← 2.4 Common microsyntaxesTable of contents2.6 Common DOM interfaces →
    1. 2.6 URLs
      1. 2.6.1 Terminology
      2. 2.6.3 Resolving URLs
      3. 2.6.6 Interfaces for URL manipulation

2.5 URLs

This specification defines the term URL, and defines various algorithms for dealing with URLs, because for historical reasons the rules defined by the URI and IRI specifications are not a complete description of what HTML user agents need to implement to be compatible with Web content.

The term "URL" in this specification is used in a manner distinct from the precise technical meaning it is given in RFC 3986. Readers familiar with that RFC will find it easier to read this specification if they pretend the term "URL" as used herein is really called something else altogether. This is a willful violation of RFC 3986. [RFC3986]

2.5.1 Terminology

A URL is a string used to identify a resource.

A URL is a valid URL if at least one of the following conditions holds:

A string is a valid non-empty URL if it is a valid URL but it is not the empty string.

A string is a valid URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid URL.

A string is a valid non-empty URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid non-empty URL.

This specification defines the URL about:legacy-compat as a reserved, though unresolvable, about: URI, for use in DOCTYPEs in HTML documents when needed for compatibility with XML tools. [ABOUT]

This specification defines the URL about:srcdoc as a reserved, though unresolvable, about: URI, that is used as the document's address of iframe srcdoc documents. [ABOUT]

2.5.2 Resolving URLs

Resolving a URL is the process of taking a relative URL and obtaining the absolute URL that it implies.

A URL is an absolute URL if resolving it results in the same output regardless of what it is resolved relative to, and that output is not a failure.

An absolute URL is a hierarchical URL if, when resolved and then parsed, there is a character immediately after the <scheme> component and it is a U+002F SOLIDUS character (/).

An absolute URL is an authority-based URL if, when resolved and then parsed, there are two characters immediately after the <scheme> component and they are both U+002F SOLIDUS characters (//).

2.5.3 Interfaces for URL manipulation

An interface that has a complement of URL decomposition IDL attributes has seven attributes with the following definitions:

           attribute DOMString protocol;
           attribute DOMString host;
           attribute DOMString hostname;
           attribute DOMString port;
           attribute DOMString pathname;
           attribute DOMString search;
           attribute DOMString hash;
o . protocol [ = value ]

Returns the current scheme of the underlying URL.

Can be set, to change the underlying URL's scheme.

o . host [ = value ]

Returns the current host and port (if it's not the default port) in the underlying URL.

Can be set, to change the underlying URL's host and port.

The host and the port are separated by a colon. The port part, if omitted, will be assumed to be the current scheme's default port.

o . hostname [ = value ]

Returns the current host in the underlying URL.

Can be set, to change the underlying URL's host.

o . port [ = value ]

Returns the current port in the underlying URL.

Can be set, to change the underlying URL's port.

o . pathname [ = value ]

Returns the current path in the underlying URL.

Can be set, to change the underlying URL's path.

o . search [ = value ]

Returns the current query component in the underlying URL.

Can be set, to change the underlying URL's query component.

o . hash [ = value ]

Returns the current fragment identifier in the underlying URL.

Can be set, to change the underlying URL's fragment identifier.

The table below demonstrates how the getter for search results in different results depending on the exact original syntax of the URL:

Input URL search value Explanation
http://example.com/ empty string No <query> component in input URL.
http://example.com/? ? There is a <query> component, but it is empty.
http://example.com/?test ?test The <query> component has the value "test".
http://example.com/?test# ?test The (empty) <fragment> component is not part of the <query> component.

The following table is similar; it provides a list of what each of the URL decomposition IDL attributes returns for a given input URL.

Input protocol host hostname port pathname search hash
http://example.com/carrot#question%3f http: example.com example.com (empty string) /carrot (empty string) #question%3f
https://www.example.com:4443? https: www.example.com:4443 www.example.com 4443 / ? (empty string)