20976 – Define base URLs in DOM

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20976 - Define base URLs in DOM

Summary: Define base URLs in DOM

Status:	RESOLVED FIXED

Alias:	None

Product:	WebAppsWG
Classification:	Unclassified
Component:	DOM (show other bugs)
Version:	unspecified
Hardware:	Other other

Importance:	P3 normal
Target Milestone:	---
Assignee:	Anne
QA Contact:	public-webapps-bugzilla

URL:	http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:

Duplicates (1):	22983 (view as bug list)
Depends on:
Blocks:

Reported:	2013-02-12 12:53 UTC by contributor
Modified:	2015-08-03 08:21 UTC (History)
CC List:	16 users (show)

See Also:	https://bugzilla.mozilla.org/show_bug.cgi?id=903372 http://code.google.com/p/chromium/issues/detail?id=341854

Attachments

Description contributor 2013-02-12 12:53:24 UTC

Specification: http://www.whatwg.org/specs/web-apps/current-work/
Multipage: http://www.whatwg.org/C#the-element's-base-url
Complete: http://www.whatwg.org/c#the-element's-base-url

Comment:
There is a mismatch here with the DOM concept of a node's base URL. Not sure
what the right way to address it would be.

Posted from: 207.218.72.65
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.30 (KHTML, like Gecko) Chrome/26.0.1403.0 Safari/537.30

Comment 1 Ian 'Hixie' Hickson 2013-03-27 23:35:42 UTC

What's the mismatch exactly?

Comment 2 Anne 2013-03-28 19:52:20 UTC

DOM has: "Each |node| also has an associated |base URL|." And "The baseURI attribute must return the associated base URL."

I guess the DOM should maybe be updated to reflect that we only need base URLs for Document and Element (and DocumentFragment?).

And maybe DOM should supersede http://www.w3.org/TR/xmlbase/ to use http://url.spec.whatwg.org/ for URL parsing and make it clear the attributes works in an HTML environment.

Comment 3 Ian 'Hixie' Hickson 2013-04-09 17:51:02 UTC

If you want to define "the element's base URL", that's fine by me. Reassign the bug to me once I need to update HTML to point to DOM instead of XML Base.

Comment 4 Anne 2013-08-16 13:45:28 UTC

*** Bug 22983 has been marked as a duplicate of this bug. ***

Comment 5 Anne 2013-09-13 17:02:10 UTC

Removing xml:base: https://bugzilla.mozilla.org/show_bug.cgi?id=903372

Comment 6 Anne 2013-10-29 11:57:14 UTC

xml:base will go away. Node.baseURI will stay. <base> will stay. Base URL will be scoped to a document.

In Chrome

 http://software.hixie.ch/utilities/js/live-dom-viewer/
 .<script>
 var a = document.createElement("a")
 w(a.baseURI)
 a.href = "/test"
 w(a.href)
 </script>

will log "", followed by "http://software.hixie.ch/test", which makes no sense.

So it seems that nodes should have the base URL of their node document. Base URL of nodes will change once they are adopted.

Does all of this make sense?

Comment 7 Boris Zbarsky 2013-10-29 14:58:41 UTC

Assuming we can get rid of xml:base, I believe it does.

Comment 8 Elliott Sprehn 2013-10-29 20:49:17 UTC

This has come up recently with developers using Shadow DOM and Web Components who wish to be able to scope elements with the base uri of the HTML import that defined them.

Specifically they want to be able to make a custom element defined in an imported document use the correct baseURI when instances are created and their ShadowRoots are populated with <img>'s or nested stylesheets are processed.

After a lot of discussion it seems Node.baesURI is the best approach for them (and provides the most flexibility). We'd like to change the spec to:

Node.baseURI on reading:
- walks up the tree until it encounters:
   a ShadowRoot with a baseURI property that is not empty.
   a Element with a base attribute that is not empty.
- else it returns the Document's baseURI (as computed with <base>).

Node.baseURI on setting:
- If ShadowRoot then set the property.
- If Element then set the base attribute.
- else throw an exception.

We can also fix the nonsensical result upon reading as observed in Anne's comment.

Comment 9 Boris Zbarsky 2013-10-29 20:54:39 UTC

Does the setter in your proposal allow relative URIs?

Comment 10 Elliott Sprehn 2013-10-29 21:00:37 UTC

(In reply to Boris Zbarsky from comment #9)
> Does the setter in your proposal allow relative URIs?

Yes, but the value should be resolved to an absolute URL when stored (ex. set in the attribute) just like <base>.

Comment 11 Simon Pieters 2013-10-30 09:33:16 UTC

In webdevdata's june data set I see 100 base attributes, on embed (49), span (27), img (6), a (6), object (3), and a few others. e.g.

<embed src='/Noor/Images/Login/intro.swf' quality='high' bgcolor='#ffffff' width='100%' height='250' name='EWIntro' align='left' allowScriptAccess='always' swLiveConnect='true' type='application/x-shockwave-flash' pluginspage='https://www.macromedia.com/go/getflashplayer' base=''/>

<span id="" class="trcLink en" base="mobile" type="" onclick="sm(this,0,['home','changeLanguage.html?language=en'],'/')" nw="0" lang="de" mode=""><i>&nbsp;</i>English</span>

vote+="<div class='castingvote star_yellow' base='star_yellow' value='"+i+"'></"+"div>";for(;i<=6;i++)

Comment 12 Anne 2013-10-30 15:09:15 UTC

So comment 8 seems to be proposing to reintroduce xml:base under a different name. Do we really want to go there given the performance issues and issues with dynamic updates?

Comment 13 Boris Zbarsky 2013-10-30 16:07:09 UTC

> So comment 8 seems to be proposing to reintroduce xml:base under a different
> name.

Not quite.  For example, the proposed behavior for relative URIs is radically different from xml:base (which reevaluates them all the time against the scoping base URI), and is much saner imo.

The dynamic update issues remain, of course.

Comment 14 Anne 2013-10-30 16:47:37 UTC

So nodes have a base URL which is either "inherit" or a parsed URL.

Note: a document's base URL can never be "inherit".

To get a node's parsed base URL you return the closest inclusive ancestor's base URL that is not "inherit". If there is no such ancestor you return the node's node document's base URL.

---

Node.prototype.baseURI returns the result of getting a node's parsed base URL, serialized.

Setting Node.prototype.baseURI on /node/ to /value/ run these steps:

1. Let /base/ be the result of getting /node/'s parsed base URL.

2. Let /url/ be the result of running the URL parser for /value/ against /base/.

3. If /node/ is a ShadowRoot, set it's base URL to /url/.

4. Otherwise, if /node/ is an Element, set /node/'s "base" attribute to /url/, serialized.

5. Otherwise, throw a JavaScript TypeError exception.

Note: Changing a document's base URL can only be done through the HTML base element.

---

Either when an element is created that has a "base" attribute or when an element's "base" attribute is set, run these steps:

1. Let /base/ be the result of getting element's parsed base URL.

2. Let /url/ be the result of running the URL parser for /value/ against /base/.

3. Set element's base URL to /url/.

When an element's "base" attribute is removed, set element's base URL to "inherit".

Comment 15 Anne 2013-10-30 16:57:55 UTC

(Just to be clear, I'm not sure whether it's a good idea, but I thought it'd be worthwhile to spell out the specifics. Note that the above omits invoking http://dom.spec.whatwg.org/#base-url-change-steps whenever the base URL is set which it would need to do in the end.)

Comment 16 Anne 2013-10-30 17:07:27 UTC

Another thing we need to consider is what http://dom.spec.whatwg.org/#concept-node-adopt and http://dom.spec.whatwg.org/#concept-node-clone do with the base URL. I guess if there is a non-"inherit" base URL we want it to be kept, same for ShadowRoot?

Comment 17 Boris Zbarsky 2013-10-30 17:47:07 UTC

The need to walk up the DOM on every single base URI lookup is pretty annoying.  :(

Comment 18 Simon Pieters 2013-10-30 21:39:36 UTC

Does the use case really need a base attribute on all elements? Or just ability to set baseURI on ShadowRoot? I don't really understand the use case.

Comment 19 Erik Arvidsson 2013-10-30 21:44:09 UTC

(In reply to Simon Pieters from comment #18)
> Does the use case really need a base attribute on all elements? Or just
> ability to set baseURI on ShadowRoot? I don't really understand the use case.

One might want to use Nodes defined in an HTMLImport in the main document without using shadow dom.

Comment 20 Elliott Sprehn 2013-11-11 21:16:16 UTC

I talked with folks over here and adding baseURI to ShadowRoot is enough for all the current use cases except the one Erik mentioned. Since we can always add the attribute version of baseURI in the future on top of ShadowRoot's support I think we should go that route. Walking up the tree of containing ShadowRoots can also be made faster than walking up the tree of parent Elements.

I also think we should avoid speccing this to be dynamic. In practice browsers are not very dynamic with resource urls already. Instead, at the point where a resource is fetched, we should resolve the URL against the current baseURI of the scope (ShadowRoot or Document).

ex.

root.baseURI = "http://foo/"
img1.src = "a.png";
root.appendChild(img1);
root.baseURI = "http://bar/"
img2.src = "b.png";
root.appendChild(img2);

img1.src == "http://foo/a.png"
img2.src == "http://bar/b.png"

How does that sound Anne/Boris?

Comment 21 Boris Zbarsky 2013-11-11 21:24:18 UTC

That sounds fairly reasonable to me, I guess.  I don't think it's too much worse than what we have with document's baseURI right now.

Comment 22 Anne 2013-11-12 06:38:34 UTC

So let's see. We make baseURI read-write on Document and ShadowRoot. We make writing throw for other nodes. Whenever you resolve a URL you take the current base URL in scope.

Would it make sense to keep baseURI readonly and introduce Document.baseURL / ShadowRoot.baseURL as read-write?

Also, are we sure we want read-write for Document given <base>? It seems somewhat confusing to have two independent points in control of a document's base URL.

Comment 23 Adam Klein 2014-01-16 00:40:34 UTC

Reviving this thread as I've been working on implementing this in Blink.

(In reply to Elliott Sprehn from comment #20)
> root.baseURI = "http://foo/"
> img1.src = "a.png";
> root.appendChild(img1);
> root.baseURI = "http://bar/"
> img2.src = "b.png";
> root.appendChild(img2);
> 
> img1.src == "http://foo/a.png"
> img2.src == "http://bar/b.png"

I'm not sure this example works correctly. Let's say that root, img1, and img2 all share an ownerDocument whose baseURI is "http://baz.com/". Resource loads for http://baz.com/a.png and http://baz.com/b.png will be initiated, since at the point when img1 and img2 have their src property set they are not descendants of root, and thus they get their baseURI from their ownerDocument.

Moreover, to get the "right" images to load, we'll need more dynamism in the system, so that becoming a descendant of a ShadowRoot causes a base URL change.

(Note that the HTML spec (which I can't get to at the moment) says that <img> elements don't reload their resources when switching Documents, but WebKit, Blink, and Gecko all seem to do a reload (haven't tried IE). It would be nice to fix that alongside getting ShadowRoot.baseURI's dynamism specified.)

Comment 24 Anne 2014-01-16 14:38:07 UTC

It sounds bad to design a system where you first load the wrong resource and then after appending it loads the correct one.

The <img> scenario is interesting however. Do you know the <img> is going to be used inside a ShadowRoot? When you do new Image() there's no such association, unless we had actual isolation.

Comment 25 Adam Klein 2014-01-16 18:41:59 UTC

(In reply to Anne from comment #24)
> The <img> scenario is interesting however. Do you know the <img> is going to
> be used inside a ShadowRoot? When you do new Image() there's no such
> association, unless we had actual isolation.

This is off-topic for this bug, but yes, in discussing this last night with some folks I came to the same conclusion: that to get this really right you want iframe-like isolation.

Comment 26 Elliott Sprehn 2014-01-16 21:39:56 UTC

(In reply to Adam Klein from comment #23)
> Reviving this thread as I've been working on implementing this in Blink.
> 
> (In reply to Elliott Sprehn from comment #20)
> > root.baseURI = "http://foo/"
> > img1.src = "a.png";
> > root.appendChild(img1);
> > root.baseURI = "http://bar/"
> > img2.src = "b.png";
> > root.appendChild(img2);
> > 
> > img1.src == "http://foo/a.png"
> > img2.src == "http://bar/b.png"
> 
> I'm not sure this example works correctly. [...]

Yeah I fumbled my example, you need to appendChild() first before assigning the src.

Comment 27 Adam Klein 2014-01-16 21:48:46 UTC

(In reply to Elliott Sprehn from comment #26)
> (In reply to Adam Klein from comment #23)
> > Reviving this thread as I've been working on implementing this in Blink.
> > 
> > (In reply to Elliott Sprehn from comment #20)
> > > root.baseURI = "http://foo/"
> > > img1.src = "a.png";
> > > root.appendChild(img1);
> > > root.baseURI = "http://bar/"
> > > img2.src = "b.png";
> > > root.appendChild(img2);
> > > 
> > > img1.src == "http://foo/a.png"
> > > img2.src == "http://bar/b.png"
> > 
> > I'm not sure this example works correctly. [...]
> 
> Yeah I fumbled my example, you need to appendChild() first before assigning
> the src.

Seems like a bit of a footgun. And note that

root.innerHTML = '<img src="a.png">'

doesn't even give you the option of doing the right thing (unless we somehow change how ShadowRoot.innerHTML works).

Comment 28 Anne 2014-01-17 11:02:58 UTC

So I think for now we are back to the situation in comment 6. Once we have declarative components or isolated components, we can revisit making base URLs more complicated.

Comment 29 Anne 2014-01-17 11:03:57 UTC

And until then the URL API can be used to create absolute URLs.

Comment 30 Mathias Bynens 2014-10-03 20:46:24 UTC

`innerHTML` and similar are not the problem, IMHO. To use relative URLs in <script>s within HTML imports, use the URL API, e.g.:

	var importBaseURL = document.currentScript.ownerDocument.baseURI;
	var url = new URL('foo.png', importBaseURL).toString();
	// Now, use `url` in `innerHTML` all you want.

The problem that still lacks a good solution: authors want to use things like `<link rel=stylesheet href=bar>` expecting `bar` to always be relative to the HTML import that includes it (without having to script it all). Same for `<img>`, references in `<style>` blocks, etc.

Consider this document, located at `https://example.com/some-component/import-me.html`:

    <img src=foo>

It would be nice if the import process would *somehow* turn that into…

    <img src=https://example.com/some-component/foo>

…before inserting it into the parent document.

If setting `shadowRoot.baseURI = document.currentScript.ownerDocument.baseURI` would have this effect, that’d be a fair solution for shadow roots. But IMHO this is a problem for all imports, regardless of whether they use shadow roots or not.

Comment 31 Philip Jägenstedt 2015-05-18 08:04:41 UTC

xml:base is now gone from Blink:
https://code.google.com/p/chromium/issues/detail?id=341854

If that isn't reverted, what's the next step for this bug?

Comment 32 Anne 2015-05-20 01:01:29 UTC

I will make comment 6 the new reality specification-wise and try to get all browsers to implement that. And get HTML to define its base URL stuff in terms of the new DOM base URL stuff.

Comment 33 Philip Jägenstedt 2015-05-20 08:04:37 UTC

That sounds good to me, in particular making Node.baseURI be the node document's base URL would make sense to me.

Comment 34 Anne 2015-08-03 08:20:22 UTC

https://github.com/whatwg/dom/commit/8ca4959505f531663bc91f064a19762e8b90b810

Comment 35 Anne 2015-08-03 08:21:38 UTC

Note that I opened https://github.com/whatwg/dom/issues/61 to design the base URL change notification system, which we'll use for setting a document's base URL and adopting. It has one open question about when it should run relative to the adoption specification callback. Input welcome there.