21058 – Changing base.href has undefined behavior on img.src

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21058 - Changing base.href has undefined behavior on img.src

Summary: Changing base.href has undefined behavior on img.src

Status:	RESOLVED WONTFIX

Alias:	None

Product:	WHATWG
Classification:	Unclassified
Component:	HTML (show other bugs)
Version:	unspecified
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	2016 Q1
Assignee:	Ian 'Hixie' Hickson
QA Contact:	contributor

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-02-20 15:21 UTC by Glenn Maynard
Modified:	2013-11-18 23:24 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Glenn Maynard 2013-02-20 15:21:34 UTC

<base href="http://www.google.com">
<body>
<script>
var base = document.getElementsByTagName("base")[0];
var img = document.createElement("img");
base.href = "http://foo.com";
img.src = "images/srpr/logo3w.png";
document.body.appendChild(img);
</script>

The result seems to be unspecified.  The relative URL "images/srpr/logo3w.png" is resolved according to <base> during "update the image data".  This may happen synchronously, in a UA that obtains images immediately, or asynchronously, in a UA that obtains images on demand.  That means in the former it'll be resolved against the old base, and in the latter it'll be resolved against the new base, or against any future value of base.href.

Chrome seems to resolve the .src's URL synchronously when set: the image loads if base.href is changed right after .src is set.  Firefox and IE9 seem to resolve it when added to the document: the image loads if base.href is changed right after appendChild.  I don't know if any browser configurations cause it to be resolved later than that, which the spec also seems to allow.

The sanest behavior is probably to resolve src and srcset synchronously, and stash the result for whenever the actual fetch happens (matching Chrome's effective behavior).

See http://lists.w3.org/Archives/Public/public-whatwg-archive/2013Feb/0142.html (last section) for a related issue: behavior is similarly unspecified if you revokeObjectURL after changing img.src.

Comment 1 Ian 'Hixie' Hickson 2013-04-12 19:14:37 UTC

This seems to be specified in detail; you describe exactly what the spec says above. The behaviour described above for Firefox and IE is wrong per spec (I expect they actually just queued a task rather than wait for the insertion, by the way, but the effect would be indistinguishable in that test case).

But it seems that today Firefox, Chrome, and IE all do what the spec says:

   http://software.hixie.ch/utilities/js/live-dom-viewer/saved/2209

They all render the same thing for me (the Google logo).

Comment 2 Glenn Maynard 2013-04-13 01:39:27 UTC

(In reply to comment #1)
> This seems to be specified in detail; you describe exactly what the spec
> says above. The behaviour described above for Firefox and IE is wrong per
> spec (I expect they actually just queued a task rather than wait for the
> insertion, by the way, but the effect would be indistinguishable in that
> test case).

I'm confused--I described the spec allowing for two different, incompatible behaviors.

> But it seems that today Firefox, Chrome, and IE all do what the spec says:
> 
>    http://software.hixie.ch/utilities/js/live-dom-viewer/saved/2209
> 
> They all render the same thing for me (the Google logo).

But the spec doesn't seem to actually require this.  This is one of two different, incompatible behaviors it allows.

> But it seems that today Firefox, Chrome, and IE all do what the spec says:
> 
>    http://software.hixie.ch/utilities/js/live-dom-viewer/saved/2209
> 
> They all render the same thing for me (the Google logo).

But in Opera, I see the HP icon, and that seems to be allowed by the spec.  If the URL is resolved when .src is assigned, both the "immediately" and "on demand" paths would end up with the same image.

Comment 3 Ian 'Hixie' Hickson 2013-04-14 07:36:26 UTC

Well, there'll always definitely be a potential difference in behaviour, because the whole point of this part of the spec here is to allow two behaviours (immediate fetching, or delayed fetching).

Resolving the URL early in the on-demand case would bring them a little closer together, but not much on the long run, given srcset="" (you don't know which URL to resolve since you don't know what the environment will be like when you need the image).

If you turn images off in browsers, and then manually ask for the image to be shown, don't they still follow the spec (using the late resolving)?

Comment 4 Glenn Maynard 2013-04-16 01:44:54 UTC

(In reply to comment #3)
> Well, there'll always definitely be a potential difference in behaviour,
> because the whole point of this part of the spec here is to allow two
> behaviours (immediate fetching, or delayed fetching).

But we should try to minimize the ways that difference might accidentally break scripts.

A closely related issue will be capturing blob URLs, which definitely needs to happen on assignment.  I think the solution to these two problems are related (Anne just started a new thread on this, so I've snipped some stuff to move it there).

> Resolving the URL early in the on-demand case would bring them a little
> closer together, but not much on the long run, given srcset="" (you don't
> know which URL to resolve since you don't know what the environment will be
> like when you need the image).

Why not just resolve all of them?  Parse srcset once on assignment, instead of when the environment changes, so instead of resulting in a list of URL strings and descriptors, it gives a list of parsed URLs and descriptors.

This would also make srcset work reliably with blob URLs, even if the environment changes, if we use my grab-blobs-synchronously proposal.

> If you turn images off in browsers, and then manually ask for the image to
> be shown, don't they still follow the spec (using the late resolving)?

I tried to test this, actually, but I couldn't figure out how to actually do this in any current browser.  Do any support it (without extensions)?

It would still be a problem if they did, though (images would fail to load in this uncommon configuration that otherwise succeeded).

Comment 5 Ian 'Hixie' Hickson 2013-10-23 20:19:26 UTC

> But we should try to minimize the ways that difference might accidentally
> break scripts.

If a page has relative URLs and the script starts changing the path using pushState or actually changing the <base> URL, it's not clear to me that the script is being sane in the first place.


> A closely related issue will be capturing blob URLs, which definitely needs
> to happen on assignment.

I understand that you feel that way, but I don't think it's a good design for Blob either.


> > Resolving the URL early in the on-demand case would bring them a little
> > closer together, but not much on the long run, given srcset="" (you don't
> > know which URL to resolve since you don't know what the environment will be
> > like when you need the image).
> 
> Why not just resolve all of them?

That's a lot of URLs to resolve and keep around just in case the site later changes the base URL. And what happens if the script just concatenates a new alternative to the srcset="" attribute? Do you suddenly forget all the resolved URLs and reresolve it? Wouldn't that be the exact problem we're trying to avoid?


> Parse srcset once on assignment

Parsing and resolving are different steps.


> > If you turn images off in browsers, and then manually ask for the image to
> > be shown, don't they still follow the spec (using the late resolving)?
> 
> I tried to test this, actually, but I couldn't figure out how to actually do
> this in any current browser.  Do any support it (without extensions)?

Opera used to. Dunno about modern UAs.


> It would still be a problem if they did, though (images would fail to load
> in this uncommon configuration that otherwise succeeded).

The alternative is requiring that memory-constrained UAs — those most likely to be disabling images in the first place — keep track of all the URLs in both resolved and unresolved form, which is a memory burden, rather the opposite of what those UAs would want.

Comment 6 Glenn Maynard 2013-10-23 23:24:39 UTC

(In reply to Ian 'Hixie' Hickson from comment #5)
> If a page has relative URLs and the script starts changing the path using
> pushState or actually changing the <base> URL, it's not clear to me that the
> script is being sane in the first place.

(Maybe, but this seems like a poor justification for the spec allowing two different incompatible behaviors.)

> > A closely related issue will be capturing blob URLs, which definitely needs
> > to happen on assignment.
> 
> I understand that you feel that way, but I don't think it's a good design
> for Blob either.

(This is tangental, but as far as I know it's the only available solution for fixing the massive underspecification of blob URL handling.)

> And what happens if the script just concatenates a new
> alternative to the srcset="" attribute? Do you suddenly forget all the
> resolved URLs and reresolve it? Wouldn't that be the exact problem we're
> trying to avoid?

The main problem I'm trying to solve is the spec allowing two different behaviors.  If changing srcset reresolves all of the URLs, that's fine, as long as that's the required behavior, not one out of two options browsers can pick from.

> > Parse srcset once on assignment
> 
> Parsing and resolving are different steps.

I was using the terminology from the URL spec, where resolving URLs against a base URL is part of parsing.  http://url.spec.whatwg.org/#parsing

> > It would still be a problem if they did, though (images would fail to load
> > in this uncommon configuration that otherwise succeeded).
> 
> The alternative is requiring that memory-constrained UAs — those most likely
> to be disabling images in the first place — keep track of all the URLs in
> both resolved and unresolved form, which is a memory burden, rather the
> opposite of what those UAs would want.

It seems like a stretch for a second copy of URLs to be a real memory concern, but that said, the resolved copy of the URL could be optimized away by the browser.  Instead of resolving the URL and storing it in each case, store only the base URL that it resolves against, so it's available later when needed.  That can be shared across a lot of elements, so the memory cost is reduced to a reference in most cases.

Comment 7 Ian 'Hixie' Hickson 2013-10-25 23:34:00 UTC

Even a pointer per element is a lot. UAs fight to save individual bits.

Fundamentally, I'm not convinced that this is really a problem, and the solution seems like a lot of work.

Comment 8 Ian 'Hixie' Hickson 2013-11-18 23:24:22 UTC

I'm marking this as WONTFIX. If you can find a browser that supports late-loading of images, and that wants to early-resolve URLs, then reopen the bug and let me know. In the meantime, what the spec has now seems good enough.