This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10625 - Spec should cover stopping parsing on location.href = "foo"
Summary: Spec should cover stopping parsing on location.href = "foo"
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-13 19:21 UTC by Eric Seidel
Modified: 2010-12-07 02:19 UTC (History)
12 users (show)

See Also:


Attachments

Description Eric Seidel 2010-09-13 19:21:03 UTC
Spec should cover stopping parsing on location.href = "foo"

It seems that at least firefox and (historical) WebKit stop parsing whenever a scheduled location change is made, such as through setting window.location.href =.

This is not directly discussed in:
http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#the-location-interface

However http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#navigate does imply that parsing should stop if we get all the way to unloading the document.   However existing browsers seem to stop parsing regardless of whether the load is secure or successful, immediately after location.href is set.

See https://bugs.webkit.org/show_bug.cgi?id=43328 for how we had to add this quirk back into WebKit after moving to the HTML5 parser.
Comment 1 Ian 'Hixie' Hickson 2010-09-15 13:25:22 UTC
So basically navigating a browsing context should cause an active parser for that browsing context's document to be aborted immediately?

For navigations started in tasks that are not triggered by the parser, e.g. from a timeout or user interaction, this will result in unpredictable behaviour. Is that intentional?
Comment 2 Eric Seidel 2010-09-15 19:26:20 UTC
I'm not sure I fully understand the desired behavior here.  WebKit's current implementation will abort any parsing associated with the frame you're navigating as soon as a scheduled navigation is registered (sent off to the network layer).  Event if that navigation eventually fails, turns into a download, etc.  This is not intended to be of our own design, but rather to match FF/IE.  I guess we need more testing here to understand what the web/FF/IE expect here.
Comment 3 Adam Barth 2010-09-16 06:19:07 UTC
This might be limited to only certain kinds of navigations.  The issue is that different async navigations race and web sites expect the races to have a particular outcome.
Comment 4 Henri Sivonen 2010-09-16 14:31:43 UTC
bz probably knows more about the constraints than I do.

I don't know what the Web requires to be synchronous. I just know that the Gecko-internal API for stopping the parser has to be synchronous and can be called with the most unexpected things on the call stack. It took many iterations to cover enough edge cases to stop the influx of bug reports about it.
Comment 5 Ian 'Hixie' Hickson 2010-09-28 22:56:57 UTC
So what happens when the parser is aborted?

 * what's the current document readiness set to, if anything?
 * what scripts execute? the same ones as when parsing ends? none?
 * do DOMContentLoaded and load events fire?
 * how about pageshow and popstate?
 * do pending appcache tasks run?

What should I spec here? I'm happy to spec anything you want, but since everyone here seems to agree that something is needed but that their implementations aren't quite right, I don't really know what to do exactly.
Comment 6 Boris Zbarsky 2010-09-29 04:22:51 UTC
The Gecko behavior is that starting a load in a docshell (which is basically a navigation context; maps 1-1 to a window proxy) will issue a Stop() call if the URI is not a javascript: URI.  This check happens after checks for scrolling to anchor, etc.

Now a complication is that there are two kinds of Stop() calls.  Usually navigation starts issue a STOP_NETWORK, which just stops all the network activity associated with the current page in the navigation context.  But for location.href sets, and for any navigation that happens before the new page is actually showing (e.g. during the paint suppression timeout), we issue a STOP_ALL call, which stops network activity, stops image animations, and stops the parser.

I believe both versions of Stop() can synchronously trigger onload on the page being navigated away from in some cases (e.g. a parent document that has everything loaded except for one subframe is navigated from an onload handler on the subframe's window; that will synchronously trigger onload on the parent before the location set returns, because otherwise the parent would never get an onload at all in Gecko due to the new load that just got kicked off being associated with it for the moment).

Now what parts of all this are needed for web compat... I don't know.  This code mostly predates my involvement in the project, and I've modified it as little as I can while fixing obvious issues over the years; the regression risk has always scared the crap out of me.

As for the questions in comment 5.... at least in Gecko a location.href set is equivalent to the user hitting the stop button in terms of its "stop" effects.  What's the stop button specified to do, if anything?  If it's not specified, then (still in Gecko) the situation is basically treated as if all outstanding network requests in the window subtree rooted at the relevant window had network errors.  That should cover document readiness, DOMContentLoaded, load, pageshow, popstate behavior.  No idea about appcache.  As for scripts executing and parsing ending, I believe anything that's not in the DOM already needs to not end up in there, if you're trying to duplicate Gecko behavior.

Again, no idea how much of this is _needed_ for compat.
Comment 7 Ian 'Hixie' Hickson 2010-10-15 00:09:36 UTC
Ok well I guess I'm just going to spend a few days testing all of this in browsers I have access to and seeing what there's interop on.

Ways to trigger this I need to test:
 - setting location.href
 - following a link
 - meta refresh
 - navigation targeting the browsing context from another
 - all the above with javascript: URLs and http: URLs
 - all of the above within milliseconds of page load starting and while the network is stalled

Behaviours to check:
 - does 'onload' fire synchronously with whatever triggered the stop?
 - what's the readiness state?
 - what scripts execute if any are pending?
 - does onload fire in subframes if the parent is navigated?
 - does onload fire in parent frames if the child navigates it?
 - how does it affect bfcache?

Anything else?
Comment 8 Boris Zbarsky 2010-10-15 01:28:15 UTC
> - does 'onload' fire synchronously with whatever triggered the stop?

And does it fire at all?
Comment 9 Boris Zbarsky 2010-10-15 01:29:38 UTC
And in particular, if a subframe is navigated during the initial pageload of the parent, before onload has fired on the subframe, and hence before onload has fired on the parent but at the point when the parent's onload is just waiting on the subframe, then which, if any, of the two onload events get fired and when?
Comment 10 Ian 'Hixie' Hickson 2010-10-20 07:35:53 UTC
It seems meta refresh waits for the document to have loaded before taking effect, so I'll ignore that in further tests:

   http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/?nph-003.html

I can't get IE to run scripts at all until the whole document is downloaded and can be all parsed at once, which makes testing exactly what happens rather awkward:

   http://www.hixie.ch/tests/adhoc/dom/level0/timers/nph-010-demo.html

When it comes to the other browsers, it seems WebKit doesn't fire 'unload', but Gecko does, when an inline script just invokes the navigation algorithm; WebKit also seems to do it as a task so the parser doesn't abort until after it has run whatever is in its buffer (e.g. a script immediately following):

   http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/?nph-002.html
Comment 11 Ian 'Hixie' Hickson 2010-10-20 07:38:15 UTC
On that last test  Opera fires 'unload', IE does not. Both do it immediately, not letting the parser finish.
Comment 12 Ian 'Hixie' Hickson 2010-10-20 07:40:01 UTC
(Anyone have any good ideas on how to do the following a link test? In particular, am I just doing something wrong with IE8 or is it possible to have IE8 parse incrementally?)
Comment 13 Boris Zbarsky 2010-10-20 19:31:46 UTC
I've certainly seen IE8 parse incrementally....
Comment 14 Ian 'Hixie' Hickson 2010-10-20 19:40:24 UTC
Actually 
   http://www.hixie.ch/tests/adhoc/dom/level0/timers/nph-010-demo.html
...suggests IE8 parses incrementally when it hits a <script src=""> but just doesn't paint, or something. Still investigating.
Comment 15 Ian 'Hixie' Hickson 2010-10-20 20:59:03 UTC
This test:
   http://www.hixie.ch/tests/adhoc/dom/level0/timers/nph-011-demo.html
...shows that IE does incremental rendering, but I had to use alert() to demonstrate it.
Comment 16 Ian 'Hixie' Hickson 2010-10-20 21:41:44 UTC
I really have no idea how to get IE8 to do this in a navigation test:
   http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/?nph-006.html
Comment 17 Ian 'Hixie' Hickson 2010-10-20 23:13:53 UTC
Testing Firefox on:
   http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/multiframe/002.html
...suggests the 'unload' happens as soon as the parser is spun up, but the previous page can still be showing (though interacting with it may not work). It does not happen before the parser is spun up.
Comment 18 Ian 'Hixie' Hickson 2010-10-21 00:11:59 UTC
If any Microsoft guys have any suggestions on how to test this in IE, please let me know. Failing that I'm probably just going to assume what Gecko does is what the spec should say, since it seems overall the sanest of the various browsers I've tested so far.
Comment 19 Adrian Bateman [MSFT] 2010-10-21 22:12:19 UTC
I'm not sure what we do here. Travis is investigating.
Comment 20 Ian 'Hixie' Hickson 2010-10-21 23:14:24 UTC
Firefox seems to be the only browser that doesn't block the parser while alert() is running, but all the browsers seem to agree that navigating to a javascript: URL that returns void() should not abort the parser altogether. WebKit further does the javascript: navigation async (i.e. it queues a task to do the navigation, rather than doing it sync), while Opera and IE do it sync.

Test: http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/?nph-008.html
Comment 21 Ian 'Hixie' Hickson 2010-10-21 23:17:29 UTC
http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/?nph-007.html confirms the above, modulo Opera which seems to have a bug with javascript: navigation: of the remaining browsers, IE is the only one to synchronously run the javascript: and navigate, the others all do it async.
Comment 22 Ian 'Hixie' Hickson 2010-10-21 23:23:44 UTC
http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/?nph-010.html shows that behaviour for schemes that don't interact with the browsing context, specifically mailto:, is all over the place. Firefox aborts the parser synchronously, Chrome queues the navigation then aborts the parser, Safari and IE don't abort it at all, and Opera ignores the navigation entirely in my setup for some reason.
Comment 23 Ian 'Hixie' Hickson 2010-10-21 23:41:17 UTC
http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/?nph-011.html says deferred scripts don't execute.
Comment 24 Ian 'Hixie' Hickson 2010-10-22 00:04:25 UTC
Because of many of the various differences listed above, Opera is the only browser that does what I think I'll end up speccing in the multiframe case:
   http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/multiframe/003.html
IE doesn't seem to bother loading the first file in the frame at all (I'll test this differently in 004). WebKit does what Opera does except it doesn't fire the unload like Gecko and Opera normally do. Gecko in this particular case is the odd one out; it fires the parent's onload as soon as the child's parser is aborted, instead of having the load event be delayed by the new document in the subframe as well.
Comment 25 Ian 'Hixie' Hickson 2010-10-22 00:08:59 UTC
I really don't understand what IE is doing here. It's like it's caching the whole document before doing anything: http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/multiframe/004.html
Comment 26 Ian 'Hixie' Hickson 2010-10-22 01:13:56 UTC
http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/multiframe/nph-005.html suggests that when a parser is aborted, any child parsers should be too.
Comment 27 Ian 'Hixie' Hickson 2010-10-22 01:27:47 UTC
Conclusions:
- On the same step of the navigation algorithm that says "Cancel any preexisting attempt to navigate", if the resource being navigated to is not a javascript: URL, then abort any parsers in the browsing context and its descendants.
- When we abort a parser, we should also abort any pending 'fetch' algorithms for that document.
- We definitely don't run its "The End" steps when a parser is aborted.
- There's no difference between whether the navigation is started from within the frame or without.
Comment 28 Ian 'Hixie' Hickson 2010-10-22 23:07:03 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: see diff given below
Rationale: see above
Comment 29 contributor 2010-10-22 23:07:59 UTC
Checked in as WHATWG revision r5643.
Check-in comment: Define how location.href='foo' aborts parsing.
http://html5.org/tools/web-apps-tracker?from=5642&to=5643
Comment 30 Tony Gentilcore 2010-11-16 17:47:10 UTC
I'm working on implementing this in WebKit. Could you please confirm whether all of the expectations in http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/ are set properly?
Comment 31 Travis Leithead [MSFT] 2010-11-16 20:18:27 UTC
In response to Comment #19:

I reviewed IE's behavior under many of the test cases. It appears that in IE, the firing of unload is tied to whether or not a 'load' event was ever fired. The firing of the load event may or may not occur depending on the timing of the parser, which runs asynchronously while the new page is requested from the server (not blocking). At a point when the navigated web page is partially recieved, IE simply kills the old parser, which is why load may not fire.

Despite IE currently failing some of these tests, we are OK with the spec change.
Comment 32 Ian 'Hixie' Hickson 2010-12-07 02:19:40 UTC
(In reply to comment #30)
> I'm working on implementing this in WebKit. Could you please confirm whether
> all of the expectations in
> http://www.hixie.ch/tests/adhoc/html/navigation/interrupts/ are set properly?

I think they are, but I haven't verified closely recently. IIRC, some of them (e.g. the mailto: one) don't have a pass condition currently. If you see any errors drop me an e-mail.