Bug 20319 - Parser issue with AAA
Summary: Parser issue with AAA
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: All All
: P2 critical
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: HTML WG Bugzilla archive list
Whiteboard: exclusion
Keywords: CR
Depends on:
Reported: 2012-12-10 04:40 UTC by Ian 'Hixie' Hickson
Modified: 2013-08-02 00:07 UTC (History)
7 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Ian 'Hixie' Hickson 2012-12-10 04:40:50 UTC
The HTML parser misorders the text nodes in this example:


Given the importance of the HTML parser, this should probably be fixed before CR.
Comment 1 Michael[tm] Smith 2012-12-18 09:32:40 UTC
So for the simple case of ignoring the "If inner loop counter is greater than or equal to three, then go to the next step in the overall algorithm." part of the AAA, <b><i><a><s><tt><div></b>first</b></div></tt></s></a>second</i> parses into <b><i><a><s><tt></tt></s></a></i></b><i><a><s><tt><div><b></b>first</div></tt></s></a>second</i>

At least that's what I get from hacking html5lib to ignore the "greater than or equal to three" limit.
Comment 2 Travis Leithead [MSFT] 2013-07-22 23:25:04 UTC
See: http://html5.org/tools/web-apps-tracker?from=5641&to=5642

It seems a little weird to want to "fix" this bug when all of the major recent browsers are 100% consistent on their parsing behavior in this scenario--e.g., we've successfully achieved an interoperable HTML5 parser!

What's so bad about leaving the current limits in place? I.e., is there a site compatibility bug that is motivating this change, or is it simply altruistic? The linked bug above describes "practical limits" for the AAA which seem to have resulted in this problem. Yet, if this isn't really a "problem" for web content somewhere, then why fix it?

My inclination is to Won't Fix this bug and call this an interesting anomaly of the AAA (and move on). There's enough other bizarre features of the HTML5 parser that I'm pretty comfortable with that idea.
Comment 3 Travis Leithead [MSFT] 2013-08-01 23:23:13 UTC
Ian, you appear to have addressed this in the WHATWG spec:

The question now becomes, will implementations adjust to match?
Comment 4 Travis Leithead [MSFT] 2013-08-02 00:07:47 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the Editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the Tracker Issue; or you may create a Tracker Issue
yourself, if you are able to do so. For more details, see this document:


Status: Accepted
Change Description: Applied existing patch (see below)

Chose to adopt two WHATWG/HTML5.1 patches to fix this bug in a manner harmonious with the WHATWG spec:

Cleanup the AAA algorithm: https://github.com/w3c/html/commit/9adb3bdc0ad9d12554a33e249dab52f265dfb3c2
Apply fix that prevents this out-of-order issue: