This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21831 - [HTML] editorial: 12.2: error handling for parse errors
Summary: [HTML] editorial: 12.2: error handling for parse errors
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-25 19:29 UTC by Michael Dyck
Modified: 2013-06-12 22:28 UTC (History)
2 users (show)

See Also:


Attachments

Description Michael Dyck 2013-04-25 19:29:58 UTC
12.2 "Parsing HTML documents" says:

    The error handling for parse errors is well-defined: user agents must
    either act as described below when encountering such problems, or must
    abort processing at the first error that they encounter for which they
    do not wish to apply the rules described in this specification.

But it's not clear what "act as described below" refers to. The subsequent paragraphs in the section apply only to conformance checkers.

Similarly, it's not clear what "the rules described in this specification
[for parse errors]" are.

My guess is that they're referring to phrasing such as:
    Parse error. Emit the current input character as a character token.
(in 12.2.4.1 Data state). That is, if the user agent arrives at this point,
it can either "abort processing" (though it's not entirely clear what that
entails), or it can "act as described" (i.e. emit the current input character
as a character token), and proceed as if nothing untoward had happened.

(And while a conformance checker can do either, it also has a duty to report
the error, modulo the existence of other errors. Other user agents don't appear
to have even the /option/ to report errors, though presumably they do have that
option, e.g. via an Error Console.)

However, there are also places where the spec simply identifies a parse error,
without any accompanying [recovery] behaviour. E.g.:
    12.2.2.4 Preprocessing the input stream
        Any occurrences of any characters in the ranges ... are parse errors.
    12.2.4 Tokenization
        When an end tag token is emitted with attributes, that is a parse error.
If a user agent encounters these situations, and doesn't want to "abort
processing", what can/must it do?

In summary, this is a request to make all these points clearer.
Comment 1 Ian 'Hixie' Hickson 2013-06-05 21:08:57 UTC
Good point. I've tried to fix it. Let me know if the new text is sufficient.

I've not mentioned that any UA can report stuff to the user, because I don't know what it would mean for us to not implicitly allow that already.
Comment 2 contributor 2013-06-05 21:09:19 UTC
Checked in as WHATWG revision r7912.
Check-in comment: Try to clarify how UAs can abort at parse errors.
http://html5.org/tools/web-apps-tracker?from=7911&to=7912