This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 19980 - Clarify "emit the word continue"
Summary: Clarify "emit the word continue"
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: Encoding (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: Unsorted
Assignee: Anne
QA Contact: sideshowbarker+encodingspec
URL: http://encoding.spec.whatwg.org/#enco...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-11-16 09:35 UTC by Martin Dürst
Modified: 2012-11-16 14:54 UTC (History)
2 users (show)

See Also:


Attachments

Description Martin Dürst 2012-11-16 09:35:42 UTC
At http://encoding.spec.whatwg.org/#encodings, the spec says:

"A decoder can end by emitting a code point or decoder error, or the word continue."

This may be just editorial, but I don't understand what it means to "emit the word continue". Does it mean that "continue" ends up in the output? "continue" appears in many places in the rest of the spec, but it seems to be something that's done, not emitted.
Comment 1 Simon Pieters 2012-11-16 09:42:08 UTC
Ambiguity in English? :-)

I guess this was meant: A decoder can end by (emitting a code point or decoder error), or (the word continue).

Suggested replacement:

A decoder can end by the word continue, by emitting a code point, or by emitting or decoder error.
Comment 2 Martin Dürst 2012-11-16 10:41:15 UTC
(In reply to comment #1)

> Suggested replacement:
> 
> A decoder can end by the word continue, by emitting a code point, or by
> emitting or decoder error.

I think this starts to make things clearer. At least "continue" is no longer emitted.

But the fundamental problems seems to be that decoder is defined as follows:
"A decoder algorithm takes a stream of bytes and emits a stream of code points."
This I think is the right definition, see also https://www.w3.org/Bugs/Public/show_bug.cgi?id=19979. But this would mean that a decoder only 'ends' once it sees an EOF byte.

The sentence in question in this bug, however, seems to be written under the assumption that a decoder "ends" for every code point read in/every error produced. But then the "continue" part also seems to imply that a decoder ends when the word "continue" is seen. As far as I have been able to figure out (e.g. in the UTF-8 decoder), this:
a) just means that the logic moves back to point 1) (it would be very good if the spec said that somewhere, maybe here), and
b) may happen in the middle of decoding a code point (which means that the decoder in no way "ended", because all its variables are still kept)
Comment 3 Anne 2012-11-16 12:24:33 UTC
So it does say it is invoked again now: "A decoder can end by emitting a code point or decoder error, or the word continue. Unless the EOF code point is emitted, the decoder algorithm must be invoked again."

How about: "A decoder must be invoked again when the word continue is used, when a code point is emitted that is not the EOF code point, or when a decoder error is emitted."