This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12246 - Say 'the/a BOM character' througout - there isn't different 'BOMs' (plural)
Summary: Say 'the/a BOM character' througout - there isn't different 'BOMs' (plural)
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL: http://dev.w3.org/html5/spec/parsing#...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-03-05 04:53 UTC by Leif Halvard Silli
Modified: 2011-08-04 05:13 UTC (History)
5 users (show)

See Also:


Attachments

Description Leif Halvard Silli 2011-03-05 04:53:05 UTC
'8.2.2.1 Determining the character encoding'  and '4.3.1 The script element' both says:

]] This step looks for Unicode Byte Order Marks (BOMs). [[

Nearly every other place where the BOM is referred to, the spec says "a Byte Order Mark character".  
(E.g. http://dev.w3.org/html5/spec/offline#writing-cache-manifests  ]]a U+FEFF BYTE ORDER MARK (BOM) character[[

Please use the same/similar expression here. Also, it is misleadiong to use the plural form, because although it can be encoded in at least 3 ways,  there is only one Byte Order Mark character.  

Hence, please change the above quote into roughly this:

    ]] This step looks for the Unicode Byte Order Mark (BOM) character. [[

Likewise, the '8.2.2.2 Character encodings' section currently reads:
http://dev.w3.org/html5/spec/parsing#character-encodings-0

]] When a user agent is to use the UTF-16 encoding but no BOM has been found, user agents must default to UTF-16LE. [[

Again 'no BOM' is not as clear as if you said 'but the BOM has not been found". 

Likewise in 4.2.5.5 Specifying the document's character encoding, it says:
http://dev.w3.org/html5/spec/semantics.html#charset

]] If an HTML document does not start with a BOM, [[

Please say "a BOM character" or (as I would prefer) "the BOM character".
Comment 1 Ian 'Hixie' Hickson 2011-05-06 00:26:05 UTC
Please file just one issue per bug.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: 

The premise of the first bit is wrong; the spec uses all kinds of ways to refer to BOMs, depending on whether it's talking about BOMs in general, or the character, or whether it's a normative reference, or a non-normative casual mention, etc. This is demonstrated by the multiple such examples that this very bug mentions.

For the second bit: it's just as clear.

For the third bit: I don't see what problem this solves. The current text is fine.
Comment 2 Michael[tm] Smith 2011-08-04 05:13:56 UTC
mass-move component to LC1