This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
This was was cloned from bug 16978 as part of operation convergence. Originally filed: 2012-05-07 18:06:00 +0000 Original reporter: Addison Phillips <addison@lab126.com> ================================================================================ #0 Addison Phillips 2012-05-07 18:06:51 +0000 -------------------------------------------------------------------------------- 3.2.3.3 The lang and xml:lang attributes http://www.w3.org/TR/html5/elements.html#the-lang-and-xml:lang-attributes (lang). What does this mean: -- If the resulting value is the empty string, then it must be interpreted as meaning that the language of the node is explicitly unknown. -- Does an explicitly unknown language have any different effect? It might be a good idea to add text such as: -- If the resulting value is the empty string, then it must be interpreted as meaning that the language of the node is explicitly unknown and any language specific processing that applied is implementation defined. -- ================================================================================ #1 Ian 'Hixie' Hickson 2012-05-10 17:55:59 +0000 -------------------------------------------------------------------------------- I believe this is a duplicate of a previously existing bug with more discussion. ================================================================================
(The other bugs I had in mind don't cover this specific issue.) Addison: What effect would it have if lang="und"? Where is that defined? I'll try to use the same language. (I don't want to explicitly make them equivalent, because the unknown codes have to be passed through to CSS, OpenType, etc.)
(In reply to comment #1) > (The other bugs I had in mind don't cover this specific issue.) > > Addison: What effect would it have if lang="und"? Where is that defined? I'll > try to use the same language. (I don't want to explicitly make them equivalent, > because the unknown codes have to be passed through to CSS, OpenType, etc.) I see lang="und" as being slightly different from lang="", although BCP 47 makes them equivalent in meaning. 'und' is defined by ISO 639-2 and is incorporated along with 'zxx', 'mul', and 'mis'. The specific definitions are here: http://tools.ietf.org/html/bcp47#section-4.1 See item #5, which has this sub-bullet about 'und': * The 'und' (Undetermined) primary language subtag identifies linguistic content whose language is not determined. This subtag SHOULD NOT be used unless a language tag is required and language information is not available or cannot be determined. Omitting the language tag (where permitted) is preferred. The 'und' subtag might be useful for protocols that require a language tag to be provided or where a primary language subtag is required (such as in "und-Latn"). The 'und' subtag MAY also be useful when matching language tags in certain situations. The way I see lang="und" being different from lang="" is probably the same thing you allude to you in your comment: there is actually a value there and, as far as any HTML processor is aware, it might contain some meaning or be available for matching. The processor would have to look at the content of the attribute and determine that it is 'und' in order to determine the "undetermined-ness" of the language, which is something we want to avoid. Hence: the 'und' tag should not be used in HTML5 (although it is not illegal to do so) because HTML5/HTML-next allows the empty string.
What part of that quoted text says what "effect" lang="und" has? Other than how the value is passed to other tools, how would lang="und" processing differ from lang="" according to the current specs? (i.e. is there anything required of user agents for one that is not required for the other?) I don't understand what you would like specified here.
*** Bug 16978 has been marked as a duplicate of this bug. ***