<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>17864</bug_id>
          
          <creation_ts>2012-07-18 07:08:23 +0000</creation_ts>
          <short_desc>i18n-ISSUE-118: explicitly undefined language</short_desc>
          <delta_ts>2015-06-17 02:58:03 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WHATWG</product>
          <component>HTML</component>
          <version>unspecified</version>
          <rep_platform>Other</rep_platform>
          <op_sys>other</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>NEEDSINFO</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P3</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>Unsorted</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter>contributor</reporter>
          <assigned_to name="Ian &apos;Hixie&apos; Hickson">ian</assigned_to>
          <cc>addison</cc>
    
    <cc>ian</cc>
    
    <cc>mike</cc>
    
    <cc>public-i18n-core</cc>
          
          <qa_contact>contributor</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>70198</commentid>
    <comment_count>0</comment_count>
    <who name="">contributor</who>
    <bug_when>2012-07-18 07:08:23 +0000</bug_when>
    <thetext>This was was cloned from bug 16978 as part of operation convergence.
Originally filed: 2012-05-07 18:06:00 +0000
Original reporter: Addison Phillips &lt;addison@lab126.com&gt;

================================================================================
 #0   Addison Phillips                                2012-05-07 18:06:51 +0000 
--------------------------------------------------------------------------------
3.2.3.3 The lang and xml:lang attributes
http://www.w3.org/TR/html5/elements.html#the-lang-and-xml:lang-attributes

(lang). What does this mean:

--
If the resulting value is the empty string, then it must be interpreted as meaning that the language of the node is explicitly unknown.
--

Does an explicitly unknown language have any different effect? It might be a good idea to add text such as:

--
If the resulting value is the empty string, then it must be interpreted as meaning that the language of the node is explicitly unknown and any language specific processing that applied is implementation defined.
--
================================================================================
 #1   Ian &apos;Hixie&apos; Hickson                             2012-05-10 17:55:59 +0000 
--------------------------------------------------------------------------------
I believe this is a duplicate of a previously existing bug with more discussion.
================================================================================</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>74821</commentid>
    <comment_count>1</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2012-09-28 18:05:09 +0000</bug_when>
    <thetext>(The other bugs I had in mind don&apos;t cover this specific issue.)

Addison: What effect would it have if lang=&quot;und&quot;? Where is that defined? I&apos;ll try to use the same language. (I don&apos;t want to explicitly make them equivalent, because the unknown codes have to be passed through to CSS, OpenType, etc.)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>74824</commentid>
    <comment_count>2</comment_count>
    <who name="Addison Phillips">addison</who>
    <bug_when>2012-09-28 18:20:18 +0000</bug_when>
    <thetext>(In reply to comment #1)
&gt; (The other bugs I had in mind don&apos;t cover this specific issue.)
&gt; 
&gt; Addison: What effect would it have if lang=&quot;und&quot;? Where is that defined? I&apos;ll
&gt; try to use the same language. (I don&apos;t want to explicitly make them equivalent,
&gt; because the unknown codes have to be passed through to CSS, OpenType, etc.)

I see lang=&quot;und&quot; as being slightly different from lang=&quot;&quot;, although BCP 47 makes them equivalent in meaning. &apos;und&apos; is defined by ISO 639-2 and is incorporated along with &apos;zxx&apos;, &apos;mul&apos;, and &apos;mis&apos;. The specific definitions are here:

  http://tools.ietf.org/html/bcp47#section-4.1

See item #5, which has this sub-bullet about &apos;und&apos;:

       *  The &apos;und&apos; (Undetermined) primary language subtag identifies
          linguistic content whose language is not determined.  This
          subtag SHOULD NOT be used unless a language tag is required
          and language information is not available or cannot be
          determined.  Omitting the language tag (where permitted) is
          preferred.  The &apos;und&apos; subtag might be useful for protocols
          that require a language tag to be provided or where a primary
          language subtag is required (such as in &quot;und-Latn&quot;).  The
          &apos;und&apos; subtag MAY also be useful when matching language tags in
          certain situations.

The way I see lang=&quot;und&quot; being different from lang=&quot;&quot; is probably the same thing you allude to you in your comment: there is actually a value there and, as far as any HTML processor is aware, it might contain some meaning or be available for matching. The processor would have to look at the content of the attribute and determine that it is &apos;und&apos; in order to determine the &quot;undetermined-ness&quot; of the language, which is something we want to avoid. Hence: the &apos;und&apos; tag should not be used in HTML5 (although it is not illegal to do so) because HTML5/HTML-next allows the empty string.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>80689</commentid>
    <comment_count>3</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2012-12-30 00:42:08 +0000</bug_when>
    <thetext>What part of that quoted text says what &quot;effect&quot; lang=&quot;und&quot; has? Other than how the value is passed to other tools, how would lang=&quot;und&quot; processing differ from lang=&quot;&quot; according to the current specs? (i.e. is there anything required of user agents for one that is not required for the other?)

I don&apos;t understand what you would like specified here.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>121148</commentid>
    <comment_count>4</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2015-06-17 02:58:03 +0000</bug_when>
    <thetext>*** Bug 16978 has been marked as a duplicate of this bug. ***</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>