<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>25362</bug_id>
          
          <creation_ts>2014-04-16 07:24:48 +0000</creation_ts>
          <short_desc>Proposals for language tag checking functionality</short_desc>
          <delta_ts>2014-04-16 07:24:48 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>HTML Checker</product>
          <component>General</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Felix Sasaki">fsasaki</reporter>
          <assigned_to name="Michael[tm] Smith">mike+validator</assigned_to>
          
          
          <qa_contact name="qa-dev tracking">www-validator-cvs</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>103939</commentid>
    <comment_count>0</comment_count>
    <who name="Felix Sasaki">fsasaki</who>
    <bug_when>2014-04-16 07:24:48 +0000</bug_when>
    <thetext>Here are a few proposals for the language tag checking functionality, based on experience developers made deploying and modifying the validator.nu library at
https://code.google.com/p/okapi-xliff-toolkit/source/browse/okapi/libraries/lib-xliff/src/main/java/net/sf/okapi/lib/xliff2/lang/Language.java

1) Private use tags
Validating this document 
&lt;!DOCTYPE html&gt;
&lt;html lang=&quot;de-x-a&quot;&gt; ...&lt;/html&gt;
creates this error message:
&quot;Bad value de-x-a for attribute lang on element html: Private use subtag a is too short.&quot;
But such a value should be OK: private use subtags can be of length 1.

Looking at this code
https://whattf.svn.cvsdude.com/syntax/trunk/relaxng/datatype/java/src/org/whattf/datatype/Language.java
(not sure if is the correct place to look at actually)
This could be fixed by replacing
&quot;subtag.length() &lt; 2&quot; with &quot;subtag.length() &lt; 1&quot;
in the below:
if (subtag.length() &lt; 2) {
                throw newDatatypeException(&quot;Private use subtag &quot;, subtag, &quot; is too short.&quot;);                
            }

2) Some language tag issues are reported as errors, e.g. for
&lt;html lang=&quot;de-latn-de&quot;&gt;
You get
&quot; Bad value de-latn-de for attribute lang on element html: Language tag should omit the default script for the language.&quot;
It may make sense to report such errors as warnings, not as errors, since the  
issues are based on SHOULD NOT statements in BCP47.

Code fixes for 2) have been made in 
https://code.google.com/p/okapi-xliff-toolkit/source/browse/okapi/libraries/lib-xliff/src/main/java/net/sf/okapi/lib/xliff2/lang/Language.java
by adding &quot;Warning: &quot; to issues that report SHOULD NOT violations. So the fastest fix for this may be to re-classify all errors that start with &quot;Warning: &quot; as warnings in the W3C validator.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>