Re: ACTION-233: Update quality issue example to use the solution (XML in "script" tag) for standoff markup

2012/10/2 Phil Ritchie <philr@vistatec.ie>

> OK, understood. Hmm. I think use of the script element will break my
> implementation.


Just to be sure - does your implementation rely on javascript processing
with this standoff approach:

<span its-loc-quality-issue=its-loc-quality-issue
its-loc-quality-issue-coment="Sentence without capitalization"
its-loc-quality-issue-severity=30
its-loc-quality-issue-type=typographical></span>

FYI, the change in the toy example
http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js
basically meant: adding a call of the XML parser and using different names
for getting attributes, e.g. its:locQualityIssueSeverity instead of
its-loc-quality-issue-severity. See the diff here:

[

-    var qielem = document.getElementById(qiref.substr(1));
-    var issues = qielem.childNodes;
     var issueslist = new String;
-    for(i=0; i<issues.length; i++) {
-	if(issues[i].nodeType==1) { issueslist = issueslist +
-				    issues[i].getAttribute('its-loc-quality-issue-type') + " "; } }
+    var parser = new DOMParser();
+    var standoffits = document.getElementById('its-standoff-1').textContent;
+    var doc = parser.parseFromString(standoffits,'application/xml');
+    var locqualityissues =
doc.getElementsByTagNameNS('http://www.w3.org/2005/11/its','locQualityIssues');
+    for(i=0; i<locqualityissues.length; i++)
+    {
+	if (locqualityissues[i].getAttribute('xml:id') == qiref.substr(1));
+	{	
+	    var issues = locqualityissues[i].childNodes;}
+	var issueslist = new String;
+        for(i=0; i<issues.length; i++) {
+	    if(issues[i].nodeType==1) { issueslist = issueslist +
+					issues[i].getAttribute('locQualityIssueType') + " "; } }
+    }

]


Felix



> I'll have to check.
>
> Phil.
>
>
>
>
>
> From:        Felix Sasaki <fsasaki@w3.org>
> To:        Phil Ritchie <philr@vistatec.ie>,
> Cc:        public-multilingualweb-lt@w3.org
> Date:        02/10/2012 11:04
> Subject:        Re: ACTION-233: Update quality issue example to use the
> solution (XML in "script" tag) for standoff markup
> ------------------------------
>
>
>
>
>
> 2012/10/2 Phil Ritchie <*philr@vistatec.ie* <philr@vistatec.ie>>
> Felix
>
> Before I can answer the question can you tell me what the motivation for
> using the script tags is?
>
> There are two motivations. One is based on
> *
> https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration
> *<https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration>
> here you have ITS rules files inside HTML5. It seems that this is a
> requirement from Linguaserve: not rules linked, but inside HTML5. So far
> Linguaserve has put the rules files "just somewhere". That makes it invalid
> HTML5. With the rules in the "script" element, it becomes valid again.
> The other motivation is that the standoff we had so far for HTML5 looked
> like this:
>
>  <span its-loc-quality-issues-ref=#lq1>c'es</span> le contenu</p>
>
>                 <span id=lq1 its-loc-quality-issues=its-loc-quality-issues>
>
>                     <span
>
>                         its-loc-quality-issue=its-loc-quality-issue
>
>                         its-loc-quality-issue-coment="Sentence without
> capitalization"
>
>                         its-loc-quality-issue-severity=30
>
>                         its-loc-quality-issue-type=typographical></span>
>
>                     <span
>
>                         its-loc-quality-issue=its-loc-quality-issue
>
>                         its-loc-quality-issue-coment="'c'es' is unknown.
> Could be 'c'est'"
>
>                         its-loc-quality-issue-severity=50
>
>                         its-loc-quality-issue-type=misspelling></span>
>
>                 </span>
>
>
>
>  "span" is mis-used to "transport" standoff metadata in the "body"
> element. It works, but is not very clean. Hence "script" which is defined
> for that purpose, see
> *http://dev.w3.org/html5/spec/the-script-element.html*<http://dev.w3.org/html5/spec/the-script-element.html>
>
> about "application/xml" and other types:
> "These types are explicitly listed here because they are poorly-defined
> types that are nonetheless likely to be used as formats for data blocks,
> and it would be problematic if they were suddenly to be interpreted as
> script by a user agent."
> Jirka had mentioned this solution afternonn 26
> *http://www.w3.org/2012/09/26-mlw-lt-minutes.html*<http://www.w3.org/2012/09/26-mlw-lt-minutes.html>
> search for "current recommendation is to put the tool info xml into script
> in html"
> and pointed us to the related DOM methods
> *https://developer.mozilla.org/en-US/docs/DOM/DOMParser*<https://developer.mozilla.org/en-US/docs/DOM/DOMParser>
>
> Felix
>
>
> My demo in Prague used standoff without needing to wrap them in script
> tags.
>
> Phil.
>
>
>
>
>
> From:        Felix Sasaki <*fsasaki@w3.org* <fsasaki@w3.org>>
> To:        *public-multilingualweb-lt@w3.org*<public-multilingualweb-lt@w3.org>,
>
> Date:        02/10/2012 09:17
> Subject:        ACTION-233: Update quality issue example to use the
> solution (XML in  "script" tag) for standoff markup
>  ------------------------------
>
>
>
>
> Hi all,
>
> I updated the qaissue example to use XML in the script element, see *
> **
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2
> *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2>
> the standoff metadata is now in a dedicated "script" element. See also *
> **
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html
> *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html>
> *
> **
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js
> *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/qaissues.js>
>
> So this works, but I have a question to the implementors using HTML5 as an
> input for processing outside the browser.
> If you process *
> **
> http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html
> *<http://www.w3.org/International/multilingualweb/lt/drafts/its20/examples/html5/EX-locQualityIssue-html5-local-2.html>
> with the *validator.nu* <http://validator.nu/> HTML5 parser, the content
> of "script" is not "seen" as XML. The output then is
>
> <html xmlns="*http://www.w3.org/1999/xhtml* <http://www.w3.org/1999/xhtml>">...
>
> <script type="application/xml" id="its-standoff-1">
>   &lt;its:locQualityIssues xml:id="lq1" xmlns:its="*
> http://www.w3.org/2005/11/its* <http://www.w3.org/2005/11/its>"&gt;
>    &lt;its:locQualityIssue
>     locQualityIssueType="misspelling"
>     locQualityIssueComment="'c'es' is unknown. Could be 'c'est'"
>     locQualityIssueSeverity="50"/&gt;
>    &lt;its:locQualityIssue
>     locQualityIssueType="typographical"
>     locQualityIssueComment="Sentence without capitalization"
>     locQualityIssueSeverity="30"/&gt;
>   &lt;/its:locQualityIssues&gt;
> </script>...</html>
>
> So if we would have an XML-based tool that wants to pick up the ITS
> standoff information, it won't work.
> Currently, Linguaserve is using this approach *
> **
> https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration
> *<https://www.w3.org/International/multilingualweb/lt/wiki/LSP_Localization_Chain_Side_Use_Case_Demonstration>
> to embed ITS rules into an HTML file. I had hoped that the "script"
> element would have been an alternative - is it?
> I'm sure this is not a difficult problem, but we probably need some
> guidance for implementors who are not used to process HTML5.
>
> Felix
>
> --
> Felix Sasaki
> DFKI / W3C Fellow
>
>
> ************************************************************
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the sender immediately by e-mail.
>
> *www.vistatec.com* <http://www.vistatec.com/>
> ************************************************************
>
>
>
>
> --
> Felix Sasaki
> DFKI / W3C Fellow
>
>
> ************************************************************
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they
> are addressed. If you have received this email in error please notify
> the sender immediately by e-mail.
>
> www.vistatec.com
> ************************************************************
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Tuesday, 2 October 2012 11:33:44 UTC