Re: ACTION-447: Make a batch transformation of the test suite to xliff

Am 21.02.13 20:55, schrieb Fredrik Liden:
>
> Hi Marcin and Yves,
>
> Regarding the comment for domain4html.html. Trans-unit id=1 is 
> actually the extraction of the content of <metaname=*"keywords"…*
>
> One of the default global html5 rules (in Okapi) specifies 
> <metaname=*"keywords"…*’s content to be translatable and a second rule 
> specifies that it contains the default domain for the entire <html> 
> element.
>
> The <script> rules applies the combined rules for the <body> element 
> only. So what you’re seeing is the <p> (trans-unit=2) with the 
> combined domains. But <metaname=*"keywords"…* being outside the body 
> only has the default domain applied.
>
> Another example why we need some guidelines/expectations for html 5 
> behavior. J
>

Independent of whether this is normative or not: how would the guidance 
look like?

Best,

Felix

> Cheers,
>
> Fredrik
>
> *//*
>
> */domain4html.html/*
>
> <!DOCTYPE html>**
>
> <html>**
>
> <head>**
>
> <metacharset=utf-8>**
>
> <metaname=*"keywords"*content=*"SPORTS LAW, Judicial Matters"*/>**
>
> <metaname=*"x-mykeywords"*content=*"Sport, Law "*/>**
>
> <scripttype=*"application/its+xml"*>
>
> *<*its*:*rules 
> xmlns*:*its*=*"http://www.w3.org/2005/11/its"xmlns*:*h*=*"http://www.w3.org/1999/xhtml"version*=*"2.0"*>*
>
> *<*its*:*param name*=*"domainParam"*>*keywords</its:param>**
>
> <its:domainRule selector="/h:html/h:body" 
> domainPointer="/h:html/h:head/h:meta[@name='x-mykeywords' or 
> @name=$domainParam]/@content" domainMapping="'sports law' LAW, 'labor 
> law' LAW, 'contract law' LAW, 'competition law' LAW,'tort law' LAW"/>**
>
> </its:rules>**
>
> </script>**
>
> </head>**
>
> <body>**
>
> <p>*Some text about sport and law.*</p>**
>
> </body>**
>
> </html>
>
> */domain4html.html.xlf/*
>
> <?xmlversion=*"1.0"*encoding=*"UTF-8"*?>**
>
> <xliffversion=*"1.2"*xmlns=*"urn:oasis:names:tc:xliff:document:1.2"*xmlns:okp=*"okapi-framework:xliff-extensions"*xmlns:its=*"http://www.w3.org/2005/11/its"*>**
>
> <fileoriginal=*"/Copy of 
> domain4html.html"*source-language=*"en-us"*target-language=*"fr-fr"*datatype=*"html"*>**
>
> <body>**
>
> <trans-unitid=*"1"*okp:itsDomain=*"SPORTS LAW, Judicial Matters"*>**
>
> <sourcexml:lang=*"en-us"*>*SPORTS LAW, Judicial Matters*</source>**
>
> <targetxml:lang=*"fr-fr"*>*SPORTS LAW, Judicial Matters*</target>**
>
> </trans-unit>**
>
> <trans-unitid=*"2"*okp:itsDomain=*"Sport, Law, SPORTS LAW, Judicial 
> Matters"*>**
>
> <sourcexml:lang=*"en-us"*>*Some text about sport and law.*</source>**
>
> <targetxml:lang=*"fr-fr"*>*Some text about sport and law.*</target>**
>
> </trans-unit>**
>
> </body>**
>
> </file>**
>
> </xliff>**
>
> **
>
> -----Original Message-----
> From: Yves Savourel
> Sent: Tuesday, February 19, 2013 6:03 AM
> To: 'Mārcis Pinnis'; 'Multilingual Web LT Public List Public List'
> Subject: RE: ACTION-447: Make a batch transformation of the test suite 
> to xliff
>
> Hi Mārcis,
>
> I missed a few comments in your docx file.
>
> Here are the file with my additional notes (nothing major).
>
> (BTW: your comment about Domain  in domain4html.html is interesting.
>
> I'll try to look at the test output and see if it matches the info 
> output in the XLIFF file.
>
> If it does, this may be an interesting overriding case.)
>
> -ys
>
> -----Original Message-----
>
> From: Mārcis Pinnis [mailto:marcis.pinnis@Tilde.lv]
>
> Sent: Tuesday, February 19, 2013 5:24 AM
>
> To: Yves Savourel; 'Multilingual Web LT Public List Public List'; Dave 
> Lewis (dave.lewis@cs.tcd.ie <mailto:dave.lewis@cs.tcd.ie>)
>
> Cc: Felix Sasaki (fsasaki@w3.org <mailto:fsasaki@w3.org>)
>
> Subject: RE: ACTION-447: Make a batch transformation of the test suite 
> to xliff
>
> Hi Yves, all,
>
> I had a look at the examples. I believe that either I am missing 
> something (not understanding where the ITS 2.0 data is in the XLIFF 
> documents) or there is some backwards compatibility of content lost 
> when converting from the HTML/XML examples to XLIFF.
>
> 1. I had a look at the Terminology part and I could not find ITS 2.0 
> related terminology annotation in the XLIFF documents. I have attached 
> my findings to this e-mail.
>
> 2. With the Locale Filter I see that instead of having ITS 2.0 
> mark-up, the whole fragment has been removed and replaced with a 
> placeholder (is that because it is not possible to add Locale Filter 
> mark-up in XLIFF at all?). This does not preserve the content, but 
> filters out fragments based on ITS 2.0 consumption/production Use Case 
> scenarios (which is I guess an internal process and not for data 
> exchange purposes). And ... it actually does not show an XLIFF 
> document with the Locale Filter data category metadata in it (that was 
> what we wanted to see, but the examples, I believe do not show that). 
> Is this because XLIFF would not be able to handle ITS 2.0 annotation 
> or because of some other reasons (I am a bit confused here ... so I 
> would like to clarify)?
>
> Some other findings (more in the attached file) 3. The Language 
> Information as I understand it, will be fully passed on to xml:lang 
> (that is clear).
>
> 4. The Domain metadata seems to be transformed from ITS into an OKAPI 
> internal structure.
>
> 5. The Elements Within Text information as I understand it, is just 
> structural, so no mark-up is necessary (that is clear).
>
> Maybe I have just misunderstood what the XLIFF examples would contain? 
> I had the understanding that the transformation to XLIFF would 
> preserve ITS 2.0 metadata. Did I understand it wrong?
>
> Then ... I had a look also at the files in the "roundtrip-example" 
> directory. As I understand from Yves e-mail, these are not valid XLIFF 
> files, right?!
>
> I still had a look at the examples that contained terminology 
> annotation. I believe Terminology is used incorrectly:
>
> <mrk its:terminology="yes" its:termInfoRef="#ge1">Arizona</mrk>
>
> The attribute is its:term="yes" rather than terminology... (or am I 
> again missing out some information?)
>
> The files seemed not to have Domain and LocaleFilter metadata in them 
> - it would be great to see these categories in action as well.
>
> Best regards,
>
> Mārcis ;o)
>
> -----Original Message-----
>
> From: Yves Savourel [mailto:ysavourel@enlaso.com]
>
> Sent: Monday, February 18, 2013 4:52 PM
>
> To: 'Multilingual Web LT Public List Public List'
>
> Subject: ACTION-447: Make a batch transformation of the test suite to 
> xliff
>
> Hi all,
>
> I've done this action item.
>
> A batch file as well as the XLIFF output have been added to GitHub:
>
> https://github.com/finnle/ITS-2.0-Testsuite/commit/294018ba576799dcbee7b9566da83837dd69f4ae
>
> Notes:
>
> -- The XLIFF outputs are often identical because the test files are 
> just different ways to markup the same content.
>
> -- The XLIFF output often make little sense because the input 
> exercises only one data category. For example, a storage size 
> limitation set on a span ("inline") element will not show up on an 
> inline element in XLIFF because there is no information in the input 
> file that says the span element is 'within text' (since the test case 
> is about the storage size). IHMO the output are rather useless.
>
> -- Most data categories have output, but only when the extraction use 
> them. For example there is no output for directionality because, while 
> the Okapi ITS engine process and provides that data category, the 
> filter does nothing with it.
>
> Cheers,
>
> -yves
>

Received on Thursday, 21 February 2013 20:05:31 UTC