See also: IRC log
<fsasaki> looking at http://www.w3.org/International/multilingualweb/lt/wiki/BledMay2013#Tuesday
<fsasaki> summary issue 74 - 89 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013May/0059.html
<fsasaki> summary issue 106 - 123 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013May/0060.html
<tadej> ACTION: felix to review draft for XLIFF mapping [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action01]
<trackbot> Created ACTION-504 - Review draft for XLIFF mapping [on Felix Sasaki - due 2013-05-14].
<tadej_> ACTION: dF to correct XLIFF mapping (change prefix its to itsx for certain attributes) [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action02]
<trackbot> Created ACTION-505 - Correct XLIFF mapping (change prefix its to itsx for certain attributes) [on David Filip - due 2013-05-14].
<tadej_> ACTION: Felix to set up the URI placeholder for ITS2-XLIFF mapping (http://www.w3.org/ns/its-xliff) [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action03]
<trackbot> Created ACTION-506 - Set up the URI placeholder for ITS2-XLIFF mapping (http://www.w3.org/ns/its-xliff) [on Felix Sasaki - due 2013-05-14].
<Arle> ACTION: Dave to look at the XLIFF 2.0 change tracking module for provenance [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action04]
<trackbot> Error finding 'Dave'. You can review and register nicknames at <http://www.w3.org/International/multilingualweb/lt/track/users>.
<Arle> ACTION: dlewis6 to look at the XLIFF 2.0 change tracking module for provenance [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action05]
<trackbot> Created ACTION-507 - Look at the XLIFF 2.0 change tracking module for provenance [on David Lewis - due 2013-05-14].
<Arle> ACTION: Yves to examine XLIFF resource data module with respect to external resouce. [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action06]
<trackbot> Created ACTION-508 - Examine XLIFF resource data module with respect to external resouce. [on Yves Savourel - due 2013-05-14].
<Arle> ACTION: davidF to make notes on extraction/merging behavior for target pointer [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action07]
<trackbot> Error finding 'davidF'. You can review and register nicknames at <http://www.w3.org/International/multilingualweb/lt/track/users>.
<Arle> ACTION: david to make notes on extraction/merging behavior for target pointer [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action08]
<trackbot> 'david' is an ambiguous username. Please try a different identifier, such as family name or username (e.g., dlewis6, dfilip).
<Arle> ACTION: dfilip to make notes on extraction/merging behavior for target pointer [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action09]
<trackbot> Created ACTION-509 - Make notes on extraction/merging behavior for target pointer [on David Filip - due 2013-05-14].
<Marcis> Hi All
<Marcis> We are waiting for the GotoMeeting Organiser to arrive
Due to connectivity problems the following log is based on offline scribing, done by Arle
Topic: XLIFF [Portion missed] David: First public review of XLIFF 2.0 will be done 29.May. Those interested in mapping need to comment now. There is a commenting facility on the web. Felix provided the needed links. ... Table on XLIFF in the wiki has a 2.0 column Felix: What version should we link to? David: Public review contains links to the public review draft. That would be a good source for the link. ... We expect at least two public review drafts, a second one in June. But we can review to public review draft 1. That's better than referencing an editorial draft. ... Felix, it should all be in the email you got. Also the first page of the spec contains the citation format. ... We've concentrated so far on XLIFF 1.2, but 2.0 solutions are marked in most cases. In many cases it is easier to map to 2.0. ... We tried to use the core vocabulary of XLIFF where possible, which leads to some inconsistent solutions when 1.2 can't handle something in ITS 2.0. ... <mrk> is a generic marker in XLIFF, and we use it to map metadata. One marker can carry more than one piece of metadata, e.g., a "translate" marker can also carry information about terminology ... Does anyone have a concern about that? Dave: Is that best practice in all cases? Felix: Can you show an example? Dave: What happens when they change in a workflow, e.g., through resegmentation? David: We have <sm> and <em> for cases where you can't have well-formed <mrk> in a segment. Dave: So you don't preclude using nested <mrk>s if that's more appropriate? David: It depends on what is in the original document. If the spans are all the same in the source, you wouldn't usually want to split them, although you could without violating syntax. Dave: I'm thinking of cases where you do text analytics after you have extracted the text and need to insert your own. Felix: This is what Marcis needs to know. David: Use the exiting markers if possible, otherwise insert your own. We need some language to explain this. ... The table started mapping categories, but now we need explanation. Dave: Example of "The City of London", where you might get two annotations on "City of London" but then realize one needs to be moved to "The City of London" David and Dave: Discussion about solutions David: In 2.0 it's easy, but in 1.2 the solution is ugly. Yves: There is no implementation of 2.0. David: We're working on implementation statements now. When we are past the review we will gather them. Felix: Marcis has generated 1.2 markup, and that is what he needs to know. Dave: We still need to look at localization quality rating for XLIFF. Phil needs that. David: terminology markup shows where term="yes" has good XLIFF markup, but term="no" needs a workaround. Yves: We should either always use ITS markup or not at all. Otherwise when you process with an ITS processor, it won't see the XLIFF-native markup and will only see the nos. David: But that's the mapping. Yves: It still doesn't work. Felix: If the process sees termInfoRef without its:term="yes", it will not validate properly. David: The goal is not to make it processable by a generic ITS processor. There are a lot of things we would need to do to make it process better. Dave: But where you do implement ITS, you need to follow the spec. Yves: You have a lot of attributes for terminology, and mapping some but not all doesn't make sense. You should use all ITS. In 2.0 extensions would allow you to use entirely native ITS markup. Dave: What is our preference? To use ITS wherever possible? Or use XLIFF equivalents when possible. SO we would have <mrk term="yes"> alongside its:term. Yves: That duplicates information. David: And creates the possibility for conflicting information. Dave: There are XLIFF-only and ITS-only processors, plus processors that would know about the mapping. Felix: Actually the third one is ITS-only since it would use global rules. David: We should catch places where ITS does not make sense ... Yves: I've look at the example from Marcis and he generated an invalid XML file due to some linking [?] [missed discussion about processing between Yves and David] Felix: You could do a mapping of its: to itsx: Felix: Someone on IRC please assign an action to David. ... Marcis is implementing to the end of the month. Terminology, language, elements within text, and domain are what he needs to do. ... David asked if we could have a namespace on the W3C server. Do we still need that? Yves: Yes. We need the URI. (Prefix is unimportant.) Felix: Could we agree on what the URI would look like? ... http:www.w3.org/ns/its-xliff/ ... That would be the mapping URI with the prefix itsx ... Please assign me an action to do this. David: Does Marcis need the 2.0 part as well? Yves: He is doing 1.2 only. David: There is an issue with needing more than one reference? Yves: The issue is with the mapping. We don't have term-info locally in ITS? ... There was a problem that in some cases in XLIFF you can have information about the term without referring to anything, but you can't have that in mapping from XLIFF to ITS. ... We have the termInfo pointer globally, but you can point to extra information about the term and put it locally in XLIFF, but the reverse is not possible since you cannot create a global rule. David: if the original processor knows about global rules, the extractor can do the matching and knows it comes from a global rule, it handles the mapping. Yves: But what happens if the markup comes from XLIFF not the original file so there is no local markup to attach it to. ... For example, if you use Enrycher. David: So you need to know about the source format to map it back. ... Maybe you do it if you can or you send an error message. Yves: Without implementations we can't tell if it will work or not. ... Some cases we can't do anything. ... Solution is that we may need a termInfo attribute in ITS, but we decided early on not to do this. David: In the case of domain we discussed the need for local mechanisms in ITS. ... I wouldn't be opposed, but we are in the last call. Yves: Who does it and would we really need it? We don't know. If you start with an XLIFF file from an original document you want to mark the terms up, but you probably don't want to inject them back in the original document. Felix: Tilde use case ends in May, so we don't want to make changes that may render it nonconformant. Yves: Let's discuss these issues in best practice David: The big things are ITS prefix, domain. Dave: Let's put a complex example in domain to be more realistic. David: I think domain is resolved. ... Same issue as for termInfo. If you introduce it in XLIFF, you need somewhere to put it globally when you get back. Dave: On domain, for consistency, we have an attribute <domains> with an S. The domain should be comma-separated since there an be more than one. It's not in the spec, but in the suite. Should we use domains for consistency? Felix: Domains attribute contains a comma-separated list. David: So you suggest itsx-domains? Dave: Yes. For RDF mapping it is important too. Felix: Let's discuss that tomorrow. David: For elements within text, we suggest a separate unit in both cases. David: For language information, mtype="x-itsLang". We just need to remove the question marks in the wiki. Yves: How about making the mtype as generic and then use the data categories you need? The only problem would be when you use term and things like that. David: mtype becomes "x-its" in this case. For XLIFF 2.0 it takes the same mechanism (UPDATED IN WIKI) David: Now for locale filter, it is basically more sophisticated version of translate, so it should be easy. ... You can use the same mechanism. Yves: We haven't thought about this one. In XLIFF you can extract without a target locale. So what do you do when you have some entries for some locales but not others. ... You may do things after extraction, but you need to know information about locales and preserve it after extraction. An XLIFF doc may have no targets. You don't want to reprocess it for every single locale. ... You might need information at the trans-unit level. David: Can the processor work at the inline level? Yves: That would certainly be possible. ... If we use an extension, you have to understand it to process the file. Felix: Maybe Dave can create an example we can look at. David: The mechanism will be the same in any case since it is an extension, like translate. It would be on a unit. Does it make sense to have it at the segment level? Yves: You don't extract segments, but units. You need at least one segment (which can be a paragraph if no further segmentation is done). Dave created examples in the wiki for locale filter. [missed portion of a few minutes] David: Ruby is gone David: text analysis uses mtype="phrase". Felix: Change this to use should instead of must since confidence may not be needed. David: If you do use it, we need guidance on how. Tadej: In the spec if uses a MUST, but it applies only when confidence information is used. Yves: Does mtype="phrase" make sense here? Does it have a specific meaning? It may not be a phrase. What is the meaning of phrase? David: There is no defined meaning. Dave: There is some ISO meaning [?] David: Even if you can't read the ITS portion, using phrase gives you a notion that there is an idiomatic value, for the generic XLIFF processor that doesn't know ITS. ... Now on to provenance, which is Dave's. Dave: We only support it in stand-off mode. David: It works in 1.2, but for 2.0, maybe look at change tracking mechanisms. [OBSCURED because already added: Action: dlewis6 to look at the XLIFF 2.0 change tracking module for provenance] David: externalResource Yves: should be externalResourceRef. I also found that <mrk> doesn't work when implementing. Because you are not referring to an object, but rather the content in the object [?] ... But we have cases where you can't put the mrk inside where it needs to go. David: So here we should use something other than mrk? Yves: Probably <ph> David: What if the external resource is a TM, term database? Yves: Those don't apply. It refers to video, images, etc., stuff from the original document. David: We need to change mrk for generic inlines. In 2.0 we need to look at the resource data module. ... You have skeleton, original data, and resource data module. So you think this is always just an inline? Yves: For 1.2 it is clear. David: Let's use inlines the inline level in 2.0. But use the skeleton for the structural level? Yves: The skeleton doesn't make sense to me, but otherwise, the external module for reference is relevant. We need to look at it. David: I'll look it. Yves: I'm not OK with that, because it would force people to add support for a module for something pretty simple. Felix: The category was suggested by Shaun and dealt with source content to show what belongs to the file and is source-content focused. Yves: I think that is partially what Microsoft intended in their module. ... Someone need to look at it anyway. Felix: What is the timeline? Yves: Don't know. Dave: So what are we using for 1.2? David: generic inline markup. Same for 2.0, but a different set. [OBSCURED because already added: Action: Yves to examine XLIFF resource data module with respect to external resouce.] David: for id value inline, we are not addressing this use case. (Explain that NA is after deliberation.) Yves: To do it we would need an extension. David: Would require unique IDs within the doc. But I think we can drop it for now and revisit if needed.
David: Mārcis asked why we have both its- and itsx-
Felix: Can you clarify it for me too?
David: Sometimes the local markup
does not exist in XLIFF, so you have to use its. In some cases
the local markup exists, but some information is lost in the
mapping, so the attributes no longer make sense and depend on
the data that was mapped.
... For that we use itsx-. The URI is actually the its-xliff namespace, but we use itsx- prefix.
... The namespace has been sort of defined.
Felix: Point Tilde to http://www.w3.org/ns/its-xliff/ There is nothing there, but there will be.
David: preserve space
... XLIFF uses xml:space, so what do we say about inline?
... There shouldn't be different whitespace behavior inline, right?
... so I would suggest to say that preserveSpace doesn't need to be defined for inline.
Yves: You could, but I'm not sure tools can actually do it. But you could have an element in a paragraph where you need to preserve space.
David: A code sample inline might need that. Can you put xml:space on that?
Yves: You could. No idea what tools would do, but follow the example of xml:lang here.
David: Agreed. Use generic xits-
... Localization quality issue
Dave: We decided to do this only
in standoff mode.
... you can do it on unit, source, or target, depending on where the issue is and where it relates.
David: Localization quality issue
Dave: Not finding anything on this in the spec.
David: Not sure if the rating
makes sense for inlines.
... Maybe in 1.2 if mtype="seg"
... In 1.2 both unit and seg are structural. But talking at that fine-grained level doesn't make sense.
Yves: can we have it on a span
element in HTML?
... If so, then we need to support it.
Dave: We can't anticipate what people might do.
David: We should have a note to say that it is discouraged to do it on spans and other inline elements.
Dave: a lot of things like mtConfidence work at fine granularity.
David: This is analogical. It's the same thing: a score.
Dave: Would you apply these to an alt-trans?
David: Alt-trans don't have any equivalent in the target document.
Arle: I think it makes sense to support it because you might need it for your selection mechanism between multiple alt-trans elements.
<scribe> ACTION: dlewis6 to make LQI and LQR similar to mtConfidence in structure. [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action10]
<trackbot> Created ACTION-510 - Make LQI and LQR similar to mtConfidence in structure. [on David Lewis - due 2013-05-14].
David: mtConfidence. We don't have much for 2.0.
<scribe> ACTION: dfilip to create mtConfidence examples for XLIFF 2.0 [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action11]
<trackbot> Created ACTION-511 - Create mtConfidence examples for XLIFF 2.0 [on David Filip - due 2013-05-14].
David: for origin, it doesn't tell you what it should look like. Don't want to conflict with simple and advanced usages.
Dave: the reasoning for using it as a flag was so that you'd know how to treat things in scenarios, but you can't reserve that phrase.
David: We could recommend that
origination is used for a URI. Then it would be like
annotatorsRef, but it would be overloaded.
... Maybe we use origin for annotators ref and then we could have—wouldn't need a flag then—because the ref would say it mtConfidence.
... We get rid of annotatorsRes in alt-trans, but put it origin and then it is overloaded.
Yves: Doesn't sound good to me.
Tools already use origin and want to put values there.
... I use it to tell me that it comes from Microsoft
... We have matchQuality. Why not use annotatorRef like we do elsewhere?
Dave: annotatorsRef then becomes the flag.
Yves: I don't use annotatorsRef.
It shouldn't behave differently depending on where it comes
... I already use it. You don't need to overload origin. It doesn't bring anything to the information. You already have confidence.
Felix: The tools processing origin and mtConfidence don't need to understand both.
Yves: Using it will create
problems since we already use it. Not just mine, so we can't
reuse it for an override.
... We would lose information.
David: So shall we leave origin out of the example?
Dave: Yes, it takes care of the problem.
Felix: The conclusion of those in the room is not to confuse origin and annotatorsRef
Dave: The only issue is that matchQuality doesn't have restrictions on the value.
David: Could be anything: TM, MT,
... Agree we can use mtConfidence as the flag for the processor. We don't have to specify what to do with origin. Resolved.
... Allowed characters. This is only just now stable in terms of the regex. The mechanism is to use the its:allowedCharacters, in both XLIFF 1.2 and 2.0.
... StorageSize in 1.2 is clear. but need action about native attribute in 1.2.
Yves: We actually decided we can't use local markup.
David: 1.2 is resolved, but 2.0 needs checking.
Yves: The example currently shows transunit, but normally it would apply to the element with the content that has the limitation.
David: Make it always on source and target, in 2.0 as well.
Yves: Should be the same for allowedCharacters.
<scribe> Scribe: pnietoca
Pablo: Felix is summarising his
discussion with Robin Berjon about the handling of HTML
Yves: Which of the 3 options choose?
Jirka: Whatever you want!
Karl: It should be the most supported
Yves: Can support a bit of the three
<Arle> \me darobin, what was the ID number for the GTM session you used.
Pablo: Robin has just joined the meeting
Yves: Talking in general about HTML5
Robin: The text on HTML5.1 is
going in the right direction
... The reference should be HTML5.1
David: We understand we need to be in sync with HMTL5.1, but we need a normative reference
Robin: You can normative reference to the above link
Felix: Encourage to look at the
latest version of translate HTML
... in the spec
Yves: Looking at the text that is
currently there, the attributte style
... is like a script
... why not a list of script attributtes there?
Robin: It's terrible practice
Felix: Putting the last reference
from Robin in the spec
... it's ok with everybody?
... Talking with Robin about Ruby in HTML5 and the dropped one in ITS
... Would it be good to have a reference to the HTML spec for the moemnt?
Robin: They are going to update it, so look to the latest version
Yves: Still something to do, correct the text about the use of translate XML vs HTML
Felix: Showing how to point to
HTML spec from the data category table
... Felix changing the spec with the reference and needed text about translate attribute
... The case of translate is similar to language information, too many different behaviours
... We need to summarise the behaviour in a note of as of writing
Karl: Provide examples of the
predefined list of attributes
... Suggests not to have two subsequent links to HTML5 in the translate section of spec
Felix: Seggests to change from TEI to another interchange format for the next spec
<fsasaki> ACTION: daveL to ping lerory to update tests for translate in html5 and elements within text [recorded in http://www.w3.org/2013/05/07-mlw-lt-minutes.html#action12]
<trackbot> Created ACTION-512 - Ping lerory to update tests for translate in html5 and elements within text [on David Lewis - due 2013-05-14].
Felix: New issues forbidden :)
<Marcis> Hi Felix
<Marcis> Now I do not hear you
<fsasaki> we are restarting gotomeeting
<fsasaki> from a different computer
<fsasaki> can you leave the call and dial in the call? 682-416-317
<fsasaki> sorry for the inconvinience
<Marcis> We can try
<Marcis> Tatiana's Skype name is kumeliite_1
<fsasaki> tatiana: how you see the most recent version of the showcase
<fsasaki> .. implements all of the required functionality
Tatiana: Show Tilde's demo of ITS enryched Term
<fsasaki> .. annotating plain text example
<fsasaki> demo shows how to upload a file to be annotated
<fsasaki> tatiana: choose one of three languages. Now showing English
<fsasaki> tatiana: first annotation option is statistical annotation. 2nd is term bank based annotation
<fsasaki> tatiana: orange term candidates
<fsasaki> .. source shows how these are mapped to ITS. You can download annotated document
<fsasaki> .. html "script" element contains term list
<fsasaki> .. the html body contains references to terms from that list
<fsasaki> marcis: termConfidence not given is term is comming from term bank
<fsasaki> .. in html5 document you have a tbx format term entry
<fsasaki> marcis: now example of annotation with html5 as input
<fsasaki> now xliff examples
<fsasaki_> now q/a
<fsasaki_> dave: do you prioritze term bank over statistical system
<fsasaki_> .. what do you do if you get a clash?
<fsasaki_> tatiana: you can switch
<fsasaki_> .. e.g. having results only proceeded by statistical tool
<fsasaki_> .. or you can have only euroterm bank results
<fsasaki_> dave: what do you do if a word is matched in both?
<fsasaki_> marcis: whenever you select term bank
<fsasaki_> .. there is a filtering step
<fsasaki_> .. not to put a heavy burden on the term bank
<fsasaki_> .. in order to narrow down search space in term bank
<fsasaki_> .. we select term candidates with statistics, then do term search in the term bank
<fsasaki_> .. a user can select whether she wants to see the results of statistics or also term bank
<fsasaki_> dave: the schema for the term entries in html?
<fsasaki_> marcis: that are tbx entries
<fsasaki_> dave: had you planned to offer this as a restful web service?
<fsasaki_> marcis: it can be accessed as a web service api
<fsasaki_> .. what you see is just a visual interface for humans
<fsasaki_> .. everything what you send to a web page calls the service
<fsasaki_> dave: did you document the web service interface? can the other partners use that?
<fsasaki_> marcis: the documentation will contain the API description as well
<fsasaki_> tatiana: this is just an interface to showcase the solution
<fsasaki_> tatiana: example of the agricultural domain
<fsasaki_> .. that annotation gives quite a lot of terminology tagged
This is scribe.perl Revision: 1.138 of Date: 2013-04-25 13:59:11 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/iits-xliff/its-xliff/ Succeeded: s/"set"/"seg"/ Found Scribe: Arle Inferring ScribeNick: Arle Found Scribe: pnietoca Inferring ScribeNick: pnietoca Scribes: Arle, pnietoca ScribeNicks: Arle, pnietoca Present: Karl arle dF dave felix jirka karl mauricio milan pablo tadej yves ankit(IRC) Agenda: http://www.w3.org/International/multilingualweb/lt/wiki/BledMay2013#Tuesday Got date from IRC log name: 07 May 2013 Guessing minutes URL: http://www.w3.org/2013/05/07-mlw-lt-minutes.html People with action items: dave davel david davidf df dfilip dlewis6 felix yves[End of scribe.perl diagnostic output]