MLW-LT WG -- 13 Sep 2012

<fsasaki> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0055.html

agenda agreed

html5 conformance

<fsasaki> https://www.w3.org/International/multilingualweb/lt/track/issues/47

<fsasaki> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#conformance

fsasaki: spec currently discussed conformance for XML but not HTML

phil: tried some html5 parsing in a browser client way, but not through the whole value chain
... would be using xliff in other places

<dF> daveL, please mute yourself while not speaking..

<fsasaki> 4.2 Conformance Type 2: The Processing Expectations for ITS Markup - separate mention of HTML5, separate tests

fsasaki: asking does it make sense separate conformance sections for HTML5 and separate conformance tests - which are available already
... does everyone agree

phil: clarifies that for html5 only planning to support quality data categories

fsasaki: this is OK to only support one data category

phil: for quality it would probably just be local mark-up as well

<dF> Felix, we lost you?

fsasaki: this is fine also

shaunm: also planning to support HTML to the same level of XML currently, but not in the timeframe to support the spec progression

fsasaki: this is fine, since it may make sense to have lower bar for conformance testing than for support in products
... so we push as much as possible for HTML, but its not then so critical to push on that for conofrmance

<fsasaki_> dave: agree with felix

<fsasaki_> .. HTML parsing is easy to do

<fsasaki_> .. makes it feasible to show HTML on business products, that's a learning curve

<fsasaki_> .. looking into the tests with leroy

<fsasaki_> .. once you have done a few tests, work with the data categories gets easier

<fsasaki_> .. the business requirements are natural a different area

<fsasaki_> .. but the pain barrier for parsing HTML gets easier

<fsasaki_> .. its the same pattern for many data categories

<dF> sounds good

fsaski: any further question on issue of separating HTML implementaiton in useful products and the process of conformance testing

Yves_: agrees

Pedro: agrees

<scribe> ACTION: fsaski to write up and close ISSUE-47 [recorded in http://www.w3.org/2012/09/13-mlw-lt-minutes.html#action01]

<trackbot> Sorry, couldn't find user - fsaski

<scribe> ACTION: fsasaki to write up and close ISSUE-47 [recorded in http://www.w3.org/2012/09/13-mlw-lt-minutes.html#action02]

https://www.w3.org/International/multilingualweb/lt/track/issues/49

fsasaki: some mappings for quality data category currently in annex that Arle had developed from detailed knowledge of current tools
... but the tool may change over time.

<fsasaki_> http://lists.w3.org/Archives/Public/public-i18n-its-ig/

fsasaki: so should we move this to a separate area, i.e. the ITS IG
... this way discussion on these topics can continue after the completion of the MLW-LT WG

Yves_: agrees

fsasaki: this mailing list is completely public
... put should this be a wiki or a html page?

Yves_: probably easier to use the wiki

<scribe> ACTION: felix to move this material to the ITS IG wiki and remove the issue [recorded in http://www.w3.org/2012/09/13-mlw-lt-minutes.html#action03]

<trackbot> Created ACTION-216 - Move this material to the ITS IG wiki and remove the issue [on Felix Sasaki - due 2012-09-20].

https://www.w3.org/International/multilingualweb/lt/track/issues/48

yves: while inmplementing storage size, all line break in XML may be different in depending on platform, which may change the line count
... so asks for ideas on how to address this?

dF: this is needed
... if the informaiton in the bom

yves: proposes a way of recodnign this in the data category
... need to include the value to be counted, including a default

felix; should it be a list of codepoints

yves; crap

scribe: we can't do that
... will come up a proposal on the mailing list

dF: propose considering CRLF as shown in editors
... and agree default is important

yves: guess line feed would be the default
... will come up with a short proposal on the list

<scribe> ACTION: yves to propose solution for recording soragesize count behaviour for line breaks [recorded in http://www.w3.org/2012/09/13-mlw-lt-minutes.html#action04]

<trackbot> Created ACTION-217 - Propose solution for recording soragesize count behaviour for line breaks [on Yves Savourel - due 2012-09-20].

https://www.w3.org/International/multilingualweb/lt/track/issues/45

felix: a lot of discussion in week on what was needed for HTML5
... which for a node is a document would say what ITS meta-data would be present

phil: to understand more clearly would it be a read only API?

felix; this is what he had in mind

scribe: but also assume a write API

shaun: would this be a requirement for ITS conformance

felix: no, wasn't intention to specify the API in the standard, just to provide it as a useful peice of support software
... this might help for people to make use of ITS

<fsasaki_> dave: it is a nice thing in the portfolio

<fsasaki_> .. esp. for editing you can have an API only for local it seems

<fsasaki_> .. it just makes an easy entry point

<fsasaki_> .. mobile might be useful for this too

phil: Ok so its clearer that its javascript, this is similar to what we've been prototyping
... but hadn't planned a releasable api
... have found a way to save output as a serialised XML file

Des: this sound more like a client side library, suitable for best practice in a java script library

phil: yes, felt this would be useful in scenario where access to other tools was limited but could enable access to some data categories
... can certainly see some common function we might want to use, but a way from that yet

Des: there may be issues with browser dependency

felix: it useful if we have initail exploration just to show that we are considering that issue

https://www.w3.org/International/multilingualweb/lt/track/issues/46

felix: so fleix to close this issue, and perhaps raise again at TPAC
... had editorial comments from dave and aaron
... but looking for volunteer to implement some of these

<fsasaki_> dave: happy to work on the word smithing part

<fsasaki_> .. would be good to separate this between a few people

<fsasaki_> .. don't want to go too far, and do that in small bits

<fsasaki_> .. examples thing is a good idea, but takes a lot of time

<fsasaki_> .. one question: in terms of the process

<fsasaki_> .. in terms of priorities: can we still do that after November

felix: what document is on last call, as is out december deadline, this mean every word is finished
... strictly to the normative parts,
... but if its to the non normative parts to a large extent this is a problem also
... so changes are possible, but last call really mean freezing the text.
... so we need to do this before the end of November

Shaun: happy to take on some of that work

felix; there are comments form dave and aaron

<fsasaki_> dave: any help is good, but shaun is in a good position to look into HTML and XML

<fsasaki_> .. so trying with some of the examples might help

<fsasaki_> .. there is good examples here already, but they are worth a review

<fsasaki_> .. and the HTML is worth some effort

<scribe> ACTION: shaun to provide some HMTL and revised XML example for introductor section, referencing comments from Dave and Aaron, but based on real use expereince [recorded in http://www.w3.org/2012/09/13-mlw-lt-minutes.html#action05]

<trackbot> Created ACTION-218 - Provide some HMTL and revised XML example for introductor section, referencing comments from Dave and Aaron, but based on real use expereince [on Shaun McCance - due 2012-09-20].

preparation for prague http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/0045.html

felix: some data categories stioll not very stable - proveance, mt confidence, text analysis
... we know people are working on showcases but we need good specification also
... so we need some face to face time in the meeting to wrap these up

<fsasaki_> dave: I'll take care of disambiguation, phil of provenance

<fsasaki_> ,, declan will look at the spec and implementation too

felix: the other one is text analytics annotation
... asks DavidF about view on how certain part apply to document or segement, and also how to encode MT engine identification

dF: though this was fairly stable, but we didn't want to prescribe how to represent this
... the requirement is that it unique on the platform but may then be dependent of the options the platform operator wanted to adopt
... though the t extension seems a good fit

felix: was not clear how how prescriptive the text was meant to be on use of t values
... since if MAY is used in a formal manner, that means that it is optional but still needs a conformance test for that option
... seems to make sense, lets see whet delcan thinks as he revises this.
... Dave and David will be at LRC next week
... jirka will arrange a special fun event in Prague, evening of the first day - evening 25th

- DRAFT -

MLW-LT WG

13 Sep 2012

Attendees