W3C

- DRAFT -

its IG

14 May 2014

Agenda

See also: IRC log

Attendees

Present
DaveLewis, dF, felix, garyLefman, philr, renat, yves
Regrets
christian, ankit
Chair
felix
Scribe
fsasaki

Contents


roll call

checking attendees ...

http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014May/0003.html

namespace bindings and CSS selectors

skip - jirka not here

update on XLIFF mapping

david: volunteered to review
... started doing some minor editorial changes today
... not done yet
... only typos to fix
... only review, did not go through "todo" parts of the mapping

dave: what is the process of the mapping becoming a module in XLIFF 2?
... do we need to support that with a test suite?
... or s.t. like that?

david: would be good to have "in" and "out" files
... yves has examples
... how it looks in native format (usually HTML)
... would be more complex compared to other modules
... in other modules you worry just about localisation roundtrip
... but not about what the information was before extraction
... the XLIFF spec has not a lot of instruction for extraction / merger
... they expect that both have the some knowledge
... the extraction is not really described
... the extraction parts would be non normative
... it is a new situation: some W3C work to be submitted to OASIS
... a member like LRC submits this to OASIS
... no need to have w3c membership, OASIS membership is sufficient

yves: for tests - would be possible to have input with ITS markers, output in XLIFF
... process and have same kind of output like we had with ITS test suite

felix: makes sense

yves: so we need a set of rules and spell them out into a rule file

renat: could we use okapi framework for file conversion purposes?

yves: sure

renat: okapi could convert almost any format into XLIFF

yves: yes

renat: it could develop or extend existing converters to support ITS

david: Okapi is a reference implementation
... having more than input + output files is probably not realistic

felix: Okapi provides low level parsing with ITS "events" but not "real" implementations for all data categories

david: we hope for other low level implementattions that should create the same results
... we need a good selection of stakeholders in dublin
... to get buy in of the people

felix: could provide xslt for test suite, but not for real processing

david: would be good for test suite
... bryan should be able to work on something similar

felix: any requirements in terms of numbers of implementations?

david: we put test suite in our charter
... but it is not even an OASIS requirement
... it is a statement of use, it is up to the TC to decide what level of verification is needed
... having input + output files is the minimum I think

felix: what is the timeline for the input + output files and how about the location?

david: we have SVN repository
... oasis admin are OK with links to SVN
... next tc call will be next week - we then should talk about new template for XLIFF 2.1
... will have a different svn than XLIFF 2.0
... about the time frame: this will be another session in feisgiltt
... Kevin will propose an early release schedule - to be discussed next month
... implementations should be ready by November, file in place no later than september

yves: the more comments we get on the mapping the better
... the parts that map to modules, e.g. size and constraints

[discussion about storage size in ITS2 and how it maps to XLIFF2]

david: yves, any other things you think need input except storage size?

yves: yes, the standoff things, e.g. provenance
... also implementation feedback

<fsasaki_> david: conceptual aspect of "Translate" - what is it for?

<fsasaki_> .. yves has fallback solution: you can extract non translatable text into code

<fsasaki_> .. not sure about this: we are mixing up things

<fsasaki_> .. if s.t. is natural language not translatable for business reasons it should not go into codes

<fsasaki_> .. it should be marked up as translate=no or not go into the extracted content

<fsasaki_> dave: the ITS "Translate" does not say anything why we annotate it

<fsasaki_> .. in XLIFF it is complicated since the extraction process says s.t. about this as well

<fsasaki_> david: depends on the semantics of the underlying format

<fsasaki_> .. e.g. in some formats you have assumptions like: markup is not translatable

<fsasaki_> dave: for ITS I think it only applies to the textual content

<fsasaki_> .. would not apply to markup unless it is textual content of attribute

<fsasaki_> .. maybe ok to have code extraction fallback

<fsasaki_> felix: besides okapi who would implement the mapping?

<fsasaki_> yves: leroy I assume

<fsasaki_> david: bryan is esp. interested in the CMS scenario

<fsasaki_> .. he would be interested in that issue, esp. DITA based

dave: we are still using mapping for mapping to RDF

update on RDF mapping

http://www.babelfy.org/

dave: useful for advancing this may be:
... for advancing ontology + test suite
... we could do that in LIDER
... that will help to motivate people
... same for qLabel library
... they only highlight 1-2 data categories
... but some are good XML>RDF data categories

felix: no need to publish a note, we can also just point to the ontology and document it

working with ITS in RDF

yves: you get a lot of data - how to make use of it is hard

dave: exactly

felix: working with various data sets - how is the impact?

yves: babelnet was mostly useful for the translators it seems
... maybe that is the case with other data sets too - they use same linked data?

dave: there are some data set differences, or different data clean up processes
... good to hear from people like yves: what is more useful?
... how to decide which one works well under which circumstances
... we are close to feisgiltt now - for yves a good opportunity to met linked data guys and talk to localisation people too
... people don't want to be in too many groups, but if we track a specific issue it'd helped

event report

felix reports on ITS metadata in json discussion held at the workshop

dave: there was a discussion at the end of the workshop - doing e.g. inline markup thing would be a horror in json
... json-ld mapping from the ITS ontology could help

yves: did not try yet to put anything in json
... not sure - this is really like a transport format

david: think it would have traction in industry

dave: there is two use cases:
... one is internationalisation of apps that use json
... the other one: use json like you use xliff, but talking to javascript browser cat tool
... example: prototype cat tool having xliff+its working in the browser
... from a web developers point of view, it would be more natural to have everything in json

felix: so having ITS > XLIFF mapping and then that represented as json?

dave: yes - and there is i18n requirements for apps that use json

david: yes, that is different use cases

felix: anything else to report about the workshop?

dave: three interesting groups of people:
... the wikipedia translation group

[see slides now here http://www.w3.org/International/multilingualweb/2014-madrid/slides/sharma.pdf ]

[and here http://www.w3.org/International/multilingualweb/2014-madrid/slides/giner.pdf ]

dave: also interesting: publication office discussion

<dF> FEISGILTT links: home http://bit.ly/1c4RC5Y reg http://bit.ly/1lxmkvr Speakers: http://bit.ly/QnVdI0 Program: http://bit.ly/1lxn0kr

dave: they will start publishing legal text
... they are interesting in understanding good ways to leverage ITS > RDF mapping
... and then discussion how to get access to "deep web" data

felix: would hope that wikipedia would use ITS (without necessarily showing the markup to the author / editor)

dave: also libraries like qlabel help to show value of ITS

AOB

adjourned

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-05-14 12:58:31 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.138  of Date: 2013-04-25 13:59:11  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

No ScribeNick specified.  Guessing ScribeNick: fsasaki
Inferring Scribes: fsasaki
Present: DaveLewis dF felix garyLefman philr renat yves
Regrets: christian ankit
Agenda: http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014May/0003.html
Got date from IRC log name: 14 May 2014
Guessing minutes URL: http://www.w3.org/2014/05/14-i18nits-minutes.html
People with action items: 

[End of scribe.perl diagnostic output]