See also: IRC log
david: going through list of participants of the group
current state: https://www.w3.org/2000/09/dbwg/details?group=53116&public=1
michael from DERI introducing himself
michael: working at DERI, concentrating on Semantic Web
... dealing with ontologies
... question answering, multilingual ontology
generation
pedro: from linguaserve, localization service provider
...
working with multilingual solutions:
... e.g. online translation systems
...
localization chain interoperability & web services
... participating here for two
aspects:
... 1) providing input to the standard definition
... 2) practical
experiences from multilingual processing
... 3) specific work package contribution
david: for people who are not in the group: work packages are references implementations the EC funded group is doing
see http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Deliverables.2C_WPs
david: some of the reference implementations will be open source
aaron: from Opera, nice to be here
... I am localization
coordinator for opera
... localization of web properties is my interest
...
managing relation between browsers and users
... currently handled differently for
each browser
... we have developed ways how we manage translations with providers and
volunteers
... I'm here to learn more about best practices
<dF> Felix, do we have a public list for the LUX f-2-f
aaron: that we can increase efficiency
link for luxembourg meeting: http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Deliverables.2C_WPs
nc, do you want to post a brief introduction here on IRC?
guiseppe: development director of linguaserve, I'm from italy
david: describing the role XLIFF, main format for localization
roundtripping
... we also have relationship with other TCs like oasis XLIFF etc.
... other relationships to Unicode consortium, ETSI etc.
... I am working at Univ. of
Limerick
... research in standardization related to localization
declan: work at DCU
... european projects like euromatrix
plus
... panacea (for data aquisition)
... cosyne (multilingual synchronization
of wikis)
... background is mostly in MT
... my role in MLW-LT is about metadata
for machine translation training
... and work package 4 about online MT systems
milan: from moravia worldwide, another LSP in the group
... helping developing reference implementations related to XLIFF roundtripping
david: too early for enlaso
phil: CTO with vistaTEC, also LSP
... based in Dublin
Ireland
... a lot of process automation
... in MLW-LW, we want to improve
decision making automation, based on metadata
... excited for potential for improving
our process
... we are also member of CNGL
(centre for next generation localisation)
felix: working for W3C and DFKI fellow, coordinating underlying
EU project and co-chair with David and Dave
... background in i18n actitiy in W3C,
bringing in existing standardization here (ITS) and assuring relation to HTML5 etc.
doates: from adobe
<doates> A brief intro from me as requested: I'm a localization architect working withing the globalization group at Adobe. I've been in that group for 14 years, working on variousl products and tools. I am now the architect of our internal localization platform and am keen to help shape the definition of this group, and help bring it to a point where it is suitable to drive it back into our internal machinery, and also to push it into our products.
<dF> link for luxembourg meeting: http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Deliverables.2C_WPs
doates, can you type a short self intro into IRC?
tadej: from JSI
... working on NLP, data mining, semantic
web
... our role is to be provider of text analytics tools
... that provide
metadata for existing content
... and to use that for other processes in the
localization or other pipelines
david: anyone else on the call?
moritz: from cocomore, agency for communication and
technology
... joining to work on multilingual CMS solutions
moritz: want to evangelize the metadata too
... metadata
will focus on drupal
about carina (introduced by moritz): also at cocomore
scribe: will also join us
David: EU project started January 2012
... W3C project
just started 7 March
(see press release at http://www.w3.org/Press/Releases-2012#x2012-mwlt )
david: 13 members of the group are LT-Web consortium. That's EC
working title of the project
... the w3c working group is larger and is working under
the W3C IPR policies
... the EU project has some additional obligations, e.g. work on
the standard and test suites
... and to develop reference implementations
...
see the deliverables list at
http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Deliverables.2C_WPs
david: not a formal f2f meeting, since it couldn't have been
announced in time
... still the meeting in Luxembourg is open for everybody
see http://www.w3.org/International/multilingualweb/lt/wiki/LuxembourgMarch2012
david: meeting is open for all participants
felix: how about other people to join the meeting?
david: should be OK, but let's address this offline
...
other meetings planned: Dublin Meeting in June
<Carina> looking forward to Dublin
http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Meetings
david: dublin meeting will be a requirements gathering meeting, hoping to get a lot of feedback
david going through the WG homepage
see http://www.w3.org/International/multilingualweb/lt/#feedback for questionnaire about requirements
david: please fill in the questionnaire http://www.w3.org/2002/09/wbs/1/mlw-lt-requirements/
... and let other
stakeholders know about it and see if they can fill it in
http://www.w3.org/International/multilingualweb/lt/wiki/LuxembourgMarch2012#Agenda
david: you can still add points of interest to the agenda
felix: please also have a look at the pre-read materials
david: we had a good opening call, looking forward to meet you regualry for 95 times
felix: we will need an additional meeting slot, for people in the US
felix: continuing now with work package specific dicsussion, everybody can stay on the call.
adjourn of this call, continue with WP discussions
tadej: interested how the MT workflow will use metadata
... declan said in the berlin meeting that there are some opportunities
<dF> Declan: what we could handle just now
<dF> Declan: priorities
declan: types of metadata for MT training: domain, terminology, do not translate item
<dF> .. 1. terms
declan: and related to training process
<dF> ..2. do not translate
@DF, happy to continue scribing
<dF> ..3. domain related metadata
declan: above would be most useful for MT training
<dF> OK :-)
tadej: for terminology we can contribute automatic
generation
... for "do not translate" it depends on the target language
declan: domain information is about the topic, e.g. chemistry,
biology etc.
... but also stylistic
... e.g. whether it is formalized etc.
... could be someone dealing with patents etc.
... we use this to categorize training
materials
tadej: sounds very useful
... our existing can analyze
topic information
... but not genre information yet
david: in let's MT project things like that were done
...
there was a challenge in organizing training data
... result was multi
dimensional
... main interest was for SMT builder
... they are interested in
data, metadata is nice
... but for building the training corpus there needs to be
something else
... e.g. a clustering
... if a corpus is really big
... you
may need to use metadata for clustering
... but clustering can also depend on other
things than metadata
declan: issue is granularity
... on top of a base line you
would want to use more specific models
... you would use more domain specific data,
identified by metadata
... and cascade the training
... in an SMT system you can
have multilple translation models and language models
... we assume that domain
specific models can help the quality
david: true, but is domain specific not cross metadata categories?
declan: for general clustering that might the case
... but
you could use a number of domains to represent a particiluar domain
tadej: what granularity are you looking for - corse grained?
declan: yes, higher level
milan: granularity also depends on amount of data
declan: do you have example of ontology you use, Tadej?
tadej: yes
... we are using dmos
... open directory
ontology
... has 1 million categories
... hierachical
... covers a lot of
domains
<tadej> http://www.dmoz.org/
declan: want to have a granularity that MT systems can make use of
tadej: so maybe only top level is useful
<tadej> http://enrycher.ijs.si/
declan: yes
tadej: for enrycher, you can see categories we are providing
declan: great, we could use that as a starting point
kimmo: any practical questions for me at the moment?
nothing specific for kimmo at the moment
declan: deliverables are to be finished month 15, but standard
is month 21
... we will adjust our deliverable for the final standard in month
21
... will still be in allocated PM, but divide the time between start and the
end
... so we still do what is in the dow, but adhere to the standard process
moritz: there should be not too much that we have change a lot
declan: agree
... so move deliverables to month 21 or have
two parts
... any other questions about the WP5 work plan?
declan, fine by me
<scribe> ACTION: declan to adjust the WP plan for WP 5 [recorded in http://www.w3.org/2012/03/09-mlw-lt-minutes.html#action01]
moritz: would good to have some background material about localization chain
declan: I'll send you some material
moritz: great, thanks
... two main workflows that we could
use from drupal: XLIFF or HTML(5)
declan: usually we do a lot of batch processing, offline
... we have done work with XLIFF and TMX and other formats
... we are open about what
is most easiest and flexible
... to define standard ways of input
moe, let's have the call with the whole group, so that others can jump in
declan: so that's all for WP5 discussion so far
...
anything else?
moritz: nothing specific to discuss
https://www.w3.org/International/multilingualweb/lt/wiki/ImplementationIssues/Drupal
felix: propose to discuss Yves's proposal about a separate
module to get localizable content out of drupal - at this call
... not now in detail,
but later with Yves on the list or on a call
<scribe> ACTION: moritz to trigger the discussion with Yves on the public mailing list and or a regular call about Yves's issue [recorded in http://www.w3.org/2012/03/09-mlw-lt-minutes.html#action02]
david: Yves reported that it is hard to extract, that was
expected
... issues of getting all localizable stuff into the cycle has been addressed
by XLIFF module
that's all for me for WP3, I have no further items
http://www.w3.org/2011/12/mlw-lt-charter.html
http://www.w3.org/International/multilingualweb/lt/wiki/LuxembourgMarch2012#Pre-reading_material
<Declan> It may be worth compiling a master planning chart for the WPs (in a similar format to the plans for WP3, WP5 that I've seen already) so that we can clearly see the inter-dependencies between the deliverables across WPs
http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Deliverables.2C_WPs
<scribe> ACTION: felix to check excel upload in the w3c side [recorded in http://www.w3.org/2012/03/09-mlw-lt-minutes.html#action03]
moritz: agenda needs times in it
<scribe> ACTION: felix to add times to luxembourg agenda [recorded in http://www.w3.org/2012/03/09-mlw-lt-minutes.html#action04]
meeting adjourned