07:06:08 RRSAgent has joined #mlw-lt 07:06:08 logging to http://www.w3.org/2012/09/26-mlw-lt-irc 07:06:14 Zakim has joined #mlw-lt 07:06:21 meeting: MLW-LT f2f (day 2) 07:06:23 chair: various 07:06:28 scribe: variousToo 07:06:45 agenda: http://www.w3.org/International/multilingualweb/lt/wiki/PragueSep2012#26_Sept:_MLW-LT_WG_meeting_agenda 07:11:05 Arle has joined #mlw-lt 07:11:06 tadej has joined #mlw-lt 07:11:12 Jirka has joined #mlw-lt 07:11:15 Ankit has joined #mlw-lt 07:11:16 micha has joined #mlw-lt 07:11:17 DomJones has joined #mlw-lt 07:11:17 Pnietoca has joined #mlw-lt 07:11:21 Declan has joined #mlw-lt 07:11:25 leroy has joined #mlw-lt 07:11:46 daveL has joined #mlw-lt 07:12:17 scribe: phil 07:12:26 topic: meeting schedule 07:12:38 philr has joined #mlw-lt 07:12:50 scribe: philr 07:13:14 http://www.w3.org/2012/10/TPAC/ 07:13:23 philr has changed the topic to: Next meeting (philr) 07:15:49 giuseppe has joined #mlw-lt 07:17:31 http://www.w3.org/2012/10/TPAC/ 07:18:14 http://www.w3.org/2012/10/TPAC/#Registration 07:18:40 16/10 October registration deadline 07:19:10 Individuals need to register themselves 07:21:05 f2f will be on Thursday/Friday 07:21:18 Milan has joined #mlw-lt 07:21:36 Sebastian has joined #mlw-lt 07:21:59 Tatiana has joined #mlw-lt 07:22:24 Des has joined #mlw-lt 07:24:02 Next event is March, Rome 07:24:12 http://www.w3.org/2011/12/mlw-lt-charter.html 07:24:34 mhellwig has joined #mlw-lt 07:27:54 Proposed date for next meeting week of January 23/24 2013 in Frankfurt or 30/31 07:29:23 January 29-31 2013 Frankfurt 07:31:34 Next regular call day and time change to accommodate Dave on Monday's 15:00 UTC 07:32:38 Desire to 'fix' tool issue this afternoon 07:32:53 After next call take time to edit the specification 07:33:14 16:00 UTC live editing session 07:33:22 Pedro has joined #mlw-lt 07:34:52 Editing sessions on Monday's after usual call 07:35:37 MLW-LT PC meet on Wednesday 14:00 UTC 07:37:01 Start Monday calls this coming Monday 1 October 07:37:12 Continue with demonstrations 07:37:52 topic: ITS rules and readiness extension to OASIS CMIS 07:38:54 shaunm has joined #mlw-lt 07:39:42 Dominic and Dave presenting 07:40:35 daveL: CMS Interoperability 07:41:01 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 07:41:12 Thousands of API's 07:41:23 CMIS standardized API 07:42:06 Standard published in 2010. Members inc. Adobe 07:42:24 Many implementors 07:43:10 Models repository as document, folder and relationship objects 07:44:23 2 Requirements for l10n: (a) ITS rules; (b) Readiness 07:45:14 Can apply same rule(s) to several documents? 07:45:21 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 07:46:30 Implemented data model in draft 07:47:08 Polling scheme to define readiness updates and notifications 07:49:50 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 07:50:33 Plans: extend ITS Rule types, integrate xliff, discuss extensions with CMIS-compliant vendors 07:53:50 Readiness essential to implementation 07:57:28 fsasaki: ITS Readiness and CMIS and objectives of Pedro and mhellwig coming together with common needs 07:58:37 fsasaki: no need to have these now in the ITS 2.0 standard, but please move this forward as a joint effort and we can advertise the outcome of this also outside of ITS 2.0 07:58:42 I'm logging. I don't understand 'draft minute', fsasaki. Try /msg RRSAgent help 07:58:43 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 07:58:47 Opportunity for Pedro/Cocomore/TCD to work together 08:01:00 Video link of CMS Lion in the Use Case section of draft 08:01:12 dF has joined #mlw-lt 08:01:24 Dave's presentation finished 08:02:06 Presentation by tadej 08:02:34 topic: Enrycher demo 08:02:36 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 08:03:15 tadej: What elements of a docuent should be treated specially 08:03:36 ..focus mainly on html 08:04:23 ..TA Annotation maybe superceded by tool info 08:05:51 ..html->object model->analysis->add annotations->html 08:08:36 demo is here http://aidemo.ijs.si/mlw/ 08:09:05 ..two markups. First very verbose, second uses single agent-ref div 08:11:11 ..can agent-ref be combined with 'tool info'? 08:14:45 aidemo has rdfa output option 08:15:35 fsasaki: can people take advantage of tadej's process? 08:16:18 tadej finished 08:17:18 philr: can enterprise specific domain models be built? 08:17:28 tadej: yes, in principle 08:18:15 ..not hard to construct multilingual databases 08:19:44 Part of terminology process 08:21:04 discussion on how to integrate enrycher into tools, e.g. terminology systems, translation tools. Confidence of disambiguation annotation not related per se to term "yes" / "no" 08:21:31 Des: any other multilingual databases? 08:22:15 end of demos 08:23:41 fsasaki: Arle will send out the slides that he has compiled of teh demos, all please review and comment 08:48:40 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 08:50:28 topic: test suite 08:50:31 scribe: fsasaki 08:50:55 dom: go through action points and issues related to test suite 08:51:07 .. will go through matrix - currently has only Yves' name as implementor 08:51:21 .. will go names to the categories, and will work with people who put the names 08:51:25 phil: question 08:51:43 .. people who are using a data category 08:51:50 .. do I have to produce tests for my tool? 08:51:57 dave: will explain that now 08:52:03 tadej has joined #mlw-lt 08:52:13 (presentation about test suite) 08:53:04 (related issues and action: issue-33, action-139, action-145) 08:53:41 dave: goal of conformance testing is that it is transparent and reliable 08:53:47 .. that is needed in the w3c process 08:54:21 .. but it also helps that other implementors can come and check wether their implementation does the real thing 08:54:30 .. that is the test suite in the w3c lingo 08:54:37 .. we do that per feature and data category 08:54:59 .. essentially what is test is the (ITS) parsers 08:57:13 phil: when I built a dom dynamically, I don't produce any output - what should I do? 08:57:37 dave: takes yves example - he has a model that splits the test output out 08:57:55 .. there is also a representational aspect: 08:58:22 .. even if there is one data category, but if there is one data category tested by many company, that gives an image of output 09:00:32 phil: what about generators and consumers? 09:01:03 dave: the conformance is really saying "I understand all the requirements" of ITS conformance 09:01:55 phil: so if we don't have two tests what will happen? 09:01:59 s/teh demos/the demos/ 09:02:01 dom: come to that in a minute 09:02:41 dave: validation is also helpful, and use case demos 09:02:52 .. these are two activities distinct to conformance testing 09:03:11 .. e.g. a nice use case doesn't mean that you contributed to the conformance process 09:03:32 tadeJ: testing makes sure that everybody makes the data model right 09:03:59 dave: data categories are independent 09:04:33 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-idvalue-attribute-1 09:05:01 Ankit has joined #mlw-lt 09:07:10 felix: example above shows what do to as an implementor for idValue: there is a MUST statement about xml:id, and the test checks that. so implementor need to decide: global and local, and then go into the data category (sub) section and check for the MUST statements, then implement the related tests 09:07:56 leroy explaining the test suite 09:11:31 leroy: output is a tab-delimited output, easier to compare than previous XML format 09:15:02 felix: test suite will also be helpful to explain ITS behaviour, e.g. with regards to inheritance 09:15:34 dom: we now need to get commitments of implementors 09:16:15 .. who is interested to contribute to the test suite for 09:18:41 Zakim has left #mlw-lt 09:30:37 going through the test suite commitments 09:33:04 action: felix to check what to do with directionality and ruby 09:33:05 Created ACTION-228 - Check what to do with directionality and ruby [on Felix Sasaki - due 2012-10-03]. 09:46:57 action: dom to circulate table for testing and everybody to fill in by friday next week , that is 4. October 09:46:57 Sorry, couldn't find user - dom 09:47:22 action: Dominic to circulate table for testing and everybody to fill in by friday next week , that is 4. October 09:47:22 Created ACTION-229 - Circulate table for testing and everybody to fill in by friday next week , that is 4. October [on Dominic Jones - due 2012-10-03]. 09:47:36 CLOSING NOW: 09:48:26 action: leroy to put test suite on githup 09:48:26 Created ACTION-230 - Put test suite on githup [on Leroy Finn - due 2012-10-03]. 09:57:54 close issue-33 09:57:54 ISSUE-33 Test suite design closed 09:57:58 close action-139 09:57:58 ACTION-139 Check options for test suite design closed 09:58:07 close action-145 09:58:07 ACTION-145 Think about a round tripping test suite data package closed 09:58:16 close action-207 09:58:16 ACTION-207 Get some feedback on test suite input and output from HTML WG closed 09:59:42 action: leroy to create tests for its:param 09:59:43 Created ACTION-231 - Create tests for its:param [on Leroy Finn - due 2012-10-03]. 10:08:59 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-param-in-global-rules-1 10:09:20 DomJones has joined #mlw-lt 10:09:21 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-param-in-global-rules-1 10:09:23 leroy has joined #mlw-lt 10:09:46 daveL has joined #mlw-lt 10:57:00 Milan has joined #mlw-lt 11:07:50 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 11:10:01 Scribe: Yves 11:10:26 Ankit has joined #mlw-lt 11:10:40 Topic: New Implementation Proposals 11:11:19 First proposal : Logrus (Serge) 11:11:42 Serge: Idea is to be practical 11:11:58 .. we have many clients, different formats, 11:12:07 .. and have to deal with a zoo of tools 11:12:24 .. a lot of disconnect between tools 11:12:38 .. data is the key for those standards 11:12:55 .. we need to remember the human user is the key as well 11:13:19 .. tagged files are not WYSIWYG 11:13:31 Arle has joined #mlw-lt 11:13:34 mdelolmo has joined #mlw-lt 11:13:40 .. users need context, and this issue is getting more and more important 11:14:02 .. fragmentation is leading to a need for more skilled users 11:14:16 .. HTML5 is a simple universal standard 11:14:45 .. idea is: Work In Context System (WICS) 11:15:09 .. human need context, and therefore need an help page, with text only 11:15:17 .. and without tags 11:15:41 .. can be prepared from the source (XML, XLIFF, etc.) 11:15:56 .. the output can be represented there 11:16:20 .. can also involve reference material, term base etc. 11:16:22 What is speaker's name? 11:17:12 .. HTML5+ ITS + Javascript for human view 11:17:22 olaf, it is serge gladkoff 11:17:34 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 11:17:51 .. view can be viewed from any browser 11:18:44 .. 2 stage process: proprietary tagged file (with ITS), then HTML5 + ITS 11:19:01 .. helper page can be used at any time 11:19:18 .. for example: client reviewers 11:19:46 .. not being in context may cause issue for reviewers 11:20:08 .. helper page would be in a single format 11:20:37 giuseppe has joined #mlw-lt 11:20:54 .. this has several advantages like no WYSIWYG, more efficient, etc. 11:21:42 micha has joined #mlw-lt 11:21:45 .. ITS data categories and that format: Terminology, Translate, Loc Note, Text Analysis, Disambiguation, External resource 11:22:37 serge then describes part of the roadmap details. 11:23:02 the result would be a working prototype of WICS generator 11:24:03 Tadej: so if I select a node i would get the annotation for that node? 11:24:16 Serge: that's the idea 11:24:49 .. the idea is tho improve productivity, so we show what helps the user to work fast 11:25:05 .. so the info needs to be determined, tailored 11:25:20 .. the important is to provide the info in visual form 11:25:38 Jan: stage1 seem to be where the issues would be 11:26:11 .. looks a lot like LocStudio a bit. Not easy to implement, but powerful 11:26:27 Serge: yes, that's what we split stage1 from stage2 11:27:10 Serge: this would be for a given client 11:27:29 Arle: what would be the output of stage1? 11:27:43 Serge: it would not be open source 11:27:58 .. output can be anything 11:28:06 Arle: XLIFF? 11:28:18 Serge: yes.. possibly 11:28:38 Felix: so you would go from XLIFF to HTML5, not trivial 11:28:56 Serge: not sure the middle would be XLIFF 11:29:20 dF: XLIFF would be good because there are already tools 11:29:45 Des: what about context? how do you get this? (style guide, etc.) 11:30:03 Serge: our understanding is that it can be set with ITS 11:30:27 Des: context is a big problem, so how it's propagated 11:30:48 .. if it's just text there is no context 11:31:09 Serge: not in this case, it's glossary, terms, etc. 11:31:42 dF: in XLIFF 2.0 there is a way to output HTML directly 11:32:07 .. providing full WYSIWYG is impossible, but basic stuff is. 11:32:25 .. so that preview feature in 2.0 is doing that. 11:32:43 Pedro: we tend to do automatic tasks. 11:32:53 .. human task is needed, but where and how 11:33:15 .. but not all metadata can be added automatically (e.g. loc note, etc.) 11:33:33 .. so intervention of the human is important to keep in mind 11:33:45 .. also translator are used to CAT tools 11:34:05 Dave: it's call a viewer, but can the user add things? 11:34:32 Serge: rules can be added to stage1 11:34:46 Dave: but then viewer is not doing this? 11:35:10 Serge: good question. But that's a too big of a scope it seems. 11:35:41 .. it's an HTML editor, and feedback loop is hard 11:35:44 Des has joined #mlw-lt 11:36:06 Felix: would the tagged file +ITS would be valid ITS? 11:36:24 Milan has joined #mlw-lt 11:36:48 .. no I meant the final file HTML5. 11:37:20 Felix: stage1 is proprietary 11:37:29 .. any way to make it open-source? 11:37:49 Serge: part maybe, but not all 11:38:01 .. we'll pick specific client/format 11:38:29 .. the converter will do the conversion will be done by in-house tools, not linked to ITS/HTML5 11:39:21 .. if intermediate file is describe then it's ok, anyone can generate the same file and use the viewer 11:39:57 Felix: many are creating HTML5+ITS, can they use this viewer? 11:40:05 Serge: yes, they should 11:40:13 .. source can be HTML5 11:40:27 Pedro: thought stage1 is what we are doing 11:40:49 dF: important differences. here output is bilingual 11:41:18 Felix: here the output is very specific 11:41:36 des: here the HTML5 is not the original format 11:41:41 .. just a container 11:42:18 Pedro: the problem, is the middle file. we produce that 11:42:34 Serge: that's good then 11:43:00 Felix: the issue is people seems to do stage1 already 11:43:10 .. so we need only stage2 11:43:44 serge: the funded project is just stage2 11:44:53 felix: so cocomore, etc. would be input for the viewer 11:45:07 Serge: we would have to agree on the middle format 11:46:17 Pedro: if scope is stage2, if editing the ITS tags could be done it would be better. 11:46:34 DomJones has joined #mlw-lt 11:46:43 dF: that would be a huge task: feeding back is difficult 11:46:47 serge: yes. 11:47:09 .. main goal is to show productivity gain through viewer 11:47:29 .. a big part is to see what is effective 11:47:46 Pedro: see this as very usefil 11:47:55 s/usefil/useful/ 11:48:26 .. users have to be trained 11:48:39 .. it take long to do this 11:49:01 Serge: helper page has many applications 11:49:25 .. we want a general example. not just a single case like Drupal. 11:49:42 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 11:49:48 Tatiana has joined #mlw-lt 11:50:06 dF: see the values through a condensed material 11:50:17 .. everybody is using the same references 11:50:34 Pedro: some data categories are editable is needed 11:51:11 Felix: those are two different use cases 11:51:49 .. stage2 built on top of what some of use do. 11:52:23 rrsagent, draft minutes 11:52:23 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html Yves_ 11:52:55 Serge: some client use only MT-based web site. 11:53:44 Felix: is stage2 valuable? 11:53:56 Jan: valuable for demonstration 11:54:02 .. not here today 11:55:06 Serge: if user can annotate, he may ament the glossary for example, and regenerate the helper page 11:55:27 I think Jan said: its valuable, as a helper page. The more enhanced usage (editing translations, making corrections etc.) is a different taks, more complex and not in stage 2 11:55:54 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 11:56:18 Serge: idea is to push ITS as possible 11:56:31 .. instead of using javascript 11:57:46 .. usability is important. e.g. no pop-ups 11:58:05 .. user needs context, color coding, no pop-up. 11:58:28 dF: pop-up cold be interactive 11:58:58 Arle has joined #mlw-lt 11:59:05 Tadej: using this one could see for example different meaning to do correction. 11:59:58 Felix: need to get a general opinion 12:00:09 .. anyone in favor or not. 12:00:29 Dave: strong feeling that seeing such HTML5 page would be useful 12:00:43 Jan: visualization of the ITS data. 12:00:50 .. it's useful 12:01:36 Arle: also an example of something tools can implement directly 12:02:02 Tatiana: like the idea. 12:02:29 .. context was one of the top requirement we saw in our reasearch 12:04:10 http://www.w3.org/International/multilingualweb/lt/wiki/PragueSep2012#26_Sept:_MLW-LT_WG_meeting_agenda 12:04:30 rrsagent, draft minutes 12:04:30 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html Yves_ 12:05:14 Felix: the tool information is important, we can talk about that. 12:05:40 Second Proposal: Tilde (Tatiana) 12:06:14 s/Tilde (Tatiana)/Init (Sebastian)/ 12:06:18 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 12:06:49 Sebastian: main focus of ]Init[ is government 12:07:38 .. many clients are multilingual (Europe languages) 12:08:22 .. we discovered ITS in May. it's a process. 12:08:29 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 12:08:44 .. our proposal is ITS in open/libre-office 12:09:09 .. we create documents in DE, then they are translated 12:09:31 .. but we have a break between the autor and the translator 12:09:49 .. data categories we'd like to use: 12:09:53 .. Translate 12:09:57 Locale Filter 12:10:10 .. Terminology (for candidate) 12:10:26 .. Localization Note 12:11:07 .. idea is to add button in O/L-Office 12:11:17 .. so ITS markup could be added in document 12:12:06 .. this would be a plugin, an extension of O/L-Office 12:12:48 .. author would add ITS tags in content 12:13:08 .. then we could save this into OD format. 12:13:29 .. we could also possibly extend Okpai to support ITS in ODF filter 12:14:29 .. for planning: development until Dec. 12:14:47 .. reporting experience after. 12:15:29 .. will need UI to do insertion of the ITS marker 12:15:40 .. save to XLIFF, and merge back 12:16:06 .. the UI would be in English (and German) 12:16:42 .. will not consider MT and Asian languages 12:17:01 .. it would be an open-source project 12:17:46 .. will do either open or Libre-office 12:18:08 .. possibly re-using okapi functionality 12:18:44 Jirka: so when I annotate file the data go outside. 12:19:19 .. seems that the project should be able to save the ITS data in the OD file 12:19:31 .. extensibility too 12:19:39 .. is available 12:20:08 Shawn: interested too. if tags would be in ODF then i can use it too. 12:20:22 Pedro: using two files would be difficult 12:20:43 .. agree that data should be in ODF 12:21:37 dF: yes, ITS markup would need to go in ODF 12:22:02 tadej has joined #mlw-lt 12:22:19 rrsagent, draft minutes 12:22:19 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html Yves_ 12:22:54 dF: don't see a reason to have extraction from O/L-Office 12:23:49 Sebastian: ok, I'm taking the feedback. 12:25:11 Jirka: you can use custom attribute in ODF with some restrictions 12:25:40 .. those may be lost if ODF application does not support those extension 12:26:27 Sebatian: you think extension mechanism will allow for this? 12:26:34 Jirka: I do not know. 12:27:34 Sebastian: need to know between Libre or Open Office 12:28:51 Shawn: LibreOffice is a fork of OpenOffice. Then OpenOffice becaome part of Apache Software 12:28:53 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 12:29:40 .. LibreOffice is an out-of-the-box application, OpenOffice now is a "platform" developers can add to. 12:29:47 dF: what is the overlap? 12:30:19 Shawn: it was a full fork, both sides continued to merge from each others 12:30:26 .. on occasion. 12:30:27 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 12:30:55 jan: what is the one you use 12:31:07 Sebastian: we are using openOffice 2.1 12:31:32 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 12:31:37 .. but use LibreOffice for other clients 12:32:13 Jan: seems LibreOffice would be the choice then. 12:33:23 .. other platform could be extended like this too. here the idea is to demostrate ITS usage. 12:33:48 Jirka: maybe the LibreOffice could save as OOXML 12:34:21 .. could speedup adoption 12:34:41 Sebastian: for XLIFF, 1.2 or 2.0? 12:34:58 Felix: if you relay on Okapi, then use what is there. 12:35:20 Third Presentation: Tilde (Tatiana) 12:35:25 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 12:37:52 Tatiana: main business LSP, terminology services, Lingual technologies, and open Linguistic Infrastructure (Tilde META-SHARE node, META-NORD) 12:38:15 .. refocusing as Terminology as a service (TaaS) 12:39:04 .. now developing web-based services, MT, indexing, annotation, etc. 12:39:39 .. our first proposal was very extensive 12:40:03 .. outr goal now is to filter and understand the requirement of the LT-Web project 12:40:31 .. see that LT-Web need to be validated 12:41:26 .. our idea is to help in usability, interoperability, productivity 12:41:42 .. several use cases: 12:41:58 First: Simple Terminology Annotation 12:42:30 Data categories: Language Information, Locale Filter, Terminology 12:42:43 .. possibly a confidence attribute 12:43:26 .. unsupervised identification of term candidates 12:43:45 .. in HTML5, then presented to users 9highlighted) 12:44:09 .. third aspect is a web service with a web application 12:45:50 Pedro _ scribe 12:46:15 rrsagent, draft minutes 12:46:15 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html Yves_ 12:46:18 Proposal 2: Enhanced terminology annotation and simple machine translation 12:46:55 Includes: term recognition, and existing terminology resources. 12:47:38 .. annotate terminology and interactive interface 12:48:04 .. support consumption by machine translation 12:48:46 .. for existing language resources there is a problem to find the way to reuse and share them 12:49:20 .. ITS DC Considered: Language information, locale filter, termionology and translate 12:49:44 .. benefits: production and consumption of ITS 12:50:34 .. one of the existing resource can be used in the 2nd proposal is eurotermbank 12:50:59 .. largest terminology bank in Europe 12:51:40 .. more benefits: reusing existing resources like this bank 12:52:35 ,. Proposal 3: Enhanced Termonology Annotation and Enhaced MAchine TRanslation 12:53:20 .. Includes the two previous proposals, extend the MT enhancement 12:53:57 .. .. ITS DC added Element whithin Text 12:54:20 .. Additional benefits: enhance translation quality 12:55:09 Des: existing termionlogical resources are used for disambiguation tasks? 12:55:25 Tatiana: yes, it is one of the purposes 12:56:16 Tadej: it also to apply to terms? 12:56:54 Tatiana: yes, we use terminololgy datacategory and we explore disambiguation and some other 12:57:28 Tadej: terminology datacategory is not modelled for candidates, we should revise. 12:57:39 Des: candidate can be domai especific. 12:58:01 s/domai /domain / 12:58:23 s/ esp/ sp/ 12:58:46 Tadej: to use confidence for this is interpretation. 12:59:05 Felix: this is about term annotation and not entity annotation, and there is some overlap 12:59:22 .. but there is a diiference annotation workflow 12:59:25 Tadej: yes 12:59:55 Segey: There is any kind of API to access existing resources? 13:00:16 .. In the case os EU bank? 13:00:22 scribe: Pedro 13:00:47 Tatiana: yes, it is a resource senstive scenario, does not take any glossary. 13:01:06 s/Pedro _ scribe/scribe: Pedro/ 13:01:09 Sergey: why? there are many other valid glossaries. 13:01:53 Tatiana: maybe with teh same architecture we can in short time allow to compile other new resources 13:02:11 FElix: in the enhancer tadej has this approach. 13:02:14 Tadej: yes 13:02:29 s/in the enhancer/in Enrycher/ 13:02:46 Sergey: It is an open source application? 13:03:14 Tatiana: now it is under discusion, but it will be a service. 13:03:57 Felix: independently is open source it can be a service, like Drupal does 13:04:30 Tatiana: in the busienss model it iwll be a service level in the future, of course 13:05:08 Felix: Perhaps the best user strategy can be discussed. 13:07:06 Tatiana: shchedule Proposal 1: 6 months, Proposal 2 and 3: to be done in 1 year 13:07:18 Sergey: which languages? 13:07:35 Tatiana: English as the source language. 13:07:52 .. with rich morphology for inflection, etc. 13:09:29 .. and other 2 languages with all the morphology. In the future it will be able to add other languages. 13:10:09 Felix: additional confidence attributes were discussed, and need to be stable 13:10:27 ... for this prototyping this will help. 13:10:53 .. but do you plan so be able to see before 6 months? 13:11:35 Tatiana: we have to see, but yes. 13:12:16 Felix: opinions about these three proposals? 13:12:25 scribe: fsasaki 13:12:31 pedro: three different proposals 13:12:38 .. all final users implied 13:12:57 .. this takes care about the "human side" of the usage of the metadata 13:13:15 .. so this scenario will help users to annotate metadata in the content 13:13:42 s/proposals/proposals (Logrus, ]init[, Tilde)/ 13:13:55 s/this scenario/the Tilde scenario/ 13:14:18 declan: also for MT it might be useful - monolingual or multilingual terminology information 13:14:52 pedro: if we have time, the groups that we are talking to (translators, terminologists, testing) 13:15:25 tatiana: we can also leverage our localization and terminology partners 13:15:41 Pedro scribe/ 13:15:47 scribe: pedro 13:16:00 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 13:16:17 Tatiana: we will not perform all the 3 proposal, but to get your vision 13:16:27 ,, and to select the one applies better 13:17:03 Felix: Does it is possible to modify Proposal 1 13:17:34 Sergey: do not understimate Proposal 1 13:17:58 dF: it is different in the different languages 13:18:36 Tataina: maybe the terminology use case will be richer if we combine part of proposal 13:18:57 .. but we do not want only to produce metadata but to consume it 13:19:34 .. terminology per se is not enough 13:20:08 Felix (coordinator): Maybe to collaborate with other exinting partners 13:20:37 .. we have to see if it is Matrix or Lucy, or another one. 13:20:47 .. we have to think about it. 13:21:25 .. they can make experiences 13:22:14 .. We need to have the feeling if it is better proposal 1, 2 or 3 13:23:20 Tatiana: we have to decide and going back to you 13:24:29 Jim: In terms of preferences I will stack in this orden: 1, 2,3 13:24:54 Sorry, wasnt Jim, but Jan 13:25:19 Declan: in proposal 3 you cover more 13:25:28 Felix: Of course, but the budget 13:25:38 scribe: pedro 13:25:43 scribe: felix 13:25:53 pedro: we plan to finish most of the stuff middle next year 13:25:59 .. so that we have time for showbusiness 13:26:38 tatiana: the 3rd one would be one year, the others shorter 13:26:49 scribe: felix 13:26:53 scribe: pedro 13:28:10 Tatiana: a possibility is a mixture 13:28:20 FElix: we can discuss later. 13:28:34 .. there are now two other things before Coffee breack 13:29:03 .. the current proposals are on the table and no time for looking other 13:29:26 .. decissions will be there in a short time:around 2 weeks 13:29:41 dF: we need the economical proposal details 13:30:30 Felix; yes, from Init we have, but we could have also a budget breakdown from Logrus and Tilde 13:30:41 .. for Monday next is fine 13:30:52 Coffe break 13:30:59 scribe: pedro 13:31:06 rrsagent, draft minutes 13:31:06 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html Yves_ 14:06:48 http://www.w3.org/International/multilingualweb/lt/wiki/PragueSep2012#26_Sept:_MLW-LT_WG_meeting_agenda 14:06:49 tadej has joined #mlw-lt 14:07:20 daveL has joined #mlw-lt 14:11:14 I have made the request to generate http://www.w3.org/2012/09/26-mlw-lt-minutes.html fsasaki 14:12:59 https://www.w3.org/International/multilingualweb/lt/track/issues/open 14:13:16 tadej_ has joined #mlw-lt 14:13:18 mhellwig has joined #mlw-lt 14:13:20 https://www.w3.org/International/multilingualweb/lt/track/issues/open 14:13:49 Milan has joined #mlw-lt 14:18:30 scribe: dave 14:18:37 scribe: daveL 14:18:47 topic: tool reference issue 14:19:25 yves: for many data categories, know the tool is useful and for some it is really importnat 14:20:08 ... but the problem is some data cats operate best at a local level, where the tool ref is a big overhead 14:20:32 ... felix proposed a standoff format detailing the tool 14:20:49 ... but this made it difficult to reference the tool. 14:21:29 ... another approach is using xml:id, but this could clash with the many other usages of xml:id. 14:22:54 Yves so an alternative proposal is a tool reference that bind the data category id 14:23:34 ... that could be used to overrid the tool binding, but not the referneced data category at a local level 14:24:27 ... but this still needs to point to the standoff tool specification file 14:25:07 dF_ has joined #mlw-lt 14:25:08 external tool information is here: http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Sep/att-0189/tool-its-info-example.xml 14:25:15 ... but this could be done by replacing the textual tool ID bound to the data category to tool with a URI 14:27:53 df what if you have two MT in the document 14:28:48 yves, you just define that locally to differentiate 14:30:57 shaun - could we allow tool refs for data categories outside of the ITS one 14:32:53 yves - yes , perhaps with a name space like prefix to data category extension 14:33:33 David and Tadej indicate this addresses thier tool identfication needs 14:35:23 pedro: asks whether this can help with evaluation of MT output 14:35:35 declan: those aren't confidence scores 14:36:00 Des; asks about updating the tool id 14:38:40 felix: asks also how to do this in HTML5 14:39:04 ... possibly using a similar mechanism to quality issue using its:span 14:41:22 jirka: could also put it in a script element 14:43:35 action: shaun to create a list of canonical data category names 14:43:35 Created ACTION-232 - Create a list of canonical data category names [on Shaun McCance - due 2012-10-03]. 14:45:35 shaun: this could be a RDF link, but this requires an RDF tool chain rather than an XML one 14:47:29 jirka: says that using script is the only way currently of including XML in HTML 14:49:12 felix: shows how to do equivalent in the quality issue example 14:49:20 yves: seems like a hack 14:49:41 jirka: possible, but it whats possible in html currently. 14:50:00 https://developer.mozilla.org/en-US/docs/DOM/DOMParser 14:50:17 felix: current recommendation is to put the tool info xml into script in html 14:50:41 var doc = parser.parseFromString(stringContainingXMLSource, "application/xml"); 14:50:49 var parser = new DOMParser(); 14:50:50 var doc = parser.parseFromString(stringContainingXMLSource, "application/xml"); 14:50:50 // returns a Document, but not a SVGDocument nor a HTMLDocument 14:51:02 felix: should we do the same in quality issue example 14:51:07 yves: agrees 14:51:34 action: felix to update quality issue to use the same solution 14:51:34 Created ACTION-233 - Update quality issue to use the same solution [on Felix Sasaki - due 2012-10-03]. 14:52:12 felix: so how do we manage conformance? 14:53:14 yves: would need to specific for appropriate data category that they MUST support tool info 14:53:20 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#conformance-product-processing-expectations 14:53:54 dF: agrees its a MUST for MTconfidence, but that does need to be universal? 14:55:05 felix: explains the current conformance rules and suggest that is an application supports mt confidence MUST support tool info and other MAY support it 14:57:54 ... this doesn't need a conformance test, but we should do this to ensure ease of conformance for this option 15:00:40 df: but is it is optional, then it wont get implemented 15:02:59 tadej: its option for disambiguation, since this could be manually generated, so tool is not necessarily importNT HERE 15:04:22 dave: also an option for for proveance agent data categories 15:08:01 df: so should it be left option or made specific to mt confidence score 15:09:47 felix: but its still useful for other, just not mandatory, but would loose that if it was specific to mtconfidence score 15:11:02 tadej: the tool info is important for text analytics 15:13:45 ... so you can give the tool info even without the confidence scoe 15:14:31 jan: asks how confident gorup is with confidence score being only valid inter-segment per engine rather than between engines 15:16:29 declan: had been discussed but that is currently what we can to and we wanted to keep it simple 15:17:26 if