07:24:52 RRSAgent has joined #mlw-lt 07:24:52 logging to http://www.w3.org/2012/11/01-mlw-lt-irc 07:24:59 Zakim has joined #mlw-lt 07:25:04 meeting: MLW-LT Lyon f2f 07:26:01 chair: felix 07:26:03 scribe: various 07:26:20 agenda: http://www.w3.org/International/multilingualweb/lt/wiki/LyonNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda 07:28:48 Ankit has joined #mlw-lt 07:30:11 action: felix to follow-up on discussion about regular expression for allowed characters, see http://www.w3.org/2012/10/31-mlw-minutes.html#item02 07:30:11 Created ACTION-261 - Follow-up on discussion about regular expression for allowed characters, see http://www.w3.org/2012/10/31-mlw-minutes.html#item02 [on Felix Sasaki - due 2012-11-08]. 07:37:36 tpacbot has joined #mlw-lt 07:45:45 Yves_ has joined #mlw-lt 07:48:39 yes, I've seen your email. Thanks. 08:00:28 Jirka has joined #mlw-lt 08:01:40 pnietoca has joined #mlw-lt 08:01:45 Milan_ has joined #mlw-lt 08:01:54 kfritsche has joined #mlw-lt 08:01:54 tadej has joined #mlw-lt 08:02:05 mdelolmo has joined #mlw-lt 08:02:05 Naoto has joined #mlw-lt 08:02:19 Marcis_Pinnis has joined #mlw-lt 08:02:19 Fredrik has joined #mlw-lt 08:02:19 mhellwig_ has joined #mlw-lt 08:02:41 daveL has joined #mlw-lt 08:03:50 SebastianSk has joined #mlw-lt 08:06:46 Pedro has joined #mlw-lt 08:07:06 clemens has joined #mlw-lt 08:07:13 scribe: clemens 08:08:39 topic: http://www.w3.org/International/multilingualweb/lt/wiki/LyonNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda 08:09:00 renatb has joined #mlw-lt 08:09:20 Agenda accepted 08:09:39 topic: self intro 08:10:00 people doing self-intro 08:10:19 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 08:15:54 topic: review implementation isssues 08:16:06 DomJones has joined #mlw-lt 08:16:58 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#domain-implementation 08:17:40 changes are: removing quotes and duplicates 08:21:09 Need for a common representation of the ITS data categories in XLIFF, see XLIFF_Mapping 08:21:30 felix: domain issue is "fine" 08:26:45 tpacbot has joined #mlw-lt 08:27:43 ITS to XLIFF working draft mapping: http://www.w3.org/International/multilingualweb/lt/wiki/XLIFF_Mapping 08:31:06 Implementors are waiting for the XLIPP mapping 08:31:16 XLIFF 08:31:31 s/XLIPP/XLIFF/ 08:32:57 current direction is that pointers are not be needed in XLIFF 08:33:33 "mrk", "sm" and "em" elements in XLIFF. "mrk" and "sm" would be extended 08:34:09 XML Schema subset of regex for allowed characters 08:34:57 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#allowedchars-implementation 08:36:57 Yves, could you repeat your issues please? Didn't get everything.... 08:43:36 the issue is that not all programming languages support all XML schema regex pattern 08:44:01 thx, Yves 08:44:16 http://www.w3.org/2012/10/31-mlw-minutes.html#item02 08:48:29 we keep the draft as it is now and postpone the topic 08:48:39 Allowed Characters regular expression for not allowing HTML tags in content nodes where only plain text content is allowed 08:59:24 tadej has joined #mlw-lt 09:02:26 rrsagent, draft minutes 09:02:26 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html Yves_ 09:04:33 action: pedro to write note about allowed characters issue with help from Mauricio and karl - due 8. November 09:04:34 Created ACTION-262 - write note about allowed characters issue with help from Mauricio and karl [on Pedro Luis Díez Orzas - due 2012-11-08]. 09:05:34 "Need to add all the document HMTL tags (wrap the content with html, head, body tags) so we can add a link to a global rules XML " 09:05:39 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 09:09:46 DomJones has joined #mlw-lt 09:12:29 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 09:13:57 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Nov/0005.html 09:23:20 action: implying some its:rules in html tags 09:23:20 Sorry, couldn't find implying. You can review and register nicknames at . 09:24:22 http://www.w3.org/TR/xml-i18n-bp/#relating-its-plus-xhtml 09:25:09 1) input 18_8.xml 09:25:31 2) XML parsing of HTML fragment e.g. via validator.nu library in 18_8.xml 09:25:46 output of step 2): DOM or XML serizalization 09:25:56 3) normal ITS processing 09:28:42 0) input is "some HTML data" 09:28:50 next step: step 2) 09:28:57 next step: step 3) 09:35:14 Coffee break now 09:39:21 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html Yves_ 09:48:27 Zakim has left #mlw-lt 09:52:21 leroy has joined #mlw-lt 10:01:02 Jirka has joined #mlw-lt 10:05:19 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 10:05:20 Milan_ has joined #mlw-lt 10:06:17 JonasJacek_ has joined #mlw-lt 10:07:32 mhellwig has joined #mlw-lt 10:08:25 that'd be great, thank you :) 10:08:49 scribe: Yves_ 10:09:46 mauricio: one possibility for solving this could be to changing to XHTML the HTML content 10:10:04 dF: that would allow local ITS markup 10:10:38 Felix: so no change for the data category 10:10:50 .. maybe some note on best practices? 10:11:23 dF: one problem Karl noted is the validator inside Drupal 10:11:34 mdelolmo has joined #mlw-lt 10:11:53 Jirka: I can try to look at it and see it could be fixed 10:12:32 Felix: but do we need guidance for CMS in general? 10:12:50 dF: maybe in a non-normative section 10:13:37 daveL has joined #mlw-lt 10:13:41 Moritz: yes a best practice 10:13:59 action: david to summarize the options and the recommendations related to HTML parsing workflow in the CMS, see discussion at http://www.w3.org/2012/11/01-mlw-lt-irc#T10-13-41 10:13:59 Sorry, ambiguous username (more than one match) - david 10:13:59 Try using a different identifier, such as family name or username (eg. dlewis6, dfilip) 10:14:11 action: dfilip to summarize the options and the recommendations related to HTML parsing workflow in the CMS, see discussion at http://www.w3.org/2012/11/01-mlw-lt-irc#T10-13-41 10:14:11 Created ACTION-263 - Summarize the options and the recommendations related to HTML parsing workflow in the CMS, see discussion at http://www.w3.org/2012/11/01-mlw-lt-irc#T10-13-41 [on David Filip - due 2012-11-08]. 10:14:21 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html Yves_ 10:14:39 Felix: so maybe a best practice for CMS 10:15:28 "Troubles with namespaces in HTML5. " 10:15:50 xmlns:h="http://www.w3.org/1999/xhtml" 10:16:21 Pedro: Jirka noted we need the namespace definition 10:16:25 http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0226.html 10:16:52 Jirka: the namespace declaration is missing 10:17:41 .. probably an implementation issue 10:18:09 .. the node in the DOM should list the namespaces 10:18:22 .. I can help if needed 10:19:22 Richard Ishida comes to discuss logistisc/topics with HTML5 WG 10:19:40 .. and I18N WG 10:20:50 tomorrow 11 a.m. meeting with i18n wg 10:21:07 back to implementation issues 10:21:23 .. HTML5 namespace issue resolved it seems 10:21:23 "Need to come to an agreement to map domain values to be consistent for both Lucy and DCU's MT Systems. " 10:22:06 Felix: important for the showcase that the two system communicate 10:22:56 action: thomas to follow up on domain list topic with DCU 10:22:56 Created ACTION-264 - Follow up on domain list topic with DCU [on Thomas Rüdesheim - due 2012-11-08]. 10:23:28 "Problems to use global rules with the provenance and quality metadata since ATLAS PW1 cannot place files on the client server. " 10:24:16 Ankit has joined #mlw-lt 10:24:16 Pedro: our system doesn't allow to create rules file on the client 10:24:38 .. rules file then points to our server, but that's not the right way to do it 10:25:03 .. also will we drop global rules? 10:25:45 David: important question for some data categories like provenance, etc. 10:26:06 .. some idea is to exclude some pointers 10:26:33 .. but some use cases show some non-pointer global rules are useful 10:27:06 .. other issue is how to address attributes 10:27:45 .. transltable attributes is not best practice, 10:28:08 .. so maybe it's ok to drop global rules despite those use cases 10:28:34 Felix: what about the MT quality 10:28:51 dF: well some attributes are not going away, like in HTML5 10:28:57 s/MT quality/quality issue/ 10:29:55 .. no way to markup things like alt or title 10:30:33 Felix: two types of implementers: some using XLIFF extration, other work directly with original data 10:30:51 .. for XLIFF case this is not an issue 10:31:00 .. for original data this is a problem 10:32:08 David: we should be consistent, if an attribute can be translated, it should be able to get other data categories too 10:32:24 .. an approach maybe could be to use a local attribute? 10:33:02 Felix: two topics one is where to put the global rules, the other is do we need global rules 10:33:42 .. with local, you can't address attributes 10:33:59 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html Yves_ 10:34:29 Tadej: so global rules in script 10:34:36 .. ok with that solution 10:35:03 .. also concern about a rule should not have pointer and values at the same time 10:35:24 .. basically not doing proxy stand-off annotation 10:35:59 Felix: yes, should we drop this statement? 10:36:37 .. one could have combination of both in some cases 10:37:25 Tadej: this would cause clashes in some cases 10:37:47 .. not sure if this affect stand-off 10:38:03 Felix: stand-off is different 10:38:32 .. the URI points to a piece of info, so it still "add" information 10:38:56 Tadej: so goal to to keep all info in one place rather than scatter it around 10:39:39 Felix: If we keep the constraint what do we do with cases like Quality Issue 10:40:27 David: is XLIFF the only use case for pointer in Quality issue 10:41:56 yves: pointer for the reference attrbute to standoff markup is needed 10:42:01 .. at least in the XLIFF case 10:42:37 felix: would that be resolved by a mapping table? 10:42:50 yves: would not make the stuff processable by an ITS processor 10:43:28 yves: need for pointer would be just for reference attribute, not the other pointers 10:44:09 David: if XLIFF is the only use case, should we make it a special case? 10:44:11 Jirka has joined #mlw-lt 10:45:07 yves: can imagine an "Its only" processor to deal with quality issues 10:45:19 .. they can have special information for ITS 10:45:29 .. but big question is: do we want ot allow that to happen 10:45:42 .. that is have other formats to work without XLIFF markup 10:46:13 .. if other formats don't allow the ITS native mapping attribute, what to do? 10:46:26 .. if XLIFF mrk allows extension, the problem goes away 10:47:39 Felix: difficult to discuss because we depend on XLIFF decission 10:49:35 yves: we need to wait for next week XLIFF meeting, if is not resolved, we need to find a different way 10:49:43 DomJones has joined #mlw-lt 10:51:12 action: daveL to react to decision on extensibility for mrk in XLIFF and the result for our "pointer" attributes 10:51:12 Created ACTION-265 - React to decision on extensibility for mrk in XLIFF and the result for our "pointer" attributes [on David Lewis - due 2012-11-08]. 10:51:51 Felix: Example 54 and 55 10:51:57 .. RDFa 10:52:31 .. .. ITS generic processor can then work with RDFa 10:53:05 .. Karl provided feedback 10:53:21 .. is this a use case for Pointer 10:53:43 dF has joined #mlw-lt 10:53:44 Tadej: the example as some small issue 10:54:07 "typeof=http:/nerd.eurecom.fr/ontology#Place": only single slash 10:54:19 .. single slash missing and miss-named attribute 10:54:40 Example 55: "entityTypeRefPointer" -> "disambiguationClassRefPointer" 10:54:50 "disambigClassRefPointer" 10:55:37 Felix: is that the only way to consume RFDa if you don't know about it? 10:55:41 .. it seems so 10:55:49 Tadej: yes 10:56:55 .. main difference with Example 52 10:57:09 .. it's much less stable, more prone to error 10:57:20 .. using the pointer is more stable 10:57:41 .. but see why 52 would not be recommended 10:58:22 .. better for consistency using local would be better 10:58:53 action: felix to edit disambig example 10:58:53 Created ACTION-266 - Edit disambig example [on Felix Sasaki - due 2012-11-08]. 11:00:10 Feelix: Currently pointer/non-pointer can be mixed in Disambiguation 11:00:18 .. not good 11:00:28 .. you should do either one 11:00:45 Tadej: agreed 11:01:14 action: felix to try to simplify disambiguation globally 11:01:15 Created ACTION-267 - Try to simplify disambiguation globally [on Felix Sasaki - due 2012-11-08]. 11:02:13 "Language Information: Use-Case? When is xml:lang or lang not enough or can't be used? " 11:02:31 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 11:02:41 karl: we have xml:lang or lang 11:03:01 .. when using HTML we should use lang 11:03:47 .. maybe the content of those attributes should be explicitely defined 11:04:07 Felix: in section 6.7.1 11:04:18 .. we say that BCP47 is the value 11:05:10 Dave: it's there for XML vaocabulary where xml:lang is not used 11:05:25 .. no need for pointer in HTML5 11:06:41 Pedro: another case is to keep info about what was the original language 11:06:56 http://www.w3.org/TR/xml-i18n-bp/ 11:07:09 Example 19: Declaring language information with a non-standard mechanism 11:07:11 .. but that's not for this. 11:07:21 Felix: yes, it's for XML 11:08:47 Pablo: was not able to come up with use case in HTML 11:09:51 Dave: so all three xml:lang, lang and pointer are relevant 11:11:22 yves: various legacy content has other attributes for language information, that is a use case for language Information 11:11:52 .. all those cases use bcp 47 as a value, so with the pointer attribute we always get the same value 11:12:25 dF: what about the BCP47 extensions 11:13:01 Felix: ITS does not do validation of content 11:13:16 .. could have a note about issue related to extensions 11:15:03 Pedro: maybe ITS 3.0 could have a way to indicate the original language 11:15:17 Felix: could be done in provenance maybe 11:16:09 Moritz: so can we drop langInfo for HTML5? 11:16:46 "[Ed. note: Add something about HTML5 lang]" 11:16:49 Felix: there is an editor note about it 11:17:28 http://tools.ietf.org/html/rfc6497 11:17:31 .. Done with the implementation issues 11:17:43 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html Yves_ 11:18:07 Felix: could start with test suite 11:18:30 Jirka: we need to start to work on schema 11:18:46 .. several data categories are still not stable 11:18:56 .. we need that for the publication 11:19:23 Felix: for disambiguation MTconfidance it's about pointer and tool reference 11:19:39 for LQ Precis it's about implementation commitements 11:20:37 dF: name may prevent adoption 11:21:02 Felix:another detail is the question about global rules 11:21:16 .. let say cut off date in 3 weeks 11:21:30 s/let/let's/ 11:21:52 .. including the details with global rules and pointers 11:22:40 Jirka: maybe too short, need 2 weeks at least 11:22:52 .. could do it in parallel 11:23:19 felix: Friday 23rd would be the cut off day 11:23:43 .. then end of nov we have stable content 11:23:57 .. then we have two more weeks for schema and tests 11:24:27 .. not sure how realisitic it is in december 11:24:44 Leroy: should be ok for tests 11:25:08 Felix: we want to get to LC 11:25:26 .. and we have a January f2f where we want to address comments 11:26:01 .. if we send this at start of December people would have time for commentd 11:27:04 .. should we have a call in week of 26th for test/schema 11:27:30 Jirka: not sure,depends on time I need to work on schema 11:27:54 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html Yves_ 11:28:38 action: dom to make sure that schedule for test suite and schema update discussed at http://www.w3.org/2012/11/01-mlw-lt-irc#T11-27-30 is taken into account 11:28:38 Sorry, couldn't find dom. You can review and register nicknames at . 11:28:50 action: DomJones to make sure that schedule for test suite and schema update discussed at http://www.w3.org/2012/11/01-mlw-lt-irc#T11-27-30 is taken into account 11:28:50 Sorry, couldn't find DomJones. You can review and register nicknames at . 11:29:09 action: Dominic to make sure that schedule for test suite and schema update discussed at http://www.w3.org/2012/11/01-mlw-lt-irc#T11-27-30 is taken into account 11:29:09 Created ACTION-268 - Make sure that schedule for test suite and schema update discussed at http://www.w3.org/2012/11/01-mlw-lt-irc#T11-27-30 is taken into account [on Dominic Jones - due 2012-11-08]. 11:29:47 Pedro: LQ precis could be LQ metrics 11:30:00 dF: could bs LQ score 11:30:16 felix: lunch now. back in one hours 11:32:42 Milan_ has joined #mlw-lt 11:47:58 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html Yves_ 12:24:54 leroy has joined #mlw-lt 12:29:33 DomJones has joined #mlw-lt 12:35:46 Jirka has joined #mlw-lt 12:38:41 Ankit has joined #mlw-lt 12:40:27 Milan has joined #mlw-lt 12:42:29 daveL has joined #mlw-lt 12:42:32 scribe: daveL 12:42:39 Topic: test suite 12:43:03 DomJones: breif overview of test suite 12:44:02 https://docs.google.com/spreadsheet/ccc?key=0AgIk0-aoSKOadG5HQmJDT2EybWVvVC1VbnF5alN2S3c#gid=0 12:44:40 dF has joined #mlw-lt 12:45:08 ... looking at these tables there are some features that don't have two implementors 12:46:47 action: DomJones to ask Phil to confirm whether or not he will implement provenance 12:46:47 Sorry, couldn't find DomJones. You can review and register nicknames at . 12:47:20 Yves: not committing yet to provenance as it is not stable 12:48:13 mhellwig has joined #mlw-lt 12:48:25 Dom: Yves just confirmed he will support disambiguation 12:50:32 clemens has joined #mlw-lt 12:50:51 Tadej: will implement text analysis annotation if it is stable 12:51:17 dF: seems to be too unstable, needs to be resolve in specification section 12:51:23 tpacbot has joined #mlw-lt 12:51:49 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-1 12:52:00 Dom: quality issue is partially covered by two implementors, but different useages of using OKAPI is acceptable. 12:52:25 http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#EX-locQualityIssue-html5-local-2 12:53:51 felix: quality has stand-off in script as well as in-line option. Do we want both, as these are not reflected in the test suite currently 12:54:37 ... this choice has a lot of knock on for test suite 12:54:48 Yves: prefer the stand off in script over inline 12:55:40 action: Dave to check with Phil what his preference was between quality issue locally with inline and script-based stand off 12:55:40 Sorry, couldn't find Dave. You can review and register nicknames at . 12:55:55 action: daveL to check with Phil what his preference was between quality issue locally with inline and script-based stand off 12:55:55 Created ACTION-269 - Check with Phil what his preference was between quality issue locally with inline and script-based stand off [on David Lewis - due 2012-11-08]. 12:56:31 dF: the script based appraoch may still recieve negative reaction from the HMTL WG 12:57:38 Jirka: HTML WG will say to do this using microdata, but this doesn't work in this case 12:57:55 DomJones has left #mlw-lt 12:58:21 ... HTML parsing consdere XML is script like it was CDATA 12:58:51 ... but best way is the link to an external file 13:00:30 ... or use XHMLT rather than HTML - which may not suite everyone 13:00:40 its-rules and its-standoff references? 13:01:08 Yves: the refernece for standoff mark-up could be an external file 13:01:44 felix: so in html would we have a separate link relation, in additon to its-rules, for such mark-up 13:02:22 felix: not sure if above is needed, just *one* solution 13:03:39 df: prefer a span-based inline solution, vs a standoff in a script 13:12:51 jirka: if using the inline version, rdfa or microdata would be better, but HTML WG has stabalised this choice enought for us to make a decision 13:13:57 felix: but to summarise, for cocomore/linguaserve, the script is better since propagation back to client can be controlled more clearly 13:18:01 ... so there is slight preference for script solution in the room 13:19:30 Jirka: agrees it not a nice solution but the best we can manage given the problem of coexisitng between XML and HTML in general 13:20:54 felix: is there a volunteer to explore the use of external file for standoff 13:21:01 ... back to test suite 13:21:32 DomJones: now have two implementers for quallity issues 13:22:15 ... quality Precis, has only partial coverage from UL and vistaTEC 13:22:43 df: only interested if someone produces it 13:22:57 Milan has joined #mlw-lt 13:23:12 yves: there was interest from Des also in this 13:24:23 df: this seems still quite unstable, for example how valid is the value of the score without the (optional) tool info 13:24:41 action: felix to ask phil and des and arle about need and implementation committment for localization precis during next call 13:24:41 Created ACTION-270 - Ask phil and des and arle about need and implementation committment for localization precis during next call [on Felix Sasaki - due 2012-11-08]. 13:25:18 dF: so this need more consideration especially about interpretation of score 13:27:52 felix: highlight the feature at risk in a draft, which allows us to cleanly remove after feature freeze - but not to change the feature 13:29:46 DomJones: mtconfidence and alloewed text 13:31:44 ... have also been now supported with imeplmentation 13:32:11 DomJones: asks what language people are using 13:32:33 ... repsonses from room, Java, Javascript, php 13:32:45 ... so tcd will try and provide some test sample code 13:32:58 DomJones: 13:33:53 ... aim to have an online tutoril to help people with coding the test suite 13:34:41 ... suggest week 3rd December, suggest on Tuesday 4th december 13:34:54 yves: need to avoid XLIFF call 4pm GMT 13:35:45 DomJones: so aim for 2.30pm GMT, 3.30 central european time 13:35:50 http://www.timeanddate.com/worldclock/fixedtime.html?iso=20121204T14 13:35:56 DomJones has joined #mlw-lt 13:36:16 DomJones: we will record this anyway for epople who miss it 13:36:21 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 13:36:39 above wordclock link is the time of the webinar for the test suite 13:38:09 DomJones: now plan to switch to github for all the files and use google docs as index sheet and recording 13:38:49 Leroy: tst output will change to be tab delmited and order the output alphabetically as suggsted by Yves 13:39:46 dF has joined #mlw-lt 13:40:20 ... also use its as prefix in all cases 13:41:36 DomJones: when will files be frozen? 13:42:03 felix: after feature freeze, changes are unlikely, though external comment may require some changes. 13:42:26 its-term="yes" vs. its:term="yes" 13:42:35 http://www.w3.org/2012/09/mlw-lt-charter.html 13:43:11 felix: according to charter, testing can continue until october, but we should aim to be completed by March 13:43:31 Bert has joined #mlw-lt 13:43:39 its-term="yes" vs. its:term="yes" 13:44:53 felix: clarification on prefixes in all cases is 'its:' 13:45:08 yves: could just drop prefix altogether 13:45:22 felix, leroy: agreed 13:48:54 DomJones: we will in new year work on making testing more accessible to implementors outside of the WG, as part of the general promotion of the of the spec and to encourage its uptake 13:52:31 felix: will there be some documentation on this 13:53:05 Leroy: yes there will be some in the github and on the web page 13:54:01 felix: will the test parser we made available 13:55:04 DomJone: yes will will release that later targetted at non working group members 13:57:04 rrsagent, generate minutes 13:57:04 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html daveL 13:57:08 rrsagent, generate minutes 13:57:08 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html daveL 13:57:34 matthiasK has joined #mlw-lt 13:57:47 Topic: CMS-TMS demo by linguaserv and cocomore 13:59:17 mhellwig: explains demo 13:59:24 ... two use cases 13:59:53 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 14:00:02 ... one where cleint have few locaslaition staff, so they needs tools to add its marks 14:00:25 s/cleint/client/ 14:00:43 ... second is where staff can enhance content with ITS, but can change the content directly - this is actually higher priority for client 14:01:27 scribe: fsasaki 14:01:37 karl: demo - editing content 14:01:47 .. in the body I can add the content and add metadata 14:02:19 .. demo shows clicking on content, then adding a localization note 14:02:49 .. also specifying concepts, e.g. disambiguation target 14:03:33 karl: can also mark content that should not be translated or only for certain languages 14:03:37 .. that is lcoale filter metadata 14:04:06 karl: now about global metadata 14:04:14 .. domain, revision agent, translation agent 14:04:26 .. can add translate rules 14:04:41 .. one field for the translate selector 14:05:02 .. we think about helping the users by helping with creating selectors 14:05:41 .. another example of outputing the content as only HTML(5) 14:06:08 .. this is one option of output, the other is the XML file that we have seen from Linguaserve before 14:06:39 .. now there is an example with checking the annotations: 14:07:21 .. one can see the page and click on data category buttons, then you can see the metadata available 14:07:31 .. we now press a button and send the data to linguaserve 14:08:23 (switching machines, now working with Linguaserve machine) 14:08:46 Mauricio: now demo environment, internal workflow interface 14:08:56 .. here we have received the file from Cocomore 14:09:25 .. it has information to transform the drupal XML file to CAT tool oriented files 14:10:09 .. now executed the preparational step 14:10:16 .. now simulating the translation of the file 14:10:34 .. assuming that the file has been translated now 14:10:55 .. setting simulation translation marks, will explain these later 14:11:12 .. the file is now ready to be downloaded by Cocomore 14:11:39 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html Yves_ 14:11:41 pedro: we don't go into their servers - we act as a server 14:11:57 .. we don't go to the client's server 14:12:10 karl: now I am starting a manual crown job 14:12:19 .. to check if there are new translations 14:12:26 s/crown/cron/ 14:12:33 .. the crown job checks regularly if there are new translations 14:12:36 s/crown/cron/ 14:12:37 .. the user doesn't need to do anything 14:13:15 karl: issue with cron job is that drupal has cron jobs internally, so it takes some time until drupal gets to our cron job 14:13:20 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 14:13:26 (karl's machine loading) 14:14:35 karl: translation received 14:14:46 .. status changed from "in progress" to "needs review" 14:15:04 Milan has joined #mlw-lt 14:15:15 .. now looking at the spanish node 14:15:28 .. the language mgmt tab shows that all metadata is still here 14:15:39 .. there is also additional metadata: revision agent and translation agent 14:16:09 (demo continues with Mauricio's machine) 14:16:30 mauricio: now showing what happens with the XML file on our side 14:16:56 .. current page is linked from our use case description in the wiki 14:17:51 .. now looking at the XML file we saw before - 21_11_orig.xml 14:17:56 .. that we saw this morning 14:18:04 .. now pre-production step 14:18:36 .. in the log there is the XPath of the nodes, will be used for the test suite 14:18:56 .. currently there is only "translate" metadata test suite output in here, others will come later 14:19:13 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 14:19:28 mauricio: now cat-tool oriented XML file 14:19:51 .. in this file you have various information pieces, e.g. domain information that the translator will see 14:20:17 .. inside the translatable content there is some marks about parts that are not translatable 14:20:49 .. now showing a really translated file 14:21:06 .. now using this for the post-processing step 14:21:37 .. now you see the translators file in the original format of the client 14:21:50 .. what has changed: e.g. the "ready to process" in the readyness has changed 14:22:07 .. orignially it had four stages: hTranslate, ... 14:22:15 .. now the state is: publish 14:22:27 .. that means it can be published on the client side 14:22:47 .. the time stamp related to readyness also has changed 14:23:01 .. also we have now translation provenance information 14:23:20 .. local xml:lang attribute has changed from "de" to "es" 14:23:55 .. next showing what will happen with the XML that we used to transalte 14:24:02 s/transalte/translate/ 14:25:11 mauricio: content part that was marked as "translate=no" is blocked so that the translators cannot change it 14:25:33 .. also storage size is part of the file 14:25:48 .. that is a file used in the transit translation tool 14:26:15 .. once the engine does post-processing, the storarge size will be applied 14:26:37 pedro: that is a simple way to allow the translator to control the number of characters that are possible 14:26:56 .. this is just one cat tool, but that can be done with other cat toosl as well 14:27:42 pedro: expansion of global rules has been done before, and will be done in post-processing 14:35:39 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 14:46:03 Jirka has joined #mlw-lt 14:51:35 Milan has joined #mlw-lt 14:56:08 DomJones has joined #mlw-lt 14:56:46 Ankit has joined #mlw-lt 15:02:23 mhellwig has joined #mlw-lt 15:06:37 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 15:07:39 clemens has joined #mlw-lt 15:08:18 Provenance XPath selector value: Does the semantic combination between the Agent Provenance and Translate data category rules validate the regular expression for Provenance (//item)? 15:10:34 15:11:09 15:14:41 discussion about interrelation between data categories 15:15:00 yves: you can imagine scenarios where tools would implement only one data category 15:15:27 .. would be better to have an XPath expression that only selects the relevant nodes for provenance 15:15:32 .. but that is probably marginal 15:18:52 topic: demo of online MT system 15:19:30 scribe: DomJOnes 15:20:07 Pablo: Global use of domain, second domain, mixture of translate / domain. 15:20:21 … sample is allways the same with different use of tags. 15:20:53 … no ITS errors identified. Click translate, same page presented but translated. 15:21:21 Pedro: Engine is supported by Lucy SW. Thomas absent, behind this is Lucy S/W. 15:21:39 Pedro: Showing translate local 15:22:08 … examples in source text. Tags (translate yes/no) are set randomly. 15:22:34 … showing example of translation based on ITS tag (translate yes/no). 15:23:05 … showing file sent to MT and returned result. 15:24:21 … showing pre-mt file. Searches for "translate" tag, adds in meta-tag for translator to see translate = no in example shown. 15:24:49 … goes to show source code of translated page, searches for "translate" tag, they are not found, they have been cleaned and removed from the text. 15:25:03 Felix: so after translation, the translate tags are removed? 15:25:07 yes. 15:25:41 Pablo: Showing example with translate global rules. All nodes are translated but their children are not translatable. 15:26:50 … shows example being translated. Nodes are translated, children are not. Showing the tags are being parsed correctly. 15:27:23 Marcais: Are the whole spans (including those with translate = no) through translate system? 15:27:35 Pablo: Yes, all tags are sent through MT 15:28:01 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 15:28:13 Pedro: The reason is that beind the MT engine these can be the post-editors. The PE may need to seem them for contextual reasons. All data is sent to MT regardless of whether text is to be translated or not. 15:28:32 Marcais: So its just kept as is in the translation (translate =no) 15:28:50 Pablo: Yes. Shows the output that is translated and that which is not. 15:29:09 … flicks between source code and webpage. Shows tags (translate) have been removed. 15:29:46 Question: In the global rule it looks like the XPath expression have no namespace, so how they can work with HTML? Is that related to the issue you mentionned for Namespace in HTML5 earlier today? 15:30:09 … shows domain example. Global rules declare domain. Tag, pointer, mapping. 15:30:37 … multiple "economic" domain mappings are added. 15:30:55 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 15:30:55 Pedro: These domains are needed by the MT engine. 15:31:22 Pablo: shows the text being translated. Meta-data (domain) has been removed. 15:31:40 … Shows global domain and a different engine based on the domain. 15:31:42 Yves, you are right I was going to ask the same question. 15:32:14 … same rule applies, but different engine is used. 15:32:43 felix: asks a question from Yves and Jirka about domain rule… 15:33:05 Jirka: Selector is not completely right for HTML5 where all elements are placed in HTML namespace. 15:33:22 … I hope that this is not a final version, just a temporary one? 15:33:37 Pablo: No problems, I'll change it. But will require some work. 15:33:48 Fredrik has joined #mlw-lt 15:34:22 Pablo: Shows another global rules element. Three different selector for elements wrt translate yes/no 15:34:33 … different domain. 15:35:22 … When we use domain meta-data we divide the document into different parts. What is sent to the MT engine is shown with tags in place and ITS domain mapping. 15:35:35 … We divide the document into three parts. 15:35:56 … we send three requests to the MT engine. 15:36:13 … it takes longer to translate three as opposed to one, 15:37:18 Pablo: Shows final example applying domain meta-data to different nodes. What is sent to the MT system has multiple domain tags added to it. When page is translated it takes a while… 15:38:02 Pedro: This negativly effects the performance of the MT engine. We must take into account how the ITS usage adds to MT costs. 15:38:11 Milan: How many requests to the MT engine? 15:38:18 Pablo: around 15. 15:38:53 David F: The overhead of multiple fragments depends on the impl of the MT. Some engines may perform better than others. 15:39:24 Pedro: This is how it works, you can balance different engines etc, but this effects the price to the client. 15:39:25 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 15:39:51 Pablo: as you can see a cache system is present for translation requests. 15:40:10 Milan: Whatabout creating a file per domain and sending three requests, one for each file. 15:40:36 Pablo: Maybe it can be done but its easier to send one file. 15:40:57 Pedro: We are also using PEditors, we have to perform with a CAT standard system. 15:41:10 Pablo: Showing caching system which results in faster performance. 15:41:33 Pablo: Closes demo. 15:42:15 Yves: a general comment about domain value, when we do mapping we dont mention Casing of the domain. Should they be case-sensitive? Should we edit the spec to show this? 15:42:26 Felix: Are keywords case-sensitive in HTML? 15:42:40 jirka: keyword values are case-sensitive. 15:42:41 present; Ankit Bert(partially) Dave David Dom DomJones Felix Fredrik JonasJacek Leroy Marcais Milan Moritz Naoto(partially) Pablo SebastianSk Tadej Yves Clemens jirka karl matthiasK mauricio mdelolmo mhellwig mhellwig pedro pnietoca renatb 15:43:27 ?: Maybe present in the algorithm, a lower case mapping 15:44:03 action: Yves to add a step regarding the lowercasing of the domain data category 15:44:03 Created ACTION-271 - Add a step regarding the lowercasing of the domain data category [on Yves Savourel - due 2012-11-08]. 15:44:28 david F: our general impl is insensitive to case sensitivity. 15:44:52 … in contract with MT provider you may need to address this but not in our mapping. 15:45:14 Yves: Does it mean the value we return is lowercase as well or should they stay as we get them? 15:45:24 David F: should be lowercase 15:46:02 I assume "STEP 5: Return the resulting string." would be "STEP 5: lowercase the resulting string and return it." 15:47:04 Marcis: general comment, on translation in general, whether to not send something through an MT system and whether it can be specified to be passed through MT engine or not. Sending a string through MT which is not translated could overload the MT speed. Q: Whether its possible to differentiate to send through an MT system and no translate or do not send through MT system. 15:47:54 David Lewis: Problematic - locale filter allows you to not pass through MT for specific locales. What was being shown was stuff that was not MT'd as it was subsequently PE'd 15:48:11 Marcis: Some things you do not want to translate, or passed through MT engine. 15:48:48 Pedro: Very impl dependent. Depends on what behaviour you want. If something is not translated it matters for analysis of the translation. 15:49:05 Marcis: If you know some text is important for MT and some is not can you handle both? 15:49:18 Pedro: This is a problem with the Lucy MT provision. 15:50:09 … Matrex SMT works with chucks of data, so maybe we can see whether there is an improvement but we dont know yet. 15:50:30 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 15:50:32 trackbot has joined #mlw-lt 15:50:55 Pedro: Different techniques are used but we need to know how SMT performs. 15:51:46 David F: In ITS we have translate, term, disam which can all be combined to make a business case. But these problems are business issues between LSP and MT provided. We should not spend time on this discussion. 15:52:15 Pedro: This is both true and false. We need to test various contracts between LSP and MT provider and this should be documented. 15:52:21 David F: Agrees but not now. 15:54:21 David L: Has no translte mark-up come up with your clients? Are they interested in this in terms of costs per words? 15:54:38 Pedro: The business model of this contract is not based on that. 15:57:08 You can find the DEMO link here http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary#More_Information_and_Implementation_Status.2FIssues_7 15:57:25 Felix: Showing screen. 30mins to go, would like to take Bert's presence here to ITS 1.0 uses XPATH 2.0 uses extraction. Shows example of translate data cat, which can be used globally with absolute selectors. 2 subjects, 1: defining selector, 2: using CSS selector at level 3. Think this is the right thing to do based on discussions with Bert. Might be various selectors only relevant to CSS for XPATH. 15:57:43 and here https://www.w3.org/International/multilingualweb/lt/wiki/Online_MT_Systems_Use_Case_Demonstration#Use_Case_Demonstration 15:58:15 might be down sometimes depending on the developing 15:58:32 … shows a rule example where node is selected based on evaluated expression with parameter. Having the CSS selector is easier for developers who may not know CSS in detail. 15:59:07 … we dont have anyone who has yet said they would implement CSS selectors. This is feature-risk as no implementor yet avaliable. 15:59:41 … however during break bert said people are working on CSS selectors to XPATH. We could say we support a translation mechanism from CSS selectors to Xpath. CSS level 3. 15:59:51 Jirka: I thought this was already done? 16:00:12 … the code should exist. 16:00:43 Felix: The developer implemented in python, people are working with other frameworks. Wanted to bring this up while bert is here. 16:01:14 … nice to have mechaism to use CSS selector and this to be translated in XPATH. Is this something to specifiy in the Spec, do we point people to libaries etc? 16:01:37 Dave L: If there was mapping from CSS to XPATH do we not need to mention CSS as part of the normative spec? 16:02:00 Felix: If mapping could be part of the spec without being used directly it would be good. 16:02:40 Cocomore?: people are maybe more familiar with CSS selector so would be better to allow users to use CSS selectors. makes more sense if these classes can be selected more easily. 16:02:48 Felix: There is a use-case 16:03:02 … any other thoughts on this? 16:03:16 s/Cocomore?/kfritsche 16:03:17 Jirka: Is someone working on in-browser implementation. 16:03:57 Yves: I think so as some impl are using CSS. If we go the CSS path what does it change for the test cases? Do we need 2 test-cases for both XPATH and CSS? 16:04:08 Dave L: This question has been asked before. 16:04:19 David F: Doubling the test-suite. 16:04:43 Felix: Just a conversion step, conversion test-case. 16:05:05 … other option is we drop the support for CSS selectors. 1) Full direct support, 2) convergence support or 3) we drop it 16:05:16 Jirka: Who asked for this? 16:05:32 Milan: Maybe it was Phil R? 16:05:58 Dave L: Interoperability is help full in passing file from one system to another. 16:06:18 Felix: Another solution: Dont change anything in the spec, not make it a feature and see if it can be done in an impl 16:06:31 … we would not formally specifiy this step. 16:06:45 … define the functionality without using it formally. 16:07:02 Jirka: keep CSS with reserved word for query language. 16:07:30 Felix: If we dont change anything we put this feature at risk, see if someone comes up with solution next year. 16:07:52 David L: Increases volume of test suite but maybe not the implementation work 16:08:29 Bert: You can only select elements and not attributes.. 16:08:58 Jirka: So CSS test cases would be smaller. In CSS the selector is not case sensitive. HTML not case sensitive, XML would be. 16:09:31 …Some algorithm to transform CSS to XPATH would not be easy as case vs. case-insensitive problem. 16:09:38 Felix: For XML there is no strong use-case 16:09:44 … only for thsoe using HTML 16:10:28 Bert: There is no official mapping but everything but the pseudo elements can be mapped to XPATH. 16:10:37 Jirka: It depends on user-interaction 16:11:00 David L: Therefore these things may need to be removed. 16:11:29 Jirka: No these user-interactions will never be applicable 16:12:23 Felix: For the time being no conclusion on this issue but if we dont change anything and no-one impl it may cause an issue. We need to mark this as a feature at risk. 16:12:46 David F: For people to understand the commitment they are making this should be mapped onto the google spreadsheet. 16:12:58 Felix: I will take an action to explain this and see take-up 16:13:31 … this is an opportunity to recruit more rule authors who can use CSS selectors as opposed to learning XPATH. 16:13:32 action: felix to make sure that css selectors are marked as feature at risk in the draft, and explain the rational in a mail 16:13:32 Created ACTION-272 - Make sure that css selectors are marked as feature at risk in the draft, and explain the rational in a mail [on Felix Sasaki - due 2012-11-08]. 16:15:04 Felix: Topics you want for the agenda tomorrow. 16:15:51 … we have joint meeting with HTML, planning for next year, XML Prague for example. Last workshop in Nov, plans for 2014 16:16:11 … editorial discussions for specification. 16:16:32 … please see doodle poll for a few more people to commit to editing meetings. 16:16:35 http://doodle.com/heh7k59h7vkvnv88 16:16:50 Jirka: Will the next meeting be in Prague? 16:17:00 Felix: Yes, 23th, 24th of January. 16:17:50 Felix: Any other topics? Issues? 16:18:53 David F: XLIFF mapping task-force meeting should happen this week. We said we'd circulate this yesterday but we didn't. Can we schedule for tomorrow? Propose 4pm French time. 16:18:59 … 3 pm 16:20:12 Felix: 2-3pm free time tomorrow for implementation discussions. 16:20:42 Jirka: Who will go at 9am to joint meeting with HTML? 16:20:47 Felix: Hope all with join. 16:21:06 … 9am we meet with the HTML group for 30 mins. 16:22:43 Felix: I have asked for a review from the HTML WG on the section on HTML and ITS written by Jirka. Wanted to ask Frederick a demo which Yves gave in Prague. People see what happens to the HTML. 16:22:55 … would be good to have a 5-10 min demo. 16:23:00 Jirka: Closer to 5min 16:23:19 … @felix do you have examples of HTML markup in your slideS? 16:23:24 Felix: Yes I do. 16:24:06 Jirka: Should also show HTML validator for the WG. Simple, example driven presentation. 16:24:21 Jirka: Our goal is to get a commitment to get a review from them. 16:24:37 Felix: Looking for a commitment for a contact 16:25:56 Tadej: Plan to discuss tool-info data category tomorrow, would be helpful. I would have some comments on this, if we could discuss before lunch. 16:26:24 Felix: 2 big topics, tool-info (issue 41) too many global rules (issue 51) others are smaller. 16:26:59 Felix: So we meet at 8.30 tomorrow here and at 9 we go to the other room 16:27:30 Felix: Closes the meeting. 16:28:07 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 16:37:26 I have made the request to generate http://www.w3.org/2012/11/01-mlw-lt-minutes.html fsasaki 16:37:32 rrsagent, bye 16:37:32 I see 18 open action items saved in http://www.w3.org/2012/11/01-mlw-lt-actions.rdf : 16:37:32 ACTION: felix to follow-up on discussion about regular expression for allowed characters, see http://www.w3.org/2012/10/31-mlw-minutes.html#item02 [1] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T07-30-11 16:37:32 ACTION: pedro to write note about allowed characters issue with help from Mauricio and karl - due 8. November [2] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T09-04-33 16:37:32 ACTION: implying some its:rules in html tags [3] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T09-23-20 16:37:32 ACTION: david to summarize the options and the recommendations related to HTML parsing workflow in the CMS, see discussion at http://www.w3.org/2012/11/01-mlw-lt-irc#T10-13-41 [4] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T10-13-59 16:37:32 ACTION: dfilip to summarize the options and the recommendations related to HTML parsing workflow in the CMS, see discussion at http://www.w3.org/2012/11/01-mlw-lt-irc#T10-13-41 [5] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T10-14-11 16:37:32 ACTION: thomas to follow up on domain list topic with DCU [6] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T10-22-56 16:37:32 ACTION: daveL to react to decision on extensibility for mrk in XLIFF and the result for our "pointer" attributes [7] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T10-51-12 16:37:32 ACTION: felix to edit disambig example [8] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T10-58-53 16:37:32 ACTION: felix to try to simplify disambiguation globally [9] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T11-01-14 16:37:32 ACTION: dom to make sure that schedule for test suite and schema update discussed at http://www.w3.org/2012/11/01-mlw-lt-irc#T11-27-30 is taken into account [10] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T11-28-38 16:37:32 ACTION: DomJones to make sure that schedule for test suite and schema update discussed at http://www.w3.org/2012/11/01-mlw-lt-irc#T11-27-30 is taken into account [11] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T11-28-50 16:37:32 ACTION: Dominic to make sure that schedule for test suite and schema update discussed at http://www.w3.org/2012/11/01-mlw-lt-irc#T11-27-30 is taken into account [12] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T11-29-09 16:37:32 ACTION: DomJones to ask Phil to confirm whether or not he will implement provenance [13] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T12-46-47 16:37:32 ACTION: Dave to check with Phil what his preference was between quality issue locally with inline and script-based stand off [14] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T12-55-40 16:37:32 ACTION: daveL to check with Phil what his preference was between quality issue locally with inline and script-based stand off [15] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T12-55-55 16:37:32 ACTION: felix to ask phil and des and arle about need and implementation committment for localization precis during next call [16] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T13-24-41 16:37:32 ACTION: Yves to add a step regarding the lowercasing of the domain data category [17] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T15-44-03 16:37:32 ACTION: felix to make sure that css selectors are marked as feature at risk in the draft, and explain the rational in a mail [18] 16:37:32 recorded in http://www.w3.org/2012/11/01-mlw-lt-irc#T16-13-32