08:08:34 RRSAgent has joined #mlwDub 08:08:34 logging to http://www.w3.org/2012/06/13-mlwDub-irc 08:08:41 Zakim has joined #mlwDub 08:08:47 meeting: MLW workshop 08:08:50 chair: arle 08:08:55 agenda: http://www.multilingualweb.eu/documents/dublin-workshop/dublin-program 08:09:02 present: again, many, people 08:09:03 tadej has joined #mlwDub 08:09:20 leroy has joined #mlwDub 08:09:31 Yves_ has joined #mlwDub 08:09:35 mhellwig has joined #mlwdub 08:10:11 daveL has joined #mlwDub 08:10:15 Jirka has joined #mlwDub 08:10:26 scribe: Jirka 08:10:59 Change to agenda - Felix will go trough data categories implementation commitments list first 08:11:37 Milan has joined #mlwdub 08:12:54 topic: Implementation commitments 08:13:04 leaded by Felix 08:13:43 gderiard has joined #mlwDub 08:14:21 Felix: review of agenda 08:14:40 dgroves has joined #mlwdub 08:15:31 Key is *real* commitments, not just interest. 08:16:49 Dave: where we will collect implementation commitments 08:16:51 Decisions (not final details) must be complete by July, with details by November. 08:17:06 Felix: proposes to create wiki for such data 08:17:54 Des has joined #mlwdub 08:18:40 ... everyone should think about which categories can gain support in implementations, we will discuss it afternoon 08:19:00 ... what we will agree to today will appear in draft, we can change later if necessary 08:20:06 LINK: http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments 08:20:24 dF has joined #mlwdub 08:20:51 ???: There is overlap in NIF and ITS proposed datacategories 08:21:09 s/???/Sebastian Hellman/ 08:21:13 thomas_ has joined #mlwDub 08:22:29 ... RDF world can reuse ITS concepts if we provide ITS OWL 08:23:30 omstefanov has joined #mlwDub 08:24:46 Tadej: RDF/RDFa can be used as an interchange format only, it will not be generally possible to construct original HTML+ITS fron NIF 08:25:12 ACTION: Tadej to Write proposal how mapping between NIF and HTML+ITS would look like with concrete examples 08:27:34 topic: XLIFF extensibility 08:27:51 Des: are talking about extensibility in XLIFF in general only for ITS 08:28:13 Felix: In general 08:29:03 David: Individual people can send comments to XLIFF TC asking for improved extensibility 08:29:05 thomas has joined #mlwDub 08:29:11 Pedro has joined #mlwDub 08:30:23 Bryan: Message could be that we want custom namespaces feature to be improved 08:30:51 Richard: We should ask personally who will support this 08:31:13 micha has joined #mlwdub 08:31:31 ACTION: Felix to Draft email to XLIFF committee about improving extensibility [due 2012-06-15] 08:31:45 will do that by today evening 08:31:55 to be sent to public-multilingualweb-lt 08:32:28 rrsagent, draft minutes 08:32:28 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Jirka 08:33:02 Topic: LocWorld 08:33:48 David: October 16th - our ITS track 08:34:08 ... Oct 17th pre-conference track for broader auditorium 08:34:36 ... LocWorld is 18 and 19, we will have few talks about ITS there 08:35:14 Felix: We need to find September dates for additional technical meeting 08:36:21 ... proposal Sep 17/18 in Prague 08:37:36 ... and alternative is 25/26 08:39:00 ... final decision is September 25/26 08:39:17 ACTION: Jirka to arrange F2F meeting in September at UEP 08:40:52 topic: Project Information Metadata 08:40:56 lead by David Filip 08:41:29 (wrt to last session - people interested in XLIFF email draft: Richard, Des, Felix, who else?) 08:43:04 David: summarizes current PI related proposed data categories 08:43:43 ... too many and overlapping data categories 08:43:46 all, my list of consensus is now at http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments - please have a look and come back to it in the afternoon 08:49:10 Felix: we can create best practise saying to reuse existing Dublin Core properties 08:51:22 Arle: There is also big overlap with ISO 10669 08:52:27 idea is to reuse HTML "meta" element and have dc.subject in here with a scheme, e.g. ISO 10669, DDC (if you want), ... 08:52:40 s/idea/my idea/ 08:52:48 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 08:53:34 Maxime: meta applies to document as a whole but you might want several chunks inside document 08:54:06 Arle: idea was to have ability to apply those data categories to any part of document 08:55:03 Dave: we should incorporate to ITS only things which help in translation and have use case, we shouldn't supply general CMS related metadata 09:00:26 ACTION: Felix to Summarize discussion around Domain 09:00:37 David: Genre discussion 09:01:16 mail from Georg here http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012May/0096 09:01:29 PaulMacAree has joined #mlwDub 09:01:55 ... consensus about dropping genre 09:02:09 ... formatType discussion 09:05:16 dublin core examples see http://de.selfhtml.org/html/kopfdaten/meta.htm#dublin_core (in German, apologies) 09:08:10 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 09:08:26 ... consensus for dropping translationQualification 09:15:57 ... back to domain 09:16:27 Yves: shouldn't we provide ITS category for domain and for HTML map this to DC in meta 09:20:21 There seems to be interested in implementing domain 09:20:50 I think Pedro, Declan, Yves raised their hands - please protest otherwise 09:21:22 ... register discussion 09:34:03 FYI, here is the domain mapping rule example http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0049.html 09:35:25 ... genre and purpose drop 09:35:44 s/drop/droped/ 09:36:02 rrsagent, draft minutes 09:36:02 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Jirka 10:03:05 scribe: mhellwig 10:03:34 topic: Translation Process Metadata 10:05:44 gderiard has joined #mlwDub 10:05:46 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 10:06:26 omstefanov has left #mlwDub 10:06:43 dF: the agenda is, what is related to process metadata. Everything we define is orthogonal, all data categories are orthonogal, but they need to be in sync 10:06:44 omstefanov has joined #mlwDub 10:07:10 micha has joined #mlwdub 10:07:31 ... orthogonal categories must fit together 10:08:45 ... the state machines need to be in think, orthogonal values must make sense to parties in the chain 10:10:36 ... particularly orthogonal categories is provenance 10:11:17 Zakim has left #mlwDub 10:12:00 ... no single tranlsation process. various different requirements, so categories must sync on the fly. 10:13:03 ... working on a category integration platform - a test bed - to simulate lifecycle designs 10:13:33 ... Pedro will talk about the process metadata in the requirements document 10:14:36 omstefanov_ has joined #mlwdub 10:14:48 ... provenance metadata categories should be connected to process metadata 10:15:43 fsasaki: there are expectations for state machines - what states are aloud and so on. How far are you on this? 10:16:15 dF: discussed yesterday, CNGL is interested and I think the work can be interesting for the work. Particularly the testing setup. 10:16:54 ... use of general process ?? labels most important results. 10:17:40 ... Des talked about process boundaries, we should be aware of them. 10:18:46 Des: on the state machine question, I wouldn't like to enforce state machine in metadata. It should be purely informative. 10:20:45 Pedro: it's true that it's quite complex, because we cannot constrain the things people want to do in their workflows 10:21:05 ... [talking about process metadata] there are three data categories 10:21:29 ... readiness, progress indicator, localisationCache 10:22:03 ... last category are a special case, because here LSPs are not only interacting but publishing to the Web. 10:22:32 dF has joined #mlwdub 10:22:44 ... readiness should indicate readiness for a particular proceess, it's priority and the expectation when it's supposed to be completed and whether an element may have already been committted 10:23:18 ... currently target language not part of the metadata 10:24:01 ... contentType (e.g. MIME, ...); pivotLang, if you go through an intermediate language or to save costs when e.g. translating from Portuguese to Brasilian 10:24:12 ... it reduces costs because you only have to revise 10:24:30 Note: remember slides are available at http://bit.ly/mlwDub 10:24:30 ... contentResultsSource, whether the original content has to be returned 10:24:48 Marion_Shaw has joined #mlwDub 10:24:56 ... contentResultTarget, should all languages be sent in one file or as separate files per language 10:25:32 ... the most important part is to specify the process without constraining it 10:25:53 ... process data model 1, define phases and in each phase different processes 10:26:56 ... process data model, a list of processes created from norm ?? 10:27:25 ... this list could be published and maintained so it wouldn't need to be a closed list 10:27:38 ... so the list can evolve 10:28:11 ... three scenarios of application: consumes (c), generates (g), transforms (t) content 10:28:51 ... progress indicator to return information to the cms how much has been completed 10:29:15 ... for example, the TMS can return that a process has been 40% completed, but 60% are still to be done 10:29:28 ... finally localisationCache. A lot of discussion about this 10:30:10 ... it's about real-time translation. the LSP directly publishes and we want the client to be able to specify whether content should be cached 10:30:21 ... this is an evolution of LSP 10:31:39 Des: localisationCache is very good idea, but when do you know when the cache has been invalidated? When the source changes 10:32:05 Pedro: maybe you're right and we're missing a not-valid attribute 10:32:26 Des: there must a trace back to the source. and if that source changes your cache is out of date and invalid 10:33:48 action: Des to write example for how to deal with invalid cache (re: localisationCache) 10:34:46 Des: The progress indicator, very useful and valid, but this is an ideal candidate for a service boundary. Good candiate for a standard API 10:35:00 Pedro: progress-indicator better in the API 10:35:49 dF: agrees, it's an API function, not about the content 10:36:11 ... the API is a by-product of the implementation 10:37:34 Des: there is a case for it to be in metadata. As a page changes, the system active on that data can update the metadata 10:38:08 Arle: where you have blind operation (crowd-sourcing) so you may not have an API, it could prove helpful there 10:39:18 dF: progress indicator a project attribute, not a content attribute 10:40:32 Dag: we have large XML files with a lot of data. a translator might not complete that in a day, you may want to have information on the progress. There is a case for having progress-indicator 10:41:10 fsasaki: question to XLIFF TC members, how is that with XLIFF 10:41:22 BryanSchnabel: there are state attributes 10:41:28 fsasaki: so there is a solution already 10:42:27 KistenSteffn: it would be a good indicator for technical projects in general. 10:42:51 Pedro: I think progress indicator is good and useful, but the question is who will be implement it 10:43:38 fsasaki: do you want to focus on content metadata related metadata, but if you see something useful coming out of the implementation, it#s good to publish that information 10:44:14 fsasaki: is it easier to put it in the requirements or figure it out from the implementation? You should figure that out 10:44:50 DaveL: process name information should be informative 10:46:08 Pedro: process data model 1 looks good, but it makes some a priori assumptions 10:46:25 ... process data model 2 does not make such assumptions 10:47:40 AlexLik: it's worthwhile watch the terminology 10:48:19 Pedro: is XLIFF an implementation of ITS 2.0 10:48:29 fsasaki: if there is an existing solution use that. 10:48:50 Pedro: if ITS can be implemented in various formats doesn't avoid that we have it in ITS 2.0 10:49:05 fsasaki: should refer to existing metadata and say we can use that metadata also in other file types 10:49:32 ... but the definition wouldn't be ours. Else we would have these clashes; somebody uses XLIFF attribute another ITS 10:50:57 Pedro: so what do we do with this data category 10:51:16 DaveL: the discussion is on how do we address the informative aspect 10:51:55 DaveL: should not spend a lot of time on this here if it's not normative 10:52:31 fsasaki: wiki page has been updated, there are currently 14 or 15 items with question marks 10:52:52 ... need to decide now, what do we put in the normative part 10:53:34 ... and then there are other parts like readiness and we would like to see more how it#s coming out of the implementation. Then we can include it in the informative part 10:54:05 Dag: what is different on these data categories from what's in XLIFF? 10:54:54 dF: state attribute has a number of various values, but these are only values necessary for lifecycle. You can't just use them, you need to complete the cycle 10:55:12 Des: agrees, there is a case for readiness within the CMS as well 10:55:42 ... readiness is specific to workflows, so readiness is relevant only on that workflow 10:56:21 Pedro: we have this in our API, but we have to find out whether it#s also valid for other format, like HTML 10:56:34 ... probably can't resolve in 5 minutes 10:56:44 ... but we need to resolve this 10:57:06 ... examples, readiness information: what shall we do that? 10:57:24 ... targetLanguages also more in the API. 10:57:33 Des: most of these information (on readiness) are project level 10:58:19 dF: place for using the t extension; this kind of information belongs to the project level 11:00:00 fsasaki: one point before we conclude the session. let's discuss over lunch 11:00:11 ... then get back and discuss this 11:01:10 Olav: we need to define more clearly the state, the process. Are we defining the process, what actions we want donee with it. And that will define what's in Linport, in XLIFF. 11:01:54 Pedro: readiness is about what's to do. Not what has been done, that's provenance 11:02:27 dF: about readiness, there is too much in it. but I see value for part of it 11:02:41 ... in a publisher - CMS workflow 11:03:22 Pedro: in the beginning Kimmo asked us why we want to join; and one answer was that we want to move from workflows to data driven processes 12:00:32 Arle has joined #mlwDub 12:07:52 tadej has joined #mlwDub 12:09:04 mhellwig has joined #mlwdub 12:10:13 Yves_ has joined #mlwdub 12:10:30 dF has joined #mlwdub 12:10:36 topic: provenance section with David Lewis 12:10:39 scribe: dF 12:10:43 Scribe: dF 12:10:45 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 12:11:18 DaveL: Explains that provenance seems a general feature that appears in more areas 12:11:43 Use cases are Localisation job monitoring 12:12:06 Synchronizing Parallel Source revisions 12:12:25 Yves_ has joined #mlwdub 12:12:30 Low cost assembly of parallel text with some idea of quality 12:12:41 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 12:12:58 Distributed quality auditing 12:13:30 in the sense of customer/provider quality data synchronization 12:13:48 Dave was cechking a use case with Phil from Vistatec 12:14:16 auditing,creates the need to synchronize QA reports 12:14:48 Provenance appears in diffrent spaces.. No single killer use case 12:15:05 There is a W3C Provenance WG 12:15:31 as a W3C WG we should look at work of other WGs 12:15:54 Provenance WG is revising their time schedule 12:16:12 Current time line look slike Jan 2013 12:16:39 Relating Provenance WG approach 12:17:01 Fact are recorded about entites 12:17:17 Agesnts can ac on entities 12:17:39 s/ac/act/ 12:17:44 Entities can be created or transformed by activities performed by agents 12:17:51 omstefanov_ has joined #mlwDub 12:18:22 mlefranc has joined #mlwdub 12:18:24 Entities can be attributed to Agents 12:19:07 Data model of the WG is PROV-DM 12:19:42 They also have RDF/OWL and XML format (not well documented) 12:19:51 also a query interface 12:20:33 micha has joined #mlwdub 12:20:40 ITS linking options 12:20:53 you can use different granilarities 12:21:01 is fairly flexible 12:21:15 any number of agents can be associated with an entity 12:21:54 Quite heavy and not at all intended as inline markup 12:22:16 Take it as is? 12:22:30 Question by Felix 12:23:49 Felix: Can use existing values. Provenance record would be on document level and URI pointing to places 12:24:11 Dave: With Dom, we have such implementation 12:24:23 felix: mechanism to use the existing values that are not ID attributes would be idValue, if we go for that, see http://www.w3.org/International/its/wiki/IssuesAndProposedFeatures#Proposal:_idValue 12:24:52 we can do it in various palces, basically the same as we do its 12:25:23 s/palces/places/ 12:25:25 In published document you do not really want to inlcude the fain grained provence info 12:26:26 you can later derive segment relevant info from the higher level 12:26:44 s/fain/fine/ 12:26:46 There are options for RDF version 12:27:09 We did a fine grained implementation based on hashes 12:27:28 The hashes need to be recalculated if it changes 12:27:57 Dave is asking Tadej about his implementation: Just one URL? 12:28:38 Showing exmample on span level referencing a textual provenance store 12:29:16 entity e1 was generated by activity a1 12:29:28 at timestamp 12:29:43 example is also available at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012May/0065 12:29:48 start and stop time 12:30:29 As dF said earlier, we can provide best practice for other categories 12:30:47 it would not be necessary a primary ITS category 12:31:38 you can record language in provenance and you can use its tag for that 12:32:02 provennce only tells you what happened, never what should happen 12:32:28 reason to do provance is to record trustworthiness 12:33:00 Felix: The general mechanism is very clear 12:33:54 This needs to be finalized as best practice for others to use for their provenance related categories 12:34:35 DaveL: The WG is approchable, they quite open, not too much grounded in industrial process 12:34:51 Yves has question to a previous slide 12:35:09 Would we need to process the txt version? 12:35:40 DaveL: they have on top of txt, xml, RDF, and query interface 12:36:15 Maxim: But their XML is just one possible serialization, it won't be compatoble with our XML 12:36:38 RDF would be easier, they should provide parser 12:37:08 Maxim: we can put lot of categories there 12:37:30 DaveL: we need a few initial use case 12:37:46 looking at exitsting SQL quality records 12:38:07 we can chain different tools 12:38:37 Jean: Who would own these resources 12:38:41 ? 12:39:20 Dave: LSP record would contain the XLIFF etc process story 12:39:31 who translated, who reviewed etc. 12:39:45 It is valuable business inteligence 12:39:59 You want to allocate blame 12:40:26 Des has joined #mlwdub 12:40:28 You get the answer quicker if everything is in one place 12:41:17 we should have a checklist of categories that should be agreed between customer and provider 12:41:39 there is a commercial tension 12:42:26 Dag: Comment, one fairly nice way, not convinced that RDF is necessary, XML should be OK 12:42:37 Info on MT processing should be embedded 12:42:43 should not be too heavy 12:42:57 there would be complexity in processing the link 12:43:05 Dave: agrees 12:43:45 with the second point (MT provenance embedded rather than linked) 12:44:07 Tadej: Adding to the request for inline inclusion 12:44:53 does it all relate to content? Also on markup? Or doesn't it matter? 12:45:17 Dave: You create connection between content and a piece of metadata 12:45:34 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves_ 12:45:37 We are cerating a binding that is very valuable 12:46:03 s/cerating/creating/ 12:46:19 Tadej: Provenance should contain the binding 12:46:40 Pedro: It reminds me of QR code 12:46:53 Can we use it? 12:47:12 Olaf: The codeis just a representation 12:47:31 [you still need the underlying categories] 12:48:12 Dave: there an be altrenative to inline QA reporting, but as Dag points out, there is cost attached to it 12:48:47 Dave: question for Felix. How about WGs working in parallel,legal, political? 12:49:08 Rules for referencing os not more than two steps behind 12:49:48 DaveL: Our requirement is simple. Informally it should be OK 12:50:31 Givem there is the risk 12:50:45 not that it wouldnot be possible to do 12:51:25 Action Item: Felix to check with W3c on status of the Provenance group to manage the dependency risk 12:51:56 Action Item: fsasaki to check with W3c on status of the Provenance group to manage the dependency risk 12:52:31 Richard: they should be excited about the realworld use case we're bringing 12:52:44 Felix goes to TPAC, maybe there 12:53:42 Sebastian: question. Are there other similarly related categories in its? 12:54:06 DaveL: we always say about scope, global or span 12:54:25 It was not fully QAed 12:54:43 Arle: We identified 3 levels:span, div, document 12:55:09 Arle, DaveL: This is now outdated, needs revision 12:55:35 Pedro: Tabular overview is out of date 12:55:54 Sebastian wants to relate cetegories where it makes sense 12:56:04 Felix: disagrees 12:56:36 May make to combine them, but there are different use cases 12:56:37 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves_ 12:56:53 Interrelations should not be overloaded 12:57:23 dependecnies would be too complex and would potentially prevent new usage scenarios 12:58:16 Sebastian: If you do not sepcify enough, people would not know how to use 12:59:42 Felix: no inline provenance in yet 13:00:08 Phil was talking about qulaity records 13:00:20 this is strictky not provenance 13:00:54 DaveL; we need to specify use cases for inline and not 13:01:20 Especially agents are reusable 13:02:12 Who scribes this? 13:02:30 I think we do not have trackbot 13:02:35 How to invite? 13:02:52 scribe: fsasaki 13:03:25 topic: translation metadata 13:04:17 yves going through target pointer porposal 13:04:19 trackbot has joined #mlwDub 13:04:19 Sorry... I don't know anything about this channel 13:04:19 If you want to associate this channel with an existing Tracker, please say 'trackbot, associate this channel with #channel' (where #channel is the name of default channel for the group) 13:04:25 not sure what the consensus currentl ist 13:05:04 dave: agree that this is a real use case 13:05:14 trackbot, associate this channel with #channel #mlw-lt 13:05:14 Associating this channel with #channel... 13:05:14 Sorry... I don't know anything about this channel 13:05:14 If you want to associate this channel with an existing Tracker, please say 'trackbot, associate this channel with #channel' (where #channel is the name of default channel for the group) 13:05:47 trackbot, associate this channel with #mlw-lt 13:05:47 Associating this channel with #mlw-lt... 13:06:35 yves: two proposals for implementations already, myself and shaun 13:08:53 richard: why is this needed? 13:08:59 yves explains the proposal again 13:09:08 des: who would generate and consume this? 13:09:25 yves: people who work with qt ts files 13:09:43 .. people who use XLIFF files, who don't have an XLIFF specific tool 13:09:54 des: I see it in an interchange format, but not in a resource format 13:10:16 yves: you can your rule to isolate one language as the source and target 13:10:43 .. there are quite a lot of resource file formats that have that information 13:10:53 s/have/need/ 13:11:01 des: how does target relate to target languages 13:11:18 yves: workflow related 13:11:23 .. the name is maybe not good 13:11:46 .. other data category: locale filter 13:12:58 .. indicate what needs to be translated to specific locales 13:13:36 .. would be a BCP 47 language code 13:14:36 alex: does it have to do with target language? 13:14:44 yves: it does, it is like a traditional translate 13:15:25 http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Identification_of_Language_and_Locale 13:22:14 Discussion on BCP 47 13:22:44 Action Item: Felix to folow up on usage of BCP 47 13:22:44 Sorry, couldn't find user - Item 13:23:01 action: shaun to flesh out locale proposal 13:23:01 Created ACTION-107 - Flesh out locale proposal [on Shaun McCance - due 2012-06-20]. 13:23:19 Action Item: fsasaki to folow up on usage of BCP 47 13:23:19 Sorry, couldn't find user - Item 13:24:12 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 13:24:25 It was not stillrecorded by trackbot.. 13:25:57 http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#preserveSpace 13:31:56 Could look at XSLT xsl:strip-space and xsl:preserve-space elements: http://www.w3.org/TR/xslt20/#strip 13:33:26 http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments 13:34:24 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 14:00:26 mhellwig has joined #mlwdub 14:01:11 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 14:02:51 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 14:03:47 scribe: Arle 14:03:52 continuation of session with Yves 14:03:56 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 14:04:09 Yves: Next one is autoLanguageProcessingRule. It tells how the content should be translated, transliterated, MT OK, etc. 14:04:30 ..I don't think there is any implementation commitment. 14:04:39 Felix: Thumbs down. 14:05:43 dF has joined #mlwdub 14:06:22 Yves: Elements within Text is from 1.0. But there it is only a global category. I do not recall why we made that exception to the general case. Perhaps we did not find any exceptions. But after publishing it we got requirements for it. We have two possible implementations. ENLASO and (maybe) SDL. The only change is to add a local aspect to it. There is nothing new except you can specify on the element. 14:06:38 .. We should have two implementations 14:07:42 Olaf-Michael: Why don't we try in version 2.0 to allow all attributes to be local or global? There is no reason why it couldn't be at both levels. 14:08:03 Arle: I think there are some exceptions, like mtDisambiguation. 14:09:35 Felix: Besides local and global, there is a way to point to content. Should we allow all of them for any category? It turns out that it represents different philosophies. For some there is no usage scenario. For someone looking at a table of features, seeing one that is logically local listed as able to be implemented globally can be confusing. 14:09:58 .. There are two perspectives. One is for clean writing of the spec; one is for implementation. 14:10:23 Olaf-Michael: But if we limit it and we run into the issue where someone realizes a case later, then you have to modify the spec. 14:11:06 Yves: But you need implementations, and we had none. You don't want to force the implementer to implement something they think is useless. Theory is one thing. 14:11:15 ,, This was the only exception any way. 14:11:30 s/,, /.. / 14:12:04 Dave: For an implementation for something that is both local and global, do the implementations have to address precedence properly? 14:12:42 Felix: Yes. They have to observe defaults. For example, if you see something that specifies globally that something is translatable, it has to respect the local override. 14:13:00 Dave: So you can't claim conformance by doing only local if it can be global? 14:13:25 Felix: You can do either. Let's look at conformance. 14:13:33 Richard: What was the use case? 14:13:55 Ingo: If you have an ITS processor that doesn't support XPath, you can still handle elements within text. 14:14:12 Yves: You could implement ITS with only local 14:14:21 Richard: Did you have customers who demanded this? 14:14:26 Ingo: Not at that time. 14:15:42 Felix: Conformance from ITS 1.0: You need to handle at least one selection mechanism (global or local), defaults, and correctly observe precedence. 14:16:05 .. ITS tools must also process Xlink hrefs in rules. 14:16:30 Pedro: In autoLanguageProcessRules, maybe transliteration is the only value that makes sense. 14:17:00 .. I think MT should be handled elsewhere. 14:18:04 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 14:18:24 leroy has joined #mlwdub 14:18:36 daveL has joined #mlwdub 14:18:59 action: yves to update the automaticProcessingRule proposal with today's discussion 14:19:00 Created ACTION-108 - Update the automaticProcessingRule proposal with today's discussion [on Yves Savourel - due 2012-06-20]. 14:19:09 next context data category 14:20:24 yves: going through context data category, related to tbx term location 14:20:43 .. TBX and XLIFF seems to be similar 14:20:59 .. seems to be important 14:21:23 .. we killed format type because this is similar 14:21:39 .. call it "dave is much better with names" :) 14:22:11 alex: call this UI controls? 14:22:46 .. should one synchronise the names with microsoft related names? 14:23:49 des: this is very narrow 14:24:02 .. for describing context this is a very limited scope 14:24:38 yves: depending on the context value you get one translation or the other 14:24:50 .. so have clear values is important 14:25:51 richard: does this exist in XLIFF? 14:26:01 yves: yes, you have that in XLIFF; restype 14:26:59 felix: is XLIFF restype interoperable 14:27:09 yves: for the existing list yes, there are extensions with "x-" 14:27:38 .. could also be a namespace based approach 14:27:52 david: in software specialized tools that list is used 14:29:02 richard: where do XLIFF people get info for this list? 14:29:36 scribe did not get the answer 14:29:41 dgroves has joined #mlwdub 14:30:21 Arle has joined #mlwDub 14:30:35 only "intentions", no real committments yet 14:31:37 David F: I might be able to have a PhD student work on this. 14:31:42 david will follow up on "context" with a student of his 14:32:00 Pedro: Let's rename this resourceType. 14:32:21 Yves: Next one is mtConfidence. Do we need the same thing for translations from other sources? 14:33:21 Arle: If you apply it to TM it sounds like fuzzy scores. 14:34:06 David F: Fuzzy scores are not interoperable. Fuzzy scores are basically random numbers. You agree on what a full match is, but nothing beyond that. I think that MT confidence is the same thing: a marketing number. 14:34:39 Pedro: There are five implementation intentions. It's not useful for RbMT, but for SMT it is useful. It is useful within a single tool. 14:34:52 Yves: Combine it with provenance and it is useful. 14:35:24 Declan: Confidence is a big issue right now. It's only valid within a single tool, e.g., benchmarking productivity improvements. We can provide that information, but we don't know how useful it is. 14:35:32 Pedro: Posteditors can use it. 14:35:55 gderiard has joined #mlwDub 14:36:08 Michael-Olaf: The UN and EU, international organizations are working on objective quality measures for MT. 14:36:54 Sebastian: It would be valuable even without objective criteria because you can still assess relative values within a document, collection, etc. It lets you tell which is higher. It is useful for a developer. You don't really need objectivity. 14:37:55 Yves: If you look at MT from Google, Microsoft, it has two scores: one is confidence (based on human input, between 1 and 6). Other systems have other means (e.g., for post-edited, but marked as MT). Crowdsourcing might also be a use. 14:38:12 Dave: It is useful for post-editors to help them focus their works. 14:38:34 .. They don't work in a linear fashion. CAT tool feedback finds this useful. 14:39:27 Johann: It would help us know whether to post-edit or translate from scratch. Something too specific between 0 and 1 might be too limiting. We should account for the different needs. 14:40:40 Arle: The more I think about it the more I think it is general. E.g., Crowdsourced materials. 14:41:16 Tadej: You should prescribe a scale for humans for psychometric validity. A machine scale is different. We need to decide which to support. 14:42:05 David F: We are now looking too broadly beyond the intent. Don't expect interoperability, but within a document, it is valuable, as Sebastian noted. 14:42:20 Dave: Perhaps we put crowd assessment into the quality categories. 14:42:31 Johann: The numbers will vary depending on the desired outcome. 14:43:23 Yves: This one seems important and there is implementation desire. But no clear image of what to implement. So within the next two weeks we need to resolve it. 14:44:14 Action: David to come up with a proposal for mtConfidence within two weeks. 14:44:14 Sorry, amibiguous username (more than one match) - David 14:44:14 Try using a different identifier, such as family name or username (eg. dlewis6, dfilip) 14:44:34 Action: dfilip to come up with a proposal for mtConfidence within two weeks. 14:44:34 Created ACTION-109 - Come up with a proposal for mtConfidence within two weeks. [on David Filip - due 2012-06-20]. 14:45:14 Yves: We've seen interest in implementing specialRequirements. 14:45:15 dF: we have a project at DCU around mtConfidence, so please include us in this dicussion 14:45:54 Yves: One problem is that this category is not fleshed out. 14:46:22 Richard: We already have the note category. The difference is that this is more structured and machine processable. 14:46:30 Pedro: There are two intentions for this. 14:46:47 Yves: We need someone to drive it and flesh it out and have a mid-July implementation example. 14:47:18 Felix: Remember you have a number of other ones to move forward. Please keep the priorities in order. 14:47:49 Yves: By mid-July we will drop things and you may waste effort, at least until the next version. 14:48:11 Action to Giuseppi to flesh out specialRequirements. 14:48:11 Sorry, couldn't find user - to 14:48:27 Action: Giuseppi to flesh out specialRequirements. 14:48:27 Sorry, couldn't find user - Giuseppi 14:48:56 http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments#New_ITS_2.0_categories 14:49:05 Action: Pedro to have Giuseppe to flesh out specialRequirements. 14:49:06 Created ACTION-110 - Have Giuseppe to flesh out specialRequirements. [on Pedro Luis Díez Orzas - due 2012-06-20]. 14:49:07 Yves_ has joined #mlwdub 14:49:54 Felix: Looking at the list of prospective categories, we need more information on some of them. We need to refine this list. 14:50:33 Otherwise, this is the list that will be in the public draft. It will be the ITS 1.0 draft with extended categories, plus mappings to RDFa and use in HTML5. 14:50:42 Dave: When does it have to be released? 14:51:01 iprause has joined #mlwDub 14:51:23 Felix: We should have published the draft last month. That's OK because we published the requirements doc. I said that we would publish these categories in the next draft in July. 14:51:38 Action: Felix to publish draft with final list of categories in July. 14:51:38 Created ACTION-111 - Publish draft with final list of categories in July. [on Felix Sasaki - due 2012-06-20]. 14:52:09 Felix: Data category holders, please remember to send out consensus emails. 14:53:24 scribe: yves_ 14:53:26 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 14:53:34 dF: will go through some issues only 14:54:02 .. about relationship with XLIFF. happy that the importance is stressed out 14:54:28 .. in our charter we have internal and external relationships 14:55:09 .. dependencies are stronger than liaisons 14:55:25 .. MLW-LT has one with XLIFF TC 14:55:46 .. liaisons: with RDF Web Application WG 14:56:26 .. we committed to a RDFa representation to foster integration of MLW in sementic web 14:57:52 .. Liaison with ULI (UnicodeLocalization Interoperability group) chaired by Helena from IBM a TC of Unicode 14:58:15 .. several MLW-LT members are active in ULI 14:58:28 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves_ 14:58:47 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html fsasaki 14:58:51 ULI is looking at a segmentation character proposal 14:59:08 .. on hold for now. 14:59:20 .. will go back to ULI for re-work 15:00:10 .. couple of characters to split or join segments 15:00:30 .. many issues related to the proposal. e.g. intended for plain text 15:00:56 .. may be to have a mapping to element, like BiDi control characters 15:01:36 .... related to our group as well 15:01:59 .. Also some discussion about the Unicode CLDR register 15:02:38 Arle: need to assign an action for the registar 15:03:01 Felix: didn't we decide to drop register? 15:03:11 Arle: not if Pedro do it 15:03:50 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves_ 15:04:05 dF: initiating the discssion would be good 15:04:36 .. We also have a liaison with ETSI ISG LIS 15:05:00 .. LISA is gone 15:06:08 .. OSCAR a LISA working group responsible for TMX, SRX, etc. was killed in the process 15:06:33 .. ESGI is a Telecom standard body 15:06:39 .. not very transparent 15:07:03 .. I would like to have a formal liason between MLW-LT and ESGI 15:07:43 .. Arle is member, could be the liaison agent 15:08:22 s/w /we / 15:08:35 .. w don't want to be surprised by a new standard 15:09:24 Arle: names and logos of OSCAR standards are protected, eveything else is under Creative Commons 15:10:10 dF: some LISA standards were co-owned with ISO, so more visible. 15:10:20 .. but SRX, TMX, etc. are not 15:10:58 .. personnaly: thinks TMX 1.4b is too old to catch up with modern data 15:11:23 .. and share a lot 15:11:45 .. potentially TMX and XLIFF share inline elements 15:12:35 .. Would propose to Arle to be a representative with ESGI 15:12:46 felix: ESGI is not in the charater 15:12:55 s/felix/Felix/ 15:13:31 Felix: liaison are weak connections, much less important than dependencies 15:13:58 dF: i think ESGI is very closed so having a liaison with it may be good. 15:14:23 Pedro: in 10 day we hace ISO TC 37 will have a meeting in Madrid 15:14:44 .. MLW will be there and offer refreshement and a 10mn presentation 15:15:02 .. I'm asking for input, ideas, etc. 15:15:24 Arle: Alan will be there too. 15:15:41 Action: Arle to interface with Pedro on presentation to TC37 by June 20. 15:15:41 Created ACTION-112 - Interface with Pedro on presentation to TC37 by June 20. [on Arle Lommel - due 2012-06-20]. 15:16:21 Felix: maybe Pedro can show our agreed-upen list of data categories to get feedback 15:16:29 .. as well as any extra ones 15:16:35 s/ESGI/ETSI/ 15:16:50 s/ETSI/ETSI ISG LIS/ 15:16:57 Olaf-Michael: Alan melby will be in Mardrid. You should coordinate with him too. 15:17:17 s/Mardrid/Madrid/ 15:17:40 Felix: about ETSI liaison: what kind of info would we provide and get 15:17:59 dF: reports/updates from Arle on the ETSI activities 15:18:10 Arle: yes I could provide updates 15:18:53 dF: need the liaison because ETSI has no public mechanism to feedback 15:19:09 dF: also SRX may be interesting 15:19:57 David moves to nominate Arle as Liaison with ETSI for info exchange 15:20:05 +1 15:20:18 +1 15:20:30 Felix senconds 15:20:47 s/senconds/seconds/ 15:20:49 jr has joined #mlwdub 15:20:51 no dissent 15:21:06 dF: About XLIFF 15:21:17 .. currently TC is working on 2.0 15:21:46 .. e.g ballot about allowance or not of custom namespaces. 15:22:01 .. to me it's a future-proofing measure 15:22:33 .. But people against have good reasons for being worry too because extensions were abused in 1.2 15:23:25 .. 2.0 will have conformance statements against abuses 15:23:37 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves_ 15:24:46 .. you can join the XLIFF Tc easily 15:25:06 much simpler and cheaper than W3C, and also can interact with the comments list 15:25:25 .. OASIS are transparent as in W3C 15:25:57 DaveL: time line about XLIFF 2.0 15:26:35 dF: Twice as many member since last year 15:26:42 no calendard dates set up 15:26:47 for now 15:27:24 .. times are different now. My perception is that important players are in now 15:27:51 Bryan: failure in draft not ready in 2013 15:28:26 dF: some move toward a first draft. 3rd XLIFF symposium in Seattle in October 15:28:59 .. core should be close to completion by October 15:29:33 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves_ 15:30:11 Session closed 15:30:23 Arle: arle and Pedro will be the two member in Madrid for TC 37 15:30:29 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves_ 15:31:43 for upcoming WG calls, see http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page#Upcoming 15:31:52 Events: LocWord, 15-16-17 Oct 15:32:07 Prague F2F 25-26 Nov 15:32:23 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves_ 15:32:30 dF has joined #mlwdub 15:32:55 Thanks to CNGL for the support for this Workshop 15:33:27 thanks to Trinity College for the support 15:34:08 Thanks for Eithne for the support 15:34:23 thanks to the sessions leaders and scribes 15:34:36 Thanks for Leroy for the filming 15:35:02 Thanks to dotNet: they will provide captions for the video 15:36:08 Yves__ has joined #mlwdub 15:36:30 I have made the request to generate http://www.w3.org/2012/06/13-mlwDub-minutes.html Yves__ 15:37:42 gderiard has joined #mlwDub 15:39:23 rrsagent, bye 15:39:23 I see 17 open action items saved in http://www.w3.org/2012/06/13-mlwDub-actions.rdf : 15:39:23 ACTION: Tadej to Write proposal how mapping between NIF and HTML+ITS would look like with concrete examples [1] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T08-25-12 15:39:23 ACTION: Felix to Draft email to XLIFF committee about improving extensibility [due 2012-06-15] [2] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T08-31-31 15:39:23 ACTION: Jirka to arrange F2F meeting in September at UEP [3] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T08-39-17 15:39:23 ACTION: Felix to Summarize discussion around Domain [4] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T09-00-26 15:39:23 ACTION: Des to write example for how to deal with invalid cache (re: localisationCache) [5] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T10-33-48 15:39:23 ACTION: Item to Felix to check with W3c on status of the Provenance group to manage the dependency risk [6] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T12-51-25 15:39:23 ACTION: Item to fsasaki to check with W3c on status of the Provenance group to manage the dependency risk [7] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T12-51-56 15:39:23 ACTION: Item to Felix to folow up on usage of BCP 47 [8] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T13-22-44 15:39:23 ACTION: shaun to flesh out locale proposal [9] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T13-23-01 15:39:23 ACTION: Item to fsasaki to folow up on usage of BCP 47 [10] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T13-23-19 15:39:23 ACTION: yves to update the automaticProcessingRule proposal with today's discussion [11] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T14-18-59 15:39:23 ACTION: David to come up with a proposal for mtConfidence within two weeks. [12] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T14-44-14 15:39:23 ACTION: dfilip to come up with a proposal for mtConfidence within two weeks. [13] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T14-44-34 15:39:23 ACTION: Giuseppi to flesh out specialRequirements. [14] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T14-48-27 15:39:23 ACTION: Pedro to have Giuseppe to flesh out specialRequirements. [15] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T14-49-05 15:39:23 ACTION: Felix to publish draft with final list of categories in July. [16] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T14-51-38 15:39:23 ACTION: Arle to interface with Pedro on presentation to TC37 by June 20. [17] 15:39:23 recorded in http://www.w3.org/2012/06/13-mlwDub-irc#T15-15-41