07:34:09 [fsasaki]
meeting: MLW-LT f2f
07:34:11 [fsasaki]
chair: felix
07:34:20 [fsasaki]
07:34:37 [fsasaki]
topic: role call
07:34:45 [fsasaki]
checking attendance ...
07:34:49 [fsasaki]
present+ fsasaki
07:36:12 [fsasaki]
07:57:55 [tadej]
tadej has joined #mlw-lt
07:59:58 [fsasaki]
fsasaki has joined #mlw-lt
08:00:40 [Yves_]
Yves_ has joined #mlw-lt
08:00:47 [Yves_]
present+ Yves
08:00:57 [daveL]
daveL has joined #mlw-lt
08:01:07 [Marcis]
Marcis has joined #mlw-lt
08:01:14 [Marcis]
present+ Marcis
08:01:23 [Ankit]
Ankit has joined #mlw-lt
08:01:23 [leroy]
leroy has joined #mlw-lt
08:01:29 [Arle]
Arle has joined #mlw-lt
08:01:31 [leroy]
present+ leroy
08:01:35 [Ankit]
present+ Ankit
08:01:41 [Arle]
present+ Arle
08:06:37 [fsasaki]
08:06:41 [truedesheim]
truedesheim has joined #mlw-lt
08:06:48 [daveL]
present+ dave
08:07:17 [Jirka]
Jirka has joined #mlw-lt
08:08:01 [fsasaki]
08:09:21 [pnietoca]
pnietoca has joined #mlw-lt
08:09:21 [mdelolmo]
mdelolmo has joined #mlw-lt
08:09:30 [pnietoca]
present+ pnietoca
08:09:31 [daveL]
scribe daveL
08:09:37 [mdelolmo]
present+ mdelolmo
08:09:39 [fsasaki]
topic: issue-67
08:10:09 [daveL]
yves: had no feedback from shaun to date so we probabl can't advance here
08:10:37 [fsasaki]
08:10:48 [kfritsche]
kfritsche has joined #mlw-lt
08:11:03 [kfritsche]
present+ Karl
08:12:00 [daveL]
felix: comment could be addressed by dropping the ref to XML schema
08:13:05 [daveL]
yves: will respond on issue 105
08:13:14 [fsasaki]
topic: issue-69
08:13:23 [fsasaki]
08:14:20 [pnietoca]
External rules may also have links to other external rules (see example 20). The linking mechanism is recursive, and subsequently after the processing the rules MUST be read top-down (see example 21).
08:14:50 [fsasaki]
08:14:58 [daveL]
pablo: had responded that this was clear in the specification, but suggest a clarification
08:15:19 [pnietoca]
the section is 5.4. (last paragraph)
08:15:23 [daveL]
felix: confirms this is just a clarification
08:15:26 [pnietoca]
change it
08:15:47 [fsasaki]
"The linking mechanism is recursive" > "The linking mechanism is recursive in a depth-first approach"
08:16:01 [daveL]
tadej: perhaps explain this recursion as being 'depth first' to be understandable more by computer scientists
08:17:04 [fsasaki]
topic: issue-70
08:17:20 [fsasaki]
08:18:09 [daveL]
felix: ref to section 5.5
08:18:14 [fsasaki]
08:20:31 [fsasaki]
will add one entry between "global selections" and "data category defaults" for inherited information, but not specific to local markup
08:20:56 [kfritsche]
kfritsche has joined #mlw-lt
08:21:00 [fsasaki]
topic: issue-71
08:21:04 [fsasaki]
08:21:21 [fsasaki]
08:21:24 [fsasaki]
scribe: fsasaki
08:22:09 [fsasaki]
daveL: Yves said the problem is: you can have a lot of annotatorRefs
08:22:28 [fsasaki]
.. issue is: how to deal with annotatorRefs with two instances of local standoff markup
08:22:37 [fsasaki]
.. e.g. lq localization issues and provenacne records
08:22:54 [fsasaki]
.. so you can have multiple records of the same data category applying to the same selection
08:23:09 [fsasaki]
.. you don't get the information whether the information comes from different processes
08:23:36 [fsasaki]
.. Yves suggested whether we can put the information into the same ...
08:23:49 [fsasaki]
.. my view was: for provenacne annotator ref is not that important
08:24:19 [fsasaki]
.. so in the mail last night: could we exlude the lqi and provenance from annotatorsRef
08:24:34 [fsasaki]
.. annotatorRefs is telling you what provided the provenacne annotation
08:24:55 [fsasaki]
tadej: from provenance it is not needed, but for lqi?
08:28:08 [fsasaki]
dave: don't think so for lqissue.
08:28:35 [fsasaki]
yves: sounds weird: have annotatorsRef mandatory for some data cats, possible for others, forbidden for two ...
08:28:58 [fsasaki]
.. currently it is required for mt-confidence and disambiguation
08:29:21 [Marcis]
... and Terminology
08:29:26 [fsasaki]
yves: otehr solution: you could have it mandatory for these two data categories, and don't have it for others
08:29:36 [fsasaki]
.. that would make things a lot simpler
08:30:03 [fsasaki]
dave: agree - not having two features interacting (standoff and annotatorsRef) would be good
08:31:32 [Milan]
Milan has joined #mlw-lt
08:33:47 [fsasaki]
felix potential resolution - so keep it mandatory for mt-confidence, disambiguation and term, and edit the list of data category items in the spec
08:34:49 [fsasaki]
scribe: daveL
08:36:47 [fsasaki]
action: dLewis6 to come back to chase and kevin about discussion of issue-71
08:38:29 [fsasaki]
action: felix to change example if the agree on issue-71 , see discussion at
08:39:32 [swalter]
swalter has joined #mlw-lt
08:40:00 [swalter]
present+ swalter
08:41:02 [daveL]
felix: example 28 needs to be revised also, will do this now
08:45:39 [dF]
dF has joined #mlw-lt
08:46:33 [dF]
present+ dF
08:49:37 [fsasaki]
scribe: fsasaki
08:49:56 [fsasaki]
daveL: using the example in the test file - should we have usage of the data categories in the elements?
08:49:58 [fsasaki]
yves: yes
08:51:35 [daveL]
dave: this example doesn't actually include the data category attributes to which the annotatorRef refers
08:52:00 [daveL]
felix: makes note that the test file and the example should be revised to include this
08:52:28 [fsasaki]
yves: we don't have annotatorsRef for all disambiguation examples
08:52:46 [daveL]
yves: we don't have annotatorRef in all examples of disambiguation
08:53:15 [fsasaki]
action: tadej to check disambiguation examples with regards to presence of annotatorsRef
08:55:13 [daveL]
topic: ISSUE-72 NIF comment
08:55:32 [daveL]
felix: comment was which version of NIF do we refer to
08:55:40 [fsasaki]
08:55:55 [daveL]
.. there are 1.0 and 2.0
08:56:08 [daveL]
.. also there stabilit was raises
08:57:21 [daveL]
... and Christian also raised whether the mapping was canonical
08:58:19 [daveL]
dF: it may be a useful clarification for implementators
08:58:46 [daveL]
felix: but its not clear what is meant by 'canonical XML' in this case
08:59:34 [daveL]
tadej: it implied there should be a canonical XML serialisation
09:00:56 [daveL]
felix: would such a requirement raise a bar for implementors, this need to be dicussed further on the lists
09:01:48 [daveL]
felix: now will attempt to dial in Christian
09:04:36 [Yves_]
rrsagent, draft minutes
09:08:11 [fsasaki]
topic: issue-68
09:08:17 [fsasaki]
scribe: fsasaki
09:08:54 [fsasaki]
marcis: there was a discussion on ITS term and disambiguation
09:09:05 [chriLi]
chriLi has joined #mlw-lt
09:09:08 [fsasaki]
.. christian brought it up, various comments from the WG
09:09:34 [fsasaki]
.. david suggested that we should not break ITS1.0, but felix said it is not necessary to have it
09:09:46 [daveL]
marcis: summarises discussion
09:10:06 [daveL]
09:10:15 [fsasaki]
daveF: don't break it if it works
09:10:22 [fsasaki]
.. that's the bottom line
09:10:41 [fsasaki]
.. we want to keep also independence of features
09:10:58 [fsasaki]
marcis: I could implement terminology independent of the rest of disambiugation
09:11:13 [fsasaki]
.. the question is: if we agree to change something, it is independent, so different question
09:11:30 [fsasaki]
.. david suggested to have a bp document that specifies how things relate
09:11:44 [fsasaki]
daveF: there are seperate use cases for disambiguation and terminology
09:12:02 [fsasaki]
.. things are backed by different use cases, also from the implementers point of view
09:12:56 [fsasaki]
felix: we can also depcreate one of these
09:13:09 [fsasaki]
tadej: if we want to annotate the same fragment - which one to choose?
09:13:14 [fsasaki]
marcis: that is the biggest problem
09:13:21 [fsasaki]
.. we cannot do both
09:13:54 [fsasaki]
marcis: there was a comment from yves, we should break larger problems into smaller ones
09:14:20 [fsasaki]
.. so even if we have an "upper level" data category which we could then use for both scenarios
09:14:51 [fsasaki]
tadej: we could use the same trip we did with annotators ref, e.g. using multiple values in the same attribute
09:15:03 [fsasaki]
.. not sure if we would encourage people to do this
09:15:16 [fsasaki]
.. complex, but same level of complexity as ...
09:15:57 [fsasaki]
tadej: another solution tadej suggested was to have many attributes , but that's the same as having everything in one attribute
09:16:10 [fsasaki]
.. if we can come up with a closed set of types of annotation, that's a solution
09:16:23 [fsasaki]
.. but that needs to be a closed set, since we are specifying attributes
09:16:48 [fsasaki]
.. right now for disambiguation we agreed for three levels: concept, entity, lexicon
09:17:10 [fsasaki]
marcis: there is no definition for each of these levels, e.g. what is a lexical concept?
09:17:20 [fsasaki]
.. I saw that there is a terminology inconcistency
09:17:34 [fsasaki]
.. terminology is not used always in the same way in the disambiguation description
09:18:07 [fsasaki]
daveL: the issue in using both of them for the same term - we are not clear how to combine them?
09:18:16 [fsasaki]
tadej: it is not an issue at the moment
09:18:34 [fsasaki]
.. if you fold it in one data category, it becomes a problem
09:18:51 [chriLi]
09:19:13 [chriLi]
09:19:45 [fsasaki]
daveF: a big system will have a terminology life cycle with many manual people, but it is an automatic workflow
09:20:09 [fsasaki]
daveL: aim of disambiguation is that it would make the output of automatic annotation available
09:20:21 [fsasaki]
ack c
09:20:51 [fsasaki]
christian: thanks to marcis for putting everything into a condensed form
09:21:08 [fsasaki]
.. there are we with the discussion today: my understanding is the following:
09:21:45 [fsasaki]
.. people think it is not a bad idea to try to come up with a data category that can subsume what ITS2 terminology and ITS2 disambiguation try to cover
09:22:05 [fsasaki]
.. with respect to paying attention to ITS1: situation is that there is no need to go for backwards compatibility
09:22:17 [fsasaki]
.. one way to achive soft transition would be to deprecate existing ITS term
09:22:32 [fsasaki]
09:23:00 [fsasaki]
.. one way to come up with the upper level data category: two implementation suggestions were made: based on attrbiute values and distinct values for annotation types
09:23:01 [fsasaki]
09:23:18 [fsasaki]
.. this is how I understand the current state of the discussion
09:23:31 [fsasaki]
.. I'm wondering what the next step would be
09:23:46 [fsasaki]
.. to say: we realize that we want to really look into this change
09:23:55 [fsasaki]
.. and want to do something to the current draft
09:24:15 [fsasaki]
.. if this wants to be driven it could be done via mail or a seperate call
09:24:27 [fsasaki]
.. need to agree on the approach
09:24:35 [fsasaki]
ack fsa
09:24:56 [daveL]
scrie: daveL
09:25:12 [daveL]
scribe: daveL
09:25:40 [daveL]
felix: we have agreement that backward compatability isn't an absolute barrier
09:25:54 [daveL]
... but it is in my view desirable
09:26:01 [daveL]
Christian: fully agree
09:26:37 [daveL]
felix: another point is trying in general to reduce level of substantive change
09:26:57 [daveL]
felix: another point is experience of people who implement and knwo users of its1.0 terminology
09:27:11 [daveL]
... such as yves and OKAPI community
09:27:49 [daveL]
yves: not necessarily a big problem to change but would like to keep backward compatibility in general
09:28:08 [Yves_]
09:28:12 [daveL]
tadej: suggested changes would break backward compatibility
09:29:14 [daveL]
macis: potetnially we add complexity to terminology by including link to external ontology or other lexical resource
09:29:23 [daveL]
df: agrees
09:30:46 [daveL]
felix: compromise is having an umbrella data category, and allow term to stay the same
09:31:02 [chriLi]
09:31:18 [fsasaki]
ack ch
09:31:23 [fsasaki]
arle: agre with marcis
09:31:28 [fsasaki]
09:31:37 [daveL]
marcis: have some questionns about the definition of disambiguation, e.g. the meaning of what is a lexical concept
09:32:23 [daveL]
christian: support having an umbrella data category that would not increase complexity of seaprate term and disambiguation use case
09:33:25 [daveL]
... also we will get better uptake if we can offer an easier route to marking up the output of text analysis
09:33:51 [daveL]
... rather than having to support the more complex issues in disambiguation
09:34:38 [fsasaki]
present+ christan(for10-11-call)
09:35:23 [daveL]
tadej: the reason for defining granularities was the major requirements of linguists, it was not sufficient to have this all in the target external data structure
09:35:47 [daveL]
... so even granularity definition was a compromise
09:36:03 [daveL]
arle: the term 'granularity' may also be an issue
09:36:43 [daveL]
tadej: was previously 'disambiguation type', but it was difficult to find the right term
09:37:39 [daveL]
felxi: asks tadej, marcis, christan to come up with a proposal that allows for both use cases and consider backward comatibility for term?
09:37:57 [daveL]
... but this would need to be done by the end of next week?
09:39:07 [Arle]
Without putting too much thought into it, would disambiguationClassType work? Would this always correspond to a description of the kind of disambiguationClass intended?
09:39:49 [daveL]
christian: happy to let marcis and tadej to try and draft something over these two days and then I can dial in again to discuss it further
09:40:04 [daveL]
marcis: asks who was originator of disambig
09:41:38 [daveL]
tadej: originally it was a named entity recoginiser category, but after discussion also became merged with diasambiguation afteter discussion with linguasev and others
09:42:29 [daveL]
marcis: could we have a cascading model, since named entity can be composite
09:43:56 [chriLi]
Don't forget to bring the beer bottles to the room as well :-)
09:44:02 [daveL]
daveL: note this overlaps with issue-109 on disambiguation in indic languages
09:44:22 [fsasaki]
topic: issue-75
09:44:34 [fsasaki]
09:45:02 [daveL]
felix: jorge as shepard has produced a summary of this topic
09:45:25 [daveL]
christian: my domain comment had three parts
09:46:20 [daveL]
.. one main point - was looking for a way for providing to meta-data on a domain without pointing to resource, this has no eyyt been resolved
09:46:41 [daveL]
... another point was that domain meta-data is processor specific
09:47:24 [daveL]
... so in one world it is called x then the context in which x is meaningful needs to be provided
09:47:57 [fsasaki]
09:48:08 [daveL]
... now jorge has resolved point 2b, but the baove has still also to be resolved
09:49:12 [daveL]
felix: felt adding this context meta was a new feature but could be reolved with a note that this relates to a single engine use case
09:50:09 [fsasaki]
09:50:13 [daveL]
christian; broadly agrees such a note would satisfy him, since in ITS the focus was on scenarios with a single engine scenario. But this need to be made clear as an assumption in ITS2.0
09:50:44 [daveL]
felix: have now started collacting items on tracker categories as 'not addressed in ITS2.0'
09:51:54 [fsasaki]
topic: issue-73
09:51:55 [daveL]
... so if larger implementors, e.g. sap, adobe, ms, will but resoruces into the multiengine scenario we could consider it, other we should stick with making explicit the single engine context
09:52:23 [fsasaki]
09:53:03 [daveL]
felix: with NIF the stability is an issue and will refer back to sebastian Helleman about the plan for this
09:53:17 [daveL]
... need this information to react fully to this comment
09:54:00 [daveL]
.. other comment was how the mapping could benefit from canonical definiition of mapping
09:54:19 [chriLi]
09:54:35 [fsasaki]
ack chr
09:54:53 [daveL]
Felix: so my comment is whether this would be of use to implementors, since in the room there was a lot of familiarisation with the use and benefits of canonicalisation
09:55:19 [daveL]
christian: asks do we have more than one implementation
09:55:32 [daveL]
felix: confirms we have one from sebastian and one from felix
09:56:41 [daveL]
christian: I brought this up to ensure that whenever NIF processing is ensured, we end up with the same representation, and this needs normalisation and canonicalisation
09:57:21 [daveL]
... if not, then we may end up with versions that are incompatible
09:58:29 [daveL]
felix: asks whether some comparison between document in NIF is an likely use case. would the comparison not takeplace back in the document itself
09:59:25 [daveL]
chrsitian: I think you would need a unicode normalisation
09:59:41 [daveL]
felix: but this was related to regex in another data category
09:59:58 [fsasaki]
10:00:16 [daveL]
christian: if we are reocmmending normalisation anyway in this other data category, could we not use this to solve the problem here
10:00:34 [fsasaki]
topic: issue-74
10:01:05 [chriLi]
10:01:39 [fsasaki]
scribe: fsasaki
10:01:48 [fsasaki]
daveL: christian provided some bullet point comments
10:01:55 [fsasaki]
.. are you planning more re-writing
10:02:04 [fsasaki]
.. or should david and I take your comments in?
10:02:06 [fsasaki]
ack cl
10:02:10 [fsasaki]
ack ch
10:02:17 [fsasaki]
christian: if it would be ok with you
10:02:27 [fsasaki]
.. I could turn the bullet points that people could read
10:02:38 [fsasaki]
.. with respect with the general approach
10:02:44 [fsasaki]
.. I could do editing of the doc
10:03:07 [fsasaki]
.. by mid next week
10:03:08 [fsasaki]
10:04:05 [fsasaki]
that would be action-377
10:04:15 [fsasaki]
davidF: that's clarificatory stutt, not very urgent
10:04:27 [fsasaki]
.. will wait for christian for a more readable version
10:05:06 [fsasaki]
felix: so we have discusesed all comments from christian
10:05:50 [fsasaki]
felix wil put thoughts on NIF in a mail
10:44:38 [daveL]
daveL has joined #mlw-lt
10:45:41 [fsasaki]
scribe: Yves_
10:45:45 [Yves_]
Scribe: Yves_
10:45:56 [fsasaki]
topic: issue-72
10:46:28 [fsasaki]
original comment here
10:46:38 [fsasaki]
.. see "Section 8.12 (Provenance Data Category)"
10:46:47 [Yves_]
daveL: Provenance issue is about timestamp
10:47:17 [Yves_]
.. quite complex to implement
10:47:44 [Yves_]
.. e.g when the information is capture, etc.
10:48:10 [Yves_]
.. This is covered by the PROV standard
10:48:22 [Yves_]
.. and we have a mechanism to point to that
10:48:35 [Yves_]
.. so no need in ITS
10:49:03 [fsasaki]
yves: so has the order of provenance a meaning?
10:49:18 [Yves_]
.. so order SHOULD reflect the order things were order in the document
10:49:40 [Yves_]
s/were order/were added/
10:50:53 [swalter]
swalter has joined #mlw-lt
10:50:55 [Yves_]
original commentor got a reply and we are waiting for a response. comment was rejected.
10:51:26 [fsasaki]
topic: issue-76
10:52:17 [Yves_]
Arle: need to re-look at it
10:53:39 [Yves_]
Jirka: proposal for a solution is in the issue's note.
10:53:58 [Yves_]
.. question was about HTML and rules precedence
10:54:00 [fsasaki]
topic: issue-77
10:54:24 [Yves_]
.. no need to change anything
10:54:37 [Yves_]
.. link is the same as link in global rules
10:55:00 [fsasaki]
resolution proposal - see note from jirka Kosek, 22 Jan 2013, 22:58:35 at
10:56:07 [Yves_]
Marcis: my comment was that it was difficult to understand how things work
10:56:24 [Yves_]
.. because it's defined in multiple places
10:57:22 [daveL]
present+ DaveLewis
10:57:27 [Yves_]
felix: in section 6.4 there are some explanation
10:57:37 [Yves_]
.. we would add Jirka's clarification there
10:58:20 [Yves_]
.. this would define the inheritance behavior
10:59:05 [Yves_]
jirka: maybe issue is that global rules need to be read in document order
10:59:15 [fsasaki]
"Global selections in documents (using mechanism of external global rules or inline global rules)" > "Global selections in documents (using mechanism of external global rules or inline global rules), to be processed in document order"
11:00:53 [fsasaki]
"Global selections in documents (using mechanism of external global rules or inline global rules)" > "Global selections in documents (using mechanism of external global rules or inline global rules), to be processed in document order, see section 5.2.1 for details "
11:01:03 [Yves_]
Felix: could point to 5.2.1 in the HTML section
11:02:18 [Yves_]
rrsagent, draft minutes
11:02:18 [RRSAgent]
I have made the request to generate Yves_
11:03:01 [Yves_]
felix: let's close this issue. See the note in the issue page.
11:03:23 [fsasaki]
action: jirka to make edit for issue-77
11:03:23 [trackbot]
Created ACTION-391 - Make edit for issue-77 [on Jirka Kosek - due 2013-01-30].
11:03:41 [fsasaki]
topic: issue-76 agan
11:03:45 [fsasaki]
11:04:08 [Yves_]
Arle: an implementer was looking at issue's type
11:04:21 [Yves_]
.. and saw inconsistency
11:04:27 [fsasaki]
original comment at
11:04:49 [Yves_]
.. solution would be to change the definition
11:05:19 [Yves_]
.. add "or text is translated inconsistently"
11:05:29 [Yves_]
.. and a second example.
11:05:46 [Arle]
Proposed change: The text is inconsistent within itself or text is translated inconsistently (NB: not for use with terminology inconsistency).
11:05:57 [Arle]
Add second example: The translated text uses different wording for a single regulatory notice in the source that occurs multiple times in a series of manuals.
11:06:18 [fsasaki]
change in this sec
11:08:06 [Yves_]
action: arle to make the edit for issue 76
11:08:06 [trackbot]
Created ACTION-392 - Make the edit for issue 76 [on Arle Lommel - due 2013-01-30].
11:08:20 [fsasaki]
topic: issue-78
11:08:58 [Yves_]
Felix: MIME type was registered, no more action is needed.
11:09:10 [fsasaki]
topic: issue-79
11:09:36 [Yves_]
s/MIME Type/rel-type/
11:10:10 [Yves_]
Felix: wrote a reply to that comment
11:10:19 [fsasaki]
11:11:42 [Yves_]
.. added text indicating namespace prefix can be difference than its if it exists already
11:12:02 [Yves_]
Jirka: this just duplicate information. not good
11:12:43 [Yves_]
.. the initial text should already address the comment
11:13:09 [fsasaki]
"The namespace URI that MUST be used by implementations of this specification is:" > "The namespace URI that MUST be used by XML-based implementations of this specification is:"
11:13:19 [Yves_]
.. add only "XML-based" to implementation
11:13:53 [fsasaki]
action: felix to go back to richard about new resolution for issue-79
11:13:54 [trackbot]
Created ACTION-393 - Go back to richard about new resolution for issue-79 [on Felix Sasaki - due 2013-01-30].
11:15:03 [Yves_]
Topic: issue-80
11:15:34 [Yves_]
Felix: we can just add links to example
11:15:50 [Yves_]
action: felix to add links to examples for issue 80
11:15:50 [trackbot]
Created ACTION-394 - Add links to examples for issue 80 [on Felix Sasaki - due 2013-01-30].
11:16:03 [fsasaki]
topic: issue-81
11:16:25 [fsasaki]
11:16:35 [Yves_]
felix: related to issue-89
11:16:38 [fsasaki]
11:17:24 [Yves_]
Felix: issue is not clear how HTML maps to ITS
11:17:43 [Yves_]
.. some HTML construct are explicitely mapped, other are not
11:17:56 [Yves_]
.. like terminology (dfn, dt, etc.)
11:18:41 [Yves_]
.. should an implementer of HTML/ITS process those constructs as term? or not?
11:18:49 [fsasaki]
11:19:00 [fsasaki]
11:19:01 [Yves_]
..Possible solution is a mapping defined in bets practice
11:19:14 [Yves_]
.. like we did in ITS 1.0
11:19:45 [Yves_]
.. we did this only as a best practice
11:19:59 [Yves_]
.. e.g. we don't talk about dfn in ITS 1.0
11:20:33 [Yves_]
.. for issue 81 we would not define normative relation to term
11:20:53 [Yves_]
.. but provide mapping in best practices document
11:21:16 [fsasaki]
11:21:27 [Yves_]
.. related issue is issue-97
11:22:15 [Yves_]
.. some HTML features are used but not declared as such, like 'translate'
11:23:50 [Yves_]
.. we should have something like "the ITS processor implementing Tranlsate MUST implement HTML5 translate attribute"
11:24:26 [Yves_]
See also note in issue-97
11:26:13 [Yves_]
Yves: this would resolve the issue
11:26:34 [fsasaki]
"the ITS processor implementing Tranlsate MUST implement HTML5 translate attribute" > "the ITS processor implementing Translate MUST implement HTML5 translate attribute in the same was as the ITS translate attribute for XML content"
11:28:16 [Yves_]
dF: we have a problem
11:29:08 [Yves_]
.. we don't have an its-translate equivalent
11:29:21 [Yves_]
Yves: we map to a functionality not an attribute
11:29:30 [Yves_]
.. like id or lang
11:30:10 [Yves_]
dF: we want to say HTML5 translate is the Translate local markup
11:30:28 [Yves_]
Yves: maybe we can re-use same text as for lang and id
11:31:41 [kfritsche]
"The recommended way to specify language identification is to use xml:lang in XML, and lang in HTML."
11:33:23 [Yves_]
Felix: for language we would need to say that lang has precedence
11:36:02 [fsasaki]
"If the attribute xml:id is present or id in HTML for the selected node, the value of the xml:id attribute or id in HTML MUST take precedence over the idValue value."
11:36:22 [fsasaki]
for lang info to be adapted to mention precedence of xml:lang and lang other langRule
11:37:06 [Yves_]
Felix: we don't have an issue for lang
11:37:21 [Yves_]
.. we would also need test cases
11:38:25 [Yves_]
Felix: if there are xml;lang and lang present, lang MUST take precedence
11:38:36 [Yves_]
.. we need a test case for it
11:39:54 [Yves_]
Felix: need to test xml:lang lang in a XHTML file
11:41:39 [Yves_]
11:42:10 [Yves_]
action: felix to check what of lang and xml;lang takes precedence
11:42:10 [trackbot]
Created ACTION-395 - Check what of lang and xml;lang takes precedence [on Felix Sasaki - due 2013-01-30].
11:42:30 [Yves_]
action: ankit to create example for xml;lang / lang
11:42:30 [trackbot]
Created ACTION-396 - Create example for xml;lang / lang [on Ankit Srivastava - due 2013-01-30].
11:43:02 [Yves_]
Yves: xml;lang seems to take precedence according:
11:43:10 [swalter]
In HTML 5 the native HTML 5 translate attribute must be used to express the Translate data category.
11:43:17 [fsasaki]
issue-97 proposal
11:43:29 [fsasaki]
11:44:16 [Yves_]
action: yves to enter the new text for 97 (above)
11:44:16 [trackbot]
Created ACTION-397 - Enter the new text for 97 (above) [on Yves Savourel - due 2013-01-30].
11:45:29 [Yves_]
dF: I would table the dfn/dt issue before Term/Disambiguation is resolved
11:46:07 [Yves_]
Felix: think there are 2 type of content: clear relation (like id translate) and un-clear (dfn)
11:46:29 [Yves_]
macris: dfn is very narrow
11:46:42 [Yves_]
.. employed only in very restricted definition
11:47:01 [Yves_]
11:47:13 [Yves_]
.. dfn is like a sub-type of ITS term
11:47:28 [Yves_]
Tadej: dt is only in a list
11:48:40 [Yves_]
karlF: adding a default rule would be better
11:48:46 [Yves_]
.. simpler
11:49:02 [Yves_]
Marcis: but only in a BP document
11:49:06 [Yves_]
Felix: yes
11:49:58 [fsasaki]
11:51:22 [Yves_]
action: Felix to answer Richard to indicate we'll address this with a rule file in BP
11:51:22 [trackbot]
Created ACTION-398 - Answer Richard to indicate we'll address this with a rule file in BP [on Felix Sasaki - due 2013-01-30].
11:51:54 [Yves_]
action: Felix to draft non-normative section clarifying relations to HTML for issue 89
11:51:54 [trackbot]
Created ACTION-399 - Draft non-normative section clarifying relations to HTML for issue 89 [on Felix Sasaki - due 2013-01-30].
11:55:01 [Yves_]
action felix to edit the specification for Translate (MUST missing, etc.)
11:55:02 [trackbot]
Created ACTION-400 - Edit the specification for Translate (MUST missing, etc.) [on Felix Sasaki - due 2013-01-30].
11:55:31 [Yves_]
11:55:47 [fsasaki]
topic: issue-82
11:56:51 [fsasaki]
11:57:59 [Yves_]
Felix: if values are ok, no need to have a mapping
11:58:03 [fsasaki]
11:59:27 [Yves_]
felix: something without mapping just pass through
12:01:04 [fsasaki]
answer to the comment: "STEP 3-1-2-5-2. Else (if no mapping is found): Add the string (in its original cases) to the result string."
12:01:19 [Yves_]
action daveL: reply to Richard
12:01:20 [trackbot]
Created ACTION-401 - Reply to Richard [on David Lewis - due 2013-01-30].
12:02:16 [fsasaki]
topic: case related comments
12:02:27 [fsasaki]
12:03:17 [fsasaki]
12:03:23 [Yves_]
Pablo: at first we used case-sensitive
12:03:37 [Yves_]
.. then we moved to insensitive
12:04:17 [Yves_]
.. we could compare directly
12:04:50 [Yves_]
.. but if document is encoded differently we may have entities
12:04:58 [Yves_]
.. and the string is different
12:05:10 [fsasaki]
scribe: fsasaki
12:05:17 [fsasaki]
yves: by entity you mean "person"?
12:05:21 [fsasaki]
pablo: yes
12:05:22 [pnietoca]
<meta name="description" content="Econom&iacute;a"/>
12:05:30 [pnietoca]
... domainMapping="Economía (ECON), Leyes (Law)"/>
12:05:31 [fsasaki]
yves: but that gets resolved then you parse the documnt
12:05:37 [fsasaki]
pablo: see example above
12:06:02 [fsasaki]
yves: then you read the document the entity wil be converted into í
12:06:13 [fsasaki]
.. if we just do case-sensitive we have a problem
12:06:25 [fsasaki]
.. the reason why we want to have insensitive: to avoid duplicates
12:06:45 [fsasaki]
.. because we know people don't regard casing for keywords anyway
12:07:03 [fsasaki]
.. so in one case we say: case matters, in others we say they don't matter
12:07:19 [fsasaki]
.. so one solution is: case always matters
12:07:27 [fsasaki]
.. but what is the solution for HTML?
12:07:50 [fsasaki]
davidF: wouldn't be worried that you preserve case
12:07:58 [fsasaki]
.. only if you fail to map
12:08:12 [fsasaki]
yves: only when you compare during the mapping you are uncertain
12:08:24 [fsasaki]
.. problem is: many documents have keywords typed differently
12:09:04 [fsasaki]
.. could also have a keyword saying "mapping or not"
12:09:34 [fsasaki]
felix: would that delay the problem
12:09:35 [fsasaki]
12:10:58 [fsasaki]
resolution: agree with first question in
12:11:09 [fsasaki]
.. 2nd question becomes unnecessary
12:11:50 [fsasaki]
scribe: Yves_
12:12:03 [Yves_]
action yves to fix text and algo for domain case mapping
12:12:03 [trackbot]
Created ACTION-402 - Fix text and algo for domain case mapping [on Yves Savourel - due 2013-01-30].
12:12:17 [Yves_]
scribe Yves_
12:13:59 [Yves_]
dF: dave split the issue into 3 topics
12:14:09 [Yves_]
.. first one was 84
12:14:28 [fsasaki]
reply from dave on issue-84 at
12:14:48 [Yves_]
.. answer is: yes transliterating is different but we didn't have enough use cases for a requirement
12:15:06 [Yves_]
.. that made it as a final data category
12:16:21 [Yves_]
12:16:51 [Yves_]
felix: so we are waiting for a reply now
12:17:46 [fsasaki]
topic: ISSUE-86
12:18:00 [Yves_]
felix: implementation committement
12:18:11 [fsasaki]
12:18:25 [Yves_]
for several issues
12:18:52 [Yves_]
.. for Ruby and Directionality
12:19:17 [Yves_]
.. basically we don't have experts and no volunteer to implement
12:19:34 [Yves_]
.. Ruby may be ported for XLIFF
12:20:52 [Yves_]
.. still not sure what is the aim: dropping ruby or not?
12:21:08 [Yves_]
.. also not sure when we can expect stability
12:21:20 [Yves_]
.. but we want to be feature complete very soon
12:21:49 [Yves_]
.. questions to the i18n are out, waiting for feedback
12:22:17 [fsasaki]
yves: directionality is not really used in XLIFF
12:22:22 [fsasaki]
.. implementers use control characters
12:22:38 [fsasaki]
.. we tried really hard in XLIFF2
12:22:49 [fsasaki]
.. we have a module for directionality in XLIFF2
12:23:12 [fsasaki]
.. but the implementers would insert rather control characters than markup
12:24:55 [Pedro]
Pedro has joined #mlw-lt
12:25:19 [Yves_]
dF: when we discussed directionality in Lyon, someone described how to do dir with inline markup
12:25:51 [Pedro]
present+ Pedro
12:27:11 [Yves_]
felix: .. for Ruby, I don't think anyone implemented the pointer for example
12:27:26 [Yves_]
Arle: need to speak to Asian developers
12:27:37 [Yves_]
.. group is not representative
12:27:48 [Yves_]
.. for these issues
12:28:50 [Yves_]
Felix: for Japanese there is a detailed document on layout
12:29:07 [Yves_]
.. and requirements in XML and HTML are pushed by this document
12:29:31 [Yves_]
.. Our question is how can we deal with it?
12:30:12 [Yves_]
Arle: maybe it can be defined later in a different namespace
12:30:29 [Yves_]
Felix: maybe, but baiscally it's the same for ITS 2.
12:30:39 [Yves_]
felix: lunh time now
12:30:52 [Arle]
12:30:56 [Yves_]
12:31:00 [Arle]
13:29:14 [Arle]
Arle has joined #mlw-lt
13:30:10 [Ankit]
Ankit has joined #mlw-lt
13:31:02 [daveL]
daveL has joined #mlw-lt
13:31:09 [daveL]
present+ DaveLewis
13:31:18 [Arle]
present+ Arle
13:36:41 [Milan]
Milan has joined #mlw-lt
13:38:37 [Arle]
Scribe: Arle
13:38:48 [fsasaki]
topic: meeting schedule
13:38:58 [fsasaki]
13:38:58 [Arle]
Felix: I thought of discussing the next meetings, but Pedro isn't here.
13:39:30 [Arle]
.. See the wiki page. You will see that thanks to Tadej that we have a face-to-face in Bled in May.
13:40:21 [Arle]
.. I just got an email from Pedro with some offers to host the face-to-face in Madrid, but all are beyond budget (€5000), because he would have to rent meeting space.
13:40:44 [Arle]
.. We might need to think of an alternative to Madrid. One alternative is LocWorld in June in London.
13:40:58 [Arle]
.. We could ask Microsoft if there is a London office we could use.
13:41:16 [fsasaki]
13:41:46 [Arle]
LocWorld is 12–14 June
13:42:18 [Arle]
David: 10 June is XLIFF; 11–12 June (?) is FEISGILTT
13:42:47 [Arle]
Felix: We will need technical discussions in June.
13:43:05 [Arle]
Yves: Whole week is booked for some people with the different events.
13:43:11 [Arle]
Felix: Week of 17th?
13:43:38 [Arle]
.. Please check your calendars to see if that might work.
13:44:25 [Arle]
.. 17–18 June is the suggestion.
13:45:09 [Arle]
Location: TBD in a cheap place.
13:45:22 [Arle]
Felix: Berlin would be free.
13:45:36 [Arle]
s/Location:/.. Location/
13:45:49 [Arle]
Dave: Dublin is an option.
13:46:20 [Arle]
Action: Felix is to check availability of Berlin on 17–18 June.
13:46:21 [trackbot]
Created ACTION-403 - Is to check availability of Berlin on 17–18 June. [on Felix Sasaki - due 2013-01-30].
13:46:34 [Jirka]
Jirka has joined #mlw-lt
13:46:54 [Arle]
Action: daveL to check availability in Dublin for face-to-face meeting on 17–18 June.
13:46:54 [trackbot]
Created ACTION-404 - Check availability in Dublin for face-to-face meeting on 17–18 June. [on David Lewis - due 2013-01-30].
13:48:00 [Arle]
Pedro: I am looking at various possibilities in Madrid still.
13:48:01 [tadej]
tadej has joined #mlw-lt
13:48:49 [Arle]
Felix: Would it be OK for you if we look at other cities to save costs?
13:48:59 [leroy]
leroy has joined #mlw-lt
13:49:01 [Arle]
Pedro: That is fine for me. Leave Madrid as an alternative.
13:49:43 [Arle]
.. My latest option in Madrid comes to 3–3.5K€, if we have everyone stay at the same hotel.
13:50:30 [Arle]
Felix: We need to fix these dates as soon as possible because of Localization World so that travel can be arranged by everyone as appropriate.
13:52:21 [Arle]
.. Dave and I will try to decide so people can make arrangements.
13:52:53 [fsasaki]
13:53:02 [Arle]
.. We are also considering another face-to-face in September, around LRC conference.
13:53:26 [Arle]
.. In Limerick.
13:54:34 [Arle]
.. Dates would be 16–17 September (pending confirmation).
13:54:44 [mdelolmo]
mdelolmo has joined #mlw-lt
13:56:34 [Arle]
.. Would 23–24 September be also good
13:56:45 [fsasaki]
will come back to september meeting tomorrow
13:56:50 [Arle]
s/also good/also good?/
13:57:01 [fsasaki]
23-24 would be difficult for cocomore
13:57:31 [Arle]
topic: Last workshop
13:58:25 [Arle]
Felix: Project ends in December. DoW shows we spend most efforts until September, so if the workshop is in December, mass may be difficult. Do we have a regular workshop, or some other kind of event?
13:58:39 [Arle]
.. Any ideas of other options for final event?
13:58:58 [Arle]
.. We can't drop it due to work package, which describes it as biggest workshop.
13:59:42 [Arle]
Pedro: What about colocation of the final workshop with another event?
14:00:04 [Arle]
.. David: What about tcworld?
14:00:22 [Arle]
s/.. David:/David../
14:00:33 [Arle]
.. It is a big one. Might be good to connect there.
14:01:17 [Arle]
Action: Felix to follow up with Christian on tekom as an option.
14:01:17 [trackbot]
Created ACTION-405 - Follow up with Christian on tekom as an option. [on Felix Sasaki - due 2013-01-30].
14:01:37 [Arle]
Arle: Consider that colocating with a commercial event will likely have higher costs.
14:01:53 [Arle]
Felix: We can do another MLW workshop, or look at other options.
14:01:58 [Arle]
Yves: That is a lot of work.
14:02:09 [Arle]
Felix: Yes, and after September, we can't ask people for a lot of work.
14:02:26 [Arle]
.. Also, September/October is probably too early for the next workshop after the one in March.
14:03:20 [Arle]
.. What if we don't make a conference or go to one? Instead we have an event (possibly closed) to do demos to customers?
14:03:52 [Pedro]
Pedro: Tekom, Wiesbaden 06Nov-08Nov2013
14:04:28 [Arle]
Felix: we can consider still in January. Let me and Dave know of any options that come to mind.
14:05:01 [Arle]
Dave: I can already confirm space would be available in Dublin in June.
14:06:06 [Arle]
topic: posters
14:06:47 [Arle]
Felix: Our reviewers will most likely not be in Rome. So we need to make a presentation in Luxembourg. Posters would help show completion.
14:07:26 [Arle]
Pedro: What size should they be?
14:07:30 [Arle]
Felix: A0.
14:07:43 [Arle]
Action: Arle to resize templates for posters from A1 to A0.
14:07:43 [trackbot]
Created ACTION-406 - Resize templates for posters from A1 to A0. [on Arle Lommel - due 2013-01-30].
14:08:11 [Arle]
topic: Issues
14:08:34 [daveL]
scribe daveL
14:10:11 [fsasaki]
topic: issue-88
14:10:12 [fsasaki]
14:10:32 [daveL]
felix: this is just editorial in the directionality section
14:11:00 [Arle]
Scribe: Arle
14:11:09 [fsasaki]
s/topic: Issues//
14:11:42 [Arle]
David: I don't know the difference between the HTML elements here.
14:12:18 [Arle]
Action: Felix to check for clarification on Issue-88
14:12:18 [trackbot]
Created ACTION-407 - Check for clarification on Issue-88 [on Felix Sasaki - due 2013-01-30].
14:13:27 [fsasaki]
topic: issue-92
14:13:51 [fsasaki]
original mail at
14:14:01 [Arle]
Yves: This is a note from Richard asking why information is in a note, which is not normative.
14:14:26 [Arle]
.. Can a note be normative? I believe they can be if they are in a normative section. I believe we have MUSTS in notes.
14:14:36 [Arle]
Felix: I think that is a mistake.
14:14:46 [Arle]
Action: Felix to ensure that there is no MUST in any notes.
14:14:46 [trackbot]
Created ACTION-408 - Ensure that there is no MUST in any notes. [on Felix Sasaki - due 2013-01-30].
14:15:01 [Arle]
Yves: idValue global has one.
14:16:40 [fsasaki]
relation to issue-103 - clarify the algorithm
14:17:20 [Arle]
.. One explanation + bullet explaining that empty = no locale and * = all locales. Then we can eliminate the note.
14:17:49 [Arle]
Felix: Solution is to have three bullets explaining the cases, and delete note. Resolves issue-92 and issue-103.
14:18:17 [Arle]
.. Yves, do you use extended filtering?
14:18:46 [Arle]
Yves: Yes. We do. We need to check with Shaun, but I believe this is the algorithm for extended filtering.
14:20:06 [Arle]
Felix: We need to express the approach described in BCP47 and that it will work for everyone implementing this. Tilde should check.
14:21:02 [Arle]
.. Ankit and Marcis, should we return to this, or can we assume that if we don't hear otherwise, it’s OK?
14:22:00 [Arle]
Action: Yves to follow up with Richard and Norbert on action-92 and action-103.
14:22:00 [trackbot]
Created ACTION-409 - Follow up with Richard and Norbert on action-92 and action-103. [on Yves Savourel - due 2013-01-30].
14:22:14 [fsasaki]
14:22:36 [fsasaki]
14:22:54 [Marcis]
Marcis has joined #mlw-lt
14:23:14 [Arle]
topic: Issue-93
14:23:52 [Arle]
Jirka: Proposed resolution is to use what was proposed by original commenter.
14:24:22 [Arle]
Action: Jirka to write to Henry on issue-93 and make the change in the text.
14:24:23 [trackbot]
Created ACTION-410 - Write to Henry on issue-93 and make the change in the text. [on Jirka Kosek - due 2013-01-30].
14:25:05 [Arle]
topic: Issue-94
14:25:18 [Arle]
Felix: I think Jirka has a proposed resolution.
14:25:44 [Arle]
Jirka: I've sent replies to Henry, but not heard back. I think we should resolve this issue in a different way. See link at end of issue.
14:26:27 [Arle]
.. HTML has different rules for processing white space and decimal numbers. There is different precision between XML and HTML.
14:28:05 [Arle]
.. The easiest resolution is to use the double data type in XML for ITS. It will align XLM and HTML. Double is implemented in almost all programming languages. So we move all data types to double and deal with the differences in leading and trailing whitespace between the two.
14:28:28 [Arle]
Felix: This impacts localization quality, MT confidence, and localization quality rating.
14:28:41 [Arle]
.. Is this OK for all implementers?
14:29:19 [Arle]
Jirka: Only difference is that double has lower precision than decimal. And you can use exponential notation.
14:29:38 [Arle]
Felix: Also disambigConfidence and term confidence.
14:30:38 [Arle]
Action: Jirka to change localization quality, localization rating, mt confidence, term confidence, and disambig confidence to use double rather than decimal and respond to Henry (Issue-94)
14:30:38 [trackbot]
Created ACTION-411 - Change localization quality, localization rating, mt confidence, term confidence, and disambig confidence to use double rather than decimal and respond to Henry (Issue-94) [on Jirka Kosek - due 2013-01-30].
14:31:35 [fsasaki]
topic: issue-95
14:31:45 [fsasaki]
14:33:14 [Arle]
Felix: We should reject this. The proposal itself said that translatable is different than localizable (e.g., in formatting numbers and images).
14:33:37 [Arle]
.. Discussion was between Norbert, Felix, Des, and Phil.
14:34:00 [Arle]
.. I think addressing this would take too much time at this point.
14:34:17 [fsasaki]
another point for Dave here
14:34:28 [Arle]
Dave: It really is out of scope for ITS.
14:34:40 [Arle]
.. Translators will deal with this on their own anyway.
14:35:29 [Arle]
Felix: Norbert asked if we could use ITS for localizing CLDR? I don't see that as a real use case.
14:35:43 [Arle]
Action: Felix to let Norbert know that action-95 is out of scope.
14:35:43 [trackbot]
Created ACTION-412 - Let Norbert know that action-95 is out of scope. [on Felix Sasaki - due 2013-01-30].
14:36:16 [fsasaki]
topic: issue-96
14:36:24 [Arle]
14:36:54 [Arle]
topic: issue-98
14:36:57 [fsasaki]
topic: issue-98
14:37:14 [fsasaki]
s/issue-98/issue-98 and issue-99
14:38:07 [truedesheim]
truedesheim has joined #mlw-lt
14:38:17 [Arle]
Milan: related to issue-99. I found that there is no way to do this. It is mentioned only for global approach to selectors and what is allowed. Chapter 1.1 should state that the local approach can be applied only to the content of the current element and any inherited nodes, per 8.1
14:40:29 [Arle]
.. For issue-99, when using selectors in ITS, how do you select attributes? Information is there, but the definition of node differs between XML and HTML, leading to confusion. I see Yves’ suggestion to remove CSS as a selector type since they can point only to elements, but I would keep it and add a note that we can only point to elements, not attributes.
14:40:41 [Arle]
David: I think it makes sense to keep CSS.
14:41:40 [Arle]
Felix: We don't have any implementers using selectors.
14:41:48 [Arle]
Yves: Shaun is, as a prototype.
14:41:54 [Arle]
Felix: I never got it to work.
14:42:07 [Arle]
Yves: Norbert says for HTML people selectors may be important.
14:42:22 [Arle]
.. But with no implementations, it won't happen. It's marked as endangered.
14:42:46 [Arle]
Felix: We can drop "at risk" bits.
14:42:59 [Arle]
.. I agree with Milan's solution, but we might drop them anyway.
14:43:37 [Arle]
Jirka: suggested a path to get implementation.
14:43:59 [Arle]
Felix: It would be nice. Right now we have two paths, doing testing only for XPath, but not for CSS.
14:44:09 [Arle]
Jirka: Do we need tests, since they just select nodes?
14:44:30 [Arle]
Felix: Maybe the test suite or elsewhere, would we have examples making use of CSS.
14:44:56 [Arle]
.. If we don't have testing, W3C management may not like us saying "you can do it on your own but we haven't done it."
14:45:36 [Arle]
Jirka: We need at least one selection mechanism. Testing is to verify interoperability.
14:46:12 [Arle]
Felix: We need to have at least one example for standardization and users about how to use it. We have no CSS examples.
14:46:22 [Arle]
Jirka: Let's have some examples, parallel to XPath examples.
14:46:45 [Arle]
Felix: Can you link to libraries to convert between CSS and XPath selectors?
14:47:05 [Arle]
.. Are there non-browser conversions?
14:47:51 [Arle]
Action: Jirka to find data on CSS and XPath selectors conversion libraries and keep CSS selectors in the spec.
14:47:51 [trackbot]
Created ACTION-413 - Find data on CSS and XPath selectors conversion libraries and keep CSS selectors in the spec. [on Jirka Kosek - due 2013-01-30].
14:48:31 [Arle]
topic: issue-100
14:48:38 [fsasaki]
14:48:54 [Arle]
Felix: Yves proposed a resolution.
14:49:07 [Arle]
Action: Felix to make edit for issue-100 and get back to Norbert.
14:49:07 [trackbot]
Created ACTION-414 - Make edit for issue-100 and get back to Norbert. [on Felix Sasaki - due 2013-01-30].
14:49:35 [fsasaki]
topic: issue-104
14:49:59 [Arle]
Action: Felix to update unicode reference for issue-104
14:49:59 [trackbot]
Created ACTION-415 - Update unicode reference for issue-104 [on Felix Sasaki - due 2013-01-30].
14:50:18 [Arle]
topic: issue-106 and issue-107
14:50:58 [fsasaki]
106 see
14:51:21 [fsasaki]
14:51:27 [Arle]
Karl: Norbert asked some questions and we weren't sure how to resolve them. It isn't up to the spec. The implementation must support UTF-8, but that is up to the implementer. It is best practice, especially for storage size. But we don't think it has to be mandatory for all implementations.
14:51:29 [fsasaki]
106 see
14:52:18 [Arle]
.. Other question was how to handle encoding when the implementation doesn't support it. Again, this is not up to the spec. We can define best practice, but it doesn't need to be stated in the spec.
14:53:30 [Arle]
Stephan: Perhaps we have an explanation about what storage size is used for. The question is about when it is used to markup text in the source language. It is informational, but not up to the spec to tell us what to do if a tool doesn't support an encoding or if user text cannot be represented in a given encoding.
14:54:00 [Arle]
Karl: We should add a sentence to storage size, per the note on the issue-107.
14:54:28 [Arle]
Felix: on issue-106 and issue-107 we do nothing, just let Norbert know the rationale.
14:56:17 [Arle]
Action: Karl to propose solution to Norbert and then Felix can add to spec.
14:56:17 [trackbot]
Created ACTION-416 - Propose solution to Norbert and then Felix can add to spec. [on Karl Fritsche - due 2013-01-30].
14:56:48 [Arle]
Felix: When we go back to Norbert, talk about what we did in the group to show there is consensus.
14:57:23 [Arle]
topic: issue-108 and issue-109
14:57:40 [Arle]
Felix: Both relate to Indic requirements.
14:58:35 [Arle]
Dave: They make a point that there is dependency on context (e.g., part of speech) that influences how you translate things. They want PoS in localizationNote and provided an annex of possible annotations.
14:59:53 [Arle]
.. Adding a data type specifically for this would be a big change. You see companies when they want to add their own metadata use localizationNote with name:value pairs. It could be best practice outside the spec.
15:00:27 [daveL]
15:00:33 [fsasaki]
reply from Dave on locNote its2 req , see above mail
15:01:12 [Arle]
Dave: I pointed them to other relevant resources, like NIF.
15:02:32 [Arle]
Arle: This would be too complex for us to solve this problem. Anything that works for Europe may fall apart elsewhere.
15:02:54 [Arle]
.. I don't think we could solve this in a reasonable time frame without too much controversy.
15:03:45 [Arle]
Tadej: they have PoS taggers in MT already, but it is specialized. This would be scope creep.
15:04:01 [Arle]
Marcis: Once you add PoS, you have to add syntax, etc.…
15:04:22 [Arle]
Dave: Do humans need PoS tagging? I don't know.
15:04:29 [tadej]
tadej has joined #mlw-lt
15:04:42 [Arle]
Marcis: Wouldn't this be duplicating existing work in text analysis.
15:05:35 [Arle]
Action: DaveL to go back to Somnath on issue-108 to explain why we won't address it.
15:05:35 [trackbot]
Created ACTION-417 - Go back to Somnath on issue-108 to explain why we won't address it. [on David Lewis - due 2013-01-30].
15:06:05 [Arle]
Dave: issue-109 falls out of my expertise. It deals with nested output from NER.
15:06:57 [Arle]
Tadej: I didn't quite follow the requirements. It seems they want to show that parts of entities may be entities. I don't know if they need this or are showing what they might do with this.
15:07:35 [Arle]
.. Regardless of this, the comment that hierarchy is needed.
15:07:40 [Arle]
Dave: We can't do this.
15:08:08 [Arle]
Tadej: Overriding makes that the case, but if we allowed multiple values, we could.
15:08:19 [Arle]
Dave: But you need to show that the different parts are bound together.
15:09:56 [Arle]
Tadej: If you allow multiple values (e.g., something can belong to two entities), then the scope can be ambiguous.
15:10:26 [Arle]
Marcis: But there should be no ambiguous overlaps in a hierarchy.
15:11:59 [Arle]
Stephan: When would you actually use the knowledge that you have nested named entities?
15:12:18 [Arle]
Tadej: Can we make the restriction that entities are contiguous?
15:12:23 [Arle]
Dave: That would be reasonable.
15:13:03 [Arle]
Dave: The solution isn't straight-foward. This would be a new feature. I think we should respond in that way.
15:13:20 [Arle]
s/Dave: The solution/.. The solution/
15:15:00 [Arle]
Discussion about whether hierarchy is needed and produced.
15:15:15 [Arle]
Dave: You could also point to a NIF record with that structure in it.
15:15:42 [Arle]
Tadej: If several disambiguationRefs address something, we can't tell which one produced what.
15:16:03 [Arle]
.. If a single node can have multiple values it makes tracking hard. We use stand-off for this.
15:16:20 [Arle]
.. This multiple granularity might break things.
15:16:57 [Arle]
Action: Dave to respond to Somnath on issue-109 to explain we are looking at it to make recommendations.
15:16:57 [trackbot]
Error finding 'Dave'. You can review and register nicknames at <>.
15:17:02 [Arle]
Action: DaveL to respond to Somnath on issue-109 to explain we are looking at it to make recommendations.
15:17:02 [trackbot]
Created ACTION-418 - Respond to Somnath on issue-109 to explain we are looking at it to make recommendations. [on David Lewis - due 2013-01-30].
15:18:09 [fsasaki]
topic: locale filtering question
15:18:18 [fsasaki]
marcis: in content is "de"
15:18:30 [fsasaki]
.. in the localeFilter it would be de-de
15:19:21 [fsasaki]
felix: not matched
15:35:29 [mdelolmo]
mdelolmo has joined #mlw-lt
15:42:29 [tadej]
tadej has joined #mlw-lt
15:46:39 [Arle]
topic: test suite check
15:47:18 [Arle]
Felix: We don't have a lot of coverage (38%) and most of that is thanks to Yves and Fredryk (ENLASO).
15:47:52 [Arle]
.. At the end of January we have the deadline to run all test cases. Is that deadline (next week) realistic? We have some changes, but others are stable.
15:48:19 [Milan]
Milan has joined #mlw-lt
15:48:47 [Arle]
Leroy: The files will remain the same, with changes after the 21st.
15:49:23 [Arle]
Karl: our cases are theoretically all working, but we have some issues with sorting of attributes, which we don't do. That's the only reason we aren't complete.
15:49:58 [Arle]
.. In the input attributes are source and alt. We output them in that order, but the output sorts them.
15:50:15 [Arle]
Leroy: I can run my sorting function on output for you.
15:50:38 [Arle]
Stephan: Actually, it is backward, the source is in order, the output isn't.
15:50:56 [Arle]
Yves: Many engines do not care about order. You have to handle sorting yourselves.
15:51:29 [Arle]
Karl: It's not a big change and then we are done. I will make the change myself.
15:51:47 [Arle]
Ankit: We have a few small snags.
15:53:52 [Arle]
Linguaserve: (Some issues. ???)
15:54:06 [Arle]
Thomas: We are working on our implementations, should be ready next week.
15:54:47 [Arle]
David: Connection between Moravia and UL tests…
15:57:42 [Arle]
Felix: David, I know you use Okapi wrapper. When that is integrated in the workflow, you can run the same tests as Okapi. So now you run six cases, but you could run more then.
15:58:36 [fsasaki]
topic: RFC statements
15:59:00 [fsasaki]
15:59:25 [Arle]
Felix: Much is covered by the schema.
16:00:41 [Arle]
..#25 talks about the content of the annotatorsRef attribute. Currently the data type is text. There is a need for test case with a file with a non-allowed identifier and the parser says it is wrong. That would test it, even though it does not produce specified output.
16:01:14 [Arle]
..David, could you make a test case and get the implementers to run it?
16:01:21 [Arle]
..See example below:
16:01:29 [fsasaki]
16:01:32 [fsasaki]
16:01:43 [Arle]
..Second line should throw an error.
16:01:59 [Arle]
Yves: Do we have standard output for the errors?
16:02:15 [Arle]
Felix: No. This will require human verification.
16:05:12 [Arle]
.. We can address issues here until October.
16:05:27 [Arle]
.. After XML Prague would be fine.
16:05:38 [Arle]
Jirka: We can do this using Schematron with regex.
16:05:58 [Arle]
Karl: There are similar cases in the docs to do negative tests.
16:06:38 [Arle]
Jirka: It's already there, but you have to look at the Schematron, not the XSD.
16:09:41 [Arle]
.. Doing as much as possible in Schematron.
16:09:55 [Arle]
Felix: What about #39, #35, #41?
16:11:50 [Arle]
.. If not checked by Schematron, please add later.
16:12:39 [fsasaki]
action: jirka to make schematron tests described at
16:12:39 [trackbot]
Created ACTION-419 - Make schematron tests described at [on Jirka Kosek - due 2013-01-30].
16:13:40 [Arle]
Felix: #31, if values have spaces, must be delimited with quotation marks. Need a test case?
16:13:58 [Arle]
Yves: It's already covered by the test cases, which fail if the output isn't formatted properly.
16:15:27 [Arle]
Felix: #36. Overriding means these won't be combined anyway. Maybe make an action to delete the sentence in 8.11.2?
16:16:50 [Arle]
16:17:00 [trackbot]
trackbot has joined #mlw-lt
16:17:52 [Arle]
Refers Issue-111
16:18:04 [fsasaki]
action: felix to make edit for issue-111
16:18:04 [trackbot]
Created ACTION-420 - Make edit for issue-111 [on Felix Sasaki - due 2013-01-30].
16:18:45 [Arle]
Felix: #36 is dropped.
16:20:39 [fsasaki]
" If the type of the issue is set to uncategorized, a comment MUST be specified as well." - can be checked, an error if no comment is avaiable
16:26:43 [Arle]
Felix: Maybe we put the other MUST statement (about mapping internal types to issue type values) as its own test type. To catch the error, you must be able to parse the category.
16:27:09 [Arle]
.. You need to understand the values and different types or markup. It is on top of the normal test suite functionality.
16:29:58 [Arle_]
Arle_ has joined #mlw-lt
16:32:10 [Arle]
Yves: We don't need the MUST there. The value column covers the same thing.
16:40:02 [Arle]
Discussion about where to test.
16:41:38 [fsasaki]
topic: test suite
16:41:56 [fsasaki]
s/topic: test suite//
16:45:32 [fsasaki]
"The set of characters that are allowed is specified using a regular expression. That is, each character in the selected content MUST be included in the set specified by the regular expression."
16:49:19 [fsasaki]
this is not a test for the processor, but for the consuming application
16:51:25 [fsasaki]
for IANA charset names see
16:53:16 [fsasaki]
we point to the IANA list, that's it
16:53:34 [fsasaki]
relevant for this MUST statement: "A storageEncoding attribute. It contains the name of the character set encoding used to calculate the number of bytes of the selected text. The name MUST be one of the names or aliases listed in the IANA Character Sets registry . The default value is UTF-8."
16:56:47 [Arle]
Felix: For many quality issue type items, change MUST/MUST NOT to must/must not.
16:57:06 [Arle]
.. Numbers 45–48
16:57:46 [fsasaki]
"See entries 45-48 at these statements are not verifable. Proposal is to set MUST and MUST NOT to lower case to make clear that the text is just guidance."
17:00:12 [fsasaki]
for 45 " The values a tool implementing the data category produces for the attribute MUST match one of the values provided in this table and MUST be semantically accurate.": re-formulate this :
17:02:28 [fsasaki]
drop "MUST be semantically accurate".
17:03:43 [Arle]
"If a tool can map its internal values to these types it MUST do so and MUST NOT use the value other, which is reserved strictly for values that cannot be mapped to these values." -> "Note that the other category is reserved for cases where a tool-specific category cannot be mapped…"
17:05:41 [fsasaki]
action: arle to work on statements 45-48 at , see prague f2f minutes
17:05:41 [trackbot]
Created ACTION-421 - Work on statements 45-48 at , see prague f2f minutes [on Arle Lommel - due 2013-01-30].
17:06:11 [Arle]
Yves pointed out that the values should be done by class, not on an individual error basis independent of classes.
17:07:00 [Arle]
#48. If a system has an "miscellaneous" or "other" category, it MUST be mapped to this value even if the specific instance of the issue might be mapped to another category -> append note on semantic accuracy here.
17:08:43 [fsasaki]
topic: requirements doc
17:08:54 [fsasaki]
s/doc/doc and issues not addressed in ITS2
17:10:29 [fsasaki]
multi-engine domain scenario + multi engine domain scenario
17:11:19 [fsasaki]
issue-95 and issue-75 would be covered by this
17:12:03 [fsasaki]
17:12:24 [fsasaki]
17:12:39 [fsasaki]
17:12:45 [fsasaki]
17:12:52 [fsasaki]
17:13:01 [fsasaki]
17:13:07 [fsasaki]
17:13:18 [fsasaki]
17:13:50 [fsasaki]
17:14:06 [fsasaki]
(covered by dc.terms
17:14:28 [fsasaki]
17:14:35 [fsasaki]
17:14:50 [fsasaki]
17:19:57 [swalter]
for 45: Note that the other category is reserved... -> Note that the "other" category is reserved to cases where a tool-specific category cannot be mapped to any of the first categories in a semantically accurate manner.
