18 Jun 2014


DaveLewis, arle, felix, yves
david, christian


Open Data Management position statement


dave: have a tbd section - one area is licensing
... that would take up e.g. work done in META-SHARE and tda

yves: I have read the doc, didn't see any issues

"tbd: purpose help to formulate reqdocs. then: mention projects."

dave: trying to attract other projects - would be good to run this pass chris wendt
... there is not in the way of terminology integration
... also may be relevant for alolita
... from wikimedia

"tbd: cef out of focus."

" tbd: mention other standards explicitly? what is a standard? "

dave: point to current projects - there is also ongoing work that we can point to
... e.g. bitext access on the web, we are doing that in the bpmlod group
... there is licensing work in the ld4lt group
... need to have a way to point to these group

felix: raise awareness in above groups and make them aware that we'd like their input

dave: also ontolex and the tbx / RDF work, need input from Philipp here
... I will present this in the MLi panel at the LT-Innovate event in brussels next week
... maybe have a little questionnaire that people can pick up to follow up

"Open standards. Open standards are standards that can be implemented on a royalty-free basis, that is, without any licensing requirements."

arle: need to add to the above: there needs to be a policy for maintaining them in an open way as well

dave: bring up this week bpmlod call; had raised it at ld4lt last week.

felix: I'll contact ontolex and alolita

dave: if any of this feeds into the CEF call, that would help
... would help to get the right people on board here
... might help to talk to EU people to see if they have guidance

"tbd: mention other standards explicitly? what is a standard?"

dave: two elements of this:
... listing things that are already availabe
... you can say: relevant existing standards, and standards that are worked on: lemon, XLIFF 2.0, ...
... and then saying: where are the gaps?
... one objective is: we don't know the answers always - there are some areas in which gaps need to be filled and there may be new work to be done
... that may also help e.g. EU to decide what to support
... put in DCAT

felix: MQM?

arle: it is not a standard yet, but moving into that direction

dave: so one could have several categories: final standard, draft standard, technology areas that need standardaration

"Bitext Data Management Requirements" - here MQM would fit very well as s.t. being prepared

scribe: e.g. lemon would very well
... in that way too
... section at the end - "gap analysis"
... table . we now have numbered requirements. we tick of maturity of avail. solutions

felix will add at the beginning about intention to do an IG draft (to be discussed)

dave: also have a contributors section

arle: get feedback from gala too, contacting several people

felix: have a section for mentioning meta-net and other efforts those those communitieswould benefit from this

dave: good idea
... lot of mt researchers are now using wikipedia for mt training, maybe a good collab. point with meta-net, e.g. experience in a particular country

felix: I'll check with the meta-net guys

arle: josef v.g. may be the right person to check that

xliff web IDL

felix describing web IDL choice options - web IDL for defining interfaces, plain json for serializing the current XML - XLIFF

yves: web IDL relation is confusing (CR versus other draft)
... in XLIFF we need two things, both API and data format aspect
... API is not useful if you don't have a clear object definition

dave: your existing API - is that a good model?

yves: needs to be more generic
... main problem is data format - depends on what you want to represent
... e.g. how to represent inline code is quite complex, you have distinct solutions
... not sure on how to proceed

dave: talked to david about this - general feeling is: inline codes dealing with overlapping annotations is in the heart of this

yves: if this is then not XML people would see things differently
... inline representation is key
... will talk to david about this again e.g. on the next call

arle: worth looping in linport in this, they are discussing APIs right now, trying to clarify relationsships to others


dave: we had discussion in FEISGILTT about MQM and relationship to ITS "localization quality issue" types
... there then was exchange on the list about this, Arle clarifying things


dave: arle saying how to map MQM into ITS or into other things - is that correct?

arle: to some extend. MQM allows you to declare that you check
... ITS can be more or less of what you check
... so you could declare an MQM model that is ITS
... the idea: ITS provides broad interop, if people map their categories they know what is there
... MQM would not know in advance what is in a metrics

dave: in ITS you have the loc quality profile ref
... in the MQM doc you could put the reference to the precise MQM type
... so the mapping will be definitive
... MQM structure details would be lost or would need to be in the string of ITS locProfileRef

arle: you would define a mapping and point to a file that has the mappings

yves: I think in ITS you cannot declare things at the top

Arle: ok
... the reference could point to an MQM declaration
... that looses the ability of what the overall profile is
... one could use MQM without having MQM specific markup in ITS

dave: you hit all problems if you start with an XML vocabulary

arle: felix had said a while ago if we should use RDF instead of XML - we may end up doing that for MQM (or have both)

dave: sometimes people have a vocabulary and a document in parallel
... but you can put definitions into RDF and then generate an HTML document which is human readable
... what is your timeline on this, Arle?

arle: development is planned to continue in qt21 and potentially in other projects

dave: in ld4lt it works quite well to team up with people to do some specific work items, e.g. taking an existing model, do rdf related things - but the group who brings in the topic still owns it
... the benefit for the group who bring in the topic is more feedback and visibility, but they don't loose ownership
... good example is meta-share schema discussion
... I am one of the ld4lt co-chairs, we could bring it up on the ld4lt call next week (Thursday 3 p.m.)

felix: arle could bring his material to the call and we'd see what the ontology engineers can do with that

dave: agree. one reasons also why the MQM / RDF disucssion is interesting: they are opportunistics, using what is avail. from wikipedia or babelnet
... many resources are under active curation. being avail. to report such things back in an open way would be a great use case for MQM
... MQM would have a lot of the semantics for such error reporting

arle: perfect, that is exactly what we want to do

dave: great - I will email ld4lt group and CC Arle, saying we are planning to put it on the agenda

arle: ok

dave: and you can reply to that providing more info

arle: ok



[End of minutes]

