IRC log of rax on 2016-11-25

Timestamps are in UTC.

12:51:03 [RRSAgent]
RRSAgent has joined #rax
12:51:03 [RRSAgent]
logging to http://www.w3.org/2016/11/25-rax-irc
12:51:08 [fsasaki]
meeting: rax cg
12:51:10 [fsasaki]
chair: phil
12:51:17 [fsasaki]
agenda: https://lists.w3.org/Archives/Public/public-rax/2016Nov/0008.html
12:52:00 [fsasaki]
regrets: christian, gerard, jose
12:52:14 [RRSAgent]
I have made the request to generate http://www.w3.org/2016/11/25-rax-minutes.html fsasaki
12:59:30 [philr]
philr has joined #rax
12:59:35 [philr]
present+ philr
12:59:44 [fsasaki]
present+ felix
12:59:54 [RRSAgent]
I have made the request to generate http://www.w3.org/2016/11/25-rax-minutes.html fsasaki
13:01:47 [Timea_T_]
Timea_T_ has joined #rax
13:03:01 [clange]
clange has joined #rax
13:04:02 [fsasaki]
present+ timea, christoph
13:05:24 [fsasaki]
topic: meeting start
13:06:20 [fsasaki]
phil: did a review of use cases this morning. not too much change, missed one that christoph added.
13:06:32 [fsasaki]
https://www.w3.org/community/rax/wiki/Draft_Material#Data_acquisition_from_job_postings_via_GATE
13:06:53 [fsasaki]
phil: thanks a lot for adding this - can you give a brief description?
13:07:05 [fsasaki]
s/this/this, christoph/
13:07:48 [fsasaki]
christoph: sure. have not yet managed to share the descriptions, I have more material, and will get it done to share this
13:08:17 [fsasaki]
... will also add more concrete examples. Application setting is: we collect job postings in the form of plain text from the web
13:08:31 [fsasaki]
... we do named entity recognition with gate, and we get XML output
13:08:40 [fsasaki]
... begining and end of each token is annotated
13:09:03 [clange]
text text text <start/>recognised entity<end/> text text
13:09:22 [fsasaki]
christoph: see above XML example. this has to be translated to RDF
13:09:22 [clange]
<start id="foo"/>
13:09:31 [clange]
<start href="#foo"/>
13:09:41 [fsasaki]
... start and end tags look like the above
13:09:46 [clange]
ids or refs (forgot which direction) are in these start/end tags
13:10:30 [fsasaki]
christoph: we are using XSLT based tool I developed (trextor) to create RDF. it is quite hard
13:10:39 [clange]
krextor
13:10:42 [fsasaki]
... with XPath it is hard to select elements between start and end tags
13:11:03 [fsasaki]
... that is a bit tricky, you need a good knowledge of XPath, the sibling axis' etc.
13:11:59 [fsasaki]
... in context of European project, in which another partner is doing the extraction
13:12:51 [fsasaki]
phil: is this similar to Martynas case?
13:13:03 [fsasaki]
christopher: in terms of Xpath complexity, yes
13:13:42 [fsasaki]
... general XML to RDF transformation issue?
13:13:42 [fsasaki]
https://github.com/fsasaki/its20-extractor/tree/master/wikipedia-extractor
13:13:45 [philr]
felix: I've written various converters
13:14:09 [philr]
...it is always special case issues
13:14:37 [philr]
...XML has various ways to include content
13:14:58 [philr]
...special purpose handling is somwhat unavoidable
13:16:16 [philr]
...example documents with guideance would be useful
13:16:36 [fsasaki]
.. may be useful to give guidance on how to handle various cases
13:17:45 [fsasaki]
christopher: there are patterns, e.g. parent child relations in XML and RDF properties
13:17:57 [fsasaki]
... for this you can provide a high level translation patterns
13:18:12 [philr]
clange: High level translation is possible with simple parent-child relationships
13:18:43 [philr]
felix: mixture of text and element nodes is challenging
13:19:19 [fsasaki]
https://github.com/fsasaki/its20-extractor/blob/master/wikipedia-extractor/its-ta-2-nif-wikipedia.xsl#L43
13:19:54 [clange]
fsasaki: handling of specific links (specific to wiki markup)
13:21:00 [fsasaki]
phil: in FREME project we are also doing named entity recognition on plain text. our services are capable of returning turtle files, but we can cover many formats
13:22:13 [fsasaki]
https://api-dev.freme-project.eu/ckeditor-dev/ckeditor/samples/freme.html
13:25:53 [fsasaki]
various types of output, inline or external using json-ld
13:27:15 [fsasaki]
action: felix to provide examples of round tripping as done in the freme project
13:27:27 [fsasaki]
topic: bdva summit
13:28:48 [philr]
felix: to collect information on what better tooling is needed
13:29:04 [philr]
...best practices abd standardization
13:29:20 [philr]
...1.5 hour session on requirements
13:30:04 [philr]
clange: is there more I can do if I do not attend the summit?
13:30:25 [philr]
felix: it would be good if someone from your organization could attend
13:31:18 [philr]
...questionnaire to bdva members but want input from companies
13:31:54 [philr]
Is there a fee to join bdva?
13:32:05 [fsasaki]
felix: yes, will send info on that
13:32:19 [clange]
fsasaki 14:29: EU is not necessarily interested in new standards being developed, but in existing standards to be _applied_ in a better way
13:32:29 [fsasaki]
thanks, clange
13:35:26 [fsasaki]
discussion on automationML use case
13:40:25 [fsasaki]
felix will send further infos on BDVA around
13:40:28 [fsasaki]
topic: AOB
13:40:37 [fsasaki]
next meeting 9th of December
13:40:50 [fsasaki]
phil cannot make it, christian to chair
13:41:04 [fsasaki]
rrsagent, draft minutes
13:41:04 [RRSAgent]
I have made the request to generate http://www.w3.org/2016/11/25-rax-minutes.html fsasaki
13:48:38 [fsasaki]
fsasaki has left #rax