12:51:03 RRSAgent has joined #rax 12:51:03 logging to http://www.w3.org/2016/11/25-rax-irc 12:51:08 meeting: rax cg 12:51:10 chair: phil 12:51:17 agenda: https://lists.w3.org/Archives/Public/public-rax/2016Nov/0008.html 12:52:00 regrets: christian, gerard, jose 12:52:14 I have made the request to generate http://www.w3.org/2016/11/25-rax-minutes.html fsasaki 12:59:30 philr has joined #rax 12:59:35 present+ philr 12:59:44 present+ felix 12:59:54 I have made the request to generate http://www.w3.org/2016/11/25-rax-minutes.html fsasaki 13:01:47 Timea_T_ has joined #rax 13:03:01 clange has joined #rax 13:04:02 present+ timea, christoph 13:05:24 topic: meeting start 13:06:20 phil: did a review of use cases this morning. not too much change, missed one that christoph added. 13:06:32 https://www.w3.org/community/rax/wiki/Draft_Material#Data_acquisition_from_job_postings_via_GATE 13:06:53 phil: thanks a lot for adding this - can you give a brief description? 13:07:05 s/this/this, christoph/ 13:07:48 christoph: sure. have not yet managed to share the descriptions, I have more material, and will get it done to share this 13:08:17 ... will also add more concrete examples. Application setting is: we collect job postings in the form of plain text from the web 13:08:31 ... we do named entity recognition with gate, and we get XML output 13:08:40 ... begining and end of each token is annotated 13:09:03 text text text recognised entity text text 13:09:22 christoph: see above XML example. this has to be translated to RDF 13:09:22 13:09:31 13:09:41 ... start and end tags look like the above 13:09:46 ids or refs (forgot which direction) are in these start/end tags 13:10:30 christoph: we are using XSLT based tool I developed (trextor) to create RDF. it is quite hard 13:10:39 krextor 13:10:42 ... with XPath it is hard to select elements between start and end tags 13:11:03 ... that is a bit tricky, you need a good knowledge of XPath, the sibling axis' etc. 13:11:59 ... in context of European project, in which another partner is doing the extraction 13:12:51 phil: is this similar to Martynas case? 13:13:03 christopher: in terms of Xpath complexity, yes 13:13:42 ... general XML to RDF transformation issue? 13:13:42 https://github.com/fsasaki/its20-extractor/tree/master/wikipedia-extractor 13:13:45 felix: I've written various converters 13:14:09 ...it is always special case issues 13:14:37 ...XML has various ways to include content 13:14:58 ...special purpose handling is somwhat unavoidable 13:16:16 ...example documents with guideance would be useful 13:16:36 .. may be useful to give guidance on how to handle various cases 13:17:45 christopher: there are patterns, e.g. parent child relations in XML and RDF properties 13:17:57 ... for this you can provide a high level translation patterns 13:18:12 clange: High level translation is possible with simple parent-child relationships 13:18:43 felix: mixture of text and element nodes is challenging 13:19:19 https://github.com/fsasaki/its20-extractor/blob/master/wikipedia-extractor/its-ta-2-nif-wikipedia.xsl#L43 13:19:54 fsasaki: handling of specific links (specific to wiki markup) 13:21:00 phil: in FREME project we are also doing named entity recognition on plain text. our services are capable of returning turtle files, but we can cover many formats 13:22:13 https://api-dev.freme-project.eu/ckeditor-dev/ckeditor/samples/freme.html 13:25:53 various types of output, inline or external using json-ld 13:27:15 action: felix to provide examples of round tripping as done in the freme project 13:27:27 topic: bdva summit 13:28:48 felix: to collect information on what better tooling is needed 13:29:04 ...best practices abd standardization 13:29:20 ...1.5 hour session on requirements 13:30:04 clange: is there more I can do if I do not attend the summit? 13:30:25 felix: it would be good if someone from your organization could attend 13:31:18 ...questionnaire to bdva members but want input from companies 13:31:54 Is there a fee to join bdva? 13:32:05 felix: yes, will send info on that 13:32:19 fsasaki 14:29: EU is not necessarily interested in new standards being developed, but in existing standards to be _applied_ in a better way 13:32:29 thanks, clange 13:35:26 discussion on automationML use case 13:40:25 felix will send further infos on BDVA around 13:40:28 topic: AOB 13:40:37 next meeting 9th of December 13:40:50 phil cannot make it, christian to chair 13:41:04 rrsagent, draft minutes 13:41:04 I have made the request to generate http://www.w3.org/2016/11/25-rax-minutes.html fsasaki 13:48:38 fsasaki has left #rax