DACG Monthly -- 04 Mar 2015

<phila> scribe: phila

<deirdrelee> i'll take .a

Updates round the table

deirdrelee: We've just published our FPWD of our best practices doc

<deirdrelee> http://www.w3.org/TR/dwbp-ucr/

deirdrelee: and an updated UCR doc
... Still very much work in progress but we thought it was important to try and get feedback
... designed a template and then tried to focus on specific areas
... so we're open to improvements and comments
... also looking at charter vocabs in the charter
... one on quality and granularity, and one on usage
... looking at each of those
... how to they differ, what does the UCR set tell us
... discussions centred around building on DCAT
... we met at TPAC. Next f2f is April 13-14 collocated with ApacheCon

JeniT: Things progessing well with CSVW. Productive f2f recently
... got resolutions to all issues open in GitHub
... since then working on editors enacting the resolutions

<ivan> (we have cca. 300 issues in the github tracker:-)

JeniT: Tabular data model, processing tab data, metadata etc.
... document for syntax for what you can say about tabular data
... 2 other specs a little further behind on conversion to JSON or RDF
... using GH well to collaborate between editors
... small number of people so we're able to agree quickly
... aim is to get new drafts out by end of March
... anad then to move rapidly into LCCR
... and having implementations
... one in the bag already but obviously want to encourage more
... heading for completion in August. All well except for low levels of participation

<raphael> What are the existing implementations or planned ones?

<JeniT> Gregg Kellogg has been working on a Ruby one

raphael: Just wondering about the otehr implementations
... what are those?

JeniT: Gregg has one as part of his systems
... does parsing and converting (written in Ruby)
... some promises too.
... plans to do one in Python and plan is to expand the ODI CSV Lint Tool
... to use the CSVW spec for validation

<raphael> CSVlint: http://csvlint.io/

JeniT: that gregg may not implement

ivan: Also one coming from Pacific researcher

<davidwood> Has anyone spoken to RPI? John Erickson should have an eye on an implementation./

JeniT: I didn't include Ivan's implementation

ivan: There's Bill as well

<JeniT> davidwood: nope

raphael: Any attempt to get plug ins for Excel etc?

JeniT: Tried to design it so that the model is an abstraction on top of specific formats
... we have a non-normative section on parsing

<davidwood> I'll ping him

JeniT: leave open for tooling to handle otehr types of table
... and then process into tabular data
... all we're defining about how to read in the data is the non-norm stuff around CSV

raphael: I think the ATOS friends in charge of the DataLift platform can already read CSV and Excel
... they can do that in a systematic manner
... and we have other tools. They were looking at what was going on in the CSVW
... waitinbg for a freeze before doing any implementation
... sounds like we're nearely there so maybe thay should start soon?

JeniT: Once next versions are published (end March)
... that would be the time to reach implementers

Soeren: I'll need to catch up with people on this. Triplification for Big data Eu

<raphael> Datalift platform: http://datalift.org/

Soeren: Ivan ??? did the triplification work

s/Big data EU/Publicdata.eu/

phila: It's Capgemini Plus FOKUS plus a little bit fo ODI

<JeniT> thanks all :)

Arnaud: On LDP
... LDP is now a Rec. It feels like old news. The WG reached that point in December but it took a while to get to the official announcement
... the WG itself...
... we requested an extension to end July 2015
... plan was to co,plete documents, bith in CR - Paging and PATCH
... initiall Paging was part of hte LDP spec
... separated it out to allow progress with LDP spec
... problem now is that the spec has evolved and people are not so interested in implementing Paging
... so my feeling is that we won't reach CR Exit criteria for paging
... IBM has implemented it and we have some people saying yes eventually... but
... I was asking what it would take to generate more interest.
... Problem is that it doesn't do enough. Sandro was interested but in the end realised it wasn't quite enough for what he wanted to do
... Oracle did something similar
... Oracle want to be able to do more querying and filtering so it's not so interesting to them
... we have some functions but not enough. Trying to add what it would take is a lot of work so I'm not sure it will be done. Probably going to be parked as a Note
... then there's LD Patch
... use with HTTP PATCH verb
... to update a resource without replacing whole resource
... now in CR
... Other than that, the WG has been wondering what to do next for a while.
... Plan has been to have a workshop. Not clear how much interest there is
... I mentioned our wishlist before. But should we jump in to working on that or give people a pause and see what happens
... been working onworkshop for months. Now looking at collocating with annotations WG in the bay Area
... now too late for official workshop
... so may do a mini/unofficial workshop

<davidwood> LDP public meeting announcement: https://www.w3.org/wiki/Main_Page/Linked_Data_2015

Arnaud: I can't take the lead on it. Maybe Ashok will

phila: So how big would that be?

<Arnaud> https://www.w3.org/wiki/Main_Page/Linked_Data_2015

Arnaud: We'd invite people and hope to attract people

<raphael> scribenick: raphael

Ivan: I'm also participating in the Open Annotation WG
... the OA datamodel is an RDF model which has a JSON-LD syntax
... there is a need to store and exchange those annotations
... thus a protocol, and one option is LDP
... the Social Web WG has a similar requirement and they also list LDP and other candidates (related to activity streams)
... there will be a Social Web WG meeting in Boston soon with some representative of the Open Annotation WG
... there will be iAnnotate conference in April where the Open Annotation WG will meet, with some people from the Social Web WG and LDP

Arnaud: LDP can also be used for people who are RDF-centric
... POST-ing things in a graph
... to some extent, serving requirements that the Open Annotation WG has

Phil: any liaison between LDP and the CSV group?
... it is unclear what we will do next, looking for your input

Soren: what we need is more collaborative ways to build and share vocabularies
... we use github as a tool to edit vocabularies, this has some limitations

<Soeren> https://zenodo.org/record/15023#.VPcRzPnF_UU

<Soeren> https://github.com/vocol/vocol

Soren: vocol use git as versioning

Soeren: Looking at vocabs for connecting manufacturing robots and sensors etc

... we want to do more LD in manufacturing

... vocabs eitehr absent or superficial

... very time consuming

... so vocabs are important from my perspective

raphael: On similar lines to Soeren

http://boris.villazon.terrazas.name/projects/prolov/index.html

raphael: New plugin for Protege and looking up LOV

... still under development, showing usefulness of the plugin

... Things like Smart Cities, when you discuss this, people wish to have some sort of Best practices, what are the vocabs we should use etc.

... they know all those vocabs exist but they wish to have a grouping of concrete examples

... this is what I see

Soeren: People may find things they want to use but they want to use it differently.

... and many vocabs are published and then abandoned

... with GH you can fork, merge and branch and we need someething like this

... we are looking at using GH for that

<deirdrelee> +1 this would be interesting to allow people to extend/merge vocabs while still ensuring they are compliant the original vocabs

http://owlgred.lumii.lv/

<JeniT> we had the same issue in CSV WG about which metadata terms to use

phila: Problem is that peopkle want BPs, but things like DWBP won't be so prescriptive

... we're trying to be tech neutral

... in the last meeting we talked about how standards are fluid. You want adoption but standards and techs evolve

... as a WG we should perhaps address this, but perhaps wth some softer recommendations

phila: So how did you handle this balance?

JeniT: We ducked it in CSVW

... we didn't want to define which bits of DC to use which are also duplicated in schema.org

... and hoped that the DWBP might guive us some recommendations like schema.org is right for X but dct is better for Y, or use multiple vocabs

... it's not in scope for CSVW

deirdrelee: We're going to be biased towards W3C standards, so arfe we going to recommend schema.org when there's a W3C rec doing the smae thing?

... when it comes to general topics, OK, we can talk about the Web, we can, but in domains like health and trabsport, we don't have the expertise to make pronouncements

JeniT: Reflecting on where this has a practical effect. Our OD certificates... we tried to automatically pull in some info about the dataset that people are registereing

... in doing that we wanted to be pulling in the keywords for a dataset, or a description

... we currently support a handful of ways of doing that, but it's a pain that there are so many ways and we don't know which is the one we should be focussing on from an implementation perspective

... I get the point Deirdre re domain expertise but in terms of metadata, a bit of description *could* help to rationalise the metadata a but

phila: Any comment Tom?

tbaker: Something that DanBri has been advocating is that you simply say both

... they can be mapped at a term level but they can also be mapped at the metadata level by providing both for certain key properties

... so maybe that's what we need to suggest

ivan: That's a very pragmatic solution but it's not nice

... in the CSV WG... You put e.g. title and you can use dcterms:title and you can repeat it in schema which is a pain

... I know what to do, but it feels wrong

tbaker: The alternative is to put more energy into mapping

davidwood: I admit... when I look through the CSV spec, it's useful for the use cases. I hope it's successful. All the times I've faced turining CSV into RDF it's been v complex

... it's not always handlable in a simple way. RPI showed this

... I have concerns about the approach

ivan: Yes, David, I realise that

... we struggled early in the group to choose the level of complexity that we would handle

... we were looking at a templating language (Mustache)

... work going on in Ghent (iMinds) adapting R2RML

...we had to realise that it was unrealistic to specify that within 3 years

... so now we have a simple mapping. Not quite as simple as R2RML. Covers a large number of our use cases and there's an escape mechanism where people can plug in extra tools

... those tools may work on the JSON mapping or the table or whatever

... maybe once this is done there might be a second round

... we started a Community group to look at more complex cases but the CG has yet to take off.

... No accepted approach to handle all the cases. Maybe the market decides on the best approach

<davidwood> I take the point, Ivan

... or a complex thing that may or may not get implemented

DACG Monthly

04 Mar 2015

Attendees

Contents

Updates round the table