W3C

– DRAFT –
Weekly DXWGDCAT

06 March 2019

Meeting minutes

<PWinstanley> proposed: approve minutes https://‌www.w3.org/‌2019/‌02/‌27-dxwgdcat-minutes

Approve minutes from last meetings: https://‌www.w3.org/‌2019/‌02/‌27-dxwgdcat-minutes

<riccardoAlbertoni> +1

<alejandra> +1

<SimonCox> +1

<Makx> +1

<PWinstanley> 0 (not there)

Resolved: approve minutes https://‌www.w3.org/‌2019/‌02/‌27-dxwgdcat-minutes

Publication schedule

<alejandra> https://‌docs.google.com/‌spreadsheets/‌d/‌17Du2hZzIxejX7MT6MlmZOIdMKFonrLRUUAn1v5XJwt4/‌edit#gid=0

<PWinstanley> alejandra: as discussed in previous meetings, the target is mid-March. There are many editorial issues to resolve, e.g. how we display properties, but there are also many other issues

<riccardoAlbertoni> I need access

<PWinstanley> ... Dave B made a spreadsheet. we have progressed

<alejandra> https://‌docs.google.com/‌spreadsheets/‌d/‌17Du2hZzIxejX7MT6MlmZOIdMKFonrLRUUAn1v5XJwt4/‌edit#gid=0

<riccardoAlbertoni> me now i can see it

<PWinstanley> alejandra: the spreadsheet requires analysis to see what really needs to be done

<PWinstanley> ... e.g. fundingSource was addressed by wasGeneratedBy,

<PWinstanley> ... but it would be nice to see how to record a funding source, so we need an example

<PWinstanley> ... because the requirement was couched in these terms

<PWinstanley> alejandra: there is one issue about profile definition.

<PWinstanley> ... Perhaps not so fundamental - not a priority

<PWinstanley> ... we haven't done dataset publications

<riccardoAlbertoni> +1 not to consider it as a priority

<alejandra> https://‌docs.google.com/‌spreadsheets/‌d/‌17Du2hZzIxejX7MT6MlmZOIdMKFonrLRUUAn1v5XJwt4/‌edit#gid=0

<riccardoAlbertoni> s/ 1 not to consider it as a priority/ 1 not to consider issue 72 as a priority

<PWinstanley> alejandra: dataset publications - I drafted something, but is incomplete.

<PWinstanley> riccardoAlbertoni: the quality example; I'm considering having this in the primer

<PWinstanley> AndreaPerego: the publications - scientific papers or reports about the datasets - I can provide examples

<PWinstanley> ... we have not included in the spec a way to point to a publication

<alejandra> https://‌github.com/‌w3c/‌dxwg/‌pull/‌803

<PWinstanley> alejandra: yes, but there is an incomplete PR I submitted today for this - perhaps we can discuss later

<PWinstanley> AndreaPerego: I contributed a use case on how this is done

<alejandra> https://‌github.com/‌w3c/‌dxwg/‌issues/‌63

<PWinstanley> alejandra: the data quality model is completed, but the example is designated as incomplete (yellow)

<PWinstanley> alejandra: project context - been partially dealt with - qualified relations

<PWinstanley> SimonCox: I haven't done any more work because we're busy wiht other things - it is not essential, but perhaps needs to be in a WG note. no action needed at the moment

<PWinstanley> alejandra: I want to agree today what we are not considering for the final pub. we can then see what is essential and can estimate time needed to complete

<PWinstanley> ... mapping of qualified / non-qualified forms - closed. we said that it will not be addressed

<Makx> +1. to won't fix

<PWinstanley> SimonCox: I have done a few, but not specifically from DCAT ... DC mapping to PROV

<PWinstanley> ... it is not hard, but we need to determine importance

<PWinstanley> alejandra: we had already decided that this was niche

<PWinstanley> ... #79 has been addressed

<PWinstanley> ... related datasets has already been addressed. we need examples, but that is the same for all

<PWinstanley> ... then we have schema.org - do we want it in the spec?

<PWinstanley> ... we singled out this one due to its importance

<PWinstanley> SimonCox: it is in the non-normative section. the table I included is still there, but AndreaPerego made a suggestion that some code would do it better

The SPARQL implementation of the mappings is at https://‌github.com/‌ec-jrc/‌dcat-ap-to-schema-org/‌tree/‌master/‌sparql

<alejandra> alejandra: there was a collab notebook with a SPARQL construct

<PWinstanley> SimonCox: it is perhaps better to move this into an annex

<Makx> +1 to move it out of the body

<PWinstanley> ... don't get rid of it altogether, but remove it from the core doc into an informative section

<PWinstanley> alejandra: I agree that it should be away into a primer or another informative section. There were concerns because of the lax nature of schema.org that mapping might not be accurate

<PWinstanley> AndreaPerego: I support the alejandra proposal. 2 issues: schema.org is released often and changes, so it should be clear when we do this that it is version dependent

<PWinstanley> ... and the mapping relationship using a sparql query will require reshaping the query. Perhaps the relationships should be described discursively rather than through a query

<PWinstanley> ... this may raise some discussion about equivalence.

<alejandra> alejandra: an implementation is probably more useful for the community

<PWinstanley> SimonCox: I have reluctance to drop entirely. we don't have a primer, but there are timing risks.

<alejandra> what about a separate note?

<Makx> as an annex?

<PWinstanley> ... however, all the comments made earlier are valid. But we need to face the lack of appetite of developers to read docs. we need to be specific about the version. It needs to be in an annex

<riccardoAlbertoni> +1 to include in the annex, removing the middle column, specifying it is an attempt and referring to a specific scheme version

<alejandra> PWinstanley: my recollection of the discussion about not completing it was related to what AndreaPerego said about frequent changes to schema.org

<alejandra> ... the proposal was made that schema.org should do the mapping to the revised DCAT rather than viceversa

<alejandra> ... does it need to be complete?

<alejandra> ... yes, specify version

<PWinstanley> AndreaPerego: could there be a seperate doc just about this - it could be a standalone note linked to from the spec.

<PWinstanley> riccardoAlbertoni: we should deal with vocab mappings in the next issue

<PWinstanley> ...are there any others that we consider important

<PWinstanley> alejandra: in earlier discussions we did recognise the importance of schema.org. there is an (empty) section on prov-o

<PWinstanley> AndreaPerego: the main issue is timing. we have already mappings from SimonCox . we need a separate doc. I think we need to revise the section. I can add the ones to DCAT-AP.

<PWinstanley> ... schema.org makes sense to me, but the main concern is to keep to schedule

<riccardoAlbertoni> +1 I agree on keep the aligning with schema.org, and abandon the others.

<PWinstanley> alejandra: timing is priority. schema.org is a special case due to its wide adoption. othrewise vocabs mapping can remain unaddressed

<Makx> annex

<riccardoAlbertoni> +1 to annex

<SimonCox> +1 to annex

<PWinstanley> proposed: to move schema.org mapping to an annexe (or other doc)

<PWinstanley> AndreaPerego: I think a separate doc is better - it will then be clearly side work to the main specification

0

<PWinstanley> alejandra: yes, we want the spec to be long lived

<alejandra> 0

<riccardoAlbertoni> I agree that a separate doc is better but considering the time matters..

<PWinstanley> AndreaPerego: there is scope for a community group to maintain a separate doc

Resolved: to move schema.org mapping to an annex

<PWinstanley> alejandra: ways of accessing items in a dataset

<PWinstanley> AndreaPerego: there is no link in the UCR doc

<PWinstanley> alejandra: perhaps it hasn't been added

<PWinstanley> ... we will leave it for now

<PWinstanley> ... there is no associated requirement

<PWinstanley> alejandra: spatial coverage - has not yet been addressed, though there has been movement from the spatial resolution discussion (about the range of the prop)

<PWinstanley> alejandra: to me this is important and necessary

<PWinstanley> SimonCox: I was drawing attention to this issue - a perennial difficulty. do we be prescriptive, or descriptive

<PWinstanley> AndreaPerego: the problem is, as SimonCox mentioned, the absence of a common way to specify an area - the most standard way is using geoSPARQL

<PWinstanley> ... there is a need to specify a relationship between the object and the dataset

<PWinstanley> ... in some cases you just use a representative point rather than a boundary (e.g. centroid)

<PWinstanley> ... this was discussed on the spatial data on the web group, but as no vocab was described there was minimal guidance

<alejandra> Location Core Vocabulary: https://‌www.w3.org/‌ns/‌locn

<PWinstanley> ... it is important to recommend a solution - we are needing to be prescriptive

<alejandra> geoSPARQL: https://‌www.opengeospatial.org/‌standards/‌geosparql

<PWinstanley> ...another important issue is that the vocabs we have at present can specify the geography, but only by a single point and by a bounding box. at the moment we cannot specify this information about what it is. This is an approximation, not a real point or boundary. schema.org might provide some help as it is based on only WGS84 - but this is not enough as people need to use their own coordinate system

<PWinstanley> ... my continuing concern with schema.org is the frequent change, and the chance that at some point some properties might not be there or be re-specified

<PWinstanley> alejandra: I agree that we should define our own

<Makx> me bye now

<PWinstanley> AndreaPerego: I should work with SimonCox to bring forward a proposal

<PWinstanley> +1

<SimonCox> we need to assign some actions ...

<PWinstanley> alejandra: next is #60. dataset aspects; fine-grained semantics

<PWinstanley> ... important for roba

Action: AndreaPerego to work out a proposal to address https://‌github.com/‌w3c/‌dxwg/‌issues/‌83

<trackbot> Sorry, but no Tracker is associated with this channel.

<PWinstanley> ... I think it is important and needs resolution

<PWinstanley> AndreaPerego: I think this is partially similar to others where we have to decide which instrument you want to use to get the data.

<SimonCox> I created the original UC, and now I think we need to push this off to profiles

<PWinstanley> ... DCAT needs to point to the instrument description, and not provide that description nor define how it should be described

<SimonCox> I don't think we have time to address this or enough detail in the UCs

<PWinstanley> ... I'm a bit leery about this - there are so many potential use cases, and we really need a generic way otherwise we will only do a partial job

<PWinstanley> SimonCox: Jaroslav was the UCR doc editor covering this but I wrote the original and it reflects concerns and requirement that are genuine, but on mature reflection can be handled elsewhere

<PWinstanley> ... let's close this with a note that there will be application profiles that will cover this requirement

<riccardoAlbertoni> I agree with simon, it seems a can of worms, are we sure we want to open it considered the time matters ?!?

<PWinstanley> +1

+1

<PWinstanley> alejandra: the stuff on versioning. riccardoAlbertoni made a PR.

<PWinstanley> alejandra: Jaroslav was preparing something

<PWinstanley> ... we need something in this version

<PWinstanley> PWinstanley: we need something, but it can quickly get complex so we need to constrain our expectations

<PWinstanley> AndreaPerego: I agree with PWinstanley

Action: SimonCox to streamline DCAT-sdo mapping and move to Annexe.

<trackbot> Sorry, but no Tracker is associated with this channel.

<PWinstanley> ... re: github issues, is the lifecycle of a dataset or a catalog record versioning, or not?

<PWinstanley> ... and even if this happens, the metadata will still be valid

<PWinstanley> ... usually a dataset has a lifecycle, but it can then be discontinued or withdrawn - and users need to be informed about this otherwise they will be inappropriately using data

<PWinstanley> ... it might be too late in the process to contribute this - there are more controversial issues, but if people agree that this is about versioning I can provide something

<PWinstanley> alejandra: I see it as a separate issue.

<PWinstanley> ... most of this is for the informative sections

<PWinstanley> AndreaPerego: there is official material across EU institutions. In the USA there is similar LoC labels for status / lifecycle

Provenance information

<SimonCox> This is pretty much it ... https://‌w3c.github.io/‌dxwg/‌dcat/#examples-dataset-provenance

alejandra: There are a few proposals by ncar. It seems to be also about qualified form.

<SimonCox> and https://‌w3c.github.io/‌dxwg/‌dcat/#qualified-forms

alejandra: Some of these elements have been covered.
… Do we want to specify relationship to software as well?

<SimonCox> This was nick car's thing of course, and he has been totally occupied with Profiles

<SimonCox> and https://‌w3c.github.io/‌dxwg/‌dcat/#Property:dataset_wasgeneratedby I'm basically comfortable with

<SimonCox> what we have in these sections

<SimonCox> yes - we can always use more examples, but the basic pointers are there now

AndreaPerego: In theory, using PROV, this should be possible.
… The software will be an agent.

<riccardoAlbertoni> +1 to software as an agent see prov:SoftwareAgent

alejandra: I put it as yellow, to see if its havee been covered.

Publication source

alejandra: This is about referencing the original metadata with its identifier.

See also a specific issue https://‌github.com/‌w3c/‌dxwg/‌issues/‌132

<SimonCox> of course dcat:Dataset rdfs:subClassOf prov:Entity is entailed by the use of prov:wasGeneratedBy predicate - so it does not actually need to be asserted ...

<SimonCox> Apologies folk - I have to sign off now

alejandra: How much you think is this important?

AndreaPerego: I think it is important, but still my concern is timing.
… So probably we can make it gray, and come back to it if we have time.

Usage notes.

alejandra: ncar commented that this is provenance use case.

<alejandra> https://‌www.ebi.ac.uk/‌ols/‌ontologies/‌duo

<alejandra> Data Use Ontology

alejandra: there are a number of suggested vocabularis, but they are no W3C ones.
… There's also DUV.

<alejandra> DUV: https://‌www.w3.org/‌TR/‌vocab-duv/

riccardoAlbertoni: I'm not sure which supports DUV will provide.

alejandra: Considering the little participation in that issue, I would mark it as won't fix

riccardoAlbertoni: Maybe we can check first DUV / DQV if they may provide something useful.

alejandra: So we may mark it for investigation.
… But do we have time?

riccardoAlbertoni: If the deadline is next week, it would be difficult.

<riccardoAlbertoni> put me on 86

An example http://‌data.jrc.ec.europa.eu/‌dataset/‌af5644c2-d3dd-46e3-9490-44b143fb3163

But this is about lineage (dct:provenance).

AndreaPerego: DUV is more complex than a simple property to link to a piece of narrative text. An option is to provide both examples.

riccardoAlbertoni: I partially agree, but providing different options may lead to confusion.

Dataset / catalog relation

alejandra: I think there are some implications that a dataset must be in the catalog.

riccardoAlbertoni: I don't think there is this constraint.

<alejandra> https://‌www.w3.org/‌TR/‌dcat-ucr/#ID35

alejandra: [checking the DCAT spec] Actually, it doesn't seem to be a requirement. So why people raised this issue?

AndreaPerego: The related use case was contributed by Makx, so maybe we should ask him. I only partially guess what is about, but it is unclear which is the proposed solution.

alejandra: Maybe we need to rephrase a few things in DCAT.
… Maybe it's the current definition of catalog is misleading.

Temporal coverage

AndreaPerego: the problem is DCAT 2014 didn't provide properties to specify start/end dates, and in DCAT-AP the decision was to use schema.org. But now schema.org is not longer using them for the same purpose (now they use schema:temporalCoverage).
… To address this, we could simply define two properties dcat:startDate and dcat:endDate.

alejandra: AndreaPerego, I will then ask you to make a proposal.

Action: AndreaPerego to draft a proposal to addres https://‌github.com/‌w3c/‌dxwg/‌issues/‌85

<trackbot> Sorry, but no Tracker is associated with this channel.

alejandra: I think we can close now.

[meeting adjourned]

Summary of action items

  1. AndreaPerego to work out a proposal to address https://‌github.com/‌w3c/‌dxwg/‌issues/‌83
  2. SimonCox to streamline DCAT-sdo mapping and move to Annexe.
  3. AndreaPerego to draft a proposal to addres https://‌github.com/‌w3c/‌dxwg/‌issues/‌85

Summary of resolutions

  1. approve minutes https://‌www.w3.org/‌2019/‌02/‌27-dxwgdcat-minutes
  2. to move schema.org mapping to an annex
Minutes manually created (not a transcript), formatted by Bert Bos's scribe.perl version 2.49 (2018/09/19 15:29:32), a reimplementation of David Booth's scribe.perl. See CVS log.

Diagnostics

Failed: s/ 1 not to consider it as a priority/ 1 not to consider issue 72 as a priority

Succeeded: s/the publicaitons/the publications

Succeeded: s/a?/

Succeeded: s/more complex that/more complex than/