W3C

– DRAFT –
Dataset Exchange Working Group Teleconference

16 September 2020

Attendees

Present
alejandra, AndreaPerego, PWinstanley, RiccardoAlbertoni, SimonCox__
Regrets
-
Chair
RiccardoAlbertoni
Scribe
PWinstanley

Meeting minutes

<RiccardoAlbertoni> PROPOSED: approve last meeting minutes https://‌www.w3.org/‌2020/‌07/‌15-dxwgdcat-minutes

<AndreaPerego> +1

<alejandra> +1

<RiccardoAlbertoni> +1

+1

<SimonCox__> +1

Resolution: approve last meeting minutes https://‌www.w3.org/‌2020/‌07/‌15-dxwgdcat-minutes

<RiccardoAlbertoni> https://‌www.w3.org/‌2017/‌dxwg/‌wiki/‌Meetings:Telecon2020.09.16

approving agenda

alejandra: as some people are unaware of how we handle research data, the paper needs to be completed

RiccardoAlbertoni: deal with at the end of the meetting

status of work on versioning

<RiccardoAlbertoni> DCAT Sprint: Versioning available https://‌github.com/‌w3c/‌dxwg/‌projects/‌9#card-17387151

RiccardoAlbertoni: a first issue to be decided is how we are going to include this in the new draft. Do we want it in the recommendation, or in some side note or primer. My view is that it should be in the recommendation - are we all on the same page?

<alejandra> https://‌www.w3.org/‌TR/‌vocab-dcat-2/#qualified-relationship

<alejandra> https://‌www.w3.org/‌TR/‌vocab-dcat-2/#dataset-versions

alejandra: I agree - revising what we have done, the minimal section on versioning and the discussion at the qualifiedRelation part, needs to be melded with the new points because it is minimal at the moment and needs examples

AndreaPerego: I am not against adding in this way, but it isn't clear how much text - based on the preparatory work there is a lot to be said, and if we add this to the recommendation it will be unbalanced. An option is to have the full text in a separate doc and to have some summary in the recommendation

<alejandra> I agree that we can start putting more descriptions in a primer

<SimonCox__> https://‌www.rd-alliance.org/‌plenaries/‌rda-16th-plenary-meeting-costa-rica-virtual/‌future-data-versioning-ig-principles

<alejandra> https://‌doi.org/‌10.15497/‌RDA00042

SimonCox__: There is some work ongoing in RDA on versioning - I'll add the link -
… I reviewed a paper coming out of this work a few weeks ago, but it didn't come up with strong recommendations. An earlier piece on databases and snapshots appeared to give the final word, but on examination it was mainly for DBMS etc.
… RDA etc have a bunch of use cases that would be helpful
… They also define requirements.

alejandra: they refer to our use cases as well

RiccardoAlbertoni: I've seen an RDA doc with desiderata of versioning. Most of the guidelines they were looking for seemed out of our scope as we are defining a vocabulary

<alejandra> also, as a reminder, this is a diagram that Jaroslav had done: https://‌www.w3.org/‌2017/‌dxwg/‌wiki/‌General_versioning_considerations

AndreaPerego: RDA also analyse the use of FRBR for versioning

<SimonCox__> I've found my copy of the paper. Submitted to DSJ.

AndreaPerego: this is a missing piece from our comparative analysis

<SimonCox__> It tried hard to align everything with FRBR, which I found a bit forced.

AndreaPerego: It might be worth more effort

<SimonCox__> And they also used the term 'versioning' to refer to many relationships between artefacts, not just revision

alejandra: bringing attention also to Jaroslav's diagrams in earlier work
… The RDA doc references this

<SimonCox__> Under the heading 'versioning' I would limit it to (a) datasets (b) revisions/updates/fixes

<SimonCox__> ... service versioning is a difficult topic ...

<alejandra> instead of datasets, should it be resources?

RiccardoAlbertoni: we need to include versioning in the recommendation, but keep the larger doc separate. I will draft the chunk for the recommendation

AndreaPerego:

AndreaPerego: I looked into FRBR and found something that might be relevant - but need discussion with others to confirm

RiccardoAlbertoni: can we have another issue in github to cover this?

AndreaPerego: I will add one - there is already the grid that can be expanded, perhaps

RiccardoAlbertoni: when we prepared the synoptic tables my impression was that it would be a challenge to align FRBR.

AndreaPerego: I'm not saying we *must* include, but we need to review

Action: RiccardoAlbertoni to draft the addition to the recommendation

<trackbot> Created ACTION-430 - Draft the addition to the recommendation [on Riccardo Albertoni - due 2020-09-23].

Action: AndreaPerego to open issue in gh on FRBR

<trackbot> Created ACTION-431 - Open issue in gh on frbr [on Andrea Perego - due 2020-09-23].

RiccardoAlbertoni: in the solution we have agreed so far there are terms from owl and adms namespaces, and so the normative terms should be in the normative part whilst the adms should be in the section alejandra pointed to. half of the solution will be normative, and half will not be. could we include adms as a normative part of the standard?

alejandra: I don't have an opinion, but we need to analyse the impact
… Do we need to update other sections?

RiccardoAlbertoni: if we use terms coming from a non-normative vocab then we cannot make it normative, but perhaps adms is an exception because it is currently widely used

AndreaPerego: we need to check with PLH. but I don't think that it will break any rules

<SimonCox__> Is PAV still on the table as well?

AndreaPerego: we might end up with adding new terms for versioning. in the guidelines there are a few properties that are not in the spec. we have yet to decide if we are going to add new terms
… or do we only put them in the guidance/primer

RiccardoAlbertoni: we need to try to get a normative solution

AndreaPerego: we will need implementation evidence

RiccardoAlbertoni: there is plenty of use already for adms

DCAT as a lingua franca on versions?

<RiccardoAlbertoni> https://‌docs.google.com/‌spreadsheets/‌d/‌1kOp810ep3gQ2iezVXH-abX2q2QubqxNmyJ2bcX6WAFw/‌edit?usp=sharing

RiccardoAlbertoni: the idea was to list different solutions for versioning
… should we try to provide a solution that embeds all approaches to versioning in DCAT, or do domains extend to cover the aspects they need?
… does it make sense to go there, or not?

<SimonCox__> looks like a combination of PAV+ADMS gives almost all green?

<alejandra> yes, it does Simon

<RiccardoAlbertoni> ack

PWinstanley: mention of 'lingua franca' - we need to ensure that effort goes into getting DCAT more widely used

alejandra: yes, we need to push for more uptake, but also to continue our development
… we need to address the most important use cases, perhaps not all of them (though the complete analysis would be useful information to socialise)

<RiccardoAlbertoni> allignemnt

SimonCox__: versioning had a placeholder left and it needs to be finished. some examples and testing against user stories needs to be finished

RiccardoAlbertoni: we need to be practical - it is difficult to get people who have been using one metadata standard to change unless there is a big driver/big benefit

<SimonCox__> absence of DCAT tools and software support is a big problem for us

<SimonCox__> DCAT-in-CKAN is the main one that we can point at. And it is definitely not 'native' there.

RiccardoAlbertoni: we need to consider that we are providing terms that are already in other vocabularies and are readily integrated

AndreaPerego: tooling support (as schema.org has with e.g. drupal) is important to facilitate adoption. In addition to improving tooling support we could show DCAT to be an interchange format.

<RiccardoAlbertoni> but you loose information

<SimonCox__> One of the times I was talking with DanBri, he made it pretty clear that he sees DCAT as a bit of a laboratory to kick the tyres of design ideas.

<SimonCox__> But then I think there is some onus on us to take the results of that tyre-kicking into the schema.org forums

<SimonCox__> We are not in a strong position here, wrt the market. Do any of you guys monitor the schema.org lists and GitHub? Discounting the robot, there is typically 10-20 interactions per days there, of various levels of sophistication.

AndreaPerego: the point of DCAT is that it can be used as the basis of application profiles

RiccardoAlbertoni: and this means that the interoperability will only relate to the core - hence some information loss

<SimonCox__> I think we should not look at it as a 'competition' - we have already lost that IMHO.

<SimonCox__> (W3C's version of the RDF/Semantic web in general has pretty much lost the public competition already. There is a lot of utilisation behind the scenes, which is good, but not so much mind-share in public.

<SimonCox__> THe other project which has a lot of energy is Wikidata

<RiccardoAlbertoni> ack

<AndreaPerego> Issue for ACTION 431: https://‌github.com/‌w3c/‌dxwg/‌issues/‌1251

<AndreaPerego> close action-431

<trackbot> Closed action-431.

<SimonCox__> To advance the versioning topic, we need some strawman proposals and examples. I know I said I would lead this but clearly I am too busy, so I need to step aside (I think you noticed this already).

RiccardoAlbertoni: which are the distinctive features of DCAT we want to promote? This is being attempted in the paper we are writing.

Summary of action items

  1. RiccardoAlbertoni to draft the addition to the recommendation
  2. AndreaPerego to open issue in gh on FRBR

Summary of resolutions

  1. approve last meeting minutes https://‌www.w3.org/‌2020/‌07/‌15-dxwgdcat-minutes
Minutes manually created (not a transcript), formatted by scribe.perl version 123 (Tue Sep 1 21:19:13 2020 UTC).

Diagnostics

Succeeded: s/\me hi//

Succeeded: s/q=//

Succeeded: s/To advance the versioning/To advance the versioning topic/