W3C

– DRAFT –
DXWG DCAT Subgroup

29 April 2020

Attendees

Present
alejandra, RiccardoAlbertoni, SimonCox
Regrets
PWinstanley
Chair
RiccardoAlbertoni
Scribe
alejandra

Meeting minutes

<RiccardoAlbertoni> rsagent, create minutes v2

<RiccardoAlbertoni> PROPOSED: approve last meeting minutes https://‌www.w3.org/‌2020/‌04/‌15-dxwgdcat-minutes

<RiccardoAlbertoni> +1

<SimonCox> +1

+1

<AndreaPerego> +1

Resolution: approve last meeting minutes https://‌www.w3.org/‌2020/‌04/‌15-dxwgdcat-minutes

<RiccardoAlbertoni> https://‌www.w3.org/‌2017/‌dxwg/‌wiki/‌Meetings:Telecon2020.04.29

agenda

https://‌www.w3.org/‌2017/‌dxwg/‌wiki/‌Meetings:Telecon2020.04.29

+1

<AndreaPerego> +1

<SimonCox> +1

we will follow this agenda

Mid-term plan and priorities among the issues

<RiccardoAlbertoni> https://‌docs.google.com/‌spreadsheets/‌d/‌1m3UPFKnpRN4vYhB_d60-pszyTQ6huSDtxp5Xt_A2EFc/‌edit#gid=0

RiccardoAlbertoni: the document includes options for journals, criteria on impact, select a place where we have the opportunity to engage across community people, current draft is quite technical
… there are some journals that are probably more suitable than others

<RiccardoAlbertoni> ack

RiccardoAlbertoni: we could work on another paper for a broader audience

alejandra: I am interested in having an open access article, and have a pre-print first
… good to have a technical paper first, and we can consider a broader audience later

SimonCox: I've dealt with Semantic Web Journal in the past, separate channel through which paper get published on open access, very long queue

RiccardoAlbertoni: similar experience of long queue, data quality paper took very long to have the revision

<RiccardoAlbertoni> 500 euro

RiccardoAlbertoni: very good journal and fee for open access is not high
… it could be a good starting point
… it is also very well-known in SW community but not sure outside the community

SimonCox: question - who is the target audience? SW audience?
… we should have an evaluation
… we have evidence of implementations but not sure how much we have in terms of metrics
… SW stack but we didn't design it in a rigourous actiomatized way
… metadata descriptive library community

RiccardoAlbertoni: you're right, but data quality vocabulary is not highly axiomatized either
… and got accepted in SWJ
… many similaries can make it less interesting for the journal

+1

alejandra: I agree that we need to define the audience first and it seems to me that it should be broader than SW and indeed include the library community

SimonCox: DLIB but has not been publishing in the last 3 years

alejandra: highlight aspects on cataloguing and resource, including data services

AndreaPerego: agree on broader audience and emphasising on new features, also we should emphasise on interoperability
… not sure if the library community as they have specific requirements
… journal not limited on specific community in terms of data management

alejandra: perhaps CODATA journal and the Data Science Journal, as there have been special issues around FAIR data... we should continue writing if we agree on technical paper + audience broader than SW

AndreaPerego: we could consider Scientific Data, and we need to take into account that the most used standards are DataCite and schema.org

RiccardoAlbertoni: we need to clarify the contribution, and the writing of the paper could clarify these points

AndreaPerego: my personal view looking at how DCAT has been used is that DataCite was designed to support a standard way of data citation on research data, DCAT is not bound to a specific community, meaning that it can be used for scientific data and public sector information and has a broader application

AndreaPerego: we can emphasise on this broader application

RiccardoAlbertoni: one concern is schema.org as there are big players behind this metadata schema

AndreaPerego: on the schema.org topic, we can argue that it was designed to optmise and improve of index of dataset pages by search engines
… DCAT as any other standard is not to replace, but has different applications
… similar with other standards, DCAT is to make data interoperable
… linking resources in an effective and actionable way is something that is missing

RiccardoAlbertoni: we could follow what alejandra was suggesting, not to decide now and write up, impact is important for me
… we can discuss this later

https://‌www.nature.com/‌sdata/‌about/‌oa

Open issues

<RiccardoAlbertoni> Scope of DCAT - datasets, or digital descriptions https://‌github.com/‌w3c/‌dxwg/‌issues/‌1235

RiccardoAlbertoni: discussion around DCAT not being just about data and providing patterns for cataloguing
… not sure if we are ready to push for a general cataloguing vocabulary
… provide examples, collect implementations
… do it as a kind of proces in which we don't decide now to change the name, but suggest these inclusions on next PWD

alejandra: +1 to Riccardo but I think it was useful to start the discussion, we can think about the acronym later but it is important to consider the broader cataloguing patterns beyond data

SimonCox: yes, the issue was to indicate to the community that we are considering this but I wasn't expecting to resolve this issue soon

<SimonCox> DO we need a resolution now?

alejandra: does this need to be resolved now? we can leave the issue open

RiccardoAlbertoni: I'd like to give the signal that we won't change the acronym now

<SimonCox> Proposed: Consideration of the scope of DCAT - just data or also other kinds of resources - is important but cannot be resolved now. We should proceed to develop examples and use-cases, then we can later consider whether the vocabulary needs re-naming or not.

+1

<RiccardoAlbertoni> +1

<SimonCox> +1

<AndreaPerego_> +1

RiccardoAlbertoni: is there any volunteer to check where we should emphasise broader catalogues and provide a PR?

Resolution: Consideration of the scope of DCAT - just data or also other kinds of resources - is important but cannot be resolved now. We should proceed to develop examples and use-cases, then we can later consider whether the vocabulary needs re-naming or not.

RiccardoAlbertoni: otherwise we go to next point

SimonCox: I think that at this stage we just need to add a note in the issue referring to this resolution

<RiccardoAlbertoni> is a software solution a dcat:Dataset ? https://‌github.com/‌w3c/‌dxwg/‌issues/‌1221 ;

alejandra: +1, I don't think changes related to this are urgent, we should focus on new features or addressing other feedback

<AndreaPerego> RRSAgent: draft minutes v2

RiccardoAlbertoni: issue 1221 - discussion has reached a conclusion
… PR emphasises that the dataset definition is broad

<RiccardoAlbertoni> https://‌github.com/‌w3c/‌dxwg/‌issues/‌1195

This is the PR: https://‌github.com/‌w3c/‌dxwg/‌pull/‌1226

<AndreaPerego> +1 for me to merge this PR

the PR was approved

<RiccardoAlbertoni> https://‌github.com/‌w3c/‌dxwg/‌issues/‌1195

<AndreaPerego> RRSAgent: draft minutes v2

<RiccardoAlbertoni> The notion of dataset in DCAT is broad and inclusive, with the intention of accommodating resource types arising from all communities. If someone consider something as a dataset it is a dataset for DCAT. We refrain from restricting the definition of DCAT dataset to specific communities. Our responsibility, on the contrary, to make the definition general enough to accommodate any definition of datasets that arise by quite distinct an[CUT]

* I haven't and was wondering the same thing

RiccardoAlbertoni: the issue emphasises on the different notions of dataset

AndreaPerego: we're trying to set up a common approach for data documentation
… there was an initial period where we were trying to propose a definition and every time wasn' t fitting definitions of specific disciplines
… so we went for the option on saying "you are the expert"
… if you look at the list of things that Google dataset search documentation they give a list of things and then "anything else that looks as a dataset to you"

the feedback points to https://‌github.com/‌heidivanparys/‌discussion_paper_dataset/‌releases/‌tag/‌v20200306

AndreaPerego: the notion of data is also tricky
… structured data depends very much on your background
… what is a unit?

<SimonCox> In particular when we think of humanities!

Action: alejandra to look at https://‌github.com/‌w3c/‌dxwg/‌issues/‌1195

<trackbot> Created ACTION-421 - Look at https://‌github.com/‌w3c/‌dxwg/‌issues/‌1195 [on Alejandra Gonzalez Beltran - due 2020-05-06].

RiccardoAlbertoni: are we meeting in two weeks again?

<RiccardoAlbertoni> May 13th

<AndreaPerego> Fine with me.

+1

Summary of action items

  1. alejandra to look at https://‌github.com/‌w3c/‌dxwg/‌issues/‌1195

Summary of resolutions

  1. approve last meeting minutes https://‌www.w3.org/‌2020/‌04/‌15-dxwgdcat-minutes
  2. Consideration of the scope of DCAT - just data or also other kinds of resources - is important but cannot be resolved now. We should proceed to develop examples and use-cases, then we can later consider whether the vocabulary needs re-naming or not.
Minutes manually created (not a transcript), formatted by scribe.perl version 114 (Tue Mar 17 13:45:45 2020 UTC).

Diagnostics

Maybe present: AndreaPerego