W3C

– DRAFT –
Data Exchange WG TPAC face to face

09 November 2017

Meeting Minutes

<Makx> ok now

<RiccardoAlbertoni> yes

<kcoyle> https://‌www.w3.org/‌2017/‌dxwg/‌wiki/‌Meetings:F2F2017.11.09

<PWinstanley> kcoyle: we'll do this as a meeting following the agenda and cover minutes of last meeting next time

<PWinstanley> ...Looking at the agenda we start with access,linking, and data info

<PWinstanley> ...there will be a break after

<PWinstanley> ...We start with "Access"

<PWinstanley> ...there is only one requirement, but it is complex, so we split into 2 parts

<kcoyle> 6.21 Provide a way to specify access restrictions for both a dataset and a distribution.

<PWinstanley> Makx: bacground - this cameup in the development of DCAT-AP. There is information on licenses etc,but on the data you cannot filter

<PWinstanley> ...in DCAT-AP we added dct:accessRights for filtering

<PWinstanley> kcoyle: is this a request for specific access request?

<PWinstanley> Makx: it's just some place to put it

<PWinstanley> DaveBrowning: thespecific example goes backto a summary level of access. Is it a fully blown ODRL?

<PWinstanley> kcoyle: we are talking about a property, which could be a URI, so that people can use it for whatever their needs are

<PWinstanley> ...The second part, do we actually need it, or by passing the first part have we completed both requirements. Makx,this seems to be an addition

<PWinstanley> Makx: on profile in Norway required a mechanism to describe why something wasn#t open data - counter to a government policy of open by default. I don't know if it is the same property, or something different

<Zakim> LarsG, you wanted to ask about difference between access restriction an openness

<PWinstanley> ... it depends on the solution,whether it can take other values that specify reasons. The Norwegian idea was to express two things about a data set

<PWinstanley> LarsG: difference between access restriction and openness. They are related but not the same

<PWinstanley> ... We have to beware mixing things up here

<PWinstanley> kcoyle: so you think there could be a property for access restriction and another for license?nde?

<PWinstanley> LarsG: yes

<PWinstanley> Makx: I think LarsG reads too much into the wording; access restriction gives you the specification of the restrictions,but in this case the wording should be to allow simple expression of access, it is not a replacement for ODRL

<PWinstanley> LarsG: could we call it usage constraints?

<PWinstanley> Makx: or 'levels of openness'?

<PWinstanley> kcoyle: can we conclude that we understand it properly?

<PWinstanley> annette_g: it sounds like the licensing and the access restrictions are two separate things

<PWinstanley> antoine: I agree with annette_g ; tryto avoid merging things . the second part- the meaning / context/ rationale for the restriction, could be a third thing

<PWinstanley> ... there could be a technical solution that dealt with all at once, but it is better to keep them separated

<PWinstanley> kcoyle: we need two - one meeting the first sentence, and another relating to license terms

<Makx> can't hear

<PWinstanley> DaveBrowning: there is already a reference to license at the data set level

<Makx> queue please

<PWinstanley> kcoyle: 6.21 becomes two requirements;one referring to the first sentence, and another referring to licensing terms

<PWinstanley> Makx: I want to separate this from licensing, because people understand that licensing is completely different. In the CKAN world people set licenses to datasets and not distributions, but here we need to indicate the openness ofthe data in a way that is separate to licenses

<RiccardoAlbertoni> +1 to Makx

<PWinstanley> annette_g: is there anything about the current approach for licensing in DCAT that doesn't meet peoples' need

<PWinstanley> Makx: there is no current mechanism in DCAT to say something about openness

<Zakim> LarsG, you wanted to say that different distributions can have different licenses

<PWinstanley> kcoyle: if we are talking about DCAT 1.1, and DCAT 1.0 already has something for licensing then we add

<PWinstanley> LarsG: isn't this something of a transitive closure? Some data can be open and some can be closed from the same dataset

<PWinstanley> kcoyle: can't DCAT describe this for a separate distribution? At what level can DCAT describe a distribution, and is it a point where the proprty for access restriction and license terms can be added?

<PWinstanley> LarsG: froma dataset you should be able to figure out if it isopen or closed

<PWinstanley> Makx: I would get away from the word 'access restriction' - I prefer 'level of oppenness'. I agree with LarsG - there is a lot of processing in the alternative approaches in working out if the data is open or not, but we need to make it easy

<PWinstanley> ... if people think that it is sensible to make inferences then that's OK, but end users that I know need something simpler

<PWinstanley> LarsG: I see what Makx is thinking about; a 3 option

<RiccardoAlbertoni> pointer to the related uses case https://‌w3c.github.io/‌dxwg/‌ucr/#ID17

<PWinstanley> AndreaPerego: we have 2 different levels in the use case; one solution is as Makx describes, the other is more related to the license restrictions. We identified different categories - no limitations; need for authorisation; and registration required (anybody can register, so it is open without authorisation, but needs registration)

<PWinstanley> ... these 3 categories partially overlap with the others, but it helps the end user decide if they want to register or not

<PWinstanley> ...There is no recommended way in DCAT to specify the degree of openness. Licenses are not about access, they are about usage conditions

<PWinstanley> ... they define 'how' data is used

<kcoyle> PROPOSED: accept 6.21 with a way to provide levels of openness (access). Consider creating best practices

<PWinstanley> ...We need to ensure we don't mix these up

<PWinstanley> annette_g: do we want to have a proposal to do a best practices document?

<PWinstanley> kcoyle: given it is not a deliverable, perhaps just a recommendation. There will be a lot of areas that relate to this

<PWinstanley> dsr: you could use a 'resolution' to work on a BP doc

<annette_g> +1

<Makx> +1

<newton> +1

+1

<dsr> +1

<PWinstanley> +1

<AndreaPerego> +1

<LarsG> +1

<antoine> +1

Resolved: accept 6.21 with a way to provide levels of openness (access)

<RiccardoAlbertoni> +1

<Caroline_> +1

<PWinstanley> kcoyle: moving on to the topic of linking. 2 proposals:

<ericP> 6.22 Ability to represent the different relationships between datasets, including: versions of a dataset, collections of datasets, to describe their inclusion criteria and to define the 'hasPart'/'partOf' relationship, derivation, e.g. processed data that is derived from raw data RID20

<PWinstanley> ...second is:

<ericP> 6.38 Clarify the relationships between Datasets and zero, one or multiple Catalogs, e.g. in scenarios of copying, harvesting and aggregation of Dataset descriptions among Catalogs.

<PWinstanley> kcoyle: 6.22

<PWinstanley> ...who wants to discuss?

<PWinstanley> Makx: This is a project in itself. We have discussed versioning, subsets, etc. The use case is clear, but it should be moved to the DCAT subgroup as it is too large a problem to decide here

<PWinstanley> ericP: the DDI Alliance are working this into their own infrastructure - so perhaps let them solve it

<PWinstanley> Makx: ericP is jumping into solution space; the requirement is clear and we already have contacts with DDI, so we should keep the requirement here

<kcoyle> PROPOSED: accept 6.22 as a requirement

<annette_g> +1

<newton> +1

<dsr> +1

<RiccardoAlbertoni> +1

<Makx> do we have contacts with DDI or more narrowly with PAV, or is that the same

<LarsG> +1

<PWinstanley> +1

<antoine> +1

<Makx> +1

+1

<Caroline_> +1

Resolved: accept 6.22 as a requirement

<Makx> hard to hear people away from the mike

<PWinstanley> DaveBrowning: I agree with the approach, but not all subsetting of datasets corresponds to a style where provenance is important. Subsets with no versioning - a simple subset - is just the simple case wihout the need for provenance, but there are cases where we do

<Makx> impossible to follow Dave

<PWinstanley> annette_g: we are moving into implementation

<PWinstanley> kcoyle: next one in linking

<kcoyle> 6.38 Clarify the relationships between Datasets and zero, one or multiple Catalogs

<PWinstanley> annette_g: isn't this in 6.22?

<PWinstanley> RiccardoAlbertoni: annette_g was saying that this requirement was similar to 6.22, but IMO this is not the case; the requirement could be between a dataset and many catalogies

<PWinstanley> Makx: I support RiccardoAlbertoni ; same dataset in many catalogues - we have several catalogue entries. People are struggling to find out if this has happened. Therefore it might be for BP ( using identifiers or other solutions)

<kcoyle> PROPOSED: accept 6.38 requirement

<RiccardoAlbertoni> +1

<annette_g> +1

<Makx> +1

<newton> +1

<PWinstanley> +1

<dsr> +1

+1

<Caroline_> +1

<LarsG> +1

<antoine> +1

<AndreaPerego> +1

Resolved: accept 6.38 requirement

<PWinstanley> kcoyle: now to data info; several aspects, mainly getting more detailed data into DCAT

<PWinstanley> ... I will skip the first (more difficult) and return later

<kcoyle> 6.36 Express summary statistics and descriptive metrics to characterize a Dataset.

<kcoyle> PROPOSED: accept requirement 6.36

<PWinstanley> +1

<newton> +1

<Caroline_> +1

<Makx> +1

+1

<antoine> +1 even though there's no use case for this one?

<AndreaPerego> +1

<LarsG> +1

<annette_g> search for the word "statistics"

Resolved: accept requirement 6.36

<kcoyle> 6.44 Define a means to advertise any quality-related information; this might be text-based or more machine-processable

<PWinstanley> kcoyle: another 'nice to have'

<PWinstanley> ...this is not necessarily structured information

<AndreaPerego> Unclear what "advertise" mean here.

<PWinstanley> Makx: I was responsible for the use case. in statistical agencies the quality is generally described in text, and we found a way of using DQV

<PWinstanley> ...this could be a separate requirement, but it couldbe linked with the DQV aspect

<kcoyle> PROPOSED: accept requirement 6.44

<PWinstanley> RiccardoAlbertoni: ....

<AndreaPerego> Or replace "advertise" with "model".

<Makx> +1

<RiccardoAlbertoni> the requirement..

<PWinstanley> LarsG: the word 'advertise' is neither in UC nor requirement.

<PWinstanley> kcoyle: I will make a note to change spreadsheet wording to 'provide'

<kcoyle> PROPOSED: accept requirement 6.44 changing "advertise" to "provide'

<annette_g> +1

<Makx> -1

<AndreaPerego> +1

<Caroline_> +1

<PWinstanley> +1

+1

<LarsG> +1

<PWinstanley> Makx: I think the proposal is incorrect - I think the exact text of the requirement should be in the spreadsheet

<PWinstanley> ... define a way to 'associate'

<Makx> +1

<dsr> +1

<RiccardoAlbertoni> +1

<AndreaPerego> +1

Resolved: accept requirement 6.44 assuming the wording in UCR is correct

<antoine> +1

<PWinstanley> kcoyle: 6.16

<kcoyle> 6.16 Provide a recommended way to attach usage notes to data descriptions.

<PWinstanley> Makx: the text is loose - talking about data descriptions. we should talk about datasets

<AndreaPerego> +1 to Makx

<PWinstanley> ...we use dataset to refer to the descriptions of datasets

<PWinstanley> annette_g: we need to make it clear that usage notes isn't about usage situations, it is about how to use the dataset

<PWinstanley> Linda: usage notes sounds generic

<AndreaPerego> +1

<Makx> The use case says "information on how to use the data"

<annette_g> +1 to Linda

<PWinstanley> Linda: could is be 'notes on how to use the data'

<PWinstanley> kcoyle: let's reword the requirement to reflect the UC

<newton> maybe usage instructions

<kcoyle> PROPOSED: accept 6.16 but changing the wording to "provide information on how to use the data"

<PWinstanley> +1

<LarsG> +1

<Linda> +1

<Caroline_> +1

<annette_g> +1

<antoine> +1

<newton> +1

<Makx> +1

+1

<AndreaPerego> +1

<RiccardoAlbertoni> +1

Resolved: accept 6.16 but changing the wording to "provide information on how to use the data"

<PWinstanley> kcoyle: 6.17

<kcoyle> 6.17 Provide a way to link publications about a dataset to the dataset.

+1

<annette_g> thumbs up

<PWinstanley> DaveBrowning: is this a one-way or bidirections?

<PWinstanley> kcoyle: it needs to be worked out

+1

<PWinstanley> LarsG: no comment

<kcoyle> PROPOSED: accept 6.17 as is

<PWinstanley> +1

<Makx> +1

<RiccardoAlbertoni> +1

<Linda> +1

+1

<newton> +1

<AndreaPerego> +1

<LarsG> +1

<Caroline_> +1

Resolved: accept 6.17 as is

<annette_g> +1

<PWinstanley> kcoyle: 6.18

<kcoyle> 6.18 Provide a way to link to structured information about the provenance of a dataset

<PWinstanley> annette_g: support

<PWinstanley> kcoyle: I might want to reconsider 'structured'

<PWinstanley> fab_gandon: how would you position the request in relation to PROV-O primitives? You want to introduce a way to link, and this is already in PROV-O

<ericP> PWinstanley: you might find out that you don't want metadata that's as dense as what's provided in PROV-O

<kcoyle> PROPOSED: accept 6.18

+1

<PWinstanley> +1

<Linda> +1

<newton> +1

<dsr> +1

<annette_g> +1

<Makx> +1

<LarsG> +1

<PWinstanley> AndreaPerego: looking at the UC and requirements, there is no recommended way to add provenance information

<PWinstanley> ...perhaps we should be addressing the need to define how provenance should be modelled?

<PWinstanley> kcoyle: should we provide BP guidance?

<PWinstanley> AndreaPerego: OK

Resolved: accept 6.18, will the caveat that recommendations and/or best practices may be needed

<annette_g> +1

<PWinstanley> +1

+1

<dsr> +1

<Makx> best practice is good but we need to be careful not to move too much to work that we are not going to do in this group

<AndreaPerego> +1

<RiccardoAlbertoni> +1

<Makx> +1

<PWinstanley> annette_g: I have no problem with that going in , but if the vocabulary gives the guidance then that would be adequate

<kcoyle> 6.20 Identify common modeling patterns for different aspects of data quality based on frequently referenced data quality attributes found in existing standards and practices

<RiccardoAlbertoni> https://‌w3c.github.io/‌dxwg/‌ucr/#RID18

<PWinstanley> RiccardoAlbertoni: it could be more understandable if we look at the description in the document

<RiccardoAlbertoni> https://‌w3c.github.io/‌dxwg/‌ucr/#RID18

<PWinstanley> Makx: 6.20 is more general - relate to 6.44; we need to have some BO for expressing data quality, most likely using DQV. The formulation is difficult to understand.

<Makx> can't hear

<PWinstanley> ScottSimmons: quality is domain-specific, so it is hard to prescribe

<PWinstanley> RiccardoAlbertoni: replying to Makx , partially the requirement relates to what Makx mentioned, but other aspects might not be entirely within DQV

<PWinstanley> AndreaPerego: agreeing with Makx . differing aspects of data quality and modelling conformity ; we need a way to promote best practices, but I don't know if this is just one requrement or more than one

<PWinstanley> ...at present I don't have a complete proposal, but we need to beware clumping too many things together

<PWinstanley> kcoyle: the creation of the requirement often clumps concepts. perhaps we need to review this and 6.44 and develop separate requirements coming from the UCs

<Makx> makes sense to me

<PWinstanley> annette_g: the UCs that I see don't give any more detail than the requirements. RiccardoAlbertoni , is themotivation coming from DQV and is there an expectation that DCAT will pass through some of this ability

<PWinstanley> RiccardoAlbertoni: DQV allows modelling of quality. I suspect that in the UCs there are issues that are not completely covered in DQV. we may need to dig around in theUC to deliver more precise requirements, or we could keep in general

<AndreaPerego> s/wihtin/within/

<PWinstanley> ... Determining if a UC is covered by DQV could be complicated to do on a case by case basis

<Makx> Also https://‌www.w3.org/‌WAI/‌intro/‌earl could be in the solution space for testing results

<kcoyle> PROPOSED accept 6.20 but ask for more attention to the specifics in the related use cases

<Makx> +1

<PWinstanley> RiccardoAlbertoni: being specific means find a solution

<PWinstanley> kcoyle: we are proposing to ask the DCAT group to do that

<PWinstanley> annette_g: The UCs don't give much guidance - there is little distinction between the requirements

<PWinstanley> kcoyle: they do seem to be connected with redundancy, but there will probably be one solution

<PWinstanley> DaveBrowning: I agree with what you've said, but I am suspicious of the phrase 'fine-grained'

<PWinstanley> ... I agree that we pass to DCAT team

<Makx> +1

<PWinstanley> kcoyle: 'fine-grained' is present in many places

+1

<annette_g> +1

<RiccardoAlbertoni> +1

<newton> +1

<PWinstanley> +1

<Caroline_> +1

Resolved: accept 6.20 but ask for more attention to the specifics in the related use cases

<kcoyle> BREAK UNTIL 11:00

<PWinstanley> kcoyle: we now have a break for 23minutes

<PWinstanley> until 11:00

<RiccardoAlbertoni> +1 to kcoyle

kcoyle: 6.5.1 is really the 'obvious' requirement from the current work

<kcoyle> 6.5.1 Identify DCAT resources that are subject to versioning, i.e. Catalog, Dataset, Distribution.

<annette_g> Proposed accept 6.5.1 as is

<Makx> +1

<annette_g> +1

+1

<RiccardoAlbertoni> +1

<PWinstanley> +1

<kcoyle> 6.6.1 Provide a conceptual definition of what is considered a version with regard to modifications of the respective subject. The definition should provide a clear guidance on conditions, type and severity of a resource's update

makx: This is probably impossible and wouldn't be used by many people

newton: If we don't have this - people would like some guidance (e.g. as part of DWBP)

RiccardoAlbertoni: I agree with Makx - is there something in mind?
… some idea that be progressed

kcoyle: There was a requirement...if I can find it

annette_g: Specific groups will have differnet ways to do this - we risk frightening people away and raising barriers

Makx: In DCAT-AP - was discussed at length. But in practice noone is using the structures provided

<Makx> +1 to DaveBrowning

DaveBrowning: versioning techniques are very important in many places (esp if multiple consumers) but the details are very domain specific

<Makx> can we reject a requirement?

kcoyle: the requirement as stated goes too far

dsr: If we want to add a feature to DCAT we should know what any new features are to be used

Makx: requirement seems more prescriptive than the use case. If there are examples, perhaps we should gather them...

DaveBrowning: I can provide some examples from the financial data space

<annette_g> PROPOSED: rewrite requirement 6.6.1

<kcoyle> +1

+1

<annette_g> +1

<Linda> +1

<newton> +1

<dsr> +1

Resolved: rewrite requirement 6.6.1

<PWinstanley> +1

<Makx> +1

Action: annette_g to rewrite requirement 6.6.1

<trackbot> Created ACTION-54 - Rewrite requirement 6.6.1 [on Annette Greiner - due 2017-11-16].

<kcoyle> 6.8.1 Indicate the status of a version in terms stability, fidelity etc. (e.g. major, minor, stable). The version identifier might refer to the version status (semantic version)

<newton> http://‌semver.org

Makx: How does the status mentioned here relate to the actual data set?
… last sentence moves in solution space...

PWinstanley: We appear to be thinking linearly around versions without acknowledging forking/merging etc

dsr: We probably need that more sophisticate model

dsr: We probably need that more sophisticated model

dsr: There is probably an rdf solution....

PWinstanley: Perhaps there is already a solution in the source code space for example

<Jaroslav_Pullmann> dear all, please excuse me being late, my train was delayed ..

Makx: This seems to be broader than the requirement we're talking about. The version delta or version notes were used in DCAT-AP to provide the description

<PWinstanley> DaveBrowning: the intersection between new versions

<PWinstanley> ...and the coverage of a dataset mean that this is more multi-dimensional than the source code example

<PWinstanley> ... the situation might have business drivers

<PWinstanley> ... and appear arbitrary

<PWinstanley> ...The source code example refers to structure, but there is also versioning relating to coverage/usage

<AndreaPerego> s/thism orning/this morning/

annette_g: Perhaps the heart of this is on stability/fidelity aspects?

kcoyle: we already have statements of quality etc which overlap with some part of this.

annette_g: is there anything here that isn't covered elsewhere?

<Makx> let's reject this one

annette_g: Any objections?

<annette_g> PROPOSED: reject requirement 6.8.1 in favor of covering its contents in other requirements.

<Makx> +1

+1

<Linda> +1

<newton> +1

<kcoyle> +1

<RiccardoAlbertoni> +1

<annette_g> +1

<AndreaPerego> +0 - not attending all the discussion

<PWinstanley> +1

<AndreaPerego> No, I'm ok

Resolved: reject requirement 6.8.1 in favor of covering its contents in other requirements.

<Jaroslav_Pullmann> Will there be a replacement for the "status" ?

6.10.1 Indicate the change delta from one version to the next.

Jaroslav_Pullmann: idea was to include an incremental description - not formalised

Makx: Could open problems - since it implies a delta from some previous thing
… using "notes" would be better than "delta"

LarsG: its more of a hint

<Linda> I think that wasn’t Jaroslav he just left

<AndreaPerego> s/I think that wasn’t LarsG he just left//

kcoyle: Is this really about dataset or about distribution?

kcoyle: So version is not a separate thing - its part of dataset, distribution etc - wouldn't the description cover this?

<Jaroslav_Pullmann> +1 for distingusihing description from version info

Linda: The requirement sounds like you will always have this - is that the idea? Or is it optional?

Makx: Good to have way to express the difference because of versioning. (Also everything is optional in DCAT)
… if you want to put lots of info in there then they can

Jaroslav_Pullmann: Retain distinction on version info from description. Could provide guidance as part of solution

kcoyle: should we expand this one to talk about version note rather than delta

Makx: that would be possible (It could be used as a delta, but that would be a specific/profile issue)

<AndreaPerego> Being able to use a "delta" to regenerate a dataset could be also considered as part of provenance information.

<Makx> "provide information about the changes from one version to the next

<Makx> s/proidce/provode/

<kcoyle> PROPOSED: ACCEPT 6.10.1 generalizing the wording to include various information about the version

<Linda> +1

<PWinstanley> +1

<Makx> +1

+1

<newton> +1

<AndreaPerego> +1

<annette_g> +1

<kcoyle> +1

<RiccardoAlbertoni> +1

Jaroslav_Pullmann: The idea was that the semantics of the change would be apparent

Jaroslav_Pullmann: would this restated requirement be testable?

<kcoyle> PROPOSED: accept 6.10.1 renaming the emphasis to version change information

<Makx> +1

<newton> +1

<Jaroslav_Pullmann> +1

<kcoyle> +1

<annette_g> +1

<AndreaPerego> +1

+1

PWinstanley: The important element is if the change is substantive
… to the information content

kcoyle: the substantive changes could be described in the note - but probablly not machine readable

<SimonCox> For many scholarly applications, the issue is whether the change to the data would change the results of any analysis done using it - i.e. reproducibility

kcoyle: Should we go back to 6.8.1 - re-write it to support this kind of thing?

PWinstanley: Example from image processing - lossy vs not. These have different information content

<SimonCox> Folks - unfortunately I'm in UK this week so about to go to bed (after a long day on family matters) so will not be able to join you on 6.46

kcoyle: maybe we can extend 6.10.1 to include Peter's ideas

<annette_g> back in one hour

<SimonCox> For 6.46 also see https://‌dr-shorthair.github.io/‌ont/‌project/

<Jaroslav_Pullmann> Jaroslav - I am back as well

<annette_g> possible rewrite of 6.6.1: Provide guidance on how to express the conditions, type, and extent of a resource's update.

We had a short discussion on versioning over lunch looking at what we have agreed so far

<kcoyle> 6.6.1 Provide a conceptual definition of what is considered a version with regard to modifications of the respective subject. The definition should provide a clear guidance on conditions, type and severity of a resource's update that motivate the creation of a new version in scenarios like dataset evolution, conversion, translations etc.

This is what we had earlier

<kcoyle> 6.6.1: Provide guidance on how to express the conditions, type, and extent of a resource's update

<annette_g> ^ this is what is proposed

This is what we have now

The new text replaces the old

Would be better to say “reasons” rather than “conditions”?

annette_g: we took conditions from the original text

<annette_g> Provide guidance on how to express the motivation, type, and extent of a resource's update.

DaveBrowning: “reasons” seems a better fit compared to “conditions”

<Makx> motivation is OK for me too

kcoyle: anyone feel strongly about this?

<kcoyle> PROPOSED: accept 6.6.1 with the new wording "Provide guidance on how to express the motivation, type, and extent of a resource's update."

<annette_g> +1

<Makx> +1

<Linda> +1

<DaveBrowning> +1

+1

<kcoyle> +1

<PWinstanley> +1

<Jaroslav_Pullmann> +1

Resolved: accept 6.6.1 with new wording

<kcoyle> https://‌www.w3.org/‌2017/‌dxwg/‌wiki/‌General_versioning_considerations

kcoyle: during the break we began to look at Jaroslav_Pullmann’s diagram

Karen summarises …

The version refers to the dataset and possibily the distribution, although that’s something for us to decide

We have a date which allows for sorting and filtering

We have free form text that describes what has changed in the dataset.

The metadata could have its own versioning

annette_g: Jaroslav_Pullmann’s diagram seems to imply semantic versioning

I want people to allow versioning that isn’t semantic versioning

Jaroslav_Pullmann concurs

kcoyle: what is the difference between version status and version delta?

Jaroslav_Pullmann: delta is a formal text for what has changed

version status …

Jaroslav_Pullmann: version status could cover the expected durability of the new version

annette_g discusses the intent of 6.8.1

Makx: I don’t how you distinguish the status of the version from the status of the dataset

Makx: we would be better with a dataset status field

annette_g: I have to disagree

how would you express the cases where you have different versions of the same dataset?

kcoyle: they would be different datasets

annette_g: if you look at W3C technical reports, different versions have their own status

kcoyle: not everyone needs the version status

Jaroslav_Pullmann: yes, so the status is optional

Makx: it is wrong to say they is a difference between documents and data sets

The status is about the document/dataset not the version as such

dsr: I agree with Makx that the status is just a piece of metadata for the dataset where this metadata has change from one version to the next

kcoyle asks Jaroslav_Pullmann to add date to the next version of the diagram

Makx: the date should be the date of the dataset

kcoyle: we don’t have a requirement for status, which seems to be a gap

<kcoyle> 6.8.1 Indicate the status of the dataset

<Makx> +1 to status ot the Dataset

<annette_g> Provide a means to ...

<annette_g> +1

+1

<annette_g> +1

<DaveBrowning> +1

<Makx> +1

<Makx> nod

<Makx> catalog

kcoyle: are assuming that the versioning metadata could be on the distribution of a dataset?

<annette_g> see 6.5.1

<kcoyle> PROPOSED: 6.81 provide a means to indicate the status of the subject being described

<Makx> resource or resource?

annette_g: we should provide some parenthetical examples of what we’re talking about

Makx: when we write up the versioning section we should cite the use of the status field

annette_g: a change delta is the last stanza of a version history, i.e. the difference between this version of the dataset and the previous version

DaveBrowning: I agree but let’s make it clear that this is not expected to be machine interpretable

<alejandra> thanks!

kcoyle: we’re on the agenda section 11-12:30, and 6.10.1 (versioning)

annette_g: people will fill this out in a domain/publisher specific way

<kcoyle> PROPOSED: accept 6.10.1 with proposed wording : provide a way to indicate the change delta or other change information from the previous version

<annette_g> +1

<Jaroslav_Pullmann> +1

<DaveBrowning> +1

<Linda> +1

+1

<alejandra> +1

<Makx> +1

<Caroline> +1

<PWinstanley> +1

<AndreaPerego> +1

Resolved: accept 6.10.1 with proposed wording : provide a way to indicate the change delta or other change information from the previous version

<alejandra> sorry, just thinking that it would be good to also mention something like "change delta (description of the change, and/or possibly machine-readable description of the change)"?

<alejandra> maybe not needed

<alejandra> but we need to remember

<kcoyle> 6.11.1 Provide a means to search and discover existing versions of a DCAT resource. This should include both metadata elements referencing previous and next versions, lists of versions, and by implication the ability to search catalogs for related versions

<annette_g> "It should be possible to use DCAT to search and discover …"

<alejandra> re-phrase as "Provide metadata enabling search and discovery..."

we briefly chat about enabling rich search results akin to schema.org

PWinstanley: it is a matter of how you can used existing fields

We discuss the need for a link to a previous version of a datatset

<alejandra> I was thinking in PAV too, which as pav:previousVersion pav:hasCurrentVersion etc

Makx: I agree with the need for provenance metadata

kcoyle: is PROV-O etc. sufficient?

Makx: yes

<alejandra> yes, it is PAV more than PROV-O (https://‌pav-ontology.github.io/‌pav/)

I don’t think we have a current requirent for pointers to previous versions etc.

<kcoyle> PROPOSED: 6.11.1 is out of scope

<Makx> +1

<annette_g> +1

+1

<DaveBrowning> +1

<PWinstanley> +1

<Jaroslav_Pullmann> 1

<Jaroslav_Pullmann> +1

<alejandra> could you put the link to what you discussed about relationships of datasets please?

Resolved: 6.11.1 out of scope; pointers between datasets and versions provided in 6.22

<alejandra> +0 (as I didn't follow the previous discussion)

<kcoyle> 6.45.1 Indicate the update method of a Dataset description, e.g. whether each new dataset entirely supercedes previous ones (is stand-alone), or whether there is a base dataset with files that effect updates to that base.

Makx: I see two parts to 6.45

annette_g: if we have a patch update to a base dataset, we need point to that base dataset

Jaroslav_Pullmann: does a new version always replace the former one?

kcoyle: no, that is not always the case

imagine updates with additions, revisions and deletions, what would you call that?

kcoyle: this occurs in the real world

DaveBrowning: for us this is a dominant use case for us

DaveBrowning: a dataset could be a set of transactions for changes

I see this as the first of a set of related topics

Makx: this is difficult to wrap my head around due to the different ways people are talking

We don’t have in DCAT any way to describe data that changes data sets

kcoyle: it looks like we can surface this in 6.2.2 (relationships between datasets)

<AndreaPerego> +1

<kcoyle> PROPOSED: Include requirements from 6.45.1 in the 6.22 requirement

<Linda> +1

+1

<annette_g> +1

<alejandra> +1

<AndreaPerego> +1

<Jaroslav_Pullmann> +1

<DaveBrowning> +1

Resolved: Include requirements from 6.45.1 in the 6.22 requirement

<annette_g> Hi Caroline!

<annette_g> we are starting up

6.35 Provide means to describe the funding (amount and source) of a Dataset (or entire Catalog)

PWinstanley: this is more an application profile issue
… comes from academic end
… people looking at dcat will think: this isn't for me, it's academic

PWinstanley: could be in a profile for cost centers, etc.

kcoyle: seems pretty generic
… could be a legal requirement

DaveBrowning: being a requirement doesn't mean it can't be in a profile

annette_g: balance between value and how much it looks academic

PWinstanley: a primer could show non-academic examples

dsr: should be in context of provenance

annette_g: could be in lieu of a full provenance entry
… provenance as 'this was derived by ...' vs 'this was funded by'

kcoyle: Look at 6.46 - it's the project that provides the funding
… they need to go together

annette_g: keep it simple

PWinstanley: yes, that's important

<annette_g> PROPOSED: accept 6.35 and 6.46

+1

Resolved: remove "e.g. class" from 6.46

<DaveBrowning> +1

<newton> +1

<annette_g> +1

Resolved: accept 6.35 and 6.46

Resolved: remove "e.g. a property" from 6.47

6.47 Provide a means to indicate the relation of Datasets to a project.

annette_g: context of use or of publication?

<AndreaPerego> I think this is related to the funding req.

kcoyle: is there a difference between 6.46 and 6.47?

<AndreaPerego> I think it is more on the level of detail on how to specify a "funding reference".

DaveBrowning: this seems plausible but could become a big structure and a distance from dcat

can we ask that these be re-viewed by DCAT group as possibly handled by PROV-O or some other vocabulary

<AndreaPerego> +1

<annette_g> PROPOSED: accept 6.35, 6.46, and 6.47 with the proviso that it may be a need fulfilled by extensions to DCAT rather than DCAT itself.

+1

<AndreaPerego> +1

<annette_g> +1

<Caroline> +1

<DaveBrowning> +1

<newton> +1

<dsr> +1

<PWinstanley> +1

Resolved: accept 6.35, 6.46, and 6.47 with the proviso that it may be a need fulfilled by extensions to DCAT rather than DCAT itself.

annette_g: next three are relations to other vocabularies

these are requirements for the dcat group process

kcoyle: schema is less directly related than the other

<annette_g> PROPOSED: accept requirement 6.39 and 6.40

+1

<PWinstanley> +1

<AndreaPerego> +1

<DaveBrowning> +1

<annette_g> +1

<newton> +1

<dsr> +1

Resolved: accept requirement 6.39 and 6.40

kcoyle: I see this as an extra

DaveBrowning: schema has life-cycle characteristics; designed to be slightly fuzzy
… the two can coexist but not force them too closely

can we consider 6.41 optional?

<annette_g> PROPOSED: reject 6.41 as a requirement, considering it as interesting but optional

<newton> +1

+1

<DaveBrowning> +1

<dsr> +1

<annette_g> +1

<PWinstanley> +1

Resolved: reject 6.41 as a requirement, considering it as interesting but optional

kcoyle: these requirements need to be re-written to state that they are requiring the creation of best practices

<AndreaPerego> +1

kcoyle: that related to qualified forms, not new functionality for DCAT itself

<annette_g> PROPOSED: rewrite 6.27 and 6.28 to be about giving guidelines rather than defining new terms or functionality in DCAT.

kcoyle: but in using existing vocabularies

+1

<AndreaPerego> +1

<newton> +1

<PWinstanley> +1

<DaveBrowning> +1

<annette_g> +1

Resolved: rewrite 6.27 and 6.28 to be about giving guidelines rather than defining new terms or functionality in DCAT.

adjourn until tomorrow

Action: AndreaPerego to rewrite 6.27 and 6.28

<trackbot> Created ACTION-55 - Rewrite 6.27 and 6.28 [on Andrea Perego - due 2017-11-17].

Summary of Action Items

  1. annette_g to rewrite requirement 6.6.1
  2. AndreaPerego to rewrite 6.27 and 6.28

Summary of Resolutions

  1. accept 6.21 with a way to provide levels of openness (access)
  2. accept 6.22 as a requirement
  3. accept 6.38 requirement
  4. accept requirement 6.36
  5. accept requirement 6.44 assuming the wording in UCR is correct
  6. accept 6.16 but changing the wording to "provide information on how to use the data"
  7. accept 6.17 as is
  8. accept 6.18, will the caveat that recommendations and/or best practices may be needed
  9. accept 6.20 but ask for more attention to the specifics in the related use cases
  10. rewrite requirement 6.6.1
  11. reject requirement 6.8.1 in favor of covering its contents in other requirements.
  12. accept 6.6.1 with new wording
  13. accept 6.10.1 with proposed wording : provide a way to indicate the change delta or other change information from the previous version
  14. 6.11.1 out of scope; pointers between datasets and versions provided in 6.22
  15. Include requirements from 6.45.1 in the 6.22 requirement
  16. remove "e.g. class" from 6.46
  17. accept 6.35 and 6.46
  18. remove "e.g. a property" from 6.47
  19. accept 6.35, 6.46, and 6.47 with the proviso that it may be a need fulfilled by extensions to DCAT rather than DCAT itself.
  20. accept requirement 6.39 and 6.40
  21. reject 6.41 as a requirement, considering it as interesting but optional
  22. rewrite 6.27 and 6.28 to be about giving guidelines rather than defining new terms or functionality in DCAT.
Minutes formatted by Bert Bos's scribe.perl version 2.37 (2017/11/06 19:13:35), a reimplementation of David Booth's scribe.perl. See CVS log.

Diagnostics

Succeeded: s/+1 -/+1/

Succeeded: s/ACCEPTED: accept 6.38 requirement/RESOLVED: accept 6.38 requirement/

Succeeded: s/bbut/but/

Succeeded: s/PROV-I/PROV-O/

Succeeded: s/wihtin/within/

Succeeded: s/oftenclumps/often clumps/

Succeeded: s/fromthe/from the/

Failed: s/wihtin/within/

Succeeded: s/lars.svensson@web.deRs7HKa37z5uZZ7DHccz1//

Succeeded: s/rrr//

Succeeded: s/accecpt/accept/

Succeeded: s/RESOLVED/Proposed/

Failed: s/thism orning/this morning/

Succeeded: s/teher/there/

Succeeded: s/LarsG/Jaroslav

Failed: s/I think that wasn’t LarsG he just left//

Failed: s/proidce/provode/

Succeeded: s/providce/provide/

Succeeded: s/apperent/apparent/

Succeeded: s/fit/fit compared to “conditions”/

Succeeded: s/data/date/

Succeeded: s/subject/resource

Succeeded: s/paranthetical/parenthetical/

Succeeded: s/lets/let’s/

Succeeded: s/schemsa.org/schema.org/