Data Exchange WG TPAC face to face - Day 2

Meeting Minutes

<AndreaPerego> Yesterday's minutes: https://‌www.w3.org/‌2017/‌11/‌09-dxwg-minutes.html

kcoyle: we have a duplication 6.45.1 that can be deleted

kcoyle: publication: Makx provided use cases

Makx: when doing DCAT-AP we found that the way that dataset descriptions move during harvesting fromone catalogue to anothermakes it challenging to know which was the original source of the dataset description

<kcoyle> 6.29 Define means to explicitly control the re-publication of Dataset and Catalog descriptions among data portals (policies). Obligations might apply e.g. to disclose the original source or keep the copies synchronized.

kcoyle: this has to do with re-publication (which might include mirroring of the descriptions, the catalogued info)

<AndreaPerego> Description = metadata

Makx: dataset descriptors are harvested from regional portals and put in the national catalogue. the metadata is copied. this leads to 2 or more descriptors for the one data file. How therefore does one identify the original catalogue entry

<Jaroslav_Pullmann> present

Makx: The issues are: is all or part of the data catalogue copied, is it changed, etc etc

kcoyle: there seems to be a difference between 'control' and 'document'

RubenVerborgh: we need to define 'control'. what do we want?

<RubenVerborgh> RubenVerborgh: At the moment, 6.29 seems to be "source" (current 6.30) and "sync" (not yet in there). So if that's the only thing, we should split them into that.

Makx: I'm relating the requirement to the use case, and it doesn't talk about control but explicit references to the original. The control aspect was added as the requirement was derived fromthe use case. Suggest 'track' rather than 'control'

Jaroslav_Pullmann: the last para of the use case; how should my metadata be handled by parties who want to copy it?

kcoyle: but we covered permissions yesterday

Jaroslav_Pullmann: that might fit

DaveBrowning: but wasn't that about content, rather than metadata?

<Zakim> AndreaPerego, you wanted to say that, after all, we are talking about provenance information about metadata

kcoyle: but we could apply the same approach

AndreaPerego: we need to know the provenance of the cataloguing data, especially in a federated environment, and especially when the original records follow a different metadata standard.

Makx: Jaroslav_Pullmann is thinking of rights and licenses for metadata. once you limit the usage of metadata the whole idea of sharing is broken. people have to then maintain the chain of limitations and this precludes enrichment pipelines
… The requirement might include the original creation and the route,but we should avoid licensing issues for metadata

Jaroslav_Pullmann: this arose from non-DCAT area where there was varied metadata but there were restrictions on usage

kcoyle: Jaroslav_Pullmann , are you suggesting that this might not be a requirement?

Jaroslav_Pullmann: it might not. My original use case was not a DCAT area

Makx: I didn't want to say it was not a valid requirement. DCAT can be used for closed as well as open data. If there is a requirement to protect/limit the reuse of metadata, then the requirement makes sense, but in open data areas it is potentially troublesome

<kcoyle> 6.29

kcoyle: we have 2 different things: DCAT metadata having provenance; DCAT metadata having expression of rights. Do we need both?

Makx: yes

<riccardoAlbertoni> +1 to kcoyle

kcoyle: could 6.29 be the rights expression, and 6.30 be the provenance?

Jaroslav_Pullmann: yes

kcoyle: then we need to re-word 6.29 to ensure it refers to the metadata rather than the control mechanism

kcoyle: DCAT can be used for non-public data. The rights we are talking about (re-use) ...

Jaroslav_Pullmann: e.g. when the metadata schema is re-formed

<kcoyle> PROPOSED: accept 6.29 reworded as: "Provide means to express rights relating to reuse of DCAT metadata"

<Makx> +1

<AndreaPerego> +1

<Linda> +1

<DaveBrowning> +1

<riccardoAlbertoni> +1

<newton> +1

<RubenVerborgh> +1

<Jaroslav_Pullmann> +1

Resolved: accept 6.29 reworded as: "Provide means to express rights relating to reuse of DCAT metadata"

kcoyle: 6.30 is related

RubenVerborgh: simply provenance. it could be more elaborate than just a link to the source

Jaroslav_Pullmann: this was partially discussed by rob . the reference should be linkable and resolvable to discover the original resource

kcoyle: the use case talks about the original resource, the requirement doesnt

Makx: what is a DCAT resource? we are talking about the metadata description. there needs to be a way to identify the original source of hte DCAT metadata

Jaroslav_Pullmann: we don't know what has been copied, what has been added.

Makx: use the term "DCAT metadata" and refer to "original source"

kcoyle: the reference to the original version - is this the metadata, or the site it originated from?

<Zakim> AndreaPerego, you wanted to ask is this related the use of schema.org for publishing DCAT records

Makx: we need a link directly to the source metadata (assuming it is not a URN)

AndreaPerego: indexable by search engine? this needs the full information in the original record

DaveBrowning: this is what the provenance vocab is meant to handle.

<AndreaPerego> I was thinking that "indexable by search engines" is about publishing DCAT records by using schema.org.

DaveBrowning: the new bit is 'indexable'
… so do we want to understand the provenance, or do we want it to be searchable?

Jaroslav_Pullmann: the requirements have two things: reference to original source; client should be able to access this source
… a client with a reference to the original resource should be able to resolve it

<DaveBrowning> +1 to Jaroslav_Pullmann Interpretation

<kcoyle> PROPOSED: accept 6.30 reworded as: "Provide a way to reference the original DCAT metadata with a dereferenceable reference"

<AndreaPerego> +1

<Jaroslav_Pullmann> +1

<newton> +1

<LarsG> +1

<DaveBrowning> +1

<riccardoAlbertoni> +1

<Makx> +1

<Makx> +1 to Andrea

<riccardoAlbertoni> +1 to AndreaPerego's proposal

<AndreaPerego> Alternative proposal: "Provide a way to reference the original metadata with a dereferenceable reference"

<Makx> apart from the duplication of the word 'reference'

Resolved: accept 6.30 reworded as "Provide a way to reference the original metadata with a dereferenceable reference"

<Makx> +1

<DaveBrowning> +1

<AndreaPerego> +1

<Linda> +1

kcoyle: 6.45 was done yesterday
… now at 6.48

<kcoyle> 6.48 Metadata 'distribution' elements need a content model (see white paper referenced above) associated with links to communicate the protocol expected and the interchange formats (information model and encoding/serialization) available via the link.

<AndreaPerego> I wonder whether this is already covered by https://‌w3c.github.io/‌dxwg/‌ucr/#RID13

<AndreaPerego> Or is it different?

Jaroslav_Pullmann: a drawback with the current approach is missing parameters - we need to know which are mandatory and which are optional

Makx: 2 things: dcat currently supposes that distributions are files and there are format and mediatype; but we are talking about API access where parameters are needed for accessing services.

<AndreaPerego> The UC Makx is referring to is https://‌w3c.github.io/‌dxwg/‌ucr/#ID18

<riccardoAlbertoni> @max but this would be the requirement 6.15 afau

RubenVerborgh: I've been looking at 6.48. I think they need deeper consideration - we need to know what profiles are available. does 5.21 say any more. 6.48 seems to criticise a potential solution

I don't see how 6.48 follows from 5.21

DaveBrowning: at best it is a duplicate of something already there

RubenVerborgh: we need to reconsider

kcoyle: the note includes profiles, web services,
… it seems we have a use case around web services that is not simply a list of profiles

Makx: I think there is a misunderstanding within the requirement. It works from the perspective of a single link for a distribution, but DCAT doesn't work like that. Different distributions would have different links
… if this use case is about the expression of what is behind a download URL then format etc is adequate. It isn't a content-negotiation matter. The word 'profile' used in the requirement is more like the schema a particular distribution conforms to

<AndreaPerego> +1 to Makx

Makx: I think the requirement mixes up. File formats is already covered by DCAT. API issues are more like service-based data access

kcoyle: you're referring to 6.48

Makx: yes, but also 5.21 which has more detail. 'profile' here is confusing.

<AndreaPerego> This Req is unclear to me as well.

kcoyle: can we agree that 6.48 is not a valid requirement and so needs to be moved to the side?

<Makx> +1 to unclear

<RubenVerborgh> +1

<AndreaPerego> Or we can ask who contributed it to clarify - I think it was Simon

<DaveBrowning> +1

<riccardoAlbertoni> +1

<Makx> it was Stephen Richard

RubenVerborgh: do we also clarify 5.21?

kcoyle: the use case can remain

Resolved: Eliminate 6.48 as being unclear and possibly erroneous in its assumptions

<Makx> +1

<AndreaPerego> +1

RubenVerborgh: is there anything else that is not with 5.5 (difference between 5.21 and 5.5 matters to me)

<LarsG> +1

<riccardoAlbertoni> +1

kcoyle: part of 5.21 seems to be asking for the metadata to provide sufficient information to determine what is inside [5.5 being the list]
… we are assuming that people know how to use 'format' and that the list is comprehensive

DaveBrowning: 5.21 looks like a good use case, but it is too much in solution space

Linda: it sounds like they both want to describe the difference between different services, and also between content models/ semantic standards

kcoyle: we have something that says that a reference to a dataset definition or standard, don't we?

<AndreaPerego> It's more about the data schema, in my understanding.

Makx: it doesn't make sense to have format and mediatype
… in DCAT-AP people said there was a need to point to a schema to which the data in the file conforms to. In this we included conformsTo in the DCAT-AP
… this is separate from passing parameters to services

kcoyle: Do we need a requirement for DCAT to point to the schema that the data conforms to

Makx: yes

kcoyle: 6.14 or 6.48? We are dropping 6.48. If 6.14 doesn't cover it then we need another

Makx: yes, 6.14 covers it

<AndreaPerego> BTW, the requirement about service information in distributions is here https://‌w3c.github.io/‌dxwg/‌ucr/#RID13

Jaroslav_Pullmann: reference to a schema is something I agree with. if the schema is there we need a hint to the type

Makx: I disagree

<Zakim> LarsG, you wanted to talk about schema types

LarsG: then when it comes to schema type we need a mediatype for every schema type there is. There are several RDF or JSON approaches, and we cannot infer from the mediatype

Makx: the schema type isn't a property of the distribution.
… sometimes there is rdf+xml that informs us about types.

kcoyle: in the absence of an identifier for the schema, we just have to work within these limitations. It is not the role of DCAT to fix this

<AndreaPerego> It could be either - better a URI if we have it.

kcoyle: Is this intended to be a name or a URI? Does it matter? could it be either?

Makx: I think URIs are a good way to identify schema, but that is solution space

<riccardoAlbertoni> +1 to Makx

Linda: I think it covers service types as well. in 6.14

kcoyle: what is meant by service types?

<AndreaPerego> There's one specific to services - 6.15

Linda: data could be a download, but also a service

DaveBrowning: in some cases you are interrogating a datasets via an API

Makx: 2 things are unrelated; the second I read as being a file as a distribution. the first is about services where there is a datastore that is accessible via a query. there are 2 different requirements with the service being more complicated
… at SDSVoc we discussed this; a small set of parameters needed to access the information

kcoyle: what about 6.15?

LarsG: in 6.14 I don't see anywhere where the second note only talks about files - could it also be for APIs

kcoyle: that note makes sense to me, but what is the extent that we need to define the requirement for services?

Linda: I agree with LarsG ; the second note could refer to the schema of the data retrieved from an API or WS request

kcoyle: for me the question is where does DCAT fit here

Makx: I think there are 2 different requirements; one for files (in DCAT-AP a property points to a schema), and the WS/API requirement which is more complex.

<kcoyle> PROPOSED: accept 6.14 re-worded as: "define a way to include identification of the schema the described data conforms to"

<riccardoAlbertoni> +1

<RubenVerborgh> +1

<Linda> +1

<LarsG> +1

<Makx> +1

<DaveBrowning> +1

<AndreaPerego> +1

Resolved: accept 6.14 re-worded as: "define a way to include identification of the schema the described data conforms to"

<Makx> related to thje service-based data access, this discussion (long read): https://‌joinup.ec.europa.eu/‌discussion/‌dt2-service-based-data-access

<Makx> somewhere in the discussion at https://‌joinup.ec.europa.eu/‌discussion/‌dt2-service-based-data-access, there is a post by AndreaPerego dated 12/12/2016 - 22:41 that reports on the discussion at SDSVoc

<AndreaPerego> Should be this one: https://‌www.w3.org/‌2016/‌12/‌01-sdsvoc-minutes#item17

<Makx> Thanks Andrea, was looking for that link

<Jaroslav_Pullmann> is there a planned break now?

<Makx> https://‌www.w3.org/‌2016/‌12/‌01-sdsvoc-minutes#item17

6.15 now

RubenVerborgh: too broad - we'll never solve this because there is no general description for apis
… this will need boundaries

<Jaroslav_Pullmann> the use case mentiones web sevrvices (~ SOAP/REST?)

RubenVerborgh: provide an extension point

Makx: Rubin, what is an extension point?

RubenVerborgh: a way to connect a DCAT distribution to a description

<Makx> https://‌www.w3.org/‌2016/‌12/‌01-sdsvoc-minutes#item17

Makx: shared a link to a solution
… an analysis and potential solution

RubenVerborgh: type of service should be out of scope

Jaroslav_Pullmann: dcat should not tackle description of services; extension point should define options for implementing this feature
… so that we not have heterogeneous description of services
… if we don't specific the type of description it isn't clear how it will be interpreted
… for example dcat should suggest use of openAPI for services
… needs to be detailed but not semantically enabled; point to WSDL documents

<Jaroslav_Pullmann> https://‌www.w3.org/‌TR/‌wsdl20/

Jaroslav_Pullmann: must be formal description of service; should it be expressed in a specific vocabulary?

annette_g: link to documentation that exists, not describe through dcat

Jaroslav_Pullmann: dcat should say which standards to use
… should we prescribe which standards?

Linda: in favor of pointing to service description, but mandating specific standards, no

PWinstanley: ditto to what Linda said. Things move too quickly to make specific recommendations
… don't even define object type as a string - could be anything

kcoyle: should we use "web" or just services?

DaveBrowning: could be logging into a proprietary database
… dropping web doesn't cost us anything

Jaroslav_Pullmann: some of the services do not have formal descriptions;
… important to know how will we describe the service
… afraid without this it will introduce ambiguity; too much freedom would not be good, unless we assume only humans will read it

annette_g: can imagine users looking for a particular api type

Jaroslav_Pullmann: specific either formal description or just a string

Linda: ref. 6.48 refers to machine-actionable descriptions, UC 5.21
… can we go back to the author of 6.48 and get a better idea of what is meant?

<AndreaPerego> I think the UCs attributed to Stephen Richard have been actually added by Simon.

DaveBrowning: these services are likely to be very domain specific
… and possibly additional metadata to be able to make choices
… 1) ability to extend 2) should make it possible to make intelligent choices

AndreaPerego:

<AndreaPerego> UC: https://‌w3c.github.io/‌dxwg/‌ucr/#ID18

requirement is a bit too generic; the use case has a number of issues

.... 1) ability to understand the type of distribution

... 2) provide information about type of service

... 3) description of how to query the service

... no general solution for type of service;

... primary issue - when you get to distribution you get a note describing the service and users don't know what to do with that

Linda: agree with 1 & 2, we need those. But how to query the services goes too far
… users don't understand the services, but that's not something we can solve in DCAT

AndreaPerego: there are examples in geo data - using specific aspects of the standard; dcat could have a flag that indicates the services when the distribution is not just files

PROPOSED: accept 6.15 with 1) ability to describe the type of distribution and 2) provide information about the type of service

<Linda> +1

<Makx> +1

<Jaroslav_Pullmann> +1

<AndreaPerego> +1

<RubenVerborgh> +1

<LarsG> +1

<DaveBrowning> +1

<PWinstanley> +1

Resolved: accept 6.15 with ability 1) to describe the type of distribution and 2) provide information about the type of service

6.23 Define way to specify content of packaged files in a Distribution. For example, a set of files may be organised in an archive format and then compressed, but dct:hasFormat property only indicates the encoding type of the outer layer of packaging

Makx: can only say it is a zip file, not what format is in the zip file

Jaroslav_Pullmann: this makes sense; distinguish packaging / compression from file format
… could be recursive - archives within archives; what to do about that?

DaveBrowning: agree it's a real problem. i wouldn't want to explain zipping independently; it could be covered by fine-grained metadata

Makx: could include this in fine-grained info, because there could be packages within packages. Let dcat group see if it fits in 6.14

<Zakim> LarsG, you wanted to talk about two requirements (packages and compression)

LarsG: may be two requirements; one could be a simple encoding, the other a tar with multiple files

<Makx> nod

PROPOSED: accept 6.23, which may be resolved by the solution to 6.14

<Makx> +1

<Linda> +1

<annette_g> +1

<DaveBrowning> +1

<RubenVerborgh> +1

<LarsG> +1

<PWinstanley> +1

<Caroline> +1

<Jaroslav_Pullmann> +1

Resolved: accept 6.23, which may be resolved by the solution to 6.14

6.37 now

Makx: comes out of DCAT-AP: what are the allowable differences between distributions? people didn't find an agreement
… same data different serializations? different data same series?
… wording in DCAT too vague

DaveBrowning: "all distributions have to be informationally equivalent"
… ability to relate datasets provides functionality that people are using distributions for

<Makx> +1 to DaveBrowning

PROPOSED: accept 6.37 (with particular attention to the use case)

<Linda> +1

<annette_g> +1

<Jaroslav_Pullmann> +1

<DaveBrowning> +1

<RubenVerborgh> +1

<Caroline> +1

Resolved: accept 6.37 (with particular attention to the use case)

<Makx> +1

<annette_g> thanks for helping, Makx!

<Jaroslav_Pullmann> bye!

<annette_g> thanks Jaroslav!

<Makx> bye

<LarsG> Bye all

<annette_g> 1:00 we come back

<annette_g> (one hour from now)

kcoyle: we're going to cover profiles until the break and then we'll work on identifiers citation
… are profile machine actionable or only documents for provide guidelines?
… what people think about it?

RubenVerborgh: profile can be expressed in machine readable way
… one profile has multiple expressions

kcoyle: should the dcat profile be machine actionable?

RubenVerborgh: a generic profile doesn't need to be machine actionable
… specific profile can be expressed in machine readable

<RubenVerborgh> correcting myself: a generic profile does not necessarily need a machine-readable expression

<RubenVerborgh> a DCAT profile might

annette_g: [reading about profiles on the charter]

kcoyle: it's up to us to clarify whether profile should be machine actionable or not

DaveBrowning: the profile definition doesn't need to be machine readable

annette_g: in order to validate something against the profile you need a profile machine actionable

RubenVerborgh: I would like to see a explicity distinction between generic profiles and DCAT AP

<RubenVerborgh> http://‌profilenegotiation.github.io/‌I-D-Accept--Schema/‌I-D-accept-schema.txt

RubenVerborgh: shares a definition of profile

kcoyle: are we asking for more than an AP?
… is it enough to say the profile group will propose a definition?
… profiles will have to be defined.

DaveBrowning: we might get distract by details when trying to define principles of this?

kcoyle: there are 4 UC related to profile
… 6.1.1 is very vague. let's go to the next ones
… we will start with 6.1.3
… 6.1.3 Profiles have URI identifiers that resolve to more detailed descriptions

kcoyle: the way I read it it provides more information for the content type

RubenVerborgh: profiles don't provide syntax, but does provide semantic constraints
… we need a distinction between Content Type and a Profile
… a content type is related to the syntax
… a profile is layered on top of a content type, but it doesn't provide information about about syntax, it provides information about the semantics and how the file could be interpreted

Linda: agrees with RubenVerborgh, the profile doesn't provide info about syntax
… but in my mind profile will have URI identifiers and will link with other resources

<annette_g> +1

kcoyle: profiles may have links pointing to aditional information

RubenVerborgh: what matters is that the URI resolves to some resource useful, either for human or machines

kcoyle: basically saying profile can link to other resources

Linda: it could be XML schema, for instance

kcoyle: there are human beings involved on the creation of metadata

PWinstanley: +1

annette_g: the way its written isn't good

kcoyle: it says profile can link to other documents

kcoyle: suggest rewording

annette_g: is it a good practice to profile have URLs?

kcoyle: yes

<kcoyle> http://‌dublincore.org/‌documents/‌singapore-framework/

kcoyle: a resource like this would be a general model
… then we would specify for dcat

kcoyle: none of the use cases were about the abstract level of a profile

RubenVerborgh: should we be more specific in this requirement?

kcoyle: we need to rewrite it

kcoyle: suggests to move to the identifiers part

DaveBrowning: I don't see a requirement for guidance
… as we mentioned in the beginning of this part
… I think the kind of guidance on how to construct a profile should be a requirement

kcoyle: some of those guidance requirements don't appear on the use cases
… the way those requirements are written doesn't help us with the guidance
… maybe we should ask for the profile guidance group to come up with a suggestion on it

RubenVerborgh: suggests to move to 6.2
… just want to say it looks good to me

…: )
… but it's written in a specific way, look likes the answer will be content negotiation

<Zakim> RubenVerborgh, you wanted to suggest to move to 6.2 instead

<annette_g> thumbs up

<kcoyle> PROPOSED: accept 6.2.1

<DaveBrowning> +1

<Linda> +1

<RubenVerborgh> +1

<annette_g> +1

<Caroline> +1

<PWinstanley> +1

Resolved: accept 6.2.1

kcoyle: what about the 6.3.1

RubenVerborgh: there's a note on the 6.3.1, but I don't think it should be there
… the 6.3.1 isn't covered in the 6.2.1, they are different

<kcoyle> PROPOSED: accept 6.3.1

<annette_g> +1

<RubenVerborgh> +1 without the note

<DaveBrowning> +1

<PWinstanley> +1

Resolved: accept 6.3.1

kcoyle: how about 6.4.1 ?

<kcoyle> PROPOSED: accept 6.4.1, but note redundancy with 6.1.3 and 6.1.7, etc.

<RubenVerborgh> +1

<Caroline> +1

<Linda> +1

<DaveBrowning> +1

<PWinstanley> +1

<annette_g> -1

annette_g: I don't think the use cases are talking about more detail about the profile

RubenVerborgh: we have profile as a concept and we also have concrete representation

kcoyle: I believe it's more information to support the profile

annette_g: I believe the profile should be self explanatory
… is it a way to express the profile itself more richly?

RubenVerborgh: yes

Resolved: accept 6.4.1, but note redundancy with 6.1.3 and 6.1.7, etc.

<annette_g> I'm okay with having an option to express the profile with more richness.

kcoyle: now about 6.42.1
… could it be a specific property?
… say about an example of data in a specific standard. when it's the case you may want to which standard is this

DaveBrowning: is it related to data creation rules?

kcoyle: yes

PWinstanley: what do we mean by a profiled … ?

kcoyle: it's a typo (???)

Linda: why do you want it to be in a profile?

kcoyle: it's link to guidance resources, standard informations, it's not a machine actionable thing

Linda: why should it be in a profile? maybe it should be in the metadata property in dcat

DaveBrowning: when we create a profile we want to apply a set of rules to the content
… I think it could be a model or schema that dcat links to it
… because it's about the data and not about the metadata

kcoyle: I think profile is a community specific DCAT

annette_g: a dcat property can have a url linking to that resource

<kcoyle> PROPOSED: accept 6.42.1 into DCAT

<annette_g> +1

<Linda> +1

<DaveBrowning> +1

<PWinstanley> +1

<Caroline> +1

Resolved: accept 6.42.1 into DCAT worded as: Ability to express "guidance rules" or "creation rules" in DCAT

kcoyle: starting identifiers discussion

https://‌www.w3.org/‌2017/‌dxwg/‌wiki/‌Meetings:F2F2017.11.09

kcoyle: I see 6.24 and 6.25 as contradictory

hadleybeeman: we had the same kind of discussion in the past (at TAG and at DWBP)
… I would argue we have to use web identifiers

DWBP #9 https://‌www.w3.org/‌TR/‌dwbp/#UniqueIdentifiers

"Best Practice 9: Use persistent URIs as identifiers of datasets"

<hadleybeeman> https://‌www.w3.org/‌TR/‌dwbp/#DataIdentifiers

DaveBrowning: we haven't made the distinction so far, if we had made we could use hadley's suggestion

kcoyle: we also have identifiers that are URN

hadleybeeman: as WG you can propose a definition for identifier
… but I warn if you come up with a very distinct definition it will be hard to get an approval when it goes to wide review

kcoyle: we don't need to make a decision about the kind of identifiers

<AndreaPerego> Related UC already lists some modelling options: https://‌w3c.github.io/‌dxwg/‌ucr/#ID11

kcoyle: communities use sometimes your own identifiers

kcoyle: I would be reluctant to tell people how to form their own identifiers
… or how they should look like

hadleybeeman: to solve which problem?
… if there's a interoperability problem it is worth to solve

kcoyle: if you have and url as identifier do you also need to identify the type?

AndreaPerego: people can use both the URI and literals as identifiers

kcoyle: it seems to me you use literal for search and URI to retrieve something

PWinstanley: a vehicle can be identified by its registration plate, but it can change
… the vehicle definitive identifier is the VIN number

<hadleybeeman> People can also use URIs and both dereferenceable links and as literals.

PWinstanley: it's part of the primary and secondary id

kcoyle: sometimes there's no identifier
… there are only literals
… and sometimes each organization chooses a specific model for identifier

hadleybeeman: it doesn't seem relevant to don't have URI
… on the semantic web context is very important to have URIs, because they are universal identifiers and also can be dereferenceable

<AndreaPerego> I think our main focus should be on "standard" identifiers, as DOIs, ISBNs, ISNIs, ORCIDs.

PWinstanley: in other contexts they have a hard coded mapping for identifiers
… for employees in a organization, for instance
… the real world is messy

kcoyle: getting back to our issue
… there's use case for non-URI identifiers
… . the question is how do you specify the kind of identifier used

<AndreaPerego> There's dct:identifier and adms:Identifier.

hadleybeeman: [points to the DWBP data identifiers section]

<AndreaPerego> The search is made on the ID type, and the prefix URI cannot be (safely) used to denote it.

<Zakim> AndreaPerego, you wanted to explain the "contradiction" and to and to

<hadleybeeman> @kcoyle: This is a Google search to return links with the string "dwbp" in them: https://‌www.google.com/‌search?client=firefox-b&ei=qisGWrqTHOKq0gLEi47wBQ&q=link%3Adwbp&oq=link%3Adwbp&gs_l=psy-ab.3...28538.29282.0.29597.4.4.0.0.0.0.82.254.4.4.0....0...1..64.psy-ab..0.0.0....0.clOHx_5ExZU

<annette_g> https://‌www.nytimes.com/‌2017/‌11/‌10/‌world/‌middleeast/‌saudi-arabia-lebanon-france-macron.html?hp&action=click&pgtype=Homepage&clickSource=story-heading&module=first-column-region&region=top-news&WT.nav=top-news could be found by searching for world/middleeast/saudi-arabia-lebanon-france-macron

<annette_g> kcoyle: I'm suggesting do we want to rewrite 6.2.4 and 6.2.5 to specify the linking for one of them and searching for the other. Would that meet your needs.

<annette_g> AndreaPerego: okay

Resolved: need a re-write of 6.24 and 6.25 focusing on the functionality desired: 1) linking (identifying) and 2) discovery

<AndreaPerego> +1

<Linda> +1

<annette_g> +1

<PWinstanley> +1

<DaveBrowning> +1

<annette_g> kcoyle: now let's look at 6.26

<annette_g> AndreaPerego: Some identifiers are secondary. In some communities there is a need to distinguish primary and secondary ones.

<annette_g> hadleybeeman: can you give an example?

<annette_g> AndreaPerego: an introduction may have an identifier separate from the thing itself (?)

<annette_g> kcoyle: we include even invalid identifiers

<annette_g> kcoyle: what would we want to see happen if more than one identifier were included?

<annette_g> AndreaPerego: we need the distinction between primary and secondary

<annette_g> hadleybeeman: I would encourage avoiding the terms primary and secondary

<annette_g> PWinstanley: we have to be careful about providing illustrations about this, where things might appear to be the same in the short term but not in the long term.

<annette_g> PWinstanley: Naive usage can end up with a mess.

<annette_g> kcoyle: we don't seem to have a strong case for calling things primary and secondary.

<kcoyle> PROPOSED: consider 6.26 out of scope

<AndreaPerego> +1

<annette_g> +1

<PWinstanley> +1

<DaveBrowning> +1

<hadleybeeman> +1

<Linda> +1

Resolved: 6.26 is out of scope

<annette_g> kcoyle: 6.43.1 not sure why it's in this section

<annette_g> 6.43.1 Define a means to identify a serialized DCAT Data(sub)set (i.e. a particular Distribution of a Dataset or its subset).

<annette_g> hadleybeeman: this comes up in versioning, when do you issue a new version in time series?

<annette_g> DaveBrowning: we already have a requirement to support subsets of datasets.

<annette_g> kcoyle: can we say it's implicit in the other requirements

<kcoyle> PROPOSED: 6.43.1 is not needed because covered already in other requirements

<PWinstanley> +1

<Linda> +1

<DaveBrowning> +1

<annette_g> +1

Resolved: 6.43.1 is not needed because covered already in other requirements

<hadleybeeman> For/re 6.19 https://‌www.w3.org/‌TR/‌vocab-duv/

<annette_g> 6.19 Provide a way to specify information required for data citation (e.g., dataset authors, title, publication year, publisher, persistent identifier) https://‌w3c.github.io/‌dxwg/‌ucr/#RID17

<DaveBrowning> AndreaPerego: Introduces use case - include the full set of terms such as persistant id, full set of authors for citation

<DaveBrowning> ... perhaps including guidance pointing at the full range of info needed

<DaveBrowning> annette_g: Summarising - we have much of this already, so its mainly going to be guidance?

<DaveBrowning> AndreaPerego: Yes. Perhaps we should make it explicit what you have to do for full citation support

<kcoyle> https://‌www.w3.org/‌TR/‌vocab-duv/#Citation_Model

<DaveBrowning> kcoyle: DUV covers most of this - suggests accept requirement, point at DUV and leave to DCAt team to explore if more is needed

<annette_g> PROPOSED: accept req 6.19 but suggest the group looks at DUV as a gap analysis.

<AndreaPerego> +1

<DaveBrowning> ... Gap analysis would be interesting

<DaveBrowning> PWinstanley: This an area with quite a bit of work already done - need for a 'receipe book' approach. DCAT provides basis but the building blocks exist

<DaveBrowning> ... e.g CERIF etc\

<kcoyle> +1

<Linda> +1

<DaveBrowning> +1

<annette_g> +1

<PWinstanley> +1

<PWinstanley> http://‌www.eurocris.org/‌cerif/‌main-features-cerif

<AndreaPerego> +1

Resolved: accept req 6.19 but suggest the group looks at DUV as a gap analysis.

<annette_g> 6.32 Allow for specification of the start and/or end date of temporal coverage. https://‌w3c.github.io/‌dxwg/‌ucr/#RID27.1

<AndreaPerego> dct:temporal <http://‌reference.data.gov.uk/‌id/‌quarter/‌2006-Q1> ;

<DaveBrowning> AndreaPerego: Currently has a somewhat complex support, but nothing easy like 'start date' and 'end date'

<DaveBrowning> PWinstanley: perhaps needs to be date time rather than just date?

<AndreaPerego> The current solution in DCAT-AP, following ADMS, is dct:temporal + dct:PeriodOfTime + schema:startDate / schema:endDate

<AndreaPerego> date or date time are both supported - is a matter of datatype.

<annette_g> PROPOSED: accept req. 6.32 but also including the ability to specify time rather than just dates.

<Linda> +1

<kcoyle> +1

<DaveBrowning> +1

<PWinstanley> +1

<annette_g> +1

<AndreaPerego> +1

Resolved: accept req. 6.32 but also including the ability to specify time rather than just dates.

<annette_g> 6.34 Provide means to specify spatial coverage with geometries

<DaveBrowning> AndreaPerego: This covers co-ordinate systems but also boundary boxes etc (see use case).

<DaveBrowning> ... DCAt only covers dct:spatial rather than random boundary boxes

<AndreaPerego> Example from SDW BP: https://‌www.w3.org/‌TR/‌sdw-bp/#ex-geodcat-ap-bag-addresses

<DaveBrowning> PWinstanley: Can we avoid this becoming a rabbit hole - there are lots of reference system in astronomy?

<DaveBrowning> Linda: These are much more standard

<annette_g> PROPOSED: accept req. 6.34

<kcoyle> +1

<Linda> +1

<annette_g> +1

<DaveBrowning> annette_g: This could be worked through by the DCAT team

<PWinstanley> +1

<DaveBrowning> +1

<AndreaPerego> +1

Resolved: accept req. 6.34

<annette_g> 6.33 Provide means to specify the reference system(s) used in a dataset

<AndreaPerego> CRS on SDW BP: https://‌www.w3.org/‌TR/‌sdw-bp/#CRS-background

<DaveBrowning> AndreaPerego: This has been widely discussed in geospatial.

<DaveBrowning> PWinstanley: So this is suggested for geospatial but other reference systems exist in other domains (UCUM)

<DaveBrowning> ... suggestion - should we aim to support this wider remit for other numerical data

<AndreaPerego> Makes sense to me.

<DaveBrowning> kcoyle: Concerned that the more general case gets very complicated

<DaveBrowning> AndreaPerego: This could be associated at the distribution as well

<DaveBrowning> ... should be added to requirement

<DaveBrowning> kcoyle: this could be in a profile rather than the core DCAT

<annette_g> PROPOSED: accept req. 6.33 and include application to distributions. It is not for spatial data alone.

<Linda> +1

<DaveBrowning> +1

<kcoyle> +1

<PWinstanley> +1

<annette_g> +1

Resolved: accept req. 6.33 and include application to distributions. It is not for spatial data alone.

<annette_g> 6.31 Define extension points (e.g. properties) for integration of external, specialized vocabularies.

<annette_g> PROPOSED: reject req. 6.31 as it is covered in other requirements already.

<PWinstanley> +1

<annette_g> +1

<Linda> +1

<DaveBrowning> +1

<AndreaPerego> +1

<kcoyle> +1

Resolved: reject req. 6.31 as it is covered in other requirements already.

Action: kcoyle to put out a call for use cases for profiles

<trackbot> Created ACTION-56 - Put out a call for use cases for profiles [on Karen Coyle - due 2017-11-18].

Action: kcoyle to write up changes to UCR doc implied by F2F resolutions

<trackbot> Created ACTION-57 - Write up changes to ucr doc implied by f2f resolutions [on Karen Coyle - due 2017-11-18].

<DaveBrowning> RRSAgent: draft minutes v2

<DaveBrowning> :)

<annette_g> thank you for persevering!

– DRAFT –
Data Exchange WG TPAC face to face - Day 2

10 November 2017

Meeting Minutes

Summary of Action Items

Summary of Resolutions

Diagnostics