IRC log of dxwgdcat on 2018-06-28

Timestamps are in UTC.

08:28:17 [RRSAgent]
RRSAgent has joined #dxwgdcat
08:28:18 [RRSAgent]
logging to https://www.w3.org/2018/06/28-dxwgdcat-irc
08:28:28 [PWinstanley]
rrsagent, make logs public
08:28:38 [SimonCox]
rrsagent, make logs public
08:29:46 [SimonCox]
Meeting: DXWG DCAT subgroup teleconference 28 June 201
08:30:00 [SimonCox]
s/201/2018/
08:30:15 [SimonCox]
Agenda: https://www.w3.org/2017/dxwg/wiki/Meetings:DCAT-Telecon2018.06.28
08:30:34 [arminhaller]
arminhaller has joined #dxwgdcat
08:30:47 [SimonCox]
chair: SimonCox
08:31:03 [SimonCox]
regrets: Dave Browning
08:31:56 [SimonCox]
present+
08:34:31 [arminhaller]
present+
08:35:23 [riccardoAlbertoni]
riccardoAlbertoni has joined #dxwgdcat
08:35:30 [riccardoAlbertoni]
present+
08:35:46 [Jaroslav_Pullmann]
Jaroslav_Pullmann has joined #dxwgdcat
08:35:53 [Jaroslav_Pullmann]
present+
08:36:43 [arminhaller]
scribe: arminhaller
08:36:44 [SimonCox]
Agenda: https://www.w3.org/2017/dxwg/wiki/Meetings:DCAT-Telecon2018.06.28
08:36:47 [PWinstanley]
PWinstanley has joined #dxwgdcat
08:36:50 [PWinstanley]
present+
08:36:50 [arminhaller]
scribenick: arminhaller
08:37:05 [SimonCox]
Topic: Confirm agenda
08:37:51 [arminhaller]
+1 for agenda
08:37:53 [riccardoAlbertoni]
present+
08:37:59 [SimonCox]
topic: Approve minutes from last meetin
08:38:02 [riccardoAlbertoni]
+1 to agenda
08:38:09 [arminhaller]
0 not there
08:38:11 [SimonCox]
https://www.w3.org/2018/06/21-dxwgdcat-minutes
08:38:18 [SimonCox]
0 not there
08:38:19 [PWinstanley]
+1
08:38:19 [riccardoAlbertoni]
0 ( i was not there)
08:38:25 [Jaroslav_Pullmann]
0 (absent)
08:38:45 [SimonCox]
Resolved: minutes approved
08:38:59 [SimonCox]
Topic: Mailing list questions
08:39:07 [SimonCox]
https://lists.w3.org/Archives/Public/public-dxwg-comments/2018Apr/0001.html
08:39:45 [riccardoAlbertoni]
+1 to have an issue
08:39:46 [SimonCox]
q?
08:40:10 [SimonCox]
Action: SimonCox to acknowledge comment
08:40:10 [trackbot]
Sorry, but no Tracker is associated with this channel.
08:40:22 [SimonCox]
Action: Simon to create issue from mailing list comment
08:40:22 [trackbot]
Sorry, but no Tracker is associated with this channel.
08:41:12 [SimonCox]
Topic: Catalogues in which dataset is a bag of files
08:41:22 [SimonCox]
https://github.com/w3c/dxwg/issues/256
08:41:44 [SimonCox]
q?
08:42:35 [arminhaller]
SimonCox: Do we need the plenary to vote on this?
08:43:08 [arminhaller]
PWinstanley: Push it through here and get the nod from the main group (UCR).
08:43:37 [arminhaller]
... should get top of the agenda next week
08:44:36 [arminhaller]
... for the main meeting
08:45:46 [arminhaller]
... this is a special case, we won't need new use cases in many instances
08:46:25 [SimonCox]
q?
08:46:55 [SimonCox]
Topic: How to express distributions provided as compressed files
08:47:07 [SimonCox]
https://github.com/w3c/dxwg/issues/259
08:48:11 [AndreaPerego]
AndreaPerego has joined #dxwgdcat
08:48:12 [SimonCox]
q?
08:48:27 [AndreaPerego]
present+ AndreaPerego
08:49:06 [arminhaller]
q+
08:49:06 [SimonCox]
q?
08:49:12 [SimonCox]
ack arminhaller
08:49:16 [Jaroslav_Pullmann]
q+
08:49:19 [SimonCox]
arm
08:49:24 [AndreaPerego]
RRSAgent, make logs public
08:49:31 [AndreaPerego]
RRSAgent, draft minutes v2
08:49:31 [RRSAgent]
I have made the request to generate https://www.w3.org/2018/06/28-dxwgdcat-minutes.html AndreaPerego
08:49:38 [SimonCox]
arminhaller: from API pov it is just the media-type that matters
08:49:44 [SimonCox]
q?
08:49:44 [PWinstanley]
q+
08:49:49 [SimonCox]
ack Jaroslav_Pullmann
08:49:52 [SimonCox]
q?
08:49:56 [SimonCox]
q+
08:50:52 [SimonCox]
ack PWinstanley
08:51:00 [AndreaPerego]
q+ to ask if Makx's comment on GH could be relevant here: https://github.com/w3c/dxwg/issues/54#issuecomment-359062055
08:51:01 [arminhaller]
Jaroslav_Pullmann: There was a suggestion to indicate the original media type, i.e. in the distribution metadata we should allow the media type description
08:51:33 [arminhaller]
PWinstanley: How deep to we go, mime types, encodings? This can become a rabbit hole
08:51:57 [arminhaller]
Jaroslav_Pullmann: we have packaging like tar, compressions like zip
08:52:07 [SimonCox]
ack SimonCox
08:52:11 [arminhaller]
PWinstanley: We should only be concerned with the compression type, not the content
08:52:14 [arminhaller]
+1 to PWinstanley
08:52:58 [arminhaller]
SimonCox: we should give the user more information then the Web architecture provides us
08:53:09 [SimonCox]
q?
08:53:13 [SimonCox]
ack AndreaPerego
08:53:13 [Zakim]
AndreaPerego, you wanted to ask if Makx's comment on GH could be relevant here: https://github.com/w3c/dxwg/issues/54#issuecomment-359062055
08:54:19 [arminhaller]
AndreaPerego: for packaged distributions, at least compressed one's you can use the + to include the included mime type
08:54:37 [SimonCox]
q?
08:54:40 [arminhaller]
s/+/+zip
08:55:20 [arminhaller]
SimonCox: In the past we had encountered similar problems with GML, with one level of compression
08:55:28 [arminhaller]
q+
08:55:32 [AndreaPerego]
q+
08:55:48 [riccardoAlbertoni]
+1 to consider also the type of resource considered ( ex. Gml in the simon's example)
08:56:13 [arminhaller]
SimonCox: the risk is that you have a potential infinite level of nesting
08:56:49 [SimonCox]
q?
08:56:55 [SimonCox]
ack arminhaller
08:57:00 [riccardoAlbertoni]
+q
08:57:23 [AndreaPerego]
q-
08:57:40 [SimonCox]
ack riccardoAlbertoni
08:57:43 [Jaroslav_Pullmann]
q+
08:57:45 [AndreaPerego]
q+
08:57:52 [PWinstanley]
+1 to arminhaller and the avoidance of rabbit holes
08:58:23 [arminhaller]
arminhaller: we should support +zip suffix, but not arbitrary levels of file hierarchies that may be contained in a packaged file or compressed file
08:58:35 [SimonCox]
ack Jaroslav_Pullmann
08:58:50 [SimonCox]
q+
08:58:59 [arminhaller]
Jaroslav_Pullmann: there might be recursive structures, but that is normally not the case
08:59:14 [SimonCox]
q+ to note connection with previous agenda topic
08:59:28 [arminhaller]
... it is important to let the user know what is in the compressed format
08:59:56 [SimonCox]
ack AndreaPerego
09:00:36 [arminhaller]
AndreaPerego: Want to report how we deal with this issue. We ignore the fact that a package distribution is compressed.
09:00:54 [arminhaller]
... people like to know what is inside, shape, CSV or whatever
09:01:20 [arminhaller]
... we want to say what is the primary format
09:02:19 [arminhaller]
... there was also a comment from David Reed. I don't care about the compression. The software can uncompress on the fly. There is not need to tell the machine the compression format.
09:02:41 [Jaroslav_Pullmann]
q+
09:02:51 [SimonCox]
ack SimonCox
09:02:51 [Zakim]
SimonCox, you wanted to note connection with previous agenda topic
09:02:56 [riccardoAlbertoni]
I agree on the fact that the compression is not the interesting thing the interesting thing is what is compressed
09:03:18 [AndreaPerego]
RRSAgent, draft minutes v2
09:03:18 [RRSAgent]
I have made the request to generate https://www.w3.org/2018/06/28-dxwgdcat-minutes.html AndreaPerego
09:03:22 [arminhaller]
SimonCox: Are we understanding what the requirement is as riccardoAlbertoni said
09:04:09 [arminhaller]
... the fact that there may be multiple files within overlaps with the previous issue that we did not discuss
09:06:04 [arminhaller]
... we should record an agreement that it is important to know what is within the archive
09:06:27 [AndreaPerego]
q+
09:06:28 [Jaroslav_Pullmann]
+1 for focusing on purpose of Distribution metadata (indicating bare content type)
09:06:35 [PWinstanley]
q+
09:07:13 [SimonCox]
Proposal: We agree that the content of an archive distribution (i.e. what is inside a zip or tar file) is important for the users of a Catalogue and should be part of the description
09:07:24 [SimonCox]
s/Proposal/Proposed/
09:07:28 [SimonCox]
+1
09:07:29 [Jaroslav_Pullmann]
+1
09:07:30 [riccardoAlbertoni]
+1
09:07:47 [AndreaPerego]
+1 (we also to that in the JRC Data Catalogue)
09:08:06 [arminhaller]
+1 (just using the +zip suffix if it is a compressed file)
09:08:10 [AndreaPerego]
s/+1 (we also to that in the JRC Data Catalogue)/+1 (we also do that in the JRC Data Catalogue)/
09:08:20 [arminhaller]
s/we also to/we also do
09:09:25 [SimonCox]
... in the dcat:mediaType property
09:09:36 [PWinstanley]
I've a question about the degree of mandation of this, because there is an element of agency involved here - sometimes people keeping the collection (e.g. in CKAN etc) may be given a compressed file but might not have all the information / understanding of the contents. So the information is desirable, but the resolution uses 'should'
09:09:49 [SimonCox]
resolved: We agree that the content of an archive distribution (i.e. what is inside a zip or tar file) is important for the users of a Catalogue and should be part of the description
09:10:52 [SimonCox]
q?
09:11:07 [AndreaPerego]
RRSAgent, draft minutes v2
09:11:07 [RRSAgent]
I have made the request to generate https://www.w3.org/2018/06/28-dxwgdcat-minutes.html AndreaPerego
09:11:29 [SimonCox]
ack Jaroslav_Pullmann
09:11:36 [PWinstanley]
q-
09:11:45 [arminhaller]
Jaroslav_Pullmann: We are not considering nesting of content
09:11:58 [arminhaller]
... flat content that is optimised through compression
09:12:20 [arminhaller]
... metadata is there for a purpose for the agent to know what is in
09:12:54 [arminhaller]
... automated agents should know about the surface format
09:13:05 [riccardoAlbertoni]
+1 to Jaroslav_Pullmann (otherwise, if we are not talking about "flat" file we need an extra use case/requirement to consider)
09:13:11 [arminhaller]
... i am advocating that we have both in
09:13:15 [PWinstanley]
s/mandation/requirement/
09:13:38 [arminhaller]
SimonCox: what about an archive with mixed file formats in it
09:13:43 [arminhaller]
s/in it/in it?
09:13:57 [arminhaller]
Jaroslav_Pullmann: Then we talk about different data
09:14:37 [AndreaPerego]
I think it's too strong to say it's different data.
09:15:04 [arminhaller]
arminhaller: What about a compressed file that contains ttl, n3 and rdf/xml files that are all equivalent, semantically.I have done that before.
09:15:20 [arminhaller]
.I have/. I have
09:15:37 [AndreaPerego]
+1 to arminhaller. Same with CSV, TSV, and spreadsheet formats.
09:16:13 [SimonCox]
q?
09:16:38 [SimonCox]
ack AndreaPerego
09:17:40 [arminhaller]
AndreaPerego: In the geospatial community it is common to have a shape file with additional files like manifests included in there.
09:17:43 [riccardoAlbertoni]
+1 AndreaPerego
09:18:18 [arminhaller]
... for standard nested formats we don't need to do anything
09:19:02 [arminhaller]
... if nesting is done in an arbitrary way, a readme file within the structure should be used
09:19:21 [SimonCox]
This topic is also related to https://github.com/w3c/dxwg/issues/256 and https://github.com/w3c/dxwg/issues/81
09:19:53 [arminhaller]
... the fact that they use a zip bundle is deliberate, because they intentionally want to help users
09:20:22 [PWinstanley]
q+
09:20:26 [riccardoAlbertoni]
perhaps we should add an extra use case/requirement about the andrea's bundle ..
09:20:27 [arminhaller]
... here I find it difficult to use metadata to describe the content, unless we use sitemap
09:20:35 [SimonCox]
where it is well-known bundle structure, then is this handled with dct:conformsTo ?
09:20:40 [SimonCox]
ack PWinstanley
09:20:43 [SimonCox]
q?
09:21:09 [SimonCox]
+1 to riccardoAlbertoni !
09:21:27 [arminhaller]
PWinstanley: In highly structured content, they give you a context.xml file with a common pattern
09:21:45 [SimonCox]
q?
09:21:58 [arminhaller]
SimonCox: I wonder if this is already covered with the conformsTo property
09:22:08 [riccardoAlbertoni]
+q
09:22:13 [arminhaller]
+1 on two seperate uses cases
09:22:15 [SimonCox]
ack riccardoAlbertoni
09:22:21 [PWinstanley]
s/common pattern/common pattern that only deals with the current level in relation to the parent level/
09:23:11 [arminhaller]
riccardoAlbertoni: just to reiterate we need two use cases, one flat file use case and one for bundled distributions
09:23:18 [SimonCox]
q?
09:23:52 [AndreaPerego]
RRSAgent, draft minutes v2
09:23:52 [RRSAgent]
I have made the request to generate https://www.w3.org/2018/06/28-dxwgdcat-minutes.html AndreaPerego
09:23:56 [SimonCox]
q?
09:24:39 [SimonCox]
q?
09:25:01 [arminhaller]
SimonCox: +1 to Jakubklimek's contributions on Github
09:25:07 [SimonCox]
action: SimonCox to add some notes into https://github.com/w3c/dxwg/issues/259 about our discussion
09:25:07 [trackbot]
Sorry, but no Tracker is associated with this channel.
09:25:24 [AndreaPerego]
q+
09:25:33 [SimonCox]
ack AndreaPerego
09:25:54 [arminhaller]
AndreaPerego: We are not getting any feedback on DCAT
09:26:04 [arminhaller]
... is there a conference where we can get feedback?
09:26:08 [SimonCox]
Topic: no feedback on DCAT FPWD
09:26:27 [PWinstanley]
q+
09:26:58 [arminhaller]
SimonCox: I briefed DCAT at several conferences recently. One at a conference in Melbourne, that is organised by my organisation, CSIRO.
09:27:23 [arminhaller]
... another one with ANDS, the data service provider in Australia
09:27:52 [arminhaller]
... i can trigger responses from those users
09:27:56 [SimonCox]
action: SimonCox to trigger feedback from ANDS
09:27:56 [trackbot]
Sorry, but no Tracker is associated with this channel.
09:27:57 [SimonCox]
q?
09:28:11 [SimonCox]
ack PWinstanley
09:29:19 [riccardoAlbertoni]
thanks enjoy the rest of week!
09:29:23 [arminhaller]
bye
09:29:32 [AndreaPerego]
RRSAgent, draft minutes v2
09:29:32 [RRSAgent]
I have made the request to generate https://www.w3.org/2018/06/28-dxwgdcat-minutes.html AndreaPerego
09:29:34 [Jaroslav_Pullmann]
bye!
09:29:39 [AndreaPerego]
RRSAgent, draft minutes v2
09:29:39 [RRSAgent]
I have made the request to generate https://www.w3.org/2018/06/28-dxwgdcat-minutes.html AndreaPerego
09:29:47 [PWinstanley]
bye
09:29:53 [arminhaller]
arminhaller has joined #dxwgdcat
09:36:09 [AndreaPerego]
RRSAgent, draft minutes v2
09:36:09 [RRSAgent]
I have made the request to generate https://www.w3.org/2018/06/28-dxwgdcat-minutes.html AndreaPerego
09:40:27 [arminhaller]
arminhaller has joined #dxwgdcat
13:16:28 [Zakim]
Zakim has left #dxwgdcat