14:30:45 RRSAgent has joined #sdsvoc_versioning 14:30:45 logging to http://www.w3.org/2016/12/01-sdsvoc_versioning-irc 14:31:05 RRSagent, make minutes public 14:31:05 I'm logging. I don't understand 'make minutes public', AxelPolleres. Try /msg RRSAgent help 14:31:27 RRSAgent, draft minutes 14:31:27 I have made the request to generate http://www.w3.org/2016/12/01-sdsvoc_versioning-minutes.html AxelPolleres 14:31:54 jrvosse has joined #sdsvoc_versioning 14:31:59 sebastian has joined #sdsvoc_versioning 14:32:00 to join type /join #sdsvoc_versioning 14:33:10 rrsagent, make log public 14:33:47 1) How to describe your change frequency and change characteristics 14:33:48 E.g. order preserving, monotonicity, no change in the ontology but changes in the instances, etc. 14:33:49 Are there specific vocabularies? Should we join forces to create one? Which are the potential needs? 14:34:53 How to query data across time? 14:34:53 Collect requirements/use cases 14:35:40 Javier has joined #sdsvoc_versioning 14:35:52 How to efficiently store and archive public datasets, allowing users to ask complex cross-time queries? how can metadata make that easier? 14:36:43 Jacco: use case annotating artworks, vocabulary is updated regularly, assumed to be only additions 14:36:55 … but some risk for semantic changes. 14:37:10 … problems with querying archived data over time. 14:37:23 problems with concept drift 14:37:37 scribe: Javier 14:38:03 chair: AxelPolleres 14:38:32 david: validity period in one dimension but also the authors and changes 14:38:58 David: version means for us schema updates, not the ongoing data (changes are always happening there) 14:39:03 newton has joined #sdsvoc_versioning 14:39:28 … could be streams. 14:40:18 David: in streaming, transaction is fairly complete 14:41:08 david: we treat versions as software releases 14:41:23 … common in the finance, data market. 14:42:15 … customers don’t want to look at transaction logs, but they want to search/lookup events. 14:42:24 david: you don't want to track all the transaction logs, but we would like to have a most common practice with DCAT 14:42:55 Transaction log vs. “full snapshots" 14:44:36 david: we have different representations for different needs: full spanshots + Logs 14:44:50 david: we have different representations for different needs: full snapshots + Logs 14:45:32 David: It may change every minute, but it is published once per day 14:45:53 “here you get ther realtime dataset and here you get the daily snapshot” might be different distributions of the same dataset. 14:47:00 David: time dimension might be part of the data/snapshot or part of the metadata 14:47:12 rrsagent, draft minutes 14:47:12 I have made the request to generate http://www.w3.org/2016/12/01-sdsvoc_versioning-minutes.html jrvosse 14:47:58 Meeting: SDSVoc bar camp: Versions and archives - how to annotate and query 14:48:03 rrsagent, draft minutes 14:48:03 I have made the request to generate http://www.w3.org/2016/12/01-sdsvoc_versioning-minutes.html jrvosse 14:48:17 present+ newton 14:48:20 metadate needs either be able to say “this column in the data” contains the temporal validity of the data or to describe the temporal extent of the whole dataset/distribution/resource 14:48:31 present+ AxelPolleres 14:48:37 present+ jrvosse 14:48:40 present+ Javier 14:48:58 present+ Jacco 14:49:17 present+ sebastian 14:50:00 Eurostat overwrites, does not provide history. 14:50:46 s/Eurostat/Willem: Eurostat/ 14:50:50 newton: do you have different URIs for versions? 14:51:10 Willem: no, just overwritten at the same URI 14:51:41 David: similar case, but it’d be actually quite valuable to look at the changes. 14:52:21 … but if you can monitor, knowing the changerate, you could follow that. 14:53:28 … archiving and finding value in the differences is a common case. 14:53:57 Javier has joined #sdsvoc_versioning 14:54:23 … use case in the legal domain, precedent cases that have been overturned…. which past cases they affect. 14:55:09 Willem: Eurostat updates daily 14:55:25 Axel: so advertising the change frequency would be helpful to monitor. 14:56:08 Willem: for taxonomies, we collect the changes and generate a new release, e.g. each 3 months 14:56:37 taxonomies/vocabularies 14:59:54 Axel: could it be that certain parts of thedatasets change at particular frequencies, e.g. thinking of statistical data, some may change daily, other annually, or other frequencies. 14:59:58 David: regular rhythm of updates + emergency updates (e.g. news) 15:01:12 David provides another example of schema change… e.g. splitting of a country, the histroic data then needs to be split hich can’t be done algorithmically. 15:01:14 BernadetteLoscio__ has joined #sdsvoc_versioning 15:01:27 David: most of our system uses versioned URIs 15:02:20 … could be “upddate is every three months, except when it doesn’t” … that might be too complicated to make it machine readable. 15:03:09 dct:accrualPeriodicity ; 15:03:09 Newton: dct has peridocity 15:03:12 https://www.w3.org/TR/dwbp/#AccessUptoDate 15:03:22 @David we use it for the external customer, but not internally 15:04:36 Willem: dcat-ap has frequency of update 15:05:08 Axel: it would be interesting to have the growth rate 15:05:42 Axel: or other characteristics, like order. 15:06:09 Willem: dcat-ap has also special things like biweekly/forthnightly. 15:07:02 Newton: travel restrictions are another example of data that changes over time. 15:07:05 present+ BernadetteLoscio 15:07:22 … some government agencies have only current, others back over 10 years. 15:07:51 some common metadata would be useful to build such an application. 15:07:53 I don'y know if you discussed that, but I think we also need better definition for versioning 15:08:31 Javier: what do you use to process temporal/archived data? SPARQL? something else? 15:08:58 David: different, not a standard way to do it. 15:09:23 +1 BernadetteLoscio__ 15:09:37 … big data technilogies, SQL, SPARQL, etc. differnt systems 15:09:50 s/technilogies/technologies/ 15:10:36 Axel: what would be helpful in tems of standards/metadata? 15:11:17 David: if metadata was more standardised that might help. important is benefits in terms of ease of use, enabling automation. 15:11:44 … anything that simplifies the exchange of the description of the interface. 15:12:13 … frequency of update, etc. 15:12:31 Javier: shoudl we talk about memento? 15:12:52 s/shoudl/should/ 15:12:53 Jacco: … which relies on content negotiation (we had talked about that) 15:13:22 Phila: internet archive use it. 15:13:36 Willem: we are planning to use it. 15:14:13 … on the publications office metadata (from the 50’s to today), CELLAR project. 15:15:16 … in the commission this is used in production. 15:15:54 … we have an issue how to model changes over time, e.g. organisational changes, country changes, etc. 15:16:27 … e.g. change from kingdom to republic for a country. 15:16:57 Javier: to some extent reflected on wikidata 15:18:12 Axel: “concept drift” ? 15:18:30 David: similar example, what if companies merge 15:18:48 … mostly done manual. 15:19:46 … doesn’t happen overnight, so there is some transition phase when it’s unclear. 15:20:34 … in the financial area this matters, if over 24hrs it’s not clear what the share price it, how it translates to yesterdays share price. 15:24:28 Axel: when starting the session I was aiming at much lower hanging fruits…i.e. making recomendations for best practices to a) describe dataet change frequency, characteristics, b) align existing vocabes in that space, c) allow to describe diffferent practices of slicing datasets based on temporal extent, etc. 15:24:44 present+ DavidBrowning 15:25:24 Javier: another issue is online APIs… e.g. metadata for APIS that indicate whether the data behind has changed. 15:25:25 I also think that we should have a better definition for the basic concepts! to have a kind of agreement 15:25:29 present+ WillemVanGemert 15:25:38 rescuing BernadetteLoscio__ question: I don'y know if you discussed that, but I think we also need better definition for versioning 15:25:51 David: we use e.g. kafka based. 15:26:01 Is it in the scope of a (new) wg to create a definition about it? 15:27:00 Axel: i there anything needed/best practice for notifications of changes. i.e. push vs. pull 15:27:17 Jacco: openarchives.org/rs resource sync 15:27:45 Newton: webmention-API (W3C rec track) 15:27:58 … could be useful/related. 15:28:24 … also the data-usage vocabulary by DWBP WG 15:29:38 Axel: who would be in to push in a WG for such issues being addressed? 15:30:24 Jacco: could be in DCAT 2.0 spatial and temporal coverage (which was mentioned in the panel). 15:32:11 Axel: I think discussing use cases would make sense, because common “categories” of use cases might require different modeling (e.g. evolving datasets that represent snapshots vs. delta/updates) 15:32:13 s/webmention-API/WebMention 15:32:17 W3C WebMention -> https://www.w3.org/TR/webmention/ 15:32:56 David: only talk more about time-granularity and how to model that in DCAT or talk also about which other sreas/standard might be affected. 15:33:46 newton: my impression is we don’t have a clear definition of version. 15:34:09 … e.g. software versioning is clearer than data versioning, e.g. dbpedia example. 15:35:52 Axel: dbpedia is a good example, because you could split it per dbpedia version, vs. differnt versions by resource taken from the wikipedia edit history. 15:36:56 rrsagent, draft minutes 15:36:56 I have made the request to generate http://www.w3.org/2016/12/01-sdsvoc_versioning-minutes.html jrvosse 15:38:54 Axel: take-home/conclusion extensiond to model temporal extent, changes and versioning should be ctegorized use-case driven. 15:39:32 David: this would help us to understand what it the scope of standardisation in this space… 15:39:52 … we have discussed *some* examples in the breakout-session. 15:40:34 Newton: it seesm no group has taken this issue serious or considered it seriously as “in scope”. 15:40:36 rrsagent, draft minutes 15:40:36 I have made the request to generate http://www.w3.org/2016/12/01-sdsvoc_versioning-minutes.html AxelPolleres 15:41:10 Willem: not only use cases, but also examples and current solutions should be collected. 15:41:34 rrsagent, draft minutes 15:41:34 I have made the request to generate http://www.w3.org/2016/12/01-sdsvoc_versioning-minutes.html AxelPolleres 15:42:53 rrsagent, draft minutes 15:42:53 I have made the request to generate http://www.w3.org/2016/12/01-sdsvoc_versioning-minutes.html jrvosse 15:49:23 newton has joined #sdsvoc_versioning