Topics for the Workshop
The main topics of the Workshop are:
- transformation (to other formats)
- combinations of data from different models (e.g., linked data and CSV)
- quality assessment and self-description
- extracting human-readable "stories" from data
Below, the Program Committee highlights a small number of topics of known interest, but position papers from participants may cover a broad range of open data topics. For all topics on the agenda, Program Committee has a goal of greater alignment between open data publishers and those who deliver open data products and services.
Data Discovery, Data Description, Data Meaning
The majority of open data is published through data portals. These provide a catalog of the available data sets, links to applications that use the data, and more. These work well for human users but what about automatic discovery by machines? What if you could search the web for data as easily as you can for documents?
Wouldn't it be great if an application could find the data on a given site automatically, and understand what it contains?
The use of the
/.well-known/ URI prefix [RFC 5785] is a good
candidate for this and has been implemented in some cases for VoID data descriptions,
but more standardization might be needed as well as the adoption of best practice by data publishers.
The advantage of manual discovery of data is that the content, provenance and quality of the data can readily be be assessed, but how can the same assessment be applied to something discovered programmatically? What is the key metadata that is needed and how should it be exposed? If the right metadata is provided, can the machines do the rest unaided?
This is as important to data publishers as it is to data consumers and is crucial to the effective use and reuse of open data. How can provenance and annotations be integrated into the data processing and production chain?
Web-Oriented Data Formats for Tabular Data
Data is and will continue to be published in a variety of different formats. Some data goes through careful and detailed preparation before publication, other data sets are saved directly from common desktop software and published ‘as is.’ Applications need to be able to handle this variety.
The majority of open data is published as tabular data in CSV files that can readily be converted to JSON, but are some methods of doing this better than others? What would a generic API for a CSV file look like? Are there ways of structuring or packaging CSV files that make them better for handling sophisticated data? How would that play with linked data, whether accessed via a SPARQL query or via JSON for Linked Data [JSON-LD]. See the Data Protocols site for a survey of current standards in this area.
The hashtag for the event is #odw13
W3C gratefully acknowledges Google for hosting this workshop.
Expression of interest — please send a short e-mail to Phil Archer ASAP.
3 March 2013:
Deadline for Position Papers
26 March 2013:
Acceptance notification and registration instructions sent. Program and position papers posted on the workshop website.
22nd April 2013, 19:00
Open Data Meetup – London
23rd April 2013, 09:30
18:00 - 20:00 ODI Networking Evening Sold Out!
24th April 17:00