Data Catalog Vocabulary
The Data Catalog Vocabulary project is a task force within the W3C Interest Group on eGovernment. The project started in mid-April 2010. The group has produced a Use Cases and Requirements document and is currently working on the dcat RDF vocabulary.
Context: Government data catalogs. Governments produce large amounts of valuable data as part of daily operations and decision-making. This data can be useful to many citizens and organizations, and it is ultimately them who paid for producing it. Governments increasingly recognize this, and start to make this data publicly available through one-stop portals called data catalogs, such as data.gov, data.gov.uk, statcentral.ie and many others.
The goals of the group are:
- To propose a unified format for publishing the contents of such data catalogs, using DERI's dcat proposal as a starting point.
- To provide support to initial implementors of the format.
Notes on scope:
- The focus is on catalogs of government data, such as data.gov. (Applicability of the format to other kinds of data catalogs, such as data on climate change, is desirable but optional.)
- The focus is on data catalogs. Standard formats for representing the actual data inside the datasets is out of scope.
- The focus is on existing data catalogs and how their contents can be expressed in a unified way.
- An RDF vocabulary will be developed, although the group may explore its mapping to non-RDF syntaxes as well in its second round of deliverables.
Further deliverables, such as a “dcat Deployment Guide” with concrete advice for syntaxes/formats/protocols to use with the dcat vocabulary, will be considered when the first round of deliverables is reasonably complete.
The group is also maintaining a Resource Guide on Data Catalogs.
Meetings and participation
Participation is open to anyone.
There are weekly teleconferences. Details are announced on the eGov IG mailing list, which is also used for discussion of the group's work.
To get write access to this wiki, you currently must formally join the group. If you need to change this and can't get write access, e-mail email@example.com.
This data started with the initial participation survey, and some names have been added as people showed up in meetings. Please feel free to edit your own data, below, or add or remove yourself as appropriate.
- Andrew Houghton: This effort has broader implications to the library community in describing and making their catalogs available to the Semantic Web community. My interest to follow and contribute to this effort and bring a broader perspective to it.
- Brand Niemann: US EPA
- Cory Casanave: Model Driven Solutions
- Craig Norvell: I work at Franz Inc (AllegroGraph) and we are interested in participating in development of this Gov't data.
- Dan Brickley
- Dan Thomas: Washington DC Gov't
- David James: Sunlight Foundation
- Ed Summers: US Library of Congress. (I am a software developer working at the Library of Congress. I participated in the w3c working group on skos, where my contribution was mainly reviewing documents, and a demonstration which now lives at id.loc.gov.)
- Erik Wilde: UC Berkeley. (core web architecture background and interest; interested in using lightweight approaches for exposing data so that people can use and reuse them with the simplest possible set of technologies and tools, so that this can be done on the largest possible variety of platforms and by the largest possible set of people.)
- Fadi Maali: DERI.
- George Thomas: US Dept of Health and Human Services. (interested in using dcat/void on data.gov work with us gov.)
- Jon Phipps
- Kate Geyer: Working on the Massachusetts Open Data Initiative. We are moving to open standards for our datasets and a linked open data model.
- Li Ding: RPI. (we are working on linking government data and building appealing demos to drive the consumption of linked government data at Rensselaer Polytechnic Institute.)
- Libby Miller
- Luigi Montanez: Sunlight Foundation
- Martín Álvarez: CTIC Spain. We maintain a list of the public open data catalogs.
- Niklas Lindström: I'm using and contributing to open source libraries for RDF (e.g. RDFLib in Python), and am currently developing the legal information system in Sweden, employing RDF and linked data principles for collecting documents and data produced by about a hundred agencies (using RDF and Atom).
- Paul Hermans
- Peter Krantz: Developer of the opengov catalog platform: http://code.google.com/p/opengov-catalog/ currently used in opengov.se. This currently supports RDF metadata about datasets primarily based on the DC vocabulary e.g.: http://www.opengov.se/data/71/rdf/
- Rich Wolverton: Massachusetts State Gov't
- Richard Cyganiak: DERI. Co-founder of the LOD project. Authored the first version of the dcat vocabulary together with Fadi Maali. Also involved in the development of related vocabularies such as voiD and SDMX+RDF.
- Rufus Pollock:
- Sandro Hawke: I'm here for logistical and process support (as W3C staff contact) and to offer what technical/design/implementation help I can (as a coder with lots of RDF experience)
- Thomas Bandholtz: need a data catalog for Linked Environment Data
- Vassilios Peristeras: Egovernment Cluster leader in DERI. I work with Fadi and Richard on governmental linked data.
- William Waites
- Daniel Bennett: CTO, eCitizen Foundation. Have been working on "Repository Schema" to help expose large web datasets: http://advocatehope.org/tech-tidbits/repository-schema .