Context

From Data on the Web Best Practices
Jump to: navigation, search


In general, publishing data on the Web means publishing datasets in order to share data with multiple users. For this document, as defined by DCAT, "a dataset is a collection of data, available for access or download in one or more formats". By data, "we mean known facts that can be recorded and that have implicit meaning" [Navathe].

Publishing data on the Web concerns two important aspects:

  1. the sharing of data in a large scale: this means that data published on the Web may be consumed by distinct groups of users with different requirements and expectations.
  2. the use of the Web as a platform for data publication and sharing: this means that data should be published according to the architectural principles of the Web [WEBARCH].

Considering the first aspect, to meet the requirements of different users a single dataset may be available in different data formats or distributions. Again quoting the DCAT specification, a distribution "Represents a specific available form of a dataset. Each dataset might be available in different forms, these forms might represent different formats of the dataset or different endpoints. Examples of distributions include a downloadable CSV file, an API or an RSS feed" DCAT.

Besides, it is fundamental to offer metadata to make datasets understandable. Metadata provides additional information that helps data consumers better understand the meaning of data, its structure, and to clarify other issues, such as rights and license terms, the organization that generated the data, data quality, data access methods and the update schedule of datasets.

The second aspect concerns mainly the identification of resources as well as the creation of links between resources. In our context, a resource may be a whole dataset or a specific item of given dataset. All resources should be published with stable URIs, so that they can be referenced. At the very least, give them unique and stable URIs, if you don't want to make them directly accessible.

Links between resources should also be established in order to transfom the Web into a Web of Data. A link may be defined as a relationship between two resources when one resource (representation) refers to the other resource by means of a URI.

Some examples of Publishing Data on the Web:

  • ...
  • ...

[Navathe] Ramez Elmasri; Shamkant B. Navathe. Fundamentals of Database Systems. 2010.