Difference between revisions of "Dcat examples"

From Government Linked Data (GLD) Working Group Wiki
Jump to: navigation, search
(4. A dataset available as RSS feed)
(5. A dataset available through a Web API)
 
Line 85: Line 85:
 
         dcat:distribution :dist5 .
 
         dcat:distribution :dist5 .
 
     :dist5 a dcat:WebService ;
 
     :dist5 a dcat:WebService ;
 +
        dcat:accessURL <http://example.org/dist5/sparql> .
 +
        dct:format [ rdfs:label "SPARQL" ].
 +
 +
Using the [http://www.w3.org/2011/gld/wiki/Data_Catalog_Vocabulary/AccessURLRedesign new proposed model of distribution]
 +
 +
    :ds5 a dcat:Dataset ;
 +
        dcat:distribution :dist5 .
 +
    :dist5 a '''dcat:Distribution''' ;
 
         dcat:accessURL <http://example.org/dist5/sparql> .
 
         dcat:accessURL <http://example.org/dist5/sparql> .
 
         dct:format [ rdfs:label "SPARQL" ].
 
         dct:format [ rdfs:label "SPARQL" ].

Latest revision as of 14:05, 14 February 2013

This page contains a number of example datasets represented with their distributions using DCAT. This is mainly meant to help discussing issues related to the Distribution class in DCAT.(see this email for some background).

The examples below use the recent DCAT specification


1. A dataset available as a downloadable file (a CSV file for example)

   :ds1 a dcat:Dataset ;
           dcat:distribution :dist1 .
   :dist1 a dcat:Download ;
          dcat:accessURL  <http://example.org/dist1.csv>;
          dcat:format [ rdfs:label "CSV" ].

Using the new proposed model of distribution

   :ds1 a dcat:Dataset ;
           dcat:distribution :dist1 .
   :dist1 a dcat:Distribution ;
          dcat:downloadURL  <http://example.org/dist1.csv>;
          dcat:format [ rdfs:label "CSV" ].

2. A dataset that is available through some web page

This means that the dataset can be obtained through a splash page where the user needs to click some pointers/check some boxes before accessing the data or the publisher is not very sure how the dataset can be obtained but she is sure that the data can be obtained somehow via the web page. This is also the case of datasets that are updated periodically (e.g. daily) with a new URL per new version of the dataset. If the format of the data is known, it can still be stated via dcat:format, even if the access URL points to an HTML page.

   :ds2 a dcat:Dataset ;
        dcat:distribution :dist2 .
   :dist2 a dcat:Distribution ;
        dcat:accessURL <http://example.org/dist2.html> .
        dct:format [ rdfs:label "CSV" ].

Using the new proposed model of distribution

   :ds2 a dcat:Dataset ;
        dcat:landingPage <http://example.org/dist2.html> ;
        dcat:distribution :dist2 .
   :dist2 a dcat:Distribution ;
        dcat:accessURL <http://example.org/dist2.html> ;
        dct:format [ rdfs:label "CSV" ].

Notice that a Distribution instance is still needed.

3. A dataset that is available as a download and through some web page

This means that the dataset can be downloaded from a known URL but also can be obtained through some landing page. The landing page might provide options to filter the dataset and obtain only a part of the whole dataset or simply link to the download file.

   :ds3 a dcat:Dataset ;
        dcat:distribution :dist3_1 , :dist3_2 .
   :dist3_1 a dcat:Distribution ;
        dcat:accessURL <http://example.org/dist3_1.html> .
   :dist3_2 a dcat:Download ;
        dcat:accessURL <http://example.org/dist3_2.csv> ;
        dct:format [ rdfs:label "CSV" ].

Using the new proposed model of distribution

   :ds3 a dcat:Dataset ;
        dcat:landingPage <http://example.org/dist3_1.html> ;
        dcat:distribution :dist3_2 .
   :dist3_2 a dcat:Distribution ;
        dcat:downloadURL <http://example.org/dist3_2.csv> .
        dct:format [ rdfs:label "CSV" ].

Notice that a Distribution instance for the landing page is not needed as the downloadable distribution is defined.

4. A dataset available as RSS feed

The assumption is that users can subscribe to the dataset using a standard feed reader, so the format would be RSS or Atom.

   :ds4 a dcat:Dataset ;
        dcat:distribution :dist4 .
   :dist4 a dcat:Feed ;
        dcat:accessURL <http://example.org/dist4.rss> .

Using the new proposed model of distribution

   :ds4 a dcat:Dataset ;
        dcat:distribution :dist4 .
   :dist4 a dcat:Distribution ;
        dcat:accessURL <http://example.org/dist4.rss> ;
        dcat:mediatype "application/atom+xml" .

5. A dataset available through a Web API

The class dcat:WebService (a subclass of dcat:Distribution) is used. Although the name "WebService" is not perfect for an API but I suggest that this class can still be used to indicate that the data can be accessed programmatically over the Web.

   :ds5 a dcat:Dataset ;
       dcat:distribution :dist5 .
   :dist5 a dcat:WebService ;
       dcat:accessURL <http://example.org/dist5/sparql> .
       dct:format [ rdfs:label "SPARQL" ].

Using the new proposed model of distribution

   :ds5 a dcat:Dataset ;
       dcat:distribution :dist5 .
   :dist5 a dcat:Distribution ;
       dcat:accessURL <http://example.org/dist5/sparql> .
       dct:format [ rdfs:label "SPARQL" ].

6. A dataset with an example record

Some datasets provide example of some records in the datasets (see this example on thedatahub). dcat:Distribution is interpreted as an obtainable form of the dataset. In this sense, example is not an instance of dcat:Distribution. However, a broader interpretation can include any resource related to the dataset. This interpretation is used on thedatahub.org.

If we stick to the "narrow" interpretation of dcat:Distribution, the example can be linked to using dc:references (@@@ but this meaning is more specific than dc:references):

   :ds6 a dcat:Dataset ;
          dc:references <http://example.org/id/example1>.
          dcat:distribution :dist6.
   :dist6 a dcat:Download;
          dcat:accessURL <http://example.org/dist6.rdf>;
          dct:format [ rdfs:label "RDF/XML" ].

Note that VoID has a void:exampleResource property that can be used specifically for datasets in RDF, where the example record would be the URI of an entity described in the dataset.

7. A dataset with additional documentation

Sometimes, additional documentation (API documentation, data dictionaries, XML schemas) is provided for a dataset.

If we stick to the “narrow” interpretation of dcat:Distrubition, then, such documents can be linked via the foaf:page property (@@@ but this meaning is more specific than foaf:page):

   :ds7 a dcat:Dataset;
         foaf:page <http://example.org/dist7-explanation.pdf>.
         dcat:distribution :dist7;
   :dist7 a dcat:Download;
         dcat:accessURL  <http://example.org/dist7.csv>;
         dct:format [ rdfs:label "CSV" ].

8. Describing distribution format

A proposal related to [issue 12] is to add a dcat:mediaType property that states the format of the distribution according to IANA. In the example, a plain string literal is used for the format value however the value must be one that is defined by IANA, otherwise dct:format must be used

   :dist8 a dcat:Download ;
         dcat:accessURL <http://example.org/dist8.csv> ;
         dcat:mediaType "text/csv" .