Difference between revisions of "Dcat examples"

From Government Linked Data (GLD) Working Group Wiki
Jump to: navigation, search
(1. A dataset available as a downloadable file (a CSV file for example))
(2. A dataset that is available through some web page)
Line 22: Line 22:
  
 
=== 2. A dataset that is available through some web page ===
 
=== 2. A dataset that is available through some web page ===
This means that the dataset can be obtained through a splash page where the user needs to click some pointers/check some boxes before accessing the data or the publisher is not very sure  how the dataset can be obtained but she is sure that the data can be obtained somehow via the web page. If the format of the data is known, it can still be stated via dcat:format, even if the access URL points to an HTML page.
+
This means that the dataset can be obtained through a splash page where the user needs to click some pointers/check some boxes before accessing the data or the publisher is not very sure  how the dataset can be obtained but she is sure that the data can be obtained somehow via the web page. This is also the case of datasets that are updated periodically (e.g. daily) with a new URL per new version of the dataset. If the format of the data is known, it can still be stated via dcat:format, even if the access URL points to an HTML page.
  
 
     :ds2 a dcat:Dataset ;
 
     :ds2 a dcat:Dataset ;
Line 29: Line 29:
 
         dcat:accessURL <http://example.org/dist2.html> .
 
         dcat:accessURL <http://example.org/dist2.html> .
 
         dct:format [ rdfs:label "CSV" ].
 
         dct:format [ rdfs:label "CSV" ].
 +
 +
 +
    :ds2 a dcat:Dataset ;
 +
        '''dcat:landingPage <http://example.org/dist2.html> ;'''
 +
        dcat:distribution :dist2 .
 +
    :dist2 a dcat:Distribution ;
 +
        dcat:accessURL <http://example.org/dist2.html> .
 +
        dct:format [ rdfs:label "CSV" ].
 +
 +
Notice that a Distribution instance is still needed.
  
 
=== 3. A dataset available as RSS feed ===
 
=== 3. A dataset available as RSS feed ===

Revision as of 13:51, 14 February 2013

This page contains a number of example datasets represented with their distributions using DCAT. This is mainly meant to help discussing issues related to the Distribution class in DCAT.(see this email for some background).

The examples below use the recent DCAT specification


1. A dataset available as a downloadable file (a CSV file for example)

   :ds1 a dcat:Dataset ;
           dcat:distribution :dist1 .
   :dist1 a dcat:Download ;
          dcat:accessURL  <http://example.org/dist1.csv>;
          dcat:format [ rdfs:label "CSV" ].

Using the new proposed model of distribution

   :ds1 a dcat:Dataset ;
           dcat:distribution :dist1 .
   :dist1 a dcat:Distribution ;
          dcat:downloadURL  <http://example.org/dist1.csv>;
          dcat:format [ rdfs:label "CSV" ].

2. A dataset that is available through some web page

This means that the dataset can be obtained through a splash page where the user needs to click some pointers/check some boxes before accessing the data or the publisher is not very sure how the dataset can be obtained but she is sure that the data can be obtained somehow via the web page. This is also the case of datasets that are updated periodically (e.g. daily) with a new URL per new version of the dataset. If the format of the data is known, it can still be stated via dcat:format, even if the access URL points to an HTML page.

   :ds2 a dcat:Dataset ;
        dcat:distribution :dist2 .
   :dist2 a dcat:Distribution ;
        dcat:accessURL <http://example.org/dist2.html> .
        dct:format [ rdfs:label "CSV" ].


   :ds2 a dcat:Dataset ;
        dcat:landingPage <http://example.org/dist2.html> ;
        dcat:distribution :dist2 .
   :dist2 a dcat:Distribution ;
        dcat:accessURL <http://example.org/dist2.html> .
        dct:format [ rdfs:label "CSV" ].

Notice that a Distribution instance is still needed.

3. A dataset available as RSS feed

The assumption is that users can subscribe to the dataset using a standard feed reader, so the format would be RSS or Atom.

   :ds3 a dcat:Dataset ;
        dcat:distribution :dist3 .
   :dist3 a dcat:Feed ;
        dcat:accessURL <http://example.org/dist3.rss> .

4. A dataset available through a Web API

The class dcat:WebService (a subclass of dcat:Distribution) is used. Although the name "WebService" is not perfect for an API but I suggest that this class can still be used to indicate that the data can be accessed programmatically over the Web.

   :ds4 a dcat:Dataset ;
       dcat:distribution :dist4 .
   :dist4 a dcat:WebService ;
       dcat:accessURL <http://example.org/dist4/sparql> .
       dct:format [ rdfs:label "SPARQL" ].

5. A dataset with an example record

Some datasets provide example of some records in the datasets (see this example on thedatahub). dcat:Distribution is interpreted as an obtainable form of the dataset. In this sense, example is not an instance of dcat:Distribution. However, a broader interpretation can include any resource related to the dataset. This interpretation is used on thedatahub.org.

If we stick to the "narrow" interpretation of dcat:Distribution, the example can be linked to using dc:references (@@@ but this meaning is more specific than dc:references):

   :ds5 a dcat:Dataset ;
          dc:references <http://example.org/id/example1>.
          dcat:distribution :dist5.
   :dist5 a dcat:Download;
          dcat:accessURL <http://example.org/dist5.rdf>;
          dct:format [ rdfs:label "RDF/XML" ].

Note that VoID has a void:exampleResource property that can be used specifically for datasets in RDF, where the example record would be the URI of an entity described in the dataset.

6. A dataset with additional documentation

Sometimes, additional documentation (API documentation, data dictionaries, XML schemas) is provided for a dataset.

If we stick to the “narrow” interpretation of dcat:Distrubition, then, such documents can be linked via the foaf:page property (@@@ but this meaning is more specific than foaf:page):

   :ds6 a dcat:Dataset;
         foaf:page <http://example.org/dist6-explanation.pdf>.
         dcat:distribution :dist6;
   :dist6 a dcat:Download;
         dcat:accessURL  <http://example.org/dist6.csv>;
         dct:format [ rdfs:label "CSV" ].

7. Describing distribution format

A proposal related to [issue 12] is to add a dcat:mediaType property that states the format of the distribution according to IANA. In the example, a plain string literal is used for the format value however the value must be one that is defined by IANA, otherwise dct:format must be used

   :dist7 a dcat:Download ;
         dcat:accessURL <http://example.org/dist7.csv> ;
         dcat:mediaType "text/csv" .