Please check the errata for any errors or issues reported since publication.
See also translations.
This document is also available in this non-normative format: Turtle
Copyright © 2020 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and permissive document license rules apply.
DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. This document defines the schema and provides examples for its use.
DCAT enables a publisher to describe datasets and data services in a catalog using a standard model and vocabulary that facilitates the consumption and aggregation of metadata from multiple catalogs. This can increase the discoverability of datasets and data services. It also makes it possible to have a decentralized approach to publishing data catalogs and makes federated search for datasets across catalogs in multiple sites possible using the same query mechanism and structure. Aggregated DCAT metadata can serve as a manifest file as part of the digital preservation process.
The namespace for DCAT terms is http://www.w3.org/ns/dcat#
The suggested prefix for the DCAT namespace is dcat
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document was published by the Dataset Exchange Working Group as a Superseded Recommendation. A newer specification exists that is recommended for new adoption in place of this specification.
For purposes of the W3C Patent Policy, this Superseded Recommendation has the same status as an active Recommendation; it retains licensing commitments and remains available as a reference for old -- and possibly still deployed -- implementations, but is not recommended for future implementation. New implementations should follow the latest version of the Data Catalog Vocabulary (DCAT) specification.
This document defines a major revision of the original DCAT vocabulary ([VOCAB-DCAT-20140116]) in response to new use cases, requirements and community experience since that publication. This revision extends the original DCAT standard in line with community practice while supporting diverse approaches to data description and dataset exchange. The main changes to the DCAT vocabulary have been:
dcat:Resource
class for representing any asset than can be included in the catalog, this is
now the super-class of dcat:Dataset
dcat:DataService
, as a sub-class of dcat:Resource
, to support catalog service end-points providing access to data assetsThis new version of the vocabulary updates and expands the original but preserves backward compatibility. A full list of the significant changes (with links to the relevent github issues) is described in § D. Change history.
The exit criteria for CR focussed on v2 new features that replicate features that were included in application profiles of v1 as a way of remedying missing and necessary elements. The exit criteria also included recent commitments by organisations such as EC Joinup to adopt the DCAT v2 model in their work. Implementation will be evidenced by showing use of the new properties/classes (or terms with equivalent meaning) in implementations of catalogs.
Issues, requirements, and features that have been considered and discussed by the Data eXchange Working Group but have not been addressed due to lack of maturity or consensus are collected in GitHub. Those believed to be a priority for a future release are in the milestone DCAT Future Priority Work.
The original DCAT vocabulary was developed and hosted at the Digital Enterprise Research Institute (DERI), then refined by the eGov Interest Group, and finally standardized in 2014 [VOCAB-DCAT-20140116] by the Government Linked Data (GLD) Working Group.
This revised version of DCAT was developed by the Dataset Exchange Working Group in response to a new set of Use Cases and Requirements [DCAT-UCR] gathered from peoples' experience with the DCAT vocabulary from the time of the original version, and new applications that were not considered in the first version. A summary of the changes from [VOCAB-DCAT-20140116] is provided in § D. Change history.
DCAT incorporates terms from pre-existing vocabularies where stable terms with appropriate meanings could be found, such as foaf:homepage
and dct:title
.
Informal summary definitions of the externally-defined terms are included in the DCAT vocabulary for convenience, while authoritative definitions are available in the normative references.
Changes to definitions in the references, if any, supersede the summaries given in this specification.
Note that conformance to DCAT (§ 4. Conformance) concerns usage of only the terms in the DCAT vocabulary specification, so possible changes to other external definitions will not affect the conformance of DCAT implementations.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 March 2019 W3C Process Document.
This section is non-normative.
Sharing data resources among different organizations, researchers, governments and citizens requires the provision of metadata. This is irrespective of the data being open or not. DCAT is a vocabulary for publishing data catalogs on the Web, which was originally developed in the context of government data catalogs such as data.gov and data.gov.uk, but it is also applicable and has been used in other contexts.
This revision of DCAT has extended the previous version to support further use cases and requirements [DCAT-UCR]. These include the possibility of cataloging other resources in addition to datasets, such as data services. The revision also supports describing relationships between datasets as well as between datasets and other cataloged resources. Guidance on how to document licenses and rights statements associated with the cataloged items is provided.
DCAT provides RDF classes and properties to allow datasets and data services to be described and included in a catalog. The use of a standard model and vocabulary facilitates the consumption and aggregation of metadata from multiple catalogs, which can:
Data described in a catalog can come in many formats, ranging from spreadsheets, through XML and RDF to various specialized formats. DCAT does not make any assumptions about these serialization formats of the datasets but it does distinguish between the abstract dataset and its different manifestations or distributions.
Data is often provided through a service which supports selection of an extract, sub-set, or combination of existing data, or of new data generated by some data processing function. DCAT allows the description of a data access service to be included in a catalog.
Complementary vocabularies can be used together with DCAT to provide more detailed format-specific information. For example, properties from the VoID vocabulary [VOID] can be used within DCAT to express various statistics about a dataset if that dataset is in RDF format.
This document does not prescribe any particular method of deploying data catalogs expressed in DCAT. DCAT information can be presented in many forms including RDF accessible via SPARQL endpoints, embedded in HTML pages as [HTML-RDFa], or serialized as RDF/XML [RDF-SYNTAX-GRAMMAR], [N3], [Turtle], [JSON-LD] or other formats. Within this document the examples use [Turtle] because of its readability.
This section is non-normative.
The original Recommendation [VOCAB-DCAT-20140116] published in January 2014 provided the basic framework for describing datasets. It made an important distinction between a dataset as an abstract idea and a distribution as a manifestation of the dataset. Although DCAT has been widely adopted, it has become clear that the original specification lacked a number of essential features that were added either through the mechanism of a profile, such as the European Commission's DCAT-AP [DCAT-AP], or the development of larger vocabularies that to a greater or lesser extent built upon the base standard, such as the Healthcare and Life Sciences Community Profile [HCLS-Dataset], the Data Tag Suite [DATS] and more. This revision of DCAT has been developed to address the specific shortcomings that have come to light through the experiences of different communities, the aim being to improve interoperability between the outputs of these larger vocabularies. For example, in this new DCAT version we provide classes, properties and guidance to address identifiers, dataset quality information, and data citation issues.
This revision includes re-writing of the specification throughout. Significant changes from the 2014 Recommendation are marked within the text using "Note" sections, as well as being described in § D. Change history.
The namespace for DCAT is http://www.w3.org/ns/dcat#
.
DCAT also makes extensive use of terms from other vocabularies, in particular Dublin Core [DCTERMS].
DCAT defines a minimal set of classes and properties of its own.
Namespaces and prefixes used in normative parts of this recommendation are shown in the following table.
Prefix | Namespace |
---|---|
dc | http://purl.org/dc/elements/1.1/ |
dcat | http://www.w3.org/ns/dcat# |
dct | http://purl.org/dc/terms/ |
dctype | http://purl.org/dc/dcmitype/ |
foaf | http://xmlns.com/foaf/0.1/ |
locn | http://www.w3.org/ns/locn# |
odrl | http://www.w3.org/ns/odrl/2/ |
owl | http://www.w3.org/2002/07/owl# |
prov | http://www.w3.org/ns/prov# |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs | http://www.w3.org/2000/01/rdf-schema# |
skos | http://www.w3.org/2004/02/skos/core# |
time | http://www.w3.org/2006/time# |
vcard | http://www.w3.org/2006/vcard/ns# |
xsd | http://www.w3.org/2001/XMLSchema# |
This section is non-normative.
Namespaces and prefixes used in examples and guidelines in the document and not from normative parts of the recommendation are shown in the following table.
Prefix | Namespace |
---|---|
adms | https://www.w3.org/ns/adms# |
dqv | http://www.w3.org/ns/dqv# |
earl | http://www.w3.org/ns/earl# |
geosparql | http://www.opengis.net/ont/geosparql# |
oa | http://www.w3.org/ns/oa# |
sdmx-attribute | http://purl.org/linked-data/sdmx/2009/attribute# |
sdo | https://schema.org/ |
w3cgeo | http://www.w3.org/2003/01/geo/wgs84_pos# |
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, MUST NOT, and SHOULD in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
A data catalog conforms to DCAT if:
A DCAT profile is a specification for a data catalog that adds additional constraints to DCAT. A data catalog that conforms to the profile also conforms to DCAT. Additional constraints in a profile MAY include:
This section is non-normative.
DCAT is an RDF vocabulary for representing data catalogs. DCAT is based around six main classes (Figure 1):
dcat:Catalog
represents a catalog, which is a dataset in which each individual item is a metadata record describing some resource; the scope of dcat:Catalog
is collections of metadata about datasets or data services.
dcat:Resource
represents a dataset, a data service or any other resource that may be described by a metadata record in a catalog.
This class is not intended to be used directly, but is the parent class of dcat:Dataset
, dcat:DataService
and dcat:Catalog
.
Member items in a catalog should be members of one of the sub-classes, or of a sub-class of these, or of a sub-class of dcat:Resource
defined in a DCAT profile or other DCAT application.
dcat:Resource
is effectively an extension point for defining a catalog of any kind of resource. dcat:Dataset
and dcat:DataService
can be used for datasets and services which are not documented in any catalog.
dcat:Dataset
represents a dataset.
A dataset is a collection of data, published or curated by a single agent.
Data comes in many forms including numbers, words, pixels, imagery, sound and other multi-media, and potentially other types, any of which might be collected into a dataset.
dcat:Distribution
represents an accessible form of a dataset such as a downloadable file.
dcat:DataService
represents a data service.
A data service is a collection of operations accessible through an interface (API) that provide access to one or more datasets or data processing functions.
dcat:CatalogRecord
represents a metadata item in the catalog, primarily concerning the registration information, such as who added the item and when.
A dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more serializations or formats". A dataset is a conceptual entity, and can be represented by one or more distributions that serialize the dataset for transfer. Distributions of a dataset can be provided via data services.
A data service typically provides selection, extraction, combination, processing or transformation operations over datasets that might be hosted locally or remote to the service. The result of any request to a data service is a representation of a part or all of a dataset or catalog. A data service might be tied to specific datasets, or its source data might be configured at request- or run-time. A data distribution service allows selection and download of a distribution of a dataset or subset. A data discovery service allows a client to find a suitable dataset. Other kinds of data service include data transformation services, such as coordinate transformation services, re-sampling and interpolation services, and various data processing services, including simulation and modelling services. Note that a data service in DCAT is a collection of operations or API which provides access to data. An interactive user-interface is often available to provide convenient access to API operations, but its description is outside the scope of DCAT. The details of a particular data service endpoint will often be specified through a description conforming to a standard service type, which complement the scope of the DCAT vocabulary itself.
Descriptions of datasets and data services can be included in a catalog.
A catalog is a kind of dataset whose member items are descriptions of datasets and data services.
Other types of things might also be cataloged, but the scope of DCAT is currently limited to datasets and data services.
To extend the scope of a catalog beyond datasets and data services it is recommended to define additional sub-classes of dcat:Resource
in a DCAT profile or other DCAT application.
To extend the scope of service descriptions beyond data distribution services it is recommended to define additional sub-classes of dcat:DataService
in a DCAT profile or other DCAT application.
A catalog record describes an entry in the catalog. Notice that while dcat:Resource
represents the dataset or service itself, dcat:CatalogRecord
is the record that describes the registration of an item in the catalog. The use of dcat:CatalogRecord
is considered optional. It is used to capture provenance information about entries in a catalog explicitly. If this is not necessary then dcat:CatalogRecord
can be safely ignored.
The DCAT vocabulary is an OWL2 ontology [OWL2-OVERVIEW] formalized using [RDF-SCHEMA].
Each class and property in DCAT is denoted by an [IRI].
Locally defined elements are in the namespace http://www.w3.org/ns/dcat#
.
Elements are also adopted from several external vocabularies, in particular [FOAF], [DCTERMS] and [PROV-O]
RDF allows resources to have global identifiers (IRIs) or to be blank nodes. Blank nodes can be used to denote resources without explicitly naming them with an IRI. They can appear in the subject and object position of a triple [RDF11-PRIMER]. For example, in many actual DCAT catalogs, distributions are represented as blank nodes nested inside the related dataset description. While blank nodes can offer flexibility for some use cases, in a Linked Data context, blank nodes limit our ability to collaboratively annotate data. A blank node resource cannot be the target of a link and it can't be annotated with new information from new sources. As one of the biggest benefits of the Linked Data approach is that "anyone can say anything anywhere", use of blank nodes undermines some of the advantages we can gain from wide adoption of the RDF model. Even within the closed world of a single application dataset, use of blank nodes can quickly become limiting when integrating new data [LinkedDataPatterns]. For these reasons, it is recommended that instances of the DCAT main classes have a global identifier, and use of blank nodes is generally discouraged when encoding DCAT in RDF.
All RDF examples in this document are written in Turtle syntax [Turtle] and many are available from the DXWG code repository.
This example provides a quick overview of how DCAT might be used to represent a government catalog and its datasets.
First, the catalog description:
The publisher of the catalog has the relative URI :transparency-office
. Further description of the publisher can be provided as in Example 2:
The catalog lists each of its datasets via the dcat:dataset
property. In Example 1, an example dataset was mentioned with the relative URI :dataset-001
. A possible description of it using DCAT is shown below:
Five distinct temporal descriptors are shown for this dataset.
The dataset publication and revision dates are shown in dct:issued
and dct:modified
.
For the frequency of update of the dataset in dct:accrualPeriodicity
, we use an instance from the content-oriented guidelines developed as part of the W3C Data Cube Vocabulary [VOCAB-DATA-CUBE] efforts.
The temporal coverage or extent is given in dct:temporal
using an item from the Interval dataset (originally available from http://reference.data.gov.uk/id/interval
) from data.gov.uk.
The temporal resolution, which describes the minimum spacing of items within the dataset, is given in dcat:temporalResolution
using the standard datatype xsd:duration
.
Additionally, the spatial coverage or extent is given dct:spatial
using a URI from Geonames.
The spatial resolution, which describes the minimum spatial separation of items within the dataset, is given in dcat:spatialResolutionInMeters
using the standard datatype xsd:decimal
.
A contact point is provided where comments and feedback about the dataset can be sent. Further details about the contact point, such as email address or telephone number, can be provided using vCard [VCARD-RDF].
One representation of the dataset :dataset-001-csv
can be downloaded as a 5kB CSV file. This is
represented as an RDF resource of type dcat:Distribution
.
The catalog classifies its datasets according to a set of domains represented by the relative URI :themes
. SKOS [SKOS-REFERENCE] can be used to describe the domains used:
Notice that this dataset is classified under the domain represented by the relative URI :accountability
.
It is recommended to define the concept as part of the concepts scheme identified by the URI :themes
that was used to describe the catalog domains. An example SKOS description:
The type or genre of a dataset can be indicated using the dct:type
property.
It is recommended that the value of the property is taken from a well governed and broadly recognised set of resource types,
such as the DCMI Type Vocabulary [DCTERMS],
the MARC Genre/Terms Scheme,
the [ISO-19115-1] MD_Scope codes
,
the DataCite resource types,
or the PARSE.Insight content-types from Re3data [RE3DATA-SCHEMA].
In the following examples, a (notional) dataset is classified separately using values from different vocabularies.
It is also possible for multiple classifications to be present in a single description.
If the catalog publisher decides to keep metadata
describing its records (i.e. the records containing metadata
describing the datasets), dcat:CatalogRecord
can be used. For example,
while :dataset-001
was issued on 2011-12-05, its description on Imaginary Catalog was added on 2011-12-11. This can be represented by DCAT as in Example 9:
:dataset-002
is available as a CSV file. However :dataset-002
can only be obtained through some Web page
where the user needs to follow some links, provide some information and check some boxes
before accessing the data.
Notice the use of a dcat:landingPage
and the definition of the dcat:Distribution
instance.
On the other hand, :dataset-003
can be obtained through some landing page but also can be downloaded from a known URL.
Notice that we used dcat:downloadURL
with the downloadable distribution and that the other distribution accessible through the landing page
does not have to be defined as a separate dcat:Distribution
instance.
:dataset-004
is distributed in different representations from different services.
The dcat:accessURL
for each dcat:Distribution
corresponds with the dcat:endpointURL
of the service.
Each service is characterized by its general type using dct:type
(here using values from the INSPIRE spatial data service type vocabulary),
its specific API definition using dct:conformsTo
,
with the detailed description of the individual endpoint parameters and options linked using dcat:endpointDescription
.
The (revised) DCAT vocabulary is available in RDF.
The primary artefact dcat2.ttl
is a serialization of the core DCAT vocabulary.
Alongside it are a set of other RDF files that provide additional information, including:
DCAT requires use of elements from a number of other vocabularies. Furthermore, DCAT may be augmented by additional elements from external vocabularies, following the usual RDFS [RDF-SCHEMA] and OWL2 [OWL2-OVERVIEW] rules and patterns.
Elements from a number of complementary vocabularies MAY be used together with DCAT to provide more detailed information. For example: properties from the VoID vocabulary [VOID] allow the description of various statistics about a DCAT-described dataset if that dataset is in RDF format; properties from the Provenance ontology [PROV-O] can be used to provide more information about the workflow that generated a dataset or service and related activities and agents; classes and properties from the Organization Ontology [VOCAB-ORG] can be used to explain additional details of responsible agents.
The definitions (including domain and range) of terms outside the DCAT namespace are provided here only for convenience and MUST NOT be considered normative. The authoritative definitions of these terms are in the corresponding specifications, i.e. [DC11], [DCTERMS], [FOAF], [PROV-O], [RDF-SCHEMA], [SKOS-REFERENCE], [XMLSCHEMA11-2] and [VCARD-RDF].
The following properties are specific to this class: catalog record, has part, dataset, service, catalog, homepage, themes.
The following properties are inherited from the super-class dcat:Dataset
:
distribution,
frequency,
spatial/geographic coverage,
spatial resolution,
temporal coverage,
temporal resolution,
was generated by.
The following properties are inherited from the super-class dcat:Resource
:
access rights,
conforms to,
contact point,
creator,
description,
has policy,
identifier,
is referenced by,
keyword/tag,
landing page,
license,
catalog language,
relation,
rights,
qualified relation,
publisher,
release date,
theme/category,
title,
type/genre,
update/modification date,
qualified attribution.
RDF Class: | dcat:Catalog |
---|---|
Definition: | A curated collection of metadata about resources (e.g., datasets and data services in the context of a data catalog) |
Sub-class of: | dcat:Dataset |
Usage note: | A Web-based data catalog is typically represented as a single instance of this class. |
See also: | § 6.5 Class: Catalog Record, § 6.6 Class: Dataset |
RDF Property: | foaf:homepage |
---|---|
Definition: | A homepage of the catalog (a public Web document usually available in HTML). |
Range: | foaf:Document |
Usage note: | foaf:homepage is an inverse functional property (IFP) which means that it MUST be unique and precisely identify the Web-page for the resource. This property indicates the canonical Web-page, which might be helpful in cases where there is more than one Web-page about the resource. |
RDF Property: | dcat:themeTaxonomy |
---|---|
Definition: | A knowledge organization system (KOS) used to classify catalog's datasets and services. |
Domain: | dcat:Catalog |
Range: | rdfs:Resource |
Usage note: |
It is recommended that the taxonomy is organized in a skos:ConceptScheme , skos:Collection , owl:Ontology or similar, which allows each member to be denoted by an IRI and published as Linked Data.
|
RDF Property: | dct:hasPart |
---|---|
Definition: | An item that is listed in the catalog. |
Domain: | dcat:Catalog |
Range: | dcat:Resource |
Usage note: | This is the most general predicate for membership of a catalog. Use of a more specific sub-property is recommended when available. |
See also: | Sub-properties of dct:hasPart in particular dcat:dataset , dcat:catalog , dcat:service . |
RDF Property: | dcat:dataset |
---|---|
Definition: | A collection of data that is listed in the catalog. |
Sub-property of: | dct:hasPart |
Domain: | dcat:Catalog |
Range: | dcat:Dataset |
RDF Property: | dcat:service |
---|---|
Definition: | A site or end-point that is listed in the catalog. |
Sub-property of: | dct:hasPart |
Domain: | dcat:Catalog |
Range: | dcat:DataService |
RDF Property: | dcat:catalog |
---|---|
Definition: | A catalog whose contents are of interest in the context of this catalog. |
Sub-property of: | dct:hasPart |
Domain: | dcat:Catalog |
Range: | dcat:Catalog |
RDF Property: | dcat:record |
---|---|
Definition: | A record describing the registration of a single dataset or data service that is part of the catalog. |
Domain: | dcat:Catalog |
Range: | dcat:CatalogRecord |
The following properties are specific to this class: access rights, conforms to, contact point, creator, description, has policy, identifier, is referenced by, keyword/tag, landing page, license, resource language, relation, rights, qualified relation, publisher, release date, theme/category, title, type/genre, update/modification date, qualified attribution.
RDF Class: | dcat:Resource |
---|---|
Definition: | Resource published or curated by a single agent. |
Usage note: | The class of all cataloged resources, the super-class of
dcat:Dataset , dcat:DataService , dcat:Catalog and any other member of a dcat:Catalog .
This class carries properties common to all cataloged resources, including datasets and data services.
It is strongly recommended to use a more specific sub-class. When describing a resource which is not a dcat:Dataset or dcat:DataService, it is recommended to create a suitable sub-class of dcat:Resource, or use dcat:Resource with the dct:type property to indicate the specific type. |
Usage note: | dcat:Resource is an extension point that enables the definition of any kind of catalog. Additional sub-classes may be defined in a DCAT profile or other DCAT application for catalogs of other kinds of resources. |
See also: | § 6.5 Class: Catalog Record |
RDF Property: | dct:accessRights |
---|---|
Definition: | Information about who can access the resource or an indication of its security status. |
Range: | dct:RightsStatement |
Usage note: | Information about licenses and rights MAY be provided for the Resource. See also guidance at § 8. License and rights statements. |
See also: | § 6.4.20 Property: rights |
RDF Property: | dct:conformsTo |
---|---|
Definition: | An established standard to which the described resource conforms. |
Range: | dct:Standard ("A basis for comparison; a reference point against which other things can be evaluated." [DCTERMS]) |
Usage note: | This property SHOULD be used to indicate the model, schema, ontology, view or profile that the cataloged resource content conforms to. |
RDF Property: | dcat:contactPoint |
---|---|
Definition: | Relevant contact information for the cataloged resource. Use of vCard is recommended [VCARD-RDF]. |
Range: | vcard:Kind |
RDF Property: | dct:creator |
---|---|
Definition: | The entity responsible for producing the resource. |
Range: | foaf:Agent |
Usage note: | Resources of type foaf:Agent
are recommended as values for this property. |
See also: | § 6.11 Class: Organization/Person |
RDF Property: | dct:description |
---|---|
Definition: | A free-text account of the item. |
Range: | rdfs:Literal |
RDF Property: | dct:title |
---|---|
Definition: | A name given to the item. |
Range: | rdfs:Literal |
RDF Property: | dct:issued |
---|---|
Definition: | Date of formal issuance (e.g., publication) of the item. |
Range: | rdfs:Literal
encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear , xsd:gYearMonth , xsd:date , or xsd:dateTime ).
|
Usage note: | This property SHOULD be set using the first known date of issuance. |
See also: | § 6.5.3 Property: listing date and § 6.7.3 Property: release date |
RDF Property: | dct:modified |
---|---|
Definition: | Most recent date on which the item was changed, updated or modified. |
Range: | rdfs:Literal
encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear , xsd:gYearMonth , xsd:date , or xsd:dateTime ).
|
Usage note: | The value of this property indicates a change to the actual item, not a change to the catalog record. An absent value MAY indicate that the item has never changed after its initial publication, or that the date of last modification is not known, or that the item is continuously updated. |
See also: | § 6.6.2 Property: frequency, § 6.5.4 Property: update/modification date and § 6.7.4 Property: update/modification date |
RDF Property: | dct:language |
---|---|
Definition: | A language of the item. This refers to the natural language used for textual metadata (i.e. titles, descriptions, etc) of a cataloged resource (i.e. dataset or service) or the textual values of a dataset distribution |
Range: |
Resources defined by the Library of Congress (ISO 639-1, ISO 639-2) SHOULD be used. If a ISO 639-1 (two-letter) code is defined for language, then its corresponding IRI SHOULD be used; if no ISO 639-1 code is defined, then IRI corresponding to the ISO 639-2 (three-letter) code SHOULD be used. |
Usage note: | Repeat this property if the resource is available in multiple languages. |
Usage note: | The value(s) provided for members of a catalog (i.e. dataset or service) override the value(s) provided for the catalog if they conflict. |
Usage note: | If representations of a dataset are available for each language separately, define an instance of dcat:Distribution for each language and describe the specific language of each distribution using dct:language (i.e. the dataset will have multiple dct:language values and each distribution will have just one as the value of its dct:language property). |
RDF Property: | dct:publisher |
---|---|
Definition: | The entity responsible for making the item available. |
Usage note: | Resources of type foaf:Agent
are recommended as values for this property. |
See also: | § 6.11 Class: Organization/Person |
RDF Property: | dct:identifier |
---|---|
Definition: | A unique identifier of the item. |
Range: | rdfs:Literal |
Usage note: | The identifier might be used as part of the URI of the item, but still having it represented explicitly is useful. |
RDF Property: | dcat:theme |
---|---|
Definition: | A main category of the resource. A resource can have multiple themes. |
Sub-property of: | dct:subject |
Range: | skos:Concept |
Usage note: | The set of skos:Concept s used to categorize the resources are organized in a skos:ConceptScheme describing all the categories and their relations in the catalog. |
See also: | § 6.3.2 Property: themes |
RDF Property: | dct:type |
---|---|
Definition: | The nature or genre of the resource. |
Sub-property of: | dc:type |
Range: | rdfs:Class |
Usage note: | The value SHOULD be taken from a well governed and broadly recognised controlled vocabulary, such as:
|
Usage note: | To describe the file format, physical medium, or dimensions of the resource, use the dct:format element. |
RDF Property: | dct:relation |
---|---|
Definition: | A resource with an unspecified relationship to the cataloged item. |
Usage note: | dct:relation SHOULD be used where the nature of the relationship between a cataloged item and related resources is not known. A more specific sub-property SHOULD be used if the nature of the relationship of the link is known.
The property dcat:distribution SHOULD be used to link from a dcat:Dataset to a representation of the dataset, described as a dcat:Distribution |
See also: | Sub-properties of dct:relation in particular
dcat:distribution ,
dct:hasPart ,
(and its sub-properties
dcat:catalog ,
dcat:dataset ,
dcat:service
),
dct:isPartOf ,
dct:conformsTo ,
dct:isFormatOf ,
dct:hasFormat ,
dct:isVersionOf ,
dct:hasVersion ,
dct:replaces ,
dct:isReplacedBy ,
dct:references ,
dct:isReferencedBy ,
dct:requires ,
dct:isRequiredBy |
Many existing and legacy catalogs do not distinguish between dataset components, representations, documentation, schemata and other resources that are lumped together as part of a dataset.
dct:relation
is a super-property of a number of more specific properties which express more precise relationships, so use of dct:relation
is not inconsistent with a subsequent reclassification with more specific semantics, though the more specialized sub-properties SHOULD be used to link a dataset to component and supplementary resources if possible.
RDF Property: | dcat:qualifiedRelation |
---|---|
Definition: | Link to a description of a relationship with another resource |
Sub-property of: | prov:qualifiedInfluence |
Domain: | dcat:Resource |
Range: | dcat:Relationship |
Usage note: | Used to link to another resource where the nature of the relationship is known but does not match one of the standard [DCTERMS] properties
(dct:hasPart ,
dct:isPartOf ,
dct:conformsTo ,
dct:isFormatOf ,
dct:hasFormat ,
dct:isVersionOf ,
dct:hasVersion ,
dct:replaces ,
dct:isReplacedBy ,
dct:references ,
dct:isReferencedBy ,
dct:requires ,
dct:isRequiredBy )
or [PROV-O] properties
(prov:wasDerivedFrom ,
prov:wasInfluencedBy ,
prov:wasQuotedFrom ,
prov:wasRevisionOf ,
prov:hadPrimarySource ,
prov:alternateOf ,
prov:specializationOf ).
|
This DCAT property follows the common qualified relation pattern described in § 13. Qualified relations .
RDF Property: | dcat:keyword |
---|---|
Definition: | A keyword or tag describing the resource. |
Range: | rdfs:Literal |
RDF Property: | dcat:landingPage |
---|---|
Definition: | A Web page that can be navigated to in a Web browser to gain access to the catalog, a dataset, its distributions and/or additional information. |
Sub-property of: | foaf:page |
Range: | foaf:Document |
Usage note: |
If the distribution(s) are accessible only through a landing page
(i.e. direct download URLs are not known), then the landing page link SHOULD be duplicated as dcat:accessURL on a distribution. (see § 5.7 Dataset available only behind some Web page)
|
RDF Property: | prov:qualifiedAttribution |
---|---|
Definition: | Link to an Agent having some form of responsibility for the resource |
Sub-property of: | prov:qualifiedInfluence |
Domain: | prov:Entity |
Range: | prov:Attribution |
Usage note: | Used to link to an Agent where the nature of the relationship is known but does not match one of the standard [DCTERMS] properties (dct:creator , dct:publisher ).
Use dcat:hadRole on the prov:Attribution to capture the responsibility of the Agent with respect to the Resource.
See § 13.1 Relationships between datasets and agents for usage examples. |
This DCAT property follows the common qualified relation pattern described in § 13. Qualified relations .
RDF Property: | dct:license |
---|---|
Definition: | A legal document under which the resource is made available. |
Range: | dct:LicenseDocument |
Usage note: | Information about licenses and rights MAY be provided for the Resource. See also guidance at § 8. License and rights statements. |
See also: | § 6.4.20 Property: rights, § 6.7.5 Property: license |
RDF Property: | dct:rights |
---|---|
Definition: | A statement that concerns all rights not addressed with dct:license or dct:accessRights, such as copyright statements. |
Range: | dct:RightsStatement |
Usage note: | Information about licenses and rights MAY be provided for the Resource. See also guidance at § 8. License and rights statements. |
See also: | § 6.4.19 Property: license, § 6.7.7 Property: rights, § 6.4.1 Property: access rights |
RDF Property: | odrl:hasPolicy |
---|---|
Definition: | An ODRL conformant policy expressing the rights associated with the resource. |
Range: | odrl:Policy |
Usage note: | Information about rights expressed as an ODRL policy [ODRL-MODEL] using the ODRL vocabulary [ODRL-VOCAB] MAY be provided for the resource. See also guidance at § 8. License and rights statements. |
See also: | § 6.4.19 Property: license, § 6.4.1 Property: access rights, § 6.4.20 Property: rights |
RDF Property: | dct:isReferencedBy |
---|---|
Definition: | A related resource, such as a publication, that references, cites, or otherwise points to the cataloged resource. |
Usage note: | In relation to the use case of data citation, when the cataloged resource is a dataset, the dct:isReferencedBy property allows to relate the dataset to the resources (such as scholarly publications) that cite or point to the dataset. Multiple dct:isReferencedBy properties can be used to indicate the dataset has been referenced by multiple publications, or other resources. |
Usage note: | This property is used to associate a resource with the resource (of type dcat:Resource ) in question. For other relations to resources not covered with this property, the more generic property dcat:qualifiedRelation can be used. See also § 13. Qualified relations. |
For examples on the use of this property, see § C.3 Link datasets and publications.
The following properties are specific to this class (dcat:CatalogRecord
):
conforms to,
description,
listing date,
primary topic,
title,
update/modification date.
RDF Class: | dcat:CatalogRecord |
---|---|
Definition: | A record in a catalog, describing the registration of a single dcat:Resource . |
Usage note | This class is optional and not all catalogs will use it. It exists for catalogs where a distinction is made between metadata about a dataset or service and metadata about the entry in the catalog about the dataset or service. For example, the publication date property of the dataset reflects the date when the information was originally made available by the publishing agency, while the publication date of the catalog record is the date when the dataset was added to the catalog. In cases where both dates differ, or where only the latter is known, the publication date SHOULD only be specified for the catalog record. Notice that the W3C PROV Ontology [PROV-O] allows describing further provenance information such as the details of the process and the agent involved in a particular change to a dataset or its registration. |
See also | § 6.6 Class: Dataset |
If a catalog is represented as an RDF Dataset with named graphs (as defined in [SPARQL11-QUERY]),
then it is appropriate to place the description of each dataset
(consisting of all RDF triples that mention the dcat:Dataset
, dcat:CatalogRecord
, and any of its dcat:Distribution
s)
into a separate named graph. The name of that graph SHOULD be the IRI of the catalog record.
RDF Property: | dct:title |
---|---|
Definition: | A name given to the record. |
Range: | rdfs:Literal |
RDF Property: | dct:description |
---|---|
Definition: | A free-text account of the record. |
Range: | rdfs:Literal |
RDF Property: | dct:issued |
---|---|
Definition: | The date of listing (i.e. formal recording) of the corresponding dataset or service in the catalog. |
Range: | rdfs:Literal
encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear , xsd:gYearMonth , xsd:date , or xsd:dateTime ).
|
Usage note: | This indicates the date of listing the dataset in the catalog and not the publication date of the dataset itself. |
See also: | § 6.4.7 Property: release date |
RDF Property: | dct:modified |
---|---|
Definition: | Most recent date on which the catalog entry was changed, updated or modified. |
Range: | rdfs:Literal
encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear , xsd:gYearMonth , xsd:date , or xsd:dateTime ).
|
Usage note: | This indicates the date of last change of a catalog entry, i.e. the catalog metadata description of the dataset, and not the date of the dataset itself. |
See also: | § 6.4.8 Property: update/modification date |
RDF Property: | foaf:primaryTopic |
---|---|
Definition: | The dcat:Resource (dataset or service) described in the record. |
Usage note: | foaf:primaryTopic property is functional:
each catalog record can have at most one primary topic i.e. describes one dataset or service. |
RDF Property: | dct:conformsTo |
---|---|
Definition: | An established standard to which the described resource conforms. |
Range: | dct:Standard (A basis for comparison; a reference point against which other things can be evaluated.) |
Usage note: | This property SHOULD be used to indicate the model, schema, ontology, view or profile that the catalog record metadata conforms to. |
The following properties are specific to this class: distribution, frequency, spatial/geographic coverage, spatial resolution, temporal coverage, temporal resolution, was generated by.
The following properties are inherited from the super-class dcat:Resource
:
access rights,
conforms to,
contact point,
creator,
description,
has policy,
identifier,
is referenced by,
keyword/tag,
landing page,
license,
resource language,
relation,
rights,
qualified relation,
publisher,
release date,
theme/category,
title,
type/genre,
update/modification date,
qualified attribution.
Information about licenses and rights SHOULD be provided on the level of Distribution. Information about licenses and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing license or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts.
RDF Class: | dcat:Dataset |
---|---|
Definition: | A collection of data, published or curated by a single agent, and available for access or download in one or more representations. |
Sub-class of: | dcat:Resource |
Usage note: | This class describes the conceptual dataset. One or more representations might be available, with differing schematic layouts and formats or serializations. |
Usage note: | This class describes the actual dataset as published by the dataset provider. In cases where a distinction between the actual dataset and its entry in the catalog is necessary (because metadata such as modification date might differ), the catalog record class can be used for the latter. |
RDF Property: | dcat:distribution |
---|---|
Definition: | An available distribution of the dataset. |
Sub-property of: | dct:relation |
Domain: | dcat:Dataset |
Range: | dcat:Distribution |
RDF Property: | dct:accrualPeriodicity |
---|---|
Definition: | The frequency at which dataset is published. |
Range: | dct:Frequency (A rate at which something recurs) |
Usage note: |
The value of dct:accrualPeriodicity gives the rate at which the dataset-as-a-whole is updated.
This may be complemented by dcat:temporalResolution to give the time between collected data points in a time series.
|
Examples showing how dct:accrualPeriodicity
and dcat:temporalResolution
may be combined are given in § 9.1 Temporal properties.
RDF Property: | dct:spatial |
---|---|
Definition: | The geographical area covered by the dataset. |
Range: | dct:Location (A spatial region or named place) |
Usage note: | The spatial coverage of a dataset may be encoded as an instance of dct:Location , or may be indicated using a URI reference (link) to a resource describing a location. It is recommended that links are to entries in a well maintained gazetteer such as Geonames. |
Options for expressing the details of a dct:Location
are provided in § 6.15 Class: Location.
RDF Property: | dcat:spatialResolutionInMeters |
---|---|
Definition: | Minimum spatial separation resolvable in a dataset, measured in meters. |
Range: | xsd:decimal |
Usage note: | If the dataset is an image or grid this should correspond to the spacing of items. For other kinds of spatial datasets, this property will usually indicate the smallest distance between items in the dataset. |
The range of this property is a decimal number representing a length in meters. This is intended to provide a summary indication of the spatial resolution of the data as a single number. More complex descriptions of various aspects of spatial precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].
RDF Property: | dct:temporal |
---|---|
Definition: | The temporal period that the dataset covers. |
Range: | dct:PeriodOfTime (An interval of time that is named or defined by its start and end dates) |
Usage note: | The temporal coverage of a dataset may be encoded as an instance of dct:PeriodOfTime , or may be indicated using a URI reference (link) to a resource describing a time period or interval. |
Options for expressing the details of a dct:PeriodOfTime
are provided in § 6.14 Class: Period of Time.
RDF Property: | dcat:temporalResolution |
---|---|
Definition: | Minimum time period resolvable in the dataset. |
Range: | xsd:duration |
Usage note: | If the dataset is a time-series this should correspond to the spacing of items in the series. For other kinds of dataset, this property will usually indicate the smallest time difference between items in the dataset. |
This is intended to provide a summary indication of the temporal resolution of the data distribution as a single value. More complex descriptions of various aspects of temporal precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].
The distinction between dcat:temporalResolution
and dct:accrualPeriodicity
is illustrated by examples in § 9.1 Temporal properties.
RDF Property: | prov:wasGeneratedBy |
---|---|
Definition: | An activity that generated, or provides the business context for, the creation of the dataset. |
Domain: | prov:Entity |
Range: | prov:Activity An activity is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities. |
Usage note: | The activity associated with generation of a dataset will typically be an initiative, project, mission, survey, on-going activity ("business as usual") etc. Multiple prov:wasGeneratedBy properties can be used to indicate the dataset production context at various levels of granularity. |
Usage note: | Use prov:qualifiedGeneration to attach additional details about the relationship between the dataset and the activity, e.g. the exact time that the dataset was produced during the lifetime of a project |
Details about how to describe the activity that generated a dataset, such as a project, initiative, on-going activity, mission or survey, are out of scope for this document.
prov:Activity
provides for some basic properties such as begin and end time, associated agents etc.
Further details may be provided through classes defined in applications.
A number of ontologies for describing projects are available, for example
VIVO for academic research projects [VIVO-ISF],
DOAP (Description of a Project) for software projects [DOAP], and
DBPedia for general projects [DBPEDIA-ONT] which are expected to be suitable for different applications.
The following properties are specific to this class: access rights, access URL, access service, byte size, compression format, conforms to, description, download URL, format, has policy, license, media type, packaging format, release date, rights, spatial resolution, temporal resolution, title, update/modification date.
RDF class: | dcat:Distribution |
---|---|
Definition: | A specific representation of a dataset. A dataset might be available in multiple serializations that may differ in various ways, including natural language, media-type or format, schematic organization, temporal and spatial resolution, level of detail or profiles (which might specify any or all of the above). |
Usage note: | This represents a general availability of a dataset. It implies no information
about the actual access method of the data, i.e. whether by direct download, API, or through a Web page.
The use of dcat:downloadURL property indicates directly downloadable distributions. |
See also: | § 6.8 Class: Data Service |
Links between a dcat:Distribution
and services or Web addresses where it can be accessed are expressed using dcat:accessURL
, dcat:accessService
, dcat:downloadURL
, as shown in Figure 1 and described in the definitions below.
RDF Property: | dct:title |
---|---|
Definition: | A name given to the distribution. |
Range: | rdfs:Literal |
RDF Property: | dct:description |
---|---|
Definition: | A free-text account of the distribution. |
Range: | rdfs:Literal |
RDF Property: | dct:issued |
---|---|
Definition: | Date of formal issuance (e.g., publication) of the distribution. |
Range: | rdfs:Literal
encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear , xsd:gYearMonth , xsd:date , or xsd:dateTime ).
|
Usage note: | This property SHOULD be set using the first known date of issuance. |
See also: | § 6.4.7 Property: release date |
RDF Property: | dct:modified |
---|---|
Definition: | Most recent date on which the distribution was changed, updated or modified. |
Range: | rdfs:Literal
encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear , xsd:gYearMonth , xsd:date , or xsd:dateTime ). |
See also: | § 6.4.8 Property: update/modification date |
RDF Property: | dct:license |
---|---|
Definition: | A legal document under which the distribution is made available. |
Range: | dct:LicenseDocument |
Usage note: | Information about licenses and rights SHOULD be provided on the level of Distribution. Information about licenses and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing license or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts. See also guidance at § 8. License and rights statements. |
See also: | § 6.7.7 Property: rights § 6.4.19 Property: license |
RDF Property: | dct:accessRights |
---|---|
Definition: | A rights statement that concerns how the distribution is accessed. |
Range: | dct:RightsStatement |
Usage note: | Information about licenses and rights MAY be provided for the Distribution. See also guidance at § 8. License and rights statements. |
See also: | § 6.7.5 Property: license, § 6.7.7 Property: rights, § 6.4.1 Property: access rights |
RDF Property: | dct:rights |
---|---|
Definition: | Information about rights held in and over the distribution. |
Range: | dct:RightsStatement |
Usage note: |
Information about licenses and rights SHOULD be provided on the level of Distribution. Information about licenses and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing license or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts. See also guidance at § 8. License and rights statements. |
See also: | § 6.7.5 Property: license, § 6.4.20 Property: rights |
RDF Property: | odrl:hasPolicy |
---|---|
Definition: | An ODRL conformant policy expressing the rights associated with the distribution. |
Range: | odrl:Policy |
Usage note: | Information about rights expressed as an ODRL policy [ODRL-MODEL] using the ODRL vocabulary [ODRL-VOCAB] MAY be provided for the distribution. See also guidance at § 8. License and rights statements. |
See also: | § 6.4.19 Property: license, § 6.7.6 Property: access rights, § 6.7.7 Property: rights |
RDF Property: | dcat:accessURL |
---|---|
Definition: | A URL of the resource that gives access to a distribution of the dataset. E.g. landing page, feed, SPARQL endpoint. |
Domain: | dcat:Distribution |
Range: | rdfs:Resource |
Usage note: |
If the distribution(s) are accessible only through a landing page (i.e. direct download URLs are not known), then the landing page URL associated with the |
See also | § 6.7.11 Property: download URL, § 6.7.10 Property: access service |
dcat:accessURL
matches the property-chain dcat:accessService
/dcat:endpointURL
. In the RDF representation of DCAT this is axiomatized as an OWL property-chain axiom.
RDF Property: | dcat:accessService |
---|---|
Definition: | A data service that gives access to the distribution of the dataset |
Range: | dcat:DataService |
Usage note: | dcat:accessService SHOULD be used to link to a description of a dcat:DataService that can provide access to this distribution. |
See also | § 6.7.11 Property: download URL, § 6.7.9 Property: access URL |
RDF Property: | dcat:downloadURL |
---|---|
Definition: | The URL of the downloadable file in a given format. E.g. CSV file or RDF file. The format is indicated by the distribution's dct:format and/or dcat:mediaType |
Domain: | dcat:Distribution |
Range: | rdfs:Resource |
Usage note: | dcat:downloadURL SHOULD be used for the URL at which this distribution is available directly, typically through a HTTP Get request. |
See also | § 6.7.9 Property: access URL, § 6.7.10 Property: access service |
RDF Property: | dcat:byteSize |
---|---|
Definition: | The size of a distribution in bytes. |
Domain: | dcat:Distribution |
Range: | rdfs:Literal typed as xsd:decimal . |
Usage note: | The size in bytes can be approximated (as a decimal) when the precise size is not known. |
RDF Property: | dcat:spatialResolutionInMeters |
---|---|
Definition: | The minimum spatial separation resolvable in a dataset distribution, measured in meters. |
Range: | xsd:decimal |
Usage note: | If the dataset is an image or grid this should correspond to the spacing of items. For other kinds of spatial datasets, this property will usually indicate the smallest distance between items in the dataset. |
Usage note: | Alternative spatial resolutions might be provided as different dataset distributions |
The range of this property is a decimal number representing a length in meters. This is intended to provide a summary indication of the spatial resolution of the data distribution as a single number. More complex descriptions of various aspects of spatial precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].
RDF Property: | dcat:temporalResolution |
---|---|
Definition: | Minimum time period resolvable in the dataset distribution. |
Range: | xsd:duration |
Usage note: | If the dataset is a time-series this should correspond to the spacing of items in the series. For other kinds of dataset, this property will usually indicate the smallest time difference between items in the dataset. |
Usage note: | Alternative temporal resolutions might be provided in different dataset distributions |
This is intended to provide a summary indication of the temporal resolution of the data distribution as a single value. More complex descriptions of various aspects of temporal precision, accuracy, resolution and other statistics can be provided using the Data Quality Vocabulary [VOCAB-DQV].
RDF Property: | dct:conformsTo |
---|---|
Definition: | An established standard to which the distribution conforms. |
Range: | dct:Standard (A basis for comparison; a reference point against which other things can be evaluated.) |
Usage note: | This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation of a dataset conforms to. This is (generally) a complementary concern to the media-type or format. |
See also: | § 6.7.17 Property: format, § 6.7.16 Property: media type |
RDF Property: | dcat:mediaType |
---|---|
Definition: | The media type of the distribution as defined by IANA [IANA-MEDIA-TYPES]. |
Sub-property of: | dct:format |
Domain: | dcat:Distribution |
Range: | dct:MediaType |
Usage note: | This property SHOULD be used when the media type of the distribution is defined in IANA [IANA-MEDIA-TYPES], otherwise dct:format MAY be used with different values. |
See also: | § 6.7.17 Property: format, § 6.7.15 Property: conforms to |
RDF Property: | dct:format |
---|---|
Definition: | The file format of the distribution. |
Range: | dct:MediaTypeOrExtent |
Usage note: | dcat:mediaType SHOULD be used if the type of the distribution is defined by IANA [IANA-MEDIA-TYPES]. |
See also: | § 6.7.16 Property: media type, § 6.7.15 Property: conforms to |
RDF Property: | dcat:compressFormat |
---|---|
Definition: | The compression format of the distribution in which the data is contained in a compressed form, e.g. to reduce the size of the downloadable file. |
Range: | dct:MediaType |
Usage note: | This property to be used when the files in the distribution are compressed, e.g. in a ZIP file. The format SHOULD be expressed using a media type as defined by IANA [IANA-MEDIA-TYPES], if available. |
See also: | § 6.7.19 Property: packaging format. |
For examples on the use of this property, see § C.5 Compressed and packaged distributions.
RDF Property: | dcat:packageFormat |
---|---|
Definition: | The package format of the distribution in which one or more data files are grouped together, e.g. to enable a set of related files to be downloaded together. |
Range: | dct:MediaType |
Usage note: | This property to be used when the files in the distribution are packaged, e.g. in a TAR file, a Frictionless Data Package or a Bagit file. The format SHOULD be expressed using a media type as defined by IANA [IANA-MEDIA-TYPES], if available. |
See also: | § 6.7.18 Property: compression format. |
For examples on the use of this property, see § C.5 Compressed and packaged distributions.
The following properties are specific to this class: endpoint description, endpoint URL, serves dataset.
The following properties are inherited from the super-class dcat:Resource
:
access rights,
conforms to,
contact point,
creator,
description,
has policy,
identifier,
is referenced by,
keyword/tag,
landing page,
license,
resource language,
relation,
rights,
qualified relation,
publisher,
release date,
theme/category,
title,
type/genre,
update/modification date,
qualified attribution.
RDF Class: | dcat:DataService |
---|---|
Definition: | A collection of operations that provides access to one or more datasets or data processing functions. |
Sub-class of: | dcat:Resource |
Sub-class of: | dctype:Service |
Usage note: | If a dcat:DataService is bound to one or more specified Datasets, they are indicated by the dcat:servesDataset property. |
Usage note: | The kind of service can be indicated using the dct:type property. Its value may be taken from a controlled vocabulary such as the INSPIRE spatial data service type code list [INSPIRE-SDST]. |
For examples on the use of this class and related properties, see § C.4 Data services.
RDF Property: | dcat:endpointURL |
---|---|
Definition: | The root location or primary endpoint of the service (a Web-resolvable IRI). |
Domain: | dcat:DataService |
Range: | rdfs:Resource |
RDF Property: | dcat:endpointDescription |
---|---|
Definition: | A description of the services available via the end-points, including their operations, parameters etc. |
Domain: | dcat:DataService |
Range: | rdfs:Resource |
Usage note: | The endpoint description gives specific details of the actual endpoint instances, while dct:conformsTo is used to indicate the general standard or specification that the endpoints implement. |
Usage note: | An endpoint description may be expressed in a machine-readable form, such as an OpenAPI (Swagger) description [OpenAPI], an OGC GetCapabilities response [WFS], [ISO-19142], [WMS], [ISO-19128], a SPARQL Service Description [SPARQL11-SERVICE-DESCRIPTION], an [OpenSearch] or [WSDL20] document, a Hydra API description [HYDRA], else in text or some other informal mode if a formal representation is not possible. |
RDF Property: | dcat:servesDataset |
---|---|
Definition: | A collection of data that this data service can distribute. |
Range: | dcat:Dataset |
RDF Class: | skos:ConceptScheme |
---|---|
Definition: | A knowledge organization system (KOS) used to represent themes/categories of datasets in the catalog. |
See also: | § 6.3.2 Property: themes, § 6.4.12 Property: theme/category |
RDF Class: | skos:Concept |
---|---|
Definition: | A category or a theme used to describe datasets in the catalog. |
Usage note: | It is recommended to use either skos:inScheme or skos:topConceptOf on every skos:Concept
used to classify datasets to link it to the concept scheme it belongs to. This concept scheme is typically associated with the catalog using dcat:themeTaxonomy . |
See also: | § 6.3.2 Property: themes, § 6.4.12 Property: theme/category |
RDF Classes: | foaf:Person for people and foaf:Organization for government agencies or other entities. |
---|---|
Usage note: | [FOAF] provides several properties to describe these entities. |
The following properties are specific to this class: relation, had role.
Examples illustrating use of this class and its properties are given in § 13. Qualified relations.
RDF Class: | dcat:Relationship |
---|---|
Definition: | An association class for attaching additional information to a relationship between DCAT Resources |
Sub-class of: | prov:EntityInfluence |
Usage note: |
Use to characterize a relationship between datasets, and potentially other resources, where the nature of the relationship is known but is not adequately characterized by the standard [DCTERMS] properties
(dct:hasPart ,
dct:isPartOf ,
dct:conformsTo ,
dct:isFormatOf ,
dct:hasFormat ,
dct:isVersionOf ,
dct:hasVersion ,
dct:replaces ,
dct:isReplacedBy ,
dct:references ,
dct:isReferencedBy ,
dct:requires ,
dct:isRequiredBy )
or [PROV-O] properties
(prov:wasDerivedFrom ,
prov:wasInfluencedBy ,
prov:wasQuotedFrom ,
prov:wasRevisionOf ,
prov:hadPrimarySource ,
prov:alternateOf ,
prov:specializationOf )
|
RDF Property: | dct:relation |
---|---|
Definition: | The resource related to the source resource. |
Usage note: | In the context of a dcat:Relationship this is expected to point to another dcat:Dataset or other cataloged resource. |
RDF Property: | dcat:hadRole |
---|---|
Definition: | The function of an entity or agent with respect to another entity or resource. |
Domain: | prov:Attribution or dcat:Relationship |
Range: | dcat:Role |
Usage note: | May be used in a qualified-attribution to specify the role of an Agent with respect to an Entity. It is recommended that the value be taken from a controlled vocabulary of agent roles, such as [ISO-19115] CI_RoleCode . |
Usage note: | May be used in a qualified-relation to specify the role of an Entity with respect to another Entity. It is recommended that the value be taken from a controlled vocabulary of entity roles. |
This DCAT property complements prov:hadRole
which provides the function of an entity or agent with respect to an activity.
Examples illustrating use of this class are given in § 13. Qualified relations.
RDF Class: | dcat:Role |
---|---|
Definition: | A role is the function of a resource or agent with respect to another resource, in the context of resource attribution or resource relationships. |
Sub-class of: | skos:Concept |
Usage note: | Used in a qualified-attribution to specify the role of an Agent with respect to an Entity. It is recommended that the values be managed as a controlled vocabulary of agent roles, such as [ISO-19115-1] CI_RoleCode . |
Usage note: |
Used in a qualified-relation to specify the role of an Entity with respect to another Entity. It is recommended that the values be managed as a controlled vocabulary of entity roles such as
|
This DCAT class complements prov:Role
which provides the function of an entity or agent with respect to an activity.
The following properties are specific to this class: start date, end date. beginning, end.
Examples illustrating use of these options for the temporal coverage of a dataset are given in § 9.1 Temporal properties.
RDF Class: | dct:PeriodOfTime |
---|---|
Definition: | An interval of time that is named or defined by its start and end. |
Usage note: | The start and end of the interval SHOULD be given by using properties
dcat:startDate
or time:hasBeginning ,
and dcat:endDate
or time:hasEnd , respectively.
The interval can also be open - i.e., it can have just a start or just an end. |
RDF Property: | dcat:startDate |
---|---|
Definition: | The start of the period. |
Domain: | dct:PeriodOfTime |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear , xsd:gYearMonth , xsd:date , or xsd:dateTime ). |
RDF Property: | dcat:endDate |
---|---|
Definition: | The end of the period. |
Domain: | dct:PeriodOfTime |
Range: | rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] |
RDF Property: | time:hasBeginning |
---|---|
Definition: | Beginning of a period or interval. |
Range: | time:Instant |
Usage note: | Use of the property time:hasBeginning entails that value of the dct:temporal property is a member of the time:TemporalEntity class from [OWL-TIME]. In this context this could be taken to imply that dct:PeriodOfTime is equivalent to the sub-class time:ProperInterval |
RDF Property: | time:hasEnd |
---|---|
Definition: | End of a period or interval. |
Range: | time:Instant |
Usage note: | Use of the property time:hasEnd entails that value of the dct:temporal property is a member of the time:TemporalEntity class from [OWL-TIME]. In this context this could be taken to imply that dct:PeriodOfTime is equivalent to the sub-class time:ProperInterval |
The following properties are specific to this class: geometry, bounding box, centroid.
Examples illustrating use of these options for the spatial coverage of a dataset are given in § 9.2 Spatial properties.
RDF Class: | dct:Location |
---|---|
Definition: | A spatial region or named place. |
Usage note: |
|
RDF Property: | locn:geometry |
---|---|
Definition: | Associates any resource with the corresponding geometry. [LOCN] |
Range: | rdfs:Literal |
Usage note: | The range of this property is intentionally generic, with the purpose of allowing different geometry encodings. E.g., the geometry could be encoded as WKT (geosparql:asWKT [GeoSPARQL]). |
RDF Property: | dcat:bbox |
---|---|
Definition: | The geographic bounding box of a resource. |
Range: | rdfs:Literal |
Usage note: | The range of this property is intentionally generic, with the purpose of allowing different geometry encodings. E.g., the geometry could be encoded as WKT (geosparql:asWKT [GeoSPARQL]). |
RDF Property: | dcat:centroid |
---|---|
Definition: | The geographic center (centroid) of a resource. |
Range: | rdfs:Literal |
Usage note: | The range of this property is intentionally generic, with the purpose of allowing different geometry encodings. E.g., the geometry could be encoded as WKT (geosparql:asWKT [GeoSPARQL]). |
This section is non-normative.
The scientific and data provider communities use a number of different identifiers for publications, authors and data. DCAT primarily relies on persistent HTTP URIs as an effective way of making identifiers actionable. Notably, quite a few identifier schemes can be encoded as dereferenceable HTTP URIs, and some of them are also returning machine-readable metadata (e.g., DOIs [ISO-26324] and ORCIDs). Regardless, data providers still might need to refer to legacy identifiers, non-HTTP dereferenceable identifiers, locally minted or third-party-provided identifiers. In these cases, [DCTERMS] and [VOCAB-ADMS] can be of use.
The property dct:identifier
explicitly indicates HTTP URIs as well as legacy identifiers. In the following examples, dct:identifier
identifies a dataset, but it can similarly be used with any kind of resources.
Proxy dereferenceable URIs can be used when resources have not HTTP dereferenceable IDs. For example, in Example 14, https://example.org/proxyid
is a proxy for id
.
The property adms:identifier
[VOCAB-ADMS] can express other locally minted identifiers or external identifiers, like DOI, ELI, arΧiv for creative works and ORCID, VIAF, ISNI for actors such as authors and publishers, as long as the identifiers are globally unique and stable.
Example 15 uses adms:schemaAgency
and dct:creator
to represent the authority that defines the identifier scheme (e.g., the DOI foundation in the example), adms:schemaAgency
is used when the authority has no URI associated. The CrossRef and DataCite display guidelines recommend displaying DOIs as full URL link in the form https://doi.org/10.xxxx/xxxxx/
.
Example 15 does not represent the authority responsible for assigning and maintaining identifiers using that scheme (e.g., Zenodo) as naming the registrant goes against the philosophy of DOI, where the sub-spaces are abstracted from the organization that registers them, with the advantage that DOIs do not change when the organization changes or the responsibility for that sub-space is handed over to someone else. Example 15 shows a locally minted identifier for the creator of the dataset (e.g., https://example.org/PoelenJorritHID
) and its correspondent ORCID identifier (e.g., https://orcid.org/0000-0003-3138-4118
).
When the HTTP dereferenceable ID returns an RDF/OWL description for the dataset, the use of owl:sameAs
might be considered. For example,
when dereferenced with media type text/turtle
, https://doi.org/10.5281/zenodo.1486279
returns a [SCHEMA-ORG] description for the dataset, which might dynamically enrich the description provided by https://example.org/id
.
The need to distinguish between primary and alternative (or legacy) identifiers for a dataset within DCAT has been posed as a requirement. However, it is very much application-specific and would be better addressed in DCAT profiles rather than mandating a general approach.
Depending on the application context, specific guidelines such as "DCAT-AP: How to manage duplicates?" can be adopted for distinguishing authoritative datasets from dataset harvested by third parties catalogs.
If identifiers are not HTTP dereferenceable, common identifier types can be served as RDF datatypes [RDF11-CONCEPTS] or custom OWL datatypes [OWL2-SYNTAX] for the sake of interoperability, see ex:type
in Example 17.
If a registered URI type is used (following [RFC3986], § 3.1 Scheme), the identifier scheme is part of the URI; thus indicating a separate identifier scheme in 'type' is redundant. For example, DOI is registered as a namespace in the info
URI scheme [IANA-URI-SCHEMES] (see DOI FAQ #11), so according to [RFC3986], it should be encoded as in Example 18.
Otherwise, examples of common types for identifier scheme (arXiv, etc.) are defined in DataCite schema and FAIRsharing Registry.
This section is non-normative.
Selecting the right way to express conditions for access to and re-use of resources can be complex. Implementers should always seek legal advice before deciding which conditions apply to the resource being described.
This specification distinguishes three main situations: one where a statement is associated with a resource that is explicitly declared as a 'license'; a second, where the statement is associated with a resource denoting only access rights; a third, covering all the other cases - i.e., statements not concerning licensing conditions and/or access rights (e.g., copyright statements).
To address these scenarios, it is recommended to use the property dct:rights
, and its sub-properties dct:license
and dct:accessRights
. More precisely:
use dct:license
to refer to licenses;
use dct:accessRights
to express statements concerning only access rights (e.g., whether data can be accessed by anyone or just by authorized parties);
use dct:rights
for all the other types of rights statements - those which are not covered by dct:license
and dct:accessRights
, such as copyright statements.
Finally, in the particular case when rights are expressed via ODRL policies, it is recommended to use the odrl:hasPolicy
property as the link from the description of the cataloged resource or distribution to the ODRL policy, in addition to the corresponding [DCTERMS] property that matches the same ODRL policy type.
Recommendations on the use of these properties on the different types of resources defined in DCAT are provided in the relevant class descriptions.
This section is non-normative.
Five temporal properties of resources may be described using DCAT.
dct:issued
.
The value is usually encoded as a xsd:date
.
dct:modified
.
The value is usually encoded as a xsd:date
.
dct:accrualPeriodicity
.
The value should be taken from a controlled vocabulary such as Dublin Core Collection Description Frequency Vocabulary.
dcat:temporalResolution
.
The value is encoded as a xsd:duration
.
The update schedule and the temporal resolution can be combined to support the description of different kinds of time-series data as shown below.
dct:temporal
.
The value is a dct:PeriodOfTime
.
A number of options for expressing the details of a dct:PeriodOfTime
are recommended in § 6.14 Class: Period of Time.
Examples of these follow.
Two spatial properties of datasets may be described using DCAT.
The minimum spatial separation of items in a dataset is given using dcat:spatialResolutionInMeters
.
The value is a decimal number.
An example of the use of dcat:spatialResolutionInMeters
is given in Example 3.
The spatial extent of a dataset is given using dct:spatial
.
The value is a dct:Location
.
A number of options for expressing the details of a dct:Location
are recommended in § 6.15 Class: Location.
Examples of these follow.
This section is non-normative.
Versioning can be applied to any of the first class citizens DCAT resources including Catalogs, Datasets, Distributions. The notion of version is very much related to the community practices, data management policy and the workflows in place. It is up to data providers to decide when and why a new version should be released. For this reason, DCAT refrains from providing definitions or rules about when changes in a resource should turn in a new release of it.
Versioning may be understood as involving relationships between datasets, which is supported by the dcat:qualifiedRelation
and described in § 13.2 Relationships between datasets and other resources. The class dcat:Relationship
supports providing information about the relationship, and could be extended for versioning information.
This section is non-normative.
Dataset citation is one of the requirements identified for this DCAT revision. Data citation is the practice of referencing data in a similar way as when providing bibliographic references, acknowledging data as a first class output in any investigative process. Data citation offers multiple benefits, such as supporting proper attribution and credit to those producing the data, facilitating data discovery, supporting tracking the impact and reuse of data, allowing for collaboration and re-use of data, and enabling the reproducibility of results based on the data.
To support data citation, the dataset description should include at a minimum: the dataset identifier, the dataset creator(s), the dataset title, the dataset publisher and the dataset publication or release date. These elements are those required by the DataCite metadata schema [DataCite], which is the metadata associated by the persistent identifiers (Digital Object Identifiers or DOIs) assigned by [DataCite] to research data.
In order to support data citation, this DCAT revision has added the consideration of dereferenceable identifiers and support for indicating the creators of the cataloged resources. The remaining properties necessary for data citation were already available in DCAT 2014 [VOCAB-DCAT-20140116].
The constraints on the availability of properties required for data citation in the dataset description can be represented as a DCAT data citation profile.
This section is non-normative.
The Data Quality Vocabulary (DQV) [VOCAB-DQV] offers common modelling patterns for different aspects of Data Quality. It can relate DCAT datasets and distributions with different types of quality information including:
dqv:QualityAnnotation
, which represents feedback and quality certificates given about the dataset or its distribution.dqv:QualityPolicy
, which represents a policy or agreement that is chiefly governed by data quality concerns.dqv:QualityMeasurement
, which represents a metric value providing quantitative or qualitative information about the dataset or distribution.Each type of quality information can pertain to one or more quality dimensions, namely, quality characteristics relevant to the consumer. The practice to see the quality as a multi-dimensional space is consolidated in the field of quality management to split the quality management into addressable chunks. DQV does not define a normative list of quality dimensions. It offers the quality dimensions proposed in ISO/IEC 25012 [ISO-IEC-25012] and [ZaveriEtAl] as two possible starting points. It also provides an RDF representation for the quality dimensions and categories defined in the latter. Ultimately, implementers will need to choose themselves the collection of quality dimensions that best fits their needs. The following section shows how DCAT and DQV can be coupled to describe the quality of datasets and distributions. For a comprehensive introduction and further examples of use, please refer to [VOCAB-DQV].
A data consumer (:consumer1
) describes the quality of the dataset :genoaBusStopsDataset
that includes a georeferenced list of bus stops in Genoa. He/she annotates the dataset with a DQV quality note
(:genoaBusStopsDatasetCompletenessNote
) about data completeness (ldqd:completeness
) to
warn that the dataset includes only 20500 out of the 30000 stops.
The activity :myQualityChecking
employs the service :myQualityChecker
to check the
quality of the :genoaBusStopsDataset
dataset. The metric :completenessWRTExpectedNumberOfEntities
is applied to measure the dataset completeness (ldqd:completeness
) and it results in the quality measurement
:genoaBusStopsDatasetCompletenessMeasurement
.
Other examples of quality documentation are available in [VOCAB-DQV], including examples about how to express dataset accuracy and precision.
This section shows different modelling patterns combining [VOCAB-DQV] with [PROV-O] and EARL [EARL10-Schema] to represent the conformance degree to a stated quality standard and the details about the conformance tests.
The use of dct:conformsTo
and
dct:Standard
is a well-known pattern
to represent the conformance to a standard. Example 33, directly borrowed from [SDW-BP] (Example 51), declares a fictional a:Dataset
conformant to the EU INSPIRE Regulation on interoperability of spatial data sets and services ("Commission Regulation (EU) No 1089/2010
of 23 November 2010 implementing Directive 2007/2/EC of the European Parliament and of the Council as regards
interoperability of spatial data sets and services").
Another example concerns the specification of the coordinate reference system (CRS) used in a dataset - an information which is typically included in geospatial metadata. Example 34 shows how the CRS of a dataset can be specified in DCAT:
In Example 34, http://www.opengis.net/def/crs/EPSG/0/28992
is a URI from the OGC CRS Registry, corresponding to EPSG:28992 ("Amersfoort / RD New") (see also Example 28).
Some legal context requires to specify the degree of conformance. For example, INSPIRE metadata adopts a specific controlled vocabulary [INSPIRE-DoC] to express non-conformance and non-evaluation beside the full compliance. Similar controlled vocabularies can be defined in other contexts.
Example 35 specifies some newly minted concepts representing the degree of conformance (i.e., conformant, not conformant) and declares the
dct:type
for indicating
the result of conformance test. Following a pattern used in [GeoDCAT-AP], the example uses a prov:Entity
to model the conformance test (e.g.,
a:testResult
), a prov:Activity
to model the testing activity (e.g.,
a:testingActivity
), a prov:Plan
derived from the Data on the Web Best Practices [DWBP] (e.g., a:conformanceTest
) to check for the whole set of best practices. A qualified PROV association binds the testing activity to the conformance test.
Also, [VOCAB-DQV] can be deployed to measure the compliance to a specific standard. In Example 36, the :levelOfComplianceToDWBP
is a quality metrics which measures the compliance of a dataset to [DWBP] in terms of the percentage of passed compliance tests. Example 36 assumes iso
as a namespace prefix representing the quality dimensions and categories defined in the ISO/IEC 25012 [ISO-IEC-25012].
The quality measurement :measurement_complianceToDWBP
represents the level of compliance for dataset a:Dataset
, namely, measurement of the metric :levelOfComplianceToDWBP
. If only a part of the compliance tests succeeds (e.g. half of the compliance tests), the measurement would look like in Example 37.
Further information about the tests can be provided using EARL [EARL10-Schema]. EARL provides specific
classes to describe the testing activity, which can be adopted in conjunction with [PROV-O].
Example 38 describes the Testing activity a:testingActivity
as an earl:Assertion
instead of a qualified association on the prov:Activity
. The earl:Assertion
states
that dataset a:Dataset
has been tested with the conformance test a:conformanceTest
, and it
has passed the test as described in a:testResult
.
Example 39 shows how the description would have looked like if the subtest a:testq1
had failed. In particular, dct:description
and earl:info
provide additional warnings or error messages in a human-readable form.
Depending on the details required about tests, [VOCAB-DQV] can express the testing activity and errors as well. In Example 40, :error
is a quality annotation that represents the previous error, and a:testResult
is defined as a dqv:QualityMetadata
to collect the above annotations and the compliance measurements providing provenance information.
Of course, the above modelling patterns can represent any quality tests, not only conformance to standards.
This section is non-normative.
DCAT includes elements to support description of many aspects of datasets and data-services. Nevertheless, additional information is required in order to fully express the semantics of some relationships. An example is that, while [DCTERMS] provides the standard roles creator, contributor and publisher for attribution of a resource to a responsible party or agent, there are many other potential roles, see for example the CI_RoleCode
values from [ISO-19115-1]. Similarly, while [DCTERMS] and [PROV-O] provide some properties to capture relationships between resources, including was derived from, was quoted from, is version of, references and several others, many additional concerns are seen in the list of [ISO-19115-1] DS_AssociationTypeCodes
, the IANA Registry of Link Relations [IANA-RELATIONS], the DataCite metadata schema [DataCite]
and the MARC relators. While these relations could be captured with additional sub-properties of dct:relation
, dct:contributor
, etc, this would lead to an explosion in the number of properties, and anyway the full set of potential roles and relationships is unknown.
A common approach for meeting these kinds of requirement is to introduce an additional resource to carry parameters that qualify the relationship. Precedents are the qualified terms in [PROV-O] and the sample relations in the Semantic Sensor Network ontology [VOCAB-SSN]. The general Qualified Relation pattern is described in [LinkedDataPatterns].
Many of the qualified terms from [PROV-O] are relevant to the description of resources in catalogs but these are incomplete due to the activity-centric viewpoint taken by PROV-O. Addressing some of the gaps, additional forms are included in the DCAT vocabulary to satisfy requirements that do not involve explicit activities. These are summarized in Figure 5:
Note that, while the focus of these qualified forms is to allow for additional roles on a relationship, other aspect of the relationships, such as the applicable time interval, are easily attached when a specific node is used to describe the relationship like this (e.g. see the chart of Influence relations in [PROV-O] for some examples).
The standard [DCTERMS] properties dct:contributor
, dct:creator
and dct:publisher
, and the generic prov:wasAttributedTo
from [PROV-O], support basic associations of responsible agents with a cataloged resource.
However, there are many other roles of importance in relation to datasets and services - e.g. funder, distributor, custodian, editor.
Some of these roles are enumerated in the CI_RoleCode
values from [ISO-19115-1], in the [DataCite] metadata schema, and included within the MARC relators.
A general method for assigning an agent to a resource with a specified role is provided by using the qualified form prov:qualifiedAttribution
from [PROV-O].
Example 41 provides an illustration:
In Example 41 the roles are denoted by IRIs from a (non-normative) linked data representation of the CI_RoleCode
codelist from [ISO-19115-1].
The standard [DCTERMS] properties dct:relation
and sub-properties such as
dct:hasPart
/ dct:isPartOf
,
dct:hasVersion
/ dct:isVersionOf
,
dct:replaces
/ dct:isReplacedBy
,
dct:requires
/ dct:isRequiredBy
,
prov:wasDerivedFrom
,
prov:wasQuotedFrom
,
support the description of relationships between datasets and other cataloged resources.
However, there are many other relationships of importance - e.g. alternate, canonical, original, preview, stereo-mate, working-copy-of.
Some of these roles are enumerated in the DS_AssociationTypeCodes
values from [ISO-19115-1], the IANA Registry of Link Relations [IANA-RELATIONS], in the [DataCite] metadata schema, and included within the MARC relators.
A general method for relating a resource to another resource with a specified role is provided by using the qualified form dcat:qualifiedRelation
.
Example 42 provides illustrations:
In Example 42 the roles are denoted by IRIs from [IANA-RELATIONS] and from a (non-normative) linked data representation of the DS_AssociationTypeCode
codelist from [ISO-19115-1].
This section is non-normative.
The DCAT-2014 vocabulary [VOCAB-DCAT-20140116] has been extended for application in data catalogs in different domains. Each of these new specifications constitutes a DCAT profile, i.e. a named set of constraints based on DCAT (see § 4. Conformance). In some cases, a profile extends one of the DCAT profiles themselves, by adding classes and properties for metadata fields not covered in the reference DCAT profile.
Some of the DCAT profiles are:
The DCAT vocabulary supports the attribution of data and metadata to various participants such as resource creators, publishers and other parties or agents via qualified relations, and as such defines terms that may be related to personal information. In addition, it also supports the association of rights and licenses with cataloged Resources and Distributions. These rights and licenses could potentially include or reference sensitive information such as user and asset identifiers as described in [ODRL-VOCAB]. Implementations that produce, maintain, publish or consume such vocabulary terms must take steps to ensure security and privacy considerations are addressed at the application level.
The editors gratefully acknowledge the contributions made to this document by all members of the working group, especially Annette Greiner, Antoine Isaac, Armin Haller, Dan Brickley, Ine de Visser, Jaroslav Pullmann, Lars G. Svensson, Linda van den Brink, Makx Dekkers, Nicholas Car, Rob Atkinson, Tom Baker.
The editors would also like to thank the following for comments received: Addison Phillips, Andreas Kuckartz, Anna Odgaard Ingram, Armando Stellato, Bert van Nuffelen, Chris Little, Chris Sweeney, Chris Wood, Clemens Portele, Daniel Pop, Dave Reynolds, Guillaume Duffes, Ian Davis, Jakob Voß, Jakub Klímek, James Passmore, Leigh Dodds, Luca Trani, Marco Brattinga, Matthias Palmér, Melanie Barlow, Nancy Fallgren, Nuno Freire, Øystein Åsnes, Pano Maria, Peter Parslow, Renato Iannella, Ruth Duerr, Siri Jodha S. Khalsa, Stephane Fellah, Stephen Richard, Stijn Goedertier, Tom Kralidis, Vladimir Alexiev, Wouter Beek, Yves Coene.
The editors also gratefully acknowledge the chairs of this Working Group: Karen Coyle, Caroline Burle and Peter Winstanley — and staff contacts Phil Archer and Dave Raggett.
This section is non-normative.
Schema.org [SCHEMA-ORG] includes a number of types and properties based on the original DCAT work (see sdo:Dataset as a starting point), and the index for Google's Dataset Search service relies on structured description in Web pages about datasets using both schema.org and DCAT. A comparison of the DCAT backbone, shown in Figure 1 above with the related classes from [SCHEMA-ORG] in Figure 6 shows the similarity, in particular: .
General purpose Web search services that use metadata at all rely primarily on [SCHEMA-ORG], so the relationship of DCAT to [SCHEMA-ORG] is of interest for data providers and catalog publishers who wish their datasets and services to be exposed through those indexes.
A mapping between DCAT 2014 and schema.org was discussed on the original proposal to extend [SCHEMA-ORG] for describing datasets and data catalogs. Partial mappings between DCAT 2014 [VOCAB-DCAT-20140116] and [SCHEMA-ORG] were provided earlier by the Spatial Data on the Web Working Group, building upon previous work.
A recommended mapping from the revised DCAT (this document) to [SCHEMA-ORG] version 3.4 is available in an RDF file.
This mapping is axiomatized using the predicates rdfs:subClassOf
, rdfs:subPropertyOf
, owl:equivalentClass
, owl:equivalentProperty
, skos:closeMatch
,
and also using the annotation properties sdo:domainIncludes
and sdo:rangeIncludes
to match [SCHEMA-ORG] semantics. The alignment is summarized in the table below, considering the prefix sdo
as http://schema.org/
.
DCAT element | target element from schema.org |
---|---|
dcat:Resource | sdo:Thing |
dct:title | sdo:name |
dct:description | sdo:description |
dcat:keyword dcat:keyword is singular, sdo:keywords is plural |
sdo:keywords |
dcat:theme | sdo:about |
dct:identifier | sdo:identifier |
dct:type | sdo:additionalType |
dct:issued | sdo:datePublished |
dct:modified | sdo:dateModified |
dct:language | sdo:inLanguage |
dct:relation | sdo:isRelatedTo |
dcat:landingPage | sdo:url |
dct:publisher | sdo:publisher |
dcat:contactPoint | sdo:contactPoint |
dcat:Catalog | sdo:DataCatalog |
dct:hasPart | sdo:hasPart |
dcat:dataset | sdo:dataset |
dcat:distribution | sdo:distribution |
dcat:Dataset | sdo:Dataset |
dcat:Dataset dct:accrualPeriodicity fixed to <http://purl.org/cld/freq/continuous> |
sdo:DataFeed |
dct:spatial | sdo:spatialCoverage |
dct:temporal | sdo:temporalCoverage |
dct:accrualPeriodicity | sdo:repeatFrequency |
prov:wasGeneratedBy | [ owl:inverseOf sdo:result ] |
dcat:Distribution | sdo:DataDownload |
dct:format | sdo:encodingFormat |
dcat:mediaType | sdo:encodingFormat |
dcat:byteSize | sdo:contentSize |
dcat:accessURL | sdo:contentUrl |
dcat:downloadURL | sdo:contentUrl |
dct:license | sdo:license |
dcat:DataService | sdo:WebAPI |
dcat:endPointURL | sdo:url |
dcat:endPointDescription | sdo:documentation, sdo:hasOfferCatalog |
dct:type in context of a dcat:DataService |
sdo:serviceType |
dcat:servesDataset | sdo:serviceOutput |
dcat:Relationship | sdo:Role |
This section is non-normative.
In many legacy catalogs and repositories (e.g. CKAN), ‘datasets’ are ‘just a bag of files’. There is no distinction made between part/whole, distribution (representation), and other kinds of relationship (e.g. documentation, schema, supporting documents) from the dataset to each of the files.
If the nature of the relationships between a dataset and component resources in a catalog, repository, or elsewhere are not known, dct:relation
can be used:
If it is clear that any of these related resources is a proper representation of the dataset, dcat:distribution
should be used.
This example is available from the DXWG code repository at csiro-dap-examples.ttl
.
Additional detail about the nature of the related resources can be given using suitable elements from other RDF vocabularies, along with dataset descriptors from DCAT. For example, the example above might be more fully expressed as follows (embedded comments explain the different resources in the graph):
This example is available from the DXWG code repository at csiro-stratchart.ttl
.
The provenance or business context of a dataset can be described using elements from the W3C Provenance Ontology [PROV-O].
For example, a simple link from a dataset description to the project that generated the dataset can be formalized as follows (other details elided for clarity):
This example is available from the DXWG code repository at csiro-dap-examples.ttl
.
Several properties capture provenance information, including within the citation and title, but the primary link to a formal description of the project is through prov:wasGeneratedBy
.
A terse description of the project is shown as a prov:Activity
, though this would not necessarily be part of the same catalog.
Note that as the project is ongoing, the activity has no end date.
Further provenance information might be provided using the other starting point properties from PROV, in particular prov:wasAttributedTo
(to link to an agent associated with the dataset production) and prov:wasDerivedFrom
(to link to a predecessor dataset). Both of these complement Dublin Core properties already used in DCAT, as follows:
prov:wasAttributedTo
provides a general link to all kinds of associated agents, such as project sponsors, managers, dataset owners, etc which are not correctly characterized using dct:creator
, dct:contributor
or dct:publisher
.
prov:wasDerivedFrom
supports a more specific relationship to an input or predecessor dataset compared with dct:source
, which is not necessarily a previous dataset.
Further patterns for the use of qualified properties for resource attribution and interrelationships are described in § 13. Qualified relations.
Often datasets are associated with publications (scholarly articles, reports, etc) and this version of DCAT relies on the property dct:isReferencedBy
to provide a way to link publications about a dataset to the dataset
The following example shows how a dataset published in the Dryad repository is linked to a publication available in the Nature Scientific Data journal:
This examples is available from the DXWG code repository at dryad-globtherm-sdata.ttl
Data services may be described using DCAT.
The values of the classifiers dct:type
, dct:conformsTo
, and dcat:endpointDescription
provide progressively more detail about a service, whose actual endpoint is given by the dcat:endpointURL
.
The first example describes a data catalog hosted by the European Environment Agency (EEA).
This is classified as a dcat:DataService
and has the dct:type
set to "discovery" from the INSPIRE classification of spatial data service types [INSPIRE-SDST].
This example is available from the DXWG code repository at eea-csw.ttl
Example 49 shows a dataset hosted by Geoscience Australia, which is available from three distinct services, as indicated by the value of the dcat:servesDataset
property of each of the service descriptions.
These are classified as a dcat:DataService
and also have the dct:type
set to "download" and "view" from the INSPIRE classification of spatial data service types [INSPIRE-SDST].
Example 49 is available from the DXWG code repository at ga-courts.ttl
The first example is for a distribution with a downloadable file that is compressed into a GZIP file.
The second example is for a distribution with several files packed into a TAR file.
The third example is for a distribution with several files packed into a TAR file which has been compressed into a GZIP file.
These examples are available from the DXWG code repository at compress-and-package.ttl
A full change-log is available on GitHub
The document has undergone the following changes since the W3C Recommendation of 16 January 2014 [VOCAB-DCAT-20140116]:
dct:isReferencedBy
was added to the class dcat:Resource
to associate the resource described in the catalog with an external resource that references, cites, or points to the cataloged resource. In particular, in the case of datasets, this property supports the data citation use case where a publication references a dataset. For other types of relations not covered by this or other known properties, the specification provides the qualified relations pattern.
See Issue #63.
dct:Location
and three new properties (locn:geometry
, dcat:bbox
, dcat:centroid
) added to support description of the coordinates of a geographical area, to be used for specifying the spatial coverage of a resource.
See Issue #83.
dct:PeriodOftime
and four new properties (dcat:startDate
, dcat:endDate
, time:hasBeginning
, time:hasEnd
) added to support description of a temporal interval, to be used for specifying the temporal coverage of a resource.
See Issue #85.
dcat:themeTaxonomy
relaxed to allow linking to a taxonomy that is not formalized as a skos:ConceptScheme
.
See Issue #119.
dcat:spatialResolutionInMeters
added to support description of the spatial resolution of datasets and distributions.
See Issue #84.
dcat:temporalResolution
added to support description of the temporal resolution of datasets and distributions.
See Issue #84.
dcat:packageFormat
and dcat:compressFormat
, were added to specify packaged and compressed distributions, respectively.
See Issue #54.
dcat:qualifiedRelation
and a new class dcat:Relationship
added to support relationships between datasets or other resources.
See Issue #79.
dcat:hadRole
is added to support the use of prov:qualifiedAttribution
to associate an agent with a resource, where the role of the agent with relation to the resource is specified, and is something other than the standard [DCTERMS] roles: creator, publisher or contributor.
See Issue #79
dct:creator
is recommended for use in the context of a dataset or other resource to allow the entity responsible for generating the resource to be recorded.
See Issue #61
prov:wasGeneratedBy
is recommended for use in the context of a dataset to allow the provenance or business context to be recorded.
See Issue #71
dct:relation
is recommended for use in the context of a cataloged resource to capture general relationships, including the case where the package of resources associated with a cataloged item includes a mixture of representations, parts, documentation and other elements which are not strictly 'distributions' of a dataset - see Issue #253.
The more general use of dct:relation
is driven by the requirement documented in Issue #81.
dcat:mediaType
has been tightened from dct:MediaTypeOrExtent
to dct:MediaType
.
See Issue #127.
dct:conformsTo
is recommended for use in the context of a dcat:Distribution
to allow the model or schema used for the representation to be indicated as well as the serialization (which is indicated using dct:format
and dcat:mediaType
).
See Issue #55.
dcat:mediaType
usage fixed.
See Issue #170.
dcat:Catalog
was limited to datasets. This has been generalized, and properties common to all cataloged resources are now associated with a super-class dcat:Resource
.
See Issue #172 and Issue #116.
dcat:Catalog
was limited to datasets. The new class dcat:DataService
has been added to support cataloging of various kinds of data services.
See Issue #172, Issue #56, Issue #432, Issue #821.
dcat:Dataset
was a sub-class of dctype:Dataset
, which is a term of the DCMI Types vocabulary [DCTERMS]. This relationship has been removed in the revised DCAT vocabulary.
See Issue #98.
dcat:Distribution
allowed a number of alternative interpretations. The definition has been rephrased to clarify that distributions are primarily representations of datasets.
See Issue #52 and related use cases.
dcat:theme
was dcat:Dataset
, which limited use of this property in other contexts. The domain has been relaxed in this revision.
See Issue #123.
dcat:keyword
was dcat:Dataset
, which limited use of this property in other contexts. The domain has been relaxed in this revision.
See Issue #121.
dcat:contactPoint
was dcat:Dataset
, which limited use of this property in other contexts. The domain has been relaxed in this revision.
See Issue #95.
dcat:landingPage
was dcat:Dataset
, which limited use of this property in other contexts. The domain has been relaxed in this revision.
See Issue #122.
vann:usageNote
:
DCAT 2014 [VOCAB-DCAT-20140116] included documentation captured as text using vann:usageNote
elements, which is a sub-property of rdfs:seeAlso
- an owl:ObjectProperty
that cannot have a Literal value. This revision of DCAT has fixed these issues and replaced the use of vann:usageNote
with skos:scopeNote
.
See Issue #233.
dct:conformsTo
for dcat:CatalogRecord
to cover this requirement.
See Issue #502.
dct:license
, dct:accessRights
, and dct:rights
in the context of dcat catalogs and distributions.
See Issue #114 for the background discussion.
Class: Catalog:
This class has been made a sub-class of dcat:Dataset
. Moreover, the following properties have been added:
dct:hasPart
, to specify a cataloged resource, irrespective of its type;
dcat:service
, to specify a cataloged data service;
dcat:catalog
, to specify sub-catalogs.
See Issue #172.