Copyright © 2016 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
Datasets published on the Web are accessed and experienced by consumers in a variety of ways, but little information about these experiences is typically conveyed. Dataset publishers many times lack feedback from consumers about how datasets are used. Consumers lack an effective way to discuss experiences with fellow collaborators and explore referencing material citing the dataset. Datasets as defined by DCAT are a collection of data, published or curated by a single agent, and available for access or download in one or more formats. The Dataset Usage Vocabulary (DUV) is used to describe consumer experiences, citations, and feedback about the dataset from the human perspective.
By specifying a number of foundational concepts used to collect dataset consumer feedback, experiences, and cite references associated with a dataset, APIs can be written to support collaboration across the Web by structurally publishing consumer opinions and experiences, and provide a means for dataset consumers and producers advertise and search for published open dataset usage.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at
This document presents the most mature version of the Dataset Usage Vocabulary that could be produced in the lifespan of the Data on the Web Best Practices Working Group. At the time of publication, it has remained stable for several months, even after receiving feedback and suggestions from the community. Further clarifications and extensions of this model may be carried out by future working groups.
This document was published by the Data on the Web Working Group as a Working Group Note. If you wish to make comments regarding this document, please send them to (subscribe, archives). All comments are welcome.
Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 September 2015 W3C Process Document.
The Data on the Web Best Practices working group identified Best Practices [DWBP] for citing published data, conveying feedback between consumers and publishers, and providing descriptive metadata that provides insights to the consumer about how published datasets can be used. The dataset usage vocabulary is viewed as an extension to the Data Catalog (DCAT) vocabulary version 1.0 [VOCAB-DCAT] to fill current existing gaps required to adequately cite, describe usage, and convey feedback on published datasets and distributions.
Broadly speaking the vocabulary is domain-independent and open. As an open vocabulary the DUV encourages publishers to add descriptive metadata tailored to meet each publisher’s domain-specific needs. When publishers and consumers exchange, combine, and reuse published data, the DUV metadata allows usage information to be clearly identified and cross-referenced across datasets. Because the DUV heavily relies on vocabulary reuse descriptive metadata in many cases can be leveraged from original vocabulary.
The DUV is modular consisting of four sub-models (DCAT, citation, usage, feedback) to support different practitioner needs. As a result the DUV can be used in part (DCAT+citation, DCAT+usage, DCAT+feedback) or in its entirety depending on the practitioner’s requirements. The DCAT+usage submodel is loosely defined to either indicate usage from an informational perspective, or if descriptive metadata is required, the usage submodel can indicate to find this information. Additionally, the citation submodel describing basic electronic citation elements, bibliographic references to dataset, and citation rationale can be independently used.
The namespace for DUV is
. It should be noted that DUV makes extensive use of terms from other vocabularies which rely their own sets of namespaces. A full set of namespaces and prefixes used in this document is shown in the table below.
Prefix | Namespace |
biro | |
cito | |
dcat | |
dct | |
disco | |
dqv | |
duv | |
fabio | |
foaf | |
frbr | |
oa | |
pav | |
rdf | |
rdfs | |
skos | |
vann | |
xsd | |
The DUV is intended for data producers and publishers interested in tracking, sharing, and persisting consumer dataset usage. It is also intended for collaborators who require an exchange medium to advertise and interactively convey dataset usage.
The scope of the DUV is defined by the Data on the Web Best Practices (DWBP) Use Case document [DWBP-UCR] based on the data usage requirements about datasets. These requirements include: citing datasets on the Web, tracking the usage of datasets, sharing feedback and rating datasets. These requirements were derived from fourteen real world use cases examples provided in the use case document.
Based on DWBP the DUV heavily relies upon is vocabulary reuse to support citation, feedback, and usage of datasets published on the Web. This section provides our rationale and approach for vocabulary selection and reuse.
The core DUV begins with reusing the Data Catalog Vocabulary [
VOCAB-DCAT] dcat:Dataset
and dcat:Distribution
classes and many of their related properties. In fact, the
DUV can be considered an extension of the dcat:Dataset and
The Web Annotation Vocabulary [Annotation-Vocab] is used to describe duv:Feedback
as a subclass inheriting the behavior of oa:Annotation
. The intent for the duv:Feedback
is provided by the oa:motivated_by
property and oa:Motivation
subclasses (for example: oa:describing
, oa:replying
). A subset of the Motivation instances are important to describe feedback to data publishers, and blogs between dataset consumers. In addition to supporting duv:Feedback
, because the Web Annotation vocabulary provides a generic way of annotating any Web resource, it is recommended that Web Annotation vocabulary be used to annotate the dcat:Dataset
for uses beyond the scope of the DUV.
The Semantic Publishing and Referencing [SPAR] Ontologies provides a suite of vocabularies used to related entities to reference citations, bibliographic records, and describe the publication process along with other related activities. The DUV directly relies upon the FRBR-aligned Bibliographic Ontology [FaBIO], and Dublin Core [DC-TERMS] ontologies are used to describe citations and references between datasets and cited sources. In addition to ontologies the research community provided basic criteria for citing data on the Web [ MSUDataCite]. These resources helped scope the DUV citation model into the minimal requirements for electronic dataset publication. Finally, data citation principles being adopted [FORCE11-Citation] are also being considered to ensure the DUV is consistent with guidelines developed by other data citation communities.
As discussed earlier, the intention of the DUV is to be thorough enough to offer a starting point for describing dataset usage and yet be open to using other alternative classes and properties as required by developers. For example, duv:Feedback the subclass of oa:Annotation reuses the object property oa:hasBody. While the [Annotation-Vocab] uses oa:TextualBody as a means to capture textual information, however what if needed to capture feedback in alternative ways, it is highly suggested spending time exploring resources such as the Open Knowledge Foundation, Linked Open Vocabularies [OKFN] such as using the Review vocabulary [REV]. This vocabulary features a very lightweight way to specify ratings.
While the focus of this document is mainly on representing the DUV using semantic expressions, a very active effort [SCHEMA] provides lightweight data structures that can be used as markup that commercial search engines recognize, aiding in discovery. Not only do and the DUV rely on many related concepts, efforts are underway to extend the to support additional data structures.
This section depicts the vocabulary in its entirety as a conceptual model. Boxes are used to identify each class. Labeled open arrows identify object and literal properties. White arrowheads depict class inheritance with the parent class identified by the arrowhead. Please note that while most of the properties are reused from other vocabularies, they are included in the diagram to (1) reduce the learning curve required by DUV implementers that would otherwise be spent constantly referencing to third party vocabularies and (2) provide a starting place to familiarize DUV implementers with the basics needed to get them started.
The citation model was motivated by the UCR requirement R-Citable It should be possible to cite data on the Web. The citation model is largely based on classes, properties, and recommended approaches taken from the SPAR Ontologies.
The citation model seeks to meet the needs of:
and dcat:Distribution
(See Example 1) by using basic bibliographic reference criteria provided by the data publisher. To do this, DUV extensions are added to the DCAT model in two ways: First because dcat:Dataset
or dcat:Distribution
are forms of electronic media that can be potentially cited, both use the same citation properties. Secondly, to fill any information gaps, new properties were created or properties from other vocabularies were added.
The table below shows how a portion of the DUV properties can be used to form a bibliographic reference.
Bibliographic Field | Definition | Property used in DUV |
Author(s) | An individual, a group of individuals, or an organization responsible for creating the dataset or distribution. | dct:creator |
Title | Name of the dataset or distribution or the name of the activities that produced the data. | dct:title |
Year | When data was produced. | dct:created |
Publisher | This may be the name of the archive where it is housed or the organization responsible for performing publication services. | dct:publisher |
Distributor (if used) | Organization that makes the dataset available for downloading and use | duv:hasDistributor |
Edition or version | Edition or version number associated with the dataset | pav:version |
Access information (a URL or other persistent identifier). | Web address of dataset or distribution, or persistent identifier such as digital object identifier (DOI). | dct:identifier |
Extracting metadata from the properties in example 1, it is possible to form the following example reference:
dct:creator + . + dct:title + pav:version + . + dct:publisher + [. + duv:hasDistributor + ] . + dct:identifier
MyCity Bus Association. Bus stops of MyCity. 1.0 version. Transport Agency, 2015 [publisher,distributor]. doi:10.0902/1975.16
with bibliographic references that provide data additional insights to data consumers (See Example 2). Data publishers can also annotate a dataset or distribution with bibliographic references provided by data consumers (See
Example 3). For example, a researcher uses a dataset for performing some experiments and then publishes a paper with the experimentation's results. The dataset can be annotated with the bibliographic reference of this paper. include: dcat:Dataset
, biro:BibliographicReference
. Properties include: frbr:part
, biro:isReferencedBy
Note that while the DUV reuses a limited set of [SPAR] classes and properties to support basic [VOCAB-DCAT] dataset and distribution citation requirements, the [SPAR] ontologies have a richer set of classes and properties for more advanced representations. For example [
FaBIO] fabio:Expression
class has many subclasses (e.g. fabio:Policy
) that help characterize referenced material.
The usage model was motivated by the UCR requirement R-TrackDataUsage It should be possible to track the usage of data. Sharing and tracking dataset usage can help enhancing the utility of the datasets by providing help describing how the datasets can be used by a consumer (See Example 5). Usage could be considered as enabling descriptive information provided by the data publisher to help the consumer community make use of datasets and distributions. Based on the use cases, data usage can help provide guidance information about how to use the dataset and tools (See Example 6) that can be used with the dataset. As also stipulated by use cases is the need to track usage metrics
The following classes constitute the Usage Model: dcat:Dataset
, duv:Usage
, duv:UsageTool
. Properties include:
, duv:hasUsageTool
, dct:identifier
, dct:title
, pav:version
, dct:issued
, and dct:description
The feedback model was motivated by UCR requirement R-UsageFeedback Data consumers should have a way of sharing feedback and rating data (See Example 4) . User feedback is important to address data quality concerns about published dataset. Different users may have different experiences with the same dataset so it is important to capture the context in which data was used and the profile of the user who uses it. R-UsageFeedback should also provide a way for consumers to communicate suggested corrections or advice back to the dataset publisher.
The following classes constitute the Feedback Model: dcat:Dataset
, oa:Annotation
, oa:Motivation
, dqv:UserQualityFeedback
, oa:TextualBody
. Properties include:
, duv:hasRating
, oa:hasBody
RDF Class: | duv:RatingFeedback |
Definition | Predefined criteria used to express a user opinion about a dataset or distribution using a discrete range of values. |
rdfs:isDefinedBy | |
Label | rating feedback |
rdfs:subClassOf | duv:UserFeedback |
RDF Class: | duv:Usage |
Definition | Actions that can be performed on a given dataset or distribution. |
rdfs:isDefinedBy | |
Label | usage |
RDF Class: | duv:UsageTool |
Definition | A tool that can use a dataset or distribution. |
rdfs:isDefinedBy | |
Label | usage tool |
RDF Class: | duv:UserFeedback |
Definition | User feedback on a dataset or distribution. |
Label | user feedback |
rdfs:subClassOf | oa:Annotation |
RDF Property: | dct:created |
Definition | Date of creation of the resource |
vann:usageNote | dcat:Dataset (subject) dct:created (predicate) rdfs:Literal (object)
dcat:Distribution (subject) dct:created (predicate) rdfs:Literal (object) |
Label | created |
RDF Property: | dct:creator |
Definition | An entity primarily responsible for making the resource. |
vann:usageNote | dcat:Dataset (subject) dct:creator (predicate) foaf:Agent (object)
dcat:Distribution (subject) dct:creator (predicate) foaf:Agent (object) |
Label | creator |
RDF Property: | dct:description |
Definition | A free text account of the resource |
vann:usageNote | duv:Usage (subject) dct:description (predicate) rdfs:Literal (object)
duv:UsageTool (subject) dct:description (predicate) rdfs:Literal (object) |
Label | description |
RDF Property: | disco:fundedBy |
Definition | The agent (person, organization) responsible for sponsoring the dataset/distribution research that made the creation of the dataset possible, such as codifying and digitizing the data. |
vann:usageNote | dcat:Dataset (subject) disco:fundedBy (predicate) foaf:Agent (object) dcat:Distribution (subject) fundedBy (predicate) foaf:Agent (object) |
Label | funded by |
RDF Property: | oa:hasBody |
Definition | Body of the comment associated with user feedback. |
vann:usageNote | duv:UserFeedback (subject) oa:hasBody (predicate) (object)
oa:Annotation (subject) oa:hasBody (predicate) oa:TextualBody(object) |
Label | has body |
RDF Property: | duv:hasDistributor |
Definition | The distributor is the organization that makes the dataset or distribution available for downloading and use. |
vann:usageNote | dcat:Dataset (subject) duv:hasDistributor (predicate) foaf:Agent (object) dcat:Distribution (subject) duv:hasDistributor (predicate) foaf:Agent (object) |
Label | has distributor |
RDF Property: | duv:hasRating |
Definition | RatingFeedback has rating opinion |
vann:usageNote | duv:RatingFeedback (subject) duv:hasRating (predicate) skos:Concept (object) |
Label | has rating |
RDF Property: | oa:hasTarget |
Definition | Dataset or distribution associated with UserFeedback. |
vann:usageNote | duv:UserFeedback (subject) oa:hasTarget (predicate) dcat:Dataset (object) duv:UserFeedback (subject) oa:hasTarget (predicate) dcat:Distribution (object) |
Label | has target |
RDF Property: | duv:hasUsage |
Definition | Dataset or distribution usage guidance/instructions. |
vann:usageNote | dcat:Dataset (subject) oa:hasUsage(predicate) duv:Usage (object)
dcat:Distribution (subject) oa:hasUsage (predicate) duv:Usage (object) |
Label | has usage |
RDF Property: | duv:hasUsageTool |
Definition | A usage tool (application, service) referred to by usage guidance/instructions. |
vann:usageNote | dcat:Usage (subject) oa:hasUsageTool(predicate) duv:UsageTool (object) |
Label | has usage tool |
RDF Property: | dct:identifier |
Definition | The identifier of the dataset or distribution. |
vann:usageNote | dcat:Dataset (subject) dct:identifier (predicate) rdfs:Literal (object) dcat:Distribution (subject) dct:identifier (predicate) rdfs:Literal (object) |
Label | identifier |
RDF Property: | biro:isReferencedBy |
Definition | The relation between a publication and the bibliographic record or bibliographic reference describing it. |
Label | is referenced by |
RDF Property: | dct:issued |
Definition | Date of formal issuance (e.g. publication) of the dataset or distribution |
vann:usageNote | dcat:Dataset (subject) dct:issued (predicate) rdfs:Literal (object)
dcat:Distribution (subject) dct:issued (predicate) rdfs:Literal (object) |
Label | issue |
RDF Property: | oa:motivatedBy |
Definition | reason behind citation annotation or userfeedback |
vann:usageNote | duv:UserFeedback (subject) oa:motivatedBy (predicate) oa:Motivation (object) oa:Annotation (subject) oa:motivatedBy (predicate) oa:Motivation (object) |
Label | motivated by |
RDF Property: | frbr:part |
Definition | A part of an endeavour |
Label | part |
RDF Property: | frbr:partOf |
Definition | An endeavour of an endeavour |
Label | part of |
RDF Property: | duv:performedBy |
Definition | Usage performed by agent. |
vann:usageNote | duv:Usage (subject) duv:performedBy (predicate) foaf:Agent (object) |
Label | performed by |
RDF Property: | dct:publisher |
Definition | An entity responsible for making the resource available. |
vann:usageNote | dcat:Dataset (subject) dct:publisher (predicate) foaf:Agent (object)
dcat:Distribution (subject) dct:publisher (predicate) foaf:Agent (object) |
Label | publisher |
RDF Property: | biro:references |
Definition | The relation between a bibliographic record or a bibliographic reference and the publication being referenced. |
Label | references |
RDF Property: | duv:refersTo |
Definition | Dataset/distribution associated with Usage. |
vann:usageNotes | duv:Usage (subject) duv:refersTo (predicate) dcat:Dataset (object) |
Label | refers to |
RDF Property: | dct:title |
Definition | A name given to the dataset or distribution |
vann:usageNote | dcat:Dataset (subject) dct:title (predicate) rdfs:Literal (object)
dcat:Distribution (subject) dct:title (predicate) rdfs:Literal (object) |
Label | title |
RDF Property: | pav:version |
Definition | The version of the dataset or distribution |
vann:usageNote | dcat:Dataset (subject) pav:version (predicate) rdfs:Literal (object)
dcat:Distribution (subject) pav:version (predicate) rdfs:Literal (object) |
Label | version |
This section shows some examples to illustrate the application of the Dataset Usage Vocabulary. The following examples are based on the running example of the Data on the Web Best Practices document.
Example 1 - Citation: Basic reference criteria for a dataset.
a dcat:Dataset ;
dct:title "Bus stops of MyCity" ;
dct:identifier ""^^xsd:anyURI;
dct:issued "2015-05-05"^^xsd:date ;
pav:version "1.0" ;
dct:publisher :transport-agency-mycity ;
dct:creator :bus-association-mycity ;
duv:hasDistributor :transport-agency-mycity ;
disco:fundedBy :bus-association-mycity ;
dct:created "2015-05-05"^^xsd:date
Example 2 - Citation: A memorandum that gives more insights about the dataset stops-2015-05-05
a dcat:Dataset ;
dct:title "Bus stops of MyCity" ;
dct:identifier ""^^xsd:anyURI ;
dct:issued "2015-05-05"^^xsd:date ;
pav:version "1.0" ;
dct:publisher :transport-agency-mycity ;
dct:creator :bus-association-mycity ;
duv:hasDistributor :transport-agency-mycity ;
disco:fundedBy :bus-association-mycity ;
dct:created "2015-05-05"^^xsd:date ;
dcat:distribution :stops-2015-05-05.csv ;
biro:isReferencedBy :stops-memorandum
a fabio:InstructionalWork ;
frbr:realization :memo-a-expression
:memo-a-expression a fabio:PolicyDocument ;
frbr:part :reference-to-stops-dataset
a biro:BibliographicReference;
dct:bibliographicCitation "MyCity Bus Association. Bus stops of MyCity. 1.0 version. Transport Agency, 2015 [publisher,distributor]. doi:10.0902/1975.16"^^xsd:string;
biro:references :stops-2015-05-05
Example 3 - Citation: A research paper that has a reference to the dataset stops-2015-05-05
a dcat:Dataset ;
dct:title "Bus stops of MyCity" ;
dct:identifier ""^^xsd:anyURI ;
dct:issued "2015-05-05"^^xsd:date ;
pav:version "1.0" ;
dct:publisher :transport-agency-mycity ;
dct:creator :bus-association-mycity ;
duv:hasDistributor :transport-agency-mycity ;
disco:fundedBy :bus-association-mycity ;
dct:created "2015-05-05"^^xsd:date ;
dcat:distribution :stops-2015-05-05.csv ;
biro:isReferencedBy :stops-memorandum ;
biro:isReferencedBy :stops-paper
a fabio:ResearchPaper ;
frbr:realization :paper-a-expression
:paper-a-expression a fabio:JournalArticle ;
frbr:part :reference-to-stops-dataset
a biro:BibliographicReference;
dct:bibliographicCitation "MyCity Bus Association. Bus stops of MyCity. 1.0 version. Transport Agency, 2015 [publisher,distributor]. doi:10.0902/1975.16"^^xsd:string ;
biro:references :stops-2015-05-05
Example 4 - Feedback: Providing a mechanism for consumers to comment and ask the dataset publisher questions.
a dcat:Dataset ;
dct:title "Bus stops of MyCity" ;
dct:identifier ""^^xsd:anyURI ;
dct:issued "2015-05-05"^^xsd:date ;
pav:version "1.0" ;
dct:publisher :transport-agency-mycity ;
dct:creator :bus-association-mycity ;
duv:hasDistributor :transport-agency-mycity ;
disco:fundedBy :bus-association-mycity ;
dct:created "2015-05-05"^^xsd:date ;
dcat:distribution :stops-2015-05-05.csv ;
duv:hasFeedback :comment1
a dcat:Distribution ;
dcat:downloadURL <> ;
dct:title "CSV distribution of stops-2015-05-05 dataset" ;
dct:description "CSV distribution of the bus stops dataset of MyCity" ;
dcat:mediaType "text/csv;charset=UTF-8" ;
dct:license <> ;
duv:hasFeedback :comment2
a duv:UserFeedback ;
oa:hasBody "This list is missing stop 3"^^xsd:string ;
oa:hasTarget :stops-2015-05-05 ;
oa:motivatedBy oa:editing;
dct:creator :localresident
a duv:UserFeedback ;
oa:hasTarget :stops-2015-05-05.csv;
oa:hasBody "Are tab delimited formats also available?"^^xsd:string ;
oa:motivatedBy oa:questioning;
dct:creator :localresident
a foaf:Person ;
foaf:Name "Alan Law"^^xsd:string
Example 5 - Feedback: A rating feedback provided by a dataset consumer. To describe possible values for rating, the data publisher used SKOS .
a dcat:Dataset ;
dct:title "Bus stops of MyCity" ;
dct:identifier ""^^xsd:anyURI ;
dct:issued "2015-05-05"^^xsd:date ;
pav:version "1.0" ;
dct:publisher :transport-agency-mycity ;
dct:creator :bus-association-mycity ;
duv:hasDistributor :transport-agency-mycity ;
disco:fundedBy :bus-association-mycity ;
dct:created "2015-05-05"^^xsd:date ;
dcat:distribution :stops-2015-05-05.csv ;
duv:hasFeedback :rating1
a duv:RatingFeedback ;
oa:hasBody :good ;
oa:hasTarget :stops-2015-05-05 ;
oa:motivatedBy oa:assessing
a skos:Concept ;
skos:inScheme :rating ;
skos:prefLabel "good"@en ;
skos:prefLabel "boa"@pt
a skos:Concept ;
skos:inScheme :rating ;
skos:prefLabel "bad"@en ;
skos:prefLabel "ruim"@pt
a skos:Concept ;
skos:inScheme :rating ;
skos:prefLabel "excellent"@en ;
skos:prefLabel "excelente"@pt
a skos:ConceptScheme ;
skos:prefLabel "A set of values to rate datasets and distributions." ;
Example 6 - Dataset Usage: A dataset consumer uses the distribution stops-2015-05-05.csv
to produce a visualization of bus routes from MyCity. A route calculator service was used to generate the bus routes (
). The dataset publisher collects this usage and updates the dataset metadata with a new property duv:hasUsage
a dcat:Distribution ;
dcat:downloadURL <> ;
dct:title "CSV distribution of stops-2015-05-05 dataset" ;
dct:description "CSV distribution of the bus stops dataset of MyCity" ;
dcat:mediaType "text/csv;charset=UTF-8" ;
dct:license <> ;
duv:hasUsage :route-vis
:route-vis a duv:Usage;
dct:title "Visualization of MyCity bus routes"^^xsd:string ;
dct:created "2016-05-05"^^xsd:date ;
duv:hasUsageTool :route-calculator ;
duv:refersTo :stops-2015-05-05.csv
:route-calculator a duv:UsageTool ;
dct:title "Route Calculator"^^xsd:string ;
dct:created "2016-03-04"^^xsd:date ;
dct:identifier ""^^xsd:anyURI