Abstract

This document defines the procedures and rules to be applied when mapping converting tabular data into RDF. Tabular data may be complemented with metadata annotations that describe its structure, the meaning of its content and how it may form part of a collection of interrelated tabular data. This document specifies the effect of this metadata on the resulting RDF.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

The CSV on the Web Working Group was chartered to produce Recommendations for "Access methods for CSV Metadata", "Metadata vocabulary for CSV data" and "Mapping mechanism to transforming CSV into various Formats (e.g., RDF, JSON, or XML)". This document aims to satisfy the RDF variant of the mapping Recommendation.

Due to the limited resources available within the CSV on the Web Working Group, this document describes only a simple mapping —that is, where each row of tabular data describes a single resource and a single RDF triple is created per cell. The Working Group solicits input on the value of mapping a single row of tabular data into multiple inter-related resources. This document was published by the CSV on the Web Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-csv-wg@w3.org ( subscribe , archives ). All comments are welcome.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .

This document is governed by the 1 August 2014 W3C Process Document .

Table of Contents

1. Introduction

This document describes the processing of tabular data to create an RDF graph comprising subject-predicate-object triples [ rdf11-concepts ] referred to as the output graph . ]. Since RDF is an abstract syntax , the output graph MUST these triples MAY be serialized in a concrete RDF syntax such as N-Triples [ n-triples ], Turtle [ turtle ], RDFa [ rdfa-primer ], JSON-LD [ json-ld ] ], or TriG [ trig ]. The RDF serializations offered by a conversion application is implementation defined.

The Tabular Data Model [ tabular-data-model ] defines a core an annotated tabular data model consisting of tables , columns , rows , and cells . Tabular data may be , enriched with metadata annotations that describes its describe the structure of the tabular data and the meaning of its content. These metadata annotations are described in [ tabular-metadata ] and may be embedded within the CSV encoding itself as a header line or provided within a separate metadata document. The resulting annotated table conforms to the annotated tabular data model . The metadata annotations may describe how a table A group of tables relates to is a group collection of tables . Such collections conform to the grouped tabular data model . published as a single atomic unit.

The mapping conversion procedure described in this specification operates on the abstract annotated tabular data model; core , annotated or grouped . No discussion is given to model . This specification does not specify the processes needed to convert CSV-encoded data into tabular data form. Please refer to [ tabular-data-model ] for details of parsing tabular data . Further details on parsing cells within tabular data is provided in [ tabular-metadata

Conversion applications MUST provide at least two modes of operation: standard ]. and minimal .

Note

Adopting terminology Standard mode conversion frames the information gleaned from the Data Catalog Vocabulary [ vocab-dcat cells ], of the tabular data is considered to be a dataset , whilst with details of the CSV file rows , tables , and a group of tables within which that tabular data is encoded information is considered to be a distribution of that tabular data. provided.

Issue 93

Are the abstract tabular data and the CSV that encodes it Minimal mode conversion includes only the same thing? (Is DCAT distribution appropriate?) The mapping procedure is intended to be simple; encouraging information gleaned from the provision of compliant mapping applications. The limitation of this simple mapping is that each row cells of the tabular data is inferred to describe a single resource and that a single RDF triples is created for each cell .

Note

An annotated table Standard may include a reference to a template specification and minimal (see conversion are described normatively below .

Conversion applications MAY offer additional implementation specific conversion modes.

Transformation definitions , as defined in [ tabular-metadata ]) that describes ] MAY be used to specify how tabular data can be transformed into another format using a template-based approach. Templating script or template. Such transformation definitions MAY facilitates far more sophisticated transformations than are possible using use the simple mapping . There is no standard template syntax, therefore template specifications may be written using existing template languages, such as Mustache , [ r2rml ] and SPARQL CONSTRUCT queries (as defined RDF output described in [ sparql11-query ]). The processing of template specifications during the mapping is yet to be determined by the Working Group and is, at least for the interim, beyond the scope of this document. specification as input.

Finally, note that the mapping procedure is considered to be entirely textual . There is no requirement on compliant mapping conversion applications to check the semantic consistency of the data during the mapping, conversion, nor to validate the cell values triples against RDF syntax rules. Where cell values within CSV encoded content are improperly formatted, the output from the mapping is likely to include syntax errors. schema. Downstream applications should SHOULD be aware of this the potential for inconsistencies and take appropriate action.

Issue 62 Should the RDF/JSON transformation check the values?

2. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY , MUST , SHALL , and SHOULD are to be interpreted as described in [ RFC2119 ].

Tabular data MUST conform to the description from [ tabular-data-model ]. In particular note that each row MUST contain the same number of cells (although some of these cells may be empty). Given this constraint, not

Note

Not all CSV-encoded data can be considered to be parsed into a tabular data. As such, the procedures and rules defined in this document cannot be applied to all CSV files. This document relies on terms (e.g. group, table, column, row, cell) defined data model. An algorithm for parsing CSV-based files is described in [ tabular-data-model ].

This specification makes use of the CURIE compact IRI Syntax for describing RDF Triples; see, for example, ; please refer to the CURIE Syntax Definition Section Compact IRIs of the RDFa 1.1 Core Specification from [ rdfa-core json-ld ].

This specification makes use of the following namespaces:

csvw :
http://www.w3.org/ns/csvw#
rdf :
http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs xsd :
http://www.w3.org/2000/01/rdf-schema# http://www.w3.org/2001/XMLSchema#

xsd : 3. Typographical conventions

The following typographic conventions are used in this specification:

http://www.w3.org/2001/XMLSchema# markup
Markup (elements, attributes, properties), machine processable values (string, characters, media types), property name, or a file name is in red-orange monospace font.
dc : variable
A variable in pseudo-code or in an algorithm description is in italics.
http://purl.org/dc/terms/ definition
A definition of a term, to be used elsewhere in this or other specifications, is in bold and italics.
dcat : definition reference
A reference to a definition in this document is underlined and is also an active link to the definition itself.
http://www.w3.org/ns/dcat# markup definition reference
A references to a definition in this document , when the reference itself is also a markup, is underlined, red-orange monospace font, and is also an active link to the definition itself.
prov : external definition reference
A reference to a definition in another document is underlined, in italics, and is also an active link to the definition itself.
http://www.w3.org/ns/prov# markup external definition reference
A reference to a definition in another document , when the reference itself is also a markup, is underlined, in italics red-orange monospace font, and is also an active link to the definition itself.
hyperlink 3. Mapping Core Tabular Data
A hyperlink is underlined and in blue.
The procedures [reference]
A document reference (normative or informative) is enclosed in square brackets and rules for mapping tabular data compliant with links to the core tabular data model are described below. references section.
Note

Core Tabular Data lacks any annotation; neither from the Notes are in light green boxes with a green left border and with a "Note" header line within in green. Notes are normative or informative depending on the CSV file nor from whether they are in a separate metadata document. normative or informative section, respectively.

Example 1
Examples are in light khaki boxes, with khaki left border, and with a 
numbered "Example" header in khaki. Examples are always informative. 
The
content
of
the
example
is
in
monospace
font
and
may
be
syntax
colored.

3.1 4. Generating Converting Tabular Data to RDF

The procedures for converting tabular data into RDF are described below for both standard and minimal modes.

3.1.1 4.1 Table-level processing Algorithm terms

about URL
The output graph about URL SHALL contain a resource that describes the table : annotation on the current cell . As defined in [ tabular-data-model ].
annotated table resource .
The annotated table resource SHALL be of type csvw:Table . is defined in [ tabular-data-model Note ] as describing a particular table and its annotations.
blank node
A blank node is defined in [ csvw:Table , rdf11-concepts ] as an RDFS Class, RDF Term disjoint from IRIs or literals .
cell
A cell is a sub-class of defined in [ dcat:Dataset tabular-data-model (as ] as the intersection of a row and a column within a table .
cell errors
Cell errors are defined in [ vocab-dcat tabular-data-model ]). ] as a (possibly empty) list of validation errors generated while parsing the literal content of a cell to generate the semantic value .
The table resource cell value
A cell value SHALL be explicitly identified as is defined in [ [CSV Location] tabular-data-model ] as the semantic value of the cell ; this MAY #table , where be null or a sequence of values.
[CSV Location] column
A column is the absolute URL defined in [ tabular-data-model ] as a vertical arrangement of the source CSV file. cells within a table .
group of tables
A group of tables is defined in [ Issue 106 tabular-data-model ] as comprising a set of annotated tables and a set of annotations that relate to that group.
group of tables identifier Should
The group of tables identifier is the table resource id annotation on a group of tables . As defined in the [ tabular-data-model ].
literal node
A literal node is defined in [ rdf11-concepts ] as a node within an RDF mapping of core tabular data be explicitly identified (e.g. graph that provides values such as strings, numbers, and dates.
[CSV Location]#table node )?
A node is defined in [ The output graph rdf11-concepts SHALL contain ] as a resource subject or an object of type dcat:Distribution an RDF triple. When in subject position, it can be either a blank node or identified with a URL; when in object position, it can be a blank node , a literal , or identified with a URL.
(as defined non-core annotations
Core annotations are listed in [ vocab-dcat tabular-data-model ]) ]; groups of tables and tables may also have other annotations that describes the CSV-encoded distribution are not defined in that specification; these are known as non-core annotations .
notes
A list of the tabular data. The distribution SHALL be related notes , as defined in [ tabular-data-model ], attached to the an annotated table resource or group of tables using the predicate notes property. This may be an empty list.
dcat:distribution predicate
A predicate is defined in [ and SHALL state the absolute URL of rdf11-concepts ] as an IRI that denotes the source CSV file using property used to relate nodes within an RDF triple.
prefixed name
A prefixed name is an abbreviation for a triple with URI, in the predicate syntax dcat:downloadURL prefix : name . See Names of Common Properties in [ tabular-metadata Issue 93 ] for information on expansion.
property URL
The property URL annotation on the current cell . As defined in [ tabular-data-model ].
Are row
The row is defined in [ tabular-data-model ] as a horizontal arrangement of cells within a table .
row number
A row number is defined in [ tabular-data-model ] as the abstract position of the row within the table , starting from 1.
row source number
A row source number is defined in [ tabular-data-model ] as the position of the row within the source tabular data file . Provision of the row source number is dependent on parsing applications and may be reported as null .
subject
Within this algorithm, a subject is the CSV resource that encodes it the same thing? (Is DCAT distribution appropriate?) value of a given cell refers to. This may be specified using about URL .
table identifier
The output graph SHALL contain one resource for each of table identifier is the rows id within annotation on an annotated table . As defined in [ tabular-data-model ].
value URL
The value URL annotation on the tabular data. current cell . As defined in [ tabular-data-model ].

4.2 Generating RDF

Each row-level resource SHALL A conformant RDF conversion application MUST be related emit triples conforming to those described in this algorithm according to the table resource chosen mode of conversion: standard using or minimal .

Unless specified otherwise, the steps in the algorithm defined herein apply to both standard predicate and csvw:row . minimal modes.

Note

csvw:row , Where an RDF Property, annotated table is a sub-property of rdfs:member (as defined in [ rdf-schema isolation (e.g. in the absence of a group of tables ]). ), a default group of tables is provided with a single tables annotation that refers to the given table .

  1. In Issue 97 standard mode only, establish a new node G . If the group of tables has an identifier then node G MUST be identified accordingly; else if identifier is null , then node G MUST be a new blank node .
  2. In standard mode What should only, specify the name type of node G as csvw:TableGroup ; emit the property be that relates rows to the table? following triple:

    subject
    node G
    predicate
    row rdf:type is one option;
    object
    hasRow csvw:TableGroup is another.
  3. Refer In standard mode only, emit the triples generated by running the algorithm specified in section 6. JSON-LD to Section 3.1.2 Row-level processing RDF over any notes and non-core annotations specified for further details on the description group of row-level entities. tables , with node G as an initial subject , the notes or non-core annotation as property , and the value of the notes or non-core annotation as value .

  4. Optionally, the output graph For each table MAY contain information describing how and when where the suppress output graph was created using terms from the PROV Ontology [ annotation is false :

    1. In prov-o standard mode ]. only, establish a new node T which represents the current table .

      If provenance information the table has an identifier then node T MUST is to be included in the output graph identified accordingly; else if identifier is null , then node T MUST be a new blank node .

    2. In standard mode only, relate the table resource SHALL use to the group of tables ; emit the following triple:

      subject
      node G
      predicate
      prov:activity csvw:table to refer to a resource of
      object
      node T
    3. In standard mode only, specify the type of node T as prov:Activity csvw:Table that describes ; emit the mapping activity. following triple:

      The
      subject
      node T
      predicate
      prov:Activity rdf:type resource SHOULD provide information on
      object
      csvw:Table
    4. In standard mode only, specify the start and end time of source tabular data file URL for the mapping activity and refer to current table based on the original CSV file. The url annotation; emit the following triple:

      subject
      node T
      predicate
      prov:Activity csvw:url resource MAY indicate the location of
      object
      a node identified by URL
    5. In standard mode only, emit the triples generated by running the algorithm specified in section 6. JSON-LD to RDF output graph if known. The example below provides over any notes and non-core annotations specified for the table , with node T as an illustration initial subject , the notes or non-core annotation as property , and the value of provenance information , where the notes or non-core annotation as value .

      [CSV Location]
      is Note

      All other core annotations for the absolute URL of table are ignored during the source CSV file, conversion; including information about table schemas and columns specified therein, foreign keys , direction , transformations etc.

      [Start Time]
    6. For each row in the current table :

      1. In is standard mode only, establish a new blank node R which represents the start time of current row .

      2. In standard mode only, relate the mapping activity, row to the table ; emit the following triple:

        subject
        [End Time] node T
        predicate
        csvw:row is
        object
        node R
      3. In standard mode only, specify the finish time type of the mapping activity (both expressed node R as xsd:dateTime csvw:Row ) and ; emit the following triple:

        subject
        node R
        predicate
        rdf:type [RDF Output Location]
        object
        csvw:Row is
      4. In standard mode only, specify the location of row number n for the generated output graph . row ; emit the following triple:

        <> a csvw:Table; prov:activity [ a prov:Activity; prov:startedAtTime [Start Time]; prov:endedAtTime [End Time]; prov:generated <[RDF Output Location]>; prov:qualifiedUsage [ a prov:Usage ; prov:Entity <[CSV Location]> ; prov:hadRole csvw:csvEncodedTabularData ] ]; ...
        subject
        node R
        predicate
        3.1.2 Row-level processing csvw:rownum
        object
        a literal n ; specified with datatype IRI xsd:integer
      5. Each In standard mode only, specify the row source number in n source for the row within the source tabular data is processed sequentially to produce file URL using a resource fragment-identifier as specified in [ RFC7111 ]; if row source number is not null , emit the output graph following triple:

        subject
        node corresponding R
        predicate
        csvw:url
        object
        a node identified by URL #row= n source
      6. In standard mode only, emit the triples generated by running the algorithm specified in section 6. JSON-LD to that row : RDF over any non-core annotations specified for the row resource . , with node R as an initial subject , the non-core annotation as property , and the value of the non-core annotation as value .

      7. Establish a new blank node S def to be used as the default subject for cells where about URL is undefined.

        Note

        A row resource SHALL MAY contain describe multiple interrelated subjects ; where the value URL annotation on one triple for cell matches the about URL annotation on another cell in the same row .

        For each column cell in the current row where the suppress output annotation for the column associated with that cell is false :

        1. Establish a node S from about URL if set, or from S def otherwise as the current subject .

        2. In standard mode only, relate the current subject to the current row ; emit the following triple:

          subject
          node R
          predicate
          csvw:describes
          object
          node S
        3. If the value of property URL for the cell is not null (e.g. where , then predicate P takes the value of property URL .

          Else, predicate P is constructed by appending the value of the name annotation for the column associated with the cell does not contain an empty string; "" ). to the the tabular data file URL as a fragment identifier.

        4. The If the value URL for the current cell is not null , then value URL SHALL be identifies a node V url that is related to the row resource current subject using a triple with the predicate P ; emit the following triple:
          subject
          [CSV Location] #_col= [N] node S , where
          predicate
          [CSV Location] P
          object
          node V url
        5. Else, if the cell value is a list and the absolute URL cell ordered annotation is true , then the cell value provides an ordered sequence of literal nodes for inclusion within the source CSV file. and RDF output using an instance of rdf:List V list [N] as defined in [ rdf-schema ]. This instance is related to the column number . subject using the predicate P ; emit the triples defining list V list plus the following triple:
          subject
          node S Issue 33
          predicate
          P
          object
          node V list What do to with conversion
        6. Else, if no column name the cell value is given? a list , then the cell value provides an unordered sequence of literal nodes for inclusion within the RDF output, each of which is related to the subject using the predicate P . For each value provided in the sequence, add a literal node V literal ; emit the following triple:
          subject
          node S
          predicate
          P
          object
          literal node V literal
        7. Where Else, if the cell value is not null , , then the triple SHALL be omitted from cell value provides a single literal node V literal for inclusion within the RDF output graph . Given that is related the absence of metadata annotations to indicate current subject using the type of data present in a given column, all predicate P ; emit the following triple:
          subject
          node S
          predicate
          P
          object
          literal node V literal

          The literal nodes derived from the cell values SHALL MUST be treated expressed according to the cell value's datatype as strings. defined below: Interpreting datatypes .

          Issue 98 Note

          Should In the mapping output for case when a given row include cell value does not have a reference datatype , the conversion should default to string .

          Note

          In the CSV source row? case where a sequence of values is provided, each value in the list has its own datatype ; the datatype may be different for different items in the sequence.

3.2 4.3 Examples Interpreting datatypes

The following example provides a numeric score for four fictional people. A row number is included for convenience. There Cell values are four columns expressed in the RDF output according to the cell value's datatype . The relationship between the value of the cell value's datatype and four rows. Given that no metadata annotations are provided, it the datatype IRI used in the RDF is very difficult provided in the table below.

Note

A datatype's format annotation is irrelevant to ascertain the subject conversion procedure defined in this specification; the cell value has already been parsed from the contents the cell according to the format annotation.

Where the contents of the tabular data without additional insight. cell cannot be parsed, or other validation errors occur, cell errors will be provided. It is an implementation decision to determine how conversion applications should proceed in the event that cell errors are encountered.

:table a csvw:Table ; dcat:distribution [ a dcat:Distribution ; dcat:downloadURL <http://example.org/people-and-points.csv> ] ; csvw:row [ :_col=1 "1" ; :_col=2 "Jill" ; :_col=3 "Smith" ; :_col=4 "50" ; ] , [ :_col=1 "2" ; :_col=2 "Eve" ; :_col=4 "94" ; ] , [ :_col=1 "3" ; :_col=2 "Adam" ; :_col=3 "Johnson" ; ] , [ :_col=1 "4" ; :_col=2 "John" ; :_col=3 "Doe" ; :_col=4 "80" ; ] ; prov:activity [ a prov:Activity ; prov:startedAtTime "2014-12-15T12:44:42"^^xsd:dateTime ; prov:endedAtTime "2014-12-15T12:44:42"^^xsd:dateTime ; prov:qualifiedUsage [ a prov:Usage ; prov:Entity <http://example.org/people-and-points.csv> ; prov:hadRole csvw:csvEncodedTabularData ; ] ; ] ; . 4. Mapping Annotated Tabular Data
datatype RDF datatype IRI Remarks
anyAtomicType xsd:anyAtomicType
1 anyURI Jill xsd:anyURI Smith
50 base64Binary xsd:base64Binary
2 boolean Eve xsd:boolean
94 date xsd:date
3 dateTime Adam xsd:dateTime Johnson
dateTimeStamp xsd:dateTimeStamp
4 decimal John xsd:decimal Doe
integer 80 xsd:integer
The CSV input (published at http://example.org/people-and-points.csv long ): 1,Jill,Smith,50 2,Eve,,94 3,Adam,Johnson, 4,John,Doe,80 The resulting RDF output graph: @prefix : <http://example.org/people-and-points.csv#> . @prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . xsd:long The procedures and rules for mapping annotated tabular data compliant with the annotated tabular data model are described below. The metadata for annotated tabular data MAY be provided by either
int xsd:int
short xsd:short
byte xsd:byte
nonNegativeInteger xsd:nonNegativeInteger
positiveInteger xsd:positiveInteger
unsignedLong xsd:unsignedLong
unsignedInt xsd:unsignedInt
unsignedShort xsd:unsignedShort
unsignedByte xsd:unsignedByte
nonPositiveInteger xsd:nonPositiveInteger
negativeInteger xsd:negativeInteger
double xsd:double
duration xsd:duration
dayTimeDuration xsd:dayTimeDuration
yearMonthDuration xsd:yearMonthDuration
float xsd:float
gDay xsd:gDay
gMonth xsd:gMonth
gMonthDay xsd:gMonthDay
gYear xsd:gYear
gYearMonth xsd:gYearMonth
hexBinary xsd:hexBinary
QName xsd:QName
string xsd:string or both of the following sources: rdf:langString depending on whether or not the header line within value has an associated language.
normalizedString xsd:normalizedString
token xsd:token
language xsd:language
Name xsd:Name
NMTOKEN xsd:NMTOKEN
xml rdf:XMLLiteral
html rdf:HTML
json csvw:JSON csvw:JSON is a sub-class of xsd:string
time xsd:time

In the CSV file; and/or case of rdf:langString , the table description , schema and column descriptions appropriate language tag (as defined in [ tabular-metadata rdf11-concepts ]) within the associated metadata document. Mapping applications SHALL MUST establish a column description object be provided for each column within the annotated tabular data. The column description object contains string, based on the aggregated set value of metadata properties for a given column that affect how the cell values within the associated column are expressed in the output graph value's language . Metadata properties are sourced from the header line in the CSV file and column description (See section on Graph Literals in the metadata document and enriched with inherited properties from the table description and schema . The output graph MAY include some Direct Annotations [ rdf11-concepts sourced from the metadata document. Where these are natural language properties , a ] for further details on language tag (as defined in tagged literals.)

Note

According to [ rdf11-concepts ]) SHALL ] language tags cannot be provided for those properties where locale information is specified within combined with any other xsd datatypes. If a cell has any other datatype than string , the metadata description. Where multiple locale-specific values value of a natural language property lang MUST be ignored. Also, all literals have been defined using a language map (as defined in datatype; however, specific serializations, like Turtle [ json-ld turtle ]), each value SHALL ], MAY be expressed as provide a separate triple special syntax for literals with the appropriate language tag . datatype xsd:string or rdf:langString .

5. Inclusion of provenance information

Clearly, in order to process annotated tabular data, a mapping application MUST This section is non-normative. have access

In addition to the full metadata description associated with namespaces defined above , the tabular data. following namespace is used in this section:

prov :
http://www.w3.org/ns/prov#

URL expansion behaviour of relative URLs SHALL Conversion applications MAY be consistent with Section 6.3 IRI Expansion include provenance information in the RDF output describing how and when the output was created; e.g., using terms from the PROV Ontology [ json-ld-api prov-o ]. The base URL provides the URL against which relative URLs from annotated tabular data are resolved. The base URL SHALL be Information that may be of interest to downstream applications includes:

In order to faciliate the provision of such information, this specification introduces two instances of prov:Role :

Issue 91 csvw:csvEncodedTabularData
Defines the role of the source tabular data file.
csvw:tabularMetadata What is default value if @base is not defined in
Defines the role of the metadata description? description file.
An illustrative example of provenance information is provided below in Turtle [ turtle 4.1 ] syntax, the conversion application used is identified as http://example.org/my-csv2rdf-application :
Example 2: http://example.org/prov-example.ttl
@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .


<> prov:wasGeneratedBy [

    a prov:Activity ;
    prov:wasAssociatedWith  <http://example.org/my-csv2rdf-application> ;
    prov:startedAtTime "2015-02-13T15:12:44"^^xsd:dateTime ;
    prov:endedAtTime   "2015-02-13T15:12:46"^^xsd:dateTime ;
    prov:qualifiedUsage [ a prov:Usage ;
        prov:entity <http://example.org/csv/data.csv> ;
        prov:hadRole csvw:csvEncodedTabularData
    ];

    prov:qualifiedUsage [ a prov:Usage ;
        prov:entity 
                <http://example.org/csv/data.csv-metadata.json> ,                 
                <http://example.org/csv/metadata.json> ;

        prov:hadRole csvw:tabularMetadata
    ];

Generating
RDF


]


4.1.1 6. Table-level processing JSON-LD to RDF

The root object of This section defines a mechanism for transforming the output graph [ json-ld SHALL contain a table resource ] dialect that describes the annotated table . The table resource used for non-core annotations SHALL be of type csvw:Table . Where provided in the table description , and notes originating from the value processing of metadata property @id SHALL be used to identify the table resource . The output graph (as defined in [ tabular-metadata SHALL contain ]) into RDF in a resource of type manner consistent with the dcat:Distribution Deserialize JSON-LD to RDF Algorithm (as defined in [ vocab-dcat json-ld-api ]) that describes the CSV-encoded distribution ]. Converters MAY of the tabular data. use any algorithm which results in equivalent triples.

The distribution SHALL be related
Note

Conversion applications may have other means to create annotated tables , e.g., through some application specific API-s. In such cases the table resource exact format for non-core annotations or notes may be different. Specifications for such annotation processes should specify how these annotations should be converted into RDF.

Given a subject , property and value in normalized using form:

  1. Property is a term defined in the predicate [ dcat:distribution csvw-context and SHALL state the ], a prefixed name , or an absolute URL; expand to an absolute URL of the source CSV file using by replacing a triple term with the predicate URI from the term definition in [ dcat:downloadURL csvw-context . ] or a prefixed name as described in Names of Common Properties in [ tabular-metadata ].
  2. Where a header line If value is present in the CSV file then, for an array, generate RDF by running this algorithm using subject , property using each column , the cell array member as value from .
  3. If value is an object containing @value , create an RDF Literal lit using the column header SHALL be assigned to metadata properties string value of name @value and language from title @language , or datatype from @type within if present, expanding @type as necessary using the column description procedure outlined for property , and emit the following triple:
    subject
    node subject
    predicate
    property
    object
    literal node for that column . lit
    Note

    The language of the column header If neither @language nor @type is inferred to be that of present, the data within literal lit has the column as specified by inherited property datatype language xsd:string .

    Where the column header
  4. Else, if value is null , an object:
    1. Establish a new node S from the value assigned to name SHALL be of _col= [N] @id , if it exists, and new blank node otherwise and emit the following triple:
      subject
      where node subject [N]
      predicate
      is the column number node and no value assigned to title . property
      object
      node S
    2. Where present For every value of @type , either a term defined in the table description [ csvw-context ], a prefixed name , or an absolute URL; establish a new node T i by expanding the following metadata properties SHALL be included in value to an absolute URL by replacing a term with the output graph URI from the term definition in [ csvw-context as properties of ] or a prefixed name with its expanded value. For each T i , emit the table resource : following triple:
      subject
      node S
      predicate
      notes rdf:type - the array of entities representing structured annotations on the tabular data SHALL be included verbatim.
      object
      Note node T i The Web Annotation Working Group is developing a vocabulary for expressing annotations which we anticipate referencing
    3. For every key and val from value that does not start with @ ( U+0040 ) generate RDF by running this specification. Issues likely to be covered therein include: how to anchor algorithm using S for subject , key for property and val for value .
  5. Else, establish lit as an RDF Literal as follows:
    1. If value is true or false , create an RDF Literal lit using the annotation to strings "true" or "false", accordingly with datatype xsd:boolean
    2. Else, if value is a target in the tabular data and/or CSV file, what form the annotations themselves may take (e.g. JSON number with a simple literal annotation body, or whether additional formatting properties are required to indicate that non-zero fractional part, create an RDF Literal lit using the annotation canonical representation for value with datatype xsd:double .
    3. Else, if value is expressed in, say, Markdown or HTML). a JSON number with no non-zero fractional part, create an RDF Literal lit using the canonical representation for value with datatype xsd:integer .
    4. Otherwise, create an RDF Literal lit using the canonical representation for value with datatype xsd:string .

    Emit the following triple:

    subject
    node subject
    predicate
    property Issue 71
    object
    literal node lit

7. Examples

Exact handling of annotations. This section is non-normative.

Additionally, the mechanism In addition to reference the annotation target (within tabular data) is still unclear - especially given namespaces defined above , the confusion on identifying row numbers ( ISSUE #68 refers). examples provided here make use of the following namespaces:

dc :
http://purl.org/dc/terms/
foaf :
http://xmlns.com/foaf/0.1/
oa :
http://www.w3.org/ns/oa#
org :
http://www.w3.org/ns/org#
schema :
http://schema.org/

Any Common Properties Furthermore, these examples also make use of the Turtle syntax @base declaration (as defined in Section 3.3 Common Properties of [ tabular-metadata turtle ]). Any of Where a single tabular data file is used in the example, the inherited properties null , language , separator , format , datatype , default @base defined within the table description and/or schema SHALL be added declaration is set to the column description object for each column . URL of that tabular data file.

Where
Note

Each of the same property examples expresses more complex conversions - it is defined in both recommended that readers of this specification work through the examples in sequential order.

7.1 Simple example

This example comprises a single annotated table description containing information attributes about countries; country code, position (latitude, longitude) and name. Whilst the schema input tabular data file, published at http://example.org/countries.csv , includes a header line , the value from the schema SHALL take precedence. no further metadata annotations are given. The tabular data file is provided below:

Example 3: http://example.org/countries.csv
countryCode,latitude,longitude,name
AD,42.546245,1.601554,Andorra
AE,23.424076,53.847818,"United Arab Emirates"
AF,33.93911,67.709953,Afghanistan

Each column description SHALL be matched to a column The annotated table in generated from parsing the tabular data based on the order that the description file is listed in shown below and provides the columns array of basis for the schema . conversion to RDF.

For each column description in the metadata document, the following metadata properties SHALL be added to Annotations for the relevant column description object resulting table established by the mapping application: T , with 4 columns and 3 rows, are shown below:

id core annotations
url columns rows
T name . Where metadata property name http://example.org/countries.csv is also provided via C1 , C2 , C3 , C4 R1 , R2 , R3

Annotations for the header line columns , rows the value from the column description and cells in table T are shown in the metadata document SHALL take precedence. tables below.

title . Column annotations:

More than one value of
id core annotations
table number source number cells name titles
C1 T 1 1 C1.1 , C2.1 , C3.1 title countryCode MAY be provided in the column description , in which case an array of values SHALL be stored along with any assertions regarding the language of each value. Where metadata property title countryCode is also provided via the header line the value from the column header SHALL occupy the first position in an array of
C2 T 2 2 C1.2 , C2.2 , C3.2 title latitude values. predicateUrl . latitude
C3 T 3 3 C1.3 , C2.3 , C3.3 urlTemplate . longitude longitude
C4 T 4 4 C1.4 , C2.4 , C3.4 name name

Row annotations:

id core annotations
table number source number cells
R1 T 1 2 C1.1 , C1.2 , C1.3 , C1.4
R2 T 2 3 C2.1 , C2.2 , C2.3 , C2.4
R3 T 3 4 C3.1 , C3.2 , C3.3 , C3.4

Inherited properties Cell annotations:

id core annotations
table column row string value value property URL
C1.1 T C1 R1 "AD" "AD" null ,
C1.2 T C2 R1 language , "42.546245" separator , "42.546245" format , null
C1.3 T C3 R1 datatype , and "1.601554" default "1.601554" are added to the column description object , overwriting values added in the previous step where properties are duplicated. For each column description object where metadata property predicateUrl null has not been assigned within the column description , the value of
C1.4 T C4 R1 predicateUrl "Andorra" SHALL be set as the value of metadata property name . "Andorra" null
C2.1 T C1 R2 "AE" "AE" null
C2.2 T C2 R2 "23.424076" "23.424076" null
C2.3 T C3 R2 "53.847818" "53.847818" null
C2.4 T C4 R2 "United Arab Emirates" "United Arab Emirates" null
C3.1 T C1 R3 "AF" "AF" null
C3.2 T C2 R3 "33.93911" "33.93911" null
C3.3 T C3 R3 "67.709953" "67.709953" null
C3.4 T C4 R3 "Afghanistan" "Afghanistan" null

Minimal mode output for this example is provided in Turtle [ turtle ] syntax below:

Example 4: http://example.org/countries-minimal.ttl
@base <http://example.org/countries.csv> .

_:8228a149-8efe-448d-b15f-8abf92e7bd17
  <#countryCode> "AD" ;
  <#latitude> "42.546245" ;
  <#longitude> "1.601554" ;
  <#name> "Andorra" .


_:ec59dcfc-872a-4144-822b-9ad5e2c6149c
  <#countryCode> "AE" ;
  <#latitude> "23.424076" ;
  <#longitude> "53.847818" ;
  <#name> "United Arab Emirates" .


_:e8f2e8e9-3d02-4bf5-b4f1-4794ba5b52c9
  <#countryCode> "AF" ;
  <#latitude> "33.93911" ;
  <#longitude> "67.709953" ;

<#

name

>

"Afghanistan"
.
Note
Here, the value of metadata property

The about URL annotation has not been set for cells in table T ( name { "url": "http://example.org/countries.csv"} is treated as ); cells in a fragment identifier relative given row where about URL has not been specified are assumed to refer to the base URL same subject . As This unspecified subject is treated as a URL fragment, blank node .

Given that the value of property URL is name null is subject to percent encoding (as defined for cells in table T ( { "url": "http://example.org/countries.csv"} ), the property URL defaults to the URI Template (see [ rfc3986 RFC6570 ]). ]) #{ [column-name] } , where Mapping applications MAY [column-name] assert that is the resource identified by value of the name annotation of the column associated with the cell. For example, the value of the property URL annotation for all cells in column C1 ( predicateUrl "name": "countryCode" ) is of type rdf:Property http://example.org/countries.csv#countryCode .

Where the following Direct Annotations Standard mode are provided output for columns this example is provided in Turtle [ turtle within ] syntax below:

Example 5: http://example.org/countries-standard.ttl
@base <http://example.org/countries.csv> .
@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .


_:d4f8e548-9601-4e41-aadb-09a8bce32625 a csvw:TableGroup ;
  csvw:table [ a csvw:Table ;
    csvw:url <http://example.org/countries.csv> ;
    csvw:row [ a csvw:Row ;
      csvw:rownum "1"^^xsd:integer ;
      csvw:url <#row=2> ;
      csvw:describes _:8228a149-8efe-448d-b15f-8abf92e7bd17
    ], [ a csvw:Row ;

      csvw:rownum "2"^^xsd:integer ;
      csvw:url <#row=3> ;
      csvw:describes _:ec59dcfc-872a-4144-822b-9ad5e2c6149c
    ], [ a csvw:Row ;

      csvw:rownum "3"^^xsd:integer ;
      csvw:url <#row=4> ;
      csvw:describes _:e8f2e8e9-3d02-4bf5-b4f1-4794ba5b52c9
    ]
  ] .


_:8228a149-8efe-448d-b15f-8abf92e7bd17
  <#countryCode> "AD" ;
  <#latitude> "42.546245" ;
  <#longitude> "1.601554" ;
  <#name> "Andorra" .


_:ec59dcfc-872a-4144-822b-9ad5e2c6149c
  <#countryCode> "AE" ;
  <#latitude> "23.424076" ;
  <#longitude> "53.847818" ;
  <#name> "United Arab Emirates" .


_:e8f2e8e9-3d02-4bf5-b4f1-4794ba5b52c9
  <#countryCode> "AF" ;
  <#latitude> "33.93911" ;
  <#longitude> "67.709953" ;

<#

name

>

"Afghanistan"
.
Note

Even though the tabular data, these SHALL be included table was defined in isolation, the output graph annotated table using triples whose subject is the RDF Property identified by the value wrapped in a group of metadata tables .

The type of both table and group of tables objects is explicitly stated; csvw:TableGroup and csvw:Table respectively.

The csvw:url property provides reference to the original tabular data file and to specific rows therein - noting the need to escape the Turtle-syntax reserved character predicateUrl = for ( U+003D ) within the associated column : fragment identifier.

For The row number is provided for each value row using title csvw:rownum for which property.

A subject and row are related using the predicate used SHALL be csvw:describes property.

rdfs:label

; 7.2 Example with single table and rich annotations

any Common Properties This example is based on Use Case #11 - City of Palo Alto Tree Data from and comprises a single annotated table describing an inventory of tree maintenance operations. The input tabular data file, published at http://example.org/tree-ops-ext.csv , and the column associated metadata description . Where language information about values of title http://example.org/tree-ops-ext.csv-metadata.json are provided below:

Example 6: http://example.org/tree-ops-ext.csv
GID,On Street,Species,Trim Cycle,Diameter at Breast Ht,Inventory Date,Comments,Protected,KML

1,ADDISON AV,Celtis australis,Large Tree Routine Prune,11,10/18/2010,,,"<Point><coordinates>-122.156485,37.440963</coordinates></Point>"
2,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,11,6/2/2010,,,"<Point><coordinates>-122.156749,37.440958</coordinates></Point>"
6,ADDISON
AV,Robinia
pseudoacacia,Large
Tree
Routine
Prune,29,6/1/2010,cavity

or
Common
Properties

decay;
trunk
decay;
codominant
leaders;
included
bark;
large
leader
or
limb
decay;
previous
failure
root
damage;
root
decay;
beware
of
BEES,YES,"<Point><coordinates>-122.156299,37.441151</coordinates></Point>"
Example 7: http://example.org/tree-ops-ext.csv-metadata.json
{
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "@id": "http://example.org/tree-ops-ext",
  "url": "tree-ops-ext.csv",
  "dc:title": "Tree Operations",
  "dcat:keyword": ["tree", "street", "maintenance"],
  "dc:publisher": [{
    "schema:name": "Example Municipality",
    "schema:url": {"@id": "http://example.org"}
  }],
  "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"},
  "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"},
  "notes": [{
    "@type": "oa:Annotation",
    "oa:hasTarget": {"@id": "http://example.org/tree-ops-ext"},
    "oa:hasBody": {
      "@type": "oa:EmbeddedContent",
      "rdf:value": "This is a very interesting comment about the table; it's a table!",
      "dc:format": {"@value": "text/plain"}
    }
  }],
  "dialect": {"trim": true},
  "tableSchema": {
    "columns": [{
      "name": "GID",
      "titles": [
        "GID",
        "Generic Identifier"
      ],
      "dc:description": "An identifier for the operation on a tree.",
      "datatype": "string",
      "required": true, 
      "suppressOutput": true
    }, {
      "name": "on_street",
      "titles": "On Street",
      "dc:description": "The street that the tree is on.",
      "datatype": "string"
    }, {
      "name": "species",
      "titles": "Species",
      "dc:description": "The species of the tree.",
      "datatype": "string"
    }, {
      "name": "trim_cycle",
      "titles": "Trim Cycle",
      "dc:description": "The operation performed on the tree.",
      "datatype": "string",
      "lang": "en"
    }, {
      "name": "dbh",
      "titles": "Diameter at Breast Ht",
      "dc:description": "Diameter at Breast Height (DBH) of the tree (in feet), measured 4.5ft above ground.",
      "datatype": "integer"
    }, {
      "name": "inventory_date",
      "titles": "Inventory Date",
      "dc:description": "The date of the operation that was performed.",
      "datatype": {"base": "date", "format": "M/d/yyyy"}
    }, {
      "name": "comments",
      "titles": "Comments",
      "dc:description": "Supplementary comments relating to the operation or tree.",
      "datatype": "string",
      "separator": ";"
    }, {
      "name": "protected",
      "titles": "Protected",
      "dc:description": "Indication (YES / NO) whether the tree is subject to a protection order.",
      "datatype": {"base": "boolean", "format": "YES|NO"},
      "default": "NO"
    }, {
      "name": "kml",
      "titles": "KML",
      "dc:description": "KML-encoded description of tree location.",
      "datatype": "xml"
    }],
    "primaryKey": "GID",
    "aboutUrl": "http://example.org/tree-ops-ext#gid-{GID}"
  }

}
Note

The notes is known, annotation in the appropriate language tag metadata description uses the Open Annotation data model SHALL be appended to currently under development within the triple. Web Annotations Working Group . This is purely illustrative; no constraints are placed on the value of the notes annotation.

The output graph SHALL contain one resource for each of the rows annotated table within generated from parsing the tabular data. data file and associated metadata is shown below and provides the basis for the conversion to RDF.

Each row-level resource SHALL be related to Core annotations for the resulting table resource using the predicate csvw:row . T , with 9 columns and 3 rows , are shown below:

id core annotations
id url columns rows notes
T <http://example.org/tree-ops-ext> http://example.org/tree-ops-ext.csv C1 , C2 , C3 , C4 , C5 , C6 , C7 , C8 , C9 R1 , R2 , R3 [{ "@type": "oa:Annotation", ... }]

Refer to Section 4.1.2 Row-level processing Non-core annotations for further details on the description of row-level entities. table T are:

dc:title
"Tree Operations"
dcat:keyword
["tree", "street", "maintenance"]
dc:publisher
[{ "schema:name": "Example Municipality", "schema:url": { "@id": "http://example.org" } }]
dc:license
{ "@id": "http://opendefinition.org/licenses/cc-by/" }
dc:modified
"2010-12-31"
Note

Optionally, the output graph MAY contain information describing how and when The value of the output graph notes was created using terms from annotation has been shortened for clarity in the PROV Ontology [ table above.

prov-o
].

If provenance information is to be included in Annotations for the output graph columns , rows then the and cells in table resource SHALL use T are shown in the predicate tables below.

Column annotations:

id core annotations annotations
table number source number cells name titles required suppress output prov:activity dc:description to refer to
C1 T 1 1 C1.1 , C2.1 , C3.1 GID GID , Generic Identifier true true An identifier for the operation on a resource of type tree.
C2 T 2 2 C1.2 , C2.2 , C3.2 prov:Activity on_street On Street The street that describes the mapping activity. tree is on.
C3 T 3 3 C1.3 , C2.3 , C3.3 species Species The species of the tree.
C4 T 4 4 C1.4 , C2.4 , C3.4 prov:Activity trim_cycle resource SHOULD provide information Trim Cycle The operation performed on the start and end time tree.
C5 T 5 5 C1.5 , C2.5 , C3.5 dbh Diameter at Breast Ht Diameter at Breast Height (DBH) of the mapping activity and refer tree (in feet), measured 4.5ft above ground.
C6 T 6 6 C1.6 , C2.6 , C3.6 inventory_date Inventory Date The date of the operation that was performed.
C7 T 7 7 C1.7 , C2.7 , C3.7 comments Comments Supplementary comments relating to the original CSV file. The operation or tree.
C8 T 8 8 C1.8 , C2.8 , C3.8 prov:Activity protected resource MAY indicate Protected Indication (YES / NO) whether the location tree is subject to a protection order.
C9 T 9 9 C1.9 , C2.9 , C3.9 kml KML KML-encoded description of tree location.
Note

In this example, output for column C1 ( GID ) is not required; note the generated RDF suppress output graph if known. annotation on this column .

Furthermore, if the metadata Row annotations:

id core annotations are provided in one or more metadata documents (e.g. as
table description , schema number source number cells primary key
R1 T 1 2 C1.1 , C1.2 , C1.3 , C1.4 , C1.5 , C1.6 , C1.7 , C1.8 , C1.9 C1.1
R2 T 2 3 C2.1 , C2.2 , C2.3 , C2.4 , C2.5 , C2.6 , C2.7 , C2.8 , C2.9 C2.1
R3 T 3 4 C3.1 , C3.2 , C3.3 , C3.4 , C3.5 , C3.6 , C3.7 , C3.8 , C3.9 C3.1

Cell and annotations:

id core annotations
table column descriptions ) then the provenance information SHOULD also include information row string value value about each of those metadata documents. The example below provides an illustration of provenance information , where URL
C1.1 T C1 R1 "1" [CSV Location] "1" is the absolute URL of the source CSV file, http://example.org/tree-ops-ext#gid-1 [Metadata Location]
C1.2 T C2 R1 "ADDISON AV" is the location of the metadata document, "ADDISON AV" [Start Time] <http://example.org/tree-ops-ext#gid-1> is the start time of the mapping activity,
C1.3 T C3 R1 "Celtis australis" [End Time] "Celtis australis" is the finish time <http://example.org/tree-ops-ext#gid-1>
C1.4 T C4 R1 "Large Tree Routine Prune" "Large Tree Routine Prune" (English) <http://example.org/tree-ops-ext#gid-1>
C1.5 T C5 R1 "11" 11 <http://example.org/tree-ops-ext#gid-1>
C1.6 T C6 R1 "10/18/2010" 2010-10-18 <http://example.org/tree-ops-ext#gid-1>
C1.7 T C7 R1 "" null <http://example.org/tree-ops-ext#gid-1>
C1.8 T C8 R1 "" false <http://example.org/tree-ops-ext#gid-1>
C1.9 T C9 R1 "<Point><coordinates>-122.156485,37.440963</coordinates></Point>" "<Point><coordinates>-122.156485,37.440963</coordinates></Point>" (XML) <http://example.org/tree-ops-ext#gid-1>
C2.1 T C1 R2 "2" "2" <http://example.org/tree-ops-ext#gid-2>
C2.2 T C2 R2 "EMERSON ST" "EMERSON ST" <http://example.org/tree-ops-ext#gid-2>
C2.3 T C3 R2 "Liquidambar styraciflua" "Liquidambar styraciflua" <http://example.org/tree-ops-ext#gid-2>
C2.4 T C4 R2 "Large Tree Routine Prune" "Large Tree Routine Prune" (English) <http://example.org/tree-ops-ext#gid-2>
C2.5 T C5 R2 "11" 11 <http://example.org/tree-ops-ext#gid-2>
C2.6 T C6 R2 "6/2/2010" 2010-06-02 <http://example.org/tree-ops-ext#gid-2>
C2.7 T C7 R2 "" null <http://example.org/tree-ops-ext#gid-2>
C2.8 T C8 R2 "" false <http://example.org/tree-ops-ext#gid-2>
C2.9 T C9 R2 "<Point><coordinates>-122.156749,37.440958</coordinates></Point>" "<Point><coordinates>-122.156749,37.440958</coordinates></Point>" (XML) <http://example.org/tree-ops-ext#gid-2>
C3.1 T C1 R3 "6" "6" <http://example.org/tree-ops-ext#gid-6>
C3.2 T C2 R3 "ADDISON AV" "ADDISON AV" <http://example.org/tree-ops-ext#gid-6>
C3.3 T C3 R3 "Robinia pseudoacacia" "Robinia pseudoacacia" <http://example.org/tree-ops-ext#gid-6>
C3.4 T C4 R3 "Large Tree Routine Prune" "Large Tree Routine Prune" (English) <http://example.org/tree-ops-ext#gid-6>
C3.5 T C5 R3 "29" 29 <http://example.org/tree-ops-ext#gid-6>
C3.6 T C6 R3 "6/1/2010" 2010-06-01 <http://example.org/tree-ops-ext#gid-6>
C3.7 T C7 R3 "cavity or decay; trunk decay; codominant leaders; included bark; large leader or limb decay; previous failure root damage; root decay; beware of the mapping activity (both expressed as BEES" xsd:dateTime "cavity or decay" , "trunk decay" , "codominant leaders" , "included bark" , "large leader or limb decay" , "previous failure root damage" , "root decay" , "beware of BEES" ) and <http://example.org/tree-ops-ext#gid-6> [RDF Output Location]
C3.8 T C8 R3 "YES" is the location true <http://example.org/tree-ops-ext#gid-6>
C3.9 T C9 R3 "<Point><coordinates>-122.156299,37.441151</coordinates></Point>" "<Point><coordinates>-122.156299,37.441151</coordinates></Point>" (XML) <http://example.org/tree-ops-ext#gid-6>
Note

The lists of values from cells in column C7 ( "name": "comments" ) are assumed to be unordered as the generated boolean ordered annotation, which defaults to false , has not be set within the metadata description.

Minimal mode output graph . for this example is provided in Turtle [ turtle ] syntax below:

<> a csvw:Table; prov:activity [ a prov:Activity; prov:startedAtTime [Start Time]; prov:endedAtTime [End Time]; prov:generated <[RDF Output Location]>; prov:qualifiedUsage [ a prov:Usage ; prov:Entity <[CSV Location]> ; prov:hadRole csvw:csvEncodedTabularData ]; prov:qualifiedUsage [ a prov:Usage ; prov:Entity <[Metadata Location]> ; prov:hadRole csvw:tabularMetadata ] ]; ...
Example 8: http://example.org/tree-ops-ext-minimal.ttl
@base <http://example.org/tree-ops-ext.csv> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/tree-ops-ext#gid-1>
  <#on_street> "ADDISON AV" ;
  <#species> "Celtis australis" ;
  <#trim_cycle> "Large Tree Routine Prune"@en ;
  <#dbh> 11 ;
  <#inventory_date> "2010-10-18"^^xsd:date ;
  <#protected> false ;
  <#kml> "<Point><coordinates>-122.156485,37.440963</coordinates></Point>"^^rdf:XMLLiteral .


<http://example.org/tree-ops-ext#gid-2>
  <#on_street> "EMERSON ST" ;
  <#species> "Liquidambar styraciflua" ;
  <#trim_cycle> "Large Tree Routine Prune"@en ;
  <#dbh> 11 ;
  <#inventory_date> "2010-06-02"^^xsd:date ;
  <#protected> false ;
  <#kml> "<Point><coordinates>-122.156749,37.440958</coordinates></Point>"^^rdf:XMLLiteral .


<http://example.org/tree-ops-ext#gid-6>
  <#on_street> "ADDISON AV" ;
  <#species> "Robinia pseudoacacia" ;
  <#trim_cycle> "Large Tree Routine Prune"@en ;
  <#dbh> 29 ;
  <#inventory_date> "2010-06-01"^^xsd:date ;
  <#comments> "cavity or decay", "trunk decay", "codominant leaders", "included bark", "large leader or limb decay", "previous failure root damage", "root decay", "beware of BEES" ;
  <#protected> true ;

<#

kml

>

"<Point><coordinates>-122.156299,37.441151</coordinates></Point>"

^^

rdf
:
XMLLiteral
.
4.1.2 Row-level processing Note

Each The subject described by each row in the tabular data is processed sequentially to produce a row resource . The behaviour exhibited when processing a given cell explicitly defined using the about URL within annotation; e.g. the current subject of row R1 is dependent on the metadata properties of the http://example.org/tree-ops-ext#gid-1 .

Output for column description object for C1 ( { "name": "GID" } ) is not included as the column that that cell suppress output resides in. The effect of each metadata property annotation is defined in Section 4.1.3 Metadata property effects on row-level mapping behaviour true .

A row resource language tag SHALL contain one triple is specified for each values of column where C4 ( { "name": "trim_cycle" } ) as the cell value language annotation is not null . en .

Where the metadata property The urlTemplate datatype annotation is provided set on columns C5 , C6 , C8 and C9 ( { "name": "dbh"} , { "name": "inventory_date" } , { "name": "protected" } and { "name": "kml" } ); integer , date , boolean and xml respectively. The datatype property is inherited by all cells in the schema each of those columns , therefore the row resource RDF output for those cells SHALL be explicitly identified using the value resulting from the expansion of includes the [ appropriate datatype IRI .

uri-template Cells ] specified in the C1.7 and C2.7 ( rows R1 and R2 ; column , urlTemplate { "name": "comments" } property. ) have null values - no output is included for these cells .

In Cell C3.7 ( row R3 ; column , { "name": "comments" } ) contains an unordered sequence of values; the absence set of values are included as a simple set of triples as opposed to an instance of metadata property urlTemplate , rdf:List as the row resource ordered SHALL be treated as a blank node annotation has defaulted to false .

Standard mode output for this example is provided in Turtle [ rdf11-concepts turtle ]. ] syntax below:

Example 9: http://example.org/tree-ops-ext-standard.ttl
@base <http://example.org/tree-ops-ext.csv> .
@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix oa: <http://www.w3.org/ns/oa#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

_:68fc08e5-56a0-47e2-a784-3a644d8257c4 a csvw:TableGroup ;
  csvw:table <http://example.org/tree-ops-ext> .

<http://example.org/tree-ops-ext> a csvw:Table ;

  csvw:url <http://example.org/tree-ops-ext.csv> ;
  dc:title "Tree Operations"@en ;
  dcat:keyword "tree"@en, "street"@en, "maintenance"@en ;
  dc:publisher [
    schema:name "Example Municipality"@en ;
    schema:url <http://example.org>
  ] ;

  dc:license <http://opendefinition.org/licenses/cc-by/> ;
  dc:modified "2010-12-31"^^xsd:date ;
  csvw:note [
    a oa:Annotation ;
    oa:hasTarget <http://example.org/tree-ops-ext> ;
    oa:hasBody [
      a oa:EmbeddedContent ;
      rdf:value "This is a very interesting comment about the table; it's a table!"@en ;
      dc:format "text/plain"
    ]
  ] ;

  csvw:row [ a csvw:Row ;
    csvw:rownum 1 ;
    csvw:url <#row=2> ;
    csvw:describes <http://example.org/tree-ops-ext#gid-1>
  ], [ a csvw:Row ;

    csvw:rownum 2 ;
    csvw:url <#row=3> ;
    csvw:describes <http://example.org/tree-ops-ext#gid-2>
  ], [ a csvw:Row ;

    csvw:rownum 3 ;
    csvw:url <#row=4> ;
    csvw:describes <http://example.org/tree-ops-ext#gid-6>
  ] .


<http://example.org/tree-ops-ext#gid-1>
  <#on_street> "ADDISON AV" ;
  <#species> "Celtis australis" ;
  <#trim_cycle> "Large Tree Routine Prune"@en ;
  <#dbh> 11 ;
  <#inventory_date> "2010-10-18"^^xsd:date ;
  <#protected> false ;
  <#kml> "<Point><coordinates>-122.156485,37.440963</coordinates></Point>"^^rdf:XMLLiteral .


<http://example.org/tree-ops-ext#gid-2>
  <#on_street> "EMERSON ST" ;
  <#species> "Liquidambar styraciflua" ;
  <#trim_cycle> "Large Tree Routine Prune"@en ;
  <#dbh> 11 ;
  <#inventory_date> "2010-06-02"^^xsd:date ;
  <#protected> false ;
  <#kml> "<Point><coordinates>-122.156749,37.440958</coordinates></Point>"^^rdf:XMLLiteral .


<http://example.org/tree-ops-ext#gid-6>
  <#on_street> "ADDISON AV" ;
  <#species> "Robinia pseudoacacia" ;
  <#trim_cycle> "Large Tree Routine Prune"@en ;
  <#dbh> 29 ;
  <#inventory_date> "2010-06-01"^^xsd:date ;
  <#comments> "cavity or decay", "trunk decay", "codominant leaders", "included bark", "large leader or limb decay", "previous failure root damage", "root decay", "beware of BEES" ;
  <#protected> true ;

<#

kml

>

"<Point><coordinates>-122.156299,37.441151</coordinates></Point>"

^^

rdf
:
XMLLiteral
.
Note

The variables in the URI Template expression relate to the Table T ( name { "url": "http://example.org/tree-ops-ext.csv"} property ) has been explicitly identified: { "@id": "<http://exmple.org/tree-ops-ext>"} .

Non-core annotations and notes specified for each column . During template expansion, the variables evaluate to the cell value within the row table being processed that is associated with T ( { "url": "http://example.org/tree-ops-ext.csv"} ) are included in the named column . output.

The variable As the metadata description file _row http://example.org/tree-ops-ext.csv-metadata.json evaluates to defines a default language within the number context ( "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}] ), all non-core annotations of type string (e.g. dc:title , dcat:keyword , dc:publisher , dc:license and dc:modified ) are expressed in the row being processed. Once the URL has been generated via RDF output using the template expansion, relative URLs are resolved against the base URL to create an absolute URL. appropriate language tag .

Where the cell value is

7.3 Example with single table and using null , the triple SHALL virtual columns be omitted from the output graph to produce multiple subjects per row

This example uses a single annotated table unless describing a default value is specified listing of music events. Each row from the tabular data file corresponds to three resources; the music event itself, the location where that event occurs and the offer to sell tickets for that column event. The goal is to convert the CSV content into schema.org (see metadata properties markup that a search engine such as Google can use to index music events. Details of how Google expects this information to be structured can be found here .

The input tabular data file, published at null http://example.org/events-listing.csv , and the associated metadata description default http://example.org/events-listing.csv-metadata.json ). are provided below:

Example 10: http://example.org/events-listing.csv
Name, Start Date, Location Name, Location Address, Ticket Url
B.B. King,2014-04-12T19:30,"Lupo’s Heartbreak Hotel","79 Washington St., Providence, RI",https://www.etix.com/ticket/1771656
B.B.
King,2014-04-13T20:00,"Lynn
Auditorium","Lynn,
MA,
01901",http://frontgatetickets.com/venue.php?id=11766
Example 11: http://example.org/events-listing.csv-metadata.json
{
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "url": "events-listing.csv",
  "dialect": {"trim": true},
  "tableSchema": {
    "columns": [{
      "name": "name",
      "titles": "Name",
      "aboutUrl": "#event-{_row}",
      "propertyUrl": "schema:name"
    }, {
      "name": "start_date",
      "titles": "Start Date",
      "datatype": {
        "base": "datetime",
        "format": "yyyy-MM-ddTHH:mm"
      },
      "aboutUrl": "#event-{_row}",
      "propertyUrl": "schema:startDate"
    }, {
      "name": "location_name",
      "titles": "Location Name",
      "aboutUrl": "#place-{_row}",
      "propertyUrl": "schema:name"
    }, {
      "name": "location_address",
      "titles": "Location Address",
      "aboutUrl": "#place-{_row}",
      "propertyUrl": "schema:address"
    }, {
      "name": "ticket_url",
      "titles": "Ticket Url",
      "datatype": "anyURI",
      "aboutUrl": "#offer-{_row}",
      "propertyUrl": "schema:url"
    }, {
      "name": "type_event",
      "virtual": true,
      "aboutUrl": "#event-{_row}",
      "propertyUrl": "rdf:type",
      "valueUrl": "schema:MusicEvent"
    }, {
      "name": "type_place",
      "virtual": true,
      "aboutUrl": "#place-{_row}",
      "propertyUrl": "rdf:type",
      "valueUrl": "schema:Place"
    }, {
      "name": "type_offer",
      "virtual": true,
      "aboutUrl": "#offer-{_row}",
      "propertyUrl": "rdf:type",
      "valueUrl": "schema:Offer"
    }, {
      "name": "location",
      "virtual": true,
      "aboutUrl": "#event-{_row}",
      "propertyUrl": "schema:location",
      "valueUrl": "#place-{_row}"
    }, {
      "name": "offers",
      "virtual": true,
      "aboutUrl": "#event-{_row}",
      "propertyUrl": "schema:offers",
      "valueUrl": "#offer-{_row}"
    }]
  }

}
Note

For each The CSV to RDF translation is limited to providing one statement, or triple, per column , the value of metadata property predicateUrl from in the column description object table . The target schema.org SHALL be used markup requires 10 statements to relate describe each event. As the row resource base tabular data file contains 5 columns, an additional 5 virtual columns have been added in order to provide for the cell value full complement of statements—including the relationships between the 3 resources (event, location, and offer) described by each row within of the column table . Note that the virtual annotation is set to true for these virtual columns .

When included Furthermore, note that no attempt is made to reconcile between locations or offers that may be associated with more than one event; every row in the output graph , table will create both a new location resource and offer resource in addition to the cell value event resource. If considered necessary, applications such as OpenRefine SHALL may be subject used to identify and reconcile duplicate location resources once the effect of RDF output has been generated.

The annotated table generated from parsing the tabular data file and associated metadata properties is shown below and provides the basis for that column the conversion to RDF.

Annotations for the resulting table (if any T , with 10 columns and 2 rows , are specified). shown below:

4.1.3 Metadata property effects on row-level mapping behaviour
id core annotations
url columns rows
T http://example.org/events-listing.csv C1 , C2 , C3 , C4 , C5 , C6 , C7 , C8 , C9 , C10 R1 , R2

The following metadata properties modify Annotations for the way that cell values columns , rows and cells in table T are incorporated into shown in the output graph : tables below.

Column annotations:

C1 T 1 1 C1.1 , C2.1 null name Name
C2 T 2 2 C1.2 , C2.2 start_date Start Date
C3 T 3 3 C1.3 , C2.3 location_name Location Name
C4 T 4 4 C1.4 , C2.4 location_address Location Address
C5 T 5 5 C1.5 , C2.5 ticket_url Ticket Url
C6 T 6 6 C1.6 , C2.6 type_event true
C7 T 7 7 C1.7 , C2.7 type_place true
C8 T 8 8 C1.8 , C2.8 type_offer true
C9 T 9 9 C1.9 , C2.9 location true
C10 T 10 10 C1.10 , C2.10 offers true

By default, a cell value Row is deemed to be null if it contains an empty string. If specified, the metadata annotations:

id core annotations
table number source number cells
R1 T 1 2 C1.1 , C1.2 , C1.3 , C1.4 , C1.5 , C1.6 , C1.7 , C1.8 , C1.9 , C1.10
R2 T 2 3 C2.1 , C2.2 , C2.3 , C2.4 , C2.5 , C2.6 , C2.7 , C2.8 , C2.9 , C2.10

Cell annotations:

]). Issue 107 Should there be an option for unordered lists in RDF mapping? Note
id core annotations
table column row string value value about URL property URL value URL
C1.1 T C1 R1 "B.B. King" "B.B. King" <http://example.org/events-listing.csv#event-1> schema:name
C1.2 T C2 R1 "2014-04-12T19:30" 2014-04-12T19:30:00 <http://example.org/events-listing.csv#event-1> schema:startDate
C1.3 T C3 R1 "Lupo’s Heartbreak Hotel" "Lupo’s Heartbreak Hotel" <http://example.org/events-listing.csv#place-1> schema:name
C1.4 T C4 R1 "79 Washington St., Providence, RI" "79 Washington St., Providence, RI" <http://example.org/events-listing.csv#place-1> schema:address
C1.5 T C5 R1 "https://www.etix.com/ticket/1771656" <https://www.etix.com/ticket/1771656> <http://example.org/events-listing.csv#offer-1> schema:url
C1.6 T C6 R1 "" null provides a token (string) that can be used to identify <http://example.org/events-listing.csv#event-1> rdf:type schema:MusicEvent
C1.7 T C7 R1 "" null values. language <http://example.org/events-listing.csv#place-1> Where metadata property language rdf:type is specified, the value of that property SHALL be used as a language tag (as specified in [ rdf11-concepts ]) for simple literal values (e.g. those values whose datatype is http://www.w3.org/2001/XMLSchema#string schema:Place ).
C1.8 T C8 R1 separator "" Where metadata property separator null is defined, the cell value SHALL be parsed into an ordered list of values, using the value of separator <http://example.org/events-listing.csv#offer-1> as the delimiter. Given that RDF has does not provide any implicit ordering of triples, the list of values shall be expressed in the output graph as an RDF List (as described in Section 5.2.1 rdf:List of [ rdf-schema rdf:type schema:Offer
C1.9 T C9 R1 datatype "" and format null Where metadata property datatype <http://example.org/events-listing.csv#event-1> is defined, the triple included in the output graph for this cell SHALL assert the datatype of the cell value using the value of the datatype schema:location property. Where metadata property datatype <http://example.org/events-listing.csv#place-1> is undefined, the column SHALL be inferred to hold values of datatype
C1.10 T C10 R1 string . "" null Where the metadata property separator <http://example.org/events-listing.csv#event-1> is specified (e.g. to indicate that a cell value is to be parsed into a list of values), the datatype specified by datatype schema:offers SHALL be inferred to apply to the members of the resulting list. The following datatypes are given special attention: Datatypes with embedded syntax: xml , <http://example.org/events-listing.csv#offer-1>
C2.1 T C1 R2 json "B.B. King" and html . These datatypes are treated as literal values; no attempt SHOULD be made to 'unpack' the structured syntax to create sub-objects within the output graph . Booleans: "B.B. King" boolean . Metadata property <http://example.org/events-listing.csv#event-2> format schema:name MAY be provided for a boolean-typed column ; providing non-standard tokens for true and false (e.g.
C2.2 T C2 R2 Y|N . Section 3.12.3 Formats for booleans from [ "2014-04-13T20:00" tabular-metadata ] refers. If a boolean type is declared, the cell value SHALL be processed as follows: if the value is true , 2014-04-13T20:00:00 1 <http://example.org/events-listing.csv#event-2> or, if the format schema:startDate property is defined, the value of true , then the output graph SHALL include the value
C2.3 T C3 R2 true "Lynn Auditorium" ; else if the value is false , "Lynn Auditorium" 0 <http://example.org/events-listing.csv#place-2> or, if the format schema:name property is defined, the value of false , then the output graph SHALL include the value
C2.4 T C4 R2 false "Lynn, MA, 01901" ; else the output graph SHALL include the cell value verbatim. Numbers: number , "Lynn, MA, 01901" decimal , <http://example.org/events-listing.csv#place-2> integer , schema:address
C2.5 T C5 R2 nonPositiveInteger , "http://frontgatetickets.com/venue.php?id=11766" negativeInteger , <http://frontgatetickets.com/venue.php?id=11766> long , <http://example.org/events-listing.csv#offer-2> int , schema:url
C2.6 T C6 R2 short , "" nonNegativeInteger , null unsignedLong , <http://example.org/events-listing.csv#event-2> unsignedInt , rdf:type unsignedShort , schema:MusicEvent
C2.7 T C7 R2 positiveInteger , "" float null and double . <http://example.org/events-listing.csv#place-2> rdf:type schema:Place
C2.8 T C8 R2 "" null <http://example.org/events-listing.csv#offer-2> rdf:type schema:Offer
C2.9 T C9 R2 "" null <http://example.org/events-listing.csv#event-2> schema:location <http://example.org/events-listing.csv#place-2>
C2.10 T C10 R2 "" null <http://example.org/events-listing.csv#event-2> schema:offers <http://example.org/events-listing.csv#offer-2>

Cell values Minimal mode that are asserted to be numeric shall be expressed in the output graph as numbers. It is not uncommon for numbers within tabular data to be formatted for human consumption, which may involve using commas for decimal points, grouping digits in the number using commas, or adding currency symbols or percent signs to the number. Metadata property format MAY be this example is provided to describe the formatting of the cell values to assist the mapping application convert the cell value in Turtle [ turtle to a number format readily consumable by downstream applications. ] syntax below:

Example 12: http://example.org/events-listing-minimal.ttl
@base <http://example.org/events-listing.csv> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<#event-1> a schema:MusicEvent ;

  schema:name "B.B. King" ;
  schema:startDate "2014-04-12T19:30:00"^^xsd:dateTime ;
  schema:location <#place-1> ;
  schema:offers <#offer-1> .

<#place-1> a schema:Place ;

  schema:name "Lupo’s Heartbreak Hotel" ;
  schema:address "79 Washington St., Providence, RI" .

<#offer-1> a schema:Offer ;

  schema:url "https://www.etix.com/ticket/1771656"^^xsd:anyURI .

<#event-2> a schema:MusicEvent ;

  schema:name "B.B. King" ;
  schema:startDate "2014-04-13T20:00:00"^^xsd:dateTime ;
  schema:location <#place-2> ;
  schema:offers <#offer-2> .

<#place-2> a schema:Place ;

  schema:name "Lynn Auditorium" ;
  schema:address "Lynn, MA, 01901" .

<#offer-2> a schema:Offer ;

schema

:


url

Issue
54


"http://frontgatetickets.com/venue.php?id=11766"



^^

xsd
:
anyURI
.
Note

Describing Three resources are defined for each row within the formatting of numbers is currently unresolved and is likely to require information on decimal separator characters, grouping characters table; event, location, and possibly others such as Infinity, Nan, currency tokens, negative numbers appearing in parentheses etc. offer.

In Each column description in the interim, mapping applications metadata explicitly defines both aboutUrl and propertyUrl properties which are not required used to undertake any reformatting and may simply pass create the cell value to about URL and property URL annotations on the output graph verbatim. column's cells .

Dates, times Columns C6 , C7 and durations: C8 ( date { "name": "type_event"} , time , { "name": "type_place"} and datetime { "name": "type_offer"} ) define the semantic types of the resources described by each row : schema:MusicEvent , dateTime schema:Place and duration . schema:Offer respectively.

A standard syntax for dates and times is defined by [ iso8601 Column C9 ( { "name": "location"} ) uses the about URL , property URL ]. This format can be readily consumed by software applications. However, dates and times are often provided in a locale-specific format, or use alternate calendars and/or eras. value URL to assert the relationship between the event and location resources.

Metadata property Column C10 ( format { "name": "offer"} MAY be provided to describe ) uses the formatting of cell values about URL , property URL and assist the mapping application convert the cell value URL to a date, time, date-time or duration format readily consumable by downstream applications. assert the relationship between the event and offer resources.

Note

Where possible, data publishers SHOULD provide dates and times in the [ iso8601 Standard mode ] format. However, where data publishers choose to use locale-specific date and time formatting, they SHOULD also provide equivalent values output for this example is provided in Turtle [ iso8601 turtle ] format (e.g. in a complementary column). syntax below:

Example 13: http://example.org/events-listing-standard.ttl
@base <http://example.org/events-listing.csv> .
@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

_:95cc7970-ce99-44b0-900c-e2c2c028bbd3 a csvw:TableGroup ;
  csvw:table [ a csvw:Table ;
    csvw:url <http://example.org/events-listing.csv> ;
    csvw:row [ a csvw:Row ;
      csvw:rownum 1 ;
      csvw:url <#row=2> ;
      csvw:describes <#event-1>, <#place-1>, <#offer-1>
    ], [ a csvw:Row ;

      csvw:rownum 2 ;
      csvw:url <#row=3> ;
      csvw:describes <#event-2>, <#place-2>, <#offer-2>
    ]
  ] .


<#event-1> a schema:MusicEvent ;

  schema:name "B.B. King" ;
  schema:startDate "2014-04-12T19:30:00"^^xsd:dateTime ;
  schema:location <#place-1> ;
  schema:offers <#offer-1> .

<#place-1> a schema:Place ;

  schema:name "Lupo’s Heartbreak Hotel" ;
  schema:address "79 Washington St., Providence, RI" .

<#offer-1> a schema:Offer ;

  schema:url "https://www.etix.com/ticket/1771656"^^xsd:anyURI .

<#event-2> a schema:MusicEvent ;

  schema:name "B.B. King" ;
  schema:startDate "2014-04-13T20:00:00"^^xsd:dateTime ;
  schema:location <#place-2> ;
  schema:offers <#offer-2> .

<#place-2> a schema:Place ;

  schema:name "Lynn Auditorium" ;
  schema:address "Lynn, MA, 01901" .

<#offer-2> a schema:Offer ;

schema

:


url


"http://frontgatetickets.com/venue.php?id=11766"

Issue
54


^^



xsd
:
anyURI
.
Note

Describing The resources described by each row are explcitly defined using the formatting of dates about URL annotation this case three resources per row (event, location, and times is currently unresolved. offer). The favoured option relationship between the row and each subject resource is to defer asserted using the parsing csvw:describes property; e.g. for row R1 we state [] csvw:describes t1:event-1, t1:place-1, t1:offer-1 . .

7.4 Example with group of dates and times to implementations tables comprising four interrelated tables

This example is based on Use Case #4 - Publication of public sector roles and salaries and uses four annotated tables published as a picture string provided group of tables . Information about senior roles and junior roles within a government department or organization are published in CSV format by each department. These are validated against a centrally published schema to ensure that all the metadata. Unfortunately, there data published by departments is no standard syntax consistent. Additionally, lists of organizations and professions are also published centrally, providing controlled vocabularies against which departmental submissions are validated.

Information published about junior and senior roles provides summary information for picture strings, therefore each post within the government department or organization. Whilst the junior role information is anonymous, providing only an array indication of picture strings relating the number of full-time-equivalent (FTE) staff occupying a given post, the senior role information specifies the named individual occupying each post. As such, each row from the tabular data file describing senior roles corresponds to common implementations seems like two resources; the best option. For example: post and the person occupying that post.

"datatype": "date", This example is concerned only with converting the information provided each government department or organization not the centrally published information listing organizations and professions.

The input tabular data files and associated metadata descriptions are provided below:

Example 14: http://example.org/gov.uk/data/organizations.csv
Organization Unique Reference,Organization Name,Department Reference

hefce.ac.uk,Higher Education Funding Council for England,bis.gov.uk
bis.gov.uk,"Department
for
Business,
Innovation
and
Skills",xx
"format": {
Example 15: http://example.org/gov.uk/data/professions.csv
Profession
Finance
Information Technology
Operational Delivery
Policy
Example 16: http://example.org/senior-roles.csv
Post Unique Reference,Name,Grade,Job Title,Reports to Senior Post,Profession,Organization Reference
90115,Steve Egan,SCS1A,Deputy Chief Executive,90334,Finance,hefce.ac.uk
90334,Sir
Alan
Langlands,SCS4,Chief
Executive,xx,Policy,hefce.ac.uk
  "picture-strings": [
Example 17: http://example.org/junior-roles.csv
Reporting Senior Post,Grade,Payscale Minimum (£),Payscale Maximum (£),Generic Job Title,Number of Posts (FTE),Profession,Organization Reference
90115,4,17426,20002,Administrator,8.67,Operational Delivery,hefce.ac.uk
90115,5,19546,22478,Administrator,0.5,Operational
Delivery,hefce.ac.uk
Example 18: http://example.org/metadata.json
{
  "@type": "TableGroup",
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "tables": [{
    "url": "gov.uk/data/organizations.csv",
    "tableSchema": "gov.uk/schemas/organizations.json",
    "suppressOutput": true
  }, {
    "url": "gov.uk/data/professions.csv",
    "tableSchema": "gov.uk/schemas/professions.json",
    "suppressOutput": true
  }, {
    "url": "senior-roles.csv",
    "tableSchema": "gov.uk/schemas/senior-roles.json"
  }, {
    "url": "junior-roles.csv",
    "tableSchema": "gov.uk/schemas/junior-roles.json"
  }]

    "unicode":
"dd
MMM
yyyy",


}

    "xpath": "[D01] [MN,*-3] [Y0001]"
Example 19: http://example.org/gov.uk/schema/organizations.json
{
  "@id": "http://example.org/gov.uk/schema/organizations.json",
  "@context": "http://www.w3.org/ns/csvw",
  "columns": [{
    "name": "ref",
    "titles": "Organization Unique Reference",
    "datatype": "string",
    "required": true,
    "propertyUrl": "dc:identifier"
  }, {
    "name": "name",
    "titles": "Organization Name",
    "datatype": "string",
    "propertyUrl": "foaf:name"
  }, {
    "name": "department",
    "titles": "Department Reference",
    "datatype": "string",
    "null": "xx",
    "propertyUrl": "org:subOrganizationOf",
    "valueUrl": "http://example.org/organization/{ref}"
  }],
  "primaryKey": "ref",
  "aboutUrl": "http://example.org/organization/{ref}",
  "foreignKeys": [{
    "columnReference": "department",
    "reference": {
      "schemaReference": "http://example.org/gov.uk/schema/organizations.json",
      "columnReference": "ref"
    }
  }]


}

  ]

Example 20: http://example.org/gov.uk/schema/professions.json
{
  "@id": "http://example.org/gov.uk/schema/professions.json",
  "@context": "http://www.w3.org/ns/csvw",
  "columns": [{
    "name": "name",
    "titles": "Profession",
    "datatype": "string",
    "required": true
  }],
  "primaryKey": "name"


}

Example 21: http://example.org/gov.uk/schema/senior-roles.json
{
  "@id": "http://example.org/gov.uk/schema/senior-roles.json",
  "@context": "http://www.w3.org/ns/csvw",
  "columns": [{
    "name": "ref",
    "titles": "Post Unique Reference",
    "datatype": "string",
    "required": true,
    "propertyUrl": "dc:identifier"
  }, {
    "name": "name",
    "titles": "Name",
    "datatype": "string",
    "aboutUrl": "http://example.org/organization/{organizationRef}/person/{_row}",
    "propertyUrl": "foaf:name"
  }, {
    "name": "grade",
    "titles": "Grade",
    "datatype": "string",
    "propertyUrl": "http://example.org/gov.uk/def/grade"
  }, {
    "name": "job",
    "titles": "Job Title",
    "datatype": "string",
    "propertyUrl": "http://example.org/gov.uk/def/job"
  }, {
    "name": "reportsTo",
    "titles": "Reports to Senior Post",
    "datatype": "string",
    "null": "xx",
    "propertyUrl": "org:reportsTo",
    "valueUrl": "http://example.org/organization/{organizationRef}/post/{reportsTo}"
  }, {
    "name": "profession",
    "titles": "Profession",
    "datatype": "string",
    "propertyUrl": "http://example.org/gov.uk/def/profession"
  }, {
    "name": "organizationRef",
    "titles": "Organization Reference",
    "datatype": "string",
    "propertyUrl": "org:postIn",
    "valueUrl": "http://example.org/organization/{organizationRef}",
    "required": true
  }, {
    "name": "post_holder",
    "virtual": true,
    "propertyUrl": "org:heldBy",
    "valueUrl": "http://example.org/organization/{organizationRef}/person/{_row}"
  }],
  "primaryKey": "ref",
  "aboutUrl": "http://example.org/organization/{organizationRef}/post/{ref}",
  "foreignKeys": [{
    "columnReference": "reportsTo",
    "reference": {
      "schemaReference": "http://example.org/gov.uk/schema/senior-roles.json",
      "columnReference": "ref"
    }
  }, {
    "columnReference": "profession",
    "reference": {
      "schemaReference": "http://example.org/gov.uk/schema/professions.json",
      "columnReference": "name"
    }
  }, {
    "columnReference": "organizationRef",
    "reference": {
      "schemaReference": "http://example.org/gov.uk/schema/organizations.json",
      "columnReference": "ref"
    }
  }]


}
Example 22: http://example.org/gov.uk/schema/junior-roles.json
{
  "@id": "http://example.org/gov.uk/schema/junior-roles.json",
  "@context": "http://www.w3.org/ns/csvw",
  "columns": [{
    "name": "reportsToSenior",
    "titles": "Reporting Senior Post",
    "datatype": "string",
    "propertyUrl": "org:reportsTo",
    "valueUrl": "http://example.org/organization/{organizationRef}/post/{reportsToSenior}",
    "required": true
  }, {
    "name": "grade",
    "titles": "Grade",
    "datatype": "string",
    "propertyUrl": "http://example.org/gov.uk/def/grade"
  }, {
    "name": "min_pay",
    "titles": "Payscale Minimum (£)",
    "datatype": "integer",
    "propertyUrl": "http://example.org/gov.uk/def/min_pay"
  }, {
    "name": "max_pay",
    "titles": "Payscale Maximum (£)",
    "datatype": "integer",
    "propertyUrl": "http://example.org/gov.uk/def/max_pay"
  }, {
    "name": "job",
    "titles": "Generic Job Title",
    "datatype": "string",
    "propertyUrl": "http://example.org/gov.uk/def/job"
  }, {
    "name": "number",
    "titles": "Number of Posts (FTE)",
    "datatype": "number",
    "propertyUrl": "http://example.org/gov.uk/def/number_of_posts" 
  }, {
    "name": "profession",
    "titles": "Profession",
    "datatype": "string",
    "propertyUrl": "http://example.org/gov.uk/def/profession"
  }, {
    "name": "organizationRef",
    "titles": "Organization Reference",
    "datatype": "string",
    "propertyUrl": "org:postIn",
    "valueUrl": "http://example.org/organization/{organizationRef}",
    "required": true
  }],
  "foreignKeys": [{
    "columnReference": "reportsToSenior",
    "reference": {
      "schemaReference": "http://example.org/gov.uk/schema/senior-roles.json",
      "columnReference": "ref"
    }
  }, {
    "columnReference": "profession",
    "reference": {
      "schemaReference": "http://example.org/gov.uk/schema/professions.json",
      "columns": "name"
    }
  }, {
    "columnReference": "organizationRef",
    "reference": {
      "schemaReference": "http://example.org/gov.uk/schema/organizations.json",
      "columns": "ref"
    }
  }]

}
Note

Where an implementation is able to interpret one This example makes extensive use of the provided picture strings, the date-time value reformatted example.org domain. As described in [ iso8601 RFC6761 ] format shall be included in ], this domain is used for illustrative examples within documentation. In reality, the output graph, else resources described here with the original cell value shall URL path /gov.uk would be included verbatim. centrally published by the UK Government at, say, the domain data.gov.uk .

In the interim, mapping applications Given that these resources are not required to undertake any reformatting and may centrally published with an aspiration for reuse, the schema descriptions have been factored out into separate resources. As such, the top-level metadata description resource metadata.json simply pass provides the cell value list of tables and binds each of them to the output graph verbatim. Issue 65 appropriate schema that is defined elsewhere.

A list of potential date-time formatting implementations needs Finally, note that because the centrally published metadata descriptions are intended to be defined. reused across many government departments and organizations, extra consideration has been given to defining URIs for the person and post resources defined in each row of the senior roles tabular data and subsequently referenced from the junior roles tabular data. To ensure that naming clashes are avoided, the unique reference for the organization to which the person or post belongs has been included in a path segment of the identifier. For example, the URI template property Uniform Resource Identifiers: anyURI . Where aboutUrl used to identify the datatype senior post is specified as anyURI http://example.org/organization/{organizationRef}/post/{ref} , thus yielding the cell value is inferred to provide a URI (as http://example.org/organization/hefce.ac.uk/post/90115 for the post described in [ rfc3986 the first row ]) rather than a literal value. of the senior roles tabular data.

urlTemplate

If The group of tables generated from parsing the tabular data files and associated metadata property urlTemplate is specified, shown below and provides the value used in basis for the output graph SHALL be conversion to RDF.

Annotations for the result group of tables G and the URI Template expansion, as defined in Section 3.1 Property Syntax four tables of [ Ta , Tb , Tc , and Td are shown below.

tabular-metadata Group of Tables ]. annotations:

id core annotations
tables
G Ta , Tb , Tc , Td

Once the URL has been generated via the template expansion, relative URLs are resolved against the base URL Table to create an absolute URL. annotations:

Issue 96 4.2 Examples
id core annotations
url Should the cell-value URL Template be treated as a datatype? columns rows suppress output foreign keys
Ta default http://example.org/gov.uk/data/organizations.csv If metadata property Ca1 , Ca2 , Ca3 Ra1 , Ra2 default true is specified and the cell value is deemed to be null , then the value of Fa1
Tb default http://example.org/gov.uk/professions.csv SHALL be used in the output graph . Cb1 Rb1 , Rb2 , Rb3 , Rb4 true
Tc http://example.org/senior-roles.csv Cc1 , Cc2 , Cc3 , Cc4 , Cc5 , Cc6 Rc1 , Rc2 false Fc1 , Fc2 , Fc3
Td http://example.org/junior-roles.csv Cd1 , Cd2 , Cd3 , Cd4 , Cd5 , Cd6 , Cd7 Rd1 , Rd2 false Fd1 , Fd2 , Fd3
Issue Note

These examples don't really show In this example, output for the edge cases - probably need centrally published lists of organizations and professions, tables Ta and Tb ( http://example.org/gov.uk/data/organizations.csv and http://example.org/gov.uk/data/professions.csv respectively), are not required; only information from the departmental submissions is to rework them be translated to RDF. Note the suppress output annotation on this table .

The first example illustrates how a CSV file with metadata annotations drawn only from a header line following foreign keys are defined:

id columns in table columns in referenced table
Fa1 Ca3 Ca1
Fc1 Cc5 Cc1
Fc2 Cc6 Cb1
Fc3 Cc7 Ca1
Fd1 Cd1 Cc1
Fd2 Cd7 Cb1
Fd3 Cd8 Ca1

Annotations for the columns , rows is processed. The tabular data describes lists countries, giving their country code and name. There cells in table T are two columns, named country and name , and four rows. shown in the tables below.

Column annotations:

:name a rdf:Property ; rdfs:label "name" . Issue 116 In the example output above we see Turtle's shorthand syntax for dealing with blank nodes. Should the recommended output form explicitly identify the blank nodes using the row number? e.g.
id core annotations
country table number source number cells name titles required virtual
AD Ca1 Andorra Ta 1 1 Ca1.1 , Ca2.1 ref Organization Unique Reference true
AF Ca2 Afghanistan Ta 1 1 Ca1.2 , Ca2.2 name Organization Name
AI Ca3 Anguilla Ta 1 1 Ca1.3 , Ca2.3 department Department Reference
AL Cb1 Albania Tb 1 1 Cb1.1 , Cb2.1 , Cb3.1 , Cb4.1 name Profession true
Cc1 Tc 1 1 Cc1.1 , Cc2.1 The CSV input (published at http://example.org/country-codes-and-names.csv ref ): country,name AD,Andorra AF,Afghanistan AI,Anguilla AL,Albania The resulting RDF output graph (published Post Unique Reference true
Cc2 Tc 2 2 Cc1.2 , Cc2.2 name Name
Cc3 Tc 3 3 Cc1.3 , Cc2.3 grade Grade
Cc4 Tc 4 4 Cc1.4 , Cc2.4 job Job Title
Cc5 Tc 5 5 Cc1.5 , Cc2.5 reportsTo Reports to Senior Post
Cc6 Tc 6 6 Cc1.6 , Cc2.6 http://example.org/country-codes-and-names.ttl profession ): @prefix : <http://example.org/country-codes-and-names.csv#> . @prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . :country a rdf:Property ; rdfs:label "country" . :table a csvw:Table ; dcat:distribution [ a dcat:Distribution ; dcat:downloadURL <http://example.org/country-codes-and-names.csv> ] ; csvw:row [ :country "AD" ; :name "Andorra" ; ] , [ :country "AF" ; :name "Afghanistan" ; ] , [ :country "AI" ; :name "Anguilla" ; ] , [ :country "AL" ; :name "Albania" ; ] ; prov:activity [ a prov:Activity ; prov:startedAtTime "2014-12-16T12:15:06"^^xsd:dateTime ; prov:endedAtTime "2014-12-16T12:15:07"^^xsd:dateTime ; prov:generated <http://example.org/country-codes-and-names.ttl> ; prov:qualifiedUsage [ a prov:Usage ; prov:Entity <http://example.org/country-codes-and-names.csv> ; prov:hadRole csvw:csvEncodedTabularData ; ] ; ] ; . Profession
Cc7 Tc 7 7 Cc1.7 , Cc2.7   csvw:row _:1 , _:2 , _:3 , _:4 . organizationRef   _:1 :country "AD" ; :name "Andorra" . Organization Reference etc. true
Cc8 Tc 8 8 Cc1.8 , Cc2.8 post_holder true
Cd1 Td 1 1 Cd1.1 , Cd2.1 reportsToSenior Reporting Senior Post true
Cd2 Td 2 2 Cd1.2 , Cd2.2 grade Grade
Cd3 Td 3 3 Cd1.3 , Cd2.3 min_pay Payscale Minimum (£)
Cd4 Td 4 4 Cd1.4 , Cd2.4 max_pay Payscale Maximum (£)
Cd5 Td 5 5 Cd1.5 , Cd2.5 job Generic Job Title
Cd6 Td 6 6 Cd1.6 , Cd2.6 number Number of Posts (FTE)
Cd7 Td 7 7 Cd1.7 , Cd2.7 profession Profession
Cd8 Td 8 8 Cd1.8 , Cd2.8 organizationRef Organization Reference true
Note
The second example illustrates how the mapping is modified

Column Cc8 , with the addition of metadata annotations in a metadata document. The CSV file virtual annotation specified as true , is a small extract from a much larger Tree Inventory dataset from used to relate the City of Palo Alto which supports person resource, whose name is provided in column Cc2 , to the maintaining and tracking associated post resource within the city's public trees and urban forest. There are five columns, named GID , On Street , Species , current row of table Tc ( Trim Cycle { "url": "http://example.org/senior-roles.csv" } and Inventory Date , and three rows. ).

Row annotations:

GID
id core annotations
On Street table Species number Trim Cycle source number Inventory Date cells
Ra1 Ta 1 ADDISON AV 2 Celtis australis Ca1.1 , Ca1.2 , Ca1.3
Large Tree Routine Prune Ra2 10/18/2010 Ta 2 3 Ca2.1 , Ca2.2 , Ca2.3
Rb1 Tb 1 2 EMERSON ST Cb1.1
Liquidambar styraciflua Rb2 Large Tree Routine Prune Tb 6/2/2010 2 3 Cb2.1
Rb3 Tb 3 EMERSON ST 4 Liquidambar styraciflua Cb3.1
Large Tree Routine Prune Rb4 6/2/2010 Tb 4 5 Cb4.1
Rc1 Tc 1 2 Cc1.1 , Cc1.2 , Cc1.3 , Cc1.4 , Cc1.5 , Cc1.6 , Cc1.7 , Cc1.8
Rc2 Tc 2 3 Cc2.1 , Cc2.2 , Cc2.3 , Cc2.4 , Cc2.5 , Cc2.6 , Cc2.7 , Cc2.8
Rd1 Td 1 2 Cd1.1 , Cd1.2 , Cd1.3 , Cd1.4 , Cd1.5 , Cd1.6 , Cd1.7 , Cd1.8
Rd2 Td 2 3 Cd2.1 , Cd2.2 , Cd2.3 , Cd2.4 , Cd2.5 , Cd2.6 , Cd2.7 , Cd2.8

The CSV input (published at Cell annotations:

:on-street a rdf:Property ; rdfs:label "On Street" , "On Street" ; dc:description "The street that the tree is on." . :inventory-date a rdf:Property ; rdfs:label "Inventory Date" , "Inventory Date" ; dc:description "The date of the operation that was performed." . :gid-1 :GID "1"^^xsd:string ; :on-street "ADDISON AV"^^xsd:string ; :species "Celtis australis"^^xsd:string ; :trim-cycle "Large Tree Routine Prune"^^xsd:string ; :inventory-date "2010-10-18"^^xsd:date ; . :gid-2 :GID "2"^^xsd:string ; :on-street "EMERSON ST"^^xsd:string ; :species "Liquidambar styraciflua"^^xsd:string ; :trim-cycle "Large Tree Routine Prune"^^xsd:string ; :inventory-date "2010-06-02"^^xsd:date ; . ]) into an [
id core annotations
table column row string value value about URL property URL value URL
Ca1.1 Ta Ca1 Ra1 http://example.org/tree-ops.csv "hefce.ac.uk" ): GID,On Street,Species,Trim Cycle,Inventory Date 1,ADDISON AV,Celtis australis,Large Tree Routine Prune,10/18/2010 2,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,6/2/2010 3,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,6/2/2010 The metadata description (published at http://example.org/tree-ops.csv-metadata.json "hefce.ac.uk" ): { "@id": "tree-ops", "@context": { "@language": "en" } "dcat:distribution": { "dcat:downloadURL": "tree-ops.csv" } "dc:title": "Tree Operations", "dc:keywords": ["tree", "street", "maintenance"], "dc:publisher": [{ "schema:name": "Example Municipality", "schema:web": "http://example.org" }], "dc:license": "http://opendefinition.org/licenses/cc-by/", "dc:modified": { "@value": "2010-12-31", "@type": "http://www.w3.org/2001/XMLSchema#date" } "schema": { "columns": [{ "name": "GID", "title": [ "GID", "Generic Identifier" ], "dc:description": "An identifier for the operation on a tree.", "datatype": "string", "required": true, "unique": true }, { "name": "on-street", "title": "On Street", "dc:description": "The street that the tree is on.", "datatype": "string" }, { "name": "species", "title": "Species", "dc:description": "The species of the tree.", "datatype": "string" }, { "name": "trim-cycle", "title": "Trim Cycle", "dc:description": "The operation performed on the tree.", "datatype": "string" }, { "name": "inventory-date", "title": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": "date" "format": { "picture-strings": [ "unicode": "M/d/yyyy" ] } }] "primaryKey": "GID", "urlTemplate": "#gid-{GID}" } } The resulting RDF output graph (published to http://example.org/tree-ops.ttl <http://example.org/organization/hefce.ac.uk> ): @prefix : <http://example.org/tree-ops.csv#> . @prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix dc: <http://purl.org/dc/terms/> @prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix prov: <http://www.w3.org/ns/prov#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix schema: <http://schema.org/> @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . :GID a rdf:Property ; rdfs:label "GID" , "GID" , "Generic Identifier" ; dc:description "An identifier for the operation on a tree." . :species a rdf:Property ; rdfs:label "Species" , "Species" ; dc:description "The species of the tree." . dc:identifier :trim-cycle a rdf:Property ; rdfs:label "Trim Cycle" , "Trim Cycle" ; dc:description "The operation performed on the tree." . :table a csvw:Table ; dcat:distribution [ a dcat:Distribution ; dcat:downloadURL <http://example.org/tree-ops.csv> ] ; dc:title "Tree Operations" ; dc:keywords "tree" , "street", "maintenance" ; dc:publisher [ schema:name "Example Municipality" ; schema:web "http://example.org" ] ; dc:license <http://opendefinition.org/licenses/cc-by/> ; dc:modified "2010-12-31"^^xsd:date ; csvw:row :gid-1 , :gid-2 , :gid-3 ; prov:activity [ a prov:Activity ; prov:startedAtTime "2014-12-16T12:15:06"^^xsd:dateTime ; prov:endedAtTime "2014-12-16T12:15:07"^^xsd:dateTime ; prov:generated <http://example.org/tree-ops.ttl> ; prov:qualifiedUsage [ a prov:Usage ; prov:Entity <http://example.org/tree-ops.csv> ; prov:hadRole csvw:csvEncodedTabularData ] ; prov:qualifiedUsage [ a prov:Usage ; prov:Entity <http://example.org/tree-ops.csv-metadata.json> ; prov:hadRole csvw:tabularMetadata ] ; ] ; .
:gid-3 :GID "3"^^xsd:string ; :on-street "EMERSON ST"^^xsd:string ; :species "Liquidambar styraciflua"^^xsd:string ; :trim-cycle "Large Tree Routine Prune"^^xsd:string ; :inventory-date "2010-06-02"^^xsd:date ; . Ca1.2 Note Ta Ca2 Ra1 In the example RDF output note that the value of column Inventory Date "Higher Education Funding Council for England" has been amended from format M/d/yyyy "Higher Education Funding Council for England" (as described by the Unicode Date Format Pattern string [ tr35 iso8601 <http://example.org/organization/hefce.ac.uk> ] formatted date string compliant with xsd:date foaf:name syntax. Issue
Ca1.3 Ta Ca3 Ra1 Note the repeating rdfs:label "bis.gov.uk" "bis.gov.uk" <http://example.org/organization/hefce.ac.uk> org:subOrganizationOf <http://example.org/organization/bis.gov.uk>
Ca2.1 Ta Ca1 Ra2 "bis.gov.uk" "bis.gov.uk" <http://example.org/organization/bis.gov.uk> dc:identifier
Ca2.2 Ta Ca2 Ra2 "Department for the column predicates; this is because the CSV Business, Innovation and the metadata both include the same Skills" title "Department for Business, Innovation and Skills" - should the mapping application attempt to deduplicate? <http://example.org/organization/bis.gov.uk> foaf:name
Ca2.3 Ta Ca3 Ra2 "xx" null <http://example.org/organization/bis.gov.uk> org:subOrganizationOf
Cb1.1 Tb Cb1 Rb1 "Finance" "Finance"
Cb2.1 Tb Cb1 Rb2 "Information Technology" "Information Techology"
Cb3.1 Tb Cb1 Rb3 "Operational Delivery" "Operational Delivery"
Cb4.1 Tb Cb1 Rb4 "Policy" "Policy"
Cc1.1 Tc Cc1 Rc1 "90115" "90115" <http://example.org/organization/hefce.ac.uk/post/90115> dc:identifier
Cc1.2 Tc Cc2 Rc1 "Steve Egan" "Steve Egan" <http://example.org/organization/hefce.ac.uk/person/1> foaf:name
Cc1.3 Tc Cc3 Rc1 "SCS1A" "SCS1A" <http://example.org/organization/hefce.ac.uk/post/90115> <http://example.org/gov.uk/def/grade>
Cc1.4 Tc Cc4 Rc1 "Deputy Chief Executive" "Deputy Chief Executive" <http://example.org/organization/hefce.ac.uk/post/90115> <http://example.org/gov.uk/def/job>
Cc1.5 Tc Cc5 Rc1 "90334" "90334" <http://example.org/organization/hefce.ac.uk/post/90115> org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90334>
Cc1.6 Tc Cc6 Rc1 "Finance" "Finance" <http://example.org/organization/hefce.ac.uk/post/90115> <http://example.org/gov.uk/def/profession>
Cc1.7 Tc Cc7 Rc1 "hefce.ac.uk" "hefce.ac.uk" <http://example.org/organization/hefce.ac.uk/post/90115> org:postIn <http://example.org/organization/hefce.ac.uk>
Cc1.8 Tc Cc8 Rc1 "" null <http://example.org/organization/hefce.ac.uk/post/90115> org:heldBy <http://example.org/organization/hefce.ac.uk/person/1>
Cc2.1 Tc Cc1 Rc2 "90334" "90334" <http://example.org/organization/hefce.ac.uk/post/90334> dc:identifier
Cc2.2 Tc Cc2 Rc2 "Sir Alan Langlands" "Sir Alan Langlands" <http://example.org/organization/hefce.ac.uk/person/2> foaf:name
Cc2.3 Tc Cc3 Rc2 "SCS4" "SCS4" <http://example.org/organization/hefce.ac.uk/post/90334> <http://example.org/gov.uk/def/grade>
Cc2.4 Tc Cc4 Rc2 "Chief Executive" "Chief Executive" <http://example.org/organization/hefce.ac.uk/post/90334> <http://example.org/gov.uk/def/job>
Cc2.5 Tc Cc5 Rc2 "xx" null <http://example.org/organization/hefce.ac.uk/post/90334> org:reportsTo
Cc2.6 Tc Cc6 Rc2 "Policy" "Policy" <http://example.org/organization/hefce.ac.uk/post/90334> <http://example.org/gov.uk/def/profession>
Cc2.7 Tc Cc7 Rc2 "hefce.ac.uk" "hefce.ac.uk" <http://example.org/organization/hefce.ac.uk/post/90334> org:postIn <http://example.org/organization/hefce.ac.uk>
Cc2.8 Tc Cc8 Rc2 "" null <http://example.org/organization/hefce.ac.uk/post/90334> org:heldBy <http://example.org/organization/hefce.ac.uk/person/2>
Cd1.1 Td Cd1 Rd1 "90115" "90115" org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115>
Cd1.2 Td Cd2 Rd1 "4" "4" <http://example.org/gov.uk/def/grade>
Cd1.3 Td Cd3 Rd1 "17426" 17426 <http://example.org/gov.uk/def/min_pay>
Cd1.4 Td Cd4 Rd1 "20002" 20002 <http://example.org/gov.uk/def/max_pay>
Cd1.5 Td Cd5 Rd1 "Administrator" "Administrator" <http://example.org/gov.uk/def/job>
Cd1.6 Td Cd6 Rd1 "8.67" 8.67 <http://example.org/gov.uk/def/number_of_posts>
Cd1.7 Td Cd7 Rd1 "Operational Delivery" "Operational Delivery" <http://example.org/gov.uk/def/profession>
Cd1.8 Td Cd8 Rd1 "hefce.ac.uk" "hefce.ac.uk" org:postIn <http://example.org/organization/hefce.ac.uk>
Cd2.1 Td Cd1 Rd2 "90115" "90115" org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115>
Cd2.2 Td Cd2 Rd2 "5" "5" <http://example.org/gov.uk/def/grade>
Cd2.3 Td Cd3 Rd2 "19546" 19546 <http://example.org/gov.uk/def/min_pay>
Cd2.4 Td Cd4 Rd2 "22478" 22478 <http://example.org/gov.uk/def/max_pay>
Cd2.5 Td Cd5 Rd2 "Administrator" "Administrator" <http://example.org/gov.uk/def/job>
Cd2.6 Td Cd6 Rd2 "0.5" 0.5 <http://example.org/gov.uk/def/number_of_posts>
Cd2.7 Td Cd7 Rd2 "Operational Delivery" "Operational Delivery" <http://example.org/gov.uk/def/profession>
Cd2.8 Td Cd8 Rd2 "hefce.ac.uk" "hefce.ac.uk" org:postIn <http://example.org/organization/hefce.ac.uk>
Issue Note

The assertions about Notice that value URL is not specified for cells Ca2.3 and Cc2.5 because in each case the column predicates could include cell value is rdfs:range null to specify and the data type. Is this desirable? virtual annotation of column Cb5 is not defined.

5. Mapping Grouped Tabular Data The procedures and rules for mapping a collection of tabular data compliant with the grouped tabular data model Minimal mode are described below. The metadata output for a group of tables SHALL be this example is provided by a table group description (as defined in Turtle [ tabular-metadata turtle ]) within the associated metadata document. ] syntax below:

Example 23: http://example.org/public-sector-roles-and-salaries-minimal.ttl
@prefix dc: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://example.org/organization/hefce.ac.uk/post/90115>

  dc:identifier "90115" ;
  org:heldBy <http://example.org/organization/hefce.ac.uk/person/1> ;
  <http://example.org/gov.uk/def/grade> "SCS1A" ;
  <http://example.org/gov.uk/def/job> "Deputy Chief Executive" ;

  org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90334> ;
  <http://example.org/gov.uk/def/profession> "Finance" ;

  org:postIn <http://example.org/organization/hefce.ac.uk> .

<http://example.org/organization/hefce.ac.uk/person/1>

  foaf:name "Steve Egan" .

<http://example.org/organization/hefce.ac.uk/post/90334>

  dc:identifier "90334" ;
  org:heldBy <http://example.org/organization/hefce.ac.uk/person/2> ;
  <http://example.org/gov.uk/def/grade> "SCS4" ;
  <http://example.org/gov.uk/def/job> "Chief Executive" ;
  <http://example.org/gov.uk/def/profession> "Policy" ;

  org:postIn <http://example.org/organization/hefce.ac.uk> .

<http://example.org/organization/hefce.ac.uk/person/2>

  foaf:name "Sir Alan Langlands" .

_:d8b8e40c-8c74-458b-99f7-64d1cf5c65f2
  org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115> ;
  <http://example.org/gov.uk/def/grade> "4" ;
  <http://example.org/gov.uk/def/min_pay> "17426"^^xsd:integer ;
  <http://example.org/gov.uk/def/max_pay> "20002"^^xsd:integer ;
  <http://example.org/gov.uk/def/job> "Administrator" ;
  <http://example.org/gov.uk/def/number_of_posts> "8.67"^^xsd:double ;
  <http://example.org/gov.uk/def/profession> "Operational Delivery" ;

  org:postIn <http://example.org/organization/hefce.ac.uk> .

_:fa1fa954-dd5f-4aa1-b2bc-20bf9867fac6
  org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115> ;
  <http://example.org/gov.uk/def/grade> "5" ;
  <http://example.org/gov.uk/def/min_pay> "19546"^^xsd:integer ;
  <http://example.org/gov.uk/def/max_pay> "22478"^^xsd:integer ;
  <http://example.org/gov.uk/def/job> "Administrator" ;
  <http://example.org/gov.uk/def/number_of_posts> "0.5"^^xsd:double ;
  <http://example.org/gov.uk/def/profession> "Operational Delivery" ;

org

:

5.1


postIn


Generating
RDF


<



http

:

5.1.1


//example.org/organization/hefce.ac.uk>
.


Group-level
processing

Note

The output graph Output for tables SHALL contain a resource that describes with the table group : Ta and Tb ( { "url": "http://example.org/gov.uk/data/organizations.csv" } and { "url": "http://example.org/gov.uk/data/professions.csv" } ) are not included as the table group resource . The table group resource suppress output SHALL be of type annotation is csvw:TableGroup true .

Where present in the table group description , any Common Properties The property URL is specified for all cells (as defined in Section 3.3 Common Properties tables of [ Tc and Td .

tabular-metadata Columns ]) SHALL be included in Cc5 and Cd1 ( { "name": "reportsTo" } and { "name": "reportsToSenior" } ) use the output graph about URL , property URL as properties of and value URL annotations to assert the table group resource . relationship between the given post and the senior post it reports to for the cells therein.

The output graph Similarly, columns SHALL contain one resource for each of Cc7 and Cd8 (both with { "name": "organizationRef" } ) use the tables about URL , property URL listed in and value URL annotations to assert the resources array of relationship between the table group description given post and the organization to which it belongs for the cells those columns .

Each Finally, note that two resources are created for each row within table SHALL Tc ( { "url": "http://example.org/senior-roles.csv" } ): the person be processed sequentially according to and the appropriate set of rules for mapping core or annotated tabular data . Refer to Section 3 Mapping Core Tabular Data post they occupy. The relationship between these resources is specified via virtual column Cc8 ( { "name": "post_holder" } ) using the about URL , property URL and Section 4 Mapping Annotated Tabular Data value URL annotations.

Standard mode output for further details. this example is provided in Turtle [ turtle ] syntax below:

Example 24: http://example.org/public-sector-roles-and-salaries-standard.ttl
@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix org: <http://www.w3.org/ns/org#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

_:3d36cfbb-d2d5-4573-a1a7-3bf817062db8 a csvw:TableGroup ;
  csvw:table [ a csvw:Table ;
    csvw:url <http://example.org/senior-roles.csv> ;
    csvw:row [ a csvw:Row ;
      csvw:rownum "1"^^xsd:integer ;
      csvw:url <http://example.org/senior-roles.csv#row=2> ;
      csvw:describes <http://example.org/organization/hefce.ac.uk/post/90115>, <http://example.org/organization/hefce.ac.uk/person/1>
    ], [ a csvw:Row ;

      csvw:rownum "2"^^xsd:integer ;
      csvw:url <http://example.org/senior-roles.csv#row=3> ;
      csvw:describes <http://example.org/organization/hefce.ac.uk/post/90334>, <http://example.org/organization/hefce.ac.uk/person/2>
    ]
  ], [ a csvw:Table ;

    csvw:url <http://example.org/junior-roles.csv> ;
    csvw:row [ a csvw:Row ;
      csvw:rownum "1"^^xsd:integer ;
      csvw:url <http://example.org/junior-roles.csv#row=2> ;
      csvw:describes _:d8b8e40c-8c74-458b-99f7-64d1cf5c65f2
    ], [ a csvw:Row ;

      csvw:rownum "2"^^xsd:integer ;
      csvw:url <http://example.org/junior-roles.csv#row=3> ;
      csvw:describes _:fa1fa954-dd5f-4aa1-b2bc-20bf9867fac6
    ]
  ] .


<http://example.org/organization/hefce.ac.uk/post/90115>

  dc:identifier "90115" ;
  org:heldBy <http://example.org/organization/hefce.ac.uk/person/1> ;
  <http://example.org/gov.uk/def/grade> "SCS1A" ;
  <http://example.org/gov.uk/def/job> "Deputy Chief Executive" ;

  org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90334> ;
  <http://example.org/gov.uk/def/profession> "Finance" ;

  org:postIn <http://example.org/organization/hefce.ac.uk> .

<http://example.org/organization/hefce.ac.uk/person/1>

  foaf:name "Steve Egan" .

<http://example.org/organization/hefce.ac.uk/post/90334>

  dc:identifier "90334" ;
  org:heldBy <http://example.org/organization/hefce.ac.uk/person/2> ;
  <http://example.org/gov.uk/def/grade> "SCS4" ;
  <http://example.org/gov.uk/def/job> "Chief Executive" ;
  <http://example.org/gov.uk/def/profession> "Policy" ;

  org:postIn <http://example.org/organization/hefce.ac.uk> .

<http://example.org/organization/hefce.ac.uk/person/2>

  foaf:name "Sir Alan Langlands" .

_:d8b8e40c-8c74-458b-99f7-64d1cf5c65f2
  org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115> ;
  <http://example.org/gov.uk/def/grade> "4" ;
  <http://example.org/gov.uk/def/min_pay> "17426"^^xsd:integer ;
  <http://example.org/gov.uk/def/max_pay> "20002"^^xsd:integer ;
  <http://example.org/gov.uk/def/job> "Administrator" ;
  <http://example.org/gov.uk/def/number_of_posts> "8.67"^^xsd:double ;
  <http://example.org/gov.uk/def/profession> "Operational Delivery" ;

  org:postIn <http://example.org/organization/hefce.ac.uk> .

_:fa1fa954-dd5f-4aa1-b2bc-20bf9867fac6
  org:reportsTo <http://example.org/organization/hefce.ac.uk/post/90115> ;
  <http://example.org/gov.uk/def/grade> "5" ;
  <http://example.org/gov.uk/def/min_pay> "19546"^^xsd:integer ;
  <http://example.org/gov.uk/def/max_pay> "22478"^^xsd:integer ;
  <http://example.org/gov.uk/def/job> "Administrator" ;
  <http://example.org/gov.uk/def/number_of_posts> "0.5"^^xsd:double ;
  <http://example.org/gov.uk/def/profession> "Operational Delivery" ;

org
:
postIn

<

http
:
//example.org/organization/hefce.ac.uk>
.
Note

Each table-level resource resulting from processing the tables Table group SHALL be related to G was explicitly defined, but has not been explicitly identified; the table group resource using the predicate csvw:table . and table resources are treated as blank nodes .

Any The person and post resources described by each row of the inherited properties table Tc ( null , language , separator , format , datatype , or default { "url": "http://example.org/senior-roles.csv"} ) are explcitly defined within the table group description SHALL be used to pre-populate using the column description objects aboutUrl property; therefore, say, for each table row in the group. Where Rc1 we state [] csvw:describes <http://example.org/organization/hefce.ac.uk/post/90115>, <http://example.org/organization/hefce.ac.uk/person/1> . ; whilst the same aboutUrl property is has not been defined in the table group description , for resources described by each row of table description , schema or column description Td ( { "url": "http://example.org/junior-roles.csv"} ); therefore blank nodes the order of precedence SHALL be: are used, e.g. for row Rd1 we state [] csvw:describes _:d8b8e40c-8c74-458b-99f7-64d1cf5c65f2 . .

column description schema description table description table group description

Issue A. Acknowledgements The presence of foreign-key references within

At the table descriptions may affect time of publication, the way following individuals had participated in the data is packaged Working Group, in the output graph. Reviewers are invited to comment on how grouped tabular data with foreign-key references might best be organised. order of their first name: Adam Retter, Alf Eaton, Anastasia Dimou, Andy Seaborne, Axel Polleres, Christopher Gutteridge, Dan Brickley, Davide Ceolin, Eric Stephan, Erik Mannens, Gregg Kellogg, Ivan Herman, Jeni Tennison, Jeremy Tandy, Jürgen Umbrich, Rufus Pollock, Stasinos Konstantopoulos, William Ingram, and Yakov Shafranovich.

5.2 B. Examples Changes since the first public working draft of 08 January 2015 Use Case 4: Publication of

The document has undergone substantial changes since the first public sector roles working draft . Below are some of the changes made:

A. C. References

A.1 C.1 Normative references

[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels . March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[iso8601]
[json-ld-api]
Markus Lanthaler; Gregg Kellogg; Manu Sporny. ISO 8601:2004 Representation of dates JSON-LD 1.0 Processing Algorithms and times . International Standard (IS). API . 16 January 2014. W3C Recommendation. URL: http://www.w3.org/TR/json-ld-api/
[rdf11-concepts]
Richard Cyganiak; David Wood; Markus Lanthaler. RDF 1.1 Concepts and Abstract Syntax . 25 February 2014. W3C Recommendation. URL: http://www.w3.org/TR/rdf11-concepts/
[rdfa-core] Ben Adida; Mark Birbeck; Shane McCarron; Ivan Herman et al. RDFa Core 1.1 - Third Edition . 16 December 2014. W3C Proposed Edited Recommendation. URL: http://www.w3.org/TR/rdfa-core/
[tabular-data-model]
Jeni Tennison; Gregg Kellogg. Model for Tabular Data and Metadata on the Web . W3C Working Draft. URL: http://www.w3.org/TR/tabular-data-model/ http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/
[tabular-metadata]
Rufus Pollock; Jeni Tennison. Tennison; Gregg Kellogg. Metadata Vocabulary for Tabular Data . W3C Working Draft. URL: http://www.w3.org/TR/tabular-metadata/ http://www.w3.org/TR/2015/WD-tabular-metadata-20150416/
[tr35]
Mark Davis; CLDR committee members.
TR35, Unicode Locale Data Markup Language (LDML)
. Report. URL: http://unicode.org/reports/tr35/

[uri-template] C.2 Informative references

[RFC6570]
Joe J. Gregorio; Roy T. R. Fielding; Marc M. Hadley; Mark M. Nottingham; David D. Orchard. URI Template . March 2012. RFC 6570. Proposed Standard. URL: http://www.rfc-editor.org/rfc/rfc6570.txt https://tools.ietf.org/html/rfc6570
[RFC6761]
S. Cheshire; M. Krochmal. Special-Use Domain Names . February 2013. Proposed Standard. URL: https://tools.ietf.org/html/rfc6761
[RFC7111]
A.2 Informative references
M. Hausenblas; E. Wilde; J. Tennison. URI Fragment Identifiers for the text/csv Media Type . January 2014. Informational. URL: https://tools.ietf.org/html/rfc7111
[json-ld]
[csvw-context]
Manu Sporny; Gregg Kellogg; Markus Lanthaler. Kellogg. JSON-LD 1.0 Metadata Vocabulary for Tabular Data . 16 January 2014. W3C Recommendation. URL: http://www.w3.org/TR/json-ld/ http://www.w3.org/ns/csvw
[json-ld-api]
[json-ld]
Markus Lanthaler; Manu Sporny; Gregg Kellogg; Manu Sporny. Markus Lanthaler. JSON-LD 1.0 Processing Algorithms and API . 16 January 2014. W3C Recommendation. URL: http://www.w3.org/TR/json-ld-api/ http://www.w3.org/TR/json-ld/
[n-triples]
Gavin Carothers; Andy Seaborne. RDF 1.1 N-Triples . 25 February 2014. W3C Recommendation. URL: http://www.w3.org/TR/n-triples/
[prov-o]
Timothy Lebo; Satya Sahoo; Deborah McGuinness. PROV-O: The PROV Ontology . 30 April 2013. W3C Recommendation. URL: http://www.w3.org/TR/prov-o/
[r2rml] Souripriya Das; Seema Sundara; Richard Cyganiak. R2RML: RDB to RDF Mapping Language . 27 September 2012. W3C Recommendation. URL: http://www.w3.org/TR/r2rml/
[rdf-schema]
Dan Brickley; Ramanathan Guha. RDF Schema 1.1 . 25 February 2014. W3C Recommendation. URL: http://www.w3.org/TR/rdf-schema/
[rdfa-primer]
Ivan Herman; Ben Adida; Manu Sporny; Mark Birbeck. RDFa 1.1 Primer - Second Third Edition . 22 August 2013. 17 March 2015. W3C Note. URL: http://www.w3.org/TR/rdfa-primer/
[rfc3986] T. Berners-Lee; R. Fielding; L. Masinter. Uniform Resource Identifier (URI): Generic Syntax . January 2005. Internet Standard. URL: https://tools.ietf.org/html/rfc3986 [sparql11-query] Steven Harris; Andy Seaborne. SPARQL 1.1 Query Language . 21 March 2013. W3C Recommendation. URL: http://www.w3.org/TR/sparql11-query/
[trig]
Gavin Carothers; Andy Seaborne. RDF 1.1 TriG . 25 February 2014. W3C Recommendation. URL: http://www.w3.org/TR/trig/
[turtle]
Eric Prud'hommeaux; Gavin Carothers. RDF 1.1 Turtle . 25 February 2014. W3C Recommendation. URL: http://www.w3.org/TR/turtle/ [vocab-dcat] Fadi Maali; John Erickson. Data Catalog Vocabulary (DCAT) . 16 January 2014. W3C Recommendation. URL: http://www.w3.org/TR/vocab-dcat/