Abstract

Validation, conversion, display display, and search of tabular data on the web requires additional metadata that describes how the data should be interpreted. This document defines a vocabulary for metadata that annotates tabular data. This can be used to provide metadata at various levels, from collections groups of data from CSV documents tables and how they relate to each other down to individual cells within a table.

The metadata defined in this specification is used to provide annotations on an annotated table or group of tables , as defined in [ tabular-data-model ]. Annotated tables form the basis for all further processing, such as validating, converting, or displaying the tables.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

The CSV on the Web Working Group was chartered to produce a Recommendation "Access methods for CSV Metadata" as well as Recommendations for "Metadata vocabulary for CSV data" and "Mapping mechanism to transforming CSV into various Formats (e.g., RDF, JSON, or XML)". This document aims to primarily satisfy the second of those Recommendations.

This document was published by the CSV on the Web Working Group as a Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-csv-wg@w3.org ( subscribe , archives ). All comments are welcome.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .

This document is governed by the 1 August 2014 W3C Process Document .

Table of Contents

1. Introduction

Interpreting tabular data that is available on the web, particularly as CSV, usually requires additional metadata. As an example, say that the following CSV file were available at http://example.org/tree-ops.csv

GID,On Street,Species,Trim Cycle,Inventory Date 1,ADDISON AV,Celtis australis,Large Tree Routine Prune,10/18/2010 2,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,6/2/2010 3,EMERSON
Example 1: http://example.org/tree-ops.csv
GID,On Street,Species,Trim Cycle,Inventory Date
1,ADDISON AV,Celtis australis,Large Tree Routine Prune,10/18/2010
2,EMERSON

ST,Liquidambar
styraciflua,Large
Tree
Routine
Prune,6/2/2010

A human consumer of this data might be able to figure out the meaning of the different columns, particularly if there were some additional human-readable documentation made available. Automated processors would have a much harder time; realistically they would be limited to displaying the information in a table. Making available machine-readable metadata helps with the interpretation of the tabular data. For example, say that the following metadata file were available at http://example.org/trees-ops.csv-metadata.json http://example.org/tree-ops.csv-metadata.json :

Example 2: http://example.org/tree-ops.csv-metadata.json
{
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "url": "tree-ops.csv",
  "dc:title": "Tree Operations",
  "dcat:keyword": ["tree", "street", "maintenance"],
  "dc:publisher": {
    "schema:name": "Example Municipality",
    "schema:url": {"@id": "http://example.org"}
  },
  "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"},
  "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"},
  "tableSchema": {
    "columns": [{
      "name": "GID",
      "titles": ["GID", "Generic Identifier"],
      "dc:description": "An identifier for the operation on a tree.",
      "datatype": "string",
      "required": true
    }, {
      "name": "on_street",
      "titles": "On Street",
      "dc:description": "The street that the tree is on.",
      "datatype": "string"
    }, {
      "name": "species",
      "titles": "Species",
      "dc:description": "The species of the tree.",
      "datatype": "string"
    }, {
      "name": "trim_cycle",
      "titles": "Trim Cycle",
      "dc:description": "The operation performed on the tree.",
      "datatype": "string"
    }, {
      "name": "inventory_date",
      "titles": "Inventory Date",
      "dc:description": "The date of the operation that was performed.",
      "datatype": {"base": "date", "format": "M/d/yyyy"}
    }],
    "primaryKey": "GID",
    "aboutUrl": "#gid-{GID}"
  }


}

Given the location of the CSV file, this metadata document can be located by appending -metadata.json to the URL (as described in Model for Tabular Data and Metadata on the Web [ tabular-data-model ). ]). It provides information for different types of applications:

The Model for Tabular Data
Note

Implementations may fulfil one or more of these functions. In particular, Converters may or may not act as a Validator (perhaps through the setting of a flag), and Metadata on check the Web data that they are converting to ensure that it is compliant with the schema. If a Converter does not also act as a Validator specification it may produce invalid output.

[ tabular-data-model ] defines an Annotated Tabular Data Model annotated tabular data model in which groups of tables, individual tables, columns, rows rows, and cells can be annotated with properties and values, and a Grouped Tabular Data Model in which a group of tables is annotated. annotations. That specification also describes how to locate metadata about a given CSV tabular data file.

This document defines the format and structure of metadata documents, and how these are interpreted to create an Annotated Tabular Data Model. Model . It also defines how to validate tabular data based on some of these annotations.

2. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY , MUST , MUST NOT , SHOULD , and SHOULD NOT are to be interpreted as described in [ RFC2119 ].

The metadata format is based on a dialect of [ JSON-LD ] as defined in section A. JSON-LD Dialect . This metadata can therefore be expressed as an RDF graph. However, It is not necessary for conformant applications to be able to process all JSON-LD, only the dialect defined in this specification. All applications that conform to this specification (including validators and applications that read or convert tabular data) MUST read the JSON-based format described in this document.

Issue 80

Tabular data Metadata documents are MUST conform to the description from [ JSON-LD tabular-data-model ] documents, however the aim is for ]. In particular note that each row MUST contain the documents to same number of cells (although some of these cells may be useable without any extra processing. To empty). Parsers might not be valid, able to map all CSV-encoded data to such a table. As such, the metadata document MUST format described in this specification cannot be applied to all CSV files.

This specification makes use a of the compact IRI Syntax ; please refer to the Compact IRIs from [ JSON-LD Context , either explicitly via ].

This specification makes use of the following namespaces:

@context csvw :
http://www.w3.org/ns/csvw# entry,
dc :
http://purl.org/dc/terms/
dcat :
http://www.w3.org/ns/dcat#
foaf :
http://xmlns.com/foaf/0.1/
rdf :
http://www.w3.org/1999/02/22-rdf-syntax-ns#
schema :
http://schema.org/
xsd :
http://www.w3.org/2001/XMLSchema#

3. Typographical conventions

The following typographic conventions are used in this specification:

markup
Markup (elements, attributes, properties), machine processable values (string, characters, media types), property name, or through the use of a file name is in red-orange monospace font.
variable
A variable in pseudo-code or in an HTTP Link header (see algorithm description is in italics.
Interpreting JSON as JSON-LD definition
A definition of a term, to be used elsewhere in [ this or other specifications, is in bold and italics.
JSON-LD definition reference ]). The default location for
A reference to a definition in this context document is underlined and is also an active link to the definition itself.
http://www.w3.org/ns/csvw . CSVW aware processors SHOULD assume markup definition reference
A references to a context at definition in this location if one document , when the reference itself is not provided with also a markup, is underlined, red-orange monospace font, and is also an active link to the metadata document. We invite comments on definition itself.
external definition reference
A reference to a definition in another document is underlined, in italics, and is also an active link to the utility of this approach: definition itself.
markup external definition reference
A reference to a definition in another document , when the reference itself is it useful for CSV metadata also a markup, is underlined, in italics red-orange monospace font, and is also an active link to be interpretable as JSON-LD? the definition itself.
hyperlink
A hyperlink is underlined and in blue.
[reference]
A document reference (normative or informative) is enclosed in square brackets and links to the references section.
Issue 81 Note

Should JSON-LD keywords be aliased? The sense is to alias @id as url Notes are in light green boxes with a green left border and not alias the others. We invite comments with a "Note" header in green. Notes are normative or informative depending on the utility of this approach. whether they are in a normative or informative section, respectively.

Example 3
Examples are in light khaki boxes, with khaki left border, and with a 
numbered "Example" header in khaki. Examples are always informative. 
The
content
of
the
example
is
in
monospace
font
and
may
be
syntax
colored.

2. 4. Annotating Tables

The metadata defined in this specification is used to annotate provide annotations on an existing annotated table or group of tables , as defined in [ tabular-data-model ]. Annotated tables form the basis for all further processing, such as validating validating, converting, or displaying the tables.

All compliant applications MUST create annotated tables based on the algorithm defined here. All compliant applications MUST generate errors and stop processing if a metadata document:

Compliant applications MUST ignore properties (aside from common properties ) which are not defined in this specification and MUST generate a warning when they are encoutered.

If a property has a value that is not permitted by this specification, then if a default value is provided for that property, compliant applications MUST use that default value and MUST generate a warning. If no default value is provided for that property, compliant applications MUST generate a warning and behave as if the property had not been specified.

Metadata documents contain descriptions of groups of tables, tables, columns, rows, cells and regions cells, which are used to create annotations on a annotated tabular data model. model . A description object is a JSON object that describes a component of the annotated tabular data model (a group of tables , a table or a column ) and has one or more properties are mapped into properties on that component. There are two types of description objects:

The description objects themselves contain a number of properties. These are:

For example, in the column description

Example 4
{
  "name": "inventory_date",
  "titles": "Inventory Date",
  "dc:description": "The date of the operation that was performed.",
  "datatype": {
    "base": "date",
    "format": "M/d/yyyy"
  }


}

the properties name , title titles , and dc:description are direct annotations that become used to create the name , title , titles , datatype and dc:description properties annotations on the column in the data model. The datatype and format properties are property is an inherited properties property that become datatype and format properties on the cells within the column. 2.1 Direct Annotations Direct annotations are properties on the description object for a given table, column, row or cell which map directly to properties on the described table, column, row or cell. The name of the annotation is the same as the name of the property on the annotation. The value of the annotation is the same as also affects the value of the property on the description object. 2.2 Inherited Properties A each cell may be assigned annotations based on properties on the description objects for the group of tables, table, column or row that it appears in. These properties are known as inherited properties and are listed in that column (see section 3.10 5.7 Inherited Properties . To ascertain a value for these annotations, an application MUST identify the relevant property in the descriptions of the table or column. Applications MUST raise an error if the value of a property in a table description is not compatible with the value of that property on the group of tables. Applications MUST raise an error if the value of a property in a column description is not compatible with the value of that property on the table. Applications MUST raise an error if the value of a property on a cell is not compatible with the values of that property on the column that the cell is associated with. A value for a cell, column or table is compatible with with a value more on a column, table or group of tables if they are the same value or if the first value is a sub-value of the second value. The definitions of individual inherited properties indicate what values count as sub-values of others. ).

3. 5. Metadata Format

This section defines a set of properties and permitted values for annotating tabular data, and how these annotations properties should be interpreted by applications.

A metadata document is a JSON document which holds an object at the top level. This object is a description object of either a table group of tables or a single table. table . A metadata document may contain other referenced or embedded description object is a objects , description objects for tables and columns . Additional JSON object that describes a component objects, not part of the annotated tabular data model (a table group, a table, a column, a row or a cell) and has one or more properties , are mapped into properties on that component. used to describe schemas , dialect descriptions , foreign key definitions and transformation definitions .

Metadata Properties
Fig. 1 Diagram showing the properties of different metadata descriptors (see the diagram in SVG or PNG formats)

3.1 5.1 Property Syntax

There are different types of properties on description objects:

5.1.1 Array Properties

These Array properties hold an array of one or more objects, which are usually description objects .

For example, the resources tables property is an array property. A table group description might contain:

"resources": [{ "@id": "https://example.org/countries.csv", "schema": "https://example.org/countries.json" }, { "@id": "https://example.org/country_slice.csv", "schema": "https://example.org/country_slice.json" }]
"tables": [{
  "url": "https://example.org/countries.csv",
  "tableSchema": "https://example.org/countries.json"
}, {
  "url": "https://example.org/country_slice.csv",
  "tableSchema": "https://example.org/country_slice.json"
}]

in which case the resources tables property has a value that is an array of two table description objects.

Any items within an array that are not valid objects of the type expected are ignored. If the supplied value of an array property is not an array (eg if it is an integer), compliant applications MUST issue a warning and proceed as if the property had been supplied with an empty array.

5.1.3 URI template properties Template Properties

A URI template property contains properties contain a [ URI-TEMPLATE ] which can be used to generate a URI. These URI templates are expanded in the context of each row by combining the template with a set of variables with values. values as defined in [ URI-TEMPLATE ]. The variables that are set are:

column names
a variable is set for each column within the schema ; the name of the variable is the column name of the column from the annotated table and the value is derived from the value of the cell in that column in the row that is currently being processed, namely one of:
Note

The languages of cell values are ignored.

_column
_row _column is set to the row column number of the row column from the annotated table that is currently being processed
_sourceColumn Issue 32
_sourceColumn is set to the source number of the column that is currently being processed; this usually varies from _column by skip columns
Where does row numbering begin? _row
_row is set to the row number of the row from the annotated table that is currently being processed
column names _sourceRow
a variable _sourceRow is set for each column within the schema; to the name source number of the variable row that is currently being processed; this usually varies from _row by skip rows and header rows
_name
_name is set to the percent-encoded URI decoded column name of annotation, as defined in [ tabular-data-model ], for the column and the that is currently being processed. (Percent-decoding is necessary as name may have been encoded if taken from titles ; this prevents double percent-encoding.)

The annotation value is the canonical representation of result of:

  1. applying the value of template against the cell in that column in the row that is currently being processed
  2. expanding any prefixes as if the value were the name of a common property , as described in section 5.8 Common Properties
  3. resolving the resulting URL against the base URL of the table url if not null

If the supplied value of a URI template property is not a string (eg if it is an integer), compliant applications MUST issue a warning and proceed as if the property had been supplied with an empty string.

For example, the urlTemplate aboutUrl property holds a URI template that is used to generate a URL identifier for each row, which might look like:

Example 6 8 : expanded aboutUrl using _row

"urlTemplate"

"aboutUrl"


:




"http://example.org/example.csv#row={_row}"

"http://example.org/example.csv#row.{_row}"


The identifiers about URL annotations that are generated and used as identifiers for the rows would then look like http://example.org/example.csv#row=1 http://example.org/example.csv#row.1 , http://example.org/example.csv#row=2 http://example.org/example.csv#row.2 and so on.

Alternatively, with the CSV and metadata in the section 1. Introduction , the urlTemplate aboutUrl might look like:

Example 7 9 : definition of aboutUrl using column titles

"urlTemplate"

"aboutUrl"


:




"http://example.org/tree/{on%2Dstreet}/{GID}"

"http://example.org/tree/{on_street}/{GID}"


This would generate URIs such as http://example.org/tree/ADDISON%20AV/1 and http://example.org/tree/EMERSON%20ST/2 .

If the value of the on_street or GID column were null , the URL would still be generated with the null value generating an empty string in the URL. For example if on_street were null and GID were 3 , the generated URL would be http://example.org/tree//3 .

Once the URI has been generated, it is resolved against the location url of the resource table (eg the CSV file) to create an absolute URI. For example, given a urlTemplate aboutUrl within a schema such as:

"urlTemplate": "#row={_row}"
Example 10 : definition of aboutUrl as a relative URL

"aboutUrl"
:
"#row.{_row}"

and given a CSV file at http://example.com/temp.csv , the URL for the first row will be http://example.com/temp.csv#row=1 http://example.com/temp.csv#row.1 .

The propertyUrl property might be defined as "{#_name}" , meaning that it resolves as a fragment identifier relative to the URL of the source of the table. For example, accessing it from a column with the column name GID would look like:

Example 11 : expanded propertyUrl

"http://example.org/example.csv#GID"

A value defined within the data is also subject to expansion. For example, consider the following table:

Example 12: table with compact URLs and micro syntax
project_name,project_type,keywords

CSVW,foaf:Project,table;data;conversion

The project_type column might have a valueUrl specified as "{project_type}" . In the first row the cell value is "foaf:Project" . The foaf prefix is understood, as described in section 5.8 Common Properties , to expand to http://xmlns.com/foaf/0.1/Project .

Similarly, the keywords column might have a valueUrl specified as "https://duckduckgo.com/?q={keywords}" . If the column also specifies "separator": ";" , then the cell value of the keywords column would be an array of the three values table , data , and conversion . This is set as the value of the keywords variable within the URI template, which means the result would be https://duckduckgo.com/?q=table,data,conversion .

If the value in the keywords column were an empty sequence (created from an empty cell in the original data), the reference properties to that column would be expanded to an empty string, generating https://duckduckgo.com/?q= .

When a cell's value is not a string, the canonical representation of that value is used within the expanded URL. For example, the data may include dates such as those in:

Example 13: CSV containing dates
GID,On Street,Species,Trim Cycle,Inventory Date

1,ADDISON AV,Celtis australis,Large Tree Routine Prune,10/18/2010
2,EMERSON
ST,Liquidambar
styraciflua,Large
Tree
Routine
Prune,6/2/2010

These The Inventory Date column description would indicate that these were dates with the format M/d/yyyy :

Example 14
{
  "name": "inventory_date",
  "titles": "Inventory Date",
  "datatype": {
    "base": "date",
    "format": "M/d/yyyy"
  }

}

The string value of the inventory_date column in the first row is parsed to create the date 18th October 2010. When the inventory_date column is referenced within a URI template such as http://example.org/event/{inventory_date} , the canonical representation of that date, as defined in [ xmlschema11-2 ] is used within the URL, giving the result http://example.org/event/2010-10-18 .

5.1.4 Column Reference Properties

Column reference properties hold one or more references to other column description objects. The referenced description object must have an a name property. Column reference properties can then reference column description objects through values that are:

  • strings — which MUST match the name on a column description object within the metadata document
  • arrays — lists of strings as above

If the supplied value of a column reference property is not a string or array (eg if it is an integer), or if any of the values in the supplied array are not strings, or if any of the supplied strings do not reference one or more columns, compliant applications MUST issue a warning and proceed as if the property had been specified (which may mean that an error is issued, if the property was required).

For example, the primaryKey property is an a column reference property on the schema. It has to hold references to columns defined elsewhere in the schema, and the descriptions of those columns must have name properties. It can hold a single reference, like this:

Example 15
"tableSchema": {
  "columns": [{
    "name": "GID"
  }, ... ],
  "primaryKey": "GID"


}

or it can contain an array of references, like this:

Example 16
"tableSchema": {
  "columns": [{
    "name": "givenName"
  }, {
    "name": "familyName"
  }, ... ],
  "primaryKey": [ "givenName", "familyName" ]


}

object properties 5.1.5 Object Properties

These Object properties hold one or more objects either a single object or references a reference to objects an object by URL. Their values may be:

  • strings — resolved as URLs against the base URL
  • objects — interpreted as structured objects
  • arrays — lists

If the supplied value of strings and/or objects, interpreted as URLs an object property is not a string or structured objects object (eg if it is an integer), compliant applications MUST issue a warning and proceed as if the property had been specified as an object with no properties.

Object properties are often used when the values can be or should be values within controlled vocabularies, or structured information which may be held elsewhere. For example, the dc:creator dialect of a table should be is an object property. It could be provided as a URL that indicates the creator, a commonly used dialect, like this:

Example 10 17

"dc:creator"

"dialect"


:




"http://ons.gov.uk"

"http://example.org/tab-separated-values"


or a structured object, like this:

Example 18
"dialect": {
  "delimiter": "\t",
  "encoding": "utf-8"


}

or an array of URLs, like this: Example 12 "dc:creator" : [ "http://ons.gov.uk" , When specified as a string, the resolved URL is used to fetch the referenced object during normalization as described in section 6.1 "https://www.gov.uk/government/organisations/department-for-transport" Normalization . For example, if http://example.org/tab-separated-values resolved to:

Example 19
{
  "@context": "http://www.w3.org/ns/csvw",
  "quoteChar": null,
  "header": true,
  "delimiter": "\t"


]

}


or an array Following normalization , the value of structured objects: the dialect property would then be:

Example 20
"dialect": {
  "@id": "http://example.org/tab-separated-values",
  "quoteChar": null,
  "header": true,
  "delimiter": "\t"


}]

}


or an array that mixes URLs and objects: },

"https://www.gov.uk/government/organisations/department-for-transport" 5.1.6 ] Natural Language Properties natural

Natural language properties These hold natural language strings. Their values may be:

  • strings — interpreted as natural language strings in the default language
  • arrays — interpreted as alternative natural language strings in the default language
  • objects whose properties MUST be language codes as defined by [ BCP47 ] and whose values are either strings or arrays, providing natural language strings in that language

Natural language properties are used for things like descriptions and titles. For example, the title titles property on a column description provides a natural language label for a column. If it's a plain string like this:

Example 15 21

"title"

"titles"


:




"Project
title"

then that string is assumed to be in the default language provided through the @language property of the nearest @context (or have no assumed an undefined language, und , if there is no such property). Multiple alternative values can be given in an array:

Example 22
"titles": [
  "Project title",
  "Project"


]

It's also possible to provide multiple values in different languages, using an object structure. For example:

Example 23
"titles": {
  "en": "Project title",
  "fr": "Titre du projet"


}

and within such an object, the values of the properties can themselves be arrays:

Example 24
"titles": {
  "en": [ "Project title", "Project" ],
  "fr": "Titre du projet"


}

The annotation value of a natural language property is an object whose properties are language codes and where the values of those properties are an array of strings (see Language Maps in [ JSON-LD ]).

Issue 49 Note

We invite comment on whether it would be useful to enable some markup in When extracting a annotation value from a metadata that will have already been merged , a natural language strings, for example by stating property will already have this form.

If the supplied value of a natural language property is not a string, array or object (eg if it is an integer), compliant applications MUST issue a warning and proceed as if the property had been specified as an empty array. If the supplied value is an array, any items in that array that they are interpreted not strings MUST be ignored. If the supplied value is an object, any properties that are not valid language codes as HTML defined by [ BCP47 ] MUST be ignored, as must any properties whose value is not a string or Markdown. an array, and any items that are not strings within array values of these properties.

atomic properties 5.1.7 Atomic Properties

These Atomic properties hold atomic values. values . Their values may be:

  • numbers — interpreted as integers or doubles
  • booleans — interpreted as booleans ( true or false )
  • strings — interpreted as defined by the property
  • objects — interpreted as defined by the property
  • arrays — lists of numbers, booleans booleans, strings, or strings objects
Note JSON does not have date or time types. Where

The annotation value of a boolean atomic property takes is false if unset; otherwise, the annotation value of an atomic property is normalized value of that property, or the defined default value or null , if unset. Processors MUST issue a date warning if a property is set to an invalid value type, such as a value, this MUST be boolean atomic property being set to the number 1 or a string in numeric atomic property being set to the format string YYYY-MM-DD . "3.1415" , and act as if the property had not been specified (which may mean using the default value for the property, or may mean raising an error and halting processing if the property is a required property).

3.2 5.2 Top-Level Properties

The top-level object of a metadata document or object referenced through an object property (whether it is a table group description or a , table description , schema , dialect description or transformation definition ) MAY MUST have a @context property. This holds is an object that provides metadata for interpreting other properties, namely: @language indicates the default language for the values of properties array property , as defined in the metadata document; if present, its value MUST be a language code [ BCP47 Section 8.7 ] which is the default language for the values of other properties in the metadata document Note [ JSON-LD Note that the ]. The @language @context property MUST have one of the following values:

3.4 5.3 Table Groups

A table group description is a JSON object that describes a group of tables. tables .

3.4.1 5.3.1 Required Properties

resources tables

An array property of table descriptions for the tables in the group. group, namely those listed in the tables annotation on the group of tables being described. Compliant application MUST raise an error if this array does not contain one or more table descriptions.

When an array of table descriptions B is imported merged into an original array of table descriptions A, A , each table description within B is combined into the original array A by:

  • if there is a table description with the same @id url in A, A , the table description from B is imported merged into the matching table description in A .
  • otherwise, the table description from B is appended to the array of table descriptions A .

3.4.2 5.3.2 Optional Properties

The description of a group of tables MAY also contain:

schema dialect

An object property that provides a single schema dialect description as described in section 3.8 Schemas , . If provided, dialect provides hints to processors about how to parse the referenced files to create tabular data models for all the tables in the group. This may be provided as an embedded object within the JSON metadata or as a URL reference reference. See section 5.9 Dialect Descriptions for more details.

notes

An array property that provides an array of objects representing arbitrary annotations on the annotated group of tables . The value of this property becomes the value of the notes annotation for the group of tables . The properties on these objects are interpreted equivalently to common properties as described in section 5.8 Common Properties . When an array of note objects B is merged into an original array of note objects A , each note object from B is appended into the array A .

Note

The Web Annotation Working Group is developing a separate JSON schema document. vocabulary for expressing annotations. In future versions of this specification, we anticipate referencing that vocabulary.

table-direction tableDirection

An atomic property that MUST have a single string value that is one of "rtl" , "ltr" or "default" . Indicates whether the tables in the group should be displayed with the first column on the right, on the left, or based on the first character in the table that has a specific direction. The value of this property becomes the value of the direction annotation for all the tables in the table group. See section 4.1.1 Bidirectional Tables for more details. Issue 51 in [ tabular-data-model This should be a defined controlled vocabulary in JSON-LD, so that the values map on to URIs in the RDF version rather than strings. We invite comment on how to configure the JSON-LD context to enable these values to be interpreted in ] for details. The default value for this way. property is "default" .

dialect tableSchema

An object property that provides a single dialect schema description . If provided, dialect provides hints to processors about how to parse as described in section 5.5 Schemas , used as the referenced files for to create tabular data models default for all the tables in the group. This may be provided as an embedded object within the JSON metadata or as a URL reference. See section 3.6 Dialect Descriptions for more details. reference to a separate JSON object that is a schema description .

templates transformations

An array property of template specifications transformation definitions that provide mechanisms to transform the tabular data into other formats. See section 3.7 Template Specifications The value of this property becomes the value of the transformations annotation for more details. all the tables in the table group.

When an array of template specifications transformation definitions B is imported merged into an original array of template specifications A, transformation definitions A , each template specification transformation definition within B is combined into the original array A by:

  • if there is a template specification transformation definition with the same targetFormat and templateFormat url in A, A , the template specification transformation definition from B is imported merged into the matching template specification transformation definition in A .
  • otherwise, the template specification transformation definition from B is appended to the array of template specifications transformation definitions A .
@id
If included, @id is a link property that identifies the group of tables , as defined by [ tabular-data-model ], described by this table group description . It MUST NOT start with _: . The value of this property becomes the value of the id annotation for the group of tables .
@type

If included, @type is an atomic property that MUST be set to "TableGroup" . Publishers MAY include this to provide additional information to JSON-LD based toolchains.

The description MAY contain any common properties as defined in section 3.3 Common Properties to provide extra metadata about the set group of tables as a whole.

The description MAY contain any of the inherited properties defined in section 2.2 Inherited Properties to describe cells within the tables.

Issue 22 This issue relates to the use of type vs datatype as a column property. (This issue seems moot now that neither are included.)

3.5 5.4 Tables

A table description is a JSON object that describes a table within a CSV file.

Issue 50 A CSV file might not be the same as the table that it contains. For example, a given CSV file might contain two tables (in different regions of the CSV file), or might contain a table that isn't positioned at the top left of the CSV file. We invite comment about whether we should assume that pre-processing is used to extract tables where there isn't a 1:1 correspondence between CSV file and table, or not.

3.5.1 5.4.1 Required Properties

@id url

This link property gives the single URL of the CSV file that the table is held in, relative to the location of the metadata document. The value of this property is the value of the url annotation for the annotated table this table description describes.

3.5.2 5.4.2 Optional Properties

The description of a table MAY also contain:

schema dialect
An object property that provides a single schema description as described in section 3.8 Schemas

As defined for table groups . This may be provided as an embedded object within the JSON metadata or as a URL reference to a separate JSON schema document.

notes

An object array property that provides an array of objects representing annotations. This specification does not place any constraints arbitrary annotations on the structure annotated tabular data model . The value of this property becomes the value of the notes annotation for the table . The properties on these objects. objects are interpreted equivalently to common properties as described in section 5.8 Common Properties . When an array of note objects B is merged into an original array of note objects A , each note object from B is appended into the array A .

Note

The Web Annotation Working Group is developing a vocabulary for expressing annotations. In future versions of this specification, we anticipate referencing that vocabulary.

Issue 70 Should there be column or level notes as well? Issue 71 The Annotation Model can indeed become very complex. table-direction As defined for table groups . templates As defined for table groups .
dialect suppressOutput
As defined for table groups

A boolean atomic property . @type If included, @type MUST true , suppresses any output that would be set to "Table" . Publishers MAY include generated when converting this to provide additional information to JSON-LD based toolchains. Issue We invite comment on whether we should include properties that help in checking the integrity of the file: datapackage includes bytes and hash . We could reuse the Subresource Integrity work here. table. The description MAY contain any common properties as defined in section 3.3 Common Properties to provide extra metadata about value of this property becomes the table as a whole. The description MAY contain any value of the properties defined in section 2.2 Inherited Properties suppress output to describe cells within the annotation for this table. 3.6 Dialect Descriptions Much of the tabular data that is published on the web The default is messy, and CSV parsers frequently need to be configured in order to correctly read in CSV. A dialect description provides hints to parsers about how to parse the file linked to from the @id property. It can have any of the following properties, which relate to the flags described in Section 5 Parsing Tabular Data within [ tabular-data-model ]: false .

encoding An atomic property that sets the encoding flag to the single provided string value, which MUST be a defined [ encoding ].
lineTerminator tableDirection
An atomic property that sets the line terminator flag to the single provided string value. quoteChar An atomic

As defined for table groups . The value of this property that sets the quote character flag to becomes the single provided value, which MUST be a single character. doubleQuote A single boolean atomic property that, if true , sets value of the escape character direction flag to " . If false , to \ . skipRows annotation for this table.

An atomic property that sets the skip rows flag to the single provided numeric value, which MUST be a non-negative integer.
commentPrefix tableSchema

An atomic object property that sets the comment prefix flag to the single provided value, which MUST be provides a single character string. header A single boolean atomic property that, if true , sets the header row count flag to 1 , and if false to 0 , unless headerRowCount schema description is provided, as described in which case the value provided for the header property is ignored. headerRowCount section 5.5 Schemas An atomic property that sets the header row count flag to the single provided value, which MUST . This may be a non-negative integer. delimiter An atomic property that sets the delimiter flag to the single provided value, which MUST be a single character string. skipColumns An atomic property that sets the skip columns flag to as an embedded object within the single provided numeric value, which MUST be JSON metadata or as a non-negative integer. headerColumnCount An atomic property that sets the header column count flag URL reference to a separate JSON schema document. If a table description is within a table group description , the single provided value, which MUST be non-negative integer. skipBlankRows tableSchema An atomic property from that sets the skip blank rows flag to the single provided boolean value. skipInitialSpace A single boolean atomic property that, if true , sets table group acts as the trim flag to "start" . If false , to false . default for this property.

If the a trim tableSchema property is provided, not declared in table description , it may be declared on the skipInitialSpace property table group description , which is ignored. then used as the schema for this table description .

The trim @id A single atomic property that, if of the boolean true tableSchema , sets the trim flag to true and if the boolean false to false . If the value provided there is a string, sets the trim flag to one, becomes the provided value, which MUST be one of "true" , "false" , "start" or "end" . @type If included, @type MUST be set to "Dialect" . Publishers MAY include this to provide additional information to JSON-LD based toolchains. The default dialect description for CSV files is: { "encoding": "utf-8", "lineTerminator": "\r\n", "quoteChar": "\"", "doubleQuote": true, "skipRows": 0, "header": true, "headerRowCount": 1, "delimiter": ",", "skipColumns": 0, "headerColumnCount": 0, "skipBlankRows": false, "skipInitialSpace": false, "trim": false } 3.7 Template Specifications A template specification is a definition value of how tabular data can be transformed into another format. It has the following properties: 3.7.1 Required Properties Template specifications MUST have the following properties: targetFormat A URL schema annotation for the format that will be created through the transformation. If one has been defined, this should be a URL for a media type, in the form http://www.iana.org/assignments/media-types/ media-type such as http://www.iana.org/assignments/media-types/text/calendar . Otherwise, it can be any URL that describes the target format. table.

Note

The When a schema is referenced by URL, this URL becomes the value of the targetFormat @id URL is intended as an informative identifier for property in the target format, normalized schema description, and applications MAY NOT access thus the URL. value of the schema annotation on the table.

templateFormat transformations

A URL As defined for the format that is used by the template. If one has been defined, table groups . The value of this should be a URL for a media type, in property becomes the form http://www.iana.org/assignments/media-types/ media-type such as http://www.iana.org/assignments/media-types/application/javascript . Otherwise, it can be any URL that describes value of the template format. Note The templateFormat URL is intended as an informative identifier transformations annotation for the template format, and applications MAY NOT access the URL. The template formats that an application supports are implementation defined. this table.

3.7.2 Optional Properties Template specifications MAY have the following properties:
title @id
A natural language property that describes the format that will be generated from the transformation. This is useful if the target format is a generic format (such as

If included, application/json @id ) and the transformation is creating a specific profile of that format. source A single string atomic link property that provides, if included, the format to which the tabular data should be transformed prior to the transformation using the template. If the value is "json" , the tabular data should first be transformed first to JSON based on the simple mapping defined in Generating JSON from Tabular Data on the Web . If the value is "rdf" , it should similarly first be transformed to XML based on identifies the simple mapping table , as defined in Generating RDF from Tabular Data on the Web [ tabular-data-model ], described by this table description . If the It MUST NOT start with source _: . The value of this property is missing or null then becomes the source value of the transformation is the annotated tabular data model. id annotation for this table.

@type

If included, @type is an atomic property that MUST be set to "Template" "Table" . Publishers MAY include this to provide additional information to JSON-LD based toolchains.

The template specification description MAY contain any common properties as defined in section 3.3 Common Properties to provide extra metadata about the transformation. table as a whole.

3.7.3 Example

The following template specification will enable a processor that supports it to generate an iCalendar document using a Mustache template description MAY contain inherited properties based on the JSON created from the simple mapping to JSON. describe cells within the table.

{ "title": "iCalendar", "targetFormat": "http://www.iana.org/assignments/media-types/text/calendar", "templateFormat": "https://mustache.github.io/", "source": "json" }

3.8 5.5 Schemas

A schema is a definition of a tabular format that may be common to multiple tables. For example, multiple tables from different sources may have the same columns and be designed such that they can be aggregated together.

A schema description is a JSON object that encodes the information about a schema. schema , which describes the structure of a table. All the properties of a schema description are optional.

columns

An array property of column descriptions as described in section 3.9 5.6 Columns . These are matched to columns in tables that use the schema by position: the first column description in the array applies to the first column in the table, the second to the second and so on.

The name properties of the column descriptions MUST be unique within a given table description.

When an array of column descriptions B is imported merged into an original array of column descriptions A, A , each column description within B is combined into the original array A by: , based on the index of each column description, as follows:

  1. if there is a non-empty case-insensitive intersection between the percent-decoded name and titles values for the column description at the same index within A and that column description has the same name B , the column description from B is imported merged into the matching column description in A .
  2. otherwise, if there are column descriptions at the same index within A and B , implementations MUST generate an error.
  3. otherwise, if at a given index there is no column description within A , but there is ignored a column description within B , then:
    1. if the primaryKey virtual A column reference property that holds either a single reference to a property of the column description object or an array of references. Validators MUST check that each row has a unique combination of cells in the indicated columns. For example, if primaryKey B is set to ["familyName", "givenName"] true , then every row must have the column description is appended to A .
    2. otherwise, implementations MUST generate an error.
  4. otherwise, if at a unique value for given index there is no column description within B , but there is a column description within A , then:
    1. if the combination virtual property of the column description in familyName A and is givenName columns. Issue 66 Composite primary keys and foreign key references. true , then the column description is retained.
    2. otherwise, implementations MUST generate an error.
foreignKeys

An array property of foreign key definitions that define how the values from specified columns within this table link to rows within this table or other tables. A foreign key definition is a JSON object with that MUST contain only the following properties:

columns columnReference

A column reference property that holds either a single reference to a column description object within this schema, or an array of references. These form the referencing columns for the foreign key definition .

reference

An object with the properties: property that identifies a referenced table and a set of referenced columns within that table. Its properties are:

resource

A link property holding a URL that is the identifier for a specific resource table that is being referenced. If this property is present then schema schemaReference MUST NOT be present. The metadata document table group MUST contain a description table whose url annotation is identical to the expanded value of this property. That table is the resource. referenced table .

schema schemaReference

A link property holding a URL that is the identifier for a schema that is being referenced. If this property is present then resource MUST NOT be present. The metadata document that forms the basis of processing table group MUST contain a description of table with a resource tableSchema having a @id that uses is identical to the referenced schema, expanded value of this property, and there MUST NOT be more than one such resource. table. That table is the referenced table .

columns columnReference

A column reference property that holds either a single reference (by name) to a column description object within this schema, the tableSchema of the referenced table , or an array of such references.

The value of this property becomes the foreign keys annotation on the table using this schema by creating a list of foreign keys comprising a list of columns in the table and a list of columns in the referenced table. The value of this property is also used to create the value of the referenced rows annotation on each of the rows in the table that uses this schema, which is a pair of the relevant foreign key and the referenced row in the referenced table.

As defined in [ tabular-data-model ], validators MUST check that, for each row, the combination of cells in the referencing columns references a unique row within the referenced table through a combination of cells in the referenced columns . For examples, see section 5.5.1.1 Foreign Key Reference Between Tables and section 5.5.1.2 Foreign Key Reference Between Schemas .

Note

It is not required for the resource table or schema referenced from a foreignKeys property to have a similarly defined primaryKey . , though frequently it will.

When an array of foreign key definitions B is imported merged into an original array of foreign key definitions A, A , each foreign key definition within B which does not appear within A is appended to the original array A. A .

Issue 16 primaryKey

A column reference property The cross that holds either a single reference between files should be limited to files from one publisher - else they are just web links with no guarantee a column description object or an array of whether references. The value of this property becomes the target primary key annotation for each row within a table that uses this schema by creating a list of the link exists which 'foreign key' might imply. cells in that row that are in the referenced columns.

As defined in [ tabular-data-model ], validators MUST check that each row has a unique combination of values of cells in the indicated columns. For example, if primaryKey is set to ["familyName", "givenName"] then every row must have a unique value for the combination of values of cells in the familyName and givenName columns.

urlTemplate @id
A URI template

If included, @id is a link property that MAY identifies the schema described by this schema description . It MUST NOT be used to create a unique identifier for each row when mapping data to other formats. start with _: .

@type

If included, @type is an atomic property that MUST be set to "Schema" . Publishers MAY include this to provide additional information to JSON-LD based toolchains.

The description MAY contain any common properties as defined in section 3.3 Common Properties to provide extra metadata about the schema as a whole.

The description MAY contain any of the inherited properties defined for to describe cells in section 2.2 Inherited Properties . within tables that use this schema.

3.8.1 5.5.1 Examples

This section is non-normative.

3.8.1.1 5.5.1.1 Foreign Key Reference Between Resources Tables

A list of countries is published at http://example.org/countries.csv with the structure:

name AD AE AF , 33.93911 , 67.709953 , Afghanistan
Example 25: http://example.org/countries.csv
countryCode,latitude,longitude,name
AD,42.546245,1.601554,Andorra
AE,23.424076,53.847818,"United Arab Emirates"
AF,33.93911,67.709953,Afghanistan

Another file contains information about the population in some countries each year, at http://example.com/country_slice.csv with the structure:

population AF AF AF , 1962 , 9989846
Example 26: http://example.org/country_slice.csv
countryRef,year,population
AF,1960,9616353
AF,1961,9799379
AF,1962,9989846

The following metadata for the group of tables links the two together by defining a foreignKeys property:

Example 27: http://example.org/countries.json
{
  "@context": "http://www.w3.org/ns/csvw",
  "tables": [{
    "url": "http://example.org/countries.csv",
    "tableSchema": {
      "columns": [{
        "name": "countryCode",
        "datatype": "string",
        "propertyUrl": "http://www.geonames.org/ontology{#_name}"
      }, {
        "name": "latitude",
        "datatype": "number"
      }, {
        "name": "longitude",
        "datatype": "number"
      }, {
        "name": "name",
        "datatype": "string"
      }],
      "aboutUrl": "http://example.org/countries.csv{#countryCode}",
      "propertyUrl": "http://schema.org/{_name}",
      "primaryKey": "countryCode"
    }
  }, {
    "url": "http://example.org/country_slice.csv",
    "tableSchema": {
      "columns": [{
        "name": "countryRef",
        "valueUrl": "http://example.org/countries.csv{#countryRef}"
      }, {
        "name": "year",
        "datatype": "gYear"
      }, {
        "name": "population",
        "datatype": "integer"
      }],
      "foreignKeys": [{
        "columnReference": "countryRef",
        "reference": {
          "resource": "http://example.org/countries.csv",
          "columnReference": "countryCode"
        }
      }]
    }
  }]


}

When Within the population data in annotated table generated for country_slice.csv countries.csv , each row will have a primary key annotation whose value is processed (displayed or mapped into another format), a link can be made from list containing the content of cell from the countryRef first column based on the of that row ( urlTemplate countryCode ).

The annotated table generated for country.csv . For example, if the countryRef country_slice.csv column (the will have a foreign keys annotation whose value of columns in is a list containing a single foreign key referencing the foreignKeys object) in first column from the table generated from country_slice.csv contains the value ( UK countryRef then ) and the processor will use that value to populate first column from the table generated from countryCode countries.csv variable (the value of ( reference.columns countryCode ). Each row within that table will have a referenced row annotation referencing this foreign key and the third row in the table generated from foreignKeys object) when interpreting countries.csv .

When the population data in urlTemplate country_slice.csv for country.csv , and create is validated, the URL http://example.org/countries.csv#UK . The processor does not need to retrieve http://example.org/countries.csv or validator must check that the value every UK countryRef appears within the countryCode country_slice.csv column to create this link: it is created purely based on the has a matching urlTemplate countryCode in the description of the referenced resource. within countries.csv .

3.8.1.2 5.5.1.2 Foreign Key Reference Between Schemas

When publishing information about public sector roles and salaries, as in Use Case 4 , the UK government requires departments to publish two files which are interlinked. The first lists senior grades (simplified here) eg e.g., at HEFCE_organogram_senior_data_31032011.csv :

Post Unique Reference, Name,Grade, Job Title,Reports to Senior Post 90115, Steve Egan,SCS1A,Deputy Chief Executive, 90334 90250, David Sweeney,SCS1A, Director, 90334 90284, Heather Fry,SCS1A, Director, 90334
Example 28
Post Unique Reference,              Name,Grade,             Job Title,Reports to Senior Post
                90115,        Steve Egan,SCS1A,Deputy Chief Executive,                 90334
                90250,     David Sweeney,SCS1A,              Director,                 90334
                90284,       Heather Fry,SCS1A,              Director,                 90334

90334,Sir
Alan
Langlands,
SCS4,
Chief
Executive,
xx

The second provides information about the number of junior positions that report to those individuals (simplified here) eg e.g., at HEFCE_organogram_junior_data_31032011.csv :

Reporting Senior Post,Grade,Payscale Minimum (£),Payscale Maximum (£),Generic Job Title,Number of Posts in FTE, Profession 90284, 4, 17426, 20002, Administrator, 2,Operational Delivery 90284, 5, 19546, 22478, Administrator, 1,Operational Delivery 90115, 4, 17426, 20002, Administrator, 8.67,Operational Delivery
Example 29
Reporting Senior Post,Grade,Payscale Minimum (£),Payscale Maximum (£),Generic Job Title,Number of Posts in FTE,          Profession
                90284,    4,               17426,               20002,    Administrator,                     2,Operational Delivery
                90284,    5,               19546,               22478,    Administrator,                     1,Operational Delivery
                90115,    4,               17426,               20002,    Administrator,                  8.67,Operational Delivery

90115,
5,
19546,
22478,
Administrator,
0.5,Operational
Delivery

The schemas are reused by multiple departments and for multiple pairs of files. The schemas are therefore defined in separate files, and they need to define links between the schemas which are then picked up as applying between tables that use those schemas.

The metadata file for the particular publication of the files above is:

Example 30
{
  "@context": "http://www.w3.org/ns/csvw",
  "tables": [{
    "url": "HEFCE_organogram_senior_data_31032011.csv",
    "tableSchema": "http://example.org/schema/senior-roles.json"
  }, {
    "url": "HEFCE_organogram_junior_data_31032011.csv",
    "tableSchema": "http://example.org/schema/junior-roles.json"
  }]


}

The schema for the senior role CSV (at http://example.org/schema/senior-roles.json ) is as follows; it includes a foreign key reference to itself: follows:

Example 31
{
  "@id": "http://example.org/schema/senior-roles.json",
  "@context": "http://www.w3.org/ns/csvw",
  "columns": [{
    "name": "ref",
    "titles": "Post Unique Reference"
  }, {
    "name": "name",
    "titles": "Name"
  }, {
    "name": "grade",
    "titles": "Grade"
  }, {
    "name": "job",
    "titles": "Job Title"
  }, {
    "name": "reportsTo",
    "titles": "Reports to Senior Post"
  }],
  "primaryKey": "ref"


}

The schema for the junior role CSV (at http://example.org/schema/junior-roles.json ) is as follows; it includes a foreign key reference to the senior roles schema:

Example 32
{
  "@id": "http://example.org/schema/junior-roles.json",
  "@context": "http://www.w3.org/ns/csvw",
  "columns": [{
    "name": "reportsTo",
    "titles": "Reporting Senior Post"
  }, 
  ...
  ],
  "foreignKeys": [{
    "columnReference": "reportsTo",
    "reference": {
      "schemaReference": "http://example.org/schema/senior-roles.json",
      "columnReference": "ref"
    }
  }]


}

In The foreign key definition here contains a schemaReference to senior-roles.json . Implementations will look for the table referenced within the original metadata file whose tableSchema is senior-roles.json , which is HEFCE_organogram_senior_data_31032011.csv . The implementation will therefore look for a relationship between the reportsTo column in HEFCE_organogram_junior_data_31032011.csv and the ref column in HEFCE_organogram_senior_data_31032011.csv .

For example, in the first line of HEFCE_organogram_junior_data_31032011.csv , the reportsTo ( Reporting Senior Post ) column contains the value 90284 . When creating validating that file, validators will check that there is a link single row within the table generated from HEFCE_organogram_senior_data_31032011.csv whose ref column contains the value 90284 .

5.5.1.3 Weak Linking between Tables

Foreign key definitions provide for strong linking between tables that column, guarantees (through validation) the existance of a referenced row. It is also possible to provide weak linking between tables that are not tested by validations but which may be useful when converting tabular data into other formats, using urlTemplate aboutUrl defined within and valueUrl .

Taking the example above as a starting point, the schema at for http://example.org/schema/senior-roles.json HEFCE_organogram_senior_data_31032011.csv is used could use aboutUrl to generate provide a URL by expanding the variable reference for each row, which can similarly be created as a ref valueUrl based on the value from for the reportsTo column. This gives column:

Example 33
{
  "@id": "http://example.org/schema/senior-roles.json",
  "@context": "http://www.w3.org/ns/csvw",
  "aboutUrl": "#role-{ref}",
  "columns": [{
    "name": "ref",
    "titles": "Post Unique Reference"
  }, {
    "name": "name",
    "titles": "Name"
  }, {
    "name": "grade",
    "titles": "Grade"
  }, {
    "name": "job",
    "titles": "Job Title"
  }, {
    "name": "reportsTo",
    "titles": "Reports to Senior Post",
    "valueUrl": "#role-{reportsTo}"
  }],
  "primaryKey": "ref"

}

The URLs generated for the values of the relative URL #post-90284 reportsTo which will (if the data is then resolved against correct) match the base URL of URLs generated for each row within the resource table. There will be no validation error, however, if there is a value in the reportsTo column that uses does not match a value in the senior-roles.json ref schema within column. In contrast, if a foreign key had been specified with:

Example 34
"foreignKeys": [{
  "columnReference": "reportsTo",
  "reference": {
    "schemaReference": "http://example.org/schema/senior-roles.json",
    "columnReference": "ref"
  }

}]

then validators would raise an error if a value in the original metadata file, namely HEFCE_organogram_senior_data_31032011.csv . reportsTo column did not match any value in the ref column.

3.9 5.6 Columns

A column description is a simple JSON object that describes a single column. The description provides additional human-readable documentation for a column, as well as additional information that may be used to validate the cells within the column, create a user interface for data entry, or inform conversion into other formats. All properties are optional.

Issue 64 Should there be a way to suppress columns? 3.9.1 Required Properties
name

An atomic property that gives a single canonical name for the column. The value of this property becomes the name annotation for the described column . This MUST be a string. Conversion specifications MUST use string and this property as the basis for the names of properties/elements/attributes in has no default value, which means it MUST be ignored if the results of conversions. supplied value is not a string.

For ease of reference within URI template properties , column names SHOULD consist only of alphanumeric characters or underscores ( [a-zA-Z0-9_]+ ). Names are restricted as defined in Variables in [ URI-TEMPLATE ] with the additional provision that names beginning with _ "_" are reserved by this specification and MUST NOT be used. Issue 33 What do to with conversion if no column name is given? Issue 53 We invite comment on what the syntactic limitations should be on column names to make them most useful when used as the basis of conversion into other formats, bearing in mind that different target languages such as JSON, RDF and XML have different syntactic limitations and common naming conventions. within metadata documents.

During validation, if there is no title suppressOutput

A boolean atomic property and the column already has a . If title annotation then a validator MUST issue a warning if true , suppresses any output that would be generated when converting cells in this column. The value of this property becomes the existing title suppress output annotation does not match the name specified in for the described column description. . The default is false .

3.9.2 Optional Properties
title titles

A natural language property that provides possible alternative names for the column. The possible column string values of this property, along with their associated language tags, become the titles are defined as: if annotation for the value of described column .

If there is no title name is a string, that string if property defined on this column, the value of first title titles is an array, the strings in that array if the value of having the same language tag as default language , or title und is an object, the string or strings that are the value of the property of that object whose name if no default language is specified, becomes the column language name where annotation for the described column language is the value of . This annotation MUST be percent-encoded as necessary to conform to the syntactic requirements defined in [ RFC3986 ]

language virtual

A boolean atomic property on taking a single value which indicates whether the column description, or (if there is no such language), a virtual column not present in the original source. The default value of the is language false . The normalized value of this property on the table description. If becomes the column already has a title virtual annotation (because a header row has been included in for the original CSV file) then described column . If present, a validator virtual column MUST issue a warning if the existing title annotation is not appear after all other non-virtual column definitions.

Note

Virtual columns are useful for inserting cells with default values into an annotated table to control the same as any results of the possible column titles . conversions.

The facility We invite comment on whether virtual columns are useful enough to specify multiple potential titles for a column is important when include in the same column description is used for multiple CSVs, through a mechanism yet to be defined by this specification. final recommendation in spite of the added complexity.

required @id
A boolean atomic property taking a single value which indicates whether every cell within the column must have a non-null value.

If included, predicateUrl @id An atomic is a link property that holds one or more URIs that MAY be used as URIs for predicates if identifies the table is mapped to another format. columns , as defined in [ tabular-data-model ], and potentially appearing across separate tables, described by this column description . It MUST NOT start with _: .

@type

If included, @type is an atomic property that MUST be set to "Column" . Publishers MAY include this to provide additional information to JSON-LD based toolchains.

If the column description has neither name nor titles properties, the string "_col. [N] " where [N] is the column number , becomes the name annotation for the described column .

The description MAY contain any common properties as defined in section 3.3 Common Properties to provide extra metadata about the column as a whole, such as a full description.

The description MAY contain any of the inherited properties defined for to describe cells in within the column.

5.6.1 Examples

This section is non-normative.

2.2 5.6.1.1 Inherited Properties Use of virtual columns .

Virtual columns are useful when data needs to be added as part of an output transformation that doesn't exist in the source file. This may be to add type information to a column, or to relate different columns having different aboutUrl . For example, the http://example.org/tree-ops.csv example used in the introduction can be used with the following metadata:

Example 35: http://example.org/tree-ops-virtual.json
{
  "url": "tree-ops.csv",
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "tableSchema": {
    "columns": [{
      "name": "GID",
      "titles": "GID",
      "datatype": "string",
      "propertyUrl": "schema:url",
      "valueUrl": "#gid-{GID}"
    }, {
      "name": "on_street",
      "titles": "On Street",
      "datatype": "string",
      "aboutUrl": "#location-{GID}",
      "propertyUrl": "schema:streetAddress"
    }, {
      "name": "species",
      "titles": "Species",
      "datatype": "string",
      "propertyUrl": "schema:name"
    }, {
      "name": "trim_cycle",
      "titles": "Trim Cycle",
      "datatype": "string"
    }, {
      "name": "inventory_date",
      "titles": "Inventory Date",
      "datatype": {"base": "date", "format": "M/d/yyyy"},
      "aboutUrl": "#event-{inventory_date}",
      "propertyUrl": "schema:startDate"
    }, {
      "propertyUrl": "schema:event",
      "valueUrl": "#event-{inventory_date}",
      "virtual": true
    }, {
      "propertyUrl": "schema:location",
      "valueUrl": "#location-{GID}",
      "virtual": true
    }, {
      "aboutUrl": "#location-{GID}",
      "propertyUrl": "rdf:type",
      "valueUrl": "schema:PostalAddress",
      "virtual": true
    }],
    "aboutUrl": "#gid-{GID}"
  }

}

This metadata creates a relationship model between data in each column by different combinations of aboutUrl , propertyUrl , and valueUrl on existing columns, and defining new virtual columns to supply additional information. In this case, the on_street and inventory_date values are split into separate entities, each having their own aboutUrl . New virtual columns are defined to provide a location type, and to relate the main row entity to the event and location associated with it. The result of converting the table to RDF would include the following, for the first row, with the contributions from the virtual columns highlighted:

Example 36
<#gid-1>
  schema:url <#gid-1> ;
  schema:name "Celtis australis" ;
  :trim_cycle "Large Tree Routine Prune" ;
  schema:event <#event-2010-10-18> ;
  schema:location <#location-1> ;

  .

<#event-1> a schema:Event ;

  schema:startDate "2010-10-18"^^xsd:date ;
  .

<#location-1> a schema:PostalAddress ;

  schema:streetAddress "ADDISON AV" ;
.

The JSON would similarly include, again with the contributions from the virtual columns highlighted:

Example 37
{
  "@id": "#gid-1",
  "schema:url": "#gid-1",
  "schema:name": "Celtis australis",
  "trim_cycle": "Large Tree Routine Prune",
  "schema:event": {
    "@id": "#event-1",
    "@type": "schema:Event",
    "schema:startDate": "2010-10-18"
  },
  "schema:location": {
    "@id": "#location-1",
    "@type": "schema:PostalAddress",
    "schema:streetAddress": "ADDISON AV"
  }

}

3.10 5.7 Inherited Properties

Cell descriptions A cell may override inherited be assigned annotations based on properties on the description objects for the group of tables , table , schema , or column that it appears in. These properties are known as inherited properties and are listed below. To ascertain a value for certain annotations on cells, an application MUST identify the relevant property in the descriptions of the group of tables, table, schema, or column.

aboutUrl

A URI template property that MAY be used to indicate what a cell contains information about. The value of this property becomes the about URL annotation for the described column .

Note

aboutUrl is typically defined on a schema description or table description to indicate what each row is about. If defined on individual column descriptions , care must be taken to ensure that transformed cell values maintain a semantic relationship.

datatype

An atomic property that contains either a single string that is the main datatype of the values of the cell or a datatype description object. If the value of this property is a string, it MUST be one of the built-in datatypes defined in section 2. 5.11.1 Annotating Tables Built-in Datatypes . It ; if it is good practice to define these properties an object then it describes a more specialised datatype. If a cell contains a sequence (ie the separator property is specified and not null ) then this property specifies the datatype of each value within that sequence. See 5.11 Datatypes and Parsing Cells in [ tabular-data-model ] for more details.

The normalized value of this property becomes the datatype annotation for the described column .

We invite comment on columns, so whether datatype should allow for a "union" of types for a cell; this would allow for a set of datatypes that all cells within could be matched against the string value of a given column are handled in cell, choosing the same way, first match; e.g., to match either a date or on tables if appropriate. These properties are: datetime .

null default

An atomic property giving the holding a single string or strings that is used to create a default value for null values. the cell in cases where the original string value is an empty string. See Parsing Cells in [ tabular-data-model ] for more details. If not specified, the default for this the default property is the empty string. string, "" . The value of this property becomes the default annotation for the described column .

language lang

An atomic property giving a single string language code as defined by [ BCP47 ]. Indicates the language of the value within the cell. See Parsing Cells in [ tabular-data-model ] for more details. The value of this property becomes the lang annotation for the described column . The default is und .

text-direction null

An atomic property that MUST have giving the string or strings used for null values within the data. If the string value of the cell is equal to any one of these values, the cell value is null . See Parsing Cells in [ tabular-data-model ] for more details. If not specified, the default for the null property is the empty string "" . The value of this property becomes the null annotation for the described column .

ordered

A boolean atomic property taking a single string value which indicates whether a list that is one the value of the cell is ordered (if "rtl" true ) or unordered (if "ltr" false (the default). Indicates whether ). The default is false . This property is irrelevant if the text within separator is null or undefined, but this is not an error. The value of this property becomes the ordered annotation for the described column , and the ordered annotation for the cells should within that column.

propertyUrl

An URI template property that MAY be displayed by default as left-to-right or right-to-left text. See section 4.1.1 Bidirectional Tables used to create a URI for a property if the table is mapped to another format. The value of this property becomes the property URL annotation for the described column .

Note

propertyUrl is typically defined on a column description . If defined on a schema description , table description or table group description , care must be taken to ensure that transformed cell values maintain an appropriate semantic relationship, for more details. example by including the name of the column in the generated URL by using _name in the template.

required

A boolean atomic property taking a single value which indicates whether the cell must have a non-null value. The default is false . The value of this property becomes the required annotation for the described column .

separator

An atomic property that MUST have a single string value that is the character used to separate items in the string value of the cell. If null (the default) or unspecified, the cell does not contain a list. Otherwise, application MUST split the string value of the cell on the specified separator character and parse each of the resulting strings separately. The cell's value will then be a list. Conversion specifications MUST use the separator to determine the conversion of a cell into the target format. See 3.12 Parsing cells Cells in [ tabular-data-model ] for more details. The value of this property becomes the separator annotation for the described column .

default textDirection

An atomic property holding a single string that provides MUST have a default string value for the cell in cases where the original single string value that is a one of null "rtl" value. This or "ltr" (the default). Indicates whether the text within cells should be displayed by default as left-to-right or right-to-left text. The value MAY be used when converting of this property becomes the table into other formats. text direction annotation for the column . See Bidirectional Tables in [ tabular-data-model ] for details.

format valueUrl

An atomic URI template property that contains a single string that is used to map the definition values of the format cells into URLs. The value of this property becomes the cell, used when parsing value URL annotation for the described column .

Note

This allows processors to build URLs from cell values , for example to reference RDF resources , as described defined in [ 3.12 Parsing cells rdf-concepts ]. For example, if the value URL were "{#reference}" , each cell value of a column named reference would be used to create a URI such as http://example.com/#1234 , if 1234 were a cell value of that column.

Note

valueUrl is typically defined on a column description . If defined on a schema description , table description or table group description , care must be taken to ensure that transformed cell values maintain an appropriate semantic relationship.

The value of an inherited property is the first value, if any, found by looking in the current description object through all of its containing objects: a inherited property defined in a column description takes precedence of one defined in a schema description , which in turn takes precedence of one defined in a table description , which in turn takes precedence of one defined in a table group description .

5.7.1 Examples

This section is non-normative.

In the following example, datatype aboutUrl property is defined on the tableSchema , and therefore affects all cells for that table.

Example 38: http://example.org/tree-ops.csv-metadata.json
{
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "url": "tree-ops.csv",
  "dc:title": "Tree Operations",
  "dcat:keyword": ["tree", "street", "maintenance"],
  "dc:publisher": {
    "schema:name": "Example Municipality",
    "schema:url": {"@id": "http://example.org"}
  },
  "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"},
  "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"},
  "tableSchema": {
    "columns": [{
      "name": "GID",
      "titles": ["GID", "Generic Identifier"],
      "dc:description": "An identifier for the operation on a tree.",
      "datatype": "string",
      "required": true
    }, {
      "name": "on_street",
      "titles": "On Street",
      "dc:description": "The street that the tree is on.",
      "datatype": "string"
    }, {
      "name": "species",
      "titles": "Species",
      "dc:description": "The species of the tree.",
      "datatype": "string"
    }, {
      "name": "trim_cycle",
      "titles": "Trim Cycle",
      "dc:description": "The operation performed on the tree.",
      "datatype": "string"
    }, {
      "name": "inventory_date",
      "titles": "Inventory Date",
      "dc:description": "The date of the operation that was performed.",
      "datatype": {"base": "date", "format": "M/d/yyyy"}
    }],
    "primaryKey": "GID",
    "aboutUrl": "#gid-{GID}"
  }

}

An atomic The equivalent effect could be achieved by using the aboutUrl property on each column:

Example 39: http://example.org/tree-ops.csv-metadata.json
{
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "url": "tree-ops.csv",
  "dc:title": "Tree Operations",
  "dcat:keyword": ["tree", "street", "maintenance"],
  "dc:publisher": {
    "schema:name": "Example Municipality",
    "schema:url": {"@id": "http://example.org"}
  },
  "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"},
  "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"},
  "tableSchema": {
    "columns": [{
      "name": "GID",
      "titles": ["GID", "Generic Identifier"],
      "aboutUrl": "#gid-{GID}",
      "dc:description": "An identifier for the operation on a tree.",
      "datatype": "string",
      "required": true
    }, {
      "name": "on_street",
      "titles": "On Street",
      "aboutUrl": "#gid-{GID}",
      "dc:description": "The street that the tree is on.",
      "datatype": "string"
    }, {
      "name": "species",
      "titles": "Species",
      "aboutUrl": "#gid-{GID}",
      "dc:description": "The species of the tree.",
      "datatype": "string"
    }, {
      "name": "trim_cycle",
      "titles": "Trim Cycle",
      "aboutUrl": "#gid-{GID}",
      "dc:description": "The operation performed on the tree.",
      "datatype": "string"
    }, {
      "name": "inventory_date",
      "titles": "Inventory Date",
      "aboutUrl": "#gid-{GID}",
      "dc:description": "The date of the operation that was performed.",
      "datatype": {"base": "date", "format": "M/d/yyyy"}
    }],
    "primaryKey": "GID"
  }

}

5.8 Common Properties

Descriptions of groups of tables, tables, schemas and columns MAY contain any common properties whose names are either absolute URLs or prefixed names . For example, a table description may contain dc:description , dcat:keyword , or schema:copyrightHolder properties to provide a description, keywords, or the name of the copyright holder, as defined in Dublin Core Terms , DCAT , or schema.org .

5.8.1 Names of Common Properties

The names of common properties are prefixed names , in the syntax prefix : name .

Prefixed names that contains can be expanded to provide a single string URI, by replacing the prefix and following colon with the URI that the prefix is associated with. Expansion is intended to be entirely consistent with Section 6.3 IRI Expansion in [ JSON-LD-API ] and implementations MAY use a JSON-LD processor for performing prefixed name and IRI expansion.

The prefixes that are recognized are those defined for [ rdfa-core ] within the main datatype of RDFa 1.1 Initial Context and other prefixes defined within [ csvw-context ] and these MUST NOT be overridden. These prefixes are periodically extended; refer to [ csvw-context ] for details. Properties from other vocabularies MUST be named using absolute URLs.

Note

Forbidding the values declaration of new prefixes ensures consistent processing between JSON-LD-aware and non-JSON-LD-aware processors.

This specification does not define how common properties are interpreted by implementations. Implementations SHOULD treat the cell. If prefixed names for common properties and the URLs that they expand into in the same way. For example, if an implementation recognises and displays the value of the cell contains a list (ie separator dc:description is specified property, it should also recognise and not display the value of the null http://purl.org/dc/terms/description ) then this is property in the datatype same way.

5.8.2 Values of each value Common Properties

Common properties can take any JSON value, so long as any objects within the list. Conversion specifications value (for example as items of an array or values of properties on other objects) adhere to the following restrictions, which are designed to ensure compatibility between JSON-LD-aware and non-JSON-LD-aware processors:

  • If a @value property is used on an object, that object MUST NOT use the datatype have any other properties aside from either @type or @language , and MUST NOT have both @type and @language as properties. The value of the @value property MUST be a string, number, or boolean value.

    If @type is also used, its value to determine the conversion of MUST be one of:

    If a length @language An atomic property that contains is used, it MUST have a single integer string value that adheres to the syntax defined in [ BCP47 ], or be null .

  • If a @type property is used on an object without a @value property, its value MUST be one of:

    A minLength @type An atomic property that contains can also have a single integer value that is the minimum length an array of the value such values.

  • The values of the cell. See @id properties are link properties and are treated as URLs. During normalization , as described in section 3.11.1 6.1 Length Constraints Normalization , they will have any prefix expanded for details. and the result resolved against the base URL . Therefore, if an maxLength @id An atomic property that contains is used on an object, it MUST have a single integer value that is a string and that string MUST NOT start with _: .

  • A @language property MUST NOT be used on an object unless it also has a @value property.

  • Aside from @value , @type , @language , and @id , the maximum length of properties used on an object MUST NOT start with @ .

These restrictions are also described in section A. JSON-LD Dialect , from the value perspective of a processor that otherwise supports JSON-LD. Examples of common property values and the cell. See impact of normalization are given in section 3.11.1 6.1.1 Length Constraints Examples .

5.9 Dialect Descriptions

Much of the tabular data that is published on the web is messy, and CSV parsers frequently need to be configured in order to correctly read in CSV. A dialect description provides hints to parsers about how to parse the file linked to from the url property in a table description . It can have any of the following properties, which relate to the flags described in Section 5 Parsing Tabular Data for details. within the [ tabular-data-model ]:

minimum commentPrefix

An atomic property that contains sets the comment prefix flag to the single provided value, which MUST be a single number that character string. The default is the minimum value for the cell (inclusive); equivalent to minInclusive "#" . See section 3.11.2 Value Constraints for details.

maximum delimiter

An atomic property that contains sets the delimiter flag to the single provided value, which MUST be a single number that character string. The default is the maximum value for the cell (inclusive); equivalent to maxInclusive "," . See section 3.11.2 Value Constraints

doubleQuote

A boolean atomic property for details. that, if true , sets the escape character flag to " . If false , to \ . The default is true .

minInclusive encoding

An atomic property that contains a single number that is sets the minimum value for encoding flag to the cell (inclusive). See section 3.11.2 Value Constraints single provided string value, which MUST be a defined in [ encoding for details. ]. The default is "utf-8" .

maxInclusive header

An A boolean atomic property that contains a single number that that, if true , sets the header row count flag to 1 , and if false to 0 , unless headerRowCount is provided, in which case the maximum value provided for the cell (inclusive). See section 3.11.2 Value Constraints for details. header property is ignored. The default is true .

minExclusive headerRowCount

An numeric atomic property that contains a single number that is the minimum value for sets the cell (exclusive). See section 3.11.2 Value Constraints header row count for details. flag to the single provided value, which MUST be a non-negative integer. The default is 1 .

maxExclusive lineTerminators

An atomic property that contains a single number that is the maximum value for sets the cell (exclusive). See section 3.11.2 Value Constraints line terminators for details. flag to either an array containing the single provided string value, or the provided array. The default is ["\r\n", "\n"] .

3.11 Datatypes quoteChar

Cells within tables may An atomic property that sets the quote character flag to the single provided value, which MUST be annotated with a single character or datatype which indicates the type of the value obtained by parsing null . If the value of the cell. The format expected in the cell is determined by the format annotation, if there null , the escape character flag is one, or uses a also set to null . The default format determined by the type. is " .

skipBlankRows

An boolean atomic property that sets the skip blank rows flag to the single provided boolean value. The possible datatypes are: default is false .

skipColumns

An numeric atomic property that sets the datatypes defined in [ xmlschema-2 skip columns ] with flag to the exception of those that rely on XML mechanisms for definition, namely: single provided numeric value, which MUST be a non-negative integer. The default is 0 .

anySimpleType skipInitialSpace

A boolean atomic property that, if string ; a sub-value of true , sets the trim flag to anySimpleType "start" . If normalizedString ; a sub-value of false , to string false . If the token ; a sub-value of normalizedString trim property is provided, the language skipInitialSpace ; a sub-value of property is ignored. The default is token false .

Name skipRows ;

An numeric atomic property that sets the skip rows flag to the single provided numeric value, which MUST be a sub-value of non-negative integer. The default is token 0 .

NCName ; a sub-value of Name trim

An atomic property that, if the boolean ; a sub-value of anySimpleType true , sets the trim flag to decimal true ; a sub-value of and if the boolean anySimpleType false to integer ; false . If the value provided is a sub-value string, sets the trim flag to the provided value, which MUST be one of decimal "true" , nonPositiveInteger ; a sub-value of "false" , integer "start" , or negativeInteger ; a sub-value of "end" . The default is nonPositiveInteger false .

long ; a sub-value of integer @id

If included, int @id ; is a sub-value of link property that identifies the dialect described by this dialect description . It MUST NOT start with long _: .

short @type ; a sub-value of

If included, int @type is an atomic property that MUST be set to "Dialect" . Publishers MAY include this to provide additional information to JSON-LD based toolchains.

byte ; Note

Dialect descriptions do not provide a sub-value of short mechanism for handling CSV files in which there are multiple tables within a single file (eg separated by empty lines).

The default dialect description for CSV files is:

{

  "encoding": "utf-8",
  "lineTerminators": ["\r\n", "\n"],
  "quoteChar": "\"",
  "doubleQuote": true,
  "skipRows": 0,
  "commentPrefix": "#",
  "header": true,
  "headerRowCount": 1,
  "delimiter": ",",
  "skipColumns": 0,
  "skipBlankRows": false,
  "skipInitialSpace": false,
  "trim": false
}

nonNegativeInteger ; 5.10 Transformation Definitions

A transformation definition is a sub-value definition of how tabular data can be transformed into another format using a script or template.

For example, the following transformation definition will enable a processor that supports it to generate an iCalendar document using a Mustache template based on the JSON created from the simple mapping to JSON.

Example 40
{

"url": "templates/ical.txt",
"titles": "iCalendar",
"targetFormat": "http://www.iana.org/assignments/media-types/text/calendar",
"scriptFormat": "https://mustache.github.io/",
"source": "json"
}

A processor that recognises templates in the Mustache format indicated by integer "https://mustache.github.io/" and that could convert tables into JSON based on [ csv2json ] would retrieve the template from unsignedLong "templates/ical.txt" ; a sub-value of and apply this to the resulting JSON.

Transformation definitions have the following properties:

5.10.1 Required Properties

Transformation definitions MUST have the following properties:

nonNegativeInteger url

A link property giving the single URL of the file that the script or template is held in, relative to the location of the metadata document.

unsignedInt scriptFormat ;

A link property giving the single URL for the format that is used by the script or template. If one has been defined, this should be a sub-value of URL for a media type, in the form unsignedLong http://www.iana.org/assignments/media-types/ media-type such as http://www.iana.org/assignments/media-types/application/javascript . Otherwise, it can be any URL that describes the script or template format.

Note

The unsignedShort ; a sub-value of unsignedInt scriptFormat URL is intended as an informative identifier for the template format, and applications SHOULD NOT access the URL. The template formats that an application supports are implementation defined.

unsignedByte targetFormat ;

A link property giving the single URL for the format that will be created through the transformation. If one has been defined, this should be a sub-value of URL for a media type, in the form unsignedShort http://www.iana.org/assignments/media-types/ media-type such as http://www.iana.org/assignments/media-types/text/calendar . Otherwise, it can be any URL that describes the target format.

Note

The positiveInteger ; a sub-value of nonNegativeInteger targetFormat URL is intended as an informative identifier for the target format, and applications SHOULD NOT access the URL.

5.10.2 Optional Properties

Transformation definitions MAY have the following properties:

float source ; a sub-value of

A single string atomic property that provides, if specified, the format to which the tabular data should be transformed prior to the transformation using the script or template. If the value is anySimpleType json , the tabular data MUST first be transformed to JSON as defined by [ csv2json ] using standard mode . If the value is double ; a sub-value of anySimpleType rdf , the tabular data MUST first be transformed to an RDF graph as defined by [ csv2rdf ] using standard mode . If the duration source ; a sub-value of property is missing or anySimpleType null (the default) then the source of the transformation is the annotated tabular data model . No other values are valid.

titles

A natural language property that describes the format that will be generated from the transformation. This is useful if the target format is a generic format (such as dateTime application/json ; ) and the transformation is creating a sub-value specific profile of that format.

anySimpleType @id

If included, time @id ; is a sub-value of link property that identifies the transformation described by this transformation definition . It MUST NOT start with anySimpleType _: .

date @type ; a sub-value of

If included, anySimpleType @type is an atomic property that MUST be set to "Template" . Publishers MAY include this to provide additional information to JSON-LD based toolchains.

The transformation definition MAY contain any common properties to provide extra metadata about the transformation.

5.10.3 Processing Transformation Definitions

Implementations MAY present users with options for transformations based on the available transformation definitions and their properties. Implementations SHOULD filter this list to only include those transformations whose gYearMonth ; a sub-value of anySimpleType scriptFormat they understand and can apply, and whose gYear source ; property, if present, specifies a sub-value of format that the implementation can convert to. Users may find the anySimpleType targetFormat and gMonthDay titles ; properties useful in deciding which transformation to apply.

When directed by a sub-value of user to transform a table using a transformation definition , implementations MUST :

  1. Convert the table to the format specified by the anySimpleType source property, if this is specified and not null .
  2. Fetch the script or template from the location specified by the gDay ; a sub-value of anySimpleType url property and raise an error if this does not exist.
  3. Use the gMonth ; a sub-value of anySimpleType scriptFormat property to determine how to interpret that script or template, and apply it to the table (or the result of converting the table).

hexBinary ; 5.11 Datatypes

Cells within tables may be annotated with a sub-value of anySimpleType datatype which indicates the type of the values obtained by parsing the string value of the cell. See [ tabular-data-model base64Binary ; ] for a sub-value description of anySimpleType annotations on a datatype .

5.11.1 Built-in Datatypes

The possible built-in datatypes, as shown on the diagram , are:

  • the datatypes defined in [ xmlschema11-2 ] as derived from and including anyURI ; a sub-value of anySimpleType anyAtomicType
  • the datatype number which is exactly equivalent mapped to double in the data model
  • the datatype binary which is exactly equivalent mapped to base64Binary in the data model
  • the datatype datetime which is exactly equivalent mapped to dateTime in the data model
  • the datatype any which is exactly equivalent mapped to anySimpleType anyAtomicType in the data model
  • the datatype xml , a sub-type of string , which indicates the cell contains value is an XML fragment
  • the datatype html , a sub-type of string , which indicates the cell contains value is an HTML fragment
  • the datatype json , a sub-type of string , which indicates the cell contains value is serialized JSON
Built-in Datatype Hierarchy diagram
Fig. 2 Diagram showing the built-in datatypes, based on [ xmlschema11-2 ]; names in paranthesis denote aliases to the [ xmlschema11-2 ] terms (see the diagram in SVG or PNG formats)

3.11.1 5.11.2 Length Constraints Derived Datatypes

More specialised datatypes can be defined through a datatype description . A datatype description may have any of the following properties, all of which are optional.

base

An atomic property that contains a single string: a term defined in the default context representing a built-in datatype URL, as listed above. Its default is string . All values of the datatype MUST be valid values of the base datatype. The value of this property becomes the base annotation for the described datatype .

format

An atomic property that contains either a single string or an object that defines the format of a value of this type, used when parsing a string value as described in Parsing Cells in [ tabular-data-model ]. The value of this property becomes the format annotation for the described datatype .

length ,

A numeric atomic property that contains a single integer that is the exact length of the value. The value of this property becomes the length annotation for the described datatype . See Length Constraints in [ tabular-data-model ] for details.

minLength and

An atomic property that contains a single integer that is the minimum length of the value. The value of this property becomes the minimum length annotation for the described datatype . See Length Constraints in [ tabular-data-model ] for details.

maxLength properties indicate

A numeric atomic property that contains a single integer that is the exact, maximum length of the value. The value of this property becomes the maximum length annotation for the described datatype . See Length Constraints in [ tabular-data-model ] for details.

minimum and

An atomic property that contains a single number or string that is the minimum valid value (inclusive); equivalent to minInclusive . The value of this property becomes the minimum annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.

maximum lengths

An atomic property that contains a single number or string that is the maximum valid value (inclusive); equivalent to maxInclusive . The value of this property becomes the values maximum annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.

minInclusive

An atomic property that contains a single number or string that is the minimum valid value (inclusive). The value of cells. this property becomes the minimum annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.

maxInclusive

An atomic property that contains a single number or string that is the maximum valid value (inclusive). The value of this property becomes the maximum annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.

minExclusive

An atomic property that contains a single number or string that is the minimum valid value (exclusive). The value of this property becomes the minimum exclusive annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.

maxExclusive

An atomic property that contains a single number or string that is the maximum valid value (exclusive). The value of this property becomes the maximum exclusive annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.

The datatype description MAY contain any common properties to provide extra metadata about the datatype, such as a title or description.

Applications MUST raise an error if both length and minLength are specified and they do not have the same value. Similarly, applications MUST raise an error if both length and maxLength are specified and they do not have the same value. Applications MUST raise an error if length , maxLength , or minLength are specified and the cell value is not a list (ie separator base datatype is not specified), a string or one of its subtypes, or a binary value. type.

The length of a value of a cell is determined as follows: if In all ways, including the errors described below, the cell is null minimum its length is zero if the value is a list, its length property is equivalent to the number of items in the list if minInclusive property and the value is a string or one of its subtypes, its length maximum property is equivalent to the number of characters in the value maxInclusive property. Applications MUST raise an error if both minimum and minInclusive are specified and they do not have the value is of a binary type, its length is the number of bytes in same value. Similarly, applications MUST raise an error if both maximum and maxInclusive are specified and they do not have the binary value 3.11.2 Value Constraints same value.

The Applications MUST raise an error if both minimum , minInclusive and maximum , minExclusive are specified, or if both maxInclusive and maxExclusive are specified. Applications MUST raise an error if both minInclusive , and maxInclusive are specified and maxInclusive is less than minInclusive , or if both minExclusive minInclusive and maxExclusive properties indicate limits on the values of cells. These apply to numeric are specified and date/time types. The minimum maxExclusive property is equivalent less than or equal to the minInclusive . Similarly, applications MUST raise an error if both minExclusive property and the maximum maxExclusive are specified and maxExclusive property is equivalent less than minExclusive , or if both minExclusive and maxInclusive are specified and maxInclusive is less than or equal to the minExclusive .

Applications MUST raise an error if minimum , minInclusive , maximum , maxInclusive , minExclusive , or maxExclusive property. are specified and the base datatype is not a numeric, date/time, or duration type.

Validation against these properties is as defined in [ xmlschema-2 xmlschema11-2 ].

3.12 6. Parsing cells Merging Metadata

Unlike many other data formats, When processing a tabular data is designed to be read by humans. For that reason, it's common file, the Locating Metadata section in [ tabular-data-model ] describes different locations for data to be represented within locating metadata. To properly transform a tabular data file, such as a CSV file, processors MUST merge metadata from these separate sources to create a single metadata document in a human-readable way. The manner consistent with this algorithm.

Implementations MUST check and issue warnings where merge issues are found as noted below and in the relevant property definitions.

Merging of metadata happens in order from highest priority to lowest priority by merging the first two metadata files ( separator A and format B properties indicates the format used ) together to represent data within the table. create new merged metadata AB' . This is used: by validators then used to check that the data in the table is merge in the expected format by converters next metadata file until all metadata have been processed to parse the values before mapping them into values in create a table group description .

If the target top-level object of either of the conversion when displaying data, to map it metadata files are table descriptions , these are turned into formats table group descriptions containing a single table description (i.e., having a tables property whose value is an array containing the original table description). Ensure that @context definitions are meaningful for those viewing moved from the data (as opposed to those publishing it) when inputting data, table description to turn entered values into representations in a consistent format the table group description .

The process of parsing Merging has two stages: the string value normalization of a cell into a single value or a list metadata documents, described in section 6.1 Normalization and the merging of values is as follows: those normalized documents, described in section 6.2 Merging .

Issue 61 6.1 Normalization

Prior to merging, each description object is expanded relative to its @context and values are normalized as follows:

  1. If the property is a common property or notes What should the value MUST be normalized as follows:
    1. If the mapping of value is an empty cell? array, each value within the array is normalized in place as described here.
    2. unless If the value is a string, replace it with an object with a datatype @value property whose value is that string. If a default language is specified, add a string @language or property whose value is that default language.
    3. If the value is an object with a anySimpleType @value or property, it remains as is.
    4. If the value is any other object, normalize each property of that object as follows:
      1. If the property is @id , strip leading expand any prefixed names and trailing whitespace from resolve its value against the base URL .
      2. If the property is @type , then its value remains as is.
      3. Otherwise, normalize the value of the property as if it were a common property, according to this algorithm.
    5. Otherwise, the value remains as is.
  2. If the property is an array property each element of the same as value is normalized using this algorithm.
  3. If the null value, then property is a link property the value is null turned into an absolute URL using the base URL .
  4. if If the separator property is not null , create an object property with a list of values by splitting string value, the string at is a URL referencing a JSON document containing a single object. Fetch this URL to retrieve an object, which may have a local @context . Raise an error if fetching this URL does not result in a JSON object. Normalize each property in the character specified by resulting object recursively using this algorithm and with its local @context then remove the local separator @context property. If the resulting object does not have an @id property, add an @id whose value is the original URL. This object becomes the value of the original object property.
  5. If the property is an object property with an object value, normalize each property recursively using this algorithm.
  6. validate If the value(s) against property is a natural language property and the format , if one value is specified, as described below; raise not already an error if any object, it is turned into an object whose properties are language codes and where the values of those properties are arrays. The suitable language code for the values do not match is determined through the specified format default language ; if it can't be determined the language code und MUST be used.
  7. parse If the value(s) using property is an atomic property that can be a string or an object, normalize to the format , object form as described below for that property.

Following this normalization process, the @base and @language properties within the @context are no longer relevant; the normalized metadata can have its @context set to http://www.w3.org/ns/csvw .

3.12.1 6.1.1 Formats for strings Examples

If the datatype This section is non-normative.

The following are examples of how common properties are normalized.

In this example, a simple string type, is used as the title for a table using the format dc:title common property provides :

Example 41: Giving a title to a Table
{
  "@context": { "http://www.w3.org/ns/csvw", { "@language": "en" } },
  "@type": "Table",
  "url": "http://example.com/table.csv",
  "tableSchema": [...],
  "dc:title": "The title of this Table"


}

Since there is a regular expression for default language , this is equivalent to explicitly specifying the language of that title; the original string values, in value becomes the syntax defined by [ ECMASCRIPT ]. value of the @value property within a value object :

Example 42: Giving a title to a Table
{
  "@type": "Table",
  "url": "http://example.com/table.csv",
  "tableSchema": [...],
  "dc:title": {"@value": "The title of this Table", "@language": "en"}

Issue
55


}


We invite comment about which reference

It is also possible to use for regular expression syntax. Other possibilities are a simple value object to use that defined by XML Schema or XPath. give a title. However, in this case the default language is not applied to the title:

Example 43: Giving a label to a Table
{
  "@context": { "http://www.w3.org/ns/csvw", { "@language": "en" } },
  "@type": "Table",
  "url": "http://example.com/table.csv",
  "tableSchema": [...],
  "dc:title": {"@value": "The title of this Table"}


}

3.12.2

The next example uses an array of a string and a value object to give two titles with different languages:

Example 44: Giving a title to a Table
{
  "@context": { "http://www.w3.org/ns/csvw", { "@language": "en" } },
  "@type": "Table",
  "url": "http://example.com/table.csv",
  "tableSchema": [...],
  "dc:title": [
    "The title of this Table",
    {"@value": "Der Titel dieser Tabelle", "@language": "de"}
  ]

Formats
for
numeric
types


}


It is not uncommon for numbers within tabular data to be formatted for human consumption, which may involve using commas for decimal points, grouping digits in the number using commas, or adding currency symbols or percent signs to the number. The normalized version of this is:

Example 45: Giving a title to a Table
{
  "@type": "Table",
  "url": "http://example.com/table.csv",
  "tableSchema": [...],
  "dc:title": [
    {"@value": "The title of this Table", "@language": "en"}
    {"@value": "Der Titel dieser Tabelle", "@language": "de"}
  ]


}

If the datatype is The next example demonstrates a numeric type, the format property indicates the expected format for that number. Validators MUST check that the numbers node object , in which the column adhere to the specified format. Converters MUST use value of the format schema:url property to parse the number when mapping it into is a suitable type in the target language of the conversion. reference to another resource:

Example 46: Referencing a URL
{
  "@context": [ "http://www.w3.org/ns/csvw", { "@base": "http://example.com/" } ],
  "@type": "Table",
  "url": "table.csv",
  "tableSchema": [...],
  "schema:url": {"@id": "table.csv"}


}

When The value of the datatype @id property is a numeric type, the format property's value MUST be a number format normalized as specified described in [ section 6.1 Normalization xslt-21 ]. against the base URL provided through the @base property, which means the above example is equivalent to:

Example 47: Referencing a URL
{
  "@context": "http://www.w3.org/ns/csvw",
  "@type": "Table",
  "url": "http://example.com/table.csv",
  "tableSchema": [...],
  "schema:url": {"@id": "http://example.com/table.csv"}

Issue
54


}


We invite comment on

The following example shows the best format to specify how to parse numbers. dc:publisher property as an array that contains a single node object :

Example 48: Embedded object
{
  "@context": "http://www.w3.org/ns/csvw",
  "@type": "Table",
  "url": "http://example.com/table.csv",
  "tableSchema": [...],
  "dc:publisher": [{
    "schema:name": "Example Municipality",
    "schema:url": {"@id": "http://example.org"}
  }],

Issue
65


}


Register

Following normalization, the schema:name property of recognised date-time picture string formats. the dc:publisher is expanded as shown:

Example 49: Normalized embedded object
"dc:publisher": [{
  "schema:name": { "@value": "Example Municipality" },
  "schema:url": { "@id": "http://example.org" }


}]

3.12.3 6.2 Formats for booleans Merging

Boolean values may be represented in many ways aside from the standard A description object 1 B and is merged into an original description object 0 A or by merging each property of true B and into false A . If the property from datatype B is does not exist on boolean A , the it is simply added to format A . If A does have the property, the way the values are merged depends on the type of the property, as follows:

3.12.4 6.3 Formats for dates and times Example

Dates and times are commonly represented in tabular data in formats other than those defined in [ xmlschema-2 This section is non-normative. ].

If For example, consider the following two metadata documents to be merged (located at datatype http://example.com/metadata.json is a date or time type, the and format http://example.com/doc1.csv-metadata.json property indicates the expected format for that date or time. Validators MUST check that the dates or times in the column adhere to the specified format. Converters MUST use ):

Example 50: Metadata A to be merged
{
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "tables": [{
    "url": "doc1.csv",
    "dc:title": "foo",
    "tableDirection": "ltr",
    "tableSchema": {
      "aboutUrl": "{#foo}",
      "columns": [{
        "name": "foo", 
        "titles": "Foo", 
        "required": true
      }, {
        "name": "bar"
      }]
    }
  }, {
    "url": "doc2.csv"
  }]

}
Example 51: Metadata B to be merged
{
  "@context": "http://www.w3.org/ns/csvw",
  "url": "http://example.com/doc1.csv",
  "dc:description": "bar",
  "tableSchema": {
    "propertyUrl": "{#_name}",
    "columns": [{
      "titles": "Foo",
      "required": false
    }, {
      "name": "bar"
    }, {
    }]
  }

}

The process of merging performs the following steps:

  1. Normalize format A property to parse the date or time when mapping it into a suitable type in use the target language of the conversion. When specified in the datatype @context is a date or time type, within the format property's value MUST be a date/time format as specified in [ xslt-21 natural language property ]. Issue 54 titles We invite comment on which format to use when parsing dates and times. to expand the link property url 3.12.5 Formats against the base URL for durations A , http://example.com/metadata.json :
    Example 52: Metadata A after normalization
    {
      "tables": [{
        "url": "http://example.com/doc1.csv",
        "dc:title": {"@value": "foo", "@language": "en"},
        "tableDirection": "ltr",
        "tableSchema": {
          "aboutUrl": "{#foo}",
          "columns": [{
            "name": "foo", 
            "titles": { "en": [ "Foo" ] }, 
            "required": true
          }, {
            "name": "bar"
          }]
        }
      }, {
        "url": "http://example.com/doc2.csv"
      }]
    
    Issue
    54
    
    
    }
    
    
    
    We invite comment on whether there are standard formats
  2. Normalize B from a table description to use when parsing durations. a table group description by embedding the table description in a tables property, resolve the link property url (which is already an absolute URL), and normalize the 4. Processing Tables titles This section describes how particular types of applications should property to use the metadata supplied about a CSV file when they process that CSV file. 4.1 Displaying Tables und language:
    Example 53: Metadata B after normalization
    {
      "tables": [{
        "url": "http://example.com/doc1.csv",
        "dc:description": {"@value": "bar"},
        "tableSchema": {
          "propertyUrl": "{#_name}",
          "columns": [{
            "titles": { "und": [ "Foo" ] },
            "required": false
          }, {
            "name": "bar"
          }, {
          }]
        }
      }]
    
    Issue
    
    
    }
    
    
    
    We intend to include other sections here about: displaying metadata about groups of tables, tables, columns, rows, cells and regions what headings to use for columns when displaying tabular content
  3. how to format values in cells Much of this is likely to be non-normative. We invite comment on whether it's useful to provide this kind of guidance. tables is an array property with rules specified in section 4.1.1 5.3 Bidirectional Tables Table Groups There are two levels of bidirectionality to consider when displaying tables: the directionality of the table (ie whether the columns should be arranged left-to-right or right-to-left) and the directionality of the content of individual cells. each value is merged accordingly:
    1. The first value from table-direction A property provides information about the desired display of the table. If and table-direction=ltr B then the first column SHOULD be displayed on the left and are now the last column on following:
      Example 54: first tables values from A and B
      {
        "url": "http://example.com/doc1.csv",
        "dc:title": {"@value": "foo", "@language": "en"},
        "tableDirection": "ltr",
        "tableSchema": {
          "aboutUrl": "{#foo}",
          "columns": [{
            "name": "foo", 
            "titles": { "en": [ "Foo" ] }, 
            "required": true
          }, {
            "name": "bar"
          }]
        }
      }
      
      
      {
        "url": "http://example.com/doc1.csv",
        "dc:description": {"@value": "bar"},
        "tableSchema": {
          "propertyUrl": "{#_name}",
          "columns": [{
            "titles": { "und": [ "Foo" ] },
            "required": false
          }, {
            "name": "bar"
          }, {
          }]
        }
      
      }
      
      As these have the right. If same table-direction=rtl url then the first column SHOULD be displayed on the right and the last column on the left. If , these are merged. Each property from table-direction=default B then tables SHOULD be displayed with attention to the bidirectionality of the content of the file. Specifically, the values of is considered:
      1. url is the cells in same (otherwise the two table should descriptions would not be scanned breadth first: from the first cell in the first column through to the last cell being merged).
      2. dc:description does not exist in the first column, down A so it is added to A .
      3. The objects held by the last cell in the last column. If the first character in the table with a strong type object property tableSchema properties are merged:
        1. B has a propertyUrl which is added to A .
        2. Each has columns which is merged as defined described in [ UNICODE-BIDI 5.5 Schemas ] indicates a RTL directionality, the table should be displayed with the :
          1. The first column matches on titles because they each have the right value "Foo" and the last column on the left. Otherwise, the table should be displayed with the first column on language tag en matches und . Because the left and und value is present in the last column on en array, that value is removed from the right. Characters such as whitespace, quotes, commas and numbers do not have a strong type, und array, and therefore are skipped when identifying because the character that determines array is now empty the directionality und property is removed. The value of the table. Implementations SHOULD required in A enable user preferences to override the indicated metadata about the directionality of the table. Once is retained, as is the directionality value of the table name .
          2. The second column has been determined, each cell within the table should be considered as a separate paragraph , as defined by the UBA in [ UNICODE-BIDI ]. match on name .
      The default directionality for the cell merged table description is determined by looking at now the following:
      Example 55: first tables values merged
      {
        "url": "http://example.com/doc1.csv",
        "dc:title": {"@value": "foo", "@language": "en"},
        "dc:description": {"@value": "bar"},
        "tableDirection": "ltr",
        "tableSchema": {
          "aboutUrl": "{#foo}",
          "propertyUrl": "{#_name}",
          "columns": [{
            "name": "foo", 
            "titles": { "en": [ "Foo" ]}, 
            "required": true
          },{
            "name": "bar"
          }]
        }
      
      
      }
      
    2. The second tables value from text-direction A property , which is an inherited property . Thus, retained as defined by the UBA , if a cell contains no characters with a strong type (if it's a number or date for example) then the way the cell is.

The resulting merged metadata is displayed should be determined by now the following:

Example 56: Fully merged metadata
{
  "tables": [{
    "url": "http://example.com/doc1.csv",
    "dc:title": {"@value": "foo", "@language": "en"},
    "dc:description": {"@value": "bar"},
    "tableDirection": "ltr",
    "tableSchema": {
      "aboutUrl": "{#foo}",
      "propertyUrl": "{#_name}",
      "columns": [{
        "name": "foo", 
        "titles": { "en": [ "Foo" ]}, 
        "required": true
      },{
        "name": "bar"
      }]
    }
  }, {
    "url": "http://example.com/doc2.csv"
  }]

text-direction
property
of
the
cell.
However,
when
the
cell
contains
characters
with
a
strong
type
(such
as
letters)
then
they
MUST
be
displayed
according
to
the
Unicode
Bidirectional
Algorithm
as
described
in
[


}

UNICODE-BIDI

].

4.2 Validating Tables 7. Issue Security Considerations We intend to detail how

Applications that process tabular data may use that data to validate groups drive other actions, which may have security implications. These behaviors are outside the scope of this specification.

Third party metadata provided about a tabular data files against metadata. This would be normative: compliant validators would have file (such as a CSV file) may rename or ignore headers, or exclude rows or columns, which may lead to report data being misinterpreted by applications that process it.

Transformation definitions are a possible security risk as they enable the errors and warnings creators of metadata to reference arbitrary code that we define. We invite comment on whether may be executed to convert tabular data into other formats. Implementations should run this is arbitrary code in a useful thing sandboxed environment to specify. reduce the security risk.

4.3 A. Converting Tables JSON-LD Dialect

Conversions of tabular data to other formats operate over The Metadata Vocabulary for Tabular Data uses a annotated table format based on JSON-LD [ JSON-LD constructed as ] with some restrictions.

A.1 URL Compaction

Conversion specifications SHOULD specify format-specific When normalizing metadata, prefixed names used in common properties specifying external processing steps and notes are expanded to provide absolute URLs. For some serializations, these are more control appropriately presented using prefixed names or terms . This algorithm compacts an absolute URL to people defining conversions. a prefixed name or term .

  1. If these are specified, the conversion specification MUST specify at what point URL exactly matches the absolute IRI associated with a term in [ csvw-context ], replace the processing this external processing takes place, and what it takes place on. Examples might be: URL with that term.
  2. Otherwise, if the URL starts with the absolute IRI associated with a term in [ csvw-context ], replace the matched part of an XSLT file that is used to process XML after it is generated URL with the term separated with a string containing : ( U+0040 ) to create a SPARQL CONSTRUCT pattern that is executed on RDF after it prefixed name . If the resulting prefixed name is generated properties that contain definitions of Javascript callback functions that are used when processing particular columns or individual rows rdf:type , replace with @type .

A. B. Acknowledgements

This document is largely a copy of content from the influenced by Data Package specification and the JSON Table Schema , which are maintained as part of Data Protocols . Particular contributors to that work are Rufus Pollock, Paul Fitzpatrick, Andrew Berkeley, Francis Irving, Benoit Chesneau, Leigh Dodds, Martin Keegan, and Gunnlaugur Thor Briem.

At the time of publication, the following individuals had participated in the Working Group, in the order of their first name: Adam Retter, Alf Eaton, Anastasia Dimou, Andy Seaborne, Axel Polleres, Christopher Gutteridge, Dan Brickley, Davide Ceolin, Eric Stephan, Erik Mannens, Gregg Kellogg, Ivan Herman, Jeni Tennison, Jeremy Tandy, Jürgen Umbrich, Rufus Pollock, Stasinos Konstantopoulos, William Ingram, and Yakov Shafranovich.

B. C. IANA Considerations

B.1 Registration of

This section has not yet been submitted to IANA for review, approval, and registration.

application/csvm+json

Type name:
application
Subtype name:
csvm+json
Required parameters:
N/A
Optional parameters:
N/A
Encoding considerations:
See [ RFC6839 ], section 3.1
Security considerations:
See [ RFC4627 Issue ] and section 7. Security Considerations We intend to include a registration for a new datatype, namely application/csvm+json . We invite comment about how to indicate of this specification
Interoperability considerations:
Note that this format is consistent with application/ld+json , or whether we should just use not the same as the existing application/json text/csv or and application/ld+json text/tab-delimited-values and not create mediatypes, but a specific JSON-based format used to annotate such documents
Published specification:
This specification.
Applications that use this media type type:
It is anticipated that there is a broad need for the metadata files defined in this document. C. Security Considerations Issue 8 data validators and converters to alternate structured representations of tabular data.
Fragment identifier considerations:
See [ RFC6839 TODO: General ], section 3.1
Additional information:
Magic number(s):
n/a
File extension(s):
".json"
Macintosh file type code(s):
"TEXT"
Person & email address to contact for further information:
Ivan Herman <ivan@w3.org>
Intended usage:
COMMON
Restrictions on usage:
None
Author/Change controller:
The Tabular metadata specification is the product of the CSV security considerations. on the Web Working Group. The W3C reserves change control over this specification.

D. JSON-LD Context

The JSON-LD context, located at http://www.w3.org/ns/csvw.jsonld is used with metadata documents. When used within a metadata document, the context can be referenced as http://www.w3.org/ns/csvw . See [ csvw-context ] for a full description of defined terms and prefixes . This context may be updated from time-to-time to define new terms and prefixes.

{ "@context": { "id": "@id", "type": "@type", "dc:title": { "@container": "@language" }, "dc:description": { "@container": "@language" }, "rdfs:comment": { "@container": "@language" }, "rdfs:domain": { "@type": "@id" }, "rdfs:label": { "@container": "@language" }, "rdfs:range": { "@type": "@id" }, "rdfs:subClassOf": { "@type": "@id" }, "rdfs:subPropertyOf": { "@type": "@id" }, "owl:equivalentClass": { "@type": "@vocab" }, "owl:equivalentProperty": { "@type": "@vocab" }, "owl:oneOf": { "@container": "@list", "@type": "@vocab" }, "owl:imports": { "@type": "@id" }, "owl:versionInfo": { "@type": "xsd:string", "@language": null }, "owl:inverseOf": { "@type": "@vocab" }, "owl:unionOf": { "@type": "@vocab", "@container": "@list" }, "rdfs_classes": { "@reverse": "rdfs:isDefinedBy", "@type": "@id" }, "rdfs_properties": { "@reverse": "rdfs:isDefinedBy", "@type": "@id" }, "rdfs_datatypes": { "@reverse": "rdfs:isDefinedBy", "@type": "@id" }, "rdfs_instances": { "@reverse": "rdfs:isDefinedBy", "@type": "@id" }, "cc": "http://creativecommons.org/ns#", "csvw": "http://www.w3.org/ns/csvw#", "ctag": "http://commontag.org/ns#", "dc": "http://purl.org/dc/terms/", "dc11": "http://purl.org/dc/elements/1.1/", "dcat": "http://www.w3.org/ns/dcat#", "dcterms": "http://purl.org/dc/terms/", "foaf": "http://xmlns.com/foaf/0.1/", "gr": "http://purl.org/goodrelations/v1#", "grddl": "http://www.w3.org/2003/g/data-view#", "ical": "http://www.w3.org/2002/12/cal/icaltzd#", "ma": "http://www.w3.org/ns/ma-ont#", "og": "http://ogp.me/ns#", "org": "http://www.w3.org/ns/org#", "owl": "http://www.w3.org/2002/07/owl#", "prov": "http://www.w3.org/ns/prov#", "qb": "http://purl.org/linked-data/cube#", "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#", "rdfa": "http://www.w3.org/ns/rdfa#", "rdfs": "http://www.w3.org/2000/01/rdf-schema#", "rev": "http://purl.org/stuff/rev#", "rif": "http://www.w3.org/2007/rif#", "rr": "http://www.w3.org/ns/r2rml#", "schema": { "@id": "csvw:schema", "@type": "@id" }, "sd": "http://www.w3.org/ns/sparql-service-description#", "sioc": "http://rdfs.org/sioc/ns#", "skos": "http://www.w3.org/2004/02/skos/core#", "skosxl": "http://www.w3.org/2008/05/skos-xl#", "v": "http://rdf.data-vocabulary.org/#", "vcard": "http://www.w3.org/2006/vcard/ns#", "void": "http://rdfs.org/ns/void#", "wdr": "http://www.w3.org/2007/05/powder#", "wrds": "http://www.w3.org/2007/05/powder-s#", "xhv": "http://www.w3.org/1999/xhtml/vocab#", "xml": "rdf:XMLLiteral", "xsd": "http://www.w3.org/2001/XMLSchema#", "any": "xsd:anySimpleType", "binary": "xsd:base64Binary", "datetime": "xsd:dateTime", "describedby": "wrds:describedby", "html": "rdf:HTML", "license": "xhv:license", "maximum": "csvw:maxInclusive", "minimum": "csvw:minInclusive", "number": "xsd:double", "role": "xhv:role", "Column": "csvw:Column", "Dialect": "csvw:Dialect", "Direction": "csvw:Direction", "Schema": "csvw:Schema", "Table": "csvw:Table", "TableGroup": "csvw:TableGroup", "Template": "csvw:Template", "columns": { "@id": "csvw:columns", "@type": "@id", "@container": "@list" }, "commentPrefix": { "@id": "csvw:commentPrefix" }, "datatype": { "@id": "csvw:datatype" }, "default": { "@id": "csvw:default" }, "delimiter": { "@id": "csvw:delimiter" }, "dialect": { "@id": "csvw:dialect", "@type": "@id" }, "doubleQuote": { "@id": "csvw:doubleQuote", "@type": "xsd:boolean" }, "encoding": { "@id": "csvw:encoding" }, "foreignKeys": { "@id": "csvw:foreignKeys" }, "format": { "@id": "csvw:format" }, "header": { "@id": "csvw:header", "@type": "xsd:boolean" }, "headerColumnCount": { "@id": "csvw:headerColumnCount", "@type": "xsd:nonNegativeInteger" }, "headerRowCount": { "@id": "csvw:headerRowCount", "@type": "xsd:nonNegativeInteger" }, "language": { "@id": "csvw:language" }, "length": { "@id": "csvw:length", "@type": "xsd:nonNegativeInteger" }, "lineTerminator": { "@id": "csvw:lineTerminator" }, "maxExclusive": { "@id": "csvw:maxExclusive" }, "maxInclusive": { "@id": "csvw:maxInclusive" }, "maxLength": { "@id": "csvw:maxLength", "@type": "xsd:nonNegativeInteger" }, "minExclusive": { "@id": "csvw:minExclusive" }, "minInclusive": { "@id": "csvw:minInclusive" }, "minLength": { "@id": "csvw:minLength", "@type": "xsd:nonNegativeInteger" }, "name": { "@id": "csvw:name" }, "notes": { "@id": "csvw:notes" }, "null": { "@id": "csvw:null" }, "predicateUrl": { "@id": "csvw:predicateUrl", "@type": "xsd:anyURI" }, "primaryKey": { "@id": "csvw:primaryKey" }, "quoteChar": { "@id": "csvw:quoteChar" }, "required": { "@id": "csvw:required", "@type": "xsd:boolean" }, "resources": { "@id": "csvw:resources", "@type": "@id", "@container": "@set" }, "row": { "@id": "csvw:row", "@container": "@set" }, "separator": { "@id": "csvw:separator" }, "skipBlankRows": { "@id": "csvw:skipBlankRows", "@type": "xsd:boolean" }, "skipColumns": { "@id": "csvw:skipColumns", "@type": "xsd:nonNegativeInteger" }, "skipInitialSpace": { "@id": "csvw:skipInitialSpace", "@type": "xsd:boolean" }, "skipRows": { "@id": "csvw:skipRows", "@type": "xsd:nonNegativeInteger" }, "source": { "@id": "csvw:source" }, "table": { "@id": "csvw:table", "@type": "@id", "@container": "@set" }, "table-direction": { "@id": "csvw:table-direction", "@type": "@vocab" }, "targetFormat": { "@id": "csvw:targetFormat" }, "templateFormat": { "@id": "csvw:templateFormat" }, "templates": { "@id": "csvw:templates", "@type": "@id" }, "text-direction": { "@id": "csvw:text-direction", "@type": "@vocab" }, "title": { "@id": "csvw:title", "@container": "@language" }, "trim": { "@id": "csvw:trim", "@type": "xsd:boolean" }, "uriTemplate": { "@id": "csvw:uriTemplate" }, "json": "csvw:json" }, "@id": "http://www.w3.org/ns/csvw#", "@type": "owl:Ontology", "dc:title": { "en": "Metadata Vocabulary for Tabular Data" }, "dc:description": { "en": "Validation, conversion, display and search of tabular data on the web\n requires additional metadata that describes how the data should be\n interpreted. This document defines a vocabulary for metadata that\n annotates tabular data. This can be used to provide metadata at various\n levels, from collections of data from CSV documents and how they relate\n to each other down to individual cells within a table." }, "rdfs_classes": [ { "@id": "csvw:Column", "@type": "rdfs:Class", "rdfs:label": { "en": "Column Description" }, "rdfs:comment": { "en": "A Column Description describes a single column." } }, { "@id": "csvw:Dialect", "@type": "rdfs:Class", "rdfs:label": { "en": "Dialect Description" }, "rdfs:comment": { "en": "A Dialect Description provides hints to parsers about how to parse a linked file." } }, { "@id": "csvw:Direction", "@type": "rdfs:Class", "rdfs:label": { "en": "Direction" }, "rdfs:comment": { "en": "The class of table/text directions." } }, { "@id": "csvw:Schema", "@type": "rdfs:Class", "rdfs:label": { "en": "Schema" }, "rdfs:comment": { "en": "A Schema is a definition of a tabular format that may be common to multiple tables." } }, { "@id": "csvw:Table", "@type": "rdfs:Class", "rdfs:label": { "en": "Table Description" }, "rdfs:comment": { "en": "A table description is a JSON object that describes a table within a CSV file." } }, { "@id": "csvw:TableGroup", "@type": "rdfs:Class", "rdfs:label": { "en": "Table Group Description" }, "rdfs:comment": { "en": "A Table Group Description describes a group of Tables." } }, { "@id": "csvw:Template", "@type": "rdfs:Class", "rdfs:label": { "en": "Template Specification" }, "rdfs:comment": { "en": "A Template Specification is a definition of how tabular data can be transformed into another format." } } ], "rdfs_properties": [ { "@id": "csvw:columns", "@type": "rdf:Property", "rdfs:label": { "en": "columns" }, "rdfs:comment": { "en": "An array of Column Descriptions." }, "rdfs:domain": "csvw:Schema", "rdfs:range": "csvw:Column" }, { "@id": "csvw:commentPrefix", "@type": "rdf:Property", "rdfs:label": { "en": "comment prefix" }, "rdfs:comment": { "en": "A character that, when it appears at the beginning of a skipped row, indicates a comment that should be associated as a comment annotation to the table. The default is \"#\"." }, "rdfs:domain": "csvw:Dialect" }, { "@id": "csvw:datatype", "@type": "rdf:Property", "rdfs:label": { "en": "datatype" }, "rdfs:comment": { "en": "The main datatype of the values of the cell. If the cell contains a list (ie separator is specified and not null) then this is the datatype of each value within the list." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:default", "@type": "rdf:Property", "rdfs:label": { "en": "default" }, "rdfs:comment": { "en": "An atomic property holding a single string that provides a default string value for the cell in cases where the original string value is a null value. This default value may be used when converting the table into other formats." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:delimiter", "@type": "rdf:Property", "rdfs:label": { "en": "delimiter" }, "rdfs:comment": { "en": "The separator between cells. The default is \",\"." }, "rdfs:domain": "csvw:Dialect" }, { "@id": "csvw:dialect", "@type": "rdf:Property", "rdfs:label": { "en": "dialect" }, "rdfs:comment": { "en": "Provides hints to processors about how to parse the referenced files for to create tabular data models for an individual table, or all the tables in a group." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table" ] }, "rdfs:range": "csvw:Dialect" }, { "@id": "csvw:doubleQuote", "@type": "rdf:Property", "rdfs:label": { "en": "double quote" }, "rdfs:comment": { "en": "If true, sets the escape character flag to \". If false, to \\\\." }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:boolean" }, { "@id": "csvw:encoding", "@type": "rdf:Property", "rdfs:label": { "en": "encoding" }, "rdfs:comment": { "en": "The character encoding for the file, one of the encodings listed in [encoding]. The default is utf-8." }, "rdfs:domain": "csvw:Dialect" }, { "@id": "csvw:foreignKeys", "@type": "rdf:Property", "rdfs:label": { "en": "foreign keys" }, "rdfs:comment": { "en": "An array of foreign key definitions that define how the values from specified columns within this table link to rows within this table or other tables." }, "rdfs:domain": "csvw:Schema" }, { "@id": "csvw:format", "@type": "rdf:Property", "rdfs:label": { "en": "format" }, "rdfs:comment": { "en": "A definition of the format of the cell, used when parsing the cell." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:header", "@type": "rdf:Property", "rdfs:label": { "en": "header" }, "rdfs:comment": { "en": "" }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:boolean" }, { "@id": "csvw:headerColumnCount", "@type": "rdf:Property", "rdfs:label": { "en": "header column count" }, "rdfs:comment": { "en": "The number of header columns (following the skipped columns) in each row. The default is 0.\n" }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:nonNegativeInteger" }, { "@id": "csvw:headerRowCount", "@type": "rdf:Property", "rdfs:label": { "en": "header row count" }, "rdfs:comment": { "en": "The number of header rows (following the skipped rows) in the file. The default is 1." }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:nonNegativeInteger" }, { "@id": "csvw:language", "@type": "rdf:Property", "rdfs:label": { "en": "language" }, "rdfs:comment": { "en": "A language code as defined by [BCP47]. Indicates the language of the value within the cell." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:length", "@type": "rdf:Property", "rdfs:label": { "en": "length" }, "rdfs:comment": { "en": "The exact length of the value of the cell." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] }, "rdfs:range": "xsd:nonNegativeInteger" }, { "@id": "csvw:lineTerminator", "@type": "rdf:Property", "rdfs:label": { "en": "line terminator" }, "rdfs:comment": { "en": "The character that is used at the end of a row. The default is CRLF." }, "rdfs:domain": "csvw:Dialect" }, { "@id": "csvw:maxExclusive", "@type": "rdf:Property", "rdfs:label": { "en": "max exclusive" }, "rdfs:comment": { "en": "The maximum value for the cell (exclusive)." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:maxInclusive", "@type": "rdf:Property", "rdfs:label": { "en": "max inclusive" }, "rdfs:comment": { "en": "The maximum value for the cell (inclusive). " }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:maxLength", "@type": "rdf:Property", "rdfs:label": { "en": "max length" }, "rdfs:comment": { "en": "The maximum length of the value of the cell." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] }, "rdfs:range": "xsd:nonNegativeInteger" }, { "@id": "csvw:minExclusive", "@type": "rdf:Property", "rdfs:label": { "en": "min exclusive" }, "rdfs:comment": { "en": "The minimum value for the cell (exclusive)." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:minInclusive", "@type": "rdf:Property", "rdfs:label": { "en": "min inclusive" }, "rdfs:comment": { "en": "The minimum value for the cell (inclusive)." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:minLength", "@type": "rdf:Property", "rdfs:label": { "en": "min length" }, "rdfs:comment": { "en": "The minimum length of the value of the cell." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] }, "rdfs:range": "xsd:nonNegativeInteger" }, { "@id": "csvw:name", "@type": "rdf:Property", "rdfs:label": { "en": "name" }, "rdfs:comment": { "en": "An atomic property that gives a canonical name for the column. This must be a string. Conversion specifications must use this property as the basis for the names of properties/elements/attributes in the results of conversions." }, "rdfs:domain": "csvw:Column" }, { "@id": "csvw:notes", "@type": "rdf:Property", "rdfs:label": { "en": "notes" }, "rdfs:comment": { "en": "An array of objects representing annotations. This specification does not place any constraints on the structure of these objects." }, "rdfs:domain": "csvw:Table" }, { "@id": "csvw:null", "@type": "rdf:Property", "rdfs:label": { "en": "null" }, "rdfs:comment": { "en": "The string used for null values. If not specified, the default for this is the empty string." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:predicateUrl", "@type": "rdf:Property", "rdfs:label": { "en": "predicate URL" }, "rdfs:comment": { "en": "An atomic property that holds one or more URIs that may be used as URIs for predicates if the table is mapped to another format." }, "rdfs:domain": "csvw:Column", "rdfs:range": "xsd:anyURI" }, { "@id": "csvw:primaryKey", "@type": "rdf:Property", "rdfs:label": { "en": "primary key" }, "rdfs:comment": { "en": "A column reference property that holds either a single reference to a column description object or an array of references." }, "rdfs:domain": "csvw:Schema" }, { "@id": "csvw:quoteChar", "@type": "rdf:Property", "rdfs:label": { "en": "quote char" }, "rdfs:comment": { "en": "The character that is used around escaped cells." }, "rdfs:domain": "csvw:Dialect" }, { "@id": "csvw:required", "@type": "rdf:Property", "rdfs:label": { "en": "required" }, "rdfs:comment": { "en": "A boolean value which indicates whether every cell within the column must have a non-null value." }, "rdfs:domain": "csvw:Column", "rdfs:range": "xsd:boolean" }, { "@id": "csvw:resources", "@type": "rdf:Property", "rdfs:label": { "en": "resources" }, "rdfs:comment": { "en": "An array of table descriptions for the tables in the group." }, "rdfs:domain": "csvw:TableGroup", "rdfs:range": "csvw:Table" }, { "@id": "csvw:row", "@type": "rdf:Property", "rdfs:label": { "en": "row" }, "rdfs:comment": { "en": "Relates a Table to each Row output." }, "rdfs:subPropertyOf": "rdfs:member", "rdfs:domain": "csvw:Table" }, { "@id": "csvw:schema", "@type": "rdf:Property", "rdfs:label": { "en": "schema" }, "rdfs:comment": { "en": "An object property that provides a schema description for an individual table, or all the tables in a group." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table" ] }, "rdfs:range": "csvw:Schema" }, { "@id": "csvw:separator", "@type": "rdf:Property", "rdfs:label": { "en": "separator" }, "rdfs:comment": { "en": "The character used to separate items in the string value of the cell." }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] } }, { "@id": "csvw:skipBlankRows", "@type": "rdf:Property", "rdfs:label": { "en": "skip blank rows" }, "rdfs:comment": { "en": "Indicates whether to ignore wholly empty rows (ie rows in which all the cells are empty). The default is false." }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:boolean" }, { "@id": "csvw:skipColumns", "@type": "rdf:Property", "rdfs:label": { "en": "skip columns" }, "rdfs:comment": { "en": "The number of columns to skip at the beginning of each row, before any header columns. The default is 0." }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:nonNegativeInteger" }, { "@id": "csvw:skipInitialSpace", "@type": "rdf:Property", "rdfs:label": { "en": "skip initial space" }, "rdfs:comment": { "en": "If true, sets the trim flag to \"start\". If false, to false." }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:boolean" }, { "@id": "csvw:skipRows", "@type": "rdf:Property", "rdfs:label": { "en": "skip rows" }, "rdfs:comment": { "en": "The number of rows to skip at the beginning of the file, before a header row or tabular data." }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:nonNegativeInteger" }, { "@id": "csvw:source", "@type": "rdf:Property", "rdfs:label": { "en": "source" }, "rdfs:comment": { "en": "The format to which the tabular data should be transformed prior to the transformation using the template. If the value is \"json\", the tabular data should first be transformed first to JSON based on the simple mapping defined in Generating JSON from Tabular Data on the Web. If the value is \"rdf\", it should similarly first be transformed to XML based on the simple mapping defined in Generating RDF from Tabular Data on the Web." }, "rdfs:domain": "csvw:Template" }, { "@id": "csvw:table", "@type": "rdf:Property", "rdfs:label": { "en": "table" }, "rdfs:comment": { "en": "Relates an Table group to annotated tables. (Note, this is different from csvw:resources, which relates metadata, rather than resulting annotated table descriptions." }, "rdfs:subPropertyOf": "rdfs:member", "rdfs:domain": "csvw:TableGroup", "rdfs:range": "csvw:Table" }, { "@id": "csvw:table-direction", "@type": "rdf:Property", "rdfs:label": { "en": "table direction" }, "rdfs:comment": { "en": "One of csvw:rtl csvw:ltr or csvw:default. Indicates whether the tables in the group should be displayed with the first column on the right, on the left, or based on the first character in the table that has a specific direction. " }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table" ] }, "rdfs:range": "csvw:Direction" }, { "@id": "csvw:targetFormat", "@type": "rdf:Property", "rdfs:label": { "en": "target format" }, "rdfs:comment": { "en": "A URL for the format that will be created through the transformation. If one has been defined, this should be a URL for a media type, in the form http://www.iana.org/assignments/media-types/media-type such as http://www.iana.org/assignments/media-types/text/calendar. Otherwise, it can be any URL that describes the target format." }, "rdfs:domain": "csvw:Template" }, { "@id": "csvw:templateFormat", "@type": "rdf:Property", "rdfs:label": { "en": "template format" }, "rdfs:comment": { "en": "A URL for the format that is used by the template. If one has been defined, this should be a URL for a media type, in the form http://www.iana.org/assignments/media-types/media-type such as http://www.iana.org/assignments/media-types/application/javascript. Otherwise, it can be any URL that describes the template format." }, "rdfs:domain": "csvw:Template" }, { "@id": "csvw:templates", "@type": "rdf:Property", "rdfs:label": { "en": "templates" }, "rdfs:comment": { "en": "An array of template specifications that provide mechanisms to transform the tabular data into other formats. " }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table" ] }, "rdfs:range": "csvw:Template" }, { "@id": "csvw:text-direction", "@type": "rdf:Property", "rdfs:label": { "en": "text direction" }, "rdfs:comment": { "en": "One of csvw:rtl or csvw:ltr. Indicates whether the text within cells should be displayed by default as left-to-right or right-to-left text. " }, "rdfs:domain": { "owl:unionOf": [ "csvw:TableGroup", "csvw:Table", "csvw:Schema", "csvw:Column" ] }, "rdfs:range": "csvw:Direction" }, { "@id": "csvw:title", "@type": "rdf:Property", "rdfs:label": { "en": "title" }, "rdfs:comment": { "en": "For a Template: A natural language property that describes the format that will be generated from the transformation. This is useful if the target format is a generic format (such as application/json) and the transformation is creating a specific profile of that format.\n\nFor a Column: A natural language property that provides possible alternative names for the column." }, "rdfs:domain": { "owl:unionOf": [ "csvw:Template", "csvw:Column" ] } }, { "@id": "csvw:trim", "@type": "rdf:Property", "rdfs:label": { "en": "trim" }, "rdfs:comment": { "en": "Indicates whether to trim whitespace around cells; may be true, false, start or end. The default is false." }, "rdfs:domain": "csvw:Dialect", "rdfs:range": "xsd:boolean" }, { "@id": "csvw:uriTemplate", "@type": "rdf:Property", "rdfs:label": { "en": "uri template" }, "rdfs:comment": { "en": "A URI template property that may be used to create a unique identifier for each row when mapping data to other formats." }, "rdfs:domain": "csvw:Schema" } ], "rdfs_datatypes": [ { "@id": "csvw:json", "@type": "rdfs:Datatype", "rdfs:label": { "en": "json" }, "rdfs:comment": { "en": "A literal containing JSON." }, "rdfs:subClassOf": "rdfs:Literal" } ], "rdfs_instances": [ { "@id": "csvw:ltr", "@type": "Direction", "rdfs:label": { "en": "left to right" }, "rdfs:comment": { "en": "Indicates text should be processed left to right." } }, { "@id": "csvw:rtl", "@type": "Direction", "rdfs:label": { "en": "right to left" }, "rdfs:comment": { "en": "Indiects text should be processed right to left" } } ] }

E. Changes since the working draft of 08 January 2015

The document has undergone substantial changes since the last working draft. Below are some of the changes made:

E. F. References

E.1 F.1 Normative references

[BCP47]
A. Phillips; M. Davis. Tags for Identifying Languages . September 2009. IETF Best Current Practice. URL: http://tools.ietf.org/html/bcp47
[ECMASCRIPT]
[JSON-LD]
Allen Wirfs-Brock. Manu Sporny; Gregg Kellogg; Markus Lanthaler. ECMA-262 ECMAScript Language Specification, Edition 6 JSON-LD 1.0 . Draft. 16 January 2014. W3C Recommendation. URL: http://people.mozilla.org/~jorendorff/es6-draft.html http://www.w3.org/TR/json-ld/
[UNICODE-BIDI]
[JSON-LD-API]
Mark Davis; Aharon Lanin; Andrew Glass. Markus Lanthaler; Gregg Kellogg; Manu Sporny. TR9, Unicode Bidirectional Algorithm JSON-LD 1.0 Processing Algorithms and API . Report. 16 January 2014. W3C Recommendation. URL: http://unicode.org/reports/tr9/ http://www.w3.org/TR/json-ld-api/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels . March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[RFC3986]
T. Berners-Lee; R. Fielding; L. Masinter. Uniform Resource Identifier (URI): Generic Syntax . January 2005. Internet Standard. URL: https://tools.ietf.org/html/rfc3986
[URI-TEMPLATE]
Joe J. Gregorio; Roy T. R. Fielding; Marc M. Hadley; Mark M. Nottingham; David D. Orchard. URI Template . March 2012. RFC 6570. Proposed Standard. URL: http://www.rfc-editor.org/rfc/rfc6570.txt https://tools.ietf.org/html/rfc6570
[csv2json]
Jeremy Tandy; Ivan Herman. Generating JSON from Tabular Data on the Web . W3C Working Draft. URL: http://www.w3.org/TR/2015/WD-csv2json-20150416/
[csv2rdf]
Jeremy Tandy; Ivan Herman; Gregg Kellogg. Generating RDF from Tabular Data on the Web . W3C Working Draft. URL: http://www.w3.org/TR/2015/WD-csv2rdf-20150416/
[encoding]
Anne van Kesteren; Joshua Bell; Addison Phillips. Encoding . 16 September 2014. W3C Candidate Recommendation. URL: http://www.w3.org/TR/encoding/
[tabular-data-model]
Jeni Tennison; Gregg Kellogg. Model for Tabular Data and Metadata on the Web . W3C Working Draft. URL: http://www.w3.org/TR/2015/WD-tabular-data-model-20150108/ http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/
[xmlschema-2]
[xmlschema11-2]
David Peterson; Sandy Gao; Ashok Malhotra; Michael Sperberg-McQueen; Henry Thompson; Paul V. Biron; Ashok Malhotra. Biron et al. W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes Second Edition . 28 October 2004. 5 April 2012. W3C Recommendation. URL: http://www.w3.org/TR/xmlschema-2/ http://www.w3.org/TR/xmlschema11-2/
[xslt-21]

F.2 Informative references

[RFC4627]
Michael Kay. D. Crockford. XSL Transformations (XSLT) Version 3.0 The application/json Media Type for JavaScript Object Notation (JSON) . 2 October 2014. W3C Last Call Working Draft. July 2006. Informational. URL: http://www.w3.org/TR/xslt-30/ https://tools.ietf.org/html/rfc4627
[RFC6839]
T. Hansen; A. Melnikov. Additional Media Type Structured Syntax Suffixes . January 2013. Informational. URL: https://tools.ietf.org/html/rfc6839
[csvw-context]
E.2 Informative references
Gregg Kellogg. Metadata Vocabulary for Tabular Data . URL: http://www.w3.org/ns/csvw
[JSON-LD]
[rdf-concepts]
Manu Sporny; Gregg Kellogg; Markus Lanthaler. Graham Klyne; Jeremy Carroll. JSON-LD 1.0 Resource Description Framework (RDF): Concepts and Abstract Syntax . 16 January 2014. 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/json-ld/ http://www.w3.org/TR/rdf-concepts/
[rdfa-core]
Ben Adida; Mark Birbeck; Shane McCarron; Ivan Herman et al. RDFa Core 1.1 - Third Edition . 16 December 2014. 17 March 2015. W3C Proposed Edited Recommendation. URL: http://www.w3.org/TR/rdfa-core/