Copyright © 2015 W3C ® ( MIT , ERCIM , Keio , Beihang ). W3C liability , trademark and document use rules apply.
Validation, conversion, display, and search of tabular data on the web requires additional metadata that describes how the data should be interpreted. This document defines a vocabulary for metadata that annotates tabular data. This can be used to provide metadata at various levels, from groups of tables and how they relate to each other down to individual cells within a table.
The metadata defined in this specification is used to provide annotations on an annotated table or group of tables , as defined in [ tabular-data-model ]. Annotated tables form the basis for all further processing, such as validating, converting, or displaying the tables.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
The CSV on the Web Working Group was chartered to produce a Recommendation "Access methods for CSV Metadata" as well as Recommendations for "Metadata vocabulary for CSV data" and "Mapping mechanism to transforming CSV into various Formats (e.g., RDF, JSON, or XML)". This document aims to primarily satisfy the second of those Recommendations.
This
document
was
published
by
the
CSV
on
the
Web
Working
Group
as
a
Working
Draft.
Candidate
Recommendation.
This
document
is
intended
to
become
a
W3C
Recommendation.
If
you
wish
to
make
comments
regarding
this
document,
please
send
them
to
public-csv-wg@w3.org
(
subscribe
,
archives
).
W3C
publishes
a
Candidate
Recommendation
to
indicate
that
the
document
is
believed
to
be
stable
and
to
encourage
implementation
by
the
developer
community.
This
Candidate
Recommendation
is
expected
to
advance
to
Proposed
Recommendation
no
earlier
than
30
October
2015.
All
comments
are
welcome.
Please see the Working Group's implementation report .
Publication
as
a
Working
Draft
Candidate
Recommendation
does
not
imply
endorsement
by
the
W3C
Membership.
This
is
a
draft
document
and
may
be
updated,
replaced
or
obsoleted
by
other
documents
at
any
time.
It
is
inappropriate
to
cite
this
document
as
other
than
work
in
progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
This document is governed by the 1 August 2014 W3C Process Document .
Interpreting
tabular
data
that
is
available
on
the
web,
particularly
as
CSV,
usually
requires
additional
metadata.
As
an
example,
say
that
the
following
CSV
file
were
available
at
http://example.org/tree-ops.csv
GID,On Street,Species,Trim Cycle,Inventory Date 1,ADDISON AV,Celtis australis,Large Tree Routine Prune,10/18/2010 2,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,6/2/2010
A
human
consumer
of
this
data
might
be
able
to
figure
out
the
meaning
of
the
different
columns,
particularly
if
there
were
some
additional
human-readable
documentation
made
available.
Automated
processors
would
have
a
much
harder
time;
realistically
they
would
be
limited
to
displaying
the
information
in
a
table.
Making
available
machine-readable
metadata
helps
with
the
interpretation
of
the
tabular
data.
For
example,
say
that
the
following
metadata
file
were
available
at
http://example.org/tree-ops.csv-metadata.json
:
{ "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}], "url": "tree-ops.csv", "dc:title": "Tree Operations", "dcat:keyword": ["tree", "street", "maintenance"], "dc:publisher": { "schema:name": "Example Municipality", "schema:url": {"@id": "http://example.org"} }, "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"}, "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"}, "tableSchema": { "columns": [{ "name": "GID", "titles": ["GID", "Generic Identifier"], "dc:description": "An identifier for the operation on a tree.", "datatype": "string", "required": true }, { "name": "on_street", "titles": "On Street", "dc:description": "The street that the tree is on.", "datatype": "string" }, { "name": "species", "titles": "Species", "dc:description": "The species of the tree.", "datatype": "string" }, { "name": "trim_cycle", "titles": "Trim Cycle", "dc:description": "The operation performed on the tree.", "datatype": "string" }, { "name": "inventory_date", "titles": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": {"base": "date", "format": "M/d/yyyy"} }], "primaryKey": "GID", "aboutUrl": "#gid-{GID}" } }
Given
the
location
of
the
CSV
file,
this
This
metadata
document
can
file
may
be
located
by
appending
referenced
from
a
-metadata.json
Link
to
header
when
the
URL
CSV
file
is
retrieved,
or
through
looking
in
known
locations
for
metadata
(as
described
in
[
tabular-data-model
]).
It
provides
information
for
different
types
of
applications:
GID
column
are
all
present
and
unique.
Implementations may fulfil one or more of these functions. In particular, Converters may or may not act as a Validator (perhaps through the setting of a flag), and check the data that they are converting to ensure that it is compliant with the schema. If a Converter does not also act as a Validator it may produce invalid output.
[ tabular-data-model ] defines an annotated tabular data model in which groups of tables, individual tables, columns, rows, and cells can be annotated with annotations. That specification also describes how to locate metadata about a given tabular data file.
This document defines the format and structure of metadata documents, and how these are interpreted to create an Annotated Tabular Data Model . It also defines how to validate tabular data based on some of these annotations.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY , MUST , MUST NOT , SHOULD , and SHOULD NOT are to be interpreted as described in [ RFC2119 ].
The metadata format is based on a dialect of [ JSON-LD ] as defined in section A. JSON-LD Dialect . This metadata can therefore be expressed as an RDF graph. It is not necessary for conformant applications to be able to process all JSON-LD, only the dialect defined in this specification. All applications that conform to this specification (including validators and applications that read or convert tabular data) MUST read the JSON-based format described in this document.
Tabular data MUST conform to the description from [ tabular-data-model ]. In particular note that each row MUST contain the same number of cells (although some of these cells may be empty). Parsers might not be able to map all CSV-encoded data to such a table. As such, the metadata format described in this specification cannot be applied to all CSV files.
This specification makes use of the compact IRI Syntax ; please refer to the Compact IRIs from [ JSON-LD ].
This specification makes use of the following namespaces:
csvw
:
http://www.w3.org/ns/csvw#
dc
:
http://purl.org/dc/terms/
dcat
:
http://www.w3.org/ns/dcat#
foaf
:
http://xmlns.com/foaf/0.1/
rdf
:
http://www.w3.org/1999/02/22-rdf-syntax-ns#
schema
:
http://schema.org/
xsd
:
http://www.w3.org/2001/XMLSchema#
The following typographic conventions are used in this specification:
markup
markup
definition
reference
markup
external
definition
reference
Notes are in light green boxes with a green left border and with a "Note" header in green. Notes are normative or informative depending on the whether they are in a normative or informative section, respectively.
Examples are in light khaki boxes, with khaki left border, and with a numbered "Example" header in khaki. Examples are always informative. The content of the example is in monospace font and may be syntax colored.
The metadata defined in this specification is used to provide annotations on an annotated table or group of tables , as defined in [ tabular-data-model ]. Annotated tables form the basis for all further processing, such as validating, converting, or displaying the tables.
All compliant applications MUST create annotated tables based on the algorithm defined here. All compliant applications MUST generate errors and stop processing if a metadata document:
Compliant applications MUST ignore properties (aside from common properties ) which are not defined in this specification and MUST generate a warning when they are encoutered.
If a property has a value that is not permitted by this specification, then if a default value is provided for that property, compliant applications MUST use that default value and MUST generate a warning. If no default value is provided for that property, compliant applications MUST generate a warning and behave as if the property had not been specified.
Metadata documents contain descriptions of groups of tables, tables, columns, rows, and cells, which are used to create annotations on a annotated tabular data model . A description object is a JSON object that describes a component of the annotated tabular data model (a group of tables , a table or a column ) and has one or more properties are mapped into properties on that component. There are two types of description objects:
The description objects contain a number of properties. These are:
name
of
a
column
or
the
dc:provenance
of
a
For example, in the column description
{ "name": "inventory_date", "titles": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": { "base": "date", "format": "M/d/yyyy" } }
the
properties
name
,
titles
,
and
dc:description
are
used
to
create
the
name
,
titles
,
datatype
and
dc:description
annotations
on
the
column
in
the
data
model.
The
datatype
property
is
an
inherited
property
that
also
affects
the
value
of
each
cell
in
that
column
(see
section
5.7
Inherited
Properties
for
more
on
inherited
properties
).
This section defines a set of properties and permitted values for annotating tabular data, and how these properties should be interpreted by applications.
A metadata document is a JSON document which holds an object at the top level. This object is a description object of either a group of tables or a single table . A metadata document may contain other referenced or embedded description objects , description objects for tables and columns . Additional JSON objects, not part of the annotated tabular data model , are used to describe schemas , dialect descriptions , foreign key definitions and transformation definitions .
There are different types of properties on description objects:
Array properties hold an array of one or more objects, which are usually description objects .
For
example,
the
tables
property
is
an
array
property.
A
table
group
description
might
contain:
"tables": [{ "url": "https://example.org/countries.csv", "tableSchema": "https://example.org/countries.json" }, { "url": "https://example.org/country_slice.csv", "tableSchema": "https://example.org/country_slice.json" }]
in
which
case
the
tables
property
has
a
value
that
is
an
array
of
two
table
description
objects.
Any
items
within
an
array
that
are
not
valid
objects
of
the
type
expected
are
ignored.
If
the
supplied
value
of
an
array
property
is
not
an
array
(eg
(e.g.
if
it
is
an
integer),
compliant
applications
MUST
issue
a
warning
and
proceed
as
if
the
property
had
been
supplied
with
an
empty
array.
Link
properties
hold
a
single
reference
to
another
resource
by
URL.
Their
value
is
a
string
—
resolved
as
a
URL
against
the
base
URL
.
If
the
supplied
value
of
a
link
property
is
not
a
string
(eg
(e.g.
if
it
is
an
integer),
compliant
applications
MUST
issue
a
warning
and
proceed
as
if
the
property
had
been
supplied
with
an
empty
string.
For
example,
the
url
property
is
a
link
property.
A
table
description
might
contain:
"url" : "example-2014-01-03.csv"
in
which
case
the
url
property
on
the
table
would
have
a
single
value,
a
link
to
example-2014-01-03.csv
,
resolved
against
the
base
URL
of
the
metadata
document
in
which
this
was
located.
For
example
if
the
metadata
document
contained:
"@context" : [ "http://www.w3.org/ns/csvw" , { "@base" : "http://example.org/" }]
this is equivalent to specifying:
"url" : "http://example.org/example-2014-01-03.csv"
URI template properties contain a [ URI-TEMPLATE ] which can be used to generate a URI. These URI templates are expanded in the context of each row by combining the template with a set of variables with values as defined in [ URI-TEMPLATE ]. The variables that are set are:
null
.
The languages of cell values are ignored.
_column
_column
is
set
to
the
column
number
of
the
column
from
the
annotated
table
that
is
currently
being
processed
_sourceColumn
_sourceColumn
is
set
to
the
source
number
of
the
column
that
is
currently
being
processed;
this
usually
varies
from
_column
by
skip
columns
_row
_row
is
set
to
the
row
number
of
the
row
from
the
annotated
table
that
is
currently
being
processed
_sourceRow
_sourceRow
is
set
to
the
source
number
of
the
row
that
is
currently
being
processed;
this
usually
varies
from
_row
by
skip
rows
and
header
rows
_name
_name
is
set
to
the
URI
decoded
column
name
annotation,
as
defined
in
[
tabular-data-model
],
for
the
column
that
is
currently
being
processed.
(Percent-decoding
is
necessary
as
name
may
have
been
encoded
if
taken
from
titles
;
this
prevents
double
percent-encoding.)
The annotation value is the result of:
If
the
supplied
value
of
a
URI
template
property
is
not
a
string
(eg
(e.g.
if
it
is
an
integer),
compliant
applications
MUST
issue
a
warning
and
proceed
as
if
the
property
had
been
supplied
with
an
empty
string.
For
example,
the
aboutUrl
property
holds
a
URI
template
that
is
used
to
generate
a
URL
identifier
for
each
row,
which
might
look
like:
"aboutUrl" : "http://example.org/example.csv#row.{_row}"
The
about
URL
annotations
that
are
generated
and
used
as
identifiers
for
the
rows
would
then
look
like
http://example.org/example.csv#row.1
,
http://example.org/example.csv#row.2
and
so
on.
Alternatively,
with
the
CSV
and
metadata
in
the
section
1.
Introduction
,
the
aboutUrl
might
look
like:
"aboutUrl" : "http://example.org/tree/{on_street}/{GID}"
This
would
generate
URIs
such
as
http://example.org/tree/ADDISON%20AV/1
and
http://example.org/tree/EMERSON%20ST/2
.
If
the
value
of
the
on_street
or
GID
column
were
null
,
the
URL
would
still
be
generated
with
the
null
value
generating
an
empty
string
in
the
URL.
For
example
if
on_street
were
null
and
GID
were
3
,
the
generated
URL
would
be
http://example.org/tree//3
.
Once
the
URI
has
been
generated,
it
is
resolved
against
the
url
of
the
table
(eg
(e.g.
the
CSV
file)
to
create
an
absolute
URI.
For
example,
given
a
aboutUrl
within
a
schema
such
as:
"aboutUrl" : "#row.{_row}"
and
given
a
CSV
file
at
http://example.com/temp.csv
,
the
URL
for
the
first
row
will
be
http://example.com/temp.csv#row.1
.
The
propertyUrl
property
might
be
defined
as
"{#_name}"
,
meaning
that
it
resolves
as
a
fragment
identifier
relative
to
the
URL
of
the
source
of
the
table.
For
example,
accessing
it
from
a
column
with
the
column
name
GID
would
look
like:
"http://example.org/example.csv#GID"
A value defined within the data is also subject to expansion. For example, consider the following table:
project_name,project_type,keywords CSVW,foaf:Project,table;data;conversion
The
project_type
column
might
have
a
valueUrl
specified
as
"{project_type}"
.
In
the
first
row
the
cell
value
is
"foaf:Project"
.
The
foaf
prefix
is
understood,
as
described
in
section
5.8
Common
Properties
,
to
expand
to
http://xmlns.com/foaf/0.1/Project
.
Similarly,
the
keywords
column
might
have
a
valueUrl
specified
as
"https://duckduckgo.com/?q={keywords}"
.
If
the
column
also
specifies
"separator":
";"
,
then
the
cell
value
of
the
keywords
column
would
be
an
array
of
the
three
values
table
,
data
,
and
conversion
.
This
is
set
as
the
value
of
the
keywords
variable
within
the
URI
template,
which
means
the
result
would
be
https://duckduckgo.com/?q=table,data,conversion
.
If
the
value
in
the
keywords
column
were
an
empty
sequence
(created
from
an
empty
cell
in
the
original
data),
the
reference
to
that
column
would
be
expanded
to
an
empty
string,
generating
https://duckduckgo.com/?q=
.
When a cell's value is not a string, the canonical representation of that value is used within the expanded URL. For example, the data may include dates such as those in:
GID,On Street,Species,Trim Cycle,Inventory Date 1,ADDISON AV,Celtis australis,Large Tree Routine Prune,10/18/2010 2,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,6/2/2010
The
Inventory
Date
column
description
would
indicate
that
these
were
dates
with
the
format
M/d/yyyy
:
{ "name": "inventory_date", "titles": "Inventory Date", "datatype": { "base": "date", "format": "M/d/yyyy" } }
The
string
value
of
the
inventory_date
column
in
the
first
row
is
parsed
to
create
the
date
18th
October
2010.
When
the
inventory_date
column
is
referenced
within
a
URI
template
such
as
http://example.org/event/{inventory_date}
,
the
canonical
representation
of
that
date,
as
defined
in
[
xmlschema11-2
]
is
used
within
the
URL,
giving
the
result
http://example.org/event/2010-10-18
.
Column
reference
properties
hold
one
or
more
references
to
other
column
description
objects.
The
referenced
description
object
must
have
a
name
property.
Column
reference
properties
can
then
reference
column
description
objects
through
values
that
are:
name
on
a
column
description
object
within
the
metadata
If
the
supplied
value
of
Compliant
applications
MUST
issue
a
warning
and
proceed
as
if
the
column
reference
property
had
not
been
specified
if:
For
example,
the
primaryKey
property
is
a
column
reference
property
on
the
schema.
It
has
to
hold
references
to
columns
defined
elsewhere
in
the
schema,
and
the
descriptions
of
those
columns
must
have
name
properties.
It
can
hold
a
single
reference,
like
this:
"tableSchema": { "columns": [{ "name": "GID" }, ... ], "primaryKey": "GID" }
or it can contain an array of references, like this:
"tableSchema": { "columns": [{ "name": "givenName" }, { "name": "familyName" }, ... ], "primaryKey": [ "givenName", "familyName" ] }
If
the
primaryKey
property
were
given
an
invalid
value,
such
as
1
,
or
a
column
name
were
misspelled,
the
processor
will
issue
a
warning
and
ignore
the
value.
On
the
other
hand,
the
columnReference
property
is
a
required
property;
if
it
has
an
invalid
value,
such
as
an
empty
array,
then
the
processor
will
issue
an
error
as
if
the
property
were
not
specified
at
all.
Object properties hold either a single object or a reference to an object by URL. Their values may be:
If
the
supplied
value
of
an
object
property
is
not
a
string
or
object
(eg
(e.g.
if
it
is
an
integer),
compliant
applications
MUST
issue
a
warning
and
proceed
as
if
the
property
had
been
specified
as
an
object
with
no
properties.
Object
properties
are
often
used
when
the
values
can
be
or
should
be
values
within
controlled
vocabularies,
or
structured
information
which
may
be
held
elsewhere.
For
example,
the
dialect
of
a
table
is
an
object
property.
It
could
be
provided
as
a
URL
that
indicates
a
commonly
used
dialect,
like
this:
"dialect" : "http://example.org/tab-separated-values"
or a structured object, like this:
"dialect": { "delimiter": "\t", "encoding": "utf-8" }
When
specified
as
a
string,
the
resolved
URL
is
used
to
fetch
the
referenced
object
during
normalization
as
described
in
section
6.1
6.
Normalization
.
For
example,
if
http://example.org/tab-separated-values
resolved
to:
{ "@context": "http://www.w3.org/ns/csvw", "quoteChar": null, "header": true, "delimiter": "\t" }
Following
normalization
,
the
value
of
the
dialect
property
would
then
be:
"dialect": { "@id": "http://example.org/tab-separated-values", "quoteChar": null, "header": true, "delimiter": "\t" }
Natural language properties hold natural language strings. Their values may be:
Natural
language
properties
are
used
for
titles.
For
example,
the
titles
property
on
a
column
description
provides
a
natural
language
label
for
a
column.
If
it's
a
plain
string
like
this:
"titles" : "Project title"
then
that
string
is
assumed
to
be
in
the
default
language
(or
have
an
undefined
language,
und
,
if
there
is
no
such
property).
Multiple
alternative
values
can
be
given
in
an
array:
"titles": [ "Project title", "Project" ]
It's also possible to provide multiple values in different languages, using an object structure. For example:
"titles": { "en": "Project title", "fr": "Titre du projet" }
and within such an object, the values of the properties can themselves be arrays:
"titles": { "en": [ "Project title", "Project" ], "fr": "Titre du projet" }
The annotation value of a natural language property is an object whose properties are language codes and where the values of those properties are an array of strings (see Language Maps in [ JSON-LD ]).
When
extracting
a
annotation
value
from
a
metadata
that
will
have
already
been
merged
normalized
,
a
natural
language
property
will
already
have
this
form.
If
the
supplied
value
of
a
natural
language
property
is
not
a
string,
array
or
object
(eg
(e.g.
if
it
is
an
integer),
compliant
applications
MUST
issue
a
warning
and
proceed
as
if
the
property
had
been
specified
as
an
empty
array.
If
the
supplied
value
is
an
array,
any
items
in
that
array
that
are
not
strings
MUST
be
ignored.
If
the
supplied
value
is
an
object,
any
properties
that
are
not
valid
language
codes
as
defined
by
[
BCP47
]
MUST
be
ignored,
as
must
any
properties
whose
value
is
not
a
string
or
an
array,
and
any
items
that
are
not
strings
within
array
values
of
these
properties.
Atomic properties hold atomic values . Their values may be:
true
or
false
The
annotation
value
of
a
boolean
atomic
property
is
false
if
unset;
otherwise,
the
annotation
value
of
an
atomic
property
is
normalized
value
of
that
property,
or
the
defined
default
value
or
null
,
if
unset.
Processors
MUST
issue
a
warning
if
a
property
is
set
to
an
invalid
value
type,
such
as
a
boolean
atomic
property
being
set
to
the
number
1
or
a
numeric
atomic
property
being
set
to
the
string
"3.1415"
,
and
act
as
if
the
property
had
not
been
specified
(which
may
mean
using
the
default
value
for
the
property,
or
may
mean
raising
an
error
and
halting
processing
if
the
property
is
a
required
property).
The
top-level
object
of
a
metadata
document
or
object
referenced
through
an
object
property
(whether
it
is
a
table
group
description
,
table
description
,
schema
,
dialect
description
or
transformation
definition
)
MUST
have
a
@context
property.
This
is
an
array
property
,
as
defined
in
Section
8.7
of
[
JSON-LD
].
The
@context
MUST
have
one
of
the
following
values:
http://www.w3.org/ns/csvw
,
or
http://www.w3.org/ns/csvw
and
the
object
represents
a
local
context
definition,
which
is
restricted
to
contain
either
or
both
of
the
following
members:
@base
an atomic property that provides the base URL against which other URLs within the metadata file are resolved. If present, its value MUST be a string that is interpreted as a URL which is resolved against the location of the metadata document to provide the base URL for other URLs in the metadata document; if unspecified, the base URL used for interpreting relative URLs within the metadata document is the location of the metadata document itself.
Note
that
the
@base
property
of
the
@context
object
provides
the
base
URL
used
for
URLs
within
the
metadata
document,
not
the
URLs
that
appear
as
data
within
the
group
of
tables
or
table
it
describes.
URI
template
properties
are
not
resolved
against
this
base
URL:
they
are
resolved
against
the
URL
of
the
table.
@language
an
atomic
property
that
indicates
the
default
language
for
the
values
of
natural
language
or
string-valued
common
properties
in
the
metadata
document;
if
present,
its
value
MUST
be
a
language
code
[
BCP47
].
The
default
is
und
.
Note
that
the
@language
property
of
the
@context
object,
which
gives
the
default
language
used
within
the
metadata
file,
is
distinct
from
the
lang
property
on
a
description
object
,
which
gives
the
language
used
in
the
data
within
a
group
of
tables,
table,
or
column.
A table group description is a JSON object that describes a group of tables .
tables
An array property of table descriptions for the tables in the group, namely those listed in the tables annotation on the group of tables being described. Compliant application MUST raise an error if this array does not contain one or more table descriptions.
The description of a group of tables MAY also contain:
dialect
An
object
property
that
provides
a
single
dialect
description
.
If
provided,
dialect
provides
hints
to
processors
about
how
to
parse
the
referenced
files
to
create
tabular
data
models
for
the
tables
in
the
group.
This
may
be
provided
as
an
embedded
object
or
as
a
URL
reference.
See
section
5.9
Dialect
Descriptions
for
more
details.
notes
An
array
property
that
provides
an
array
of
objects
representing
arbitrary
annotations
on
the
annotated
group
of
tables
.
The
value
of
this
property
becomes
the
value
of
the
notes
annotation
for
the
group
of
tables
.
The
properties
on
these
objects
are
interpreted
equivalently
to
common
properties
as
described
in
section
5.8
Common
Properties
.
When
an
array
of
note
objects
B
is
merged
into
an
original
array
of
note
objects
A
,
each
note
object
from
B
is
appended
into
the
array
A
.
The Web Annotation Working Group is developing a vocabulary for expressing annotations. In future versions of this specification, we anticipate referencing that vocabulary.
tableDirection
An
atomic
property
that
MUST
have
a
single
string
value
that
is
one
of
"rtl"
,
"ltr"
or
.
Indicates
whether
the
tables
in
the
group
should
be
displayed
with
the
first
column
on
the
right,
on
the
left,
or
based
on
the
first
character
in
the
table
that
has
a
specific
direction.
The
value
of
this
property
becomes
the
value
of
the
table
direction
annotation
for
all
the
tables
in
the
table
group.
See
Bidirectional
Tables
in
[
tabular-data-model
]
for
details.
The
default
value
for
this
property
is
"default"
"auto"
.
"default"
"auto"
tableSchema
An object property that provides a single schema description as described in section 5.5 Schemas , used as the default for all the tables in the group. This may be provided as an embedded object within the JSON metadata or as a URL reference to a separate JSON object that is a schema description .
transformations
An array property of transformation definitions that provide mechanisms to transform the tabular data into other formats. The value of this property becomes the value of the transformations annotation for all the tables in the table group.
@id
@id
is
a
link
property
that
identifies
the
group
of
tables
,
as
defined
by
[
tabular-data-model
],
described
by
this
table
group
description
.
It
MUST
NOT
start
with
_:
.
The
value
of
this
property
becomes
the
value
of
the
id
annotation
for
the
group
of
tables
.
@type
If
included,
@type
is
an
atomic
property
that
MUST
be
set
to
"TableGroup"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The description MAY contain any common properties to provide extra metadata about the group of tables as a whole.
The description MAY contain inherited properties to describe cells within the tables.
Two
table
group
descriptions
are
compatible
if
any
table
descriptions
they
contain
with
matching
normalized
url
properties
are
themselves
compatible
as
defined
in
section
5.4.3
Table
Description
Compatibility
.
A table description is a JSON object that describes a table within a CSV file.
url
This link property gives the single URL of the CSV file that the table is held in, relative to the location of the metadata document. The value of this property is the value of the url annotation for the annotated table this table description describes.
The description of a table MAY also contain:
dialect
As defined for table groups .
notes
An
array
property
that
provides
an
array
of
objects
representing
arbitrary
annotations
on
the
annotated
tabular
data
model
.
The
value
of
this
property
becomes
the
value
of
the
notes
annotation
for
the
table
.
The
properties
on
these
objects
are
interpreted
equivalently
to
common
properties
as
described
in
section
5.8
Common
Properties
.
When
an
array
of
note
objects
B
is
merged
into
an
original
array
of
note
objects
A
,
each
note
object
from
B
is
appended
into
the
array
A
.
The Web Annotation Working Group is developing a vocabulary for expressing annotations. In future versions of this specification, we anticipate referencing that vocabulary.
suppressOutput
A
boolean
atomic
property
.
If
true
,
suppresses
any
output
that
would
be
generated
when
converting
this
table.
The
value
of
this
property
becomes
the
value
of
the
suppress
output
annotation
for
this
table.
The
default
is
false
.
tableDirection
As defined for table groups . The value of this property becomes the value of the table direction annotation for this table.
tableSchema
An
object
property
that
provides
a
single
schema
description
as
described
in
section
5.5
Schemas
.
This
may
be
provided
as
an
embedded
object
within
the
JSON
metadata
or
as
a
URL
reference
to
a
separate
JSON
schema
document.
If
a
table
description
is
within
a
table
group
description
,
the
tableSchema
from
that
table
group
acts
as
the
default
for
this
property.
If
a
tableSchema
is
not
declared
in
table
description
,
it
may
be
declared
on
the
table
group
description
,
which
is
then
used
as
the
schema
for
this
table
description
.
The
@id
property
of
the
tableSchema
,
if
there
is
one,
becomes
the
value
of
the
schema
annotation
for
this
table.
When
a
schema
is
referenced
by
URL,
this
URL
becomes
the
value
of
the
@id
property
in
the
normalized
schema
description,
and
thus
the
value
of
the
schema
annotation
on
the
table.
transformations
As defined for table groups . The value of this property becomes the value of the transformations annotation for this table.
@id
If
included,
@id
is
a
link
property
that
identifies
the
table
,
as
defined
in
[
tabular-data-model
],
described
by
this
table
description
.
It
MUST
NOT
start
with
_:
.
The
value
of
this
property
becomes
the
value
of
the
id
annotation
for
this
table.
@type
If
included,
@type
is
an
atomic
property
that
MUST
be
set
to
"Table"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The description MAY contain any common properties to provide extra metadata about the table as a whole.
The description MAY contain inherited properties to describe cells within the table.
Two
table
descriptions
are
compatible
if
they
have
equivalent
normalized
url
properties,
and
have
compatible
schemas
as
defined
in
section
5.5.1
Schema
Compatibility
.
A schema is a definition of a tabular format that may be common to multiple tables. For example, multiple tables from different sources may have the same columns and be designed such that they can be aggregated together.
A schema description is a JSON object that encodes the information about a schema , which describes the structure of a table. All the properties of a schema description are optional.
columns
An array property of column descriptions as described in section 5.6 Columns . These are matched to columns in tables that use the schema by position: the first column description in the array applies to the first column in the table, the second to the second and so on.
The
name
properties
of
the
column
descriptions
MUST
be
unique
within
a
given
table
description.
foreignKeys
An array property of foreign key definitions that define how the values from specified columns within this table link to rows within this table or other tables. A foreign key definition is a JSON object that MUST contain only the following properties:
columnReference
A column reference property that holds either a single reference to a column description object within this schema, or an array of references. These form the referencing columns for the foreign key definition .
reference
An object property that identifies a referenced table and a set of referenced columns within that table. Its properties are:
resource
A
link
property
holding
a
URL
that
is
the
identifier
for
a
specific
table
that
is
being
referenced.
If
this
property
is
present
then
schemaReference
MUST
NOT
be
present.
The
table
group
MUST
contain
a
table
whose
url
annotation
is
identical
to
the
expanded
value
of
this
property.
That
table
is
the
referenced
table
.
schemaReference
A
link
property
holding
a
URL
that
is
the
identifier
for
a
schema
that
is
being
referenced.
If
this
property
is
present
then
resource
MUST
NOT
be
present.
The
table
group
MUST
contain
a
table
with
a
tableSchema
having
a
@id
that
is
identical
to
the
expanded
value
of
this
property,
and
there
MUST
NOT
be
more
than
one
such
table.
That
table
is
the
referenced
table
.
columnReference
A
column
reference
property
that
holds
either
a
single
reference
(by
name)
to
a
column
description
object
within
the
tableSchema
of
the
referenced
table
,
or
an
array
of
such
references.
The value of this property becomes the foreign keys annotation on the table using this schema by creating a list of foreign keys comprising a list of columns in the table and a list of columns in the referenced table. The value of this property is also used to create the value of the referenced rows annotation on each of the rows in the table that uses this schema, which is a pair of the relevant foreign key and the referenced row in the referenced table.
As
defined
in
[
tabular-data-model
],
validators
MUST
check
that,
for
each
row,
the
combination
of
cells
in
the
referencing
columns
references
a
unique
row
within
the
referenced
table
through
a
combination
of
cells
in
the
referenced
columns
.
For
examples,
see
section
5.5.1.1
5.5.2.1
Foreign
Key
Reference
Between
Tables
and
section
5.5.1.2
5.5.2.2
Foreign
Key
Reference
Between
Schemas
.
It
is
not
required
for
the
table
or
schema
referenced
from
a
foreignKeys
property
to
have
a
similarly
defined
primaryKey
,
though
frequently
it
will.
primaryKey
A column reference property that holds either a single reference to a column description object or an array of references. The value of this property becomes the primary key annotation for each row within a table that uses this schema by creating a list of the cells in that row that are in the referenced columns.
As
defined
in
[
tabular-data-model
],
validators
MUST
check
that
each
row
has
a
unique
combination
of
values
of
cells
in
the
indicated
columns.
For
example,
if
primaryKey
is
set
to
["familyName",
"givenName"]
then
every
row
must
have
a
unique
value
for
the
combination
of
values
of
cells
in
the
familyName
and
givenName
columns.
rowTitles
A
column
reference
property
that
holds
either
a
single
reference
to
a
column
description
object
or
an
array
of
references.
The
value
of
this
property
determines
the
titles
annotation
for
each
row
within
a
table
that
uses
this
schema.
The
titles
annotation
holds
the
list
of
the
values
of
the
cells
in
that
row
that
are
in
the
referenced
columns;
if
the
value
is
not
a
string
or
has
no
associated
language,
it
is
interpreted
as
a
string
with
an
undefined
language
(
und
).
@id
If
included,
@id
is
a
link
property
that
identifies
the
schema
described
by
this
schema
description
.
It
MUST
NOT
start
with
_:
.
@type
If
included,
@type
is
an
atomic
property
that
MUST
be
set
to
"Schema"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The description MAY contain any common properties to provide extra metadata about the schema as a whole.
The description MAY contain inherited properties to describe cells within tables that use this schema.
Two
schemas
are
compatible
if
they
have
the
same
number
of
non-
virtual
column
descriptions
,
and
the
non-
virtual
column
descriptions
at
the
same
index
within
each
are
compatible
with
each
other.
Column
descriptions
are
compatible
under
the
following
conditions:
name
nor
titles
properties.
name
properties
of
the
columns.
titles
values,
where
matches
must
have
a
matching
language;
und
matches
any
language,
and
languages
match
if
they
are
equal
when
truncated
,
as
defined
in
[
BCP47
],
to
the
length
of
the
shortest
language
tag.
name
property
but
not
a
titles
property,
and
the
other
has
a
titles
property
but
not
a
name
property.
A
column
description
within
embedded
metadata
where
the
header
dialect
property
is
false
will
have
neither
name
nor
titles
properties.
This section is non-normative.
A
list
of
countries
is
published
at
http://example.org/countries.csv
with
the
structure:
countryCode,latitude,longitude,nameAD,42.546245,1.601554,Andorra AE,23.424076,53.847818,"United Arab Emirates" AF,33.93911,67.709953,AfghanistanAD,42.5,1.6,Andorra AE,23.4,53.8,"United Arab Emirates" AF,33.9,67.7,Afghanistan
Another
file
contains
information
about
the
population
in
some
countries
each
year,
at
http://example.com/country_slice.csv
with
the
structure:
countryRef,year,population AF,1960,9616353 AF,1961,9799379 AF,1962,9989846
The
following
metadata
for
the
group
of
tables
links
the
two
together
by
defining
a
foreignKeys
property:
{ "@context": "http://www.w3.org/ns/csvw", "tables": [{ "url": "http://example.org/countries.csv", "tableSchema": { "columns": [{ "name": "countryCode", "datatype": "string", "propertyUrl": "http://www.geonames.org/ontology{#_name}" }, { "name": "latitude", "datatype": "number" }, { "name": "longitude", "datatype": "number" }, { "name": "name", "datatype": "string" }], "aboutUrl": "http://example.org/countries.csv{#countryCode}", "propertyUrl": "http://schema.org/{_name}", "primaryKey": "countryCode" } }, { "url": "http://example.org/country_slice.csv", "tableSchema": { "columns": [{ "name": "countryRef", "valueUrl": "http://example.org/countries.csv{#countryRef}" }, { "name": "year", "datatype": "gYear" }, { "name": "population", "datatype": "integer" }], "foreignKeys": [{ "columnReference": "countryRef", "reference": { "resource": "http://example.org/countries.csv", "columnReference": "countryCode" } }] } }] }
Within
the
annotated
table
generated
for
countries.csv
,
each
row
will
have
a
primary
key
annotation
whose
value
is
a
list
containing
the
cell
from
the
first
column
of
that
row
(
countryCode
).
The
annotated
table
generated
for
country_slice.csv
will
have
a
foreign
keys
annotation
whose
value
is
a
list
containing
a
single
foreign
key
referencing
the
first
column
from
the
table
generated
from
country_slice.csv
(
countryRef
)
and
the
first
column
from
the
table
generated
from
countries.csv
(
countryCode
).
Each
row
within
that
table
will
have
a
referenced
row
annotation
referencing
this
foreign
key
and
the
third
row
in
the
table
generated
from
countries.csv
.
When
the
population
data
in
country_slice.csv
is
validated,
the
validator
must
check
that
every
countryRef
within
country_slice.csv
has
a
matching
countryCode
within
countries.csv
.
When
publishing
information
about
public
sector
roles
and
salaries,
as
in
Use
Case
4
,
the
UK
government
requires
departments
to
publish
two
files
which
are
interlinked.
The
first
lists
senior
grades
(simplified
here)
e.g.,
at
HEFCE_organogram_senior_data_31032011.csv
:
Post Unique Reference, Name,Grade, Job Title,Reports to Senior Post 90115, Steve Egan,SCS1A,Deputy Chief Executive, 90334 90250, David Sweeney,SCS1A, Director, 90334 90284, Heather Fry,SCS1A, Director, 90334 90334,Sir Alan Langlands, SCS4, Chief Executive, xx
The
second
provides
information
about
the
number
of
junior
positions
that
report
to
those
individuals
(simplified
here)
e.g.,
at
HEFCE_organogram_junior_data_31032011.csv
:
Reporting Senior Post,Grade,Payscale Minimum (£),Payscale Maximum (£),Generic Job Title,Number of Posts in FTE, Profession 90284, 4, 17426, 20002, Administrator, 2,Operational Delivery 90284, 5, 19546, 22478, Administrator, 1,Operational Delivery 90115, 4, 17426, 20002, Administrator, 8.67,Operational Delivery 90115, 5, 19546, 22478, Administrator, 0.5,Operational Delivery
The schemas are reused by multiple departments and for multiple pairs of files. The schemas are therefore defined in separate files, and they need to define links between the schemas which are then picked up as applying between tables that use those schemas.
The metadata file for the particular publication of the files above is:
{ "@context": "http://www.w3.org/ns/csvw", "tables": [{ "url": "HEFCE_organogram_senior_data_31032011.csv", "tableSchema": "http://example.org/schema/senior-roles.json" }, { "url": "HEFCE_organogram_junior_data_31032011.csv", "tableSchema": "http://example.org/schema/junior-roles.json" }] }
The
schema
for
the
senior
role
CSV
(at
http://example.org/schema/senior-roles.json
)
is
as
follows:
{ "@id": "http://example.org/schema/senior-roles.json", "@context": "http://www.w3.org/ns/csvw", "columns": [{ "name": "ref", "titles": "Post Unique Reference" }, { "name": "name", "titles": "Name" }, { "name": "grade", "titles": "Grade" }, { "name": "job", "titles": "Job Title" }, { "name": "reportsTo", "titles": "Reports to Senior Post" }], "primaryKey": "ref" }
The
schema
for
the
junior
role
CSV
(at
http://example.org/schema/junior-roles.json
)
is
as
follows;
it
includes
a
foreign
key
reference
to
the
senior
roles
schema:
{ "@id": "http://example.org/schema/junior-roles.json", "@context": "http://www.w3.org/ns/csvw", "columns": [{ "name": "reportsTo", "titles": "Reporting Senior Post" }, ... ], "foreignKeys": [{ "columnReference": "reportsTo", "reference": { "schemaReference": "http://example.org/schema/senior-roles.json", "columnReference": "ref" } }] }
The
foreign
key
definition
here
contains
a
schemaReference
to
senior-roles.json
.
Implementations
will
look
for
the
table
referenced
within
the
original
metadata
file
whose
tableSchema
is
senior-roles.json
,
which
is
HEFCE_organogram_senior_data_31032011.csv
.
The
implementation
will
therefore
look
for
a
relationship
between
the
reportsTo
column
in
HEFCE_organogram_junior_data_31032011.csv
and
the
ref
column
in
HEFCE_organogram_senior_data_31032011.csv
.
For
example,
in
the
first
line
of
HEFCE_organogram_junior_data_31032011.csv
,
the
reportsTo
(
Reporting
Senior
Post
)
column
contains
the
value
90284
.
When
validating
that
file,
validators
will
check
that
there
is
a
single
row
within
the
table
generated
from
HEFCE_organogram_senior_data_31032011.csv
whose
ref
column
contains
the
value
90284
.
Foreign
key
definitions
provide
for
strong
linking
between
tables
that
guarantees
(through
validation)
the
existance
of
a
referenced
row.
It
is
also
possible
to
provide
weak
linking
between
tables
that
are
not
tested
by
validations
but
which
may
be
useful
when
converting
tabular
data
into
other
formats,
using
aboutUrl
and
valueUrl
.
Taking
the
example
above
as
a
starting
point,
the
schema
for
HEFCE_organogram_senior_data_31032011.csv
could
use
aboutUrl
to
provide
a
URL
for
each
row,
which
can
similarly
be
created
as
a
valueUrl
for
the
reportsTo
column:
{ "@id": "http://example.org/schema/senior-roles.json", "@context": "http://www.w3.org/ns/csvw", "aboutUrl": "#role-{ref}", "columns": [{ "name": "ref", "titles": "Post Unique Reference" }, { "name": "name", "titles": "Name" }, { "name": "grade", "titles": "Grade" }, { "name": "job", "titles": "Job Title" }, { "name": "reportsTo", "titles": "Reports to Senior Post", "valueUrl": "#role-{reportsTo}" }], "primaryKey": "ref" }
The
URLs
generated
for
the
values
of
the
reportsTo
will
(if
the
data
is
correct)
match
the
URLs
generated
for
each
row
within
the
table.
There
will
be
no
validation
error,
however,
if
there
is
a
value
in
the
reportsTo
column
that
does
not
match
a
value
in
the
ref
column.
In
contrast,
if
a
foreign
key
had
been
specified
with:
"foreignKeys": [{ "columnReference": "reportsTo", "reference": { "schemaReference": "http://example.org/schema/senior-roles.json", "columnReference": "ref" } }]
then
validators
would
raise
an
error
if
a
value
in
the
reportsTo
column
did
not
match
any
value
in
the
ref
column.
A column description is a JSON object that describes a single column. The description provides additional human-readable documentation for a column, as well as additional information that may be used to validate the cells within the column, create a user interface for data entry, or inform conversion into other formats. All properties are optional.
name
An atomic property that gives a single canonical name for the column. The value of this property becomes the name annotation for the described column . This MUST be a string and this property has no default value, which means it MUST be ignored if the supplied value is not a string.
For
ease
of
reference
within
URI
template
properties
,
column
names
are
restricted
as
defined
in
Variables
in
[
URI-TEMPLATE
]
with
the
additional
provision
that
names
beginning
with
"_"
are
reserved
by
this
specification
and
MUST
NOT
be
used
within
metadata
documents.
suppressOutput
A
boolean
atomic
property
.
If
true
,
suppresses
any
output
that
would
be
generated
when
converting
cells
in
this
column.
The
value
of
this
property
becomes
the
suppress
output
annotation
for
the
described
column
.
The
default
is
false
.
titles
A natural language property that provides possible alternative names for the column. The string values of this property, along with their associated language tags, become the titles annotation for the described column .
If
there
is
no
name
property
defined
on
this
column,
the
first
titles
value
having
the
same
language
tag
as
default
language
,
or
und
or
if
no
default
language
is
specified,
becomes
the
name
annotation
for
the
described
column
.
This
annotation
MUST
be
percent-encoded
as
necessary
to
conform
to
the
syntactic
requirements
defined
in
[
RFC3986
]
virtual
A
boolean
atomic
property
taking
a
single
value
which
indicates
whether
the
column
is
a
virtual
column
not
present
in
the
original
source.
The
default
value
is
false
.
The
normalized
value
of
this
property
becomes
the
virtual
annotation
for
the
described
column
.
If
present,
a
virtual
column
MUST
appear
after
all
other
non-virtual
column
definitions.
Virtual columns are useful for inserting cells with default values into an annotated table to control the results of conversions.
@id
If
included,
@id
is
a
link
property
that
identifies
the
columns
,
as
defined
in
[
tabular-data-model
],
and
potentially
appearing
across
separate
tables,
described
by
this
column
description
.
It
MUST
NOT
start
with
_:
.
@type
If
included,
@type
is
an
atomic
property
that
MUST
be
set
to
"Column"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
If
the
column
description
has
neither
name
nor
titles
properties,
the
string
"_col.
[N]
"
where
[N]
is
the
column
number
,
becomes
the
name
annotation
for
the
described
column
.
The description MAY contain any common properties to provide extra metadata about the column as a whole, such as a full description.
The description MAY contain inherited properties to describe cells within the column.
This section is non-normative.
virtual
columns
Virtual
columns
are
useful
when
data
needs
to
be
added
as
part
of
an
output
transformation
that
doesn't
exist
in
the
source
file.
This
may
be
to
add
type
information
to
a
column,
or
to
relate
different
columns
having
different
aboutUrl
.
For
example,
the
http://example.org/tree-ops.csv
example
used
in
the
introduction
can
be
used
with
the
following
metadata:
{ "url": "tree-ops.csv", "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}], "tableSchema": { "columns": [{ "name": "GID", "titles": "GID", "datatype": "string", "propertyUrl": "schema:url", "valueUrl": "#gid-{GID}" }, { "name": "on_street", "titles": "On Street", "datatype": "string", "aboutUrl": "#location-{GID}", "propertyUrl": "schema:streetAddress" }, { "name": "species", "titles": "Species", "datatype": "string", "propertyUrl": "schema:name" }, { "name": "trim_cycle", "titles": "Trim Cycle", "datatype": "string" }, { "name": "inventory_date", "titles": "Inventory Date", "datatype": {"base": "date", "format": "M/d/yyyy"}, "aboutUrl": "#event-{inventory_date}", "propertyUrl": "schema:startDate" }, { "propertyUrl": "schema:event", "valueUrl": "#event-{inventory_date}", "virtual": true }, { "propertyUrl": "schema:location", "valueUrl": "#location-{GID}", "virtual": true }, { "aboutUrl": "#location-{GID}", "propertyUrl": "rdf:type", "valueUrl": "schema:PostalAddress", "virtual": true }], "aboutUrl": "#gid-{GID}" } }
This
metadata
creates
a
relationship
model
between
data
in
each
column
by
different
combinations
of
aboutUrl
,
propertyUrl
,
and
valueUrl
on
existing
columns,
and
defining
new
virtual
columns
to
supply
additional
information.
In
this
case,
the
on_street
and
inventory_date
values
are
split
into
separate
entities,
each
having
their
own
aboutUrl
.
New
virtual
columns
are
defined
to
provide
a
location
type,
and
to
relate
the
main
row
entity
to
the
event
and
location
associated
with
it.
The
result
of
converting
the
table
to
RDF
would
include
the
following,
for
the
first
row,
with
the
contributions
from
the
virtual
columns
highlighted:
<#gid-1> schema:url <#gid-1> ; schema:name "Celtis australis" ; :trim_cycle "Large Tree Routine Prune" ; schema:event <#event-2010-10-18> ; schema:location <#location-1> ; . <#event-1> a schema:Event ; schema:startDate "2010-10-18"^^xsd:date ; . <#location-1> a schema:PostalAddress ; schema:streetAddress "ADDISON AV" ; .
The JSON would similarly include, again with the contributions from the virtual columns highlighted:
{ "@id": "#gid-1", "schema:url": "#gid-1", "schema:name": "Celtis australis", "trim_cycle": "Large Tree Routine Prune", "schema:event": { "@id": "#event-1", "@type": "schema:Event", "schema:startDate": "2010-10-18" }, "schema:location": { "@id": "#location-1", "@type": "schema:PostalAddress", "schema:streetAddress": "ADDISON AV" } }
A
cell
Columns
and
cells
may
be
assigned
annotations
based
on
properties
on
the
description
objects
for
the
group
groups
of
tables
,
table
tables
,
schema
schemas
,
or
column
that
it
appears
in.
columns
.
These
properties
are
known
as
inherited
properties
and
are
listed
below.
To
ascertain
a
value
for
certain
annotations
on
cells,
If
an
application
MUST
identify
inherited
property
is
not
defined
on
a
column
description
,
it
defaults
to
the
relevant
first
value,
if
any,
found
by
looking
through
all
of
its
containing
objects:
a
inherited
property
defined
in
its
containing
schema
description
takes
precedence
of
one
defined
in
its
containing
table
description
,
which
in
turn
takes
precedence
of
one
defined
in
its
containing
table
group
description
.
This
value
is
used
to
determine
the
descriptions
value
of
the
group
relevant
annotation
on
the
described
column
,
which
is
then
used
to
determine
the
value
of
tables,
table,
schema,
or
the
relevant
annotation
on
the
cells
in
that
column.
aboutUrl
A URI template property that MAY be used to indicate what a cell contains information about. The value of this property becomes the about URL annotation for the described column and is used to create the value of the about URL annotation for the cells within that column as described in section 5.1.3 URI Template Properties .
aboutUrl
is
typically
defined
on
a
schema
description
or
table
description
to
indicate
what
each
row
is
about.
If
defined
on
individual
column
descriptions
,
care
must
be
taken
to
ensure
that
transformed
cell
values
maintain
a
semantic
relationship.
datatype
An
atomic
property
that
contains
either
a
single
string
that
is
the
main
datatype
of
the
values
of
the
cell
or
a
datatype
description
object.
If
the
value
of
this
property
is
a
string,
it
MUST
be
the
name
of
one
of
the
built-in
datatypes
defined
in
section
5.11.1
Built-in
Datatypes
;
if
and
this
value
is
normalized
to
an
object
whose
base
property
is
the
original
string
value.
If
it
is
an
object
then
it
describes
a
more
specialised
specialized
datatype.
If
a
cell
contains
a
sequence
(ie
(i.e.
the
separator
property
is
specified
and
not
null
)
then
this
property
specifies
the
datatype
of
each
value
within
that
sequence.
See
5.11
Datatypes
and
Parsing
Cells
in
[
tabular-data-model
]
for
more
details.
The normalized value of this property becomes the datatype annotation for the described column .
default
An
atomic
property
holding
a
single
string
that
is
used
to
create
a
default
value
for
the
cell
in
cases
where
the
original
string
value
is
an
empty
string.
See
Parsing
Cells
in
[
tabular-data-model
]
for
more
details.
If
not
specified,
the
default
for
the
default
property
is
the
empty
string,
""
.
The
value
of
this
property
becomes
the
default
annotation
for
the
described
column
.
lang
An
atomic
property
giving
a
single
string
language
code
as
defined
by
[
BCP47
].
Indicates
the
language
of
the
value
within
the
cell.
See
Parsing
Cells
in
[
tabular-data-model
]
for
more
details.
The
value
of
this
property
becomes
the
lang
annotation
for
the
described
column
.
The
default
is
und
.
null
An
atomic
property
giving
the
string
or
strings
used
for
null
values
within
the
data.
If
the
string
value
of
the
cell
is
equal
to
any
one
of
these
values,
the
cell
value
is
null
.
See
Parsing
Cells
in
[
tabular-data-model
]
for
more
details.
If
not
specified,
the
default
for
the
null
property
is
the
empty
string
""
.
The
value
of
this
property
becomes
the
null
annotation
for
the
described
column
.
ordered
A
boolean
atomic
property
taking
a
single
value
which
indicates
whether
a
list
that
is
the
value
of
the
cell
is
ordered
(if
true
)
or
unordered
(if
false
).
The
default
is
false
.
This
property
is
irrelevant
if
the
separator
is
null
or
undefined,
but
this
is
not
an
error.
The
value
of
this
property
becomes
the
ordered
annotation
for
the
described
column
,
and
the
ordered
annotation
for
the
cells
within
that
column.
propertyUrl
An URI template property that MAY be used to create a URI for a property if the table is mapped to another format. The value of this property becomes the property URL annotation for the described column and is used to create the value of the property URL annotation for the cells within that column as described in section 5.1.3 URI Template Properties .
propertyUrl
is
typically
defined
on
a
column
description
.
If
defined
on
a
schema
description
,
table
description
or
table
group
description
,
care
must
be
taken
to
ensure
that
transformed
cell
values
maintain
an
appropriate
semantic
relationship,
for
example
by
including
the
name
of
the
column
in
the
generated
URL
by
using
_name
in
the
template.
required
A
boolean
atomic
property
taking
a
single
value
which
indicates
whether
the
cell
must
have
a
non-null
value.
value
can
be
null
.
See
Parsing
Cells
in
[
tabular-data-model
]
for
more
details.
The
default
is
false
,
which
means
cells
can
have
null
values.
The
value
of
this
property
becomes
the
required
annotation
for
the
described
column
.
.
separator
An
atomic
property
that
MUST
have
a
single
string
value
that
is
the
character
string
used
to
separate
items
in
the
string
value
of
the
cell.
If
null
(the
default)
or
unspecified,
the
cell
does
not
contain
a
list.
Otherwise,
application
MUST
split
the
string
value
of
the
cell
on
the
specified
separator
character
and
parse
each
of
the
resulting
strings
separately.
The
cell's
value
will
then
be
a
list.
See
Parsing
Cells
in
[
tabular-data-model
]
for
more
details.
The
value
of
this
property
becomes
the
separator
annotation
for
the
described
column
.
textDirection
An
atomic
property
that
MUST
have
a
single
string
value
that
is
one
of
"ltr"
,
"rtl"
,
"auto"
or
(the
default).
Indicates
whether
the
text
within
cells
should
be
displayed
"ltr"
"inherit"
by
default
as
left-to-right
or
text
(
ltr
),
as
right-to-left
text.
text
(
rtl
),
according
to
the
content
of
the
cell
(
auto
)
or
in
the
direction
inherited
from
the
table
direction
annotation
of
the
table
.
The
value
of
this
property
becomes
determines
the
text
direction
annotation
for
the
column
.
,
and
the
text
direction
annotation
for
the
cells
within
that
column:
if
the
value
is
inherit
then
the
value
of
the
text
direction
annotation
is
the
value
of
the
table
direction
annotation
on
the
table,
otherwise
it
is
the
value
of
this
property.
See
Bidirectional
Tables
in
[
tabular-data-model
]
for
details.
valueUrl
An URI template property that is used to map the values of cells into URLs. The value of this property becomes the value URL annotation for the described column and is used to create the value of the value URL annotation for the cells within that column as described in section 5.1.3 URI Template Properties .
This
allows
processors
to
build
URLs
from
cell
values
,
for
example
to
reference
RDF
resources
,
as
defined
in
[
rdf-concepts
].
For
example,
if
the
value
URL
were
"{#reference}"
,
each
cell
value
of
a
column
named
reference
would
be
used
to
create
a
URI
such
as
http://example.com/#1234
,
if
1234
were
a
cell
value
of
that
column.
valueUrl
is
typically
defined
on
a
column
description
.
If
defined
on
a
schema
description
,
table
description
or
table
group
description
,
care
must
be
taken
to
ensure
that
transformed
cell
values
maintain
an
appropriate
semantic
relationship.
This section is non-normative.
In
the
following
example,
aboutUrl
property
is
defined
on
the
tableSchema
,
and
therefore
affects
all
cells
for
that
table.
{ "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}], "url": "tree-ops.csv", "dc:title": "Tree Operations", "dcat:keyword": ["tree", "street", "maintenance"], "dc:publisher": { "schema:name": "Example Municipality", "schema:url": {"@id": "http://example.org"} }, "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"}, "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"}, "tableSchema": { "columns": [{ "name": "GID", "titles": ["GID", "Generic Identifier"], "dc:description": "An identifier for the operation on a tree.", "datatype": "string", "required": true }, { "name": "on_street", "titles": "On Street", "dc:description": "The street that the tree is on.", "datatype": "string" }, { "name": "species", "titles": "Species", "dc:description": "The species of the tree.", "datatype": "string" }, { "name": "trim_cycle", "titles": "Trim Cycle", "dc:description": "The operation performed on the tree.", "datatype": "string" }, { "name": "inventory_date", "titles": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": {"base": "date", "format": "M/d/yyyy"} }], "primaryKey": "GID", "aboutUrl": "#gid-{GID}" } }
The
equivalent
effect
could
be
achieved
by
using
the
aboutUrl
property
on
each
column:
{ "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}], "url": "tree-ops.csv", "dc:title": "Tree Operations", "dcat:keyword": ["tree", "street", "maintenance"], "dc:publisher": { "schema:name": "Example Municipality", "schema:url": {"@id": "http://example.org"} }, "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"}, "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"}, "tableSchema": { "columns": [{ "name": "GID", "titles": ["GID", "Generic Identifier"], "aboutUrl": "#gid-{GID}", "dc:description": "An identifier for the operation on a tree.", "datatype": "string", "required": true }, { "name": "on_street", "titles": "On Street", "aboutUrl": "#gid-{GID}", "dc:description": "The street that the tree is on.", "datatype": "string" }, { "name": "species", "titles": "Species", "aboutUrl": "#gid-{GID}", "dc:description": "The species of the tree.", "datatype": "string" }, { "name": "trim_cycle", "titles": "Trim Cycle", "aboutUrl": "#gid-{GID}", "dc:description": "The operation performed on the tree.", "datatype": "string" }, { "name": "inventory_date", "titles": "Inventory Date", "aboutUrl": "#gid-{GID}", "dc:description": "The date of the operation that was performed.", "datatype": {"base": "date", "format": "M/d/yyyy"} }], "primaryKey": "GID" } }
Descriptions
of
groups
of
tables,
tables,
schemas
and
columns
MAY
contain
any
common
properties
whose
names
are
either
absolute
URLs
or
prefixed
names
.
For
example,
a
table
description
may
contain
dc:description
,
dcat:keyword
,
or
schema:copyrightHolder
properties
to
provide
a
description,
keywords,
or
the
name
of
the
copyright
holder,
as
defined
in
Dublin
Core
Terms
,
DCAT
,
or
schema.org
.
The
names
of
common
properties
are
prefixed
names
,
in
the
syntax
prefix
:
name
.
Prefixed names can be expanded to provide a URI, by replacing the prefix and following colon with the URI that the prefix is associated with. Expansion is intended to be entirely consistent with Section 6.3 IRI Expansion in [ JSON-LD-API ] and implementations MAY use a JSON-LD processor for performing prefixed name and IRI expansion.
The prefixes that are recognized are those defined for [ rdfa-core ] within the RDFa 1.1 Initial Context and other prefixes defined within [ csvw-context ] and these MUST NOT be overridden. These prefixes are periodically extended; refer to [ csvw-context ] for details. Properties from other vocabularies MUST be named using absolute URLs.
Forbidding the declaration of new prefixes ensures consistent processing between JSON-LD-aware and non-JSON-LD-aware processors.
This
specification
does
not
define
how
common
properties
are
interpreted
by
implementations.
Implementations
SHOULD
treat
the
prefixed
names
for
common
properties
and
the
URLs
that
they
expand
into
in
the
same
way.
For
example,
if
an
implementation
recognises
and
displays
the
value
of
the
dc:description
property,
it
should
also
recognise
and
display
the
value
of
the
http://purl.org/dc/terms/description
property
in
the
same
way.
Common properties can take any JSON value, so long as any objects within the value (for example as items of an array or values of properties on other objects) adhere to the following restrictions, which are designed to ensure compatibility between JSON-LD-aware and non-JSON-LD-aware processors:
If
a
@value
property
is
used
on
an
object,
that
object
MUST
NOT
have
any
other
properties
aside
from
either
@type
or
@language
,
and
MUST
NOT
have
both
@type
and
@language
as
properties.
The
value
of
the
@value
property
MUST
be
a
string,
number,
or
boolean
value.
If
@type
is
also
used,
its
value
MUST
be
one
of:
If
a
@language
property
is
used,
it
MUST
have
a
string
value
that
adheres
to
the
syntax
defined
in
[
BCP47
],
or
be
null
.
If
a
@type
property
is
used
on
an
object
without
a
@value
property,
its
value
MUST
be
one
of:
@type
as
defined
for
any
of
the
description
objects
in
this
specification.
A
@type
property
can
also
have
a
value
that
is
an
array
of
such
values.
The
values
of
@id
properties
are
link
properties
and
are
treated
as
URLs.
During
normalization
,
as
described
in
section
6.1
6.
Normalization
,
they
will
have
any
prefix
expanded
and
the
result
resolved
against
the
base
URL
.
Therefore,
if
an
@id
property
is
used
on
an
object,
it
MUST
have
a
value
that
is
a
string
and
that
string
MUST
NOT
start
with
_:
.
A
@language
property
MUST
NOT
be
used
on
an
object
unless
it
also
has
a
@value
property.
@value
,
@type
,
@language
,
and
@id
,
the
properties
used
on
an
object
MUST
NOT
start
with
@
.
These
restrictions
are
also
described
in
section
A.
JSON-LD
Dialect
,
from
the
perspective
of
a
processor
that
otherwise
supports
JSON-LD.
Examples
of
common
property
values
and
the
impact
of
normalization
are
given
in
section
6.1.1
6.1
Examples
.
Much
of
the
tabular
data
that
is
published
on
the
web
is
messy,
and
CSV
parsers
frequently
need
to
be
configured
in
order
to
correctly
read
in
CSV.
A
dialect
description
provides
hints
to
parsers
about
how
to
parse
the
file
linked
to
from
the
url
property
in
a
table
description
.
It
can
have
any
of
the
following
properties,
which
relate
to
the
flags
described
in
Section
5
Parsing
Tabular
Data
within
the
[
tabular-data-model
]:
commentPrefix
An
atomic
property
that
sets
the
comment
prefix
flag
to
the
single
provided
value,
which
MUST
be
a
single
character
string.
The
default
is
"#"
.
delimiter
An
atomic
property
that
sets
the
delimiter
flag
to
the
single
provided
value,
which
MUST
be
a
single
character
string.
The
default
is
","
.
doubleQuote
A
boolean
atomic
property
that,
if
true
,
sets
the
escape
character
flag
to
"
.
If
false
,
to
\
.
The
default
is
true
.
encoding
An
atomic
property
that
sets
the
encoding
flag
to
the
single
provided
string
value,
which
MUST
be
a
defined
in
[
encoding
].
The
default
is
"utf-8"
.
header
A
boolean
atomic
property
that,
if
true
,
sets
the
header
row
count
flag
to
1
,
and
if
false
to
0
,
unless
headerRowCount
is
provided,
in
which
case
the
value
provided
for
the
header
property
is
ignored.
The
default
is
true
.
headerRowCount
An
numeric
atomic
property
that
sets
the
header
row
count
flag
to
the
single
provided
value,
which
MUST
be
a
non-negative
integer.
The
default
is
1
.
lineTerminators
An
atomic
property
that
sets
the
line
terminators
flag
to
either
an
array
containing
the
single
provided
string
value,
or
the
provided
array.
The
default
is
["\r\n",
"\n"]
.
quoteChar
An
atomic
property
that
sets
the
quote
character
flag
to
the
single
provided
value,
which
MUST
be
a
single
character
string
or
null
.
If
the
value
is
null
,
the
escape
character
flag
is
also
set
to
null
.
The
default
is
"
.
skipBlankRows
An
boolean
atomic
property
that
sets
the
skip
blank
rows
flag
to
the
single
provided
boolean
value.
The
default
is
false
.
skipColumns
An
numeric
atomic
property
that
sets
the
skip
columns
flag
to
the
single
provided
numeric
value,
which
MUST
be
a
non-negative
integer.
The
default
is
0
.
skipInitialSpace
A
boolean
atomic
property
that,
if
true
,
sets
the
trim
flag
to
"start"
.
If
false
,
to
false
.
If
the
trim
property
is
provided,
the
skipInitialSpace
property
is
ignored.
The
default
is
false
.
skipRows
An
numeric
atomic
property
that
sets
the
skip
rows
flag
to
the
single
provided
numeric
value,
which
MUST
be
a
non-negative
integer.
The
default
is
0
.
trim
An
atomic
property
that,
if
the
boolean
true
,
sets
the
trim
flag
to
true
and
if
the
boolean
false
to
false
.
If
the
value
provided
is
a
string,
sets
the
trim
flag
to
the
provided
value,
which
MUST
be
one
of
"true"
,
"false"
,
"start"
,
or
"end"
.
The
default
is
.
false
true
@id
If
included,
@id
is
a
link
property
that
identifies
the
dialect
described
by
this
dialect
description
.
It
MUST
NOT
start
with
_:
.
@type
If
included,
@type
is
an
atomic
property
that
MUST
be
set
to
"Dialect"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
Dialect
descriptions
do
not
provide
a
mechanism
for
handling
CSV
files
in
which
there
are
multiple
tables
within
a
single
file
(eg
(e.g.
separated
by
empty
lines).
The default dialect description for CSV files is:
{ "encoding": "utf-8", "lineTerminators": ["\r\n", "\n"], "quoteChar": "\"", "doubleQuote": true, "skipRows": 0, "commentPrefix": "#", "header": true, "headerRowCount": 1, "delimiter": ",", "skipColumns": 0, "skipBlankRows": false, "skipInitialSpace": false, "trim": false }
A transformation definition is a definition of how tabular data can be transformed into another format using a script or template.
For example, the following transformation definition will enable a processor that supports it to generate an iCalendar document using a Mustache template based on the JSON created from the simple mapping to JSON.
{ "url": "templates/ical.txt", "titles": "iCalendar", "targetFormat": "http://www.iana.org/assignments/media-types/text/calendar", "scriptFormat": "https://mustache.github.io/", "source": "json" }
A
processor
that
recognises
templates
in
the
Mustache
format
indicated
by
"https://mustache.github.io/"
and
that
could
convert
tables
into
JSON
based
on
[
csv2json
]
would
retrieve
the
template
from
"templates/ical.txt"
and
apply
this
to
the
resulting
JSON.
Transformation definitions have the following properties:
Transformation definitions MUST have the following properties:
url
A link property giving the single URL of the file that the script or template is held in, relative to the location of the metadata document.
scriptFormat
A
link
property
giving
the
single
URL
for
the
format
that
is
used
by
the
script
or
template.
If
one
has
been
defined,
this
should
be
a
URL
for
a
media
type,
in
the
form
http://www.iana.org/assignments/media-types/
media-type
such
as
http://www.iana.org/assignments/media-types/application/javascript
.
Otherwise,
it
can
be
any
URL
that
describes
the
script
or
template
format.
The
scriptFormat
URL
is
intended
as
an
informative
identifier
for
the
template
format,
and
applications
SHOULD
NOT
access
the
URL.
The
template
formats
that
an
application
supports
are
implementation
defined.
targetFormat
A
link
property
giving
the
single
URL
for
the
format
that
will
be
created
through
the
transformation.
If
one
has
been
defined,
this
should
be
a
URL
for
a
media
type,
in
the
form
http://www.iana.org/assignments/media-types/
media-type
such
as
http://www.iana.org/assignments/media-types/text/calendar
.
Otherwise,
it
can
be
any
URL
that
describes
the
target
format.
The
targetFormat
URL
is
intended
as
an
informative
identifier
for
the
target
format,
and
applications
SHOULD
NOT
access
the
URL.
Transformation definitions MAY have the following properties:
source
A
single
string
atomic
property
that
provides,
if
specified,
the
format
to
which
the
tabular
data
should
be
transformed
prior
to
the
transformation
using
the
script
or
template.
If
the
value
is
json
,
the
tabular
data
MUST
first
be
transformed
to
JSON
as
defined
by
[
csv2json
]
using
standard
mode
.
If
the
value
is
rdf
,
the
tabular
data
MUST
first
be
transformed
to
an
RDF
graph
as
defined
by
[
csv2rdf
]
using
standard
mode
.
If
the
source
property
is
missing
or
null
(the
default)
then
the
source
of
the
transformation
is
the
annotated
tabular
data
model
.
No
other
values
are
valid.
titles
A
natural
language
property
that
describes
the
format
that
will
be
generated
from
the
transformation.
This
is
useful
if
the
target
format
is
a
generic
format
(such
as
application/json
)
and
the
transformation
is
creating
a
specific
profile
of
that
format.
@id
If
included,
@id
is
a
link
property
that
identifies
the
transformation
described
by
this
transformation
definition
.
It
MUST
NOT
start
with
_:
.
@type
If
included,
@type
is
an
atomic
property
that
MUST
be
set
to
"Template"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The transformation definition MAY contain any common properties to provide extra metadata about the transformation.
Implementations
MAY
present
users
with
options
for
transformations
based
on
the
available
transformation
definitions
and
their
properties.
Implementations
SHOULD
filter
this
list
to
only
include
those
transformations
whose
scriptFormat
they
understand
and
can
apply,
and
whose
source
property,
if
present,
specifies
a
format
that
the
implementation
can
convert
to.
Users
may
find
the
targetFormat
and
titles
properties
useful
in
deciding
which
transformation
to
apply.
When directed by a user to transform a table using a transformation definition , implementations MUST :
source
property,
if
this
is
specified
and
not
null
.
url
property
and
raise
an
error
if
this
does
not
exist.
scriptFormat
property
to
determine
how
to
interpret
that
script
or
template,
and
apply
it
to
the
table
(or
the
result
of
converting
the
table).
Cells
within
tables
may
be
annotated
with
a
datatype
which
indicates
the
type
of
the
values
obtained
by
parsing
the
string
value
of
the
cell.
See
[
tabular-data-model
]
for
a
description
of
annotations
on
a
datatype
.
The possible built-in datatypes, as shown on the diagram , are:
anyAtomicType
,
whose
identifier
URLs
are
generated
by
prefixing
the
name
with
http://www.w3.org/2001/XMLSchema#
.
number
double
in
the
data
model
http://www.w3.org/2001/XMLSchema#double
.
binary
base64Binary
in
the
data
model
http://www.w3.org/2001/XMLSchema#base64Binary
.
datetime
dateTime
in
the
data
model
http://www.w3.org/2001/XMLSchema#dateTime
.
any
anyAtomicType
in
the
data
model
http://www.w3.org/2001/XMLSchema#anyAtomicType
.
xml
,
a
sub-type
of
string
,
whose
identifier
URL
is
http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral
and
which
indicates
the
value
is
an
XML
html
,
a
sub-type
of
string
,
whose
identifier
URL
is
http://www.w3.org/1999/02/22-rdf-syntax-ns#HTML
and
which
indicates
the
value
is
an
HTML
json
,
a
sub-type
of
string
,
whose
identifier
URL
is
http://www.w3.org/ns/csvw#JSON
and
which
indicates
the
value
is
serialized
More
specialised
specialized
datatypes
can
be
defined
through
a
datatype
description
.
A
datatype
description
may
have
any
of
the
following
properties,
all
of
which
are
optional.
base
An
atomic
property
that
contains
a
single
string:
a
the
name
of
one
of
the
built-in
datatypes,
as
listed
above
(and
which
are
defined
as
term
terms
defined
in
the
default
context
representing
a
built-in
datatype
URL,
as
listed
above.
context).
Its
default
is
string
.
All
values
of
the
datatype
MUST
be
valid
values
of
the
base
datatype.
The
value
of
this
property
becomes
the
base
annotation
for
the
described
datatype
.
format
An atomic property that contains either a single string or an object that defines the format of a value of this type, used when parsing a string value as described in Parsing Cells in [ tabular-data-model ]. The value of this property becomes the format annotation for the described datatype .
length
A numeric atomic property that contains a single integer that is the exact length of the value. The value of this property becomes the length annotation for the described datatype . See Length Constraints in [ tabular-data-model ] for details.
minLength
An atomic property that contains a single integer that is the minimum length of the value. The value of this property becomes the minimum length annotation for the described datatype . See Length Constraints in [ tabular-data-model ] for details.
maxLength
A numeric atomic property that contains a single integer that is the maximum length of the value. The value of this property becomes the maximum length annotation for the described datatype . See Length Constraints in [ tabular-data-model ] for details.
minimum
An
atomic
property
that
contains
a
single
number
or
string
that
is
the
minimum
valid
value
(inclusive);
equivalent
to
minInclusive
.
The
value
of
this
property
becomes
the
minimum
annotation
for
the
described
datatype
.
See
Value
Constraints
in
[
tabular-data-model
]
for
details.
maximum
An
atomic
property
that
contains
a
single
number
or
string
that
is
the
maximum
valid
value
(inclusive);
equivalent
to
maxInclusive
.
The
value
of
this
property
becomes
the
maximum
annotation
for
the
described
datatype
.
See
Value
Constraints
in
[
tabular-data-model
]
for
details.
minInclusive
An atomic property that contains a single number or string that is the minimum valid value (inclusive). The value of this property becomes the minimum annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.
maxInclusive
An atomic property that contains a single number or string that is the maximum valid value (inclusive). The value of this property becomes the maximum annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.
minExclusive
An atomic property that contains a single number or string that is the minimum valid value (exclusive). The value of this property becomes the minimum exclusive annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.
maxExclusive
An atomic property that contains a single number or string that is the maximum valid value (exclusive). The value of this property becomes the maximum exclusive annotation for the described datatype . See Value Constraints in [ tabular-data-model ] for details.
@id
If
included,
@id
is
a
link
property
that
identifies
the
datatype
described
by
this
datatype
description
.
The
value
of
this
property
becomes
the
id
annotation
for
the
described
datatype
.
It
MUST
NOT
start
with
_:
and
it
MUST
NOT
be
the
URL
of
a
built-in
datatype.
@type
If
included,
@type
is
an
atomic
property
that
MUST
be
set
to
"Datatype"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The datatype description MAY contain any common properties to provide extra metadata about the datatype, such as a title or description.
Applications
MUST
raise
an
error
if
the
@id
property
has
the
value
of
a
built-in
datatype
and
any
other
property
is
specified.
In
these
cases,
the
other
properties
are
ignored.
Applications
MUST
raise
an
error
if
both
length
and
minLength
are
specified
and
they
do
not
have
the
same
value.
length
is
less
than
minLength
.
Similarly,
applications
MUST
raise
an
error
if
both
length
and
maxLength
are
specified
and
they
do
not
have
the
same
value.
length
is
greater
than
maxLength
.
Applications
MUST
raise
an
error
if
minLength
and
maxLength
are
both
specified
and
minLength
is
greater
than
maxLength
.
Applications
MUST
raise
an
error
if
length
,
maxLength
,
or
minLength
are
specified
and
the
base
datatype
is
not
string
or
one
of
its
subtypes,
or
a
binary
type.
In
all
ways,
including
the
errors
described
below,
the
minimum
property
is
equivalent
to
the
minInclusive
property
and
the
maximum
property
is
equivalent
to
the
maxInclusive
property.
Applications
MUST
raise
an
error
if
both
minimum
and
minInclusive
are
specified
and
they
do
not
have
the
same
value.
Similarly,
applications
MUST
raise
an
error
if
both
maximum
and
maxInclusive
are
specified
and
they
do
not
have
the
same
value.
Applications
MUST
raise
an
error
if
both
minInclusive
and
minExclusive
are
specified,
or
if
both
maxInclusive
and
maxExclusive
are
specified.
Applications
MUST
raise
an
error
if
both
minInclusive
and
maxInclusive
are
specified
and
maxInclusive
is
less
than
minInclusive
,
or
if
both
minInclusive
and
maxExclusive
are
specified
and
maxExclusive
is
less
than
or
equal
to
minInclusive
.
Similarly,
applications
MUST
raise
an
error
if
both
minExclusive
and
maxExclusive
are
specified
and
maxExclusive
is
less
than
minExclusive
,
or
if
both
minExclusive
and
maxInclusive
are
specified
and
maxInclusive
is
less
than
or
equal
to
minExclusive
.
Applications
MUST
raise
an
error
if
minimum
,
minInclusive
,
maximum
,
maxInclusive
,
minExclusive
,
or
maxExclusive
are
specified
and
the
base
datatype
is
not
a
numeric,
date/time,
or
duration
type.
Validation against these properties is as defined in [ xmlschema11-2 ].
When
processing
a
tabular
data
file,
the
Locating
Metadata
section
in
[
tabular-data-model
]
describes
different
locations
for
locating
metadata.
To
properly
transform
a
tabular
data
file,
such
as
a
CSV
file,
processors
MUST
merge
metadata
from
these
separate
sources
to
create
a
single
metadata
document
in
a
manner
consistent
with
this
algorithm.
Implementations
MUST
check
and
issue
warnings
where
merge
issues
are
found
as
noted
below
and
in
the
relevant
property
definitions.
Merging
of
metadata
happens
in
order
from
highest
priority
to
lowest
priority
by
merging
the
first
two
metadata
files
(
A
and
B
)
together
to
create
new
merged
metadata
AB'
.
This
is
then
used
to
merge
in
the
next
metadata
file
until
all
metadata
have
been
processed
to
create
a
table
group
description
.
If
the
top-level
object
of
either
of
the
metadata
files
are
table
descriptions
,
these
are
turned
into
table
group
descriptions
containing
a
single
table
description
(i.e.,
having
a
tables
property
whose
value
is
an
array
containing
the
original
table
description).
Ensure
that
@context
definitions
are
moved
from
the
table
description
to
the
table
group
description
.
Merging
has
two
stages:
the
normalization
of
metadata
documents,
described
in
section
6.1
Normalization
and
the
merging
of
those
normalized
documents,
described
in
section
6.2
Merging
.
6.1
Normalization
Prior
to
merging,
each
description
object
is
expanded
relative
to
its
@context
and
values
are
normalized
as
follows:
notes
the
value
MUST
be
normalized
as
follows:
@value
property
whose
value
is
that
string.
If
a
default
language
is
specified,
add
a
@language
property
whose
value
is
that
default
language.
@value
property,
it
remains
as
is.
@id
,
expand
any
prefixed
names
and
resolve
its
value
against
the
base
URL
.
@type
,
then
its
value
remains
as
is.
@context
.
@context
then
remove
the
local
@context
property.
If
the
resulting
object
does
not
have
an
@id
property,
add
an
@id
whose
value
is
the
original
URL.
This
object
becomes
the
value
of
the
original
object
property.
und
MUST
be
used.
Following
this
normalization
process,
the
@base
and
@language
properties
within
the
@context
are
no
longer
relevant;
the
normalized
metadata
can
have
its
@context
set
to
http://www.w3.org/ns/csvw
.
This section is non-normative.
The following are examples of how common properties are normalized.
In
this
example,
a
simple
string
is
used
as
the
title
for
a
table
using
the
dc:title
common
property
:
{ "@context": { "http://www.w3.org/ns/csvw", { "@language": "en" } }, "@type": "Table", "url": "http://example.com/table.csv", "tableSchema": [...], "dc:title": "The title of this Table" }
Since
there
is
a
default
language
,
this
is
equivalent
to
explicitly
specifying
the
language
of
that
title;
the
original
string
value
becomes
the
value
of
the
@value
property
within
a
value
object
:
{ "@type": "Table", "url": "http://example.com/table.csv", "tableSchema": [...], "dc:title": {"@value": "The title of this Table", "@language": "en"} }
It is also possible to use a simple value object to give a title. However, in this case the default language is not applied to the title:
{ "@context": { "http://www.w3.org/ns/csvw", { "@language": "en" } }, "@type": "Table", "url": "http://example.com/table.csv", "tableSchema": [...], "dc:title": {"@value": "The title of this Table"} }
The next example uses an array of a string and a value object to give two titles with different languages:
{ "@context": { "http://www.w3.org/ns/csvw", { "@language": "en" } }, "@type": "Table", "url": "http://example.com/table.csv", "tableSchema": [...], "dc:title": [ "The title of this Table", {"@value": "Der Titel dieser Tabelle", "@language": "de"} ] }
The normalized version of this is:
{ "@type": "Table", "url": "http://example.com/table.csv", "tableSchema": [...], "dc:title": [ {"@value": "The title of this Table", "@language": "en"} {"@value": "Der Titel dieser Tabelle", "@language": "de"} ] }
The
next
example
demonstrates
a
node
object
,
in
which
the
value
of
the
schema:url
property
is
a
reference
to
another
resource:
{ "@context": [ "http://www.w3.org/ns/csvw", { "@base": "http://example.com/" } ], "@type": "Table", "url": "table.csv", "tableSchema": [...], "schema:url": {"@id": "table.csv"} }
The
value
of
the
@id
property
is
normalized
as
described
in
section
6.1
6.
Normalization
against
the
base
URL
provided
through
the
@base
property,
which
means
the
above
example
is
equivalent
to:
{ "@context": "http://www.w3.org/ns/csvw", "@type": "Table", "url": "http://example.com/table.csv", "tableSchema": [...], "schema:url": {"@id": "http://example.com/table.csv"} }
The
following
example
shows
the
dc:publisher
property
as
an
array
that
contains
a
single
node
object
:
{ "@context": "http://www.w3.org/ns/csvw", "@type": "Table", "url": "http://example.com/table.csv", "tableSchema": [...], "dc:publisher": [{ "schema:name": "Example Municipality", "schema:url": {"@id": "http://example.org"} }], }
Following
normalization,
the
schema:name
property
of
the
dc:publisher
is
expanded
as
shown:
"dc:publisher": [{ "schema:name": { "@value": "Example Municipality" }, "schema:url": { "@id": "http://example.org" } }]
Applications that process tabular data may use that data to drive other actions, which may have security implications. These behaviors are outside the scope of this specification.
Third party metadata provided about a tabular data file (such as a CSV file) may rename or ignore headers, or exclude rows or columns, which may lead to data being misinterpreted by applications that process it.
Transformation definitions are a possible security risk as they enable the creators of metadata to reference arbitrary code that may be executed to convert tabular data into other formats. Implementations should run this arbitrary code in a sandboxed environment to reduce the security risk.
The Metadata Vocabulary for Tabular Data uses a format based on JSON-LD [ JSON-LD ] with some restrictions.
The
value
of
any
@id
or
@type
contained
within
a
metadata
document
MUST
NOT
be
a
blank
node
.
A
metadata
document
MUST
NOT
add
a
new
context
(ie
(i.e.
include
a
@context
property
except
at
the
top
level),
or
extend
the
top-level
context
in
anyway
other
than
as
specifically
allowed
in
section
5.2
Top-Level
Properties
.
Common properties and notes may contain arbitrary JSON-LD with the following restrictions:
The
value
of
any
member
of
@type
MUST
be
either
a
term
defined
in
[
csvw-context
],
a
prefixed
name
where
the
prefix
is
a
term
defined
in
[
csvw-context
],
or
an
absolute
URL.
Values
MAY
be
a
string
,
native
JSON
type
(such
as
number
,
true
,
or
false
.),
value
object
,
node
object
or
an
array
of
zero
or
more
of
any
of
these.
Values MUST NOT use list objects or set objects .
Keys
of
node
objects
MUST
NOT
include
@graph
,
@context
,
terms
,
or
blank
nodes
.
When
normalizing
metadata,
prefixed
names
used
in
common
properties
and
notes
are
expanded
to
absolute
URLs.
For
some
serializations,
these
are
more
appropriately
presented
using
prefixed
names
or
terms
.
This
algorithm
compacts
an
absolute
URL
to
a
prefixed
name
or
term
.
:
(
U+0040
)
to
create
a
prefixed
name
.
If
the
resulting
prefixed
name
is
rdf:type
,
replace
with
@type
.
This document is influenced by Data Package specification and the JSON Table Schema , which are maintained as part of Data Protocols . Particular contributors to that work are Rufus Pollock, Paul Fitzpatrick, Andrew Berkeley, Francis Irving, Benoit Chesneau, Leigh Dodds, Martin Keegan, and Gunnlaugur Thor Briem.
This section has not yet been submitted to IANA for review, approval, and registration.
text/csv
and
text/tab-delimited-values
mediatypes,
but
a
JSON-based
format
used
to
annotate
such
Although
no
byte
sequence
can
be
guaranteed
at
a
specific
location,
a
valid
application/csvm+json
document
MUST
somewhere
contain
the
string
"@context"
(including
quotation
characters),
followed
by
one
or
more
whitespace,
colon
or
open-square-bracket
characters,
followed
by
the
string
"http://www.w3.org/ns/csvw"
(including
quotation
characters).
org.w3c.csvm
conforms
to
public.json
The
JSON-LD
context,
located
at
http://www.w3.org/ns/csvw.jsonld
is
used
with
metadata
documents.
When
used
within
a
metadata
document,
the
context
can
be
referenced
as
http://www.w3.org/ns/csvw
.
See
[
csvw-context
]
for
a
full
description
of
defined
terms
and
prefixes
.
This
context
may
be
updated
from
time-to-time
to
define
new
terms
and
prefixes.
A
JSON-LD
processor
retrieving
the
context
will
use
content
negotiation
to
request
the
resource
at
http://www.w3.org/ns/csvw
with
an
Accept:
application/ld+json
HTTP
header
as
defined
in
Remote
Document
and
Context
Retrieval
in
[
JSON-LD-API
].
Without
this
header,
this
resource
will
return
an
HTML
representation
of
the
csvw
namespace.
All
serializations
of
this
resource
also
define
the
metadata
vocabulary
using
RDFS
.
rowTitles
property
has
been
added
to
help
screen
readers
provide
cues
about
focus
within
a
table.
@id
to
reference
an
external
datatype
definition
in
XSD,
OWL,
or
some
other
format.
trim
changed
from
false
to
true
.The document has undergone substantial changes since the last working draft. Below are some of the changes made:
notes
and
common
properties
defined.
resources
property
was
changed
to
tables
.
foreignKeys
.
templates
property
was
changed
to
transformations
.
urlTemplate
property
was
changed
from
a
schema
property
to
the
aboutUrl
common
property,
and
propertyUrl
and
valueUrl
were
added
as
common
properties.
virtual
columns
to
allow
data
to
be
inserted
into
a
row.