Copyright
©
2014
�
2015
W3C
®
�
(
MIT
,
ERCIM
,
Keio
,
Beihang
),
All
Rights
Reserved.
W3C
liability
,
trademark
and
document
use
rules
apply.
Validation, conversion, display and search of tabular data on the web requires additional metadata that describes how the data should be interpreted. This document defines a vocabulary for metadata that annotates tabular data. This can be used to provide metadata at various levels, from collections of data from CSV documents and how they relate to each other down to individual cells within a table.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
The CSV on the Web Working Group was chartered to produce a Recommendation "Access methods for CSV Metadata" as well as Recommendations for "Metadata vocabulary for CSV data" and "Mapping mechanism to transforming CSV into various Formats (e.g., RDF, JSON, or XML)". This document aims to primarily satisfy the second of those Recommendations.
This
document
was
published
by
the
CSV
on
the
Web
Working
Group
as
a
First
Public
Working
Draft.
This
document
is
intended
to
become
a
W3C
Recommendation.
If
you
wish
to
make
comments
regarding
this
document,
please
send
them
to
public-csv-wg@w3.org
(
subscribe
,
archives
).
All
comments
are
welcome.
Publication
as
a
First
Public
Working
Draft
does
not
imply
endorsement
by
the
W3C
Membership.
This
is
a
draft
document
and
may
be
updated,
replaced
or
obsoleted
by
other
documents
at
any
time.
It
is
inappropriate
to
cite
this
document
as
other
than
work
in
progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
This document is governed by the 1 August 2014 W3C Process Document .
Interpreting
tabular
data
that
is
available
on
the
web,
particularly
as
CSV,
usually
requires
additional
metadata.
As
an
example,
say
that
the
following
CSV
file
were
available
at
http://example.org/tree-ops.csv
GID,On Street,Species,Trim Cycle,Inventory Date 1,ADDISON AV,Celtis australis,Large Tree Routine Prune,10/18/2010 2,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,6/2/2010 3,EMERSON ST,Liquidambar styraciflua,Large Tree Routine Prune,6/2/2010
A
human
consumer
of
this
data
might
be
able
to
figure
out
the
meaning
of
the
different
columns,
particularly
if
there
were
some
additional
human-readable
documentation
made
available.
Automated
processors
would
have
a
much
harder
time;
realistically
they
would
be
limited
to
displaying
the
information
in
a
table.
Making
available
machine-readable
metadata
helps
with
the
interpretation
of
the
tabular
data.
For
example,
say
that
the
following
metadata
file
were
available
at
:
http://example.org/trees-ops.csv.csvm
http://example.org/trees-ops.csv-metadata.json
{ "@id": "tree-ops.csv", "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}], "dc:title": "Tree Operations", "dc:keywords": ["tree", "street", "maintenance"], "dc:publisher": [{ "sch:name": "Example Municipality", "sch:web": "http://example.org" }], "dc:license": "http://opendefinition.org/licenses/cc-by/", "dc:modified": "2010-12-31", "schema": { "columns": [{ "name": "GID", "title": [ "GID", "Generic Identifier" ], "dc:description": "An identifier for the operation on a tree.", "datatype": "string", "required": true }, { "name": "on-street", "title": "On Street", "dc:description": "The street that the tree is on.", "datatype": "string" }, { "name": "species", "title": "Species", "dc:description": "The species of the tree.", "datatype": "string" }, { "name": "trim-cycle", "title": "Trim Cycle", "dc:description": "The operation performed on the tree.", "datatype": "string" }, { "name": "inventory-date", "title": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": "date", "format": "M/D/YYYY" }], "primaryKey": "GID" } }
Given
the
location
of
the
CSV
file,
this
metadata
file
document
can
be
located
by
appending
.csvm
-metadata.json
to
the
URL
(as
described
in
Model
for
Tabular
Data
and
Metadata
on
the
Web
).
It
provides
information
for
different
types
of
applications:
GID
column
are
all
present
and
unique.
The Model for Tabular Data and Metadata on the Web specification defines an Annotated Tabular Data Model in which tables, columns, rows and cells can be annotated with properties and values, and a Grouped Tabular Data Model in which a group of tables is annotated. That specification also describes how to locate metadata about a given CSV file.
This document defines the format and structure of metadata documents, and how these are interpreted to create an Annotated Tabular Data Model. It also defines how to validate tabular data based on some of these annotations. This metadata can be expressed as an RDF graph. However, all applications that conform to this specification (including validators and applications that read or convert tabular data) MUST read the JSON-based format described in this document.
We
Metadata
documents
are
aiming
[
JSON-LD
]
documents,
however
the
aim
is
for
the
JSON
format
documents
to
be
interpretable
as
JSON-LD,
but
useable
without
any
requirement
to
include
context
within
extra
processing.
To
be
valid,
a
metadata
document
MUST
use
a
JSON-LD
Context
,
either
explicitly
via
the
@context
entry,
or
through
the
use
of
an
HTTP
Link
header
(see
Interpreting
JSON
itself
(to
save
people
from
having
to
do
boilerplate).
as
JSON-LD
in
[
JSON-LD
]).
The
default
location
for
this
context
is
http://www.w3.org/ns/csvw
.
CSVW
aware
processors
SHOULD
assume
a
context
at
this
location
if
one
is
not
provided
with
the
metadata
document.
We
invite
comments
on
the
utility
of
this
approach:
is
it
useful
for
CSV
metadata
to
be
interpretable
as
JSON-LD?
Is
it
helpful
to
be
able
to
map
it
to
RDF?
Would
it
Should
JSON-LD
keywords
be
better
aliased?
The
sense
is
to
rename
some
of
the
JSON-LD
keywords,
such
as
alias
@id
and
as
@type
url
?
and
not
alias
the
others.
We
invite
comments
on
the
utility
of
this
approach.
The
metadata
defined
in
this
specification
is
used
to
annotate
an
existing
annotated
table
or
group
of
tables
,
as
defined
in
[
tabular-data-model
].
Annotated
tables
form
the
basis
for
all
further
processing,
such
as
validating
or
displaying
the
table.
tables.
All
compliant
applications
MUST
create
annotated
tables
based
on
the
algorithm
defined
here.
Metadata
documents
contain
descriptions
of
groups
of
tables,
tables,
columns,
rows,
cells
and
regions
which
are
used
to
create
annotations.
annotations
on
a
tabular
data
model.
There
are
two
types
of
description
objects:
The description objects themselves contain a number of properties. These are:
name
of
a
column
or
the
provenance
dc:provenance
of
a
table
For example, in the column description
{ "name": "inventory-date", "title": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": "date", "format": "M/D/YYYY" }
the
properties
name
,
title
and
are
direct
annotations
that
become
description
dc:description
name
,
title
and
properties
on
the
column
in
the
data
model.
The
description
dc:description
datatype
and
format
properties
are
inherited
properties
that
become
datatype
and
format
properties
on
the
cells
within
the
column.
Direct annotations are properties on the description object for a given table, column, row or cell which map directly to properties on the described table, column, row or cell. The name of the annotation is the same as the name of the property on the annotation. The value of the annotation is the same as the value of the property on the description object.
A
cell
may
be
assigned
annotations
based
on
properties
on
the
description
objects
for
the
group
of
tables,
table,
column
or
row
that
it
appears
in.
These
properties
are
known
as
inherited
properties
and
are
listed
in
section
3.8
3.10
Cells
Inherited
Properties
.
To
ascertain
a
value
for
these
annotations,
an
application
MUST
identify
the
relevant
property
in
the
descriptions
of
the
table,
column
and
row.
table
or
column.
Applications
MUST
raise
an
error
if
the
value
of
a
property
in
a
column
or
row
table
description
is
not
compatible
with
the
value
of
that
property
on
the
table.
group
of
tables.
Applications
MUST
raise
an
error
if
the
value
of
a
property
on
in
a
row
column
description
is
not
compatible
with
the
values
value
of
that
property
on
all
the
columns
in
the
table.
Application
Applications
MUST
raise
an
error
if
the
value
of
a
property
on
a
cell
is
not
compatible
with
the
values
of
that
property
on
both
the
column
and
the
row
that
the
cell
is
associated
with.
A
value
for
a
cell,
column
or
row
table
is
compatible
with
with
a
value
on
a
row,
column
or
column,
table
or
group
of
tables
if
they
are
the
same
value
or
if
the
first
value
is
a
sub-value
of
the
second
value.
The
definitions
of
individual
inherited
properties
indicate
what
values
count
as
sub-values
of
others.
This section defines a set of properties and permitted values for annotating tabular data, and how these annotations should be interpreted by applications.
A
metadata
document
is
a
JSON
document
which
holds
an
object
at
the
top
level.
This
object
is
a
description
object
of
either
a
table
group
or
a
single
table.
A
description
object
is
a
JSON
object
that
describes
a
component
of
a
table
the
tabular
data
model
(a
table
group,
a
table,
a
column,
a
row
or
a
cell)
and
has
one
or
more
properties
are
mapped
into
properties
on
that
component.
There are different types of properties on description objects:
These hold an array of one or more objects, which are usually description objects .
For
example,
the
resources
property
is
an
array
property.
A
table
group
description
might
contain:
"resources": [{ "@id": "https://example.org/countries.csv", "schema": "https://example.org/countries.json" }, { "@id": "https://example.org/country_slice.csv", "schema": "https://example.org/country_slice.json" }]
in
which
case
the
resources
property
has
a
value
that
is
an
array
of
two
table
description
objects.
These hold one or more references to other resources by URL. Their values may be:
For
example,
the
property
is
a
link
property.
A
table
description
might
contain:
hasVersion
dc:hasVersion
"hasVersion""dc:hasVersion" : "example-2014-01-03.csv"
in
which
case
the
property
on
the
table
would
have
a
single
value,
a
link
to
hasVersion
dc:hasVersion
example-2014-01-03.csv
.
Alternatively,
the
metadata
document
might
contain:
,
or
it
"dc:hasVersion": [ "example-2014-01-03.csv", "example-2014-01-17.csv", "example-2014-01-25.csv" ]
in
which
case
the
property
on
the
table
would
hasVersion
dc:hasVersion
have
be
an
array
of
three
values,
links
to
other
versions
of
the
table.
A URI template property contains a [ URI-TEMPLATE ] which can be used to generate a URI. These URI templates are expanded in the context of each row by combining the template with a set of variables with values. The variables that are set are:
_row
_row
is
set
to
the
row
number
of
the
row
that
is
currently
being
processed
For
example,
the
urlTemplate
property
holds
a
URI
template
that
is
used
to
generate
a
URL
identifier
for
each
row,
which
might
look
like:
"urlTemplate" : "http://example.org/example.csv#row={_row}"
The
identifiers
that
are
generated
for
the
rows
would
then
look
like
http://example.org/example.csv#row=1
,
http://example.org/example.csv#row=2
and
so
on.
Alternatively,
with
the
CSV
and
metadata
in
the
section
1.
Introduction
,
the
urlTemplate
might
look
like:
"urlTemplate" : "http://example.org/tree/{on%2Dstreet}/{GID}"
This
would
generate
URIs
such
as
http://example.org/tree/ADDISON%20AV/1
and
http://example.org/tree/EMERSON%20ST/2
.
Once
the
URI
has
been
generated,
it
is
resolved
against
the
location
of
the
resource
(eg
the
CSV
file)
to
create
an
absolute
URI.
For
example,
given
a
urlTemplate
within
a
schema
such
as:
"urlTemplate": "#row={_row}"
and
given
a
CSV
file
at
http://example.com/temp.csv
,
the
URL
for
the
first
row
will
be
http://example.com/temp.csv#row=1
.
These
hold
one
or
more
references
to
other
column
description
objects.
The
referenced
description
object
must
have
an
property.
Column
reference
properties
can
then
reference
@id
property
whose
value
looks
like
_:
name
.
Internal
other
column
description
objects
through
values
that
are:
@id
name
on
For
example,
the
primaryKey
property
is
an
internal
column
reference
property
on
the
schema.
It
has
to
hold
references
to
columns
defined
elsewhere
in
the
schema,
and
the
descriptions
of
those
columns
must
have
properties.
It
can
hold
a
single
reference,
like
this:
@id
name
"schema": { "columns": [{ "name": "GID" }, ... ], "primaryKey": "GID" }
or it can contain an array of references, like this:
"schema": { "columns": [{ "name": "givenName" }, { "name": "familyName" }, ... ], "primaryKey": [ "givenName", "familyName" ] }
These hold one or more objects or references to objects by URL. Their values may be:
Object
properties
are
often
used
when
the
values
can
be
or
should
be
values
within
controlled
vocabularies,
or
structured
information
which
may
be
held
elsewhere.
For
example,
the
of
a
table
creator
dc:creator
is
should
be
an
object
property.
It
could
be
provided
as
a
URL
that
indicates
the
creator,
like
this:
"creator""dc:creator" : "http://ons.gov.uk"
or a structured object, like this:
"dc:creator": { "sch:name": "Office of National Statistics", "sch:url": "http://ons.gov.uk", "sch:email": "info@ons.gsi.gov.uk" }
or an array of URLs, like this:
"dc:creator" : [ "http://ons.gov.uk" , "https://www.gov.uk/government/organisations/department-for-transport" ]
or an array of structured objects:
"dc:creator": [{ "sch:name": "Office of National Statistics", "sch:url": "http://ons.gov.uk", "sch:email": "info@ons.gsi.gov.uk" }, { "sch:name": "Department for Transport", "sch:url": "https://www.gov.uk/government/organisations/department-for-transport" }]
or an array that mixes URLs and objects:
"dc:creator": [{ "sch:name": "Office of National Statistics", "sch:url": "http://ons.gov.uk", "sch:email": "info@ons.gsi.gov.uk" }, "https://www.gov.uk/government/organisations/department-for-transport" ]
These hold natural language strings. Their values may be:
Natural
language
properties
are
used
for
things
like
descriptions
and
titles.
For
example,
the
title
property
provides
a
natural
language
label
for
a
column.
If
it's
a
plain
string
like
this:
"title" : "Project title"
then
that
string
is
assumed
to
be
in
the
language
provided
through
the
@language
property
of
the
nearest
@context
(or
have
no
assumed
language,
if
there
is
no
such
property).
Multiple
alternative
values
can
be
given
in
an
array:
"title": [ "Project title", "Project" ]
It's also possible to provide multiple values in different languages, using an object structure. For example:
"title": { "en": "Project title", "fr": "Titre du projet" }
and within such an object, the values of the properties can themselves be arrays:
"title": { "en": [ "Project title", "Project" ], "fr": "Titre du projet" }
We invite comment on whether it would be useful to enable some markup in natural language strings, for example by stating that they are interpreted as HTML or Markdown.
These hold atomic values. Their values may be:
true
or
false
)
JSON
does
not
have
date
or
time
types.
Where
a
property
takes
a
date
as
a
value,
this
MUST
be
a
string
in
the
format
YYYY-MM-DD
.
The
top-level
object
(whether
it
is
a
table
group
description
or
a
table
description
)
MAY
have
a
@context
property.
This
holds
an
object
that
provides
metadata
for
interpreting
other
properties,
namely:
@language
indicates
the
default
language
for
the
values
of
properties
in
the
description;
metadata
document;
if
present,
its
value
MUST
be
a
language
code
[
RFC3066
BCP47
]
which
is
the
default
language
for
the
values
of
other
properties
in
the
metadata
document
Note
that
the
@language
property
of
the
@context
object,
which
gives
the
default
language
used
within
the
metadata
file,
is
distinct
from
the
language
property
on
a
description
object
,
which
gives
the
language
used
in
the
data
within
the
table.
a
group
of
tables,
table
or
column.
@base
indicates
the
base
URL
against
which
other
URLs
within
the
description
are
resolved;
if
present,
its
value
MUST
be
a
URL
which
is
resolved
against
the
base
URL
location
of
the
metadata
document
(the
location
from
which
it
was
retrieved)
to
provide
the
base
URL
for
other
URLs
in
the
metadata
document;
if
unspecified,
the
base
URL
used
for
interpreting
relative
URLs
within
the
metadata
document
is
the
location
of
the
metadata
document
itself
Note
that
the
@base
property
of
the
@context
object
provides
the
base
URL
used
for
URLs
within
the
metadata
document,
not
the
URLs
that
appear
within
the
table.
group
of
tables
or
table
it
describes.
The
properties
listed
here
may
be
applied
top-level
object
(whether
it
is
a
table
group
description
or
a
table
description
)
MAY
also
have
an
import
property.
This
is
a
link
property
which
references
one
or
more
other
metadata
files
to
any
structure
within
be
imported
into
the
tabular
data
model:
tables,
columns,
rows
or
cells.
original
metadata
file.
If
the
import
property
contains
an
array,
imports
are
other
standard
carried
out
in
sequence:
the
first
metadata
vocabularies
that
should
be
reused
within
this
specification.
file
referenced
is
imported
into
the
original
metadata
file;
the
second
is
imported
into
the
result
and
so
on.
If
a
referenced
metadata
file
has
already
been
imported
(or
was
the
original
metadata
file)
it
is
ignored.
Descriptions
MAY
contain
any
properties
defined
by
[
DC-TERMS
]
to
describe
If
the
table.
This
specification
does
not
define
top-level
object
of
any
application
behaviour
associated
with
these
properties
being
present,
except
that
validation
of
the
metadata
files
MUST
check
that,
if
they
are
present,
table
descriptions
,
these
are
treated
as
if
they
adhere
to
the
syntax
defined
here.
Property
Type
Details
abstract
natural
language
property
were
table
group
descriptions
containing
a
single
table
description
(ie
having
a
single
accessRights
resource
object
property
accrualMethod
whose
value
is
the
same
as
the
original
table
description).
An
imported
description
object
property
accrualPeriodicity
B
is
imported
into
an
original
description
object
A
by
merging
each
property
accrualPolicy
object
of
B
into
A.
If
the
property
alternative
natural
language
from
B
does
not
exist
on
A,
it
is
simply
added
to
A.
If
A
does
have
the
property,
the
way
the
values
are
merged
depends
on
the
type
of
the
property,
as
follows:
dateCopyrighted
und
should
be
used.
The
arrays
should
provide
the
values
from
A
followed
by
those
from
B
that
were
not
already
a
value
in
A.
If
the
type
of
the
format
YYYY-MM-DD
dateSubmitted
atomic
property
dates
cannot
be
determined,
because
it
is
not
defined
in
this
specification
(ie
because
it
is
an
extension
property),
the
type
of
the
format
YYYY-MM-DD
description
natural
language
property
educationLevel
object
is
determined
based
on
its
values
in
A
and
B,
as
follows,
and
merged
accordingly:
Descriptions
of
groups
of
tables,
tables,
schemas,
columns,
rows
and
cells
MAY
contain
any
properties
whose
names
are
either
absolute
URLs
or
prefixed
names.
For
example,
a
table
description
may
contain
,isVersionOf
dc:description
dcat:keyword
link
property
or
language
schema:copyrightHolder
atomic
property
properties
to
provide
a
language
code
description,
keywords
or
the
name
of
the
copyright
holder,
as
defined
by
in
Dublin
Core
Terms
,
DCAT
or
schema.org
.
The
same
prefixes
are
pre-defined
as
for
[
RFC3066
rdfa-core
];
this
is
an
inherited
property
license
object
property
]
within
the
RDFa
1.1
Initial
Context
and
MUST
NOT
be
overridden.
Properties
from
other
vocabularies
MUST
be
defined
using
full
URLs.
Forbidding the declaration of new prefixes ensures consistent processing between JSON-LD-aware and non-JSON-LD-aware processors.
A
table
group
description
is
a
JSON
object
property
that
describes
a
group
of
tables.
medium
resources
relation
@id
The description of a group of tables MAY also contain:
replaces
schema
rightsHolder
table-direction
An
atomic
property
that
MUST
have
a
single
string
value
that
is
one
of
,
source
link
property
"rtl"
spatial
"ltr"
object
property
or
.
Indicates
whether
the
tables
in
the
group
should
be
displayed
with
the
first
column
on
the
right,
on
the
left,
or
based
on
the
first
character
in
the
table
that
has
a
specific
direction.
See
section
4.1.1
Bidirectional
Tables
subject
"default"
object
property
for
more
details.
This should be a defined controlled vocabulary in JSON-LD, so that the values map on to URIs in the RDF version rather than strings. We invite comment on how to configure the JSON-LD context to enable these values to be interpreted in this way.
temporal
dialect
title
dialect
provides
hints
to
processors
about
how
to
parse
the
referenced
files
for
to
create
tabular
data
models
for
the
tables
in
the
group.
This
may
be
provided
as
an
embedded
object
or
as
a
URL
reference.
See
section
3.6
Dialect
Descriptions
type
templates
link:start
targetFormat
and
link:successor-version
templateFormat
in
A,
the
template
specification
from
B
is
imported
into
the
matching
template
specification
in
A
link:up
@type
link:version-history
@type
MUST
be
set
to
link:working-copy
"TableGroup"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The description MAY contain any common properties as defined in section 3.3 Common Properties to provide extra metadata about the set of tables as a whole.
The
description
MAY
contain
any
of
the
properties
defined
in
section
2.2
Inherited
Properties
link:working-copy-of
to
describe
cells
within
the
tables.
Unlike
the
Dublin
Core
terms,
link
relations
are
an
ever-expanding
list
and
there
may
eventually
be
clashes
between
link
relation
terms
and
those
defined
above.
That's
why
This
issue
relates
to
the
above
list
uses
QNames
for
all
link
relations,
so
use
of
type
vs
datatype
as
a
column
property.
(This
issue
seems
moot
now
that
they
look
like
link:
relation
rather
than
plain
relation
.
neither
are
included.)
A table description is a JSON object that describes a table within a CSV file.
A CSV file might not be the same as the table that it contains. For example, a given CSV file might contain two tables (in different regions of the CSV file), or might contain a table that isn't positioned at the top left of the CSV file. We invite comment about whether we should assume that pre-processing is used to extract tables where there isn't a 1:1 correspondence between CSV file and table, or not.
@id
This link property gives the single URL of the CSV file that the table is held in, relative to the location of the metadata document.
The description of a table MAY also contain:
schema
notes
An
object
property
,
usually
that
provides
an
array,
array
of
annotation
objects
representing
annotations.
This
specification
does
not
place
any
constraints
on
the
structure
of
these
objects.
An
annotation
object
The
Web
Annotation
Working
Group
is
an
object
that
holds
general
annotations
about
developing
a
particular
column,
row,
cell
or
region
vocabulary
for
expressing
annotations.
In
future
versions
of
the
table.
Each
annotation
object
MUST
have
an
@id
property
this
specification,
we
anticipate
referencing
that
references
the
relevant
column,
row,
cell
vocabulary.
Should
there
be
column
or
region
of
the
table
using
a
fragment
identifier.
It
MAY
have
any
other
common
properties
level
notes
as
described
in
section
3.3
Common
Properties
well?
The Annotation Model can indeed become very complex.
text/csv
table-direction
text/tab-separated-values
,
and
the
media
type
parameters
that
they
allow,
namely:
templates
separator
dialect
encoding
@type
header
@type
MUST
be
set
to
"Table"
.
Publishers
MAY
include
this
to
We
invite
comment
on
whether
we
should
include
properties
that
help
in
checking
the
integrity
of
the
file:
datapackage
includes
bytes
and
hash
.
We
could
reuse
the
Subresource
Integrity
work
here.
The
description
MAY
contain
any
of
the
common
properties
as
defined
in
section
3.3
Common
Properties
to
provide
extra
metadata
about
the
table
as
a
whole.
The
description
MAY
contain
any
of
the
properties
defined
in
section
2.2
Inherited
Properties
to
describe
cells
within
the
table.
As
well
as
links
Much
of
the
tabular
data
that
is
published
on
the
web
is
messy,
and
CSV
parsers
frequently
need
to
other
related
tables,
be
configured
in
order
to
correctly
read
in
CSV.
A
dialect
description
provides
hints
to
parsers
about
how
to
parse
the
file
linked
to
from
the
@id
property.
It
can
have
any
of
the
following
common
properties
are
particularly
suitable
for
tables:
properties,
which
relate
to
the
flags
described
in
Section
5
Parsing
Tabular
Data
within
[
tabular-data-model
]:
created
encoding
creator
lineTerminator
description
quoteChar
language
doubleQuote
true
,
sets
the
escape
character
flag
to
"
.
If
false
,
to
\
.
license
skipRows
modified
commentPrefix
provenance
header
true
,
sets
the
header
row
count
flag
to
1
,
and
if
false
to
0
,
unless
publisher
headerRowCount
is
provided,
in
which
case
the
value
provided
for
the
header
property
is
ignored.
rights
headerRowCount
rightsHolder
delimiter
source
skipColumns
spatial
headerColumnCount
subject
skipBlankRows
temporal
skipInitialSpace
true
,
sets
the
trim
flag
to
"start"
.
If
false
,
to
false
.
If
the
trim
property
is
provided,
the
skipInitialSpace
property
is
ignored.
trim
true
,
sets
the
trim
flag
to
true
and
if
the
boolean
false
to
false
.
If
the
value
provided
is
a
string,
sets
the
trim
flag
to
the
provided
value,
which
MUST
be
one
of
"true"
,
"false"
,
"start"
or
"end"
.
@type
@type
MUST
be
set
to
"Dialect"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The default dialect description for CSV files is:
{ "encoding": "utf-8", "lineTerminator": "\r\n", "quoteChar": "\"", "doubleQuote": true, "skipRows": 0, "header": true, "headerRowCount": 1, "delimiter": ",", "skipColumns": 0, "headerColumnCount": 0, "skipBlankRows": false, "skipInitialSpace": false, "trim": false }
A template specification is a definition of how tabular data can be transformed into another format. It has the following properties:
Template specifications MUST have the following properties:
targetFormat
A
URL
for
the
format
that
will
be
created
through
the
transformation.
If
one
has
been
defined,
this
should
be
a
URL
for
a
media
type,
in
the
form
http://www.iana.org/assignments/media-types/
media-type
such
as
http://www.iana.org/assignments/media-types/text/calendar
.
Otherwise,
it
can
be
any
URL
that
describes
the
target
format.
The
targetFormat
URL
is
intended
as
an
informative
identifier
for
the
target
format,
and
applications
MAY
NOT
access
the
URL.
templateFormat
A
URL
for
the
format
that
is
used
by
the
template.
If
one
has
been
defined,
this
should
be
a
URL
for
a
media
type,
in
the
form
http://www.iana.org/assignments/media-types/
media-type
such
as
http://www.iana.org/assignments/media-types/application/javascript
.
Otherwise,
it
can
be
any
URL
that
describes
the
template
format.
The
templateFormat
URL
is
intended
as
an
informative
identifier
for
the
template
format,
and
applications
MAY
NOT
access
the
URL.
The
template
formats
that
an
application
supports
are
implementation
defined.
Template specifications MAY have the following properties:
title
application/json
)
and
the
transformation
is
creating
a
specific
profile
of
that
format.
source
"json"
,
the
tabular
data
should
first
be
transformed
first
to
JSON
based
on
the
simple
mapping
defined
in
Generating
JSON
from
Tabular
Data
on
the
Web
.
If
the
value
is
"rdf"
,
it
should
similarly
first
be
transformed
to
XML
based
on
the
simple
mapping
defined
in
Generating
RDF
from
Tabular
Data
on
the
Web
.
If
the
source
property
is
missing
or
null
then
the
source
of
the
transformation
is
the
annotated
tabular
data
model.
@type
@type
MUST
be
set
to
"Template"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The template specification MAY contain any common properties as defined in section 3.3 Common Properties to provide extra metadata about the transformation.
The following template specification will enable a processor that supports it to generate an iCalendar document using a Mustache template based on the JSON created from the simple mapping to JSON.
{ "title": "iCalendar", "targetFormat": "http://www.iana.org/assignments/media-types/text/calendar", "templateFormat": "https://mustache.github.io/", "source": "json" }
A schema is a definition of a tabular format that may be common to multiple tables. For example, multiple tables from different sources may have the same columns and be designed such that they can be aggregated together.
A schema description is a JSON object that encodes the information about a schema. All the properties of a schema description are optional.
columns
An
array
property
of
column
descriptions
as
described
in
section
3.6
3.9
Columns
.
These
are
matched
to
columns
in
table
tables
that
use
the
schema
by
position:
the
first
column
description
in
the
array
applies
to
the
first
column
in
the
table,
the
second
to
the
second
and
so
on.
The
name
properties
of
the
column
descriptions
MUST
be
unique
within
a
given
table
description.
An
When
an
array
of
row
column
descriptions
as
described
in
section
3.7
Rows
.
These
are
matched
to
row
by
the
value
B
is
imported
into
an
original
array
of
column
descriptions
A,
each
column
description
within
B
is
combined
into
the
row
in
the
row
description.
The
values
of
original
array
A
by:
cells
An
array
of
cell
descriptions
as
described
in
section
3.8
Cells
.
These
are
matched
to
cell
by
name
,
the
primaryKey
An
internal
A
column
reference
property
that
holds
either
a
single
references
reference
to
a
column
description
object
or
an
array
of
references.
Validators
MUST
check
that
each
row
has
a
unique
combination
of
cells
in
the
indicated
columns.
For
example,
if
primaryKey
is
set
to
then
every
row
must
have
a
unique
value
for
the
combination
of
the
["_:familyName",
"_:givenName"]
["familyName",
"givenName"]
familyName
and
givenName
columns.
When
referencing
columns
for
a
Composite
primary
key,
it
keys
and
foreign
key
references.
foreignKeys
An
array
property
of
foreign
key
definitions
that
define
how
the
values
from
specified
columns
within
this
table
link
to
rows
within
this
table
or
other
tables.
A
foreign
key
definition
is
a
lot
clearer
to
JSON
object
with
the
properties:
columns
reference
An object with the properties:
resource
schema
MUST
NOT
be
present.
The
metadata
document
MUST
contain
a
description
of
the
resource.
schema
resource
MUST
NOT
be
present.
The
metadata
document
that
forms
the
basis
of
processing
MUST
contain
a
description
of
a
resource
that
uses
the
referenced
schema,
and
there
MUST
NOT
be
more
than
one
such
resource.
columns
It
is
not
required
for
the
resource
or
schema
referenced
from
a
foreignKeys
property
to
have
a
similarly
defined
.name
primaryKey
When an array of foreign key definitions B is imported into an original array of foreign key definitions A, each foreign key definition within B which does not appear within A is appended to the original array A.
The cross reference between files should be limited to files from one publisher - else they are just web links with no guarantee of whether the target of the link exists which 'foreign key' might imply.
urlTemplate
@type
@type
MUST
be
set
to
"Schema"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
The
description
MAY
contain
any
of
the
common
properties
as
defined
in
section
3.3
Common
Properties
to
describe
provide
extra
metadata
about
the
schema.
As
well
schema
as
links
to
other
related
schemas,
a
whole.
The
description
MAY
contain
any
of
the
following
common
inherited
properties
are
particularly
suitable
defined
for
schemas:
cells
in
section
2.2
Inherited
Properties
.
A
list
of
countries
is
published
at
with
the
structure:
description
http://example.org/countries.csv
countryCode,latitude,longitude,name AD,42.546245,1.601554,Andorra AE,23.424076,53.847818,"United Arab Emirates" AF ,license33.93911 , 67.709953 , Afghanistan
Another
file
contains
information
about
the
population
in
some
countries
each
year,
at
with
the
structure:
modified
http://example.com/country_slice.csv
countryRef,year,population AF,1960,9616353 AF,1961,9799379 AF , 1962 , 9989846
The
following
metadata
for
the
group
of
tables
links
the
two
together
by
defining
a
property:
publisher
foreignKeys
{ "@context": "http://www.w3.org/ns/csvw", "resources": [{ "@id": "https://example.org/countries.csv", "schema": { "columns": [{ "name": "countryCode", "datatype": "string" }, { "name": "latitude", "datatype": "number" }, { "name": "longitude", "datatype": "number" }, { "name": "name", "datatype": "string" }], "urlTemplate": "http://example.org/countries.csv{#countryCode}", "primaryKey": "countryCode" } }, { "@id": "http://example.com/country_slice.csv", "schema": { "columns": [{ "name": "countryRef", "datatype": "string" }, { "name": "year", "datatype": "gYear" }, { "name": "population", "datatype": "integer" }], "foreignKeys": [{ "columns": "countryRef", "reference": { "resource": "http://example.org/countries.csv", "columns": "countryCode" } }] } }] }
When
the
population
data
in
is
processed
(displayed
or
mapped
into
another
format),
a
link
can
be
made
from
the
content
of
the
rights
country_slice.csv
countryRef
column
based
on
the
urlTemplate
for
country.csv
.
For
example,
if
the
countryRef
column
(the
value
of
columns
in
the
foreignKeys
object)
in
country_slice.csv
contains
the
value
UK
then
the
processor
will
use
that
value
to
populate
the
countryCode
variable
(the
value
of
reference.columns
in
the
foreignKeys
object)
when
interpreting
the
urlTemplate
for
country.csv
,
and
create
the
URL
http://example.org/countries.csv#UK
.
The
processor
does
not
need
to
retrieve
http://example.org/countries.csv
or
check
that
the
value
UK
appears
within
the
countryCode
column
to
create
this
link:
it
is
created
purely
based
on
the
urlTemplate
in
the
description
of
the
referenced
resource.
When
publishing
information
about
public
sector
roles
and
salaries,
as
in
Use
Case
4
,
the
UK
government
requires
departments
to
publish
two
files
which
are
interlinked.
The
first
lists
senior
grades
(simplified
here)
eg
at
:rightsHolder
HEFCE_organogram_senior_data_31032011.csv
Post Unique Reference, Name,Grade, Job Title,Reports to Senior Post 90115, Steve Egan,SCS1A,Deputy Chief Executive, 90334 90250, David Sweeney,SCS1A, Director, 90334 90284, Heather Fry,SCS1A, Director, 90334 90334,Sir Alan Langlands, SCS4, Chief Executive, xx
The
second
provides
information
about
the
number
of
junior
positions
that
report
to
those
individuals
(simplified
here)
eg
at
HEFCE_organogram_junior_data_31032011.csv
:
Reporting Senior Post,Grade,Payscale Minimum (�),Payscale Maximum (�),Generic Job Title,Number of Posts in FTE, Profession 90284, 4, 17426, 20002, Administrator, 2,Operational Delivery 90284, 5, 19546, 22478, Administrator, 1,Operational Delivery 90115, 4, 17426, 20002, Administrator, 8.67,Operational Delivery 90115, 5, 19546, 22478, Administrator, 0.5,Operational Delivery
The schemas are reused by multiple departments and for multiple pairs of files. The schemas are therefore defined in separate files, and they need to define links between the schemas which are then picked up as applying between tables that use those schemas.
The metadata file for the particular publication of the files above is:
{ "@context": "http://www.w3.org/ns/csvw", "resources": [{ "@id": "HEFCE_organogram_senior_data_31032011.csv", "schema": "http://example.org/schema/senior-roles.json" }, { "@id": "HEFCE_organogram_junior_data_31032011.csv", "schema": "http://example.org/schema/junior-roles.json" }] }
The
schema
for
the
senior
role
CSV
(at
)
is
as
follows;
it
includes
a
foreign
key
reference
to
itself:
subject
http://example.org/schema/senior-roles.json
{ "@context": "http://www.w3.org/ns/csvw", "@id": "http://example.org/schema/senior-roles.json", "columns": [{ "name": "ref", "title": "Post Unique Reference" }, { "name": "name", "title": "Name" }, { "name": "grade", "title": "Grade" }, { "name": "job", "title": "Job Title" }, { "name": "reportsTo", "title": "Reports to Senior Post" }], "primaryKey": "ref", "urlTemplate": "#post-{ref}", "foreignKeys": [{ "columns": "reportsTo", "reference": { "schema": "http://example.org/schema/senior-roles.json", "columns": "ref" } }] }
The
schema
for
the
junior
role
CSV
(at
)
is
as
follows;
it
includes
a
foreign
key
reference
to
the
senior
roles
schema:
title
http://example.org/schema/junior-roles.json
{ "@context": "http://www.w3.org/ns/csvw", "@id": "http://example.org/schema/junior-roles.json", "columns": [{ "name": "reportsTo", "title": "Reporting Senior Post" }, ... ], "foreignKeys": [{ "columns": "reportsTo", "reference": { "schema": "http://example.org/schema/senior-roles.json", "columns": "ref" } }] }
The
description
MAY
contain
any
In
the
first
line
of
HEFCE_organogram_junior_data_31032011.csv
,
the
inherited
properties
reportsTo
(
Reporting
Senior
Post
)
column
contains
the
value
90284
.
When
creating
a
link
from
that
column,
the
urlTemplate
defined
within
the
schema
at
http://example.org/schema/senior-roles.json
is
used
to
generate
a
URL
by
expanding
the
variable
reference
for
cells
in
section
2.1.2
Inherited
Properties
.
ref
based
on
the
value
from
the
reportsTo
column.
This
gives
the
relative
URL
#post-90284
which
is
then
resolved
against
the
base
URL
of
the
resource
that
uses
the
senior-roles.json
schema
within
the
original
metadata
file,
namely
HEFCE_organogram_senior_data_31032011.csv
.
A column description is a simple JSON object that describes a single column. The description provides additional human-readable documentation for a column, as well as additional information that may be used to validate the cells within the column, create a user interface for data entry, or inform conversion into other formats.
Should there be a way to suppress columns?
name
An atomic property that gives a single canonical name for the column. This MUST be a string. Conversion specifications MUST use this property as the basis for the names of properties/elements/attributes in the results of conversions.
For
ease
of
reference
within
URI
template
properties
,
column
names
SHOULD
consist
only
of
alphanumeric
characters
or
underscores
(
[a-zA-Z0-9_]+
).
Names
beginning
with
_
are
reserved
by
this
specification
and
MUST
NOT
be
used.
What do to with conversion if no column name is given?
We invite comment on what the syntactic limitations should be on column names to make them most useful when used as the basis of conversion into other formats, bearing in mind that different target languages such as JSON, RDF and XML have different syntactic limitations and common naming conventions.
During
validation,
if
there
is
no
title
property
and
the
column
already
has
a
title
annotation
then
a
validator
MUST
issue
a
warning
if
the
existing
title
annotation
does
not
match
the
name
specified
in
the
column
description.
title
A natural language property that provides possible alternative names for the column. The possible column titles are defined as:
title
is
a
string,
that
string
title
is
an
array,
the
strings
in
that
array
title
is
an
object,
the
string
or
strings
that
are
the
value
of
the
property
of
that
object
whose
name
is
the
column
language
where
the
column
language
is
the
value
of
the
language
property
on
the
column
description,
or
(if
there
is
no
such
language),
the
value
of
the
language
property
on
the
table
description.
If
the
column
already
has
a
title
annotation
(because
a
header
row
has
been
included
in
the
original
CSV
file)
then
a
validator
MUST
issue
a
warning
if
the
existing
title
annotation
is
not
the
same
as
any
of
the
possible
column
titles
.
The facility to specify multiple potential titles for a column is important when the same column description is used for multiple CSVs, through a mechanism yet to be defined by this specification.
required
row
predicateUrl
@type
If
included,
@type
MUST
be
set
to
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
"Row"
"Column"
The
description
MAY
contain
any
of
the
inherited
common
properties
as
defined
for
cells
in
section
2.1.2
3.3
Inherited
Common
Properties
.
3.8
Cells
Cells
can
be
described
using
cell
description
objects
.
A
cell
description
object
is
a
JSON
object
within
a
to
provide
extra
metadata
file
that
includes
properties
that
describe
an
individual
cell.
3.8.1
Required
Properties
The
following
properties
MUST
appear
on
a
cell
description:
row
an
integer;
the
number
of
the
row
on
which
the
cell
appears
column
an
integer;
the
number
of
about
the
column
on
which
the
cell
appears
3.8.2
Optional
Properties
@type
If
included,
@type
MUST
be
set
to
"Cell"
.
Publishers
MAY
include
this
to
provide
additional
information
to
JSON-LD
based
toolchains.
as
a
whole,
such
as
a
full
description.
The
description
MAY
contain
any
of
the
inherited
properties
defined
for
cells
in
section
2.1.2
2.2
Inherited
Properties
.
Cell
descriptions
may
override
inherited
properties
,
as
described
in
section
2.1
2.
Annotating
Tables
.
It
is
good
practice
to
define
these
properties
on
columns,
so
that
all
cells
within
a
given
column
are
handled
in
the
same
way.
way,
or
on
tables
if
appropriate.
These
properties
are:
null
The
An
atomic
property
giving
the
string
or
strings
used
for
null
values.
If
not
specified,
the
default
for
this
is
the
empty
string.
language
An atomic property giving a single string language code as defined by [ BCP47 ]. Indicates the language of the value within the cell.
text-direction
An
atomic
property
that
MUST
have
a
single
string
value
that
is
one
of
"rtl"
or
"ltr"
(the
default).
Indicates
whether
the
text
within
cells
should
be
displayed
by
default
as
left-to-right
or
right-to-left
text.
See
section
4.1.1
Bidirectional
Tables
for
more
details.
separator
The
An
atomic
property
that
MUST
have
a
single
string
value
that
is
the
character
used
to
separate
items
in
the
string
value
of
the
cell.
If
null
or
unspecified,
the
cell
does
not
contain
a
list.
Otherwise,
application
MUST
split
the
string
value
of
the
cell
on
the
specified
separator
character
and
parse
each
of
the
resulting
strings
separately.
The
cell's
value
will
then
be
a
list.
Conversion
specifications
MUST
use
the
separator
to
determine
the
conversion
of
a
cell
into
the
target
format.
See
,
3.8.5
3.12
Parsing
cells
for
more
details.
default
null
value.
This
default
value
MAY
be
used
when
converting
the
table
into
other
formats.
format
A
An
atomic
property
that
contains
a
single
string
that
is
the
definition
of
the
format
of
the
cell,
used
when
parsing
the
cell
as
described
in
3.8.5
3.12
Parsing
cells
.
datatype
The
An
atomic
property
that
contains
a
single
string
that
is
the
main
datatype
of
the
values
of
the
cell.
If
the
cell
contains
a
list
(ie
separator
is
specified
and
not
null
)
then
this
is
the
datatype
of
each
value
within
the
list.
Conversion
specifications
MUST
use
the
datatype
of
the
value
to
determine
the
conversion
of
a
cell
into
the
target
format.
See
3.8.4
3.11
Datatypes
for
more
details.
length
The
An
atomic
property
that
contains
a
single
integer
that
is
the
exact
length
of
the
value
of
the
cell.
See
section
3.8.4.1
3.11.1
Length
Constraints
for
details.
minLength
The
An
atomic
property
that
contains
a
single
integer
that
is
the
minimum
length
of
the
value
of
the
cell.
See
section
3.8.4.1
3.11.1
Length
Constraints
for
details.
maxLength
The
An
atomic
property
that
contains
a
single
integer
that
is
the
maximum
length
of
the
value
of
the
cell.
See
section
3.8.4.1
3.11.1
Length
Constraints
for
details.
minimum
The
An
atomic
property
that
contains
a
single
number
that
is
the
minimum
value
for
the
cell
(inclusive);
equivalent
to
minInclusive
.
See
section
3.8.4.2
3.11.2
Value
Constraints
for
details.
maximum
The
An
atomic
property
that
contains
a
single
number
that
is
the
maximum
value
for
the
cell
(inclusive);
equivalent
to
maxInclusive
.
See
section
3.8.4.2
3.11.2
Value
Constraints
for
details.
minInclusive
The
An
atomic
property
that
contains
a
single
number
that
is
the
minimum
value
for
the
cell
(inclusive).
See
section
3.8.4.2
3.11.2
Value
Constraints
for
details.
maxInclusive
The
An
atomic
property
that
contains
a
single
number
that
is
the
maximum
value
for
the
cell
(inclusive).
See
section
3.8.4.2
3.11.2
Value
Constraints
for
details.
minExclusive
The
An
atomic
property
that
contains
a
single
number
that
is
the
minimum
value
for
the
cell
(exclusive).
See
section
3.8.4.2
3.11.2
Value
Constraints
for
details.
maxExclusive
The
An
atomic
property
that
contains
a
single
number
that
is
the
maximum
value
for
the
cell
(exclusive).
See
section
3.8.4.2
3.11.2
Value
Constraints
for
details.
Cells
within
tables
may
be
annotated
with
a
datatype
which
indicates
the
type
of
the
value
obtained
by
parsing
the
value
of
the
cell.
The
format
expected
in
the
cell
is
determined
by
the
format
annotation,
if
there
is
one,
or
uses
a
default
format
determined
by
the
type.
The possible datatypes are:
the datatypes defined in [ xmlschema-2 ] with the exception of those that rely on XML mechanisms for definition, namely:
anySimpleType
string
;
a
sub-value
of
anySimpleType
normalizedString
;
a
sub-value
of
string
token
;
a
sub-value
of
normalizedString
language
;
a
sub-value
of
token
Name
;
a
sub-value
of
token
NCName
;
a
sub-value
of
Name
boolean
;
a
sub-value
of
anySimpleType
decimal
;
a
sub-value
of
anySimpleType
integer
;
a
sub-value
of
decimal
nonPositiveInteger
;
a
sub-value
of
integer
negativeInteger
;
a
sub-value
of
nonPositiveInteger
long
;
a
sub-value
of
integer
int
;
a
sub-value
of
long
short
;
a
sub-value
of
int
byte
;
a
sub-value
of
short
nonNegativeInteger
;
a
sub-value
of
integer
unsignedLong
;
a
sub-value
of
nonNegativeInteger
unsignedInt
;
a
sub-value
of
unsignedLong
unsignedShort
;
a
sub-value
of
unsignedInt
unsignedByte
;
a
sub-value
of
unsignedShort
positiveInteger
;
a
sub-value
of
nonNegativeInteger
float
;
a
sub-value
of
anySimpleType
double
;
a
sub-value
of
anySimpleType
duration
;
a
sub-value
of
anySimpleType
dateTime
;
a
sub-value
of
anySimpleType
time
;
a
sub-value
of
anySimpleType
date
;
a
sub-value
of
anySimpleType
gYearMonth
;
a
sub-value
of
anySimpleType
gYear
;
a
sub-value
of
anySimpleType
gMonthDay
;
a
sub-value
of
anySimpleType
gDay
;
a
sub-value
of
anySimpleType
gMonth
;
a
sub-value
of
anySimpleType
hexBinary
;
a
sub-value
of
anySimpleType
base64Binary
;
a
sub-value
of
anySimpleType
anyURI
;
a
sub-value
of
anySimpleType
number
which
is
exactly
equivalent
to
double
binary
which
is
exactly
equivalent
to
base64Binary
datetime
which
is
exactly
equivalent
to
dateTime
geopoint
any
which
anySimpleType
any
xml
which
array
html
any
json
The
length
,
minLength
and
maxLength
properties
indicate
the
exact,
minimum
and
maximum
lengths
of
the
values
of
cells.
Applications
MUST
raise
an
error
if
both
length
and
minLength
are
specified
and
they
do
not
have
the
same
value.
Similarly,
applications
MUST
raise
an
error
if
both
length
and
maxLength
are
specified
and
they
do
not
have
the
same
value.
Applications
MUST
raise
an
error
if
length
,
maxLength
or
minLength
are
specified
and
the
cell
value
is
not
a
list
(ie
separator
is
not
specified),
a
string
or
one
of
its
subtypes,
or
a
binary
value.
The length of a value of a cell is determined as follows:
null
its
length
is
zero
The
minimum
,
maximum
,
minInclusive
,
maxInclusive
,
minExclusive
and
maxExclusive
properties
indicate
limits
on
the
values
of
cells.
These
apply
to
numeric
and
date/time
types.
The
minimum
property
is
equivalent
to
the
minInclusive
property
and
the
maximum
property
is
equivalent
to
the
maxInclusive
property.
Validation against these properties is as defined in [ xmlschema-2 ].
Unlike
many
other
data
formats,
tabular
data
is
designed
to
be
read
by
humans.
For
that
reason,
it's
common
for
data
to
be
represented
within
tabular
data
in
a
human-readable
way.
The
separator
and
format
properties
indicates
the
format
used
to
represent
data
within
the
table.
This
is
used:
The process of parsing the string value of a cell into a single value or a list of values is as follows:
What should be the mapping of an empty cell?
datatype
is
string
or
anySimpleType
or
any
,
strip
leading
and
trailing
whitespace
from
the
value
null
value,
then
the
value
is
null
separator
property
is
not
null
,
create
a
list
of
values
by
splitting
the
string
at
the
character
specified
by
the
separator
property
format
,
if
one
is
specified,
as
described
below;
raise
an
error
if
any
of
the
values
do
not
match
the
specified
format
format
,
as
described
below
If
the
datatype
is
a
string
type,
the
format
property
provides
a
regular
expression
for
the
string
values,
in
the
syntax
defined
by
[
ECMASCRIPT
].
We invite comment about which reference to use for regular expression syntax. Other possibilities are to use that defined by XML Schema or XPath.
It is not uncommon for numbers within tabular data to be formatted for human consumption, which may involve using commas for decimal points, grouping digits in the number using commas, or adding currency symbols or percent signs to the number.
If
the
datatype
is
a
numeric
type,
the
format
property
indicates
the
expected
format
for
that
number.
Validators
MUST
check
that
the
numbers
in
the
column
adhere
to
the
specified
format.
Converters
MUST
use
the
format
property
to
parse
the
number
when
mapping
it
into
a
suitable
type
in
the
target
language
of
the
conversion.
When
the
datatype
is
a
numeric
type,
the
format
property's
value
MUST
be
a
number
format
as
specified
in
[
xslt-21
].
We invite comment on the best format to specify how to parse numbers.
Register of recognised date-time picture string formats.
Boolean
values
may
be
represented
in
many
ways
aside
from
the
standard
1
and
0
or
true
and
false
.
If
the
datatype
is
boolean
,
the
format
property
provides
the
true
and
false
values
expected,
separated
by
|
.
For
example
if
format
is
Y|N
then
cells
must
hold
either
Y
or
N
with
Y
meaning
true
and
N
meaning
false
.
Dates and times are commonly represented in tabular data in formats other than those defined in [ xmlschema-2 ].
If
the
datatype
is
a
date
or
time
type,
the
format
property
indicates
the
expected
format
for
that
date
or
time.
Validators
MUST
check
that
the
dates
or
times
in
the
column
adhere
to
the
specified
format.
Converters
MUST
use
the
format
property
to
parse
the
date
or
time
when
mapping
it
into
a
suitable
type
in
the
target
language
of
the
conversion.
When
the
datatype
is
a
date
or
time
type,
the
format
property's
value
MUST
be
a
date/time
format
as
specified
in
[
xslt-21
].
We invite comment on which format to use when parsing dates and times.
We invite comment on whether there are standard formats to use when parsing durations.
A
set
This
section
describes
how
particular
types
of
constraints
can
be
associated
with
applications
should
use
the
metadata
supplied
about
a
cell.
These
constraints
can
be
used
CSV
file
when
they
process
that
CSV
file.
We
intend
to
validate
data
against
a
JSON
Table
Schema.
The
constraints
might
be
used
by
consumers
include
other
sections
here
about:
A
constraints
descriptor
Much
of
this
is
a
JSON
hash.
It
likely
to
be
non-normative.
We
invite
comment
on
whether
it's
useful
to
provide
this
kind
of
guidance.
There
are
two
levels
of
bidirectionality
to
consider
when
displaying
tables:
the
directionality
of
the
following
keys.
table
(ie
whether
the
columns
should
be
arranged
left-to-right
or
right-to-left)
and
the
directionality
of
the
content
of
individual
cells.
The
minLength
table-direction
–
An
integer
that
specifies
property
provides
information
about
the
minimum
number
desired
display
of
characters
for
a
value
the
table.
If
maxLength
table-direction=ltr
–
An
integer
that
specifies
then
the
maximum
number
of
characters
for
a
value
first
column
SHOULD
be
displayed
on
the
left
and
the
last
column
on
the
right.
If
unique
table-direction=rtl
–
A
boolean.
then
the
first
column
SHOULD
be
displayed
on
the
right
and
the
last
column
on
the
left.
If
then
true
,
table-direction=default
all
values
for
that
cell
MUST
tables
SHOULD
be
unique
within
displayed
with
attention
to
the
data
file
bidirectionality
of
the
content
of
the
file.
Specifically,
the
values
of
the
cells
in
which
it
is
found.
This
defines
a
unique
key
for
a
row
although
a
row
could
potentially
have
several
such
keys.
pattern
–
A
regular
expression
that
can
the
table
should
be
used
scanned
breadth
first:
from
the
first
cell
in
the
first
column
through
to
test
the
last
cell
values.
in
the
first
column,
down
to
the
last
cell
in
the
last
column.
If
the
regular
expression
matches
then
first
character
in
the
value
is
valid.
Values
will
table
with
a
strong
type
as
defined
in
[
UNICODE-BIDI
]
indicates
a
RTL
directionality,
the
table
should
be
treated
displayed
with
the
first
column
on
the
right
and
the
last
column
on
the
left.
Otherwise,
the
table
should
be
displayed
with
the
first
column
on
the
left
and
the
last
column
on
the
right.
Characters
such
as
whitespace,
quotes,
commas
and
numbers
do
not
have
a
string
of
characters.
It
is
recommended
strong
type,
and
therefore
are
skipped
when
identifying
the
character
that
values
determines
the
directionality
of
this
cell
conform
the
table.
Implementations
SHOULD
enable
user
preferences
to
override
the
standard
XML
Schema
regular
expression
syntax
.
See
also
this
reference
.
minimum
–
specifies
indicated
metadata
about
the
directionality
of
the
table.
Once
the
directionality
of
the
table
has
been
determined,
each
cell
within
the
table
should
be
considered
as
a
minimum
value
separate
paragraph
,
as
defined
by
the
UBA
in
[
UNICODE-BIDI
].
The
default
directionality
for
a
cell.
This
the
cell
is
different
to
determined
by
looking
at
the
property
,
which
minLength
text-direction
checks
number
of
characters.
A
minimum
value
constraint
checks
whether
is
an
inherited
property
.
Thus,
as
defined
by
the
UBA
,
if
a
cell
value
is
greater
than
contains
no
characters
with
a
strong
type
(if
it's
a
number
or
equal
to
date
for
example)
then
the
specified
value.
The
range
checking
depends
on
way
the
cell
is
displayed
should
be
determined
by
the
property
of
the
cell.
type
text-direction
E.g.
an
integer
However,
when
the
cell
may
have
contains
characters
with
a
minimum
value
strong
type
(such
as
letters)
then
they
MUST
be
displayed
according
to
the
Unicode
Bidirectional
Algorithm
as
described
in
[
UNICODE-BIDI
].
We
intend
to
detail
how
to
validate
groups
of
100;
a
date
cell
might
tabular
data
files
against
metadata.
This
would
be
normative:
compliant
validators
would
have
to
report
the
errors
and
warnings
that
we
define.
We
invite
comment
on
whether
this
is
a
minimum
date.
If
useful
thing
to
specify.
Conversions
of
tabular
data
to
other
formats
operate
over
a
minimum
value
constraint
is
specified
then
the
cell
descriptor
annotated
table
constructed
as
defined
in
section
2.
Annotating
Tables
.
The
mechanics
of
these
conversions
to
other
formats
are
defined
in
other
specifications.
Conversion
specifications
MUST
contain
define
a
default
mapping
from
an
annotated
table
that
lacks
any
annotations
(ie
that
is
equivalent
to
an
un-annotated
table).
Conversion
specifications
MUST
use
either
the
type
name
key
or
the
maximum
predicateUrl
–
as
above,
but
specifies
of
a
maximum
value
column
as
the
basis
for
a
cell.
naming
machine-readable
fields
in
the
target
format,
such
as
the
name
of
the
equivalent
element
or
attribute
in
XML,
property
in
JSON
or
property
URI
in
RDF.
A
constraints
descriptor
may
contain
multiple
constraints,
Conversion
specifications
MAY
use
any
of
the
properties
defined
in
this
specification
to
adjust
the
mapping
of
an
annotated
table
into
another
format.
Conversion
specifications
MAY
define
additional
properties,
not
defined
in
this
specification,
which
case
are
specifically
used
when
converting
to
the
target
format
of
the
conversion.
For
example,
a
conversion
to
XML
might
specify
a
consumer
MUST
element-or-attribute
apply
all
the
constraints
when
determining
if
property
on
columns
that
determines
whether
a
cell
value
particular
column
is
valid.
represented
through
an
element
or
an
attribute
in
the
data.
A
data
file,
e.g.
an
entry
Conversion
specifications
SHOULD
specify
format-specific
properties
specifying
external
processing
steps
to
provide
more
control
to
people
defining
conversions.
If
these
are
specified,
the
conversion
specification
MUST
specify
at
what
point
in
a
data
package,
the
processing
this
external
processing
takes
place,
and
what
it
takes
place
on.
Examples
might
be:
This document is largely a copy of content from the Data Package specification and the JSON Table Schema , which are maintained as part of Data Protocols . Particular contributors to that work are Rufus Pollock, Paul Fitzpatrick, Andrew Berkeley, Francis Irving, Benoit Chesneau, Leigh Dodds, Martin Keegan, and Gunnlaugur Thor Briem.
application/csvm+json
We
intend
to
include
a
registration
for
a
new
datatype,
namely
application/csvm+json
.
We
invite
comment
about
how
to
indicate
that
this
is
consistent
with
application/ld+json
,
or
whether
we
should
just
use
application/json
or
application/ld+json
and
not
create
a
specific
media
type
for
the
metadata
files
defined
in
this
document.
See
csvm-context.json
.
TODO:
General
CSV
security
considerations.
The
JSON-LD
context,
located
at
http://www.w3.org/ns/csvw.jsonld
is
used
with
metadata
documents.
{
"@context": {
"id": "@id",
"type": "@type",
"dc:title": {
"@container": "@language"
},
"dc:description": {
"@container": "@language"
},
"rdfs:comment": {
"@container": "@language"
},
"rdfs:domain": {
"@type": "@id"
},
"rdfs:label": {
"@container": "@language"
},
"rdfs:range": {
"@type": "@id"
},
"rdfs:subClassOf": {
"@type": "@id"
},
"rdfs:subPropertyOf": {
"@type": "@id"
},
"owl:equivalentClass": {
"@type": "@vocab"
},
"owl:equivalentProperty": {
"@type": "@vocab"
},
"owl:oneOf": {
"@container": "@list",
"@type": "@vocab"
},
"owl:imports": {
"@type": "@id"
},
"owl:versionInfo": {
"@type": "xsd:string",
"@language": null
},
"owl:inverseOf": {
"@type": "@vocab"
},
"owl:unionOf": {
"@type": "@vocab",
"@container": "@list"
},
"rdfs_classes": {
"@reverse": "rdfs:isDefinedBy",
"@type": "@id"
},
"rdfs_properties": {
"@reverse": "rdfs:isDefinedBy",
"@type": "@id"
},
"rdfs_datatypes": {
"@reverse": "rdfs:isDefinedBy",
"@type": "@id"
},
"rdfs_instances": {
"@reverse": "rdfs:isDefinedBy",
"@type": "@id"
},
"cc": "http://creativecommons.org/ns#",
"csvw": "http://www.w3.org/ns/csvw#",
"ctag": "http://commontag.org/ns#",
"dc": "http://purl.org/dc/terms/",
"dc11": "http://purl.org/dc/elements/1.1/",
"dcat": "http://www.w3.org/ns/dcat#",
"dcterms": "http://purl.org/dc/terms/",
"earl": "http://www.w3.org/ns/earl#",
"foaf": "http://xmlns.com/foaf/0.1/",
"gr": "http://purl.org/goodrelations/v1#",
"grddl": "http://www.w3.org/2003/g/data-view#",
"ical": "http://www.w3.org/2002/12/cal/icaltzd#",
"ma": "http://www.w3.org/ns/ma-ont#",
"og": "http://ogp.me/ns#",
"org": "http://www.w3.org/ns/org#",
"owl": "http://www.w3.org/2002/07/owl#",
"prov": "http://www.w3.org/ns/prov#",
"qb": "http://purl.org/linked-data/cube#",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"rdfa": "http://www.w3.org/ns/rdfa#",
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
"rev": "http://purl.org/stuff/rev#",
"rif": "http://www.w3.org/2007/rif#",
"rr": "http://www.w3.org/ns/r2rml#",
"schema": {
"@id": "csvw:schema",
"@type": "@id"
},
"sd": "http://www.w3.org/ns/sparql-service-description#",
"sioc": "http://rdfs.org/sioc/ns#",
"skos": "http://www.w3.org/2004/02/skos/core#",
"skosxl": "http://www.w3.org/2008/05/skos-xl#",
"v": "http://rdf.data-vocabulary.org/#",
"vcard": "http://www.w3.org/2006/vcard/ns#",
"void": "http://rdfs.org/ns/void#",
"wdr": "http://www.w3.org/2007/05/powder#",
"wrds": "http://www.w3.org/2007/05/powder-s#",
"xhv": "http://www.w3.org/1999/xhtml/vocab#",
"xml": "rdf:XMLLiteral",
"xsd": "http://www.w3.org/2001/XMLSchema#",
"any": "xsd:anySimpleType",
"binary": "xsd:base64Binary",
"datetime": "xsd:dateTime",
"describedby": "wrds:describedby",
"html": "rdf:HTML",
"license": "xhv:license",
"maximum": "csvw:maxInclusive",
"minimum": "csvw:minInclusive",
"number": "xsd:double",
"role": "xhv:role",
"Column": "csvw:Column",
"Dialect": "csvw:Dialect",
"Direction": "csvw:Direction",
"Schema": "csvw:Schema",
"Table": "csvw:Table",
"TableGroup": "csvw:TableGroup",
"Template": "csvw:Template",
"columns": {
"@id": "csvw:columns",
"@type": "@id",
"@container": "@list"
},
"commentPrefix": {
"@id": "csvw:commentPrefix"
},
"datatype": {
"@id": "csvw:datatype"
},
"default": {
"@id": "csvw:default"
},
"delimiter": {
"@id": "csvw:delimiter"
},
"dialect": {
"@id": "csvw:dialect",
"@type": "@id"
},
"doubleQuote": {
"@id": "csvw:doubleQuote",
"@type": "xsd:boolean"
},
"encoding": {
"@id": "csvw:encoding"
},
"foreignKeys": {
"@id": "csvw:foreignKeys"
},
"format": {
"@id": "csvw:format"
},
"header": {
"@id": "csvw:header",
"@type": "xsd:boolean"
},
"headerColumnCount": {
"@id": "csvw:headerColumnCount",
"@type": "xsd:nonNegativeInteger"
},
"headerRowCount": {
"@id": "csvw:headerRowCount",
"@type": "xsd:nonNegativeInteger"
},
"language": {
"@id": "csvw:language"
},
"length": {
"@id": "csvw:length",
"@type": "xsd:nonNegativeInteger"
},
"lineTerminator": {
"@id": "csvw:lineTerminator"
},
"maxExclusive": {
"@id": "csvw:maxExclusive"
},
"maxInclusive": {
"@id": "csvw:maxInclusive"
},
"maxLength": {
"@id": "csvw:maxLength",
"@type": "xsd:nonNegativeInteger"
},
"minExclusive": {
"@id": "csvw:minExclusive"
},
"minInclusive": {
"@id": "csvw:minInclusive"
},
"minLength": {
"@id": "csvw:minLength",
"@type": "xsd:nonNegativeInteger"
},
"name": {
"@id": "csvw:name"
},
"notes": {
"@id": "csvw:notes"
},
"null": {
"@id": "csvw:null"
},
"predicateUrl": {
"@id": "csvw:predicateUrl",
"@type": "xsd:anyURI"
},
"primaryKey": {
"@id": "csvw:primaryKey"
},
"quoteChar": {
"@id": "csvw:quoteChar"
},
"required": {
"@id": "csvw:required",
"@type": "xsd:boolean"
},
"resources": {
"@id": "csvw:resources",
"@type": "@id",
"@container": "@set"
},
"row": {
"@id": "csvw:row",
"@container": "@set"
},
"separator": {
"@id": "csvw:separator"
},
"skipBlankRows": {
"@id": "csvw:skipBlankRows",
"@type": "xsd:boolean"
},
"skipColumns": {
"@id": "csvw:skipColumns",
"@type": "xsd:nonNegativeInteger"
},
"skipInitialSpace": {
"@id": "csvw:skipInitialSpace",
"@type": "xsd:boolean"
},
"skipRows": {
"@id": "csvw:skipRows",
"@type": "xsd:nonNegativeInteger"
},
"source": {
"@id": "csvw:source"
},
"table": {
"@id": "csvw:table",
"@type": "@id",
"@container": "@set"
},
"table-direction": {
"@id": "csvw:table-direction",
"@type": "@vocab"
},
"targetFormat": {
"@id": "csvw:targetFormat"
},
"templateFormat": {
"@id": "csvw:templateFormat"
},
"templates": {
"@id": "csvw:templates",
"@type": "@id"
},
"text-direction": {
"@id": "csvw:text-direction",
"@type": "@vocab"
},
"title": {
"@id": "csvw:title",
"@container": "@language"
},
"trim": {
"@id": "csvw:trim",
"@type": "xsd:boolean"
},
"uriTemplate": {
"@id": "csvw:uriTemplate"
},
"json": "csvw:json"
},
"@id": "http://www.w3.org/ns/csvw#",
"@type": "owl:Ontology",
"dc:title": {
"en": "Metadata Vocabulary for Tabular Data"
},
"dc:description": {
"en": "Validation, conversion, display and search of tabular data on the web\n requires additional metadata that describes how the data should be\n interpreted. This document defines a vocabulary for metadata that\n annotates tabular data. This can be used to provide metadata at various\n levels, from collections of data from CSV documents and how they relate\n to each other down to individual cells within a table."
},
"rdfs_classes": [
{
"@id": "csvw:Column",
"@type": "rdfs:Class",
"rdfs:label": {
"en": "Column Description"
},
"rdfs:comment": {
"en": "A Column Description describes a single column."
}
},
{
"@id": "csvw:Dialect",
"@type": "rdfs:Class",
"rdfs:label": {
"en": "Dialect Description"
},
"rdfs:comment": {
"en": "A Dialect Description provides hints to parsers about how to parse a linked file."
}
},
{
"@id": "csvw:Direction",
"@type": "rdfs:Class",
"rdfs:label": {
"en": "Direction"
},
"rdfs:comment": {
"en": "The class of table/text directions."
}
},
{
"@id": "csvw:Schema",
"@type": "rdfs:Class",
"rdfs:label": {
"en": "Schema"
},
"rdfs:comment": {
"en": "A Schema is a definition of a tabular format that may be common to multiple tables."
}
},
{
"@id": "csvw:Table",
"@type": "rdfs:Class",
"rdfs:label": {
"en": "Table Description"
},
"rdfs:comment": {
"en": "A table description is a JSON object that describes a table within a CSV file."
}
},
{
"@id": "csvw:TableGroup",
"@type": "rdfs:Class",
"rdfs:label": {
"en": "Table Group Description"
},
"rdfs:comment": {
"en": "A Table Group Description describes a group of Tables."
}
},
{
"@id": "csvw:Template",
"@type": "rdfs:Class",
"rdfs:label": {
"en": "Template Specification"
},
"rdfs:comment": {
"en": "A Template Specification is a definition of how tabular data can be transformed into another format."
}
}
],
"rdfs_properties": [
{
"@id": "csvw:columns",
"@type": "rdf:Property",
"rdfs:label": {
"en": "columns"
},
"rdfs:comment": {
"en": "An array of Column Descriptions."
},
"rdfs:domain": "csvw:Schema",
"rdfs:range": "csvw:Column"
},
{
"@id": "csvw:commentPrefix",
"@type": "rdf:Property",
"rdfs:label": {
"en": "comment prefix"
},
"rdfs:comment": {
"en": "A character that, when it appears at the beginning of a skipped row, indicates a comment that should be associated as a comment annotation to the table. The default is \"#\"."
},
"rdfs:domain": "csvw:Dialect"
},
{
"@id": "csvw:datatype",
"@type": "rdf:Property",
"rdfs:label": {
"en": "datatype"
},
"rdfs:comment": {
"en": "The main datatype of the values of the cell. If the cell contains a list (ie separator is specified and not null) then this is the datatype of each value within the list."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:default",
"@type": "rdf:Property",
"rdfs:label": {
"en": "default"
},
"rdfs:comment": {
"en": "An atomic property holding a single string that provides a default string value for the cell in cases where the original string value is a null value. This default value may be used when converting the table into other formats."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:delimiter",
"@type": "rdf:Property",
"rdfs:label": {
"en": "delimiter"
},
"rdfs:comment": {
"en": "The separator between cells. The default is \",\"."
},
"rdfs:domain": "csvw:Dialect"
},
{
"@id": "csvw:dialect",
"@type": "rdf:Property",
"rdfs:label": {
"en": "dialect"
},
"rdfs:comment": {
"en": "Provides hints to processors about how to parse the referenced files for to create tabular data models for an individual table, or all the tables in a group."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table"
]
},
"rdfs:range": "csvw:Dialect"
},
{
"@id": "csvw:doubleQuote",
"@type": "rdf:Property",
"rdfs:label": {
"en": "double quote"
},
"rdfs:comment": {
"en": "If true, sets the escape character flag to \". If false, to \\\\."
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:boolean"
},
{
"@id": "csvw:encoding",
"@type": "rdf:Property",
"rdfs:label": {
"en": "encoding"
},
"rdfs:comment": {
"en": "The character encoding for the file, one of the encodings listed in [encoding]. The default is utf-8."
},
"rdfs:domain": "csvw:Dialect"
},
{
"@id": "csvw:foreignKeys",
"@type": "rdf:Property",
"rdfs:label": {
"en": "foreign keys"
},
"rdfs:comment": {
"en": "An array of foreign key definitions that define how the values from specified columns within this table link to rows within this table or other tables."
},
"rdfs:domain": "csvw:Schema"
},
{
"@id": "csvw:format",
"@type": "rdf:Property",
"rdfs:label": {
"en": "format"
},
"rdfs:comment": {
"en": "A definition of the format of the cell, used when parsing the cell."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:header",
"@type": "rdf:Property",
"rdfs:label": {
"en": "header"
},
"rdfs:comment": {
"en": ""
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:boolean"
},
{
"@id": "csvw:headerColumnCount",
"@type": "rdf:Property",
"rdfs:label": {
"en": "header column count"
},
"rdfs:comment": {
"en": "The number of header columns (following the skipped columns) in each row. The default is 0.\n"
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:nonNegativeInteger"
},
{
"@id": "csvw:headerRowCount",
"@type": "rdf:Property",
"rdfs:label": {
"en": "header row count"
},
"rdfs:comment": {
"en": "The number of header rows (following the skipped rows) in the file. The default is 1."
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:nonNegativeInteger"
},
{
"@id": "csvw:language",
"@type": "rdf:Property",
"rdfs:label": {
"en": "language"
},
"rdfs:comment": {
"en": "A language code as defined by [BCP47]. Indicates the language of the value within the cell."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:length",
"@type": "rdf:Property",
"rdfs:label": {
"en": "length"
},
"rdfs:comment": {
"en": "The exact length of the value of the cell."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
},
"rdfs:range": "xsd:nonNegativeInteger"
},
{
"@id": "csvw:lineTerminator",
"@type": "rdf:Property",
"rdfs:label": {
"en": "line terminator"
},
"rdfs:comment": {
"en": "The character that is used at the end of a row. The default is CRLF."
},
"rdfs:domain": "csvw:Dialect"
},
{
"@id": "csvw:maxExclusive",
"@type": "rdf:Property",
"rdfs:label": {
"en": "max exclusive"
},
"rdfs:comment": {
"en": "The maximum value for the cell (exclusive)."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:maxInclusive",
"@type": "rdf:Property",
"rdfs:label": {
"en": "max inclusive"
},
"rdfs:comment": {
"en": "The maximum value for the cell (inclusive). "
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:maxLength",
"@type": "rdf:Property",
"rdfs:label": {
"en": "max length"
},
"rdfs:comment": {
"en": "The maximum length of the value of the cell."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
},
"rdfs:range": "xsd:nonNegativeInteger"
},
{
"@id": "csvw:minExclusive",
"@type": "rdf:Property",
"rdfs:label": {
"en": "min exclusive"
},
"rdfs:comment": {
"en": "The minimum value for the cell (exclusive)."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:minInclusive",
"@type": "rdf:Property",
"rdfs:label": {
"en": "min inclusive"
},
"rdfs:comment": {
"en": "The minimum value for the cell (inclusive)."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:minLength",
"@type": "rdf:Property",
"rdfs:label": {
"en": "min length"
},
"rdfs:comment": {
"en": "The minimum length of the value of the cell."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
},
"rdfs:range": "xsd:nonNegativeInteger"
},
{
"@id": "csvw:name",
"@type": "rdf:Property",
"rdfs:label": {
"en": "name"
},
"rdfs:comment": {
"en": "An atomic property that gives a canonical name for the column. This must be a string. Conversion specifications must use this property as the basis for the names of properties/elements/attributes in the results of conversions."
},
"rdfs:domain": "csvw:Column"
},
{
"@id": "csvw:notes",
"@type": "rdf:Property",
"rdfs:label": {
"en": "notes"
},
"rdfs:comment": {
"en": "An array of objects representing annotations. This specification does not place any constraints on the structure of these objects."
},
"rdfs:domain": "csvw:Table"
},
{
"@id": "csvw:null",
"@type": "rdf:Property",
"rdfs:label": {
"en": "null"
},
"rdfs:comment": {
"en": "The string used for null values. If not specified, the default for this is the empty string."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:predicateUrl",
"@type": "rdf:Property",
"rdfs:label": {
"en": "predicate URL"
},
"rdfs:comment": {
"en": "An atomic property that holds one or more URIs that may be used as URIs for predicates if the table is mapped to another format."
},
"rdfs:domain": "csvw:Column",
"rdfs:range": "xsd:anyURI"
},
{
"@id": "csvw:primaryKey",
"@type": "rdf:Property",
"rdfs:label": {
"en": "primary key"
},
"rdfs:comment": {
"en": "A column reference property that holds either a single reference to a column description object or an array of references."
},
"rdfs:domain": "csvw:Schema"
},
{
"@id": "csvw:quoteChar",
"@type": "rdf:Property",
"rdfs:label": {
"en": "quote char"
},
"rdfs:comment": {
"en": "The character that is used around escaped cells."
},
"rdfs:domain": "csvw:Dialect"
},
{
"@id": "csvw:required",
"@type": "rdf:Property",
"rdfs:label": {
"en": "required"
},
"rdfs:comment": {
"en": "A boolean value which indicates whether every cell within the column must have a non-null value."
},
"rdfs:domain": "csvw:Column",
"rdfs:range": "xsd:boolean"
},
{
"@id": "csvw:resources",
"@type": "rdf:Property",
"rdfs:label": {
"en": "resources"
},
"rdfs:comment": {
"en": "An array of table descriptions for the tables in the group."
},
"rdfs:domain": "csvw:TableGroup",
"rdfs:range": "csvw:Table"
},
{
"@id": "csvw:row",
"@type": "rdf:Property",
"rdfs:label": {
"en": "row"
},
"rdfs:comment": {
"en": "Relates a Table to each Row output."
},
"rdfs:subPropertyOf": "rdfs:member",
"rdfs:domain": "csvw:Table"
},
{
"@id": "csvw:schema",
"@type": "rdf:Property",
"rdfs:label": {
"en": "schema"
},
"rdfs:comment": {
"en": "An object property that provides a schema description for an individual table, or all the tables in a group."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table"
]
},
"rdfs:range": "csvw:Schema"
},
{
"@id": "csvw:separator",
"@type": "rdf:Property",
"rdfs:label": {
"en": "separator"
},
"rdfs:comment": {
"en": "The character used to separate items in the string value of the cell."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
}
},
{
"@id": "csvw:skipBlankRows",
"@type": "rdf:Property",
"rdfs:label": {
"en": "skip blank rows"
},
"rdfs:comment": {
"en": "Indicates whether to ignore wholly empty rows (ie rows in which all the cells are empty). The default is false."
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:boolean"
},
{
"@id": "csvw:skipColumns",
"@type": "rdf:Property",
"rdfs:label": {
"en": "skip columns"
},
"rdfs:comment": {
"en": "The number of columns to skip at the beginning of each row, before any header columns. The default is 0."
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:nonNegativeInteger"
},
{
"@id": "csvw:skipInitialSpace",
"@type": "rdf:Property",
"rdfs:label": {
"en": "skip initial space"
},
"rdfs:comment": {
"en": "If true, sets the trim flag to \"start\". If false, to false."
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:boolean"
},
{
"@id": "csvw:skipRows",
"@type": "rdf:Property",
"rdfs:label": {
"en": "skip rows"
},
"rdfs:comment": {
"en": "The number of rows to skip at the beginning of the file, before a header row or tabular data."
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:nonNegativeInteger"
},
{
"@id": "csvw:source",
"@type": "rdf:Property",
"rdfs:label": {
"en": "source"
},
"rdfs:comment": {
"en": "The format to which the tabular data should be transformed prior to the transformation using the template. If the value is \"json\", the tabular data should first be transformed first to JSON based on the simple mapping defined in Generating JSON from Tabular Data on the Web. If the value is \"rdf\", it should similarly first be transformed to XML based on the simple mapping defined in Generating RDF from Tabular Data on the Web."
},
"rdfs:domain": "csvw:Template"
},
{
"@id": "csvw:table",
"@type": "rdf:Property",
"rdfs:label": {
"en": "table"
},
"rdfs:comment": {
"en": "Relates an Table group to annotated tables. (Note, this is different from csvw:resources, which relates metadata, rather than resulting annotated table descriptions."
},
"rdfs:subPropertyOf": "rdfs:member",
"rdfs:domain": "csvw:TableGroup",
"rdfs:range": "csvw:Table"
},
{
"@id": "csvw:table-direction",
"@type": "rdf:Property",
"rdfs:label": {
"en": "table direction"
},
"rdfs:comment": {
"en": "One of csvw:rtl csvw:ltr or csvw:default. Indicates whether the tables in the group should be displayed with the first column on the right, on the left, or based on the first character in the table that has a specific direction. "
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table"
]
},
"rdfs:range": "csvw:Direction"
},
{
"@id": "csvw:targetFormat",
"@type": "rdf:Property",
"rdfs:label": {
"en": "target format"
},
"rdfs:comment": {
"en": "A URL for the format that will be created through the transformation. If one has been defined, this should be a URL for a media type, in the form http://www.iana.org/assignments/media-types/media-type such as http://www.iana.org/assignments/media-types/text/calendar. Otherwise, it can be any URL that describes the target format."
},
"rdfs:domain": "csvw:Template"
},
{
"@id": "csvw:templateFormat",
"@type": "rdf:Property",
"rdfs:label": {
"en": "template format"
},
"rdfs:comment": {
"en": "A URL for the format that is used by the template. If one has been defined, this should be a URL for a media type, in the form http://www.iana.org/assignments/media-types/media-type such as http://www.iana.org/assignments/media-types/application/javascript. Otherwise, it can be any URL that describes the template format."
},
"rdfs:domain": "csvw:Template"
},
{
"@id": "csvw:templates",
"@type": "rdf:Property",
"rdfs:label": {
"en": "templates"
},
"rdfs:comment": {
"en": "An array of template specifications that provide mechanisms to transform the tabular data into other formats. "
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table"
]
},
"rdfs:range": "csvw:Template"
},
{
"@id": "csvw:text-direction",
"@type": "rdf:Property",
"rdfs:label": {
"en": "text direction"
},
"rdfs:comment": {
"en": "One of csvw:rtl or csvw:ltr. Indicates whether the text within cells should be displayed by default as left-to-right or right-to-left text. "
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:TableGroup",
"csvw:Table",
"csvw:Schema",
"csvw:Column"
]
},
"rdfs:range": "csvw:Direction"
},
{
"@id": "csvw:title",
"@type": "rdf:Property",
"rdfs:label": {
"en": "title"
},
"rdfs:comment": {
"en": "For a Template: A natural language property that describes the format that will be generated from the transformation. This is useful if the target format is a generic format (such as application/json) and the transformation is creating a specific profile of that format.\n\nFor a Column: A natural language property that provides possible alternative names for the column."
},
"rdfs:domain": {
"owl:unionOf": [
"csvw:Template",
"csvw:Column"
]
}
},
{
"@id": "csvw:trim",
"@type": "rdf:Property",
"rdfs:label": {
"en": "trim"
},
"rdfs:comment": {
"en": "Indicates whether to trim whitespace around cells; may be true, false, start or end. The default is false."
},
"rdfs:domain": "csvw:Dialect",
"rdfs:range": "xsd:boolean"
},
{
"@id": "csvw:uriTemplate",
"@type": "rdf:Property",
"rdfs:label": {
"en": "uri template"
},
"rdfs:comment": {
"en": "A URI template property that may be used to create a unique identifier for each row when mapping data to other formats."
},
"rdfs:domain": "csvw:Schema"
}
],
"rdfs_datatypes": [
{
"@id": "csvw:json",
"@type": "rdfs:Datatype",
"rdfs:label": {
"en": "json"
},
"rdfs:comment": {
"en": "A literal containing JSON."
},
"rdfs:subClassOf": "rdfs:Literal"
}
],
"rdfs_instances": [
{
"@id": "csvw:ltr",
"@type": "Direction",
"rdfs:label": {
"en": "left to right"
},
"rdfs:comment": {
"en": "Indicates text should be processed left to right."
}
},
{
"@id": "csvw:rtl",
"@type": "Direction",
"rdfs:label": {
"en": "right to left"
},
"rdfs:comment": {
"en": "Indiects text should be processed right to left"
}
}
]
}