W3C

TriG

RDF Dataset Language

W3C First Public Working Draft 09 April 2013

This version:
http://www.w3.org/TR/2013/WD-trig-20130409/
Latest published version:
http://www.w3.org/TR/trig/
Latest editor's draft:
https://dvcs.w3.org/hg/rdf/raw-file/default/trig/index.html
Editor:
Gavin Carothers, Lex Machina
Authors:
Chris Bizer, Freie Universität Berlin
Richard Cyganiak, Freie Universität Berlin

Abstract

The Resource Description Framework (RDF) is a general-purpose language for representing information in the Web.

This document defines a textual syntax for RDF called TriG that allows an RDF dataset to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes. TriG is an extension of the Turtle [TURTLE-TR] format.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

TriG is intended the meet the charter requirement of the RDF Working Group to define an RDF syntax for multiple graphs. TriG is an extension of the Turtle syntax for RDF [[!TURTLE-TR]. The current document is based on the original proposal by Chris Bizer and Richard Cyganiak.

This document was published by the RDF Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-rdf-comments@w3.org (subscribe, archives). All comments are welcome.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

This document defines TriG a concrete syntax for RDF as defined in the RDF Concepts and Abstract Syntax ([RDF11-CONCEPTS]). TriG is an extension of Turtle ([TURTLE-TR]), extended to support representing a complete RDF Dataset.

2. An Introduction to TriG

This section is non-normative.

A TriG document allows writing down an RDF Dataset in a compact textual form. It consists of a sequence of directives, graph statements which contain triple-generating statements and optional blank lines. Comments may be given after a # that is not part of another lexical token and continue to the end of the line.

Graph statements are a pair of an IRI and a group of triple statements surrounded by {}. The IRI of the graph statement may be used in another graph statement which implies merging the tripes generated by each graph statement, and may reoccur as part of any triple statement. Optionally one graph statement may not not be labeled with an IRI. Such a graph statement corresponds to the Default Graph of an RDF Dataset.

The construction of an RDF Dataset from a TriG document is defined in section 4. TriG Grammar and section 5. Parsing.

2.1 Graph Statements

This section is non-normative.

A graph statement pairs an IRI with a RDF Graph. The triple statements that make up the graph are enclosed in {}.

In a TriG document a graph IRI MAY be used to label more then one graph. The IRI of a graph statement may be omitted. In this case the graph is considered the default graph of the RDF Dataset.

A RDF Dataset may contain only a single graph.

Example 1

A RDF Dataset may contain a default graph, and named graphs.

Example 2
Issue 1

A provenance example

Issue 2

A verisions example

Issue 3

A web snapshot example

2.2 Other Terms

All other terms and directives come from Turtle.

2.2.1 Sal Considerations for Blank Nodes

This section is non-normative.

BlankNodes sharing the same label in differently labeled graph statements MUST be considered to be the same BlankNode.

3. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MUST, MUST NOT, REQUIRED, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this specification are to be interpreted as described in [RFC2119].

This specification defines conformance criteria for:

A conforming TriG document is a Unicode string that conforms to the grammar and additional constraints defined in section 4. TriG Grammar, starting with the trigDoc production. A TriG document serializes an RDF dataset.

A conforming TriG parser is a system capable of reading TriG documents on behalf of an application. It makes the serialized RDF dataset, as defined in section 5. Parsing, available to the application, usually through some form of API.

The IRI that identifies the TriG language is: http://www.w3.org/ns/formats/TriG

Note

This specification does not define how TriG parsers handle non-conforming input documents.

3.1 Media Type and Content Encoding

The media type of TriG is application/trig. The content encoding of TriG content is always UTF-8.

4. TriG Grammar

A TriG document is a Unicode[UNICODE] character string encoded in UTF-8. Unicode characters only in the range U+0000 to U+10FFFF inclusive are allowed.

4.1 White Space

White space (production WS) is used to separate two terminals which would otherwise be (mis-)recognized as one terminal. Rule names below in capitals indicate where white space is significant; these form a possible choice of terminals for constructing a TriG parser.

White space is significant in the production String.

4.2 Comments

Comments in TriG take the form of '#', outside an IRIREF or String, and continue to the end of line (marked by characters U+000D or U+000A) or end of file if there is no end of line after the comment marker. Comments are treated as white space.

4.3 IRI References

Relative IRIs are resolved with base IRIs as per Uniform Resource Identifier (URI): Generic Syntax [RFC3986] using only the basic algorithm in section 5.2. Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of RFC3986) are performed. Characters additionally allowed in IRI references are treated in the same way that unreserved characters are treated in URI references, per section 6.5 of Internationalized Resource Identifiers (IRIs) [RFC3987].

The @base directive defines the Base IRI used to resolve relative IRIs per RFC3986 section 5.1.1, "Base URI Embedded in Content". Section 5.1.2, "Base URI from the Encapsulating Entity" defines how the In-Scope Base IRI may come from an encapsulating document, such as a SOAP envelope with an xml:base directive or a mime multipart document with a Content-Location header. The "Retrieval URI" identified in 5.1.3, Base "URI from the Retrieval URI", is the URL from which a particular Turtle document was retrieved. If none of the above specifies the Base URI, the default Base URI (section 5.1.4, "Default Base URI") is used. Each @base directive sets a new In-Scope Base URI, relative to the previous one.

4.4 Escape Sequences

There are three forms of escapes used in TriG documents:

Context where each kind of escape sequence can be used
numeric
escapes
string
escapes
reserved character
escapes
IRIs, used as RDF terms or as in @prefix or @base declarations yes no no
local names no no yes
Strings yes yes no
Note

%-encoded sequences are in the character range for IRIs and are explicitly allowed in local names. These appear as a '%' followed by two hex characters and represent that same sequence of three characters. These sequences are not decoded during processing. A term written as <http://a.example/%66oo-bar> in TriG designates the IRI http://a.example/%66oo-bar and not IRI http://a.example/foo-bar. A term written as ex:%66oo-bar with a prefix @prefix ex: <http://a.example/> also designates the IRI http://a.example/%66oo-bar.

4.5 Grammar

The EBNF used here is defined in XML 1.0 [EBNF-NOTATION]. Production labels consisting of a number and a final 'g' are unique to TriG. All Production labels consisting of only a number reference the production with that number in the Turtle grammar [TURTLE-TR]. Production labels consisting of a number and a final 's', e.g. [60s], reference the production with that number in the SPARQL Query Language for RDF grammar [RDF-SPARQL-QUERY].

[1g] trigDoc ::= graph_statement*
[2g] graph_statement ::= directive | graph
[3g] graph ::= graphIri? '{' (triples '.')* '}'
[4g] graphIri ::= iri
[3] directive ::= prefixID | base | sparqlPrefix | sparqlBase
[4] prefixID ::= '@prefix' PNAME_NS IRIREF '.'
[5] base ::= '@base' IRIREF '.'
[28*] sparqlPrefix ::= [Pp] [Rr] [Ee] [Ff] [Ii] [Xx] PNAME_NS IRIREF
[29*] sparqlBase ::= [Bb] [Aa] [Ss] [Ee] IRIREF
[6] triples ::= subject predicateObjectList | blankNodePropertyList predicateObjectList?
[7] predicateObjectList ::= verb objectList (';' (verb objectList)?)*
[8] objectList ::= object (',' object)*
[9] verb ::= predicate | 'a'
[10] subject ::= iri | blank
[11] predicate ::= iri
[12] object ::= iri | blank | blankNodePropertyList | literal
[13] literal ::= RDFLiteral | NumericLiteral | BooleanLiteral
[14] blank ::= BlankNode | collection
[15] blankNodePropertyList ::= '[' predicateObjectList ']'
[16] collection ::= '(' object* ')'
[17] NumericLiteral ::= INTEGER | DECIMAL | DOUBLE
[128s] RDFLiteral ::= String (LANGTAG | '^^' iri)?
[133s] BooleanLiteral ::= 'true' | 'false'
[18] String ::= STRING_LITERAL_QUOTE | STRING_LITERAL_SINGLE_QUOTE | STRING_LITERAL_LONG_SINGLE_QUOTE | STRING_LITERAL_LONG_QUOTE
[135s] iri ::= IRIREF | PrefixedName
[136s] PrefixedName ::= PNAME_LN | PNAME_NS
[137s] BlankNode ::= BLANK_NODE_LABEL | ANON

Productions for terminals

[19] IRIREF ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'
[139s] PNAME_NS ::= PN_PREFIX? ':'
[140s] PNAME_LN ::= PNAME_NS PN_LOCAL
[141s] BLANK_NODE_LABEL ::= '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)?
[144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*
[20] INTEGER ::= [+-]? [0-9]+
[21] DECIMAL ::= [+-]? ([0-9]* '.' [0-9]+)
[22] DOUBLE ::= [+-]? ([0-9]+ '.' [0-9]* EXPONENT | '.' [0-9]+ EXPONENT | [0-9]+ EXPONENT)
[154s] EXPONENT ::= [eE] [+-]? [0-9]+
[23] STRING_LITERAL_QUOTE ::= '"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"'
[24] STRING_LITERAL_SINGLE_QUOTE ::= "'" ([^#x27#x5C#xA#xD] | ECHAR | UCHAR)* "'"
[25] STRING_LITERAL_LONG_SINGLE_QUOTE ::= "'''" (("'" | "''")? [^'\] | ECHAR | UCHAR)* "'''"
[26] STRING_LITERAL_LONG_QUOTE ::= '"""' (('"' | '""')? [^"\] | ECHAR | UCHAR)* '"""'
[27] UCHAR ::= '\u' HEX HEX HEX HEX | '\U' HEX HEX HEX HEX HEX HEX HEX HEX
[159s] ECHAR ::= '\' [tbnrf\"']
[160s] NIL ::= '(' WS* ')'
[161s] WS ::= #x20 | #x9 | #xD | #xA
[162s] ANON ::= '[' WS* ']'
[163s] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#00C0-#00D6] | [#00D8-#00F6] | [#00F8-#02FF] | [#0370-#037D] | [#037F-#1FFF] | [#200C-#200D] | [#2070-#218F] | [#2C00-#2FEF] | [#3001-#D7FF] | [#F900-#FDCF] | [#FDF0-#FFFD] | [#10000-#EFFFF]
[164s] PN_CHARS_U ::= PN_CHARS_BASE | '_'
[166s] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | #00B7 | [#0300-#036F] | [#203F-#2040]
[167s] PN_PREFIX ::= PN_CHARS_BASE ((PN_CHARS | '.')* PN_CHARS)?
[168s] PN_LOCAL ::= (PN_CHARS_U | ':' | [0-9] | PLX) ((PN_CHARS | '.' | ':' | PLX)* PN_CHARS | ':' | PLX)?
[169s] PLX ::= PERCENT | PN_LOCAL_ESC
[170s] PERCENT ::= '%' HEX HEX
[171s] HEX ::= [0-9] | [A-F] | [a-f]
[172s] PN_LOCAL_ESC ::= '\' ('_' | '~' | '.' | '-' | '!' | '$' | '&' | "'" | '(' | ')' | '*' | '+' | ',' | ';' | '=' | '/' | '?' | '#' | '@' | '%')

5. Parsing

Issue 4

Define a method of parsing that treats each graph statement as a Turtle document. Merge any graph statements that have the same label, or if they don't have a label merge to form the default graph.

A. Differences from previous TriG

This section is non-normative.

B. Internet Media Type, File Extension and Macintosh File Type

Contact:
Eric Prud'hommeaux
See also:
How to Register a Media Type for a W3C Specification
Internet Media Type registration, consistency of use
TAG Finding 3 June 2002 (Revised 4 September 2002)

The Internet Media Type / MIME Type for TriG is "application/trig".

It is recommended that TriG files have the extension ".trig" (all lowercase) on all platforms.

It is recommended that TriG files stored on Macintosh HFS file systems be given a file type of "TEXT".

This information that follows has been submitted to the IESG for review, approval, and registration with IANA.

Type name:
application
Subtype name:
trig
Required parameters:
None
Optional parameters:
None
Encoding considerations:
The syntax of TriG is expressed over code points in Unicode [UNICODE]. The encoding is always UTF-8 [UTF-8].
Unicode code points may also be expressed using an \uXXXX (U+0000 to U+FFFF) or \UXXXXXXXX syntax (for U+10000 onwards) where X is a hexadecimal digit [0-9A-Fa-f]
Security considerations:
TriG is a general-purpose assertion language; applications may evaluate given data to infer more assertions or to dereference IRIs, invoking the security considerations of the scheme for that IRI. Note in particular, the privacy issues in [RFC3023] section 10 for HTTP IRIs. Data obtained from an inaccurate or malicious data source may lead to inaccurate or misleading conclusions, as well as the dereferencing of unintended IRIs. Care must be taken to align the trust in consulted resources with the sensitivity of the intended use of the data; inferences of potential medical treatments would likely require different trust than inferences for trip planning.
TriG is used to express arbitrary application data; security considerations will vary by domain of use. Security tools and protocols applicable to text (e.g. PGP encryption, MD5 sum validation, password-protected compression) may also be used on Turtle documents. Security/privacy protocols must be imposed which reflect the sensitivity of the embedded information.
TriG can express data which is presented to the user, for example, RDF Schema labels. Application rendering strings retrieved from untrusted Turtle documents must ensure that malignant strings may not be used to mislead the reader. The security considerations in the media type registration for XML ([RFC3023] section 10) provide additional guidance around the expression of arbitrary data and markup.
TriG uses IRIs as term identifiers. Applications interpreting data expressed in TriG should address the security issues of Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8, as well as Uniform Resource Identifier (URI): Generic Syntax [RFC3986] Section 7.
Multiple IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER E WITH ACUTE). Any person or application that is writing or interpreting data in TriG must take care to use the IRI that matches the intended semantics, and avoid IRIs that make look similar. Further information about matching of similar characters can be found in Unicode Security Considerations [UNISEC] and Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8.
Interoperability considerations:
There are no known interoperability issues.
Published specification:
This specification.
Applications which use this media type:
No widely deployed applications are known to use this media type. It may be used by some web services and clients consuming their data.
Additional information:
Magic number(s):
TriG documents may have the strings '@prefix' or '@base' (case dependent) near the beginning of the document.
File extension(s):
".trig"
Base URI:
The TriG '@base <IRIref>' term can change the current base URI for relative IRIrefs in the query language that are used sequentially later in the document.
Macintosh file type code(s):
"TEXT"
Person & email address to contact for further information:
Eric Prud'hommeaux <eric@w3.org>
Intended usage:
COMMON
Restrictions on usage:
None
Author/Change controller:
The TriG specification is the product of the RDF WG. The W3C reserves change control over this specifications.

C. References

C.1 Normative references

[EBNF-NOTATION]
Tim Bray; Jean Paoli; C. M. Sperberg-McQueen; Eve Maler; François Yergeau. EBNF Notation 26 November 2008. W3C Recommendation. URL: http://www.w3.org/TR/REC-xml/#sec-notation
[RDF11-CONCEPTS]
Richard Cyganiak; David Wood. RDF 1.1 Concepts and Abstract Syntax 05 June 2012. W3C Working Draft (work in progress). URL: http://www.w3.org/TR/2012/WD-rdf11-concepts-20120605/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Internet RFC 2119. URL: http://www.ietf.org/rfc/rfc2119.txt
[RFC3023]
M. Murata; S. St.Laurent; D. Kohn. XML Media Types January 2001. Internet RFC 3023. URL: http://www.ietf.org/rfc/rfc3023.txt
[RFC3986]
T. Berners-Lee; R. Fielding; L. Masinter. Uniform Resource Identifier (URI): Generic Syntax. January 2005. Internet RFC 3986. URL: http://www.ietf.org/rfc/rfc3986.txt
[RFC3987]
M. Dürst; M. Suignard. Internationalized Resource Identifiers (IRIs). January 2005. Internet RFC 3987. URL: http://www.ietf.org/rfc/rfc3987.txt
[TURTLE-TR]
Eric Prud'hommeaux; Gavin Carothers. Turtle: Terse Triple Language 19 February 2013. W3C Candidate Recommendation. URL: http://www.w3.org/TR/2013/CR-turtle-20130219/
[UNICODE]
The Unicode Consortium. The Unicode Standard.. Defined by: The Unicode Standard, Version 6.2.0, (Mountain View, CA: The Unicode Consortium, 2012. ISBN 978-1-936213-07-8) , as updated from time to time by the publication of new versions URL: http://www.unicode.org/standard/versions/enumeratedversions.html
[UTF-8]
F. Yergeau. UTF-8, a transformation format of ISO 10646. IETF RFC 3629. November 2003. URL: http://www.ietf.org/rfc/rfc3629.txt

C.2 Informative references

[RDF-SPARQL-QUERY]
Andy Seaborne; Eric Prud'hommeaux. SPARQL Query Language for RDF. 15 January 2008. W3C Recommendation. URL: http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/
[UNISEC]
Mark Davis; Michel Suignard. Unicode Security Considerations 4 August 2010. URL: http://www.unicode.org/reports/tr36/