N-Triples-Format

From RDF Working Group Wiki
Revision as of 17:58, 1 April 2011 by Rcygania2 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

RDF-Triples

This document defines an easy-to-parse line-based serialization for RDF.

This syntax is an improved version of N-Triples. N-Triples is a format originally defined in the Test Cases document. Its original intent was for writing test cases, but it has proven to be popular as a dump format for RDF data.

To distinguish it from the original N-Triples, the improved version defined here is called RDF-Triples.

Modifications

Modifications to N-Triples:

  • The charset is UTF-8
  • IRIs, not RDF URI References
  • Media type: application/rdf-triples
  • Tied closely to the Turtle grammar

Media type

Type name: application

Subtype name: rdf-triples

Charset: UTF-8 (default and only)

File extension: .nt

Intended usage: : Common

Note: text/ntriples+turtle would be consistent with the +xml convention established in the XML Media Types registration (RFC3023 §7 ¶1) in that a tool which did not have a specific parser for ntriples could safely fall back to a turtle parser (which could fall back to N3, but ericP is nervous about pursuing text/ntriples+turtle+n3+bytes+bits+…). Note recent registrations for text/turtle and text/n3.

Sub-Language of Turtle

This language is a sub-language of Turtle. Any RDF-Triples document is a syntactically valid Turtle document and an RDF-Triples parser will generate the same triples as if the document were processed by a Turtle parser.


Compatibility with N-Triples

An N-Triples document written using only absolute IRIs is a valid RDF-triples document generating the same triples. N-triples uses \u escapes for character outside US-ASCII and processing will have turned these into the original character.

RDF-Triples has a different media type. The media type for N-Triples is proposed as being text/plain. All text/* media types default to US-ASCII (RFC 2046, sec 4.1.2). A specific media type helps distinguish between formats when asking for a broad range of types (e.g. */*) and deciding what to do based on the return.

Earlier versions of N-Triples were unclear as the meaning of relative URIs. In the final RDF test cases version, sec URI References restricts them to RDF URI references, i.e. absolute. The grammar rule is called uriref, and token absoluteURI, but this is not restricted beyond being a sequence of characters delimited by < >.


Grammar

Turtle-based grammar

The grammar is based on the Turtle working draft (as of 2011-03-31).

The tokens for IRI_REF, BLANK_NODE_LABEL, STRING_LITERAL2 are taken from Turtle.

The same white space, tokenization and comment rules apply.

   rdf-triples-doc     ::= (triple)* <EOF>
   triple              ::= subj pred obj '.'
   subj                ::= IRI_REF | BLANK_NODE_LABEL
   pred                ::= IRI_REF 
   obj                 ::= IRI_REF | BLANK_NODE_LABEL | lit
   lit                 ::= STRING_LITERAL2 ('^^' IRI_REF | ('@' LANG) )
   LANG                ::= [a-zA-Z]+ ( "-" [a-zA-Z0-9]+ )*

As a restriction of Turtle

This restricts Turtle to triples written as "subject predicate object .", with no abbreviation forms.

  • No prefixed names
  • No directives
  • No predicate-object lists (';' syntax)
  • No objects lists (',' syntax)
  • No RDF collection syntax ('(' ')' syntax)
  • No unlabelled blank nodes ('[' ']' syntax)
  • No special literal forms for integer, decimal or double
  • No 'a' for rdf:type
  • No '()' for rdf:nil
  • No keywords for 'true' and 'false'
  • Strings are only one-" quoted

Issues to be addressed

  • Should the syntax be widened to any of the features of Turtle? For example, for integers.
  • Should Modifications to N-Triples mention that relative IRIs are allowed?
  • Comments in N-Triples have to be on their own line, in Turtle they can be on a line that also has non-WS content. RDF-Triples should follow N-Triples?
  • In N-Triples, a triple can't be split over multiple lines, and one line can't have multiple triples. In Turtle, both are allowed. RDF-Triples should follow N-Triples?