Trivial Object Description Language (TODL) 1.0

Status

Experimental. This document is missing some explanation text, but should still be usable to people who know RDF/XML.

Note: This is a fork from N-Triples, trying to keep it architecturally pure. I believe it is still a sublanguage of n3.

1. Purpose

TODL is a language for communicating information using simple descriptive statements. It can be used to serialize program data structures, database contents, or any other basic information.

TODL is comparable to RDF/XML, XML, KIF, and many other languages and formats. Its chief advantages are its simplicity and global-scope identifiers.

Understanding TODL is trivial if you know RDF/XML, but does not rely on them in any way.

TODL is a human-readable format, using only printable USASCII characters (with space and cr/lf), but it is not meant to be easy to read or write by hand.

TODL is perhaps an excellent language on which to base the Semantic Web.

2. Introduction (non-normative)

TODL is based on an old idea that one can communicate arbitrary information through simple three-part sentences. In grammatical terms, each of these sentence has a subject, a predicate, and an object. Each sentence conveys the information that something (identified by the subject) has some relationship (identified by the predicate) with something else (identified by the object).

In TODL, we identify everything (whether identified by the subject, predicate, or object of a sentence) using the same kind of identifier. We have three kinds of identifiers:

  1. Local-Scope Identifiers look like _:sam, _:friend, and _:g357x. When you see a TODL expression, you can try to learn what one of these terms identifies by looking at the sentences in which it is used, but you must start from scratch, not looking at the text of the identifier or remember any previous meaning. These identifiers are like pronouns, acting as placeholders identifying something for a little while and later (in another TODL document) identifying something else.
  2. Global-Scope Identifiers look like web addresses (such as http://www.w3.org/2000/01/sw/people#sandro). Like local-scope identifiers, they can identify anything. The difference is that they are supposed to identify exactly one thing to everybody and for all time. It is probably impossible to implement or enforce this behavior perfectly, but we still try. Anyone creating TODL content should be careful to use global-scope identifiers only as their creator intended. If you don't know the intended meaning, don't use it.
  3. String Literals look like "Hello, World!". They identify strings of Unicode characters. Control characters, characters not in USASCII, quote, and backslash may be including in strings using backslash-escape sequences like "\n" for lf (new line) and \u03b1 for the Greek small letter alpha.

It is possible, even reasonable for several identifiers to denote the some thing.

...go on...

3. Syntax

This syntax description was borrowed from the N-Triples documentation and then modified to suit TODL. Thanks to all the people responsible for N-Triples; sorry we're diverging.

A TODL document is a sequence of US-ASCII characters and is defined by the todlDoc grammar term below. Parsing it results in a sequence of statements formed from the subject, predicate and object terms. The meaning of these terms is defined in section 4.

This EBNF is the notation used in XML 1.0 second edition

todlDoc ::= line*
line ::= ws* ( comment | triple )? eoln
comment ::= '#' ( character - ( cr | lf ) )*
triple ::= subject ws+ predicate ws+ object ws* '.' ws*
subject ::= identifier
predicate ::= identifier
object ::= identifier
identifier ::= uriref | nodeID | literal
uriref ::= '<' absoluteURI '>'
nodeID ::= '_:' name
literal ::= '"' string '"'
ws ::= space | tab
eoln ::= cr | lf | cr lf
space ::= #x20 /* US-ASCII space - decimal 32 */
cr ::= #xD /* US-ASCII carriage return - decimal 13 */
lf ::= #xA /* US-ASCII linefeed - decimal 10 */
tab ::= #x9 /* US-ASCII horizontal tab - decimal 9 */
string ::= character* with escapes as defined in section Strings
name ::= [A-Za-z][A-Za-z0-9]*
absoluteURI ::= character+ with escapes as defined in section URI References
character ::= [#x20-#x7E] /* US-ASCII space to decimal 127 */

3.2 Strings

N-Triples strings are sequences of US-ASCII character productions encoding [Unicode] character strings. The characters outside the US-ASCII range are made available by \-escape sequences as follows:

Escape
sequence
Encodes Unicode character
\\ Backslash character (decimal 92, #x5c)
\" Double quote (decimal 34, #x22)
\n Linefeed (decimal 10, #xA) - lf character
\r Carriage return (decimal 13, #xD) - cr character
\t Horizontal tab (decimal 9, #x9) - tab character
\uxxxx 4 required hexadecimal digits xxxx encoding character
[#x0-#x8],[#xB#xC],[#xE-#x1F],[#x7F-#xFFFF]
\Uxxxxxxxx 8 required hexadecimal digits xxxxxxxx encoding character
[#x10000-#x10FFFF]

This escaping satisfies the [Charmod] section Reference Processing Model on making the full Unicode character range U+0 to U+10FFFF available to applications and providing only one way to escape any character.

It is recommended but not required that the resulting Unicode character string be made available to applications in UTF-8 encoding.

3.3 URI References

N-Triples says:

URI references are defined and encoded using the rules defined in Character Encoding in URI References. That is, disallowed characters are represented in UTF-8 and then encoded using the %HH format, where HH is the byte value expressed using hexadecimal notation.

...but this makes no sense to me. A URI-Reference is simple a Unicode character string with a very restricted syntax (printable USASCII, no spaces, no "<", no "[", etc.) The use of URI escaping with "%" is up to higher layers, and not our concern.

4. Semantics

...

5. Related Work

5.1 N-Triples

The need for something like TODL was temporarily assuaged with N-Triples, but when RDF Core WG reasserted N-Triples' role as a parser-testing language, disallowing literals as subjects and introducing XML and language-tagged literals, N-Triples became unusable as a trivial description language.

5.2 N3

At this time, TODL is a sublanguage of N3, but N3 is an evolving language. New versions of the two may not be synchronized.

5.3 RDF/XML

The semantics of TODL are intended to match an essential subset of those of RDF/XML. Interoperation should be straighforward, at least around common, central features.

Sandro Hawke
First: 2002-03-29; This: $Id: Overview.html,v 1.1 2002/03/29 17:01:08 sandro Exp $