No definition of line terminator for canonical N-Triples

The RDF Dataset Canonicalization Working Group makes use of an inferred Canonical representation for N-Quads. Such a form, is not actually defined in the N-Quads spec, but it can be inferred from the Canonical representation of N-Triples [1].

However, while the whitespace between components of a triple (subject, predicate, object, and terminating ‘.’) are defined, no discussion of whitespace separating triples in a document. This is inline with the grammar production for triple [2], but there is no definition for what comprises a ntriplesDoc [3] in canonical form. The suggested change is to add that each triple MUST be terminated by a single newline (U+000A). Without such a change, a document with several forms of EOL token could be used and still be considered canonical, where EOL is [#xD#xA]+.

A future release of the N-Quads should have a corresponding Canonicalization section, adding a canonical representation of any graph label followed by a single space (U+0020) unless the graph label were otherwise omitted.

Editorially, section 2.1 Simple Statements says “The graph label IRI can be omitted”, and should rather say “The graph label component can be omitted” (or similar), as a graph label may be a blank node.

Gregg Kellogg
gregg@greggkellogg.net

[1] https://www.w3.org/TR/n-triples/#canonical-ntriples
[2] https://www.w3.org/TR/n-triples/#grammar-production-triple
[3] https://www.w3.org/TR/n-triples/#grammar-production-ntriplesDoc

Received on Wednesday, 23 November 2022 22:00:53 UTC