Are documents with uppercase language tags valid Turtle?

Dear all,

The Turtle draft spec part about quoted literals [1] points to the RDF 1.1 Concepts and Abstract Syntax [2], which says:
The language tag must be well-formed according to section 2.2.9 of [BCP47], and must be normalized to lowercase.

Note the *must* next to lowercase. This means that, by this definition, any parser that encounters an uppercase language tag must throw an error.
In other words: tolerant behavior is not acceptable, because it’s a “must” and not a “should”.

Unfortunately, this is in contradiction with the BNF syntax [3], which reads:
    [76s] <LANGTAG> ::= BASE 
     | PREFIX 
     | "@" [a-zA-Z]+ ( "-" [a-zA-Z0-9]+ )* 

It’s also in contradiction with the editor’s draft of 23 January 2013 [4]:
    [144s]	LANGTAG	::=	'@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*

So either, the spec is incorrect when pointing to [2], because that is a different definition of language tag,
or either the syntax should be updated to reflect this.
Note, however, that some software (such as 4store) do return uppercase language tags.

So the question is:
are uppercase language tags correct and *must* parsers accept them or *must* they throw an error?

I have seen this issue on several parsers [5][6] and am currently facing it for my own JavaScript-based Turtle/N3 parser [7].

Best,

Ruben

[1] http://www.w3.org/TR/turtle/#turtle-literals
[2] http://www.w3.org/TR/2012/WD-rdf11-concepts-20120605/#dfn-literal
[3] http://www.w3.org/TR/2011/WD-turtle-20110809/turtle.bnf
[4] https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html#grammar-production-LANGTAG
[5] http://www.openrdf.org/issues/browse/RIO-44
[6] https://github.com/ruby-rdf/rdf-n3/issues/2
[7] https://github.com/RubenVerborgh/node-n3/issues/2

Received on Wednesday, 6 February 2013 14:07:50 UTC