Prepared by ericP for media type discussions with the ietf and a W3C Team discussion in January 2008.
An XML serialization of RDF has been a standard since 1998. Two textual representations of RDF, n3 and turtle, have been in use since 2000 and 2001 respectively. This documents the issues associated with selecting media types for these languages.
name | role | current media type | example |
---|---|---|---|
RDF | data model | N/A | |
RDFXML | XML serialization of RDF | application/rdf+xml | <rdf:Description rdf:about="http://purl.org/commons/record/ncbi_gene/1812"> <rdfs:label>human DRD1</rdfs:label> </rdf:Description |
ntriples | simple serialization of RDF | text/plain | http://purl.org/commons/record/ncbi_gene/1812 http://www.w3.org/2000/01/rdf-schema#label "human DRD1" . |
turtle | textual serialization of RDF | application/x-turtle | ncbi_gene:1812 rdfs:label "human DRD1" . |
n3 | extension¹ of turtle language expressing a superset of RDF | text/rdf+n3 (not registered) | hcls:kb foo:asserts { ncbi_gene:1812 rdfs:label "human DRD1" } . |
¹ Note that n3 actually predates turtle, but considering it an extension is practical for media type consideration.
The registration application included the following parameters:
with the expectation that, until HTTP sets no default charset (or a default charset of UTF-8), each HTTP transation which includes non-ascii characters will include a charset parameter:
Content-type: text/n3; charset="UTF-8"
Content-type: text/turtle; charset="UTF-8"
While the choice of subtree may be forced for practical reasons (the desire to not include a charset parameter in every HTTP transaction), there remains the ideological choice between text/ and application/. The two camps principally divide on whether one's grandmother should see turtle and whether the text/ tree is meaningful if that is the only litmus.
MIME (RFC2046) says that the default character encoding is us-ascii unless specified in the registration process (RFC3023). n3 and turtle, as well as related languages like SPARQL, are explcitly UTF-8. As stated above, HTTP would require a charset parameter for these languages if they used text/ media types. While MIME applies to technologies like SMTP and NNTP, the vast majority of custom code that will need to accomodate media types for these is for HTTP server scripts serving turtle or n3 and HTTP client scripts requesting them.
Choosing an application/ media type obviates the need for including ;charset="UTF-8"
in HTTP transations, however, Martin Dürst asserts that modern browsers use the user preferences instead of 8859-1; there is no observable default charset in HTTP any more. If the community agrees that this is the only practice we need to worry about, we should update HTTP. The contra-indications would be security considerations and introduced need for user intervention to view documents they used to view without manually selecting a charset. Re security: no utf-8 document interpreted as 8859-1 will have any unintended control characters (u01-u1f). The principle debate here appears to be between Frank Ellermann preferring "default ASCII" and Martin Dürst preferring "no default". and which allows the application to use the internal charset.
; charset='UTF-8'
" with each transation — TimBlDec 17 Eric Prud'homme (8.2K) Media types for RDF languages N3 and Turtle │ initial summary of the problem space Dec 17 Garret Wilson (7.6K) ├─>Re: Media types for RDF languages N3 and Turtle │ │ pointer to discussion of media type for ntriples Dec 17 Sean B. Palmer (2.9K) │ └─> │ │ is us-ascii the default encoding? changing to UTF-8 would require an RFC Dec 17 Garret Wilson (6.2K) │ └─> │ │ text/plain defaults to us-ascii; other subtrees may pick their own defaults │ │ could render text/ obselete Dec 17 Sean B. Palmer (2.1K) │ └─> │ │ does interpreting utf-8 as US-ASCII cause security issues? │ │ DanC noted that civil disobedience was an option Dec 17 Garret Wilson (0.7K) │ └─> │ │ +1 to civil disobedience if we document it Dec 30 Garret Wilson (1.6K) │ └─> │ note that RFC2616 — HTTP 1.1 has a default charset of ISO-8859-1 Dec 17 Eric Prud'homme (8.3K) └─>[RESEND] Media types for RDF languages N3 and Turtle │ same summary of the problem space (some thinkos and typos corrected) Dec 19 Graham Klyne (7.3K) └─>Re: [RESEND] Media types for RDF languages N3 and Turtle │ use text/ *only* for media types intended *primarily* for human consumption │ use '-' instead of '+', e.g. application/rdf-turtlec (or provide good use case for '+') Dec 20 Eric Prud'homme ( 10K) ├─>Re: Re: [RESEND] Media types for RDF languages N3 and Turtle │ │ is the above metric (primarily for human consumption) shared by the community? │ │ is there precedent for '-'? Dec 20 Graham Klyne ( 10K) │ └─>Re: [RESEND] Media types for RDF languages N3 and Turtle │ Ned Freed asserts that text/html was a mistake │ no precedent for '-' │ point of RDF is to *not* end up with a family of syntactically related languages Dec 21 Garret Wilson (2.1K) └─>Re: [RESEND] Media types for RDF languages N3 and Turtle │ above metric (primarily for human consumption) precludes everything but text/plain │ propose criteria that match spirit of RFC2046: │ • all bytes compose text characters │ • always editable in a text editor │ • abstract values represented in text Dec 21 Garret Wilson (1.1K) └─> • revision control system (e.g. CVS) treats it as text
Dec 18 Eric Prud'homme (5.5K) Request for review of Turtle (an RDF serialization) media type: text/turtle │ strawman request for text/turtle Dec 18 Julian Reschke (1.6K) ├─>Re: Request for review of Turtle (an RDF serialization) media type: text/turtle │ │ why not application/? │ │ note HTTP 1.1 default charset │ │ see also HTTPbis issue 20 Dec 18 Frank Ellermann (0.3K) │ ├─>Unknown text/* subtypes (was: Request for review of Turtle (an RDF serialization) media type: text/turtle) │ │ │ HTTP oddities shouldn't affect MIME registration │ │ │ 2616bis can switch to "unknown text is ASCII" Dec 18 Julian Reschke (0.6K) │ │ ├─>Re: Unknown text/* subtypes │ │ │ │ why the HTTP 1.1 rule? Dec 18 Frank Ellermann (1.3K) │ │ │ └─> │ │ │ │ some RFC history Dec 26 Martin Duerst (3.3K) │ │ │ └─> │ │ │ necessary for backwards-compatibility with very early HTTP versions │ │ │ backwards compatibility is no longer necessary Dec 26 Martin Duerst (1.2K) │ │ └─>Re: Unknown text/* subtypes (was: Request for review of Turtle (an RDFserialization) media type: text/turtle) │ │ │ "there is no default" so the application can look for internal charset info │ │ │ "unknown text is ASCII" would prohibit app from using the internal charset Dec 26 Eric Prud'homme (3.2K) │ │ ├─>Re: Re: Unknown text/* subtypes (was: Request for review of Turtle (an RDFserialization) media type: text/turtle) │ │ │ how will the browser know how to look for charset info? Dec 28 Frank Ellermann (2.2K) │ │ ├─>Re: Unknown text/* subtypes │ │ │ │ "default ASCII" would be consistent with MIME │ │ │ │ docs with chars outside 00-7f should be erroneous without some charset │ │ │ │ years after 2616bis, could migrate to "default UTF-8" │ │ │ │ current apps allow internal charsets to override if there is not explicit charset parm (**contradicts Martin's assertion**) │ │ │ │ nobody treats text/html as "default Latin-1" Dec 28 Anne van Kester (0.7K) │ │ │ ├─> │ │ │ │ existing software treats text/xml as it treats application/xml Jan 13 Ian Hickson (1.6K) │ │ │ └─> │ │ │ │ (HMTL4, HTML5, CSS) override both MIME and HTTP │ │ │ │ defining behavior in HTTP likely to be ignored │ │ │ │ use lower level (MIME) or app-level (XML, HTML, CSS, ...) Jan 13 Eric Prud'homme (7.6K) │ │ │ ├─>Re: Re: Unknown text/* subtypes │ │ │ │ │ consistent with Martin's assertion │ │ │ │ │ what should the CRLF rules be? │ │ │ │ │ when is "default" charset applied? Jan 13 Ned Freed (7.5K) │ │ │ │ └─> │ │ │ │ MIME specs cover some SMTP-specific stuff │ │ │ │ CRLF rules strengthened recently │ │ │ │ text/ defaults to UTF-8, then 8859-1 won't fly for mail Jan 15 Frank Ellermann (1.1K) │ │ │ └─>Re: Unknown text/* subtypes │ │ │ │ assume-Latin-1 rule ignored by everyone, including the W3C validator Jan 15 Ian Hickson (0.6K) │ │ │ └─> │ │ │ │ handling should be in [media-specific specs like] HTML Jan 16 Frank Ellermann (0.9K) │ │ │ └─> │ │ │ @@@ lost Jan 04 Julian Reschke (2.4K) │ │ └─>Re: Unknown text/* subtypes │ │ 2046 (text/* is US-ASCII), 2616 (text/* over HTTP is ISO8859-1), 3023 (text/xml is US-ASCII) │ │ 2616 should get out of it Dec 18 Eric Prud'homme (4.6K) │ └─>Re: Re: Request for review of Turtle (an RDF serialization) media type: text/turtle │ turtle is the most human-readable RDF format │ text/ is useless if we can't use it here Dec 19 James Cloos (0.5K) └─>Re: Request for review of Turtle (an RDF serialization) media type: text/turtle │ should unicode reference be to UCS (ISO 10646)? Dec 20 Felix Sasaki (1.0K) └─> │ see Charmod Referencing Unicode and Charmod C062 Dec 20 James Cloos (0.8K) └─> │ IETF point of view is to prefer ISO over an industry organization Dec 21 Felix Sasaki (1.1K) └─> Unicode provides additional semantics useful for implementers
Oct 15 Tim Berners-Lee ( 21) N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. │ propose text/rdf+n3 or text/rdf+n3; level=nt for NTriples Oct 24 Graham Klyne ( 47) └─>Re: N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. │ +xml convention assumed common consumers could fallback to suffix │ ntriples isn't principally human-consumable Nov 03 Garret Wilson ( 71) └─>Re: N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. │ use of text/ is problematic RFC 4329 Nov 04 Garret Wilson ( 108) └─>Re: N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. │ background: 1 2 Nov 04 Graham Klyne ( 148) ├─> │ │ argument against text/ │ │ history of +xml Nov 04 Garret Wilson ( 27) │ ├─>Re: N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. │ │ recipe use case introducing application/recipe+rdf+n3 and application/config+rdf+n3 Nov 04 Graham Klyne ( 71) │ │ └─> │ │ │ distinguishing between RDF super-languages not useful due to open content model Nov 04 Garret Wilson ( 39) │ │ └─>Re: N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. │ │ having browser render application/recipe+rdf+n3 still useful Nov 04 Garret Wilson ( 29) │ └─>Re: N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. │ why *NOT* render application/...+rdf+n3 ? Nov 05 Dan Brickley ( 22) └─>Re: N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. │ note text/xml broken Nov 04 Garret Wilson ( 61) └─>Re: N-Triples MIME type should not be text/plain -- comment on RDF Test Cases. *why* is it broken? default interpretation by the browser, and allowed/default encoding c.f. RFC2045 — MIME bodies and RFC2046 — MIME media types