I18n-Comments

From RDF Working Group Wiki
Jump to: navigation, search

Status

I (ericP) believe that Addison Phillips's response closes these issues.

Analysis

Key:

  • [S] : also an issue in SPARQL
  • [F] : forward compatibility (new documents rejected by old parsers)
  • [B] : backward compatibility (old documents rejected by new parsers)
  • [E] : editorial

178: Ack [E] Reference IETF BCP 47 for language declaration

Refer to BCP 47 for langtags.

ericP: +1

gavinc: we do indirecly, we link to language tags as defined in RDF concepts which directly refrences BCP 47 +0

179: Nack [SF] Scope of document language label

No LANG directive to set default document language.

ericP: undecided

gavinC: -0.9 way too big a feature, very hard to use for a seralizer, imposible to use for a streaming seralizer

180: Ack [E] 180: Various random character references should be more explicit

ericP: +1

gavinC: +1

181: Ack [E] Non-ASCII IRI example

ericP: +1

gavinc: +1

182: Nack [E] cultural relevance of examples

Good examples with relevent features are difficult to come up with.

183: Nack [E] Use of types questionable?

Asserts the year 2007 and the dollar amount 14074.2E9 should have datatypes.

Most scalar values have a datatype.

ericP: any proposals for unit-less numbers to replace the current example?

184: Ack Malformed escapes

The characters -, \uB7, \u300 to \u36F and \u203F to 2040 are permitted anywhere except the first character.

ericP: I don't understand these codepoints enough to understand why they'd be illegal. ... ahh, it's just that "\uB7" isn't a convention we've been using.

185: Ack [E] Line terminator assumptions

The reference to #xA; is making some assumptions about line terminators

ericP: Apparently "Assumes that line feeds in this document are #xA" described the following example. I've re-worded to clarify that this assumption is not Turtle-wide or even Turtle-spec-wide.

187: Nack [SBF] escape syntax

use \u{xxxxx} instead of \uxxxx or \Uxxxxxxxx

191 also specifically proposes \Ux{6} (six instead of eight)

ericP: -0.5 could be worth it only if the rest of the world really likes that format.

188: Nack [S] special handling of % in IRI

Why aren't %dd's de-escaped?

<>s and the like in IRIs require escaping.

189: Ack [S] reference obs-language-tag instead of defining your own

Use BCP47's "obs-language-tag" production in grammar. obs-language-tag looks like:

      obs-language-tag = primary-subtag *( "-" subtag )
      primary-subtag   = 1*8ALPHA
      subtag           = 1*8(ALPHA / DIGIT)

We could extend the XML expressivity to include {min,max} and incorporate obs-language-tag into the current grammar in the form:

 -[144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*
 +[144s] LANGTAG ::= '@' [a-zA-Z]{1,8} ('-' [a-zA-Z0-9]{1,8})*

This would also allow us to improve the production for UCHAR:

 -[27] UCHAR ::= '\u' HEX HEX HEX HEX | '\U' HEX HEX HEX HEX HEX HEX HEX HEX
 +[27] UCHAR ::= '\u' HEX{4} | '\U' HEX{8}

Simply referencing obs-language-tag is a pain when folks try to synthesize a grammar, either for implementation or comprehension.

ericP: -.9 to reference, +1 to adopting {} notation and making Turtle more precise as currently mocked up in editor's draft notations for UCHAR and, most importantly, LANGTAG.

190: Nack [S] attempting to erase combining marks?

Using XML4 productions leads us to prohibit combaining marks and surrogates

Shouldn't we prohibit surrogates?

191: Ack [S] Various nits in Appendix B

Comments on media type registration form:

  • covers docs and not in-memory representations -- RDF covers the abstract syntax.
  • The reference to U+0 should read U+0000 -- okidoke.
  • We recommend a different escape syntax altogether -- costs discussed in 187
  • We recommend six-digit rather than eight-digit \U representation -- that's probably break no deployed data.

192: [E] referencing Unicode

Obselete reference to Unicode

ericP: anyone have the keys to update respec / bibref / biblio.js?

193: Nack [E] define when escapes are evaluated

don't 7.2 and 6.3 cover that?