See also: IRC log
Filed: https://www.w3.org/Bugs/Public/show_bug.cgi?id=18646
close ACTION-142
<trackbot> ACTION-142 File a ticket on HTML5 asking for accommodation of IANA time zone IDs closed
Prepare some examples of ITS2NIF showing NFC issue for consideration next week
felix: probably take a little while
richard: updated
templates/boilerplates for articles
... now show live results from test framework database
... still doing some tweaks for maintaintability
... if look at test page
<r12a> http://www.w3.org/International/tests/
richard: the list is a pain to
maintain
... so removing links to list pages
... because results page does that work
<r12a> http://www.w3.org/International/tests/html-css/character-encoding/results-basics
richard: above is link to results page
richard: end date for charter
renewal is close
... not a lot of response
... 9 currently
... would like more
... anyone on call with an AC Rep get them to do so
http://lists.w3.org/Archives/Public/www-international/2012JulSep/0072.html
<r12a> http://lists.w3.org/Archives/Public/www-international/2012JulSep/0076.html
http://www.w3.org/2012/08/15-i18n-minutes.html#item06
norbert: not actually another type to use for 6a. don't complain about it
addison: okay
>> 7. Section 2.6. These escapes are malformed or use a questionable syntax: >> >> The characters -, \uB7, \u300 to \u36F and \u203F to 2040 are permitted anywhere except the first character. > > + Use the U+XXXX or U+XXXXXX notation to refer to code points in the specification (rather than other escaped forms).
(also name the characters?)
addison: appears to be a "ppor man's" attempt to prevent including combining marks
>> 8. Section 3. The reference to #xA; is making some assumptions about line terminators too :-) > > + Don't make assumptions about line terminator characters. > > >> An example of two identical triples containing literal objects containing newlines, written in plain and long literal forms. Assumes that line feeds in this document are #xA. (example3.ttl): >>
norbert: should define line terms
>> 9. Section 5.1. The following might need attention: >> >> The media type of Turtle is text/turtle. The content encoding of Turtle content is always UTF-8. Charset parameters on the mime type are required until such time as the text/ media type tree permits UTF-8 to be sent without a charset parameter. See section B Internet Media Type, File Extension and Macintosh File Type for the media type registration form.
okay as is
>> 10. Section 6. Refers to TURTLE documents as being encoded as UTF-8. In practice, UTF-8 is a serialization. The actually document should just be "a sequence of Unicode characters" This probably needs to distinguish more clearly between processing and storage/transmission. For the former it's just a sequence of Unicode characters, for the latter it's UTF-8.
>> 11. Section 6.2. Says in part this: "continue to the end of line (marked by characters U+000D or U+000A)" which again makes assumptions about line terminators. Should there be a rule for line termination? Such as this? http://ecma-international.org/ecma-262/5.1/#sec-7.3
>> 12. Section 6.4. The \u (lowercase u) syntax allows: >> >> A Unicode codepoint in the range U+0 to U+FFFF inclusive corresponding to the value encoded by the four hexadecimal digits interpreted from most significant to least significant digit. >> >> This is probably wrong, given that the surrogate code points fall into this range. No mention is made of surrogate pair handling. And since there is a second form that can handle the complete Unicode charac
>> 13. Section 6.4 contains this Note: >> >> -- >> %-encoded sequences are in the character range for IRIs and are explicitly allowed in local names. These appear as a '%' followed by two hex characters and represent that same sequence of three characters. These sequences are not decoded during processing. A term written as <http://a.example/%66oo-bar> in Turtle designates the IRI http://a.example/%66oo-bar and not IRI http://a.example/foo-bar. A term writte
richard; why do you think they do that?
addison: let's ask
>> 14. Section 6.5 (Grammar) defines LANGTAG far more permissively than BCP 47 does--even in its obsolete forms, to wit: >> >> [144s] LANGTAG ::= '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* >> >> It would be better to define this at least in terms of BCP 47's "obs-language-tag" production: >> >> obs-language-tag = primary-subtag *( "-" subtag ) >> primary-subtag = 1*8ALPHA >> subtag = 1*8(ALPHA / DIGIT) Since the RDF spec refers
norbert: RDF already references
LANGTAG in Section 2.1 of BCP 47
... so why not use it?
addison: okay to refer to full definition
>> 15. Same section. PN_CHARS_BASE erases various Unicode ranges without explanation. This appears to be an attempt to eliminate combining marks and the surrogates. This probably isn't how to do this?
http://www.w3.org/TR/2012/WD-turtle-20120710/#grammar-production-PN_CHARS_BASE
richard: do we have a list of
this kind?
... use to refer to XML?
http://www.unicode.org/reports/tr31/#Default_Identifier_Syntax
norbert: what are they really trying to accomplish here? just identifiers? or something else?
addison: not sure if UTR31 is too
restrictive??
... should we say "charmod-norm wrong"?
norbert: start with question: what are you trying to do here?
>> 16. Appendix B contains this note: >> >> Encoding considerations: >> The syntax of Turtle is expressed over code points in Unicode [UNICODE]. The encoding is always UTF-8 [UTF-8]. >> Unicode code points may also be expressed using an \uXXXX (U+0 to U+FFFF) or \UXXXXXXXX syntax (for U+10000 onwards) where X is a hexadecimal digit [0-9A-Fa-f]
>> 17. Appendix B contains security considerations, that reference UTR#36 (good). Should there also be reference to UTS#39??
http://www.unicode.org/reports/tr39/
>> 18. In "References", the Unicode reference is to Unicode 4.0, which is well out of date. The current version is 6.1, for example.
> W3C I18N Techniques: Developing specifications > Referencing the Unicode Standard > http://www.w3.org/International/techniques/developing-specs#unicoderef
norbert: but a specific version may be needed for implementation consistency, e.g. for identifier syntax
addison: but not restrict code point usage to a specific version
richard: note that above link
doesn't go directly to charmod
... but to our techniques page
<scribe> ACTION: addison: move TURTLE comments to tracker and send the comments to the WG using the usual process [recorded in http://www.w3.org/2012/08/22-i18n-minutes.html#action01]
<trackbot> Created ACTION-146 - Move TURTLE comments to tracker and send the comments to the WG using the usual process [on Addison Phillips - due 2012-08-29].
richard: made a few changes in the process document, so take a look at it again
<scribe> ACTION: addison: send TPAC registration reminder [recorded in http://www.w3.org/2012/08/22-i18n-minutes.html#action02]
<trackbot> Created ACTION-147 - Send TPAC registration reminder [on Addison Phillips - due 2012-08-29].
<matial> Bye