W3C

Internationalization Working Group Teleconference

21 Aug 2014

Agenda

See also: IRC log

Attendees

Present
Addison, Richard, JcK, Leandro, Mati, Koji, PLH_(guest)
Regrets
Felix
Chair
Addison Phillips
Scribe
Addison Phillips

Contents


Agenda

Action Items

http://www.w3.org/International/track/actions/open

close action-332

<trackbot> Closed action-332.

Info Share

leandro: coming to TPAC

richard: new community group
... looking for members
... character description group

<r12a> http://www.w3.org/community/cdl/

richard: describing characters based on strokes

JcK: ietf vs unicode issue about precombined marks suspended by not gone away

<JcK> Especially if it is to be an action item, that should be more like "new apparently precombined characters without decompositionss" or words to that effect. A different version would be whether Unicode "characters

<JcK> are language and phonetically independent within a script (as described in Section 2.2 of the Standard) or the degree to which language, phonetic use, and other usage issues that cannot be detected by a reader with no context are significant.

RADAR

https://www.w3.org/International/wiki/Review_radar#Scheduled_Last_Call_reviews

Encoding LC

<r12a> http://www.w3.org/International/docs/encoding/encoding-doc.html#issue-26154

<r12a> http://www.w3.org/International/docs/encoding/encoding-doc.html

(addison summarizes)

richard: want to reach CR by 12 september
... what is outstanding?
... working on clarifying, etc.
... going through disposition of comments
... will help
... two sections
... first is the LC issues
... two raised in bugzilla, rest on winter list
... second section is bugzilla items
... mostly old
... anne not willing to work on the larger items
... he sees this as on-going project without specific deadlines
... don't know if we can defer

addison: don't know how we could correct later

richard: unicode violation text
... ken whistler, chief editor, said "none of these are violations"
... went further and suggested changes
... anne will change
... another thing causing worry
... things are not as clear cut on list of encodings and indexes
... what characters in what codepoints, etc.
... thought it came from survey
... of all browsers
... but seems to be based on older version of opera (12?)

<r12a> http://www.w3.org/International/tests/repository/encoding/indexes/results-indexes.en.php

addison: did discuss with Anne, many errors are of the same type

richard: above link is from martins tests which I adapted
... for example 1253

http://www.w3.org/International/tests/repository/run?manifest=encoding/indexes&test=windows-1253_test

<r12a> windows 1253

<r12a> Firefox

<r12a> 0081 -> FFFD

<r12a> 0088 -> FFFD

<r12a> 008A -> FFFD

<r12a> 008C -> FFFD

<r12a> 008D -> FFFD

<r12a> 008E -> FFFD

<r12a> 008F -> FFFD

<r12a> 0090 -> FFFD

<r12a> 0098 -> FFFD

<r12a> 009A -> FFFD

<r12a> 009C -> FFFD

<r12a> 009D -> FFFD

<r12a> 009E -> FFFD

<r12a> 009F -> FFFD

<r12a> Chrome

<r12a> 00AA -> ª instead of FFFD

<r12a> IE

<r12a> 00AA -> ª instead of FFFD

<r12a> 00D2 -> F8FA instead of FFFD

<r12a> 00FF -> F8FB instead of FFFD

<r12a> windows 874

<r12a> Firefox

<r12a> 23 mismatches in rows 8 and 9

<r12a> Chrome

<r12a> 8 mismatches in rows D and F

<r12a> IE

<r12a> same as Chrome

<r12a> ibm866

<r12a> Firefox

<r12a> 001A -> 007F

addison: need to say what to do and get consensus on moving to it or allow variations

<r12a> 001C -> 001A

<r12a> 007F -> 001C

<r12a> Chrome

<r12a> replacement characters after 0080

<r12a> IE

<r12a> same as firefox

richard: bit of a shock
... calls into question label choices

addison: need to survey the browsers?

JcK: claims of some of the wilder changes are troubling

addison: 1252 thing seems well established
... multibyte mappings require deeper insight into what is actually happening
... any actions?

richard: plh?
... one option, not the best, would be to deeply fork

addison: hard to fork because it forms a system

richrad: looked for references
... mainly sectoin 6
... utf-8
... a few things
... if just going to "normal" CR
... could then go back to LC if needed to if tests found issues

plh: could work, but can't change things ref by html5

richard: mainly indexes and mainly details
... check what css and html actrually need

addison: more references now than just the big 2

richard: css cares only about determine encoding, I think
... define "indexes and such" as the "registry part"

addison: if only the registyr part is subject to correction, won't cause problems to advance on CR and then correct later

plh: not a REC, so would work to fix if testing came back negative
... did we address enough of issues to move to CR

richard: probably yes

addison: may know we have some flaws (bugzilla bugs on mbcs)

plh: put notes in CR version to call attention to that
... on specific encodings

richard: don't know if IE will change as well?

plh: best way to get attention is to get CR

richard: obsolete word removal
... provided option, but anne didn't like

addison: 26514 looks like it could be defer

plh: would agree that it could defer

addison: violation statement

"is not relevant"

<r12a> "For the purposes of specifications using this specification, that registry is obsolete"

<r12a> User agents actually use a subset of the IANA Character Sets registry in a particular way. This specification documents this to establish interoperability on the Open Web Platform. Specifications and applications using this specification must restrict themselves to the encodings as documented in this specification.

<r12a> Anne's preference: For the purposes of specifications using this specification, that registry no longer relevant.

addison: could fork and replace or bin that paragraph

JcK: reuse of lables with different meanings is more problematic
... hard to tell what an implementation is doing when it sees a given label
... which standard it follows

richard: already a real problem

plh: not looking for convergence... looking for everyone to use UTF-8

<plh> "Authors must use the utf-8 encoding and must use the ASCII case-insensitive "utf-8" label to identify it. "

addison: mention actual goal (UTF-8) in preface?

<r12a> New protocols and formats, as well as existing formats deployed in new contexts, must use the utf-8 encoding exclusively. If these protocols and formats need to expose the encoding's name or label, they must expose it as "utf-8".

richard: close remaining
... press on with tests
... get anne to do edits
... and work on false statement
... move towards CR

JcK: refocus the preface on moving to UTF-8 and the rest of this is legacy compatibility

richard: "is no longer relevant"

<scribe> ACTION: richard: kick off additional discussion with anne about preface wording [recorded in http://www.w3.org/2014/08/21-i18n-minutes.html#action01]

<trackbot> Created ACTION-333 - Kick off additional discussion with anne about preface wording [on Richard Ishida - due 2014-08-28].

JcK: will suggest new text in the next hour

richard: will look into CR

AOB?

<r12a> oh, one thing i forgot to mention in the infoshare - we published an updated WD of Predefined Counter Styles today !

Summary of Action Items

[NEW] ACTION: richard: kick off additional discussion with anne about preface wording [recorded in http://www.w3.org/2014/08/21-i18n-minutes.html#action01]
 
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014/08/27 17:38:58 $