16:13:30 RRSAgent has joined #i18n 16:13:30 logging to http://www.w3.org/2007/11/09-i18n-irc 16:13:37 amit has joined #i18n 16:13:44 http://www.w3.org/html/wg/html5/#determining0 16:13:49 http://www.whatwg.org/specs/web-apps/current-work/multipage/section-parsing.html 16:13:56 http://lists.w3.org/Archives/Public/public-i18n-core/2007OctDec/0088.html 16:13:58 16:13 -!- Irssi: Join to #i18n was synced in 0 secs 16:13:58 16:13 < Hixie> http://www.whatwg.org/specs/web-apps/current-work/multipage/section-parsing.html#parsing 16:14:01 16:13 < Hixie> http://www.whatwg.org/specs/web-apps/current-work/multipage/section-parsing.html#the-input0 16:14:57 ScribeNick: fantasai 16:15:11 Addison: THere was a badly-titled thread saying something about making windows-1252 the default encoding. 16:15:29 Addison: Our first reaction was, wouldn't it be nice if that were something else, say utf-8 16:15:47 s/THere/There/ 16:15:56 Addison: At the same time we recognize that there's a legacy encoding issue, since previous versions of HTML required iso-???? 16:16:14 apppp has joined #i18n 16:16:16 hsivonen has joined #i18n 16:16:23 http://hsivonen.iki.fi/charmod-checking/ 16:16:28 http://hsivonen.iki.fi/charmod-norm-checking/ 16:16:51 smedero has joined #i18n 16:16:56 Addison: If you actually look at the sections, 8.2 and .... 16:17:09 Addison: It does not in fact say that the default encoding of the universe at large is windows 1252 16:17:23 Addison: In the sequence there's looking at byte sequences, then using heuristics, etc. 16:17:33 Philip has joined #i18n 16:17:37 Addison: at the end of that sequence there's a paragraph that says 16:17:52 MikeSmith has joined #i18n 16:17:55 Addison: if all else fails, you have to supply some implementation-defined default and we recommend you do these things. 16:18:21 Addison: And windows-1252 just appears out of nowhere. 16:18:48 Addison: One thought we had was for us to provide some information on why windows-1252 is preferable and how it differs from the standard ISO encodings. 16:19:21 plh has joined #i18n 16:19:27 " 16:19:28 When a user agent would otherwise use the ISO-8859-1 encoding, it must instead use the Windows-1252 encoding." 16:19:46 Henri: that part is a violation of charmod 16:19:54 Addison doesn't consider that a violation of charmod 16:20:35 Addison: There are superset encodings and they're often tagged with the subset encodings. 16:21:13 Addison: using the superset interpretation doesn't conflict with using the subset interpretation 16:21:31 Addison: We're not proposing a substantive change, just providing more justification for what you're doing. 16:21:48 Addison: We also looked at the structure of the paragraph, and had some concerns. 16:21:56 Addison: one was the phrasing of "western demographics" etc 16:22:09 Addison: We had several reactions. 16:22:22 Addison: Oene it's not clear what a western demographic and how you tell when you're talking to one on the internet. 16:22:36 Addison: We proposed 2 things, one of which was to turn two things around. 16:22:51 Addison: We have a love of utf-8, and we'd like you to mention that one first and then the legacy thing 16:22:51 amit has joined #i18n 16:23:13 Addison: We also think the wording could be changed somewhat on the windows-1252 to say that "in a legacy context, if you have to guess, you should guess this one" 16:23:59 Ian: I haven't gotten to that issue yet, haven't looked at it in detail, sounds ok 16:24:07 Richard: Is it purely editorial? 16:24:18 Addison: It doesn't change the result, it just changes how you explain the result. 16:24:32 Ian: Do you have any recommendation for dealing with say Japan and other parts of East Asia? 16:25:15 Addison: There are a variety of things in step #7 that allow for various heuristics and sniffing. 16:25:36 Ian: windows-1252 is fine for US and UK, but what about other places? 16:25:41 Felix: Depends on what device. 16:26:18 Addison: Most implementations use information in the browser, e.g. what the browser uses or if a narrower auto-detect is set (as for Japanese) 16:26:44 Ian: So in the Japanese cases, you expect that the rest of the steps would take care of it? 16:26:57 Addison: I think you'd trap those encodings before you get to step 7(?) 16:27:30 Addison: Might want to mention that in some cases of getting a subset encoding to use the superset encoding. 16:27:38 Addison: I think we can provide that information. 16:28:07 Ian: I believe when I wrote that section that I checked a browser and that was the only mapping they had. 16:28:24 Addison: Most browsers dont' just do GBK, but do ???? 16:28:46 Addison: There are some cases, such as in Japan, where the byte patterns are completely different. 16:28:57 Addison: where the encoding schemes are different even though the charset is the same 16:29:06 Addison: that kind of autodetection is a separate thing 16:29:11 Addison: I think this is still valid. 16:29:36 Addison: THe only question I have is, if you're thinking "what should happen in step 7" is some language-dependent or context-dependent thing ... 16:30:01 Hixie: In this final step, you don't have any information from the content 16:30:19 Addison: You might want to think about splitting step 7 and doing a utf-8 detection first 16:30:50 Addison: UTF-8 has recognizable byte patterns, it would be great to put that first before saying "use your favorite legacy encoding" 16:31:13 Hixie: The concern is what happens if the user enters some bytes into the form and then submits it? 16:31:24 Addison: We were just looking at that in the i18n working group 16:31:44 Hixie: We'd have to make sure that that's what the server was expecting. 16:31:54 Felix asks a question 16:32:11 Hixie: Typically different localizations of the browser have different default encodings. 16:32:41 s/asks a question/what information are you looking at to guess what encoding the user applies?/ 16:33:21 Hixie: well, the email's in my pile. I don't know when I'll get to it. 16:33:41 Addison: We'll look at superset encodings and try to write up a document that you can reference. 16:34:13 Introductions 16:34:21 Richard Ishida: W3C Internationalization Lead 16:34:28 Anne van Kesteren: Opera Software 16:34:42 Elika: fantasai, CSSWG Invited Expert, works on international text layout 16:34:56 Addison Phillips: Yahoo, i18n wg 16:35:13 Amit Parashar: something-or-other chair 16:35:23 Henri Sivonen: working on HTML5 conformance checker 16:35:27 Ian Hickson: HTML5 editor 16:35:51 Felix Sasaki: i18n Core, i18n ITS and Web Services Policy WG [W3C] 16:36:32 Philippe Le Hegaret: W3C, Architecture Domain (XML, Web Services, i18n), and Video 16:36:49 Ishida: Can you explain the alt text issue? 16:36:50 Najib Tounsi, W3C Morocco Office Mgr. 16:37:08 Ishida: We believe that you should never put human-readable text in an attribute value because you can't put markup in it 16:37:30 Ishida: which is important for various i18n reasons: bidi, language annotation, ruby, etc. 16:38:00 Hixie: We still have the element; we can't get rid of it. It still has alt attr, because it's had that. 16:38:12 Hixie: We can't give it content because HTML parsers all close it right after the start tag. 16:38:25 Hixie: We also have the tag, which has full fallback capabilities. 16:38:37 Ishida: Would the group advise the tag then? 16:39:08 Hixie: I don't think we'll have a recommendation one way or another; if your fallback content needs element content, then you'll have to use 16:39:27 Hixie: We've been doing some work, e.g. Acid2, on making sure the tag works properly in various browsers. 16:39:49 Ishida asks about some XHTML2 stuff 16:40:12 Hixie: THe XHTML2 group did two things, one was switching some attributes into elements, e.g. title attributes. 16:40:27 Hixie: Then they also went and started usng rdf for everything: we are certainly not going to do that. 16:40:46 Hixie: For the first one, I'm not convinced that the benefits of using an element for these things is better than the costs 16:41:06 Hixie: We can try not to do things like that in the future though 16:41:33 Hixie: This problem comes up in many places, e.g. in DOM APIs that take a string. 16:41:49 Hixie: There are also places where we can't make such changes, such as the element 16:42:03 <fantasai> Hixie: whose content winds up in places like filenames where you can't have structured markup anyway 16:42:29 <fantasai> Ishida: Can you use bidi in filenames? 16:43:05 <fantasai> Hixie: probably, but I'm not going to recommend it 16:44:17 <fantasai> Ishida: We might need to start thinking about how to convert text from markup to strings with bidi control characters. 16:44:35 <anne> (I think HTML 5 should get &rlo;, &lro;, and &pdf; (or something in that direction) for BiDi. These are already in IE.) 16:44:58 <fantasai> Hixie: We did consider having a DOM attribute that would pull out e.g. bidi control characters from the markup and alt text from images 16:45:13 <fantasai> Hixie: not sure where that's going 16:45:34 <fantasai> Hixie: I would recommend finding solutions for plaintext, since that will work for both 16:46:58 <fantasai> Discussion of that 16:47:18 <fantasai> language tags are in Unicode, but were deprecated as soon as they were added: they were added as deprecated and should never be used 16:48:37 <anne> (event though the characters they map to are apparently deprecated) 16:49:13 <fantasai> discussion of markup-plaintext thing 16:49:44 <apppp> reference RFC 3066 should point to BCP 47 16:51:01 <fantasai> Addison notes that the i18n group needs to review the date parsing things 16:51:06 <najib> +1 for to add &rle, ..., &pdf; in HTML 16:51:52 <fantasai> Henri notes that it's using ISO dates anyway 16:52:00 <jgraham_> jgraham_ has joined #i18n 16:52:09 <fantasai> najib, if we're adding more entities I want &zwsp; 16:52:11 <fantasai> :) 16:52:56 <najib> It depends on usage frequences. :-) 16:53:32 <fantasai> Topic: Validator checking entity reqs 16:53:40 <smedero> smedero has left #i18n 16:54:51 <fantasai> Henri: I don't check that character entities are only used for characters that are unclear. 16:55:02 <fantasai> Henri: because I can't tell mechanically whether the character is unclear 16:57:05 <anne> fantasai, I think &zwsp; is also supported by IE 16:57:23 <fantasai> cool 16:57:30 <fantasai> let's add it :P 16:57:40 <fantasai> all the characters next to it have names, 16:57:44 <fantasai> zwnj, zwj etc 16:57:54 <najib> I don't have IE on MacOS :-( & :-) 16:58:59 <fantasai> Ishida explain that this part of charmod is about best practices 16:59:16 <fantasai> it's not should in the normative sense 17:00:48 <fantasai> Elika: Maybe you should go through the document and change the wording of should sentences that don't match RFC2119 to something else 17:01:13 <fantasai> Ishida: Well, we mean it that way for authors. Maybe we need to create different classes and explain which recommendations apply to which 17:01:45 <fsasaki> http://hsivonen.iki.fi/charmod-norm-checking/ 17:02:04 <amit> amit has joined #i18n 17:02:08 <fantasai> Henri: I documented which constructs in HTML5 result in a continuous string 17:02:48 <fantasai> Henri: I don't have any other comment there except that I wrote this and it is available :) 17:03:44 <fantasai> Henri: I have another comment, but its targetted at the unicode/icu specs 17:03:58 <fantasai> Ishida: Might want to psot to the unicode list 17:04:02 <fantasai> s/psot/post/ 17:04:20 <apppp> RRSAgent, set logs world-visible 17:04:25 <apppp> RRSAgent, make minutes 17:04:25 <RRSAgent> I have made the request to generate http://www.w3.org/2007/11/09-i18n-minutes.html apppp 17:05:16 <apppp> Title: I18N / HTML5 break out session 17:05:29 <apppp> Scribe: fantasai 17:05:35 <apppp> ScribeNick: fantasai 17:05:45 <apppp> RRSAgent, make minutes 17:05:45 <RRSAgent> I have made the request to generate http://www.w3.org/2007/11/09-i18n-minutes.html apppp 17:08:42 <apppp> Present: Anne, Addison, Richard, Hixix, Fantasai, Najib, Amit, Philippe, Hsivonen, J.Graham 17:08:49 <apppp> RRSAgent, make minutes 17:08:49 <RRSAgent> I have made the request to generate http://www.w3.org/2007/11/09-i18n-minutes.html apppp 17:09:18 <apppp> s/Hixix/Hixie 17:09:24 <apppp> RRSAgent, make minutes 17:09:24 <RRSAgent> I have made the request to generate http://www.w3.org/2007/11/09-i18n-minutes.html apppp 17:09:44 <apppp> Chair: Addison Phillips (I18N) 17:09:53 <apppp> Regrets: none 17:09:59 <apppp> RRSAgent, make minutes 17:09:59 <RRSAgent> I have made the request to generate http://www.w3.org/2007/11/09-i18n-minutes.html apppp 17:10:55 <apppp> Meeting: I18N / HTML5 Break-Out 17:11:01 <apppp> RRSAgent, make minutes 17:11:01 <RRSAgent> I have made the request to generate http://www.w3.org/2007/11/09-i18n-minutes.html apppp 17:11:31 <jgraham_> jgraham_ has joined #i18n 17:11:49 <jgraham_> jgraham_ has left #i18n 17:13:03 <apppp> apppp has left #i18n 17:24:39 <Philip> Philip has left #i18n 18:28:48 <najib> najib has joined #I18N 18:42:26 <najib> najib has joined #I18N 18:42:42 <r12a> r12a has joined #i18n 18:42:54 <r12a> we're on our way 18:43:10 <r12a> just talking with Ian Jacobs about note vs Rec stuff 18:45:31 <anne> anne has left #i18n 18:46:03 <anne> anne has joined #i18n 18:55:37 <fsasaki> fsasaki has joined #i18n 19:10:34 <aphillip_> aphillip_ has joined #i18n 19:10:48 <fsasaki> http://lists.w3.org/Archives/Public/public-i18n-core/2007OctDec/index.html 19:11:07 <fsasaki> http://lists.w3.org/Archives/Public/public-i18n-core/2007OctDec/0071.html 19:11:14 <fsasaki> http://www.w3.org/2007/10/30-core-minutes.html#item08 19:11:32 <fsasaki> <scribe> ACTION: Addison to write a note for consumption by XML Core expressing our support for extending XML 1.0 [recorded in http://www.w3.org/2007/10/30-core-minutes.html#action03] 19:14:30 <fsasaki> Dear XML Core Working Group, 19:14:31 <fsasaki> the i18n Core Working Group looked at your discussion about XML 1.0 and XML 1.1. [1]. We discussed this topic recently [2] and hereby would like to express our support for extending XML 1.0 in the way you described it at [1]. 19:14:33 <fsasaki> On behalf of the i18n Core Working Group, 19:14:34 <fsasaki> Felix Sasaki 19:14:36 <fsasaki> [1] http://lists.w3.org/Archives/Public/public-i18n-core/2007OctDec/0071.html 19:14:37 <fsasaki> [2] http://www.w3.org/2007/10/30-core-minutes.html#item08 19:43:00 <aphillip_> aphillip_ has left #i18n 19:49:42 <hsivonen> hsivonen has left #i18n 20:33:23 <aphillip_> aphillip_ has joined #i18n 20:46:18 <chaals> chaals has joined #i18n 20:46:56 <aphillip_> we're around 20:47:00 <aphillip_> in our room 20:47:07 <aphillip_> shall we wander past? 20:48:00 <chaals> in 20 minutes? 20:48:19 <aphillip_> sure... us come to you? you're in the webapi room, right? 20:49:11 <chaals> yes please. cambridge a 20:49:16 <aphillip_> done see you in 20 20:49:28 <chaals> thanks 21:08:37 <fsasaki> fsasaki has joined #i18n 22:53:12 <chaals> chaals has left #i18n