Minute-taker: Konstantin Kalaitzidis MINUTES - START: 8:45am - MARTIN opened meeting CONFIDENTIALITY Martin said the meeting is open/public. If you have something that you do not want others to know, please do not say it. GOALS OF THIS MEETING 1. Why is this important 2. Why should the W3C work on this? 3. How can your organization help? 9:00am - MARTIN: Reviewed Agenda 9. AUDIENCE QUESTION: (UNKNOWN SPEAKER) - You think we can finish today? -- MISHA RESPONSE - purpose of this workshop is to get started ...and see what we can finish in the next 2 years or so. ----------------------------------------------------------------------------------------------------------------- 10. INTRODUCTIONS - went around room, clockwise. 1. Judy Brewer: Web Accessability - increase accessability of the web 2. Barbara Tillett: Library of Congress - Multilingual/multicultural interoperaility 3. Cathy Wissink: Microsoft - focus on cultural/locale implementation models. I18n of Domain Names 4. Tex Texin: Progress Software- noted overlap on papers. Interested in best practices of i18n/l10n. 5. Yves Savourel: Interested in making sure l10n tools can work together. 6. Shigemichi Yazawa: GlobalSight: interest in i18n, l10n, unicode. Involved in ITS group-trying to define a way to localize the XML document. 7. Edgardo A. Luzcando: VERIZON: Learn more 8. Suzanne Topiping: Bizwonk Inc- process improvement. Prof Assoc of L10n - one of the founders. 9. Konstantin Kalaitzidis- IT Infrastructure i18n 10. Randall K. Barry: Library of Conress - character set work. 11. Debasish Banerjee: IBM: Websphere group. No robust architecture for a heterogeneous environment and web services. 12. Sumanth Muthyala: Computer Corp: How shouldwe design tools for i18n? 13. Garrey Learmonth: Cisco Systems - Garrey - IT Architect: IT Infrastructure i18n. Current state of i18n. Best practices. 14. Teresa Mulvihill: Tech Writer. New to W3C and XML. Learn andDevelop standards for XML, XSL, and Cascading Style Sheets 15. Etan Wexler: Unaffiliated - Priority is education and evangelism to Computing Professionals 16. Richard Ishida: XEROX - Global Design Consultant: Interested in everything. 17. Misha Wolf: REUETERS - Interested in standards across the board. Finding out what it is that we refer to as i18n. Up to us to define what new workgroups we need. Also, these will need chairs. Specific interest in the education and outreach. Also web accessability, since there is so much overlap. How do we do things right? Hope to develop program that works. 18. Mike Ksar: Microsoft: In this business for about 40 years. Main interest is to assess where the new W3C is right now. Character model, and how it was received by the rest of the W3C community. Interested in everything that comes out of this workshop. Need to see what focus is ... we cannot do everything. Also see how we work together with other groups. Coordinate fragmented efforts. We need to take giant steps to increase the coordination and interest in getting trust heightened by various people working in this area. Ask what can you volunteer on? 19. Sue Allen Wright: Ken? State University - Training next generation of multilingual experts. Chair of ANSI ...? Point person for language codes. Has to do with expansion of language codes to meet need of IT industry. ISO 639 standards. Interested in coordination and not duplication. 20. Jennifer DeCamp: Foreign Language Tech Research at MITRE - advise US government on foreign languages. Also teach multilingual web site class. What types of issues should we be talking about. 21. Peter Constable: SLI Internatinal - 70 countries, 1100 languages. Much interaction with people in the academic sector. work with UNESCO. Find solutions who need to work with non-roman scripts. ISO TC37. Looking for the right venues for addressing these issues. 22. John Jenkins: APPLE: Utah. Part of the International Text and Graphics group. Represent Apple at UTC. Main concern is internationalized Domain Names. Determine when 2 domains names are the same. 23. Sam Sun: Not for profit organization. Hear W3C recommendations and guidelines. 24. Tony Graham: Sun Microsystems. look into XSL solutions. 25. Vincent Quint: W3C - Formats, StyleSheets. Coordinate all activities. Understand what the community is expecting from W3C. Involved in implementing whatever we decide here. 26. Arnold Winkler: UNISYS - 3 hats. Trying to kill one of the groups. Programming languages and sorting. I18n Evangelist at UNISYS. Resposnibility for accessability. Try to make sure that we do not spread our resources too thinly. Put the right work into the right group, and do the job well. 27. Katsuhiko Momoi: NETSCAPE: Look at what this group is doing? Interested in Open Source Development. Ambiguities will negatively affect our work. Need well-defined standards. Taught Japanese language for years. Eliminate ambiguities, and come to concensus. 28. Asmus Freytag: Technical VP of Unicode Cons. Represents Unicode cons here. Interest are focused on character set standardization, and anything intertwined with that. Unicode not only works on localization, but also related areas. Also look out for issues that would fit in the Unicode cons. Make sure W3C and Unicode do not duplicate efforts. 29. Francois Yergeau: ALIS Technologies - Interest is anything language-related and computers. Guidelines for multilingual sites. Localizing web sites and apps. In domain of Outreach, W3C could do something by being more international and more multilingual by forst applying these to itself. 30. JUDY BREWER: Worked for a number of years in accessability. Share what we learned in my area. Also learn from what you are exploring. Hope that our groups will work more closely together. 31. Chris Lilley - W3C: END OF INTRODUCTIONS ----------------------------------------------------------------------------------------------------------------- PRESENTATION #1 TITLE: WEB ACCESSABILITY PRESENTER: JUDY BREWER (JB) ADDITIONAL INFORMATION - See presentation Notes from presentation: 1. Factors for accessability. Promote awareness of our work. 2. Authoring Tool Accessability - Greatest potential to improve accessability. 3. XIG-XAG: if there is an XML I18n Guidelines & XAG, then the acronym is XIG-XAG :-) QUESTIONS FROM THE AUDIENCE QUESTION 1 (UNKNOWN SPEAKER) - 3 parts to my question: - Tell us about the number of resources you require to do this? - - Your project is well integrated - interested about process that produced this. - OUTREACH - How did you get people to address this? (JB) - Most sponsored by W3C. Some outside sponsorship - Government of Canada, European Commission. Also have Corp Sponsors (funding). Amounts of sponsorship vary quite a bit. Currently we have between 5-7 W3C staff. Also, we have participation from invited experts. Going to different groups and looking for participation. QUESTION 2. (SUZANNE) - Info on the IMPACT matrix. (JB) - Draft done by the I18n Content Guidelines. QUESTION 3. (MIKE) - Do you have any metrics from people who have implemented some of these guidelines (JB) - Varies according to the guideline. Most info is on the User Agent Accessabiltity guidelines. Dialog with these Developers have a fair amount of knowledge that went into this implementation.Canada implemented guidelines. EU adapted guidelines quite a bit. US lagging. Hong Kong, Japan looking at implementation targets. Also working with multinationals. Cost Factors is a document that covers some of this. QUESTION 4. (UNKNOWN SPEAKER) - Are you working with the other committees (JB) - Worked on 2 Fed Advisory Committeees. Part of our responsibilities to follow standards around the world, not just in the US. Starting to look at processes in Japan. QUESTION 5. (FRANCOIS) - Puzzled by your statement "do not try to generalize too much" (JB) - Felt that in some cases lost too much to go from abstract to a universal Design. Keep scope fairly consistent. QUESTION 6. (UNKNOWN SPEAKER) - Within the EU, are UK and France interested in implementing your guidelines? (JB) - Ministerial committee in 2000 - adopting guidelines. Several countries have come closer to our guidelines. Other countries have not done enough. QUESTIOIN 7. (MISHA) - Have you thought about influencing computer class curriculums? (JB) - To some extend. We have released a white paper on this. President Clinton received 25-40 universities about commitment on following these guidelines but nothing happened. END OF PRESENTATION #1 AND QUESTIONS ----------------------------------------------------------------------------------------------------------------- PRESENTATION #2 - PREENTATION: Int'l DTDs PREENTER: Richard Ishida ADDITIONAL INFORMATION: See presentation. Minute-taker: Garrey Learmonth Richard Ishida - Guidelines for DTD design. =============================== Susan - Naming approach for guidelines ? Richard - none. Sue Ellen - What progress with ITS work ? Richard - Website on egroups for ITS reqs - not much movement. Unknown - What extent is L10N using XML ? Richard - Developers of DTD's Tony G - How useful are attributes/emphasis types to non-authors ? Richard - n/r Francois - Addressing content as opposed to wider context e.g. site design etc Richard - Agreed. Unknown - UI focus ? Richard - Prioritization needed. Unknown - Categorization of L10N issues needed. Richard - n/r Unknown - Guideline document - what type ? Richard - XML, XSL, HMTL, Javascript. Unknown - How many docs in Xerox in XML ? Richard - More and more. Peter C - Tagging guidelines wrt fine typography/rendering issues ? Richard - n/r Martin - Add to general guidelines or separate ? General - Yes Yves S - Localization Formats ====================== Randy - Transliteration an issue ? Yves - Never seen it coming up. Tex - Does W3C need to work on this if other communities involved ? Likelyhood of success ? Yves - Yes since more influence and more generic viewpoint. Tex - Process definition needed + communities Yves - ITS : no real progress since no supporting infrastructure. Unknown - TMX : can be more than for extending translation memory e.g. Yves - no response Unknown - Coordinate OLAF/TMX categories. Yves - no response Martin - W3C role to get basic infrastructure correct. Peter Constable - Outstanding Issues in L10N ================================= Chris - Where is language/dialect division ? Peter - Various sub-divisions. Unknown - Can industry trust ISO ? Peter - n/r Unknown - Question wrt what *is* a locale ? Also web services issues wrt no direct end user e.g. propagation of locale etc ? Peter - Definition of params needed Susan - W3C to define user pref model. Peter - n/r Unknown - How should language codes be defined wrt variations on each language ? Martin - tbd Tex - Issue is a problem of identifying attriubutes of L10N behaviour - worry about naming later. Not just a language problem - identifying core languages vs numerous local dialects. Peter - Many languages not including dialects needed and are not currently addressed. No need to know local dialects simply how to represent. Tex - Industry requirement is no. 1 issue vs needs of linguists. Martin - Usage scenarios to be identified. Judy Brewer WAI ============= Susan - Education/outreach activities ? Judy - Yes KK - Define WAI Judy - Needs refinement/clarification - see website. Edgardo - Exchange of L10N data - centralized or smart clients ? Martin - All websites in all languages or what ? Barbara T - Building blocks for the semantic web ? Susan - Minority languages ? Tex - Who is influencing what ? Chris - wrt returning L10N info etc - possible privacy issues etc. Jennifer De Camp - Are there elements of foreign language capabilities to be included in 508 WAI reqs. Judy - W3C not policy making body. Minute-taker: Etan Wexler I've marked certain portions of the text as notes from me: m/\[[^\]]+--EW]/ Certain portions are questioned transcription: m/\[\?[^\]]+]/ A few attributions are questioned: m/^([A-Z]{2}|\?+)\?:/ Cast: DB: Debasish Banerjee RB: Randall Barry JB: Judy Brewer PC: Peter Constable JD: Jennifer DeCamp MD: Martin Dürst AF: Asmus Freytag TG: Tony Graham RI: Richard Ishida JJ: John Jenkins KK: Konstantin Kalaitzidis CL: Chris Lilley EL: Edgardo Luzcando KM: Katsuhiko Momoi TM: Teresa Mulvihill ST: Suzanne Topping TT: Tex Texin EW: Etan Wexler CW: Cathy Wissink MW: Misha Wolf SW: Sue Ellen Wright FY: François Yergeau Dialogue: TT: Eating our own dogfood should not take the place of more important W3C work. It sounds good, but it can be a distraction from things that need our attention. CL: We should strive for due diligence and be a good example. Eating our own dogfood is a kind of outreach. AF: Translation can be showcased by treating a few specifications rather than translating every document that the W3C produces. ST: A good candidate for such a showcase is the charter of an outreach group. [SW mentions the process of formal liason with TC37 and other ISO committees involved in locale identifiers.] AF: We can either leave locale identifiers to another body or we can formulate requirements and see that some body meets them. If the first body does not meet our requirements, we can shop around. SW: We can leave language identifiers to TC37, but 'xml:lang' is closer to W3C work. JD: We need to identify language, but also orthography, script, and modality (voice or text, for example). PC: We should leave it to TC37 or IETF as we have done. We can reconsider that decision later. AF: IETF has not met our needs. We need to discuss our requirements. ST: The requirements should be formulated by a working group. AF: Locales are a high priority, no matter where the work happens. TT: Leave the distinction between scripts, languages, and dialects to other bodies. We should address the general syntax issues. [...] KK: What I propose is the development of an end-to-end Web i18n model that consists of three components, what I call the IT infrastructure components. I think the first two components are well covered. The first is content i18n, the second is application i18n, and the third, which is kind of new, is infrastructure i18n. By "infrastructure" I mean databases, operating systems, networks, information security, application servers, Web servers. In with that I throw the Web services component. What I want to see is how these infrastructure components correlate with each other and what the impact is on Web i18n. We all talk about Web i18n, but to me Web i18n goes beyond content and application development. It should also cover, well... How does one get this information out to everybody, so everybody around the world can access this information? I am referring to the lower levels of i18n. Perhaps Web i18n is better defined here to say we deal with content and application i18n, and we leave the other component to another group to do the research, see what is involved. The bottom line is, if we internationalize according to the guidelines developed here, are we done? Is the Web i14ed? I would say no. ST?: Are you proposing a model for the flow through those pieces, or a model which includes the lowest models for each piece? KK: Models for each piece, and then an overall Web i18n model which would be comprehensive and cover everything, end to end. Right now there is a lot of emphasis on content and application i18n, and that is part of what we are looking at, but... RI: Can you give us examples of issues in the third part? KK: Sure. We can i14e an e-commerce application, l6e it... This is a real-world example, actually three real-world examples: Beijing, Johannesburg, and Moscow. It can take seconds to download and fill out an e-commerce page but it can take ten minutes to look at the same information. That is just from the industrial point of view. Remember yesterday what the vice president of the World Bank was saying: access to information. It could be educational information. We develop and l6e all this good information, and yet most of the world cannot see this information because of infrastructure. I can bring up the example of the Apache Web server; we had to do certain things with the configuration file just so that the Japanese people could see it. When we upgraded the system, things broke down. When one develops content and applications, there are components in the infrastructure that deploy those applications and content, and we need to do things to them to ensure that they are deployed properly. RI: My inclination is to ask, what can the W3C do about these things? Is that going to come later, Martin? KK: It is just an idea for everybody to contemplate. I do not think this is the forum for that. It would take days to cover specifics. MD: On the World Wide Web we have many cases where we are trying to think end to end but we are actually only working on something in the middle. For example, we won't go to any operating vendor and tell them what i18n means for that operating system. We work on the place where data are interchanged, and that might have some consequences that we have to examine. EL: If I may further the information technology perspective... As opposed to the model of the components themselves, I was interested in flow through the components. That is where I was seeing an XML guideline for i14zed content flow across systems of Web components. [Question punted to DB.] DB: What are the problems we are facing, propagating i18n? It is nothing special; it is like any planned server architectures. The only thing is that we are using mainly SOAP over HTTP as the carrier. Client and server can be in different locales or time zones. Today, in most conventional architectures, the server imposes its i18n properties on the client, which is probably not acceptable. Or desirable, rather. It is not only locale. I may want to drive my business logic depending on the locale, especially when I talk to a database. Queries can be locale-sensitive, and that should be in the client is locale, not in the server is. But today we do not have a standardized mechanism to propagate the client is i18n context to the server side. Especially while we have other context flowing, that is, in CORBA or J2EE. As a client, if you start a transaction, sure, the server has to go by the transaction identifer to drive the transaction to completion. But regarding i18n, there is no standard mechanism. We are thinking of propagating the i18n context using locale identifiers as a specialized SOAP header which would be attached to the payload. MD: In that case we go to the technical details. You want to exchange that information. DB: Yes, I want to exchange that information transparently. A client programmer should not have to create a header or whatever mechanism we want to do. The architecture should provide transparent support to carry the information seamlessly from hop to hop. MD: Misha, does that explain it well enough? MW: Well, are those people here whose companies are developing Web services and tools for them... Does anybody want to speak more about what they see as the requirements? CW: I do think a lot of what we are discussing has been covered. I think we do have to define what flow means in this case. There is a few things. One is identifying what we are sending out, transmittal of data. What market am I in? What language am I using? And also, how do we accept that? That is, on the other side, with regard to transactions. You know, what kind of currency. I think that was well covered in the document that was presented here. How we do that is up for debate, but the concept is pretty well out there as far as Web services. ST: That sounds like it is a combination of flow and then back to the whole locales situation. MD: There is quite a relationship there, yes. ST: I wonder if there is a better home, if that would be something to try to lump in there as part of... FY?: It would be a slightly [?higher approach]. Define what you want to transmit, how you want to transmit it, when you want to transmit it, and what to do with it. CW: It might be useful at some point to call out what we mean as well as collation. That is a nice application of how we use the locale identifier. Poll: How important is this for W3C? 18 - Existing work (reviews, character model) 24 - Guidelines, best practices 10 - Education & outreach 07 - Gathering user requirements/solutions 18 - Distributed services incl Web services, locales, collations 05 - IDNS 10 - Localizability Dialogue continued: RB: Are we going to provide guidelines in all the languages that we intend to serve? MD: W3C policy sets English as working language, and translations are welcomed. TT: But that is irrelevant. What is the best for the situation? If it conflicts with policy, we worry about that later. JB: W3C policy is a crucial issue. W3C is perceived as a standards organization. We at WAI attempted to put off questions on translation policy, but various governments came to us and said, "Look, we would be using, referring directly to your documents if we had official translations." [...] ST: Are guidelines a part of education and outreach? AF: The first task is to decide how many groups we want to form. None of this guidelines business fits into the current structure. The Character Model has stuff in it that is a source for guidelines. We should separate the tasks of writing information and transmitting information. Every time we put a "should" into a Recommendation, that is a guideline, and the formulation of such belongs in technical committees. Some people should go through all the Recommendations and pull all the appropriate stuff out and collect it into one area, which is different from what the working group has been doing, which is to put very specific character or other core information into [?conduct] of other working groups to put it into their documents. MW: On the guidelines, the one issue I would like to raise is... If I listened correctly to Judy is talk, WAI divided the guidelines by audience. There are many ways of cutting it. But WAI divided it by audience, which if I remember is content, user agents, authoring tools, and specifications. Is that the right dimension for i18n? That is, intended audience rather than topic. And, if it is, are those four the right four categories? CW: I think that that is something that will take more time to determine than the amount of time that we have today. MW: Okay, but insofar as we are discussing something like guidelines, let is say we do it for twenty minutes. What, actually, is it that we want to discuss? I do not know. That is one possibility. CW: I think that that would require a little bit more brainstorming and then cutting things out. I think that it is easier to have a whole lot of information that might be pertinent to guidelines and best practices and then start cutting things out. RI: I do not think we need to get the answer today, but I think that it is a great forum and a great opportunity to do the brainstorming, and then we can capture those things and think about them later. MD: As a chair, I would like to do a bit of brainstorming now. (Not too much.) Looking at this [the handwritten list of audience categories --EW], are there any things that you think are important for i18n that maybe WAI did not really consider? FY: Site design. Which is not the same as content. JB: That is what we [Web Accessibility Initiative --EW] intend by "content". It is people who are developing content and people who are developing sites. EW: So, what would your distinction be between content and site design? FY: A site designer decides how a site is structured, how it is accessed, how it is navigated. I am not talking about page design. I am talking about the whole site. "Content" makes me think of the pages themselves. Of course, the pages reflect the design of the site. ST?: Well, I think that for i18n, "content" would not be a good title. We talk a lot about separation of content from architecture, so it would be misleading. MD: Anything else to add here? ST: L6ability, again, for lack of a better word. Or, if we are talking audiences, l6ers. RB: Are these facets of i18n? I am a little confused about what these words are connote. MW: These are audiences we want to read the guidelines and do something with them. RB: That would be content creators, then? MW: Yeah. Add "creators" to each of them. Or "designers". FY: I think it is important to distinguish the audience from the content of the guidelines. L6ability, for me, would go under the site design/content audience. RI: [To ST --EW] You mean a group of l6ers? ST: Unfortunately, I think that looking at what the WAI people did, as a construction for them, may be hampering our ability to brainstorm here. I have a hard time thinking, "How is this going to fit?" Whereas we should be thinking, "What do the people who are going to be looking at our guidelines need?" And then categorize them. KM: I do agree with that. It seems to me that the top four items on that list are closely related. I was really thinking that guidelines and best practices have to do with reviews and Character Model. FY: I think part of the rationale for guidelines came from adding less reviews, because they would be guidelines that others could use. [Going back to the list of audiences.] TT: I would add administrators. There is a class of tools for administration and guidelines on how to set them up. It also bears on the distributed infrastructure. RI: Does "site designers" fall into that category or is it still separate? TM: No. TT: Well, with "site design", I think of the network of, the flow of the pages. But adminstration, when one deals with multiple languages, deals with an overarching architecture for how to manage all these things. What is common? What is separate? MW: And there is the Web server administrator. TT: There is also support and how one deals with changes and versioning, and also problem detection. KK: This is what I was mentioning: scalability, availability, and so on. ST?: I do not think that it is administration, though. TM: Yeah, it is. ST: Well, what if one is developing it from Ground Zero? I will be going through a table of contents and say, "I want to design a system that is end-to-end." I would never look at administration until I had gone through--- TM: I would look at it first. I would look at the back end, the administration, first, and build it from there. TT: So we need a table of contents for the guidelines, so that one knows where to look. MD: I would like to think about cases where somebody wants to do it in one language but still have it i14ed so that the second that they want to have added versions or they want to have it adapted to different audiences or they want to have multilingual content on the same page... Often, when I talk to people, I get the impression that they are thinking about one of these, so we have to distinguish these cases. RI: One of the questions I struggle with is, what do the guidelines contain? Do we just take the Character Model, or any other bits and bobs lying around in the current specifications, and then make those visible in one place? Or do we go as far as handling cultural impacts of color, time and date formats, and so on? Which is kind of the other end of the scale. Where do we cut the amount of work that we are going to do? FY: Well, there is the distinction between l10n and i18n. "Make sure that you can change the color." That is i18n. "Put the right color in." That is l10n. RI: I certainly do not think we should be doing l10n guidelines. I think that that is for the l10n community to do. But even within i18n, do we get into color issues, translatability issues, and so on? FY: That is why I was insisting that we separate the issue of the audience and the content. So do we finish that? TT: I think that we need some more debate over this. There is no l10n group in the W3C, at the moment. There are other groups that do it. Part of what we need to address here, in establishing the charters, is, should there be a l10n group? There clearly need to be guidelines around l10n, especially in the context of the Web. I do not think that we should rule that out. I have in mind that there will be guidelines for those things, but we also need to determine scope and do something to categorize better what we want to accomplish through these guidelines. CW: Let us try to use our resources the best way we can, and try to leverage existing resources. We have a limited amount of time, a limited amount of resources, and there is a lot of good stuff out there, perhaps on l10n, that we can reference. Or we can say, "We recommend the use of this particular set of guidelines." That way we can concentrate on what is specific to the W3C. ST: Is it enough to say that we should create a committee and have it develop recommendations and reports on how to proceed? MW: That is how we will end up, but in order to put to the W3C management that there should be such a working group... The decision is made, in the end, by the member companies of the W3C. The W3C management puts out a draft charter for an Activity and puts it to a vote. We need to formulate something plausible and then put it forward and see what happens. On l10n, there are two things. One is to say, in the guidelines, "Make sure you use a locally appropriate color for your Web page. And there is a book that discusses it." Another way is, "Here is a list of fifty-nine colors." I am very in favor of us doing the former and very not in favor of us doing the latter. [MW notes the day's time constraint, comparing priority votes with remaining minutes.] TT: Misha, in general I agreed with your remarks with respect to, for example, colors, but as a developer, I find that if I do not know the range of data, then I cannot do development appropriately. If you tell me, "There are a lot of date formats out there", fine, I can accomodate some variability, but if I never anticipate a Japanese character in the middle of a string of numbers, I still cannot do the job. I think that we do need a bit more with respect to l10n and then some specific registry that records what goes where. That is fine for somebody else to do. RI: Certainly, I try to say, "These are the things that you are going to encounter, these are the more common ones." But one cannot enumerate every possibility and then one must point people off. Not only that, but different communities have different requirements. It will be different industries in the same country, or different regions, and so on. The thing that worried me though, going back to what Cathy said, and I do not mean to point a finger at you, but you said that we should focus on what most interests W3C. CW: Maybe I misspoke, then. We have a certain amount of expertise in this particular group. I think that we should focus our time on that expertise and if there are the people who, say, have the expertise to talk about l10n best practices, those resources should be allocated to them and not here. RB: I would like to make a comment about l10n and i18n. I was surprised at how many people mentioned l10n in their position papers. I thought, "Hey, this is supposed to be about i18n", but it made me realize that, in order to do l10n, one really must know about it and think about it when one does i18n. Something as simple as, say, identifying season is so relative to where one is in the hemisphere, and if one had not thought about that in advance, one cannot let somebody l6e to identify season. RI: The point is, the people talking about l10n were actually talking about l6ability and l10n enabling. [Discussion of what to discuss next.] TG: I was wondering if one more audience group could be schema developers and style sheet developers. RI: Are they content developers? TG: I am not sure where style sheet developers fit in among such groups. Schema developers--- MW: ---are very different people from content developers. [Crosstalk.] TT: Another community I would target is testers. I suggest that for three reasons. One is that I would like to educate quality assurance folks because when they report problems it brings the developers along. They are a community that is rarely addressed, so they are happy to get that kind of information. Developers often do not want to do things different from what they are used to doing. Second, I am interested in conformance testing to know whether products are in complete conformance, or to what degree they conform to the guidelines. Third, with respect to i18n quality assurance folks, they often do not know what to test or what values trigger the boundary conditions, the edge conditions, and therefore they do not do a good job of testing other than putting volumes of data through the system. RB: I would like to suggest prioritization of all these things, because it got so diverse. In another project that I tackled where we had some existing standards but that needed application guidelines, we started with the ones [guidelines --EW] for content since, without content, all of it is useless. But one cannot have content without an authoring tool. They [the audiences --EW] fall into an order which helps to prioritize where one could suggest they be done. FY: Do you really think that that should be done today? RB: No, I am just suggesting it, just raising it as something that needs consideration at some point. MD: Given that WAI has spent several years in several working groups covering half of that, I think that that is a very important point. CL: Judy made the point about looking for levers and looking for multipliers. The first thing that we need to do is decide what sort of content is desirable. Having done that, we do not necessarily write the content note. We then write the authoring tool note. If we get a few tool vendors to use the stuff, then we produce large amounts of it, whereas if we get five or six content developers to follow the guidelines, we have a drop in the ocean. RI: I totally agree with that and I would go further. I would say that what WAI is already doing is trying to create tools to help authors. One might have all the guidelines that one needs, but text authors are under pressure all the time and they will forget things. One needs software to pop up and say, "You did not put the 'xml:lang' thing in here; I cannot spell-check it." Make the tools force authors to do certain things. ??: So we have them [the audiences --EW] in the right order already. TT: I would put testing higher than authoring tools, because one can trap the problems early on. MD: I would like to see schema stuff higher. I think that schema has the biggest leverage. RI: The schema and the style sheet go together because one does not get anything without those. And then the content developers come along and fill it in. MD: It depends. In some cases, like more data-oriented schemas, maybe style sheets come later. ST: One thing that, Tex, you may have mentioned in your paper, is that checklists are useful, too, not just guidelines with lots of text, but something that one can scan as one is trying to validate whatever it is that one is doing. RI: Are we going to continue this discussion on the mailing list? Are we forming a community of people here who will continue to discuss things amongst themselves? CL: Name the mailing list. Is it www-international? MD: I suggest that general issues go on www-international. If you are in the Working Group and have something more specific to the Working Group or the Interest Group, send it to that group. TT: What about mailing to the list set up for the workshop? MD: Any of you is free to send something to that mailing list and everybody in the world can look at the archives. They just cannot post, and most of them do not get it. Whether you want to discuss on that mailing list is up to everybody here. CL: Can you just say what the mailing list is? MD: It is www-i18n-workshop. MW: At w3.org. Minute-taker: Sam X. Sun Discussion minutes (under “What W3C I18N group should work on”) • Discussion topic: Review Existing work MD: review work is very important… RI: need guidelines, to reduce the review work MD: need to know what to do, to advise them, and develop information model MW: should finish the character model. Let’s concentrate on first version, not sure about the second version. RI: any new specification to be written? E.g. CSS-3 spec MD: at the moment not working on it TT: suggest to look at comments submitted from different reviews, and listing categories not being addressed. PC: listing projects that haven’t been done. Documenting un-encoded characters… glyph variants… etc. MD: There are two w3c recommendations (developed in Japan) that cover this topic. What to do with them? ??: open language archive community is looking at this… look for solutions… two communities looking for same solution, this wg might be the organization to unite the effort. MW: Should finish the character model first. MD: these will need coordination. ??: This area resembles research, giving that other people are doing it, there’s danger of separate solutions, the role of w3c can be like stamping… or provide guidance. • Discussion topic: Locale, collation, and web services MD: A few pointers: w3c last Friday changed XML protocol activity to web service activity, setting up two wgs: web service architecture working group, and CC/PP (composite capability/preference profiles), initiated from mobile community, about property of devices, what software is available, information that can be cached and pointed to, not sending all info at all the time, etc. MD: announcement of another workshop on delivering context, in March 2002, relating i18n and localization. I’ll report what we learnt here to that workshop ??: need to formalize and standardize i18n: setting up architecture group, define the context… otherwise may create interoperability mess ??: what does it mean by 18n context? … CW: Regarding locale identifier, from experience: extending locale identifier with different configuration settings is difficult… ST: mentioned this in my position paper. Need to define what’s meant by locale, what to eliminate, to include in locale identification, language identification, etc. MD: what about db approach? ST: choose from relational db, choose English – doesn’t necessary mean American currency… ??: need some basic set data as default,… as a dropdown menu choice, … not something fixed MD: what needs to be achieved, how to cover all minority cases? Can we agree that there’s identifier for it, …in some cases an identifier is sufficient, in other cases the identifier points to additional data that people can pull them out… CW: this can be difficult to implement, although seems useful. For now we’d best focus on what locale identifier covers… MD: in some cases, identifier doesn’t matter CW: this is a valid point… it’s vary hard to convince vender… MW: Should define a general mechanism, data structure, etc. CW: also provide flexibility… DB: There are some languages not well defined, should not mix the issue… need ways to identify that… MW: locale may use XML to add capability for search. Should we expect that same query to different search engines would produce same results? AF: Possible to devise syntax to achieve… such mechanism may achieve two things: create a list of common registered identifiers to map existing info… agreement of the defaults… even come with shortcuts … MD: from implementation point of view it is important to work on the notion, content and identification, as well as propagation and use of “locale” include: date, language, time, collation, time zone, etc. TT: also how to address un-encoded characters… RI: need to find someway to hook into XML document… MD: would like to ask for use cases, examples. Send them to me and will bring it to the other workshop MW: are they of same importance? Minute-taker: Tony Graham Education & Outreach: fy: includes "eat your own dog food" md: Currently working on getting all mail in mail archive using right encoding. mw: Education includes what? md: Includes sponsoring t-shirts for Unicode Conferences fy: That's outreach. tt: Problem is existing documentation is all specs. Need "this is how to do it or how to use it" in a way that users can understand. mw: Need templates ri: Work on HTML tidy, for example. md: Also, specifications how to find out encoding page is supposed to be in, complicated because of compromises going on over the years. ri: Bill Hall mentioned Universities, as did Tim Bray md: but not in a position paper. ri: Are universities appropriate for W3C outreach? VC: Nothing prevents us. Haven't done that yet. md: Think about best way to get multiplier effect. ri: Computer-based training programs, distance learning... md: Get contacts for people working on standard curricula ri: Rather than target those people, make everybody think they need to know about this. st: Professional association is looking at education. Collecting current providers and making list available to people looking. Could do the same. tm: Society for Technical Communication already has ties to universities. Should use someone elses avenues. st: A lot of WAI focus on developing materials. Need people focusing on PR. mw: W3C has communication team. st: Getting stories planted. md: They do that. mw: ST is talking about ongoing. st: yes, raise visibility. ri: Business with web site wants people to keep coming back to web site. I always need to know latest browser i18n support. If had that on a web site, they'd keep coming back. Would have strategically placed links to draw them in to other things. af: The way univerities work, not in business of training, unless convince research issue on internationalization. cw: On web sites, many could point to W3C i18n working group. cw: If can point i18n problems at graduate students, they go off and do it, but you hit very few people. md: Books speak about new technology, books often forget i18n. Book gets translated as is into Japanese, and things that you don't worry about in the US are missing in the Japanese. md: Issues should go in original of book. cl: Data structures 101 course is missing a component, and we can help write it for you. af: An author writes basic book on algorithm, did task of writing his algorithm to work with Unicode. Localisability (10 minutes) ri: The i18n tag stuff fits into this category. Propose we develop i18n tag set. E.g. translate flag or designer's note. May l10n would do much of definition of tags. md: Also work at code localization level. When looked at XSLT, stylesheet adds labels, what do I have to do to get labels in different languages. James Clark had easy/complex solution. md: XForms has places that you know will need l10n. Draw up example, and say, look, this has to be simpler. ri: Much localizability will be guidelines? md: yes. ri: Is this appropriate for W3C? Is there something to do here? md: Probably not xml: attribute. mw: Are we developing actual tags? Belong to W3C or l10n bodies? ys: A lot of properties also related to schema. Some aspects already in schema, also related to rules file. ys: Same mechanism could be applied to Javascript code. md: XML Schema already has mechanism: say "integer" in schema or data. If have to put things in XML Schema, W3C will have to do something. If not, or Schema group says no, it has to be done somewhere else. mw: Anything else? Architectural work to make it possible, guidelines so people do the right thing, question of whether develop tags. Anything else? sy: RIs tagset makes document easily localizable. Rule file has same purpose and complements the other. Makes sense to put both in same standard or guideline. md: Rule file. Schema and instance, but rule file appears to be in the middle. Not used in other problems. Is l10n special? ys: Reason is l10n tools don't support schema. Build their own rule files and go from there. ri: Stylesheet for translation. Need include rule files for translation. md: Look at RDDL. Gathering user requirements/solutions (7 minutes) mw: Comes from meeting with XML Query where asking what would western-language speaker do to find "'Bill' near 'Monica'". What would a Chinese user expect where maybe don't have concept of "word". We didn't know. How do we find experts? mw: Instead of telling people way to do it, asking them way to do it and maintaining network. mw: It's not inherent in a language how a search engine would work. mw: If ask Chinese person who's never used a search engine, they wouldn't be able to tell you. md: Applies to i18n because people who know this are all over the place. mw: For schema, MD set up a web site asking people what calendar they used for different activities. st: So group becomes collection point? mw: Don't know. fy: should i18n group set up center of expertise. cw: Tempting but could get out of hand. mw: Netscape periodically asks "How do we do numbered lists in different languages?" pc: But other groups addressing same issues in contexts not necessarily related to the web. E.g. group working on collation sequences for European languages. Open Language Archive community looking at this kind of thing. Is there another venue that's the right place to hold this? ri: We don't want to be a registry. Don't want to hold that knowledge. But don't want to be ad hoc. tt: It's not a problem specific to the web. Members have resources in other countries. Larger issue than the Consortium. Don't use W3C resources. md: Other groups exist, vendors have contacts. For reviews, could have list of places to send questionnaires if we need for feedback. If have contacts in standard organizations, could send it them for feedback. st: If done, save information somewhere searchable. ri: So could be resource to other people. pc: Unencoded characters. Needs to be a common way of documenting so it becomes interchangeable. Don't want to be repository, but could provide common representation. Schema or whatever. pc: Open Language Archive community have multiple archives around the world and problem of querying distributed information. mw: Solution is web page, public or member-only, where form weaker thing than liasions with experts. Someone asks us, they want answer now. We have no resource for where to go and find expert in Chinese searching, etc. fy: Use case for Semantic Web? bt: Just what I was going to say. fy takes minute. md: Those people get a lot of mail, some ignore, some look by chance. More specific polling if clear question could work better. ri: Poll of people not working for W3C member? Multilingual Domain Names (5 minutes) as: Are we having a brand of multilingual dog food? st: How would W3C participate in IDN cw: Two major software vendors put this in their position papers. Lot of interest in industry. W3C i18n looks like place. km: IETF develops protocol. Other people can look at edge cases. Rather not see something crazy done. cw: Have more questions than answers. Have impression people don't want to touch, not important. af: It's the wrong forum. May be at level of focussed input. km: Crazy things might happen in China. cw: Hard to get worldwide solution when different reactions from different parts of the world. md: We need one solution and not more than one. We can't influence the solution very much. If send something to that mailing list saying we know better, they won't listen. md: Can use IETF process. E.G. one of chairs took one year to understand specific Chinese issue. E.G. if few people, not W3C, not Unicode, document problems, list impressive cases of how it doesn't work. Send in as IETF document, as anyone can do, may be published, e.g. as informational. One way to influence what's going on. md: In working group last call. md: W3C has interest in how IDN works in resource identifiers. Have published draft. Looking for feedback. Discussed issues with BiDi. tt: Why are we here? fy: Question is if anyone knows if last call documents satisfactory? md: Good point to review them again. ?? found one character that should be included not excluded. mw: Last call ends in ten days time. tt: But meeting has to do with next charter. mw: But people here are concerned. cw: My question is if people satisfied or doing anything? People here have knowledge to add valuable input. Should be be involved? mw: I hear that people find working group very difficult. Process doesn't appear very good. MD says any of us can review them. cw: It's lat in the process, but might not be the end. MD reviews IETF process. mw: Who resolves issues after last call? md: Editor fixes editorial. Big issues go back to working group. mw: Way forward is to send in comments. Repeat if ignored. This group can't do anything except, if important, go through IETF liaison? yf: Two channels. Normal and liaison. md: Can ignore most things on that mailing list as useless, but sometimes you may say something to move things forward. st: I asked on list, what is the process for concerns? Was never answered. cw: Is this important to this group? mw: It's important, but what can we do? fy: We can review as group, send comments as individuals. md: Difference between sending as group and as individuals. ss: On IRIs, this working group interested in i18n of URIs? md: Not completely done, want to move on with character model, character model says to use this. We have good base to move forward with it. ss: Process for IRI? md: I send to last poll, send to ??, he will say yes or no. ss: Anyone want to implement IRI? cw: Can't give definitive answer. md: IE implemented parts 2-3 years ago. Others have done a bit recently. mw: All we can record is that a lot of concern was expressed about the process in that WG and we noted ten days to go in last call. People who want support from others can sent concerns to workshop mailing list. yf takes minute. md: Minute that for web we are interested in single solution that works across the web. jj: A simple single solution. Simple enough that my mother can understand it. yf revises minute. mw: We only have power as individuals, and its awfully complex. Summary and Conclusions md: We have a lot of consensus. We listed a lot of things that are quite a bit of work. Let everybody say whether would be able to help with something or if know people who would be able to help. We now what is important and what we should and shouldn't do, but need to know if have people to do work. pc: Question of know who is not working for member company. md: Can handle somehow. cl: Interest group is public? md: No, but public or not public could be part of rechartering. mw: No point having people on working group who don't do anything, but also better to have more than fewer. Poll of who can participate in what. yf notes numbers. af: Room split into two subgroups. One person who answered yes to E&O answered yes to existing work. md: Propose last round where everybody can say something. km: Hope provision for character model document. af: Glad accessability mentioned. cl: been valuable, but needs to continue to be sure it has lasting value. Maintaing momentum. vq: Have better idea of what people should do. Not sure have resources. Have to prioritise. More work needed to make final decision. tg: Useful day. af: Need to identify people willing to drive activities, be chairperson, etc. ri: Willing to drive guidelines. af: Look for caretaker drivers? st: I'd do outreach. No-one volunteers for locales. st: I could kick if off, but... mw: Have feeling it needs to be someone from a member company. tt: Don't look a gift horse in the mouth. ??: ys volunteers for localizability. jj: Need regular f2f meetings. Maybe have day like this with each IUC. af: A workshop like this is good, but should it have the same focus or just one of the subject matters. mw: Need balance between subjects and future path. Money is practical matter. W3C probably wouldn't do one two or three times a year. ri: Or organise like normal face to face. md: Can also think about panel. CL hints to TG that Sun could host another meeting like this in Dublin. 10 people interesting in attending "a thing like this" in Dublin. vq: Motivation of workshop was to gather ideas. Next step is to elaborate plans then make an activity proposal describing plan and structure for activity, chairs, charter, timeline. This usually takes a few weeks. Propose to members. Management decide on feedback from members. ri: Would be good to have meeting in Dublin. vq: Could be first f2f meeting. But have to announce f2f 8 weeks in advance. j: Would have liked to see more on language codes and how different organisations would work together, e.g. ISO TC 37. st: Maybe do something from outreach perspective. mw: Glad it's happened. I'm tired. I want to rest. Supper would be good. ri: Nice to see everybody and meet new people. What time are we going for dinner. Haiku. DB: Would like to see more architectural issues being discussed. mw: More four letter words? kk: learned a lot. My issues were touched upon. st: Wondering if with right outreach/PR, could get more resources. el: first time. looking forward to workgroup. i18n will have major role in industry. sy: Good outcome. Hope each group will produce result. ys: Good meeting. tt: Was enthusiastic jj: But now? tt: Now I'm tired. Was glad to see got volunteers and leaders. Next meeting should be two-day. cw: Was impressed by diversity of people around the table. Hope cross-polination will continue. Vendors/IT don't always get to see all sides. bt: Cross-pollination was useful. Dealt with some high priority issues for libraries. Glad to contribute. Libraries have a lot to contribute. See us as a resource. fy: Good workshop. Optimistic to see we have volunteers and small set of possible activities. md: Glad it's over. Glad no people fighting with each other. Thank minute takers. Thank presenters. Thank Barbara for getting it organised. Round of applause for Barbara. md: Thanks for loan of projector. Thanks for participating. ri: Thank to Martin. Round of applause for Martin.