The MultilingualWeb-LT Working Group published a First Public Working Draft of Metadata for the Multilingual Web – Usage Scenarios and Implementations. This document introduces a variety of usage scenarios and applications for the Internationalization Tag Set (ITS) 2.0, ranging from simple machine translation or human translation quality check to training for machine translation systems or automatic text analyis. Many of the underlying implemementations will be showcased in the upcoming W3C MultilingualWeb Workshop 12-13 March in Rome.
A new FAQ page devoted to the topic of private-use characters, noncharacters, and sentinels has been posted on the Unicode web site. This FAQ aims to clear up confusion about whether noncharacters are permitted in Unicode text, and how they differ from ordinary private-use characters. The recently published Corrigendum #9: Clarification About Noncharacters makes it clear that noncharacters are permitted even in interchange, and the new FAQ page addresses some of the fine points about their usage and about differences from other types of Unicode code points. The brief mentions of noncharacters in other FAQ pages have also been updated accordingly.
Are you unclear about what Unicode “noncharacters” even are? The new FAQ page also answers basic questions about noncharacters and private-use characters, and provides a bit of history about how they came to be part of the Unicode Standard.
The deadline for speaker submissions for the 6th MultilingualWeb Workshop (March 12–13, 2013 in Rome, Italy) is this Friday (January 18 at 23:59 UTC).
With a keynote by Mark Davis and Vladimir Weinstein (Google), special breakout sessions on linked open data and other critical topics, this Workshop is set to continue the tradition of excellence set by the previous six Workshops, and will provide an outstanding forum for thought leaders to share their ideas and gain critical feedback.
While the organizers have already received many excellent submissions, there is still time to make a proposal, and we encourage interested parties to do so by the deadline. With over 100 attendee registrations already submitted for the Workshop, we are certain to have a large and diverse audience and stimulating discussion about all of the presentations.
For more information, please visit the Rome Workshop Call for Participation.
Mark Davis and Vladimir Weinstein (Google) to deliver keynote, “Innovations in Internationalization at Google,” at MultilingualWeb Workshop
Mark Davis (President and Cofounder, Unicode Consortium, and Software Engineer, Unicode and ICU, Google) and Vladimir Weinstein (Engineering Manager, Google) will deliver the keynote talk at the upcoming 6th MultilingualWeb Workshop in Rome, Italy (March 12–13).
The keynote will discuss how Google supports its ambitious goals of removing barriers to information, in an ever increasing number of languages, through recent innovations in internationalization technology.
The MultilingualWeb workshop series examines best practices and standards related to all aspects of creating, localizing and deploying the Web multilingually. It aims to raise the visibility of existing best practices and standards and identify gaps, with a view to helping content creators, localizers, tools developers, and others meet the challenges of the multilingual Web.
Participation is free. We welcome participation from both speakers and non-speaking attendees. For more information and to register, see the Call for Participation.
Led by experts in the field, two special break-out sessions on Internationalized Domain Names (IDN) and Linked Open Data (LOD) are planned for the upcoming MultilingualWeb workshop, to be held at the headquarters of the UN’s Food and Agriculture Organization in the heart of Rome, on 12-13 March. We will also continue the Open Space discussions that have been so popular in the past.
In addition, lunch-time exhibition sessions will showcase the recent work and progress made on implementing the ITS 2.0 specification, a major effort in the W3C to improve support for language- and translation-related processes.
Register soon to ensure you get a place, especially if you are interested in also speaking. See the Call for Participation.
The W3C’s MultilingualWeb workshops bring together approximately 150 implementers, leading developers, localizers, researchers and users of the Web to discuss best practices and standards related to all aspects of creating, localizing and deploying the Web multilingually. One and a half days of presentations will be followed by break-out sessions that will allow attendees to explore additional topics in an in-depth, discussion-oriented fashion.
Participation is free.
If you have any questions, contact the program committee chair, Dr. Arle Lommel (firstname.lastname@example.org).
This document defines data categories and their implementation as a set of elements and attributes called the Internationalization Tag Set (ITS) 2.0.
ITS 2.0 is designed to foster the creation and localization of multilingual Web content, focusing on HTML5, XML based formats in general, and to leverage localization workflows based on the XML Localization Interchange File Format (XLIFF), and language technology applications like machine translation or named entity annotation. In addition to HTML5 and XML, algorithms to convert ITS attributes to NIF is provided.
Last Call means that the MultilingualWeb-LT Working Group feels that ITS 2.0 is ready to move to recommendation. If you have comments on the document, please send them to the list mentioned in the document status before 10 January.
12–13 March 2013 in Rome, Italy, hosted by the Food and Agriculture Organisation of the United Nations.
The W3C announces today the sixth in a series of workshops exploring the mechanisms and processes needed to ensure that the World Wide Web lives up to its potential around the world and across barriers of language and culture.
Anyone may attend at no charge and the W3C welcomes participation by both speakers and non-speaking attendees. Early registration is encouraged due to limited space.
Building on the success of five highly regarded previous workshops in Madrid, Pisa, Limerick, Luxembourg, and Dublin, this workshop will emphasize the application of theory and technology to meet practical needs. The workshop brings together participants interested in the best practices and standards needed to help content creators, localizers, language tools developers, and others meet the challenges of the multilingual Web. It provides further opportunities for networking across communities that span the various aspects involved. We are particularly interested in speakers who can demonstrate novel solutions for reaching out to a global, multilingual audience. Registration now online.
A report summarizing the MultilingualWeb workshop in Luxembourg is now available from the MultilingualWeb site. Alongside the summaries are links to slides, video recordings, and the IRC log for each speaker and the discussion sessions.
Entitled “The Multilingual Web – The Way Ahead”, the workshop surveyed and shared information about currently available best practices and standards that can help content creators and localizers address the needs of the multilingual Web. Attendees also heard about gaps that need to be addressed, and enjoyed opportunities to network and share information between the various different communities involved in enabling the multilingual Web.
This workshop also included a half-day Open Space discussion session run by Jaap van der Meer of TAUS, where attendees split into breakout groups to discuss topics of their own choosing.
You can also find links to videos, slides, etc as well as links to social media related to the event on the program page of the workshop.
This is the final workshop in the series belonging to the first MultilingualWeb project. The MultilingualWeb-LT project, which follows on from the original project, is holding a workshop in Dublin on 11-13 June entitled The Multilingual Web – Linked Open Data and MultilingualWeb-LT Requirements and plans to hold additional workshops next year that will be similar in format to those run so far.
A new version of the Character Model for the World Wide Web 1.0: Normalization was published. The only significant change was a note to clarify that content of the Working Draft is currently out of date, and the Internationalization Core Working Group intends to substantially alter or replace the recommendations found in this document with very different recommendations in the near future.
W3C Workshop, Call for Participation: The Multilingual Web – Linked Open Data and MultilingualWeb-LT Requirements
11 – 13 June 2012, Dublin, Ireland, hosted by Trinity College Dublin.
Organized by the MultilingualWeb-LT Working Group, the purpose of this workshop is two-fold: first, to discuss the intersection between Linked Open Data and Multilingual Technologies (11 June), and second, to discuss Requirements of the W3C MultilingualWeb-LT Working Group (12 – 13 June). For more information, see the Call for Participation.
Participation is free. We welcome participation from both speakers and non-speaking attendees. However, whereas future MultilingualWeb workshops will continue the wide-ranging format of previous MultilingualWeb events, and will aim again at a larger audience, attendees for this workshop are required to participate actively in discussions and will need to submit a position statement for the workshop registration. There are limited spaces available.
The MultilingualWeb Working Group aims to define meta-data for web content (mainly HTML5) and “deep Web” content (for example a CMS or XML files from which HTML pages are generated) that facilitates its interaction with multilingual technologies and localization processes.