Version 8.0 of the Unicode Standard is now available. It includes 41 new emoji characters (including five modifiers for diversity), 5,771 new ideographs for Chinese, Japanese, and Korean, the new Georgian lari currency symbol, and 86 lowercase Cherokee syllables. It also adds letters to existing scripts to support Arwi (the Tamil language written in the Arabic script), the Ik language in Uganda, Kulango in the Côte d’Ivoire, and other languages of Africa. In total, this version adds 7,716 new characters and six new scripts. For full details on Version 8.0, see Unicode 8.0.
The first version of Unicode Technical Report #51, Unicode Emoji is being released at the same time. That document describes the new emoji characters. It provides design guidelines and data for improving emoji interoperability across platforms, gives background information about emoji symbols, and describes how they are selected for inclusion in the Unicode Standard. The data is used to support emoji characters in implementations, specifying which symbols are commonly displayed as emoji, how the new skin-tone modifiers work, and how composite emoji can be formed with joiners. The Unicode website now supplies charts of emoji characters, showing vendor variations and providing other useful information.
Some of the changes in Version 8.0 and associated Unicode technical standards may require modifications in implementations. For more information, see Unicode 8.0 Migration and the migration sections of UTS #10, UTS #39, and UTS #46.
Provide input to the planning of the Big Data Value Chain: Contribute to BDVA Summit 18-19 June, Madrid
In the context of the Big Data Value Association Madrid Summit, 17-19th June, there are two sessions of specific relevance to standards and also to multilingualism: on 18th June a session on standardization, and on 19th June a session on Multilingual Data Value Chains. If you want to have an active participation in both sessions or want to provide further feedback, please contact Felix Sasaki <firstname.lastname@example.org> on Standardization and Asun Gomez-Perez <email@example.com> on Multilingual Data Value Chains. Presentation will be short in order to promote a wide participation.
If you cannot be in Madrid please also provide your input – see above session links for further instructions. The BDVA Summit will be crucial in shaping upcoming funding opportunities related to Big Data. Don’t miss the chance to describe your views on opportunities, challenges and potential solutions for the Big Data Value Chain!
Language Tags and Locale Identifiers for the World Wide Web describes the best practices for identifying or selecting the language of content as well as the the locale preferences used to process or display data values and other information on the Web. It describes how document formats, specifications, and implementations should handle language tags, as well as extensions to language tags that describe the cultural or linguistic preferences referred to in internationalization as a “locale”.
Changes in this update include the following: All references to RFC3066bis were updated to BCP 47 or to RFC5646 or RFC 4647 as appropriate.References to HTML were changed to point to HTML5. Imported and rewrote the text formerly containing in Web Services Internationalization Usage Scenarios defining internationalization, locale, and other important terms. Modified and reorganized the other sections of this document. Moved the Web services materials to an appendix.
The updated Working Draft of Requirements for Hangul Text Layout and Typography brings the English version of the draft into line with a number of changes prompted by feedback that were added to the editor’s copy. Notes pointing to as yet unresolved comments were also added to the document. It also points to the new location of the editor’s draft, on github, and suggest the use of github issues for future comments.
The document describes requirements for general Korean language/Hangul text layout and typography realized with technologies like CSS, SVG and XSL-FO. The document is mainly based on a project to develop the international standard for Korean text layout.
Multilingual Linked Data for a Digital Single Market – Dedicated LD4LT call, 2 April 2015, 3 p.m. CEST
The LIDER project is fostering the creation of a community around Linguistic Linked Data (LLD): linked data used to represent metadata about linguistic resources and the resources themselves, e.g. lexica, thesauri, corpora, multilingual semantic networks etc. In a dedicated LD4LT community group call on 2 April, 3 p.m. CEST, we will discuss how LLD can contribute to the creation of the digital single market. See for more details the slides that will be presented during the call.
See the program. The keynote speaker will be Page Williams, Director of Global Readiness, Trustworthy Computing, Microsoft. She is followed by a strong line up in sessions entitled Developers and Creators, Localizers, Machines, and Users, including speakers from Microsoft, the European Parliament, the UN FAO, Intel, Verisign, and many more. The workshop is made possible with the generous support of the LIDER project.
Participation in the event is free. Please register via the Riga Summit for the Multilingual Digital Single Market site.
The MultilingualWeb workshops, funded by the European Commission and coordinated by the W3C, look at best practices and standards related to all aspects of creating, localizing and deploying the multilingual Web. The workshops are successful because they attract a wide range of participants, from fields such as localization, language technology, browser development, content authoring and tool development, etc., to create a holistic view of the interoperability needs of the multilingual Web.
We look forward to seeing you in Riga!
The Unicode® Consortium announced the start of the beta review for Unicode 8.0.0, which is scheduled for release in June, 2015. All beta feedback must be submitted by April 27, 2015.
Unicode 8.0.0 comprises several changes which require careful migration in implementations, including the conversion of Cherokee to a bicameral script, a different encoding model for New Tai Lue, and additional character repertoire. Implementers need to change code and check assumptions regarding case mappings, New Tai Lue syllables, Han character ranges, and confusables. Character additions in Unicode 8.0.0 include emoji symbol modifiers for implementing skin tone diversity, other emoji symbols, a large collection of CJK unified ideographs, a new currency sign for the Georgian lari, and six new scripts. For more information on emoji in Unicode 8.0.0, see the associated draft Unicode Emoji report.
Please review the documentation, adjust code, test the data files, and report errors and other issues to the Unicode Consortium by April 27, 2015. Feedback instructions are on the beta page.
We would like to remind you that the deadline for speaker proposals for the 8th MultilingualWeb Workshop (April 29, 2015, Riga, Latvia) is on Sunday, March 8, at 23:59 UTC.
Featuring a keynote by Paige Williams (Director of Global Readiness, Trustworthy Computing at Microsoft) and sessions for various audiences (Web developers, content creators, localisers, users, and multilingual language processing), this workshop will focus on the advances and challenges faced in making the Web truly multilingual. It provides an outstanding and influential forum for thought leaders to share their ideas and gain critical feedback.
While the organizers have already received many excellent submissions, there is still time to make a proposal, and we encourage interested parties to do so by the deadline. With roughly 150 attendees anticipated for the Workshop from a wide variety of profiles, we are certain to have a large and diverse audience that can provide constructive and useful feedback, with stimulating discussion about all of the presentations.
The workshop is made possible by the generous support of the LIDER project and will be part of the Riga Summit 2015 on the Multilingual Digital Single Market. We are organizing the workshop as part of the Riga Summit to strengthen the European related community at large. Depending on the number of submissions to the MultilingualWeb workshop we may suggest to move some presentations to other days of the summit. For these reasons we highly recommend you to attend the whole Riga Summit! See the line-up of speakers already confirmed for the various events during the summit.
For more information and to register a presentation proposal, please visit the Riga Workshop Call for Participation. For registration as a regular participant of the MultilingualWeb workshop or other events at the Riga Summit, please register at the Riga Summit 2015 site.
We are please to announce that Paige Williams, Director of Global Readiness, Trustworthy Computing at Microsoft, will deliver the keynote at the 8th Multilingual Web Workshop, “Data, content and services for the Multilingual Web,” in Riga, Latvia (29 April 2015).
Paige spent 10 years managing internationalization of Microsoft.com, before joining the Trustworthy Computing organization in 2005. In TwC, Paige oversees compliance of company policy for geographic, country-region and cultural requirements, establishing a new center of excellence for market and world readiness, globalization/localizability, and language programs, tools, resources and external community forums to reach markets across the world with the right local experience.
The Multilingual Web Workshop series brings together participants interested in the best practices, new technologies, and standards needed to help content creators, localizers, language tools developers, and others address the new opportunities and challenges of the multilingual Web. It will provide for networking across communities and building connections.
Registration for the Workshop is free, and early registration is recommended since space at the Workshop is limited.
The workshop will be part of the Riga Summit 2015 on the Multilingual Digital Single Market. We are organizing the workshop as part of the Riga Summit to strengthen the European related community at large. Depending on the number of submissions to the MultilingualWeb workshop we also may suggest to move presentations to other days of the summit. For these reasons we highly recommend you to attend the whole Riga Summit!
There is still opportunity for individuals to submit proposals to speak at the workshop. Ideal proposals will highlight emerging challenges or novel solutions for reaching out to a global, multilingual audience. The deadline for speaker proposals is March 8, but early submission is strongly encouraged. See the Call for Participation for more details.
This workshop is made possible by the generous support of the LIDER project.
The Cascading Style Sheets (CSS) Working Group has published a Candidate Recommendation of CSS Counter Styles Level 3. It adds new built-in counter styles to those defined in CSS 2.1, but, more importantly, it also allows authors to define custom styles for list markers, numbered headings and other types of generated content.
At the same time, the Internationalization Working Group has updated their Working Draft of Predefined Counter Styles, which provides custom rules for over a hundred counter styles in use around the world. It serves both as a ready-to-use set of styles to copy into your own style sheets, and also as a set of worked examples.