W3C

Site Navigation


  • XML Technology

XML Technology

XML Technologies including XML, XML Namespaces, XML Schema, XSLT, Efficient XML Interchange (EXI), and other related standards.

XML Essentials Header link

XML is shouldered by a set of essential technologies such as the infoset and namespaces. They address issues when using XML in specific applications contexts.

Efficient Interchange Header link

XML standards are omnipresent in enterprise computing, and are part of the foundation of the Web. Because the standards are highly interoperable and affordable, people have wanted to use them in a wide variety of applications. However, in some settings (on devices with low memory or low bandwidth, or where performance is critical) experience has shown that a more efficient form of XML is required.

Schema Header link

Formal descriptions of vocabularies create flexibility in authoring environments and quality control chains. W3C’s XML Schema, SML, and data binding technologies provide the tools for quality control of XML data.

Security Header link

Manipulating data with XML requires sometimes integrity, authentication and privacy. XML signature, encryption, and xkms can help create a secure environment for XML.

Transformation Header link

Very frequently one wants to transform XML content into other formats (including other XML formats). XSLT and XPath are very powerful tools for creating different representations of XML content.

Query Header link

XQuery (supported by XPath) is a query language for XML to extract data, similar to the role of SQL for databases, or SPARQL for the Semantic Web.

Components Header link

The XML ecosystem is using additional tools to create a richer environment for using and manipulating XML documents. These components include style sheets, xlink xml:id, xinclude, xpointer, xforms, xml fragments, and events.

Processing Header link

A processing model defines what operations should be performed in what order on an XML document.

Internationalization Header link

W3C has worked with the community on the internationalization of XML, for instance for specifying the language of XML content.

Publishing Header link

XML grew out of the technical publication community. Use XSL-FO to publish even large or complex multilingual XML documents to HTML, PDF or other formats; include SVG diagrams and MathML formulas in the output.

News Atom

Character Model for the World Wide Web: String Matching and Searchingbuilds upon Character Model for the World Wide Web 1.0: Fundamentals to provide authors of specifications, software developers, and content developers a common reference on string identity matching on the World Wide Web and thereby increase interoperability.

This new version introduces numerous editorial changes as well as replacing some temporary terminology with better terms, and integrating the case folding text from the string matching algorithm into the case folding section. The document template was also adapated to match the new Internationalization publication process. See details of changes.

In the last 1 1/2 years, the LIDER project has organized several roadmapping events to gather a broad community around the topic of linguistic linked data. In July this year, LIDER will engage with two selected communities. On July 6, the 5th LIDER roadmapping workshop will be held in Rome at Sapienza University of Rome . The topic will be cross-media linked data and the event will provide several high level speakers from the multimedia area. On July 13th LIDER will organize the 6th roadmappping workshopin Munich. The event will be hosted by Siemens and will focus on content analytics and linked data in healthcare and medicine.

For both workshops participation is limited. If you are interested in the Rome event please contact Tiziano Flati , for Munich please contact Philipp Cimiano.

Language Tags and Locale Identifiers for the World Wide Webdescribes the best practices for identifying or selecting the language of content as well as the the locale preferences used to process or display data values and other information on the Web. It describes how document formats, specifications, and implementations should handle language tags, as well as extensions to language tags that describe the cultural or linguistic preferences referred to in internationalization as a “locale”.

Changes in this update include the following: All references to RFC3066bis were updated to BCP 47 or to RFC5646 or RFC 4647 as appropriate.References to HTML were changed to point to HTML5. Imported and rewrote the text formerly containing in Web Services Internationalization Usage Scenarios defining internationalization, locale, and other important terms. Modified and reorganized the other sections of this document. Moved the Web services materials to an appendix.

See the program . The keynote speaker will be Page Williams, Director of Global Readiness, Trustworthy Computing, Microsoft. She is followed by a strong line up in sessions entitled Developers and Creators, Localizers, Machines, and Users, including speakers from Microsoft, the European Parliament, the UN FAO, Intel, Verisign, and many more. The workshop is made possible with the generous support of the  LIDER project.

Participation in the event is free. Please register via the  Riga Summit for the Multilingual Digital Single Market site.

The MultilingualWeb workshops, funded by the European Commission and coordinated by the W3C, look at best practices and standards related to all aspects of creating, localizing and deploying the multilingual Web. The workshops are successful because they attract a wide range of participants, from fields such as localization, language technology, browser development, content authoring and tool development, etc., to create a holistic view of the interoperability needs of the multilingual Web.

We look forward to seeing you in Riga!

The LIDER project is  developing  LingHub , a repository of metadata about language resources and linguistic data. During a dedicated  conference call on 19 March, 3 p.m. CET , LingHub will be discussed in the  LD4LT community group  to gather feedback from the public at large. The call is open to the public, no LD4LT group participation is required. Dial-in information is available. The call will be relevant for anybody interest specifically in language resources, or in public metadata repositories and the  re-use of public sector information in general.

The Unicode® Consortium announced the start of the beta review for Unicode 8.0.0, which is scheduled for release in June, 2015. All beta feedback must be submitted by April 27, 2015.

Unicode 8.0.0 comprises several changes which require careful migration in implementations, including the conversion of Cherokee to a bicameral script, a different encoding model for New Tai Lue, and additional character repertoire. Implementers need to change code and check assumptions regarding case mappings, New Tai Lue syllables, Han character ranges, and confusables. Character additions in Unicode 8.0.0 include emoji symbol modifiers for implementing skin tone diversity, other emoji symbols, a large collection of CJK unified ideographs, a new currency sign for the Georgian lari, and six new scripts. For more information on emoji in Unicode 8.0.0, see the associated draft Unicode Emoji report.

Please review the documentation, adjust code, test the data files, and report errors and other issues to the Unicode Consortium by April 27, 2015. Feedback instructions are on the beta page.

See more information about testing the 8.0.0 beta. See the current draft summaryof Unicode 8.0.0.

The LIDER project is developing a reference architecture for working with Linguistic Linked Data (LLD). LLD is linked data used to represent metadata about linguistic resources and the resources themselves, e.g. lexica, thesauri, corpora, multilingual semantic networks etc. The reference architecture defines various aspects of LLD processing, related e.g. to LLD publishing, linking, services or discovery. As part of this activity, the LD4LT community group is organizing a conference call on 5 March, 3 p.m. CET, to gather feedback from the public at large.

The call is open to the public, no LD4LT group participation is required. Dial-in informationis available. No knowledge about LLD is required. We especially are interested in feedback from potential users of LLD in content analytics related application areas.

We would like to remind you that the deadline for speaker proposals for the 8th MultilingualWeb Workshop (April 29, 2015, Riga, Latvia) is on Sunday, March 8, at 23:59 UTC.

Featuring a keynote by Paige Williams (Director of Global Readiness, Trustworthy Computing at Microsoft) and sessions for various audiences (Web developers, content creators, localisers, users, and multilingual language processing), this workshop will focus on the advances and challenges faced in making the Web truly multilingual. It provides an outstanding and influential forum for thought leaders to share their ideas and gain critical feedback.

While the organizers have already received many excellent submissions, there is still time to make a proposal, and we encourage interested parties to do so by the deadline. With roughly 150 attendees anticipated for the Workshop from a wide variety of profiles, we are certain to have a large and diverse audience that can provide constructive and useful feedback, with stimulating discussion about all of the presentations.

The workshop is made possible by the generous support of the LIDER project and will be part of the Riga Summit 2015 on the Multilingual Digital Single Market . We are organizing the workshop as part of the Riga Summit to strengthen the European related community at large. Depending on the number of submissions to the MultilingualWeb workshop we may suggest to move some presentations to other days of the summit. For these reasons we highly recommend you to attend the whole Riga Summit! See the line-up of speakersalready confirmed for the various events during the summit.

For more information and to register a presentation proposal, please visit the Riga Workshop Call for Participation . For registration as a regular participant of the MultilingualWeb workshop or other events at the Riga Summit, please register at the Riga Summit 2015 site.

The article Tagging text with no languagewas updated to correct that statement that lang=”” is not appropriate for HTML. This was introduced with HTML5.

In addition, various editorial changes were made and the page was reorganized, moving the information about XHTML and XML schema considerations to a new advanced section.

We are please to announce that Paige Williams, Director of Global Readiness, Trustworthy Computing at Microsoft, will deliver the keynote at the 8th Multilingual Web Workshop, “Data, content and services for the Multilingual Web,” in Riga, Latvia (29 April 2015).

Paige spent 10 years managing internationalization of Microsoft.com, before joining the Trustworthy Computing organization in 2005. In TwC, Paige oversees compliance of company policy for geographic, country-region and cultural requirements, establishing a new center of excellence for market and world readiness, globalization/localizability, and language programs, tools, resources and external community forums to reach markets across the world with the right local experience.

The Multilingual Web Workshop series brings together participants interested in the best practices, new technologies, and standards needed to help content creators, localizers, language tools developers, and others address the new opportunities and challenges of the multilingual Web. It will provide for networking across communities and building connections.

Registration for the Workshop is free, and early registrationis recommended since space at the Workshop is limited.

The workshop will be part of the Riga Summit 2015 on the Multilingual Digital Single Market. We are organizing the workshop as part of the Riga Summit to strengthen the European related community at large. Depending on the number of submissions to the MultilingualWeb workshop we also may suggest to move presentations to other days of the summit. For these reasons we highly recommend you to attend the whole Riga Summit!

There is still opportunity for individuals to submit proposals to speak at the workshop. Ideal proposals will highlight emerging challenges or novel solutions for reaching out to a global, multilingual audience. The deadline for speaker proposals is March 8, but early submission is strongly encouraged. See the Call for Participationfor more details.

This workshop is made possible by the generous support of the LIDERproject.