Nearby: Workshop Report
Section 1.1 of the Extensible Markup Language (XML) gives as a design goal that Terseness in XML markup is of minimal importance. The Standard Generalized Markup Language (SGML), of which XML is a Profile, has a number of features intended to reduce typing when humans are entering markup directly, or to reduce file sizes, but these features were not included in XML.
The resulting XML specification gave us a highly regular language, but one that can use a considerable amount of bandwidth to transmit in any quantity. Furthermore, although parsing has been greatly simplified in terms of code complexity and run-time requirements, larger data streams necessarily entail greater I/O activity, and this can be significant in some applications.
There has been a steadily increasing demand to find ways to transmit pre-parsed XML documents and Schema-defined objects, in such a way that embedded, low-memory and/or low bandwidth devices can make use of an interoperable, accessible, internationalized, standard representation for structured information, yet without the overhead of parsing an XML text stream.
Multiple separate experimenters have reported significant savings in bandwidth, memory usage and CPU consumption using an ASN.1-based representation of XML documents. Others have claimed that gzip is adequate.
Advantages of a binary representation of a pre-parsed stream of Information Items (as defined by the XML Infoset) might include:
One potential and very serious disadvantage is that one might lose the View Source Principle which has helped the Web to spread.
The purpose of the Workshop, then, is to study methods to compress XML documents, comparing Infoset-level representations with other methods, in order to determine whether a W3C Working Group might be chartered to produce an interoperable specification for such a transmission format.
We expect several groups to contribute to the workshop:
Although the Workshop is public, it is restricted to approximately 60 places, with at most two attendees per organization. In addition, people wishing to attend must submit a position paper, and will be informed by the Program Committee of the success of their application. The intent is to make sure that participants have an active interest in the area.
Per W3C Process, attendance is on a strict first-come first-served basis!
The Workshop will produce the following:
These will be published on the W3C Web site by the end of October, 2003
There will be a limit of 60 participants at the Workshop. To ensure maximum diversity amongst participants, only two participants may attend per organization.
Position papers are required in order to participate in this workshop. Each organization or individual wishing to participate must submit a position paper explaining their interest in the workshop no later than 11th August 2003.
There will be no participation fee.
To attend, you must register by filling out the registration form. The URI for the registration form will be sent to you after your position paper is accepted. Send papers (in XHTML/HTML, DocBook or PDF) directly to the conference chair, Liam Quin.
Organizations wishing to participate in the Workshop must submit a position paper. Position papers can be anywhere from one page to 20 pages or more, but must address at least the following questions:
Position papers are due no later than the 11th of August, 2003
The Position Papers will be made public no later than one month after the Workshop.
Any future work in this area will be governed by the W3C Patent Policy, and will be on Royalty Free terms.
The final agenda will be announced in August. The outline will be as follows:
The W3C Contact and Conference Chair is Liam Quin <liam@w3.org>.
The Workshop will be hosted in the Bay Area, by Sun Microsystems, at the corner of Montague and Lafayette in Santa Clara, CA (see directions).
Participation by teleconference and by Internet Relay Chat (IRC) may be arranged in some cases. Subsequent discussion is expected to occur on a publicly-readable mailing list.
This activity will consume 30% of the time of one W3C staff member for chairing the workshop, and 10% of the time of one W3C staff member for managing the workshop website. This workshop is part of the W3C XML Activity.