XML Packaging Working Group Charter

This [DRAFT!] charter is written in accordance with section 3.2.2 of the W3C Process.


Table of Contents

  1. Mission Statement
  2. Scope
    1. Collection
    2. Association
    3. Compression
    4. Dynamic Creation and Incremental Processing
    5. Packaging
  3. Criteria for Success
  4. Duration
  5. Deliverables
  6. Release Policy
  7. Milestones and Schedules
  8. Dependencies with other W3C Groups
  9. Coordination with External Groups
  10. Confidentiality
  11. Intellectual Property Rights Policy
  12. Communication and Meeting Mechanisms
  13. Requirements for Participants


1. Mission Statement

There is a great need for a general purpose packaging mechanism for XML and related files. Consider the following five use cases:

Average User: Has an XML document, a DTD, unparsed (binary) entity files and a stylesheet. He would like to collect, describe relationships between files, compress and package the files together for easier transmission over the web.

SVG: It has to write out any image data (raster) as separate files, along with web font files for any fonts used in the document. Thus, one Application file can turn into dozens of files when exported as SVG. For large SVG files, compression becomes very important to keep the size down, for faster network transmission. SVG files tend to compress well.

Content Protection: A key issue for many is encryption and authentication. They may want to use a proprietary encryption scheme and encrypt the content files while leaving the packaging structure as is in order to retain direct access to the content files. The same scenario could be applied for authentication.

Long Term Storage: The need is to save the content and associated metadata such that the entire unit can be resurrected into the appropriate Databases and file managers. It should have the means to attach authenticity or digital signatures, so there is a means of proving it is the sole source for this document/unit of information. The information has to be stored up to 50-75 years and needs to be able to withstand a legal challenge that the data is what was sent to customers, suppliers, etc.

Dynamic Creation: Increasingly web servers need to dynamically generate information to transmit. Many of these dynamically created components could be part of a package. If package transmission could begin before all the component files exist, continues with transmission of generated or preexisting components, then the package could be sent without having to store to disk. Also, the client could begin unpackaging as the package was being recieved, for early display, content checking and terminating the transmission. From this we get a much more efficient transmission scheme.

The mission of the XML Packaging Working Group will be to create a general purpose, flexible, powerful and highly interoperable mechanism for collecting, associating, compressing, encrypting, authenticating, dynamically transmitting, process incrementally and packaging XML and related files, such that it can be applied in many as yet unforeseen circumstances to support many as yet unforeseen needs.

2. Scope

The scope of the work on XML Packaging covers a wide range of subjects. Though this sounds very open ended, it is not. The idea is to invent new technology only when existing technology will not meet the goal set forth in the Mission Statement. Having the specification define which existing and new technologies will be used for XML Packaging supports the need for true interoperability. The main areas of work are discussed in the rest of this section.

Terms used in this document and their meanings:

Component
Refers to a file within a Collection.
Collection
Refers to a group of Components that are gathered together along with whatever extra information that is defined. This is analogous to files on a file system, with a bit more data.
Index
A file to refer to each Component in a Collection. The information in the Index generally corresponds with the information that can retrieved from the average File System. The Index is also considered a Component in general discussion.
Association
Refers to a relationship between Components in a Collection. This information is metadata.
Manifest
A compilation of metadata about Components, and the Associations between them. This information could be stored in a new Component.
Package
Refers to a Collection that is bundled together, or packaged, into one file using the packaging scheme to be defined by this Working Group. All Packages are Collections, but not all Collections have been packaged, so they are not all Packages.

Collection

Since a Collection is a grouping of Components, while a Package takes the same group of Components and packages them into one file, we have two different forms for Collections. Consider briefly how these two forms can be put to use. In a client/server environment one might enumerate four scenarios:

  1. Single Package at server, multiple files at client -- client pulls only those pieces needed.
  2. Single Package at server, single Package at client -- client downloads complete Package.
  3. Multiple files at server, multiple files at client -- client pulls only those pieces needed, no decomposition needed at server or client.
  4. Multiple files at server, single Package file at client -- client downloads complete Collection, composition at client.

Observing the four scenarios shows the utility of supporting the two forms. This "mode-neutrality" would allow servers and clients to not care whether a Collection is packaged or not, because they can convert between modes efficiently. The WG needs to decide what form(s) will be supported, and how this will be done.

The WG will also need to specify naming conventions for Components in Collections; how they can be addressed from the outside; how Components will refer to one another.

Supporting random access to Components within a Collection is raised in scenario one. Will random access be supported in XML Packaging? Will an Index be defined, containing file system like information for each Component? When the Index has been processed, will it be possible to do random access without processing other Components first? The WG will determine the issues with regard to random access, and specify the implementation for the level of random access to be supported.

Association

There is a great need to associate Components with other Components, Components with metadata and possibly Components with files or metadata outside a Collection. This additional information makes many applications of XML Packaging much more powerful. After a discussion of this topic with Dan Connolly, Tim Berners-Lee and Bert Bos, Michael Sperberg-McQueen wrote:

(See http://lists.w3.org/Archives/Member/w3c-xml-plenary/1998Dec/0010.ht ml)

"Stylesheet linkage is one prominent instance of a large class of problems, related to the question of locating metadata relevant to processing a particular document in a particular way. When that problem is solved, other similar kinds of metadata will take center stage. The problem of locating a Dublin Core description of a document, or some other RDF metadata, will be the hot question. Then the problem of locating the schema, or a schema in a particular notation or schema language. And then the next flavor of the week.
The best solution to this problem, as was recognized a long time ago in discussions of the XML work group (at least my memory says it was long ago), would be to define an application for packaging this kind of information."

The Association metadata would be useful for either the packaged and unpackaged form of a Collection. The accumulated association information for a Collection is referred to as "the Manifest" within this document. Information that may be associated includes:

The WG will need to specify exactly what kinds of metadata the Collection should be able to associate, and the semantics required by applications supporting this mechanism. The metadata defined in the specification should be limited to information that is of broad general use to XML applications. If the defined metadata is not limited using some sort of metric, this area of work could prove to be unproductive.

The WG will need to consider what syntax the manifest should be written in, for example XML defined with a schema, having its own namespace. And, whether to support an extensibility mechanism for application specific metadata should be considered by the WG.

Compression

Compression is of particular importance to the Long Term Storage and SVG use cases. For these and other applications getting the size of components reduced significantly is what is needed. The WG has many things to consider in relation to compression:

Shall a compression mechanism for non-binary files be chosen by the group, to encourage interoperability? Shall the compression and decompression method allow for processing as an in-stream filter?

Will compression/decompression be left to the transmission protocol? What effect will this have on interoperability? Or, will compression be done at or below the level of the packaging? Or, will the whole package be compressed after packaging? What effect does this have on random access?

Shall components that are already compressed, such as a JPEG be subjected to further compression when packaged? Will it be possible to unpackage a component without decompressing it.

Will a package within a package be recognized?

Shall there be many choices of compression technology, or just a few (maybe one), for non-binary files? How does this decision effect ease of implementation, documentation and learning?

The WG will have to review available compression technologies, such as zlib, bzip2 and many others to decide their suitability for the Packaging Specification.

Dynamic Creation and Incremental Processing

Web servers are increasingly generating information dynamically. Web Clients need to process much of their information incrementally. Support for both dynamically generated information, and the incrementally processing of a package is important for the WG to consider. These two areas of work are combined for discussion because they are much like 2 sides of a coin.

When transmitting a Package as you go, once a dynamically created Component is reached, the sizes of the Component may be difficult or impossible to obtain on most systems when needed. Now, it is impossible to determine the starting byte for the rest of the Components. Hence, it is not possible to generate an Index, on all systems, for a Package containing dynamically created Components. On the other hand, it is generally easy for a client to compute sizes and create an index during receipt of the package. The WG will need to consider the implications of this problem, what the best solution is and what effect this may have on the Packaging Specification overall.

The WG also needs to consider if and how it will support packages without an Index; if the specification will enable the general usage of XML Packaging without preventing usages over simple file transfer protocols; if a conforming client should be required to check the Package for an Index, and create one if needed; how to indicate component boundaries if serial transmission is supported.

Another area that will need to be addressed by the WG is how supporting Dynamic Creation and Incremental Processing effects the Manifest. Some of the metadata, may be know ahead of time, during transmission, or only at the end of transmission. Shall an incomplete Manifest be allowed in the package to support incremental processing, and display? If the Manifest is only allowed once in the package, then processing could not begin until after the whole Package had been recieved. The WG shall decide whether subsequent Manifests can be built to add to the information from the previous Manifest or to have subsequent Manifests supersede the previous one, or some other solution.

The WG will need to consider the idea of subsetting a package. Shall a client be able to request of a server just certain Components in the package? Shall servers be allowed to take Components from an already existing package to create a new package for transmission?

Packaging

The mechanism chosen for physically packaging Components will greatly determine which features as outlined above may, or may not be supported.

Things for the WG to consider when choosing a packaging mechanism: What set of features will the mechanism not support? Which use cases will not be well supported, or completely unsupported? How much time and resources will be required from the WG, and implementors to design, implement and test the mechanism? What is the availability of underlying technology needed to create the packaging mechanism? How much work will be needed to get the chosen mechanism changed to work for XML Packaging?

What are some of the mechanisms that the WG may consider for Packaging? XML is an obvious mechanism to consider. ZIP is a highly used format for packaging information for transmission on the web. MIME is also highly used for transmitting multiple files at the same time. Some other existing mechanism may be found by the WG. Or, a new mechanism could be designed by the WG.

3. Criteria for Success

The Working Group has fulfilled its mission if it succeeds in producing a highly interoperable specification that stimulates the development and widespread use of the XML Packaging Recommendation, to be, for both general purpose and application specific XML Packaging needs. This will be demonstrated by having a number of interoperable implementations of the specification before going to Proposed Recommendation.

4. Duration

This Working Group is scheduled to last for 18 months, from October 1st 1999 to April 1st 2001.

5. Deliverables

The deliverables for the working group will include:

The deliverables of the Working Group may also include:

6. Release Policy

By default, all documents developed by the Working Group are available from the group's web page. Selected documents may be published via the W3C's technical reports page after approval from W3C management.

Documents must have an editor and a date assigned by which it should become stable. Documents published are to represent the consensus of the Working Group, except where noted. Any remaining issues at this date will be described in the document to avoid delaying its wider release. The Requirements Document will be released first, followed by the first Working Draft within 3 months. New Working Drafts must be published within 3 months of the last one published. These will take place until a "Last Call" is made on a Working Draft.

It is the policy of the Working Group to publish meeting minutes within a week of a Face to Face meeting, and a day of a teleconference, if not sooner.

Documents that do not fulfill the criteria above (e.g. longer documents describing specific technical solutions brought up by one member of the Working Group) have to be submitted to W3C before they can be published on the W3C technical reports page.

7. Milestones and Schedules

Milestones are mostly approximate at this point, though the time allotted for the work should allow for some adjustment. Additional milestones may be added when the group decides to take on additional work items.

Early September, 1999
Deadline for Advisory Committee representatives to submit their review of the proposal.
Mid September, 1999
Director's Decision. Work starts via e-mail and teleconferences.
Mid September to early October, 1999
Working Group Call for Participation
December, 1999
First Face to Face meeting. This meeting will either wrap up work on the Requirements Document, or advance work on the first Working Draft.
Requirements Document Published.
April, 2000
Second Face to Face meeting. This meeting will either wrap up work on the first Working Draft, or begin work on the hard issues left over from the first Working Draft.
First Working Draft Published.
June, 2000
Third Face to Face meeting. This meeting will either wrap up work on the second Working Draft, or review and make decisions on the remaining work for the WG. This may also include a review of implementations, and how interoperable they are.
Second Working Draft Published.
August, 2000
Publication of third Working Draft, expected to go to Last Call.
Source Code being readied for publication (if so adopted by group).
October, 2000
Proposed Recommendation
November, 2000
Group in quiescence, pending any follow up fixes to the Recommendation.
April, 2001
Working Group closed.

8. Dependencies with other W3C Groups

The XML Packaging Working Group will have to take into account the needs of other groups within the W3C, and to ask certain groups to review published or internal documents from the Working Group. The following W3C activities may have dependencies on the XML Packaging WG:

The XML Packaging Working Group has dependencies on the following WG:

9. Coordination with External Groups

The following is a group that is known or presumed to be working on, or interested in, standards relating to XML Packaging, with pointers to the documents discussing the respective project. The W3C XML Packaging working group will need to liaise with this group.

The WAP (Wireless Application Protocol) Forum: The relationship of this WG with the WAP Forum is based on each groups work on a specification that attempts to reduce the transmission size of XML documents. There has been cooperation between the W3C and the WAP Forum as indicated by The WAP Forum - W3C Cooperation White Paper. Since the e-mail from Bruce Martin, WAP Liaison to the W3C, to the XML CG entitled, Liaison: WAP / W3C Binary XML Coordination states, "The W3C and WAP Forum have recently created a committee to facilitate better coordination between the two organizations," the coordination between the XML Packaging WG and the WAP Forum may be better facilitated by this new committee. The author has not been able to identify this organization through a search of the W3C web site, both external and internal documents were searched.

In the same e-mail from Bruce Martin, he states that, "The WAP Forum strongly desires the development of a single industry standard for

In the same e-mail from Bruce Martin, he states that, "The WAP content space, and ideally, the work within WAP will naturally converge with that of the W3C." The WAP Forum has produced the Wireless Application Protocol Binary XML Content Format Specification Version 1.1. Since the XML Packaging specification is designed to compress the size of XML documents for transmission over the Web, and the WAP Binary Specification says, "The binary XML content format is designed to reduce the transmission size of XML documents, allowing more effective use of XML data on narrowband communication channels," there may be some collision between the work on the two specifications. This needs to be coordinated. Other links are Comment on HDML Submission (see bottom), Handheld Device Markup Language Specification and HTML 4.0 Guidelines for Mobile Access.

10. Confidentiality

The work of the XML Packaging WG is generally covered by the usual W3C member confidentiality agreement. Access to e-mail discussions and to documents developed by the working group will be limited to W3C members and invited experts, until released for publication by the joint agreement of the working group and the W3C management team. Working group members are required to honor the confidentiality of the group's work, until such time that the work is publicly released. This charter will remain a confidential document of the W3C.

11. Intellectual Property Rights Policy

The W3CTM Intellectual Property Notice and Legal Disclaimers document, and the related linked documents explains the policy that the XML Packaging WG will follow. Any person considering membership in the WG, should review these documents, so that they know what the policies are, and can be sure that they are ready to follow them. Joining the WG implies agreement to follow the policies of the W3C in relation to Intellectual Property Rights, as described in the link above.

12. Communication and Meeting Mechanisms

See XML Activity Membership and Decision Process

13. Requirements for Participants

Participation by all members of the working group is expected to take 20% of the participants work time. More should be expected for the chairs and editor(s).

To be successful, we expect the XML Packaging WG to have approximately 8 to 12 active members for its 18 month duration. The WG will have 3 to 4 Face to Face meetings per year. The level of participation expected of WG members requires that they be conversant in XML and that they support the purpose of the WG set forth in this charter. Requirements for meeting attendance and timely response are described in the Process document.

The W3C Team Contact will ensure that the mailing lists, public and Group pages are adequately maintained; that public Working Drafts are made available on the Technical Reports page; will serve as liaison between non-Team document

Paul Grosso of ArborText will serve as Co-chair and Daniel Veillard will serve as both Co-chair and W3C Team staff contact for the XML Packaging Working Group.

It is expected that this WG's work would consume moderate communications resources for press and media relations and speaking appearances or meeting planning resources.


Made  
    with CSS! Author: Joel Nava, Adobe Systems, Inc.
Maintainer : Daniel Veillard, W3C <veillard@w3.org>
$Id: xml-packaging-charter.html,v 1.3 2000/07/25 16:27:31 connolly Exp $
copy of Id: xml-packaging-charter.html,v 1.5 1999/07/16 11:30:03 veillard Exp