Copyright © 2005 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use rules apply.
This document describes use cases for evaluating the potential benefits of an efficient serialization format for XML. The use cases are documented here to understand the constraints involved in environments for which XML employment may be problematic because of one or more characteristics of XML. Desirable properties of XML and alternative formats to address the use cases are derived and discussed in a separate publication of the XML Binary Characterization Working Group (XBC WG) [XBC Properties].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a Working Group Note, produced by the XML Binary Characterization Working Group as part of the XML Activity.
This document is part of a set of documents produced according to the Working Group's charter, in which the Working Group has been determining Use Cases, characterizing the Properties that are required by those Use Cases, and establishing objective, shared Measurements to help judge whether XML 1.x and alternate binary encodings provide the required properties.
The XML Binary Characterization Working Group has ended its work. This document is not expected to become a Recommendation later. It will be maintained as a WG Note.
Discussion of this document takes place on the public public-xml-binary@w3.org mailing list (public archives).
Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
1 Introduction
2 Use Case Structure
3 Documented Use Cases
3.1 Metadata in Broadcast Systems
3.1.1 Description
3.1.2 Domain & Stakeholders
3.1.3 Justification
3.1.4 Analysis
3.1.5 Properties
3.1.5.1 Must Have
3.1.5.2 Should Have
3.1.5.3 Nice To Have
3.1.6 Alternatives
3.1.7 References
3.2 Floating Point Arrays in the Energy Industry
3.2.1 Description
3.2.2 Domain
3.2.3 Justification
3.2.4 Analysis
3.2.5 Properties
3.2.5.1 Must Have
3.2.5.2 Should Have
3.2.5.3 Nice To Have
3.2.6 Alternatives
3.2.7 References
3.3 X3D Graphics Model Compression, Serialization and Transmission
3.3.1 Description
3.3.2 Domain & Stakeholders
3.3.3 Justification
3.3.4 Analysis
3.3.5 Properties
3.3.5.1 Must Have
3.3.5.2 Should Have
3.3.5.3 Nice To Have
3.3.6 Alternatives
3.3.7 References
3.4 Web Services for Small Devices
3.4.1 Description
3.4.2 Domain
3.4.3 Justification
3.4.4 Analysis
3.4.5 Properties
3.4.5.1 Must Have
3.4.5.2 Should Have
3.4.5.3 Nice To Have
3.4.6 Alternatives
3.4.7 References
3.5 Web Services within the Enterprise
3.5.1 Description
3.5.2 Domain
3.5.3 Justification
3.5.4 Analysis
3.5.5 Properties
3.5.5.1 Must Have
3.5.5.2 Should Have
3.5.5.3 Nice To Have
3.5.6 Alternatives
3.5.7 References
3.6 Electronic Documents
3.6.1 Description
3.6.2 Domain & Stakeholders
3.6.3 Justification
3.6.4 Analysis
3.6.5 Properties
3.6.5.1 Must Have
3.6.5.2 Should Have
3.6.5.3 Nice To Have
3.6.6 Alternatives
3.6.7 References
3.7 FIXML in the Securities Industry
3.7.1 Description
3.7.2 Domain
3.7.3 Justification
3.7.4 Analysis
3.7.5 Properties
3.7.5.1 Must Have
3.7.5.2 Should Have
3.7.5.3 Nice To Have
3.7.6 Alternatives
3.7.7 References
3.8 Multimedia XML Documents for Mobile Handsets
3.8.1 Description
3.8.2 Domain
3.8.3 Justification
3.8.4 Analysis
3.8.5 Properties
3.8.5.1 Must Have
3.8.5.2 Should Have
3.8.5.3 Nice To Have
3.8.6 Alternatives
3.8.7 References
3.9 Intra/Inter Business Communication
3.9.1 Description
3.9.2 Domain & Stakeholders
3.9.3 Justification
3.9.4 Analysis
3.9.5 Properties
3.9.5.1 Must Have
3.9.5.2 Should Have
3.9.5.3 Nice To Have
3.9.6 Alternatives
3.9.7 References
3.10 XMPP Instant Messaging Compression
3.10.1 Description
3.10.2 Domain & Stakeholders
3.10.3 Analysis
3.10.4 Justification
3.10.5 Properties
3.10.5.1 Must Have
3.10.5.2 Should Have
3.10.5.3 Nice To Have
3.10.6 Alternatives
3.10.7 References
3.11 XML Documents in Persistent Store
3.11.1 Description
3.11.2 Domain & Stakeholders
3.11.3 Justification
3.11.4 Analysis
3.11.5 Properties
3.11.5.1 Must Have
3.11.5.2 Should Have
3.11.5.3 Nice To Have
3.11.6 Alternatives
3.11.7 References
3.12 Business and Knowledge Processing
3.12.1 Description
3.12.2 Domain & Stakeholders
3.12.3 Justification
3.12.4 Analysis
3.12.5 Properties
3.12.5.1 Must Have
3.12.5.2 Should Have
3.12.5.3 Nice To Have
3.12.6 Alternatives
3.12.7 References
3.13 XML Content-based Routing and Publish Subscribe
3.13.1 Description
3.13.2 Domain & Stakeholders
3.13.3 Justification
3.13.4 Analysis
3.13.5 Properties
3.13.5.1 Must Have
3.13.5.2 Should Have
3.13.5.3 Nice To Have
3.13.6 Alternatives
3.13.7 References
3.14 Web Services Routing
3.14.1 Description
3.14.2 Domain & Stakeholders
3.14.3 Justification
3.14.4 Analysis
3.14.5 Properties
3.14.5.1 Must Have
3.14.5.2 Should Have
3.14.5.3 Nice To Have
3.14.6 Alternatives
3.14.7 References
3.15 Military Information Interoperability
3.15.1 Description
3.15.2 Domain & Stakeholders
3.15.3 Justification
3.15.4 Analysis
3.15.5 Properties
3.15.5.1 Must Have
3.15.5.2 Should Have
3.15.5.3 Nice To Have
3.15.6 Alternatives
3.15.7 References
3.16 Sensor Processing and Communication
3.16.1 Description
3.16.2 Domain & Stakeholders
3.16.3 Justification
3.16.4 Analysis
3.16.5 Properties
3.16.5.1 Must Have
3.16.5.2 Should Have
3.16.5.3 Nice To Have
3.16.6 Alternatives
3.16.7 References
3.17 SyncML for Data Synchronization
3.17.1 Description
3.17.2 Domain & Stakeholders
3.17.3 Justification
3.17.4 Analysis
3.17.5 Properties
3.17.5.1 Must Have
3.17.5.2 Should Have
3.17.5.3 Nice To Have
3.17.6 Alternatives
3.17.7 References
3.18 Supercomputing and Grid Processing
3.18.1 Description
3.18.2 Domain & Stakeholders
3.18.3 Justification
3.18.4 Analysis
3.18.5 Properties
3.18.5.1 Must Have
3.18.5.2 Should Have
3.18.5.3 Nice To Have
3.18.6 Alternatives
3.18.7 References
4 References
While XML has been enormously successful as a markup language for documents and data, the overhead associated with generating, parsing, transmitting, storing, or accessing XML-based data has hindered its employment in some environments. The question has been raised as to whether some optimized serialization of XML is appropriate to satisfy the constraints present in such environments. In order to address this question, a compatible means of classifying the requirements posed by specific use cases and the applicable characteristics of XML must be devised. This allows a characterization of the potential gap between what XML currently supports and use case requirements. In addition, it also provides a way to compare use case requirements to determine the degree to which an alternate serialization could be beneficial.
Use cases describing situations where some characteristics of XML may prevent its effective use are presented in this document. The XBC WG has made efforts through internal discussion and dialog with the XML community to define a comprehensive set of use cases to examine and compare potential solutions, including alternate serializations of XML as well as other available means. Comments are invited, especially if important use cases or solutions to address them have been omitted.
In this section we elaborate on the template used to present the use cases. All the use cases collected by this WG are listed in 3 Documented Use Cases.
Description: Provides an overview of the use case.
Domain & Stakeholders: Identifies the functional and/or business area(s) and users to which the use case pertains.
Justification: This is a discussion of why (or why not) a standard solution for efficient XML encoding should be pursued.
Analysis: Examines why XML is appropriate for addressing the use case, and the limitations of XML which hinder its use. Requirements for the use case are discussed in this section.
Properties: References/discusses the properties required to support the use case. For each use case the properties are partitioned into three categories:
Must Have: This is the set of properties which must be supported for a format to be adopted in the use case domain. This is intended to be a high bar in that an unsupported "must have" property would not simply make a format undesirable but actually unusable.
Should Have: This is the set of properties which are important, but not critical, to the use case. A format which did not support "should have" properties would be significantly less desirable than one that did. However, formats not supporting "should have" properties would still be usable for that use case.
Nice To Have: This is the set of properties which are not important, but supporting them brings some benefit to the use case. However, the benefit is generally minor and would be traded off to support "should have" or "must have" properties for that use case.
Alternatives: Presents alternatives to XML for addressing the use case.
References: Lists references to industries, standards, etc. that are related to the use case.
The use cases identified by or submitted to the working group are documented below, in accordance with the meta data defined in 2 Use Case Structure.
The constant progress of digital TV, the multiplication of channels, the competition and convergence with the Web, and the widespread deployment of a variety of set-top boxes (notably Personal Video Recorders, or PVRs for short) call for services on TV sets that extend beyond simply broadcasting audio/video content.
For instance, above a certain number of channels, broadcasters find themselves having to provide EPG (Electronic Program Guide) services to their users, without which they would be overwhelmed with the sheer amount of available content. These EPGs also allow PVRs to automatically pick up recording schedules for given programs, based on user-defined criteria that match against metadata broadcasted alongside the data.
Similarly, efforts are under way to define a world-wide standard for timed text systems under the leadership of the W3C Timed Text Working Group. Timed text is an essential service for a large variety of video use cases, such as broadcasting subtitles for foreign movies, providing accessible video information, or implementing karaoke systems.
However, there are constraints that cause problems when trying to deploy such services, all of which rely on XML, to television sets and other video devices:
Bandwidth: TV bandwidth is extremely expensive, and how much data you use for services directly constrains the number of channels that you are able to broadcast. In addition to the potential technical issues, there are strong economic motivations to reduce bandwidth usage as much as possible. One week of electronic program guide data easily amounts to roughly 30Mb of XML data.
Processing Power: Most set-top boxes are inexpensive, and the low-end ones suffer from low processing power. Contrary to mobile devices, there are few limitations as to the processing power that can be embarked in a box the size of the average set-top box (STB), notably the problems relating to heat and battery life are of little or no concern. However, the large-scale deployment of STBs and similar devices into households requires them to be extremely inexpensive and therefore as limited as possible (an average mid-end STB may have up to 16Mb of memory and processors up to 80Mhz, but frequently less, and systems anticipated for deployment in 2008 may have up to 120Mhz processors and up to 64Mb of memory). Also convergence with mobile devices remains a prime motivator for the television industry and constraints applicable to mobile devices apply equally to broadcasted XML metadata notably because of efforts such as DVB-H (Digital Video Broadcast for Handsets).
Unidirectional Network: This being broadcast, there is not typically a way for TVs to request data. Instead, it is being continuously streamed and re-streamed to them, a process which is called carouselling (the data itself being 'on a carousel'). Some set-top boxes do in fact have a return channel but the vast majority don't and it is not expected that most would support one in the near future.
Change Resilience. Upgrading several million STBs is often very impractical. Therefore, it must be possible to evolve the broadcast format without breaking older hardware. While XML is perfectly suited to this, the above issues make it unusable. It is thus required that the binary XML format replacing it be resilient to changes in the schema.
These constraints translate into a set of requirements: a decoder should be able to begin decoding at any point in the stream without waiting for an entire document to have been received and be able to reconstruct the document progressively; if a decoder fails to receive a fragment, then, within a limited time duration, it should be able to receive a repeated copy of that same fragment, knowing that some fragments may be more important than others and therefore repeated more frequently; and the transmission size of the data as well as the amount of processing power required to process it should be minimized.
As a result, MPEG-7 BiM (a binary encoding of XML originally created to carry video metadata), has been integrated into a number of broadcasting standards, notably ARIB, DVB, and TV Anytime.
This use case is relevant to the entirety of the television distribution industry, comprising content providers, broadcast infrastructure deployers, television and set-top box manufacturers, and of course the broadcasting companies themselves.
It also covers similar requirements that can be found in digital radio broadcasting, where one equally needs to broadcast EPG metadata to very limited devices, to integrate with mobile devices (for instance by sending SVG ads as part of the radio stream).
And finally, convergence with TV is considered to be a major next step in mobile services, and all participants on both sides of the fence are presently being extremely active in making television available anywhere, at any time, and on any device. Quite naturally, this leads to the need for common technology between the TV industry and the rest of the Web. As such, the major stakeholders in the domain have expressed interest in reusing a solution commonly agreed upon across several industries.
Television is a very large market that has a strong need for program metadata, and is increasingly converging with the Web at large (with a strong emphasis on mobile devices at first), notably using technologies such as XHTML, SVG, XForms, SMIL, and Time Text.
Deployed systems already use binary XML, currently standardized as part of ISO MPEG and industry fora such as ARIB, DVB, or TV-Anytime, but have expressed interest in using more broadly adopted technologies.
XML is appropriate for these situations because:
existing specifications based on XML are being reused wholesale;
most major TV standards in the area are already XML-based, and the industry has no wish to go through another standards cycle;
XML is well-suited to describing structured information such as metadata;
XML has proven to be a good format to specify user interfaces in, using notably XHTML, SVG, or XForms. These are needed for TV applications, and they need to be broadcasted;
the industry wishes to publish its data, especially the Electronic Program Guides, to as many media as possible. XML enables it to publish directly to desktops, mobile devices, and TVs using off-the-shelf or Open Source software.
DVB EIT schedules provide a small subset of the required functionality and do not integrate well with a more generic information ecosystem.
Without a W3C format of binary XML, application domains will likely adopt different formats--a phenomenon that already started in the Broadcast domain. This would further cluster the market. A convergence (which can be recognized right now) between the broadcast and the mobile communication services would be hindered by this clustering. In this concrete case, mobile devices would be required to implement codecs of both domains to enable value added services like interactive and location aware TV broadcast. This drives up the initial investment, which translates into a great obstacle for a converged service in the marketplace.
The upstream segment of the energy industry is concerned with exploration for and production of oil and gas. XML-based techniques have made very little penetration into the upstream technology part of the energy industry. The most basic reason for this is the nature of the data, which does not at this time lend itself to being represented usefully in XML.
There are basically two core types of data in this industry: well logs and seismic data. Well logs are moderately large datasets while seismic datasets are very large, typically in the order of gigabytes. Although the Petrotechnical Open Standards Consortium [POSC] has produced an XML schema for well logs [WellLogML], it has not been widely adopted by the industry. At the time of writing, we are not aware of anyone even considering a schema definition for seismic data.
One example of the magnitude of data and processing needs is data from marine seismic surveys. A typical data collection arrangement includes a ship, traveling at a fairly slow speed, trailing a cable behind it that is about two miles long and which contains about a hundred listening devices (geophones) spaced evenly along the cable. An air gun array on the boat fires every thirty seconds or so and the echos from the subsurface received by each geophone are recorded for about six seconds at a two millisecond sampling rate. That's 36 million words of floating point data per hour. The ship travels back and forth covering in a systematic way an area several miles on a side, resulting in a highly redundant 3D sampling of echos from the subsurface. The redundancy, loosely speaking, results from the fact that the cable is moving so that, for example, a given subsurface point might be sampled by geophone 10 on shot 100, geophone 14 on shot 101, geophone 18 on shot 102 and so on. The high degree of redundancy later results in a large number of processing operations involving reordering and sub-setting the data. The communication of this data and the necessity of these operations influences the formats and structures that are appropriate.
Both seismic and well log data include control data, easily represented in XML, as well as large arrays of floating point numbers, not easily represented efficiently in XML. Although in practice an XML representation is not used, such data may be represented as shown in the following fragment (with a whole document consisting of a large number of these fragments):
<seisdata>
<lineheader>...<lineheader>
<header>
<linename>westcam 2811</linename>
<ntrace>1207</ntrace>
<nsamp>3001</nsamp>
<tstart>0.0</tstart>
<tinc>4.0</tinc>
<shot>120</shot>
<geophone>7</geophone>
</header>
<trace>0.0 0.0 468.34 3.245672E04 6.9762345E05 ... (3001 floats)</trace>
<header>
...
</header>
<trace> ... </trace>
...
</seisdata>The scope within the Energy Industry as discussed above is very broad, encompassing a very large number of technical issues and usage scenarios involving, for example, integration of drilling information, processing of seismic and well data, integration of seismic and well data into interpretation systems, and so on.
There are a number of dominant technology vendors in this sector as well as a number of small companies that "work around the edges". The dominant technology vendors in this field provide proprietary solutions that do not interoperate easily with each other. Providing communications between these products within a company, or between companies, is a constant problem: this is the main motivator to develop Web service interfaces for these products. A second motivator for a standard is that it will open the door for smaller companies to provide useful add-on products. Large budgets in this sector are allocated to the purchase of software packages and display devices, but these budgets are small compared to the leverage of mass-market devices, so a longer term objective is to encourage a situation where more technologies with mass-market cost leverage can be used.
Given that this scenario involves interoperability between companies using disparate systems, XML is a natural choice due to its ubiquity and tool availability.
The main shortcoming of XML for this application is the processing expense incurred while converting floating point data to and from a character representation, as well as the extra size of some representations. Thus, the main requirement for this use case is the ability to represent sequences of scalars like floating point numbers in a native binary format in order to facilitate efficient use in application processing. In the example shown above, the header information would still have a textual value representation (useful for any infoset-based processing), but the data composed of floating point numbers would appear as a binary stream that is as directly usable as possible.
A candidate format must support floating point numbers in multiple native formats representative of common architectures with appropriate type indication. This allows reader-makes-right conversion only when needed and direct memory load otherwise. In practice, most operations involve moving data between machines with the same floating point formats so the solution should not impose undue overhead on the most common situation in order to handle the less common ones. It is generally considered too expensive to incur processing overhead of conversion between floating point and character representation for this data. The ecosystem of data communication and processing involving multiple independently developed applications indicates the need for exchange formats that are very processing efficient and space efficient in ways that are not processing intensive. This implies that a directly random access, random update, and efficient update capable format would be very useful and that transmitting low-level deltas might make sense to support update or repetitive communication.
Developers working in this industry find it most desirable to use a coherent library interface to get efficient access to the usually-native scalar data in the most direct way after receiving a block of data in the format. Similar efficiency is needed in the creation or update of an instance of the format. This need is not fulfilled by something that operates like a traditional parser. A parser-driven architecture usually involves considering each byte of the entire object first, generating many parse events, and then building a memory representation involving memory allocation and data copies, often through interfaces that must be invoked repeatedly. The goal of infrastructure for applications in this industry is to have the overall minimum net overhead in processing. To support adoption as a common interchange format, the industry needs a standard format that can support this lightweight low-overhead processing model.
Notes: Specialized Codecs makes sense if it is the property that allows direct representation of binary scalars. Platform Neutrality is needed in the sense that native scalars from all needed platforms must be supported by any implementation if the format has a Single Conformance Class.
Data Compression: One expert in this area has said, "For us, binary compression is probably not that important because transmission speeds are constantly improving. The additional time needed to compress and decompress seismic data would probably slow things down. We also place a greater value in the message structures than the transmission mechanics". Or, in more picturesque words, again from an expert in the field when asked about compressing seismic data, "Been there, done that, doesn't work, not interested". Bear in mind that this epigram encapsulates decades of experience and highly sophisticated R&D.
CORBA: There is, in fact, a CORBA-based integration platform currently deployed (although perhaps not widely) in this space. Without diving into technical details, it is clear that some companies would prefer an approach based on Web services.
XML Protocol Attachments: It is possible to represent seismic data control information in XML and to put the floating point arrays in a binary attachment using XOP. This data architecture is certainly viable, assuming that the issues involving floating point numbers are addressed, as evidenced by the fact that many of the proprietary vendor data formats work this way. It is, however, less flexible than the header-trace architecture described above, which is probably one reason why the latter is used in industry-wide seismic data standards (e.g. SEGY). Nonetheless, Web services that return data using XOP are an attractive alternative for dealing with seismic data.
Extensible 3D (X3D) Graphics [Extensible 3D (X3D) Graphics] is an XML-enabled ISO Standard 3D file format to enable real-time communication of 3D data across all applications and networks. It is used for commercial applications, engineering and scientific visualization, medical imaging, training, modeling and simulation (M&S), multimedia, entertainment, education, and more. [X3D Markets] Computer-Aided Design (CAD) and architecture scenes are also supported, though because they have larger file sizes and are more complex they are not typically streamed for web-delivered viewing.
Web-delivered file sizes in this use case typically range from 1-1000 KB of data while CAD files may run to several hundred megabytes apiece. An optimized serialization of the X3D data may be performed in concert with application-specific compression (e.g. combining coplanar polygons, quantizing colors, etc.) Lossy geometric compression is sometimes acceptable. Due to interaction requirements, the latency time associated with deserialization, decompression and parsing must be minimal. Digital signature and encryption compatibilities are also important for protecting digital content assets.
Support of Web-based interchange, rendering and interactivity for 3D graphics scenes. Stakeholders include tool builders, content authors, application developers and end users of 3D graphics models.
The X3D Compressed Binary Encoding Request For Proposals (RFP) from the X3D Consortium [X3D RFP] lists and justifies ten separate technical requirements. Many of these have parallels to a general optimized serialization format. The Web3D Consortium and X3D designers see great value in aligning with W3C standardization efforts in this area. This serves as significant evidence of the need for an industry standard.
Taken together, the following technical requirements for the X3D Compressed Binary Encoding RFP (indicated by emphasized type) include many requirements for an optimized serialization format. Strictly speaking, such an efficient serialization is not necessarily "compressed", though it is very likely to be more compact. Other factors, such as speed of parsing or databinding performance, may override the desire for compact representation for some applications, such as CAD. The X3D Consortium's RFP has chosen to include the ability to perform application-specific size optimizations, both lossless and lossy, as part of the process of converting the document from XML to the optimized serialization. However, the optimized format can still be represented as XML; in other words, a lossy geometric compression to an efficient format can be translated back to a lossy XML representation.
X3D Compatibility: The compressed binary encoding shall be able to encode all of the abstract functionality described in X3D Abstract Specification. Since X3D is expressed in XML, any optimized serialization that encodes all XML features will also be capable of encoding an X3D document.
Interoperability: The compressed binary encoding shall contain identical information to the other X3D encodings (XML and Classic VRML). It shall support an identical round-trip conversion between the X3D encodings. This corresponds to the Roundtrip Support property.
Multiple, separable data types: The compressed binary encoding shall support multiple, separable media data types, including all node (element) and field (attribute) types in X3D. The RFC allows the possibility of performing domain-specific compressions or encodings of data in the XML document, for example of polygons and textures. The ability to make use of Specialized Codecs is essential to meeting this requirement.
Processing Performance: The compressed binary encoding shall be easy and efficient to process in a runtime environment. Because the data for interactive applications is delivered across the web low latency is important. The ability to quickly process documents is important. In the case of CAD files, which may be several hundred megabytes in size, the ability to quickly process the file is very important. Often 3D files have long arrays of numeric data. Using XML format requires that a reader extract the string information and convert it to a binary representation. This is a computationally expensive process. Experimental data has shown that an optimized format can be 20 times as fast as XML. The ability to rapidly process arrays of floating point data is also important. Thus the Accelerated Sequential Access property is also relevant to this use case.
Ease of Implementation: Binary compression algorithms shall be easy to implement. This corresponds to Implementation Cost.
Streaming: Compressed binary encoding will operate in a variety of network-streaming environments. X3D documents are often streamed in web environments, and portions of the 3D scene are rendered as they arrive on the client computer. This corresponds to the Streamable property.
Compression: Compressed binary encoding algorithms will together enable effective compression of diverse datatypes. 3D data often consists of large arrays of floating point data that can be compressed in various ways. The ability to employ Specialized Codecs, either lossless or lossy, would meet this requirement.
Security: Compressed binary encoding will optionally enable security, content protection, privacy preferences and metadata such as encryption, conditional access, and watermarking. This corresponds to the Signable property.
Bundling: Mechanisms for bundling multiple files (e.g. X3D scene, inlined sub-scenes, image files, audio file, etc.) into a single archive file will be considered. This corresponds to Embedding Support.
Intellectual Property Rights (IPR): Technology submissions must meet the Web3D Consortium IPR policy. The W3C Patent Policy [W3C PP] is compatible with this.
GZIP is the specified compression scheme for Virtual Reality Modeling Language (VRML 97) specification, the second-generation ISO predecessor to X3D. GZIP is not type-aware and does not compress large sets of floating-point numbers as well. GZIP allows staged decompression of 64KB blocks, which might be used to support streaming capabilities. GZIP outputs are strings and require a second pass for any parsing, thus degrading parsing and loading performance. A GZIPed file would not gain the parsing speed and databinding advantages of an optimized format.
Numerous piecemeal, incompatible proprietary solutions exist in the 3D graphics industry for Web-page plug-ins. None address the breadth of technical capabilities that might be enabled by a general purpose optimized serialization format.
An X3D-specific compression and serialization algorithm for XML is certainly feasible and demonstrated. Compatibility with a general recommendation for an optimized format is desirable in order to maximize interoperability with other XML technologies, and reduce implementation cost. Many of these issues are common to other use-case domains; broad mutual benefits become possible via a common recommendation.
As Web services become more and more ubiquitous, there is a greater demand to use this technology as a way to deliver content to small devices such PDAs, pagers and mobile phones. All these devices often share the following characteristics:
They have limited memory and limited processing power.
Battery life is at a premium.
They are connected to low-bandwidth, high-latency networks which in some cases are regulated by "pay-per-byte" policies.
XML-based messaging is at the heart of the current Web services technology. XML's self-describing nature has significant advantages, but they come at the price of bandwidth and performance. XML-based messages are larger and require more processing than other protocols, and are therefore not well suited for a domain having the characteristics outlined above. Increased bandwidth usage affects wireless networks due to bandwidth restrictions allotted for communication by each device. In addition, the larger the message the higher the probability of a retransmission as a result of an on-the-air collision.
The target platforms for this use case include a broad range of PDAs, handhelds and mobile handsets, including mass market devices that limit code size to 64K and heap size to 230K. The transport packet size may vary from network to network, but it is typically measured in bytes (e.g. 128 bytes).
Small devices connected to low-bandwidth, high-latency networks. Two examples in this domain are cellular phone networks and PDA networks employed by the military.
XML is the fundamental technology underlying a Web services infrastructure, and one of the main reasons why Web services are not being deployed on the mobile space. A number of alternative serializations have already been developed to deliver XML content to small devices, however, many of these are not interoperable. This lack of interoperability results in fragmentation and the need for specialized gateways to transcode proprietary formats.
In order to satisfy the requirements of this use case, an alternative serialization must be faster to process and must produce smaller packets. Faster processing will result in lesser battery consumption while smaller packets will result in reduced latency as well as, assuming a pay-per-byte model, a more cost-effective service. In addition to small and fast, an alternative serialization should also be streamable, i.e. it should be possible for the client application to operate on any prefix of the serialized data.
Assuming that the same amount of information is encoded in an alternative serialization, a way to quantify efficiency is to consider the instruction to data ratio. In other words, the amount of effort that is needed to produce or consume a unit of data. Even though this is an implementation requirement, an alternative serialization must enable the creation of "thin" stacks with a low instruction to data ratio.
The reduction in latency that results by improving parsing speed may or may not be noticeable to the consumer depending on the transport latency of the network --transport latency is the dominant factor in many existing networks. Nevertheless, a more efficient parsing method will improve battery life on the device as well as throughput on the server.
Proprietary solutions result in the so-called gatewayed networks, where communication is always routed through a single point that translates to and from XML. This architecture not only creates a single point of failure within a network but also fragments the entire network by creating non-interoperable, domain-specific solutions.
Message size reductions are attainable via the use of standard data compression techniques. Even though in general decompression is less expensive than compression, it is still too costly for most small devices. Additionally, the extra burden of compressing packets has a negative impact on the overall system throughput.
In addition to the added cost, redundancy-based compression algorithms tend to perform very poorly on small messages, in many cases resulting in larger messages. Mobile clients often carry on dialogs with servers which consist of a large number of small messages. Examples of this include: data synchronization, stateful web services, multi-player games, querying and browsing data. In all of these use cases, the cumulative stream of messages that make up the dialog can grow very large even though all of the individual messages are rather small. Thus, there is still a need to reduce the amount of data exchanged, but doing this by compressing each message individually is not a viable solution.
A large number of existing enterprise systems are built using distributed technologies such as RMI, DCOM and CORBA. As the industry moves from distributed object systems to Service Oriented Architectures (SOAs), the use of Web services technologies becomes more significant even within the confines of a single enterprise. Many of the concepts behind SOAs are applicable to divisions within an corporation, so it is only natural to extend the applicability of Web services to intranet systems.
A stumbling block that several re-architected systems are facing is that XML-based messages are larger and require more processing than those from existing protocols: data is represented inefficiently and binding requires more computation. It has been shown that an RMI service can perform up to an order of magnitude faster than an equivalent Web Service due to the processing required to parse and bind XML data into programmatic objects.
The domain is that of distributed systems, typically based on binary protocols, which for technical reasons (e.g., interoperability) or for economic reasons (e.g., reduction of software licenses) need to be re-architected as Web services. An important constraint for these type of re-deployments is to maintain (or improve) the system's performance, a task that has been found challenging given the additional processing requirements of XML-based protocols.
There are some important economic reasons that support the use of Web services as an alternative to existing technologies for building distributed systems. First, preliminary results show more powerful hardware is needed to re-deploy existing systems using Web services technologies given the additional processing requirements of an XML messaging system. Second, assuming the company in question already develops (or is planning to develop) Web services to communicate outside their firewall, there is the extra incentive in using the same set of tools and the same development team to build intranet applications. This reduces both software fees (e.g., by reusing application servers and development tools) as well as training costs associated with having separate development teams for each technology. Third, some companies that have successfully deployed CORBA-based systems, but are not planning on deploying Web services, may find an additional incentive to do so if a more efficient serialization is standardized.
Intranet Web services differ from Internet Web services especially in the areas of deployment and security: deployments are easier to manage and security is typically defined by a single domain. The requirements for intranet systems are somewhat different from those for Internet systems, permitting the use of certain optimizations in the former which would be difficult or simply impossible to implement in the latter. Consequently, in many cases the degree of coupling of the systems can be adjusted if this helps in achieving the desired performance goals.
The main requirement for this use case is reducing XML processing time in order to achieve a level of performance comparable to the existing systems. Due to the availability of high-speed networks in these scenarios, reducing message sizes is of a lesser priority. It is worth pointing out that not all systems re-deployed using Web services will be unable to achieve their performance requirements. Therefore, this use case applies only to a subset of the aforementioned re-deployments.
In some cases, it may be possible to re-design the system's interfaces to make them more coarse grained in order to reduce the number of messages exchanged. Although this is technically feasible in most cases, the costs associated with this effort can be prohibitive.
Documents are the most basic form of recorded human communication, dating back thousands of years. Electronic documents are the transition of this invention to the online, computerized world. Books, forms, contracts, e-mails, spreadsheets, and Web pages are only some of the forms in which electronic documents are used. Unlike paper-based documents, electronic documents are not limited to static text and images. Electronic documents regularly contain both static content, dynamic content (e.g., animations, video), and interactive content (e.g., form fields). This wide range of content has a great affect on selecting an appropriate representation format and must be considered in evaluating this use case.
Documents are first created in some authoring environment. During the creation process the author may elect to include text, fonts, image, videos, or other resources which are to be rendered more than once when the document is displayed. For example, a company logo may appear in the header of each page of a document, but this should not require adding the logo to the document more than once.
In a special case of document creation, new documents are created by assembling a set of existing documents into a single aggregate document. For example, this may done to combine a basic product manual with additional documentation for optional product accessories into a customized manual for an individual purchaser. When documents are bound together in this way it may be important that the data in the original documents is not modified, so as to preserve signatures or other properties of the file, or it may be desirable to identify and eliminate duplicated resources, such as fonts.
After a document has been created it is usually read, in whole or in part. Documents are not necessarily read front to back; a particular reader may select a different order or read only part of a document. A reader may, for example, obtain the document by traversing a hyperlink which points to a specific location within the document. It is important that rendering a document for reading be fast, even when starting at an arbitrary location in this way, and even when documents are large (millions of pages). This implies that it must be possible to navigate to specific sections within a document quickly, as well as follow links to shared resources within the document, as mentioned above under document creation. Finally, if a document is being retrieved over a slow link, it may be useful to fetch portions of the document in the order in which they are being rendered and read (e.g., starting at page 700), as opposed to document order (i.e., starting at page 1).
Documents often contain information of a sensitive or proprietary nature and so can be secured using encryption technologies. Encrypting the document can serve either to keep the contents confidential, to--in conjunction with the rendering application--allow only certain operations ("rights") on the document, or both. Typically a description of any rights granted is embedded within the document itself when it is encrypted. It is often desirable that only portions of a document be encrypted so that intermediaries can access some portion of the data in the file.
Documents, and especially those used in business transactions, are often signed to indicate authenticity of, consent to, or agreement with the document. In electronic documents, this is implemented by digitally signing the document. The digital signature must itself be stored in the document. Multiple signatures may be applied to a document, each one signing those which came before it. Additional information is sometimes added to a document after it has been signed but without invalidating a signature--in the same way one can initial a correction to a paper document--but so that it is clear that any subsequent changes were not present when the pre-existing signatures were applied. In some cases signatures should apply to only part of a document, leaving other parts for later modification. Finally, it must be possible for a recipient to validate all of these signatures.
Documents are often long-lived and, during the course of their lives, used in different environments with varying constraints. For example, when a document is being published for general consumption, it might be most desirable to select an encoding such as XML which is widely understood. If, however, the same document is being transmitted between partners with known expectations a more compact format such as [XOP] might be preferred. Thus, a single document may sometimes be transformed between different encodings at different times and for different purposes. Such transformations should preserve the information in the document, but these operations cannot be expected to be compatible with encryption mechanisms used to secure documents.
Even when various encodings are available documents tend to push the available storage and bandwidth of the devices on which they are created, stored, transmitted, and read. In other words, as device capabilities increase, users respond by creating larger documents. Note that these documents rarely contain only text; they generally contain larger elements such as fonts and images and, increasingly, video and 3D models which these same enhanced devices make possible.
Electronic documents, like their paper counterparts, can be modified or re-purposed. In electronic documents, this typically occurs when pages, images, videos, and so forth are either copied out of a document to be used elsewhere or removed from a document to produce an altered version of that document. Again, these operations should be efficient: removing any one page from a one million page document should not take significantly longer than doing the same to a ten page document. Documents may also be modified by their recipients to include comments of various types--editors' marks, sticky notes, etc.--usually intended to communicate responses back to the author. These comments may be stored within the document itself; both adding them to and extracting them from the document should be efficient.
Finally, some documents are designed to be interactive beyond the limited interactions of rendering, signing, and annotating. These documents may contain form fields, GUI widgets such as buttons and listboxes, or other active elements, data islands bound to these widgets, and code, scripts, or declarative logic to validate input to these elements, enable or disable the elements, transmit the document, modify the document, interact with the rendering application, and so forth. It must be possible to describe and access all of these elements within the document itself.
Electronic documents are used extensively throughout government, business, and personal domains as well as in the interchange between these entities.
XML is in its roots a syntax for marking documents, and so the electronic document use cases seem highly relevant. Interestingly, XML has a number of shortcomings (discussed below) with respect to many of the requirements derived from this use case. Arguably, these occur because XML (and SGML) were focused largely on textual documents, but such documents represent a decreasing fraction of all electronic documents. Thus, Binary XML as a natural extension of XML to handle new document types, and documents containing new content, seems particularly relevant.
Documents are almost always exchanged between two or more people, and often between larger entities such as corporations or governments. It is, therefore, extremely desirable that an electronic document format should be easily consumable by all parties involved. XML, as a widely accepted, implemented, and used format, fits this need quite well.
Unfortunately there are a number of requirements imposed by electronic documents which XML fails to address:
Documents frequently contain embedded resources such as fonts, images, and video which are themselves encoded in binary formats. It must be possible to efficiently embed these resources in documents.
It must be possible to navigate to and render a specified location in better than linear time with respect to the size of the document.
The document encoding must be efficient with respect to space, that is, it must have low redundancy.
The document encoding must be efficient with respect to space, that is, it must have low entropy.
In order to make updates efficient, it must be possible to update a document in time proportional to the size of the update rather than the size of the document.
There are a number of requirements which XML does address, but which are enumerated here as well because they would also be requirements on any Binary XML encoding:
The format should be widely accepted, available, and implemented.
Re-usable resources may appear, or be referenced from, multiple locations within the document. In order to maintain reasonable document sizes, it must be possible for these resources to be used by reference, rather than by duplication.
It must be possible to efficiently assemble even large documents.
It must be possible to assemble signed documents in such a way that their signatures are preserved.
It must be possible for a document to contain multiple signatures, full or selective, from one or more signers.
It must be possible to read a secured (encrypted) document without suffering an unreasonable delay when first viewing the document, without unreasonably exposing the decrypted contents of the document, and while obeying rights associated with the document.
It must be possible to efficiently extract data from the document (i.e., a document fragment) and without modification to the extracted data.
Finally, the introduction of multiple formats (i.e., XML and Binary XML), implies the following desirable requirement:
The conversion of a document between different encodings must preserve all information in the document, including digital signatures.
The current de facto standard for interchange of electronic documents is Adobe's Portable Document Format, or PDF. PDF meets all of the requirements stated here except that it is not based on a widely accepted, implemented, and available format, and so while widely deployed for document viewing is not sufficiently easy to use for general-purpose interchange.
Earlier formats for electronic documents, such as TIFF, DVI, RTF, AFP, and Postscript do not, among other shortcomings, support the full range of required document features, such as dynamic and interactive content.
HTML/XHTML, SVG, XSL-FO, and other XML-based formats can, in combination, provide coverage for most of the requirement document features. However, they fail to meet certain file format requirements as described under Analysis, above.
There are other proprietary formats, such as Microsoft Word and SWF, which meet many of the functional requirements described here but, due to their proprietary nature, also lack sufficiently broad-spread acceptance, implementation, and availability.
The Securities industry has cooperated to define a standard protocol and a common messaging language called FIX which allows real-time, vendor/platform neutral electronic exchange of securities transactions between financial institutions.
The original definition of FIX was as a tag-value pair format. Due to increased competition by the year 1999, and to better accommodate business models of emerging initiatives, an XML-based message format for application-layer messages called FIXML was devised. Even though FIXML was designed to have minimum impact on existing systems, in order to protect investments in traditional FIX systems and processes, it soon became evident that the new message size was as much as 6 times larger than its tag-value predecessor, a condition that precluded key participants in the industry to integrate FIXML into their systems. This problem, together with some positive findings made through experiments, spurred the discussion for size reduction of FIXML messages, which culminated in a new format called Transport Optimized FIXML (TO-FIXML) in FIXML version 4.4. TO-FIXML is essentially a collection of XML Schema definitions that uses name abbreviations as well as attributes instead of elements wherever possible to collectively reduce FIXML messages up to 4 times.
Securities industry engaging in capital markets such as derivatives, equity and fixed-income markets, where the FIX protocol is applicable and is moving towards SOA architectures based on FIXML. Major roles played in the industry include brokers, exchanges and clearing houses.
Even though TO-FIXML has been designed to minimize message sizes, some industry participants still consider it to be a sub-optimal solution and envisage the possibility of further optimization by studying binary-compatible XML formats.
XML was the natural choice for the securities industry in light of its expandability and flexibility, which was required for the continuous and rapid evolution of the FIX protocol. There was also a demand for cross-industry interoperability given the broad adoption of XML by other financial industries.
XML Schema is the point of agreement for multiple parties to share a common transport format. However, the bloated size of the XML instances resulted in artificial changes to the schemas, with the sole purpose of reducing the number of bytes on the wire. The methods used for this purpose include the use of name abbreviations and the use of attributes in favor of elements wherever possible. Clearly, XML Schema is not the right place to tackle this problem given that the syntax verbosity is a property exclusive to the XML serialization. Stated differently, XML Schema is the point of agreement in terms of vocabulary and structure, not in terms of syntax.
Shown below are two sample FIXML order messages. The two messages carry the same information, yet their appearance is quite different. The first one is in FIXML version 4.3 format, while the second is in TO-FIXML format (FIXML version 4.4). As shown below, the one in TO-FIXML is much more compact than its equivalent FIXML 4.3 message. Some items such as 'Sender' and 'TransactTime' that were elements in FIXML 4.3 became attributes in TO-FIXML with abbreviated names 'SID' and 'TxnTm', respectively.
<FIXML DTDVersion="2.0.0" FIXVersion="4.3">
<FIXMLMessage>
<Header>
<Sender><CompID>CAT</CompID></Sender>
<Target><CompID>DOG</CompID></Target>
<SendingTime>2004-10-13T12:00:00</SendingTime>
</Header>
<Order>
<ClOrdID>123456</ClOrdID>
<Instrument>
<Symbol>XYZ</Symbol>
</Instrument>
<Side Value="1"/>
<TransactTime>2004-10-13T12:00:00</TransactTime>
<OrderQtyData>
<OrderQty>100</OrderQty>
</OrderQtyData>
<OrdType Value="2"/>
<Price>85.00</Price>
</Order>
</FIXMLMessage>
</FIXML>
<FIXML v="4.4" r="20030618" s="20031218">
<Order ID="123456" Side="1"
TxnTm="2004-10-13T12:00:00" Typ="2" Px="85.00">
<Hdr SID="CAT" TID="DOG"
Snt="2004-10-13T12:00:00"/>
<Instrmt Sym="XYZ"/>
<OrdQty Qty="100"/>
</Order>
</FIXML>Message size alone can be substantially reduced by standard compression methods. However, there is a study that shows compression of FIXML instances increases round trip time over 10 Mbps networks. Compression may be useful for considerably slower networks, which is not the typical case in FIXML. The same study also suggests that marshalling/unmarshalling costs do not seem to make tangible performance differences in those data sets typically seen in FIX scenarios.
The Service Enabler standard for mobile handsets benefits from extensive use of XML-based technologies for interoperability. For example, SMIL, SVG and XHTML are used as document formats for mobile content services such as:
Multimedia Messaging Services (MMS): MMS in 3G consists of multiple XML documents, such as SMIL, SVG and XHTML. The handset is required to parse and render multi-namespaced XML documents.
Map Services: Map data delivered to a handset is split into multiple chunks based on region and level of detail; handsets retrieve additional chunks in response to user zooms and scrolls. Additional data, such as restaurant information supplied by other content providers, can also be overlayed on top.
XML documents in these services are considerably large. For instance, the map data represented in SVG could be 100KB or more. Rich content MMS could also be very large. Even on today's high-end handsets with 120 MHz 32-bit RISC processors, parsing a raw 100 KB XML document takes approximately 10 seconds.
XML is required for maximum interoperability. In fact, XML technology is already widely adopted in the mobile services space. As this area requires a solution for narrow band and limited footprint devices, the importance of this use case should be considered high.
This use case requires the following capabilities of XML to be preserved:
Interoperability
Multiple namespace support
In-memory, random access using a DOM
Interoperability is mandatory as the same documents must be shared among different handsets. Moreover, for map services, the layering of multiple source map data requires interoperability among the providers. Support for multiple namespaces is a must in order to deliver multi-format messages (e.g. HTML + SVG) to the devices. DOM access is required to support ECMA scripting as well as for efficient rendering of formats such as SVG.
The requirements not satisfied by current XML solutions that must be addressed are:
Efficient transmission of XML documents by reducing their sizes
Efficient access to a DOM, i.e. efficient DOM parsing
The WAP Forum defined a WAP Binary XML format as an alternative serialization for XML. However, this format has a number of shortcomings, the biggest of which is the lack of support for multi-namespace documents due to the use of a "single dimension" system of 6-bit tags.
A large business communicates via XML with a number of remote businesses, some of which can be small business partners. These remote or small businesses often have access only to slow transmission lines and have limited hardware and technical expertise. The large business cannot expect the smaller partners to upgrade often or to use expensive technology. The primary illustrations of this use case come from the energy, banking, and retail industries.
In the energy industry, the major upstream (exploration and production) operations of oil companies are largely in developing countries and it is a common problem to have very slow and perhaps unreliable communications between the main office and remote sites. It's not that the oil companies don't know how to set up a satellite feed, it's that they are often required by the local governments to use the communication facilities provided by that government, and these communications can be technically low-end and expensive. So the common problem is one where there is plenty of processing power and bandwidth at both central and remote sites, but the communication between the two is slow.
Although many scenarios illustrating this problem have to do with upstream operations, this specific example will be from downstream (refining and marketing). It involves transmission of Point of Sale (POS) information back and forth between back office systems and remote sites. The data flowing to the remote sites includes "incremental price book" for dry goods and wet stock, currency exchange rates, promotion codes/rates/groups and so on. The data coming back includes raw sales transactions data, tank data, etc. One might have 1000 transactions per day per site with an average file size of 3 KB, for a total size of 3 Megs typically broken up into 12 documents (transmittal every 2 hours, referred to as "trickle feed"). Each document would then average 250 KB.
Currently the scope is for many thousands of sites connected to several regional back-office hubs. Connectivity ranges from VSAT to 32 kbps analog connections. The 32 kbps connections would only communicate once a day. This downstream situation includes a factor which is not common in upstream operations. Not only are there communication limitations, but in this case some of the remote sites also have limited processing capabilities because they are small businesses with limited resources.
In the banking industry, there is typically a main data center(s) and several connected branch offices, ATM machines, and business partners. The main data center may have the latest in technology, however, the connected branch offices, ATM machines and partners are often without access to high speed connections and powerful hardware. Communicating between the various entities can be accomplished with XML Web Services, however, the size and speed issues of XML are troublesome for those without access to high speed lines and/or powerful machines.
These same issues affect the retail industry as stores often are connected to the main data center over less than optimal links. In addition, the retail store needs to perform real-time purchase/return transactions that require round-trip communications with the main center.
Retailing operations of large companies, particularly those where the actual retail outlets are SME's (Small to Medium Size Enterprises) and large companies with various small business partners and/or branch offices. The belief is that the experience gained in this situation is likely to be directly applicable to a number of other scenarios in the industry.
Note that the players in this use case have rather different situations and needs. The large company has significant sunk investment in complex back-office systems, lots of hardware and a te