This document is a submission to the World Wide Web Consortium from IBM (see Submission Request, W3C Staff Comment).
This document is a NOTE made available by W3C for discussion only. This indicates no endorsement of its content, nor that W3C has had any editorial control in its preparation, nor that W3C has, is, or will be allocating any resources to the issues addressed by the NOTE.
Users will be accessing the Internet increasingly from information appliances such as PDAs, cell phones, and set-top boxes. These devices do not have the same rendering capabilities (display size, color depth, screen resolution, etc.) or network connectivity as traditional desktop clients, and therefore, content must be modified, or transcoded, for proper display on those devices.
This proposal presents annotations that can be attached to HTML/XML documents to guide their adaptation to the characteristics of diverse information appliances. It also provides a vocabulary for transcoding, and syntax of the language for annotating Web content. Used in conjunction with device capability information, style sheets, and other mechanisms, these annotations enable a high quality user experience for users who are accessing Web content from information appliances.
The proposed framework is broadly applicable to cases when content adaptation is desirable. It therefore enhances language translation, Web accessibility, and speech enabling efforts.
As more and more information appliances, or pervasive computing devices are becoming available for connecting to the Web, the same web content needs to be rendered differently on client devices taking account of their physical and performance constraints such as screen size, memory size, and connection bandwidth. For example, a large full color image may be reduced with regard to size and color depth, removing unimportant portions of the content. Such content adaptation, also called transcoding, can be done for either a set of consecutive elements in an HTML document, or a set of individual elements.
This adaptation can be done at a content server, a proxy, or a client device. Such adaptation results in better presentation and faster delivery to the client device. An original HTML document, authored for a specific client such as a PC, can be augmented with annotations, which provide hints for adapting the document to the other client devices. It is important to note that a result of applying an annotation to a target document depends on a particular transcoding policy and on knowing the particular needs of the target device. The role of the external annotation is to provide hints for a trancoding policy to make better decisions on content adaptation given the device's characteristics. The specification of such transcoding tools is beyond the scope of this proposal.
This proposal is motivated by the requirements of rendering already published HTML documents on various types of web-enabled pervasive devices. Although external annotation is a general concept that has a lot of potential applications such as text formatting and language processing, we focus on annotations of HTML documents that facilitate contents adaptation for pervasive computing devices.
The following figure depicts several paths from an original HTML document to different client devices. An HTML document, which is provided for a desktop PC (path 1), is analyzed and annotated with a separate file by using an annotation tool (path 2). The annotated document must be viewable from a normal browser on a PC (path 3). Furthermore, such an annotated document can also be authored by using a standalone editor (path 4). Upon a request from a pervasive device, a proxy server may adapt the document on the basis of attached annotations (path 5). The rendered document is then loaded down to a client device (path 6). In the process of document transcoding (path 5), it is also necessary to exploit user preferences and device capabilities for the content adaptation. Such information profiles can be described by using Composite Capability/Preference Profiles [CC/PP]. CC/PP is an extensible framework for specifying client-side profiles by using the RDF data model [RDF], and the profiles can be delivered to a proxy server over HTTP [CC/PP-exchange].
Annotations could range from simple ones such as the importance of document elements (elements with lower importance might be ignored when display space is limited) to more complex ones such as specifying separate image files for each device type and possibly for different user preferences.
A possible approach is to define a set of new tags and attributes for annotation, and embed them directly into an HTML document. However, this approach has the following limitations. HTML 4.0 [HTML] is an established international standard, which is huge already. It would be an extremely tough and time-consuming task to extend the specification, and to incorporate the extensions into the standard. Even if such extension were to be agreed upon, the existing browsers are not capable of handling these new tags. It is possible to add some extra attributes, because normal browsers will ignore such extras. However, introducing custom-tailored elements and attributes may confuse browsers that do not understand such a complex structure.
Taking account of the requirement that annotated HTML documents must be viewable by normal browsers (the path 3 indicated in the above figure), it will not be acceptable to define such extensions. In addition, further demands on content annotation are emerging, such as Web accessibility [WAI], speech synthesis [SpML] [VXML], and language translation. For example, the word "bank" in a document can be annotated to indicate that it should be interpreted as the sense of "financial institution", rather than that of "river bank." It would be impractical to incorporate these kinds of requirements in their entirety into the established HTML specification, and then modify existing HTML documents. The external content annotation is thus key to rendering already published HTML and XML [XML] documents to be adapted for various constraints stemming from user preferences, client devices, media types, and so on.
Markup languages such as HTML [HTML] embed
annotations into documents. For example, an
indicates the start of an ordered list, and a paragraph begins with a
<p> tag. On the other hand, annotations could be
external, when they reside in a file separated from an original
document. Although making annotations external may require additional
bookkeeping tasks, it has a substantial advantage of not requiring any
modification on existing contents that are already published as HTML and XML
files on the Web. Furthermore, this approach offers an advantage of enabling
a one-time specification of meta information for elements that are present in
multiple document files. Finally, external annotation can offer latency and
bandwidth advantages through caching at requesting network elements.
In the following sections, we propose a framework of external annotations for HTML and XML documents, and then introduce a vocabulary, or a tag set, for content adaptation.
This section describes a framework of external annotation, which prescribes a representation scheme of annotation files, and a way of linking original documents with an external annotation file. The basic ideas behind this annotation framework are as follows:
An annotation file refers to portions of a subject document. A reference
may point to a single element (e.g., an
IMG element), or a range
of elements (e.g., an
H2 element and the following paragraphs).
XPointer allows for such addressing into the internal document structure. For
root().child(3).child(7) points to the seventh child
element of the third child element of the root element of the subject
document. If a target element is entailed with an
the attribute can be used for direct addressing without the need for a long
path expression. Furthermore, a range of elements can be pointed to by using a
span keyword in XPointer.
When annotation files is stored in a repository, an appropriate annotation file for a subject HTML document is selected dynamically from the repository either implicitly by means of a structural analysis of the subject document, or explicitly by means of a reference contained in the subject document or some other association database. One annotation file could be associated with a single subject file. On the other hand, a single annotation file may contain meta information about multiple subject files. For example, all of the external annotations for an application, consisting of a set of HTML files and image files, could be encoded in a single annotation file. This approach will be useful if an authoring tool generates these subject files at one time. Furthermore, it is possible for multiple annotation files to be associated with a single subject file. This many-to-one association is useful when constituents of an HTML document appear in different HTML files.
Since the annotation file must be updated whenever the subject HTML file is revised, it is necessary to provide a way of keeping them synchronized. Many methods can be used for this purpose. For instance, this synchronization could be implemented with a database that has a general-purpose meta-content capability. Another implementation might use a digest value (hash value such as MD5 [MD5] or SHA-1 [SHA1]) for ensuring the subject file has not been changed. For example, if an MD5 value of an entire HTML file is recorded in the annotation file, a system can check if a given file is an up-to-date version of the subject HTML file.
This framework is applicable not only to transcoding for web-enabled personal devices, but also other cases when content adaptation is desirable. For example, when HTML documents are translated into multiple target languages using a machine translation engine, linguistic annotations such as specifying proper nouns that should not be translated would be useful for improving the translation accuracy. In other situations, content adaptation may be needed, so that user-side constraints can be met or alleviated. For example, text contents should be transcoded into audio content for a user who is driving a car.
This section provides a sample annotation vocabulary, which can be used for
adaptation hints of rendering HTML documents for pervasive computing devices.
The vocabulary includes three types of annotation: alternatives, splitting
hint and selection criteria. In this document, a namespace [Namespaces] prefix
is used for the vocabulary we propose for pervasive computing devices, while
rdf represents the RDF vocabulary [RDF].
A document or any set of its elements can be provided with alternative representations. For example, a color image may have a grayscale image as an alternative for clients with a monochrome display. A transcoding proxy selects the one alternative that best suits the capabilities of the requested client device. Elements in the subject document can then be altered either by replacement or on-demand conversion.
<pcd:Alternatives>tag specifies a list of alternative representation for a subject element. The
<rdf:Alt>tag provided by the RDF data model is used to specify alternatives to be included in the
pcd:Alternativeselement. Each item in the RDF containers (
rdf:Seq) may include a
pcd:Replaceelement. Note that the original item is an alternative for itself, by default.
The alternatives can be formed hierarchically as an AND-OR tree. For
example, alternatives for a video may include an audio track and a
sequence of images. The
<rdf:Seq> tags allow specifying a combination of
alternative representation for a subject element as follows.
<rdf:Description about="..."> <pcd:Alternatives> <!-- a collection of an audio track and an image sequence --> <rdf:Bag> <rdf:li> <!-- audio track --> </rdf:li> <rdf:li> <rdf:Seq> <rdf:li> <!-- the first image --> </rdf:li> <rdf:li> <!-- the second image --> </rdf:li> </rdf:Seq> </rdf:li> </rdf:Bag> </pcd:Alternatives> </rdf:Description>
<pcd:Replace>tag specifies an available alternative resource or combination of resources. The resource to be substituted is be indicated by a
<pcd:resourceToSubstitute>tag specifies a resource to be substituted into the place of an original resource that is indicated by an
aboutattribute of the
pcd:resourceToSubstituteelement may also contain a CDATA section with the HTML text to replace the annotated element with the original resource.
An HTML file, which can be shown as a single page in a normal desktop PC, may be divided into multiple pages in clients with a smaller display screen.
<pcd:Group>tag specifies a set of elements to be considered as a logical unit. Another use for the
pcd:Grouptag is to provide hints for determine appropriate page break points. Alternatives may be provided for the group as a whole.
For example, a news headline may be associated with an alternative
for a news story that consists of paragraphs of text and some images. In
the following example, the range of elements from the second occurrence
H2 element through the third occurrence of a
P element is annotated as a group.
<rdf:Description about=span("http://foo.com/catalog.html#root().child(2,H2)", "http://foo.com/catalog.html#root().child(3,P)")> <pcd:Group/> </rdf:Description>
The annotation may contain information to help the transcoding proxy in selecting from alternative representations, the one that best suits the client device. This information may indicate (1) the client device capability expected for an alternative resource, (2) the resource requirements of an alternative, (3) its fidelity to the original items, (4) the semantic role of an element, and (5) importance or priority.
<pcd:clientCapability>tag specifies the hardware or software capabilities of a client device that an alternative is suitable for. It uses CC/PP [CC/PP] to define the client device or its capabilities. The transcoding proxy chooses the alternative for which the
pcd:clientCapabilityis closest to the requested client device.
<pcd:resourceRequirement>tag specifies the characteristics of either an alternative resource or an original resource. For example, for an image, this tag would include its width, height, number of colors, and size (in terms of bytes). For a video, it would also include the minimum bandwidth required for streaming. The attributes used to describe resource needs would borrow attributes from [CC/PP] wherever possible.
<pcd:fidelity>tag specifies the fidelity of an alternative as compared to the original. The fidelity value ranges from 0 to 1, where 1 is the fidelity of the original element, and 0 is the fidelity corresponding to the element being dropped from the transcoded page. A default fidelity value is 1. When a fidelity is specified with a value out of range, the default value, namely, 1 is used. A transcoding proxy will try to select the alternative with the highest fidelity within the constraints of the client device. A technique of content selection, which exploits resource requirements, fidelity, and priority, is found in [Mohan99].
<pcd:role>tag specifies a role of a subject element in the document. A transcoder can make decisions on the allocation of client resources (display area, data volume for transmission, etc.) for each element relying on this value. Values of this role attribute will include proper content, advertisement, decoration, icon, and so on.
<pcd:importance>tag specifies the priority of the subject elements relative to the rest of elements on the page. When the importance of an element is low, for example, it will be ignored or displayed in a very small font. The value of importance is a real number ranging from -1 to 1, where 1 stands for the case with the highest priority, and -1 for the lowest case. A default importance value is 0. When an importance is specified with a value out of range, the default value, namely, 0 is used. A transcoding proxy can make decisions on the allocation of client resources (display area, data volume for transmission, etc.) for each element with regard to this importance value.
As in the following example, if an element with a decoration role has a lower importance value -0.2, it may not be sent to a lightweight client.
<rdf:Description about="http://foo.com/catalog.html#root().child(2,IMG)"> <pcd:role value="decoration" /> <pcd:importance value="-0.2" /> </rdf:Description>
The following examples further illustrate features of the adaptation vocabulary explained above. Suppose we have the following HTML file (catalog.html). It is a car catalog that shows several cars. The HTML file contains a major description, an image, and an additional description of each car. The images are usually very high quality and thus very large, but it cannot be omitted even for smaller client devices.
<HTML> <HEAD> <META name="link" rel="meta" href="http://foo.com/catalog.meta"> <--- Link to </HEAD> meta document <BODY> <H2>Turtle Tubo 999</H2> <P> ... </P> <--- More important description <EMBED src="car1.mpg"> <-- Video; should be replaced for smaller devices <P> ... </P> <--- Less important description <H2>Rabbit 2000</H2> + |<-- Block of a logical unit; <IMG src="carrot.jpg"> | may be displayed in a single page <P> ... </P> + </BODY> </HTML>
A possible annotation file associated with the above HTML file is as
follows. It contains a RDF description specifying alternate version of the
catalog.html page for various client devices. Note that
it is also necessary to specify the client device that the original page is
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/TR/REC-rdf-syntax" xmlns:pcd="http://www.ibm.com/annot/pcd" xmlns:prf="http://www.w3.org/TR/WD-profile-vocabulary" xmlns:dc="http://purl.org/metadata/dublin_core"> <rdf:Description about="http://foo.com/catalog.html"> <dc:title>Car catalog</dc:title> <dc:author>Hiroshi Maruyama</dc:author> </rdf:Description> <rdf:Description about="http://foo.com/catalog.html"> <!-- the page was authored for a typical PC connected over a modem --> <pcd:clientCapability value="http://www.ibm.com/profiles/pc_modem.ccpp" /> <pcd:Alternatives> <rdf:Alt> <rdf:li> <!-- alternative resource suitable for PDAs with CDPD modems --> <pcd:Replace> <pcd:clientCapability prf:Default="http://www.palmpilot.com/profiles/PalmIII.ccpp" prf:Modem="CDPD" /> <pcd:resourceToSubstitute pcd:target="http://foo.com/catalog_pda.html" > </pcd:Replace> <rdf:/li> <rdf:li> <!-- alternative resource suitable for cell phones --> <pcd:Replace> <pcd:clientCapability prf:Default="http://www.nokia.com/profiles/2160" /> <pcd:resourceToSubstitute pcd:target="http://foo.com/catalog.wml" > </pcd:Replace> <rdf:/li> </rdf:Alt> </pcd:Alternatives> </rdf:Description> <rdf:Description about="http://foo.com/catalog.html#root().child(2,P)"> <pcd:importance value="-0.2" /> </rdf:Description> <rdf:Description about=span("http://foo.com/catalog.html#root().child(2,H2)", "http://foo.com/catalog.html#root().child(3,P)")> <pcd:group /> </rdf:Description> <rdf:Description about="http://foo.com/catalog.html#root().child(2,IMG)"> <pcd:role value="decoration" /> <pcd:importance value="-0.1" /> </rdf:Description> </rdf:RDF>
Next we look at a more complex annotation for the same page, where individual items of the page are annotated. The video has alternative videos rendered at different bit-rates. The video also has alternatives as an audio track and a set of images.
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/TR/REC-rdf-syntax" xmlns:pcd="http://www.ibm.com/annot/pcd" xmlns:prf="http://www.w3.org/TR/WD-profile-vocabulary"> <rdf:Description about="http://foo.com/catalog.html#root().child(1,EMBED)"> <pcd:importance value="0.8" /> <pcd:resourceRequirement width="320" height="140" bpp="8" bw="1.4Mbps" color="color" media="video/mpeg" /> <pcd:Alternatives> <rdf:Alt> <rdf:li> <pcd:Replace> <pcd:fidelity value="0.8" /> <pcd:resourceRequirement width="160" height="70" bpp="8" bw="28Kbps" color="color" mime="video/h.263" /> <pcd:resourceToSubstitute> <![CDATA[ <EMBED src="http://foo.com/car1.h263" width=170 height=80> ]]> </pcd:resourceToSubstitute> </pcd:Replace> </rdf:li> <rdf:li> <!-- a collection of an audio track and an image sequence --> <rdf:Bag> <rdf:li> <!-- **First** item in the collection --> <pcd:importance value="-0.4" /> <rdf:Alt> <rdf:li> <pcd:Replace> <pcd:fidelity value="0.6" /> <pcd:resourceRequirement bw="8Kbps" mime="audio/real" /> <pcd:resourceToSubstitute> <![CDATA[ <EMBED src="http://foo.com/car1_audio.ra" width=80 height=20> ]]> </pcd:resourceToSubstitute> </pcd:Replace> </rdf:li> </rdf:Alt> </rdf:li> <rdf:li> <!-- **Second** item in the collection --> <pcd:importance value="0.6" /> <!-- in the sequence of two images, each has a color and a b/w version --> <rdf:Seq> <rdf:li> <!-- **First** item in the sequence --> <pcd:Replace> <pcd:fidelity value="0.4" /> <pcd:resourceRequirement width="320" height="240" bpp="8" size="18Kb" color="color" mime="image/jpeg" /> <pcd:resourceToSubstitute> <![CDATA[ <IMG src="car1_1.jpeg" width=320 height=240> ]]> </pcd:resourceToSubstitute> </pcd:Replace> </rdf:li> <rdf:li> <!-- **Second** item in the sequence --> <pcd:Replace> <pcd:fidelity value="0.4" /> <pcd:resourceRequirement width="320" height="240" bpp="8" size="18Kb" color="color" mime="image/jpeg" /> <pcd:resourceToSubstitute> <![CDATA[ <IMG src="car1_2.jpeg" width=320 height=240> ]]> </pcd:resourceToSubstitute> </pcd:Replace> </rdf:li> </rdf:Seq> </rdf:li> </rdf:Bag> </rdf:li> </rdf:Alt> </pcd:Alternatives> </rdf:Description> </rdf:RDF>
Many thanks to the following people who have contributed through review and comment:
Resource identification validity property names (a namespace and propertyNames )for RDF are needed for confirming that the resource described by the about attribute has not changed since the metadata was prepared. At a minimum, it should be defined that the MD5 hash property and the IMT (Internet Mime Type) property. In addition, it would be necessary to specify whether
If two or more
Description about expressions using XPointer
are such that one is a superset of the other, and both have
importance values, which one governs? (This question should be
answered by the RDF Data Model after XLink is approved.)
If two or more annotation files are in play, and more than one has a
Description with the same
about expression, what are
the semantics of merging the
role properties from the multiple Descriptions?