Annotation of Web Content for Transcoding

W3C Note 10 July 1999

This version:: http://www.w3.org/1999/07/NOTE-annot-19990710
Latest version:: http://www.w3.org/TR/annot
Authors:: Masahiro Hori (horim@jp.ibm.com)
Rakesh Mohan (rakeshm@us.ibm.com)
Hiroshi Maruyama (maruyama@jp.ibm.com)
Sandeep Singhal (singhal@us.ibm.com)

Status of This Document

This document is a submission to the World Wide Web Consortium from IBM (see Submission Request, W3C Staff Comment).

This document is a NOTE made available by W3C for discussion only. This indicates no endorsement of its content, nor that W3C has had any editorial control in its preparation, nor that W3C has, is, or will be allocating any resources to the issues addressed by the NOTE.

Abstract

Users will be accessing the Internet increasingly from information appliances such as PDAs, cell phones, and set-top boxes. These devices do not have the same rendering capabilities (display size, color depth, screen resolution, etc.) or network connectivity as traditional desktop clients, and therefore, content must be modified, or transcoded, for proper display on those devices.

This proposal presents annotations that can be attached to HTML/XML documents to guide their adaptation to the characteristics of diverse information appliances. It also provides a vocabulary for transcoding, and syntax of the language for annotating Web content. Used in conjunction with device capability information, style sheets, and other mechanisms, these annotations enable a high quality user experience for users who are accessing Web content from information appliances.

The proposed framework is broadly applicable to cases when content adaptation is desirable. It therefore enhances language translation, Web accessibility, and speech enabling efforts.

1. Introduction
1. 1.1 Annotation
2. 1.2 Related standardization activities
2. Framework
3. Annotation Vocabulary
4. Examples
Acknowledgements
References
Appendix: Open Issues

1. Introduction

As more and more information appliances, or pervasive computing devices are becoming available for connecting to the Web, the same web content needs to be rendered differently on client devices taking account of their physical and performance constraints such as screen size, memory size, and connection bandwidth. For example, a large full color image may be reduced with regard to size and color depth, removing unimportant portions of the content. Such content adaptation, also called transcoding, can be done for either a set of consecutive elements in an HTML document, or a set of individual elements.

This adaptation can be done at a content server, a proxy, or a client device. Such adaptation results in better presentation and faster delivery to the client device. An original HTML document, authored for a specific client such as a PC, can be augmented with annotations, which provide hints for adapting the document to the other client devices. It is important to note that a result of applying an annotation to a target document depends on a particular transcoding policy and on knowing the particular needs of the target device. The role of the external annotation is to provide hints for a trancoding policy to make better decisions on content adaptation given the device's characteristics. The specification of such transcoding tools is beyond the scope of this proposal.

This proposal is motivated by the requirements of rendering already published HTML documents on various types of web-enabled pervasive devices. Although external annotation is a general concept that has a lot of potential applications such as text formatting and language processing, we focus on annotations of HTML documents that facilitate contents adaptation for pervasive computing devices.

The following figure depicts several paths from an original HTML document to different client devices. An HTML document, which is provided for a desktop PC (path 1), is analyzed and annotated with a separate file by using an annotation tool (path 2). The annotated document must be viewable from a normal browser on a PC (path 3). Furthermore, such an annotated document can also be authored by using a standalone editor (path 4). Upon a request from a pervasive device, a proxy server may adapt the document on the basis of attached annotations (path 5). The rendered document is then loaded down to a client device (path 6). In the process of document transcoding (path 5), it is also necessary to exploit user preferences and device capabilities for the content adaptation. Such information profiles can be described by using Composite Capability/Preference Profiles [CC/PP]. CC/PP is an extensible framework for specifying client-side profiles by using the RDF data model [RDF], and the profiles can be delivered to a proxy server over HTTP [CC/PP-exchange].

Adaptation of HTML documents for personal
computing devices

1.1 Annotation

Annotations could range from simple ones such as the importance of document elements (elements with lower importance might be ignored when display space is limited) to more complex ones such as specifying separate image files for each device type and possibly for different user preferences.

A possible approach is to define a set of new tags and attributes for annotation, and embed them directly into an HTML document. However, this approach has the following limitations. HTML 4.0 [HTML] is an established international standard, which is huge already. It would be an extremely tough and time-consuming task to extend the specification, and to incorporate the extensions into the standard. Even if such extension were to be agreed upon, the existing browsers are not capable of handling these new tags. It is possible to add some extra attributes, because normal browsers will ignore such extras. However, introducing custom-tailored elements and attributes may confuse browsers that do not understand such a complex structure.

Taking account of the requirement that annotated HTML documents must be viewable by normal browsers (the path 3 indicated in the above figure), it will not be acceptable to define such extensions. In addition, further demands on content annotation are emerging, such as Web accessibility [WAI], speech synthesis [SpML] [VXML], and language translation. For example, the word "bank" in a document can be annotated to indicate that it should be interpreted as the sense of "financial institution", rather than that of "river bank." It would be impractical to incorporate these kinds of requirements in their entirety into the established HTML specification, and then modify existing HTML documents. The external content annotation is thus key to rendering already published HTML and XML [XML] documents to be adapted for various constraints stemming from user preferences, client devices, media types, and so on.

Markup languages such as HTML [HTML] embed annotations into documents. For example, an <ol> tag indicates the start of an ordered list, and a paragraph begins with a <p> tag. On the other hand, annotations could be external, when they reside in a file separated from an original document. Although making annotations external may require additional bookkeeping tasks, it has a substantial advantage of not requiring any modification on existing contents that are already published as HTML and XML files on the Web. Furthermore, this approach offers an advantage of enabling a one-time specification of meta information for elements that are present in multiple document files. Finally, external annotation can offer latency and bandwidth advantages through caching at requesting network elements.

1.2 Related standardization activities

XHTML [XHTML]: A new HTML Working Group [HTML-WG] started from Summer 1998, in order to define the next generation of HTML, or the Extensible HyperText Markup Language (XHTML). The working group aims to re-cast the current monolithic definition of HTML as a suite of XML tag-sets [XHTML-mod]. The XHTML activity also pursues ways of specifying conformance profiles, which provide semantic constraints on tag support by means of the modularized tag definitions, and help transforming Web document to make it suitable for different devices. The transformation here is a coarse-grained adaptation, or transformation, without further regard to the structure of Web resource. On the other hand, our annotation proposal addresses the other aspect of content adaptation, namely transcoding, which includes conversion of media types and modification of document structure.
RDF [RDF]: The RDF data model defines a simple model for describing relations among resources in terms of named properties and values. In particular, the RDF data model does not define concrete semantics in any application domains. Although the annotation vocabulary introduced in the following section is to characterize Web resource, namely HTML/XML documents, the vocabulary does not specify a tag set for describing individual Web resources in itself. The role of the proposed vocabulary is to constrain possibilities of the content alternation, decomposition, and combination. In addition, the conformance profiles mentioned above are currently anticipated to be encoded in RDF [HTML-WG]. Therefore, it makes sense for this annotation vocabulary to be encoded in RDF as well, so that comprehensive content adaptation mechanisms can be pursued consistently in the future.
MPEG-7 [MPEG-7]: A description of various types of audio-visual information is being standardized in Multimedia Content Description Interface, or MPEG-7, whose working draft is targeted for December 1999. MPEG-7 coincides with the goal of this external annotation proposal in the sense of not defining the content itself, but instead defining information about the content. The difference is in the primary focus on the assumed content types. This annotation proposal deals with Web resource encoded in HTML/XML, while MPEG-7 focuses on concrete media types. Therefore, it is necessary for this annotation proposal to be consistent with the progress of the MPEG-7 standardization, in pursuit of more convenient and efficient ways of content adaptation.

In the following sections, we propose a framework of external annotations for HTML and XML documents, and then introduce a vocabulary, or a tag set, for content adaptation.

2. Framework

This section describes a framework of external annotation, which prescribes a representation scheme of annotation files, and a way of linking original documents with an external annotation file. The basic ideas behind this annotation framework are as follows:

New tags and/or attributes need not be introduced into the existing language specification, and
Annotation with arbitrary complexity can be given to any part of the subject document.

The external annotation files contain hint information that is linked to elements in the original document. RDF [RDF] is used as the syntax of external annotation files. In addition, XPointer [XPointer] is used for linking annotations with the annotated elements. An annotation file, which is an XML document, therefore contains a set of descriptions for annotating the subject HTML file. The way of adding a descriptive note is illustrated in the figure below. A HTML document is entailed with an external
annotation file

A HTML document is entailed with an external
annotation file

An annotation file refers to portions of a subject document. A reference may point to a single element (e.g., an IMG element), or a range of elements (e.g., an H2 element and the following paragraphs). XPointer allows for such addressing into the internal document structure. For example, root().child(3).child(7) points to the seventh child element of the third child element of the root element of the subject document. If a target element is entailed with an id attribute, the attribute can be used for direct addressing without the need for a long path expression. Furthermore, a range of elements can be pointed to by using a span keyword in XPointer.

When annotation files is stored in a repository, an appropriate annotation file for a subject HTML document is selected dynamically from the repository either implicitly by means of a structural analysis of the subject document, or explicitly by means of a reference contained in the subject document or some other association database. One annotation file could be associated with a single subject file. On the other hand, a single annotation file may contain meta information about multiple subject files. For example, all of the external annotations for an application, consisting of a set of HTML files and image files, could be encoded in a single annotation file. This approach will be useful if an authoring tool generates these subject files at one time. Furthermore, it is possible for multiple annotation files to be associated with a single subject file. This many-to-one association is useful when constituents of an HTML document appear in different HTML files.

Since the annotation file must be updated whenever the subject HTML file is revised, it is necessary to provide a way of keeping them synchronized. Many methods can be used for this purpose. For instance, this synchronization could be implemented with a database that has a general-purpose meta-content capability. Another implementation might use a digest value (hash value such as MD5 [MD5] or SHA-1 [SHA1]) for ensuring the subject file has not been changed. For example, if an MD5 value of an entire HTML file is recorded in the annotation file, a system can check if a given file is an up-to-date version of the subject HTML file.

This framework is applicable not only to transcoding for web-enabled personal devices, but also other cases when content adaptation is desirable. For example, when HTML documents are translated into multiple target languages using a machine translation engine, linguistic annotations such as specifying proper nouns that should not be translated would be useful for improving the translation accuracy. In other situations, content adaptation may be needed, so that user-side constraints can be met or alleviated. For example, text contents should be transcoded into audio content for a user who is driving a car.

3. Annotation Vocabulary

This section provides a sample annotation vocabulary, which can be used for adaptation hints of rendering HTML documents for pervasive computing devices. The vocabulary includes three types of annotation: alternatives, splitting hint and selection criteria. In this document, a namespace [Namespaces] prefix pcd is used for the vocabulary we propose for pervasive computing devices, while the prefix rdf represents the RDF vocabulary [RDF].

3.1 Alternatives

A document or any set of its elements can be provided with alternative representations. For example, a color image may have a grayscale image as an alternative for clients with a monochrome display. A transcoding proxy selects the one alternative that best suits the capabilities of the requested client device. Elements in the subject document can then be altered either by replacement or on-demand conversion.

pcd:Alternatives

<pcd:Alternatives> tag specifies a list of alternative representation for a subject element. The <rdf:Alt> tag provided by the RDF data model is used to specify alternatives to be included in the pcd:Alternatives element. Each item in the RDF containers (rdf:Alt, rdf:Bag and rdf:Seq) may include a pcd:Replace element. Note that the original item is an alternative for itself, by default.

The alternatives can be formed hierarchically as an AND-OR tree. For example, alternatives for a video may include an audio track and a sequence of images. The <rdf:Bag> and <rdf:Seq> tags allow specifying a combination of alternative representation for a subject element as follows.

<rdf:Description about="...">
                <pcd:Alternatives>
                  <!-- a collection of an audio track and an image sequence -->
                  <rdf:Bag>
                    <rdf:li> <!-- audio track --> </rdf:li>
                    <rdf:li> 
                      <rdf:Seq>
                        <rdf:li> <!-- the first image  --> </rdf:li>
                        <rdf:li> <!-- the second image --> </rdf:li>
                      </rdf:Seq>
                    </rdf:li>
                  </rdf:Bag>
                </pcd:Alternatives>
              </rdf:Description>

pcd:Replace

<pcd:Replace> tag specifies an available alternative resource or combination of resources. The resource to be substituted is be indicated by a pcd:resourceToSubstitute element.

pcd:resourceToSubstitute

<pcd:resourceToSubstitute> tag specifies a resource to be substituted into the place of an original resource that is indicated by an about attribute of the rdf:Description element. pcd:resourceToSubstitute element may also contain a CDATA section with the HTML text to replace the annotated element with the original resource.

3.2 Splitting hint

An HTML file, which can be shown as a single page in a normal desktop PC, may be divided into multiple pages in clients with a smaller display screen.

pcd:Group

<pcd:Group> tag specifies a set of elements to be considered as a logical unit. Another use for the pcd:Group tag is to provide hints for determine appropriate page break points. Alternatives may be provided for the group as a whole.

For example, a news headline may be associated with an alternative for a news story that consists of paragraphs of text and some images. In the following example, the range of elements from the second occurrence of an H2 element through the third occurrence of a P element is annotated as a group.

<rdf:Description 
                about=span("http://foo.com/catalog.html#root().child(2,H2)",
                           "http://foo.com/catalog.html#root().child(3,P)")>
                <pcd:Group/>
              </rdf:Description>

3.3 Selection criteria

The annotation may contain information to help the transcoding proxy in selecting from alternative representations, the one that best suits the client device. This information may indicate (1) the client device capability expected for an alternative resource, (2) the resource requirements of an alternative, (3) its fidelity to the original items, (4) the semantic role of an element, and (5) importance or priority.

pcd:clientCapability

<pcd:clientCapability> tag specifies the hardware or software capabilities of a client device that an alternative is suitable for. It uses CC/PP [CC/PP] to define the client device or its capabilities. The transcoding proxy chooses the alternative for which the pcd:clientCapability is closest to the requested client device.

pcd:resourceRequirement

<pcd:resourceRequirement> tag specifies the characteristics of either an alternative resource or an original resource. For example, for an image, this tag would include its width, height, number of colors, and size (in terms of bytes). For a video, it would also include the minimum bandwidth required for streaming. The attributes used to describe resource needs would borrow attributes from [CC/PP] wherever possible.

pcd:fidelity

<pcd:fidelity> tag specifies the fidelity of an alternative as compared to the original. The fidelity value ranges from 0 to 1, where 1 is the fidelity of the original element, and 0 is the fidelity corresponding to the element being dropped from the transcoded page. A default fidelity value is 1. When a fidelity is specified with a value out of range, the default value, namely, 1 is used. A transcoding proxy will try to select the alternative with the highest fidelity within the constraints of the client device. A technique of content selection, which exploits resource requirements, fidelity, and priority, is found in [Mohan99].

pcd:role

<pcd:role> tag specifies a role of a subject element in the document. A transcoder can make decisions on the allocation of client resources (display area, data volume for transmission, etc.) for each element relying on this value. Values of this role attribute will include proper content, advertisement, decoration, icon, and so on.

pcd:importance

<pcd:importance> tag specifies the priority of the subject elements relative to the rest of elements on the page. When the importance of an element is low, for example, it will be ignored or displayed in a very small font. The value of importance is a real number ranging from -1 to 1, where 1 stands for the case with the highest priority, and -1 for the lowest case. A default importance value is 0. When an importance is specified with a value out of range, the default value, namely, 0 is used. A transcoding proxy can make decisions on the allocation of client resources (display area, data volume for transmission, etc.) for each element with regard to this importance value.

As in the following example, if an element with a decoration role has a lower importance value -0.2, it may not be sent to a lightweight client.

<rdf:Description 
                     about="http://foo.com/catalog.html#root().child(2,IMG)">
                <pcd:role value="decoration" />
                <pcd:importance value="-0.2" />
              </rdf:Description>

4. Examples

The following examples further illustrate features of the adaptation vocabulary explained above. Suppose we have the following HTML file (catalog.html). It is a car catalog that shows several cars. The HTML file contains a major description, an image, and an additional description of each car. The images are usually very high quality and thus very large, but it cannot be omitted even for smaller client devices.

catalog.html

<HTML>
        <HEAD>
          <META name="link" rel="meta" 
                href="http://foo.com/catalog.meta">  <--- Link to 
        </HEAD>                                           meta document
        <BODY>
          <H2>Turtle Tubo 999</H2>
          <P> ... </P>  <--- More important description

          <EMBED src="car1.mpg"> <-- Video;
                                     should be replaced for smaller devices

          <P> ... </P>  <--- Less important description

          <H2>Rabbit 2000</H2>    +
                                  |<-- Block of a logical unit;
          <IMG src="carrot.jpg">  |    may be displayed in a single page 
          <P> ... </P>            +

        </BODY>
        </HTML>

A possible annotation file associated with the above HTML file is as follows. It contains a RDF description specifying alternate version of the original catalog.html page for various client devices. Note that it is also necessary to specify the client device that the original page is suitable for.

catalog.meta

<?xml version="1.0"?>

        <rdf:RDF
           xmlns:rdf="http://www.w3.org/TR/REC-rdf-syntax"
           xmlns:pcd="http://www.ibm.com/annot/pcd"
           xmlns:prf="http://www.w3.org/TR/WD-profile-vocabulary"
           xmlns:dc="http://purl.org/metadata/dublin_core">

          <rdf:Description about="http://foo.com/catalog.html">
            <dc:title>Car catalog</dc:title>
            <dc:author>Hiroshi Maruyama</dc:author>
          </rdf:Description>

          <rdf:Description about="http://foo.com/catalog.html">
            <!-- the page was authored for a typical PC connected over a modem -->
            <pcd:clientCapability 
               value="http://www.ibm.com/profiles/pc_modem.ccpp" />
            <pcd:Alternatives>
              <rdf:Alt>
                <rdf:li>
                  <!-- alternative resource suitable for PDAs with CDPD modems -->
                  <pcd:Replace> 
                    <pcd:clientCapability 
                       prf:Default="http://www.palmpilot.com/profiles/PalmIII.ccpp"
                       prf:Modem="CDPD" />
                    <pcd:resourceToSubstitute
                       pcd:target="http://foo.com/catalog_pda.html" >
                  </pcd:Replace>
                <rdf:/li>
                <rdf:li>
                  <!-- alternative resource suitable for cell phones -->
                  <pcd:Replace> 
                    <pcd:clientCapability 
                       prf:Default="http://www.nokia.com/profiles/2160" />
                    <pcd:resourceToSubstitute
                       pcd:target="http://foo.com/catalog.wml" >
                  </pcd:Replace>
                <rdf:/li>
              </rdf:Alt>
            </pcd:Alternatives>
          </rdf:Description>

          <rdf:Description 
                 about="http://foo.com/catalog.html#root().child(2,P)">
            <pcd:importance value="-0.2" />
          </rdf:Description>

          <rdf:Description 
                 about=span("http://foo.com/catalog.html#root().child(2,H2)",
                            "http://foo.com/catalog.html#root().child(3,P)")>
            <pcd:group />
          </rdf:Description>

          <rdf:Description 
                 about="http://foo.com/catalog.html#root().child(2,IMG)">
            <pcd:role value="decoration" />
            <pcd:importance value="-0.1" />
          </rdf:Description>

        </rdf:RDF>

Next we look at a more complex annotation for the same page, where individual items of the page are annotated. The video has alternative videos rendered at different bit-rates. The video also has alternatives as an audio track and a set of images.

catalog.meta

<?xml version="1.0"?>

        <rdf:RDF
           xmlns:rdf="http://www.w3.org/TR/REC-rdf-syntax"
           xmlns:pcd="http://www.ibm.com/annot/pcd"
           xmlns:prf="http://www.w3.org/TR/WD-profile-vocabulary">

          <rdf:Description 
                 about="http://foo.com/catalog.html#root().child(1,EMBED)">
            <pcd:importance value="0.8" />
            <pcd:resourceRequirement 
               width="320" height="140" bpp="8" bw="1.4Mbps"
               color="color" media="video/mpeg" /> 

            <pcd:Alternatives>
              <rdf:Alt>
                <rdf:li>
                  <pcd:Replace> 
                    <pcd:fidelity value="0.8" />
                    <pcd:resourceRequirement 
                       width="160" height="70" bpp="8" bw="28Kbps"
                       color="color" mime="video/h.263" />
                    <pcd:resourceToSubstitute>
                      <![CDATA[
                         <EMBED src="http://foo.com/car1.h263" width=170 height=80>
                       ]]> 
                    </pcd:resourceToSubstitute>
                  </pcd:Replace>
                </rdf:li>
                <rdf:li>
                  <!-- a collection of an audio track and an image sequence -->
                  <rdf:Bag> 
                    <rdf:li> <!-- **First** item in the collection -->
                      <pcd:importance value="-0.4" />
                      <rdf:Alt>
                        <rdf:li>
                          <pcd:Replace> 
                            <pcd:fidelity value="0.6" />
                            <pcd:resourceRequirement 
                               bw="8Kbps" mime="audio/real" />
                            <pcd:resourceToSubstitute>
                              <![CDATA[
                                 <EMBED src="http://foo.com/car1_audio.ra"
                                        width=80 height=20>
                               ]]>
                            </pcd:resourceToSubstitute>
                          </pcd:Replace>
                        </rdf:li>
                      </rdf:Alt>
                    </rdf:li>
                    <rdf:li> <!-- **Second** item in the collection -->
                      <pcd:importance value="0.6" />
                      <!-- in the sequence of two images, 
                           each has a color and a b/w version -->
                      <rdf:Seq>  
                        <rdf:li> <!-- **First** item in the sequence -->
                          <pcd:Replace> 
                            <pcd:fidelity value="0.4" />
                            <pcd:resourceRequirement 
                               width="320" height="240" bpp="8" size="18Kb"
                               color="color" mime="image/jpeg" />
                            <pcd:resourceToSubstitute>
                              <![CDATA[
                                 <IMG src="car1_1.jpeg" width=320 height=240>
                               ]]>
                            </pcd:resourceToSubstitute>
                         </pcd:Replace>
                        </rdf:li>
                        <rdf:li> <!-- **Second** item in the sequence -->
                          <pcd:Replace> 
                            <pcd:fidelity value="0.4" />
                            <pcd:resourceRequirement 
                               width="320" height="240" bpp="8" size="18Kb"
                               color="color" mime="image/jpeg" />
                            <pcd:resourceToSubstitute>
                              <![CDATA[
                                 <IMG src="car1_2.jpeg" width=320 height=240>
                               ]]>
                            </pcd:resourceToSubstitute>
                          </pcd:Replace>
                        </rdf:li>
                      </rdf:Seq>
                    </rdf:li>
                  </rdf:Bag>
                </rdf:li>
              </rdf:Alt>
            </pcd:Alternatives>
          </rdf:Description>

        </rdf:RDF>

Acknowledgements

Many thanks to the following people who have contributed through review and comment:

David Fallside, Shin-ichi Hirose, Kazushi Kuse, Chung-Sheng Li, Bob Schloss, John R. Smith, Naohiko Uramoto.

References

[CC/PP]: Composite Capability/Preference Profiles (CC/PP): A user side framework for content negotiation. W3C Note, http://www.w3.org/TR/NOTE-CCPP/ (11/1998).
[CC/PP-exchange]: CC/PP exchange protocol based on HTTP Extension Framework. W3C Note, http://www.w3.org/TR/NOTE-CCPPexchange (04/1999).
[HTML]: HTML 4.0 Specification. W3C Recommendation, http://www.w3.org/TR/REC-html40/ (04/1998).
[HTML-WG]: HyperText Markup Language. W3C User Interface Domain Activity Statement, http://www.w3.org/MarkUp/Activity.html (12/1998).
[MD5]: The MD5 Message-Digest Algorithm. IETF Network Working Group RFC1321, http://www.ietf.org/rfc/rfc1321.txt (04/1992).
[Mohan99]: R. Mohan, J.R. Smith, and C-S. Li: Adapting Multimedia Internet Content for Universal Access. IEEE Transactions on Multimedia, Vol. 1, No. 1 (03/1999).
[MPEG-7]: MPEG-7: Context and Objectives. ISO/IEC JTC1/SC29/WG11 N2460, http://drogo.cselt.stet.it/mpeg/standards/mpeg-7/mpeg-7.htm (10/1998).
[Namespaces]: Namespaces in XML. W3C Recommendation, http://www.w3.org/TR/REC-xml-names/ (01/1999).
[RDF]: Resource Description Framework (RDF) Model and Syntax Specification. W3C Recommendation, http://www.w3.org/TR/REC-rdf-syntax/ (02/1999).
[SHA1]: SHA1 Secure Hash Algorithm - Version 1.0. http://www.w3.org/PICS/DSig/SHA1_1_0.html (10/1997).
[SpML]: SpeechML: Speech Markup Language and Browser. IBM alphaWorks, http://www.alphaWorks.ibm.com/formula/speechml (02/1999).
[VXML]: The Voice eXtensible Markup Language (VXML) Forum, http://www.vxml.org/ (03/1999).
[WAI]: Web Content Accessibility Guidelines. W3C Working Draft, http://www.w3.org/TR/WD-WAI-PAGEAUTH/ (02/1999).
[XHTML]: XHTML^TM 1.0: The Extensible HyperText Markup Language. W3C Working Draft, http://www.w3.org/TR/WD-html-in-xml/ (03/1999).
[XHTML-mod]: Modularization of XHTML^TM. W3C Working Draft, http://www.w3.org/TR/xhtml-modularization/ (04/1999).
[XML]: Extensible Markup Language (XML) 1.0. W3C Recommendation, http://www.w3.org/TR/REC-xml (02/1998).
[XPointer]: XML Pointer Language (XPointer). W3C Working Draft, http://www.w3.org/TR/WD-xptr (03/1998).

Appendix: Open Issues

Issue-1. Element-level identification validity

Resource identification validity property names (a namespace and propertyNames )for RDF are needed for confirming that the resource described by the about attribute has not changed since the metadata was prepared. At a minimum, it should be defined that the MD5 hash property and the IMT (Internet Mime Type) property. In addition, it would be necessary to specify whether

(a)	`rdf:Description` blocks which specify subsections of a resource are to be considered invalid if the MD5 property in the about for the entire page does not match, or
(b)	`rdf:Description` blocks apply unless the MD5 in that description block is a mismatch against that subsection of the resource.

Issue-2. Mediation of multiple annotations

If two or more Description about expressions using XPointer are such that one is a superset of the other, and both have role or importance values, which one governs? (This question should be answered by the RDF Data Model after XLink is approved.)

If two or more annotation files are in play, and more than one has a Description with the same about expression, what are the semantics of merging the importance or alt or role properties from the multiple Descriptions?

Annotation of Web Content for Transcoding

W3C Note 10 July 1999

Status of This Document

Table of Contents

Issue-1. Element-level identification validity

Issue-2. Mediation of multiple annotations