This document is a work in progress, representing a revision of the working draft dated 1998-10-05 incorporating suggestions received in review comments and further deliberations of the W3C Mobile Access Interest Group. It also incorporates suggestions resulting from reviews by members of the IETF CONNEG working group and the WAPForum. It is the first public review draft of this document. Publication as a working draft does not imply endorsement by the W3C membership.
All RDF code has been validated with SiRPAC, the W3C RDF validator.
The RDF code has been updated on July 26, 1999, to reflect the Resource Description Framework (RDF) Model and Syntax Specification Recommendation, and the 3 March Resource Description Framework (RDF) Schemas Proposed Recommendation. Minor updates to the text, reflecting the work that has been conducted in the W3C since November 1998, has also been added.
Review comments from the public on this document should be sent to firstname.lastname@example.org which is an automatically archived email list. Information on how to subscribe to public W3C email lists can be found at http://www.w3.org/Mail/Request.
This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by this NOTE.
Copyright © 1997,1998, 1999 W3C (MIT, INRIA, Keio ), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
1.1 Wireless networks
1.2 Goals of this work
2. Metadata and profiles
3. Composite Capability/Preferences Profiles (CC/PP)
3.1 Inline example
3.2 Indirect references
3.3 Runtime changes
4. Protocol considerations
In this note we describe a method for using RDF, the Resource Description Format of the W3C, to create a general, yet extensible framework for describing user preferences and device capabilities. This information can be provided by the user to servers and content providers. The servers can use this information describing the user's preferences to customize the service or content provided. The ability of RDF to reference profile information via URLs assists in minimizing the number of network transactions required to adapt content to a device, while the framework fits well into the current and future protocols being developed a the W3C and the WAP Forum.
This document describes the rationale and design of a profile service to describe the capabilities and preferences of Web enabled applications. A Composite Capability/Preference Profile (CC/PP) is a collection of the capabilities and preferences associated with user and the agents used by the user to access the World Wide Web. These user agents include the hardware platform, system software and applications used by the user. User agent capabilities and references can be thought of as metadata or properties and descriptions of the user agent hardware and software.
A description of the user's capabilities and preferences is necessary but insufficient to provide a general content negotiation solution. A general framework for content negotiation requires a means for describing the meta data or attributes and preferences of the user and his/hers/its agents, the attributes of the content and the rules for adapting content to the capabilities and preferences of the user. The current mechanisms, such as accept headers and <alt> tags, are somewhat limited. Future services will be more demanding. For example: the content might be authored in multiple languages with different levels of confidence in the translation and the user might be able to understand multiple languages with different levels of proficiency. To complete the negotiation some rule is needed for selecting a version of the document based on weighing the user's proficiency in different languages against the quality of the documents various translations.
This proposal focuses on the design of a user agent profile service based on XML/RDF. RDF, the Resource Description Format [RDF][RDF-Schema], was designed by the W3C consortium. There is a specification that describes how to encode RDF using XML. RDF was designed to describe the machine understandable properties of web content. In this proposal we explore to use of RDF to describe capabilities and preferences associated with a user and the hardware and software agents used to access the web. We expect the use of a common technology to encode metadata describing both content and a user's preferences will encourage the adoption of the technology and simplify the use of metadata in the Web. Hopefully, powerful tools for dealing with XML and RDF, some of which are already under development, will be available.
Some potentially complex negotiation may have to take place between the content or the server of the content and the user of the content. For example: the content might be authored in multiple languages with different levels of confidence in the translation and the user might be able to understand multiple languages with different levels of proficiency. Though we hope that the use of RDF to encode the metadata describing the content and the user's preferences will facilitate the development of solutions to these kinds of complex negotiations, the implementation of appropriate rules for the negotiation is left to application developers.
Alternate methods for describing the attributes or meta data of documents are under investigation by other organizations such as the IETF Content Negotiation [CONNEG] working group. Though this proposal is not directly compatible with the IETF CONNEG proposals currently under development, RDF allows the use of multiple vocabularies. Hopefully, this will provide a means for interoperability, at least at the level of attribute vocabularies. The CONNEG working group is also developing a media feature matching algebra. Efforts are underway to insure that the CONNEG algebra and RDF are complementary technologies. In addition to the IETF we are particularly concerned about the WAPForum and ETSI. The success of the CC/PP effort will undoubtedly hinge on our ability to cooperate with those organizations.
Compared to the typical wireline data networks available to corporate desktop users, wireless networks are more expensive, provide less bandwidth, with higher latency and less reliability. SMS data service on GSM networks provides 22 bytes (!) per second to a typical mobile host. The situation is rapidly changing. Emerging packet oriented, cellular networks, such as CDPD and CDMA, and with packet oriented bearer technologies such as GPRS and EDGE are providing higher bandwidth and lower latency. Within the next decade we should see the deployment of "third generation" cellular networks that provide low latency and megabit bandwidth to mobile hosts.
But today's wireless networks are slow and tomorrow's wireless networks will be slow compared to tomorrow's wireline networks. Protocols designed for wireline networks without regard for the limitations of wireless networks often exhibit undesirable behavior when deployed on wireless networks.
CC/PPs are intended to provide information necessary to adapt the content and the content delivery mechanisms to best fit the capabilities and preferences of the user and its agents. Protocol design is beyond the scope of this group, however, the use of CC/PPs does have some impact on web protocols and in this section some of those issues are discussed. The design and implementation of HTTP-NG is being actively carried out by another group. In this section we limit our discussion to some of the issues that many need to be considered in HTTPng or similar protocols:
The goal of this work is to:
The data model for the capability and preferences profile is similar to a table of tables. Each individual table roughly compares to a significant hardware or software component. The primary goal is to be able to describe the desired table of tables in an unambiguous and inter operable fashion. Secondary goals include general applicability and good performance.
In most documents on 3rd generation networks, scenarios are presented where users will want to assert several preferential factors[IMT-2000]. Also, mechanisms for this exist [Agent-attrib]. The preferences are such as:
They will also want to assert hardware platform attributes, like:
We also expect them to want to assert software defined variables, such as:
It is interesting to note that metadata (capabilities and preferences) associated with the device, the software used to access the web and the user of the device could originate from different sources created at different times. The hardware vendor might have profile information available for its products, the software vendor might supply a default profile, and the user's preferences might apply across multiple applications (preferred language) or change during a session (sound on/off). If it is too complex people won't use it and if it too slow people won't use it. The challenge is to provide an efficient mechanism for communicating the profiles for constrained devices, such as smart phones, using slow networks, such as GSM SMS.
The CC/PP proposal describes an interoperable encoding for capabilities and preferences of user agents, specifically web browsers. The proposal is also intended to support applications other than browsers, including email, calendars, etc. Support for peripherals like printers and fax machines will require other types of attributes such as type of printer, location, Postscript support, color, etc. We believe an XML/RDF based approach would be suitable. However, metadata descriptions of devices like printers or fax machines may use a different scheme. Every reasonable effort will be made to provide interoperability other important proposals.
The basic data model for a CC/PP is a collection of tables. Though RDF makes modeling a wide range of data structures possible, it is unlikely that this flexibility will used in the creation of complex data models for profiles. In the simplest form each table in the CC/PP is a collection of RDF statements with simple, atomic properties. These tables may be constructed from default settings, persistent local changes or temporary changes made by a user. One extension to the simple table of properties data model is the notion of a separate, subordinate collection of default properties. Default settings might be properties defined by the vendor. In the case of hardware the vendor often has a very good idea of the physical properties of any given model of product. However, the current owner of the product may be able to add options, such as memory or persistent store or additional I/O devices that add new properties or change the values of some original properties. These would be persistent local changes. An example of a temporary change would be turning sound on or off.
The profile is associated with the current network session or transaction. Each major component may have a collection of attributes or preferences. Examples of major components are the hardware platform upon which all the software is executing, the software platform upon which all the applications are hosted and each of the applications. This following is a simplified example of the sort of data expected to be encoded in these profiles.
Memory = 64mb
CPU = PPC
Screen = 640*400*8
BlueTooth = Yes
OS version = 1.0
HTML version = 4.0
WML version = 1.0
Sound = ON
Images = Yes
Language = English
Some collections of properties and property values may be common to a particular component. For example: a specific model of a smart phone may come with a specific CPU, screen size and amount of memory by default. Gathering these "default" properties together as a distinct RDF resource makes it possible to independently retrieve and cache those properties. A collection of "default" properties is not mandatory, but it may improve network performance, especially the performance of relatively slow wireless networks.
Any RDF graph consists of nodes, arcs and leafs. Nodes are resources, arcs are properties and leafs are property values. An RDF graph based on the previous example that includes "Default" properties for each major component is relatively straightforward.
The introduction of "Defaults" makes the graph of each major component more of a simple tree than a table. In this example the major components are associated with the current network session. In this case, the network session is serving as the root of a tree that includes the trees of each major component. RDF was originally intended to describe metadata associated with documents or other objects that can be named via a URI. The closest thing to a "document" associated with a CC/PP is the current network session.
From the point of view of any particular network transaction the only property or capability information that is important is whatever is "current". The network transaction does not care about the differences between defaults or persistent local changes, it only cares about the capabilities and preferences that apply to the current network transaction. Because this information may originate from multiple sources and because different parts of the capability profile may be differentially cached, the various components must be explicitly described in the network transaction.
The CC/PP is the encoding of profile information that needs to be shared between a client and a server, gateway or proxy. The persistent encoding of profile information and the encoding for the purposes of interoperability (communication) need not be the same. In this document we consider the use of XML/RDF as the interoperability encoding. Persistent storage of profile information is left to the individual applications.
Consider a more realistic example of inline encoding of a CC/PP for a hypothetical smart phone. This is an example of the type of information a phone might provide to a gateway/proxy/server. Note that we do not explicitly name the "current network session". Instead, the profiles of each major component is collected in a "Bag". This is probably not necessary since the document in question, the network session, is unlikely to contain additional RDF.
This sample profile is a collection of the capabilities and preferences associated with either a user or the hardware platform or a software component. Each collection of capabilities and preferences are organized within a description block. These description blocks may contain subordinate description blocks to describe default attributes or other collections of attributes.
There is nothing that prevents the use of multiple namespaces. This might be useful to either define experimental or non-standard attributes or to define application specific capabilities and preferences.
Delivering all of the CC/PP at one time, inline makes some simplifications possible. If the user has overridden some default property, then there is no reason to send the default - all that is needed is to send the current value for that attribute. In the example above, there is no reason to send the hardware platform's default setting of "Memory=16mb" since the user has upgraded the memory to 32mb.
The significance of an attribute is generally limited to the component it is describing. For example, each software application can define a value for a "Version" attribute. This indicates the version of the particular application being described. In general, side effects that extend beyond the bounds of a particular component are not defined in this document. The relationship between components is system and application dependent.
The major disadvantage of this format is that it is verbose. Some networks are very slow and this would be a moderately expensive way to handle metadata. There are several optimizations possible to help deal network performance issues. One strategy is compressed form of XML [TokenXML] and a complementary strategy is to use indirect references.
Instead of enumerating each set of attributes, a remote reference can be used to name a collection of attributes such as the hardware platform defaults. This has the advantage of enabling the separate fetching and caching of functional subsets. This might be very nice if the link between the gateway or the proxy and the client agent was slow and the link between the gateway or proxy and the site named by the remote reference was fast - a typical case when the user agent is a smart phone. Another advantage is the simplification of the development of different vocabularies for hardware vendors and software vendors (assuming this is a good thing).
The following example uses indirect references. First the profile provided by the user agent. It refers to default profiles provided by the hardware and software platform vendors:
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:prf="http://www.w3.org/TR/WD-profile-vocabulary#"> <rdf:Description about="HardwarePlatform"> <prf:Defaults> <rdf:li resource="http://www.nokia.com/profiles/2160"/> </prf:Defaults> <prf:Modifications Memory="32mB"/> </rdf:Description> <rdf:Description about="SoftwarePlatform"> <prf:Defaults> <rdf:li resource="http://www.symbian.com/profiles/pda"/> </prf:Defaults> <prf:Modifications Sound="Off" Images="Off" /> </rdf:Description> <rdf:Description about="EpocEmail"> <prf:Defaults> <rdf:li resource="http://www.symbian.com/epoc/profiles/epocemail" /> </prf:Defaults> </rdf:Description> <rdf:Description about="EpocCalendar"> <prf:Defaults> <rdf:li resource="http://www.symbian.com/epoc/profiles/epoccal"/> </prf:Defaults> </rdf:Description> <rdf:Description about="UserPreferences"> <prf:Defaults Language="English" /> </rdf:Description> </rdf:RDF>
Next, the profile provided by the hardware vendor.
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description Vendor="Nokia" Model="2160" Type="PDA" ScreenSize="800x600x24" CPU="PPC" Keyboard="Yes" Memory="16mB" Bluetooth="YES" Speaker="Yes" /> </rdf:RDF>
Finally, the profiles provided by the software platform and application vendors.
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <Description Version="EpocEmail1.0" HTMLVersion="4.0" /> </rdf:RDF>
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <Description Version="EpocCalendar1.0" HTMLVersion="4.0" /> </rdf:RDF>
All we did in the second example was group different collections of default attributes together in such a way that they could be named by a URL. Since the hardware and software platform default profiles were independently described using a URL, they can be separately fetched and cached. When an application in the server/gateway/proxy uses RDF to process the CC/PP it may encounter attrributes with default values and user specified values. It is up the application to enforce the rule that user specified attributes over ride default values. RDF does not provide a convenient mechanism for implementing that rule.
It is worth noting again that the information we are most concerned with is the current profile. Default properties might have some importance, for example, they may be worth caching independently of any particular session or user. However, the key is for the client and the server/gateway/proxy to have a consistent view of the current profile.
It is important to be able to add to and modify attributes associated with the current CC/PP. We need to be able to modify the value of certain attributes, such as turning sound on and off and we need to make persistent changes to reflect things like a memory upgrade. We need to be able to override the default profile provided by the vendor. However, we only need to concern ourselves with changes to the current profile. Reflecting changes to preferences or capabilities in persistent storage is beyond the scope of this document.
Our problem is to propogate changes to the current CC/PP to the server/gateway/proxy. One solution is to transmit the entire CC/PP with each change. It would replace the previous profile. This is not ideal for slow networks. An alternative is to send only the changes. Thus if Sound were to be changed from "Off" to "On" the only data that would need to be sent would be:
Alternatively, the <Modifications> element could be used to communicate changes.
When used in the context of a web browsing application, a CC/PP should be associated with a notion of a current session rather than a user or a node. HTTP and WSP (the WAP session protocol) both define different session semantics. The client, server and and gateways and proxies may already have their own, well defined notions of what constitutes a connection or a session. Our protocol strategy is to send as little information as possible and if anyone is missing something, they have to ask for it. If there is good reason to believe that someone is going to ask for a profile, the client can elect to send the most efficient form of the profile that makes sense.
Consider the following possible interaction between a server and a client. When the Client begins a session it sends a minimal profile using as much indirection as possible. If the server/gateway/proxy does not have a CC/PP for this session, then it asks for one. When a profile is sent the client tries a minimal form, i.e., it uses as much indirection as possible and only names the non default attributes of the profile. The server/gateway/proxy can try to fill in the profile using the indirect HTTP references (which may be independently cached). If any of these fail, a request for additional data can be sent to the user which can reply with a fully enumerated profile. If the client changes the value of an attribute, such as turning sound off, only that change needs to be sent.
It is likely that servers and gateways/proxies will be concerned with different preferences. For example, the server may need to know which language the user prefers and the gateway may have responsibility to trim images to 8 bits of color (to save bandwidth). However, the exact use of profile information by each server/gateway/proxy is hard to predict. Therefore gateways/proxies should forward all profile information to the server. Any requests for profile information that the gateway/proxy cannot satisfy should be forwarded to the client.
The ability to compose a profile from sources provided by third parties at run-time exposes the system to a new type of attack. For example, if the URL that named the hardware default platform defaults were to be compromised via an attack on DNS it would be possible to load incorrect profile information. If cached within a server/gateway/proxy this could be a serious denial of service attack. If this is a serious enough problem it may be worth adding digital signatures to the URLs used to refer to profile components.
New versions of HTTP such as HTTPng should be able to support the CC/PP framework without difficulty. HTTP 1.0 servers and proxies may not be able to handle CC/PPs. Clients need to be able to detect communication with old servers and adapt the protocol accordingly. HTTP 1.1, perhaps via the Mandatory/Optional Extension Framework should be able to support sessions and profiles. At the least, 1.1 proxy servers should pass requests that include CC/PPs on to servers in the hope that the servers will understand the requests. New versions of 1.1 proxies and servers should be able to use CC/PPs.
This protocol discussion is not a specific proposal for HTTP or WSP. Its intent is merely to illustrate how the design allows us to exploit the cachability of both the current session state and the default profiles.
Since the original writing of this document, a W3C Note has been produced, which describes how to use the HTTP 1.1 extension framework to implement a mechanism for communicating CC/PP profiles and profile differences. See the note, CC/PP exchange protocol based on HTTP Extension Framework [Ohto], for more information.
In this document, we have described a proposal for the use of XML/RDF to describe user preferences and the capabilities of the device and software used to access the Web. Encodings of hypothetical user profiles were used to illustrate some of the benefits of RDF. Some of the possible ramifications for Web protocol design were discussed.
[Agent-attrib] Client-Specific Web Services by Using User Agent Attributes. Tomihisa Kamada, Tomohiko Miyazaki. W3C Note.
[CONNEG] IETF working group on content negotiation
[IMT-2000] Ericsson in Wideband Wireless Multimedia.
[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies, Borenstein N., Freed N., 1996/11/27
[Ohto] CC/PP exchange protocol based on HTTP Extension Framework
[RDF] Resource Description Framework, (RDF) Model and Syntax Specification. Lassila O., Swick R. W3C Working Draft.
[RDF-Schema] Resource Description Framework (RDF) Schema Specification. Brickley, D., Guha, R.V. , Layman, A., W3C Working Draft.
[TokenXML] Binary XML Content Format Specification