Composite Profile Information

Carl Binding, Reto Hermann, Andreas Schade
IBM Research, Zurich Laboratory,
CH-8803 Ruschlikon, Switzerland
email: {cbd,rhe,san}@zurich.ibm.com

1. Introduction

The overall application architecture for an internet and a mobile internet application does not differ fundamentally - indeed both using the same lower layer transport protocols and similar mark-up languages - the presentation to the end-user will have to differ because mobile end-user devices will remain smaller and thus limited in terms of input-output capabilities compared with traditional personal computers. Hence, applications will benefit from the ability to adapt the generated content to optimally to the capabilities of the end-user device and the preferences of the end-user himself.

The content generation process is driven by parameters associated with the request for information. Hence, if we augment such requests with additional information to describe the capabilities of the device and the end-user's preferences, we enable the application to exploit these and generate better suited mark-up output to be rendered on the end-user device. We refer to this additional information as composite profile information (CPI).

This position paper lists the CPI requirements we have identified, reviews and criticizes the current standardization efforts, and briefly describes an implementation of a standardized profile environment.

2. Requirements for Composite Profile Information

The requirements for handling of such capability and preference profile information can be summarized as follows:

  1. Expressiveness: an end-user compute entity must be enabled to express its capabilities and the user's preferences in a concise and non-ambiguous way.
  2. Transport syntax: the profile information must be transmitted to the content generating origin-server in a space efficient and reliable way.
  3. Handling: the origin-server must be enabled to handle the information contained in a preference profile through an appropriate mechanism, i.e. through a programmatic interface to query the profile's values.
  4. Efficiency: the transport and handling of profiles must be efficient as each information request can be associated with a preference profile.
  5. Aggregation: various processing nodes in the network between end-user device and origin-server may augment the profile associated with a request. For example, if a content transforming node on the path supports additional transformations which are beyond the capabilities of either the end-user client device or the origin-server. This information can be of use to the origin-server and should therefore be aggregated into the profile.
  6. Manageability: the entire profile handling system must be manageable.

3. Composite Profile Information Standards

Current internet technology does not use preference information on a wide-spread scale. In part, this is due to the limited possibilities of forwarding preference profile information with an HTTP request. The HTTP/1.1 standard [1] supports a limited set of header fields of which only the header-field User-Agent allows to identify the end-user's browsing environment. Additional preference information can be conveyed via the various Accept header fields, but the range of capabilities which can be described through these means remains limited.

Common practice thus is to statically associate a device profile with the device's User-Agent information. This profile is then accessed by the origin-server based on the HTTP request's header value.

3.1 CC/PP

A more dynamic approach is advocated in the model of the client capabilities and preference profile (CC/PP) [3]. It proposes composite preference information grouping device capabilities and user preferences into a set of components. For each component, a possible set of default property values is indicated through the inclusion of a URL reference. Each component may also contain one or more properties which override the default values. Every property defines its name, its type, and possibly a set of values (for enumerations, for example).

A profile's components and their properties are specified in the profile schema or vocabulary. Thus, for different applications distinct CC/PP vocabularies can be defined. An XML based syntax, the resource description framework (RDF) [4] is used to externalize a CC/PP compliant profile and associated default profiles.

From an implementation point of view, the CC/PP proposal suffers from various shortcomings:

  1. RDF syntax [4] is ambiguous and thus hard to parse properly. In our opinion, it would have been sufficient to define an XML DTD for externalizing CC/PP profiles with the benefit of considerably simplifying the parsing task.
  2. The formalism to specify profile schemata is incomplete. The proposal lacks expressiveness regarding the definition of comparison operations applied to properties and the introduced property types and their formats are not mandatory. The notation also does not clearly state which properties are mandatory, which can be omitted etc.
  3. The CC/PP model leaves the determination of possible relations between property values, their syntax, and their meaning to the application. For example, there are no rules for spelling of enumeration of literal values (e.g. capitalization, white-space) or the format for numeric values. Similarly, defaulting for omitted property values is not specified.

3.2 UAProf

The WAP Forum has based its User Agent profile(UAProf) [8] work on the CC/PP and RDF framework. These have been extended with a specific transfer syntax for profile and profile differences, as well as resolution rules for property default and difference values. In addition to the Wireless Application Protocol itself, the proposed m-services initiative also requires adoption of the UAProf standard [2].

To describe the capabilities and preferences of a mobile end-user device, the UAProf vocabulary defines six components hardware, software, user agent, network, WAP characteristics, and push environment. The set of available scalar property types is fixed and comprises numerics (integer), booleans, dimensions (a two-dimensional metric), and (string) literals which are also used for enumerated values. In addition to these base types, multi-valued types are also supported using the RDF collection mechanism. Bags indicate an unordered, multi-valued set, alternatives describe a choice from multiple values, and sequences provide an ordered set of values. The application of profile differences to base profiles is driven by resolution policies that are associated with each property. Possible resolution policies are locked (value can be set once), override (value can be modified), or append (for multi-value sets only).

The UAProf standard includes a transport syntax for transporting profiles and profile differences. profiles references, i.e. their URLs, are contained in an x-wap-profile HTTP header and profile differences as RDF/XML data in x-wap-profile-diff headers. The x-wap-profile header refers to the profile differences through an index and includes a checksum (MD5) of the difference value for validation of its integrity.

Other transfer syntaxes are based on RDF/XML documents embedded in MIME multi-part messages [6] or on a distinct set of HTTP extension headers [7]. The UAProf schema makes incomplete use of the already incomplete CC/PP schema formalism, and embeds property types and resolution policies in XML comments. Further shortcomings of the CC/PP framework and the UAProf schema in particular are:

  1. No fully machine readable schema definition to allow automatic verification of profiles against their schema.
  2. Missing component type compatibility rules for profile aggregation.
  3. No machine readable type declarations and resolution policies for properties.
  4. No syntax rules for literals and literal enumerations. Valid literal values, white-space and capitalization rules, syntax for numbers, etc. are not clearly defined.
  5. Relation functions defining an equivalence relationship between property values to indicate whether a given property value p1 is equivalent, inferior, or superior to some other property value p2. For example, does a Java Virtual Machine property value of SUN_JVM_1 indicate similar, superior, or inferior capabilities of a Java Virtual Machine MS_VM_J13?

4. Implementation Experiences

At the IBM Zurich Research Laboratory we have implemented middleware software which performs default resolution and profile difference evaluation for UAProf compliant profiles. An API implemented in C and in Java exposes operations such as:

The implementation has been done in the C language for performance reasons. On top of our base CPI library, we have implemented a Java wrapper using Java Native Interface (JNI). Additional software was written to use the CPI library in an Apache module or a Java Servlet context.

Several functional units can be distinguished in our UAProf/CPI implementation. An XML parser produces a DOM-like [5] representation of RDF/XML data on which RDF syntax rules are verified. The UAProf schema information drives the parsing of the RDF/XML document representing the profile.

Since no fully machine readable formalism has been defined for the UAProf schema, we have hand-crafted an extended database model to capture CC/PP compatible schemata and applied it to the UAProf schema. Thus, our CPI library can draw on machine accessible UAProf schema information. It is used in processing profile and profile difference data. Our database model for schemata captures a schema's components, their types, and the property types. For property types, we store its name, its type, and its resolution policy.

After verification of the profile's compliance with the schema and the RDF/XML syntax, default resolution is performed. For performance reasons, we have implemented a two level cache to hold default component values. The first level is an in-core cache; the secondary level uses a relational database. Hence, our CPI library retrieves default components - referenced via their URLs - only once across the internet to load the default component caches. Depending on the lifetime of the CPI library instantiation (e.g. per HTTP request or across multiple HTTP requests), access to a default component is serviced via the database cache or the in-core cache and becomes independent of network latency and throughput.

The default component cache can also be pre-loaded during initialization of the CPI library to avoid penalizing the very first profile resolution with retrieval of default component values.

A similar two-level cache approach is used for resolved profiles. Indeed, there is a high likelihood that clients present identical profiles with a series of subsequent HTTP requests since the device's capabilities and the user's preferences are unlikely to change between subsequent HTTP requests. We detect this by checksumming the HTTP headers related to the profile information and using the checksum value as a look-up key into a profile cache. Thus, if subsequent requests carry identical profile information, profile resolution is only performed once; profiles for ulterior requests are serviced out of the profile cache.

Another implementation issue has been the handling of profile matching which requires an ordering function on profile property values. For simple scalar types, such as integers or even dimensions, the ordering function is simple. However, for literal enumeration values no simple relationship can be defined: lexicographical ordering, for example, is not meaningful.

Our implementation therefore associates an explicit enumeration of the relations between literal enumeration values. Evidently this externalized encoding of the order relationship requires O(n2) space in the schema database, but supports convenient extensibility of the order relationship for a given enumeration type without rebuilding the CPI library when new property values are introduced.

References

[1] R. Fielding et al. Hypertext Transfer Protocol - HTTP/1.1. IETF, June 1999. RFC 2616.

[2] GSM Association. M-Services Guideline, May 2001. PRD AA.35.

[3] G. Klyne, F. Reynolds, C. Woodrow, and H. Ohto. Composite Capability/Preference profiles (CC/PP): Structure and Vocabularies. W3C, June 1999. http://www.w3c.org/TR/NOTE-CCPPexchange.

[4] Ora Lassila and Ralph R. Swick. Resource Description Framework (RDF): Model and Syntax Specification. W3C, 1999. http://www.w3c.org/TR/REC-rdf-syntax.

[5] W3C. Document Object Model (DOM) Level 1 Specification, October 1998. http://www.w3.org/TR/REC-DOM-Level-1.

[6] WAP Forum. Wireless Application Protocol: Push Access Protocol (PAP), April 2001. WAP-247-PAP.

[7] WAP Forum. Wireless Application Protocol: Push OTA Protocol, April 2001. WAP-235-PushOTA.

[8] WAP Forum. Wireless Application Protocol: User Agent profile Specification, October 2001. WAP-248-UAPROF.