International Workshop on the Implementation of a Device Description Repository

A workshop report prepared by the DDWG

12/13 July 2006, Madrid, Spain (hosted by Telefónica)

Nearby: MWI | DDWG | Original Call for Participation | Submissions | Presentations

Introduction
Participants
Agenda
Executive Summary
Miscellany
Program Committee

Introduction

At a meeting in Barcelona of the MWI Steering Council it was proposed that the DDWG hold an international workshop to consider the implications of its recently published Requirements document. That document outlined the basic properties of a Device Description Repository that, if implemented and widely available, would significantly enhance content adaptation solutions. Such solutions enable Web content adapt to the varied characteristics of mobile devices. The next step for the DDWG was to consider how such a technology should be implemented, and whether or not the W3C should take an active role in such an implementation. The workshop was called to get various views on the role of a DDR, its design and its implementation. Telefónica I+D offered to host the workshop in Madrid, and a call for participation was issued in April 2006. This is the report of the DDWG workshop that was held in Madrid on 12 and 13 July 2006.

Participants

Chris Abbott (Mobile Phone Wizards AS)
Daniel Appelquist (Vodafone)
Stéphane Boyera (W3C)
Pontus Carlsson (Drutt Corporation)
Rafael Casero (SATEC)
Ronan Cremin (mTLD / .mobi)
Jose Manuel Cantera Fonseca (Telefónica)
Eric Fruhinsholz (Zandan)
Rotan Hanrahan (MobileAware, Chair)
Juan J. Hierro (Telefónica)
Masayasu Ishikawa (W3C)
Martin Jones (Volantis Systems)
Cedric Kiss (W3C)
Iñaki Martínez de Lizarrondo (Telefónica)
Ignacio Marín (Fundación CTIC)
Bennett Marks (Nokia)
Edouard Marques (France Telecom)
Richard Nkrumah (Enough Software)
Luca Passani (WURFL, Openwave)
David Sanders (Vodafone)
Bertrand Schmitt (Zandan)
Andrea Trasatti (WURFL, M:Metrics)
José Ángel Martínez Usero (Technosite)
Njål Wilberg (Mobile Phone Wizards AS)
Toshihiko Yamakami (ACCESS)
Yoram Zahavi (Flash Networks)

Agenda

Linked presentations are in PDF format. There are slight changes to the schedule from the original published agenda, resulting from adjustments made during the workshop.

12 July 2006

Time	Item
9:00 - 9:30	Registration and coffee.
9:30 - 9:45	Welcome from the chair.
9:45 - 11:00	Introduction statements from participants. (Slides from WURFL, Volantis)	Minutes
11:00 - 11:15	Break
11:15 - 11:45	France Telecom Group presentation + Q&A	Minutes
11:45 - 12:30	Debate	Minutes
12:30 - 13:00	Review of Landscape/Ecosystem	Minutes
13:00 - 14:30	Lunch
14:30 - 15:00	MobileAware presentation + Q&A	Minutes
14:30 - 15:30	Review of the DD Requirements	Minutes
15:30 - 16:30	Debate
16:30 - 17:00	Round-up
17:00 - 17:30	Guest presentation: MORFEO MyMobileWeb

13 July 2006

Time	Item
9:00 - 9:45	Coffee. Review of previous day
9:45 - 10:15	Telefonica presentation + Q&A	Minutes
10:15 - 11:00	Technosite presentation + Q&A	Minutes
11:00 - 11:15	Break
11:15 - 12:00	Mobile Phone Wizards presentation + Q&A	Minutes
12:00 - 12:30	Guest presentation: WURFL and WALL	Minutes
12:30 - 13:00	Debate	Minutes
13:10 - 14:00	Lunch
14:00 - 15:30	DDWG Charter discussion	Minutes
15:30 - 16:00	DDWG Charter 2 drafting
16:00 - 16:30	Wrap-up

Executive Summary

The workshop took place at the Telefonica I+D offices in Madrid and commenced with introductory statements from the participants. This included two brief slide sets. The participants represented interested parties from many domains, including mobile operators, manufacturers, service providers, accessibility, content adaptation and device information providers.

Each of the participants expressed an interest in producing or consuming device description data, and acknowledged that an agreement on how such data could be shared and accessed would be of general benefit to the Web, especially its mobile aspect that exhibits the most variability in its client devices.

Initially, the workshop discussed the scope of the DDR. Should it be limited to browsers, or extend to include the physical characteristics of the mechanism in which the browser executes? Should the DDR only hold static properties, or should it include dynamic properties? It was concluded that the DDWG should focus on static properties of Web browsers and their static execution environment, but that the design of the the DDR (both API and deployment architecture) should be sufficiently extensible to allow it to support additional properties, data types etc. Meanwhile, the design of the DDR should take into account related technologies that have already been deployed, such as W3C CC/PP, OMA UAProf, W3C DCI and OMA DPE.

The workshop considered the possibility of device information coming from multiple sources. There is a general expectation that the existing sources will not be abandoned, but rather incorporated into a DDR "umbrella". This would be a DDR framework in which there could be many contributors sharing and providing information through a set of nodes (access points, local instances, shared services). From the client's point of view the DDR would appear as a single information source, but within the framework there would be multiple sources. Comparisons to similar views of the DNS were mentioned. However, the quality (is the data validly constructed and has the data been verified through testing on real devices?) will vary from source to source, so the DDR framework must provide some support for trust information associated with the results from any queries. A trust mechanism is seen as essential for a commercial environment, though only to the extent that one can measure the trust/quality rather than the ability to impose trust/quality. By not demanding that all data is verified prior to insertion, the open source model of community contributions could continue. To allay fears of using unverified data, the DDR would need to provide some means of indicating the reliability of the results. Furthermore, versioning would be necessary in commercial environments, especially where testing requires a stable set of input for repeatable examination. In this case, being able to access certain versions (even if they are old) would be beneficial.

The meaning of "device" was a topic of debate. The assumption that recognising the browser is sufficient to recognise the physical device is clearly wrong. The User Agent string in the HTTP header is insufficient to identify the true client context. Other techniques, typically based on patterns in sets of headers with a fall-back mechanism based on notional hierarchies of devices, are not guaranteed. Furthermore, the end user is free to customise the hardware or software and this further confuses the idea of "device". The use of the term "device" is historical: early Web-enabled mobile units had a permanent relationship between hardware and the embedded browser. This is no longer the case. In reality, we are trying to determine the delivery context, in which the physical and software characteristics are major constituents. Nevertheless, we continue to use the term "device". In the end, something practical that concentrated on those contextual properties that strongly influence the adaptation of content for presentation on mobile Web-enabled devices (as defined in the original DDWG charter) seemed to offer the best value.

There are many actors involved in the provision and use of device descriptions. While the DDWG's Landscape and Ecosystem documents described many of these actors, the workshop suggested that these should be revised to include the chain of events and show the different stages in which the actors are involved. Furthermore, this should suggest the points where actors make contributions, and where they obtain benefits. When this is in balance we have a viable ecosystem. Otherwise some actors may withdraw. Unfortunately, not all actors (especially manufacturers) understand the way benefits can be obtained from providing input to device descriptions.

If the DDR extends to encompass a lot of device information then there is a concern that the information may be (mis)used for marketing purposes, by using it to present unfavourable feature comparisons. It was agreed, however, that merely preventing certain data entering into the repository would not prevent "comparison shopping". This is already a feature of the Web, and in many cases it is using unverified data, or data that has been transcribed incorrectly, leading to false comparisons. Given the possibility of mining the data in the DDR, effort must be put into considering the issues of trust, quality and (possibly) access, so that at least the information (which is or will be public) is detectably reliable. Ignoring this could lead to negative feedback from the industry.

Publishing the data to the DDR is another issue that concerns manufactuers. It is important that data is not released before the new devices are launched on the market. Currently, the data only appears after launch, and then there is a lag time while various companies test the devices to get reliable descriptions. Meanwhile the adaptations of content for such devices will be sub-optimal, which cannot be good for sales. If the descriptions were published directly into the DDR on the day the devices were launched, there would be no lag and the presentation quality on the new devices would be better. So the DDR needs a good publishing mechanism and an efficient means of distributing the new data.

It is possible that the DDR could grow to contain hundreds of properties for all devices, but in practice only a small number of properties will be necessary to help the general adaptation solutions for the "mobile Web". Specialist companies will probably extend the collection by many hundreds of properties, but these would only be available from their own DDR instances (via the same DDR APIs) and probably only to facilitate their own commercial products. In this way, one API could serve the general Web community and the specialist commercial community equally.

The DDWG carried out an exercise in which adaptation specialists revealed their opinions of what were the essential properties for content adaption. This was the first time that some insight into the use of device descriptions was shared in the industry. The contributors agreed that these essential properties would be sufficient to achieve a functional content adaptation for the mobile Web and would be of general benefit to the Web community. The kind of "mobile perfect" presentations that the commercial solutions offer rely on may 100s of additional properties, which would remain available only on a commercial basis. Contributing to the establishment of a trusted collection of essential properties for the spectrum of mobile devices was generally seen as a way to enhance the mobile Web, and in such an improved environment there would be greater opportunity for commercial innovation. Consequently, the commercial representatives were positively disposed to the idea.

The DDWG has already communicated some information about the essential properties to the OMA UAProf group, and so it was suggested in the workshop that the DDWG work closely with the OMA to evolve these properties into a proper vocabulary.

It was suggested that the design of the APIs for the DDR should employ the OMG IDL technology. With care, interfaces defined via IDL can be translated with ease into WSDL to provide a Web Services compatible solution. The IDL could also be used to produce implementations in popular languages such as Java, C/C++, C#, PHP, Perl etc.

Some maintainers of device information employ a hierarchical system in which the properties of new devices can be inherited from existing devices represented in their collection. This provides a convenient way to add new devices, and also can provide a fall-back mechanism for approximate matching of devices. It was felt, however, that inheritance should not be mandated in the data model because there can be multiple inheritance paths (the device, its accessories, the embedded software, added software etc.). Instead, one could support the notion of arbitrary grouping/categorisation of devices external to the main data. When serialised (e.g. for import/export purposes), the data associated with each device should be complete and self-contained. This supports the notion of a "single exportable unit", which could form the means by which nodes of a DDR framwork would exchange device descriptions. The exchange of grouping/categorisation information was not discussed, though it was recognised as an important issue if one wanted to exchange repository information between DDR nodes with perfect fidelity. Identifying each "exportable unit" raised the general problem of indexing the DDR data. A device identifier is an obvious key, but in the absense of clarity about the definition of "device", and the known limitations of User Agent strings, it was obvious that some work would be needed to define an appropriate unique identifier to ensure there would be no ambiguity in DDR queries.

Some of the information that could be supplied through the DDR might be subjective in nature, such as indicating the most appropriate features of the device to use in order to maximise the accessibility of the content/service. It is possible that (a node of) the DDR could supply machine-readable guidance provided to the community by accessibility specialists. For example, the DDR could indicate the most legible fonts, best colour contrasts, most accessible buttons to use as "hot keys". Currently, guidance on what to do in this space is available in publications intended for human consumption (e.g. the WAI guidelines), but the DDR could make some of this information directly available on a per-device basis. It was agreed that the DDWG should liaise with WAI.

Prior to the commencement of the charter discussion, the workshop participants debated the expected nature of the DDR. All agreed it should be simple. The API and a vocabulary of essential properties are high priorities. The DDR should not merely gather data from other sources, but be a bridge to those sources. Avoid replicating past efforts. Instead, build on them. Re-use where possible. While the architecure of the DDR framework is important, the participants felt that the emphasis should be on the API, creating an initial vocabulary, supporting extensibility and demonstrating the utility of the solution. The DDWG should concentrate on how the data is moved around (e.g. queries, updates etc.) and not on how the data is stored. The general feeling that the DDR framework will be distributed in nature will need more careful examination, possibly outside W3C, but not necessarily in the second charter.

Charter Discussion summary

Based on the follow-on items mentioned in the first DDWG charter, the following items were initially considered.

An architectural design (including trust model)
A design for a reference repository of device descriptions
A reference (prototype?) implementation of a repository
A solution for aiding discovery of vocabularies and profiles. (forum?)
A strategy for extending, distributing and delegating the repository where appropriate.
A set of interface specifications for methods to access the device repository.
Guidelines on the population and maintenance of device information repositories.

The meaning of "reference" was discussed. This could mean an actual collection of device descriptions. It could mean an implementation (in software) of an API defined by the group, with the implementation being a definitive interpretation of how the API should operate. The relationship between Apache Tomcat and the Servlet specifications was cited as an example. The participants felt the suggested items also supported the need for extensibility in any final solution. The creation of an interface specification was generally seen as an achievable goal.

The participants continued to consider the following items:

A collection of values for core characteristics for a large set of devices
Decision on running an actual repository (node)
Definition of the namespace of core capabilities and process for extending it
Administrative protocols (inluding import/export/archive formats)
Administrative functionalities
Security model(s) for access and administration
Update of the list of "essential device properties"
Definitions of categories/groups of device characteristics
Establishing formal relationships with DIWG and the subgroup responsible for DCI
Establishing formal relationships with OMA DPE and Vocabulary Management activities
Establishing formal relationships with WAI
Inheritance and fallback/best-effort mechanisms (and issues of trust inheritance)
Unique index for identifying the device, the context etc. (Role of User-Agent Header string)
Versioning (implied in the DDR requirements)
Clustering/inheritance within DDR data
Should DDR node describe how it gets data? (Assumptions, direct testing, inferred, guesswork...)
SAY SOMETHING ABOUT KEEPING FREE INFO FREE (DDWG should examine and then choose/support an option)
PRINCIPLE: Ensure proposed solution works with real/legacy devices.

The participants felt that the DDWG should look again at the "Top 5" survey it conducted to determine the set of essential device properties that have the most impact on content adaptation. This information should be published by the DDWG and be used to determine an initial vocabulary of essential properties with which to seed an initial implementation of a repository. The DDWG should also put in place a mechanism for review of such a vocabulary as mobile devices evolve.

The idea that information can come from multiple sources suggests some form of namespacing may be necessary. This is separate from any notion of categorisation or grouping, which could be implemented ad-hoc and should not be tied directly to any internal data structures (such as hierarchies). While it was considered useful to be able to add new devices by inheriting properties from existing devices, it was felt that such an inheritance hierarchy should not be imposed on the repository implementation. A sufficiently flexible categorisation strategy should be able to support such functionality if required. The fall-back mechanism for device recognition is similarly associated with the concept of device hierarchies, but the means by which recognition is performed (a process described during the workshop as a "hack") is not solid (deterministic) enough to guarantee the same behaviour in all implementations. Therefore such a process is considered a "value-add" and a "commercial opportunity" for suppliers of DDR-related services (including adaptation). While the DDR could acknowledge the availability of such a mechanism, it could not prescribe how it behaves.

The issue of device recognition ambiguity comes about partly because of the failure of the User Agent string to properly identify the device, and also because of the ambiguity in the interpretation of what we mean by "device". A device could be the physical mechanism in which the browser is operating, it could be the browser software that is acting as an agent for the user, or it could be the combination of the two. Furthermore, with the ability to add physical attachments to the user mechanism and the ability to replace, extend or otherwise change the software, the dynamic nature of the mobile environment should be recognised. Other work, such as the DIWG DCI and the OMA DPE recognise this evolution in technology, but the User Agent string in the HTTP headers as currently used is insufficient in this context. Therefore it was proposed that device recognition be considered a specialised process in which as much information about the client is gathered (inferred) from whatever information is provided in the request (HTTP headers), and that this process is in need of some work. No conclusion was reached regarding whether or not the DDWG should take responsibility for addressing this issue (e.g. through a revision of the use/format of the User Agent string).

There was a debate about providing free access to certain device information. There is a good argument for making certain basic information free, and currently this is done through the publication of UAProf by several manufacturers. This published information is freely available. If the industry was to move to access such information via a standardised API that is part of the DDR then there is a concern that the providers of DDR access points would require payment. The idea that free information would now require payment was not liked. However, it was pointed out that the open source community would likely provide the means to access their free data via a DDR API, and that it would probably be an easy exercise for organisations like the OMA to provide a DDR API for their growing collection of validated UAProf instances. What was required, it was felt, was a mechanism to ensure that any free device information to which a DDR entity has added no extra value (other than providing an alternative interface) should be precluded from requiring payment. Derived information would be considered to be "value-add" and a payment may be justified. It was not clear how to distinguish between what is value-added and what is not. For example, gathering a set of UAProf instances and making them available through a Web Service could be seen as value added, which could be pay-per-use, but this might also be viewed as a proxy to free data that should remain free.

The participants recognised that there were many worthwhile tasks that the DDWG could undertake, but generally expressed a preference for concentrating on those tasks that could be achieved in a 12-18 month timeframe. Among these achievable tasks the participants identified the following:

Requirements doc to UAProf Vocabulary Management activity in OMA that specifies in satisfactory detail the top N properties. (Update the top N before we send to OMA.)
Definition of DDR API (and required semantics)
Call for prototype implementations
Call for hosting of a permanent access point
Policy statement on the management and use of DDR data (Example: the top N properties could be required to continue to be free via any system using DDR)
Requirements for unique device identifier and UA identifier.
Establish a policy for ongoing update of Top N properties.

It was felt that a useful vocabulary could be developed in close cooperation with the OMA, while the DDWG would focus on creating an API. To demonstrate the usefulness/viability of such and API, the group would call for prototype implementations, with the expectation that some of the group members would provide implementations. Furthermore, the DDWG was encouraged to make a call for a host of a permanent access point to the repository, which (depending on the ultimate architecure of the repository) would have some or all of the device descriptions stored on the host. At a minimum, such a host would provide access to the essential device properties as defined in a DDWG/OMA vocabulary, and initially collected as a task of the DDWG. Whether or not the W3C would offer to be such a host would depend on resourcing. If there are multiple such hosts then it follows that some aspect of distribution will be part of the deployment architecture. However, while such aspects of the architecture can be anticipated, the workshop participants suggested that the architecture should not be a primary concern for the DDWG but rather to focus on the API for contribution of data and execution of queries, and treat the repository itself as a "black box".

In terms of making the essential device information generally available, several related activities were identified. 1) Determining the essential properties. 2) Creating a formal vocabulary for the essential properties. 3) Gathering and verifying values for the identified properties for a large collection of mobile devices. 4) Storing the gathered data in a machine readable format. 5) Distributing the gathered data to one or more hosts who will provide access to the data using a DDR API.

The above activities can be performed by one or more bodies who are committed to the goals of the DDWG. For example, the determination of the essential properties and creation of a formal vocabulary can be conducted between W3C and OMA based on previous work of the DDWG and UAProf WG. The gathering of data could be conducted through the open source community, with verification assisted by existing repository owners (e.g. some DDWG members). The storage of the gathered data could be accomplished by placing the data into an import/export format document hosted on (say) the W3C site. Finally, the distribution of the data can be achieved by downloading of the data to remote nodes that implement the DDR API and which in turn permit users to execute DDR queries via these APIs. Alternative means of distribution can be considered later if the document size or frequency of access cause undesireably performance burdens.

The following statements were also generally viewed positively at the workshop:

[It was suggested (though not unanimously supported) that] anything that falls into the "subjective" could be classed as "business opportunities". Eg. my technique for using the data is better than yours...
[We should consider] management of derived data.
Focus on what is needed to get onto the Rec track. Issue: Quantification of Trust & Quality.
Can I discover from where the response got its information when I query the DDR?
Implementation within the black boxes should be out of scope. Charter must take into account all communities that will create/use this data. Can't have situation where everyone has to pay. There might be a risk if we ignore what is in the box. Must be a requirement for a common free base, regardless of what is inside. Cannot have situation where getting access requires payment for access.
Open Source (e.g. WURFL) could provide simple implementation of the API so that at least one free source of info is available.
When data is given for free (e.g. Nokia device profile) it should continue to be free even if it gets included within some other mechanism (e.g. a DDR interface). We should mandate that this free basic information continues to be available free through DDR.

The workshop concluded by agreeing that device recognition is a problem, making device information generally available has significant benefits and a vocabulary of the Top N properties (for use in content adaptation) would be good piece of work to do in cooperation with the OMA. The workshop suggested that the second charter focus mainly on an API for the DDR, and encourage implementation through prototypes, possibly in an open source environment. The architecture of the repository itself should be recognised, but should not necessarily be determined under the new charter. The architecture is a longer-term project. Trust (of the data provided via a DDR) is an issue raised by many participants, but for an initial implementation it may be sufficient to adopt a simple trust model. Finally, the workshop concluded that any device information that is currently freely available should continue to be freely available even if made available via services making use of the DDR API, though how such an aspiration could be enforced remains an open issue.

The results of the workshop, in particular the outcome of the charter discussion, will be used as a major contribution to the drafting of a new charter for the DDWG. The Chair concluded the workshop by thanking the participants for their contributions and thanking the host, Telefónica, for providing the venue.

Miscellany

Live minutes from the two days
Summary of presentations
Photographs and Videos
Liaison statement from OMA (stored on OMA site)

Program Committee

The Program Committee was appointed by the MWI Steering Council to review submissions to the workshop, and comprises the group Chair, W3C Team Contact and the following appointees:

Daniel Appelquist (Vodafone, Chair MWI BPWG)
Mark Butler (HP)
Joachim Dahlgren (Drutt Corporation)
Zohar Iarchy (Adamind)
Bennett Marks (Nokia, Vice-Chair OMA BAC-MAE)
James Pearce (Argo Interactive Ltd)
Jo Rabin (mTLD / .mobi)
Peter Stark (SonyEricsson)
Andrea Trasatti (M:Metrics Ltd and WURFL)
Keith Waters (France Telecom)

Workshop Chair: Rotan Hanrahan, MobileAware, MWI Device Description Working Group Chair