Bringing Augmented Reality into the Web

Position paper for the Mobile Augmented Reality Summit, Barcelona, 17 February 2010.

Introduction

The concept of Augmented Reality (AR) is not new - just think of heads-up displays in post-war military aeroplanes. Even the term Augmented Reality is not new (it is believed to have been coined in 1990 by Thomas Caudell, then of Boeing) and users are already familiar with AR or AR-like experiences. The most cited example is that of graphical overlays on TV coverage of sports events but things like 'Bright Dancing', promoted by British ISP Talk Talk as part of its sponsorship of the X-Factor TV talent show, have brought some elements of the technology directly to public attention. The marketing and advertising world is experimenting with AR technologies: essentially using image recognition to trigger the display of product information, promotions and the like. See, for example, Cross Platform's work promoting a fashion retailer: AR was used to attract and engage the attention of passing public before delivering them into the store to find out if they had won a prize.

Pranav Mistry using hand movements to control a computer screen that exists only as a projected image in front of him

Figure 1: Pranav Mistry demonstrates the 'Sixth Sense Device'

On the cutting edge of the technology, Pranav Mistry, at MIT's Media Lab, is the man behind so called 'sixth sense' devices that allow users to interact with the real world in very sophisticated ways: pulling up information about real world objects, displaying data on everyday surfaces and responding to gestures as commands (see Figure 1).

In the Q&A session at the end of his November TED India talk, Mistry makes the point that for people with sensory disabilities, the numerical term in 'sixth sense' may be inappropriate. For users with disabilities of varying kinds, AR has real potential to help people with a variety of disabilities; examples include:

Only some of the current and future AR applications make use of a smartphone as a mobile computing platform so their arrival can only be part of the reason for all the excitement around the term today. Improvements in image recognition, display technologies and multimedia platforms must also be part of the story. Nevertheless, smartphones are an important part of the landscape and do have the potential put AR in more people's hands - or should that be eyes, ears and hands?

Standards and Interoperability

Despite the excitement, there is perhaps a roadblock for AR: a lack of standardisation and interoperability. A serious attempt at creating an interoperable standard has been made in ARML. This is an extension to the Open Geospatial Consortium's KML (formerly Keyhole Markup Language), initially developed by Google for Google Earth. However, not all AR applications use the format so that data produced by third parties for use in one 'Augmented Reality Browser' is unlikely to be interoperable with that created for another. And what is an Augmented Reality Browser? Don't we already have several well-known and massively distributed applications that can render data, dynamic images and more, delivered in a wide variety of formats?

Web technologies may not be able directly to support AR today, but many of the key pieces are already in place, more are close to market, and the gap between what AR delivers today and what is possible on the Web is closing all the time.

Data

It's not so long ago that all the buzz was about mash-ups: taking data from multiple sources and putting them together in some fashion, usually on a map. Figure 2 shows a screenshot of an example of this: mousePrice.com is a service that uses some the data made available very recently by the UK government. Users can indulge the British obsession with house prices by surfing on their desktop but surely the data could easily be included within a mobile AR experience. The most recent price paid for the house you're walking past, or where the houses for sale are near your current location, will be of interest to some users. Whether it's a mashup or AR, it's one data set superimposed on another.

Screenshot of mouseprice.com showing prices paid and dates for house purchases on a UK street and a map of the location

Figure 2: House Prices Map created using RDF data from the UK Land Registry as well as other sources'

APIs

The W3C Geolocation API specification is at Last Call working draft, however, the API is already well-implemented in products in the marketplace: Android, iPhone and Firefox 3.5. A test suite is under development, as is a second version of the API to incorporate new features (such as access to the compass and accelerometer). In devices that implement the nascent standard already, finding the user's current location is as simple as:

getCurrentPosition()

The standardisation of other APIs is well under way in two other W3C Working Groups. Firstly the Device APIs and Policy Working Group is working on giving Web developers more access to information and functionality of the device itself (camera, PIM data etc.). The first public working draft of the contact API is now available. Secondly, the WebApps Working Group is documenting existing APIs (like XMLHttpRequest) and developing new APIs in areas such as DOM Events, Cross Origin Resource Sharing and Server-Sent Events.

Display

HTML 5, SVG, canvas and CSS background layers — these latest Web technologies and more are making the Web an ever-more powerful open platform with highly flexible and responsive display capabilities. Several features of HTML 5 have been separated out into new documents in recent months (such as the Web storage that defines name/value pair storage APIs for Web clients and Programmable HTTP Caching and Serving that defines APIs for accessing HTTP requests while offline). This allows the HTML 5 Working Group to make more rapid progress in some areas without having necessarily to wait for the entire standard to be completed.

So What Needs to be Done?

Answering that is a significant task in itself and W3C exists to facilitate community action to build consensus. Therefore, if there is to be work within W3C to make AR part of the Web then the first step is to gather that community together and look at what needs doing. At one extreme there will be nothing to do — the consensus will be that everything is already in place for AR to be available through existing Web technologies. At the other extreme would be an agreement that a whole new technology stack was required.

Neither scenario seems entirely plausible but it's worth noting that W3C is not advancing a specific plan for Augmented Reality. It is for the community to decide on the most appropriate course of action, knowing that W3C is ready to support such action.

The vehicle that W3C has for doing this kind of community-gathering and roadmap-building work is an Incubator Group (XG). It is the community itself that creates the charter for an XG and there must be at least 3 W3C members in the XG for it to go ahead. The rules for membership of XGs are similar to those for full Working Groups: any W3C member may join an XG. The XG chair and W3C Team may invite experts from non-member companies to join the group. An XG does not create standards. Its task is to write a report that may or may not highlight the standards work that needs to be done. In effect it's a way to do the groundwork that will speed up progress if a full Working Group is chartered to create new standards. Invited Experts may also participate in WGs but where individuals are affiliated with companies that might reasonably be expected to join as W3C members, they will be asked to do so.

A full list of current and past XGs is available.

If you are interested in establishing an ARXG, please contact me through phila@w3.org.

Phil Archer
For the OMWeb Team

Acknowledgements

European Union 7th Framework Programme (FP7) The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 248687 — Open Media Web (OMWeb)

Thanks to Dan Appelquist and Shadi Abou-Zahra for their help in writing this short paper.

Last updated: $Date: 2010/02/04 16:44:37 $