MWI Team Blog

Dispatches from members of the W3C Mobile Web Initiative Team

Categories: Current state (32) | Developing Countries (15) | Events (20) | Looking forward (13) | News (42) | Technical (33) |

Augmented Reality on the Web — 26 April 2010

The concept of Augmented Reality (AR) is not new - just think of heads-up displays in post-war military aeroplanes. Even the term Augmented Reality is not new (it is believed to have been coined in 1990 by Thomas Caudell, then of Boeing) and users are already familiar with AR or AR-like experiences. The most cited example is that of graphical overlays on TV coverage of sports events but things like 'Bright Dancing', promoted by British ISP Talk Talk as part of its sponsorship of the X-Factor TV talent show, have brought some elements of the technology directly to public attention. The marketing and advertising world is experimenting with AR technologies: essentially using image recognition to trigger the display of product information, promotions and the like. See, for example, Cross Platform's work promoting a fashion retailer: AR was used to attract and engage the attention of passing public before delivering them into the store to find out if they had won a prize.

Pranav Mistry using hand movements to control a computer screen that exists only as a projected image in front of him

Figure 1: Pranav Mistry demonstrates the 'Sixth Sense Device'

On the cutting edge of the technology, Pranav Mistry, at MIT's Media Lab, is the man behind so called 'sixth sense' devices that allow users to interact with the real world in very sophisticated ways: pulling up information about real world objects, displaying data on everyday surfaces and responding to gestures as commands (see Figure 1).

In the Q&A session at the end of his November TED India talk, Mistry makes the point that for people with sensory disabilities, the numerical term in 'sixth sense' may be inappropriate. For users with disabilities of varying kinds, AR has real potential to help people with a variety of disabilities; examples include:

  • a wheelchair user seeking an accessible route or the one that has the fewest slopes to a particular destination;
  • a blind person being guided by an audio narrator about the surrounding world;
  • assistance with daily activities (think of a person with cognitive impairments who could see the contents of a can before buying it at the supermarket);
  • a deaf person seeing the beats of music in addition to feeling the vibrations while dancing away with friends.

Only some of the current and future AR applications make use of a smartphone as a mobile computing platform so their arrival can only be part of the reason for all the excitement around the term today. Improvements in image recognition, display technologies and multimedia platforms must also be part of the story. Nevertheless, smartphones are an important part of the landscape and do have the potential put AR in more people's hands - or should that be eyes, ears and hands?

Standards and Interoperability

Despite the excitement, there is perhaps a roadblock for AR: a lack of standardization and interoperability. A serious attempt at creating an interoperable standard has been made in ARML. This is an extension to the Open Geospatial Consortium's KML (formerly Keyhole Markup Language), initially developed by Google for Google Earth. However, not all AR applications use the format so that data produced by third parties for use in one 'Augmented Reality Browser' is unlikely to be interoperable with that created for another. And what is an Augmented Reality Browser? Don't we already have several well-known and massively distributed applications that can render data, dynamic images and more, delivered in a wide variety of formats?

Web technologies may not be able directly to support AR today, but many of the key pieces are already in place, more are close to market, and the gap between what AR delivers today and what is possible on the Web is closing all the time.


It's not so long ago that all the buzz was about mash-ups: taking data from multiple sources and putting them together in some fashion, usually on a map. Figure 2 shows a screenshot of an example of this: is a service that uses some the data made available very recently by the UK government. Users can indulge the British obsession with house prices by surfing on their desktop but surely the data could easily be included within a mobile AR experience? The most recent price paid for the house you're walking past, or where the houses for sale are near your current location, will be of interest to some users. Whether it's a mashup or AR, it's one data set superimposed on another.

Screenshot of showing prices paid and dates for house purchases on a UK street and a map of the location

Figure 2: House Prices Map created using RDF data from the UK Land Registry as well as other sources

A 'nice to have' feature for future AR development might be a common method of identifying Points of Interest across different data sources. Perhaps a URI scheme? That should make the combining of data from different sources easier and it might also provide a bridge across the sadly all too real divide between those that see AR as part of, and those that see it as quite separate from, linked data (a.k.a the Semantic Web). Individuals will always have their preferences for how they like to process data but the format in which it is made available should not be a barrier to any serious developer.


Today, the W3C Geolocation API specification is at Last Call working draft, however, the API is already well-implemented in products in the marketplace: Android, iPhone and Firefox 3.5 and there's a test suite under development. However, this is perhaps just the most obvious of the APIs already under development at W3C. Between them, the GeoLocation, Device APIs & Policy and Web Apps working groups are in various stages of completing work on APIs for:

As these APIs and more become available, Web applications, whether delivered as pages on a Web site or as widgets, will be able to offer a full range of Augmented Reality and Augmented Reality-like services as part of the regular Web experience. These working groups do not specify how, say, Firefox should ascertain a user's location, merely how a Web application can access the data. In the case of location it couldn't be simpler:




HTML 5, SVG, canvas and CSS background layers — these latest Web technologies and more are making the Web an ever-more powerful open platform with highly flexible and responsive display capabilities. Several features of HTML 5 have been separated out into new documents in recent months (such as the Web storage that defines name/value pair storage APIs for Web clients and Programmable HTTP Caching and Serving that defines APIs for accessing HTTP requests while offline). This allows the HTML 5 Working Group to make more rapid progress in some areas without having necessarily to wait for the entire standard to be completed.

So What Needs to be Done?

Answering that is a significant task in itself but several groups are looking at the issue closely, for example, AR Devcamp is a very active community. W3C participated in the Augmented Reality Summit at this year's Mobile World Congress and, working with the organizers of that event, we’re retuning to Barcelona this June for a W3C Workshop on the topic.

Join us!

The Call for Papers has been made with a closing date of 29th May. Various topics are suggested in the call but essentially, if you're interested in seeing AR developed in the royalty-free, open platform of the Web, this is the workshop for you.

by Phil ARCHER in Permalink

Comments, Pingbacks:

No Comments/Pingbacks for this post yet...

Contacts: Dominique Hazael-Massieux