Dave Raggett

This is my home page where you can learn about my interests, achievements and how to contact me. Here is my Curriculum Vitae.

Recipient of Talking Hands Award in January 2004.

I am a member of the W3C Team and a visiting professor at the University of the West of England. I work from home in the UK on behalf of W3C's European host ERCIM. Much of my time is spent on participating in European projects, e.g. Gatekeeper (smart healthcare for elderly and frail patients), Boost 4.0 (big data in factories), SDN-microSENSE (cybersecurity for smart grids) and TERMINET (next generation IoT). My past projects include Create-IoT (Internet of Things), COMPOSE, webinos, Serenoa and PrimeLife.

For W3C, I am helping with the Web of Data and the Web of Things, which defines an abstraction layer for digital twins that decouples developers from the underlying IoT technologies as a basis for addressing the fragmentation of the IoT. I first talked about the Web of Things in 2007, and later organised a Web of Things workshop in 2014, leading to a W3C Recommendation for thing descriptions in JSON-LD for digital twins as objects with properties, actions and events. Over the years I have organised many other W3C workshops, e.g. Graph Data in 2019, Web Payments in 2014, Web & Automotive workshop in 2013, Multimodal Interaction in 2007, Voice Browsers and Shaping the future of HTML both in 1998.

I enjoy working on software projects, e.g. the Arena Web Hub for the Web of Things, HTML Tidy for cleaning up HTML markup, and HTML Slidy which is a JavaScript library for HTML based slide presentations. In the early nineties, whilst I was working at HP Labs, I developed an HTTP server and the Arena Web browser which was subsequently adopted by CERN for early work on CSS.

My current focus is on how to build AI systems that mimic human reasoning inspired by decades of advances in the cognitive sciences, and hundreds of millions of years of evolution of the brain. This is a major paradigm shift compared to the Semantic Web which is steeped in the Aristotelian tradition of mathematical logic and formal semantics. This will enable the Sentient Web as the combination of sensing, actuation and cognition federated across the Web in support of markets of services based upon open standards.

The W3C Cognitive AI Community Group is seeking to incubate ideas that combine symbolic information (graphs) with sub-symbolic information (statistics), rules and high performance graph algorithms. This combination enables machine learning and reasoning in the presence of uncertainty, incompleteness and inconsistencies. The starting point has been the development of the chunks and rules format as an amalgam of RDF and Property Graphs. A series of demos are being developed to explore different aspects, using an open source JavaScript library.

I was a W3C Fellow for many years on behalf of a number of companies, most recently Justsystems, involving work on the relationship between XBRL and the Semantic Web. I was briefly a member of the XBRL Standards Board. Before that, I was a W3C Fellow for Volantis, Canon, Openwave Systems, and HP Labs in Bristol, England. I have been very closely involved with the development of HTML from the early days (HTML+, HTML 3.0, 3.2, 4.0, XHTML) as well as setting up the IETF HTTP working group and helping to initiate work on VRML. I used to be the activity lead for XHTML, XForms, MathML, Voice Browsers and Multimodal Interaction. I am a visiting professor for the University of the West of England.

Whilst working for JustSystems I became involved in work on XBRL, a markup standard for financial reports defined by XBRL International with the support of financial institutions around the world. XBRL makes use of XML Schema, XLink and XML Namespaces to give precise semantics to financial data. There is a huge potential for combining XBRL with the Semantic Web as a basis for analysing financial data and combining it with other sources of information. My starting point was to develop an open source tool (xbrlimport) to translate XBRL into RDF. The Semantic Web, with its ability to represent a World Wide Web of machine interpretable data and metadata, will give users tremendous flexibility for exploring huge amounts of information about companies and markets.

I joined the EU PrimeLife Project in February 2009 to help with work on privacy enhancing technologies. The project is funded by the European Commission's 7th Framework Programme. I am particularly interested with the the concept of privacy providers as a new class of web services giving users life long control over their personal data. You get to determine just how much personally identifying information you disclose to websites. The approach also offers single-signon and opportunities for supporting micropayments as value added features for participating websites.

I have also been working on broadening the Web to include all kinds of network appliances, whether in the home, office or on the move, and at the same time reducing the cost and complexities involved in developing Web applications through declarative languages that enable higher level authoring tools. The long term aim is to avoid the need for Web application authors to have to learn the intricacies of markup, style sheet and scripting languages, and the infuriating variations across browsers. This will reduce the development and maintenance costs compared with today's approaches, whilst improving the quality and the end-user experience on whatever device he or she is using. I launched the Model Based User Interface Incubator Group in October 2008 to evaluate research on model-based user interface design as a framework for authoring Web applications and with a view to proposing work on related standards.

My other software projects have included xbrlimport, EzMath, XForms-Tiny and several experimental browsers. I have explored the potential of custom XML applications written in Haxe and deployed via the once ubiquitous flash player. I developed components for rendering and for editing SVG that work on any browser with Flash Player 9 and above. I worked on a cross browser library with a view to enabling the use of Web browsers for editing the W3C site, and a model-based user interface editor named Quill for Serenoa that runs in the browser together with a cloud-based rule engine.

In my spare time I enjoy diving with the Bath Sub-Aqua Club, and became an assistant open water instructor. Here are some of my collections of photo's for my diving trips to the Scilly Isles, the southern Red Sea, Socorro (Baja California), and to South Africa for tiger sharks and the famous sardine run. I am married with a son and a daughter, and live in Frome, near Bath in the west of England. Since August 2007 I have been a visiting professor for the University of the West of England in the Faculty of Environment and Technology.

Some (older) publications/presentations:

The Web of Things: Extending the Web into the Real World, January 2010, an Invited talk at SOFSEM 2010, 36th International Conference on Current Trends in Theory and Practice of Computer Science, Špindlerův Mlýn, Czech Republic
Geolocation on the Mobile Web, 23 April 2008, W3C Track, WWW2008 conference, Beijing, China
Towards the Web of Things, 27 March 2008, Internet of Things, Zurich, Switzerland.
Towards the Web of Things, Mobile Web 2.0, Seoul, 5 March 2008
Towards the Web of Things at UWE Web Developer's Conference in Bristol, UK on 26 September 2007
Google Tech Talk on Forms, 5 March 2007, Mountain View
Ubiquitous Web presentation on 19 September 2006 at CE2006, Antibes, France
Slidy, a web-based alternative to PowerPoint, XTech on 19 May 2006, Amsterdam
Interview in the March 2006 edition of the Loquendo Newsletter
Web Applications and the Ubiquitous Web, on 13 March 2006 at the Next Generation Web Conference, Seoul, Korea
Outline on 7 March 2006 to the Japan Members meeting of the W3C Ubiquitous Web workshop
My Tech Talk on the Web of Applications at the Googleplex on 1st February 2006, Google recorded the talk, see the video
Applyng AJAX to add speech services to Web browsers on 31 January 2006 at AVIOS/SpeechTek West
Presentations on MMI Activity and Ubiquitous Web at Multimodal Web Applications for Embedded Systems, W3C seminar 21 June 2005 - Toulouse, France
Demonstration of a text to speech extension for the Mozilla-Firefox browser and its application to render RSS to speech at the W3C Technical Plenary, 2nd March 2005, Boston, MA.
Presentation on the Ubiquitous Web at W3C Technical Plenary, 2nd March 2005, Boston, MA.
CSS Extensions for Multimodal Interaction, written together with Max Froumentin, and introducing the principle of modality independence.
Position paper for the Mobile Web Initiative Workshop, 18-19th November 2004, Barcelona, Spain: Winning users over with more attractive and more flexible mobile web applications [Slides].

The Ubiquitous Web

The Ubiquitous Web seeks to broaden the capabilities of browsers to enable new kinds of web applications, particularly those involving coordination with other devices. These applications involve identifying resources and managing them within the context of an application session. The resources can be remote as in a network printer and projector, or local, as in the estimated battery life, network signal strength, and audio volume level. The Ubiquitous Web will provide a framework for exposing device coordination capabilities to Web applications. I organized and chaired a W3C workshop on the Ubiquitous Web in Tokyo on 9-10 March 2006 as a means to share use cases, research results, and implementation experience. The workshop raised a number of security related issues, and the importance of extending the web application model out into the physical world of sensors and effectors. In March 2007 I launched and the Ubiquitous Web Applications working group. I organized a Workshop on declarative models of distributed web applications in June 2007, in Dublin, Ireland, and plan to hold further workshops on related topics to guide W3C's standards activities in these areas.

Voice Browsers and Multimodal Interaction

I ran a workshop in 1998 to look at the opportunities for W3C to take a role in extending the Web to support voice interaction as the means for browsing Web content. This led to the setting up of a Voice Browser activity and a working group to develop related standards. I was the W3C Activity Lead for Voice Browsers until March 2005. Voice Browsers offer the means to access Web-based services over any telephone, or for hands & eyes free operation such as in a car. Voice interaction allows browsers to shrink in size as you no longer need the physical space for a high resolution display. The primary initial market is for replacing the current generation of touch-tone voice menuing systems, so common these days when you call up companies. Voice Browsers allow you to use spoken commands rather than having to press "1" for this and "2" for that etc.

My interest in multimodal interaction started years ago, and led to work within the Voice Browser activity and more recently to a new W3C Multimodal activity of which I am the W3C Activity Lead. This work is still at an early stage, but aims to weave together ideas for visual, aural and tactile interaction with the Web, offering users the means to choose whether to use their eyes or ears, and fingers or speech as appropriate to the context in which they find themselves.

Whilst I was working for HP Labs I developed a voice browser together with a student (Guillaume Belrose) to test out ideas for using context free grammars for more flexible voice interaction dialogs. The applications were written in XML using a language we called TalkML. More recently, I have begun to study ideas for the use of natural language in multimodal systems, based upon event driven nested state machines, and inspired by David Harel's work on State Charts. Max Froumentin and I, explored this in some ideas for extending CSS to describe interaction based upon the idea of text as an abstract modality. Whilst CSS is perhaps easier for authors, an XML based representation for state machines is likely to provide greater flexibility, and this is now being pursued within the Voice Browser working group. I am currently working on developing a means to integrate speech with web pages via an open source proxy speech server based on HTTP. This will be usable with any modern web browser without the need for plugins, and is being developed to enable widespread experimentation with multimodal web applications.

Computers with Common Sense

I am intrigued with the idea of giving computers a modicum of common sense, or in other words a practical knowledge of everyday things. This would have huge benefits, for instance, much smarter ways of searching for information, and more flexible user interfaces to applications. While it might sound easy, this is in fact very difficult and has defeated traditional approaches based upon mathematical logic and AI (artificial intelligence). More recently, work on speech recognition and natural language processing using statistical methods have shown great promise. Statistical approaches offer a way out of the combinatorial explosion faced by AI, and I am excited by work in cognitive science on relevancy theory and the potential for applying statistical learning techniques to semantics, learning on the fly or from tagged corpora.

My long term aim is to understand this better and to put it into practice in the form of a multi-user conversational agent that is accessible over the Web, so that we can harness the power of the Web to allow volunteers to teach the system common sense knowledge by conversing with it in English (and eventually other languages). I plan to work on an open source broad coverage statistical natural language processor for parsing and generation, and a relevancy-based inference system for natural language semantics. Here are some more details. If you are interested in collaborating on this, please contact me.

Tidying up your markup!

"HTML tidy" is an open source utility for tidying up HTML. Tidy is composed from an HTML parser and an HTML pretty printer. The parser goes to considerable lengths to correct common markup errors. It also provides advice on how to make your pages more accessible to people with disabilities, and can be used to convert HTML content into XML as XHTML. Tidy is W3C open source and available free. It has been successfully compiled on a large number of platforms, and is being integrated into many HTML authoring tools. Recently the maintenance of Tidy has been taken over by a group of dedicated volunteers on SourceForge, see: http://tidy.sourceforge.net/.

XForms — the future of Web forms

A few years ago, I set up a working group that is focusing on standards for the next generation of Web forms. The key idea is to separate the user interface and presentation from the underlying data model and logic. This allows content providers to plug in different user interfaces as befits different devices, for example, voice browsers, cell phones, palm-tops, television and desktop machines. XForms builds on XML to transfer form data as structured data.

XForms whilst rooted in forms, is also about the common building blocks for interactive Web applications. The aim is to make it easier to build powerful Web applications in a world where increasingly everything will be interconnected. Web servers, for instance have now shrunk to the size of a single chip. We want to make it easier to achieve the layout and behavior you want without the need to struggle with complex scripts or having to hack layout using tables and spacer gifs etc.

An easy way to add Math to Web pages

In 1993 I first started work on how to incorporate mathematical expressions into Web pages. This work led me to set up the W3C Math working group which has produced the MathML specification. MathML is an XML application and very verbose. In search of an easier to learn and more concise notation, I have been inspired by how people say mathematical expressions when reading aloud. The result is now available for downloading as a plugin and standalone editing tool for the EzMath notation developed together with Davy Batsalle from ENST. EzMath is particularly simple to use as well as providing a convenient way to author MathML. Have a look and see how much smaller and more obvious EzMath is compared to MathML! I later worked on a reimplementation of EzMath with a filter for mapping XHTML+EZMath to XHTML+MathML.

Easier ways to add style and behaviour to Web pages

I am interested in ideas for easier ways to apply style and behaviour to HTML and XML documents. My approach has been to look at ways to extend ECMAScript to combine extensible cascading style rules (CSS) with object-oriented scripting. I sketched out the ideas in a proposal called Spice, which was officially submitted to W3C by Hewlett-Packard, and led to work with IBM, Microsoft and Netscape in ECMA on a new edition of ECMAScript. This has taken a long time to develop but is now nearing completion.

book cover

"Beginning XHTML"

XHTML follows in the footsteps of HTML, combining the benefits of its easy to understand vocabulary with the versatile syntax of XML to create an Extensible HTML, which will be easily accessible not only by today's desktop browsers, but by other equipment - such as cell phones - without the processing power to interpret the now lenient rules of HTML. Anyone who wants to learn how to create a Web page will need to learn XHTML. Sadly this book is now out of print.

"Raggett on HTML 4"

'Raggett on HTML 4' was published (1998) by Addison Wesley, ISBN 0-201-17805-2. The intelligent person's guide to HTML 4, as written by one of the chief architects of HTML, and editor of the HTML+, 3.0, 3.2 and 4.0 specifications. Here is Chapter 1 - introduction to the World Wide Web, and Chapter 2 - a history of HTML. See also these notes on my personal involvement with the early days. Sadly this book too is now out of print.

Subsetting and Extending HTML

The range of browser platforms is undergoing a massive expansion with set-top boxes for televisions, handhelds, cellphones, voice browsers and embedded devices as well as conventional desktop systems. Defining HTML as the lowest common denominator of these devices would fall far short of the potential for the upper end. As a result, W3C has worked on ways to modularize HTML and how to combine it with other tag sets, for instance SVG (W3C's web drawing standard), SMIL (used for multimedia synchronization), MathML (mathematical expressions) and RDF (used for representing metadata).

A key ingredient in this, is the means to formally specify a document profile that defines what tag sets can be used together, what image formats, the level of style sheet support, which scripting libraries can be used etc. The document profile provides the basis for interoperability guarantees. It also makes it feasible to provide transformation tools for converting content from one profile to another.

I developed a way of formalizing document profiles as a set of modular assertions, that break free of the limitations of document type definitions (DTDs) as used in SGML and XML. The approach is being named Assertion Grammars. An early spin-off from this work is a tool for generating DTDs called dtdgen. When I get time, I plan to combine this with ideas developed for XForms, to produce a powerful new way to describe XML document integrity constraints that bursts free of the static nature of XML Schema, to cover dynamic constraints expressed in fuctional and logical terms.

Much later in August 2006, when trying to write modular schemas for an XML grammar for DOM events, I came up with a way to combine assertion grammars with RelaxNG. The result is expressed in XML and allows you to write definitions that extend earlier ones, but without the need to modify the definitions they extend. This is in contrast to RelaxNG, which allows definitions closer to the document's root element to refer to definitions that are closer to the leaves in the document tree, but not the other way around. The problem with the top down nature of most grammar formalisms is that if you want to add a new definition, you can't just compose sets of grammar rules, since the new definition has to be referenced from the old, and that means changing the old definition. My approach borrows from type definitions for object oriented programming languages as well as from the tree regular expressions that form the basis for RelaxNG. The new approach is called Exert as a contraction for XML assertions.

Email: dsr@w3.org