¬e.name;

Internal Working Draft

Editors' Draft

http://www.w3.org/2006/07/SWD/RDFa/scenarios/20070109 http://www.w3.org/2006/07/SWD/RDFa/scenarios Ben Adida Creative Commons ben@adida.net Michael Hausenblas JOANNEUM RESEARCH michael.hausenblas@joanneum.at

Current web pages, written in HTML, are chock-full of structured data. When publishers can express this data, and when tools can read it, a new world of user functionality becomes available, letting users copy and paste structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself, enabling structured search and sharing.

RDFa is a syntax for expressing this structured data in HTML. The rendered data within HTML is reused by the RDFa markup, so that publishers don't repeat themselves. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure.

This document provides use case scenarios for RDFa. An introduction to implementing RDFa is provided in the RDFa Primer, while the details of the syntax are explained in the RDFa Syntax.

This is an internal draft produced by the Semantic Web Deployment Working Group .

This document is for internal review only and is subject to change without notice. This document has no formal standing within the W3C.

Last Modified: $Id: Overview.xml,v 1.2 2007/01/09 15:14:37 adida Exp $

Introduction

RDFa is a syntax that expresses this structured data using a set of elements and attributes that embed RDF in HTML. An important goal of RDFa is to achieve this RDF embedding without repeating existing HTML content when that content is the structured data.

An HTML document marked up with RDFa constructs is a valid HTML Document. RDFa is about using HTML compatible constructs and extensions to specify RDF 'content'. It is not about embedding RDF/XML syntax into HTML documents.

This document presents the major use cases where embedding structured data in HTML using RDFa provides significant benefit. Each use case explores how publishers, tool builders, and consumers benefit from RDFa. In parallel, the reader is encouraged to look at the RDFa Primer and RDFa Syntax.

We note that RDFa makes use of XML namespaces. In this document, we assume, for simplicity's sake, that the following namespaces are defined: dc for Dublin Core, foaf for FOAF, cc for Creative Commons, and xsd for XML Schema Definitions:

dc: http://purl.org/dc/elements/1.1/
foaf: http://xmlns.com/foaf/0.1/
cc: http://web.resource.org/cc/

In this document, we consider Publishers, Tool Builders, and Users. For simplicity, we give our fictitious users first names whose first letter matches their role: Paul, Patrick, and Peter are publishers, Tim, Tod, and Tara and tool builders, and Ursula is a user.

Audience

This document assumes a reader who has reasonable experience with HTML. Prior experience with RDF is not necessary, as the important concepts are explained along the way.

An Overview of the Use Cases

: Paul maintains a blog and wishes to "mark up" his existing page with structure so that tools can pick up his blog post tags, authors, titles, and his blogroll. In particular, his HTML blog should be usable as its own structured feed.
: Paul sometimes gives talks on various topics, and announces them on his blog. He would like to mark up these announcements with proper scheduling information, so that RDFa-enabled agents can automatically obtain the scheduling information and add it to the browsing user's calendar. Importantly, some of the rendered data might be more informal than the machine-readable data required to produce a calendar event.
: Tod sells an HTML-based content management system, where all documents are processed and edited as HTML, sent from one editor to another, and eventually published and indexed. He would like to build up the editorial metadata within the HTML document itself, so that it is easier to manage and less likely to be lost.
: Tara runs a video sharing web site. When Paul wants to blog about a video, he can paste a chunk of HTML provided by Tara directly into his blog. The video is then available inline, in his blog, along with any licensing information (Creative Commons?) about the video.
: Ursula is looking for a new apartment and some items with which to furnish it. She browses various RDFa-enabled web pages, including apartment listings, furniture stores, kitchen appliances, etc. Every time she finds an item she likes, she can point to it, right-click, copy it to her Web Clipboard, and paste it into her apartment-hunting page, where it can be organized, sorted, categorized. Any additional features of the HTML that are not structured, e.g. links to photos, are conserved by the copy-and-paste.
: Tim runs a Semantic Wiki, where users contribute content in Wiki markup or using a WYSIWYG tool. In all cases, Tim produces HTML+RDFa, so that users like Ursula can copy and paste the structured content. In particular, Ursula may be pasting her apartment-and-furnishing finds into a Semantic Wiki.
: Patrick writes a science blog where he discusses proteins, genes, and chemicals. As he has very little control over the layout—he's using a fairly constrained hosting provider—, Patrick adds RDFa to indicate the scientific components he's working with. Ulrich, a scientist, can browse Patrick's site with an RDFa-aware browser and automatically cross-reference the proteins and genes that Patrick is talking about.

Use Case #1 — Basic Structured Blogging

Paul maintains a blog and wishes to "mark up" his existing page with structure so that tools can pick up his blog post tags, authors, titles, and his blogroll, and so that he does not need to maintain a parallel version of his data in "structured format." He could use microformats , but he wishes to combine existing vocabularies and, at some point in the future, add his own extensions, too. Thus, he uses RDFa to embed FOAF and Dublin Core properties.

Paul's Blog ...

My WWW2007 Talk

a post by Paul.

I'm giving a talk at the WWW2007 Conference about structured blogging.

... ... ]]>
(html)

A user with an RDFa-aware browser can automatically pick up Paul's list of acquaintances. An RDFa-aware newsreader can use the HTML page itself as a newsfeed, rather than seek out a separate, parallel RSS or Atom file.

Importantly, if Paul edits one of his blog posts, the corresponding structured data is also automatically updated, since it is, in fact, the exact same data in the exact same file. The RDFa-aware newsreader will automatically pick up the updated title, content, and tags.

Use Case #2 — Publishing an Event - Overriding Some of the Rendered Data

Paul sometimes gives talks on various topics, and announces them on his blog, as well as on a static page of his web site that archives all of the talks he's given. He would like to mark up these announcements with proper scheduling information, so that RDFa-enabled agents can automatically obtain the scheduling information and add it to the browsing user's calendar. Importantly, some of the rendered data might be more informal than the machine-readable data required to produce a calendar event.

My WWW2007 Talk

a post by Paul.

I'm giving a talk at the WWW2007 Conference about structured blogging, on the second day of the conference at 10 .

... ]]>
(html)

When Ursula points her RDFa-enabled web browser to Paul's blog, she notices a calendar icon next to the event description. By clicking on it, she gets the option to add the event to her calendar of choice.

Note that the rendered HTML uses informal language to describe the scheduling of the event, i.e. "the second day of the conference," while the structured data contains the complete iCal timestamp.

Use Case #3 — Content Management Metadata

Tod sells an HTML-based content management system, where all documents are processed and edited as HTML, sent from one editor to another, and eventually published and indexed. He would like to build up the editorial metadata within the HTML document itself, so that the metadata is never lost.

For this purpose, Tod's software uses RDFa with non-rendered metadata, using the document's HEAD. Peter, one of Tod's customers, runs Foo Magazine, which ships content to aggregators and business partners using XHTML 2, whose structural features make it easy to group components of an article (e.g. pictures, recipes, sidebars of book excerpts) into individual elements of an XHTML file. As Peter performs editorial tasks using Tod's content management system, metadata properties are added to the HEAD of the document. These data are not rendered, but they can be extracted using a generic RDFa parser. Peter can thus insert a block of workflow and rights reuse metadata about the document and its components at a single point in the XHTML file and then ship the document off to a business partner.

The sample XHTML shown here contains metadata about the document itself, about one subcomponent (a recipe) and about a subcomponent of that subcomponent (a picture within the recipe):

2006-04-03 2007-02-24 XZ3214 Recipe r003423 Joe Smith 2007-03-12 Add Some Tex Mex Sizzle to Your Kid's Lunch

Amigo Corn Dogs

...

EZ Bean Tacos ...

... ]]> Use Case #4 — Self-Contained HTML Chunks

Tara runs a video sharing web site. Paul frequently blogs about videos. Some are his own, which he distributes exclusively, while others are videos from Tara's site which he reviews. When Paul wants to blog about a video from Tara's site, he can paste a chunk of HTML provided by Tara directly into his blog:

The US Constitution, a Documentary available under a CC License . Please provide credit to Tara. ]]>

In the example above, the video available at http://example.org/tara/video_123 has four properties, a dc:title of "The US Constitution, a Documentary," license of http://creativecommons.org/licenses/by/2.5/, and cc:attributionName as well as dc:creator both "Tara."

When Paul uses this markup, the video is then available inline, in his blog, along with this structured title and licensing information about the video. A user browsing Paul's blog with an RDFa-aware browser can tell that the video shared from Tara's site is licensed under Creative Commons.

Note specifically that the RDFa markup allows Paul to display, within a single HTML page, multiple videos, each with its own license, title, and other structured information. The videos excerpted from Tara's site may be available under a Creative Commons license, while Paul's own videos are licensed under different terms.

Use Case #5 — Web Clipboard

Ursula is looking for a new apartment and some items with which to furnish it. She browses various RDFa-enabled web pages, including apartment listings, furniture stores, kitchen appliances, etc. Every time she finds an item she likes, she can point to it, right-click, copy it to her Web Clipboard, and paste it into her apartment-hunting page, where it can be organized, sorted, categorized. Any additional features of the HTML that are not structured, e.g. links to photos, are conserved by the copy-and-paste.

More specifically, when a browser's DOM representation of an HTML document includes structured data expressed using RDFa, an RDFa-aware agent can copy that node's contained HTML chunk, along with all relevant structure. For example, using the HTML from , one can copy the HTML+RDFa corresponding to the first blog post, resulting in the following chunk of HTML stored in the clipboard:

My WWW2007 Talk

a post by Paul.

I'm giving a talk at the WWW2007 Conference about structured blogging.

]]>

Notice how the reference in the about is absolute, a transformation performed by the RDFa agent during the copy operation. Notice also how any existing HTML links, images, etc. are also copied. Thus, when Ursula pastes the content into her own page, she keeps all of the useful HTML rendering and structured data.

The information Ursula aggregates can then be managed using any set of existing RDF tools for querying, sorting, and navigating.

Use Case #6 — Semantic Wiki

Tim runs a Semantic Wiki (as ) where users contribute content in Wiki markup or using a WYSIWYG tool. In all cases, Tim produces HTML+RDFa, so that users like Ursula can copy the data from the Semantic Wiki to paste it into other web pages, exactly as in . A sample Wiki markup snippet giving some details about Ursula might look like:

The above shown Wiki markup then gets transformed into HTML+RDFa automatically into:

About Me

My name is Ursula Er, I live in Norge and I am 32 years; in my spare time I test ice creams, e.g. I am the reviewer of Best icecream in Norway - reviewed daily.. ]]>

In addition, Tim allows users to embed HTML+RDFa directly into a Wiki page by means of a macro. In particular, Ursula may be pasting her apartment-and-furnishing finds into a Semantic Wiki.

Use Case #7 — Advanced Structured Publishing for Scientists

Patrick writes web-based science articles where he discusses proteins, genes, and chemicals. As he has very little control over the layout—he's using a fairly constrained hosting provider—, Patrick adds RDFa to indicate the scientific components he's working with. Ulrich, a scientist, can browse Patrick's site with an RDFa-aware browser and automatically cross-reference the proteins and genes that Patrick is talking about.

Specifically, Patrick may write the following blog post:

Let's talk about the Corticotropin-lipotropin precursor protein, aka UPA3_HUMAN . I was reading the other day about.... ]]>

Ulrich runs an RDFa-aware web browser, e.g. Firefox with a GreaseMonkey plugin, which provides automatic popups with additional information for all proteins appropriately marked up using RDFa. When he visits Patrick's blog post, Ulrich can point to the protein name in the blog post and get a pop-up window which automatically retrieves information about the protein from public RDF databases, including related proteins, genes, and publications.

Acknowledgments

The editors gratefully acknowledge contributions from:

Mark Birbeck
Jeremy Carroll
Steven Pemberton
Guus Schreiber
Ralph Swick
Elias Torres

Bibliography Semantic Wikipedia Microformats The Friend of a Friend (FOAF) Project The Semantic Web Deployment Working Group RDF-in-HTML Task Force Semantic Web Best Practices and Deployment Working Group HTML Working Group RDF Calendar Interest Group Note Representing vCard Objects in RDF/XML