WebID 1.0

Web Identification and Discovery

Unofficial Draft 25 July 2010

This version:
http://www.w3.org/2005/Incubator/webid/spec/drafts/ED-webid-20100725/
Latest editor's draft:
http://www.w3.org/2005/Incubator/webid/spec/
Previous version:
http://www.w3.org/2005/Incubator/webid/spec/drafts/ED-webid-20100718/
Editor:
Manu Sporny, Digital Bazaar, Inc. msporny@digitalbazaar.com
Authors:
Toby Inkster
Henry Story
Bruno Harbulot
Reto Bachmann-Gmür

This document is also available in this non-normative format: Diff from previous Editors Draft.


Abstract

Social networking, identity and privacy have been at the center of how we interact with the Web in the last decade. The explosion of social networking sites has brought the world closer together as well as created new points of pain regarding ease of use and the Web. Remembering login details, passwords, and sharing private information across the many websites and social groups that we are a part of has become more difficult and complicated than necessary. The Social Web is designed to ensure that control of identity and privacy settings is always simple and under one's control. WebID is a key enabler of the Social Web. This specification outlines a simple universal identification mechanism that is distributed, openly extensible, improves privacy, security and control over how one can identify themselves and control access to their information on the Web.

How to Read this Document

There are a number of concepts that are covered in this document that the reader may want to be aware of before continuing. General knowledge of public key cryptography and RDF [RDF-PRIMER] and RDFa [RDFA-CORE] is necessary to understand how to implement this specification. WebID uses a number of specific technologies like HTTP over TLS [HTTP-TLS], X.509 certificates [X509V3], RDF/XML [RDF-SYNTAX-GRAMMAR] and XHTML+RDFa [XHTML-RDFA].

A general Introduction is provided for all that would like to understand why this specification is necessary to simplify usage of the Web.

The terms used throughout this specification are listed in the section titled Terminology.

Developers that are interested in implementing this specification will be most interested in the sections titled Authentication Sequence and Authentication Sequence Details.

Status of This Document

This document is merely a public working draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organisation.

The source code for this document is available via Github at the following URL: http://github.com/msporny/webid-spec

Table of Contents

1. Introduction

This section is non-normative.

The WebID specification is designed to help alleviate the difficultly that remembering different logins, passwords and settings for websites has created. It is also designed to provide a universal and extensible mechanism to express public and private information about yourself. This section outlines the motivation behind the specification and the relationship to other similar specifications that are in active use today.

1.1 Motivation

This section is non-normative.

It is a fundamental design criteria of the Web to enable individuals and organizations to control how they interact with the rest of society. This includes how one expresses their identity, public information and personal details to social networks, Web sites and services.

Semantic Web vocabularies such as Friend-of-a-Friend (FOAF) permit distributed hyperlinked social networks to exist. This vocabulary, along with other vocabularies, allow one to add information and services protection to distributed social networks.

One major criticism of open networks is that they seem to have no way of protecting the personal information distributed on the web or limiting access to resources. Few people are willing to make all their personal information public, many would like large pieces to be protected, making it available only to a select group of agents. Giving access to information is very similar to giving access to services. There are many occasions when people would like services to only be accessible to members of a group, such as allowing only friends, family members, colleagues to post an article, photo or comment on a blog. How does one do this in a flexible way, without requiring a central point of access control?

Using an process made popular by OpenID, we show how one can tie a User Agent to a URL by proving that one has write access to the URL. WebID is a simpler alternative to OpenID (fewer connections), that uses X.509 certificates to tie a User Agent (Browser) to a Person identified via a URL. WebID also provides a few additional features to OpenID. These features include trust management, via digital signatures, and free-form extensibility via RDFa. By using the existing SSL certificate exchange mechanism, WebID integrates more smoothly with existing Web browsers, including browsers on mobile devices. WebID also permits automated session login in addition to interactive session login. Additionally, all data is encrypted and guaranteed to only be received by the person or organization that was intended to receive it.

1.2 Relation to OpenID

This section is non-normative.

This section needs to be re-written. The flow and grammar leaves much to be desired. -- manu

WebID is compatible with OpenID. Both protocols use a URL that dereferences to a Personal Profile Document. This Personal Profile Document is where further information about an identity can be discovered. This mechanism is compatible with both WebID and OpenID. Therefore, WebID does not intend to replace OpenID, but can work beside OpenID by sharing the content in the Personal Profile Document.

That said, there are a number of benefits that WebID achieves over OpenID:

WebID gives people and other agents a WebID URL for identification. OpenID also provides a URL to a Personal Profile Document. However, in the case of WebID, one does not need to remember the URL since the User Agent remembers the URL on behalf of the person browsing. To log in on a WebID web site there is no need to enter any identifier like one has to do for OpenID. Just one click tells the browser to send the WebID URL. The person that is browsing does not need to remember either their WebID URL or the website password. The only password one may need to remember is the one that is used to access their collection of WebIDs in their browser, and that's only if they opt-in to password protect their WebIDs.

While WebID works well in a browser environment, it is also very useful outside of the browser environment. WebID can also operate without requiring the use of any passwords. This is useful to developers that may want to use WebID to perform server-to-server or peer-to-peer verification of identity. WebID works for automated agents such as Search Agents, API Agents, and other automated mechanisms that are often found outside of the browser environment.

The WebID protocol requires just one direct network connection to establish identity via the client. The server requires one connection to the client and one connection to retrieve the WebID Profile if it does not have the credential information cached. Compare this to the much more complex OpenID sequence, which requires six connections by the client to establish a login. In a world of distributed data where each site can point to data on any other site, multiple connections become costly to manage.

WebID builds on a number of well established Internet and Web standards; REST, RDF [RDF-PRIMER], RDFa [RDFA-CORE], RDF/XML [RDF-SYNTAX-GRAMMAR], TLS [HTTP-TLS], and X.509 [X509V3]. By building on previous standards, it makes both explaining and implementing WebID easier on developers.

Since WebID is RESTful, you can perform basic HTTP operations to GET your WebID, and if you needed update it, you can use HTTP PUT semantics. You can also create a WebID via POST. This is improved from the OpenID specification, which requires a new set of operations described in the OpenID Attribute Exchange specification.

WebID is built on RDF and thus enables all of the advanced semantic web concepts that RDF enables. For example, a developer may perform machine reasoning with a WebID. One can construct machine-executable statements like "If this WebID claims to be a friend of one of our partner WebIDs that is trusted and the relationship is bi-directional, trust the WebID." While OpenID attempts to support this use case by mapping OpenID to RDF, it's far easier to do with WebID because WebID is natively RDF-aware.

It is easy to extend a WebID with new attributes via RDF. The power of RDF allows developers to add extensions to WebID by defining new vocabularies that they publish. There is no authorization process necessary and thus WebID allows for distributed innovation. Every WebID property is a URI, which when clicked, can give you yet more information about what the property means. A developer can create new usage classes by extending their vocabulary at will. A developer can add relationships to a WebID by simply adding more HTML to the developer's page. OpenID does not provide any type of distributed innovation akin to RDF.

Implementing WebID is easier than OpenID because all of the basic technologies have been working and integrated into Web browsers for many years. There were already three interoperable implementations of WebID before this specification was written.

WebID is truly decentralized - with WebID you get a web of trust. OpenID only supports the Web of Trust model if you indirectly trust the OpenID provider. In other words - OpenID is not truly decentralized. In OpenID you must trust OpenID providers. With WebID you only have to trust the people and the organizations with which you are communicating. In other words, you don't have to ask anyone whether or not you can trust your friends. You can query people that you trust directly to see if someone is trustworthy or not. There is no need for a central WebID authority.

WebID is fully distributed, anyone can setup a WebID by placing a single file on a web server of their choosing. There is no need for a special OpenID-like provider service. The only thing anyone that wants a WebID needs is a web account where you can post your WebID file, ideally on your own domain name. You can also use a WebID hosting provider, but it's not necessary for WebID to work. While it is possible to run an OpenID server, other OpenID applications may not trust you and thus you won't be able to fully utilize your private OpenID credentials. The reason that there are a few large OpenID providers and very few small OpenID providers is because of this trust design issue related to OpenID.

WebID does not require HTTP redirects. Redirects are problematic on many cell phones, because telecoms heavily rely on proxys, which selectively block redirects.

A WebID provider is 100% compatible with an OpenID provider and thus can inter-operate with OpenID-powered networks.

1.3 Relation to OAuth

This section is non-normative.

OAuth and WebID are mutually beneficial when used together. WebID can be used to provide RSA parameters to the RSA-SHA1 signature method required by OAuth 1.0. WebID can also be used to establish the consumer_key and HTTPS connection that will be used to transmit OAuth Tokens in OAuth 2.0.

2. The WebID Protocol

2.1 Terminology

Verification Agent
Performs authentication on provided WebID credentials and determines if an Identification Agent can have access to a particular resource. A Verification Agent is typically a Web server, but may also be a peer on a peer-to-peer network.
Identification Agent
Provides identification credentials to a Verification Agent. The Identification Agent is typically also a User Agent.
Identification Certificate
An X.509 [X509V3] Certificate that must contain a Subject Alternative Name extension with a URI entry. The URI should be a URL, and should not be a URN. The URL identifies the Identification Agent. The URL must be dereference-able and result in a document containing RDF data. For example, the certificate would contain http://example.org/webid#public, known as a WebID URL, as the Subject Alternative Name:
X509v3 extensions:
   ...
   X509v3 Subject Alternative Name:
      URI:http://example.org/webid#public
WebID URL
A URL specified via the Subject Alternative Name extension of the Identification Certificate that identifies an Identification Agent.
public key
A widely distributed crytographic key that can be used to verify digital signatures and encrypt data between a sender and a receiver. A public key is always included in an Identification Certificate
WebID Profile
A structured document that contains identification credentials for the Identification Agent expressed using the Resource Description Framework [RDF-CONCEPTS]. Either the XHTML+RDFa 1.1 [XHTML-RDFA] serialization format or the RDF/XML [RDF-SYNTAX-GRAMMAR] serialization format must be supported by the mechanism, e.g. a Web Service, providing the WebID Profile document. Alternate RDF serialization formats, such as N3 [N3] or Turtle [TURTLE], may be supported by the mechanism providing the WebID Profile document.

Whether or not RDF/XML, XHTML+RDFa 1.1, both or neither serialization of RDF should be required serialization formats in the specification is currently under heavy debate.

2.2 Authentication Sequence

The following steps are executed by Verification Agents and Identification Agents to determine if access should be granted to a particular resource.

  1. The Identification Agent attempts to access a resource using HTTP over TLS [HTTP-TLS] via the Verification Agent.
  2. The Verification Agent must request the Identification Certificate of the Identification Agent as a part of the TLS client-cerificate retrieval protocol.
  3. The Verification Agent must extract the public key and the WebID URL contained in the Subject Alternative Name extension of the Identification Certificate.
  4. The public key information associated with the WebID URL must be checked by the Verification Agent. This process should occur either by dereferencing the WebID URL and extracting RDF data from the resulting document, or by utilizing a cached version of the RDF data contained in the document or other data source that is up-to-date and trusted by the Verification Agent. The processing and extraction mechanism is further detailed in the sections titled Processing the WebID Profile and Extracting WebID URL Details.
  5. If the public key in the Identification Certificate is found in the list of public keys associated with the WebID URL, the Verification Agent must assume that the client intends to use the public key to verify their ownership of the WebID URL.
  6. The Verification Agent verifies that the Identification Agent owns the WebID Profile by using the public key to create a cryptographic challenge. The challenge should be fulfilled by performing TLS mutual-authentication between the Verification Agent and the Identification Agent. If the Verification Agent does not have access to the TLS layer, a digital signature challenge must be provided by the Verification Agent. These processes are detailed in the sections titled Authorization and Secure Communication.

The Identification Agent may re-establish a different identity at any time by executing all of the steps in the Authentication Sequence again. Additional algorithms, detailed in the next section, may be performed to determine if the Verification Agent can access a particular resource after the last step of the Authentication Sequence has been completed.

2.3 Authentication Sequence Details

This section covers details about each step in the authentication process.

2.3.1 Initiating a TLS Connection

This section will detail how the TLS connection process is started and used by WebID to create a secure channel between the Identification Agent and the Verification Agent.

2.3.2 Exchanging the Identification Certificate

This section will detail how the certificate is selected and sent to the Verification Agent.

2.3.3 Processing the WebID Profile

A Verification Agent must be able to process documents in RDF/XML [RDF-SYNTAX-GRAMMAR] and XHTML+RDFa [XHTML-RDFA]. A server responding to a WebID Profile request should support HTTP content negotiation. The server must return a representation in RDF/XML for media type application/rdf+xml. The server must return a representation in XHTML+RDFa for media type text/html or media type application/xhtml+xml. Verification Agents and Identification Agents may support any other RDF format via HTTP content negotiation.

This section will explain how a Verification Agent extracts semantic data describing the identification credentials from a WebID Profile.

2.3.4 Extracting WebID URL Details

The Verification Agent may use a number of different methods to extract the public key information from the WebID Profile.

The following SPARQL query outlines one way in which the public key could be extracted from the WebID Profile:
PREFIX cert: <http://www.w3.org/ns/auth/cert#>
PREFIX rsa: <http://www.w3.org/ns/auth/rsa#>
SELECT ?modulus ?exp
WHERE {
   ?key cert:identity <http://example.org/webid#public>;
      a rsa:RSAPublicKey;
      rsa:modulus [ cert:hex ?modulus; ];
      rsa:public_exponent [ cert:decimal ?exp ] .
}

This section still needs more information.

2.3.5 Authorization

This section will explain how a Verification Agent may use the information discovered via a WebID URL to determine if one should be able to access a particular resource. It will explain how a Verification Agent can use links to other RDFa documents to build knowledge about the given WebID.

2.3.6 Secure Communication

This section will explain how an Identification Agent and a Verification Agent may communicate securely using a set of verified identification credentials.

If the Verification Agent has verified that the WebID Profile is owned by the Identification Agent, the Verification Agent should use the verified public key contained in the Identification Certificate for all TLS-based communication with the Identification Agent. This ensures that both the Authorization Agent and the Identification Agent are communicating in a secure manner, ensuring cryptographically protected privacy for both sides.

2.4 The WebID Profile

The WebID Profile is a structured document that contains identification credentials for the Identification Agent expressed using the Resource Description Framework [RDF-CONCEPTS]. The following sections describe how to express certain common properties that could be used by Verification Agents and other entities that consume a WebID Profile.

The following vocabularies are used in their shortened form in the subsequent sections:

foaf
http://xmlns.com/foaf/0.1/
cert
http://www.w3.org/ns/auth/cert#
rsa
http://www.w3.org/ns/auth/rsa#

2.4.1 Personal Information

Personal details are the most common requirement when registering an account with a website. Some of these pieces of information include an e-mail address, a name and perhaps an avatar image. This section includes properties that should be used when conveying key pieces of personal information but are not required to be present in a WebID Profile:

foaf:mbox
The e-mail address that is associated with the WebID URL.
foaf:name
The name that is most commonly used to refer to the individual or agent.
foaf:depiction
An image representation of the individual or agent.

2.4.2 Cryptographic Details

Cryptographic details are important when Verification Agents and Identification Agents interact. The following properties should be used when conveying cryptographic information in WebID Profile documents:

rsa:RSAPublicKey
Expresses an RSA public key. The RSAPublicKey must specify the rsa:modulus and rsa:public_exponent properties.
cert:identity
Used to associate an RSAPublicKey with a WebID URL. A WebID Profile must contain at least one RSAPublicKey that is associated with the corresponding WebID URL.

Change History

This section is non-normative.

2010-07-25 Added WebID Profile section.

2010-07-18 Updates from WebID community related to RDF/XML support, authentication sequence corrections, abstract and introduction updates.

2010-07-11 Initial version.

Acknowledgments

This section is non-normative.

The following people have been instrumental in providing thoughts, feedback, reviews, criticism and input in the creation of this specification:

  • Melvin Carvalho
  • Bruno Harbulot
  • Toby Inkster
  • Ian Jacobi
  • Jeff Sayre
  • Henry Story

A. References

A.1 Normative references

[HTTP-TLS]
E. Rescorla. HTTP Over TLS. May 2000. Internet RFC 2818. URL: http://www.ietf.org/rfc/rfc2818.txt
[N3]
Tim Berners-Lee; Dan Connolly. Notation3 (N3): A readable RDF syntax. 14 January 2008. W3C Team Submission. URL: http://www.w3.org/TeamSubmission/2008/SUBM-n3-20080114/
[RDF-PRIMER]
Frank Manola; Eric Miller. RDF Primer. 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-primer-20040210/
[RDF-SYNTAX-GRAMMAR]
Dave Beckett. RDF/XML Syntax Specification (Revised). 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210
[RDFA-CORE]
Shane McCarron; et al. RDFa Core 1.1: Syntax and processing rules for embedding RDF through attributes.22 April 2010. W3C Working Draft. URL: http://www.w3.org/TR/2010/WD-rdfa-core-20100422
[TURTLE]
David Beckett, Tim Berners-Lee. Turtle: Terse RDF Triple Language January 2008. W3C Team Submission. URL: http://www.w3.org/TeamSubmission/turtle/
[X509V3]
ITU-T Recommendation X.509 version 3 (1997). "Information Technology - Open Systems Interconnection - The Directory Authentication Framework" ISO/IEC 9594-8:1997.
[XHTML-RDFA]
Shane McCarron; et. al. XHTML+RDFa 1.1. 22 April 2010. W3C Working Draft. URL: http://www.w3.org/TR/WD-xhtml-rdfa-20100422

A.2 Informative references

[RDF-CONCEPTS]
Graham Klyne; Jeremy J. Carroll. Resource Description Framework (RDF): Concepts and Abstract Syntax. 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-concepts-20040210