WebID 1.0

Abstract

Social networking, identity and privacy have been at the center of how we interact with the Web in the last decade. The explosion of social networking sites has brought the world closer together as well as created new points of pain regarding ease of use and the Web. Remembering login details, passwords, and sharing private information across the many websites and social groups that we are a part of has become more difficult and complicated than necessary. The Social Web is designed to ensure that control of identity and privacy settings is always simple and under one's control. WebID is a key enabler of the Social Web. This specification outlines a simple universal identification mechanism that is distributed, openly extensible, improves privacy, security and control over how one can identify themselves and control access to their information on the Web.

How to Read this Document

There are a number of concepts that are covered in this document that the reader may want to be aware of before continuing. General knowledge of public key cryptography and RDF [RDF-PRIMER] and RDFa [RDFA-CORE] is necessary to understand how to implement this specification. WebID uses a number of specific technologies like HTTP over TLS [HTTP-TLS], X.509 certificates [X509V3], RDF/XML [RDF-SYNTAX-GRAMMAR] and XHTML+RDFa [XHTML-RDFA].

A general Introduction is provided for all that would like to understand why this specification is necessary to simplify usage of the Web.

The terms used throughout this specification are listed in the section titled Terminology.

Developers that are interested in implementing this specification will be most interested in the sections titled Authentication Sequence and Authentication Sequence Details.

1. Introduction

This section is non-normative.

The WebID specification is designed to help alleviate the difficultly that remembering different logins, passwords and settings for websites has created. It is also designed to provide a universal and extensible mechanism to express public and private information about yourself. This section outlines the motivation behind the specification and the relationship to other similar specifications that are in active use today.

1.1 Motivation

This section is non-normative.

It is a fundamental design criteria of the Web to enable individuals and organizations to control how they interact with the rest of society. This includes how one expresses their identity, public information and personal details to social networks, Web sites and services.

Semantic Web vocabularies such as Friend-of-a-Friend (FOAF) permit distributed hyperlinked social networks to exist. This vocabulary, along with other vocabularies, allow one to add information and services protection to distributed social networks.

One major criticism of open networks is that they seem to have no way of protecting the personal information distributed on the web or limiting access to resources. Few people are willing to make all their personal information public, many would like large pieces to be protected, making it available only to a selected group of agents. Giving access to information is very similar to giving access to services. There are many occasions when people would like services to only be accessible to members of a group, such as allowing only friends, family members, colleagues to post an article, photo or comment on a blog. How does one do this in a flexible way, without requiring a central point of access control?

Using a process made popular by OpenID, we show how one can tie a User Agent to a URI by proving that one has write access to the URI. WebID is an authentication protocol which uses X.509 certificates to associate a User Agent (Browser) to a Person identified via a URI. A WebID profile can also be used for OpenID, WebId provides a few additional features such as trust management via digital signatures, and free-form extensibility via RDF. By using the existing SSL certificate exchange mechanism, WebID integrates smoothly with existing Web browsers, including browsers on mobile devices. WebID also permits automated session login in addition to interactive session login. Additionally, all data is encrypted and guaranteed to only be received by the person or organization that was intended to receive it.

2. Preconditions

2.1 Terminology

Verification Agent

Performs authentication on provided WebID credentials and determines if an Identification Agent can have access to a particular resource. A Verification Agent is typically a Web server, but may also be a peer on a peer-to-peer network.

Identification Agent

Provides identification credentials to a Verification Agent. The Identification Agent is typically also a User Agent.

Identification Certificate

An X.509 [X509V3] Certificate that must contain a Subject Alternative Name extension with at least one URI entry identifying the Identification Agent. This URI should be dereference-able and result in a document containing RDF data. For example, a certificate identifying the WebID URI http://example.org/webid#public would contain the following:

X509v3 extensions:
   ...
   X509v3 Subject Alternative Name:
      URI:http://example.org/webid#public

TODO: cover the case where there are more than one URI entry

WebID URI

A URI specified via the Subject Alternative Name extension of the Identification Certificate that identifies an Identification Agent.

public key

A widely distributed cryptographic key that can be used to verify digital signatures and encrypt data between a sender and a receiver. A public key is always included in an Identification Certificate.

WebID Profile

A structured document that contains identification credentials for the Identification Agent expressed using the Resource Description Framework [RDF-CONCEPTS]. Either the XHTML+RDFa 1.1 [XHTML-RDFA] serialization format or the RDF/XML [RDF-SYNTAX-GRAMMAR] serialization format must be supported by the mechanism, e.g. a Web Service, providing the WebID Profile document. Alternate RDF serialization formats, such as N3 [N3] or Turtle [TURTLE], may be supported by the mechanism providing the WebID Profile document.

Whether or not RDF/XML, XHTML+RDFa 1.1, both or neither serialization of RDF should be required serialization formats in the specification is currently under heavy debate.

2.2 Creating the certificate

The user agent will create a Identification Certificate with a Subject Alternative Name URI entry. This URI must be one that dereferences to a document the user controls so that he can publish the public key of the Identification Certificate at this URI.

For example, if a user Joe controls http://joe.example/profile, then his WebID can be http://joe.example/profile#me

explain why the WebID URI is different from the URI of the WebID profile document.

As an example to use throughout this specification here is the following certificate as an output of the openssl program.

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            5f:df:d6:be:2c:73:c1:fb:aa:2a:2d:23:a6:91:3b:5c
        Signature Algorithm: sha1WithRSAEncryption
        Issuer: O=FOAF+SSL, OU=The Community of Self Signers, CN=Not a Certification Authority
        Validity
            Not Before: Jun  8 14:16:14 2010 GMT
            Not After : Jun  8 16:16:14 2010 GMT
        Subject: O=FOAF+SSL, OU=The Community Of Self Signers/UID=https://example.org/profile#me, CN=Joe (Personal)
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:cb:24:ed:85:d6:4d:79:4b:69:c7:01:c1:86:ac:
                    c0:59:50:1e:85:60:00:f6:61:c9:32:04:d8:38:0e:
                    07:19:1c:5c:8b:36:8d:2a:c3:2a:42:8a:cb:97:03:
                    98:66:43:68:dc:2a:86:73:20:22:0f:75:5e:99:ca:
                    2e:ec:da:e6:2e:8d:15:fb:58:e1:b7:6a:e5:9c:b7:
                    ac:e8:83:83:94:d5:9e:72:50:b4:49:17:6e:51:a4:
                    94:95:1a:1c:36:6c:62:17:d8:76:8d:68:2d:de:78:
                    dd:4d:55:e6:13:f8:83:9c:f2:75:d4:c8:40:37:43:
                    e7:86:26:01:f3:c4:9a:63:66:e1:2b:b8:f4:98:26:
                    2c:3c:77:de:19:bc:e4:0b:32:f8:9a:e6:2c:37:80:
                    f5:b6:27:5b:e3:37:e2:b3:15:3a:e2:ba:72:a9:97:
                    5a:e7:1a:b7:24:64:94:97:06:6b:66:0f:cf:77:4b:
                    75:43:d9:80:95:2d:2e:85:86:20:0e:da:41:58:b0:
                    14:e7:54:65:d9:1e:cf:93:ef:c7:ac:17:0c:11:fc:
                    72:46:fc:6d:ed:79:c3:77:80:00:0a:c4:e0:79:f6:
                    71:fd:4f:20:7a:d7:70:80:9e:0e:2d:7b:0e:f5:49:
                    3b:ef:e7:35:44:d8:e1:be:3d:dd:b5:24:55:c6:13:
                    91:a1
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Key Usage: critical
                Digital Signature, Non Repudiation, Key Encipherment, Key Agreement, Certificate Sign
            Netscape Cert Type:
                SSL Client, S/MIME
            X509v3 Subject Key Identifier:
                08:8E:A5:5B:AE:5D:C3:8B:00:B7:30:62:65:2A:5A:F5:D2:E9:00:FA
            X509v3 Subject Alternative Name: critical
                URI:https://joe.example/profile#me
    Signature Algorithm: sha1WithRSAEncryption
        cf:8c:f8:7b:b2:af:63:f0:0e:dc:64:22:e5:8a:ba:03:1e:f1:
        ee:6f:2c:f5:f5:10:ad:4c:54:fc:49:2b:e1:0d:cd:be:3d:7c:
        78:66:c8:ae:42:9d:75:9f:2c:29:71:91:5c:29:5b:96:ea:e1:
        e4:ef:0e:5c:f7:07:a0:1e:9c:bf:50:ca:21:e6:6c:c3:df:64:
        29:6b:d3:8a:bd:49:e8:72:39:dd:07:07:94:ac:d5:ec:85:b1:
        a0:5c:c0:08:d3:28:2a:e6:be:ad:88:5e:2a:40:64:59:e7:f2:
        45:0c:b9:48:c0:fd:ac:bc:fb:1b:c9:e0:1c:01:18:5e:44:bb:
        d8:b8

Should we formally require the Issuer to be O=FOAF+SSL, OU=The Community of Self Signers, CN=Not a Certification Authority. This was discussed on the list as allowing servers to distinguish certificates that are foaf+Ssl enabled from others. Will probably need some very deep TLS thinking to get this right.

discuss the importance for UIs of the CN

The above certificate is no longer valid, as I took an valid certificate and change the time and WebID. As a result the Signatiure is now false. A completely valid certificate should be generated to avoid nit-pickers picking nits

2.3 Publishing the WebID Profile Document

The WebID Profile document must expose the relation between the WebID URI and the Identification Agent's public keys using the cert and rsa ontologies, as well as the cert or xsd datatypes. The set of relations to be published at the WebID Profile document can be presented in a graphical notation as follows.

The document can publish many more relations than are of interest to the WebID protocol, as shown in the above graph by the grayed out relations.

The encoding of this graph is immaterial to the protocol, so long as a well known mapping to the format of the representation to such a graph can be found. Below we discuss the most well known formats, and a method for dealing with new unknown formats as they come along.

The WebID provider must publish the graph of relations in one of the well known formats, though he may publish it in a number of formats to increase the useabulity of his site using Content Negotations.

Add content negoatiation pointers

It is particularly useful to have one of the representations be in HTML or XHTML even if it is not marked up in RDFa as this allows people using a web browser to understand what the information at that URI represents.

2.3.1 Turtle

A widely used format for writing RDF graphs is the Turtle notation.

 @prefix cert: <http://www.w3.org/ns/auth/cert#> .
 @prefix rsa: <http://www.w3.org/ns/auth/rsa#> .
 @prefix foaf: <http://xmlns.com/foaf/0.1/> .
 @prefix : <https://joe.example/profile#> .

 :me a foaf:Person;
     foaf:name "Joe" .

 [] a rsa:RSAPublicKey;
    rsa:modulus """
      00:cb:24:ed:85:d6:4d:79:4b:69:c7:01:c1:86:ac:
      c0:59:50:1e:85:60:00:f6:61:c9:32:04:d8:38:0e:
      07:19:1c:5c:8b:36:8d:2a:c3:2a:42:8a:cb:97:03:
      98:66:43:68:dc:2a:86:73:20:22:0f:75:5e:99:ca:
      2e:ec:da:e6:2e:8d:15:fb:58:e1:b7:6a:e5:9c:b7:
      ac:e8:83:83:94:d5:9e:72:50:b4:49:17:6e:51:a4:
      94:95:1a:1c:36:6c:62:17:d8:76:8d:68:2d:de:78:
      dd:4d:55:e6:13:f8:83:9c:f2:75:d4:c8:40:37:43:
      e7:86:26:01:f3:c4:9a:63:66:e1:2b:b8:f4:98:26:
      2c:3c:77:de:19:bc:e4:0b:32:f8:9a:e6:2c:37:80:
      f5:b6:27:5b:e3:37:e2:b3:15:3a:e2:ba:72:a9:97:
      5a:e7:1a:b7:24:64:94:97:06:6b:66:0f:cf:77:4b:
      75:43:d9:80:95:2d:2e:85:86:20:0e:da:41:58:b0:
      14:e7:54:65:d9:1e:cf:93:ef:c7:ac:17:0c:11:fc:
      72:46:fc:6d:ed:79:c3:77:80:00:0a:c4:e0:79:f6:
      71:fd:4f:20:7a:d7:70:80:9e:0e:2d:7b:0e:f5:49:
      3b:ef:e7:35:44:d8:e1:be:3d:dd:b5:24:55:c6:13:
      91:a1
    """^^cert:hex;
    rsa:public_exponent "65537"^^cert:int;
    cert:identity :me .

2.3.2 RDFa HTML notation

There are many ways of writing out the above graph using RDFa in html. Here is just one example.

<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:cert="http://www.w3.org/ns/auth/cert#"
      xmlns:foaf="http://xmlns.com/foaf/0.1/"
      xmlns:owl="http://www.w3.org/2002/07/owl#"
      xmlns:rsa="http://www.w3.org/ns/auth/rsa#"
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<head>
</head>
<body>
<h2>My RSA Public Key</h2>

   <dl typeof="rsa:RSAPublicKey">
   <dt>WebId</dt><dd href="#me" rel="cert:identity">http://joe.example/profile#me</dd>
   <dt>Modulus (hexadecimal)</dt>
   <dd property="rsa:modulus" datatype="cert:hex">
      00:cb:24:ed:85:d6:4d:79:4b:69:c7:01:c1:86:ac:
      c0:59:50:1e:85:60:00:f6:61:c9:32:04:d8:38:0e:
      07:19:1c:5c:8b:36:8d:2a:c3:2a:42:8a:cb:97:03:
      98:66:43:68:dc:2a:86:73:20:22:0f:75:5e:99:ca:
      2e:ec:da:e6:2e:8d:15:fb:58:e1:b7:6a:e5:9c:b7:
      ac:e8:83:83:94:d5:9e:72:50:b4:49:17:6e:51:a4:
      94:95:1a:1c:36:6c:62:17:d8:76:8d:68:2d:de:78:
      dd:4d:55:e6:13:f8:83:9c:f2:75:d4:c8:40:37:43:
      e7:86:26:01:f3:c4:9a:63:66:e1:2b:b8:f4:98:26:
      2c:3c:77:de:19:bc:e4:0b:32:f8:9a:e6:2c:37:80:
      f5:b6:27:5b:e3:37:e2:b3:15:3a:e2:ba:72:a9:97:
      5a:e7:1a:b7:24:64:94:97:06:6b:66:0f:cf:77:4b:
      75:43:d9:80:95:2d:2e:85:86:20:0e:da:41:58:b0:
      14:e7:54:65:d9:1e:cf:93:ef:c7:ac:17:0c:11:fc:
      72:46:fc:6d:ed:79:c3:77:80:00:0a:c4:e0:79:f6:
      71:fd:4f:20:7a:d7:70:80:9e:0e:2d:7b:0e:f5:49:
      3b:ef:e7:35:44:d8:e1:be:3d:dd:b5:24:55:c6:13:
      91:a1
    </dd>
    <dt>Exponent (decimal)</dt>
    <dd property="rsa:public_exponent" datatype="cert:int">65537</dd>
   </dl>
</body>
</html>

If a WebId provider would rather prefer not to mark up his data in RDFa, but just provide a human readable format for users and have the RDF graph appear in a machine readable format such as RDF/XML then he may publish the link from the HTML to a machine readable format (it this is available at a dedicated URI) as follows:

<html>
<head>
<link type="rel" type="application/rdf+xml" href="profile.rdf"/>
</head>
<body> ...  </body>
</html>

2.3.3 In RDF/XML

RDF/XML is easy to generate automatically from structured data, be it in object notiation or in relational databases. Parsers for it are also widely available.

TODO: the dsa ontology

2.3.4 In Portable Contacts format using GRDDL

TODO: discuss other formats and GRDDL, XSPARQL options for xml formats

summarize and point to content negotiation documents

3. The WebID Protocol

3.1 Authentication Sequence

The following steps are executed by Verification Agents and Identification Agents to determine the global identity of the requesting agent. Once this is known, the identity can be used to determine if access should be granted to the requested resource.

The Identification Agent attempts to access a resource using HTTP over TLS [HTTP-TLS] via the Verification Agent.
The Verification Agent must request the Identification Certificate of the Identification Agent as a part of the TLS client-certificate retrieval protocol.
The Verification Agent must extract the public key and all the URI entries contained in the Subject Alternative Name extension of the Identification Certificate. An Identification Certificate may contain multiple URI entries which are considered claimed WebID URIs.
The Verification Agent must attempt to verify the public key information associated with at least one of the claimed WebID URIs. The Verification Agent may attempt to verify more than one claimed WebID URI. This verification process should occur either by dereferencing the WebID URI and extracting RDF data from the resulting document, or by utilizing a cached version of the RDF data contained in the document or other data source that is up-to-date and trusted by the Verification Agent. The processing and extraction mechanism is further detailed in the sections titled Processing the WebID Profile and Extracting WebID URI Details.
If the public key in the Identification Certificate is found in the list of public keys associated with the claimed WebID URI, the Verification Agent must assume that the client intends to use this public key to verify their ownership of the WebID URI. On the other hand, if no matching public key is found in the list of public keys associated with the claimed WebID URI, the Verification Agent must attempt to verify another claimed WebID URI. The authentication must fail if no matching public key is found among all the claimed WebID URIs.
The Verification Agent verifies that the Identification Agent owns the private key corresponding to the public key sent in the Identification Certificate. This should be fulfilled by performing TLS mutual-authentication between the Verification Agent and the Identification Agent. If the Verification Agent does not have access to the TLS layer, a digital signature challenge must be provided by the Verification Agent. These processes are detailed in the sections titled Authorization and Secure Communication.
If the public key in the Identification Certificate matches one in the set given by the profile document graph given above then the Verification Agent knows that the Identification Agent is indeed identified by the WebID URI. The verification is done by querying the Personal Profile graph as specified in querying the RDF graph.

The Identification Agent may re-establish a different identity at any time by executing all of the steps in the Authentication Sequence again. Additional algorithms, detailed in the next section, may be performed to determine if the Verification Agent can access a particular resource after the last step of the Authentication Sequence has been completed.

3.2 Authentication Sequence Details

This section covers details about each step in the authentication process.

3.2.1 Initiating a TLS Connection

This section will detail how the TLS connection process is started and used by WebID to create a secure channel between the Identification Agent and the Verification Agent.

3.2.2 Exchanging the Identification Certificate

This section will detail how the certificate is selected and sent to the Verification Agent.

3.2.3 Processing the WebID Profile

A Verification Agent must be able to process documents in RDF/XML [RDF-SYNTAX-GRAMMAR] and XHTML+RDFa [XHTML-RDFA]. A server responding to a WebID Profile request should be able to deliver at least RDF/XML or RDFa. The Verification Agent must set the Accept-Header to request application/rdf+xml with a higher priority than text/html and application/xhtml+xml. If the server answers such a request with an HTML representation of the resource, this should describe the WebId Profile with RDFa.

This section will explain how a Verification Agent extracts semantic data describing the identification credentials from a WebID Profile.

3.2.4 Verifying the WebID is identified by that public key

There are number of different ways to check that the public key given in the X.509 certificate against the one provided by the WebID Profile or another trusted source, the essence is checking that the graph of relations in the Profile contains a pattern of relations.

Assuming the public key is an RSA key, and that its modulus is "9D79BFE2498..." and exponent "65537" then the following SPARQL query could be used:

PREFIX cert: <http://www.w3.org/ns/auth/cert#>
PREFIX rsa: <http://www.w3.org/ns/auth/rsa#>
ASK {
   [] cert:identity <http://example.org/webid#public>;
      rsa:modulus  "9D79BFE2498..."^^cert:hex;
      rsa:public_exponent "65537"^^cert:int .
}

If the query returns true, then the graph has validated the associated public key with the WebID.

The above requires the sparql endpoint (or the underlying triple store to be able to do inferencing on dataytypes. This is because the numerical values may be expressed with different xsd and cert datatypes which must all be supported by VerificationAgents. The cert datatypes allow the numerical expression to be spread over a number of lines, or contain arbitrary characters such as "9D ☮ 79 ☮ BF ☮ E2 ☮ F4 ☮ 98 ☮..." . The datatype itself need not necessarily be expressed in cert:hex, but could use a number of xsd integer datatype notations, cert:int or future base64 notations.

Should we define the base64 notation?

If the SPARQL endpoint doesn't provide a literal inferencing engine, then the modulus should be extracted from the graph, normalised into a big integer (integers without an upper bound), and compared with the values given in the public key certificate. After replacing the ?webid variable in the following query with the required value the Verifying Agent can query the Profile Graph with

PREFIX cert: <http://www.w3.org/ns/auth/cert#>
PREFIX rsa: <http://www.w3.org/ns/auth/rsa#>
SELECT ?m ?e
WHERE {
   [] cert:identity ?webid ;
        rsa:modulus ?m ;
        rsa:public_exponent ?e .
}

Here the verification agent must check that one of the answers for ?m and ?e matches the integer values of the modulus and exponent given in the public key in the certificate.

The public key could be a DSA key. We need to add an ontology for DSA too. What other cryptographic ontologies should we add?

3.2.5 Authorization

This section will explain how a Verification Agent may use the information discovered via a WebID URI to determine if one should be able to access a particular resource. It will explain how a Verification Agent can use links to other RDFa documents to build knowledge about the given WebID.

3.2.6 Secure Communication

This section will explain how an Identification Agent and a Verification Agent may communicate securely using a set of verified identification credentials.

If the Verification Agent has verified that the WebID Profile is owned by the Identification Agent, the Verification Agent should use the verified public key contained in the Identification Certificate for all TLS-based communication with the Identification Agent. This ensures that both the Verification Agent and the Identification Agent are communicating in a secure manner, ensuring cryptographically protected privacy for both sides.

3.3 The WebID Profile

The WebID Profile is a structured document that contains identification credentials for the Identification Agent expressed using the Resource Description Framework [RDF-CONCEPTS]. The following sections describe how to express certain common properties that could be used by Verification Agents and other entities that consume a WebID Profile.

The following vocabularies are used in their shortened form in the subsequent sections:

foaf: http://xmlns.com/foaf/0.1/
cert: http://www.w3.org/ns/auth/cert#
rsa: http://www.w3.org/ns/auth/rsa#

3.3.1 Personal Information

Personal details are the most common requirement when registering an account with a website. Some of these pieces of information include an e-mail address, a name and perhaps an avatar image. This section includes properties that should be used when conveying key pieces of personal information but are not required to be present in a WebID Profile:

foaf:mbox: The e-mail address that is associated with the WebID URI.
foaf:name: The name that is most commonly used to refer to the individual or agent.
foaf:depiction: An image representation of the individual or agent.

3.3.2 Cryptographic Details

Cryptographic details are important when Verification Agents and Identification Agents interact. The following properties should be used when conveying cryptographic information in WebID Profile documents:

rsa:RSAPublicKey: Expresses an RSA public key. The RSAPublicKey must specify the rsa:modulus and rsa:public_exponent properties.
cert:identity: Used to associate an RSAPublicKey with a WebID URI. A WebID Profile must contain at least one RSAPublicKey that is associated with the corresponding WebID URI.