Draft Spec

From Read Write Web Community Group
Jump to: navigation, search

Introduction

Rww-logo-001.jpg

The Web was designed to be an interactive, collaborative space, where every user could be both a consumer and a producer of information. The first major browsers were designed mainly to be read only, creating an asymmetric web where users could look but not change. In the second decade of the web things started to become more interactive, with projects like editable encyclopedias, social networks, blogs and content management systems becoming more common place.

However, the very fabric of the web was designed so that the connected data and content could be edited in a standardized way, creating new forms of communication and feedback that would allow rich interactions, incentives and instant updates. Much progress has been made on this front, we aim to describe the motivation, and state of the art for handling changes to the fabric of the web using established standards.

Identity

Once you allow writing as well as reading to the web, in many cases, you would like to know who is doing the writing. At this point it is important to lay forth an important distinction which is a common point of confusion. When most technical people think of the term "identity" they immediately think of "verifying identity", also known as authentication. After all, it's natural on the web to want to check that someone is who they say they are, in most cases.

It's an important aspect of the web that things are done in a modular way, so this section will only describe how entities are identified, and authentication will be covered later.

The first axiom of the web is that any entity that you care about should be given a URI. The reason for this is that the web as a global space should be usable by many different systems unambiguously, without needing translation. Any entity that may be used in a system other than your own should be given a URI. The majority of systems do *not* follow this rule, making interoperability very hard. However more and more are starting to follow this pattern. One system that takes advantage of this technique is the WebID Identity Spec.

We would like to identify ourselves to a server and have the server tell us who they think that we are. Unfortunately there is no standard for this as of today, however under proposal is an HTTP header that will do this:

   User: http://www.w3.org/People/Berners-Lee/card#i

This is already implemented in a few systems used by the RWW Group

Discovery

One particularly useful form of URI is an HTTP URI. This is particularly handy because using the HTTP GET verb you can get more information (content or data) the so-called "follow your nose" technique. One nuance of HTTP is that the document generally encloses data points, which are delimited by the # or "anchor" character. It's important to distinguish between document and, say, a person described in a document as you may want to say different things about the two.

There is a growing movement that is taking advantage of this pattern, known as Linked Data. The aim of linked data is to complete the Web's original vision where "everything can be connected to everything".

While "follow your nose" is a powerful and intuitive technique. There are more techniques for discovery. One is called reverse search. While in a forward search you will give something a URL and get back words and data. In a reverse search you type in words, and get back URLs. This familiar technique is commonly used by search engines. Linked data comes with a query language called SPARQL which allows complex searches to be performed from endpoints to give rich data sets.

Another pattern for discovery is called ".well-known". When you are unable to perform a forward or reverse search, it possible to put meta data in "well known" locations located at host/.well-known/. These tend to be domain specific meta data, but can also be domain independent.

Authentication

Once you know someone's identity it can be important to verify that identity. There are many ways to do this, but it's important to keep the web is modular, and should not closely couple identity with verifying identity, if a system is going to scale widely.

Some techniques in verifying identity are a shared secret, such as a password, or a cookie which contains an unguessable string.

Another is PKI, one example of which is the X.509 certificate system, which is built into most browsers. A particular advantage of X.509 is that it is an existing standard that also allows you to both identify and verify that identity in one go. A spec that explains one technique for this is the WebID + TLS specification.

Authorization

Once you have established a web scale identity (a URI) and possibly verified it, it is possible to control who has access to various resources.

The Authorization specification is still being fleshed out but so far it is loosely based on the UNIX file system model. The aim is to make the web one giant file system. The operations allowable so far are:

  • Read
  • Write
  • Control (ability to change permissions)
  • Append (can write but not delete, something like an inbox)

A given resource can link to its Access Control file using the rel="meta" headers. (This may change to rel="acl" in future)

Reading and Writing

The first sections have shown how resources on the web can be identified and access controlled. Finally, if we have permission, we would like to know how to change things. Fortunately, this technology was built into HTTP from the start, so that will be a common technique.

HTTP POST

This is probably the most common technique used in web forms. However, there is no standard way in which POST data can be interpreted. We often will tag post data onto the end of a data structure, but this is not required

HTTP PUT

This will put a new resource in a location

HTTP DELETE

This will delete a resource

HTTP PATCH

This is generally used to change a small element of data

SPARQL Update

As mentioned above Linked Data has a query language, which also has update ability. One proposal is to discover a sparql "endpoint" at

   .well-known/sparql

Standards

Specifications that deal with these techniques that are standards or near standards are:

Further Reading