The TANN (Time, Authority-Name, Name) URI Scheme

Status

Merged with work from Tim Kindberg into the tag: URI scheme, although the details remain essentially unchanged.

Proposed by sandro on 2001-02-05. The only complaint to date (02-08) is that some people (DanC) think it's not worth the chunk of scheme namespace it takes.

One should perhaps use "urn:x-tann:" as the prefix for now, not "tann:".

Related Idea: "fetchable names" If we use a portion of HTTP space we address DanC's concern and we let people type it into a browser and get something. Maybe we can map it to a kind of proxy for general SW inference systems, which can generate HTML pages of information about arbitrary object identifiers.

Overview

This document describes a URI scheme for easily making permanent identifiers (URNs). This scheme meets the following goals better than any other scheme yet known to the author:

  1. easily generated by a human
  2. relatively short, memorable, and transcribable
  3. no central authority or administration

One of the reasons this scheme is able to meet these goals is by not attempting to solve the "broken links" problem. The URNs are just object identifiers, with no convenient embedded hints for mapping to URLs.

Some examples of TANN URIs:

    tann:2001,sandro@w3.org,my-dog
    tann:2001-01-31,w3.org,RDF
    tann:2001,sandro_hawke@yahoo.com,
    tann:1992,microsoft.com,uuid/f81d4fae-7dec-11d0-a765-00a0c91e6bf6
    tann:2000,w3.org,random_id/56e99e03abff293e9b11c5d1968f7522

A TANN URI is composed of a time, an authority-name, and a name. The authority-name is an e-mail address or domain name which (at the given moment in the past) identified the person or organization which has (in perpetuity) authority over allocating the name field.

The time value represents a precise instant in time (UTC). When the instant is the start of a time unit (such as a year or a day), the lower-order fields are omitted.

The selection of what time value to use, over all possible time values for which the authority-name is valid, is delegated to the authority. The recommended practice is to use the earliest large time interval (eg year) for which the name was valid, since that allows for short URIs. The use of a later time value is suggested if name-space management practices change in an incompatible way.

How To Use It

  1. Decide whether you want to be a naming authority or use part of the space managed by some other authority. You may use someone else's namespace with their explicit permission. Typically, someone may delegate a part of their space for public use to support software interoperability. (Some uses are suggest in the examples above.)

  2. Decide whether to use (1) an e-mail address you actually use for e-mail, (2) an e-mail address you make for just this purpose (perhaps using a service like yahoo mail or hotmail), or (3) a domain name registered to you. Option (1) has the advantage of identifying you in a recognizable way as the creator of the TANN, but may encourage people to send you unwanted e-mail. Whatever you decide, call that your "authority-name".

  3. Pick a time during which that authority identifier was yours. It should be in the past, no matter how confident you are that you'll still be using the same identifier at some point in the future. You will generally want it to be as short as possible, so a good practice is to pick the start of the most recent year in which the identifier was yours. If you didn't have it then, then pick the start of smaller time unit, like a month or day. If you get down to small units like days, don't forget the time should be in UTC (Universal Time Coordinates, also known as Greenwich Mean Time).

    You may want to also use the time field for versioning, to keep your name fields simpler. The time you select is up to you, except that it must be some time for which the authority identifier identifies you.

  4. Finally, pick any bytes you want to follow the comma in the Name field. Conventionally, one would use text with UTF-8 encoding for Unicode characters not in ASCII, but this is not mandated. Any unsafe URI characters should be URI escaped (%xx) as usual.

Syntax

Formally, the syntax is this:

    TANN_URI ::=  "tann:" time "," authority_name ( "," name )?

    time    ::=  year ("-" month ("-" day ("-" hour 
                      ("-" minute ("-" second ("." digit+ )? )? )? )? )? )? 

    authority_name ::=  lowercase_email_address | lowercase_domain_name

    name    ::=  (URIchars)*

Time Details

The time is specified in UTC. All terms but "year" are padded with leading zeros to their maximum width (two characters), and all rightmost terms holding the minimal value (01 for month and day, 00 for hour, minute, and second, 0 for digits) must be omitted. This allows for lexicographic ordering to match time ordering, for short time values to be chosen when Authority identifiers are long-lived, and for string equality comparison to be usable for time equality comparison. (This ordering behavior required that we select the character used to divide the time fields to be be lexicographically less than the character used to separate the time from authority field. Otherwise we would have used slash instead of comma.)

An example: W3C obtained "w3.org" on 1994-07-06. Had this TANN scheme been in place, W3C might have used "tann:1994-07-06,w3.org," for the first few weeks. If it wanted, come August 1 it could switch to "tann:1994-07,w3.org,". Come Jan 1, it could switch to "tann:1995,w3.org," which would be a good name to use until some problem arose. If in July of 2000 (say), it realized a wholely new organization and policy was needed, it could start using "tann:2000,w3.org,". Also, if some need arose for a temporary namespace, an arbitrary narrow time could be used, like "tann:2000-01-01-00-00-00.00001,w3.org,". (This is probably best avoided by using sensable namespace policies.)

Authority Name Details

The important property of authority names is that no two people, groups of people, or other entities act as if they have rights to the same combination of Time and Authority Name.

For an organization, this means the naming authority rests with the general organizational authority (such as the board of directors) until or unless it is clearly delegated to an appropriate internal agency.

For informal groups, such as a family sharing an e-mail address, the matter should be discussed and resolved among all parties who might imagine they are the identified naming authority.

Name Details

The Name part of the TANN may be further divided, as in the examples above, where we can imagine Microsoft defining a UUID space for general use, or w3.org defining a space where arbitrary strongly-random identifiers can be created.

FAQ

What if someone else uses my name space? That's in issue beyond the scope of this standard. TANNs are like DCE GUIDs and UUIDs -- they provide a way to generate unique identifiers, but they don't enforce the uniqueness, they don't provide security against spoofing, etc. What they do provide is a way to avoid unintentional conflicts.

What's wrong with PURLs and Handles? They are administered through a central authority and thus have overhead which, while perhaps needed for their designed uses, is probably inappropriate for others.

What's wrong with "mid:" URIs? Functionally, they are okay, but semantically they are wrong: they identify mail messages, not arbitrary objects.

Can you assign URIs to everything? No, but you can come very close. You can assign a tann: identifier to any particular individual object you can imagine, but there are precisely describable sets (such as the set of irrational numbers between 0 and 1) which cannot be systematically assigned URIs, since there are more of them than there are URIs, even if we do not limit URI length.