Minutes

The "UDI BOF" was held at the 24th IETF in Cambridge, MA, USA on 14 July 1992. Attendees are listed.

Introduction

Tim Berners-Lee (TBL) opened the BOF with a summary of the terms used in the discussion to date. The information one quoted in a reference to an object could comprise many things, among which were possible one unique name, (Unique Resource Number, URN was one acronym), and zero or more addresses (Uniform Resource Locators or URLs) which gave instructions for retrieving the object.

The object of the meeting was to formalize a standard string syntax for URNs and URLs in general, and to define specific syntaxes for addresses in the namespaces of each of the existing network protocols. [There was a discussion on acronyms at various times. URL was decided upon for an address, and that is used throughout these minutes for clarity.] The result should be a standards track document (requiring a working group, which should probably be in the Directory Area but could be in Applications).

NOT to be discussed were the differences between names and addresses, URN schemes (which are not yet well enough defined), the full set of information to be given in a reference, or IPv7.

To be discussed were the overall string syntax, including allowed characters and escaping systems for unallowed characters, the order of components (little/big-endian), punctuation characters, the particular prefix to be used to identify each namespace.

Specific schemes should be handles in appendices of the reulting document, and should include

Prospero

FTP

WWW

telnet

net man. db?

nntp

WAIS

gopher

finger

X.500

Discussion

We need methods of keeping up to date the set of appendices without the same standards track procedure which applies to the full document.

It was pointed out that for WAIS one could imagine a separate name space for databases and for documents. If this was taken futher, a separate prefix would be used for each type of object. It was on balance agreed that this could go too far. One prefix should be used per protocol, but it should be made clear how to determine the type of an object from the URL.

Peterd is concerned that we need a syntax for attaching a URN toa URL, but accepted that it was not for dicussion at the BOF.

Cliff Lynch suggests 3 part structure of name, address, and other stuff. Peterd suggests that the document should talk about what the addresses aren't, in a section on Scope. This section could also provide an example of a complete reference, including other information, by way of explanation but not recommendation.

TBL had submitted in the background document the W3 implementation:

scheme: ____blah___ (the syntax of ____blah___ depending on the value of scheme, within certain constraints.) There was no dissent, although TBL noted that this is the reverse order from the WAIS proposal.

John Curran has concern about URLs being resolvable in such a way that any two references to same URL get the same thing. (unambiguity). It was generally felt that the system W3 uses to allow URLs to be incompletely quoted in context was an applciation issue and was not relevant.

The issue of what we are identifying came up "resource locator"? -- a scheme for somehow identifying resources. Perhaps identifying procedure for locating a resource.(Sollins and C. Lynch) Cliff Neumann suggested Document Access Instructions as an alternative handle/name/identifier for these addresses. URL was decided on by an almost unanimous vote. (Uniform Resource Locator).

Peter Deutch pointed out that we want to focus on interoperability, not on longevity. We ought to be able to hand URLs around in short term, but not long- lived. URLs are not unique (in the sense that one document may have several). This should be made clear in the document.

The class of object you get back should be predictable (--C Lynch). W3 has a real problem with that, since everything is a "document" and handled in a similar way. Might get a pointer to a database in a piece of mail. The question of whether get back a file or a directory from a FTP URL arose. Archie really wants to know what it is getting back. Within a scheme, should be documented syntax that will clarify which sort of object will come back. If we go too far down this track, we fairly quickly get to full object-oriented world, with fullscale typing. Alan Emtage. suggested that simple enumeration of acceptable types. Extensions based on documented new subtypes, based on documented protocols.

A separate issue of whether human or only machine readable. Previously, included issue of printable. This is needed because don't have names now. Question arose of whether once these addresses exist will be replaceable with names - will be presented as new functionality, not replacing existing systems. Agreement on some way of specifying class of objects.

"Context" prefix

IT WAS AGREED that the context, or namespace, prefix be the first (leftmost) part of the URL, and be separated from the rest of the URL by a colon.

The punctuation "//" was discussed. Currently, W3 URLs use "@" for login information. An extension of server the server hostname can include a port number in all current schemes.

The issue of how to manage additional schemas was discussed. Each appendix should be checked out by particular group within IESG. Perhaps should be "expermental standard," rather than simply "information rfc." Document will describe how to write appendices.

Syntax details

The syntax should be human typable (majority agreement).

Should one use punctuation, or attribute-value pairs? Attribute value pairs get mispelt. (note x.400 vs.internet addresses

It was decided to use a short string with punctuation rather than an attribute-value pair system.

We must specify the terminator (by declaring some characters as illegal inside a URL).

Is a URL nestable? If one URL can contain another, one needs nestable begin-end pairs. (- Alan E). Currently W3 URLs are not nested visibly although escaping allows URLs to be encapsulated within URLs, for example by gateways.

Allowed charcters: characters should be disallowed if they are needed as terminators (`"', `;') or are too easily mutialtable by passage though (for example ASCII/EBCDIC/ASCII) gateways (tilde, backslash). A subset of an ISO 7-bit code should be defined, with reference to MIME work.

Future discussion

Mailing lists: NIR list at McGill will be used by Jill Foster and George Brett's NIR BOF. ietf-url@merit.edu will be used by theis group. First of all, we should ask Mike Schwartz whether he is willing to run all mailing lists on one machine (at least nir and url) in order to cut down on multiple coopies opf cross-posted messages.

Accomplishments

Things which the meeting had brought to light included:

"File:" is too broad a description, "ftp:" would be better. If a given client knows that it can in fact access some FTP sites as local files, that is a local client issue.
Escaping is to be defined
Relative naming is a client issue
We should look at what we call "news:" ("usenet:"?)
We should we be able to tell what sort of an object we have (eg database or document) by simple examination of the URL
We need a scheme for managing the addition of new schemas (cf Directory object definitions).
The document should be an "experimental standard" RFC.
mailing lists will be defined and a single message sent to cniarch to say what is happening.

The timeframe for the document was: soon. Probably in directory services area (Eric Huizer has suggested this). We need a wg if want to go through the IESG. But in reality, if have big applications buying in, then will be a de facto standard.

Final comments

We may need to separate WAIS from Z39.50. (C Lynch). We may also need SQL.

Also we should include X-junk: type extension mechanism for experimental schemes.

[These minutes noted on-line by Karen Sollins (Thanks!) and edited by Tim Berners-Lee. They are available as URL http://www.w3.org/timbl/Public/USTrip1992/IETF-24/UDI_BOF_Minutes.html ]