From bajan@mocha.bunyip.com Mon Feb  7 19:27 MET 1994

Subject: Changes to URL document

Hi All,
	Tim and I have exchanged mail over the past couple of days and he
has made some revisions to the current URL draft. [It is available as
<URL:ftp://ftp.w3.org/pub/www/doc/draft-uri-url-02.txt> and
<URL:ftp://ftp.w3.org/pub/www/doc/draft-uri-url-02.ps>]

Tim also says that there is a change list summrising changes to the
document at URL:http://www.w3.org/hypertext/WWW/Addressing/URL/7_1_Changes.html


I have retrieved the draft and asked Tim to make the following changes to
conform to the decisions of the working group before resubmitting it to
the Secretariat as a new draft.

----------------------

1) From the minutes:

  - The Area Directors require the group to produce a companion document
    to the current URL draft containing a list of functional
    specifications and requirements. This document can then be used to
    determine if the current URL draft meets the requirements. Similar
    documents will be required for all UR* protocol specifications.

I believe that the initial discussion in the current draft is in conflict
with this since the first 6 or so pages try to fill this role and this
will be taken up by another document. This document should concern itself
with the definition of the URL and leave the design criteria to the doc
mentioned above.

2) The term "universal" is used quite widely in the document. Since the
group decided some time ago (when we changed the name to Uniform Resource
Identifier), this is probably not appropriate. We can't really claim to
have a "universal" system. A "uniform" system more adequately describes
it.

3) While the document is "wordsmithed" by the author, under the IETF process
it is viewed as the product of the working group and as such "author's
comments" are probably not appropriate (such as those in the section on
readability). In any case this more properly belongs in the requirements
document.

4) References to non-standard URLs should be left out in the main text.
For example the section on the "Full Form" URL don't make much sense since
for the purposes of the document there is no other form. The same applies
to references to fragments in the main body. I suggest that if you would
like to put these into the process that they be split out from this
document entirely and submitted to the working group for separate
consideration.

5) The WG has decided that the "URL:" prefix is standard and this should be
made clear in the draft. Currently the only place that this appears is in
the BNF. It should rightly be part of the "Scheme" section which
currently makes no mention of it.

[There are a couple of sentences which don't parse. "Abstract" final
paragraph and "Characteristics" for example. I'm sure the list will be
happy to make an editing pass]


------------------


-- 
-Alan
___________________________________________________________________________
From bajan@mocha.bunyip.com Fri Feb 18 00:29 MET 1994
Return-Path: <bajan@mocha.bunyip.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.0/SMI-4.0)
	id AA20123; Fri, 18 Feb 1994 00:29:37 --100
Received: from mocha.bunyip.com by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA03842; Fri, 18 Feb 1994 00:29:21 +0100
Received: by mocha.bunyip.com (5.65a/IDA-1.4.2b/CC-Guru-2b)
        id AA26704  (mail destined for timbl@www0.cern.ch) on Thu, 17 Feb 94 18:28:55 -0500
Message-Id: <9402172328.AA26704@mocha.bunyip.com>
From: bajan@bunyip.com (Alan Emtage)
Date: Thu, 17 Feb 1994 18:28:55 -0500
In-Reply-To: Tim Berners-Lee's message as of Feb 17, 10:35
Mime-Version: 1.0
X-Mailer: Mail User's Shell (7.2.3 5/22/91)
To: timbl@www0.cern.ch

Subject: Re: Changes to URL document

Cc: uri@bunyip.com, jkrey@isi.edu, Erik.Huizer@surfnet.nl,
        klensin@infoods.mit.edu, fullton@cnidr.org
Content-Type: text/plain; charset=US-ASCII
Content-Length: 5505
Status: RO
X-Status: 

Hello Tim,

> > I believe that the initial discussion in the current draft is in conflict
> > with this since the first 6 or so pages try to fill this role and this
> > will be taken up by another document. This document should concern itself
> > with the definition of the URL and leave the design criteria to the doc
> > mentioned above.
> 
> Who is working on this draft?  Karen and Larry are defiing URNs not URLs
> or URIs.  If I take out a definition of what the document is about,
> I need something to refer to.  I can refer to Karen and Larry's document
> for the definition of a URN, and perhaps of a URL which they do
> define in passing.  But that document does not give requirements for URLs.
> What can I quote here?

John Kunze has a copy of the draft document that was presented at Houston
and will be cleaning that up and posting it to the list. I think this
will work quite nicely since there was pretty broad agreement on what was
to be in and, more importantly, out of that document. We should have it
soon. I don't expect that there will be any (ok, "much" :-) controversy
over that document since the meeting was quite clear on it and most of
the points made were listed in the minutes and people seem to have agreed
to them.

> > 2) The term "universal" is used quite widely in the document. Since the
> > group decided some time ago (when we changed the name to Uniform Resource
> > Identifier), this is probably not appropriate. We can't really claim to
> > have a "universal" system. A "uniform" system more adequately describes
> > it.
> 
> The universality of the URI name space is in fact important,
> but if you like I will take that out of the document and make this draft
> only talk about URLs.  I propose to write a separate informational
> document on WWW's use of URIs as input to the group.  This would be  
> appropriate given the group's desire to take into account but not
> be driven by current designs.

More than appropriate I think... necessary. As the the first system to
widely deploy URLs (in the basic framework, not the exact syntax being
proposed) I think the group would welcome such a document.  Please bring
it along... we'll move into the flow...

> > 5) The WG has decided that the "URL:" prefix is standard and this should be
>      ^^^^^^^^^^^^^^^^^^
> > made clear in the draft. Currently the only place that this appears is in
> > the BNF. It should rightly be part of the "Scheme" section which
> > currently makes no mention of it.
> 
> Sorry, I wasn't clear to me that it had. Look at Larry Masinter's
> message of 17 Dec available as
> <http://www.acl.lanl.gov/URI/archive/uri-archive.messages/900.html>
> summaries the problems. My personal feeling is that this shouldn't hold
> us up as defining the URL itself is more important that its wrappers
> for plain text.  But there seem to be a lot of suggestions about this.
> Do you regard prefix as part of the URL, or part of a wrapper for
> plain text use?  Have I missed a roar of consent about this one?

[Alan's thought bubble:]
Oh God :-) I don't want to reopen this debate... really, really I don't.
Ok, let me see if I can try it this way (he thinks naively), maybe
they won't kill me....

[In an authoratative voice:]

"The working group has decided..."

Seriously folks, having discussed this at length with my co-chair, the
AD's and several of you privately since Houston here is my sense of where
the group stands and how best to move on.

(1) The URL: prefix debate splits people in to 3 groups.

  (a) Vehement supporters "for"
  (b) Supporters "for" & supporters "against"
  (c) Vehement supporters "against" (I'm not sure if you can "support
      against" something but you get the idea)

(b) has people with defined opinions but who are willing to let the
respective opposing side "win" to allow us to move on.

It is the opinion of the WG Chairs and the ADs (all 3 of whom were
present for the debate in Houston) that the (a) and (b) groups have shown
the greater numbers "for" and thus the motion FOR a "URL:" prefix as part
of the URL proper carries under the conditions of "rouch consensus". 

This IN NO WAY invalidates the arguments of (c). Their points are noted
and remain valid.

Look folks, this has become somewhat of an intractible point. Which, all
things considered is not too bad considering that we have few of them and
in the general scheme we have broad consensus on a wide range of issues
in a very complex area. At this stage (a) and ("for" b) are not going to
change many of the minds of (c) so we have reached a crossroads and a
decision has been made.

2) The issue of the wrapper (in plain text) needs to be resolved. People
seem to be starting to use the <> syntax although no formal decision has
been made on that yet. I think the concept is pretty well accepted so it's
a matter of choosing what its going to be. People working with HTML have
understandably complained that this will make their life more difficult.
Can we reach some closure on this one?

URN requirements are well on their way and we should be into the specs
stage in time for Seattle (5 weeks). URCs are a bit of an unknown
quantity at this time so we'll have to start that debate going.


-- 
-Alan
___________________________________________________________
From: bajan@bunyip.com (Alan Emtage)
Date: Mon, 7 Mar 1994 00:07:12 -0500
X-Mailer: Mail User's Shell (7.2.5 10/14/92)
To: uri@bunyip.com

Subject: Unresolved URL issues

Content-Type: text
Content-Length: 6956


Hello All,
	As Larry enumerated last week there are a couple of minor issues
still to be resolved before we can put the bow on the URL box.

In decreasing order of importance (by my reckoning) here they are. I have
included where I believe the current majority opinion lies (and pro/con
arguments if appropriate). My own personal observations for resolving the
issue is also there. If you have strong objections to these proposals,
please raise your hand. My plan is to formalize the mailing-list position
in short order in Seattle and move on.

1) The syntax and semantics for certain URLs.

a) the FTP URL.

Majority(?): The syntax should be "URL:ftp://host/a/b/c/d". Meaning that
repeated CWD commands "a", "b", "c" should be performed and a RETR done
on "d". The "/" is a directory boundary and if embedded "/" are to be
allowed they must be quoted via the same mechanism as whitespace (ie,
%<number>).

Pro: Will work in most cases

Con: Will fail in (a minority of) cases

Personal comments: I've had some experience with automated ftp retrieval
through archie and the technique we use is that proposed above. We
perform an additional "PWD" as the first command after login and keep
track of what's returned, presumably the root of the ~ftp tree. We
eliminate that from given paths. However there are problems with this in
that this would fail if a file called /pub/ftp/a was being retrieved and
~ftp was /pub/ftp. 

I propose that the path be given relative to the login (uid/password) in
the URL, as opposed to an absolute path. The URL still contains enough
information that it is not "relative" (or "partial") and the context may
be fully resolved on the host in question. It does however prevent the
conversion of the URL to another access method. Not a requirement in any
case, I believe.

I would further suggest (as I believe has been done in the past) that the
"login" term in the BNF be rewritten to have the username/password
combination at the end of the hostname/port, not at the beginning. This
would allow it to conform to the other access methods. In addition, an
"account" term will have to be added (RFC 959 has a triplet,
username/password/account).

Also as Larry notes, there is no current provision for typing the object
being referenced or the transfer mode that has to be used. Since both are
required for access to the object and since the draft requirements allow
such typing in cases were the information is necessary for access, I
propose that we allow the terms "binary", "ascii" and "tenex" to be used
as transmission specifiers (again, see RFC 959). Since the only two
objects that can be obtained from ftp sessions are (potentially)
directories (that is the contents of a directory) and files, that we
specify "directory" and "file" as object types. There is currently no
(standard) mechanism in FTP to determine if an object is a directory or a
file so this is needed. [You can do like archie and parse the ls(1)
listing but that is so ugly as to be rejected out of hand.... in any case
there is no standard for responses to the LIST command].

The question of what to do about multiple types of objects in a directory
may need to be addressed.

IMHO, I agree with John Curran. With something like FTP we can't bother
about every possible implementation under the sun... it's been around too
long and in may cases is too unstandardized to try to get 100% of all
implementations.

This would lead to an ftp URL like:

URL:ftp://hostport@[[[username]:password]:account]/binary/file/a/b/c

[Some people may prefer other delimiters for the "binary", "file"
separators]

b) Telnet URLs

Majority: ?

There hasn't been significant discussion on this if memory servers
correctly.

Larry asks if rlogin, tn3270 (and telnet-with-local-echo) are the same.

My comments: I think we should let telnet be pure telnet. While rlogin
and tn3270 are very similary, they probably should remain separate.
rlogin has a different default port, and tn3270 may require parameters
other than login and password. I suggest that we define similar, yet
distinct, URLs for these beasties. The telnet-with-local-echo is really a
matter of the attributes of the telnet session (if I understand the term
correctly). Can we provide a syntax to specify telnet parameters WHICH
ARE NECESSARY FOR ACCESS on the URL, hopefully the same as say the
Prospero attributes?

b) News and NNTP URLs

Majority: unclear

There was a discussion on this some time back. I belive that Tim and
Mitra were on opposite sides of this fence. Would it be possible for the
two of them to send a message 20 lines or less with a brief summary of
their respective sides? Whatever is finally decided the WG grandfathered
the news URL already.

2) Wrappers

Majority: Wrappers will exist in contexts which need them and are context
specific. However the spec should define them for plain text at least.
Currently <>, {}, '' and "" are possible candidates. Each context
(protocol, system) is free to define its own wrapper. [We should probably
look at documenting these when they come into common use].

URLs need to be identified in plain, running text. While the "URL:"
prefix allows the URL to be explicitly defined (as opposed to a process
of elimination that "if it's not a URN and if it's not a URC then it's a
URL" and although whitespace must be encoded sometimes a wrapper is
neccessary. For example, in cases where no whitespace may be present to
identify the end of the URL.

The debate remaining centres on the characters for the wrapper. Many
people currently use <> (including TimBL if I'm not mistaken). Any
character chose would cause it to be disallowed in the URL proper, having
to be quoted via the normal mechanism. Wrappers are not part of the URL
proper.

Pro: <> are available in most character sets and are already widely used.
(I perceive) Some leaning towards their "intuitive" nature. "" and '' are
used pervasively in text and {} are not available from certain systems
(IBM comes to mind if I remember correctly... it's been a long time).

Con: <> is already used in SGML and would cause additional parsing
headaches.

Con-con: We can't restrict particular characters because particular
systems use them.... this restriction would disallow most (non
alphanumeric) characters.

Personal comments: I fully understand the position of the SGML people
although I don't see many alternatives here. We could look at [] perhaps?

3) Other (to be defined URLs).

Majority: Not a lot of debate. Seem to be comforatble with the current
crop of defined URLs.... may want to wait before defining others?



That's it as far as I know... anything else that I've forgotten?


-- 
-Alan
_____________________________________________________________