Date: Tue, 22 Mar 94 17:51:49 +0100
From: Tim Berners-Lee <timbl@ptpc00.cern.ch>
To: <uri@bunyip.com>

FTP syntax

Reply-To: timbl@www0.cern.ch

Summarizing, the consesnus is now that the FTP syntax should be
of the form

	ftppath		<fpath> [ <separator> <mode> ]

	<fpath>		<xpalphas>

	<separator>	(we have been talking about a colon)

	<mode>		dir | bin | text | tenex

where if the mode is omitted the client has to do its best to figure it out.

Now all we have to do is to make this wholsesome.  You obviously
must allow any character in the xpalphas bit. So that means that the
separator (like colon) must be escapable.  Now the rules for
escaping characters in URLs are such that reserved characters
like / have different meanings when escaped, but other characters
not explicitly reserved MAY be escaped if the medium in question
doesn't like them.

[background: This is what gave us the flexibility to arrive
at consesus on the character set issue in Amsterdam.  It allows
software to display Latin-1 characters to people, but encode them
in mail. It allows Gopher and HTTP to have slightly different
excluded sets. But it does define a connonical form for standard
interchange.  In other
words, if you come across %20 in a URL or you come across " "
you must treat them the same. But if you come across %2F and "/"
you must treat them differently.   This allows us lattitude
and also avoids the yukky prospect of escaped escape sequences]

Now it is no good if some gateway unescapes the separator, so the separator
must be from, or added to, the set of reserved characters. Currently
this set is

	 { | } | / | # | ? | vline | [ | ] | \ | ^ | ~   < | >

but does NOT include the "extra" characters (from Amsterdam)

 	! | * | " |  ' | ( | ) | : | ; | , | space

 where space in now disallowed in connonical form (Houston).
 What I suggest is that just as now, for other schemes we are
 going to need some separator characters, and we grap them now
 while we have the chance.  For example, suppose we take

 	! and *
	
 and say that they are reserved, and may not be used unescaped
 except when having special meanings (to be defined).  This would allow

  ftp://info.cern.ch/pub/www/doc/draft-www-bernerslee-uri-00.text!text
 
 The change to the safe character set would not affect very many URLs
 at this stage.

 I wouldn't be in favor of doing this to ":" as I am sure it turns up
 in current URLs a lot more than "!".

 [The only alternative I see is to use as a sparator any of the
 non-hex & sequences, which are currently all reserved, such as

   ftp://info.cern.ch/pub/www/doc/draft-www-bernerslee-uri-00.text%:text

 but this is I think more messy.]

Summary of proposal:

	1.	! and *	reserved
	2.	separator is !
	
Ok?

Tim BL