email address in a URI

Hi all,

Someone has pointed out that tag URIs (taguri.org) are using an old 
notion of email address syntax rather than the latest standard (RFC 
2822). It seems that I have some time before we go to RFC to fix this.

I've hit some issues that I'm guessing will have to be addressed for 
mailto URIs. Since it's easy to get the details wrong, I'd appreciate 
comments on the following.

Currently in tag we have:

emailAddress = 1*(alphaNum /"-"/"."/"_") "@" DNSname

whereas RFC 2822 has:

addr-spec       =       local-part "@" domain
local-part      =       dot-atom / quoted-string / obs-local-part
dot-atom        =       [CFWS] dot-atom-text [CFWS]
dot-atom-text   =       1*atext *("." 1*atext)
atext           =       ALPHA / DIGIT / ; Any character except controls,
                         "!" / "#" /     ;  SP, and specials.
                         "$" / "%" /     ;  Used for atoms
                         "&" / "'" /
                         "*" / "+" /
                         "-" / "/" /
                         "=" / "?" /
                         "^" / "_" /
                         "`" / "{" /
                         "|" / "}" /
                         "~"

We only need to fix the "local-part" for tag.

I'd appreciate comments on the following relacement within
tag syntax, and the logic behind it:

emailAddress = dot-atom-text-uri "@" DNSname
dot-atom-text-uri = 1*atext-uri * ("." 1*atext-uri)
atext-uri= ALPHA / DIGIT / ; see RFC 2822
             "!" / "$" /    ; only URI-compatible
             "&" / "'" /	   ; characters included
             "*" / "+" /
             "-" / "/" /
             "=" / "_" /
             "~"

The logic behind the above is:
0. Avoid obsolete local parts, and local parts involving CFWS
(comments and white space, which could introduce ambiguity in who can
use which email addresses)
1. But enable as many email addresses allowed by RFC 2822 as possible
2. While ensuring RFC 3896 compatibility
3. Can't %-encode characters without ambiguity, since RFC 2822
allows email addresses containing % HEX HEX constructs
4. So we have to avoid / "^" / "`" / "{" / "|" / "}"
5. And it seems a bad idea to allow "#" / "%" / "?"

Cheers,

Tim.

-- 

Tim Kindberg
hewlett-packard laboratories
filton road
stoke gifford
bristol bs34 8qz
uk

purl.org/net/TimKindberg
timothy@hpl.hp.com
voice +44 (0)117 312 9920
fax +44 (0)117 312 8003

Received on Wednesday, 6 July 2005 09:56:55 UTC