This document has no official standing. It stands on its own merits. See related writings.
This is a large subset of the language problem: we have a "string" (a sequence of symbols from some alphabet), and we want to give it to someone else and have some degree of confidence what they will learn from it. Our subset here is that we're only interested in identifying objects, not in conveying information about the objects.
The first tool is Syntax: We can divide up the set of all possible strings into those which are/are-not productions of some grammar. Often we can use a regular expression. The strings which are productions of the grammar are said to "conform to the syntax" or "be in the language."
The second tool is the idea that we can have a conceptual look-up table: each string appears once in the left column, and in the right column is the object we are identifying with that string. We could only physically build that table for some kinds of objects, but we can imagine it for everything. The string in the left column is called the "name" or "identifier" (or "symbol", "sign", etc). The thing-itself in the right column is the "denotation" of the string (or name, or identifier, etc). The table itself is a "name mapping" and is a representation of the semantics of a language.
The term namespace is somewhat ambiguous, since it can refer to either a name mapping or to the set of names with currently defined denotations or the set of names allowed by the syntax. ***Some people even refer to the name of an xml namespace as a namespace itself.
The DNS gives us an extremely well-known, precisely defined, and open family of identification languages. Internet Standard STD0013 (RFC's 1034, 1035, clarified in 2181) define the familiar syntax of strings like "w3.org" and an operational method for managing mappings from those strings to other strings (most commonly IP addresses). A variety of legal structures and registrars' databases defines a more fundamental name mapping, from DNS names to people/organizations who have legitimate authority to alter the various string mappings. We'll call this the "DNS owner name mapping".
What makes the DNS so interesting and useful is how open and operational it is. In general, anyone can establish an entry in the DNS owner name mapping for about $10/year and then publish entries for mapping strings->strings, which can be used by bazillions of people. There are of course some tricky spots, such as people suing to obtain other people's domain names, but in general the system works remarkably well for the problems it solves.
One of the reasons the DNS is so useful is that name mappings can "leverage" off of other mappings by being defined in terms of them. We can define a mapping from american corporations to their chief executives, and then leverage off the DNS and say things like "the chief executive of the DNS owner of IBM.COM". Using math function notation, we're saying f(dns_name)=g(owner(dns_name)) or if we need something more operational: f(dns_name)=g(ip_address(dns_name)).
We can also leverage off multiple name mappings at once. Imagine a report of internet traffic between hosts. We can identify the bytes sent in one direction in some time period like bytes(ip_address(dns_name1), ip_address(dns_name2)). This is blindingly obvious in function notation, but might be missed in thinking about name mapping.
With a few syntax tricks, any number of name mappings can be combined, just like functions, into a single name mapping. Math function notation uses the syntax trick of having expressions composed of expressions (a recursive definition), which is relatively hard to parse, but easy to use. Sometimes the set of allowable names omits some character which can simply be used as a delimiter. More generally, we can define a reversable transformation on the names which guarantees a delimiter character will be available. And we can define the delimiters in a more complex way, like saying end-quotes wont be recognized when preceeded by a backslash.
Syntax | Examples | Denotation |
---|---|---|
URI | http://foo.com | Depends on the scheme. The denotation of "data:,hello" is a sequence of five ascii characters. The denotation of an HTTP URI is defined in RFC 2616 like this: "resource" is defined as "a network data object or service", and "The semantics are that the identified resource is located at the server listening for TCP connections on that port of that host, and the Request-URI for the resource is abs_path." |
URI-Reference | http://foo.com/#bar | Disallowed. Instead, use literals (data:,) and define a mapping to whatever objects you intended the URI-References to denote. |
Block-Scope Identifier | <foo> | Anything. An existential variable. Scope depends on the context. Might be a file or http resource. If scopes are nested, then lexically identical identifiers in scopes which have an ancestor-decendent relationship are considerd identical. That means an identifier used in sibling scopes is not shared unless it appears in a common ancestor. May be rewriten into Global-Scope form with no change in semantics if in an asserted context and all matching identifiers are rewritten to the same thing (ie may be Skolemized). |
Global-Scope Identifier | <foo,673d4a22-1b02-11d5-9472-0050ba428008> <foo,sandro@w3.org,2001> |
Anything. These are skolem constants [with a syntax that suggests they are skolem functions with constant arguments?!]. There's some argument they should still be variables, ie in a query context they'll match stuff other than themselves too - but in that case you have no constants. {sandro@w3.org,2000}foo NO, they're not -- they're still variables.... In a query context they'll match stuff. Uhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh. Denotation is univerally fixed. (2-term form uses UUID as per opaquelocktoken, 3-term form uses authority and time, as per TANN.) Replacing all occurances in the universe of one such identifier with another such identifier which has no occurances in the universe does not alter the semantcs of any statements. |
Sandro Hawke
$Date: 2001/03/23 19:17:21 $