Widely-Shared Strings-to-Objects Mappings
(The Semantic Web Identification Problem)


This document has no official standing. It stands on its own merits. See related writings.

Two Dimensions: Syntax and Semantics [theory]

This is a large subset of the language problem: we have a "string" (a sequence of symbols from some alphabet), and we want to give it to someone else and have some degree of confidence what they will learn from it. Our subset here is that we're only interested in identifying objects, not in conveying information about the objects.

The first tool is Syntax: We can divide up the set of all possible strings into those which are/are-not productions of some grammar. Often we can use a regular expression. The strings which are productions of the grammar are said to "conform to the syntax" or "be in the language."

The second tool is the idea that we can have a conceptual look-up table: each string appears once in the left column, and in the right column is the object we are identifying with that string. We could only physically build that table for some kinds of objects, but we can imagine it for everything. The string in the left column is called the "name" or "identifier" (or "symbol", "sign", etc). The thing-itself in the right column is the "denotation" of the string (or name, or identifier, etc). The table itself is a "name mapping" and is a representation of the semantics of a language.

The term namespace is somewhat ambiguous, since it can refer to either a name mapping or to the set of names with currently defined denotations or the set of names allowed by the syntax. ***Some people even refer to the name of an xml namespace as a namespace itself.

The Domain Name System (DNS) [practice]

The DNS gives us an extremely well-known, precisely defined, and open family of identification languages. Internet Standard STD0013 (RFC's 1034, 1035, clarified in 2181) define the familiar syntax of strings like "w3.org" and an operational method for managing mappings from those strings to other strings (most commonly IP addresses). A variety of legal structures and registrars' databases defines a more fundamental name mapping, from DNS names to people/organizations who have legitimate authority to alter the various string mappings. We'll call this the "DNS owner name mapping".

What makes the DNS so interesting and useful is how open and operational it is. In general, anyone can establish an entry in the DNS owner name mapping for about $10/year and then publish entries for mapping strings->strings, which can be used by bazillions of people. There are of course some tricky spots, such as people suing to obtain other people's domain names, but in general the system works remarkably well for the problems it solves.

Leveraging and Combined Mappings [theory]

One of the reasons the DNS is so useful is that name mappings can "leverage" off of other mappings by being defined in terms of them. We can define a mapping from american corporations to their chief executives, and then leverage off the DNS and say things like "the chief executive of the DNS owner of IBM.COM". Using math function notation, we're saying f(dns_name)=g(owner(dns_name)) or if we need something more operational: f(dns_name)=g(ip_address(dns_name)).

We can also leverage off multiple name mappings at once. Imagine a report of internet traffic between hosts. We can identify the bytes sent in one direction in some time period like bytes(ip_address(dns_name1), ip_address(dns_name2)). This is blindingly obvious in function notation, but might be missed in thinking about name mapping.

With a few syntax tricks, any number of name mappings can be combined, just like functions, into a single name mapping. Math function notation uses the syntax trick of having expressions composed of expressions (a recursive definition), which is relatively hard to parse, but easy to use. Sometimes the set of allowable names omits some character which can simply be used as a delimiter. More generally, we can define a reversable transformation on the names which guarantees a delimiter character will be available. And we can define the delimiters in a more complex way, like saying end-quotes wont be recognized when preceeded by a backslash.

Proposed Solution #29

Syntax Examples Denotation
URI http://foo.com Depends on the scheme. The denotation of "data:,hello" is a sequence of five ascii characters. The denotation of an HTTP URI is defined in RFC 2616 like this: "resource" is defined as "a network data object or service", and "The semantics are that the identified resource is located at the server listening for TCP connections on that port of that host, and the Request-URI for the resource is abs_path."
URI-Reference http://foo.com/#bar Disallowed. Instead, use literals (data:,) and define a mapping to whatever objects you intended the URI-References to denote.
Block-Scope Identifier <foo> Anything. An existential variable. Scope depends on the context. Might be a file or http resource. If scopes are nested, then lexically identical identifiers in scopes which have an ancestor-decendent relationship are considerd identical. That means an identifier used in sibling scopes is not shared unless it appears in a common ancestor. May be rewriten into Global-Scope form with no change in semantics if in an asserted context and all matching identifiers are rewritten to the same thing (ie may be Skolemized).
Global-Scope Identifier <foo,673d4a22-1b02-11d5-9472-0050ba428008>
Anything. These are skolem constants [with a syntax that suggests they are skolem functions with constant arguments?!]. There's some argument they should still be variables, ie in a query context they'll match stuff other than themselves too - but in that case you have no constants. {sandro@w3.org,2000}foo NO, they're not -- they're still variables.... In a query context they'll match stuff. Uhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh. Denotation is univerally fixed. (2-term form uses UUID as per opaquelocktoken, 3-term form uses authority and time, as per TANN.) Replacing all occurances in the universe of one such identifier with another such identifier which has no occurances in the universe does not alter the semantcs of any statements.
if there exists x,y,z such that x is shared with ancestors, but not with siblings (unless they in turn share with the same ancestor). if grandfather then parent ; father IE you look to see if it occurs in the premise. Need compound head? if grandfather then parent if grandfather then father The system KNOWS b is a skolem function of and . So it can keep track across rules.... Wow. yeah, it works. WSL 2 forms: someuri uri("someuri") foo ("x","y").foo OR authority("x","y").foo

Sandro Hawke
$Date: 2001/03/23 19:17:21 $

forall x,y if f(x,y) then there exists z such that g(x,z) can I rewrite that as if there exists x,y such that f(x,y) then there exists .... Sure. INFERENCE RULE: premise - set of triples, some vars conclusion - triple, possibly some same vars, possibly some different. conclusion MIGHT need set of triples to give us scope. without it, you don't know which variables matter. maybe that's okay. if ya x,y | f(x,y) then q(z,x) if ya x,y | f(x,y) then q(z,x) if wants shutdown then from ; if wants shutdown then active now; if wants shutdown; is amazing then status amazing; now a wants shutdown b wants shutdown b is amazing sk1 from a sk1 active now sk2 from b sk2 active now sk2 status amazing How do we know it's sk2? because we know shutdown-request depends on x. if wants shutdown; is amazing then status amazing; that gives us sk3 status amazing. **** so it works if you link the names. for all x if there exists l2 such that lx wants shutdown and lx=x then there exists mx such that mx=lx [] something like that -- there a lot of such-that-exivar-equals-univar. if wants shutdown then from ; if wants shutdown then active now; if wants shutdown; is amazing then status amazing; for all x if there exists x1 such that x1 = x and x1 wants shutdown then there exists shutdown_request such that shutdown_request=skolem("x", x) and shutdown_request from x. IE: conclusion-only variables are made skolems of all the universal variables named in the premise. Did proto16 need to know the univar? I can do: rule "everone is loved by someone" ( ==> !foo loves $x ); rule "indirect love" ( $x loves $y; $y loves $z ==> $x likes $z ); rule "indirect like" ( $x likes $y; $y likes $z ==> $x knows $z ); rule "output" ( $someone knows bill ==> return done done ); with <*>. It's for making gigantic ground facts, if we want that for some reason. No, we can do that with: rule "everyone is loved by someone" ($x equals $x ==> $foo loves $x) We'de like to say f(x,y) = f(y,x) No, that's no problem if f(x,y)=z then f(y,z)=z and the reverse. So a rule has set-of-triples premise triple conclusion a syntax for recognizing universal variables, and and practice of handle conclusion variables differently when they appear in the premise and is equivalent to FOPC. That'll be nice to prove.

Sandro Hawke
$Date: 2001/03/23 19:17:21 $