Semantic Web Abstract Messaging Protocol (SWAMP)

Status

Personal draft; parts nicely polished, but other parts may be unintelligable.

Abstract

We introduce a practical approach to protocol abstraction, viewing all messages as special-case serializations of RDF graphs. This allows a clean separation of protocol semantics from other protocol details. Web servers are examined in terms of the serialized RDF graphs they receive and transmit, and their function as managed knowledge repositories is explored. The need for formally-described (de)serialization modules is discussed.

The Pitch

The SWAMP View: consider all interprocess communication in terms of what knowledge the receiver is intended to learn, using the "RDF Model" for knowledge. All message-passing protocols are concrete instantiations of SWAMP. We can write drivers for each concrete protocol, and new network apps can be written at the exchange-of-knowledge level. We get interoperability and protocol abstraction, at O(n) cost and with no need for universal adoption of a knowledge format or agent language.

With SWAMP, web servers turn into RDF databases which happen to speak the HTTP instantiation of SWAMP. Web services servers turn into the domain-specific agents they should be; they can talk over the SOAP instantiation of SWAMP if that makes sense in their environment. In any case, the fundamental behaviors are described in terms of knowledge learned (and possible external actions).

Background

There are two basic abstractions of communication: message-passing and shared-memory. With message-passing, we consider communication to be the act of one being passing some message to another. With shared-memory, we consider communication to be the act of affecting one's environment in a way which is perceived by another. These are both conceptual abstractions; some real-world actions can be seen in either light.

The Internet is fundamentally built as a messaging system. The Web is built on top of it using both shared-memory and message-passing components. The most basic web operation, fetching and displaying a web page, is the way we "perceive our environment", retreiving a peice of shared memory. Managing the other half of the shared-memory system, sometimes called "authoring" or "content-creation", is delegated to each web server. HTTP and HTML provide POST (and forms) as a way to send a message to a web server.

The Semantic Web can be imagined either as a society of beings passing around (machine-understandable) messages, or as a shared environment which beings can consult for information (and sometimes affect with something like HTTP POST, but that's up to each web server).

There is a meeting place between these models. We can view the "semantic web servers" as a class of beings responsible for maintaining the shared memory. To fetch some information, you send a request to one of these repositories. To publish some information, you send a different kind of request.

Messaging

@@ integrate old explanation: http://www.w3.org/2001/05/14/swarch/. It does not include bNodes. They complicate things a bit.

@@ integrate or point to http://www.w3.org/2001/12/semrun/, which has most technical details. But it's missing an ontology of messaging, falling slightly below that.

Message sending API? (1) prepare two graphs, between them talking about some message, its source, destination, reply-to, ... (2) one graph is the message, the other is used to help guide it, at the local end. It may have stuff not encodable using the protocol.

In practice, talking about messages -- you use a prototype with some parts missing. Or some other ontological trick. What is our ontology of messagesing? What are some of our ontologies of particular instance protocols?

Concrete Instance Syntaxes

In the abstract, passing a SWMP message is indicating that certain RDF triples are "true". Free of any shared context between sender and receive, that's impossible. With enough shared context, it may involve transmitting only one bit. In practice, we'll be somewhere in the middle.

Each instance requires the creation of software proxies. The proxies can be declarative or procedural, I suppose. The question is.... does it all go in the address? We don't so much have an address as a general description of the destination agent, which might include which protocols it's available via, and the details for those protocols.

Use case: query a stock price using (a) SemXML/UDP and (2) GET http://finance.yahoo.com/q?s=msft | perl -nle 'if (m/(\d+\.\d\d)/) { print $1; }'


   LOAD A TEMPLATE/DEFAULT MESSAGE
    my $message = rdf_load("http://...//scrape_stocks_from_yahoo");
   FILL IT IN
    set($message, "stock:ticker", "MSFT");
   "SEND" IT  (which involves looking in the database for everything
   about it -- UDP/TCP, etc, etc.)
     send($message);

Response issues? The API needs ways to handle the response, if it comes, via the various TellByX mechanisms