SemRun: A Basic RDF-Based Service Provider

Project Status: experimental. First draft & presentation mid-December 2001. On hold for a few months. 3/1/2002, it still looks good. Touched it up a bit.

State of the Code: implements an older version of the model; needs to be updated

Overview

semrun is a general-purpose semantic web service provider (aka "agent"). You invoke it with the address of an RDF file containing instructions for it. It follows your instructions. It's like a programming language interpreter (perl, python), but its language is purely declarative (like pure Prolog) and geared more for the machine than the programmer (like java byte code).

The language can be extended internally by providing rules which map your new constructs onto existing ones, or externally, by adding code to semrun, teaching it new primitives. For now, the primitives are divided into two classes: those which are basic in every service provider, and those which are particular to specialized providers. Specialized providers might offer primitives giving access to the network or local computing resources (filesystem, display, special hardware), or higher-performance shortcuts for rules clusters.

semrun treats its RDF input as a knowledge base to be queried for knowledge of things which it cares about. The existance of objects with rdf:type sr:Request, sr:Conditional, and sr:Import fundamentally guide its behavior.

Here is the basic algorithm:

Loop:

Query for new sr:Imports. If well-formed, attept to fetch the named RDF content, and merge it into the knowledge base. If this fails, abort.

Query for new sr:Conditionals. If well-formed, add them as rules to the knowledge base (or something like that, depending on the technology being used).

Query for new sr:Requests. Attempt to perform each one; attach the results (success, failure, possible additional data) as a response.

If none of these queries had any results, exit (we've done as requested!), else loop back and see if the new results of requests give us new things to do.

The Code

The current prototype code is in Perl, using RDF::Pool and xsb for all the heavy lifting.

Basic Guiding Objects

Import

Merge in another RDF Graph, fetched off the web (if cached copy is older than some given max-age or the TTL has elapsed). (@@ Is there any reason this is Basic, not some kind of Request? I'm not sure.)

Request

Attempt to perform some Action (see below for some Actions). A response to the request is linked off the request object.

It is important to make clear which requests are non-commital. These are like HTTP GET. The only action would be TellByReply and other non-commital actions.

Conditional

Use doing a Horn-style (cwm --think + log:implies style) inference.

Basic Actions

TellByEMail

Send an RDF Graph by e-mail.

TellByUDP

Send an RDF Graph in a UDP packet. Note that using UDP (unreliable as it is) may be completely appropriate, if end-to-end reliability is an issue and service providers are not perfect.

TellByPost

Send an RDF Graph in an HTTP Post operation.

TellByReply

Send an RDF Graph back by the natural reply channel (if any), based on how the input arrived.

Some Experimental Specialized Actions

GetRDF

Get an RDF graph from somewhere on the web and load it in, reified. This can be used as a safe, controlled Import, when de-reificiation is done by conditionals.

StoreRDF

GetContent, PutContent, PostContent

Get/Put some content (a string of bytes + media-type) from the web, or Post as an HTML Form would do.

Halt

ConsolePrint

InvokeSimpleWebService

TCPStartListen, TCPStopListen, TCPWrite

OpenWindow, CloseWindow

Create (destroy) a graphic display window on the console, whose behavior can be described/controlled. Here's the demo: semrun quakeworld.rdf.

Security Considerations

The input to semrun is trusted. Even without any external extensions (such as those giving access to local files), malicious input can, at very least, send e-mail. In practice, input should be treated as if it were machine-code; don't semrun anything from an untrusted sources!

If you want to handle untrusted RDF, you need to GetRDF it into a reified graph, then have some security rules to decide what to actually assert.

Just-In-Time Information Providing

Nearly all the XMethods Sample Web-Services are, in concept, specialized queries: get current temperature for a zip code, get current stock price for a ticker symbol, get Barnes&Noble price for a book, etc. These could be used via an InvokeSimpleWebService request, but a more interesting and elegant approach is to have the client simply try to use the data, and have a Web Service invocation happen, when needed, to retreive the data. This is the kind of remote-query-on-demand where having monotonic open-world semantics pays off, especially if we model their notion of "current" properly.

This could be done as a Conditional which matches on other Conditionals, or with a hook inside the inference system (set up by some other Basic Guiding Object). I think it's an important design point that one be able to catch all uses via searching for Conditionals. You can say "Tell me MSFT is at 62.525" but you can only ask for the real value if you use a conditional.

Messages Are RDF Graphs, or Objects?

Should the response to "What is your name?" be "My name is 'Sandro'" or just "Sandro"? I think we should require all communication to be in complete sentences (RDF Graphs), so that metadata can always be attached.

Should we have/allow identifiers which function like "me", "you", "now", and "this message"? In protocol, yes. In the database, no. On reading it in, any such terms (unless quoted of course, however that works) should be turned into the appropriate term.

Sandro Hawke
$Date: 2002/03/01 20:15:47 $