SWAN 1.0 Specification (very rough draft)


Status

I wrote this document on January 3-5, 2001, during my first weeks at W3C and in the RDF community, trying to bring my earlier work into this arena. I abandoned the effort because it seemed hopeless to talk about languages when the model was still so poorly understood. I'm making it public anyway, full of my notes. Hopefully, we're getting closer to a time for talking about languages.

-- sandro (2001-08-30)


This is probably much more a research paper than a spec! Still, any good implementation effort starts with a spec, right?

@@ something about building a language by parsing it into triples, then running rules on the triples. advantages and limitations. Possibly even stuff about ambiguous parses. Performance characteristics.

@@ how do we get a table of contents?

   Model
   Example
   Modules Overview
   Issues (overloading, namespaces, future expansion, whitespace)
   Inclusion of Module Specs (machine & human) (esp SwanBase, Grammar,
       Lists, StringLiterals)  -- or are Modules HERE and just
       the operator specs are out there, included for the module
       in question?
   Interesting Ideas (annotation frame, client use, boolean=class,
         closing-the-world)
   Software Guide

       Equations
           = -- ;
       Asides
           [ ]
       Reification 
           { }    or semanticContent(string_as_message)
       SwanBase
       Parsing       making a parse tree; possibly ambiguous; impl
           techs, lex+yacc
       Variables        (implicit rule-ification)
       Booleans        true,false,not,";",and,or,xor,implies,iff,if/then/else
       Documentation
       ExternalIdentification (and namespaces?)    (inclusion?)
       Dates and Times ?    events?     [manditory for ExternalStuff]
       Web Interactions ?                  "        "          "
       Collections
       StringLiterals
       NumericLiterals
       Mathematics
       PropertyStatements
       Comments
       Markup
       TableShorthand    (?   use | )
               x=author:|title:|
	       swick   rdf                (use parens if you need spaces)
	       ora     rdf        
             Should turn into a list, I think.

	     x=table(author, title) of 
	     ``
	      swick  (rdf [=x])
	      ora    x           first_given:true
	     ''

	     x=table of 
	     ``
              author:  title:
	      swick  (rdf [=x])
	      ora    x           first_given:true
	     ''

	     "table" is just an ordinary function -- no language support.
	       
       SelfReference   
                  - every object has a many-many map to the
		    source coordinates where it is involved;
		    it might be reasonable to sort by the first 
		    mention, in certain cases!
		  - be able to talk about closing the world;
		    @.message.explicitSemanticContent
		        [ does that include language semantics? ]
		    Usually you would do that on some section
		    that had no rules, I think, or something?

    doing inclusion lets us version them separately, which is good.
    and put machine stuff near it.

    (inclusion uses xml, like here.    perl build-combined script)
        or use Swan-perlish
	     Operator (
	       name:"colon";
	       prec:950;
	       example:( 
	          Example:true;
		  ordering:file;
		  
             )
         Not a bad idea.   But doesnt show I know xml.   :-)
	 do XML for now, machine convert later!

	 


Bootstrapping, Modules, and some Examples

something

SWAN is being built in modules, some of which are built using other modules. The higher layer modules can be implemented entirely in the subset of SWAN provided by the lower layer modules.

SwanBase is the lowest layer module. The syntax for SwanBase looks like this:


   # first, some simple property statement triples
   someDocument.author=ora;
   ora.firstName="Ora";
   ora.lastName="Lassila";
   someDocument.author=ralph;
   ralph.emailAddress="swick@w3.org";
   someDocument.title="RDF Model and Syntax";

   # and now, a rule saying anything ora writes is good:
   if   $x.author=ora
   then $x.quality=good;
   end
   
   # and a more complicated one, saying that anything ora 
   # writes with ralph will be read by at least one person with an IQ
   # greater than 150.
   if   $x.author=ora;
        $x.author=ralph
   then $x.reader=$y;
        $y.intelligenceQuotent=$z;
	$z.greaterThan=150;          # awkward syntax in SwanBase
   end

@@@@ link to more formal grammar?

[ Literals in SwanBase are base-ten whole numbers < 2^31 and strings, surrounded by (") which can only contain ascii 32, 33, 35-126. Program text may only be ascii 0-126, with 0-32 treated as whitespace (which just separates tokens when needed). This restricted syntax and character set is expected to be loosened by higher layers. ]

Higher layer modules can make the syntax more compact, in a lot of different ways. The details of how exactly they work is not expected to be terribly clear until more detailed explanations are given below.

The first rewrite makes heavy use of the colon operator in its sense of "return an object which is known to have this property having this value" and semicolon in its sense of both its operands being the same object.

   Document someDocument [ =                                   # declare it a "Document", which equals
      author:(ora; firstName:"Ora"; lastName:"Lassila);        # something with an author like this
      author:ralph [.emailAddress=f];                          # and like this [oh, and ralph has some email address]
      title:"RDF Model and Syntax";                            # and this is all still the same as something with this title
   ];

   f=lowercase(ralph.lastName)+"@"+domainName(ralph.employer); # oh yeah, ralph's email is his last name, followed
                                                               # by "@" and the domain name of his employer.
   ralph.employer = w3c or mit;				       # alternatives are fine
   ralph.emailAddress.last(3) != "edu";                        # as are list operations on strings, and negation
   w3c.domainName="w3.org";
   "Ralph Swick"=ralph.firstName+" "+ralph.lastName;           # hey - this is declarative - it all works backwards too!
                                                               # (no now we know ralph's email address)
   # only the second rule can be helped
   if   $x = (author:ora; author:ralph)
   then $x.reader.intelligenceQuotent > 150;

The syntax and semantics used in the later examples is defined using the SwanBase language of the first example. The layering can be done late (in a client) so that the latest versions of SWAN, unofficial dialects, or even non-SWAN languages can be understood by any client which can understand the SwanBase syntax and the semantic ontologies used to add syntactic and semantic structures.

Underlying Model

Expressions map many-to-one to a result value object and some semantic content, which we model as a set of property statements (triples of Subject, Property, Value). The result object may be identified with a newly-created identifier.

Identifiers

Identifiers are the same as in C, except that in certain cases a dollar-sign ("$") may be used as the first character.

The at-sign character is also a special identifier.

By convention, the names of classes (which are the same as boolean-valued properties) are capitalized. (Another convetion, of using only these properties with the implied-of operator, could prevent conflict with any possible future infix keyword operators, but this may not be worthwhile.)

Literals

Non-negative rational numbers in base 10 may be specified like 51, 51.0, or 51.2. Base 2 like 0b01010.0110 and base 16 like 0x43cd.3e. No distinction is made in parsing between integers or rationals. Negative numbers and scientific notation may be expressed with operators.

Character sequences (strings) may be specified like "Hello" or ``Hello''. Quotes may be embedded like "He said ""Hello!"" yesterday." or ``He said ``Hello!'' yesterday.''. Note that doubled single quote marks nest. @@@@ this isn't very C-like, especially below

Long strings containting arbitrary characters may be quoted using "custom" quotes, like this:

`some_marker`This string can contain almost anything'some_marker'

The text between the single-quote marks must syntactically an identifier. These quotes nest as well.

Finally, the special sequence `' begins a string which continues until an out-of-band end marker, such as the end-of-file.

Operators

(This output is programmatically generated.)

Operator COLON : property:value=subject 950 left
Meaning an object for which the given property has the given value
Example 1 color:red return an object, call it x, for which x.color=red
Example 2 dc:author can have the namespace-like meaning an XMLer might expect if "dc" is defined properly, as a property (with the value author) of only one possible object, which happens to be the Dublin Core "Author" semantic grounding element.
Operator SEMI-COLON ; a;b=c 100 left
Meaning what's here is semantic, meaning stuff
Example 1 text of first example meaning of first example

Original Author: Sandro Hawke, December 2000
Last Modified: $Author: sandro $ $Date: 2001/08/30 15:26:46 $
Current Status: awaiting first draft.