MITCSAIL

   Possible projects in MIT-CSAIL-DIG


Here are some projects we are interested in people doing.   Suitable for MIT students, students elsewhere open source hackers -- just please mail me for details and coordination befroe starting.  Done when code works, tests exist, documentation and code is all on the web.

Small projects....

Context-free grammar manipulation

Uses: Notation3 langauge. Requires Basic understanding of context-free grammars.

@prefix cfg: <http://www.w3.org/2000/10/swap/grammar/bnf#>.
We have an ontology for describing the syntax of grammars. There isn't one RDF property, cfg:mustBeOneSequence which allows you to construct a whole grammar in RDF which is equivalent to a pure BNF (Baccus-Naur Form) gammar. In practice,  people writing parsers use all kinds of grammars, as do people trying to read them. Some of these grammars are know as Extended BNF (EBF). These involve shorthand notations for zero or more of something, for example, or an optional production.  The CFG ontology has properties for expressing these shortcuts like

[ cfg:zeroOrMore  ex:expression ]

A set of N3 rules exist to generate this pure CFG style from the shorthand forms.
Goal:  To promote the use of machine-readable grammars, and specifically the use of interoperable standards for them, and specifically common semantic web standards which are in common with many other applications.

Reading XML documents as RDF graphs

Uses: Programming in python.
It is sometimes useful to be able to just peer into an XML document when manipulating RDF.  One way to do this is to use XQuery, and then generate RDF/XML. Another way would be just to parse the XML file to an RDF graph.  

It turns out that there are two ways to do this.  If you preseve the order of the document, you have to use RDF collections, aka lists. If you are not worried about order, as is the case with many XML to RDF conversions, then the graph you get  is simpler.

In the context of te cwm software, this would just be another input language.  There is in the cwm suite a python module to do tyhis but it was started an never finished.  Finish it, or make a new one.  You can use the xml2rdf.pt

Implementing GRDDL in cwm

GRDDL is a specification for saying what the RDF semantics are of an arbitrary document, and for automatically extracting RDF information from such a document by following pointers from the document or the XML schema.

It potentially could be the necessary tool to make large amounts of  exiting data availablein the RDF model, and on eth semantic web, connectable to the other data on the semantci web.

GRDDL uses XSLT so this would involve finding an implemntation of XSLT  which is in or could be connected to python, and setting cwm to use GRDDL when loading a remote document.

(TBD)

Implementing SPARQL client in CWM with distributed query

CWM can run as SPARQL serverr.   This project is to take te hook already insered in the code and make CWM
pass a part of q query to a remote sparql server.  This will allow CWM to participate in a distribted system of delegated or peer servers.

Other things


 Other student projects (of millions - come talk to us) are realted to Paper Trail and Diff, Patch, Update and Sync.



$Id$ timbl timbl

disclaimer

Valid HTML 4.01!