

Possible projects in MIT-CSAIL-DIG
Here are some projects we are interested in people doing.
Suitable for MIT students, students elsewhere open source hackers --
just please mail me for details and coordination befroe starting.
Done when code works, tests exist, documentation and code is all
on the web.Small projects....
Context-free grammar manipulation
Uses: Notation3 langauge. Requires Basic understanding of context-free grammars.
@prefix cfg: <http://www.w3.org/2000/10/swap/grammar/bnf#>.
We have an ontology for describing the syntax of grammars. There isn't one RDF property, cfg:mustBeOneSequence
which allows you to construct a whole grammar in RDF which is
equivalent to a pure BNF (Baccus-Naur Form) gammar. In practice,
people writing parsers use all kinds of grammars, as do people
trying to read them. Some of these grammars are know as Extended BNF
(EBF). These involve shorthand notations for zero or more of something,
for example, or an optional production. The CFG ontology has
properties for expressing these shortcuts like
[ cfg:zeroOrMore ex:expression ]
A set of N3 rules exist to generate this pure CFG style from the shorthand forms.
- Make
N3 rules to go backwards and generate the shorthands from the pure
form, where the pure form matches some pattern there is a shorthand for.
- Make N3 rules to generate BNF in its conventional syntax from the RDF/N3. (use log:outputString)
- Make N3 rules to genearte from the RDF with shorthand forms the EBNF in the syntax used in the XML specification.
Goal:
To promote the use of machine-readable grammars, and specifically
the use of interoperable standards for them, and specifically common
semantic web standards which are in common with many other applications.Reading XML documents as RDF graphs
Uses: Programming in python.
It
is sometimes useful to be able to just peer into an XML document when
manipulating RDF. One way to do this is to use XQuery, and then
generate RDF/XML. Another way would be just to parse the XML file to an
RDF graph.
It turns out that there are two ways to do
this. If you preseve the order of the document, you have to use
RDF collections, aka lists. If you are not worried about order, as is
the case with many XML to RDF conversions, then the graph you get
is simpler.
In the context of te cwm software, this would
just be another input language. There is in the cwm suite a
python module to do tyhis but it was started an never finished.
Finish it, or make a new one. You can use the xml2rdf.pt
Implementing GRDDL in cwm
GRDDL
is a specification for saying what the RDF semantics are of an
arbitrary document, and for automatically extracting RDF information
from such a document by following pointers from the document or the XML
schema.
It potentially could be the necessary tool to make large
amounts of exiting data availablein the RDF model, and on eth
semantic web, connectable to the other data on the semantci web.
GRDDL
uses XSLT so this would involve finding an implemntation of XSLT
which is in or could be connected to python, and setting cwm to
use GRDDL when loading a remote document.
(TBD)
Implementing SPARQL client in CWM with distributed query
CWM can run as SPARQL serverr. This project is to take te hook already insered in the code and make CWM
pass
a part of q query to a remote sparql server. This will allow CWM
to participate in a distribted system of delegated or peer servers. Other things
Other student projects (of millions - come talk
to us) are realted to Paper
Trail and Diff, Patch, Update and
Sync.
$Id$ timbl
timbl
disclaimer
