This is a first pass at an n3 [1] parser using lex/yacc. I've used some C++ features, although it would be pretty easy to make it straight C, I think. The hardest parts would be keeping track of who needs to free char* strings, I think. Also, it would simplify things to make it just write the output, instead of indirecting the writes through StreamOfTriples. No known system dependencies beyond some lex, yacc, and C++ with STL, but there probably are some. As of this writing, this consitutes about 2 days work, including borrowing and modifying an old lex/yacc code base, and refreshing my memory on STL. I worked without the n3 BNF until some checking at the end. Issues: - what approach should we use for encoding a context into a stream of triples? We can reify each statement, but then what do we do with the statement identifiers? Put them in a list, or some RDF collection, or...? Current Solution: a lisp-style list - what vocabulary to use in various places? (see vocab array) - should it be an error to redefine a @prefix in the same file? Current Solution: no, so you can just cat n3 files, although that wont work if you're using "<>" - should we leave <> intact, or replace it with a top-level context identifier? Use the filename? Require it from user? To Do: [ added 2002-08-30 ] - add support for "this" - change lists to use rdfns - MAYBE reorg to return AST instead of construct as running - MAYBE redo in plain C (with AST rewrite -- no need to free things) - use LX vocab for variables & formulas - use _: for genids - output prolog with rules recognition??? [ from 2001-03-28 ] - play with adding expressions - pass in top-level context identifier - missing >-prop-> and <-prop<- and maybe some other unused features - input from files named on command line (use filenames in error listing) - validate (no-triples-out) option [1] http://www.w3.org/DesignIssues/Notation3.html sandro@w3.org $Id: README,v 1.3 2002/08/30 16:20:45 sandro Exp $