Cwm enhancement roadmap

Weekly Meeting Agenda

In connection with the PAW project, we have paw/cwm dev meetings starting 9 Mar at 11:00ET for 90 mins and every 2nd week on Thursday thereafter until further notice. .

Meetings were held 31 Aug, 15 september, 28 September, 12 Oct, and 26 Oct, and 28 Oct (all paw-team meeting), 4 Nov (no records), 23 Nov.. 7 Dec, 21 Dec, 2006 Jan 4, Jan 19, 1 Feb (IRC only), 15 Feb, 9 March, 23 March, and 6 Apr meeting. The 20 Apr meeting was cancelled.

Convene, take roll, review records and agenda.
- Meeting coordinates: #paw on irc.freenode.net Dial in +1-617-761-6200 Code 79296 ("SW CWM").
- Round the table updates (a sort of roll call and agenda review) expected: Yarden aka Jordan (?), Vlad, Eikeon aka Dan K. DanC, TimBL, Yosi
- misc action: Eikeon to ask Ron to copy TimBL's keys across from the pychinko space to the policyawareweb.org space
Vlad's progress integrating pychinko and cwm
much progress has been made on built-ins in pychinko. functions seem to work as well as relational ops as of ~Sep. log:includes progress? log:semantics? formulas as 1st-class objects?

15 Feb Conclusion: The TAR file will have the whole source tree in, with swap/pychinko... but the CVS won't. Aim for: --pythink ? -or --rete or -run. release with a few tests and explanation of when to use --rete vs --think.

ACTION: Vlad to Send mail to public-cwm-announce@w3.org about the inclusion of the rete as an alpha test, test users welcome. Vlad sent the relevant mail message, but it seems to be stuck in a moderation queue as of 7 Apr
PAW demo item? How is it going? Debugging cwm --why and proof checking. Diag.tracking.
Tim and Vlad talked about cwm proofs. On 26 Apr, Vlad reported reducing the size of the generated proof from 355KB to 54KB:
This is the command I ran to generate the proof:
```
      cwm.py http://www.policyawareweb.org/2006/lean/troop42-policy.n3
http://www.policyawareweb.org/2006/lean/bengine.n3 --think 
--filter="http://www.policyawareweb.org/2006/lean/bfilter.n3" --why >
proof.n3
      
```
To check the proof I tried:
```
	check.py proof.n3
      
```
The proof and the "lean" REIN files are available in the SVN: svn+ssh://mindswap.org/home/svn/paw/www/2006/lean
Perhaps that addresses the "Indexing for speed in a world of nested formaule with variables" performance issues?

Yosi implemented log:supports, though Dan sent a can't reproduce log:supports results message 26 Apr. Is somebody working on using log:supports rather than log:conclusion/log:includes in the REIN engine? Lalana?

ACTION TimBL: To think about phasing out statement[CONTEXT]. (Tim: Upgraded the javascript API.)

See also: pfproxy.py, PAW dev SVN info

older test case: DanC ran it at the face-face meeting.:
```
cwm --rdf http://dig.csail.mit.edu/2005/09/rein/examples/judy-req.rdf --n3 http://dig.csail.mit.edu/2005/09/rein/engine.n3 --think --filter="http://dig.csail.mit.edu/2005/09/rein/filter.n3" --why
```
TBL checked in minimally-working code in the 26 Oct 2005 meeting.
cwm systems paper
- ACTION: Vlad to submit cwm systems paper to Journal of Theory and Practice of Logic Programming (TPLP) special issue on Logic Programming and the Web Progress: see A Reasoner for the Web, in progress.
N3 semantics.
- ACTION: LK to submit N3 semantics paper to Journal of Theory and Practice of Logic Programming (TPLP) special issue on Logic Programming and the Web In progress; see A Logic For the Web [DRAFT]
N3 changes and n3.n3
Unicode problems in reg exps using predictiveParser.

N3 decimal - any reasons why not move 1.0 in N3, turtle and sparql to xsd:decimal? Also, true and false keywords.

ACTION TimBL/Yosi. Change n3.n3 to include @true, @false, and separate type for decimal, and make sure output format of those is right - must preserve type. And make tests.

update 10 Feb: bnf2turtle -- write a turtle version of an EBNF grammar

Issue: BNF in html won't build.

Report: Nottaion3 spec has changed.
Common RDF API. (was: clean-up priorities)
- ACTION: DanC to work with timbl to separate operators from term.py, look at other class organization
- ACTION: DanC to kill sumOf
Experience with AJAR/Tabulator stuff suggests compromise of (subject, predicate, object, why) why why is used for provenence filtering and SPARQL source.

Earlier, we discussed a very clean common class structure for the basic N3 language elements. Arrive at common class tree for that. The [new] fact that pychinko seems to be happy using cwm classes now suggest that there is a clean break here and cooperation will be easy.

Wiki page on WebDataInterface design
the N3 abstract syntax interface to parsers and serializers.
ACTION TimBL: Add formulaFromN3String formulaFromRDFXMLString() methods to store.

Possible to preserve similarity between parser and unparser interface to allow straight-through data merge at speed?

aka: on arch of rdflib doing N3 formulae (Eikeon)
Common test suites -- manifests of manifests?
getting rdflib to pass the N3 parsing tests
Nary operations and built-ins
@@ links to Nary Note and Rules charter. NaryRelations in ESW wiki

p.s. The cwm/Inferenceweb integration item has moved to the TAMI agenda. Some technical discussion is in the inferenceweb archive; see Thu Mar 23 discussion, for example

by Release 1.1 2005/11/?? (Requires Python 2.3)

This contains performance work which would have been been in 0.9 except for unexpected medical situation. This will use Python sets where appropriate (instead of homebrew sets or sequences) and will therefore require python 2.3.

Longer Term

There is no timescale for these enhahcements, as they are not subject to existing funded development or volunteer commitment. Bother are of course welcome at any time!

Philosophy: To be able to integrate with other projects such as redfoot, pychinko, redland, etc, make the base classes for N3 terms as light as possible. Move functionality such as indexing for query, parsing, pretty printing, serializing, etc into separate modules.

Performance

Performance: Make test/perform test suite with standard tests so as to plot perfromance changes version to version.

Develop specific benchmarks for specific problems, such as pipelineable cases, large store/small rules, large rules/small data, etc.. Include feedback from users to characterize different types of way in whic cwm is used.

Clones

Cwm demonstrates the power of a particular langauge and a particular general-purpose tool. Encourage and liaise with compatible systems, and share test data. Work to define subsets of the rule langauges which work with different types of engine.

Syntax

N3 grammar close the loop: Connect N3 grammar to syntax tests by generating parser directly from the grammar. . Closing the loop on the syntax. We have a definition of the N3 syntax in N3. This is only a form of documentation until we have automatically generated a parser and/or a test suite from that syntax definition. A straightforward job - We have a syntax validator based on the BNF in RDF already. The interesting bit is the ontology for annotating the BNF grammar with RDF production rules, because it could then be reused to annotating foreign and legacy grammars to enable automatic extraction of RDF from them. (Apart from the niceness of having something defined it itself of course.) (July 2004 this is done without the grammar annotation necessary to actually generate triples in the store)

Extended RDF output syntax for formulae (how? have RDF Core locked all the exits? Prevents RDF2 not superset of RDF/XML1)

Query/Inference

Built-ins for lists. in (needs muli value function, below); append, prepend, second..seventh, last, concat.

Sets, plus builtins, see syntax in N3. Information is lost, and optimizations are lost, when sets are actually represented as ordered lists because that is the only syntax easy to use in RDF. owl:oneOf provides the constructor for sets.

Builtins extracting closed world data: s=Q(F, y, G) is set of x such that formula F.substite((y, x)) log:includes G.

Symbols used as variables as separate python class - save some time checking occurrences

Implement backward chainer (like Euler) on the same store. Allow forward and backward chainers to interact, share built-ins.

Store

Net

Implement persistent cache - make an ontology (based very stongly on HTTP headers) for recording the results of an HTTP (etc) get. Store metadata about fetched ontologies. Build an index into the persistent cache for offline working.

Functions: find the closure of these documents under eg --closure=ptr and make sure than they are all in my persistent cache. Allow environment variable to specify persistent cache mode (default: ontologies only, write to cache when online and expiry check fails, read from cache when offline). Provide pcache.urlopen() as a transparent extension of urlopen()

Remote Operation

IRC bot (sWBot) -- Your friendly N3-happy robotic chat room companion. See sandbox security below.

Security

Sandboxed version of cwm, which is immune to attacks from rules in data, builtin functions in untrusted rules, confientiality violations by indirect web access, and so on. While you can build a secure trusted system without needing a sandbox at all (it isn't really the way to build one) sandboxing is handy for things like demo servers and IRC bots.

Cwm Development Agenda and Release Roadmap

PAW Dec 2006 work

Goals: