THIS TOOL IS NOW OBSOLETE!

This tool, and the underlying talk management system, is now obsolete. Please, consult the new, “How to add a new Talk” page.

Management of W3C Public Presentations

1. Introduction
2. The CGI Script and its URI-s
3. Invoking the CGI Script Directly
4. The Data Files
- 4.1. The Talk RDF Files
- 4.2. The Auxiliary RDF Files
5. How Does it All Work?

1. Introduction

The goal of this project is to create a unified management of all public presentations of the W3C Team, Office Staff, and those Working Group or Advisory Board Participants who make presentations on behalf of W3C. Data for all these presentations are stored in one place and various “views” of the data can be retrieved through scripts.

The main (and public) view of the presentation data is via the Public Presentations’ page (in fact, this page is simply redirected to the “real” page which is again mapped via .htaccess to a CGI script). This page may be used to filter the view of the presentation data along different axes: time, W3C activities, countries, etc. The main tool to add a new presentation to the underlying database is to use the “Upcoming Talks” input form. The rest is done by scripts on the W3C site, backed up by the communication team for possible editing (spell check problems, etc).

Another (and team only) view of the presentation is via another URI. There are two tricks there: in real life, this is a real directory, hence the ACL system can ensure that this is team visible only, and the .htaccess in the directory does not only map on the same CGI script but also adds a “hidden” form option to the query string (extraData=yes). The core script keeps an eye on this extra data to display some extra information for the team only view (e.g., comments to the comm team).

2. The CGI Script and its URI-s

The CGI script can be accessed from the URI http://www.w3.org/2004/08/W3CTalks (technically, this URI is re-written to the “real” script). By default it displays all presentations that are close past and upcoming.

A server side remap is provided from http://www.w3.org/Talks/ to the new view. More precisely, the map is on one of the optimizing cache files rather than to the script proper. This is important to know if one wants to refer to the URI-s directly; http://www.w3.org/2004/08/W3CTalks must be used in this case.

3. Invoking the CGI Script Directly

Obviously, the usual way of invoking the CGI script is via the form in the Public Presentations’ page. Using those forms full XHML pages are returned with the forms filled in with current values. Another possibility is to invoke the CGI script with an additional parameter, forcing it to return the list of talks only. The additional parameters are as follows.

xmlFragment=yes: The CGI script returns only a <div> structure containing the list of talks. The class attribute of the <div> is set to presentationList to allow CSS styling, for example. By default, the <div> is built up of headers (depending on the request, that can be countries, years and months, etc, in the forms of <h2> and, possibly, <h3> elements) and a definition list with the talk data. Just as for the full XHTML view, the talk data is formatted into proper English text.
noHeader=yes: This additional parameter forces the script to drop the header elements and return the list of talks only (enclosed in the <div>). This may be more appropriate for some pages.
extraData=yes: Some extra data, like the comments to the comm team or RDF data related to events only (and used by the comm team) are also displayed.
debug=yes: As its name suggests, this is for debug:-): some extra information is added to a comment field of the output. Ie, with view source one can get to those.

Using this CGI invocation, the return value may be embedded into another page using a URI of the form, for example:

http://www.w3.org/2004/08/W3CTalks?activity=Semantic+Web+Activity&submit=Submit&xmlFragment=yes

The return can be included into a page via, e.g., a <?php include(...);?>. See, for example, my Public Page, the Offices home page, or the Offices’ news archive that use this technique. One can also use a crontab job instead of PHP, updating a page every night, etc.

4. The Data Files

The tool uses a number of RDF/XML files to store the talk data. Although the “Upcoming Talks” input form is the preferred way to add new entries, it is perfectly possible for the W3C team to edit those files directly (though with care, obviously) or to maintain, essentially, their own files (see below). This is particularly useful if existing entries are to be modified: while it is possible to use the upcoming entry form for a change request there is, alas! no direct editing tool of the RDF data yet (this is part of the “to do” list...).

All data are stored in RDF/XML format; sorry, no turtle format (yet). The reason is purely practical: the RDF tool that has been used does not have a turtle parser. If such parser is added to the tool (rdflib) then tutrle becomes possible, too…

The tool uses a number of RDF files, some specifically for the talk management, while others are maintained and used by totally different projects and only reused here. Typically, users are interested by the former only.

4.1. The Talk RDF Files

The RDF data for the talks themselves are stored in /2004/08/TalkFiles/{2004,2005,…}. The scripts parses all RDF files (ie, files with an rdf suffix) found in those directories. The input form manipulates the Talks.rdf file; however, it is perfectly possible to add and maintain separately. For example, an Ivan.rdf file containing Ivan’s presentations for a specific year may be used, if Ivan decided not to use the standard input form. It is important to stick to the year structure, some of the internal optimization of the scripts (when retrieving the data) are based on it.

The Events.rdf is another file that is manipulated by another submission script, and is used internally to store events of interest (without a speaker) and displayed by the tool when the extraData=rdf option is ‘on’. Each resource has an extra (RDF) type (<talk:eventNotification>True</talk:eventNotification>).

All the RDF data follows an ontology whose format was discussed by the W3C Semantic Web team in autumn of 2004. The tool (currently) does not make a very strict check on the RDF data using the ontology (another item for the ‘to do’ list); ie, if you choose to edit your own RDF file, do it with care and use “editing by example”…

4.2. The Auxiliary RDF Files

The tool also uses a number of other RDF files that are usually generated and maintained outside this project. Just out of interest, here they are:

TeamFoafs.rdf is simply a file dump of a script written by Dominique that accesses the W3C public team page and creates FOAF data. It contains the name and the home page addresses of the W3C Team. (The fact that it is dumped into a file is simply a matter of efficiency).

FoafBridge.rdf binds the FOAF terms to the “contact” terms that are used elsewhere, by defining some RDF subclass and subproperty relations. (A.f.a.i.k. the FOAF ontology makes the same connections, but it was more efficient to add the few relationships here rather than reading in a full ontology).

Offices.rdf contains all the contact addresses of the Offices’ staff; this file is maintained by the Head of Offices (and also used to generate, for example, the Office Staff List).

groups.rdf is maintained by the system team, and includes data on W3C activities, groups, group chairs.

langinfo.rdf contains data on languages, e.g., ISO two code, the name of the language natively and English. Usually maintained by whoever maintains the translations.

5. How Does it All Work?

Some of the links and/or machine references in this section are accessible to the team only. However, it does give a general idea of how the system works, which may be of a general interest.

5.1. Top Layer: What Scripts are Involved and How?

The core of the management is based on three Python scripts:

ManageTalks.py

This script makes a heavy use of a SPARQL implementation on top of a Python RDF Library called RDFlib. There is a separate document (generated from the source) that gives more details on how this script works internally. (Note also a separate description on how the popup mechanism for abstracts work, which has nothing to do with RDF…). There are, however, some general issues that are important to note here:

The script depends heavily on some external tools, either developed by myself or generally available. These are all documented separately (all modules listed there are used).
The script implements an extra optimizing step. If the request is to have all presentations for a specific year (and no other constraints are specified), the script simply displays the file TalkFiles/yXXXX.html, where 'XXXX' denotes the year. This also means that those 'cache' files are to be kept up-to-date; see below on how this is done. Suffices it to say at this point that if the ManageTalks.py script is invoked directly (as opposed being invoked from another Python Module), those cache files are generated.

W3CTalks.py

This is just a very small script making the bridge between the CGI call and ManageTalks.py. Our friendly system team at W3C links this script to the real CGI directory of the W3C server.

AddTalk.py

This script takes a text file on the standard input, looks for an RDF description of a talk, and updates the TalksFiles/XXXX/Talks.rdf file by adding a new entry (XXXX is the year of the talk, as found in the input). The new entry has an extra (RDFS) comment to make clear that, well, it is a new entry. This script is invoked when a new entry is filled in via the input form.

AddEvent.py

This script is very similar than AddTalk.py but is used to create updates on the TalksFiles/XXXX/Events.rdf file.

(Note: the external scripts are installed on wiggum as part of the python libraries; on homer, they are available under /home/ivan/W3C/dev/2004/PythonLib-IH.)

5.2. Day-to-day: How the Scripts are Used

Keeping the cache files up-to-date: User 'ivan' on homer has a crontab entry that runs the /home/ivan/WWW/2004/08/UpdateTalks script every two hours. The script makes some CVS updates locally (to be sure) and invokes ManageTalks.py; the latter updates the cashed files. Finally, everything is committed back via CVS. Voilà!
Managing a new entry: The task of “Upcoming Talks” input form is to format the input as an RDF/XML entry, and sends this via mail to (among others) ivan+talks@w3.org. A procmail entry on homer under user ‘ivan’ pipes this into a shell script on homer: /home/ivan/WWW/2004/08/AddTalk. This script makes some CVS updates to be sure, and invokes the AddTalk.py script. The latter updates the talk files, updates the cached files, and commits everything into CVS. Adding a new event follows the same strategy, except that the email address is ivan+events@w3.org which leads to the invocation of AddEvent.py.
Checking entries: Once a week, somebody from the comm team checks the new entries, and possibly updates them (spelling mistakes, etc).
Updating entries: This is still done manually: if the radio button on the “Upcoming Talks” input form signals that the entry must be updated, the mail is sent to the comm team, and the change is done manually.

5.3. Errors

If the data files are edited by hand, errors might occur, unfortunately. These errors are detected by the run time libraries (rdflib, etc) and exceptions are raised. The exceptions, as a rule, are handled differently whether the script runs as a CGI script or via the crontab (for a background processing for the cach files).

when running a CGI script, a more “graceful” output is provided on the screen, though the content being displayed might be partial only.
when running separately, the exception data are written on standard error. The file /home/ivan/WWW/2008/08/log.error contains the exception description.

T.b.d.: provide some sort of a check as a separate, probably team only script that would return an immediate diagnosis.

5.4. The Input Forms

Time to time, it is necessary to update the Submission form: team or office staff may change, new activities come in, etc. It does not come as a surprise that this update is also done via a Python Script SubmitTalkGenerate.py: it uses all the auxiliary RDF Files listed above, plus the core of a PHP script SubmitTalk-core.php to generate the final input form (by adding all the data in pull-down option list, essentially). This script is updated by another crontab entry on homer with user 'ivan' once a day (using /home/ivan/WWW/2004/09/SubmitTalkUpdate script, wrapping the call to SubmitTalkGenerate.py into a set of CVS updates and commits). Otherwise, the input itself is a fairly standard PHP form, nothing particular. (The same mechanism applies to teh SubmitEvent form.)

Ivan Herman, Head of Offices
Last modified: $Date: 2018/07/13 16:19:21 $