About the Tabulator


This is one possible form of a semantic web browser.

It works on Firefox and Camino, may be bugs on other platforms.  No no-portable features were knowingly used.  Patches for other platforms are welcome.

Motivations

Ethos

The browser browses things, not documents. Of course some thing are documents, but the browser considers them first as things, and then as documents.

Other browsers have tended to focus on a document at a time, or work by amassing a large static database of RDF from all known sources and then browsing it.

This browser works in a web of documents. This is partly in order to scale: to operate in an unbounded space without having to manage with more than a certain finite memory. It is partly in order to develop the links we need in this space.

When Google came to the web, it seemed to make the links less necessary, because one could ask Google instead of tracing the links to find something. Tracing the links, that is, through remembered pages, relevant pages, trusted pages, using one's instinct and the indications left by authors to find the goal. However, the links are still important. In fact, Google requires the links to work at all. And not just any old links: links patiently and carefully made by thoughtful authors, automated or most often human, for the benefit of inquiring readers.

Now we are making the semantic web, and we have the Google experience, and we tend to start by making a huge index: we are tempted to start with the Google index and ignore the links. But we are missing the links. Apart from the FOAF world, much semantic web data is isolated from other webs of data, even though connections could be made. We need to make the links across systems. This browser is designed to use these links, and hence encourage their creation.

Documents have content which describe things. They are in the browser a special sort of object.

The use of a URI to denote something is an invitation to look it up. This is the hypertext link in the Semantic Web. If the URI we use to identify something is formed from adding a hash and a local identifier to the URI of a document, then the reference is to the thing in the document with that local identifier. This is the simplest breadcrumbs protocol.

Breadcrumbs protocols

A breadcrumbs protocol is convention by which information is left to allow another to follow.  When the information provider follows the convention on the breadcrumbs to leave, and an information seeker follows the convention on what links to follow, then the protocol that the seeker will be able to solve certain sorts of problem.

There are other forms, such as the FOAF link. The FOAF link is a breadcrumbs protocol as follows. Document ?d1 makes a reference the thing identified as ?x in document ?d2 when:

 ?d1 log:says { ?y foaf:sha1sum ?s. ?y rdfs:seeAlso ?d2}.
?d2 log:says { ?x foaf:mbox_sha1sum ?s }.

or the same with foaf:mbox (email address)  instead of foaf:mbox_sha1sum (cryptographic hash of  email address).  If the information provider of ?d1 leaves that information. The seeker, having read d1, and wanting information about ?y (1) looks up the document found under rdfs:seeAlso, and (2) finds in d2 reference to something which has the same foaf:mbox_sh1sum value, and infers that it is the same thing.  The protocol assumes that only one person (or entity) has a given email address, and hence checksum of email address. 

This is clearly more complicated than the simple link!  The tabular can do both.   Links which could be followed are represented by blue dots.  The smushing by mailbox is handled when the properties are declared as owl:InverseFunctionalProperty. 

Exploring a web of data in outline mode

Background

Many RDF visualizers have represented the data as circle-and-arrow diagrams.  This gives you a feel for how a small number of things connect.  It allows you to see clustering when a large number of things are related by the same properties, such as in Foafnaut or How/Why diagrams we have made at W3C. Circles and arrows are very intuitive and useful when trying to understand something. However, it isn't a way to look at data with many nodes and many different properties. It isn't used in applications we think of as handling data, such as personal financial management, music management, calendar management programs, for example.  In these cases one uses tables or matrices. These are the densest way of comparing objects of the same class (strictly, which are likely to share common properties).  mSpace  is an example of a table-based semantic web browser. These table-based systems, though, tend to operate on a given set of data, and don't naturally allow browsing.

Outline mode is extremely common and clearly natural for tree-oriented data. People are typically very comfortable in a tree-like environment. In fact, much data in the world has been organized into trees, and the web is composed of overlapping trees with local roots all over the place.  This suggests that in fact a tree-oriented browser will feel natural even though the world is actually a web. In early hypertext, Peter Brown's Guide system was an outliner, and the Gopher system was a presented as tree (not in outline mode) though in fact a web. 

The outline browser is quite straightforward.  Most of the design decisions are around how to represent options to load data, and whether to load it automatically - many variations are possible.  There are questions as to whether to express links in bother directions, and whether to suppress links which are reverse versions of a link displayed at a higher level.

One meme of RDF ethos is that the direction one choses for a given property is arbitrary: it doesn't matter whether one defines "parent" or "child", "employee" or "employer". This philosophy (from the Enquire design of 1980) is that one should not favor one way over another.  On the other hand, also one should not encourage people having to declare both a property and its inverse, which would simply double the number of definitions out there, and give one more axis of arbitrary variation in the way information is expressed. Therefore, the design is to make the system treat forward and backward links equivalently.   The outliner therefore displays links in both directions. (it should sort them so obverse and reverse appear together).

The outline is therefore redundant, in that whenever an object is expanded more than one deep, for outer link will also be found as a "dual" inner link in the opposite direction. (e.g. Sam: father Joe: [ is father of: Sam].)  That isn't necessarily a problem. An interesting operation is "inversion", in which everything which the outer link is removed, and replaced by its inner dual link, and all the information which was outside the original link  is removed in favor of its dual inside the inside the dual of the original link. This is called refocusing in the user manual. It brings the selected branch to be a new root.

Query by example

The outline mode

(to be continued)


Acknowledgments

The code uses the RDF browser written by Jim Ley, without which the activation energy for the project would not have been reached at all.

Tim Berners-Lee hacked together the original program in spare time around the end of 2005.

Ruth Dhanaraj improved the whole program in many ways, including coding the asynchronous source management module (sources.js), January-Febrary 2006.