Primer - Getting into the semantic web and RDF using N3

The world of the semantic web, as based on RDF, is really simple at the base. This article shows you how to get started. It uses a simplified teaching language -- Notation 3 or N3 -- which is basically equivalent to RDF in its XML syntax, but easier to scribble when getting started.

Subject, verb and object

In RDF, information is simply a collection of statements, each with a subject, verb and object - and nothing else. In N3, you can write an RDF triple just like that, with a period:

Everything, be it subject, verb, or object, is identified with a Uniform Resource Identifier. This is something like <http://www.w3.org/> or <http://www.w3.org/2000/10/swap/test/s1.n3#includes>, but when everything is missed out before the "#" it identifies <#pat> in the current document whatever it is.

There is one exception: the object (only) can be a literal, such as a string or integer:

The verb "knows" is in RDF called a "property" and thought of as a noun expressing a relation between the two. In fact you can write

There are two shortcuts for when you have several statements about the same subject: a semicolon ";" introduces another property of the same subject, and a comma introduces another object with the same predicate and subject.

Sometimes there are things involved in a statement don't actually have any identifier you want to give them - you know one exists but you only want to give the properties . You represent this by square brackets with the properties inside.

You could read this as #pat has a #child which has #age of "4" and a #child which has an #age of "3". There are two important things to remember

There are many ways of combining square brackets - but you can figure that out from the examples later on. There is not much left learn about using N3 to express data, so let us move on.

Sharing concepts

The semantic web can't define in one document what something means. That's something you can do in english (or occasionally in math) but when we really communicate using the concept "title", (such in a library of congress catalog card or a web page), we rely on a shared concept of "title". On the semantic web, we share quite precisely by using exactly the same URI for the concept of title.

(The <> being an empty URI reference always refers to the document it is written in.) The <#title> refers to the concept of #title as defined by the document itself. This won't mean much to the reader. However, a group of people created a list of properties called the Dublin Core, among which is their idea of title, which they gave the identifier

<http://purl.org/dc/elements/1.1/title>. So we can make a much better defined statement if we say

That of course would be a bit verbose - imagine using such long identifiers for everything like #age and #eyecolor above. So N3 allows you to set up a shorthand prefix for the long part - the part we call the namespace. You set it up using "@prefix" like this:

Note that when you use a prefix, you use a colon instead of a hash between dc and title, and you don't use the <angle brackets> around the whole thing. This is much quicker. This is how you will see and write almost all your predicates in N3. Once set up, a prefix can be used for the rest of the file.

There are an increasingly large number of RDF vocabularies for you to refer to - check the RDF home page and things linked from it - and you can build your own for your own applications very simply.

From now, on we are going to use some well known namespaces, and so to save space, I will just assume the prefixes

These are the RDF, RDF schema, and OWL namespaces, respectively. They give us the core terms which we can bootstrap ourselves into the semantic web. I am also going to assume that the empty prefix stands for the document we are writing, which we can say in N3 as

which is slightly fewer characters to type. Now you understand how to write data in N3, you can start making up your own vocabularies, because they are just data themselves.

Making vocabularies

Things like dc:title above are RDF Properties. When you want to define a new vocabulary you define new classes of things and new properties. When you say what type of thing something is, you say a Class it belongs to.

The property which tells you what type something is is rdf:type which can be abbreviated to N3 to just a. So we can define a class of person

Classes just tell you about the thing which is in them. An object can be in many classes. There doesn't have to be any hierarchical relationship -- think of Person, AnimateObject, Animal, TallPerson, Friend, and so on. If there is a relationship between two classes you can state it - check out the properties (of classes) in the RDF Schema and OWL vocabularies.

A property is something which is used to declare a relationship between two things.

Sometimes when a relationship exists between two things, you immediately know something about them, which you can express as a class. When the subject of any property must be in a class, that class is a domain of the property. When the object must be in a class, that class is called the range of a property. A property can have many domains and ranges, but typically one specifies one.

Note the class identifiers start with capitals and properties with lower case letters. This is not a rule, but it is a good convention to stick to. Note also that because the domain of rdfs:range and rdfs:domain themselves is rdf:Property, it follows that :sister is a rdf:Property without it being stated explicitly.

Equivalence

Often, you define a vocabulary where one or more of the terms, whether or not you realized it when you started, is in fact exactly the same as one in another vocabulary. This is a really useful titbit of information for any machine or person dealing with the information! The property of equivalence between two terms is so useful and fundamental that N3 has a special shorthand for it, "=".

Tip: Use other people's vocabularies when you can - it helps interchange of data. When you define your own vocabulary which includes synonyms, do record the equivalence because this, likewise, will help present and future processors process your and others' data in meaningful ways.

Choosing a namespace and publishing your vocabulary

Good on-line documentation for vocabulary terms helps people read and write RDF data. Writers need to see how a term is supposed to be used; readers need to see what it is supposed to mean. People developing software which uses the terms need to know in particular detail exactly what each URI means.

If you document your vocabulary using the RDF Schema and OWL vocabularies, then your documentation will be machine-readable in a variety of interesting and useful ways, as mentioned above and covered in more detail in Vocabulary Documentation. This kind of RDF-documentation-in-RDF is sometimes called a "schema" or an "ontology."

The easiest way to help people find your documentation is to make the URIs you create as vocabulary terms also work in a web browser. This happens automatically if you follow the naming convention we use here, where the vocabulary definition document has a URI like http://example.com/terms and it refers to its terms like <#Woman>. With the @prefix declaration above, this gives the URI http://example.com/terms#Woman which should work in any browser to display the definition document.

Ideally, you should publish your documentation on the web using a server and portion of URI-space which are owned by an organization which can commit to maintaining them well into the future. That way, many years down the road, RDF data using your terms will still be documented and potentially understandable. The convention of putting the current year into the URI can help with stability; some day people may be tempted to re-use http://example.com/food-vocabulary, but they will probably only touch http://example.com/2003/food-vocabulary, when they really mean to upgrade the documentation there. In some circumstances you can also achieve increased stability by using a specialized domain name which may be insulated from possible organizational renaming and trademark issues.

Of course if you are just playing around, you can use a file (say mydb.n3) in the same directory as the rest of your work. When you do that, your can simply use <mydb.n3#> as your namespace identifier, because in N3 (as in HTML), the URIs can be specified relative to the current location.

Now you know all you need to start creating your own vocabularies, or ontologies, and you have pointers to where to look for the richer ways of defining them.You don't have to go any further, as what you have now will allow you to create new applications, and create schemas, data files, and programs which interchange and manipulate data for the semantic web.

At this point, you should be getting the hang of it and be writing stuff. To give you some more ideas, though, there is a longer list of more complex and varied examples. These come with less tutorial explanation.

Or, you can continue with a tutorial which goes into mroe features of the language, explaining how to process you data and involve other data on the Web. In that case, next bit is about: Shortcuts and long cuts

	age	eyecolor
pat	24	blue
al	3	green
jo	5	green

Primer: Getting into RDF & Semantic Web using N3

Subject, verb and object

Making vocabularies

Equivalence

Choosing a namespace and publishing your vocabulary

More

References