Copyright © 2003 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource. However, by generalizing the concept of a "Web resource", RDF can also be used to represent information about things that can be identified on the Web, even when they can't be directly retrieved on the Web. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning.
This Primer is designed to provide the reader with the basic knowledge required to effectively use RDF. It introduces the basic concepts of RDF and describes its XML syntax. It describes how to define RDF vocabularies using the RDF Vocabulary Description Language, and gives an overview of some deployed RDF applications. It also describes the content and purpose of other RDF specification documents.
This is a W3C RDF Core Working Group Last Call Working Draft produced as part of the W3C Semantic Web Activity (Activity Statement).
This document is in the Last Call review period, which ends on 21 February 2003. This document has been endorsed by the RDF Core Working Group.
This document incorporates material developed by the Working Group designed to provide the reader with the basic knowledge required to effectively use RDF in their particular applications.
This document is being released for review by W3C Members and other interested parties to encourage feedback and comments, especially with regard to how the changes made affect existing implementations and content.
In conformance with W3C policy requirements, known patent and IPR constraints associated with this Working Draft are detailed on the RDF Core Working Group Patent Disclosure page.
Comments on this document are invited and should be sent to the public mailing list www-rdf-comments@w3.org. An archive of comments is available at http://lists.w3.org/Archives/Public/www-rdf-comments/.
This is a public W3C Last Call Working Draft for review by W3C Members and other interested parties. This section describes the status of this document at the time of its publication. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite as other than "work in progress". A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.
1. Introduction
2. Making Statements About
Resources
2.1 Basic Concepts
2.2 The
RDF Model
2.3 Structured Property Values and Blank
Nodes
2.4 Typed Literals
2.5 Concepts Summary
3. An XML Syntax for RDF:
RDF/XML
3.1 Basic Principles
3.2 Abbreviating and Organizing RDF
URIrefs
3.3 RDF/XML Summary
4. Other RDF
Capabilities
4.1 RDF
Containers
4.2 RDF
Collections
4.3 RDF
Reification
4.4 More
on Structured Values: rdf:value
5. Defining RDF Vocabularies: RDF
Schema
5.1 Defining Classes
5.2 Defining Properties
5.3 Interpreting RDF Schema
Declarations
5.4 Other Schema Information
5.5 Richer Schema Languages
6. Some RDF Applications: RDF
in the Field
6.1 Dublin Core Metadata Initiative
6.2 PRISM
6.3 XPackage
6.4 RSS 1.0:
RDF Site Summary
6.5 CIM/XML
6.6 Gene
Ontology Consortium
6.7 Describing Device Capabilities and User
Preferences
7. Other Parts of the RDF
Specification
7.1 RDF
Semantics
7.2 Test
Cases
8. References
8.1 Normative References
8.2 Informational References
9. Acknowledgments
A. More on
Uniform Resource Identifiers (URIs)
B. More on the Extensible Markup
Language (XML)
The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource. However, by generalizing the concept of a "Web resource", RDF can also be used to represent information about things that can be identified on the Web, even when they can't be directly retrieved on the Web. Examples include information about items available from online shopping facilities (e.g., information about specifications, prices, and availability), or the description of a Web user's preferences for information delivery.
RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. The ability to exchange information between different applications means that the information may be made available to applications other than those for which it was originally created.
RDF is based on the idea of identifying things using Web identifiers (URIs), and describing resources in terms of simple properties and property values. This enables RDF to represent simple statements about resources as a graph of nodes and arcs representing the resources, and their properties and values. To make this discussion somewhat more concrete as soon as possible, the group of statements "there is someone whose name is Eric Miller, whose email address is em@w3.org, and whose title is Dr." could be represented as the RDF graph in Figure 1:
Figure 1 illustrates that RDF uses URIs to identify:
RDF also provides an XML-based syntax (called RDF/XML) for recording and exchanging these graphs. Example 1 is a small chunk of RDF in RDF/XML corresponding to the graph in Figure 1:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#">
<contact:Person rdf:about="http://www.w3.org/People/EM/contact#me">
<contact:fullName>Eric Miller</contact:fullName>
<contact:mailbox rdf:resource="mailto:em@w3.org"/>
<contact:personalTitle>Dr.</contact:personalTitle>
</contact:Person>
</rdf:RDF>
Note that this RDF/XML also contains URIs, as well as properties like mailbox and fullName (in an abbreviated form), and their respective values em@w3.org, and Eric Miller.
Like HTML, this RDF/XML is machine processable, and, using URIs, can link pieces of information across the Web. However, unlike conventional hypertext, RDF URIs can refer to any identifiable thing, including things that may not be directly retrievable on the Web (such as the person Eric Miller). The result is that in addition to describing such things as Web pages, we can also describe cars, businesses, people, news events, etc. In addition, RDF properties themselves have URIs, to precisely identify the kind of relationship that exists between the linked items.
The following documents contribute to the specification of RDF:
This Primer is intended to provide an introduction to RDF and describe some existing RDF applications, to help information system designers and application developers understand the features of RDF and how to use them. In particular, the Primer is intended to answer such questions as:
The Primer is a non-normative document, which means that it does not provide a definitive specification of RDF. The examples and other explanatory material in the Primer are provided to help you understand RDF, but they may not always provide definitive or fully-complete answers. In such cases, you should refer to the relevant normative parts of the RDF specification. To help you do this, we provide links pointing to the relevant parts of the normative specifications.
RDF is intended to provide a simple way to make statements about Web resources, e.g., Web pages. In this section, we describe the basic ideas behind the way RDF provides these capabilities (the normative specification describing these concepts is RDF Concepts and Abstract Syntax [RDF-CONCEPTS]).
Imagine that we want to state the fact that someone named John Smith created a particular Web page. A straightforward way to state this in English would be in the form of a simple statement such as:
http://www.example.org/index.html has a creator whose value is John Smith
We've underlined parts of this statement to illustrate that, in order to describe the properties of something, we need ways to name, or identify, a number of things:
In this statement, we've used the Web page's URL (Uniform Resource Locator) to identify it. In addition, we've used the word "creator" to identify the property we want to talk about, and the two words "John Smith" to identify the thing (a person) we want to say is the value of this property.
We could state other properties of this Web page by writing additional English statements of the same general form, using the URL to identify the page, and words (or other expressions) to identify the properties and their values. For example, to specify the date the page was created, and the language in which the page is written, we could write the additional statements:
http://www.example.org/index.html
has a creation-date whose value is August 16,
1999
http://www.example.org/index.html has a
language whose value is English
RDF is based on the idea that the things we want to describe have properties which have values, and that resources can be described by making statements, similar to those above, that specify those properties and values. RDF uses a particular terminology for talking about the various parts of statements. Specifically, the part that identifies the thing the statement is about (the Web page in this example) is called the subject. The part that identifies the property or characteristic of the subject that the statement specifies (creator, creation-date, or language in these examples) is called the predicate, and the part that identifies the value of that property is called the object. So, taking the English statement
http://www.example.org/index.html has a creator whose value is John Smith
the RDF terms for the various parts of the statement are:
However, while English is good for communicating between (English-speaking) humans, RDF is about making machine-processable statements. To make these kinds of statements suitable for processing by machines, we need two things:
Fortunately, the existing Web architecture provides both these necessary facilities.
As we've seen, the Web already provides one form of identifier, the Uniform Resource Locator (URL). We used a URL in our original example to identify the Web page that John Smith created. A URL is a character string that identifies a Web resource by representing its primary access mechanism (essentially, its network "location"). However, we would also like to be able to record information about many things that, unlike Web pages, don't have network locations or URLs.
The Web provides a more general form of identifier for these purposes, called the Uniform Resource Identifier (URI). URLs are a particular kind of URI. All URIs share the property that different persons or organizations can independently create them, and use them to identify things. However, URIs are not limited to identifying things that have network locations, or use other computer access mechanisms. In fact, we can create a URI to refer to anything we want to talk about, including
Because of this generality, RDF uses URIs as the basis of its mechanism for identifying the subjects, predicates, and objects in statements. To be more precise, RDF uses URI references [URIS]. A URI reference (or URIref) is a URI, together with an optional fragment identifier at the end. For example, the URI reference http://www.example.org/index.html#section2 consists of the URI http://www.example.org/index.html and (separated by the "#" character) the fragment identifier Section2. RDF defines a resource as anything that is identifiable by a URI reference, so using URIrefs allows RDF to describe practically anything, and to state relationships between such things as well. URIrefs and fragment identifiers are discussed further in Appendix A and [RDF-CONCEPTS].
To represent RDF statements in a machine-processable way, RDF uses the Extensible Markup Language [XML]. XML was designed to allow anyone to design their own document format and then write a document in that format. RDF defines a specific XML markup language, referred to as RDF/XML, for use in representing RDF information, and for exchanging it between machines. An example of RDF/XML was given in Section 1. That example (Example 1) used tags such as <contact:fullName> and <contact:personalTitle> to delimit the text content Eric Miller and Dr., respectively. Such tags allow programs written with an understanding of what the tags mean to property interpret that content. Appendix B provides further background on XML in general. The specific RDF/XML syntax used for RDF is described in more detail in Section 3.
Now that we've introduced RDF's basic statement concepts, URI references for identifying things we want to talk about on the Web, and RDF/XML as a machine-processable way of representing RDF statements, we can describe how RDF lets us use URIs to make statements about resources. In the introduction, we said that RDF was based on the idea of expressing simple statements about resources, where those statements are built using subjects, predicates, and objects. In RDF, we could represent our original English statement:
http://www.example.org/index.html has a creator whose value is John Smith
by an RDF statement having:
Note how we have used URIrefs to identify not only the subject of the original statement, but also the predicate and object, instead of using the words "creator" and "John Smith", respectively. We'll discuss this further later in this section.
RDF models statements as nodes and arcs in a graph. RDF's graph model is defined in [RDF-CONCEPTS]. In this notation, a statement is represented by:
So the RDF statement above would be represented by the graph shown in Figure 2:
Groups of statements are represented by corresponding groups of nodes and arcs. So if we wanted to also represent the additional statements
http://www.example.org/index.html
has a creation-date whose value is August 16,
1999
http://www.example.org/index.html has a
language whose value is English
we could, by using suitable URIrefs to name the properties "creation-date" and "language", use the graph shown in Figure 3:
Figure 3 illustrates that the objects of RDF statements may be either resources identified by URIrefs, or constant values (called literals) represented by character strings, in order to represent certain kinds of property values. Literals may not be the subjects or predicates of RDF statements. (The simple character string literals we will use for now are called plain literals, to distinguish them from the typed literals we will introduce in Section 2.4. The various kinds of literals that can be used in RDF statements are defined in [RDF-CONCEPTS].) In drawing RDF graphs, nodes that represent resources identified by URIrefs are shown as ellipses, while nodes that represent literals are shown as boxes (labeled by the literal itself).
Sometimes it is not convenient to draw graphs when discussing them, so an alternative way of writing down the statements, called triples, is also used. In the triples notation, each statement in the graph is written as a simple triple of subject, predicate, and object node labels (either URIref or literal), in that order. The triples representing the three statements shown in Figure 3 would be written in full as:
<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/creator> <http://www.example.org/staffid/85740> . <http://www.example.org/index.html> <http://www.example.org/terms/creation-date> "August 16, 1999" . <http://www.example.org/index.html> <http://www.example.org/terms/language> "English" .
Each triple corresponds to a single arc in the graph, complete with the arc's beginning and ending nodes (the subject and object of the statement). Unlike the drawn graph (but like the original statements), the triples notation requires that a node be separately identified for each statement it appears in. So, for example, http://www.example.org/index.html appears three times (once in each triple) in the triples representation of the graph, but only once in the drawn graph. However, the triples represent exactly the same information as the drawn graph, and this is a key point: what is fundamental to RDF is the graph model of the statements. The notation used to represent or depict the graph is secondary.
The full triples notation requires that URI references be
written out completely, in angle brackets, which, as the
example above illustrates, can result in very long lines. For
convenience, we will use a shorthand way of writing triples in
the rest of this Primer, and also in other RDF specifications.
In this shorthand, we can substitute a qualified name
(or QName) without angle brackets as an abbreviation
of a full URI reference. A QName contains a prefix that has
been assigned to a namespace URI, followed by a colon, and then
a local name (QNames are discussed further in Appendix B). So, for example, if the
QName prefix foo is assigned to the namespace URI
http://example.org/somewhere/, then the QName
foo:bar is shorthand for the URIref
http://example.org/somewhere/bar. We will also make
extensive use in these examples of several "well-known" QName
prefixes (which we will use without explicitly specifying them
each time), defined as follows:
prefix rdf:, namespace URI:
http://www.w3.org/1999/02/22-rdf-syntax-ns#
prefix rdfs:, namespace URI:
http://www.w3.org/2000/01/rdf-schema#
prefix dc:, namespace URI:
http://purl.org/dc/elements/1.1/
prefix daml:, namespace URI:
http://www.daml.org/2001/03/daml+oil#
prefix ex:, namespace URI:
http://www.example.org/ (or
http://www.example.com/)
prefix xsd:, namespace URI:
http://www.w3.org/2001/XMLSchema#
We will also use variations on the "example" prefix
ex: as needed in the examples, where this will not
cause confusion, for example,
prefix exterms:, namespace URI:
http://www.example.org/terms/ (for terms used by our
example organization),
prefix exstaff:, namespace URI:
http://www.example.org/staffid/ (for our example
organization's staff identifiers),
prefix ex2:, namespace URI:
http://www.domain2.example.org/ (for a second example
organization), and so on.
Using our new shorthand, we can write the previous set of triples as:
ex:index.html dc:creator exstaff:85740 . ex:index.html exterms:creation-date "August 16, 1999" . ex:index.html exterms:language "English" .
The examples we've just given of RDF statements begin to illustrate some of the advantages of using URIrefs as RDF's basic way of identifying things. For instance, instead of identifying the creator of the Web page in our first example by the character string "John Smith", we've assigned him a URIref, in this case (using a URIref based on his employee number) http://www.example.org/staffid/85740 . An advantage of using a URIref in this case is that we can be more precise in our identification. That is, the creator of the page isn't the character string "John Smith", or any one of the thousands of people named John Smith, but the particular John Smith associated with that URIref (whoever created the URIref defines the association). Moreover, since we have a URIref for the creator of the page, it is a full-fledged resource, and we can record additional information about him, such as his name, and age, as in the graph shown in Figure 4:
These examples also illustrate that RDF uses URIrefs as predicates in RDF statements. That is, rather than using character strings (or words) such as "creator" or "name" to identify properties, RDF uses URIrefs. Using URIrefs to identify properties is important for a number of reasons. First, it allows us to distinguish the properties we use from properties someone else may use that would otherwise be identified by the same character string. For instance, in our example, example.org uses "name" to mean someone's full name written out as a character string literal (e.g., "John Smith"), but someone else may intend "name" to mean something different (e.g., the name of a variable in a piece of program text). A program encountering "name" as a property identifier on the Web wouldn't necessarily be able to distinguish these uses. However, if example.org writes http://www.example.org/terms/name for its "name" property, and the other person writes http://www.domain2.example.org/genealogy/terms/name for hers, we can keep straight the fact that there are distinct properties involved (even if a program cannot automatically determine the distinct meanings). Another reason why it is important to use URIrefs to identify properties is that it allows us to treat RDF properties as resources themselves. Since properties are resources, we can record descriptive information about them (e.g., the English description of what example.org means by "name"), simply by adding additional RDF statements with the property's URIref as the subject.
Using URIrefs as subjects, predicates, and objects in RDF statements allows us to begin to develop and use a shared vocabulary on the Web, reflecting (and creating) a shared understanding of the concepts we talk about. For example, in the triple
ex:index.html dc:creator exstaff:85740 .
the predicate dc:creator, when fully expanded as a URIref, is an unambiguous reference to the "creator" attribute in the Dublin Core metadata attribute set (discussed further in Section 6.1), a widely-used set of attributes (properties) for describing information of all kinds. The writer of this triple is effectively saying that the relationship between the Web page (identified by http://www.example.org/index.html ) and the creator of the page (a distinct person, identified by http://www.example.org/staffid/85740 ) is exactly the concept identified by http://purl.org/dc/elements/1.1/creator . Moreover, anyone else, or any program, that understands http://purl.org/dc/elements/1.1/creator will know exactly what is meant by this relationship.
Of course, RDF's use of URIrefs doesn't solve all our problems because, for example, people can still use different URIrefs to refer to the same thing. However, the fact that these different URIrefs are used in the commonly-accessible "Web space" creates the opportunity both to identify equivalences among these different references, and to migrate toward the use of common references.
The result of all this is that RDF provides a way to make statements that applications can more easily process. Now an application can't actually "understand" such statements, of course, but it can deal with them in a way that makes it seem like it does. For example, a user could search the Web for all book reviews and create an average rating for each book. Then, the user could put that information back on the Web. Another web site could take that list of book rating averages and create a "Top Ten Highest Rated Books" page. Here, the availability and use of a shared vocabulary about ratings, and a shared group of URIrefs identifying the books they apply to, allows individuals to build a mutually-understood and increasingly-powerful (as additional contributions are made) "information base" about books on the Web. The same principle applies to the vast amounts of information that people create about thousands of subjects every day on the Web.
RDF statements are similar to a number of other formats for recording information, such as:
and information in these formats can be treated as RDF statements, allowing RDF to be used to integrate data from many sources.
Things would be very simple if the only types of information we had to record about things were obviously in the form of the simple RDF statements we've illustrated so far. However, most real-world data involves structures that are more complicated than that, at least on the surface. For instance, in our original example, we recorded the date the Web page was created as a single exterms:creation-date property, with a plain literal as its value. However, suppose we wanted to show, as the value of the exterms:creation-date property, the month, day, and year as separate pieces of information? Or, in the case of John Smith's personal information, suppose we wanted to record his address. We might write the whole address out as a plain literal, as in the triple
exstaff:85740 exterms:address "1501 Grant Avenue, Bedford, Massachusetts 01730" .
However, suppose we wanted to record John's address as a structure consisting of separate street, city, state, and Zip code values? How do we do this in RDF?
We can represent such structured information in RDF by considering the aggregate thing we want to talk about (like John Smith's address) as a resource, and then making statements about that new resource. So, in the RDF graph, in order to break up John Smith's address into its component parts, we create a new node to represent the concept of John Smith's address, and assign that concept a new URIref to identify it, say http://www.example.org/addressid/85740 (which we will abbreviate as exaddressid:85740). We then write RDF statements (create additional arcs and nodes) with that node as the subject, to represent the additional information, producing the graph shown in Figure 5:
or the triples:
exstaff:85740 exterms:address exaddressid:85740 . exaddressid:85740 exterms:street "1501 Grant Avenue" . exaddressid:85740 exterms:city "Bedford" . exaddressid:85740 exterms:state "Massachusetts" . exaddressid:85740 exterms:Zip "01730" .
Using this approach allows us to represent structured information in RDF, but it can involve generating numerous "intermediate" URIrefs to represent aggregate concepts such as John's address. Such concepts may never need to be referred to directly from outside a particular graph, and hence may not require "universal" identifiers. In addition, in the drawing of the graph representing the group of statements shown in Figure 5, we didn't really need the URIref we assigned to identify "John Smith's address", since we could just as easily have drawn the graph as in Figure 6:
In Figure 6, which is a perfectly good RDF graph, we've used a node without a label to stand for the concept of "John Smith's address". This unlabeled node, or blank node, serves its purpose in the drawing without needing a URIref, since the node itself provides the necessary connectivity between the various other parts of the graph. (Blank nodes were called anonymous resources in [RDF-MS].) However, we would need some form of explicit identifier for that node if we wanted to represent this graph as triples. To see this, we can try to write the triples corresponding to what is shown in Figure 6. What we would get would be something like:
exstaff:85740 exterms:address ??? . ??? exterms:street "1501 Grant Avenue" . ??? exterms:city "Bedford" . ??? exterms:state "Massachusetts" . ??? exterms:Zip "01730"
where ??? stands for something that indicates the presence of the blank node. Since a complex graph might contain more than one blank node, we would also need a way to differentiate between these multiple blank nodes in a triples representation of the graph. To do this, we use blank node identifiers, having the form _:name, to indicate the presence of blank nodes in triples. For instance, in this example we might use the blank node identifier _:johnaddress to refer to the blank node, in which case the resulting triples might be:
exstaff:85740 exterms:address _:johnaddress . _:johnaddress exterms:street "1501 Grant Avenue" . _:johnaddress exterms:city "Bedford" . _:johnaddress exterms:state "Massachusetts" . _:johnaddress exterms:Zip "01730" .
In a triples representation of a graph, each distinct blank node in the graph is given a different blank node identifier. Unlike URIrefs and literals, blank node identifiers are not considered to be actual parts of the RDF graph (this can be seen by looking at the drawn graph in Figure 6 and noting that the blank node has no blank node identifier). Blank node identifiers are just a way of representing the blank nodes in a graph (and distinguishing one blank node from another) when the graph is written in triple form. Blank node identifiers also have significance only within the triples representing a single graph (two different graphs with the same number of blank nodes might independently use the same blank node identifiers to distinguish them, and it would be incorrect to assume that blank nodes from different graphs having the same blank node identifiers are the same). If it is expected that a node in a graph will need to be referenced from outside the graph, a URIref should be assigned to identify it.
At the beginning of this section, we noted that we can represent aggregate structures, like John Smith's address, by considering the aggregate thing we want to talk about as a resource, and then making statements about that new resource. This example illustrates an important aspect of RDF: RDF directly represents only binary relationships, e.g. the relationship between John Smith and the literal representing his address. When we try to represent the relationship between John and the group of separate components of this address, we are dealing with an n-ary (n-way) relationship (in this case, n=5) between John and the street, city, state, and zip components. In order to represent such structures directly in RDF (e.g., considering the address as a group of street, city, state, and zip sub-components), we need to break this n-way relationship up into a group of separate binary relationships. Blank nodes give us one way to do this. Each time we have an n-ary relationship, we can choose one of the participants as the subject of the relationship (John in this case), and create a blank node to represent the rest of the relationship (John's address in this case). We can then represent the remaining participants in the relationship (such as the city in our example) as separate properties of the new resource represented by the blank node.
Blank nodes also give us a way to more accurately make statements about resources that may not have URIs, but that are described in terms of relationships with other resources that do have URIs. For example, when making statements about a person, say Jane Smith, it may seem natural to use a URI based on that person's email address as her URI, e.g., mailto:jane@example.org. However, this approach can cause problems. For example, we may want to record information about Jane's mailbox (e.g., the server it is on) as well as about Jane herself (e.g., her current address), and using a URIref for Jane based on her email address makes it difficult to know which thing we're talking about. The same problem exists when a company's Web page URL, say http://www.example.com/, is used as the URI of the company itself. Once again, we may need to record information about the Web page (e.g., who created it and when) as well as about the company, and using http://www.example.com/ as an identifier for both makes it difficult to know which thing we're talking about.
The fundamental problem is that using Jane's mailbox as a stand-in for Jane isn't really accurate: Jane and her mailbox are not the same thing, and hence their identifiers should be different. When Jane herself doesn't have a URI, a blank node gives us a more accurate way of modeling this situation. We can represent Jane by a blank node, and give the blank node an exterms:mailbox property having the URIref mailto:jane@example.org as its value. We can also assign the blank node an rdf:type property with a value of exterms:Person (we will discuss types in more detail in the following sections), an exterms:name property with a value of "Jane Smith", and any other descriptive information we might want to provide, as shown in the following triples:
_:jane exterms:mailbox mailto:jane@example.org . _:jane rdf:type exterms:Person . _:jane exterms:name "Jane Smith" . _:jane exterms:empID "23748" _:jane exterms:age "26" .
This says, accurately, that "there is a resource of type exterms:Person, whose electronic mailbox is identified by mailto:jane@example.org, whose name is Jane Smith, etc." That is, the blank node can be read as "there is a resource". Statements with that blank node as subject then provide information about the characteristics of that resource.
In practice, using blank nodes instead of URIrefs in these cases doesn't change the way we actually handle this kind of information very much. For example, if we know independently that an email address uniquely identifies someone at example.org (particularly if the address is unlikely to be reused), we can still use that fact to associate information about that person from multiple sources, even though the email address is not the person's URI. For example, if we were to find another piece of RDF on the web that described a book, and gives the author's contact information as mailto:jane@example.org, we might reasonably conclude that the author's name is Jane Smith. The point is that saying something like "the author of the book is mailto:jane@example.org" is typically a shorthand for "the author of the book is someone whose mailbox is mailto:jane@example.org". Using a blank node to represent this "someone" is just a more accurate way to represent the real world situation. (Incidentally, some RDF-based schema languages allow specifying that certain properties are unique identifiers. This is discussed further in Section 5.5.)
In the last section, we described how to handle situations in which we needed to take property values represented by plain literals, and break them up into structured values that identify the individual parts of those property values. Using this approach, instead of, say, recording the date a Web page was created as a single exterms:creation-date property, with a single plain literal as its value, we could represent the value as a structure consisting of the month, day, and year as separate pieces of information. However, so far, we've followed the practice of representing any constant values that serve as objects in RDF statements by these plain (untyped) literals, even when we probably intend for the value of the property to be a number (e.g., the value of a year or age property) or some other kind of more specialized value.
For example, in Figure 4 we illustrated an RDF graph recording information about John Smith. In that graph, we recorded the value of John Smith's exterms:age property as the plain literal "27", as shown in Figure 7:
In this case, our hypothetical organization example.org probably intends for "27" to be interpreted as a number, rather than as the string consisting of the character "2" followed by the character "7". However, an application reading that literal "27" would only know to do that if the application was explicitly given the information that the literal "27" was intended to represent a number, and knew which number the literal "27" was supposed to represent. The common practice in programming languages or database systems is to provide this kind of information by associating a datatype with the literal, in this case, a datatype like decimal or integer. An application that understands the datatype then knows, for example, whether the literal "10" is intended to represent the number ten, the number two, or the string consisting of the character "1" followed by the character "0", depending on whether the specified datatype is integer, binary, or string. In RDF, typed literals are used to provide this kind of information.
Using a typed literal, we could describe John Smith's age as being the integer number 27 using the triple:
<http://www.example.org/staffid/85740> <http://www.example.org/terms/age> "27"^^<http://www.w3.org/2001/XMLSchema#integer> .
or, using our QName simplification for writing long URIs:
exstaff:85740 exterms:age "27"^^xsd:integer .
or as shown in Figure 8:
Similarly, in the graph shown in Figure 3 describing information about a Web page, we recorded the value of the page's exterms:creation-date property as the plain literal "August 16, 1999". However, using a typed literal, we could describe the creation date of the Web page as being the date August 16, 1999, using the triple:
ex:index.html exterms:creation-date "1999-08-16"^^xsd:date .
or as shown in Figure 9:
As these examples illustrate, an RDF typed literal is formed by explicitly pairing a URIref identifying a particular datatype (in these examples, the datatypes integer and date from XML Schema Part 2: Datatypes [XML-SCHEMA2]) with a literal that the datatype uses to represent the intended value. In each case, this results in a single node in the RDF graph with the pair as its label.
Unlike typical programming languages and database systems, RDF has no built-in set of datatypes of its own, such as datatypes for integers, reals, strings, or dates. Instead, it relies on datatypes defined elsewhere that can be identified by a datatype URI. RDF typed literals simply provide a way to explicitly indicate, for a given literal, what datatype should be used to interpret it. As far as RDF is concerned, you can write any pair of URIref and literal you want as a typed literal. This gives RDF the flexibility to directly represent information coming from different sources without the need to perform type conversions between these sources and a native set of RDF datatypes. (Type conversions would still be required when moving information between systems with different datatype systems, but RDF would impose no extra conversions into and out of a native set of RDF types.)
The actual interpretation of a typed literal (determining the value it denotes) must be performed by an RDF processor that is programmed to "understand" that datatype. In particular, we've used XML Schema datatypes in the two examples we've just presented, and will be using XML Schema datatypes in most of our other examples as well (for one thing, XML Schema data types have URIrefs we can use to refer to them, specified in [XML-SCHEMA2]). XML Schema datatypes have a "first among equals" status in RDF. They are treated no differently than any other datatype, but they are expected to be the most widely used, and therefore the most likely to be interoperable among different software. As a result, it is expected that many RDF processors will be programmed to recognize these datatypes. However, RDF software could be programmed to process other sets of datatypes as well.
RDF datatype concepts also borrow a conceptual framework from XML Schema datatypes [XML-SCHEMA2] to more precisely describe datatype requirements. RDF's use of this framework is defined in RDF Concepts and Abstract Syntax [RDF-CONCEPTS].
The flexibility provided by RDF typed literals comes at a price. For one thing, RDF has no way of knowing whether or not a URIref in a typed literal actually identifies a datatype. Moreover, even when a URIref does identify a datatype, RDF itself does not define the validity of pairing that datatype with a particular literal. This validity can only be determined by software built to understand that datatype. For example, you could write the triple:
exstaff:85740 exterms:age "pumpkin"^^xsd:integer .
or the graph shown in Figure 10:
The typed literal in Figure 10 is valid RDF, but obviously an error as far as the xsd:integer datatype is concerned, since "pumpkin" is not defined as being a legal literal for xsd:integer.
In general, RDF software may be called on to process RDF data that contains datatypes that it has not been programmed to understand, in which case there are some things the software will not be able to do. This includes recognizing whether or not a particular string represents a legal value for a particular datatype. In this case, RDF software not built to understand the xsd:integer datatype would not be able to recognize that "pumpkin" is not a valid xsd:integer.
Taken as a whole, RDF is simple: nodes-and-arcs diagrams interpreted as statements about things identified by URIrefs. This section has presented an introduction to these concepts. As noted earlier, the normative (i.e., definitive) RDF specification describing these concepts is the RDF Concepts and Abstract Syntax [RDF-CONCEPTS], which should be consulted for further information. Together with the RDF Semantics [RDF-SEMANTICS] document, [RDF-CONCEPTS] provides the definition of the abstract syntax for RDF, together with its formal semantics (meaning).
However, in addition to the basic techniques for representing RDF statements in diagrams (or triples) we've seen so far, it should be clear that we also need a way for people to define the vocabularies they intend to use in those statements, including:
The basis for describing such vocabularies in RDF is the RDF Vocabulary Description Language 1.0: RDF Schema [RDF-VOCABULARY], which will be described in Section 5.
Additional background on the basic ideas underlying RDF, and its role in providing a general language for describing Web information, can be found in [WEBDATA]. RDF draws upon ideas from knowledge representation, artificial intelligence, and data management, including Conceptual Graphs, logic-based knowledge representation, frames, and relational databases. Some possible sources of background information on these subjects include [Sowa], [CG], [KIF], [Hayes], [Luger], and [Gray].
As we described in Section 2, RDF's conceptual model is a graph. RDF provides an XML syntax for writing down and exchanging RDF graphs, called RDF/XML. Unlike triples, which are intended as a shorthand notation, RDF/XML is the normative syntax for writing RDF. RDF/XML is defined in the RDF/XML Syntax Specification [RDF-SYNTAX]. This section describes this RDF/XML syntax.
We can illustrate the basic ideas behind the RDF/XML syntax using some of the examples we've presented already. Suppose we want to represent one of our initial statements:
http://www.example.org/index.html has a creation-date whose value is August 16, 1999
The RDF graph for this single statement, after assigning a URIref to the creation-date property, is shown in Figure 11:
with a triple representation of:
ex:index.html exterms:creation-date "August 16, 1999" .
Example 2 shows the RDF/XML syntax corresponding to the graph in Figure 11:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:exterms="http://www.example.org/terms/"> 4. <rdf:Description rdf:about="http://www.example.org/index.html"> 5. <exterms:creation-date>August 16, 1999</exterms:creation-date> 6. </rdf:Description> 7. </rdf:RDF>
(we have added line numbers to use in explaining the example).
This seems like a lot of overhead. We can understand better what is going on by considering each part of this XML in turn (a brief introduction to XML is provided in Appendix B).
Line 1, <?xml version="1.0"?>, is the XML declaration, which indicates that the following content is XML, and what version of XML it is.
Line 2 begins an rdf:RDF element. This indicates that the following XML content (starting here and ending with the </rdf:RDF> in Line 7) is intended to represent RDF. Following the rdf:RDF on this same line is an XML namespace declaration, represented as an xmlns attribute of the rdf:RDF start-tag. This declaration specifies that all tags in this content prefixed with rdf: are part of the namespace identified by the URIref http://www.w3.org/1999/02/22-rdf-syntax-ns#. This namespace is the source for the RDF-specific terms used in RDF/XML.
Line 3 specifies another XML namespace declaration, this time for the prefix exterms:. This is expressed as another xmlns attribute of the rdf:RDF element, and specifies that the namespace URIref http://www.example.org/terms/ is to be associated with the exterms: prefix. This namespace is the source for the specific terms defined by our example organization, example.org. The ">" at the end of line 3 indicates the end of the rdf:RDF start-tag. Lines 1-3 are general "housekeeping" necessary to indicate that we are defining RDF/XML content, and to identify the sources of the terms we are using.
Lines 4-6 provide the RDF/XML for the specific statement we're representing. An obvious way to talk about any RDF statement is to say it's a description, and that it's about the subject of the statement (in this case, about http://www.example.org/index.html), and this is the way RDF/XML represents the statement. The rdf:Description start-tag in Line 4 indicates that we're starting a description of a resource, and goes on to identify the resource the statement is about (the subject of the statement) using the rdf:about attribute to specify the URIref of the subject resource. Line 5 provides a property element, with the QName <exterms:creation-date> as its tag, to hold the plain literal August 19, 1999 of the creation-date property of the statement. It is nested within the containing rdf:Description element, indicating that this property applies to the resource specified in the rdf:about attribute of the rdf:Description element. The URIref of the creation-date property corresponding to the QName <exterms:creation-date> is obtained by appending the name creation-date to the URIref of the exterms: prefix (http://www.example.org/terms/), giving http://www.example.org/terms/creation-date. Line 6 indicates the end of this particular rdf:Description element.
Finally, Line 7 indicates the end of the rdf:RDF element started on Line 2.
Example 2 illustrates the basic ideas used by RDF/XML to encode an RDF graph as XML elements, attributes, element content, and attribute values. The URIref labels for properties and object nodes are written as XML QNames, consisting of a short prefix denoting a namespace URI, together with a local name denoting a namespace-qualified element or attribute, as described in Appendix B. The (namespace URIref, local name) pair are chosen so that concatenating them forms the URIref of the original node. The URIrefs of subject nodes are written as XML attribute values. The nodes labeled by literals (which are always object nodes) become element text content or attribute values. (All these options are described in [RDF-SYNTAX]).
We could represent an RDF graph consisting of multiple statements in RDF/XML by using RDF/XML similar to Lines 4-6 in Example 2 to separately represent each statement. For example, if we wanted to write the following two statements:
ex:index.html exterms:creation-date "August 16, 1999" . ex:index.html exterms:language "English" .
we could write the RDF/XML in Example 3:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:exterms="http://www.example.org/terms/"> 4. <rdf:Description rdf:about="http://www.example.org/index.html"> 5. <exterms:creation-date>August 16, 1999</exterms:creation-date> 6. </rdf:Description> 7. <rdf:Description rdf:about="http://www.example.org/index.html"> 8. <exterms:language>English</exterms:language> 9. </rdf:Description> 10. </rdf:RDF>
Example 3 is the same as Example 2, with the addition of lines 7-9, a second rdf:Description element to represent the second statement. We could represent an arbitrary number of additional statements in the same way, using a separate rdf:Description element for each additional statement. As Example 3 illustrates, once the overhead of writing the XML and namespace declarations is dealt with, writing each additional RDF statement in RDF/XML is both straightforward and not too complicated.
The RDF/XML syntax provides a number of abbreviations to make common uses easier to write. For example, it is typical for the same resource to be described with several properties and values at the same time, as in Example 3, where the resource ex:index.html is the subject of several statements. To handle such cases, RDF/XML allows multiple property elements representing those properties to be nested within the rdf:Description element that identifies the subject resource. For example, if we wanted to represent the following group of statements about http://www.example.org/index.html:
ex:index.html dc:creator exstaff:85740 . ex:index.html exterms:creation-date "August 16, 1999" . ex:index.html exterms:language "English" .
whose graph (the same as Figure 3) is shown in Figure 12:
we could write the RDF/XML as shown in Example 4:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:dc="http://purl.org/dc/elements/1.1/" 4. xmlns:exterms="http://www.example.org/terms/"> 5. <rdf:Description rdf:about="http://www.example.org/index.html"> 6. <exterms:creation-date>August 16, 1999</exterms:creation-date> 7. <exterms:language>English</exterms:language> 8. <dc:creator rdf:resource="http://www.example.org/staffid/85740"/> 9. </rdf:Description> 10. </rdf:RDF>
Compared with the previous two examples, Example 4 adds an additional namespace declaration (in Line 3), and an additional creator property element (in Line 8). In addition, we've nested the property elements for the three properties whose subject is http://www.example.org/index.html within a single rdf:Description element identifying that subject, rather than writing a separate rdf:Description element for each statement.
Line 8 also introduces a new form of property element. (The element tag also uses a different namespace prefix, the new namespace prefix dc: we defined in Line 3.) The exterms:language element in Line 7 is similar to the exterms:creation-date element we defined in Example 2. Both these elements represent properties with plain literals as property values, and such elements are specified by enclosing the literal within start- and end-tags corresponding to the property name. However, the dc:creator element on Line 8 represents a property whose value is another resource, rather than a literal. If we had written the URIref of this resource as a plain literal within start- and end-tags in the same way as we wrote the literal values of the other elements, we would be saying that the value of the dc:creator element was the character string http://www.example.org/staffid/85740, rather than the resource identified by that literal interpreted as a URIref. In order to indicate the difference, we've written the dc:creator element using what XML calls an empty-element tag (it has no separate end-tag), and defined the property value using an rdf:resource attribute within that empty element. The rdf:resource attribute indicates that the property element's value is another resource, identified by its URIref. Because the URIref is being used as an attribute value, RDF/XML requires that we write out the URIref, rather than abbreviating it as a QName, as we've done in writing element and attribute names.
It is important to understand that the RDF/XML in the Example 4 is an abbreviation. The RDF/XML in Example 5, in which each statement is written separately, describes exactly the same RDF graph:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:exterms="http://www.example.org/terms/">
<rdf:Description rdf:about="http://www.example.org/index.html">
<exterms:creation-date>August 16, 1999</exterms:creation-date>
</rdf:Description>
<rdf:Description rdf:about="http://www.example.org/index.html">
<exterms:language>English</exterms:language>
</rdf:Description>
<rdf:Description rdf:about="http://www.example.org/index.html">
<dc:creator rdf:resource="http://www.example.org/staffid/85740"/>
</rdf:Description>
</rdf:RDF>
We will describe a few additional RDF/XML abbreviations in the following sections. However, you should consult [RDF-SYNTAX] for a more thorough description of the abbreviations that are available.
RDF/XML also allows us to represent graphs that include nodes that have no URIrefs, i.e., blank nodes. For example, Figure 13 (taken from [RDF-SYNTAX]) shows a graph saying "the document 'http://www.w3.org/TR/rdf-syntax-grammar' has a title 'RDF/XML Syntax Specification (Revised)' and has an editor, the editor has a name 'Dave Beckett' and a home page 'http://purl.org/net/dajobe/' ".
This illustrates an idea we discussed in Section 2.3: the use of a blank node to represent something that does not have a URIref, but can be described in terms of other information. In this case, the blank node represents a person, the editor of the document, and the person is described by his name and home page.
RDF/XML provides several ways to represent blank nodes. These are described in [RDF-SYNTAX]. The approach we will illustrate here, which is the most direct approach, is to assign a blank node identifier to the blank node. A blank node identifier serves to identify a blank node within a particular RDF/XML document but, unlike a URIref, is unknown outside the document in which it is assigned. A blank node is referred to in RDF/XML using an rdf:nodeID attribute with a blank node identifier as its value in places where the URIref of a resource node would otherwise appear. Specifically, a statement with a blank node as its subject can be written in RDF/XML using an rdf:Description element which specifies an rdf:nodeID attribute instead of an rdf:about attribute. Similarly, a statement with a blank node as its object can be written using a property element with an rdf:nodeID attribute instead of an rdf:resource attribute. Using rdf:nodeID, Example 6 shows the RDF/XML corresponding to Figure 13:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:dc="http://purl.org/dc/elements/1.1/" 4. xmlns:exterms="http://example.org/stuff/1.0/"> 5. <rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar"> 6. <dc:title>RDF/XML Syntax Specification (Revised)</dc:title> 7. <exterms:editor rdf:nodeID="abc"/> 8. </rdf:Description> 9. <rdf:Description rdf:nodeID="abc"> 10. <exterms:fullName>Dave Beckett</exterms:fullName> 11. <exterms:homePage rdf:resource="http://purl.org/net/dajobe/"/> 12. </rdf:Description> 13. </rdf:RDF>
In Example 6, the blank node identifier abc is used in Line 9 to identify the blank node as the subject of several statements, and is used in Line 7 to indicate that the blank node is the value of a resource's exterms:editor property. The advantage of using a blank node identifier over some of the other approaches described in [RDF-SYNTAX] is that using a blank node identifier allows the same blank node to be referred to in more than one place in the same RDF/XML document.
Finally, the typed literals we described in Section 2.4 may be used as property values instead of the plain literals we have used in the examples so far. A typed literal is represented in RDF/XML by adding an rdf:datatype attribute specifying a datatype URIref to the property element containing the literal.
For example, to change the statement from Example 2 to use a typed literal instead of a plain literal for the creation-date property, the triple representation would be:
ex:index.html exterms:creation-date "1999-08-16"^^xsd:date .
with corresponding RDF/XML syntax shown in Example 7:
1. <?xml version="1.0"?>
2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3. xmlns:exterms="http://www.example.org/terms/">
4. <rdf:Description rdf:about="http://www.example.org/index.html">
5. <exterms:creation-date rdf:datatype=
"http://www.w3.org/2001/XMLSchema#date">1999-08-16
</exterms:creation-date>
6. </rdf:Description>
7. </rdf:RDF>
In Line 5 of Example 7, a typed literal is given as the value of the ex:creation-date property element by adding an rdf:datatype attribute to the element's start-tag to specify the datatype. The value of this attribute is the URIref of the datatype, in this case, the URIref of the XML Schema date datatype. Since this is an attribute value, the URIref must be written out, rather than using the QName abbreviation xsd:date that we used in the triple. A literal appropriate to this datatype is then written as the element content, in this case, the literal 1999-08-16, which is the literal representation for August 16, 1999 in the XML Schema date datatype.
For the most part, we will continue to use plain (untyped) literals in our examples. However, you should be aware that typed literals from appropriate datatypes, such as XML Schema datatypes, can always be used instead.
Although additional abbreviated forms for writing RDF/XML are available, the facilities we have illustrated so far provide a simple but general way to serialize graphs in RDF/XML. Using these facilities, an RDF graph is written in RDF/XML as follows:
rdf:Description element, using an
rdf:about attribute if the node has a URIref, or
an rdf:nodeID attribute if the node is
blank.rdf:resource attribute
specifying the object of the triple (if the object node has a
URIref), or an rdf:nodeID attribute specifying
the object of the triple (if the object node is blank).Compared to some of the more abbreviated serialization approaches described in [RDF-SYNTAX], this simple serialization approach provides the most direct representation of the actual graph structure, and is particularly recommended for applications in which the output RDF/XML is to be used in further RDF processing.
So far, we've been describing resources that we imagine have been given URIrefs already. For instance, in our initial examples, we provided descriptive information about example.org's web page, whose URIref was http://www.example.org/index.html. We referred to this resource using an rdf:about attribute citing its full URIref. Although RDF doesn't specify or control how URIrefs are assigned to resources, sometimes we want to achieve the effect of assigning URIrefs to resources that are part of an organized group of resources. For example, suppose a sporting goods company, example.com, wanted to provide an RDF-based catalog of its products, such as tents, hiking boots, and so on, as an RDF/XML document, identified by (and located at) http://www.example.com/2002/04/products. In that resource, each product might be given a separate RDF description. This catalog, along with one of these descriptions, the catalog entry for a model of tent called the "Overnighter", might be written in RDF/XML as shown in Example 8:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:exterms="http://www.example.com/terms/"> 4. <rdf:Description rdf:ID="item10245"> 5. <exterms:model>Overnighter</exterms:model> 6. <exterms:sleeps>2</exterms:sleeps> 7. <exterms:weight>2.4</exterms:weight> 8. <exterms:packedSize>14x56</exterms:packedSize> 9. </rdf:Description> ...other product descriptions... 10. </rdf:RDF>
(We've included the surrounding xml, RDF, and namespace information in lines 1 through 3, and line 10, but this information would only need to be defined once for the whole catalog, not repeated for each entry in the catalog).
Example 8 is similar to our previous examples in the way it represents the properties (model, sleeping capacity, weight) of the resource (the tent) being described. However, in line 4, the rdf:Description element has an rdf:ID attribute instead of an rdf:about attribute. Using rdf:ID indicates that we are using a fragment identifier, given by the value of the rdf:ID attribute (item10245 in this case, which might be the catalog number assigned by example.com), as an abbreviation of the complete URIref of the resource we are describing. The fragment identifier item10245 will be interpreted relative to a base URI, in this case, the URI of the containing catalog document. The full URIref for the tent is formed by taking the base URI (of the catalog), and appending # (to indicate that what follows is a fragment identifier) and then item10245 to it, giving the absolute URIref http://www.example.com/2002/04/products#item10245.
The rdf:ID attribute is somewhat similar to the ID attribute in XML and HTML, in that it defines a name which must be unique within the document (in this case, the catalog) in which it is defined. In this case, the rdf:ID attribute appears to be assigning a name (item10245) to this particular kind of tent. Any other RDF/XML within this catalog could refer to the tent by using the relative URIref #item10245 in an rdf:about attribute. This would be understood as being a URIref defined relative to the base URIref of the catalog. Using a similar abbreviation, we could also have given the URIref of the tent by specifying rdf:about="#item10245" in the catalog entry (i.e., by specifying the relative URIref directly) instead of rdf:ID="item10245" . The two forms are essentially synonyms: the full URIref formed by RDF/XML is the same in either case: http://www.example.com/2002/04/products#item10245. In either case, example.com would be giving the URIref for the tent in a two-stage process, first assigning the URIref for the whole catalog, and then using a relative URIref in the description of the tent in the catalog to indicate the URIref that has been assigned to this particular kind of tent. Moreover, you can think of this use of a relative URIref as either being an abbreviation for a full URIref that has been assigned to the tent independently of the RDF, or as being the assignment of the URIref to the tent within the catalog.
RDF located outside the catalog could refer to this tent by using the full URIref, i.e., by concatenating the relative URIref #item10245 of the tent to the base URI of the catalog, forming the absolute URIref http://www.example.com/2002/04/products#item10245. For example, an outdoor sports web site exampleRatings.com might use RDF to provide ratings of various tents. The (5-star) rating given to the tent described in Example 8 might then be represented on exampleRatings.com's web site as shown in Example 9:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:sportex="http://www.exampleRatings.com/terms/"> 4. <rdf:Description rdf:about="http://www.example.com/2002/04/products#item10245"> 5. <sportex:ratingBy>Richard Roe</sportex:ratingBy> 6. <sportex:numberStars>5</sportex:numberStars> 7. </rdf:Description> 8. </rdf:RDF>
In Example 9, line 4 uses an rdf:Description element with an rdf:about attribute whose value is the full URIref of the tent. The use of this URIref allows the tent being referred to in the rating to be precisely identified.
These examples illustrate several points. First, even though RDF doesn't specify or control how URIrefs are assigned to resources (in this case, the various tents and other items in the catalog), the effect of assigning URIrefs to resources in RDF can be achieved by combining a process (external to RDF) that identifies a single document (the catalog in this case) as the source for descriptions of those resources, with the use of relative URIrefs in descriptions of those resources within that document. For instance, example.com could use this catalog as the central source where its products are described, with the understanding that if a product's item number isn't in an entry in this catalog, it's not a product known to example.com. (Note that RDF does not assume any particular relationship exists between two resources just because their URIrefs have the same base, or are otherwise similar. This relationship may be known to example.com, but it is not directly defined by RDF.)
These examples also illustrate one of the basic architectural principles of the Web, which is that anyone should be able say anything they want about existing resources [BERNERS-LEE98]. The examples further illustrate that the RDF describing a particular resource does not need to be located all in one place; instead, it may be distributed throughout the web. This is true not only for situations like this one, in which one organization is rating or commenting on resources defined by another, but also for situations in which the original definer of a resource (or anyone else) wishes to amplify the description of that resource by providing additional information about it. This may be done either by modifying the RDF document in which the resource was originally described, to add the properties and values needed to describe the additional information, or, as this example illustrates, by creating a separate document, and providing the additional properties and values in rdf:Description elements that refer to the original resource via its URIref using rdf:about.
The discussion above indicated that fragment identifiers such as #item10245 will be interpreted relative to a base URI. By default, this base URI would be the URI of the resource in which the fragment identifier is used. However, in some cases it is desirable to be able to explicitly specify this base URI. For instance, suppose that in addition to the catalog located at http://www.example.com/2002/04/products, example.org wanted to provide a duplicate catalog on a mirror site, say at http://mirror.example.com/2002/04/products. This could create a problem, since if the catalog was accessed from the mirror site, the URIref for our example tent would be generated from the URI of the containing document, forming http://mirror.example.com/2002/04/products#item10245, rather than http://www.example.com/2002/04/products#item10245, and hence would apparently refer to a different resource than the one intended. Alternatively, example.org might want to assign a base URIref for its set of product URIrefs without publishing a single source document whose location defines the base.
To deal with such cases, RDF/XML supports XML Base [XML-BASE], which allows an XML document to specify a base URI other than the URI of the document itself. Example 10 shows how we would define the catalog using XML Base:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:exterms="http://www.example.com/terms/" 4. xml:base="http://www.example.com/2002/04/products"> 5. <rdf:Description rdf:ID="item10245"> 6. <exterms:model>Overnighter</exterms:model> 7. <exterms:sleeps>2</exterms:sleeps> 8. <exterms:weight>2.4</exterms:weight> 9. <exterms:packedSize>14x56</exterms:packedSize> 10. </rdf:Description> ...other product descriptions... 11. </rdf:RDF>
In Example 10, the xml:base declaration in line 4 specifies that the base URI for the content within the rdf:RDF element (until another xml:base attribute is specified) is http://www.example.com/2002/04/products, and all relative URIrefs cited within that content will be interpreted relative to that base, no matter what the URI of the containing document is. As a result, the relative URIref of our tent, #item10245, will be interpreted as the same absolute URIref, http://www.example.com/2002/04/products#item10245, no matter what the actual URI of the catalog document is, or whether the base URIref actually identifies a particular document at all.
So far, we've been talking about a single product description, a particular model of tent, from example.com's catalog. However, example.com will probably offer several different models of tents, as well as multiple instances of other categories of products, such as backpacks, hiking boots, and so on. This idea of things being classified into different kinds or categories is similar to the programming language concept of objects having different types or classes. RDF supports this concept by providing a predefined property, rdf:type. When an RDF resource is described with an rdf:type property, the value of that property is considered to be a resource that represents a category or class of things, and the subject of that property is considered to be an instance of that category or class. Using rdf:type, Example 11 shows how example.com might indicate that our product description is that of a tent:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:exterms="http://www.example.com/terms/" 4. xml:base="http://www.example.com/2002/04/products"> 5. <rdf:Description rdf:ID="item10245"> 6. <rdf:type rdf:resource="http://www.example.com/terms/Tent" /> 7. <exterms:model>Overnighter</exterms:model> 8. <exterms:sleeps>2</exterms:sleeps> 9. <exterms:weight>2.4</exterms:weight> 10. <exterms:packedSize>14x56</exterms:packedSize> 11. </rdf:Description> ...other product descriptions... 12. </rdf:RDF>
In Example 11, the rdf:type property in Line 6 indicates that the instance belongs to a class identified by the URIref http://www.example.com/terms/Tent. In this case, we imagine that example.com has described its classes as part of the same vocabulary that it uses to describe its other terms (such as the property exterms:weight), so we use the absolute URIref of the class to refer to it. If example.com had described these classes as part of the product catalog itself, we could have used the relative URIref #Tent to refer to it.
RDF itself does not define a vocabulary for defining application-specific classes of things, such as Tent in this example. Instead, such classes would be described in an RDF Schema. The facilities provided by RDF for describing application-specific classes and their properties are discussed in Section 5. Other such facilities for describing classes can also be defined, such as the DAML+OIL and OWL languages described in Section 5.5.
Since describing resources as instances of specific types or classes is fairly common, RDF/XML provides a special abbreviation for instances described as members of classes using the rdf:type property. In this abbreviation, the rdf:type property and its value are removed, and the rdf:Description element is replaced by an element whose name is the QName corresponding to the class URIref. Using this abbreviation, example.com's tent from Example 11 could also be described as shown in Example 12:
1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3. xmlns:exterms="http://www.example.com/terms/" 4. xml:base="http://www.example.com/2002/04/products"> 5. <exterms:Tent rdf:ID="item10245"> 6. <exterms:model>Overnighter</exterms:model> 7. <exterms:sleeps>2</exterms:sleeps> 8. <exterms:weight>2.4</exterms:weight> 9. <exterms:packedSize>14x56</exterms:packedSize> 10. </exterms:Tent> ...other product descriptions... 11. </rdf:RDF>
Both Example 11 and Example 12 illustrate that RDF statements can be written in RDF/XML in a way that closely resembles descriptions that might have been written directly in XML. This is an important consideration, given the increasing use of XML in all kinds of applications, since it suggests that RDF could be used in these applications without requiring major changes in the way their information is structured.
The examples above have illustrated some of the basic ideas behind the RDF/XML syntax. These examples provide enough information to enable you to begin writing useful RDF/XML. For a more thorough discussion of the principles behind the modeling of RDF statements in XML (known as striping), together with a presentation of the other RDF/XML abbreviations available, and other details and examples about writing RDF in XML, you should refer to the RDF/XML Syntax Specification [RDF-SYNTAX].
RDF provides a number of additional capabilities, including some built-in types and properties for representing groups of resources and RDF statements, and capabilities for deploying RDF information in the World Wide Web. These additional capabilities are described in the following sections.
There is often a need to describe groups of things. For example, we might want to say that a book was created by several authors, or to list the students in a course, or the software modules in a package. RDF provides several pre-defined types and properties that can be used to describe such groups.
First, RDF provides a container vocabulary consisting of three predefined types (together with some associated predefined properties). A container is a resource that contains things. The contained things are called members. The members of a container may be resources or literals. RDF defines three types of containers:
A Bag (a resource having type rdf:Bag) is a group of resources or literals, possibly including duplicate members, where there is no significance in the order of the members. For example, a Bag might be used to describe a group of part numbers in which the order of entry or processing of the part numbers does not matter.
A Sequence or Seq (a resource having type rdf:Seq) is a group of resources or literals, possibly including duplicate members, where the order of the members is significant. For example, a Sequence might be used to describe a group that must be maintained in alphabetical order.
An Alternative or Alt (a resource having type rdf:Alt) is a group of resources or literals that are alternatives (typically for a single value of a property). For example, an Alt might be used to describe alternative language translations for the title of a book, or to describe a list of alternative Internet sites at which a resource might be found. An application using a property whose value is an Alt container should be aware that it can choose any one of the members of the group as appropriate.
To describe a resource as being one of these types of containers, you give the resource an rdf:type property whose value is one of the pre-defined resources rdf:Bag, rdf:Seq, or rdf:Alt (whichever is appropriate). The container resource (which may either be a blank node or a resource with a URIref) denotes the group as a whole. The members of the container can be described by defining a container membership property for each member with the container resource as its subject and the member as its object. These container membership properties have names of the form rdf:_n, where n is a decimal integer greater than zero, with no leading zeros, e.g., rdf:_1, rdf_2, rdf_3, and so on, and are used specifically for describing the members of containers. Container resources may also have other properties that describe the container, in addition to the container membership properties and the rdf:type property.
It is important to understand that while these types of containers are described using pre-defined RDF types and properties, any special meanings associated with these containers, e.g., that the members of an Alt container are alternative values, are only intended meanings. These specific container types, and their definitions, are provided with the aim of establishing a shared convention among those who need to describe groups of things. All RDF does is provide the types and properties that can be used to construct the RDF graphs to describe each type of container. RDF has no more built-in understanding of what a resource of type rdf:Bag is than it has of what a resource of type ex:Tent, that we discussed in Section 3.2, is. In each case, applications must be written to behave according to the particular meaning involved for each type. This point will be expanded on in the following examples.
A typical use of a container is to indicate that the value of a property is a group of things. For example, to represent the sentence "Course 6.001 has the students Amy, Tim, John, Mary, and Sue", you could describe the course by giving it a s:students property whose value is a container of type rdf:Bag (the group of students) and then, using the container membership properties, describe the individual students as being members of that container, as in the RDF graph shown in Figure 14:
Since the value of the s:students property in this example is described as a Bag, there is no intended significance in the order given for the URIrefs of each student, even though the properties in the graph have integers in their names. It is up to applications creating and processing graphs that include rdf:Bag containers to ignore any (apparent) order in the names of the membership properties.
RDF/XML provides some special syntax and abbreviations to make it simpler to describe such containers. For example, Example 13 describes the graph shown in Figure 14:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:s="http://example.edu/students/vocab#">
<rdf:Description rdf:about="http://example.edu/courses/6.001">
<s:students>
<rdf:Bag>
<rdf:li rdf:resource="http://example.edu/students/Amy"/>
<rdf:li rdf:resource="http://example.edu/students/Tim"/>
<rdf:li rdf:resource="http://example.edu/students/John"/>
<rdf:li rdf:resource="http://example.edu/students/Mary"/>
<rdf:li rdf:resource="http://example.edu/students/Sue"/>
</rdf:Bag>
</s:students>
</rdf:Description>
</rdf:RDF>
Example 13 shows that RDF/XML provides li as a convenience element to avoid having to explicitly number each membership property. The numbered properties rdf:_1, rdf:_2, and so on are generated from the li elements in forming the corresponding graph. The element name li was chosen to be mnemonic with the term "list item" from HTML. Note also the use of a <rdf:Bag> element within the <s:students> property element. The <rdf:Bag> element is another example of the abbreviation we used in Example 12 that lets us replace both an rdf:Description element and an rdf:type element with a single element. Since no URIref is specified, the Bag is a blank node. Its nesting within the <s:students> property element is an abbreviated way of indicating that the blank node is the value of this property. These abbreviations are described further in [RDF-SYNTAX].
The graph structure for an rdf:Seq container, and the corresponding RDF/XML, are similar to those for an rdf:Bag (the only difference is in the type, rdf:Seq). Once again, although an rdf:Seq container is intended to describe a sequence, it is up to applications creating and processing the graph to appropriately interpret the sequence of integer-valued property names.
As an illustration of an Alt container, the sentence "The source code for X11 may be found at ftp.example.org, ftp.example1.org, or ftp.example2.org" could be expressed in the RDF graph shown in Figure 15:
Example 14 shows how the graph in Figure 15 could be written in RDF/XML:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:s="http://example.org/packages/vocab#">
<rdf:Description rdf:about="http://example.org/packages/X11">
<s:DistributionSite>
<rdf:Alt>
<rdf:li rdf:resource="ftp://ftp.example.org"/>
<rdf:li rdf:resource="ftp://ftp.example1.org"/>
<rdf:li rdf:resource="ftp://ftp.example2.org"/>
</rdf:Alt>
</s:DistributionSite>
</rdf:Description>
</rdf:RDF>
An Alt container is intended to have at least one member, identified by the property rdf:_1. This member is intended to be considered as the default or preferred value. Other than the member identified as rdf:_1, the order of the remaining elements is not significant.
The RDF in Figure 15 as written states simply that the value of the s:DistributionSite site property is the Alt container resource itself. Any additional meaning that is to be read into this graph, e.g., that one of the members of the Alt container is to be considered as the value of the s:DistributionSite site property, or that ftp://ftp.example.org is the default or preferred value, must be built into an application's understanding of how an Alt is intended to behave, and/or into the meaning defined for the particular property (s:DistributionSite in this case), which also must be understood by the application.
Alt containers are frequently used in conjunction with language tagging. For example, a work whose title has been translated into several languages might have its Title property pointing to an Alt container holding each of the language variants.
The distinction between the intended meanings of a Bag and an Alt can be further illustrated by considering the authorship of the book "Huckleberry Finn". The book has exactly one author, but the author has two names (Mark Twain and Samuel Clemens). Either name is sufficient to specify the author. Thus using an Alt container of the author's names more accurately represents the relationship than using a Bag (which might suggest there are two different authors).
Users are free to choose their ways to describe groups of resources, rather than using the ones described here. These RDF containers are merely provided as common definitions that, if generally used, could help make data involving groups of resources more interoperable.
Sometimes there are clear alternatives to using these RDF container types. For example, a relationship between a particular resource and a group of other resources could be indicated by making the first resource the subject of multiple statements using the same property. This is structurally not the same as the resource being the subject of a single statement whose object is a container containing multiple members. In some cases, these two structures may have equivalent meaning, but in other cases they may not. The choice of which to use in a given situation should be made with this in mind.
Consider as an example the relationship between a writer and her publications. We might have the sentence:
Sue has written "Anthology of Time", "Zoological Reasoning", and "Gravitational Reflections".
In this case, there are three resources each of which was written independently by the same writer. This could be expressed using repeated properties as:
exstaff:Sue exterms:publication ex:AnthologyOfTime . exstaff:Sue exterms:publication ex:ZoologicalReasoning . exstaff:Sue exterms:publication ex:GravitationalReflections .
In this