W3C

RDF Primer

Editor's Working Draft 24 October 2002

This version:
http://www.w3.org/2001/09/rdfprimer/rdf-primer-20021021.html
Latest version:
http://www.w3.org/TR/rdf-primer/
Previous version:
http://www.w3.org/TR/2002/WD-rdf-primer-20020426
Editors:
Frank Manola, The MITRE Corporation, fmanola@mitre.org
Eric Miller, W3C, em@w3.org
Series Editor:
Brian McBride, Hewlett-Packard Laboratories, bwm@hplb.hpl.hp.com

Abstract

The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource. However, by generalizing the concept of a "Web resource", RDF can also be used to represent information about things that can be identified on the Web, even when they can't be directly retrieved on the Web. Examples include information about items available from online shopping facilities (e.g., information about specifications, prices, and availability), or the description of a Web user's preferences for information delivery. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. The ability to exchange information between different applications means that the information may be made available to applications other than those for which it was originally created. This Primer is designed to provide the reader with the basic fundamentals required to effectively use RDF in their particular applications.

Status of this Document

This is a W3C RDF Core Working Group Working Draft produced as part of the W3C Semantic Web Activity. This document incorporates material developed by the Working Group designed to provide the reader the basic fundamentals required to effectively use RDF in their particular applications.

This document is being released for review by W3C members and other interested parties to encourage feedback and comments. This is the current state of an ongoing work on the Primer.

This is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use it as reference material or to cite as other than "work in progress". A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.

Comments on this document are invited and should be sent to the public mailing list www-rdf-comments@w3.org. An archive of comments is available at http://lists.w3.org/Archives/Public/www-rdf-comments/.

@@This is an Editor's Draft. The "WD" logo on the side shows in NS but not in IE.@@

Table of Contents

  1. Introduction
  2. Making Statements About Resources
      2.1 Uniform Resource Identifiers (URIs)
      2.2 Documents: Extensible Markup Language (XML)
      2.3 The RDF Model
      2.4 Structured Property Values and Blank Nodes
      2.5 Typed Literals
  3. An XML Syntax for RDF: RDF/XML
      3.1 Defining New RDF Resources
      3.2 Additional RDF/XML Abbreviations
      3.3 RDF/XML Summary
  4. Other RDF Classes and Properties
      4.1 RDF Containers
      4.2 RDF Reification
      4.3 Miscellaneous RDF Facilities
            4.3.1 More on Structured Values: rdf:value
            4.3.2 Boolean-valued Properties
            4.3.3 Embedding RDF in HTML
  5. Defining RDF Vocabularies: RDF Schema
      5.1 Defining Classes
      5.2 Defining Properties
      5.3 Interpreting RDF Schema Declarations
      5.4 Other Schema Information
      5.5 Richer Schema Languages
  6. Some RDF Applications: RDF in the Field
      6.1 Dublin Core Metadata Initiative
      6.2 PRISM
      6.3 XPackage
      6.4 Intelligent Routing: Reuters Health Information
      6.5 RSS: RDF Site Summary 1.0
      6.6 CIM/XML
      6.7 Gene Ontology Consortium
  7. Other Parts of the RDF Specification
      7.1 Model Theory
      7.2 Test Cases
  8. RDF As a Data Model
  9. References
      9.1 Normative References
      9.2 Informational References
10. Acknowledgments

Appendices

  A. Changes


1. Introduction

The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource. However, by generalizing the concept of a "Web resource", RDF can also be used to represent information about things that can be identified on the Web, even when they can't be directly retrieved on the Web. Examples include information about items available from online shopping facilities (e.g., information about specifications, prices, and availability), or the description of a Web user's preferences for information delivery.

RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. The ability to exchange information between different applications means that the information may be made available to applications other than those for which it was originally created.

To make this discussion somewhat more concrete as soon as possible, the following is a small chunk of RDF in its XML serialization format.

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns="http://www.w3.org/2000/10/swap/pim/contact#">
  <Person rdf:about="http://www.w3.org/People/EM/contact#me">
    <fullName>Eric Miller</fullName>
    <mailbox rdf:resource="mailto:em@w3.org"/>
    <personalTitle>Semantic Web Activity Lead</personalTitle> 
  </Person>
</rdf:RDF>

This example roughly translates as a collection of statements "there is someone whose name is Eric Miller, whose email address is em@w3.org, and whose title is Semantic Web Activity Lead". Note that the example contains what seem to be Web addresses, as well as some properties like mailbox and fullName, and their respective values em@w3.org, and Eric Miller.

Like HTML, this form of information is machine processable, and links pieces of data across the Web. However, unlike conventional hypertext, RDF references can refer to any identifiable thing, including things that may or may not be Web-based data. The result is that in addition to describing Web pages, we can also convey information about cars, businesses, people, news events, etc. Further, RDF references themselves can be labeled, to indicate the kind of relationship that exists between the linked items.

The complete specification of RDF consists of a number of documents:

This Primer is intended to augment the other parts of the RDF specification, to help information system designers and application developers understand the features of RDF and how to use them. In particular, the Primer is intended to answer such questions as:

  • What does RDF look like?
  • What information can RDF represent?
  • How is RDF information created, accessed, and processed?
  • How can existing information be combined with RDF?

The Primer is a non-normative document, which means that it does not provide a definitive (from the W3C's point of view) specification of RDF. The examples and other explanatory material in this document are provided to help you understand RDF, but they may not always provide definitive or fully-complete answers. In such cases, you should refer to the relevant normative parts of the RDF specification. To help you do this, we provide links pointing to the relevant parts of the normative specifications.

2. Making Statements About Resources

RDF is intended to provide a simple way to state properties of (facts about) Web resources, e.g., Web pages. For example, imagine that we want to record the fact that someone named John Smith created a particular Web page. A straightforward way to state this fact in English would be in the form of a simple statement, e.g.:

http://www.example.org/index.html has a creator whose value is John Smith

We've underlined parts of this statement to illustrate that, in order to describe the properties of something, we need ways to name, or identify, a number of things:

  • We need a way to identify the thing we want to describe (the Web page, in this case)
  • We need a way to identify a specific property (the creator) of the thing that we want to describe
  • We need a way to identify the thing we want to assign as the value of this property (who the creator is), for the thing we want to describe

In this statement, we've used the Web page's URL (Uniform Resource Locator) to identify it. In addition, we've used the word "creator" to identify the property we want to talk about, and the two words "John Smith" to identify the thing (a person) we want to say is the value of this property.

We could state other properties of this Web page by writing additional English statements of the same general form, using the URL to identify the page, and words (or other expressions) to identify the properties and their values. For example, to specify the date the page was created, and the language in which the page is written, we could write the additional statements:

http://www.example.org/index.html has a creation-date whose value is August 16, 1999
http://www.example.org/index.html has a language whose value is English

RDF is based on the idea that the things we want to describe have properties which have values, and that resources can be described by making statements, similar to those above, that specify those properties and values. RDF uses a particular terminology for talking about the various parts of statements. Specifically, the part that identifies the thing the statement is about (the Web page in this example) is called the subject. The part that identifies the property or characteristic of the subject that the statement specifies (creator, creation-date, or language in these examples) is called the predicate, and the part that identifies the value of that property is called the object. So, taking the English statement

http://www.example.org/index.html has a creator whose value is John Smith

the RDF terms for the various parts of the statement are:

  • the subject is the URL http://www.example.org/index.html
  • the predicate is the word "creator"
  • the object is the words "John Smith"

However, while English is good for communicating between (English-speaking) humans, RDF is about making machine-processable statements. To make these kinds of statements suitable for processing by machines, we need two things:

  • a system of machine-processable identifiers that allows us to identify a subject, predicate, or object in a statement without any possibility of confusion with a similar-looking identifier that might be used by someone else on the Web.
  • a machine-processable format for representing these statements and exchanging them between machines.

Fortunately, the existing Web architecture provides us with both of the necessary mechanisms. The Web's Uniform Resource Identifier (URI) provides us with a way to uniquely identify anything we want to talk about in an RDF statement, and the Extensible Markup Language (XML) provides us with a format for representing and exchanging RDF statements. The next two sections briefly describe these mechanisms.

2.1 Uniform Resource Identifiers (URIs)

If we want to discuss something, we must first identify it. How else will we know what we are referring to? In everyday communication, we use references such as "Bob", "The Moon", "373 Whitaker Ave.", "California", "VIN 2745534", "today's weather", etc., to identify things. Ambiguities in these identifiers are generally resolved in terms of a shared semantic context between the sender and the receiver. To refer to "things" on the Web, we also use identifiers.

As we've seen, the Web already provides one form of identifier, the Uniform Resource Locator (URL). We used a URL in our original example to identify the Web page that John Smith created. A URL is a character string that identifies a Web resource by representing its primary access mechanism (essentially, its network "location"). However, we would like to be able to record information about many things in addition to Web pages. In particular, we'd like to record information about lots of things that don't have network locations or URLs. For example, I (a human being) don't have a network location or URL, and yet my employer needs to record all sorts of things about me in order to pay my salary, keep track of the work that I've been doing, and so on. My doctor needs to record other sorts of things about me in order to keep track of my medical history, tests that have been performed (and the results, who performed them, and when), inoculations I've received, etc.

We've recorded information about lots of things that don't have URLs in files (both manual and automated) for many years, and the way we identify those things is by assigning them identifiers: values that we uniquely associate with the individual things. The identifiers we use to identify various kinds of things go by names like "Social Security Number", "Part Number", "license number", "employee number", "user-id", etc. In some cases, these identifiers (such as Social Security Numbers) are assigned by a recognized authority of some kind. In other cases, these identifiers are generated by a private organization or individual. In some cases, these identifiers have a national or international scope within which they are unique (a Social Security Number has national scope), while in other cases they may only be unique within a very limited scope (my employee number is only unique among the numbers assigned by my specific employer). Nevertheless, these identifiers serve, if used properly, to identify the things we want to talk about.

The Web provides its own form of identifier for these purposes, called the Uniform Resource Identifier (URI). The URLs we've already discussed are a particular kind of URI. All URIs share the property that different persons or organizations can independently create them, and use them to identify things. However, URIs are not limited to identifying things that have network locations, or use other computer access mechanisms. In fact, we can create a URI to refer to anything we want to talk about, including

  • network-accessible things, such as an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), or a collection of other resources.
  • things that are not network-accessible, such as human beings, corporations, and bound books in a library.
  • abstract concepts that don't physically exist, like the concept of a "creator".

URIs essentially constitute an infinite stock of names that can be used to identify things. A number of different URI schemes (URI forms) have been already been developed, and are being used, for various purposes. Examples include:

  • http: (Hypertext Transfer Protocol, primarily for Web pages)
  • mailto: (email addresses), e.g., mailto:em@w3.org
  • ftp: (File Transfer Protocol)
  • urn: (Uniform Resource Names, intended to be persistent location-independent resource identifiers), e.g., urn:isbn:0-520-02356-0 (for a book)

URIs are defined in RFC 2396 [URI]. Some additional discussion of URIs can be found in Naming and Addressing: URIs, URLs, ... [NAMEADDRESS]. A list of existing URI schemes can be found in Addressing Schemes [ADDRESS-SCHEMES], and it is a good idea to consider adapting one of the existing schemes for any specialized identification purposes you may have, rather than trying to invent a new one.

No one person or organization controls who makes URIs or how they can be used. While some URI schemes, such as URL's http:, depend on centralized systems such as DNS, other schemes, such as freenet:, are completely decentralized. This means that, as with any other kind of name, you don't need special authority or permission to create a URI for something. Also, you can create URIs for things you don't own, just as in ordinary language you can use whatever name you like for things you don't own. The URI is the foundation of the Web. While nearly every other part of the Web can be replaced, the URI cannot: it holds the Web together.

Since the URI is such a general identification mechanism, capable of identifying anything, it should not be surprising that RDF uses URIs as the basis of its mechanism for identifying the subjects, predicates, and objects in statements. To be more precise, RDF uses URI references [URI] to define its subjects, predicates, and objects. A URI reference (or URIref) is a URI, together with an optional fragment identifier at the end. For example, the URI reference http://www.example.org/index.html#section2 consists of the URI http://www.example.org/index.html and (separated by the "#" character) the fragment identifier Section2. RDF defines a resource as anything that is identifiable by a URI reference, and hence using URIrefs allows RDF to describe practically anything, and to state relationships between such things as well.

In order to make writing URIrefs easier, URIrefs may be either absolute or relative. An absolute URIref refers to a resource independently of the context in which the URIref appears, e.g., the URIref http://www.example.org/index.html. A relative URIref is a shorthand form of an absolute URIref, where some prefix of the URIref is missing, and information from the context in which the URIref appears is required to fill in the missing information. For example, the relative URIref otherpage.html, when appearing in a resource http://www.example.org/index.html, would be filled out to the absolute URIref http://www.example.org/otherpage.html. A URIref that does not contain a URI is considered a reference to the current document (the document in which it appears). So, an empty URIref within a document is considered equivalent to the URIref of the document itself. A URIref consisting of just a fragment identifier is considered equivalent to the URIref of the document in which it appears, with the fragment identifier appended to it. For example, within http://www.example.org/index.html, if #section2 appeared as a URIref, it would be considered equivalent to the absolute URIref http://www.example.org/index.html#section2.

Both RDF and web browsers use URIrefs to identify things. However, RDF and browsers interpret URIrefs in slightly different ways. This is because RDF uses URIrefs only to identify things, while browsers also use URIrefs to retrieve things. Often there is no effective difference, but in some cases the difference can be significant. One obvious difference is when a URIref is used in a browser, there is the expectation that it identifies a resource that can actually be retrieved: that something is actually "at" the location identified by the URI. However, in RDF a URIref may be used to identify something, like a person, that has no physical existence on the web, and hence can't be retrieved. People sometimes use RDF together with a convention that, when a URIref is used to identify an RDF resource, a page containing descriptive information about that resource will be placed on the web "at" that URI, so that the URIref can be used in a browser to retrieve that information. This can be a useful convention in some circumstances (although it creates a difficulty in distinguishing the identity of the original resource from the identity of the web page describing it). However, this convention is not an explicit part of the definition of RDF, and RDF itself does not assume that a URIref identifies something that can be retrieved.

Another difference is in the way URIrefs with fragment identifiers are handled. Fragment identifiers are often seen in URLs that identify HTML documents, where they serve to identify a specific place within the document identified by the URL. In normal HTML usage, where URI references are used to retrieve the indicated resources, the two URIrefs:

http://www.example.org/index.html
http://www.example.org/index.html#Section2

are related (they both refer to the same document, the second one identifying a location within the first one). However, as noted already, RDF uses URI references purely to identify resources, not to retrieve them, and RDF assumes no particular relationship between these two URIrefs. As far as RDF is concerned, they are syntactically different URI references, and hence may refer to unrelated things. (This doesn't mean that the HTML-defined containment relationship might not exist, just that RDF doesn't assume that a relationship exists based only on the fact that the URI parts of the URI references are the same.)

In later sections, we'll see how RDF uses URIrefs for identifying the subjects, predicates, and objects in statements. But before we do that, we need to briefly introduce, in the next section, the basis of how RDF statements can be physically represented and exchanged.

2.2 Documents: Extensible Markup Language (XML)

The Extensible Markup Language [XML] was designed to allow anyone to design their own document format and then write a document in that format. Like HTML documents (Web pages), XML documents contain text. This text consists primarily of plain text content, and markup in the form of tags. This markup allows a processing program to interpret the various pieces of content (elements). In HTML, the set of permissible tags, and their interpretation, is defined by the HTML specification. However, XML allows users to define their own markup languages (tags and the structures in which they can appear) adapted to their own specific requirements. For example, the following is a simple passage marked up using an XML-based markup language:

<sentence><person href="http://example.com/#me">I</person> 
just got a new pet <animal>dog</animal>.</sentence>

Elements delimited by tags (<sentence>, <person>, etc.) are introduced to reflect a particular structure associated with the passage. These tags allow a program written with an understanding of these particular elements to properly interpret the passage.

This particular markup language uses the words "sentence," "person," and "animal" as tag names in an attempt to convey some of the meaning of the elements; and they would convey meaning to an English-speaking person reading it, or to a program specifically written to interpret this vocabulary. However, there is no built-in meaning here. For example, to non-English speakers, or to a program not written to understand this markup, the element <person> may mean absolutely nothing. Take the following passage, for example:

<dfgre><reghh bjhb="http://example.com/#me">I</reghh> 
just got a new pet <yudis>dog</yudis>.</dfgre>

To a machine, this passage has exactly the same structure as the previous example. However, it is no longer clear to an English-speaker what is being said, because the tags are no longer English words. Moreover, others may have used the same words as tags in their own markup languages, but with completely different intended meanings. For example, "sentence" in another markup language might refer to the amount of time that a convicted criminal must serve in a penal institution. So additional mechanisms must be provided to help keep XML vocabulary straight.

To prevent confusion, it is necessary to uniquely identify markup elements. This is done in XML using XML Namespaces [XML-NS]. A namespace is just a way of identifying a part of the Web (space) which acts as a qualifier for a specific set of names. A namespace is created for an XML markup language by creating a URI for it. By qualifying tag names with the URIs of their namespaces, anyone can create their own tags and properly distinguish them from tags with identical spellings created by others. A useful practice is to create a Web page to describe the markup language (and the intended meaning of the tags) and use the URL of that Web page as the URI for its namespace. The following example illustrates the use of an XML namespace.

<my:sentence xmlns:my="http://example.org/xml/documents/">
   <my:person my:href="http://example.com/#me">I</my:person> 
just got a new pet <my:animal>dog</my:animal>.
</my:sentence>

In this example, xmlns:my="http://example.org/xml/documents/ declares a namespace for use in this piece of XML. It maps the prefix my to the namespace URI http://example.org/xml/documents/. The XML content can then use qualified names (or QNames) like my:person as tags. A QName contains a prefix that identifies a namespace, followed by a colon, and then a local name for an XML tag (element) or attribute. By using namespace URIs to distinguish specific collections of names, and qualifying tags with the URIs of the namespaces they come from, as in this example, we don't have to worry about tag names conflicting. Two tags having the same spelling are considered the same only if they also have the same namespace URIs.

RDF defines a specific XML markup language, referred to as RDF/XML, for use in representing RDF information, and for exchanging it between machines. An example of RDF/XML was given in Section 1, and the language is described in more detail in Section 3.

@@Will the following be understandable here? Perhaps should go in Section 3.@@

In RDF/XML, XML QName tags are used together with namespace URIs to provide URIrefs for the names used in RDF. This is done by concatenating the namespace URI and the local (tag) name. For example, if the XML namespace assigned to prefix foo has the URI http://example.org/somewhere/, then the tag foo:bar can be used as a shorthand for the URIref http://example.org/somewhere/bar. Similarly, in the previous example, the my:person tag would have the namespace URI http://example.org/xml/documents/ concatenated with local name person, giving it the URIref http://example.org/xml/documents/person. This simple method restricts the URIrefs that can be generated from the XML, and allows the same URIref to be constructed in multiple ways, but if care is taken, it works satisfactorily.

2.3 The RDF Model

Now that we've introduced URI references for identifying things we want to talk about on the Web, and XML as a machine-processable way of representing RDF statements, we can describe how RDF lets us use URIs to make statements about resources. In the introduction, we said that RDF was based on the idea of expressing simple statements about resources, where those statements are built using subjects, predicates, and objects. In RDF, we could represent our original English statement:

http://www.example.org/index.html has a creator whose value is John Smith

by an RDF statement having:

  • a subject http://www.example.org/index.html
  • a predicate http://purl.org/dc/elements/1.1/creator
  • and an object http://www.example.org/staffid/85740

Note how we have introduced URIrefs to identify not only the subject of the original statement, but also the predicate and object, instead of using the words "creator" and "John Smith", respectively. We'll discuss this further a bit later on.

RDF models statements as nodes and arcs in a graph. In this notation, a statement is represented by:

  • a node for the subject, labeled with its URIref
  • a node for the object, labeled with its URIref
  • an arc for the predicate, labeled with its URIref, directed from the subject node to the object node.

So the RDF statement above would be represented by the graph shown in Figure 1:

A Simple RDF
      Statement
Figure 1: A Simple RDF Statement (SVG version)

Collections of statements are represented by corresponding collections of nodes and arcs. So if we wanted to also represent the additional statements

http://www.example.org/index.html has a creation-date whose value is August 16, 1999
http://www.example.org/index.html has a language whose value is English

we could, by introducing suitable URIrefs to name the properties "creation-date" and "language", use the graph shown in Figure 2:

Several Statements About the Same Resource
Figure 2: Several Statements About the Same Resource (SVG version)

Figure 2 illustrates that RDF permits the objects of statements (but not the subjects or predicates) to be constant values (called literals) represented by character strings, as well as URIrefs, in order to represent certain kinds of property values. In drawing RDF graphs, nodes that represent resources identified by URIrefs are shown as ellipses, while nodes that represent literals are shown as boxes (labeled by the literal itself). RDF graphs are technically "labeled directed graphs", since the arcs have labels, and are "directed" (point in a specific direction, from subject to object).

@@Need to go through and change some appearances of "character string" to "literal".@@

Sometimes it is not convenient to draw graphs, so an alternative way of writing down the statements, called N-Triples, can also be used. In the N-Triples notation, each statement in the graph is written as a simple triple of subject, predicate, and object node labels (either URIref or character string), in that order. The N-Triples representing the three statements shown in Figure 2 would be written:

<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/creator> <http://www.example.org/staffid/85740> .

<http://www.example.org/index.html> <http://www.example.org/terms/creation-date> "August 16, 1999" .

<http://www.example.org/index.html> <http://www.example.org/terms/language> "English" .

Each triple corresponds to a single arc in the graph, complete with the arc's beginning and ending nodes (the subject and object of the statement). Unlike the drawn graph (but like the original statements), the N-Triples notation requires that a node be separately identified for each statement it appears in. So, for example, http://www.example.org/index.html appears three times (once in each triple) in the N-Triples representation of the graph, but only once in the drawn graph. However, the triples represent exactly the same information as the graph.

The N-triples syntax requires that URI references be written out in full, in angle brackets, which, as the example above illustrates, can result in very long lines. For convenience, we will use a shorthand way of writing triples in the rest of this Primer, and also in other RDF specifications. In this shorthand, we can substitute a QName without angle brackets as an abbreviation of a full URI reference. We will also make extensive use in these examples of several "well-known" QName prefixes (which we will use without explicitly specifying them each time), defined as follows:

prefix rdf:, namespace URI: http://www.w3.org/1999/02/22-rdf-syntax-ns#
prefix rdfs:, namespace URI: http://www.w3.org/2000/01/rdf-schema#
prefix dc:, namespace URI: http://purl.org/dc/elements/1.1/
prefix daml:, namespace URI: http://www.daml.org/2001/03/daml+oil#
prefix ex:, namespace URI: http://www.example.org/ (or http://www.example.com/)
prefix xsd:, namespace URI: http://www.w3.org/2001/XMLSchema#

We will also use variations on the "example" prefix ex: as needed in the examples, where this will not cause confusion, for example,

prefix exterms:, namespace URI: http://www.example.org/terms/ (for terms used by our example organization),
prefix exstaff:, namespace URI: http://www.example.org/staffid/ (for our example organization's staff identifiers),
prefix ex2:, namespace URI: http://www.domain2.example.org/ (for a second example organization), and so on.

Using our new shorthand, we can write the previous set of triples as:

ex:index.html dc:creator exstaff:85740 .

ex:index.html exterms:creation-date "August 16, 1999" .

ex:index.html exterms:language "English" .

The examples we've just given of RDF statements begin to illustrate some of the advantages of using URIrefs as RDF's basic way of identifying things. For instance, instead of identifying the creator of the Web page in our first example by the character string "John Smith", we've assigned him a URIref, in this case (using a URIref based on his employee number) http://www.example.org/staffid/85740 . An advantage of using a URIref in this case is that we can be more precise in our identification. That is, the creator of the page isn't the character string "John Smith", or any one of the thousands of people named John Smith, but the particular John Smith associated with that URIref (whoever created the URIref defines the association). Moreover, since we have a URIref for the creator of the page, it is a full-fledged resource, and we can record additional information about him, such as his name, and age, as in the graph shown in Figure 3:

More
      Information About John Smith
Figure 3: More Information about John Smith (SVG version)

These examples also illustrate that RDF uses URIrefs as predicates in RDF statements. That is, rather than using character strings (or words) such as "creator" or "name" to identify properties, RDF uses URIrefs. Using URIrefs to identify properties is important for a number of reasons. First, it allows us to distinguish the properties we use from properties someone else may use that would otherwise be identified by the same character string. For instance, in our example, example.org uses "name" to mean someone's full name written out as a character string (e.g., "John Smith"), but someone else may intend "name" to mean something different (e.g., the name of a variable in a piece of program text). A program encountering "name" as a property identifier on the Web wouldn't necessarily be able to distinguish these uses. However, if example.org writes http://www.example.org/terms/name for its "name" property, and the other person writes http://www.domain2.example.org/genealogy/terms/name for hers, we can keep straight the fact that there are distinct properties involved (even if a program cannot automatically determine the distinct meanings). Another reason why it is important to use URIrefs to identify properties is that it allows us to treat RDF properties as resources themselves. Since properties are resources, we can record descriptive information about them (e.g., the English description of what example.org means by "name"), simply by adding additional RDF statements with the property's URIref as the subject.

Using URIrefs as subjects, predicates, and objects in RDF statements allows us to begin to develop and use a shared vocabulary on the Web, reflecting (and creating) a shared understanding of the concepts we talk about. For example, in the triple

ex:index.html  dc:creator  exstaff:85740 .

the predicate dc:creator, when fully expanded as a URIref, is an unambiguous reference to the "creator" attribute in the Dublin Core metadata attribute set, a widely-used collection of attributes (properties) for describing information of all kinds. The writer of this triple is effectively saying that the relationship between the Web page (identified by http://www.example.org/index.html ) and the creator of the page (a distinct person, identified by http://www.example.org/staffid/85740 ) is exactly the concept defined by http://purl.org/dc/elements/1.1/creator . Moreover, anyone else, or any program, that understands http://purl.org/dc/elements/1.1/creator will know exactly what is meant by this relationship.

Of course, RDF's use of URIrefs doesn't solve all our problems because, for example, people can still use different URIrefs to refer to the same thing. However, the fact that these different URIrefs are used in the commonly-accessible "Web space" creates the opportunity both to identify equivalences among these different references, and to migrate toward the use of common references.

The result of all this is that RDF provides a way to make statements that applications can more easily process. Now an application can't actually "understand" such statements, of course, but it can deal with them in a way that makes it seem like it does. For example, a user could search the Web for all book reviews and create an average rating for each book. Then, the user could put that information back on the Web. Another web site could take that list of book rating averages and create a "Top Ten Highest Rated Books" page. Here, the availability and use of a shared vocabulary about ratings, and a shared group of URIrefs identifying the books they apply to, allows individuals to build a mutually-understood and increasingly-powerful (as additional contributions are made) "information base" about books on the Web. The same principle applies to the vast amounts of information that people create about thousands of subjects every day on the Web.

RDF statements are similar to a number of other formats for recording information, such as:

  • entries in a simple record or catalog listing describing the resource in a data processing system.
  • rows in a simple relational database.
  • simple assertions in formal logic

and information in these formats can be treated as RDF statements, allowing RDF to be used as a unifying model for integrating data from many sources. This relationship is further explored in Section 8.

2.4 Structured Property Values and Blank Nodes

Things would be very simple if the only types of information we had to record about things were obviously in the form of the simple RDF statements we've illustrated so far. However, most real-world data involves structures that are more complicated than that, at least on the surface. For instance, in our original example, we recorded the date the Web page was created as a single exterms:creation-date property, with a simple character string as its value. However, suppose we wanted to show, as the value of the exterms:creation-date property, the month, day, and year as separate pieces of information? Or, in the case of John Smith's personal information, suppose we wanted to record his address. We might write the whole address out as a character string, as in the triple

exstaff:85740  exterms:address  "1501 Grant Avenue, Bedford, Massachusetts 01730" .

However, suppose we wanted to record John's address as a structure consisting of separate street, city, state, and Zip code values? How do we do this in RDF?

We can represent such structured information in RDF by considering the aggregate thing we want to talk about (like John Smith's address) as a separate resource, and then making separate statements about that new resource. So, in the RDF graph, in order to break up John Smith's address into its component parts, we create a new node to represent the concept of John Smith's address, and assign that concept a new URIref to identify it, say http://www.example.org/addressid/85740 (which we will abbreviate as exaddressid:85740). We then write RDF statements (create additional arcs and nodes) with that node as the subject, to represent the additional information, producing the graph shown in Figure 4:

Breaking Up
      John's Address
Figure 4: Breaking Up John's Address (SVG version)

or the triples:

exstaff:85740      exterms:address  exaddressid:85740 .
exaddressid:85740  exterms:street   "1501 Grant Avenue" .
exaddressid:85740  exterms:city     "Bedford" .
exaddressid:85740  exterms:state    "Massachusetts" .
exaddressid:85740  exterms:Zip      "01730" .

Using this approach allows us to represent structured information in RDF, but it can involve generating numerous "intermediate" URIrefs to represent aggregate concepts such as John's address, concepts that may never need to be referred to directly from outside a particular graph, and thus don't, strictly speaking, require "universal" identifiers. In addition, in the drawing of the graph representing the collection of statements shown in Figure 4, we don't really need the URIref we assigned to identify "John Smith's address", since we could just as easily have drawn the graph as in Figure 5:

Using a Blank
      Node
Figure 5: Using a Blank Node (SVG version)

In Figure 5, which is a perfectly good RDF graph, we've used a node without a label to stand for the concept of "John Smith's address". This unlabeled node, or blank node, functions perfectly well in the drawing without needing a URIref, since the node itself provides the necessary connectivity between the various other parts of the graph. However, we do need some form of explicit identifier for that node if we are going to represent this graph as triples. To see this, we can try to write the triples corresponding to what is shown in the drawn graph. What we would get would be something like:

exstaff:85740  exterms:address  ??? .
???            exterms:street   "1501 Grant Avenue" .
???            exterms:city     "Bedford" .
???            exterms:state    "Massachusetts" .
???            exterms:Zip      "01730" 

where ??? stands for something that indicates the presence of the blank node. Since a complex graph might contain more than one blank node, we also need a way to differentiate between the various blank nodes in the triples representation of the graph. To do this, the triples notation uses a node identifier, having the form _:name, to indicate the presence of a blank node. For instance, in this example we might generate the node identifier _:johnaddress to refer to the blank node, in which case the resulting triples might be:

exstaff:85740  exterms:address  _:johnaddress .
_:johnaddress  exterms:street   "1501 Grant Avenue" .
_:johnaddress  exterms:city     "Bedford" .
_:johnaddress  exterms:state    "Massachusetts" .
_:johnaddress  exterms:Zip      "01730" .

In a triples representation of a graph, each distinct blank node in the graph is given a different node identifier. Unlike URIrefs and character string literals, node identifiers are not considered to be actual parts of the RDF graph (this can be seen by looking at the drawn graph in Figure 5 and noting that there is no node identifier used to label the blank node). Node identifiers only have significance within the triple representation of the graph, and only for the purpose of distinguishing one blank node from another (so that two collections of triples that differ only by re-naming their node identifiers are considered to represent identical RDF graphs). Node identifiers also have significance only within the triples representing a single graph (so that two different graphs with the same number of blank nodes might use the same node identifiers to distinguish them, and it would be unwise to assume that blank nodes from different graphs having the same node identifiers referred to the same resource). If it is expected that a node in a graph will need to be referenced from outside the graph, a URIref should be assigned to identify it.

At the beginning of this section, we noted that we can represent aggregate structures, like John Smith's address, by considering the aggregate thing we want to talk about as a separate resource, and then making separate statements about that new resource. This example illustrates an important aspect of RDF: RDF directly represents only binary relationships, e.g. the relationship between John Smith and the character string representing his address. When we try to deal with the relationship between John and the collection of separate components of this address, we are dealing with an n-ary (n-way) relationship (in this case, n=5) between John and the street, city, state, and zip components. In order to represent such structures directly in RDF (e.g., considering the address as a collection of street, city, state, and zip sub-components), we need to break this n-way relationship up into a collection of separate binary relationships. Blank nodes give us one way to do this. Each time we have an n-ary relationship, we can choose one of the participants as the subject of the relationship (John in this case), and create a blank node to represent the rest of the relationship (John's address in this case). We can then represent the remaining participants in the relationship (such as the city in our example) as separate properties of the new resource represented by the blank node.

Blank nodes also give us a way to more accurately model statements about resources that may not have URIs, but that are described in terms of relationships with other resources that do have URIs. For example, when making statements about a person, say Jane Smith, it may seem natural to use that person's email address as her URI, e.g., mailto:jane@example.org. However, this approach can cause a number of problems. One obvious problem is that Jane Smith's email address may change when she changes jobs, and so it may be hard to combine information about Jane recorded at different times. Another problem is that we may want to record information about Jane's mailbox (e.g., the server it is on) as well as about Jane herself (e.g., her current address), and using a URIref for Jane based on her email address makes it difficult to know which thing we're talking about. The same problem exists when a company's Web page URL, say http://www.example.com/, is used as the URI of the company itself. Once again, we may need to record information about the Web page (e.g., who created it and when) as well as about the company, and using http://www.example.com/ as an identifier for both makes it difficult to know which thing we're talking about.

The fundamental problem is that using Jane's email address as a stand-in for Jane is an inaccurate model: Jane's email address identifies a mailbox, and Jane and her mailbox are not the same thing. When Jane herself doesn't have a URI, a blank node gives us a more accurate way of modeling this situation. We can represent Jane by a blank node, and give the blank node an exterms:emailaddress property having the URIref mailto:jane@example.org as its value. We can also assign the blank node an rdf:type property with a value of exterms:Person (we will discuss types in more detail in the following sections), a exterms:name property with a value of "Jane Smith", and any other descriptive information we might want to provide, as shown in the following triples:

_:jane  exterms:emailaddress   mailto:jane@example.org .
_:jane  rdf:type       exterms:Person .
_:jane  exterms:name   "Jane Smith" .
_:jane  exterms:empID  "23748"
_:jane  exterms:age    "26" .

This says, accurately, that "there is a resource of type Person, whose email address is mailto:jane@example.org, whose name is Jane Smith, etc." That is, the existence of a blank node effectively says "there is a resource". Statements with that blank node as subject then provide information about the characteristics of that resource.

In practice, using blank nodes instead of URIrefs in these cases doesn't change the way we actually handle this kind of information very much. For example, if we know independently that an email address uniquely identifies someone at example.org (particularly if the address is unlikely to be reused), we can still use that fact to associate information about that person from multiple sources, even though the email address is not the person's URI. For example, if we were to find another piece of RDF on the web that described a book, and gives the author's contact information as the email address mailto:jane@example.org, we might reasonably conclude that the author's name is Jane Smith. The point is that saying something like "the author of the book is mailto:jane@example.org" is actually a shorthand for "the author of the book is someone whose email address is mailto:jane@example.org". Using a blank node to represent this "someone" simply makes what is actually happening more explicit. (Incidentally, some RDF-based schema languages allow specifying that certain properties are unique identifiers. This is discussed further in Section 5.5.)

2.5 Typed Literals

In the last section, we described how to handle situations in which we needed to take property values represented by character string literals, and break them up into structured values that identify the individual parts of those property values. Using this approach, instead of, say, recording the date a Web page was created as a single exterms:creation-date property, with a single character string literal as its value, we could represent the value as a structure consisting of the month, day, and year as separate pieces of information. However, so far, we've followed the practice of representing any constant values that serve as objects in RDF statements by character string literals, even when we probably intend for the value of the property to be a number (e.g., the value of a year or age property) or some other kind of more specialized value.

For example, earlier in Figure 3, we illustrated an RDF graph recording information about John Smith. In that graph, we recorded the value of John Smith's exterms:age property as the literal "27", as shown in Figure 6:

Representing John
      Smith's Age
Figure 6: Representing John Smith's Age (SVG version)

In this case, our hypothetical organization example.org probably intends for "27" to be interpreted as a number, rather than as the string consisting of the character "2" followed by the character "7". However, an application reading that literal "27" would only know how to do that if the application was explicitly given the information that the literal "27" was intended to represent a number, and knew which number the literal "27" was supposed to represent. The common practice in programming languages or database systems is to provide this kind of information by associating a datatype with the literal, in this case, a datatype like decimal or integer. An application that understands the datatype then knows, for example, whether the literal "10" is intended to represent the number ten, the number two, or the string consisting of the character "1" followed by the character "0", depending on whether the specified datatype is integer, binary, or string. In RDF, typed literals are used to provide this kind of information.

Using a typed literal, we could describe John Smith's age as being the integer number 27 using the N-triple:

<http://www.example.org/staffid/85740>  <http://www.example.org/terms/age> <http://www.w3.org/2001/XMLSchema#integer"27"> .

or, using our QName simplification for writing long URIs:

exstaff:85740  exterms:age  xsd:integer"27" .

or as shown in Figure 7:

A Typed Literal for John
      Smith's Age
Figure 7: A Typed Literal for John Smith's Age (SVG version)

@@Above Ntriples and abbreviated syntax for typed literals is temporary until the official syntax for these is determined.@@

Similarly, in the graph shown in Figure 2 describing information about a Web page, we recorded the value of the page's exterms:creation-date property as the character string literal "August 16, 1999". However, using a typed literal, we could describe the creation date of the Web page as being the date August 16, 1999, using the triple:

ex:index.html  exterms:creation-date  xsd:date"1999-08-16" .

or as shown in Figure 8:

A Typed Literal for a
      Web Page's Creation Date
Figure 8: A Typed Literal for a Web Page's Creation Date (SVG version)

As these examples illustrate, an RDF typed literal is formed by explicitly pairing a URIref identifying a particular datatype (in these examples, the datatypes integer and date from XML Schema Part 2: Datatypes [XML-SCHEMA2]) with a literal that the datatype uses to represent the intended value. In each case, this results in a single node in the RDF graph with the pair as its label.

We've used XML Schema datatypes in the two examples we've just presented, and will be using XML Schema datatypes in most of our other examples as well (for one thing, XML Schema data types have URIrefs we can use to refer to them, specified in [XML-SCHEMA2]). However, unlike typical programming languages and database systems (and unlike the XML Schema language), RDF does not build in any particular collection of datatypes (not even XML Schema datatypes). Instead, RDF typed literals simply provide a way to explicitly indicate, for a given literal, what datatype should be used to interpret it. As far as RDF is concerned, you can write any pair of URIref and literal you want as a typed literal. This gives RDF the flexibility to directly represent information coming from different sources without the need to perform type conversions between these sources and a native set of RDF datatypes. (Type conversions would still be required when moving information between systems with different datatype systems, but RDF would impose no extra conversions into and out of a native set of RDF types.)

However, this flexibility comes at a price. For one thing, RDF has no way of knowing whether or not a URIref in a typed literal actually identifies a datatype. Moreover, even when a URIref does identify a datatype, RDF cannot check the validity of pairing that datatype with a particular literal. For example, you could write the triple:

exstaff:85740  exterms:age  xsd:integer"pumpkin" .

or the graph shown in Figure 9:

An Invalid Typed Literal for John
      Smith's Age
Figure 9: An Invalid Typed Literal for John Smith's Age (SVG version)

and RDF would not see anything wrong with this. However, proper use of typed literals clearly requires that, given a pair of datatype URIref and literal, the literal should be a legal representation of one of the datatype's legal values.

RDF datatype concepts borrow a conceptual framework from XML Schema datatypes [XML-SCHEMA2] to more precisely describe these datatype requirements. RDF's use of this framework is defined in RDF Concepts and Abstract Data Model [RDF-CONCEPTS]. The framework involves distinguishing between what might be written in RDF (or program) text as a literal to represent a value, (usually a character string of some kind), and the actual value that literal is intended to represent or denote. For example, the literal "10" may be written to refer to the value ten in a decimal representation, to the value two in a binary representation, or to the string consisting of a "1" followed by a "0". Which value the literal denotes is determined by the datatype associated with the literal "10". In the case of numbers, the terms numeral and number are commonly used to distinguish between the figures that are written down (the numeral) and the value that is meant (the number). We use this distinction when we talk about the "Roman numerals" like "IV" that we sometimes see chiseled on buildings. We don't call these "Roman numbers" because the Romans were using the same numbers (the number four in this case) that we do; it's the way they wrote them down that was different.

Specifically, RDF defines a datatype to have:

  • a value space, that defines the collection of legal values that the datatype can represent.
  • a lexical space, that defines the collection of legal literals that you can write down to denote members of the value space.
  • a datatype mapping, that defines which values (things in the value space) are denoted by which literals (things in the lexical space).

Morever, a useful datatype mapping will satisfy some other conditions:

  • Each literal (member of the datatype's lexical space) is associated with exactly one member of the datatype's value space (so that, given a datatype and a literal, there is no ambiguity in which value is meant).
  • Each member of the datatype's value space has at least one corresponding literal in the lexical space (so we can represent all the values associated with the data type).

If the datatype mapping satisfies these conditions, an RDF typed literal, since it pairs the URIref of a datatype with a literal, will unambiguously identify a specific member of a datatype mapping and thus a specific member of the value space of the datatype.

For example, using these concepts, the XML Schema datatype xsd:boolean can be described as shown in Table 1. In the datatype mapping for this datatype, each member of the value space (represented here as T and F) has two literal representations defined in the lexical space.

Table 1: A Description of datatype xsd:boolean
Value Space {T, F}
Lexical Space {"0", "1", "true", "false"}
Datatype Mapping {<"true", T>, <"1", T>, <"0", F>, <"false", F>}

Given the datatype description in Table 1, Table 2 shows the RDF typed literals that can be used for datatype xsd:boolean and how the datatype mapping enables a specific value to be determined for each typed literal.

Table 2: Typed Literals for datatype xsd:boolean
Typed Literal Member of Datatype Mapping
Denoted by Typed Literal
Member of Value Space
Denoted by Typed Literal
<xsd:boolean, "true"> <"true", T> T
<xsd:boolean, "1"> <"1", T> T
<xsd:boolean, "false"> <"false", F> F
<xsd:boolean, "0"> <"0", F> F

With this background, we can see how the interpretation of the triple describing John Smith's age:

exstaff:85740  exterms:age  xsd:integer"27" .

works. The triple states that John's age is the member of the value space of the datatype xsd:integer that is represented by the literal "27". Based on the definition of xsd:integer given in [XML-SCHEMA2], it can be determined that John's age is the integer value twenty-seven.

We said earlier that RDF typed literals only provide a way to explicitly indicate the datatype that should be used to interpret a given literal, and that RDF doesn't build in any datatypes. This means that RDF specifies nothing about which datatypes exist, or what their value and lexical spaces, or datatype mappings, might be. The interpretation of a typed literal (determining the value it denotes) must be performed externally to RDF by an application that understands that datatype.

This explains why we said earlier that RDF would be unable to see anything wrong with the typed literal in the triple:

exstaff:85740  exterms:age  xsd:integer"pumpkin" .

or the graph shown in Figure 9. Even though "pumpkin" is not defined as being in the lexical space of the datatype xsd:integer, for RDF to be able to determine this requires that RDF know whether or not a particular literal is a member of a datatype's lexical space, information RDF doesn't have.

There will continue to be a great deal of RDF in the Web that does not use typed literals, since this is a relatively new facility in RDF. However, as use of RDF develops further, the use of typed literals will develop further as well.

@@Need to add datatype range declarations to the Schema section.@@

@@Need to update all subsequent Figure numbers, and generate SVG versions for the new ones added in this section.@@

@@Need a "Summary" or "Overview" section header here to set the following summary off?@@

This is all there is to basic RDF: nodes-and-arcs diagrams interpreted as statements about concepts or digital resources identified by URIrefs . However, it should be clear that, in addition to the basic techniques for representing RDF statements in diagrams (or triples), we also need a way for people to define the vocabularies they intend to use in those statements, including:

  • defining types of things (like ex:Person)
  • defining properties (like ex:age and creation-date), and
  • defining the types of things (or datatypes) that can serve as the subjects or objects of statements involving those properties (like specifying that the value of an ex:age property should always be an xsd:integer).

The basis for defining such vocabularies in RDF is RDF Schema , which will be described in Section 4 . Additional discussion of the basic ideas underlying the RDF data model, and its role in providing a general language for describing Web information, can be found in [RDF-CONCEPTS] and [WEBDATA].

3. An XML Syntax for RDF: RDF/XML

To summarize what we have said already, RDF models statements in terms of a graph consisting of nodes and arcs. The nodes describe resources that can be labeled with URIrefs, character string literals, or are blank. The arcs connect the nodes and are all labeled with URIrefs. This graph is more precisely called a labeled directed graph; each arc has a direction (drawn as an arrow) connecting two nodes. These arcs can also be described as triples of subject node, at the blunt end of the arrow/arc, property arc, and an object node at the sharp end of the arrow/arc. The property arc is interpreted as an attribute, relationship or predicate of the resource, with a value given by the object node.

RDF defines an XML syntax for writing down and exchanging RDF graphs. This syntax is defined in the RDF/XML Syntax Specification [RDFXML]. We can illustrate the basic ideas behind the RDF/XML syntax using some of the examples we've presented already. Suppose we want to represent one of our initial statements:

http://www.example.org/index.html has a creation-date whose value is August 16, 1999

The RDF graph for this single statement, after assigning a URIref to the creation-date property, is shown in Figure 6:

with a triple representation of:

ex:index.html  exterms:creation-date  "August 16, 1999" .

Corresponding RDF/XML syntax for the graph in Figure 6 would be:

1. <?xml version="1.0"?>
2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.             xmlns:ex="http://www.example.org/terms/">

4.   <rdf:Description rdf:about="http://www.example.org/index.html">
5.       <ex:creation-date>August 16, 1999</ex:creation-date>
6.   </rdf:Description>

7. </rdf:RDF>

(we have added line numbers to use in explaining the example).

This seems like a lot of overhead. We can understand better what is going on by considering each part of this XML in turn.

Line 1, <?xml version="1.0"?>, is the XML declaration, which indicates that the following content is XML, and what version of XML it is.

Line 2 begins an rdf:RDF element. This indicates that the following XML content (starting here and ending with the </rdf:RDF> in Line 7) is intended to represent RDF. Following the rdf:RDF on this same line is an XML namespace declaration, represented as an xmlns attribute of the rdf:RDF start-tag. This declaration specifies that all tags in this content prefixed with rdf: are part of the namespace identified by the URIref http://www.w3.org/1999/02/22-rdf-syntax-ns#. This namespace is the source for the RDF-specific terms used in RDF/XML.

Line 3 specifies another XML namespace declaration, this time for the prefix ex:. This is expressed as another xmlns attribute of the rdf:RDF element, and specifies that the namespace URIref http://www.example.org/terms/ is to be associated with the ex: prefix. This namespace is the source for the specific terms defined by our example organization, example.org. The ">" at the end of line 3 indicates the end of the rdf:RDF start-tag. Lines 1-3 are general "housekeeping" necessary to indicate that we are defining RDF/XML content, and to identify the sources of the terms we are using.

Lines 4-6 provide the RDF/XML for the specific statement we're representing. An obvious way to talk about any RDF statement is to say it's a description, and that it's about the subject of the statement (in this case, about http://www.example.org/index.html). This is exactly the way the RDF/XML represents the statement. The rdf:Description start tag in Line 4 indicates that we're starting a description, and goes on to identify the resource the statement is about (the subject of the statement) using the rdf:about attribute to specify the URIref of the subject resource. Line 5 provides a property element, with the QName <ex:creation-date> as its tag, to hold the value August 19, 1999 of the creation-date property of the statement. It is nested within the preceding rdf:Description element, indicating that this property applies to the resource specified in the containing rdf:Description element. The complete URIref of the creation-date property corresponding to the QName <ex:creation-date> would be obtained by replacing the ex: prefix by the namespace URI defined for it in Line 3. Line 6 indicates the end of this particular rdf:Description element.

Finally, Line 7 indicates the end of the rdf:RDF element started on Line 2.

This example illustrates the basic ideas used by RDF/XML to encode an RDF graph as XML elements, attributes, element content, and attribute values. The URIref labels for properties and object nodes are written as XML QNames, consisting of a short prefix denoting a namespace URI, together with a local name denoting a namespace-qualified element or attribute, as described in Section 2.2. The (namespace URIref, local name) pair are chosen such that concatenating them forms the original node URIref. The URIrefs of subject nodes are stored in XML attribute values. The nodes labeled by character string literals (which can only be object nodes) become element text content or attribute values.

We could represent an RDF graph consisting of multiple statements in RDF/XML by using RDF/XML similar to Lines 4-6 in the previous example to separately represent each statement. For example, if we wanted to write the two statements:

ex:index.html  exterms:creation-date  "August 16, 1999" .
ex:index.html  exterms:language "English" .

we could write the RDF/XML as:

1.  <?xml version="1.0"?>
2.  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.              xmlns:ex="http://www.example.org/terms/">

4.    <rdf:Description rdf:about="http://www.example.org/index.html">
5.        <ex:creation-date>August 16, 1999</ex:creation-date>
6.    </rdf:Description>

7.    <rdf:Description rdf:about="http://www.example.org/index.html">
8.        <ex:language>English</ex:language>
9.    </rdf:Description>

10. </rdf:RDF>

This is the same as our initial example, with the addition of lines 7-9, a second rdf:Description element to represent the second statement. We could represent an arbitrary number of additional statements in the same way, using a separate rdf:Description element for each additional statement. As this example illustrates, once the overhead of writing the XML and namespace declarations is dealt with, writing each additional RDF statement in RDF/XML is both straightforward and not too complicated.

The RDF/XML syntax provides several abbreviations to make common uses easier to write. For example, it is typical for the same resource to be described with several properties and values at the same time, as in the example above. To handle this case, RDF/XML allows multiple property elements representing those properties to be nested within the rdf:Description element that identifies the subject resource. For example, if we wanted to represent our previous collection of statements about http://www.example.org/index.html:

ex:index.html  dc:creator  exstaff:85740 .
ex:index.html  exterms:creation-date  "August 16, 1999" .
ex:index.html  exterms:language "English" .

whose graph (the same as Figure 2) is shown in Figure 7:

Several
      Statements About the Same Resource
Figure 7: Several Statements About the Same Resource (SVG version)

the RDF/XML syntax for the graph shown in Figure 7 would be:

1.  <?xml version="1.0"?>
2.  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.              xmlns:dc="http://purl.org/dc/elements/1.1/"
4.              xmlns:ex="http://www.example.org/terms/">

5.    <rdf:Description rdf:about="http://www.example.org/index.html">
6.         <ex:creation-date>August 16, 1999</ex:creation-date>
7.         <ex:language>English</ex:language>
8.         <dc:creator rdf:resource="http://www.example.org/staffid/85740"/>
9.    </rdf:Description>

10. </rdf:RDF>

(we have added line numbers again to use in explaining the example).

Compared with the previous two examples, we've added an additional namespace declaration (in Line 3), and an additional property element (in Line 8). In addition, we've nested the three property elements whose subject is http://www.example.org/index.html within the rdf:Description element identifying that subject, rather than writing a separate rdf:Description element for each statement.

Line 8 also introduces a new form of property element. The ex:language element in Line 7 is similar to the ex:creation-date element we defined in the first example. Both these elements represent properties with character strings as property values, and such elements are specified by enclosing the character string within start- and end-tags corresponding to the property name. However, the dc:creator element on Line 8 represents a property whose value is another resource, rather than a character string. If we had written this element in the same way as the others, we would be saying that the value of the dc:creator element was the character string http://www.example.org/staffid/85740, rather than the resource identified by that string interpreted as a URIref. Hence, in order to indicate the difference, we've represented the property by what XML calls an empty element (it has no separate end tag), and defined the property value using an rdf:resource attribute within that empty element. The rdf:resource attribute indicates that its value is another resource, identified by its URIref. Because the URIref is being used as an attribute value, we cannot abbreviate it as a QName, as we've done in writing element and attribute names (this is due to the need to conform to XML syntax). Instead, we must write it out as a full URIref. This element tag also uses a different namespace prefix, the new namespace prefix dc: we defined in Line 3.

It is important to understand that the RDF/XML above is an abbreviation. The RDF/XML below, in which the three statements are written with separate rdf:Description elements, describes exactly the same RDF graph:

 <?xml version="1.0"?>
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:dc="http://purl.org/dc/elements/1.1/"
             xmlns:ex="http://www.example.org/terms/">

   <rdf:Description rdf:about="http://www.example.org/index.html">
       <ex:creation-date>August 16, 1999</ex:creation-date>
   </rdf:Description>

   <rdf:Description rdf:about="http://www.example.org/index.html">
       <ex:language>English</ex:language>
   </rdf:Description>

   <rdf:Description rdf:about="http://www.example.org/index.html">
       <dc:creator rdf:resource="http://www.example.org/staffid/85740"/>
   </rdf:Description>

 </rdf:RDF>

We will describe some further RDF/XML abbreviations in the following sections. However, the general approach we have illustrated so far is referred to as the RDF/XML basic serialization syntax [RDF-MS]. In this approach, an RDF graph is written in RDF/XML as follows:

  • All blank nodes are assigned arbitrary URIs.
  • Each resource is listed in turn as the subject of a top-level rdf:Description element, using an rdf:about attribute.
    For each triple, with this resource as subject, an appropriate property element is created, with either literal content (possibly empty) or an rdf:resource attribute specifying the object of the triple.

@@The first bullet above may need to change due to the introduction of NodeIDs. Need to decide whether to introduce them here or in a later section.@@

The basic serialization syntax is particularly recommended for applications in which the output RDF/XML is to be used in further RDF processing, because it most directly represents the RDF graph.

Finally, the typed literals we described in Section 2.5 may be used as property values instead of the character string literals we have used in the examples so far. A typed literal is represented in RDF/XML by adding an rdf:datatype attribute specifying a datatype URIref to the property element containing the literal.

For example, to change the statement shown in Figure 6 to use a typed literal instead of a character literal for the creation-date property, the triple representation would be:

ex:index.html  exterms:creation-date  xsd:date"1999-08-16" .

and the corresponding RDF/XML syntax would be:

1. <?xml version="1.0"?>
2. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.             xmlns:ex="http://www.example.org/terms/">

4.   <rdf:Description rdf:about="http://www.example.org/index.html">
5.      <ex:creation-date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">1999-08-16</ex:creation-date>
6.   </rdf:Description>

7. </rdf:RDF>

In Line 5, a typed literal is given as the value of the ex:creation-date property element by adding an rdf:datatype attribute to the element's start-tag to specify the datatype. The value of this attribute is the URI of the datatype, in this case, the URIref of the XML Schema date datatype. Since this is an attribute value, the full URIref must be written out, rather than using the QName abbreviation xsd:date that we used in the triple. A literal appropriate to this datatype is then written as the element content, in this case, 1999-08-16, which is the literal representation for August 16, 1999 in the XML Schema date datatype.

@@Could mention using XML entities as in the Datatype draft to shorten these long datatype URIrefs, but this might introduce yet another complication.@@

3.1. Defining New RDF Resources

So far, we've been describing resources that we imagine have been defined (and given URIrefs) already. For instance, in our initial examples, we've been providing descriptive information about example.org's web page, whose URIref was http://www.example.org/index.html. We referred to this resource (defined elsewhere) using an rdf:about attribute. However, obviously we also want to be able to introduce new resources. For example, suppose a company, example.com, wanted to provide an RDF-based catalog of its products as an RDF/XML document, identified by (and located at) http://www.example.com/2002/04/products. Within that resource, each product might be given a separate RDF description. This catalog, along with one of these descriptions (the catalog entry for a model of tent called the "Overnighter") might be written:

1.   <?xml version="1.0"?>
2.   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.               xmlns:ex="http://www.example.com/terms/">

4.     <rdf:Description rdf:ID="10245">
5.          <ex:model>Overnighter</ex:model>
6.          <ex:sleeps>2</ex:sleeps>
7.          <ex:weight>2.4</ex:weight>
8.          <ex:packedSize>14x56</ex:packedSize>
9.     </rdf:Description>

  ...other product descriptions...

10.  </rdf:RDF>

(We've included the surrounding xml, RDF, and namespace information in lines 1 through 3, and line 10, but this information would only need to be defined once for the whole catalog, not repeated for each entry in the catalog).

This is similar to our previous examples in the way it represents the properties (model, sleeping capacity, weight) of the resource (the tent) being described. However, in line 4, the rdf:Description element has an rdf:ID attribute instead of an rdf:about attribute. Using rdf:ID indicates that we are using a fragment identifier, given by the value of the rdf:ID attribute ("10245" in this case, which might be the catalog number used by example.com), as a shorthand for the complete URIref of the resource we want to describe. This fragment identifier 10245 will be interpreted relative to a base URI, in this case, the URI of the containing catalog. The full URIref for the tent is formed by taking the base URI (of the catalog), and appending #10245 to it, giving the URIref http://www.example.com/2002/04/products#10245.

The rdf:ID attribute is somewhat similar to the ID attribute in XML and HTML, in that it defines a label which can be used to refer to this resource. This label must be unique within the resource (in this case, the catalog) in which it is defined. Any other RDF within this catalog could refer to this resource (this particular catalog entry) by using the relative URIref #10245 in a rdf:about attribute. This would be understood to refer to another resource defined within the catalog. We could also have introduced the URIref of the catalog entry itself by specifying rdf:about="#10245" instead of rdf:ID="10245" (i.e., by specifying the relative URIref directly). The full URIref formed by RDF is the same in either case: http://www.example.com/2002/04/products#10245.

RDF located outside the catalog could refer to this catalog entry by using the full URIref, i.e., by concatenating the relative URIref #10245 of the catalog entry to the base URI of the catalog, forming the absolute URIref http://www.example.com/2002/04/products#10245. For example, an outdoor sports web site exampleRatings.com might use RDF to provide ratings of various tents. The (5-star) rating given to the tent we described earlier might then be represented on exampleRatings.com's web site as:

1.  <?xml version="1.0"?>
2.  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.              xmlns:sportex="http://www.exampleRatings.com/terms/">

4.    <rdf:Description rdf:about="http://www.example.com/2002/04/products#10245">
5.         <sportex:ratingBy>Richard Roe</sportex:ratingBy>
6.         <sportex:numberStars>5</sportex:numberStars>
7.    </rdf:Description>
8.  </rdf:RDF>

In this example, line 4 uses an rdf:Description element with an rdf:about attribute whose value is the full URIref of the tent's catalog entry, defined by the earlier RDF description. The use of this URIref allows the tent being referred to in the rating to be precisely identified.

This example not only shows how new resources can be defined in RDF/XML; it also illustrates one of the basic architectural principles of the Web, which is that anyone should be able say anything they want about existing resources [BERNERS-LEE98]. The example also illustrates the fact that the RDF describing a particular resource does not need to be located all in one place; instead, it may be distributed throughout the web. This is true not only for examples like this one, in which one organization is rating or commenting on resources defined by another, but also for situations in which the original creator of a resource (or anyone else) wishes to amplify the description of that resource by providing additional information about it. This may be done either by modifying the original document in which the resource was defined, to add the properties and values needed to describe the additional information, or, as this example illustrates, by creating a separate document, and providing the additional properties and values in an rdf:Description element that refers to the original resource using rdf:about.

The previous example indicated that fragment identifiers such as #10245 will be interpreted relative to a base URI. By default, this base URI would be the URI of the resource in which the fragment is used. However, in some cases it is desirable to be able to explicitly specify this base URI. For instance, suppose that in addition to the catalog located at http://www.example.com/2002/04/products, example.org wanted to provide a duplicate catalog on a mirror site, say at http://mirror.example.com/2002/04/products. This could create a problem, since if the catalog was retrieved from the mirror site, the URIref generated for our example tent would be http://mirror.example.com/2002/04/products#10245, rather than http://www.example.com/2002/04/products#10245, and hence apparently a different tent. To deal with this problem, RDF/XML supports XML Base [XML-BASE], which allows an XML document to specify a base URI other than the base URI of the document. In this case, we would define the catalog as:

1.   <?xml version="1.0"?>
2.   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.               xmlns:ex="http://www.example.com/terms/"
4.               xml:base="http://www.example.com/2002/04/products">

5.     <rdf:Description rdf:ID="10245">
6.          <ex:model>Overnighter</ex:model>
7.          <ex:sleeps>2</ex:sleeps>
8.          <ex:weight>2.4</ex:weight>
9.          <ex:packedSize>14x56</ex:packedSize>
10.    </rdf:Description>

  ...other product descriptions...

11.  </rdf:RDF>

The xml:base declaration in line 4 specifies that the base URI for the content within the rdf:RDF element (until another xml:base attribute is specified) is http://www.example.com/2002/04/products, and all relative URIrefs cited within that content will be interpreted relative to that base, no matter where the actual content is located. As a result, the relative URIref of our tent, #10245, will generate the same absolute URIref, http://www.example.com/2002/04/products#10245, no matter where the catalog is located.

So far, we've been talking about a single product description, a particular model of tent, from example.com's catalog. However, example.com will probably offer several different models of tents, as well as multiple instances of other categories of products, such as backpacks, hiking boots, and so on. This idea of instances of things that can be classified into different kinds or categories is similar to the programming language concept of objects having different types or classes. RDF supports this concept by providing a predefined property, rdf:type. When an RDF resource is defined as having an rdf:type property, the value of that property is considered to be a resource that defines a category or class of things, and the original resource is considered to be an instance of that category or class. Using rdf:type, example.com might indicate that our product description is that of a tent as follows:

1.   <?xml version="1.0"?>
2.   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.               xmlns:ex="http://www.example.com/terms/"
4.               xml:base="http://www.example.com/2002/04/products">

5.     <rdf:Description rdf:ID="10245">
6.          <rdf:type rdf:resource="http://www.example.com/terms/Tent" />
7.          <ex:model>Overnighter</ex:model>
8.          <ex:sleeps>2</ex:sleeps>
9.          <ex:weight>2.4</ex:weight>
10.         <ex:packedSize>14x56</ex:packedSize>
11.    </rdf:Description>

  ...other product descriptions...

12.  </rdf:RDF>

Note the use of the rdf:type property to indicate that the instance belongs to class Tent. In this case, we imagine that example.com has defined its classes as part of the same vocabulary that it uses to describe its other terms (such as the property ex:weight), so we use the absolute URIref of the class to refer to it. If example.com had defined these classes in the product catalog itself, we could have used the relative URIref #Tent to refer to it.

RDF itself does not define a vocabulary for defining application-specific classes of things, like Tent in this example. Instead, such classes would be defined in an RDF Schema. The RDF Schema vocabulary is described in Section 5. Other vocabularies for defining classes can also be defined, such as the DAML+OIL language described in Section 5.5. In addition, RDF defines several pre-defined types of its own for various purposes. These will be described in Section 4.

Since defining resources as instances of specific types is fairly common, the RDF/XML syntax provides a special abbreviation for instances defined as members of classes using the rdf:type property. In this abbrevation, the rdf:type property and value are removed, and the rdf:Description element name is replaced by the class name. Using this abbreviation, example.com's tent from the example above could also be defined as:

1.   <?xml version="1.0"?>
2.   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.               xmlns:ex="http://www.example.com/terms/"
4.               xml:base="http://www.example.com/2002/04/products">

5.     <ex:Tent rdf:ID="10245">
6.          <ex:model>Overnighter</ex:model>
7.          <ex:sleeps>2</ex:sleeps>
8.          <ex:weight>2.4</ex:weight>
9.          <ex:packedSize>14x56</ex:packedSize>
10.    </ex:Tent>

  ...other product descriptions...

11.  </rdf:RDF>

Both this abbreviation and the previous description of the tent (using the full <rdf:Description rdf:ID="10245"> element) illustrate that RDF statements can be written in RDF/XML in a way that closely resembles the descriptions that might have been written directly in XML. This is an important consideration, given the increasing use of XML in all kinds of applications, since it suggests that RDF could be used in these applications without major changes in information structure being required, and that much deployed XML can be interpreted as RDF statements.

@@Talk about using frag ids in this example vs. not using them as in John's employee number? Comment about using URIs with # at the end for namespace ids (from Syntax doc or CC/PP?)

Finally, RDF/XML allows the definition of new resources that have no URIs, i.e., blank nodes. For example, Figure 8 (from [RDF-XML]) shows a graph saying "the document 'http://www.w3.org/TR/rdf-syntax-grammar' has a title 'RDF/XML Syntax Specification (Revised)' and has an editor, the editor has a name 'Dave Beckett' and a home page 'http://purl.org/net/dajobe/' ".

Graph for
      Another RDF/XML Example
Figure 8: Graph for Another RDF/XML Example (SVG version)

This illustrates an idea we discussed at the end of Section 2: the use of a blank node to represent something that does not have a URI, but can be described in terms of other information. In this case, the blank node represents a person, the editor of the document, and the person is described by his name and home page. Some RDF/XML corresponding to Figure 8 is:

1.  <?xml version="1.0"?>
2.  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
3.              xmlns:dc="http://purl.org/dc/elements/1.1/"
4.              xmlns:ex="http://example.org/stuff/1.0/">

5.     <rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar">
6.       <dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
7.       <ex:editor rdf:parseType="Resource">
8.         <ex:fullName>Dave Beckett</ex:fullName>
9.         <ex:homePage rdf:resource="http://purl.org/net/dajobe/" />
10.      </ex:editor>
11.    </rdf:Description>

12. </rdf:RDF>

Much of this XML is similar to what we have seen before. What is new is in lines 7-10, which specify the blank node, and its properties and their values. Line 7 begins the element describing the ex:editor property of the containing rdf:Description element (in Line 5). The start tag of this ex:editor element contains an attribute rdf:parseType="Resource". This indicates that the contents of the element are to be considered as if they were inside a new rdf:Description element that defines a new, unnamed resource. This new resource is the value of the ex:editor property, corresponding to the blank node in the graph. Within the ex:editor start and end tags (on lines 7 and 10), lines 8 and 9 define the ex:fullName and ex:homePage properties of this new resource, respectively. The end tag </ex:editor> on line 10 indicates the end of the information provided about this new resource.

The ability to use rdf:parseType="Resource" inside elements in this way makes it relatively easy to write RDF/XML to represent RDF graphs that involve intermediate blank nodes at various points.

@@parsetype=Resource can be an abbreviation too, right? Have a Description without an rdf:about, as in ex:editor followed by rdf:Description and then a nested ex:fullname.@@

3.2. Additional RDF/XML Abbreviations

We've already described a number of abbreviations that RDF/XML provides to allow graphs to be represented more compactly. For example, we showed that multiple property elements that describe the same resource can be nested within the same rdf:Description element that identifies the resource. We also showed that the name of an rdf:Description element can be replaced by the class name of the resource. In this section, we will briefly describe some additional RDF/XML abbreviations.

To start with, consider our tent example from Section 3.1:

  <?xml version="1.0"?>
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
              xmlns:ex="http://www.example.com/terms/"
              xml:base="http://www.example.com/2002/04/products">

    <ex:Tent rdf:ID="10245">
         <ex:model>Overnighter</ex:model>
         <ex:sleeps>2</ex:sleeps>
         <ex:weight>2.4</ex:weight>
         <ex:packedSize>14x56</ex:packedSize>
    </ex:Tent>

  </rdf:RDF>

One of the abbreviations allowed by RDF/XML is that when properties are not repeated within an rdf:Description element, and the values of those properties are literals, the properties can be written as XML attributes of the rdf:Description element (this can't be done when properties are repeated because XML does not allow the same attribute to appear more than once within the same element). Using this abbreviation, we can convert the elements in this example to attributes, and write the description as:

  <?xml version="1.0"?>
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
              xmlns:ex="http://www.example.com/terms/"
              xml:base="http://www.example.com/2002/04/products">

    <ex:Tent rdf:ID="10245"
         ex:model="Overnighter"
         ex:sleeps="2"
         ex:weight="2.4"
         ex:packedSize="14x56"/>

  </rdf:RDF>

Another abbreviation is that of nested rdf:Description elements. Suppose we want to say that John Smith created our example Web page from the beginning of Section 2, and also provide some information about John Smith himself. We might do this with the following RDF/XML:

 <?xml version="1.0"?>
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:dc="http://purl.org/dc/elements/1.1/"
             xmlns:ex="http://www.example.org/terms/">

   <rdf:Description rdf:about="http://www.example.org/index.html">
       <dc:creator rdf:resource="http://www.example.org/staffid/85740"/>
   </rdf:Description>

   <rdf:Description rdf:about="http://www.example.org/staffid/85740">
       <ex:name>John Smith</ex:name>
       <ex:age>36</ex:age>
   </rdf:Description>

 </rdf:RDF>

This form makes it clear that two separate resources are being described, but it is less clear that the second resource is the one referenced by the first one. The same information could be expressed by nesting the second description inside the dc:creator element of the first one, as in the following RDF/XML:

 <?xml version="1.0"?>
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:dc="http://purl.org/dc/elements/1.1/"
             xmlns:ex="http://www.example.org/terms/">

   <rdf:Description rdf:about="http://www.example.org/index.html">
       <dc:creator> 
          <rdf:Description rdf:about="http://www.example.org/staffid/85740">
             <ex:name>John Smith</ex:name>
             <ex:age>36</ex:age>
          </rdf:Description>
       </dc:creator>
   </rdf:Description>

 </rdf:RDF>

Notice that because we're not just citing a URIref as the value of dc:creator, but instead providing a complete rdf:Description for the resource, we nest the description between dc:creator start- and end-tags.

Yet another abbreviation works on these nested rdf:Description elements, or their equivalents. When the object of a statement is another resource (e.g., the nested description in the example above), and the values of any properties given in-line for that resource are literals, we can write the nested properties as additional XML attributes of the outer property element. Applying this abbreviation to the example above gives the following RDF/XML:

 <?xml version="1.0"?>
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:dc="http://purl.org/dc/elements/1.1/"
             xmlns:ex="http://www.example.org/terms/">

   <rdf:Description rdf:about="http://www.example.org/index.html">
       <dc:creator rdf:resource="http://www.example.org/staffid/85740"
             ex:name="John Smith"
             ex:age="36" />
   </rdf:Description>

 </rdf:RDF>

Once again, recall that we are describing abbreviations. For example, the above two examples describe exactly the same RDF graph. Some of these abbreviations may be helpful in making RDF/XML easier for people to read, or in enabling RDF/XML to more closely resemble certain forms of more conventional XML.

3.3. RDF/XML Summary

The examples above have illustrated some of the basic ideas behind the RDF/XML syntax. For a discussion of the basic principles behind the modeling of RDF statements in XML (known as striping), and other details about writing RDF in XML, refer to the RDF/XML Syntax Specification [RDF-XML].

[RDF-XML] notes a number of caveats about this syntax. First, not all graphs that can be expressed in the RDF Model Theory [RDF-MODEL] can be represented in RDF/XML. For example, it is not possible to use the RDF/XML serialization for serializing an RDF graph in which any triple has a property label which cannot be expressed as a XML namespace-qualified name (QName). Moreover, if you do a round trip from RDF/XML to RDF graph and then back to RDF/XML the meaning will be the same (the graphs) but the RDF/XML that comes out may not be exactly the same.

Second, we noted above that the RDF/XML basic serialization syntax is recommended for applications in which the output RDF/XML is to be used in further RDF processing. This basic serialization does not conform to some more restricted sub-dialects of RDF, such as [RSS] or [CC/PP]. As a result, it is not appropriate for such applications, for which dialect specific serializers are needed.

Finally, if more human readable output is needed, there are many different choices, with many RDF/XML documents corresponding to identical RDF graphs. Individual triples can be represented in numerous ways. High quality RDF serialization requires that these choices be considered by serializing code. Some are more appropriate than others, in an application dependent fashion.

4. Other RDF Classes and Properties

RDF defines a number of additional classes and properties, providing capabilities for representing containers and RDF statements, and for deploying RDF information in the World Wide Web. These additional classes and properties are described in the following sections.

4.1. RDF Containers

There is often a need to represent collections of things. For example, we might want to say that a book was created by several authors, or to list the students in a course, or the software modules in a package. RDF provides several pre-defined container types that ca