— This is a first draft of a document that might become an interest group note. —
Copyright © 2007 W3C ® (MIT , ERCIM , Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. This Primer is designed to provide the reader with the basic knowledge required to effectively use RDF. It describes how to define RDF vocabularies using the RDF Vocabulary Description Language. It introduces the basic concepts of RDF and describes its Turtle serialization syntax.
The original version of this Primer[RDF_PRIMER] was part of the RDF Recommendation published in February 2004, was based on the RDF/XML serialization syntax of RDF. The text of the original primer has been adapted to Turtle for the purpose of this document, and some of the application examples (that were defined by external bodies in terms of RDF/XML) have been removed.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is an Interest Group Note, developed by the Semantic Web Interest Group.
As of the publication of this Interest Group Note the Semantic Web Interest Group has completed work on this document. Comments on this document may be sent to public-n3-discuss@w3.org (with public archive). Further discussion on this material may be sent to the Semantic Web Interest Group mailing list, semantic-web@w3.org (also with public archive).
Publication as a Working Group Note does not imply endorsement by the W3C Membership. This document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
The W3C maintains a list of any patent disclosures related to the original, RDF/XML version of this work. @@maybe something is needed here on Yahoo! vs. Dave?@@@
The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource. However, by generalizing the concept of a "Web resource", RDF can also be used to represent information about things that can be identified on the Web, even when they cannot be directly retrieved on the Web. Examples include information about items available from on-line shopping facilities (e.g., information about specifications, prices, and availability), or the description of a Web user's preferences for information delivery.
RDF is intended for situations in which this information needs to be processed by applications, rather than being only displayed to people. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. The ability to exchange information between different applications means that the information may be made available to applications other than those for which it was originally created.
RDF is based on the idea of identifying things using Web identifiers
(called Uniform Resource Identifiers, or URIs), and
describing resources in terms of simple properties and property values. This
enables RDF to represent simple statements about resources as a
graph of nodes and arcs representing the resources, and their
properties and values. To make this discussion somewhat more concrete as soon
as possible, the group of statements "there is a Person identified by
http://www.w3.org/People/EM/contact#me , whose name
is Eric Miller, whose email address is em@w3.org, and whose title is Dr."
could be represented as the RDF graph in Figure 1:
Figure 1 illustrates that RDF uses URIs to identify:
http://www.w3.org/People/EM/contact#me http://www.w3.org/2000/10/swap/pim/contact#Person http://www.w3.org/2000/10/swap/pim/contact#mailbox mailto:em@w3.org as the
value of the mailbox property (RDF also uses character strings such as
"Eric Miller", and values from other datatypes such as integers and
dates, as the values of properties) RDF also provides a text-based syntax (called Turtle) for recording and exchanging these graphs. Example 1 is a small chunk of RDF in Turtle syntax corresponding to the graph in Figure 1:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#>. <http://www.w3.org/People/EM/contact#me> rdf:type contact:Person; contact:fullName "Eric Miller"; contact:mailbox <mailto:em@w3.org>; contact:personalTitle "Dr.".
Note that this example also contains URIs, as
well as properties like mailbox and fullName (in an
abbreviated form), and their respective values em@w3.org, and
Eric Miller.
Like HTML, this Turtle syntax is machine processable and, using URIs, can link pieces of information across the Web. However, unlike conventional hypertext, RDF URIs can refer to any identifiable thing, including things that may not be directly retrievable on the Web (such as the person Eric Miller). The result is that in addition to describing such things as Web pages, RDF can also describe cars, businesses, people, news events, etc. In addition, RDF properties themselves have URIs, to precisely identify the relationships that exist between the linked items.
The following documents contribute to the specification of RDF:
This Primer is intended to provide an introduction to RDF and describe some existing RDF applications, to help information system designers and application developers understand the features of RDF and how to use them. In particular, the Primer is intended to answer such questions as:
This Primer is a non-normative document, which means that it does not provide a definitive specification of RDF. The examples and other explanatory material in the Primer are provided to help readers understand RDF, but they may not always provide definitive or fully-complete answers. In such cases, the relevant normative parts of the RDF specification should be consulted. To help in doing this, the Primer describes the roles these other documents play in the complete specification of RDF, and provides links pointing to the relevant parts of the normative specifications, at appropriate places in the discussion.
This Primer follows as closely as possible the RDF/XML Version of the Primer [RDF-PRIMER]. RDF/XML provides a serialization based on XML, whereas the Turtle syntax used in this document is different. Both serializations are equivalent in the sense that they can express the same RDF information, and the choice among the two is solely based on personal taste, availability of parser, etc. Tools also exist that convert the two serialization format to one another without any loss of expressivity.
RDF is intended to provide a simple way to make statements about Web resources, e.g., Web pages. This section describes the basic ideas behind the way RDF provides these capabilities (the normative specification describing these concepts is RDF Concepts and Abstract Syntax [RDF-CONCEPTS]).
Imagine trying to state that someone named John Smith created a particular Web page. A straightforward way to state this in a natural language such as English would be in the form of a simple statement such as:
http://www.example.org/index.html
has a creator whose value is John Smith
Parts of this statement are emphasized to illustrate that, in order to describe the properties of something, there need to be ways to name, or identify, a number of things:
In this statement, the Web page's URL (Uniform Resource Locator) is used to identify it. In addition, the word "creator" is used to identify the property, and the two words "John Smith" to identify the thing (a person) that is the value of this property.
Other properties of this Web page could be described by writing additional English statements of the same general form, using the URL to identify the page, and words (or other expressions) to identify the properties and their values. For example, the date the page was created, and the language in which the page is written, could be described using the additional statements:
http://www.example.org/index.html
has a creation-date whose value is August 16,
1999
http://www.example.org/index.html has a
language whose value is English
RDF is based on the idea that the things being described have properties which have values, and that resources can be described by making statements, similar to those above, that specify those properties and values. RDF uses a particular terminology for talking about the various parts of statements. Specifically, the part that identifies the thing the statement is about (the Web page in this example) is called the subject. The part that identifies the property or characteristic of the subject that the statement specifies (creator, creation-date, or language in these examples) is called the predicate, and the part that identifies the value of that property is called the object. So, taking the English statement
http://www.example.org/index.html
has a creator whose value is John Smith
the RDF terms for the various parts of the statement are:
http://www.example.org/index.html However, while English is good for communicating between (English-speaking) humans, RDF is about making machine-processable statements. To make these kinds of statements suitable for processing by machines, two things are needed:
Fortunately, the existing Web architecture provides both these necessary facilities.
As illustrated earlier, the Web already provides one form of identifier, the Uniform Resource Locator (URL). A URL was used in the original example to identify the Web page that John Smith created. A URL is a character string that identifies a Web resource by representing its primary access mechanism (essentially, its network "location"). However, it is also important to be able to record information about many things that, unlike Web pages, do not have network locations or URLs.
The Web provides a more general form of identifier for these purposes, called the Uniform Resource Identifier (URI). URLs are a particular kind of URI. All URIs share the property that different persons or organizations can independently create them, and use them to identify things. However, URIs are not limited to identifying things that have network locations, or use other computer access mechanisms. In fact, a URI can be created to refer to anything that needs to be referred to in a statement, including
Because of this generality, RDF uses URIs as the basis of its mechanism
for identifying the subjects, predicates, and objects in statements. To be
more precise, RDF uses URI
references [URIS]. A URI reference (or
URIref) is a URI, together with an optional fragment
identifier at the end. For example, the URI reference
http://www.example.org/index.html#section2 consists of the URI
http://www.example.org/index.html and (separated by the "#"
character) the fragment identifier Section2. RDF URIrefs can
contain Unicode [UNICODE] characters (see [RDF-CONCEPTS]), allowing many languages to be
reflected in URIrefs. RDF defines a resource as anything that is
identifiable by a URI reference, so using URIrefs allows RDF to describe
practically anything, and to state relationships between such things as well.
URIrefs and fragment identifiers are discussed further in Appendix A, and in [RDF-CONCEPTS].
To represent RDF statements in a machine-processable way, RDF normatively uses the Extensible Markup Language [XML], but other syntaxes are also possible. The XML serialization syntax (referrred to as RDF/XML) is described in a separate document [RDF-SYNTAX]. This document uses the Turtle syntax [TURTLE]. An example of Turtle was given in Section 1. Turtle content can contain Unicode [UNICODE] characters, allowing information from many languages to be directly represented. The specific Turtle syntax is defined in [TURTLE]
Section 2.1 has introduced RDF's basic statement concepts, the idea of using URI references to identify the things referred to in RDF statements, and Turtle as a machine-processable way to represent RDF statements. With that background, this section describes how RDF uses URIs to make statements about resources. The introduction said that RDF was based on the idea of expressing simple statements about resources, where each statement consists of a subject, a predicate, and an object. In RDF, the English statement:
http://www.example.org/index.html
has a creator whose value is John Smith
could be represented by an RDF statement having:
http://www.example.org/index.html http://purl.org/dc/elements/1.1/creator http://www.example.org/staffid/85740 Note how URIrefs are used to identify not only the subject of the original statement, but also the predicate and object, instead of using the words "creator" and "John Smith", respectively (some of the effects of using URIrefs in this way will be discussed later in this section).
RDF models statements as nodes and arcs in a graph. RDF's graph model is defined in [RDF-CONCEPTS]. In this notation, a statement is represented by:
So the RDF statement above would be represented by the graph shown in Figure 2:
Groups of statements are represented by corresponding groups of nodes and arcs. So, to reflect the additional English statements
http://www.example.org/index.html
has a creation-date whose value is August 16,
1999
http://www.example.org/index.html has a
language whose value is English
in the RDF graph, the graph shown in Figure 3 could be used (using suitable URIrefs to name the properties "creation-date" and "language"):
Figure 3 illustrates that objects in RDF statements
may be either URIrefs, or constant values (called literals)
represented by character strings, in order to represent certain kinds of
property values. (In the case of the predicate
http://purl.org/dc/elements/1.1/language the literal is an
international standard two-letter code for English.) Literals may not be used
as subjects or predicates in RDF statements. In drawing RDF graphs, nodes
that are URIrefs are shown as ellipses, while nodes that are literals are
shown as boxes. (The simple character string literals used in these examples
are called plain
literals, to distinguish them from the typed
literals to be introduced in Section 2.4.
The various kinds of literals that can be used in RDF statements are defined
in [RDF-CONCEPTS]. Both plain and typed
literals can contain Unicode [UNICODE] characters,
allowing information from many languages to be directly represented.)
Sometimes it is not convenient to draw graphs when discussing them, so an alternative way of writing down the statements, called triples, is also used. In the triples notation, each statement in the graph is written as a simple triple of subject, predicate, and object, in that order. For example, the three statements shown in Figure 3 would be written in the triples notation as:
<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/creator> <http://www.example.org/staffid/85740> . <http://www.example.org/index.html> <http://www.example.org/terms/creation-date> "August 16, 1999" . <http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/language> "en" .
Each triple corresponds to a single arc in the graph, complete with the
arc's beginning and ending nodes (the subject and object of the statement).
Unlike the drawn graph (but like the original statements), the triples
notation requires that a node be separately identified for each statement it
appears in. So, for example, http://www.example.org/index.html
appears three times (once in each triple) in the triples representation of
the graph, but only once in the drawn graph. However, the triples represent
exactly the same information as the drawn graph, and this is a key point:
what is fundamental to RDF is the graph model of the statements. The
notation used to represent or depict the graph is secondary.
The full triples notation requires that URI references be written out
completely, in angle brackets, which, as the example above illustrates, can
result in very long lines on a page. For convenience, the Primer uses a
shorthand way of writing triples (the same shorthand is also used in other
RDF specifications). This shorthand substitutes a
qualified name (or QName) without angle brackets as an
abbreviation for a full URI reference . A QName contains a prefix
that has been assigned to a namespace URI, followed by a colon, and then a
local name. The full URIref is formed from the QName by appending
the local name to the namespace URI assigned to the prefix. So, for example,
if the QName prefix foo is assigned to the namespace URI
http://example.org/somewhere/, then the QName
foo:bar is shorthand for the URIref
http://example.org/somewhere/bar. Primer examples will also use
several "well-known" QName prefixes (without explicitly specifying them each
time), defined as follows:
prefix rdf:, namespace URI:
http://www.w3.org/1999/02/22-rdf-syntax-ns#
prefix rdfs:, namespace URI:
http://www.w3.org/2000/01/rdf-schema#
prefix dc:, namespace URI:
http://purl.org/dc/elements/1.1/
prefix owl:, namespace URI:
http://www.w3.org/2002/07/owl#
prefix ex:, namespace URI: http://www.example.org/
(or http://www.example.com/)
prefix xsd:, namespace URI:
http://www.w3.org/2001/XMLSchema#
Obvious variations on the "example" prefix ex: will also be
used as needed in the examples, for instance,
prefix exterms:, namespace URI:
http://www.example.org/terms/ (for terms used by an example
organization),
prefix exstaff:, namespace URI:
http://www.example.org/staffid/ (for the example organization's
staff identifiers),
prefix ex2:, namespace URI:
http://www.domain2.example.org/ (for a second example
organization), and so on.
Using this new shorthand, the previous set of triples can be written as:
ex:index.html dc:creator exstaff:85740 . ex:index.html exterms:creation-date "August 16, 1999" . ex:index.html dc:language "en" .
Since RDF uses URIrefs instead of
words to name things in statements, RDF refers to a set of URIrefs
(particularly a set intended for a specific purpose) as a
vocabulary. Often, the URIrefs in such vocabularies are organized so
that they can be represented as a set of QNames using a common prefix. That
is, a common namespace URIref will be chosen for all terms in a vocabulary,
typically a URIref under the control of whoever is defining the vocabulary.
URIrefs that are contained in the vocabulary are formed by appending
individual local names to the end of the common URIref. This forms a set of
URIrefs with a common prefix. For instance, as illustrated by the previous
examples, an organization such as example.org might define a vocabulary
consisting of URIrefs starting with the prefix
http://www.example.org/terms/ for terms it uses in its business,
such as "creation-date" or "product", and another vocabulary of URIrefs
starting with http://www.example.org/staffid/ to identify its
employees. RDF uses this same approach to define its own vocabulary of terms
with special meanings in RDF. The URIrefs in this RDF vocabulary all begin
with http://www.w3.org/1999/02/22-rdf-syntax-ns#, conventionally
associated with the QName prefix rdf:. The RDF Vocabulary
Description Language (described in Section 5)
defines an additional set of terms having URIrefs that begin with
http://www.w3.org/2000/01/rdf-schema#, conventionally associated
with the QName prefix rdfs:. (Where a specific QName prefix is
commonly used in connection with a given set of terms in this way, the QName
prefix itself is sometimes used as the name of the vocabulary. For example,
someone might refer to "the rdfs: vocabulary".)
Using common URI prefixes provides a convenient way to organize the URIrefs for a related set of terms. However, this is just a convention. The RDF model only recognizes full URIrefs; it does not "look inside" URIrefs or use any knowledge about their structure. In particular, RDF does not assume there is any relationship between URIrefs just because they have a common leading prefix (see Appendix A for further discussion). Moreover, there is nothing that says that URIrefs with different leading prefixes cannot be considered part of the same vocabulary. A particular organization, process, tool, etc. can define a vocabulary that is significant for it, using URIrefs from any number of other vocabularies as part of its vocabulary.
In addition, sometimes an organization will use a vocabulary's namespace
URIref as the URL of a Web resource that provides further information about
that vocabulary. For example, as noted earlier, the QName prefix
dc: will be used in Primer examples, associated with the
namespace URIref http://purl.org/dc/elements/1.1/. In fact, this
refers to the Dublin Core vocabulary described in Section 6.1. Accessing this namespace URIref in a Web
browser will retrieve additional information about the Dublin Core vocabulary
(specifically, an RDF schema).
In the rest of the Primer, the term vocabulary will be used when referring to a set of URIrefs defined for some specific purpose, such as the set of URIrefs defined by RDF for its own use, or the set of URIrefs defined by example.org to identify its employees.
URIrefs from different vocabularies can be freely mixed in RDF graphs. For
example, the graph in Figure 3 uses URIrefs from the
exterms:, exstaff:, and dc:
vocabularies. Also, RDF imposes no restrictions on how many statements using a given URIref as predicate can
appear in a graph to describe the same resource. For example, if the resource
ex:index.html had been created by the cooperative efforts of
several staff members in addition to John Smith, example.org might have
written the statements:
ex:index.html dc:creator exstaff:85740 . ex:index.html dc:creator exstaff:27354 . ex:index.html dc:creator exstaff:00816 .
These examples of RDF statements begin to illustrate some of the
advantages of using URIrefs as RDF's basic way of identifying things. For
instance, in the first statement, instead of identifying the creator of the
Web page by the character string "John Smith", he has been assigned a URIref,
in this case (using a URIref based on his employee number)
http://www.example.org/staffid/85740 . An advantage of using a
URIref in this case is that the identification of the statement's subject can
be more precise. That is, the creator of the page is not the character string
"John Smith", or any one of the thousands of people named John Smith, but the
particular John Smith associated with that URIref (whoever created the URIref
defines the association). Moreover, since there is a URIref to refer to John
Smith, he is a full-fledged resource, and additional information can be
recorded about him, simply by adding additional RDF statements with John's
URIref as the subject. For example, Figure 4 shows
some additional statements giving John's name and age.
These examples also illustrate that RDF uses URIrefs as
predicates in RDF statements. That is, rather than using character
strings (or words) such as "creator" or "name" to identify properties, RDF
uses URIrefs. Using URIrefs to identify properties is important for a number
of reasons. First, it distinguishes the properties one person may use from
different properties someone else may use that would otherwise be identified
by the same character string. For instance, in the example in Figure 4, example.org uses "name" to mean someone's full
name written out as a character string literal (e.g., "John Smith"), but
someone else may intend "name" to mean something different (e.g., the name of
a variable in a piece of program text). A program encountering "name" as a
property identifier on the Web (or merging data from multiple sources) would
not necessarily be able to distinguish these uses. However, if example.org
writes http://www.example.org/terms/name for its "name"
property, and the other person writes
http://www.domain2.example.org/genealogy/terms/name for hers, it
is clear that there are distinct properties involved (even if a program
cannot automatically determine the distinct meanings). Also, using URIrefs to
identify properties enables the properties to be treated as resources
themselves. Since properties are resources, additional information can be
recorded about them (e.g., the English description of what example.org means
by "name"), simply by adding additional RDF statements with the property's
URIref as the subject.
Using URIrefs as subjects, predicates, and objects in RDF statements supports the development and use of shared vocabularies on the Web, since people can discover and begin using vocabularies already used by others to describe things, reflecting a shared understanding of those concepts. For example, in the triple
ex:index.html dc:creator exstaff:85740 .
the predicate dc:creator, when fully expanded as a URIref, is
an unambiguous reference to the "creator" attribute in the Dublin Core
metadata attribute set (discussed further in Section
6.1), a widely-used set of attributes (properties) for describing
information of all kinds. The writer of this triple is effectively saying
that the relationship between the Web page (identified by
http://www.example.org/index.html ) and the creator of the page
(a distinct person, identified by
http://www.example.org/staffid/85740 ) is exactly the concept
identified by http://purl.org/dc/elements/1.1/creator. Another person familiar
with the Dublin Core vocabulary, or who finds out what
dc:creator means (say by looking up its definition on the Web)
will know what is meant by this relationship. In addition, based on this
understanding, people can write programs to behave in accordance with that
meaning when processing triples containing the predicate
dc:creator.
Of course, this
depends on increasing the general use of URIrefs to refer to things instead
of using literals; e.g., using URIrefs like exstaff:85740 and
dc:creator instead of character string literals like John
Smith and creator. Even then, RDF's use of URIrefs
does not solve all identification problems because, for example, people can
still use different URIrefs to refer to the same thing. For this reason, it
is a good idea to try to use terms from existing vocabularies (such as the
Dublin Core) where possible, rather than making up new terms that might
overlap with those of some other vocabulary. Appropriate vocabularies for use
in specific application areas are being developed all the time, as
illustrated by the applications described in Section
6. However, even when synonyms are created, the fact that these different
URIrefs are used in the commonly-accessible "Web space" provides the
opportunity both to identify equivalences among these different references,
and to migrate toward the use of common references.
In addition, it is important to distinguish between any meaning that
RDF itself associates with terms (such as dc:creator in
the previous example) used in RDF statements and additional,
externally-defined meaning that people (or programs written by those
people) might associate with those terms. As a language, RDF directly defines
only the graph syntax of subject, predicate, and object triples, certain
meanings associated with URIrefs in the rdf: vocabulary, and
certain other concepts to be described later. These things are normatively
defined in [RDF-CONCEPTS] and [RDF-SEMANTICS]. However, RDF does not define
the meanings of terms from other vocabularies, such as
dc:creator, that might be used in RDF statements. Specific
vocabularies will be created, with specific meanings assigned to the URIrefs
defined in them, externally to RDF. RDF statements using URIrefs from these
vocabularies may convey the specific meanings associated with those terms to
people familiar with these vocabularies, or to RDF applications written to
process these vocabularies, without conveying any of these meanings to an
arbitrary RDF application not specifically written to process these
vocabularies.
For example, people can associate meaning with a triple such as
ex:index.html dc:creator exstaff:85740 .
based on the meaning they associate with the appearance of the word
"creator" as part of the URIref dc:creator, or based on their
understanding of the specific definition of dc:creator in the
Dublin Core vocabulary. However, as far as an arbitrary RDF application is
concerned the triple might as well be something like
fy:joefy.iunm ed:dsfbups fytubgg:85740 .
as far as any built-in meaning is concerned. Similarly, any natural
language text describing the meaning of dc:creator that might be
found on the Web provides no additional meaning that an arbitrary RDF
application can directly use.
Of course, URIrefs from a particular vocabulary can be used in RDF
statements even though a given application may not be able to associate any
special meanings with them. For example, generic RDF software would recognize
that the above expression is an RDF statement, that ed:dsfbups
is the predicate, and so on. It will simply not associate with the triple any
special meaning that the vocabulary developer might have associated with a
URIref like ed:dsfbups. Moreover, based on their understanding
of a given vocabulary, people can write RDF applications to behave in
accordance with the special meanings assigned to URIrefs from that
vocabulary, even though that meaning will not be accessible to RDF
applications not written in that way.
The result of all this is that RDF provides a way to make statements that
applications can more easily process. An application cannot actually
"understand" such statements, as noted already, any
more than a database system "understands" terms like "employee" or "salary"
in processing a query like SELECT NAME FROM EMPLOYEE WHERE SALARY >
35000. However, if an application is appropriately written, it
can deal with RDF statements in a way that makes it seem like it does
understand them, just as a database system and its
applications can do useful work in processing employee and payroll
information without understanding "employee" and "payroll". For
example, a user could search the Web for all book reviews and create an
average rating for each book. Then, the user could put that information back
on the Web. Another Web site could take that list of book rating averages and
create a "Top Ten Highest Rated Books" page. Here, the availability and use
of a shared vocabulary about ratings, and a shared group of URIrefs
identifying the books they apply to, allows individuals to build a
mutually-understood and increasingly-powerful (as additional contributions
are made) "information base" about books on the Web. The same principle
applies to the vast amounts of information that people create about thousands
of subjects every day on the Web.
RDF statements are similar to a number of other formats for recording information, such as:
and information in these formats can be treated as RDF statements, allowing RDF to be used to integrate data from many sources.
Things would be very simple if the only types of information to be
recorded about things were obviously in the form of the simple RDF statements
illustrated so far. However, most real-world data involves structures that
are more complicated than that, at least on the surface. For instance, in the
original example, the date the Web page was created is recorded as a single
exterms:creation-date property, with a plain literal as its
value. However, suppose the value of the exterms:creation-date
property needed to record the month, day, and year as separate pieces of
information? Or, in the case of John Smith's personal information, suppose
John's address was being described. The whole address could be written out as
a plain literal, as in the triple
exstaff:85740 exterms:address "1501 Grant Avenue, Bedford, Massachusetts 01730" .
However, suppose John's address needed to be recorded as a structure consisting of separate street, city, state, and postal code values? How would this be done in RDF?
Structured information like this is represented in RDF by considering the
aggregate thing to be described (like John Smith's address) as a resource,
and then making statements about that new resource. So, in the RDF graph, in
order to break up John Smith's address into its component parts, a new node
is created to represent the concept of John Smith's address, with a new
URIref to identify it, say
http://www.example.org/addressid/85740 (abbreviated as
exaddressid:85740). RDF statements (additional arcs and nodes)
can then be written with that node as the subject, to represent the
additional information, producing the graph shown in Figure 5:
or the triples:
exstaff:85740 exterms:address exaddressid:85740 . exaddressid:85740 exterms:street "1501 Grant Avenue" . exaddressid:85740 exterms:city "Bedford" . exaddressid:85740 exterms:state "Massachusetts" . exaddressid:85740 exterms:postalCode "01730" .
This way of representing structured information in RDF can involve
generating numerous "intermediate" URIrefs such as
exaddressid:85740 to represent aggregate concepts such as John's
address. Such concepts may never need to be referred to directly from outside
a particular graph, and hence may not require "universal" identifiers. In
addition, in the drawing of the graph representing the group of
statements shown in Figure 5, the URIref assigned to
identify "John Smith's address" is not really needed, since the graph could
just as easily have been drawn as in Figure 6:
Figure 6, which is a perfectly good RDF graph, uses a node without a URIref to stand for the concept of "John Smith's address". This blank node serves its purpose in the drawing without needing a URIref, since the node itself provides the necessary connectivity between the various other parts of the graph. (Blank nodes were called anonymous resources in [RDF-MS].) However, some form of explicit identifier for that node is needed in order to represent this graph as triples. To see this, trying to write the triples corresponding to what is shown in Figure 6 would produce something like:
exstaff:85740 exterms:address ??? . ??? exterms:street "1501 Grant Avenue" . ??? exterms:city "Bedford" . ??? exterms:state "Massachusetts" . ??? exterms:postalCode "01730" .
where ??? stands for something that indicates the presence of the blank
node. Since a complex graph might contain more than one blank node, there
also needs to be a way to differentiate between these different blank nodes
in a triples representation of the graph. As a result, triples use blank node
identifiers, having the form _:name, to indicate the
presence of blank nodes. For instance, in this example a blank node
identifier _:johnaddress might be used to refer to the blank
node, in which case the resulting triples might be:
exstaff:85740 exterms:address _:johnaddress . _:johnaddress exterms:street "1501 Grant Avenue" . _:johnaddress exterms:city "Bedford" . _:johnaddress exterms:state "Massachusetts" . _:johnaddress exterms:postalCode "01730" .
In a triples representation of a graph, each distinct blank node in the graph is given a different blank node identifier. Unlike URIrefs and literals, blank node identifiers are not considered to be actual parts of the RDF graph (this can be seen by looking at the drawn graph in Figure 6 and noting that the blank node has no blank node identifier). Blank node identifiers are just a way of representing the blank nodes in a graph (and distinguishing one blank node from another) when the graph is written in triple form. Blank node identifiers also have significance only within the triples representing a single graph (two different graphs with the same number of blank nodes might independently use the same blank node identifiers to distinguish them, and it would be incorrect to assume that blank nodes from different graphs having the same blank node identifiers are the same). If it is expected that a node in a graph will need to be referenced from outside the graph, a URIref should be assigned to identify it. Finally, because blank node identifiers represent (blank) nodes, rather than arcs, in the triple form of an RDF graph, blank node identifiers may only appear as subjects or objects in triples; blank node identifiers may not be used as predicates in triples.
The beginning of this section noted that aggregate structures, like John Smith's address, can be represented by considering the aggregate thing to be described as a separate resource, and then making statements about that new resource. This example illustrates an important aspect of RDF: RDF directly represents only binary relationships, e.g. the relationship between John Smith and the literal representing his address. Representing the relationship between John and the group of separate components of this address involves dealing with an n-ary (n-way) relationship (in this case, n=5) between John and the street, city, state, and postal code components. In order to represent such structures directly in RDF (e.g., considering the address as a group of street, city, state, and postal code components), this n-way relationship must be broken up into a group of separate binary relationships. Blank nodes provide one way to do this. For each n-ary relationship, one of the participants is chosen as the subject of the relationship (John in this case), and a blank node is created to represent the rest of the relationship (John's address in this case). The remaining participants in the relationship (such as the city in this example) are then represented as separate properties of the new resource represented by the blank node.
Blank nodes also provide a way to more accurately make statements about
resources that may not have URIs, but that are described in terms of
relationships with other resources that do have URIs. For example,
when making statements about a person, say Jane Smith, it may seem natural to
use a URI based on that person's email address as her URI, e.g.,
mailto:jane@example.org. However, this approach can cause
problems. For example, it may be necessary to record information both about
Jane's mailbox (e.g., the server it is on) as well as about Jane
herself (e.g., her current physical address), and using a URIref for
Jane based on her email address makes it difficult to know whether it is Jane
or her mailbox that is being described. The same problem exists when a
company's Web page URL, say http://www.example.com/, is used as
the URI of the company itself. Once again, it may be necessary to record
information about the Web page itself (e.g., who created it and when) as well
as about the company, and using http://www.example.com/ as an
identifier for both makes it difficult to know which of these is the actual
subject.
The fundamental problem is that using Jane's mailbox as a
stand-in for Jane is not really accurate: Jane and her mailbox are
not the same thing, and hence they should be identified differently. When
Jane herself does not have a URI, a blank node provides a more accurate way
of modeling this situation. Jane can be represented by a blank node, and that
blank node used as the subject of a statement with
exterms:mailbox as the property and the URIref
mailto:jane@example.org as its value. The blank node could also
be described with an rdf:type property having a value of
exterms:Person (types are discussed in more detail in the
following sections), an exterms:name property having a value of
"Jane Smith", and any other descriptive information that might
be useful, as shown in the following triples:
_:jane exterms:mailbox <mailto:jane@example.org> . _:jane rdf:type exterms:Person . _:jane exterms:name "Jane Smith" . _:jane exterms:empID "23748" . _:jane exterms:age "26" .
(Note that mailto:jane@example.org is written within angle
brackets in the first triple. This is because
mailto:jane@example.org is a full URIref in the
mailto URI scheme, rather than a QName abbreviation, and full
URIrefs must be enclosed in angle brackets in the triples notation.)
This says, accurately, that "there is a resource of type
exterms:Person, whose electronic mailbox is identified by
mailto:jane@example.org, whose name is Jane Smith,
etc." That is, the blank node can be read as "there is a resource".
Statements with that blank node as subject then provide information about the
characteristics of that resource.
In practice, using blank nodes instead of URIrefs in these cases does not
change the way this kind of information is handled very much. For example, if
it is known that an email address uniquely identifies someone at example.org
(particularly if the address is unlikely to be reused), that fact can still
be used to associate information about that person from multiple sources,
even though the email address is not the person's URI. In this case, if some
RDF is found on the Web that describes a book, and gives the author's contact
information as mailto:jane@example.org, it might be reasonable,
combining this new information with the previous set of triples, to conclude
that the author's name is Jane Smith. The point is that saying something like
"the author of the book is mailto:jane@example.org" is typically
a shorthand for "the author of the book is someone whose mailbox is
mailto:jane@example.org". Using a blank node to represent this
"someone" is just a more accurate way to represent the real world situation.
(Incidentally, some RDF-based schema languages allow specifying that certain
properties are unique identifiers of the resources they describe.
This is discussed further in Section 5.5.)
Using blank nodes in this way can also help avoid the use of literals in
what might be inappropriate situations. For example, in describing Jane's
book, lacking a URIref to identify the author, the publisher might have
written (using the publisher's own ex2terms: vocabulary):
ex2terms:book78354 rdf:type ex2terms:Book . ex2terms:book78354 ex2terms:author "Jane Smith" .
However, the author of the book is not really the character string "Jane Smith", but a person whose name is Jane Smith. The same information might be more accurately given by the publisher using a blank node, as:
ex2terms:book78354 rdf:type ex2terms:Book . ex2terms:book78354 ex2terms:author _:author78354 . _:author78354 rdf:type ex2terms:Person . _:author78354 ex2terms:name "Jane Smith" .
This essentially says "resource ex2terms:book78354 is of type
ex2terms:Book, and its author is a resource of type
ex2terms:Person, whose name is Jane Smith." Of
course, in this particular case the publisher might instead have assigned its
own URIrefs to its authors instead of using blank nodes to identify them, in
order to encourage external references to its authors.
Finally, the example above giving Jane's age as 26 illustrates the fact that sometimes the value of a property may appear to be simple, but actually may be more complex. In this case, Jane's age is actually 26 years, but the units information (years) is not explicitly given. Such information is often omitted in contexts where it can be safely assumed that anyone accessing the property value will understand the units being used. However, in the wider context of the Web, it is generally not safe to make this assumption. For example, a U.S. site might give a weight value in pounds, but someone accessing that data from outside the U.S. might assume that weights are given in kilograms. In general, careful consideration should be given to explicitly representing units and similar information. This issue is discussed further in Section 4.4, which describes an RDF feature for representing such information as structured values, as well as some other techniques for representing such information.
The last section described how to handle situations in which property
values represented by plain literals had to be broken up into structured
values to represent the individual parts of those literals. Using this
approach, instead of, say, recording the date a Web page was created as a
single exterms:creation-date property, with a single plain
literal as its value, the value would be represented as a structure
consisting of the month, day, and year as separate pieces of information,
using separate plain literals to represent the corresponding values. However,
so far, all constant values that serve as objects in RDF statements have been
represented by these plain (untyped) literals, even when the intent is
probably for the value of the property to be a number (e.g., the value of a
year or age property) or some other kind of more
specialized value.
For example, Figure 4 illustrated an RDF graph
recording information about John Smith. That graph recorded the value of John
Smith's exterms:age property as the plain literal "27", as shown
in Figure 7:
In this case, the hypothetical organization example.org probably intends
for "27" to be interpreted as a number, rather than as the string consisting
of the character "2" followed by the character "7" (since the literal represents the value of an "age"
property). However, there is no information in Figure 7's graph that
explicitly indicates that "27" should be interpreted as a number. Similarly,
example.org also probably intends for "27" to be interpreted as a
decimal number, i.e., the value twenty seven, rather than,
say, as an octal number, i.e., the value twenty three.
However, once again there is no information in Figure 7's graph that
explicitly indicates this. Specific applications might be written with the
understanding that they should interpret values of the
exterms:age property as decimal numbers, but this would mean
that proper interpretation of this RDF would depend on information not
explicitly provided in the RDF graph, and hence on information that would not
necessarily be available to other applications that might need to interpret
this RDF.
The common practice in programming languages or database systems is to
provide this additional information about how to interpret a literal by
associating a datatype with the literal, in this case, a datatype
like decimal or integer. An application that
understands the datatype then knows, for example, whether the literal "10" is
intended to represent the number ten, the number two, or
the string consisting of the character "1" followed by the character "0",
depending on whether the specified datatype is integer,
binary, or string. (More specialized datatypes
could also be used to include the units information mentioned at the end of
Section 2.3, e.g., a datatype
integerYears, although the Primer will not elaborate on this
idea.) In RDF, typed
literals are used to provide this kind of information.
An RDF typed literal is formed by pairing a string with a URIref that identifies a particular datatype. This results in a single literal node in the RDF graph with the pair as the literal. The value represented by the typed literal is the value that the specified datatype associates with the specified string. For example, using a typed literal, John Smith's age could be described as being the integer number 27 using the triple:
<http://www.example.org/staffid/85740> <http://www.example.org/terms/age> "27"^^<http://www.w3.org/2001/XMLSchema#integer> .
or, using the QName simplification for writing long URIs:
exstaff:85740 exterms:age "27"^^xsd:integer .
or as shown in Figure 8:
Similarly, in the graph shown in Figure 3
describing information about a Web page, the value of the page's
exterms:creation-date property was written as the plain literal
"August 16, 1999". However, using a typed literal, the creation date of the
Web page could be explicitly described as being the date August 16,
1999, using the triple:
ex:index.html exterms:creation-date "1999-08-16"^^xsd:date .
or as shown in Figure 9:
Unlike typical programming languages and database systems, RDF has no
built-in set of datatypes of its own, such as datatypes for integers, reals,
strings, or dates. Instead, RDF typed literals simply
provide a way to explicitly indicate, for a given literal, what datatype
should be used to interpret it. The datatypes used in typed literals are
defined externally to RDF, and identified by their datatype URIs.
(There is one exception: RDF defines a built-in datatype with the URIref
rdf:XMLLiteral to represent XML content as a literal value. This
datatype is defined in [RDF-CONCEPTS], and
its use is described in Section 4.5.) For
instance, the examples in Figure 8 and Figure 9 use the datatypes integer and
date from the XML Schema datatypes defined in XML Schema Part 2: Datatypes [XML-SCHEMA2]. An advantage of this approach is
that it gives RDF the flexibility to directly represent information coming
from different sources without the need to perform type conversions between
these sources and a native set of RDF datatypes. (Type conversions would
still be required when moving information between systems having different
sets of datatypes, but RDF would impose no extra conversions into and out of
a native set of RDF datatypes.)
RDF datatype concepts are based on a conceptual framework from XML Schema datatypes [XML-SCHEMA2], as described in RDF Concepts and Abstract Syntax [RDF-CONCEPTS]. This conceptual framework defines a datatype as consisting of:
xsd:date, this set of values is a set of dates.
xsd:date defines
1999-08-16 as being a legal way to write a literal of this
type (as opposed, say, to August 16, 1999). As defined in [RDF-CONCEPTS], the lexical space of a
datatype is a set of Unicode [UNICODE]
strings, allowing information from many languages to be directly
represented. xsd:date determines
that, for this datatype, the string 1999-08-16 represents
the date August 16, 1999. The lexical-to-value mapping is a
factor because the same character string may represent different values
for different datatypes. Not all datatypes are suitable for use in RDF. For a datatype to be
suitable for use in RDF, it must conform to the conceptual framework just
described. This basically means that, given a character string, the datatype
must unambiguously define whether or not the string is in its lexical space,
and what value in its value space the string represents. For example, the
basic XML Schema datatypes such as xsd:string,
xsd:boolean, xsd:date, etc. are suitable for use in
RDF. However, some of the built-in XML Schema datatypes are not suitable for
use in RDF. For example, xsd:duration does not have a
well-defined value space, and xsd:QName requires an enclosing
XML document context. Lists of the XML Schema datatypes that are currently
considered suitable and unsuitable for use in RDF are given in [RDF-SEMANTICS].
Since the value that a given typed literal denotes
is defined by the typed literal's datatype, and, with the exception of
rdf:XMLLiteral, RDF does not define any datatypes, the actual
interpretation of a typed literal appearing in an RDF graph (e.g.,
determining the value it denotes) must be performed by software that is
written to correctly process not only RDF, but the typed literal's datatype
as well. Effectively, this software must be written to process an extended
language that includes not only RDF, but also the datatype, as part of its
built-in vocabulary. This raises the issue of which datatypes will be
generally available in RDF software. Generally, the XML Schema datatypes that
are listed as suitable for use in RDF in [RDF-SEMANTICS] have a "first among
equals" status in RDF. As noted already, the examples in Figure 8 and Figure 9 used some of
these XML Schema datatypes, and the Primer will be using these datatypes in
most of its other examples of typed literals as well (for one thing, XML
Schema datatypes already have assigned URIrefs that can be used to refer to
them, specified in [XML-SCHEMA2]). These XML
Schema datatypes are treated no differently than any other datatype, but they
are expected to be the most widely used, and therefore the most likely to be
interoperable among different software. As a result, it is expected that much
RDF software will also be written to process these datatypes. However, RDF
software could be written to process other sets of datatypes as well,
assuming they were determined to be suitable for use with RDF, as described
already.
In general, RDF software may be called on to process RDF data that
contains references to datatypes that the software has not been written to
process, in which case there are some things the software will not be able to
do. For one thing, with the exception of
rdf:XMLLiteral, RDF itself does not define the URIrefs that
identify datatypes. As a result, RDF software, unless it has been written to
recognize specific URIrefs, will not be able to determine whether or not a
URIref written in a typed literal actually identifies a datatype. Moreover,
even when a URIref does identify a datatype, RDF itself does not define the
validity of pairing that datatype with a particular literal. This
validity can only be determined by software written to correctly process that
particular datatype.
For example, the typed literal in the triple:
exstaff:85740 exterms:age "pumpkin"^^xsd:integer .
or the graph shown in Figure 10:
is valid RDF, but obviously an error as far as the
xsd:integer datatype is concerned, since "pumpkin"
is not defined as being in the lexical space of xsd:integer. RDF
software not written to process the xsd:integer datatype would
not be able to recognize this error.
However, proper use of RDF typed literals provides more information about the intended interpretation of literal values, and hence makes RDF statements a better means of information exchange among applications.
Taken as a whole, RDF is basically simple: nodes-and-arcs diagrams interpreted as statements about things identified by URIrefs. This section has presented an introduction to these concepts. As noted earlier, the normative (i.e., definitive) RDF specification describing these concepts is RDF Concepts and Abstract Syntax [RDF-CONCEPTS], which should be consulted for further information. The formal semantics (meaning) of these concepts is defined in the (normative) RDF Semantics [RDF-SEMANTICS] document.
However, in addition to the basic techniques for describing things using RDF statements discussed so far, it should be clear that people or organizations also need a way to describe the vocabularies (terms) they intend to use in those statements, specifically, vocabularies for:
exterms:Person) exterms:age and
exterms:creation-date), and exterms:age property should always be an
xsd:integer). The basis for describing such vocabularies in RDF is the RDF Vocabulary Description Language 1.0: RDF Schema [RDF-VOCABULARY], which will be described in Section 5.
Additional background on the basic ideas underlying RDF, and its role in providing a general language for describing Web information, can be found in [WEBDATA]. RDF draws upon ideas from knowledge representation, artificial intelligence, and data management, including Conceptual Graphs, logic-based knowledge representation, frames, and relational databases. Some possible sources of background information on these subjects include [SOWA], [CG], [KIF], [HAYES], [LUGER], and [GRAY].
As described in Section 2, RDF's conceptual model is a graph. Turtle provides an textual syntax for writing down and exchanging RDF graphs. This syntax includes the triple notation used in this document so far, but also includes some additional syntactic means to simplify the specification of RDF graphs. Turtle is defined in the @@@Final Title of the document@@@ [TURTLE]. This section describes this syntax.
The basic ideas behind the Turtle syntax can be illustrated using some of the examples presented already. Take as an example the English statement:
http://www.example.org/index.html
has a creation-date whose value is August 16,
1999
The RDF graph for this single statement, after assigning a URIref to the
creation-date property, is shown in Figure
11:
with a triple representation of:
ex:index.html exterms:creation-date "August 16, 1999" .
(Note that a typed literal is not used for the date value in this example. Representing typed literals in Turtle will be described later in this section.)
Example 2 shows the Turtle syntax corresponding to the graph in Figure 11:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix exterms: <http://www.example.org/terms/>. <http://www.example.org/index.html> exterms:creation-date "August 16, 1999".
The only additional syntactic element to what has been used so far is the usage of the @prefix keyword in first two lines. These are the namespace declarations, ie, the association of a prefix used in the document with a URI; one line per prefix.
The Turtle syntax provides a number of abbreviations to make common uses easier to write. For example, it is typical for the same resource to be described with several properties and values at the same time. To handle such cases, Turtle allows multiple property elements representing those properties to be nested within the symbol element that identifies the subject resource. For example, to represent the following group of statements about http://www.example.org/index.html:
ex:index.html dc:creator exstaff:85740 . ex:index.html exterms:creation-date "August 16, 1999" . ex:index.html dc:language "en" .
whose graph (the same as Figure 3) is shown in Figure 12:
the Turtle version of the graph shown in Figure 12 could be written as:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix dc: <http://purl.org/dc/elements/1.1/#>.
@prefix exterms: <hhttp://www.example.org/terms/>.
<http://www.example.org/index.html>
exterms:creation-date "August 16, 1999";
dc:language "en";
dc:creator <http://www.example.org/staffid/85740>.
Compared with the previous examples, Example 3
adds an additional dc:language dc:creator
properties to the same subject. Note the usage of the ";"
character: this character separates the additional property-object pairs for
the same subject. The series of such pairs is closed with the
"." character. Note also that whitespace characters can be used anywhere to ensure a more readable format.
It is important to understand that Example 3 is an abbreviation. Example 4, in which each statement is written separately, describes exactly the same RDF graph (the graph of Figure 12):
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix dc: <http://purl.org/dc/elements/1.1/#>. @prefix exterms: <hhttp://www.example.org/terms/>. <http://www.example.org/index.html> exterms:creation-date "August 16, 1999". <http://www.example.org/index.html> dc:language "en". <http://www.example.org/index.html> dc:creator <http://www.example.org/staffid/85740>.
The following sections will describe a few additional Turtle abbreviations. The Turtle specification document[TURTLE] provides a more thorough description of the abbreviations that are available.
Turtle can also represent graphs that include nodes that have no URIrefs, i.e., the blank nodes described in Section 2.3. For example, Figure 13 (taken from [RDF-SYNTAX]) shows a graph saying "the document 'http://www.w3.org/TR/rdf-syntax-grammar' has a title 'RDF/XML Syntax Specification (Revised)' and has an editor, the editor has a name 'Dave Beckett' and a home page 'http://purl.org/net/dajobe/' ".
This illustrates an idea discussed in Section 2.3: the use of a blank node to represent something that does not have a URIref, but can be described in terms of other information. In this case, the blank node represents a person, the editor of the document, and the person is described by his name and home page.
Turtle provides several ways to represent graphs containing blank nodes.
The approach illustrated here, which is the most direct approach, is to
assign a blank node identifier to each blank node (just as for
direct triples in our previous examples). A blank node identifier serves to
identify a blank node within a particular Turtle document but, unlike a
URIref, is unknown outside the document in which it is assigned. A blank node
is referred to in Turtle using the special namespace prefix "_"
with a blank node identifier as its value, in places where the URIref of a
resource would otherwise appear. Using this facility, Example 5 shows the Turtle corresponding to Figure 13:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix dc: <http://purl.org/dc/elements/1.1/#>.
@prefix exterms: <http://www.example.org/terms/>.
<http://www.w3.org/TR/rdf-syntax-grammar>
dc:title "RDF/XML Syntax Specification (Revised)";
exterm:editor _:abc.
_:abc
exterms:fullName "Dave Beckett";
exterms:homePage <http://purl.org/net/dajobe/>.
In Example 5, the blank node identifier
abc is used to identify the blank node as the subject of several
statements, and is used separately to indicate that the blank node is the
value of a resource's exterms:editor property. The advantage of
using a blank node identifier is that using a blank node identifier allows
the same blank node to be referred to in more than one place in the same
Turtle document.
However, the reuse of the blank node from within the Turtle document is not always necessary. Turtle also provides an abbreviated syntax using anonymous blank nodes, ie, without the necessity to define local names. Example 6 shows another approach of Turtle for the serialization of the graph on Figure 13:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix dc: <http://purl.org/dc/elements/1.1/#>.
@prefix exterms: <http://www.example.org/terms/>.
<http://www.w3.org/TR/rdf-syntax-grammar>
dc:title "RDF/XML Syntax Specification (Revised)";
exterm:editor [
exterms:fullName "Dave Beckett";
exterms:homePage <http://purl.org/net/dajobe/>.
].
In Example 6, the pair of [ and
] enclose a blank node, whose identification is left to the
system (essentially the parser). Although this approach does not give the
possibility to access this blank node from other parts of the same Turtle
file, it covers a large number of cases of blank node usage.
In the rest of the Primer, the examples will use typed literals from appropriate datatypes rather than plain (untyped) literals, in order to emphasize the value of typed literals in conveying more information about the intended interpretation of literal values. (The exceptions will be that plain literals will continue to be used in examples taken from actual applications that do not currently use typed literals, in order to accurately reflect the usage in those applications.) In Turtle, both plain and typed literals (and, with certain exceptions, tags) can contain Unicode [UNICODE] characters, allowing information from many languages to be directly represented.
So far, the examples have assumed that the resources being described have
been given URIrefs already. For instance, the initial examples provided
descriptive information about example.org's Web page, whose URIref was
http://www.example.org/index.html. This resource was identified
in Turtle using its full URIref. Although RDF does not specify or control how
URIrefs are assigned to resources, sometimes it is desirable to achieve the
effect of assigning URIrefs to resources that are part of an organized group
of resources. For example, suppose a sporting goods company, example.com,
wanted to provide an RDF-based catalog of its products, such as tents, hiking
boots, and so on, as an RDF/XML document, identified by (and located at)
http://www.example.com/2002/04/products. In that resource, each
product might be given a separate RDF description. This catalog, along with
one of these descriptions, the catalog entry for a model of tent called the
"Overnighter", might be written in Turtle as shown in Example 7:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. @prefix exterms: <hhttp://www.example.org/terms/>. :item10245 exterm:models "Overnighter"^^xsd:string; exterm:sleeps "2"^^xsd:integer; exterm:weight "2.4"^^xsd:decimal; exterm:packedSize "784"^^xsd:integer. ...other product descriptions...
Example 7 is similar to previous examples in the
way it represents the properties (model, sleeping capacity, weight) of the
resource (the tent) being described. Note also that although the
datatypes associated with the various property values are given
explicitly, the units associated with some of these property values
are not, even though this information should be available to properly
interpret the values. Representing units and similar information that may be
associated with property values is discussed in Section
4.4. In this example, the value of exterms:sleeps is the
number of persons the tent can sleep, the value of
exterms:weight is given in kilograms, and the value of
exterms:packedSize is given in square centimeters, the area the
tent occupies on a backpack.)
An important difference from previous examples is the usage of
the :item10245 to identify a resource. Using a namespace
without a prefix specifies a fragment identifier, as an
abbreviation of the complete URIref of the resource being described. The
fragment identifier item10245 will be interpreted relative to a
base URI, in this case the URI of the containing catalog document.
The full URIref for the tent is formed by taking the base URI (of the
catalog), and appending the character "#" (to indicate that what
follows is a fragment identifier) and then item10245 to it,
giving the absolute URIref
http://www.example.com/2002/04/products#item10245.
This formalism is somewhat similar to the id attribute in XML
and HTML, in that it defines a name which must be unique relative to the
current base URI (in this example, that of the catalog). Any other Turtle
statement within this catalog could refer to the tent by using either the
absolute URIref
http://www.example.com/2002/04/products#item10245, or the
relative URIref :item10245. The relative URIref would
be understood as being a URIref defined relative to the base URIref of the
catalog. This use of a relative URIref can be thought of either as being an
abbreviation for a full URIref that has been assigned to the tent
independently of the RDF, or as being the assignment of the URIref to the
tent within the catalog.
RDF located outside the catalog could refer to this tent by using
the full URIref, i.e., by concatenating the relative URIref
#item10245 of the tent to the base URI of the catalog, forming
the absolute URIref
http://www.example.com/2002/04/products#item10245. For example,
an outdoor sports Web site exampleRatings.com might use RDF to provide
ratings of various tents. The (5-star) rating given to the tent described in
Example 7 might then be represented on
exampleRatings.com's Web site as shown in Example 8:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix sportex: <http://www.exampleRatings.com/terms/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
<http://www.example.com/2002/04/products#item10245>
sportex:ratingBy "Richard Roe"^^xsd:string;
sportex:numberStars "5"^^xsd:integer.
Example 8 uses a full URIref of the tent. The use of this URIref allows the tent being referred to in the rating to be precisely identified.
These examples illustrate several points. First, even though RDF does not specify or control how URIrefs are assigned to resources (in this case, the various tents and other items in the catalog), the effect of assigning URIrefs to resources in RDF can be achieved by combining a process (external to RDF) that identifies a single document (the catalog in this case) as the source for descriptions of those resources, with the use of relative URIrefs in descriptions of those resources within that document. For instance, example.com could use this catalog as the central source where its products are described, with the understanding that if a product's item number is not in an entry in this catalog, it is not a product known to example.com. (Note that RDF does not assume any particular relationship exists between two resources just because their URIrefs have the same base, or are otherwise similar. This relationship may be known to example.com, but it is not directly defined by RDF.)
These examples also illustrate one of the basic architectural principles of the Web, which is that anyone should be able to freely add information about an existing resource, using any vocabulary they please [BERNERS-LEE98]. The examples further illustrate that the RDF describing a particular resource does not need to be located all in one place; instead, it may be distributed throughout the Web. This is true not only for situations like this one, in which one organization is rating or commenting on a resource defined by another, but also for situations in which the original definer of a resource (or anyone else) wishes to amplify the description of that resource by providing additional information about it. This may be done by modifying the RDF document in which the resource was originally described, to add the properties and values needed to describe the additional information. Or, as this example illustrates, a separate document could be created, providing the additional properties and values, referring to the original resource via its URIref.
The discussion above indicated that relative URIrefs such as
:item10245 will be interpreted relative to a base URI.
By default, this base URI would be the URI of the resource in which the
relative URIref is used. However, in some cases it is desirable to be able to
explicitly specify this base URI. For instance, suppose that in addition to
the catalog located at http://www.example.com/2002/04/products,
example.org wanted to provide a duplicate catalog on a mirror site, say at
http://mirror.example.com/2002/04/products. This could create a
problem, since if the catalog was accessed from the mirror site, the URIref
for the example tent would be generated from the URI of the containing
document, forming
http://mirror.example.com/2002/04/products#item10245, rather
than http://www.example.com/2002/04/products#item10245, and
hence would apparently refer to a different resource than the one intended.
Alternatively, example.org might want to assign a base URIref for its set of
product URIrefs without publishing a single source document whose
location defines the base.
To deal with such cases, Turtle support a @base facility,
which allows a Turtle document to specify a base URI other than the URI of
the document itself. Example 9 shows how the catalog
would be described using XML Base:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix exterms: <hhttp://www.example.org/terms/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. @base <http://www.example.com/2002/04/products>. :item10245 exterm:models "Overnighter"^^xsd:string; exterm:sleeps "2"^^xsd:integer; exterm:weight "2.4"^^xsd:decimal; exterm:packedSize "784"^^xsd:integer. ...other product descriptions...
In Example 10, the @base declaration
specifies that the base URI for the content within the file element is
http://www.example.com/2002/04/products, and all relative
URIrefs cited within that content will be interpreted relative to that base,
no matter what the URI of the containing document is. As a result, the
relative URIref of the tent, #item10245, will be interpreted as
the same absolute URIref,
http://www.example.com/2002/04/products#item10245, no matter
what the actual URI of the catalog document is, or whether the base URIref
actually identifies a particular document at all.
So far, the examples have used a single product description, a particular
model of tent, from example.com's catalog. However, example.com will probably
offer several different models of tents, as well as multiple instances of
other categories of products, such as backpacks, hiking boots, and so on.
This idea of things being classified into different kinds or
categories is similar to the programming language concept of objects
having different types or classes. RDF supports this
concept by providing a predefined property, rdf:type. When an
RDF resource is described with an rdf:type property, the value
of that property is considered to be a resource that represents a category or
class of things, and the subject of that property is considered to
be an instance of that category or class. Using
rdf:type, Example 10 shows how
example.com might indicate that the product description is that of a tent:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix exterms: <http://www.example.org/terms/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. @base <http://www.example.com/2002/04/products>. :item10245 rdf:type <http://www.example.com/terms/Tent>; exterm:models "Overnighter"^^xsd:string; exterm:sleeps "2"^^xsd:integer; exterm:weight "2.4"^^xsd:decimal; exterm:packedSize "784"^^xsd:integer.
In Example 10, the rdf:type property
indicates that the resource being described is an instance of the class
identified by the URIref http://www.example.com/terms/Tent. This
assumes that example.com has described its classes as part of the same
vocabulary that it uses to describe its other terms (such as the property
exterms:weight), so the absolute URIref of the class is used to
refer to it. If example.com had described these classes as part of the
product catalog itself, the relative URIref :Tent could have
been used to refer to it.
RDF itself does not provide facilities for defining application-specific
classes of things, such as Tent in this example, or their
properties, such as exterms:weight. Instead, such classes would
be described in an RDF schema, using the RDF Schema
language discussed in Section 5. Other such
facilities for describing classes can also be defined, such as the
DAML+OIL and OWL languages described in Section 5.5.
It is fairly common in RDF for resources to have rdf:type
properties that describe the resources as instances of specific types or
classes. Turtle introduces a special abbreviation for the
rdf:type property using the a keyword. Using this
abbreviation, example.com's tent from Example 10
could also be described as shown in Example 11:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @prefix exterms: <hhttp://www.example.org/terms/>. @prefix xsd: <http://www.w3.org/2001/XMLSchema#>. @base <http://www.example.com/2002/04/products>. :item10245 a <http://www.example.com/terms/Tent>; exterm:models "Overnighter"^^xsd:string; exterm:sleeps "2"^^xsd:integer; exterm:weight "2.4"^^xsd:decimal; exterm:packedSize "784"^^xsd:integer.
RDF provides a number of additional capabilities, such as built-in types and properties for representing groups of resources and RDF statements, and capabilities for representing XML fragments as property values. These additional capabilities are described in the following sections.
There is often a need to describe groups of things: for example, to say that a book was created by several authors, or to list the students in a course, or the software modules in a package. RDF provides several predefined (built-in) types and properties that can be used to describe such groups.
First, RDF provides a container vocabulary consisting of three predefined types (together with some associated predefined properties). A container is a resource that contains things. The contained things are called members. The members of a container may be resources (including blank nodes) or literals. RDF defines three types of containers:
rdf:Bag rdf:Seq rdf:Alt A Bag (a resource having type rdf:Bag) represents a group of resources or literals, possibly
including duplicate members, where there is no significance in the order of
the members. For example, a Bag might be used to describe a group of part
numbers in which the order of entry or processing of the part numbers does
not matter.
A Sequence or Seq (a resource having type
rdf:Seq) represents a group of
resources or literals, possibly including duplicate members, where the order
of the members is significant. For example, a Sequence might be used to
describe a group that must be maintained in alphabetical order.
An Alternative or Alt (a resource having type
rdf:Alt) represents a group of
resources or literals that are alternatives (typically for a single
value of a property). For example, an Alt might be used to describe
alternative language translations for the title of a book, or to describe a
list of alternative Internet sites at which a resource might be found. An
application using a property whose value is an Alt container should be aware
that it can choose any one of the members of the group as appropriate.
To describe a resource as being one of these types of containers, the
resource is given an rdf:type property whose value is one of the
predefined resources rdf:Bag, rdf:Seq, or
rdf:Alt (whichever is appropriate). The container resource
(which may either be a blank node or a resource with a URIref) denotes the
group as a whole. The members of the container can be described by
defining a container membership property for each member with the
container resource as its subject and the member as its object. These
container membership properties have names of the form rdf:_n
, where n is a decimal integer greater than zero, with no
leading zeros, e.g., rdf:_1, rdf:_2,
rdf:_3, and so on, and are used specifically for describing the
members of containers. Container resources may also have other properties
that describe the container, in addition to the container membership
properties and the rdf:type property.
It is important to understand that while these types of containers are
described using predefined RDF types and properties, any special meanings
associated with these containers, e.g., that the members of an Alt container
are alternative values, are only intended meanings. These specific
container types, and their definitions, are provided with the aim of
establishing a shared convention among those who need to describe groups of
things. All RDF does is provide the types and properties that can be used to
construct the RDF graphs to describe each type of container. RDF has no more
built-in understanding of what a resource of type rdf:Bag is
than it has of what a resource of type ex:Tent (discussed in Section 3.2) is. In each case, applications must be
written to behave according to the particular meaning involved for each type.
This point will be expanded on in the following examples.
A typical use of a container is to indicate that the value of a property
is a group of things. For example, to represent the sentence "Course 6.001
has the students Amy, Mohamed, Johann, Maria, and Phuong", the course could
be described by giving it a s:students property (from an
appropriate vocabulary) whose value is a container of type
rdf:Bag (representing the group of
students). Then, using the container membership properties, individual
students could be identified as being members of that group, as in the RDF
graph shown in Figure 14:
Since the value of the s:students property in this example is
described as a Bag, there is no intended significance in the order given for
the URIrefs of the students, even though the membership properties in the
graph have integers in their names. It is up to applications creating and
processing graphs that include rdf:Bag containers to ignore any
(apparent) order in the names of the membership properties.
Example 12 describes the graph shown in Figure 14:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix s: <http://example.org/students/vocab#>.
<http://example.org/courses/6.001>
s:students [
a rdf:Bag;
rdf:_1 <http://example.org/students/Amy>;
rdf:_2 <http://example.org/students/Mohamed>;
rdf:_3 <http://example.org/students/Johann>;
rdf:_4 <http://example.org/students/Maria>;
rdf:_5 <http://example.org/students/Phuong>.
].
The graph structure for an rdf:Seq container, and the
corresponding Turtle, are similar to those for an rdf:Bag (the
only difference is in the type, rdf:Seq). Once again, although
an rdf:Seq container is intended to describe a sequence, it is
up to applications creating and processing the graph to appropriately
interpret the sequence of integer-valued property names.
Note that this example also shows a Turtle abbreviation for the property http://www.w3.org/1999/02/22-rdf-syntax-ns#type in the form of the special predicate name "a". Because using rdf typing is very frequent, this abbreviation is very useful in practice.
To illustrate an Alt container, the sentence "The source code for X11 may be found at ftp.example.org, ftp1.example.org, or ftp2.example.org" could be expressed in the RDF graph shown in Figure 15:
Example 13 shows how the graph in Figure 15 could be written in Turtle:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix s: <http://example.org/packages/vocab#>.
<http://example.org/packages/X11>
s:DistributionSite [
a rdf:Alt;
rdf:_1 <ftp://ftp.example.org>;
rdf:_2 <ftp://ftp1.example.org>;
rdf:_3 <ftp://ftp2.example.org>.
].
An Alt container is intended to have at least one member, identified by
the property rdf:_1. This member is intended to be considered as
the default or preferred value. Other than the member identified as
rdf:_1, the order of the remaining elements is not significant.
The RDF in Figure 15 as written states
simply that the value of the s:DistributionSite site property is
the Alt container resource itself. Any additional meaning that is to be read
into this graph, e.g., that one of the members of the Alt container
is to be considered as the value of the s:DistributionSite site
property, or that ftp://ftp.example.org is the default or
preferred value, must be built into an application's understanding of the
intended meaning of an Alt container, and/or into the meaning defined for the
particular property (s:DistributionSite in this case), which
also must be understood by the application.
Alt containers are frequently used in conjunction with language tagging.
(Turtle permits the use of a language tag to indicate that the element
content is in a specified language. The use of this tag is described in [TURTLE], and illustrated later in Section 6.2.) For example, a work whose title has been
translated into several languages might have its title property
pointing to an Alt container holding literals representing the titles
expressed in each of the language variants.
The distinction between the intended meanings of a Bag and an Alt can be further illustrated by considering the authorship of the book "Huckleberry Finn". The book has exactly one author, but the author has two names (Mark Twain and Samuel Clemens). Either name is sufficient to specify the author. Thus using an Alt container for the author's names more accurately represents the relationship than using a Bag (which might suggest there are two different authors).
Users are free to choose their own ways to describe groups of resources, rather than using the RDF container vocabulary. These RDF containers are merely provided as common definitions that, if generally used, could help make data involving groups of resources more interoperable.
Sometimes there are clear alternatives to using these RDF container types. For example, a relationship between a particular resource and a group of other resources could be indicated by making the first resource the subject of multiple statements using the same property. This is structurally different from the resource being the subject of a single statement whose object is a container containing multiple members. In some cases, these two structures may have equivalent meaning, but in other cases they may not. The choice of which to use in a given situation should be made with this in mind.
Consider as an example the relationship between a writer and her publications, as in the sentence:
Sue has written "Anthology of Time", "Zoological Reasoning", and "Gravitational Reflections".
In this case, there are three resources each of which was written independently by the same writer. This could be expressed using repeated properties as:
exstaff:Sue exterms:publication ex:AnthologyOfTime . exstaff:Sue exterms:publication ex:ZoologicalReasoning . exstaff:Sue exterms:publication ex:GravitationalReflections .
In this example there is no stated relationship between the publications other than that they were written by the same person. Each of the statements is an independent fact, and so using repeated properties would be a reasonable choice. However, this could just as reasonably be represented as a statement about the group of resources written by Sue:
exstaff:Sue exterms:publication _:z _:z rdf:type rdf:Bag . _:z rdf:_1 ex:AnthologyOfTime . _:z rdf:_2 ex:ZoologicalReasoning . _:z rdf:_3 ex:GravitationalReflections .
On the other hand, the sentence:
The resolution was approved by the Rules Committee, having members Fred, Wilma, and Dino.
says that the committee as a whole approved the resolution; it
does not necessarily state that each committee member individually
voted in favor of the resolution. In this case, it would be potentially
misleading to model this sentence as three separate
exterms:approvedBy statements, one for each committee member, as
shown below:
ex:resolution exterms:approvedBy ex:Fred . ex:resolution exterms:approvedBy ex:Wilma . ex:resolution exterms:approvedBy ex:Dino .
since these statements say that each member individually approved the resolution.
In this case, it would be better to model the sentence as a single
exterms:approvedBy statement whose subject is the resolution and
whose object is the committee itself. The committee resource could then be
described as a Bag whose members are the members of the committee, as in the
following triples:
ex:resolution exterms:approvedBy ex:rulesCommittee . ex:rulesCommittee rdf:type rdf:Bag . ex:rulesCommittee rdf:_1 ex:Fred . ex:rulesCommittee rdf:_2 ex:Wilma . ex:rulesCommittee rdf:_3 ex:Dino .
When using RDF containers, it is important to understand that the
statements are not constructing containers, as in a programming
language data structure. Instead, the statements are describing
containers (groups of things) that presumably exist. For instance, in the
Rules Committee example just given, the Rules Committee is an unordered group
of people, whether it is described in RDF that way or not. Saying that the
resource ex:rulesCommittee has type rdf:Bag is not
saying that the Rules Committee is a data structure, or constructing a
particular data structure to hold the members of the group (the Rules
Committee could be described as a Bag without describing any members at all).
Instead, it is describing the Rules Committee as having characteristics
corresponding to those associated with a Bag container, namely that it has
members, and their order of description is not significant. Similarly, using
the container membership properties simply describes a container resource as
having certain things as members. This does not necessarily say that the
things described as members are the only members that exist. For
example, the triples given above to describe the Rules Committee say only
that Fred, Wilma, and Dino are members of the committee, not that they are
the only members of the committee.
Also, Example 12 and Example
13 illustrated a common "pattern" in describing containers, regardless of
the type of container involved (e.g., use of a blank node with an appropriate
rdf:type property to represent the container itself, and use of
rdf:_n to generate sequentially-numbered container membership
properties). However, it is important to understand that RDF does not
enforce this particular way of using the RDF container vocabulary,
and so it is possible to use this vocabulary in other ways. For example, in
some cases it might be appropriate to use a container resource having a
URIref rather than using a blank node. Moreover, it is possible to use the container vocabulary in
ways that may not describe graphs with the "well-formed" structures shown in
the previous examples. For example, Example 14 shows
the Turtle syntax for a graph similar to the Alt container shown in Figure 15:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix s: <http://example.org/packages/vocab#>.
<http://example.org/packages/X11>
s:DistributionSite [
a rdf:Alt;
a rdf:Bag;
rdf:_2 <ftp://ftp.example.org>;
rdf:_2 <ftp://ftp1.example.org>;
rdf:_5 <ftp://ftp2.example.org>
].
As noted in [RDF-SEMANTICS], RDF imposes
no "well-formedness" conditions on the use of the container vocabulary, so Example 14 is perfectly legal, even though the
container is described as both a Bag and an Alt, it is described as
having two distinct values of the rdf:_2 property, and it does
not have rdf:_1, rdf:_3, or rdf:_4
properties.
As a result, RDF applications that require containers to be "well-formed" should be written to check that the container vocabulary is being used appropriately, in order to be fully robust.
A limitation of the containers described in Section
4.1 is that there is no way to close them, i.e., to say "these
are all the members of the container". As noted in Section 4.1, a container only says that certain
identified resources are members; it does not say that other members do not
exist. Also, while one graph may describe some of the members, there is no
way to exclude the possibility that there is another graph somewhere that
describes additional members. RDF provides support for describing groups
containing only the specified members, in the form of RDF
collections. An RDF collection is a group of things represented as a
list structure in the RDF graph. This list structure is constructed using a
predefined collection vocabulary consisting of the predefined type
rdf:List, the predefined properties rdf:first and
rdf:rest, and the predefined resource rdf:nil.
To illustrate this, the sentence "The students in course 6.001 are Amy, Mohamed, and Johann" could be represented using the graph shown in Figure 16:
In this graph, each member of the collection, such as s:Amy,
is the object of an rdf:first property whose subject is a
resource (a blank node in this example) that represents a list. This list
resource is linked to the rest of the list by an rdf:rest
property. The end of the list is indicated by the rdf:rest
property having as its object the resource rdf:nil (the resource
rdf:nil represents the empty list, and is defined as being of
type rdf:List). This structure will be familiar to those who
know the Lisp programming language. As in Lisp, the rdf:first
and rdf:rest properties allow applications to traverse the
structure. Each of the blank nodes forming this list structure is implicitly
of type rdf:List (that is, each of these nodes implicitly has an
rdf:type property whose value is the predefined type
rdf:List), although this is not explicitly shown in the graph.
The RDF Schema language [RDF-VOCABULARY]
defines the properties rdf:first and rdf:rest as
having subjects of type rdf:List, so the information about these
nodes being lists can generally be inferred, rather than the corresponding
rdf:type triples being written out all the time.
Turtle provides a special notation to make it easy to describe collections using graphs of this form. To illustrate how this notation works, the Turtle from Example 15 would result in the RDF graph shown in Figure 16:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix s: <http://example.org/students/vocab#>.
<http://example.org/courses/6.001>
s:students (
<http://example.org/students/Amy>
<http://example.org/students/Mohamed>
<http://example.org/students/Johann>
).
The use of Turtle abbreviation for lists always defines a list structure
like the one shown in Figure 16, i.e., a fixed finite
list of items with a given length and terminated by rdf:nil, and
which uses "new" blank nodes that are unique to the list structure itself.
However, RDF does not enforce this particular way of using the RDF
collection vocabulary, and so it is possible to use this vocabulary in other
ways, some of which may not describe lists or closed collections. To see why,
note that the graph shown in Figure 16 could also be
written in Turtle by writing out the same triples "in longhand" (without
using the special Turtle notation) using the collection vocabulary, as in Example 16:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix s: <http://example.org/students/vocab#>.
<http://example.org/courses/6.001> s:student _:sch1.
_:sch1
rdf:first <http://example.org/students/Amy>;
rdf:rest _:sch2.
_:sch2
rdf:first <http://example.org/students/Mohamed>;
rdf:rest _:sch3.
_:sch3
rdf:first <http://example.org/students/Johann>;
rdf:rest rdf:nil.
As noted in [RDF-SEMANTICS] (and as was
the case for the container vocabulary described in Section 4.1), RDF
imposes no "well-formedness" conditions on the use of the collection
vocabulary so, when writing triples in longhand, it is possible to define RDF
graphs with structures other than the well-structured graphs that would be
automatically generated by using list notation of Turtle. For example, it is
not illegal to assert that a given node has two distinct values of the
rdf:first property, to create structures that have forked or
non-list tails, or to simply omit part of the description of a collection.
Also, graphs defined by using the collection vocabulary in longhand could use
URIrefs to identify the components of the list instead of blank nodes unique
to the list structure. In this case, it would be possible to create triples
in other graphs that effectively added elements to the collection, making it
non-closed.
As a result, RDF applications that require collections to be well-formed should be written to check that the collection vocabulary is being used appropriately, in order to be fully robust. In addition, languages such as OWL [OWL], which can define additional constraints on the structure of RDF graphs, can rule out some of these cases.
RDF applications sometimes need to describe other RDF statements using RDF, for instance, to record information about when statements were made, who made them, or other similar information (this is sometimes referred to as "provenance" information). For example, Ex