W3C

HTML 5

A vocabulary and associated APIs for HTML and XHTML

This is revision 1.2852.

5 Microdata

Status: First draft. ISSUE-76 (Microdata/RDFa) blocks progress to Last Call

5.1 Introduction

5.1.1 Overview

This section is non-normative.

Sometimes, it is desirable to annotate content with specific machine-readable labels, e.g. to allow generic scripts to provide services that are customised to the page, or to enable content from a variety of cooperating authors to be processed by a single script in a consistent manner.

For this purpose, authors can use the microdata features described in this section. Microdata allows nested groups of name-value pairs to be added to documents, in parallel with the existing content.

5.1.2 The basic syntax

This section is non-normative.

At a high level, microdata consists of a group of name-value pairs. The groups are called items, and each name-value pair is a property. Items and properties are represented by regular elements.

To create an item, the item attribute is used.

To add a property to an item, the itemprop attribute is used on one of the item's descendants.

Here there are two items, each of which have the property "name":

<div item>
 <p>My name is <span itemprop="name">Elizabeth</span>.</p>
</div>

<div item>
 <p>My name is <span itemprop="name">Daniel</span>.</p>
</div>

Properties generally have values that are strings.

Here the item has three properties:

<div item>
 <p>My name is <span itemprop="name">Neil</span>.</p>
 <p>My band is called <span itemprop="band">Four Parts Water</span>.</p>
 <p>I am <span itemprop="nationality">British</span>.</p>
</div>

Properties can also have values that are URLs. This is achieved using the a element and its href attribute, the img element and its src attribute, or other elements that link to or embed external resources.

In this example, the item has one property, "image", whose value is a URL:

<div item>
 <img itemprop="image" src="google-logo.png" alt="Google">
</div>

Properties can also have values that are dates, times, or dates and times. This is achieved using the time element and its datetime attribute.

In this example, the item has one property, "birthday", whose value is a date:

<div item>
 I was born on <time itemprop="birthday" datetime="2009-05-10">May 10th 2009</time>.
</div>

Properties can also themselves be groups of name-value pairs, by putting the item attribute on the element that declares the property.

Items that are not part of others are called top-level microdata items.

In this example, the outer item represents a person, and the inner one represents a band:

<div item>
 <p>Name: <span itemprop="name">Amanda</span></p>
 <p>Band: <span itemprop="band" item> <span itemprop="name">Jazz Band</span> (<span itemprop="size">12</span> players)</span></p>
</div>

The outer item here has two properties, "name" and "band". The "name" is "Amanda", and the "band" is an item in its own right, with two properties, "name" and "size". The "name" of the band is "Jazz Band", and the "size" is "12".

The outer item in this example is a top-level microdata item.

Properties don't have to be given as descendants of the element with the item attribute. They can be associated with a specific item using the subject attribute, which takes the ID of the element with the item attribute.

This example is the same as the previous one, but all the properties are separated from their items:

<div item id="amanda"></div>
<p>Name: <span subject="amanda" itemprop="name">Amanda</span></p>
<div subject="amanda" itemprop="band" item id="jazzband"></div>
<p>Band: <span subject="jazzband" itemprop="name">Jazz Band</span></p>
<p>Size: <span subject="jazzband" itemprop="size">12</span> players</p>

This gives the same result as the previous example. The first item has two properties, "name", set to "Amanda", and "band", set to another item. That second item has two further properties, "name", set to "Jazz Band", and "size", set to "12".

An item can have multiple properties with the same name and different values.

This example describes an ice cream, with two flavors:

<div item>
 <p>Flavors in my favorite ice cream:</p>
 <ul>
  <li itemprop="flavor">Lemon sorbet</li>
  <li itemprop="flavor">Apricot sorbet</li>
 </ul>
</div>

This thus results in an item with two properties, both "flavor", having the values "Lemon sorbet" and "Apricot sorbet".

An element introducing a property can also introduce multiple properties at once, to avoid duplication when some of the properties have the same value.

Here we see an item with two properties, "favorite-color" and "favorite-fruit", both set to the value "orange":

<div item>
 <span itemprop="favorite-color favorite-fruit">orange</span>
</div>

It's important to note that there is no relationship between the microdata and the content of the document where the microdata is marked up.

There is no semantic difference, for instance, between the following two examples:

<figure>
 <img src="castle.jpeg">
 <legend><span item><span itemprop="name">The Castle</span></span> (1986)</legend>
</figure>
<span item><meta itemprop="name" content="The Castle"></span>
<figure>
 <img src="castle.jpeg">
 <legend>The Castle (1986)</legend>
</figure>

Both have a figure with a caption, and both, completely unrelated to the figure, have an item with a name-value pair with the name "name" and the value "The Castle". The only difference is that if the user drags the caption out of the document, in the former case, the item will be included in the drag-and-drop data. In neither case is the image in any way associated with the item.

5.1.3 Typed items

This section is non-normative.

The examples in the previous section show how information could be marked up on a page that doesn't expect its microdata to be re-used. Microdata is most useful, though, when it is used in contexts where other authors and readers are able to cooperate to make new uses of the markup.

For this purpose, it is necessary to give each item a type, such as "com.example.person", or "org.example.cat", or "net.example.band". Types are identified in three ways:

URLs are self-explanatory. Reversed DNS labels are strings such as "org.example.animals.cat" or "com.example.band".

The type for an item is given as the value of the item attribute.

Here, the item is "org.example.animals.cat":

<section item="org.example.animal.cat">
 <h1 itemprop="name">Hedral</h1>
 <p itemprop="desc">Hedral is a male american domestic
 shorthair, with a fluffy black fur with white paws and belly.</p>
 <img itemprop="img" src="hedral.jpeg" alt="" title="Hedral, age 18 months">
</section>

In this example the "org.example.animals.cat" item has three properties, a "name" ("Hedral"), a "desc" ("Hedral is..."), and an "img" ("hedral.jpeg").

An item can only have one type. The type gives the context for the properties: a property named "class" given for an item with the type "com.example.census.person" might refer to the class of an individual, while a property named "class" given for an item with the type "com.example.school.teacher" might refer to the classroom a teacher has been assigned.

5.1.4 Selecting names when defining vocabularies

This section is non-normative.

Using microdata means using a vocabulary. For some purposes, an ad-hoc vocabulary is adequate. For others, a vocabulary will need to be designed. Where possible, authors are encouraged to re-use existing vocabularies, as this makes content re-use easier.

When designing new vocabularies, identifiers can be created either using URLs, reversed DNS labels, or, for properties, as plain words (with no dots or colons). For URLs conflicts with other vocabularies can be avoided by only using identifiers that correspond to pages that the author has control over. Similarly, for reversed DNS labels conflicts can be avoided by using a domain name that the author has control over, or by using suffixes that correspond to the path components of pages that the author has control over.

For instance, if Jon and Adam both write content at example.com, at http://example.com/jon/... and http://example.com/adam/... respectively, then they could select identifiers of the form "com.example.jon.name" and "com.example.adam.name" respectively.

Properties whose names are just plain words can only be used within the context of the types for which they are intended; properties named using URLs or reversed DNS labels can be reused in items of any type. If an item has no type, and is not part of another item, then if its properties have names that are just plain words, they are not intended to be globally unique, and are instead only intended for limited use. Generally speaking, authors are encouraged to use either properties with globally unique names (URLs, reversed DNS labels) or ensure that their items are typed.

Here, an item is an "org.example.animals.cat", and most of the properties have names that are words defined in the context of that type. There are also a few additional properties whose names come from other vocabularies.

<section item="org.example.animal.cat">
 <h1 itemprop="name com.example.fn">Hedral</h1>
 <p itemprop="desc">Hedral is a male american domestic
 shorthair, with a fluffy <span
 itemprop="com.example.color">black</span> fur with <span
 itemprop="com.example.color">white</span> paws and belly.</p>
 <img itemprop="img" src="hedral.jpeg" alt="" title="Hedral, age 18 months">
</section>

This example has one item with two types and the following properties:

Property Value
name Hedral
com.example.fn Hedral
desc Hedral is a male american domestic shorthair, with a fluffy black fur with white paws and belly.
com.example.color black
com.example.color white
img .../hedral.jpeg

5.1.5 Predefined vocabularies

ISSUE-73 (predefined-voc) blocks progress to Last Call

This section is non-normative.

To make the most common tasks simpler, certain vocabularies have been predefined. These use short names for types and properties.

For example, the vCard vocabulary can be used to mark up people's names:

<span item=vcard><span itemprop=fn>George Washington</span></span>

This creates a single item with a single name-value pair, with the name "fn" and the value "George Washington". This is defined to map to the following vCard:

BEGIN:VCARD
PROFILE:VCARD
VERSION:3.0
SOURCE:document's address
FN:George Washington
N:Washington;George;;;
END:VCARD

5.1.6 Using the microdata DOM API

This section is non-normative.

The microdata becomes even more useful when scripts can use it to expose information to the user, for example offering it in a form that can be used by other applications.

The document.getItems(typeNames) method provides access to the top-level microdata items. It returns a NodeList containing the items with the specified types, or all types if no argument is specified.

Each item is represented in the DOM by the element on which the relevant item attribute is found. The type of that element can be obtained using the element.item DOM attribute.

This sample shows how the getItems() method can be used to obtain a list of all the top-level microdata items of one type given in the document:

var cats = document.getItems("com.example.feline");

Once an element representing an item has been obtained, its properties can be extracted using the properties DOM attribute. This attribute returns an HTMLPropertyCollection, which can be enumerated to go through each element that adds one or more properties to the item. It can also be indexed by name, which will return an object with a list of the elements that add properties with that name.

Each element that adds a property also has a content DOM attribute that returns its value.

This sample gets the first item of type "net.example.user" and then pops up an alert using the "name" property from that item.

var user = document.getItems('net.example.user')[0];
alert('Hello ' + user.properties['name'][0].content + '!');

The HTMLPropertyCollection object, when indexed by name in this way, actually returns a PropertyNodeList object with all the matching properties. The PropertyNodeList object can be used to obtained all the values at once using its contents attribute, which returns an array of all the values.

In an earlier example, a "org.example.animals.cat" item had two "com.example.color" values. This script looks up the first such item and then lists all its values.

var cat = document.getItems('org.example.animals.cat')[0];
var colors = cat.properties['com.example.color'].contents;
var result;
if (colors.length == 0) {
  result = 'Color unknown.';
} else if (colors.length == 1) {
  result = 'Color: ' + colors[0];
} else {
  result = 'Colors:';
  for (var i = 0; i < colors.length; i += 1)
    result += ' ' + colors[i];
}

It's also possible to get a list of all the property names using the object's names DOM attribute.

This example creates a big list with a nested list for each item on the page, each with of all the property names used in that item.

var outer = document.createElement('ul');
for (var item = 0; item < document.items.length; item += 1) {
  var itemLi = document.createElement('li');
  var inner = document.createElement('ul');
  for (var name = 0; name < document.items[item].names.length; name += 1) {
    var propLi = document.createElement('li');
    propLi.appendChild(document.createTextNode(document.items[item].names[name]));
    inner.appendChild(propLi);
  }
  itemLi.appendChild(inner);
  outer.appendChild(itemLi);
}
document.body.appendChild(outer);

If faced with the following from an earlier example:

<section item="org.example.animal.cat">
 <h1 itemprop="name com.example.fn">Hedral</h1>
 <p itemprop="desc">Hedral is a male american domestic
 shorthair, with a fluffy <span
 itemprop="com.example.color">black</span> fur with <span
 itemprop="com.example.color">white</span> paws and belly.</p>
 <img itemprop="img" src="hedral.jpeg" alt="" title="Hedral, age 18 months">
</section>

...it would result in the following output:

(The duplicate occurrence of "com.example.color" is not included in the list.)

5.2 Encoding microdata

5.2.1 The microdata model

The microdata model consists of groups of name-value pairs known as items.

Each group has zero or more types, each name has one or more values, and each value is either a string or another group of name-value pairs.

5.2.2 Items: the item attribute

Every HTML element may have an item attribute specified.

An element with the item attribute specified creates a new item, a group of name-value pairs.

The attribute, if specified, must have a value that is either:

The item type of an element with an item attribute is the value of the element's item attribute. If the attribute's value is the empty string, the element is said to have no item type.

5.2.3 Associating names with items

The subject attribute may be specified on any HTML element to associate the element with an element with an item attribute. If the subject attribute is specified, the attribute's value must be the ID of an element with an item attribute, in the same Document as the element with the subject attribute.

An element's corresponding item is determined by its position in the DOM and by any subject attributes on the element, and is defined as follows:

If the element has a subject attribute

If there is an element in the document with an ID equal to the value of the subject attribute, and if the first such element has an item attribute specified, then that element is the corresponding item. Otherwise, there is no corresponding item.

If the element has no subject attribute but does have an ancestor with an item attribute specified

The nearest ancestor element with the item attribute specified is the element's corresponding item.

If the element has neither subject attribute nor an ancestor with an item attribute specified

The element has no corresponding item.

The list of elements that create items but do not themselves have a corresponding item forms the list of top-level microdata items.

5.2.4 Names: the itemprop attribute

Every HTML element that has a corresponding item may have an itemprop attribute specified.

An element with the itemprop attribute specified adds one or more name-value pairs to its corresponding item.

The itemprop attribute, if specified, must have a value that is an unordered set of unique space-separated tokens representing the names of the name-value pairs that it adds. The attribute's value must have at least one token.

Each token must be either:

The property names of an element are the tokens that the element's itemprop attribute is found to contain when its value is split on spaces, with the order preserved but with duplicates removed (leaving only the first occurrence of each name).

With an item, the properties are unordered with respect to each other, except for properties with the same name, which are ordered in tree order.

In the following example, the "a" property has the values "1" and "2", in that order, but whether the "a" property comes before the "b" property or not is not important:

<div item>
 <p itemprop="a">1</p>
 <p itemprop="a">2</p>
 <p itemprop="b">test</p>
</div>

Thus, the following is equivalent:

<div item>
 <p itemprop="b">test</p>
 <p itemprop="a">1</p>
 <p itemprop="a">2</p>
</div>

As is the following:

<div item>
 <p itemprop="a">1</p>
 <p itemprop="b">test</p>
 <p itemprop="a">2</p>
</div>

5.2.5 Values

The property value of a name-value pair added by an element with an itemprop attribute depends on the element, as follows:

If the element also has an item attribute

The value is the item created by the element.

If the element is a meta element

The value is the value of the element's content attribute, if any, or the empty string if there is no such attribute.

If the element is an audio, embed, iframe, img, source, or video element

The value is the absolute URL that results from resolving the value of the element's src attribute relative to the element at the time the attribute is set, or the empty string if there is no such attribute or if resolving it results in an error.

If the element is an a, area, or link element

The value is the absolute URL that results from resolving the value of the element's href attribute relative to the element at the time the attribute is set, or the empty string if there is no such attribute or if resolving it results in an error.

If the element is an object element

The value is the absolute URL that results from resolving the value of the element's data attribute relative to the element at the time the attribute is set, or the empty string if there is no such attribute or if resolving it results in an error.

If the element is a time element with a datetime attribute

The value is the value of the element's datetime attribute.

Otherwise

The value is the element's textContent.

The URL property elements are the a, area, audio, embed, iframe, img, link, object, source, and video elements.

If a property's value is an absolute URL, the property must be specified using an URL property element.

5.3 Microdata DOM API

document . getItems( [ types ] )

Returns a NodeList of the elements in the Document that create items, that are not part of other items, and that are of one of the types given in the argument, if any are listed.

The types argument is interpreted as a space-separated list of types.

element . properties

If the element has an item attribute, returns an HTMLPropertyCollection object with all the element's properties. Otherwise, an empty HTMLPropertyCollection object.

element . content [ = value ]

Returns the element's value.

Can be set, to change the element's value.

The document.getItems(typeNames) method takes an optional string that contains an unordered set of unique space-separated tokens representing types. When called, the method must return a live NodeList object containing all the elements in the document, in tree order, that are each top-level microdata items with a type equal to one of the types specified in that argument, having obtained the types by splitting the string on spaces. If there are no tokens specified in the argument, or if the argument is missing, then the method must return a NodeList containing all the top-level microdata items in the document.

The item DOM attribute on elements must reflect the element's item content attribute.

The itemprop DOM attribute on elements must reflect the element's itemprop content attribute.

The properties DOM attribute on elements must return an HTMLPropertyCollection rooted at the Document node, whose filter matches only elements that have property names and have a corresponding item that is equal to the element on which the attribute was invoked.

The content DOM attribute's behavior depends on the element, as follows:

If the element is a meta element

The attribute must act as it would if it was reflecting the element's content content attribute.

If the element is an audio, embed, iframe, img, source, or video element

The attribute must act as it would if it was reflecting the element's src content attribute.

If the element is an a, area, or link element

The attribute must act as it would if it was reflecting the element's href content attribute.

If the element is an object element

The attribute must act as it would if it was reflecting the element's data content attribute.

If the element is a time element with a datetime attribute

The attribute must act as it would if it was reflecting the element's datetime content attribute.

Otherwise

The attribute must act the same as the element's textContent attribute.

The subject DOM attribute on elements must reflect the element's subject content attribute.

5.4 Predefined vocabularies

ISSUE-73 (predefined-voc) blocks progress to Last Call

A number of predefined types exist, for describing common structures. Each such type has a set of predefined property names that are used to describe data of that type. In addition, there are some predefined global property names that can be used for any item.

5.4.1 General

The predefined global property name about can be used to name an item for the purposes of identifying or refering to the data defined in that item.

A single property with the name about may be present within each item. Its value must be an absolute URL.

5.4.2 vCard

An item with the predefined type vcard represents a person's or organization's contact information.

The following are the type's predefined property names. They are based on the vocabulary defined in the vCard specification and its extensions, where more information on how to interpret the values can be found. [RFC2426] [RFC4770]

fn

Gives the formatted text corresponding to the name of the person or organization.

The value must be text.

Exactly one property with the name fn must be present within each item with the type vcard.

n

Gives the structured name of the person or organization.

The value must be an item with zero or more of each of the family-name, given-name, additional-name, honorific-prefix, and honorific-suffix properties.

Unless one of the conditions given below applies, exactly one property with the name n must be present within each item with the type vcard.

If one of the following conditions does apply, then the n may be omitted:

The item with the type vcard has both an fn property and an org property, and they both have values that are strings and those strings are identical when compared in a case-sensitive manner.

The contact information must be for an organization.

The item with the type vcard has an fn property whose value consists of a string with zero space characters.

The value of the fn property must be a nickname.

The item with the type vcard has an fn property whose value consists of a string with exactly one sequence of space characters, which occurs neither at the immediate start nor the immediate end of the string.

The value of the fn property must be a name in one of the following forms:

  • Last, First
  • Last F.
  • Last F
  • First Last
family-name (inside n)

Gives the family name of the person, or the full name of the organization.

The value must be text.

Any number of properties with the name family-name may be present within the item that forms the value of the n property of an item with the type vcard.

given-name (inside n)

Gives the given-name of the person.

The value must be text.

Any number of properties with the name given-name may be present within the item that forms the value of the n property of an item with the type vcard.

additional-name (inside n)

Gives the any additional names of the person.

The value must be text.

Any number of properties with the name additional-name may be present within the item that forms the value of the n property of an item with the type vcard.

honorific-prefix (inside n)

Gives the honorific prefix of the person.

The value must be text.

Any number of properties with the name honorific-prefix may be present within the item that forms the value of the n property of an item with the type vcard.

honorific-suffix (inside n)

Gives the honorific suffix of the person.

The value must be text.

Any number of properties with the name honorific-suffix may be present within the item that forms the value of the n property of an item with the type vcard.

nickname

Gives the nickname of the person or organization.

The nickname is the descriptive name given instead of or in addition to the one belonging to a person, place, or thing. It can also be used to specify a familiar form of a proper name specified by the fn or n properties.

The value must be text.

Any number of properties with the name nickname may be present within each item with the type vcard.

photo

Gives a photograph of the person or organization.

The value must be an absolute URL.

Any number of properties with the name photo may be present within each item with the type vcard.

bday

Gives the birth date of the person or organization.

The value must be a valid date string.

A single property with the name bday may be present within each item with the type vcard.

adr

Gives the delivery address of the person or organization.

The value must be an item with zero or more type, post-office-box, extended-address, and street-address properties, and optionally a locality property, optionally a region property, optionally a postal-code property, and optionally a country-name property.

If no type properties are present within an item that forms the value of an adr property of an item with the type vcard, then the address type strings intl, postal, parcel, and work are implied.

Any number of properties with the name adr may be present within each item with the type vcard.

type (inside adr)

Gives the type of delivery address.

The value must be text that, when compared in a case-sensitive manner, is equal to one of the address type strings.

Within each item with the type vcard, there must be no more than one adr property item with a type property whose value is pref.

Any number of properties with the name type may be present within the item that forms the value of an adr property of an item with the type vcard, but within each such adr property item there must only be one type property per distinct value.

post-office-box (inside adr)

Gives the post office box component of the delivery address of the person or organization.

The value must be text.

Any number of properties with the name post-office-box may be present within the item that forms the value of an adr property of an item with the type vcard.

extended-address (inside adr)

Gives an additional component of the delivery address of the person or organization.

The value must be text.

Any number of properties with the name extended-address may be present within the item that forms the value of an adr property of an item with the type vcard.

street-address (inside adr)

Gives the street address component of the delivery address of the person or organization.

The value must be text.

Any number of properties with the name street-address may be present within the item that forms the value of an adr property of an item with the type vcard.

locality (inside adr)

Gives the locality component (e.g. city) of the delivery address of the person or organization.

The value must be text.

A single property with the name locality may be present within the item that forms the value of an adr property of an item with the type vcard.

region (inside adr)

Gives the region component (e.g. state or province) of the delivery address of the person or organization.

The value must be text.

A single property with the name region may be present within the item that forms the value of an adr property of an item with the type vcard.

postal-code (inside adr)

Gives the postal code component of the delivery address of the person or organization.

The value must be text.

A single property with the name postal-code may be present within the item that forms the value of an adr property of an item with the type vcard.

country-name (inside adr)

Gives the country name component of the delivery address of the person or organization.

The value must be text.

A single property with the name country-name may be present within the item that forms the value of an adr property of an item with the type vcard.

label

Gives the formatted text corresponding to the delivery address of the person or organization.

The value must be either text or an item with zero or more type properties and exactly one value property.

If no type properties are present within an item that forms the value of a label property of an item with the type vcard, or if the value of such a label property is text, then the address type strings intl, postal, parcel, and work are implied.

Any number of properties with the name label may be present within each item with the type vcard.

type (inside label)

Gives the type of delivery address.

The value must be text that, when compared in a case-sensitive manner, is equal to one of the address type strings.

Within each item with the type vcard, there must be no more than one label property item with a type property whose value is pref.

Any number of properties with the name type may be present within the item that forms the value of a label property of an item with the type vcard, but within each such label property item there must only be one type property per distinct value.

value (inside label)

Gives the actual formatted text corresponding to the delivery address of the person or organization.

The value must be text.

Exactly one property with the name value must be present within the item that forms the value of a label property of an item with the type vcard.

tel

Gives the telephone number of the person or organization.

The value must be either text that can be interpreted as a telephone number as defined in the CCITT specifications E.163 and X.121, or an item with zero or more type properties and exactly one value property. [E163] [X121]

If no type properties are present within an item that forms the value of a tel property of an item with the type vcard, or if the value of such a tel property is text, then the telephone type string voice is implied.

Any number of properties with the name tel may be present within each item with the type vcard.

type (inside tel)

Gives the type of telephone number.

The value must be text that, when compared in a case-sensitive manner, is equal to one of the telephone type strings.

Within each item with the type vcard, there must be no more than one tel property item with a type property whose value is pref.

Any number of properties with the name type may be present within the item that forms the value of a tel property of an item with the type vcard, but within each such tel property item there must only be one type property per distinct value.

value (inside tel)

Gives the actual telephone number of the person or organization.

The value must be text that can be interpreted as a telephone number as defined in the CCITT specifications E.163 and X.121. [E163] [X121]

Exactly one property with the name value must be present within the item that forms the value of a tel property of an item with the type vcard.

email

Gives the e-mail address of the person or organization.

The value must be either text or an item with zero or more type properties and exactly one value property.

If no type properties are present within an item that forms the value of an email property of an item with the type vcard, or if the value of such an email property is text, then the e-mail type string internet is implied.

Any number of properties with the name email may be present within each item with the type vcard.

type (inside email)

Gives the type of e-mail address.

The value must be text that, when compared in a case-sensitive manner, is equal to one of the e-mail type strings.

Within each item with the type vcard, there must be no more than one email property item with a type property whose value is pref.

Any number of properties with the name type may be present within the item that forms the value of an email property of an item with the type vcard, but within each such email property item there must only be one type property per distinct value.

value (inside email)

Gives the actual e-mail address of the person or organization.

The value must be text.

Exactly one property with the name value must be present within the item that forms the value of an email property of an item with the type vcard.

mailer

Gives the name of the e-mail software used by the person or organization.

The value must be text.

Any number of properties with the name mailer may be present within each item with the type vcard.

tz

Gives the time zone of the person or organization.

The value must be text and must match the following syntax:

  1. Either a U+002B PLUS SIGN character (+) or a U+002D HYPHEN-MINUS character (-).
  2. A valid non-negative integer that is exactly two digits long and that represents a number in the range 00..23.
  3. A U+003A COLON character (:).
  4. A valid non-negative integer that is exactly two digits long and that represents a number in the range 00..59.

Any number of properties with the name tz may be present within each item with the type vcard.

geo

Gives the geographical position of the person or organization.

The value must be text and must match the following syntax:

  1. Optionally, either a U+002B PLUS SIGN character (+) or a U+002D HYPHEN-MINUS character (-).
  2. One or more digits in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.
  3. Optionally*, a U+002E FULL STOP character (.) followed by one or more digits in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.
  4. A U+003B SEMICOLON character (;).
  5. Optionally, either a U+002B PLUS SIGN character (+) or a U+002D HYPHEN-MINUS character (-).
  6. One or more digits in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.
  7. Optionally*, a U+002E FULL STOP character (.) followed by one or more digits in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.

The optional components marked with an asterisk (*) should be included, and should have dix digits each.

The value specifies latitude and longitude, in that order (i.e., "LAT LON" ordering), in decimal degress. The longitude represents the location east and west of the prime meridian as a positive or negative real number, respectively. The latitude represents the location north and south of the equator as a positive or negative real number, respectively.

Any number of properties with the name geo may be present within each item with the type vcard.

title

Gives the job title, functional position or function of the person or organization.

The value must be text.

Any number of properties with the name title may be present within each item with the type vcard.

role

Gives the role, occupation, or business category of the person or organization.

The value must be text.

Any number of properties with the name role may be present within each item with the type vcard.

Gives the logo of the person or organization.

The value must be an absolute URL.

Any number of properties with the name logo may be present within each item with the type vcard.

agent

Gives the contact information of another person who will act on behalf of the person or organization.

The value must be either an item with the type vcard, or an absolute URL, or text.

Any number of properties with the name logo may be present within each item with the type vcard.

org

Gives the name and units of the organization.

The value must be either text or an item with one organization-name property and zero or more organization-unit properties.

Any number of properties with the name org may be present within each item with the type vcard.

organization-name (inside org)

Gives the name of the organization.

The value must be text.

Exactly one property with the name organization-name must be present within the item that forms the value of an org property of an item with the type vcard.

organization-unit (inside org)

Gives the name of the organization unit.

The value must be text.

Any number of properties with the name organization-unit may be present within the item that forms the value of the org property of an item with the type vcard.

categories

Gives the name of a category or tag that the person or organization could be classified as.

The value must be text.

Any number of properties with the name categories may be present within each item with the type vcard.

note

Gives supplemental information or a comment about the person or organization.

The value must be text.

Any number of properties with the name note may be present within each item with the type vcard.

rev

Gives the revision date and time of the contact information.

The value must be text that is a valid global date and time string.

The value distinguishes the current revision of the information for other renditions of the information.

Any number of properties with the name rev may be present within each item with the type vcard.

sort-string

Gives the string to be used for sorting the person or organization.

The value must be text.

Any number of properties with the name sort-string may be present within each item with the type vcard.

sound

Gives a sound file relating to the person or organization.

The value must be an absolute URL.

Any number of properties with the name sound may be present within each item with the type vcard.

url

Gives a URL relating to the person or organization.

The value must be an absolute URL.

Any number of properties with the name url may be present within each item with the type vcard.

class

Gives the access classification of the information regarding the person or organization.

The value must be text with one of the following values:

This is merely advisory and cannot be considered a confidentiality measure.

Any number of properties with the name class may be present within each item with the type vcard.

impp

Gives a URL for instant messaging and presence protocol communications with the person or organization.

The value must be either an absolute URL or an item with zero or more type properties and exactly one value property.

If no type properties are present within an item that forms the value of an impp property of an item with the type vcard, or if the value of such an impp property is an absolute URL, then no IMPP type strings are implied.

Any number of properties with the name impp may be present within each item with the type vcard.

type (inside impp)

Gives the intended use of the IMPP URL.

The value must be text that, when compared in a case-sensitive manner, is equal to one of the IMPP type strings.

Within each item with the type vcard, there must be no more than one impp property item with a type property whose value is pref.

Any number of properties with the name type may be present within the item that forms the value of an impp property of an item with the type vcard, but within each such impp property item there must only be one type property per distinct value.

value (inside impp)

Gives the actual URL for instant messaging and presence protocol communications with the person or organization.

The value must be an absolute URL.

Exactly one property with the name value must be present within the item that forms the value of an impp property of an item with the type vcard.

The address type strings are:

dom

Indicates a domestic delivery address.

intl

Indicates an international delivery address.

postal

Indicates a postal delivery address.

parcel

Indicates a parcel delivery address.

home

Indicates a residential delivery address.

work

Indicates a delivery address for a place of work.

pref

Indicates the preferred delivery address when multiple addresses are specified.

The telephone type strings are:

home

Indicates a residential number.

msg

Indicates a telephone number with voice messaging support.

work

Indicates a telephone number for a place of work.

voice

Indicates a voice telephone number.

fax

Indicates a facsimile telephone number.

cell

Indicates a cellular telephone number.

video

Indicates a video conferencing telephone number.

pager

Indicates a paging device telephone number.

bbs

Indicates a bulletin board system telephone number.

modem

Indicates a MODEM-connected telephone number.

car

Indicates a car-phone telephone number.

isdn

Indicates an ISDN service telephone number.

pcs

Indicates a personal communication services telephone number.

pref

Indicates the preferred telephone number when multiple telephone numbers are specified.

The e-mail type strings are:

internet

Indicates an Internet e-mail address.

x400

Indicates a X.400 addressing type.

pref

Indicates the preferred e-mail address when multiple e-mail addresses are specified.

The IMPP type strings are:

personal
business

Indicates the type of communication for which this IMPP URL is appropriate.

home
work
mobile

Indicates the location of a device associated with this IMPP URL.

pref

Indicates the preferred address when multiple IMPP URLs are specified.

5.4.2.1 Examples

Here is a long example vcard for a fictional character called "Jack Bauer":

<section id="jack" item="vcard">
 <h1 itemprop="fn">Jack Bauer</h1>
 <img itemprop="photo" alt="" src="jack-bauer.jpg">
 <p itemprop="org" item>
  <span itemprop="organization-name">Counter-Terrorist Unit</span>
  (<span itemprop="organization-unit">Los Angeles Division</span>)
 </p>
 <p>
  <span itemprop="adr" item>
   <span itemprop="street-address">10201 W. Pico Blvd.</span><br>
   <span itemprop="locality">Los Angeles</span>,
   <span itemprop="region">CA</span>
   <span itemprop="postal-code">90064</span><br>
   <span itemprop="country-name">United States</span><br>
  </span>
  <span itemprop="geo">34.052339;-118.410623</span>
 </p>
 <h2>Assorted Contact Methods</h2>
 <ul>
  <li itemprop="tel" item><span itemprop="value">+1 (310) 597
  3781</span> <span itemprop="type">work</span>
  <meta itemprop="type" content="pref"></li>
  <li><a itemprop="url"
  href="http://en.wikipedia.org/wiki/Jack_Bauer">I'm on
  Wikipedia</a> so you can leave a message on my user talk
  page.</li>
  <li><a itemprop="url"
  href="http://www.jackbauerfacts.com/">Jack Bauer Facts</a></li>
  <li itemprop="email"><a
  href="mailto:j.bauer@la.ctu.gov.invalid">j.bauer@la.ctu.gov.invalid</a></li>
  <li itemprop="tel" item><span itemprop="value">+1 (310) 555
  3781</span> <span><meta itemprop="type" content="cell">mobile
  phone</span></li>
 </ul>
 <p itemprop="note">If I'm out in the field, you may be better
 off contacting <span itemprop="agent" item="vcard"><a
 itemprop="email" href="mailto:c.obrian@la.ctu.gov.invalid"><span
 itemprop="fn">Chloe O'Brian</span></a></span> if it's about
 work, or ask <span itemprop="agent">Tony Almeida</span> if
 you're interested in the CTU five-a-side football team we're
 trying to get going.</p>
 <ins datetime="2008-07-20T21:00:00+0100">
  <span itemprop="rev" item>
   <meta itemprop="type" content="date-time">
   <meta itemprop="value" content="2008-07-20T21:00:00+0100">
  </span>
  <p itemprop="tel" item><strong>Update!</strong>
  My new <span itemprop="type">home</span> phone number is
  <span itemprop="value">01632 960 123</span>.
 </ins>
</section>

This example shows a site's contact details (using the address element) containing an address with two street components:

<address item=vcard>
 <strong title="fn">Alfred Person</strong> <br>
 <span itemprop="adr" item>
  <span itemprop="street-address">1600 Amphitheatre Parkway</span> <br>
  <span itemprop="street-address">Building 43, Second Floor</span> <br>
  <span itemprop="locality">Mountain View</span>,
   <span itemprop="region">CA</span> <span itemprop="postal-code">94043</span>
 </span>
</address>

5.4.3 vEvent

An item with the predefined type vevent represents an event.

The following are the type's predefined property names. They are based on the vocabulary defined in the iCalendar specification, where more information on how to interpret the values can be found. [RFC2445]

Only the parts of the iCalendar vocabulary relating to events are used here; this vocabulary cannot express a complete iCalendar instance.

attach

Gives the address of an associated document for the event.

The value must be an absolute URL.

Any number of properties with the name attach may be present within each item with the type vevent.

categories

Gives the name of a category or tag that the event could be classified as.

The value must be text.

Any number of properties with the name categories may be present within each item with the type vevent.

class

Gives the access classification of the information regarding the event.

The value must be text with one of the following values:

This is merely advisory and cannot be considered a confidentiality measure.

A single property with the name class may be present within each item with the type vevent.

comment

Gives a comment regarding the event.

The value must be text.

Any number of properties with the name comment may be present within each item with the type vevent.

description

Gives a detailed description of the event.

The value must be text.

A single property with the name description may be present within each item with the type vevent.

geo

Gives the geographical position of the event.

The value must be text and must match the following syntax:

  1. Optionally, either a U+002B PLUS SIGN character (+) or a U+002D HYPHEN-MINUS character (-).
  2. One or more digits in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.
  3. Optionally*, a U+002E FULL STOP character (.) followed by one or more digits in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.
  4. A U+003B SEMICOLON character (;).
  5. Optionally, either a U+002B PLUS SIGN character (+) or a U+002D HYPHEN-MINUS character (-).
  6. One or more digits in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.
  7. Optionally*, a U+002E FULL STOP character (.) followed by one or more digits in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE.

The optional components marked with an asterisk (*) should be included, and should have dix digits each.

The value specifies latitude and longitude, in that order (i.e., "LAT LON" ordering), in decimal degress. The longitude represents the location east and west of the prime meridian as a positive or negative real number, respectively. The latitude represents the location north and south of the equator as a positive or negative real number, respectively.

A single property with the name geo may be present within each item with the type vevent.

location

Gives the location of the event.

The value must be text.

A single property with the name location may be present within each item with the type vevent.

resources

Gives a resource that will be needed for the event.

The value must be text.

Any number of properties with the name resources may be present within each item with the type vevent.

status

Gives the confirmation status of the event.

The value must be text with one of the following values:

A single property with the name status may be present within each item with the type vevent.

summary

Gives a short summary of the event.

The value must be text.

User agents should replace U+000A LINE FEED (LF) characters in the value by U+0020 SPACE characters when using the value.

A single property with the name summary may be present within each item with the type vevent.

dtend

Gives the date and time by which the event ends.

If the property with the name dtend is present within an item with the type vevent that has a property with the name dtstart whose value is a valid date string, then the value of the property with the name dtend must be text that is a valid date string also. Otherwise, the value of the property must be text that is a valid global date and time string.

In either case, the value be later in time than the value of the dtstart property of the same item.

The time given by the dtend property is not inclusive. For day-long events, therefore, the the dtend property's value will be the day after the end of the event.

A single property with the name dtend may be present within each item with the type vevent, so long as that vevent does not have a property with the name duration.

dtstart

Gives the date and time at which the event starts.

The value must be text that is either a valid date string or a valid global date and time string.

Exactly one property with the name dtstart must be present within each item with the type vevent.

duration

Gives the date and time at which the event starts.

The value must be text that is a valid vevent duration string.

The duration represented is the sum of all the durations represented by integers in the value.

A single property with the name duration may be present within each item with the type vevent, so long as that vevent does not have a property with the name dtend.

transp

Gives whether the event is to be considered as consuming time on a calendar, for the purpose of free-busy time searches.

The value must be text with one of the following values:

A single property with the name transp may be present within each item with the type vevent.

contact

Gives the contact information for the event.

The value must be text.

Any number of properties with the name contact may be present within each item with the type vevent.

url

Gives a URL for the event.

The value must be an absolute URL.

A single property with the name url may be present within each item with the type vevent.

exdate

Gives a date and time at which the event does not occur despite the recurrence rules.

The value must be text that is either a valid date string or a valid global date and time string.

Any number of properties with the name exdate may be present within each item with the type vevent.

exrule

Gives a rule for finding dates and times at which the event does not occur despite the recurrence rules.

The value must be text that matches the RECUR value type defined in the iCalendar specification. [RFC2445]

Any number of properties with the name exrule may be present within each item with the type vevent.

rdate

Gives a date and time at which the event recurs.

The value must be text that is one of the following:

Any number of properties with the name rdate may be present within each item with the type vevent.

rrule

Gives a rule for finding dates and times at which the event occurs.

The value must be text that matches the RECUR value type defined in the iCalendar specification. [RFC2445]

Any number of properties with the name rrule may be present within each item with the type vevent.

created

Gives the date and time at which the event information was first created in a calendaring system.

The value must be text that is a valid global date and time string.

A single property with the name created may be present within each item with the type vevent.

last-modified

Gives the date and time at which the event information was last modified in a calendaring system.

The value must be text that is a valid global date and time string.

A single property with the name last-modified may be present within each item with the type vevent.

sequence

Gives a revision number for the event information.

The value must be text that is a valid non-negative integer.

A single property with the name sequence may be present within each item with the type vevent.

A string is a valid vevent duration string if it matches the following pattern:

  1. A U+0050 LATIN CAPITAL LETTER P character.
  2. One of the following:
5.4.3.1 Examples

Here is an example of a page that uses the vevent vocabulary to mark up an event:

<body item="vevent">
 ...
 <h1 itemprop="summary">Bluesday Tuesday: Money Road</h1>
 ...
 <time itemprop="dtstart" datetime="2009-05-05T19:00:00Z">May 5th @ 7pm</time>
 (until <time itemprop="dtend" datetime="2009-05-05T21:00:00Z">9pm</time>)
 ...
 <a href="http://livebrum.co.uk/2009/05/05/bluesday-tuesday-money-road"
    rel="bookmark" itemprop="url">Link to this page</a>
 ...
 <p>Location: <span itemprop="location">The RoadHouse</span></p>
 ...
 <p><input type=button value="Add to Calendar"
           onclick="location = getCalendar(this)"></p>
 ...
 <meta itemprop="description" content="via livebrum.co.uk">
</body>

The "getCalendar()" method could look like this:

function getCalendar(node) {
  while (node && !node.item.contains('vevent'))
    node = node.parentNode;
  if (!node) {
    alert('No event data found.');
    return;
  }
  var stamp = new Date();
  var stampString = '' + stamp.getUTCFullYear() + (stamp.getUTCMonth() + 1) + stamp.getUTCDate() + 'T' +
                         stamp.getUTCHours() + stamp.getUTCMinutes() + stamp.getUTCSeconds() + 'Z';
  var calendar = 'BEGIN:VCALENDAR\r\nPRODID:HTML\r\nVERSION:2.0\r\nBEGIN:VEVENT\r\nDTSTAMP:' + stampString + '\r\n';
  for (var propIndex = 0; propIndex < node.properties.length; propIndex += 1) {
    var prop = node.properties[propIndex];
    var value = prop.contents;
    var parameters = '';
    if (prop.localName == 'time') {
      value = value.replace(/[:-]/g, '');
      if (prop.date && prop.time)
        parameters = ';VALUE=DATE';
      else
        parameters = ';VALUE=DATE-TIME';
    } else {
      value = value.replace(/\\/g, '\\n');
      value = value.replace(/;/g, '\\;');
      value = value.replace(/,/g, \\,');
      value = value.replace(/\n/g, '\\n');
    }
    for (var nameIndex = 0; nameIndex < prop.itemprop.length; nameIndex += 1) {
      var name = prop.itemprop[nameIndex];
      if (!name.match(':') && !name.match('.'))
        calendar += name.toUpperCase() + parameters + ':' + value + '\r\n';
    }
  }
  calendar += 'END:VEVENT\r\nEND:VCALENDAR\r\n';
  return 'data:text/calendar;component=vevent,' + encodeURI(calendar);
}

The same page could offer some markup, such as the following, for copy-and-pasting into blogs:

<div item="vevent">
 <p>I'm going to
 <strong itemprop="summary">Bluesday Tuesday: Money Road</strong>,
 <time itemprop="dtstart" datetime="2009-05-05T19:00:00Z">May 5th at 7pm</time>
 to <time itemprop="dtend" content="2009-05-05T21:00:00Z">9pm</time>,
 at <span itemprop="location">The RoadHouse</span>!</p>
 <p><a href="http://livebrum.co.uk/2009/05/05/bluesday-tuesday-money-road"
       itemprop="url">See this event on livebrum.co.uk</a>.</p>
 <meta itemprop="description" content="via livebrum.co.uk">
</div>

5.4.4 Licensing works

An item with the predefined type work represents a work (e.g. an article, an image, a video, a song, etc). This type is primarily intended to allow authors to include licensing information for works.

The following are the type's predefined property names.

title

Gives the name of the work.

A single property with the name title may be present within each item with the type work.

author

Gives the name or contact information of one of the authors or creators of the work.

The value must be either an item with the type vcard, or text.

Any number of properties with the name author may be present within each item with the type work.

license

Identifies one of the licenses under which the work is available.

The value must be an absolute URL.

Any number of properties with the name license may be present within each item with the type work.

In addition, exactly one property with the name about must be present within each item with the type work, giving the URL of the work.

5.4.4.1 Examples

This example shows an embedded image entitiled My Pond, licensed under the Creative Commons Attribution-Share Alike 3.0 United States License and the MIT license simultaneously.

<figure item="work">
 <img itemprop="about" src="mypond.jpeg">
 <legend>
  <p><cite itemprop="title">My Pond</cite></p>
  <p><small>Licensed under the <a itemprop="license"
  href="http://creativecommons.org/licenses/by-sa/3.0/us/">Creative
  Commons Attribution-Share Alike 3.0 United States License</a>
  and the <a itemprop="license"
  href="http://www.opensource.org/licenses/mit-license.php">MIT
  license</a>.</small>
 </legend>
</figure>

5.5 Converting HTML to other formats

In all these algorithms, unless otherwise stated, operations that iterate over a series of elements (whether items, properties, or otherwise) must do so in tree order.

A generic API upon which the vocaulary-specific conversions defined below (vCard, iCalendar) can be built will need to provide the following information when given a Document (or equivalent):

5.5.1 JSON

Given a list of nodes nodes in a Document, a user agent must run the following algorithm to extract the microdata from those nodes into a JSON form:

  1. Let result be an empty object.

  2. Let items be an empty array.

  3. For each node in nodes, check if the element is a top-level microdata item, and if it is then get the object for that element and add it to items.

  4. Add an entry to result called "items" whose value is the array items.

  5. Return the result of serializing result to JSON.

When the user agent is to get the object for an item item, it must run the following substeps:

  1. Let result be an empty object.

  2. Add an entry to result called "type" whose value is the item type of item.

  3. Let properties be an empty object.

  4. For each element element that has one or more property names and whose corresponding item is item, run the following substeps:

    1. Let value be the property value of element.

    2. If value is an item, then get the object for value, and then replace value with the object returned from those steps.

    3. For each name name in element's property names, run the following substeps:

      1. If there is no entry named name in properties, then add an entry named name to properties whose value is an empty array.

      2. Append value to the entry named name in properties.

  5. Add an entry to result called "properties" whose value is the array properties.

  6. Return result.

5.5.2 RDF

To convert a Document to RDF, a user agent must run the following algorithm:

  1. If the title element is not null, then generate the following triple:

    subject
    the document's current address
    predicate
    http://purl.org/dc/terms/title
    object
    the textContent of the title element, as a plain literal, with the language information set from the language of the title element, if it is not unknown.
  2. For each a, area, and link element in the Document, run these substeps:

    1. If the element does not have a rel attribute, then skip this element.

    2. If the element does not have an href attribute, then skip this element.

    3. If resolving the element's href attribute relative to the element is not successful, then skip this element.

    4. Otherwise, split the value of the element's rel attribute on spaces, obtaining list of tokens.

    5. Convert each token in list of tokens to ASCII lowercase.

    6. If list of tokens contains more than one instance of the token up, then remove all such tokens.

    7. Coalesce duplicate tokens in list of tokens.

    8. If list of tokens contains both the tokens alternate and stylesheet, then remove them both and replace them with the single (uppercase) token ALTERNATE-STYLESHEET.

    9. For each token token in list of tokens that contains neither a U+003A COLON character (:) nor a U+002E FULL STOP character (.), generate the following triple:

      subject
      the document's current address
      predicate
      the concatenation of the string "http://www.w3.org/1999/xhtml/vocab#" and token, with any characters in token that are not valid in the <ifragment> production of the IRI syntax being %-escaped [RFC3987]
      object
      the absolute URL that results from resolving the value of the element's href attribute relative to the element
  3. For each meta element in the Document that has a name attribute and a content attribute, if the value of the name attribute contains neither a U+003A COLON character (:) nor a U+002E FULL STOP character (.), generate the following triple:

    subject
    the document's current address
    predicate
    the concatenation of the string "http://www.w3.org/1999/xhtml/vocab#" and the value of the element's name attribute, converted to ASCII lowercase, with any characters in the value that are not valid in the <ifragment> production of the IRI syntax being %-escaped [RFC3987]
    object
    the value of the element's content attribute, as a plain literal, with the language information set from the language of the element, if it is not unknown.
  4. For each article, section, blockquote, and q element in the Document that has a cite attribute that resolves successfully relative to the element, generate the following triple:

    subject
    the document's current address
    predicate
    http://purl.org/dc/terms/source
    object
    the absolute URL that results from resolving the value of the element's cite attribute relative to the element
  5. For each element that is also a top-level microdata item, run the following steps:

    1. Generate the triples for the item. Let item be the subject returned.

    2. Generate the following triple:

      subject
      the document's current address
      predicate
      http://www.w3.org/1999/xhtml/vocab#item
      object
      item
    3. If the element is, or is a descendant of, an address element that has no article element ancestors, and the item has the type vcard, generate the following triple:

      subject
      the document's current address
      predicate
      http://purl.org/dc/terms/creator
      object
      item

When the user agent is to generate the triples for an item item, it must follow the following steps:

  1. If of the elements whose corresponding item is item, there are any with a property name equal to the string "about", and the first such element is a URL property element, and its value is not an item, let subject be the value of that property. Otherwise, let subject be a new blank node.

  2. Let type be the item type of item.

  3. If type is neither the empty string nor an absolute URL, then let type be the result of concatenating the string "http://www.w3.org/1999/xhtml/custom#" with the type, with any characters in type that are not valid in the <ifragment> production of the IRI syntax being %-escaped.

  4. If type is not the empty string, generate the following triple:

    subject
    subject
    predicate
    http://www.w3.org/1999/02/22-rdf-syntax-ns#type
    object
    type
  5. For each element element that has one or more property names and whose corresponding item is item, run the following substeps:

    1. Let value be the property value of element.

    2. If value is an item, then generate the triples for value, and then replace value with the subject returned from those steps.

    3. Otherwise, if element is not one of the URL property elements, let value be a plain literal, with the language information set from the language of the element, if it is not unknown.

    4. For each name name in element's property names, run the following substeps:

      1. If name is equal to the string "about", skip this name.

      2. Otherwise, if type is work, and name is equal to the string "title", let name be the string "http://purl.org/dc/elements/1.1/title".

      3. Otherwise, if type is work, and name is equal to the string "author", let name be the string "http://creativecommons.org/ns#attributionName".

      4. Otherwise, if type is work, and name is equal to the string "license", let name be the string "http://www.w3.org/1999/xhtml/vocab#license".

      5. Otherwise, if name is not an absolute URL, then let name be the result of concatenating the string "http://www.w3.org/1999/xhtml/custom#" with name, with any characters in name that are not valid in the <ifragment> production of the IRI syntax being %-escaped. [RFC3987]

      6. Generate the following triple:

        subject
        subject
        predicate
        name
        object
        value
  6. Return subject.

5.5.3 vCard

Given a list of nodes nodes in a Document, a user agent must run the following algorithm to extract any vcard data represented by those nodes (only the first vCard is returned):

  1. If none of the nodes in nodes are items with the type vcard, then there is no vCard. Abort the algorithm, returning nothing.

  2. Let node be the first node in nodes that is an item with the type vcard.

  3. Let output be an empty string.

  4. Add a vCard line with the type "BEGIN" and the value "VCARD" to output.

  5. Add a vCard line with the type "PROFILE" and the value "VCARD" to output.

  6. Add a vCard line with the type "VERSION" and the value "3.0" to output.

  7. Add a vCard line with the type "SOURCE" and the result of escaping the vCard text string that is the the document's current address as the value to output.

  8. If the title element is not null, add a vCard line with the type "NAME" and with the result of escaping the vCard text string obtained from the textContent of the title element as the value to output.

  9. If there is a property named about whose corresponding item is node and the element of the first such property is a URL property element and has a value that is not an item, add a vCard line with the type "UID" and with the result of escaping the vCard text string that is that property's value as the value to output.

  10. For each element element that has one or more property names and whose corresponding item is node: for each name name in element's property names, run the following substeps:

    1. If name is equal to the string "about", skip this name.

    2. Let parameters be an empty set of name-value pairs.

    3. Run the appropriate set of substeps from the following list. The steps will set a variable value, which is used in the next step.

      If the property's value is an item subitem and name is n
      1. Let n1 be the value of the first property named family-name in subitem, or the empty string if there is no such property or the property's value is itself an item.

      2. Let n2 be the value of the first property named given-name in subitem, or the empty string if there is no such property or the property's value is itself an item.

      3. Let n3 be the value of the first property named additional-name in subitem, or the empty string if there is no such property or the property's value is itself an item.

      4. Let n4 be the value of the first property named honorific-prefix in subitem, or the empty string if there is no such property or the property's value is itself an item.

      5. Let n5 be the value of the first property named honorific-suffix in subitem, or the empty string if there is no such property or the property's value is itself an item.

      6. Let value be the concatenation of the following, in this order:

        1. The result of escaping the vCard text string n1
        2. A U+003B SEMICOLON character (;)
        3. The result of escaping the vCard text string n2
        4. A U+003B SEMICOLON character (;)
        5. The result of escaping the vCard text string n3
        6. A U+003B SEMICOLON character (;)
        7. The result of escaping the vCard text string n4
        8. A U+003B SEMICOLON character (;)
        9. The result of escaping the vCard text string n5
      If the property's value is an item subitem and name is adr
      1. Let value be the empty string.

      2. Append to value the result of collecting vCard subproperties named post-office-box in subitem.

      3. Append a U+003B SEMICOLON character (;) to value.
      4. Append to value the result of collecting vCard subproperties named extended-address in subitem.

      5. Append a U+003B SEMICOLON character (;) to value.
      6. Append to value the result of collecting vCard subproperties named street-address in subitem.

      7. Append a U+003B SEMICOLON character (;) to value.
      8. Append to value the result of collecting the first vCard subproperty named locality in subitem.

      9. Append a U+003B SEMICOLON character (;) to value.
      10. Append to value the result of collecting the first vCard subproperty named region in subitem.

      11. Append a U+003B SEMICOLON character (;) to value.
      12. Append to value the result of collecting the first vCard subproperty named postal-code in subitem.

      13. Append a U+003B SEMICOLON character (;) to value.
      14. Append to value the result of collecting the first vCard subproperty named country-name in subitem.

      15. If there is a property named type in subitem, and the first such property has a value that is not an item and whose value consists only of alphanumeric ASCII characters, then add a parameter named "TYPE" whose value is the value of that property to parameters.

      If the property's value is an item subitem and name is org
      1. Let value be the empty string.

      2. Append to value the result of collecting the first vCard subproperty named organization-name in subitem.

      3. For each property named organization-unit in subitem, run the following steps:

        1. If the value of the property is an item, then skip this property.

        2. Append a U+003B SEMICOLON character (;) to value.

        3. Append the result of escaping the vCard text string given by the value of the property to value.

      If the property's value is an item subitem with the type vcard and name is agent
      1. Let value be the result of escaping the vCard text string obtained from extracting a vCard from the element that represents subitem.

      2. Add a parameter named "VALUE" whose value is "VCARD" to parameters.

      If the property's value is an item and name is none of the above
      1. Let value the result of collecting the first vCard subproperty named value in subitem.

      2. If there is a property named type in subitem, and the first such property has a value that is not an item and whose value consists only of alphanumeric ASCII characters, then add a parameter named "TYPE" whose value is the value of that property to parameters.

      Otherwise (the property's value is not an item)
      1. Let value be the property's value.

      2. If element is one of the URL property elements, add a parameter with the name "VALUE" and the value "URI" to parameters.

      3. Otherwise, if element is a time element and the value is a valid date string, add a parameter with the name "VALUE" and the value "DATE" to parameters.

      4. Otherwise, if element is a time element and the value is a valid global date and time string, add a parameter with the name "VALUE" and the value "DATE-TIME" to parameters.

      5. Prefix every U+005C REVERSE SOLIDUS character (\) in value with another U+005C REVERSE SOLIDUS character (\).

      6. Prefix every U+002C COMMA character (,) in value with a U+005C REVERSE SOLIDUS character (\).

      7. Unless name is geo, prefix every U+003B SEMICOLON character (;) in value with a U+005C REVERSE SOLIDUS character (\).

      8. Replace every U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF) in value with a U+005C REVERSE SOLIDUS character (\) followed by a U+006E LATIN SMALL LETTER N.

      9. Replace every remaining U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF) character in value with a U+005C REVERSE SOLIDUS character (\) followed by a U+006E LATIN SMALL LETTER N.

    4. Add a vCard line with the type name, the parameters parameters, and the value value to output.

  11. If there is no property named n whose corresponding item is node, then run the following substeps:

    1. If there is no property named fn whose corresponding item is node, then skip the remainder of these substeps.

    2. If the first property named fn whose corresponding item is node has a value that is an item, then skip the remainder of these substeps.

    3. Let fn be the value of the first property named fn whose corresponding item is node.

    4. If there is a property named org whose corresponding item is node, and the value of the first such property is equal to fn (and is not an item), then add a vCard line with the type "N" whose value is four U+003B SEMICOLON characters (";;;;") to output. Then, skip the remainder of these substeps.

    5. If the space characters in fn, if any, are not all contiguous, then skip the remainder of these substeps.

    6. Split fn on spaces, and let part one be the first resulting token, and part two be the second, if any, or the empty string if there is no second token. (There cannot be three, given the previous step.)

    7. If the last character of part one is a U+002C COMMA character (,), then remove that character from part one and add a vCard line with the type "N" whose value is the concatenation of the following strings:

      1. The result of escaping the vCard text string part one
      2. A U+003B SEMICOLON character (;)
      3. The result of escaping the vCard text string part two
      4. Three U+003B SEMICOLON characters (;)

      Then, skip the remainder of these substeps.

    8. If part two is two Unicode code-points long and its second character is a U+002E FULL STOP character (.), then add a vCard line with the type "N" whose value is the concatenation of the following strings:

      1. The result of escaping the vCard text string part one
      2. A U+003B SEMICOLON character (;)
      3. The result of escaping the vCard text string consisting of the first character of part two
      4. Three U+003B SEMICOLON characters (;)

      Then, skip the remainder of these substeps.

    9. If part two is one Unicode code-point long, then add a vCard line with the type "N" whose value is the concatenation of the following strings:

      1. The result of escaping the vCard text string part one
      2. A U+003B SEMICOLON character (;)
      3. The result of escaping the vCard text string part two
      4. Three U+003B SEMICOLON characters (;)

      Then, skip the remainder of these substeps.

    10. Add a vCard line with the type "N" whose value is the concatenation of the following strings:

      1. The result of escaping the vCard text string part two
      2. A U+003B SEMICOLON character (;)
      3. The result of escaping the vCard text string part one
      4. Three U+003B SEMICOLON characters (;)
  12. Add a vCard line with the type "END" and the value "VCARD" to output.

When the above algorithm says that the user agent is to add a vCard line consisting of a type type, optionally some parameters, and a value value to a string output, it must run the following steps:

  1. Let line be an empty string.

  2. Append type, converted to ASCII uppercase, to line.

  3. If there are any parameters, then for each parameter, in the order that they were added, run these substeps:

    1. Append a U+003B SEMICOLON character (;) to line.

    2. Append the parameter's name to line.

    3. Append a U+003D EQUALS SIGN character (=) to line.

    4. Append the parameter's value to line.

  4. Append a U+003A COLON character (:) to line.

  5. Append value to line.

  6. Let maximum length be 75.

  7. If and while line is longer than maximum length Unicode code points long, run the following substeps:

    1. Append the first maximum length Unicode code points of line to output.

    2. Remove the first maximum length Unicode code points from line.

    3. Append a U+000D CARRIAGE RETURN character (CR) to output.

    4. Append a U+000A LINE FEED character (LF) to output.

    5. Append a U+0020 SPACE character to output.

    6. Let maximum length be 74.

  8. Append (what remains of) line to output.

  9. Append a U+000D CARRIAGE RETURN character (CR) to output.

  10. Append a U+000A LINE FEED character (LF) to output.

When the steps above require the user agent to obtain the result of collecting vCard subproperties named subname in subitem, the user agent must run the following steps:

  1. Let value be the empty string.

  2. For each property named subname in the item subitem, run the following substeps:

    1. If the value of the property is itself an item, then skip this property.

    2. If this is not the first property named subname in subitem (ignoring any that were skipped by the previous step), then append a U+002C COMMA character (,) to value.

    3. Append the result of escaping the vCard text string given by the value of the property to value.

  3. Return value.

When the steps above require the user agent to obtain the result of collecting the first vCard subproperty named subname in subitem, the user agent must run the following steps:

  1. If there are no properties named subname in subitem, then abort these substeps, returning the empty string.

  2. If the value of the first property named subname in subitem is an item, then abort these substeps, returning the empty string.

  3. Return the result of escaping the vCard text string given by the value of the first property named subname in subitem.

When the above algorithms say the user agent is to escape the vCard text string value, the user agent must use the following steps:

  1. Prefix every U+005C REVERSE SOLIDUS character (\) in value with another U+005C REVERSE SOLIDUS character (\).

  2. Prefix every U+002C COMMA character (,) in value with a U+005C REVERSE SOLIDUS character (\).

  3. Prefix every U+003B SEMICOLON character (;) in value with a U+005C REVERSE SOLIDUS character (\).

  4. Replace every U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF) in value with a U+005C REVERSE SOLIDUS character (\) followed by a U+006E LATIN SMALL LETTER N.

  5. Replace every remaining U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF) character in value with a U+005C REVERSE SOLIDUS character (\) followed by a U+006E LATIN SMALL LETTER N.

  6. Return the mutated value.

This algorithm can generate invalid vCard output, if the input does not conform to the rules described for the vcard predefined type and predefined property names.

5.5.4 iCalendar

Given a list of nodes nodes in a Document, a user agent must run the following algorithm to extract any vevent data represented by those nodes:

  1. If none of the nodes in nodes are items with the type vevent, then there is no vEvent data. Abort the algorithm, returning nothing.

  2. Let output be an empty string.

  3. Add an iCalendar line with the type "BEGIN" and the value "VCALENDAR" to output.

  4. Add an iCalendar line with the type "PRODID" and the value equal to a user-agent specific string representing the user agent to output.

  5. Add an iCalendar line with the type "VERSION" and the value "2.0" to output.

  6. For each node node in nodes that is an item with the type vevent, run the following steps:

    1. Add an iCalendar line with the type "BEGIN" and the value "VEVENT" to output.

    2. Add an iCalendar line with the type "DTSTAMP" and a value consisting of an iCalendar DATE-TIME string representing the current date and time, with the annotation "VALUE=DATE-TIME", to output. [RFC2445]

    3. If there is a property named about whose corresponding item is node and the element of the first such property is a URL property element and has a value that is not an item, add an iCalendar line with the type "UID" and that property's value as the value to output.

    4. For each element element that has one or more property names and whose corresponding item is node: for each name name in element's property names, run the appropriate set of substeps from the following list:

      If name is equal to the string "about"
      If the property's value is an item

      Skip the property.

      If element is a time element

      Let value be the result of stripping all U+002D HYPHEN-MINUS (-) and U+003A COLON (:) characters from the property's value.

      If the property's value is a valid date string then add an iCalendar line with the type name and the value value to output, with the annotation "VALUE=DATE".

      Otherwise, if the property's value is a valid global date and time string then add an iCalendar line with the type name and the value value to output, with the annotation "VALUE=DATE-TIME".

      Otherwise skip the property.

      Otherwise

      Add an iCalendar line with the type name and the value value to output.

    5. Add an iCalendar line with the type "END" and the value "VEVENT" to output.

  7. Add an iCalendar line with the type "END" and the value "VCALENDAR" to output.

When the above algorithm says that the user agent is to add an iCalendar line consisting of a type type, a value value, and optinally an annotation, to a string output, it must run the following steps:

  1. Let line be an empty string.

  2. Append type, converted to ASCII uppercase, to line.

  3. If there is an annotation:

    1. Append a U+003B SEMICOLON character (;) to line.

    2. Append the annotation to line.

  4. Append a U+003A COLON character (:) to line.

  5. Prefix every U+005C REVERSE SOLIDUS character (\) in value with another U+005C REVERSE SOLIDUS character (\).

  6. Prefix every U+002C COMMA character (,) in value with a U+005C REVERSE SOLIDUS character (\).

  7. Prefix every U+003B SEMICOLON character (;) in value with a U+005C REVERSE SOLIDUS character (\).

  8. Replace every U+000D CARRIAGE RETURN U+000A LINE FEED character pair (CRLF) in value with a U+005C REVERSE SOLIDUS character (\) followed by a U+006E LATIN SMALL LETTER N.

  9. Replace every remaining U+000D CARRIAGE RETURN (CR) or U+000A LINE FEED (LF) character in value with a U+005C REVERSE SOLIDUS character (\) followed by a U+006E LATIN SMALL LETTER N.

  10. Append value to line.

  11. Let maximum length be 75.

  12. If and while line is longer than maximum length Unicode code points long, run the following substeps:

    1. Append the first maximum length Unicode code points of line to output.

    2. Remove the first maximum length Unicode code points from line.

    3. Append a U+000D CARRIAGE RETURN character (CR) to output.

    4. Append a U+000A LINE FEED character (LF) to output.

    5. Append a U+0020 SPACE character to output.

    6. Let maximum length be 74.

  13. Append (what remains of) line to output.

  14. Append a U+000D CARRIAGE RETURN character (CR) to output.

  15. Append a U+000A LINE FEED character (LF) to output.

This algorithm can generate invalid iCalendar output, if the input does not conform to the rules described for the vevent predefined type and predefined property names.

5.5.5 Atom

Given a Document source, a user agent must run the following algorithm to extract an Atom feed:

  1. If the Document source does not contain any article elements, then return nothing and abort these steps. This algorithm can only be used with documents that contain distinct articles.

  2. Let R be an empty XML Document object whose address is user-agent defined.

  3. Append a feed element in the Atom namespace to R.

  4. For each element candidate that is, or is a descendant of, an address element that has no article element ancestors, and that is an item that has the type vcard, if there is a property property named fn whose corresponding item is candidate, and the value of property is not an item, then append an author element in the Atom namespace to the root element of R whose contents is a text node with its data set to the value of property.

  5. If there is a link element whose rel attribute's value includes the keyword icon, and that element also has an href attribute whose value successfully resolves relative to the link element, then append an icon element in the Atom namespace to the root element of R whose contents is a text node with its data set to the absolute URL resulting from resolving the value of the href attribute.

  6. Append an id element in the Atom namespace to the root element of R whose contents is a text node with its data set to the document's current address.

  7. Optionally: Let x be a link element in the Atom namespace. Add a rel attribute whose value is the string "self" to x. Append a text node with its data set to the (user-agent defined) address of R to x. Append x to the root element of R.

    This step would be skipped when the document R has no convenient address. The presence of the rel="self" link is a "should"-level requirement in the Atom specification.

  8. Let x be a link element in the Atom namespace. Add a rel attribute whose value is the string "alternate" to x. If the document being converted is an HTML document, add a type attribute whose value is the string "text/html" to x. Otherwise, the document being converted is an XML document; add a type attribute whose value is the string "application/xhtml+xml" to x. Append a text node with its data set to the document's current address to x. Append x to the root element of R.

  9. Let subheading text be the empty string.

  10. Let heading be the first element of heading content whose nearest ancestor of sectioning content is the the body element, if any, or null if there is none.

  11. Take the appropriate action from the following list, as determined by the type of the heading element:

    If heading is null

    Let heading text be the textContent of the title element, if there is one, or the empty string otherwise.

    If heading is a hgroup element

    If heading contains no child h1h6 elements, let heading text be the empty string.

    Otherwise, let headings list be a list of all the h1h6 element children of heading, sorted first by descending rank and then in tree order (so h1s first, then h2s, etc, with each group in the order they appear in the document). Then, let heading text be the textContent of the first entry in headings list, and if there are multiple entries, let subheading text be the textContent of the second entry in headings list.

    If heading is an h1h6 element

    Let heading text be the textContent of heading.

  12. Append a title element in the Atom namespace to the root element of R whose contents is a text node with its data set to heading text.

  13. If subheading text is not the empty string, append a subtitle element in the Atom namespace to the root element of R whose contents is a text node with its data set to subheading text.

  14. Let global update date have no value.

  15. For each article element article that does not have an ancestor article element, run the following steps:

    1. Let E be an entry element in the Atom namespace, and append E to the root element of R.

    2. Let heading be the first element of heading content whose nearest ancestor of sectioning content is article, if any, or null if there is none.

    3. Take the appropriate action from the following list, as determined by the type of the heading element:

      If heading is null

      Let heading text be the empty string.

      If heading is a hgroup element

      If heading contains no child h1h6 elements, let heading text be the empty string.

      Otherwise, let headings list be a list of all the h1h6 element children of heading, sorted first by descending rank and then in tree order (so h1s first, then h2s, etc, with each group in the order they appear in the document). Then, let heading text be the textContent of the first entry in headings list.

      If heading is an h1h6 element

      Let heading text be the textContent of heading.

    4. Append a title element in the Atom namespace to E whose contents is a text node with its data set to heading text.

    5. For each element candidate that is, or is a descendant of, an address element whose nearest article element ancestor is article, and that is an item that has the type vcard, if there is a property property named fn whose corresponding item is candidate, and the value of property is not an item, then append an author element in the Atom namespace to E whose contents is a text node with its data set to the value of property.

    6. Clone article and its descendants into an environment that has scripting disabled, has no plugins, and fails any attempt to fetch any resources. Let cloned article be the resulting clone article element.

    7. Remove from the subtree rooted at cloned article any article elements other than the cloned article itself, any header, footer, or nav elements whose nearest ancestor of sectioning content is the cloned article, and the first element of heading content whose nearest ancestor of sectioning content is the cloned article, if any.

    8. If cloned article contains any ins or del elements with datetime attributes whose values parse as global date and time strings without errors, then let update date be the value of the datetime attribute that parses to the newest global date and time.

      Otherwise, let update date have no value.

      This value is used below; it is calculated here because in certain cases the next step mutates the cloned article.

    9. If the document being converted is an HTML document, then: Let x be a content element in the Atom namespace. Add a type attribute whose value is the string "html" to x. Append a text node with its data set to the result of running the HTML fragment serialization algorithm on cloned article to x. Append x to E.

      Otherwise, the document being converted is an XML document: Let x be a content element in the Atom namespace. Add a type attribute whose value is the string "xml" to x. Append a div element to x. Move all the child nodes of the cloned article node to that div element, preserving their relative order. Append x to E.

    10. Establish the value of id and has-alternate from the first of the following to apply:

      If the article node has a descendant a or area element with an href attribute that successfully resolves relative to that descendant and a rel attribute whose value includes the bookmark keyword
      Let id be the absolute URL resulting from resolving the value of the href attribute of the first such a ot area element, relative to the element. Let has-alternate be true.
      If the article node has an id attribute
      Let id be the document's current address, with the fragment identifier (if any) removed, and with a new fragment identifier specified, consisting of the value of the article element's id attribute. Let has-alternate be false.
      Otherwise
      Let id be a user-agent defined undereferencable yet globally unique absolute URL. Let has-alternate be false.
    11. Append an id element in the Atom namespace to E whose contents is a text node with its data set to id.

    12. If has-alternate is true: Let x be a link element in the Atom namespace. Add a rel attribute whose value is the string "alternate" to x. Append a text node with its data set to id to x. Append x to E.

    13. If article has a pubdate attribute, and parsing that attribute's value as a global date and time string does not result in an error, then let publication date be the value of that attribute.

      Otherwise, let publication date have no value.

    14. If update date has no value but publication date does, then let update date have the value of publication date.

      Otherwise, if publication date has no value but update date does, then let publication date have the value of update date.

    15. If update date has a value, and global update date has no value or is less recent than update date, then let global update date have the value of update date.

    16. If publication date and update date both still have no value, then let them both value a value that is a valid global date and time string representing the global date and time of the moment that this algorithm was invoked.

    17. Append an published element in the Atom namespace to E whose contents is a text node with its data set to publication date.

    18. Append an updated element in the Atom namespace to E whose contents is a text node with its data set to update date.

  16. If global update date has no value, then let it have a value that is a valid global date and time string representing the global date and time of the date and time of the Document's source file's last modification, if it is known, or else of the moment that this algorithm was invoked.

  17. Insert an updated element in the Atom namespace into the root element of R before the first entry in the Atom namespace whose contents is a text node with its data set to global update date.

  18. Return the Atom document R.

The Atom namespace is: http://www.w3.org/2005/Atom