Copyright © 2017 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and permissive document license rules apply.
This specification defines new HTML attributes to embed machine-readable data in HTML documents in a style similar to RDFa. It is compatible with JSON, and can be written in a style which is convertible to RDF, although two-way conversion is not lossless.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is a Working draft to update the 26 June 2017 W3C Working Draft. The editors hope to request advancement to Candidate Recommendation before the end of November 2017. Review is particularly sought on the new algorithms for conversion to JSON-LD and RDFa.
If you wish to make comments regarding this document please submit them as github issues. All feedback is welcome, but please note the contribution guidelines require agreement to the terms of the W3C Patent Policy for substantive contributions.
All changes are listed in detail in the Github commit history. Significant changes are described in the changes section of this document.
This document was published by the Web Platform Working Group as a Working Draft. This document is intended to become a W3C Recommendation.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 March 2017 W3C Process Document.
This specification is an extension to HTML. All normative content in the HTML specification not specifically overridden by this specification is intended to be normative for this specification. This specification depends on the HTML specification and its extensions for definitions of individual HTML elements and attributes. [HTML52][html-extensions]
Information expressed as Microdata can be converted to JSON-LD, JSON, or RDFa as described in Section 6. Microdata can be converted to RDF, for example via conversion to JSON-LD or RDFa, if additional constraints are applied to the Microdata content. A process to convert Microdata directly to RDF is described in Microdata to RDF. [ JSON-LD], [JSON], [rdfa-core], [microdata-rdf]
For the purposes of this specification, the terms "URL" and "URI" are equivalent. The URL specification, and RFC 3986 which uses the term URI, define a URL, valid URL, and absolute URL. [RFC3986][URL]
A valid absolute URL is an absolute URL which is valid.
RFC 3986 defines the term resolve a URL [RFC3986].
DOM 4.1 defines textContent
for attributes, and for elements or nodes, as well as the term tree order. [DOM41]
This specification relies on the HTML specification to define the following terms: [HTML52]
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words "must", "must not", "should", "should not", and "may" in the normative parts of this document are to be interpreted as described in RFC2119. [RFC2119]
Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
The Microdata model consists of groups of name-value pairs known as items.
Each group is known as an item. Each item can have zero or more item types, global identifier(s), and associated name-value pairs. Each name in the name-value pair is known as a property, and each property has one or more values. Each value is either a string or itself a group of name-value pairs (an item). The names are unordered relative to each other, but if a particular name has multiple values, they do have a relative order.
itemscope
, itemtype
, and itemid
Every HTML element may have an
itemscope
attribute specified. The
attribute is a boolean attribute.itemscope
An element with the
attribute specified creates a new item, a group of name-value pairs that describe properties, and their values, of the thing represented by that element.itemscope
Elements with an
attribute may have an
itemscope
itemtype
attribute specified, to give the
item types of the item.
The
attribute, if specified, must have a value that is an unordered set of unique space-separated tokens that are
case-sensitive, each of which is a valid absolute
URL, and all of which are in the same vocabulary. The attribute's value must have at least one token.itemtype
The item types of an item are the tokens obtained by splitting the element's itemtype
attribute's value on spaces. If the
attribute is missing or parsing it in this way finds no tokens, the item is said to have no
item types.itemtype
The item types determine the vocabulary identifier. This is a URL that is prepended to property names, which identifies them as part of their vocabulary. The value of the vocabulary identifier for an item is determined as follows:
itemtype
attribute, split on spaces.User agents must not automatically dereference unknown URLs given as item types and property names. These URLs are a priori opaque identifiers.
A specification could define that its item types can be derefenced to provide the user with help information. Vocabulary authors are encouraged to provide useful information at the given URL, either in prose or a formal language such as RDF.
The
attribute must not be specified on elements that do not have an itemtype
attribute specified.itemscope
An item is said to be a typed item when either it has an item type, or it is the value of a property of a typed item. The relevant types for a typed item is the item's item types, if it has any, or else is the relevant types of the item for which it is a property's value.
Elements with an itemscope
attribute may also have an itemid
attribute specified, to give a global
identifier for the item, so that it can be related to other items elsewhere on the Web, or with concepts beyond the Web such as ISBN numbers for published books.
The
attribute, if specified, must have a value that is a valid URL potentially surrounded by space characters.itemid
The global identifier of an item is the value of its element's
attribute, if it has one,
resolved relative to the element on which the attribute is specified. If the itemid
attribute is missing or if resolving it fails, it is said to have no global identifier.itemid
The
attribute must not be specified on elements that do not have an itemid
attribute.itemscope
This example shows Microdata used to describe model railway products. manufacturer. The vocabulary identifier is https://md.example.com/. The example uses five property names:
It identifies four item types:
Each item that uses this vocabulary can be given one or more of these types, depending on what the product is.
Thus, a locomotive might be marked up as:
<dl itemscope itemtype="https://md.example.com/loco
https://md.example.com/lighting"
itemid="https://md.example.com/product-catalog/33041">
<dt>Name:
<dd itemprop="name">Tank Locomotive (DB 80)
<dt>Product code:
<dd itemprop="product-code">33041
<dt>Scale:
<dd itemprop="scale">HO
<dt>Digital:
<dd itemprop="digital">Delta
</dl>
A turnout lantern retrofit kit might be marked up as:
<dl itemscope itemtype="https://md.example.com/track
https://md.example.com/lighting"
itemid="https://md.example.com/product-catalog/74470">
<dt>Name:
<dd itemprop="name">Turnout Lantern Kit
<dt>Product code:
<dd itemprop="product-code">74470
<dt>Purpose:
<dd>For retrofitting 2 <span itemprop="track-type">C</span> Track
turnouts. <meta itemprop="scale" content="HO">
</dl>
A passenger car with no lighting might be marked up as:
<dl itemscope itemtype="https://md.example.com/passengers"
itemid="https://md.example.com/product-catalog/8710">
<dt>Name:
<dd itemprop="name">Express Train Passenger Car (DB Am 203)
<dt>Product code:
<dd itemprop="product-code">8710
<dt>Scale:
<dd itemprop="scale">Z
</dl>
itemprop
and itemref
attributesThe itemprop
attribute, when added to any HTML element that is part of an item, identifies a property of that item. The attribute must be an unordered set of unique space-separated tokens, representing the case-sensitive names of the properties that it adds. The attribute must contain at least one token.
Each token must be either a valid absolute URL or a a string that contains no "." (U+002E) characters and no ":" (U+003A) characters.
Vocabulary specifications must not define property names for Microdata that contain "." (U+002E) characters, ":" (U+003A) characters, nor space characters (defined in [HTML52] as U+0020, U+0009, U+000A, U+000C, and U+000D).
The property names of an element are determined as follows:
itemprop
attribute,
Split on spaces.Within an item, the properties are unordered with respect to each other, except for properties with the same name, which are ordered in the order they are given by the algorithm that defines the properties of an item.
In the following example, the "a" property has the values "1" and "2", in that order, but whether the "a" property comes before the "b" property or not is not important:
<div itemscope>
<p itemprop="a">1</p>
<p itemprop="a">2</p>
<p itemprop="b">test</p>
</div>
Thus, the following is equivalent:
<div itemscope>
<p itemprop="b">test</p>
<p itemprop="a">1</p>
<p itemprop="a">2</p>
</div>
As is the following:
<div itemscope>
<p itemprop="a">1</p>
<p itemprop="b">test</p>
<p itemprop="a">2</p>
</div>
Elements with an
attribute may have an
itemscope
itemref
attribute specified, to give a list of additional elements to crawl to find the name-value pairs of the item.
The
attribute, if specified, must have a value that is an unordered set of unique space-separated tokens that are
case-sensitive, consisting of IDs of elements in the same document.itemref
The
attribute must not be specified on elements that do not have an itemref
attribute specified.itemscope
The preceding example:
<div itemscope>
<p itemprop="a">1</p>
<p itemprop="a">2</p>
<p itemprop="b">test</p>
</div>
Could also be written as follows:
<div id="x">
<p itemprop="a">1</p>
</div>
<div itemscope itemref="x">
<p itemprop="b">test</p>
<p itemprop="a">2</p>
</div>
When an element with an
attribute adds a property to multiple items,
the requirement above regarding the tokens applies for each
item individually.itemprop
For the following code:
<div itemscope itemtype="http://example.com/a" itemref="x"></div>
<div itemscope itemtype="http://example.com/b" itemref="x"></div>
<meta id="x" itemprop="z" content="">
The author should be certain that z is a valid property name for both the http://example.com/a and http://example.com/b vocabularies.
content
attribute, element-specific attributes and element contentThe algorithm to determine the value for a name-value pair is given by applying the first matching case in the following list:
itemscope
attributeThe value is the item created by the element.
content
attributeThe value is the textContent
of the element's content
attribute.
HTML only allows the content
attribute on the meta
element. This specification changes the content model to allow it on any element,
as a global attribute.
audio
, embed
, iframe
,
img
, source
, track
, or video
elementIf the element has a src
attribute, let proposed value be the result of resolving that attribute's textContent
. If proposed value is a valid absolute URL: The value is proposed value.
otherwise The value is the empty string.
a
, area
, or link
elementIf the element has an href
attribute, let proposed value be the result of resolving that attribute's textContent
. If proposed value is a valid absolute URL: The value is proposed value.
otherwise The value is the empty string.
object
elementIf the element has a data
attribute, let proposed value be the result of resolving that attribute's textContent
. If proposed value is a valid absolute URL: The value is proposed value.
otherwise The value is the empty string.
data
or meter
elementIf the element has a value
attribute, the value is that attribute's textContent
.
time
elementIf the element has a datetime
attribute, the value is that attribute's textContent
.
The value is the element's textContent
.
The URL property elements are the a
, area
,
audio
, embed
, iframe
, img
, link
,
object
, source
, track
, and video
elements.
If a property's value, as defined by the property's definition, is an absolute URL, the property must be specified using a URL property element.
These requirements do not apply just because a property value happens to match the syntax for a URL. They only apply if the property is explicitly defined as taking such a value.
For example, a book about the first moon landing
could be called "mission:moon". A "title" property from a vocabulary that defines a title as being a string would not expect the title to be given in an a
element, even though it looks like a
URL. On the other hand, if there was a (rather narrowly scoped!) vocabulary for "books whose titles look like URLs" which had a "title" property whose content was defined
as a URL, then the property would expect the title to be given in an a
element (or one of the other URL property elements), because
of the requirement above.
To find the properties of an item defined by the element root, the user agent must run the following steps. These steps are also used to flag Microdata Errors.
Let results, memory, and pending be empty lists of elements.
Add the element root to memory.
Add the child elements of root, if any, to pending.
If root has an
attribute,
split the value of that itemref
itemref
attribute on spaces. For each resulting token ID, if
there is an element in the document whose ID is ID, then add the first such element to pending.
Loop: If pending is empty, jump to the step labeled end of loop.
Remove an element from pending and let current be that element.
If current is already in memory, there is a Microdata Error; return to the step labeled loop.
Add current to memory.
If current does not have an
attribute, then: add all the child elements of current to pending.itemscope
If current has an
attribute specified and has one or more property names, then add
current to results.itemprop
Return to the step labeled loop.
End of loop: Sort results in tree order.
Return results.
A document must not contain any items for which the algorithm to find the properties of an item finds any Microdata Error.
An item is a top-level Microdata item if its element does not have an
attribute.itemprop
All
attributes in a itemref
Document
must be such that there are no cycles in the graph formed from representing each item in the Document
as a node in the graph and each property of an item whose
value is another item as an edge in the graph connecting those two items.
A document must not contain an
attribute that would not be a property of any item in that document were the properties all to be determined.itemprop
In this example, a single license statement is applied to two works, using
from the items representing the works:itemref
<!DOCTYPE HTML>
<html>
<head>
<title>Photo gallery</title>
</head>
<body>
<h1>My photos</h1>
<figure itemscope itemtype="http://n.whatwg.org/work" itemref="licenses">
<img itemprop="work" src="images/house.jpeg" alt="A white house, boarded up, sits in a forest.">
<figcaption itemprop="title">The house I found.</figcaption>
</figure>
<figure itemscope itemtype="http://n.whatwg.org/work" itemref="licenses">
<img itemprop="work" src="images/mailbox.jpeg" alt="Outside the house is a mailbox. It has a leaflet inside.">
<figcaption itemprop="title">The mailbox.</figcaption>
</figure>
<footer>
<p id="licenses">All images licensed under the <a itemprop="license"
href="http://www.opensource.org/licenses/mit-license.php">MIT
license</a>.</p>
</footer>
</body>
</html>
The above results in two items with the type "http://n.whatwg.org/work
", one with:
images/house.jpeg
http://www.opensource.org/licenses/mit-license.php
...and one with:
images/mailbox.jpeg
http://www.opensource.org/licenses/mit-license.php
Given a list of nodes nodes in a Document
, a user agent must run the following algorithm to extract the Microdata from those nodes into [JSON-LD]:
Let result be an empty object.
Let items be an empty array.
For each node in nodes, check if the element is a top-level Microdata item, and if it is then get the object for JSON-LD for that element and add it to items.
Add an entry to result called "items
" whose value is the array items.
Add an entry to result called "@context
" whose value is the following object:
{ "@vocab" : "" }
Return the result of serializing result to JSON. [JSON]
This algorithm returns an object with a single property that is an array, instead of just returning an array, so that it is possible to extend the algorithm in the future if necessary.
When the user agent is to get the object for JSON-LD for an item item, potentially together with a list of elements memory, it must run the following substeps:
Let result be an empty object.
If no memory was passed to the algorithm, let memory be an empty list.
Add item to memory.
If the item has any item types, add an entry to result called "@type
" whose value is an array listing the
item types of item, in the order they were specified on the
attribute.itemtype
If the item has a global identifier, add an entry to
result called "@id
" whose value is the global
identifier of item.
Let properties be an empty object.
For each element element that has one or more property names and is one of the properties of the item item, in the order those elements are given by the algorithm that returns the properties of an item, run the following substeps:
Let value be the property value of element.
If value is an item, then: If value is in memory, then let value be the string "ERROR
".
Otherwise, get the object for
value, passing a copy of memory, and then replace value with the object returned from those steps.
For each name name in element's property names, run the following substeps:
If there is no entry named name in result, then add an entry named name to result whose value is an empty array.
Append value to the entry named name in result.
Return result.
For example, take this markup:
<!DOCTYPE HTML>
<title>My Blog</title>
<article itemscope itemtype="https://schema.org/BlogPosting">
<header>
<h1 itemprop="headline">Progress report</h1>
<p><time itemprop="datePublished" datetime="2013-08-29">today</time></p>
<link itemprop="url" href="?comments=0">
</header>
<p>All in all, he's doing well with his swim lessons. The biggest thing was he had trouble
putting his head in, but we got it down.</p>
<section>
<h1>Comments</h1>
<article itemprop="comment" itemscope itemtype="https://schema.org/Comment" id="c1">
<link itemprop="url" href="#c1">
<footer>
<p>Posted by: <span itemprop="creator" itemscope itemtype="https://schema.org/Person">
<span itemprop="name">Greg</span>
</span></p>
<p><time itemprop="dateCreated" datetime="2013-08-29">15 minutes ago</time></p>
</footer>
<p>Ha!</p>
</article>
<article itemprop="comment" itemscope itemtype="https://schema.org/Comment" id="c2">
<link itemprop="url" href="#c2">
<footer>
<p>Posted by: <span itemprop="creator" itemscope itemtype="https://schema.org/Person">
<span itemprop="name">Charlotte</span>
</span></p>
<p><time itemprop="dateCreated" datetime="2013-08-29">5 minutes ago</time></p>
</footer>
<p>When you say "we got it down"...</p>
</article>
</section>
</article>
It would be turned into the following JSON by the algorithm above (supposing that the page's URL was http://blog.example.com/progress-report
):
{
"items": [
{
"type": [ "https://schema.org/BlogPosting" ],
"properties": {
"headline": [ "Progress report" ],
"datePublished": [ "2013-08-29" ],
"url": [ "http://blog.example.com/progress-report?comments=0" ],
"comment": [
{
"type": [ "https://schema.org/Comment" ],
"properties": {
"url": [ "http://blog.example.com/progress-report#c1" ],
"creator": [
{
"type": [ "https://schema.org/Person" ],
"properties": {
"name": [ "Greg" ]
}
}
],
"dateCreated": [ "2013-08-29" ]
}
},
{
"type": [ "https://schema.org/Comment" ],
"properties": {
"url": [ "http://blog.example.com/progress-report#c2" ],
"creator": [
{
"type": [ "https://schema.org/Person" ],
"properties": {
"name": [ "Charlotte" ]
}
}
],
"dateCreated": [ "2013-08-29" ]
}
}
]
}
}
]
}
Given a list of nodes nodes in a Document
, a user agent must run the following algorithm to extract the Microdata from those nodes into a JSON form:
Let result be an empty object.
Let items be an empty array.
For each node in nodes, check if the element is a top-level Microdata item, and if it is then get the object for that element and add it to items.
Add an entry to result called "items
" whose value is the array items.
Return the result of serializing result to JSON in the shortest possible way (meaning no whitespace between tokens, no unnecessary zero digits in numbers, and only using Unicode escapes in strings for characters that do
not have a dedicated escape sequence), and with a lowercase "e
" used, when appropriate, in the representation of any numbers. [JSON]
This algorithm returns an object with a single property that is an array, instead of just returning an array, so that it is possible to extend the algorithm in the future if necessary.
When the user agent is to get the object for an item item, potentially together with a list of elements memory, it must run the following substeps:
Let result be an empty object.
If no memory was passed to the algorithm, let memory be an empty list.
Add item to memory.
If the item has any item types, add an entry to result called "type
" whose value is an array listing the
item types of item, in the order they were specified on the
attribute.itemtype
If the item has a global identifier, add an entry to
result called "id
" whose value is the global
identifier of item.
Let properties be an empty object.
For each element element that has one or more property names and is one of the properties of the item item, in the order those elements are given by the algorithm that returns the properties of an item, run the following substeps:
Let value be the property value of element.
If value is an item, then: If value is in memory, then let value be the string "ERROR
".
Otherwise, get the object for
value, passing a copy of memory, and then replace value with the object returned from those steps.
For each name name in element's property names, run the following substeps:
If there is no entry named name in properties, then add an entry named name to properties whose value is an empty array.
Append value to the entry named name in properties.
Add an entry to result called "properties
" whose value is the object properties.
Return result.
For example, take this markup:
<!DOCTYPE HTML>
<title>My Blog</title>
<article itemscope itemtype="https://schema.org/BlogPosting">
<header>
<h1 itemprop="headline">Progress report</h1>
<p><time itemprop="datePublished" datetime="2013-08-29">today</time></p>
<link itemprop="url" href="?comments=0">
</header>
<p>All in all, he's doing well with his swim lessons. The biggest thing was he had trouble
putting his head in, but we got it down.</p>
<section>
<h1>Comments</h1>
<article itemprop="comment" itemscope itemtype="https://schema.org/Comment" id="c1">
<link itemprop="url" href="#c1">
<footer>
<p>Posted by: <span itemprop="creator" itemscope itemtype="https://schema.org/Person">
<span itemprop="name">Greg</span>
</span></p>
<p><time itemprop="dateCreated" datetime="2013-08-29">15 minutes ago</time></p>
</footer>
<p>Ha!</p>
</article>
<article itemprop="comment" itemscope itemtype="https://schema.org/Comment" id="c2">
<link itemprop="url" href="#c2">
<footer>
<p>Posted by: <span itemprop="creator" itemscope itemtype="https://schema.org/Person">
<span itemprop="name">Charlotte</span>
</span></p>
<p><time itemprop="dateCreated" datetime="2013-08-29">5 minutes ago</time></p>
</footer>
<p>When you say "we got it down"...</p>
</article>
</section>
</article>
It would be turned into the following JSON by the algorithm above (supposing that the page's URL was http://blog.example.com/progress-report
):
{
"items": [
{
"type": [ "https://schema.org/BlogPosting" ],
"properties": {
"headline": [ "Progress report" ],
"datePublished": [ "2013-08-29" ],
"url": [ "http://blog.example.com/progress-report?comments=0" ],
"comment": [
{
"type": [ "https://schema.org/Comment" ],
"properties": {
"url": [ "http://blog.example.com/progress-report#c1" ],
"creator": [
{
"type": [ "https://schema.org/Person" ],
"properties": {
"name": [ "Greg" ]
}
}
],
"dateCreated": [ "2013-08-29" ]
}
},
{
"type": [ "https://schema.org/Comment" ],
"properties": {
"url": [ "http://blog.example.com/progress-report#c2" ],
"creator": [
{
"type": [ "https://schema.org/Person" ],
"properties": {
"name": [ "Charlotte" ]
}
}
],
"dateCreated": [ "2013-08-29" ]
}
}
]
}
}
]
}
A typed item can generally be readily converted to RDFa. This is useful for example to provide greater support for internationalization, or to include markup in the resulting data. The algorithm to convert a Microdata item to RDFa is as follows:
itemscope
attribute with a vocab
attribute, whose value is the vocabulary identifier for the item.itemtype
.
Replace the itemtype
attribute with a typeof
attribute whose value is local type.itemtype
attribute results in more than one value, then for each value extra type of itemtype
after the first, add an empty span
child
element to the element that represents the item, with the attribute rel="rdf:type"
and a resource
attribute whose value is extra type.itemprop
attribute to a property
attribute with the same valueitemid
attribute to an about
attribute with the same valueitemref
attribute, for each value reference that is a result of splitting the value on spaces follow the following substeps:
id
reference, with a vocab
attribute whose value is the vocabulary identifier for the item.
The choice of element is left to the author, to provide sufficient flexibility to avoid unwanted changes in the rendering of the content.
id
reference has no resource
attribute, add a resource
attribute whose value is a NUMBER SIGN U+0023 followed by reference to the element.id
reference has no typeof
attribute, add a typeof="rdfa:Pattern"
attribute to the element.link
child element to the element that represents the item, with a rel="rdfa:copy"
attribute and an href
attribute
whose value is a NUMBER SIGN U+0023 followed by referenceAnd then remove the itemref attribute.
There is significant scope for optimising this algorithm, which may result in redundant vocab
declarations in particular.
The example of date for a model locomotive given above would be converted to the folling RDFa:
<dl vocab="http://md.example.com/" typeof="loco">
<span rel="rdf:type" resource="http://md.example.com/lighting"></span>
<dt>Name:
<dd property="name">Tank Locomotive (DB 80)
<dt>Product code:
<dd property="product-code">33041
<dt>Scale:
<dd property="http://my.test/scale">HO
<dt>Digital:
<dd property="digital">Delta
</dl>
An example using itemref
shows scope for optimisation:
<figure vocab="https://schema.org/" typeof="CreativeWork">
<link rel="rdfa:copy" href="#licenses">
<img property="image" src="images/house.jpeg"
alt="A white house, boarded up, sits in a forest.">
<figcaption property="name">The house I found.</figcaption>
</figure>
<figure vocab="https://schema.org/" typeof="CreativeWork">
<link rel="rdfa:copy" href="#licenses">
<img property="image" src="images/mailbox.jpeg"
alt="Outside the house is a mailbox. It has a leaflet inside.">
<figcaption property="name">The mailbox.</figcaption>
</figure>
<footer>
<div vocab="https://schema.org/">
<div vocab="https://schema.org/">
<p typeof="rdfa:Pattern" resource="#licenses"
id="licenses">All images licensed under the <a property="license"
href="http://www.opensource.org/licenses/mit-license.php">MIT license</a>.</p>
</div>
</div>
</footer>
It could be rewritten as:
<div vocab="https://schema.org/">
<figure typeof="CreativeWork">
<link rel="rdfa:copy" href="#licenses">
<img property="image" src="images/house.jpeg"
alt="A white house, boarded up, sits in a forest.">
<figcaption property="name">The house I found.</figcaption>
</figure>
<figure typeof="CreativeWork">
<link rel="rdfa:copy" href="#licenses">
<img property="image" src="images/mailbox.jpeg"
alt="Outside the house is a mailbox. It has a leaflet inside.">
<figcaption property="name">The mailbox.</figcaption>
</figure>
<footer>
<p typeof="rdfa:Pattern" resource="#licenses"
id="licenses">All images licensed under the <a property="license"
href="http://www.opensource.org/licenses/mit-license.php">MIT license</a>.</p>
</footer>
</div>
This specification adds the following global attributes and associated validity constraints to HTML:
itemscope
itemtype
itemtype
attribute must not be specified on elements that do not have an itemscope
attribute.This attribute performs a function similar to the combination of vocab
and typeof
attributes in [rdfa-core].
itemprop
This attribute is equivalent to the property
attribute in [rdfa-core].
itemid
itemid
attribute must not be specified on elements that do not have an itemscope
attribute specified.This is approximately equivalent to declaring that an item is
owl:sameAs
the value of the attribute. [owl-ref]
itemref
itemref
attribute must not be specified on elements that do not have an itemscope
attribute.This section changes the content models defined by HTML in the following ways:
The content
attribute redefined by this specification as a global attribute that may be present on that element.
This is consistent with [HTML-RDFA], which uses the attribute for the same purpose.
If the
attribute is present on a itemprop
link
or meta
element, that element is flow content and phrasing content,
and may be used where phrasing content is expected.
If a link
element has an
attribute, the itemprop
rel
attribute may be omitted.
If a meta
element has an
attribute, the itemprop
name
, http-equiv
, and charset
attributes must be omitted, and the content
attribute must be present.
If the
attribute is specified on an itemprop
a
or area
element, then the href
attribute must also be specified.
If the
attribute is specified on an itemprop
audio
, embed
,
iframe
, img
, source
, track
, or video
element, then the src
attribute must also be specified.
If the
attribute is specified on an itemprop
object
element, then the data
attribute must also be specified.
This section is not normative
Microdata can be used to provide machine-parseable information about content that is processed by tools to improve accessibility.
When editing content that contains Microdata, authors should consider the possibility that this is the case. Authoring and content management tools should implement the Authoring Tool Accessibility Guidelines, and in this context note Guideline B1.1.2 - Ensure Accessibility Information is Preserved, if applicable drawing attention to the fact that changes in content may mean the encoded metadata is not accurate. [ATAG20]
Authors should be aware that a great deal of accessibility information is ignored in extracting Microdata, including attributes such as alt
and ARIA information. Authors should consider whether to encode accessibility information
explicitly, or to use a more expressive approach such as RDFa. [rdfa-core]
This section is not normative
Microdata conversion means that almost all internationalisation-related information is lost, except if it is specifically encoded as Microdata, in which case it is important to pay attention when editing, as above for accessibility.
Machine-readable data may be presented to users, for example by search engines. Identifying content that should, or should not, be translated, would be helpful but currently Microdata strips markup. It is possible to use XMLLiterals in RDFa to ensure that markup is kept. [rdfa-core]
Vocabulary design is difficult. Different languages and cultures present view ambiguity differently: two terms with different meanings in one situation may be most naturally translated by a single term that has both meanings, or a single term may have two natural translations. When developing for localisation, it is important to provide sufficient contextual information about terms in a vocabulary to enable accurate translation.
This section is not normative
Microdata does not introduce new mechanisms to transmit privacy-sensitive information. However it more clearly identifies information, in a way that facilitates finding data and merging it with data from other sources.
Authors and processors should take care to ensure that their use of Microdata is in line with privacy policies and any applicable regulation.
This section is not normative
Microdata does not generally interact with browsers, being a static document format that lacks any DOM interface.
Microdata makes information machine-readable, but does not automatically include provenance information for the statements it encodes.
Processors of Microdata should consider the trustworthiness of sources they use, including the possibility that data is no longer accurate, and the possibility that data gathered over an insecure connection has been altered by a "man-in-the-middle" attack.
application/microdata+json
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
application/json
[JSON]application/json
[JSON]application/json
[JSON]application/json
[JSON]application/microdata+json
type asserts that the resource is a JSON text that consists of an object with a single entry called "items
" consisting of an array of entries,
each of which consists of an object with an entry called "id
" whose value is a string, an entry called "type
" whose value is another string, and an entry called "properties
" whose value is an object
whose entries each have a value consisting of an array of either objects or strings, the objects being of the same form as the objects in the aforementioned "items
" entry. Thus, the relevant specifications are the JSON specification
and this specification. [JSON]
Applications that transfer data intended for use with Microdata, especially in the context of drag-and-drop, are the primary application class for this type.
application/json
[JSON]application/json
[JSON]application/json
[JSON]Fragment identifiers used with
application/microdata+json
resources have the same semantics as when used with application/json
(namely, at the time of writing, no semantics at all). [JSON]
An exact history of changes to the text is available in the Github commit log. The following information is provided as an overview of the substantive changes made to the specification between each publication.
Changes made between the current draft and the second Working Draft:
itemid
may be specified on an element with an itemscope
attribute, not just
a typed itemChanges made between the second Working Draft and the First Public Working Draft:
itemid
in general, rather than expecting vocabularies to explain what it means.content
attribute on any element where an itemprop
attribute is present, to provide a readable value for a property.data
, meter
, and time
elements' textContent
is used if they do not have an attribute that supplies the value.content
attribute present, it provides the value.Changes made between the First Public Working Draft and the 23 October 2013 W3C Note:
The original specification for Microdata was developed by Ian Hickson. Without him this specification would not exist. Uptake has substantially been driven by its use for the schema.org vocabulary.
The current editors would like to thank the following people for direct contributions to this work:
Christine Runnegar, Gregg Kellogg, Ivan Herman, Jeni Tennison, Jens Oliver Meiert, Léonie Watson, Manu Sporny, Marcos Cáceres, Markus Lanthaler, Nick Doty, Philippe Le Hégaret, Ralph Swick, Robin Berjon, Shane McCarron, Tab Atkins, Tavis Tucker, Tobie Langel, " Unor", Xiaoqian Wu, Yves Lafon.