This pages describes schema.org proposal for improving the breadcrumbs mechanism in schema.org.
Breadcrumbs historically have been poorly handled in schema.org. There are some subtle markup issues involved in getting the design right, so we have separated out candidate designs into separate pages.
For technical background, see also the issue tracker, http://www.w3.org/2011/webschema/track/issues/10
The bulk of this page collects real-world sample HTMLs for evaluation in the following designs.
- WebSchemas/BreadcrumbsDesign1 - uses a 'child' property
- JSON-LD - WebSchemas/BreadcrumbsDesign2 to use JSON-LD's support for ordered lists
- section indicator WebSchemas/BreadcrumbsDesign3 to simply indicate the chunk of HTML that has breadcrumbs, without representing everything in triples
- RDFa lists - WebSchemas/BreadcrumbsDesign4 - using RDFa's inlist="" syntax.
- link any further proposals here
Requirements and Constraints
- It should be possible to extract (possibly several independent) ordered lists of pairs of URLs and their anchor text labels using this markup
- This should be possible using Microdata or RDFa markup, or possibly JSON-LD
- The structure should be preserved when parsing the page with standard parsers that emit triples
- The extraction rules should ensure that relative URLs e.g. /books/stephen_king get turned into complete URLs e.g. http://example.com/books/stephen_king
BC1 (from oldish https://support.google.com/webmasters/answer/185417 ):
<a href="http://www.example.com/dresses">Dresses</a> › <a href="http://www.example.com/dresses/real">Real Dresses</a> › <a href="http://www.example.com/dresses/real/green">Real Green Dresses</a>
<a href="category/books.html">Books</a> > <a href="category/books-literature.html">Literature & Fiction</a> > <a href="category/books-classics">Classics</a>
These essentially just differs by separator character; could also be an image. This is common enough but with no interesting difference in terms of essential structure. In theory the anchor content could also include an image, but this is less common.
Sample BC3: multiple trails
Extracted from https://support.google.com/webmasters/answer/185417?hl=en by taking the Microdata example and removing the microdata (danbri):
<div> <a href="http://www.example.com/books"> <span>Books</span> </a> › <div> <a href="http://www.example.com/books/authors"> <span>Authors</span> </a> › <div> <a href="http://www.example.com/books/authors/stephenking"> <span>Stephen King</span> </a> </div> </div> </div> <div> <a href="http://www.example.com/books"> <span>Books</span> </a> › <div> <a href="http://www.example.com/fiction"> <span>Fiction</span> </a> › <div> <a href="http://www.example.com/books/fiction/horror"> <span>Horror</span> </a> </div> </div> </div>