WebSchemas/Breadcrumbs

From W3C Wiki


This is an archived WebSchemas proposal Breadcrumbs proposal for schema.org. See Proposals listing for more. Note: active schema.org development is now based at github



Overview

This pages describes schema.org proposal for improving the breadcrumbs mechanism in schema.org.

Breadcrumbs historically have been poorly handled in schema.org. There are some subtle markup issues involved in getting the design right, so we have separated out candidate designs into separate pages.

For technical background, see also the issue tracker, http://www.w3.org/2011/webschema/track/issues/10

The bulk of this page collects real-world sample HTMLs for evaluation in the following designs.

Candidate Designs

Requirements and Constraints

Must haves:

  • It should be possible to extract (possibly several independent) ordered lists of pairs of URLs and their anchor text labels using this markup

Desirable:

  • This should be possible using Microdata or RDFa markup, or possibly JSON-LD
  • The structure should be preserved when parsing the page with standard parsers that emit triples
  • The extraction rules should ensure that relative URLs e.g. /books/stephen_king get turned into complete URLs e.g. http://example.com/books/stephen_king

HTML Samples

Sample BC1

BC1 (from oldish https://support.google.com/webmasters/answer/185417 ):

<a href="http://www.example.com/dresses">Dresses</a> ›   
 <a href="http://www.example.com/dresses/real">Real Dresses</a> ›    
  <a href="http://www.example.com/dresses/real/green">Real Green Dresses</a>

Sample BC2

BC2: (from http://schema.org/WebPage https://schema.org/breadcrumb ):

<a href="category/books.html">Books</a> >
 <a href="category/books-literature.html">Literature & Fiction</a> >
 <a href="category/books-classics">Classics</a>

These essentially just differs by separator character; could also be an image. This is common enough but with no interesting difference in terms of essential structure. In theory the anchor content could also include an image, but this is less common.

Sample BC3: multiple trails

Extracted from https://support.google.com/webmasters/answer/185417?hl=en by taking the Microdata example and removing the microdata (danbri):

<div>
  <a href="http://www.example.com/books">
    <span>Books</span>
  </a> ›
  <div>
    <a href="http://www.example.com/books/authors">
      <span>Authors</span>
    </a> ›
    <div>
      <a href="http://www.example.com/books/authors/stephenking">
        <span>Stephen King</span>
      </a>
    </div>
  </div>
</div>

<div>
  <a href="http://www.example.com/books">
    <span>Books</span>
  </a> ›
  <div>
    <a href="http://www.example.com/fiction">
      <span>Fiction</span>
    </a> ›
    <div>
      <a href="http://www.example.com/books/fiction/horror">
        <span>Horror</span>
      </a>
    </div>
  </div>
</div>