One billion documents
Millions of pages use non-standard elements
But there were no semantics there
Classes (<p class="...">)
But there were no semantics there either