This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29483 - metatags already too redundant
Summary: metatags already too redundant
Status: RESOLVED MOVED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: CR HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: ---
Assignee: Robin Berjon
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-19 02:58 UTC by Nick Levinson
Modified: 2016-04-28 16:15 UTC (History)
2 users (show)

See Also:


Attachments

Description Nick Levinson 2016-02-19 02:58:30 UTC
Redundancy is getting out of hand despite the instruction at WHATWG MetaExtensions to reuse what's roughly close enough. In order to support different parsers and not be misunderstood due to omitting a tag, my new website (cold32.com) on a recent day had the following meta tag values on a typical page (index.html):

-- nine of me as author, creator, designer, and publisher: author; article:author; web_author; creator; dc.creator; dcterms.creator; designer; dc.publisher; dcterms.publisher (differences can be substantial but that still leaves three for author, three for creator, and two for publisher)

-- five copyright statements: rights; dcterms.rights; dcterms.rightsHolder; dc.dateCopyrighted; dcterms.dateCopyrighted (while dcterms.rightsHolder would be useful if rights had been transferred, such as to a literary agent, without such a transfer omitting this tag is potentially problematic if a parser would understand the omission to mean there is no rights holder) (while dc.dateCopyrighted and dcterms.dateCopyrighted could presumably have a year only, when the first year of a copyright stretches over multiple years then a distinction has to be made between revisions and balance, which the main machine-readable copyright notice already does)

-- five for description, all identical: description; dc.description; dcterms.description; twitter:description; og:description

-- five for page title, four identical and one nearly so: dc.title; dcterms.title; twitter:title; og:title; application-name

-- four for coverage by space and/or time: dcterms.coverage; dcterms.spatial; dc.temporal; dcterms.temporal

-- four for date of first appearance: created; dc.created; dcterms.created; article:published_time (including a specific time would be rare and this last is explicitly date-time)

-- three for date of last modification: dc.modified; dcterms.modified; article:modified_time (last explicitly datetime)

-- three that say this is a "website": dc.type; dcterms.type; og:type

-- three for language: dc.language; dcterms.language; og:locale (it's possible the last was meant for markup applied in Russia to a page written in Chinese, but since the spec calls for a language code it seems unlikely a literal place is what they'd want to find)

-- three for an icon: two meta tags plus one link tag: twitter:image; og:image (meta thumbnail arguably should be different); redundant of link icon

-- two for audience type: audience; dcterms.audience

-- two short URLs (protocol-to-TLD only): one meta tag plus one link tag: meta msapplication-starturl redundant of link shortlink

-- two page URLs: one meta tag plus one link tag: meta og:url redundant of link canonical

-- two for main color: theme-color; msapplication-navbutton-color (maybe different but not clear if latter omissible)

While dc.* and dcterms.* we might think would not both be needed on the same page, a claim that one replaces the other does not seem clearly supported by DCMI, so I assume parsers expect both and I have to supply both.

And there are more metatags that aren't even registered in HTML5 or MetaExtensions but seem to be in use for scholarly works.

I'm treating the name and property attributes in meta tags as interchangeable.

This doesn't consider microdata, which don't use meta tags but do overlap meta tags.

I predict redundancy will generally get worse, as more websites get large and decide to throw their weight around and make their own systems using meta tags, creating their own branded comprehensive systems that don't leave gaps for what's already available.

Suggestion: Require in HTML5 that if an attribute's value is absent and a certain other value is present in the same file then interpret the metadata as if both were present and identical. Which attribute values to which to apply this would be enumerated, probably in WHATWG MetaExtensions with another column (this does not fit a legacy synonymy so no existing column would fit).
Comment 1 LĂ©onie Watson 2016-04-28 16:15:53 UTC
Moved to HTML on Github:
https://github.com/w3c/html/issues/302