Jonathan Rees took an action (ACTION-227) on 2/12/2009: "Summarize TAG work on metadata, with Larry" due 2/24.
I decided to cast the net wider, so that we could better think about what we might do. I got a bit carried away, and this turned into a bird's-eye outline of this immense field. My research was done in the most lowbrow way: using a search engine, and using an online encyclopedia.
The wikipedia article is worth reading.
It's very important to understand the definition of "metadata": Metadata is data about data, or information about information. It is not arbitrary kinds of information, even if it is about something, unless that something is data. Although the word is widely abused to go beyond this scope, I (JAR) do not approve, since we already have perfectly good words for other kinds of information - data, description, information, etc.
In addition, the W3C SVG recommendation defines "metadata" consistently with general use, and to have other W3C documents vary from usage in a recommendation would be a bad idea.
(The etymology of "metadata" is a circus. The "meta" in "metadata" is a back-formation from "metaphysics", which originally just meant "the volume that comes after the physics volume" but because of that volume's content came to mean something much more... well, metaphysical.)
If we want to broaden the scope of the investigation - which already seems much too broad - beyond data about data, we'll need to choose a different word - "description", "information about", etc.
Obviously any data can have metadata, but the following are specific domains in which people talk about metadata.
Schemas...
Syntactic bases...
Embedded metadata / markup...
See Metadata standards, crosswalks, and standard organizations from QE II Library for a long list of links about metadata related to the library world. (W3C is on its list of 5 standards organizations in this area.)
Neither here nor there: Google search results are metadata; the books and media sections of Amazon are metadata; link lists and webliography such as LSRN are metadata.
The following are not on topic (data about data), but come up often in discussions of metadata on the Web.
TAG Resolution endorsing W3C Team Comment on the identifications of WS Transfer resources, WSRA WG charter, WS Metadata Exchange, WS Policy, and so on. All documents use the word "metadata" as describing "what other endpoints need to know to interact with" a service. This is definitely outside the accepted definition of the word "metadata", as a service is not data. You would not say that a planet's mass was "metadata" - it is just data. "Metadata" can be about the messages sent when using the service, but not the service itself. I would urge the WG to change their terminology.
In contexts where what a URI is supposed to "identify" matters, it becomes important to ask a URI owner what they want to say about this (AWWW 2.2.2.1). You could consider this to be "metadata" if you took the position that what you recover is data about the URI, where the URI is data. JAR does not favor this usage.
Larry Masinter, Phil Archer, Dan Connolly, Eran Hammer-Lahav, Ray Denenberg