Workshop Metadata for Content Adaptation

Minutes session 2 (Oct. 12, 2004, 11:00-13:00)


Daniel Appelquist (Vodafone)

"Metadata Power for Content Adaptation and Discovery"

Some types of metadata:

- topics/genre (to find related info)

- child protection

- commercial data (business relations)

- device requirements

- digital rights management

- location/event (for personal messages, blogs, etc.)

Balance optimal user experience with usability for content authors.

Also balance with standardization, which enables more players.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Rotan Hanrahan: Example of reducing table to its most important columns shows that metadata must express notion of importance, and also that importance is relative.

Daniel A: Ideally, importance judged by client, but for practical reasons we fell back to numerical importance value.

Max Froumentin: You have an authoring language?

Daniel A: Authors probably authoring in their own format and translating to ours. CNN, e.g., has large databases. We bypass HTML.

(Orange): Adaptation and discovery, is that the same thing?

Daniel A: Grey area. Some things are clearly discovery, others not clear.

Rotan H: Overlap.

(Orange): E.g., when you want location-related info, in general context-based info, is that treated anywhere, in W3C or elsewhere?

Rotan H: Result of discovery may provide metadata for adaptation. E.g., when looking for a map and finding an image, you now know that it is a map and shoul dbe compressed with a different algo from, say, a face.


Phil Archer (ICRA)

"Metadata for child protection"

ICRA vocabulary (RDF), evolved from RSAC (PICS).

Looking for an architecture that allows sharing metadata: multiple resources with the same label should only require one instance of the label (e.g., all URLs with the same prefix have the same label: a selector/query for resources). Also, a resource should be able to tell where to go to find its labels (if any).

Not just for blocking, but also to give provider a chance to provide alternative content that will not be blocked.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Wendy Chisholm: About (not) broadcasting personal data: same concern in accessibility world. Usually solved by having something in the middle. An organization asks for accessible content, a person gets it from the organization.

Phil A: Publishers loathe to say what age, e.g., their content is for. So intermediate doesn't work, unless it itself makes the judgements.

Daniel A: Can you trust self-labeling?

Phil A: Providers don't lie that much. Porn industry, e.g., *want* you to know that they've got porn. Of course, there is a risk. That's why there are other systems. Don't have to rely on one system only. But external classifiers spend very little time on any Web site, so their classification should be looked at critically as well.

Max F: URLs may not have a structure that allows you to select groups of them for labeling.

Bert Bos: Or use RDF query as the "subject" of a label?

Phil A: Possible. Still have to deal with cascading of multiple labels.



Oskari Koskimies (Nokia): Content and context: does the metadata describe the image (a map), or the place where it is used (a mobile phone)?

Shadi Abou-Zahra: Content is described, and some (external) rules that match content to contexts.

Oskari K: Also need to know that certain things cannot be displayed without one another: if you drop the image, the text makes no sense.

Rotan H: That is a third kind of metadata. You have to consider the adaptation/discovery process as well. W3C doesn't develop software, but still needs to define the framework in which those processes can work.

???: There are even more types of metadata.

Wendy C: Relationships between resources very important for

accessibility: you combine resources differently for different people.

Rotan H: Easy enough to match screen width to image size, but much harder to judge whether content is "appropriate".

Oskari K:

Rotan H: Complex problem, but can start with something simple (and extensible).

Lisa Seeman: Also consider evolution: old systems interpreting new terms. RDF helps, since it allows to define subclasses. There are many extensions possible: emotional health, religion, etc. Don't fix a specific scenario.

Andy Heath: Defining a basic set of terms is hard. Dublin Core succeeded, but others failed.

Rotan H: Like software design...

Mark Birbeck: Device capabilities, server capabilities, document content... Maybe more promising to study the process. Where would what information be used? Client, proxy, server.

Rotan H: Is having a lot of metadata a disincentive for authors?

Daniel A: Authors are authoring in something else already, goes into a database. Not in HTML directly. NewsML, e.g., is used, because it gives content providers added value.

Roland M: The format provides implicit metadata. The format and the tags in it have a certain semantic already.

(MobileAware): Have to keep delivery context separate, subject for different workshop. Although you need to understand where it enters into the process. This workshop should focus on content.

Rotan H: But they have to match, can learn from delivery context what needs to be said about the content.

Stephane Boyera: Two separate points: content metadata itself, and how to get into a state that we have such metadata.

Rotan H: Author's metadata and generated/implied metadata.

(MobileAware): Author not always aware of delivery context.

Daniel A: Correct.

Rhys Lewis: Our experience as well. Authors are not experts on device. They want a pixel somewhere and get upset when they can't get it.

Rhys L: I'd like to see some taxonomies come out of this workshop. Maybe separate from accessibility taxonomy, but used in same way. Maybe use namespaces for extensibility.

Roland M: Taxonomies, processes and domains. But need convenient way to associate matadata with resource. The best taxonomy doesn't help if you can't easily associate it with a resource.

Rotan H: Yes, like Lisa's suggestion earlier about syntax for HTML:

why link inside instead of around its subject? We need to figure out questions like that.

Lisa S: Ambiguities in text are a big problem. Maybe solving that and solving the adaptation to mobile devices is in fact the same thing, or can be solved the same way.

Lisa S: Good processes while creating content can help. E.g., when adding layers to an image, you're prompted for a title for the layer. But sometimes you also want to talk about a site as a whole.

Lisa S: Need to consider people's personal scenarios. For security and privacy, many processes need to happen in the UA.

Mark B: When I said "process" I meant in particular that metadata can come from various places and be used at various times.

Rotan H: Lisa's example of people with failing memory. That may also be a result of the medium/presentation. A large page when linearized causes people to forget the start.