When talking about media semantics, one has to commit to basics principles in how “media semantics” should be understood. There are different possible ways of applying semantics to media (Grosky 94):
On the essence level (i.e. the binary sequences) it is up to a codec and/or a format to assign a certain pattern a distinct meaning. For example to represent a circle in SVG a certain command <circle cx="100" cy="50" r="40" stroke="black" stroke-width="2" fill="red"/>) is needed; representing the same shape in a raster format yields a certain binary pattern.
On the domain level (an opera, football, WW II, etc.) there exists a collection of logical entities - possibly represented in a somewhat controlled vocabulary as an ontology. Here the task is to assign a certain audio-visual pattern (feature) to one or more of the logical entities; also known as: bridging the semantic gap.
On the metadata level the main question is how to represent the content description. Taking MPEG-7 as one prominent MM-metadata standard, the question here is that of formal describing its semantics (hence to be able to reason with it)
Relevance to Use Cases
With and being the proposed priorities, and ? indicating a not yet assigned item.
This is a non-exhaustive list of question that should be answered to agree upon the ontological commitments of the media semantics modelling
Does the essence itself have rigid properties (select a part of a still-image and store it as a new document; yields a new image that might allow different interpretation; another nice example is the compilation of arguments and counter-arguments done in the VoxPopuli project)?
Is the context in which we consume a certain piece of essence relevant? If yes, how to represent it?
References which have not been directly linked from within the text (e.g. references to resources not available on the web).
(Grosky 94) W. I. Grosky. Multimedia Information Systems. IEEEMultiMedia, 1(1):12–24, 1994