Authors: VassilisTzouvaras, MichaelHausenblas, StamatiaDasiopoulou
Use Case: MPEG-7 metadata interoperability
Index
Contents
1. Introduction
MPEG-7 was developed to provide standardized tools for describing different aspects of multimedia at different levels of abstraction. Its XML-based syntax enables smooth interchange across applications and over the web, but the lack of precise semantics hinders metadata interoperability. Two representative examples include:
- Semantically identical metadata can be represented in multiple ways. For example, an image depicting a player scoring a goal can be annotated using the free text tag (“Zinedine Zidane scoring against England”), the keyword tag (Zidan, goal, France, England), the label tag etc.
<FreeTextAnnotation xml:lang="en"> Zinedine Zidane scoring against England. </FreeTextAnnotation>
| Using the free text annotation | 
<KeywordAnnotation xml:lang="en"> 
        <Keyword>Zinedine</Keyword>
        <Keyword>Zidan</Keyword>
        <Keyword>scoring</Keyword>
        <Keyword>England</Keyword>
        <Keyword>goal</Keyword>
</KeywordAnnotation>| Using the keyword annotation | 
<StructuredAnnotation> 
                <Who>
                 <Name xml:lang="en"> Zinedine Zidane </Name>
                </Who>          
                <WhatAction>
                   <Name xml:lang="en"> Zinedine Zidane scoring against England. </Name>
                </WhatAction>           
</StructuredAnnotation>| Using a structured annotation with labels | 
<Semantic id="FormalAbstractionDescription"> 
      <SemanticBase xsi:type="AgentObjectType" id="Zidane">
         <Label> <Name> Zidane </Name> </Label>
         <Agent xsi: ="PersonType">
             <Name>
                <GivenName> Zinedine </GivenName>
                <FamilyName> Zidane </FamilyName>
             </Name>
         </Agent>
      </SemanticBase>
      <SemanticBase xsi:type="EventType" id="scoring">
         <Label> 
             <Name>  Zinedine Zidane scoring against England. </Name>
         </Label>
      </SemanticBase>
</Semantic>| Using MPEG-7-built-in (non-formal) semantic descriptor | 
- The intended semantics underlying the structure of descriptions defined within MPEG-7, for example the decomposition relation between an image and its constituent segments, are not formal and as such cannot be deployed (an image annotated as depicting Zidane and an image, a segment of which is annotated as depicting Zidane won’t be both retrieved in a corresponding ‘semantic’ query unless customized query expansion is performed to cover both cases.
2. Existing MPEG-7 ontologies
To alleviate the resulting interoperability issues, efforts have been undertaken to translate MPEG-7 into an ontology and through appropriate frameworks to enable its integration with other ontologies, thus enhancing interoperability. Two main such methodologies include the proposals by Hunter et. al. and Tsinaraki et. al. Both approaches aim to provide a framework for interoperable MPEG-7 compliant multimedia metadata. However, given the continuously growing research interest in formalizing multimedia related semantics and building a common metadata framework, the question of how interoperable these proposals are becomes particularly important.
2.1. Using the MPEG-7/ABC Ontology
In the approach proposed by Hunter, the ABC ontology is used as the core one to provide attachment points for integrating mpeg7 and domain specific ontologies. More specifically, the mpeg7:Multimedia Content class (and the subsequent multimedia and segment hierarchy) is defined as a subclass of the abc: Manifestation class, while the corresponding domain ontologies are assumed to be appropriately attached to corresponding ABC classes.
A first observation at this point would be that MPEG-7 includes apart from the structure related description schemes, descriptions on other aspects as well (e.g., the semantic part ones), for which it is not clear how the mapping to ABC should be and how they relate to possibly relevant domain specific definitions. For example, mpeg7: Agent could be mapped to abc:Agent. Assuming a domain specific class o:Person it should be again linked to abc:Agent as equivalent class, subclass or through some property, thus raising issues about the semantics of the mpeg7:Agent and o:Person relation, which in turn reduces interoperability among possible pre-existing MPEG-7 based annotation metadata and newly created ones under the ABC core ontology framework.
Let assume that someone follows the approach by Hunter, using the Multimedia Description Scheme (MDS) part of the MPEG-7 ontology to address the structural aspects, in order to annotate an image depicting Zidane scoring. Assuming a soccer ontology s, the involved classes would be s:goal, s:player, s:scoring and mpeg7:image (at least in a simple case where spatiotemporal decomposition is not taken into account). One possible way to represent this annotation would be using the following statements:
:image01 rdf:type mpeg7:Image :goal01 rdf:type s:Goal :scoring01 rdf:type s:Scoring :image01 mpeg7:depicts :goal01 :goal01 abc:hasAction :scoring01 :scoring01 abc:hasAgent s:_b1 :_b1 :hasName 'Zinedine Zidane'
where additionally the following hold:
mpeg7:Image rdfs:subclass mpeg7:MultimediaContent mpeg7:MultimediaContent rdfs:subclass abc:Manifestation s:Scoring rdfs:subclass abc:Action s:Goal rdfs:subclass abc:Event
Notice that under this framework, having attached this annotation to a specific image region rather than the whole image, i.e.
:region01 rdf:type mpeg7:StillRegion :region01 mpeg7:depicts :goal01
we would be able to retrieve the corresponding image if querying for images depicting Zinedine Zidane scoring, due to the subclass relation mpeg7:StillRegion rdfs:subclass mpeg7:Image, sth that it is not inherently possible by MPEG-7 itself.
Leaving out individual issues regarding the taken modeling decisions (e.g., should still regions be modeled as a subclass of image or related to the latter through partonomic decomposition relations only), the one sees evidence for the value of using an upper ontology, adequately generic to allow the consistent integration between an mpeg7 ontology and domain specific ones.
2.2. Using the MPEG-7/Tsinaraki Ontology
In Tsinaraki on the other hand, the semantic part of MPEG-7 is translated into an ontology that serves as the core one for the attachment of domain specific ontologies, in order to achieve MPEG-7 compliant domain specific annotations. A first observation is that under this approach the initial conceptualization of the domain specific ontologies needs to be “mapped” to the MPEG-7 modeling rationale. Consequently, annotation metadata produced following this approach would not be interoperable with approaches coupling domain specific ontologies with an MPEG-7-like one, following a procedure similar to the one proposed by Hunter.
3. Possible Solutions
In this section we will present the possible solutions for the interoperability problems that arise from the different translations/formalisations of the MPEG-7 standard. The specific interoperability problems have been illustrated in the motivating example. There are three approaches in the literature that try to overcome such interoperability problems. These approaches are:
TODO: Michael to describe syntactic (XML, XML-Schema) and semantic (RDF/OWL/rules) aspects.
- Create syntactic mappings between terms of two or more standards (e.g. Cidoc-Crm Vs Dublin-core). The proposed solution exploits the expressive power and reasoning support of OWL and SWRL (or other rules on-top-of ontologies language) in order to created syntactic as well as semantic mappings.
- Align the domain ontologies in a multimedia core ontology (or framework) that ensures interoperability. This approach covers the work that is in progress in the K-Space project.
- Using MPEG-7 profiles. This approach will be mainly covered by Michael.
The aim of this section is not to present the analytical solutions but rather the mechanism to ensure interoperability in MPEG-7 based MM applications. In addition, we will present the interoperability problems that are solved and the new ones that are introduced.