W3C > Semantic Web Use Cases and Case Studies

Case Study: A Digital Music Archive (DMA) for the Norwegian National Broadcaster (NRK) using Semantic Web techniques

Dr. Robert H.P. Engels, ESIS, and Jon Roar Tønnesen, NRK, Norway

September 2007

Logo of NRK

General Description

Introduction

Digitizing the complete radio and television broadcasting production process is a major undertaking in many public and commercial broadcasters. Many public broadcasters possess enormous archives often ranging back 60+ years to include pre-WWII sound assets on bakelite, wax and even(!) chocolate. Whereas the older assets show a remarkable resistance against the tooth of time, more modern storage formats like digital video tape, certain CD’s, tapes etc. are not that robust. At NRK (Norwegian National Broadcaster) it is expected that many tapes recorded in the late 80’s and early 90’s cannot be recovered within 5 years if no immediate action is taken and digitzing the assets is considered a correct way of action for preserving assets for the future.

Another effect of digitizing, besides the preservation argument, is that the assets become more easily available, with many manual or labor-intensive steps in a production process eliminated. At NRK it is estimated that during a year of broadcasting a maximum of 5% of all in-house available assets are really used in broadcasting.

Semantic Web technology is primarily used for enclosing the enormous amounts of metadata on music tracks available within the archives so that a larger amount of the “hidden treasures” will be used in broadcasting, potentially providing the broadcaster with an advantage over the competition, being better informed and more interesting.

System objectives and components

Metadata for all registered music in NRK has been handled by a group of librarians from the archiving department. From 1962 to 1982 all registrations for incoming records where made on paper, from 1981 until 2007 all registrations where made in a simple, file-based and non-relational database. Objectives of the system are to:

The complete system has been taken into production during summer 2007.

Modeling the repository

An important principle for design was the wish to use Semantic Web technology for the solution as to bring a Semantic Web scenario to end users in a commercial environment. The designed solution is business critical as well as a real-world production environment. During initial tests, some drawbacks were identified, as well as potential opportunities for expanding the solution:

creenshots of the actual Digital Music Archive User Interface

Figure 1: Some screenshots of the actual Digital Music Archive User Interface

Conclusion and future work

During design time, RDF stores and repositories seemed not to be able yet to store and reason with the sheer amount of triples needed for a proper representation of the archive assets (estimation in 2006: 150+ Million triples for the complete database). The evaluation showed that at that time not the whole stack could be served by proper Semantic Web technology. Therefore it was decided to do a production ready in-house development where objects, properties and relations are stored in a scalable RDBM mapped up to a Semantic Web based publication layer. This approach allowed for a production system while being able to show the benefits of using Semantic Web technology in Search & Navigation scenarios. Part of the solution is an export layer, where all metadata can be exported to a variety of formats, including XML/OWL.

Tests are currently conducted with parts of the archive and currently available technology in order to evaluate scalability of available systems to date.

As soon as SPARQL end-points for internet resources with metadata in the field of music become available, such connectors will be added to the administration module of the Digital Music Archive.

Key Benefits of Using Semantic Web Technology