This workshop brought together developers and users working on the multilingual application of metadata and the semantic web. It discussed the problems and issues invovled in making the Web truly world wide, and the impact of this and the development of a more semantically rich web on each other.
Some tools were presented, some areas of success, and many areas requiring significant further work were identified. Work on glossary tools was directly advanced in preparation for and as a result of this workshop.
The workshop was jointly organised with CEN-ISSS MMI-DC ensuring rapid flow of information to other relevant European organisations.
This report is part of the SWAD-Europe project Work package 3: Dissemination and Implementation.
It describes a developer workshop held in Copenhagen in July 2004, on the topic of metadata in a multillingual world. The workshop was attended by developers based in Europe, with additional participation from the USA and Australia. This workshop was held jointly with the CEN-ISSS MMI-DC workshop group, who investigateand make recommendations on the use of metadata in Europe.
A short list of background reading for workshop participants is available. Two position papers were provided, one from Thomas Baker who was unable to attend the meeting himself.
The Dublin Core Metadata Activity maintains a number of international mirrors of its content, with a variable amount available in translation, including schemas, documents describing usage, etc.
The W3C maintains a collection of documents, and a glossary derived from those documents. A large number of volunteer translators provide translations of various fo these documents, and in some cases terms from the glossary, in order to help ensure consistency of translations.
A number of countries or organisations are working in multiple languages - Fundación Sidar is one, working primarily in Castellano, Catalá, Gallego and Português with collections of documents and tools. Some of Sidar's tools, such as Hera, are using multilingual RDF vocabularies as a base for interfaces and document output.
There are an increasing number of applications of multilingual approaches and technologies to providing accessibiltiy for people with disabilities, including the use of simplified language, provision of visual or other multimedia aids to comprehension.
Finally, many governments and similar organisations (such as the European Commission) are required to work in a number of languages at once, and need to ensure that they are providing and managing information appropriately for this need.
The workshop was attended by developers from
It was broadcast via the #rdfig IRC channel, allowing remote participation, and at relevant points various developers took part in the workshop through this method.
A number of tools and vocabularies were presented or discussed. The first day's discussion log, the first day's "chumped" highights, the second day log and second day's highights are all available.
It was clear that in many areas there is a lot of development, and tools are moving towards the level of products developed commercially for end users, while other areas still involved research and development. It is also clear that this is a very large area for exploation, and that most systems are only currently working on a fairly basic level.
In particular, tools dealing with time or location in any complexity tend to be in the early phases of development.
It was clear that the complexity of this area is due in part to the fact that language cannot be readily seperated from its cultural context in many important use cases, and that representing this information in a machine readable way is therefore a very complex problem.
It seems that simple dictionary tools are useful for many cases, and these are relatively advanced in development.
A substantial amount of discussion was devoted to looking at use cases - the things that participants actually want the semantic web to do. These were discussed briefly in the logs as well as being collected in the highlights.
A clear outcome of the workshop was the need for simple step-by-step explanations of how to use vocabularies, oriented to developers who want to copy working examples rather than understand the entire theoretical base and then deriving their own tools and code.
Systems that can handle any type of postal address should, in theory, be readily available, but they are not yet in widespread use. Addressing areas such as people's names, or locations of things is still largely in the area of research, although some simple use cases are being met by existing tools.
The outcomes of this workshop, and in particular the lessons learned in the discussions, will be used to inform discussions at the FOAF workshop to take place in Galway, on the topic of how to capture the names of people (and places) in a way that makes sense in the context of a multilingual world wide semantic web.
W3C's glossary system was updated to use the SKOS vocabulary, developed for the SWAD-E project, in preparation for this workshop.
As a planned follow-up to the workshop Sidar's glossary is expected to migrate to SKOS, as a preliminary step to developing intereoperability with the W3C glossary.
Attempting to run a workshop or any similar event in Europe during the two peak months of summer is difficult, and reduces somewhat the opportunity for the broadest possible participation. This effect is speard over at least two months of the year since people take vacaations at different times. As this workshop showed, it is possible to make important and valuable progress during this extended slowdown, but the limited availability of people means that it is unlikely to be possible at the same rate as at other times of year.