Difference between revisions of "Vocabulary and Dataset"

From Library Linked Data
Jump to: navigation, search
(Metadata Element Sets)
(Metadata Element Sets)
Line 469: Line 469:
''[AI] William, I can't find FRBR mentioned in your case. Can we remove it?''
''[AI] William, I can't find FRBR mentioned in your case. Can we remove it?''
===Work in progress to create RDF vocabularies ===
===Work in progress to make RDF vocabularies available===
''@@TODO: Some of these should be moved to the previous section at the time of final publishing''.
''@@TODO: Some of these should be moved to the previous section at the time of final publishing''.

Revision as of 10:57, 24 July 2011

LLD Datasets, Value Vocabularies and Metadata Element Sets

Editors: Antoine Isaac, William Waites, Jeff Young, Marcia Zeng.

This page is a draft for the side deliverable on "LLD Vocabularies and Datasets" as penciled in this plan.

@@TODO: for general TODOs see the Discussion page


Introduction: Scope and Definitions

This document, a deliverable from the W3C Library Linked Data incubator group, is an attempt to identify a set of useful resources for creating or consuming linked data in the library domain. It is intended both for novices seeking an overview of the library linked data domain, and for experts in search of a quick look-up or refresher. The final report of the incubator group suggests that the success of linked data in any domain relies on the ability of its practitioners to identify, re-use or connect to already available datasets and data models. Library Linked Data is not an exception. Such an identification effort is crucial given the complexity and variety of library data resources, many of them already available as linked data at the time of writing this report. We hope that this document will help those who undertake such a task.

In previous library terminology explanation efforts, we have identified the following types of resources of interest, which are non mutually exclusive (as shown later):

  • Datasets : A dataset is a collection of structured metadata -- descriptions of things, such as books in a library. The equivalent of a dataset in the library world is a collection of Library records. Library records consist of statements about things, where each statement consists of an element ("attribute" or "relationship") of the entity, and a "value" for that element. The elements that are used are usually selected from a set of standard elements, such as Dublin Core. The values for the elements are either taken from value vocabularies such as LCSH, or are free text values. Similar notions to "dataset" include "collection" or "metadata record set". Note that in the Linked Data context, Datasets do not necessarily consist of clearly identifiable "records".

    • a record from a dataset for a given book could have a Subject element drawn from Dublin Core, and a value for Subject drawn from LCSH.
    • the same dataset may contain records for authors as first-class entities that are linked from their book, described with elements like "name" from FOAF.
    • a dataset may be self describing in that it contains information about itself as a distinct entity for example with a modified date and maintainer/curator elements drawn from Dublin Core.

  • Value vocabularies : A value vocabulary defines resources (instances of topics, art styles, authors) that are used as values of elements in metadata records. Typically a value vocabulary does not define bibliographic resources such as books but concepts related to bibliographic resources (persons, languages, countries, etc.). They are "building blocks" with which metadata records can be populated. Many libraries mandate specific value vocabularies for selecting values for a particular metadata element. A value vocabulary thus represents a "controlled list" of allowed values for an element. Examples include: thesaurus, code list, term list, classification scheme, subject heading list, taxonomy, authority file, digital gazetteer, concept scheme, and other types of knowledge organisation system. Value vocabularies often have http URIs assigned to the value, which would appear in a metadata record instead of or in addition to the literal value.

    • LCSH defines topics of books (e.g., Travel).
    • Art and Architecture Thesaurus defines art styles (e.g., Impressionist) among others.
    • VIAF defines authorities (e.g., Mark Twain).
    • GeoNames defines geographical locations (e.g. Paris).

  • Metadata element sets or element sets: A metadata element set defines classes and attributes used to describe entities of interest. In the linked data terminology, such element sets are generally made concrete through (RDF) schemas or (OWL) ontologies, the term "RDF vocabulary" being often used as an umbrella for these. Usually a metadata element set does not describe bibliographic entities, rather it provides elements to be used by others to describe such entities.

    • Dublin Core defines elements such as Creator and Date (but DC does not define bibliographic records that use those elements).
    • FRBR defines entities such as Work and Manifestation and elements that link and describe them.
    • MARC21 defines elements (fields) to describe bibliographic records and authorities.
    • FOAF and ORG define elements to describe people and organisations as might be used for describing authors and publishers.

This report is intended as an entry point for practitioners to find, understand and explore some exemplar Metadata Element Sets, Value Vocabularies and Datasets. It is especially grounded by the use cases our incubator group has gathered. We do not aim here to draw a complete list of the various resources related to the (library) linked data "cloud". We hope this report will prove an inspirational complement to more complete listing tools such as Semantic Web search engines (like Sindice or Falcons), or registries such as the Metadata Registry or CKAN -- we of course encourage our readers to also use these, just as we did ourselves for the CKAN dataset registry.


CKAN is a registry for datasets. It is a tool for people to share information about datasets of all types and collaboratively describe them. The CKAN registry is not itself a linked-data service however there is a linked data version for the information it contains. Many of the datasets described in CKAN are in linked-data form.

CKAN has the concept of curated groups of datasets and is used to maintain information about membership of the wider LOD Cloud as well as the subset that pertains to Libray Linked Data. The curators of these groups have arrived at a set of conventions for using the tagging facilities in CKAN to describe datasets that are to be included. This includes information about dataset size, example resources and access methods (e.g. SPARQL endpoints) and, crucially, links to other datasets.


When publishing a new dataset, adding it to CKAN means that it is included in a frequently consulted list of datasets. Following the conventions of the LOD Cloud and LLD groups means that its relationships to other datasets are documented and that it will be counted amongst the growing number of linked data corpora and appear in diagrams and visualisations that are produced as part of the study of this type of data. Having such datasets documented in this way means that we can build tools to gain a greater understanding of their nature and how they fit together. Whilst interesting in itself, this process is important in that this kind of understanding makes it easier to determine if a particular dataset is suitable or appropriate for a given task and thus makes it easier to use.

To illustrate an example of the results of this process, consider the diagram below,


Original at: http://semantic.ckan.net/group/?group=http://ckan.net/group/lld

The brightly coloured circles represent the datasets that are part of the LLD group. They grey circles represent datasets that they are connected to but are not members of this group (they typically are members of the LOD Cloud group). The size of the circles and the thickness of the lines are related to the size of the dataset and the number of outward links (logarithmic) respectively. It is immediately apparent that though there are some densely connected clusters of datasets in LLD the majority are actually actually connect through datasets that are not necessarily library data in themselves -- DBPedia and Geonames figuring prominently. It is also apparent that linking to other datasets that do not have this central character is quite common.

Published Datasets

@@TODO: Just before the final delivery of this document, we will add here a snapshot of the CKAN LLD group. I.e., a simple bullet list that sums up the packages available there, with direct pointers to these.

Value vocabularies

Published value vocabularies

This section describes value vocabularies, which have been made available as linked data and/or mentioned as being relevant by one of the LLD XG cases.

Every entry features a brief introduction to the vocabulary, as well and links to their locations. Cases collected by the LLD XG that refer to the value vocabulary are also listed under each entry.

Classification systems

Dewey Decimal Classification (DDC) summaries

Dewey Summaries is a suitable data set containing the top classes of Dewey Decimal Classification (DDC) 22. It provides access to the top three levels of the DDC in eleven languages and access to Abridged Edition 14 (assignable numbers and captions) in three languages.

Universal Decimal Classification (UDC) summary

The Universal Decimal Classification (UDC) is a multilingual classification scheme for all fields of knowledge. The UDC Summary represents a selection of around 2,000 classes extracted from the UDC scheme. [1]

Subject Headings/Subject Authority Files

Library of Congress Subject Headings (LCSH)

LCSH is a comprehensive list of subject headings published in print and as linked data. Subject authority headings can be accessed through Library of Congress Authorities.

RAMEAU, French National Library

RAMEAU is a subject heading vocabulary used by the French National Library. It has been developped starting from the subject heading repository of the Quebec University, being derived itself from the Library of Congress Subject Headings (LCSH). RAMEAU has been published as linked data by the TELplus project.

SWD (Schlagwortnormdatei), German National Library

A controlled vocabulary system managed by the German National Library (DNB) in cooperation with various library networks. The inclusion of keywords in the SWD is defined by "Rules for the Keyword Catalogue" (RSWK). [2]

National Diet Library List of Subject Headings (NDLSH)

The National Diet Library List of Subject Headings (NDLSH) is a list of subject headings applied to the catalog of the National Diet Library, including mainly the topical headings and some proper name headings. [3]

Name Authority Data

VIAF (Virtual International Authority File)

VIAF is a joint project of multiple national libraries in the world which virtually combining the name authority files of participating institutions into a single name authority service. As of the winter of 2011, there are 21 authority files of personal, corporate, and conference names from 18 organizations participating in VIAF. [4]

Union List of Artist Names (ULAN)

ULAN is a structured vocabulary containing more than 225,000 names and biographical and bibliographic information about artists and architects, including a wealth of variant names, pseudonyms, and language variants.

It is not yet published as linked data per se, but appears in [5].


The GeoNames geographical database contains over 10 million geographical names and consists of 7.5 million unique features whereof 2.8 million populated places and 5.5 million alternate names. [6]


STW Thesaurus for Economics

The thesaurus provides vocabulary on any economic subject. It also covers technical terms used in law, sociology, or politics, and geographic names.[7]

AGROVOC multilingual thesaurus

AGROVOC is a multilingual structured and controlled vocabulary designed to cover the terminology of all subject fields in agriculture, forestry, fisheries, food and related domains (e.g. environment). [8]

Eurovoc - Multilingual Thesaurus of the European Union

EuroVoc is a multilingual, multidisciplinary thesaurus covering the activities of the EU, the European Parliament in particular. It contains terms in 24 languages (as of May 2011).[9]

Other Controlled Vocabularies

MARC Code List of Relators (MARC Relators) (also in element sets)

The MARC Relators provide list of properties for describing the relationship between a name and a bibliographic resource.


PRONOM is the online registry of technical information about the file formats, software products and other technical components required to support long-term access to electronic records and other digital objects of cultural, historical or business value. [10]

Creative Commons (CC) License set

The Creative Commons provides an infrastructure which consists of a set of copyright licenses and tools that create a balance inside the traditional “all rights reserved” setting that copyright law creates. [11]

Preservation vocabularies from LoC

Preservation Events

A concept scheme for the preservation events, i.e., actions performed on digital objects within a preservation repository.

Preservation Level Role

A concept scheme for the preservation level roles, i.e., values that specify in what context a set of preservation options is applicable.

Additional sources


WordNet is a lexical database of English where nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (called "synsets"). Each synset expresses a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. [12] Wordnet has been published as linked data by the Vrije Universiteit Amsterdam.

Freebase (also in datasets)

Freebase is an open, Creative Commons licensed collection of structured data, and a platform for accessing and manipulating that data via the Freebase API. Freebase imports data from a wide variety of open data sources, such as Wikipedia, MusicBrainz, and others.[13] Note that Freebase is essentially a dataset, but its including many reference resource can lead to using some parts of it as value vocabularies for certain cases.


DBpedia extracts structured information from Wikipedia. The DBpedia data set features labels and abstracts for over three million things, with a half of them classified in an ontology, and contains millions of links to images, external web pages, and external links to other RDF datasets. [14]

Work in progress, or relevant for cases but not in progress officially

Aquatic Sciences and Fisheries Abstracts (ASFA) Thesaurus

The Thesaurus is used for the subject indexing of the Aquatic Sciences and Fisheries Abstracts (ASFA), an abstracting and indexing service that covers the world's literature on the science, technology, management, and conservation of marine, brackish water, and freshwater resources and environments, including their socio-economic and legal aspects.[15]

Fisheries Reference Metadata

The Fisheries Reference Metadata system stores all the classification systems (for species, countries, water areas, commodities, fishing vessels, fishing gears, etc.) used by FAO to describe fisheries observations such as time-series data on fisheries capture and production and species fact sheets.

Agriculture Thesaurus and Glossary

The Agricultural Thesaurus and Glossary are online vocabulary tools of agricultural terms in English and Spanish provided by the USDA National Agricultural Library. The subject scope of agriculture is broadly defined in the NAL Agricultural Thesaurus, and includes terminology in the supporting biological, physical and social sciences. The definitions of terms in the thesaurus were separately published as the Glossary of Agricultural Terms.[16]

Art and Architecture Thesaurus (AAT)

A multilingual controlled vocabulary for fine art, architecture, decorative arts, archival materials, and material culture for the purposes of indexing, cataloging, searching, as being a research tool.

Medical Subject Headings (MeSH)

A comprehensive controlled vocabulary produced by the National Library of Medicine (NLM) for biomedical and health-related information and documents.


A classification system for describing and classifying the subject of images represented in various media such as paintings, drawings and photographs.

The Getty Thesaurus of Geographic Names (TGN)

A structured, world-coverage vocabulary of over 1.3 million names, including vernacular and historical names, coordinates, place types, and descriptive notes, focusing on places important for the study of art and architecture.

Other value vocabularies relevant to the LLD field, not mentioned in the use cases

New York Times subject headings

The New York Times uses approximately 30,000 tags to power its Times Topics Pages. These tags (categorized into 'people', 'organization', 'place', and 'descriptor') as published as linked open data and are mapped to freebase, DBpedia, and Geonames.

MARC Countries list

MARC Countries list identifies current national entities, states of the United States, provinces and territories of Canada and Australia, divisions of the United Kingdom, and internationally recognized dependencies. The entries include references to their equivalent ISO 3166 codes.

MARC List for Languages

The MARC List for Languages provides three-character lowercase alphabetic strings that serve as the identifiers of languages and language groups. It have been cross referenced with ISOs 639-1, 639-2, and 639-5, where appropriate.

Metadata Element Sets

This section lists metadata element sets mentioned in the uses cases gathered by the Library Linked Data group in 2010-2011. These include some of the most relevant RDF vocabularies for practitioners who want to re-use available Semantic Web technology for creating or converting data from the library domain.

These RDF vocabularies are represented using the constructs offered by the RDF Schema (RDFS) and Ontology Web Language (OWL) ontology modeling languages. In addition to the documentation made available on their own websites, the reader can view their content using generic ontology creation and visualization tools such as Protégé, the Manchester ontology browser, OWL Sight or the Live OWL Documentation Environment (see for example the DOAP ontology rendered in LODE).

For each element set, we give a pointer to a human-readable website and indicate the corresponding RDF namespace, as well as a common abbreviation used for it. We also provide or re-use a short description, focused on the main scope or usage domain for the element set. We have sometimes emphasized important design decisions that characterize the element set, including indications on whether the element set is connected to another one, or on its relation to traditional library usages. Finally, cases collected by the LLD XG are also listed under each entry as relevant usage examples.

Metadata element sets published as RDF vocabularies

This sub-section lists the relevant ontologies (OWL or RDFS) available at the time of writing this report identified by the gathered by the LLD Incubator Group.

@@TODO: Similar to what LOV and the UMBEL doc have done, we could include a graph that shows all our metadata element sets, a bit like for DC in LOV. The links would indicate that a metadata element set "re-uses" another. And the size of the circles would depend on the number of times the vocabulary appears in our use cases (which can be --even manually-- extracted from our LLD vocabulary wiki page, as Paul as done here). PLEASE DON'T FORGET THAT THE MOCK-UP BELOW IS JUST TO GIVE AN IDEA!!!


Other relevant pointers (as a reminder):

Dublin Core

Dublin Core 1.1 is the legacy Dublin Core element set containing 15 basic property elements capable of describing anything. A critical aspect of these properties is the lack of a rdfs:range setting, which allows one to use them both with literal values and fully-fledged RDF resources.

The DCMI Metadata Terms /terms namespace refines the legacy /elements/1.1/ namespace with some rdfs:range restrictions and a variety of new properties. Note that interoperability with the /element/1.1/ set is preserved via rdfs:subPropertyOf.

Friend of a Friend (FOAF)

FOAF is a basic common-sense and widely used ontology for describing persons and other closely-related entities on the web.

Vocabulary of Interlinked Datasets (VoID)

VoID (from "Vocabulary of Interlinked Datasets") is an RDF based schema to describe linked datasets. With VoID the discovery and usage of linked datasets can be performed both effectively and efficiently. A VoID dataset is a collection of data, published and maintained by a single provider, available as RDF, and accessible, for example, through dereferenceable HTTP URIs or a SPARQL endpoint.

OAI-ORE (Open Archives Initiative - Object Reuse and Exchange)

The Open Archives Initiative Object Reuse and Exchange model define elements to describe aggregations of web resources, which together form complex digital objects, such as a journal article and its different digital variations and accompanying material. It also proposes a "resource map" mechanism to indicate and describe provenance of metadata on these aggregations, as well as "proxies" to describe any given resource from the perspective of a specific aggregation, when resources are included in different aggregations.


"SKOS provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary."[17] SKOS deliberately avoids providing rdfs:domains with some of its properties (esp. labelling and note properties), enabling one to re-use them for any kind of resource.


SKOS-XL is a SKOS extension that provides support for describing lexical entities attached to concepts. It "reifies" the labels of skos:Concepts, treating them as fully-fledged RDF resources. This allows them to be annotated further, or support linking them using, say, a "isTranslationOf" property.

BIBO (Bibliographic Ontology)

BIBO (Bibliographic Ontology) can be used as a citation ontology or document classification ontology, or a way to describe any kind of bibliographic things in RDF.


This is a RDF Schema for EXIF -- a standard for images and supports mainly technical metadata, usually embedded in an image file (e.g., JPEG file), where each key of the EXIF specification has been directly mapped to a corresponding property. In order to preserve the groupings of metadata keys that is provided in the original EXIF specification (e.g., pixel composition and geo location), other efforts have been reported, such an EXIF OWL ontology [18].

UMBEL ((Upper Mapping and Binding Exchange Layer) Vocabulary

The UMBEL (Upper Mapping and Binding Exchange Layer) Reference Concepts dataset is derived from the OpenCyc ontology. It includes thousands of coherently structured and linked concepts, and is broadly applicable as orienting nodes to any knowledge domain. The UMBEL vocabulary provides classes and properties to describe this conceptual knowledge. It also intends to function as the basis for constructing domain ontologies. [19] It re-uses external vocabularies whenever possible.


The vCard ontology enables representing business card profiles defined by vCard (RFC2426).


OWL ontology: http://lexvo.org/ontology

The name Lexvo is derived from the Ancient Greek λεξικόν (lexicon) and the Latin vocabularium (vocabulary).[20] The ontology provides a vocabulary for defining global URIs for languages, words, characters, and other human language-related objects.

MARC Code List of Relators

The MARC relators vocabulary provides a list of properties for describing the relationship between a name and a bibliographic resource.

OPM (Open Provenance Model)

The Open Provenance Model is a generic model to express and share provenance information. It consists of a lightweight Open Provenance Model Vocabulary which enables basic representation of provenance data, and a more expressive Open Provenance Model OWL Specification geared towards inference.


The CIDOC object-oriented Conceptual Reference Model (CRM) is developed by the International Council of Museums (ICOM) to represent and make interoperable description of objects from the cultural sector. It makes intensive use of events to link objects, persons, places and more conceptual notions together.

xmlns:crm="http://purl.org/NET/cidoc-crm/core#" @@TODO: we need to see who is behind this one.

CIDOC CRM generic (version independent) namespace for the interpretation of the CIDOC CRM as RDF schema: http://www.cidoc-crm.org/rdfs/cidoc-crm, and generic namespace used for CIDOC CRM english labels (initial codes followed by their english names): [http://www.cidoc-crm.org/rdfs/cidoc-crm-english-label [21]

Music Ontology

"The Music Ontology Specification provides main concepts and properties for describing music (i.e. artists, albums and tracks) on the Semantic Web". It applies the FRBR distinctions to the music domain.

Creative Commons Rights Expression Language (CC REL)

CC REL enables describing copyright licenses in RDF.

CiTO: A Citation Type Ontology

CiTO, one of the SPAR ontologies is a minimal ontology for describing reference citations in research articles.

DOAP (Description of a Project)

Description of a Project (DOAP) is a vocabulary for describing software projects, especially open-source projects.

W3C Basic Geo vocabulary

This small ontology is aimed at representing Geo Positioning (latitude, longitude and altitude) for spatial objects, according to the WGS84 standard.

DCMI Type Vocabulary

A general, cross-domain list of Dublin Core Metadata Initiative (DCMI) approved terms that may be used as values for the resource type element to identify the genre of a resource.

Dublin Core Collection Description vocabularies

The DCMI Collection Description Application Profile Task Group developed a Dublin Core collections application profile and several vocabularies. Its work was based on the RSLP Collection description schema.

Functional Requirements for Bibliographic Records (FRBR) and related ontologies

FRBR (Functional Requirements for Bibliographic Records) is a conceptual reference model developed by the International Federation of Library Associations and Institutions (IFLA) "to provide a (...) framework for relating the data that are recorded in bibliographic records to the needs of users of those records" (FRBR Final Report, sec. 2.1) and for assessing their actual relevance.

The IFLA "FRBR family" consists of three conceptual models each covering an aspect of the data recorded in bibliographic and authority records. The entities, attributes, and relationships defined by each of the models are included in the Metadata Registry:

The FRBR Final Report describes an entity-relationship model that has been the source of a number of other ontology implementations:

[AI] William, I can't find FRBR mentioned in your case. Can we remove it?

Work in progress to make RDF vocabularies available

@@TODO: Some of these should be moved to the previous section at the time of final publishing.

MADS/RDF (Metadata Authority Description Schema in RDF)

MADS/RDF is designed for use with controlled values for names (personal, corporate, geographic, etc.), thesauri, taxonomies, subject heading systems, and other controlled value lists. The MADS/RDF ontology is mapped to SKOS.

ISAD(G) (General International Standard Archival Description)

ISAD (G)= General International Standard Archival Description. It defines the elements that should be included in an archival finding aid.

W3C Ontology for Media Resources

Defines a core set of metadata properties for media resources, along with their mappings to elements from a set of existing metadata formats. It mainly targetes towards media resources available on the Web, as opposed to media resources that are only accessible in local archives or museums.

ISBD (International Standard Bibliographic Description)

This is a preliminary registration of classes and properties from International Standard Bibliographic Description (ISBD) consolidated edition. The ISBD is useful and applicable for descriptions of bibliographic resources in any type of catalogue.

EAC-CPF (Encoded Archival Context – Corporate bodies, Persons, and Families)

EAC-CPF (Encoded Archival Context – Corporate bodies, Persons, and Families) is aimed at representing authoritative information about the context of archival materials, including "the identification and characteristics of the persons, organizations, and families (agents) who have been the creators, users, or subjects of records, as well as the relationships amongst them" [22]. It is a parallel effort to the Encoded Archival Description (EAD) standard for representation of archival finding aids.

A core concept in EAC-CPF is the distinction between agents and identities: a same agent can have different identities, and one identity can correspond to several agents.

Documentation, RDF/XML file


MARC has played a crucial role in the creation and exchange of library metadata. The MarcOnt initiative has created an OWL ontology that includes a small sub-set of MARC elements, connected to other ontologies.

PREMIS (Preservation Metadata: Implementation Strategies)

Preservation Metadata: Implementation Strategies (PREMIS) defines core set of preservation metadata elements, with supporting data dictionary, applicable to a broad range of digital preservation activities.

EAD and other archive-oriented element sets

EAD standard for encoding archival finding aids using Extensible Markup Language (XML).

  • Usage examples: Cluster Archives
  • Work relevant for EAD in RDF has been done in the LOCAH (linked data available here and documentation here) and EuropeanaConnect (see schema here)

Note that the LOCAH element set only handles a part of EAD, and introduces other elements that the LOCAH participants found useful to publish archival collection data as linked data. Readers may also be interested in the lightweight Archival vocabulary maintained by Aaron Rubinstein for describing archives and the named entities associated to them.

Metadata element sets from cases for which no RDF vocabulary is available

Categories for the Description of Works of Art (CDWA)

Categories for the Description of Works of Art (CDWA) includes 532 categories and subcategories for describing describing and accessing information about works of art, architecture, other material culture, groups and collections of works, and related images.


A subset of elements taken based on the Categories for the Description of Works of Art (CDWA) and Cataloging Cultural Objects: A Guide to Describing Cultural Works and Their Images (CCO). It is an XML schema to describe core records for works of art and material culture.

EBU P/Meta Semantic Metadata Schema (P/META)

A standard vocabulary for information relating to programme information in the professional broadcasting industry.


SPECTRUM is a UK-originated standard for managing museum collections, from descriptive metadata for objects to loan information [23]

MODS (Metadata Object Description Schema)

Metadata Object Description Schema (MODS) includes a subset of MARC fields and uses language-based tags rather than numeric ones, in some cases regrouping elements from the MARC 21 bibliographic format. MODS is expressed using XML.

Other metadata element sets (no RDF vocabulary) relevant to the LLD field, not mentioned in the use cases

Visual Resources Association (VRA) Core Categories (VRA Core)

Visual Resources Association (VRA) Core Categories (VRA Core) specifies a set of core categories for creating records to describe works of visual culture as well as the images that document them.

  • An OWL ontology for VRA core 3.0 has been created by Mark van Assem for the W3C Semantic Web Best Practices and Deployment working group.

Text Encoding Initiative (TEI) Guidelines

The "Guidelines for Electronic Text Encoding and Interchange" is a standard for representing all kinds of literary and linguistic texts for online research and teaching.

PBCore (Public Broadcasting Metadata Dictionary)

(PB=Public Broadcasting). PBCore is a metadata standard designed to describe media, both digital and analog. The PBCore XML Schema Definition (XSD) defines the structure and content of PBCore. The element set and related value vocabularies are available at Metadata Registry.