Main Page

From Linked Data for Language Technology Community Group
Jump to: navigation, search


We also invite you to participate in any of the upcoming group calls and roadmapping workshops being organised by the group.

Calls and workshops are announced over the mailing list. We pursue three major activities:

  • Roadmapping activities (primarily 2013-2015)
    • Roadmapping activities have not been stalled since 2015, but they are now being conducted in more focused sub-discussions, i.e., on web standards for language resource metadata and linguistic annotation
  • Metadata vocabularies for language resources (since 2015)
  • Web standards for linguistic annotation (since 2019)

Web standards for linguistic annotations

Since 2013, LD4LT has embraced the NLP Interchange Format (NIF) as a community standard for representing linguistic annotations on the web. However, NIF competes (and partially overlaps) in this regard with other web standards such as Web Annotation on the one hand, with ISO standards that primarily build on XML technologies on the other hand, as well as with community- or platform-specific solutions such as TEI-XML (for the Digital Humanities) and the LAPPS Interchange Format (for Galaxy). We identified the need to harmonize these efforts and are currently working towards a synthesis.

  • GitHub: repository
  • We conduct (more or less) regular telcos in collaboration with the Cost Action Nexus Linguarum, announced via the LD4LT mailing list

OWL Metamodel for Language Resources

With the aim of converting data/metadata of Language Resources into the cloud of Linguistic Linked Data, an OWL metamodel is being developed based on the inputs and previous experience of well-established LRs communities such as Meta-Share. In this section we are collecting pointers and materials related to that topic.

Roadmapping activities

During an initial phase, LD4LT focused on roadmapping activities and surveys regarding needs and potential of linked data in language technology. This has been conducted in conjunction with the FP7 project LIDER: Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe, with a peak in activity in 2015. On this basis, we focused on more specialialized aspects such as language resource metadata and exchange formats for linguistic annotations.

  • Summary of LD4LT Roadmapping workshops
    • 05/03/2015: 1st session on LIDER reference architecture NOTE: this had been announced to be 3 March, but the call will be 5 March.
    • 19/03/2015: Presentation of linghub to the LD4LT community
    • 02/04/2015: Presentation on the Digital Single Market and linguistic linked data enhanced content analytics


Past Events

Use Cases

The LD4LT is developing use cases to explore the industrial relevance of linguistic linked data in different industrial and governmental applications. Initial input on such use cases is being gathered though an online questionnaire currently offered by the group, and these will be elaborated on in face to face session at the above workshops. To seed this process however, a number of suggested use cases are being summarised on this wiki based on previous or ongoing applications.

The results of an initial survey conducted by the LIDER project into requirements and use cases related to linguistic linked data are now available. It is interesting to read this in tandem with results of a survey just released by LT-Innovate on interest in a European Language Cloud.

Collaborative Landscape

The LD4LT Community Group aims to work closely with other active communities with an interest in building consensus on interoperability of linguistic data using linked data.

W3C Best Practice for Multilingual Linked Open Data Community Group

This group focusses on capturing best practice in publishing and using linguistic linked data.

LD4LT collaborates on improving the understanding of the use cases to which these best practices may apply.

W3C OntoLex Community Group

Developing a model as well as demonstrations and best practice for the representation of lexica (and machine readable dictionaries) relative to ontologies.

LD4LT aims to collaborate on improving the understanding of industrial use cases that may make use of the ontolex model.

W3C Linked Data Models for Emotion and Sentiment Analysis Community Group

Open Knowledge Foundation working group on Open Data in Lingusitics

W3C Data Activity

Works to facilitate potentially Web-scale data integration and processing. It does this by providing standard data exchange formats, models, tools, and guidance.

LD4LT will promote awareness of best practice from the Data Activity and standardised vocabularies (e.g. DCAT, ORG, PROV) from the W3C to identify further language-specific use cases and vocabularies that may be advanced in collaboration with other groups, e.g. BP-MLOD.

W3C Internationalization Activity

Works with W3C working groups and liaises with other organizations to make it possible to use Web technologies with different languages, scripts, and cultures. It includes the ITS Interest Group, which is maintaining an RDF vocabulary for text annotation based on the Internationalization Tag Set 2.0.

LD4LT will promote awareness of existing guidelines and best practice promoted in relation to web content, and also assist in identifying use cases needing best practice in interlinking multilingual web and linguistic linked data, and promote these via the BP-MLOD Community Group.


Update or remove.

Teleconference logistics

Tracker, mailing list etc.

ISSUE and ACTION tracker mailing list archive

LD4LT Community Group Public Landing page