Skip to toolbar

Community & Business Groups

Entity Reconciliation Community Group

Matching entities across data sources using different identifiers and formats is a pervasive issue on the web. This group revolves around developing a web API that data providers can expose, which eases the reconciliation of third-party data to their own identifiers. OpenRefine's reconciliation API is used as a starting point. Our goals are to document this existing API, share our experiences and lessons learnt from it, propose an improved protocol in the view of promoting it as a standard, and build tooling around it. A description of the existing protocol can be found here: https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API

reconciliation-api

Group's public email, repo and wiki activity over time

Note: Community Groups are proposed and run by the community. Although W3C hosts these conversations, the groups do not necessarily represent the views of the W3C Membership or staff.

drafts / licensing info

name
Reconciliation Service API v0.1

Chairs, when logged in, may publish draft and final reports. Please see report requirements.

Publish Reports

Supporting reconciliation from a library perspective

I’ve recently had the opportunity to briefly present our Community Group and what we do in a lightning talk at SWIB20, this years iteration of the annual (and this year digital) Semantic Web in Libraries conference (slides, video):

OpenRefine, and in particular its reconciliation feature, are widely used in the library world, where authority files are an established part of traditional cataloging workflows. Early reconciliation data sources for library use cases include FAST, VIAF, and VIVO.

Our Open Infrastructure team at hbz is offering a reconciliation service for the Integrated Authority File (GND). The GND is the main authority file in the German-speaking library field. It contains persons and corporations, subject headings, geographical entities, events, and works. With our reconciliation service, we’re building a bridge from a traditional library dataset to new applications within and outside the library domain, e.g. in the (German-speaking) digital humanities. This complements the general development of the GND in recent years, especially within the GND4C project, of opening up organizational structures, processes, data models, and tooling of the GND to other cultural heritage institutions like archives and museums.

Besides services, the library world is also the source of new clients that interact with services using the reconciliation API. Two of the known clients are from the library domain: AlmaRefine and Cocoda. Managing, identifying, and connecting entities is at the very core of librarianship, making it an ideal field for the goals of our Community Group.

Therefore, I’m very happy to join Antonin as co-chair of our group. I’m looking forward to help advancing and promoting our goal of a common protocol for data matching on the Web, both in the library field and beyond.

Reconciliation test bench helps services improve

The reconciliation test bench developed by our Community Group gives an overview of the API features supported by reconciliation endpoints available online. It also lets developers try out their service interactively, helping them improve reconciliation quality and user experience.

Today, lobid announced that their GND reconciliation endpoint now implements the Suggest API, which helps users select entities, properties and types from OpenRefine’s user interface. They report that the test bench was used to plan and test this improvement. We hope this will encourage other services to implement more aspects of the API.

If you want to get involved with improving the test bench, head over to its GitHub repository.

Mapping the reconciliation ecosystem

We have started to map the existing environment around entity reconciliation on the Web. Our goal is to get a complete picture of all the data providers, clients, protocols, tools and other resources which are relevant to our community group.

This effort is happening on GitHub: the reconciliation-api/census repository hosts it as a collection of markdown files, which are exposed as a website at https://reconciliation-api.github.io/census/. If you are aware of anything even remotely related to entity matching on the Web, please add it there.

Our charter is still not final – feel free to tweak it. And if you want to get involved in running the group, it would be great to have more chairs.

Call for Participation in Entity Reconciliation Community Group

The Entity Reconciliation Community Group has been launched:


Matching entities across data sources using different identifiers and formats is a pervasive issue on the web.

This group revolves around developing a web API that data providers can expose, which eases the reconciliation of third-party data to their own identifiers. OpenRefine’s reconciliation API is used as a starting point. Our goals are to document this existing API, share our experiences and lessons learnt from it, propose an improved protocol in the view of promoting it as a standard, and build tooling around it.

A description of the existing protocol can be found here:
https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API


In order to join the group, you will need a W3C account. Please note, however, that W3C Membership is not required to join a Community Group.

This is a community initiative. This group was originally proposed on 2019-06-08 by Antonin Delpeuch. The following people supported its creation: Antonin Delpeuch, Ettore Rizza, Owen Stephens, Juliane Schneider, Ethan Gruber, Thad Guidry, Christina Harlow, Markus Mandalka. W3C’s hosting of this group does not imply endorsement of the activities.

The group must now choose a chair. Read more about how to get started in a new group and good practice for running a group.

We invite you to share news of this new group in social media and other channels.

If you believe that there is an issue with this group that requires the attention of the W3C staff, please email us at site-comments@w3.org

Thank you,
W3C Community Development Team