Skip to toolbar

Community & Business Groups

RDF and XML Interoperability Community Group

The goal of this group is to 1) identify application areas in which the combined processing of XML and RDF data and tooling is beneficial; 2) identify issues that hinder the joint usage of the two technology stacks 3) formulate best practices to resolve the issues or propose standardization topics. The goal does not only take into account the data representation formats XML and RDF, but all related technologies (e.g. for XML: XSLT, XQuery; for RDF: RDF Schema, SPARQL) and selected XML (e.g. OData) or RDF vocabularies. The group should be driven by needs of industries that already deploy one or both technology stacks. This will also cover adjacent technologies like JSON with respect to the topics covered in this group. The outcome should focus not on a big architecture of how to work with XML and RDF, but on small building blocks (as best practices or standardization topics) that can be re-used across industries and application scenarios.

Group's public email, repo and wiki activity over time

Note: Community Groups are proposed and run by the community. Although W3C hosts these conversations, the groups do not necessarily represent the views of the W3C Membership or staff.

No Reports Yet Published

Learn more about publishing.

Chairs, when logged in, may publish draft and final reports. Please see report requirements.

This group does not have a Chair and thus cannot publish new reports. Learn how to choose a Chair.

Converting XML-encoded texts to RDF and back

We created a generic research platform called Knora to be (primarily) used in the humanities domain. Knora internally uses an RDF-triplestore and offers a RESTful API to its users to perform all necessary operations (reading, creating, updating, and deleting data). The Knora base ontology provides basic value types designed for the representation of qualitative data, including versioning and permissions.

An important part of data in the humanities are marked up texts (e.g., for digital editions). Once imported into Knora, it is our goal to represent these texts adequately in RDF and to export them if the user wishes to do so. At the moment, we support the import of XML-encoded texts into Knora and their export as XML. The export delivers an XML document that is equivalent to the imported one (equivalent, but not necessarily identical on the character stream level).

Before importing an XML document representing a text, a mapping has to be provided. A mapping expresses the relations between XML elements and attributes and their corresponding entities defined in ontologies (classes and properties). With a mapping provided, XML documents can be converted to RDF and stored in Knora’s triplestore. During the conversion, markup and content are separated since we use a so called standoff-based approach (referring to positions or ranges of the text via index positions of single characters). The text is stored as a string, the markup is represented as RDF-triples, allowing for SPARQL queries.

Our goal is to develop an editor that allows for creating and editing texts directly in a native standoff format. For now, we are still using embedded markup (e.g., HTML in a browser-based GUI) that is converted to RDF and back, limiting the advantages of the standoff apprach. One of the main advantages of standoff is the ability to add layers of annotations to a text without interfering with the existing markup (unlike as in embedded markup like XML-based documents where overlap may occur). Our approach is inspired by Desmond Schmidt’s work: http://multiversiondocs.blogspot.com

You will find more information about the creation and handling of standoff markup in Knora here:

 

 

 

Call for Participation in RDF and XML Interoperability Community Group

The RDF and XML Interoperability Community Group has been launched:


The goal of this group is to 1) identify application areas in which the combined processing of XML and RDF data and tooling is beneficial; 2) identify issues that hinder the joint usage of the two technology stacks 3) formulate best practices to resolve the issues or propose standardization topics. The goal does not only take into account the data representation formats XML and RDF, but all related technologies (e.g. for XML: XSLT, XQuery; for RDF: RDF Schema, SPARQL) and selected XML (e.g. OData) or RDF vocabularies. The group should be driven by needs of industries that already deploy one or both technology stacks. This will also cover adjacent technologies like JSON with respect to the topics covered in this group. The outcome should focus not on a big architecture of how to work with XML and RDF, but on small building blocks (as best practices or standardization topics) that can be re-used across industries and application scenarios.


In order to join the group, you will need a W3C account. Please note, however, that W3C Membership is not required to join a Community Group.

This is a community initiative. This group was originally proposed on 2016-05-30 by Christian Dirschl. The following people supported its creation: Christian Dirschl, Felix Sasaki, Mohamed ZERGAOUI, Charles Foster, Jose Emilio Labra Gayo, Andreas Blumauer, Yuanzhe Yang, Thomas Thurner, Timea Turdean, Reul Quentin, Florent Georges, Erika Pauwels, Rob Walpole. W3C’s hosting of this group does not imply endorsement of the activities.

The group must now choose a chair. Read more about how to get started in a new group and good practice for running a group.

We invite you to share news of this new group in social media and other channels.

If you believe that there is an issue with this group that requires the attention of the W3C staff, please email us at site-comments@w3.org

Thank you,
W3C Community Development Team