W3C

RDF Data Shapes
Working Group Charter

The mission of the RDF Data Shapes Working Group is to produce a language for defining structural constraints on RDF graphs. In the same way that SPARQL made it possible to query RDF data, the product of the RDF Data Shapes WG will enable the definition of graph topologies for interface specification, code development, and data verification.

End date 1 June 2017
Confidentiality Proceedings are public
Chair Arnaud Le Hors, IBM
Staff Contact
Eric Prud'hommeaux 0.15 FTE
Teleconference Schedule One 60-90 minute call per week, plus task force calls as necessary
Face-to-Face Meetings 3-5 during during the course of the group, although the chairs may schedule or cancel meetings as needed to help the group reach its goals. These meetings will use teleconferencing facilities, but effective participation generally requires attending in person, so participants should budget for travel. Remote participation does not necessarily qualify for good standing.

Introduction

While the Semantic Web has demonstrated considerable value for collaborative contributions to data, adoption in many mission-critical environments requires data to conform to specified patterns. This need for interface definitions spans domains. For instance, validation in a banking context shares many requirements with quality assurance of linked clinical data. Systems like the Linked Data Platform will be greatly enhanced by the ability to publish, consume and verify data against machine-readable interface specifications. Development of standards and tools to meet these requirements can greatly increase the utility and ubiquity of Semantic Web data.

Most data representation languages used in conventional settings offer some sort of input validation, ranging from parsing grammars for domain-specific languages to XML Schema, RelaxNG or Schematron for XML structures. While the distributed nature of RDF affects the notions of "validity", tool chains and consumers of data frequently need know that the data meets some defined interface definition.

This working group will address the need for RDF data validation/interface definition on the Semantic Web. It will address issues like:

In addressing these issues, the WG may consider whether it is necessary, practical or desireable to normalize a graph prior to validation. That is, whether an algorithm can and should be defined that creates a canonical form of a given graph.

Motivation

One motivation for this work is Application Integration, where different software components, potentially maintained by different organizations, need to function together smoothly. As a everyday example, imagine an international company with a dozen divisions, each providing a feed of their Human Resources data to authorized users. Different divisions might use different software to produce their feeds, and there might be many distinct applications which consume the data, ranging from an employee phone book to a hiring-compliance monitoring system.

While systems like this are built and maintained around the world today, their complexity often becomes a problem. Not only are the systems expensive and sometimes unpleasant to maintain, but changing data fields and adding new applications can grow to be practically impossible. An "RDF Data Shapes" standard would help manage the complexity, greatly reducing the cost and hassle, by separating components while still allowing them to work together.

Specifically, in this example, an RDF Data Shapes Language would allow:

In all cases, the semantics of the data are determined by RDF and the vocabularies specified by the shape, so if the shapes match, the systems can reasonably be expected to interoperate correctly.

While RDF Data Shapes are expected to have immediate everyday utility, as illustrated above, they have even wider potential applicability, ranging in scale. At the large end, RDF Data Shapes might be used by loosely-knit communities, where data is provided by organizations which are not under any central authority, such as charities and researchers around the world concerned with quality-of-life measures. At the small end, RDF Data Shapes might be used within a mobile application environment to provide interoperability among independent sensor modules and tools for analyzing and acting on sensor results. The common thread is that RDF Data Shapes allow a loose coupling, where independently maintained elements of an overall system can reliably and comfortably interoperate.

Scope

This work group will address these design goals:

Association of properties with Primitive Data Types
Constrain the object of a property to have a particular datatype.
Association of properties with Structural Types
Assert that the object of a property must conform to a specific shape.
Hierarchies of shape definitions
Permit any of a set of shapes to stand for a specified shape, e.g. to say that either a User shape or an Employee shape can be used in place of a Commentor shape.
Data feed description
Description of data streams and query interfaces.
Conformance
The relation of schemata to RDF graph instances; obligations on schema-aware processors. The WG will define a process for checking to see that the constraints expressed in a schema are respected in an RDF graph (validation);
Extensibility
The language will enable implementations to take advantage of extensions in the language to perform verification or other actions not required for conformance.

The first order of business of the Working Group will be to review existing work in this space, including the W3C Submissions Resource Shapes 2.0, Shape Expressions 1.0 Primer, Shape Expressions 1.0 Definition, and SPIN, along with discussion at the RDF Validation Workshop (eg the Stardog ICV presentation). From these inputs, and other input it deems useful, the Working Group shall select appropriate Use Cases and decide on which Requirements are to be met by the group's eventual output. These Use Cases and Requirements are to be documented in a Working Group Note within 3-6 months after the start of the group.

Deliverables

Recommendation Track:

  1. An RDF vocabulary, such as Resource Shapes 2.0, for expressing these shapes in RDF triples, so they can be stored, queried, analyzed, and manipulated with normal RDF tools, with some extensibility mechanism for complex use cases.

  2. Semantics, possibly defined as SPARQL operations, specifying how shapes are evaluated against RDF graphs.

  3. OPTIONAL - Compact, human-readable, non-RDF syntax for expressing constraints on RDF graph patterns (aka shapes), suitable for the use cases determined by the group.

Not Recommendation Track:

  1. Use Case and Requirements: a collection of use cases and a derived list of requirements that gives a practical foundation with which to analyze proposed designs for elements of the platform.

  2. Relationship to SPARQL: The group will document how its output relates to SPARQL, including instructions for how to use SPARQL systems to perform validation.

  3. Relationship to OWL: The group will document how its output relates to OWL, including OWL using the Closed World Assumption (CWA) and Unique Names Assumption (UNA), as reported at the Workshop. This may include instructions for translating to/from OWL, as the Working Group deems useful. (This deliverable may not be needed if OWL is integral to the group's output.)

  4. Primer: W3C Note introducing users to the technical output of the group.

  5. Test Suite and/or Validator: to help ensure interoperability and correct implementation. The group will chose the form of this deliverable, such as a git repository.

Schedule

The group will document significant deviations from this schedule on its home page.

Date Event Description
2014-09 Start Group Launch, First Teleconferences
2014-11 F2F1 Face-to-face meeting (W3C TPAC)
2014-12 UCR Release Use Cases and Requirements
and First Public Working Drafts
2015-03 WD2 Second public Working Drafts published
2015-05 F2F2 Face-to-face meeting
2015-06 LCWD Last Call Working Drafts published
2015-09 F2F3 Face-to-face meeting, if needed
2015-10 CR Candidate Recommendation published
2016-01 PR Proposed Recommendation published
2016-02 REC Recommendation published

Dependencies and Liaisons

W3C Groups

Where the groups listed below have already closed, the RDF Data Shapes Working Group will make a reasonable attempt to communicate with relevant individuals and the larger community.

Liaisons

Data Activity Coordination Group
RDF Working Group (closed):
address technical issues associated with the core semantics of RDF.
public-sparql-dev@w3.org
complement the SPARQL 1.1 Query Language.
ensure a proper coordination between the work of this group and all the other Working Groups in the Data Activity,
Data on the Web Best Practices Data Working Group
solicit test cases from the DWBP Working Group's work on data publication and vocabulary creation.
CSV on the Web Working Group
solicit test cases from the CSVW's work on the conversion of CSV data to other formats according to defined criteria.
WAI Protocol and Formats Working Group
The Web Accessibility Initiative anticipates benefits from this work. The WAIPF WG coordinates among several WAI groups, including the Evaluation and Repair Tools Working Group where the EARL vocabulary is maintained.

Dependencies

none

Participation

In general, people participate in this group as representatives of W3C member organizations. At least one representative from each participating organization is expected to devote significant time to this effort (about one day per week, or more, depending on duties), to accept and complete appropriate action items on a timely basis, and to travel to face-to-face meetings, as scheduled by the chairs in consultation with the group.

On a case-by-case basis, using the invited expert process, people may be allowed to participate as individuals, not representing an organization.

To be successful, the Working Group is expected to have between ten and thirty active participants for its duration.

Participants are reminded of the Good Standing requirements of the W3C Process.

Communication

This group primarily conducts its work on the mailing list public-data-shapes-wg@w3.org (public archives). The mailing list member-data-shapes-wg@w3.org (W3C member-access-only archives) may be used for administrative purposes, such as travel planning.

Information about the group (deliverables, participants, face-to-face meetings, teleconferences, etc.) will be available from the group's home page.

Decision Policy

As explained in the Process Document (section 3.3, Consensus), this group will seek to make decisions when there is consensus. When the Chair puts a question and observes dissent, after due consideration of different opinions, the Chair should record a decision (possibly after a formal vote) and any objections, and proceed.

A formal vote should allow for remote asynchronous participation—using, for example, email and/or web-based survey techniques. Any resolution taken in a face-to-face meeting or teleconference is to be considered provisional until 5 working days after the publication of the resolution in draft minutes sent to the group's mailing list.

This charter is written in accordance with Section 3.4, Votes of the W3C Process Document and includes no voting procedures beyond what the Process Document requires.

Patent Policy

This Working Group operates under the W3C Patent Policy (5 February 2004 Version). To promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented, according to this policy, on a Royalty-Free basis.

For more information about disclosure obligations for this group, please see the W3C Patent Policy Implementation.

About this Charter

This charter for the RDF Data Shapes Working Group has been created according to section 6.2 of the W3C Process Document. In the event of a conflict between this document or the provisions of any charter and the W3C Process, the W3C Process shall take precedence.

References

The following items should be included as background material by the Working Group, documenting some of the work in this field:

  1. Dublin Core Application Profiles WG database of validation requirements.
  2. Public ShEx Wiki, and discussion.
  3. Driessche's tests, .
  4. Algebra coverage tests, .
  5. Labra Gayo's implementation and tests

Eric Prud'hommeaux (eric@w3.org), Phil Archer (phil@w3.org), and Sandro Hawke (sandro@w3.org), editors

$Id: charter.html,v 1.57 2014-09-26 20:05:09 eric Exp $