RDFCore notes: Container model problems

Author: danbri@w3.org $Id: Overview.html,v 1.4 2001/06/07 03:12:09 danbri Exp $

Nearby: proposed interpretation of RDF containers, 2000-12-13

Overview

This document is concerned with the RDF formal model treatment of containers. It is unconcerned with syntactic issues; those problems are covered by the 'container interpretation' doc above.

The issue here is raised in the container interpretation doc:

This document motivates and proposes a simplification to the specification of the RDF model consistent with the proposed container syntax simplification.

Problem statement

The RDF Model and Syntax recommendation includes two kinds of special case support for three types of "container" or grouping construct: Bags, Seqs and Alts. The specification expresses support for these at both syntactic and formal model levels. Regarding syntax, M+S defines some XML structures (based around the rdf:li construct) that make it easier to serialise representations of containers. There are known problems with the detail of this syntax, see Beckett/McBride proposed simplification.

In addition to the problem with container syntax, RDF's specification of a formal model for containers is also problematic. This document proposes editorial changes to the Model and Syntax specification to simplify the spec's description of the model.

The "formal model" section of the spec tells us:

[[ As described in Section 3, it is frequently necessary to represent a collection of resources or literals; for example to state that a property has an ordered sequence of values. RDF defines three kinds of collections: ordered lists, called Sequences, unordered lists, called Bags, and lists that represent alternatives for the (single) value of a property, called Alternatives.
Formally, these three collection types are defined by: 10.There are three elements of Resources, not contained in Properties, known as RDF:Seq, RDF:Bag, and RDF:Alt. 11.There is a subset of Properties corresponding to the ordinals (1, 2, 3, ...) called Ord. We refer to elements of Ord as RDF:_1, RDF:_2, RDF:_3, ...

The next paragraph in the specification is problematic. It tells us that collection elements are numbered in sequence using the ordinal properties rdf:_1 rdf:_2 etc. Note that this is with regard to the formal model for representing RDF containers, and not with regard to any particular XML representation (complete or partial) of this information.

To represent a collection c, create a triple {RDF:type, c, t} where t is one of the three collection types RDF:Seq, RDF:Bag, or RDF:Alt. The remaining triples {RDF:_1, c, r1}, ..., {RDF:_n, c, rn}, ... point to each of the members rn of the collection. For a single collection resource there may be at most one triple whose predicate is any given element of Ord and the elements of Ord must be used in sequence starting with RDF:_1. For resources that are instances of the RDF:Alt collection type, there must be exactly one triple whose predicate is RDF:_1 and that is the default value for the Alternatives resource (that is, there must always be at least one alternative).

Two problems are posed by this formulation:

This situation is in tension with a broad design goal of RDF: to allow Web services to aggregate and process partial descriptions.

(Tentative) Proposed Simplification

Note: It is is possible to take a permissive reading of the "formal model" for containers, and read these paragraphs as describing the structure of containers "in the abstract". This permissive reading would allow for actual RDF representations of containers to be partial, incomplete etc., while insisting that the container itself (in some idealised form) does consist of a complete (ungappy) collection. For example, it could be true that some container has 20 elements, yet no RDF representation of that element actually had representations of the complete data structure.

The alternative to this reading is to take the paragraph about sequential numbering to apply to instances of RDF data, not just when serialised as XML/RDF but also in the context of databases, APIs etc. To the authors knowledge (@@refs to the contrary welcome), there are no implementations of RDF that take this view, since it makes exchange of partial information about containers highly problematic.

If the permissive reading is taken, and RDF databases, APIs etc are allowed to represent partial (gappy) information about containers, a substantial simplification to the "formal model" section of RDF M+S becomes possible: the text, making no strong claims about RDF implementations, can be removed (or transferred to a less central part of the spec). RDF Schema already contains a section describing Model and Syntax constructs; it is proposed that the section of M+S formal model relating to containers be moved to RDF Schema.

Rationale: The M+S container constructs, like properties, classes etc., have no special relationship to the RDF model. The only reading of M+S that gives them such a privileged role is (?) unimplemented. By removing the priveleged role given to containers in the RDF formal model, the spec can be made simpler and clearer without changing the model.

Test Cases

I want to show some test cases here couched against RDF APIs, rather than in terms of syntax. Suggestions needed on how to do this in a machine friendly format....