Skeleton document for discussion in Boston, January 2007

This document is for discussion only and has no official status. It is one of three discussion documents for use at the meeting. The others are Boston 2 and, creatively, Boston 3

The document largely follows the XGR structure but all Open Questions need to be resolved. The resource grouping and linkage aspects are being split out into new documents and we may need to expand some sections to cover the broader remit of the group.

Contents

1 Use cases
2 Requirements
3 Data model
3.1 WDR Example 1
3.1.1 Attribution (lines 2 - 4)
3.1.2 Scope (line 5)
3.1.3 Description (line 6 points to lines 8 - 11)
3.2 WDR Repositories
3.3 Trust
3.4 WDR Semantics
3.4.1 Included content
3.4.2 Unique Identity Rules, Caching and Expiry.
4 The WDR Vocabulary
5 WDR in ATOM/RSS
6 WDR in RDFa
7 Glossary
8 References
9 Acknowledgements

1 Use cases

The use cases for POWDER were established in the WCL-XG.

Ed note: Consider "new" work, e.g. robots.txt. If necessary, prepare new use case(s) here.

2 Requirements

These will need to be reviewed revised in the light of new name (i.e. WDR not cLabel)

Fundamentals

  1. It must be possible for both resource creators and third parties to make assertions about information resources.
  2. A group of one or more assertions, known as a description, combined with attribution and a scope of resources that they refer to, together constitute a Content Label, also written as cLabel. This must be able to describe aspects of those resources using terms chosen from different vocabularies. Such vocabularies might include, but are not limited to, those that describe a resource's subject matter, its suitability for children, its conformance with accessibility guidelines and/or Mobile Web Best Practice, its scientific accuracy and the editorial policy applied to its creation.
  3. It must be possible to group information resources and have cLabels refer to that group of resources i.e. define the scope of the cLabel. For example, cLabels can refer to all the pages of a Web site, defined sections of a Web site, or all resources on multiple Web sites.
  4. cLabels must support a single composite assertion taking the place of a number of other assertions. For example, WAI AAA can be defined as WAI AA plus a series of detailed descriptors. Other examples include mobileOK and age-based classifications.
  5. It must be possible for more than one cLabel to refer to the same resource or group of resources.
  6. It must be possible for a resource to refer to one or more cLabels. It follows that there must be a linking mechanism between content and labels.
  7. cLabels must be able to point to any resource(s) independently of those resources.
  8. A cLabel must include assertions about itself using appropriate vocabularies. As a minimum, a cLabel must have metadata describing who created it. Good practice would be to declare its period of validity, how to provide feedback about it, who last verified it and when etc.
  9. It must be possible for a cLabel to refer to other cLabels.
  10. There must be standard vocabularies for assertions about cLabels.
  11. cLabels, their components and individual assertions should have unique and unambiguous identifiers.
  12. Assertions within cLabels should be made using descriptors that themselves have unique identifiers

Fitting in with commercial or other large scale workflows

  1. It must be possible for cLabels to be authenticated.
  2. It must be possible to create and edit cLabels without modifying the resources they describe OQ 1: It is an open question whether there may be a requirement for some forms of cLabel that involve editing those resources.
  3. It must be possible to identify a default cLabel for a group of resources and provide an override at specific locations within the scope of the cLabel.

Encoding labels for humans and machines

  1. It must be possible to express cLabels and cLabel metadata in a machine readable way.
  2. The machine readable form of a cLabel must be defined by a formal grammar.
  3. cLabels must provide support for a human readable summary of the claims it contains.
  4. It must be possible to express cLabels in a compact form.
  5. Vocabularies and authentication data must be formally encoded and support URI references.

New (from dotMobi)

If the use cases and requirements differ significantly from the XGR then we can either a) amend the report, or, more simply b) prepare a Group Note to underpin all 3 Recs.

3 Data model

The data model in the XGR meets these requirements. (add any explanatory notes)

Ed note: Does the data model presented in the XGR actually meet the requirements? We need to work through this. Again, identify what will be in this Rec and what will be covered by the other two?

3.1 WDR Example 1

The following worked example shows how the data model is encoded in RDF.

1  <wdr:WDR rdf:ID="WDR_1"> 
2   <foaf:maker rdf:resource="http://labellingauthority.example.org/foaf.rdf#me" />
3    <dcterms:issued>2006-09-01</dcterms:issued>
4    <wdr:validUntil>2007-09-01</wdr:validUntil> 
5    <wdr:hasScope rdf:resource="$URI" />
6    <wdr:hasDescription rdf:resource="#description_1" />
7  </wdr:WDR>

8  <rdf:Description rdf:ID="description_1">
9    <ex:colour>red</ex:colour>
10   <ex:shape>square</ex:shape>
11 </rdf:Description>

The Web Description Resource Class, lines 1 to 7, includes the 3 key elements:

3.1.1 Attribution (lines 2 - 4)

Line 2 uses the FOAF vocabulary and its conventions to declare that the label was "made by" the entity described at http://labellingauthority.example.org/foaf.rdf#me. There are no formal requirements for this data but it is expected that it will provide generic information about the labelling authority such as its name, homepage URL, contact details etc.

Specific information about when the WDR was issued and its valid until date.

3.1.2 Scope (line 5)

The scope of a Web Description Resource is defined in a discrete block of data, the format for which is described in [Resource Grouping].

3.1.3 Description (line 6 points to lines 8 - 11)

This is a straightforward RDF class offering a description of the resources defined in the scope, as claimed by the labelling authority. In this case, the resources are described as red and square.

On their own, each Class in the example is consistent within the RDF data model. However, taken together, and especially when the scope is included, the semantics of a WDR do not fit the RDF data model and so cannot be processed isolation. The subject - predicate - object triples only have the desired semantics of "these resources have the following property/value pairs" when taken together. It is for this reason that. although standard RDF tools are useful when processing WDR, the results must be used in the specific context of POWDER.

Ed Note; Then a second, more complex example. Perhaps including a classification and a folksonomy tag?

Ed Note: Open questions 2 - 4 come under Resource grouping. See recent TAG finding on metadata in URIs http://www.w3.org/2001/tag/doc/metaDataInURI-31

3.2 WDR Packages

Ed Note: we didn't explore this fully in the XGR and we need to. The simple WDR examples are fine for describing everything within a given scope, but we must support different WDRs for different groups.

A package is likely to be something like

1  <wdr:Package>
2    <foaf:maker rdf:resource="http://labellingauthority.example.org/foaf.rdf#me" />
3    <dcterms:issued>2006-09-01</dcterms:issued>
4    <wdr:validUntil>2007-09-01</wdr:validUntil> 
5    <wdr:hasScope rdf:resource="$URI_1" />
     <wdr:hasDefaultDescription rdf:resource="#description_1" />

9    <wdr:hasWDR>
10     <wdr:WDR rdf:id="WDR_2">
11       <wdr:hasScope rdf:resource="$URI_2" />
12       <wdr:hasDescription rdf:resource="#description_2" />
13     </wdr:WDR>
14   </wdr:hasWDR>

15 </wdr:Package>

This starting to look more like RDF-CL! However, there are important differences. The scope linked in line 5 would define the scope for the full package and the default description is given for resources in the package's scope (rather like hostRestriction and default Label in RDF-CL). The attribution is given at the package level and is inherited by all subsequent WDRs within the package ... but maybe this is a bad idea? A single site might have multiple labels from multiple sources and so require multiple attribution statements. Maybe we need "default attribution"? Hmmm...

The RDF predicate hasWDR links the package to a constituent WDR. There is no sequence implied here (as there is in RDF-CL). This means that a single package might have several WDRs for the same resource (implying more processing). Alternatively, we could use the RDF-CL technique of putting everything in an ordered sequence (implying less processing at the cost of reduced flexibility). Maybe support both??

An important point here is that scope is always defined separately (and may or may not be defined using RDF).

3.2 WDR Repositories

Ed Note: Open Question 5. Should there be a standard protocol for WDR repositories? (probably yes - SPARQL "Describe"? What would come back? The description and attribution triples? Just the description (since we know who we're asking)) Do we also want a "give me all the WDRs for this/these domains"?. Can we construct SPARQL queries for this (using RegExes etc.) basically - try and use existing protocol rather than invent new one.

Ed note: Open Question 6: Should a repository provide a bulk data transfer capability alongside whatever capabilities it offers for transfer of description, WDRs and packages? Maybe specify its URI in the LA's FOAF file? Data encoding will depend on grouping??

Ed Note: OQ 7 firmly in the linkage Rec. - this is the HTTP Link header stuff. Mark Nottingham Is not answering my e-mails and his IEFT draft has now expired.

3.3 Trust

Ed Note: OQ 8 The form of authentication and certification mechanisms for cLabels requires further study. Should we expand on a particular trust model? Should we insist on a single trust model?

Ed Note OQ 9: There is also work to be done to more clearly define the roles of various players in the trust chain, such as labelling authority, certification provider etc. (Follows from OQ 8)

3.4 WDR Semantics

Adapted from XGR: We define a Web Description Resource as a resource that contains a description, a definition of the scope of the description and assertions about both the circumstances of its own creation and the entity that created it. In other words, a WDR is the expression of an opinion held by an individual, organization or automaton at a particular point in time. It cannot be taken as proof, in a logical sense, that one or more of the assertions expressed in the WDR is true as an empirical fact.

Furthermore, the WDR is limited by the vocabularies used. That is, inferences cannot be drawn about a resource or group or resources based on the absence of any descriptor. To give a simple example of this, if a WDR describes a resource solely in terms of its colour, no inference can be drawn about its shape.

3.4.1 Included content

Ed Note: Open question 10: how to enact the finding that a label on an HTML page covers elements within the page. Suggest that linking to a WDR that includes a scope statement covers this one, and that if the elements are not explicitly covered in the scope then it the client can establish that the elements are part of a labelled resource then it SHOULD treat the label as covering those elements.

Ed Note: OQ 11 - seek input from ERT. Maybe the scope Rec will cover this.
Ed Note: Resolve OQ 12 (HTTP redirects)

3.4.2 Unique Identity Rules, Caching and Expiry.

Ed Note: Resolve OQ 13 (new label = new URI)
Ed Note: Resolve OQ 14 (cache-header overrides valid until - fits with Tag finding on authoritative metadata ).

4 The WDR Vocabulary

Ed Note. Since the resource grouping issue has been split out, I reckon we can go ahead and produce the vocabulary and its RDF scheme version readily enough. No doubt it will evolve as the WG progresses.
Ed note: OQ 15 is about vocabulary terms and refers back to 13. It states: "This section is subject to further review and elaboration - e.g. which terms are mandatory, which are optional, how to provide a unique identifier for the vocabulary. Note also that some of the terms suggest that labels can be altered, and there is an open question as to whether this is in fact possible, given that each instance of a Content Label must have a unique and unambiguous ID. Equally it is important that when a label is 'renewed' that it is not then necessary to change all references to it. It may be possible to work around this by accessing labels by 30x redirection, but the rules applications would be required to follow remain to be discussed"
OQ 16 Just says we'll define a vocabulary., but also says that we may define tests for some of the terms.

5 WDR in ATOM/RSS

T.B.C.

6 WDR in RDFa

T.B.C.

7 Glossary

The following terms are used throughout this report. Definitions have been collected from W3C glossaries where possible and provided a priori where necessary.

Assertion Any expression which is claimed to be true. [W3C definition source]

Authenticate, (n. authentication) To provide evidence that assertions made in a cLabel or a certificate are the authentic view of the entity that created them. Such evidence will typically be acquired by direct communication with that entity.

Category A thematically-related sub-group of terms within a vocabulary.

Certificate A cLabel containing assertions about the veracity of claims made in another cLabel.

Certification The process of verification of claims and the creation of a certificate.

Claim An assertion whose truth can be assessed by an independent party.

Classification A specialization of a description; one that is pre-defined .

Content Label, cLabel A resource that contains a description, a definition of the scope of the description and assertions about both the circumstances of its own creation and the entity that created it.

Content provider An entity (individual, organization or automaton) that provides resources in response to requests, whether or not the resource was created by that entity.

Description A resource that contains only assertions and claims.

Descriptor An aspect of a resource about which it is possible to make assertions. For example, color, size and shape. A descriptor becomes a vocabulary term when it is associated with possible values.

Expression An instance of a vocabulary term and its value.

Information resource A resource which has the property that all of its essential characteristics can be conveyed in a message. [W3C definition source]

Labeling Authority (acronym LA) An organization that provides infrastructure for the generation and authentication of content labels.

Labelmark A human perceivable sign that a cLabel has been issued.

Package A collection of cLabels and certificates that apply within some  scope.

Repository A storage mechanism for descriptions, cLabels and packages from which they can be retrieved without necessarily being linked from the content they describe.

Resource Anything that might be identified by a URI. [W3C definition source]

Resource creator The individual or organization that created the resource.

Schema (pl., schemata) A document that describes an XML or RDF vocabulary. Any document which describes, in a formal way, a language or parameters of a language. [ W3C definition source]

Scope The set of resources to which a cLabel states it applies, or to which a Package states it applies.

Summary A short description of what is said about the resource by the cLabel, suitable for display to end users.

Trustmark A human perceivable sign that a certificate has been issued.

Valid A cLabel is valid if it has an associated schema or schemata and if it complies with the constraints expressed therein. [Adapted W3C definition]

Verification The process of assessing the correctness of claims.

Vocabulary A collection of vocabulary terms, usually linked to a document that defines the precise meaning of the descriptors and the domain in which the vocabulary is expected to be used. When associated with a schema, attributes are expressed as URI references. [This definition is an amalgam of those provided in Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies 1.0 and OWL Web Ontology Language Guide.]

Vocabulary term An attribute that can describe one or more resources using a defined set of values or data type. Attributes may be expressed as a URI reference. See also descriptor and expression.

Well-formed Syntactically legal. [W3C definition source]

8 References

9 Acknowledgements