Arch/Extensibility3 - W3C RIF-WG Wiki

This text is being proposed for Arch, replacing Arch/Extensibility2. It differs mostly in style, and in some ways the style of the earlier page is probably better. The main substantive difference is that the "component" level of modularity has been removed and instead we just talk about dialects.

Systems are extensible if they can grow or change without coordination among all their users. Because it is not practical to coordinate among the full user base of an open system, spread across many organizations and across the Web, open systems must be extensible if they are to offer ongoing improvements in functionality. RIF has been designed to support extensibility so that it may be incorporated as a key element in such open systems and in the Web as a whole.

The RIF model of extensibility is based on the idea of having multiple RIF dialects, each with its own XML syntax. Whenever a new feature is needed, a new dialect can be published which supports the new feature. This new dialect will largely "overlap" with older dialects so that documents in the new dialect which do not use the new feature can be understand as instances of an older dialect. This provides backward compatibility.

Beyond this, RIF also provides a form of forward compatibility in the way software is required to handle RIF documents which contain unknown or unimplemented features. The publishers of dialects are encouraged to provide "fallback" information which allow older systems to handle new-dialect documents in a graceful way. This allows features to become used by segments of the overall RIF user base while minimizing impact on other users.

Both aspects of RIF extensibility, management of overlap for backward compatibility and fallback procedures for forward compatibility, rely on XML namespaces and the Web as a decentralized information store. Our approach leverages Web Architecture and Semantic Web concept to provide high functionality at a low implementation cost.

Linked Dialects

In order to coordinate between dialects (to manage overlap) and to allow implementations to download fallback information on-demand, we define "linked dialects" as dialects which are coordinated with Web content as defined in this section.

Not all RIF dialects need to be linked, but unlinked dialects do not support RIF's extensibility features. It is expected that dialects in wide use will be a linked, while dialects in early development and those used privately by individuals and small organizations will typically be unlinked.

Each linked dialect must have a single IRI, used to refer to the dialect. Dereference of that IRI, with the content type "application/rif+xml" MUST return a RIF document in [@@some dialect] which include [@@some information]. The dereference MAY include HTTP redirection steps.

Every IRI used in the dialect's abstract syntax (as identifiers for syntactic classes and properties) must similarly return RIF content. The RIF content, and the IRIs (after fragment truncation) MAY be the same for all syntactic elements and for one or more dialects.

The content for the syntactic elements must include a list of all the linked dialects which use the syntactic element. The person or organization responsible for maintaining that content MUST, on a request in good faith, include the reference IRI of any dialect which claims to use that syntactic element.

Syntactic Overlap

The key principle to providing scalable backward compatibility is that the semantics of a given linguistic expression must not change between dialects. As long as this principle is followed, implementations can process input documents without any concern about what dialect they might be written in. Systems can simply try to parse each document as being written in one or more of the dialects they do implement, and if it does parse, then it can be correctly treated as if it had been written with that dialect in mind. The meaning is, by definition, the same.

This principal of non-conflict between dialects where there is syntactic overlap has been practiced for many years in data formats and computer languages as they have evolved over the years. It is relatively obvious and easy to manage if the dialects are all developed by the same organization, or if the dialects have very little overlap. (Most XML based languages, if they use namespaces, can be seen as non-conflicting. They only share syntactic elements in the "xml" pseudo namespace, and the semantics of those parts of the syntax are managed by W3C.)

Non-conflict is ensured like this: every dialect definition MUST include normative references to every prior dialect which uses the same syntactic elements. It must state, prominantly, that the semantics are the same in the areas of syntactic overlap. By doing this, any contradiction between the semantics of the dialect stated elsewhere in its definition and prior dialects becomes an internal inconsistcy in the specification. This is more readily apparent, and less subject to debate than possible consistency with other specifications. Of course, internal inconsisty may be debated, and internal inconsisties may be discovered years after a dialect is deployed. This is an unfortunate fact of life. [ heh ]

More practically, each dialect SHOULD include an extensive test suite. An implementation SHOULD be tested against all the tests provided with all linked dialects reachable from the implemented dialects. The vocabulary for linking to tests and describing types of tests is specified in our test suite document [???].

Extension Handling

Extension handling is optional in two ways:

If a dialect is unlinked, documents which use it it cannot be
- handled gracefully by systems which do not support the dialect. Such implementations MUST give a fatal error message indicating that such a document is writtin a RIF dialect which is not fully supported by its publisher.
If the RIF input handler does not fully implement RIF
- extension handling, it MUST, upon receipt of a message not in the syntax of any dialect it implements, issue a fatal error message indicating that the system cannot process the document because the system does not fully implement RIF's extension handling and that some extension handling is necessary to process this document.

(more details about fallback to follow. Impacts remain the same as in Arch/Extensibility2. The actual fallback procedures are BLD rulesets. There is a default "trim" procedure, which cuts until it's syntactically valid.)