Document To Do: add some concrete examples!
- Background and Terminology
- Managing Overlap
- Fallback Procedures
- Choice: What Fallback Mechanisms Are Mandated?
- Choice: Where Is The Extension Metadata?
- Choice: How complex an impact structure?
- Random Questions
- Example Extensions
1. Background and Terminology
1.1. Basic Terms
A RIF Document is an XML document with a root element called "Document" in the RIF namespace (http://www.w3.org/2007/rif#). In general, RIF documents are expected to convey machine-processible rules, data for use with rules, and metadata about rules.
A RIF System is anything which might produce or consume a RIF Document. Typical RIF systems include rule authoring tools and rule engines. These systems may consist of a non-RIF subsystem and a RIF translation subsystem; taken as a whole, they form a RIF system.
A RIF Dialect is an XML language for RIF Documents. Each RIF Dialect defines semantics for the set of RIF Documents which conform to its syntax definition. Dialects may overlap other dialects; that is, a given document may be an expression in multiple dialects at the same time.
A Language Conflict occurs when multiple dialects specify different meanings for the same document. That is, if there can exist a RIF Document which syntatically conforms to two dialects, and a system can be conformant to one of the dialects without also being conformant to the other, then there is a language conflict between the dialects.
A RIF Extension is a set of changes to one dialect (the "base" dialect) which produce another dialect (the "extended" dialect), where the extended dialect is a superset of the base dialect.
A RIF Profile is the complement of a RIF Extension; it is a set of changes to one dialect (the "base" dialect) which produce another dialect (the "profile" dialect), where the profile dialect is a subset of the base dialect.
A system is Backward Compatible if it accepts old versions of its input language. All systems are backward compatible for languages which change only by incorporating extensions (that is, by growning). In a large, decentralized system (like the Web), backward compatibility is extremly important because new system will almost certainly have to read old-version data (either old documents, or documents recently written by old software).
A system is Forward Compatible if it behaves well when given input in future or unknown languages. In a large, decentralized system (like the Web), if the systems are not all forward compatible, new language versions are extremely difficult to deploy. Systems which are not forward compatible will behave badly when they encounter new-version data, so the users of these systems will tend to push back on the people trying to publish the new-version data. If a large enough fraction of the user base is using such systems, the push back becomes too great and migration to new versions is prevented. In small, controlled environments, the software for all the users can be upgraded at once, but that is not practical on the Web.
A Fallback mechanism provides forward compatibility by defining a transformation by which any RIF document in an unknown (or simply unimplemented) dialect can be converted into a RIF document in an implemented dialect. In many cases, fallback transformations will have to be defined to be lossy (changing the semantics of the document). Fallback mechanisms can be simple, like saying that certain kinds of unrecognized language constructs are to be ignored (as in CSS), or they can be complex, invoking a Turing complete processor (as in XForms-Tiny).
Impact is information about the type and degree of change performed by a fallback transformation. For instance, a fallback transformation which affects performance might be handled differently from one which will cause different results to be produced. This difference is considered impact information.
An Invisible Extension defines a dialect which has exactly the same syntax as some other dialect but different semantics. This is sometimes desirable when the different semantics are related in a practical and useful way, such as reflecting the different capabilities of competing implementation technologies. Deployment, testing, and the definition of conformance for invisible extensions require out-of-band information, which may be problematic. For example, there is a subset of OWL-Full which has the same syntax as OWL-DL, but which has more entailments (ie different semantics). This subset of OWL-Full is an invisible extension of OWL-DL; its presence (and thus the different intended semantics) cannot be determined by inspection and must be conveyed out-of-band in any applications where the semantic difference might matter.
Extensible systems may support User Extensions (Vendor Extensions), Official Extensions or both. A user extension is one which can be defined and widely (and legitimately) deployed without coordinating with a central authority (such as W3C or IANA). Official extensions are those produced under the auspices of the organization which which produced the base dialect (in this case W3C). Some people consider user extensibility to be required for a system to truly be extensible. The RIF Charter extensibility requirement concerns user extensions.
Some partially-formed ideas about dependencies:
An extension A requires extension B if the base dialect of A is a superset of the extended dialect of B.
Extensions A and B are independent if, for all dialects D which can be a base dialect for both A and B, the dialects D+A+B and D+B+A are well-defined and identical.
An extension A is compatible with extension B if A requires B, B requires A, or A and B are independent.
Dialects are incompatible if either there is no dialect which can be a base for both of them or if they are not compatible.
1.3. Motivation Scenario
Acme Widget Co. has a complex pricing structure, with bulk discounts, high-volume customer discounts, periodic sales, overstock sales, and multiple shipping options. They encode their pricing structure in a RIF ruleset which they want to give to customers so that the customers can computationally determine their best timing and grouping of of orders. This provides a mutual advantage, as long as Acme designs their pricing structure to accurately reflect their costs and business goals.
Unfortunately, at the time of this effort, Acme finds there is no standard RIF dialect which supports pricing structures varying over time. They can publish a simplified version of their rules, without time-varying parts, but that version would be missing some important information. So they meet with their two biggest customers, who they know are rules-savvy and design an extension to a RIF dialect which gives them this functionality.
Some questions arise:
- Do they need permission from anyone to do this? (That is, is RIF user extensible or not?)
- What should the syntax look like? What namespace? (How do they avoid language conflict?)
- What happens if RIF-WG wants to make it a standard later? Will the namespace have to change?
- Can they still publish just one ruleset, and have it work for both the users understanding the extension and those not? (That is, is there an effective fallback mechanism?)
These are some of the less obvious goals. Maybe they should all be enumerated here.
User Extensibility: user communities should be able to deploy dialects without any sort of approval from anybody (eg W3C)
Let user extensions function as prototypes for standards: it should be possible to prototype possible standard dialects using the user extensibility mechanism, so that (for instance) some community can gain practical experience with a feature before W3C incorporates it into a standard.
2. Managing Overlap
The straightforward way to provide for backward compatibility is to ensure that there are no language conflicts. This requires that designers of extensions never accidentally use the same syntax, and that they are careful to use the same semantics when they do use the same syntax.
2.1. Choice: How Do Extension-Creators Discover Overlap?
Because RIF uses an XML syntax with XML namespaces (which are URIs), there are several options here.
2.2. Random Questions
Does every syntactic element bless by RIF-WG go in one w3.org namespace, multiple w3.org namespaces, or can some of them be in non-w3.org namespaces?
implies: Do user extensions have to change namespaces if they become part of a W3C recommendation?
Do we ever allow invisible extensions?
How do you know when extensions are compatible?
Do people ever need to specify dialects, or can they just specify extensions from a null core? Maybe BLD is a package of 6 extensions? Or maybe it's just one extension? Is that a useful concept?
Do we recommend/allow/forbid document-level flags to change the meaning of syntactic elements? e-mail.
3. Fallback Procedures
3.1. Choice: What Fallback Mechanisms Are Mandated?
If a system receives a document which does not conform to the syntax of any dialect it implements, what should it do?
3.2. Choice: Where Is The Extension Metadata?
Again, if a system receives a document which does not conform to the syntax of any dialect it implements, what should it do? Specifically, how can it learn which elements belong to which extensions? How can it learn what Impact is associated with each fallback option it has? How do I learn which fallbacks to perform?
Some of these options parallel the Approaches to Discovering Overlap, but there are some others available here, too.
All the net-access approaches involve some possible security risks. Perhaps the links could optionally include a secure-hash (if you want to make sure no one changes the extension metadata) or a public key (if you want to let the extension metadata change but are worried about imposters changing it).
3.3. Choice: How complex an impact structure?
The design of the impact information structure depends on what you might do differently, based on the information.
Many prior extensible languages just use 1-bit of impact information. Sometimes (eg in SOAP) it is a "Must Understand" flag. Other times anything that is not "must understand" is considered to be metadata.
The rest of this section is from the Arch/Extensibility2 strawman.
Each fallback substitution has zero or more two-part impact flags. Each flag consists of a type and a severity, indicating what kind of effect performing that substitution will have. Based on the type and severity, different types of RIF-consuming systems can behave differently.
Repeats of the same impact SHOULD NOT have greater impact.
soundness : performing this substitution will make the resulting ruleset produce incorrect answers and/or behaviors.
completeness : performing this substitution will make the resulting ruleset produce fewer distinct answers and/or behaviors than it otherwise would. If the results would have been complete before, they no longer will be.
performance : performing this substitution may cause rule processing systems to handle this ruleset with significantly degraded performance
presentation : this substitution only affects aspects of the ruleset intended for human readers.
Under certain conditions, impact flags may be interrelated. For instance, if a negation-as-failure component is being used, a completeness impact flag being set should automatically raise the soundness impact flag.
Systems MUST NOT silently perform any fallback substitution which has even a slight chance of producing incorrect answers or behavior. Instead, software SHOULD inform users of fallback substitutions which have minimal affects and SHOULD require confirmation from users before performing fallback substitutions which may have greater affects. RIF Software MUST NOT lead a reasonable user to think that errors stemming from fallback substitutions are due to faulty input.
This table indicates suggested handling of impact information by a system which answers queries for users, using reasoning on a rulebase provided in RIF.
3.4. Random Questions
Can we use the namespace+element URI to look up the fallback information?
Must we support off-line fallback?
Do we have a simple flag indicating non-semantic (metadata) extensions? (Do we consider metadata an extension?)
Is there a way to make the fallback processing extensible? Could we make only Trim-to-Fit manditory, but have XSLT or BLDX be encouraged and motivated?
All official dialect elements go into rif: namespace. That namespace will dereference just like everyone else's SHOULD:
- pointers to fallback/impact information
- pointers to documentation
- pointers to community resources
User extensions go in separate own namespaces (which might happen to be on w3.org, or purl.org, or whatever). If they become official, the namespace has to change. But fallback substitution between the two should mean implementations and data don't have to change.
extensible fallback options, up to fixed-replacement.
no invisible extensions
having in-line data or in-line imports is an extension.
no specific notion of metadata -- it's just extensions for elements which can be ignored with minimal impact.
4.2. Input Processing Procedure
Proposed, that all systems which consume RIF MUST do this:
- You have a RIF document to process
- You try to parse it (or schema validate it) according to each of the dialects you know. If you succeed with any of them, you're done. Otherwise...
- You parse the document using the all-RIF schema to obtain the dialect metadata.
- The metadata (or its absence) will indicate whether you should do web-based fallback processing. If so, you must either do it, or give the user an error message. If it fails, the user must be given an error message. If it succeeds, you have more dialect metadata.
- The dialect metadata allows you to construct additonal XML schemas. The document must be schema valid with respect to at least one of them.
- The dialect metadata also includes fallback/impact information, back to a base dialect you probably implement. You must do this transformation or warn the user. Based on the impact, you may have to warn the user
4.3. Dialect Metadata
Whole Dialect include grammar for dialect, and set of zero or more fallbacks to other dialects.
Extension includes a partial grammar to match against and additional branches to add to it. Also, zero or more fallbacks which transform from the modified grammar back to the original.
Names any element/attribute URIs which are not to followed.
Names any additional URIs which are to be followed.
Fallback procedures are named by URIs, which are to be followed unless also given in dialect metadata.
Namespace documents, fallback data, etc, is all in RDF/XML. (Alternatively, in some RIF dialect.)
@@ Can we add content hashes, somehow, later?
4.4. Fallback Functionality
- xml tag/attr substitution [ with this impact ]
- omit this subtree [ with this impact ]
- replace this subtree with its content [ with this impact ]
- replace this subtree with this value [ with this impact ]
Additional fallback mechanisms may be specified later; esp XSLT and BLDX. This is done by simply saying that if you don't understand some of the Extension data, you ignore it. (@@ Is that okay? Elsewhere, we seem to want schema validity.)
Impact information is as in previous strawman.
5. Example Extensions
See RIF Dialect Structure, and other versions of that....
5.1. Add Neg (classical negation) to BLD
5.2. Add SMNAF (Stable Model Negation-As-Failure) to BLD
5.3. Add WFNAF (Well-Founded Negation-As-Failure) to BLD
5.4. Add Lists to BLD
5.5. Add Object-Oriented Non-Monotonic Inheritance to BLD
David Orchard has edited a series of drafts on "versioning" for the TAG and XML Schema Working Group: