W3C Technology and Society Domain The Semantic Web Home Page

This is $Revision: 1.64 $.

THIS DRAFT IS OBSOLETE. See Approved Charter

Please sent comments (noting revision number!) as follows:

Comments to Address
Editor sandro@w3.org
W3C Team w3t-semweb-review@w3.org archives
W3C Members w3c-ac-forum@w3.org archives
Public (technical comments) public-rule-workshop-discuss@w3.org archives

We're aiming for an Activity Proposal to go to the AC for formal review in early September.

*DRAFT*
Rules Working Group Charter

This [DRAFT] charter defines the scope of work and plans for a W3C Working Group. See the workshop report.

Contents:

1. Mission

The Working Group is to specify a data format or language for rule exchange across systems, especially systems using diverse technologies and systems using W3C technologies. This language will function as an interlingua into which established rule languages can be mapped, allowing rules written for one application to be published, shared, and re-used with other applications and other rule engines.

This mission is part of W3C's larger goal of enabling the sharing of information in forms suited to machine processing. Rules themselves represent a valuable form of information for which there is, at present, no consensus on a standard interchange format. Rules provide a powerful business logic representation, as business rules, in many modern information systems. Rules are often the technology of choice for creating maintainable adapters between information systems. Finally, as part of the Semantic Web architecture, rules can extend the OWL Web Ontology Language to more thoroughly cover a broader set of applications. Each of these styles of application was presented at the recent W3C Workshop on Rule Language Interoperability.

1.1. Usage Scenarios

To help motivate and clarify the scope of this working group, here are three cornerstone scenarios, each illustrating a kind of application which should be supported by this work.

Finding New Customers

Jackson is trying to find someone: he needs at least one more client before the end of the quarter. He has access to dozens of databases containing millions of potential candidates, but how is he going to narrow the field to the five or ten leads he should seriously pursue?

He thinks for a minute, then constructs a new query. He clicks on interesting properties, sees their values, and locks in the ones he thinks will act as useful filters. After a few minutes he gets frustrated, because the same concepts seem to have different names in different databases. Worse, the same idea is sometimes expressed in different structures; one database follows the Outlook model of having "name of assistant" and "phone number of assistant" properties, while another simple has an "assistant" property, which links to another person. Trying to handle structural variations like this in his query is becoming impossible.

Fortunately, Jackson's system supports a rule language. The query construction interface helps him construct mapping rules between different constructs which seem equivalent to him, letting him infer new information that is customized to his needs, so he can query over a virtual unified database with a structure that seems to him to be simple and straightforward.

In fact, these rules were already being used; the data views Jackson saw were in many cases constructed by rules other people had written. His own rules will be available to his department (because he stored them in department workspace), allowing his co-workers to use the unified view he finds so useful.

Validating Prescriptions

Bob goes to his new physician, Dr. Rosen, complaining of a painful cough and some difficulty breathing. The diagnosis of pneumonia is straightforward, and Dr. Rosen prepares to prescribe erythromycin. First, he asks Bob if he is taking any medications. Unfortunately, Bob is not entirely forthcoming: he says no, even though he takes pimozide to help manage Tourette's disorder. The omission seems harmless enough, and Bob is uncomfortable with people knowing about this difficult aspect of his medical history.

Fortunately, Bob uses the same pharmacy for both prescriptions, and his pharmacy checks all prescriptions against a merged rule base. This rule base includes the fact that erythromycin is a macrolide antibiotic (coming from the erythromycin vendor) and an encoding of the 1996 FDA bulletin that pimozide is contraindicated with macrolides. When the pharmacist enters the prescription, he is informed of the potentially dangerous drug interaction. He talks to Bob, and with Bob's permission contacts Dr. Rosen to plan an alternative therapy.

The same technology could be made available to doctors, to double check their own knowledge and available references, and to consumers who want to take a greater role in understand their own health care. The key is the ability to efficiently merge rules from multiple sources because we have an interchange language.

Processing Loan Applications

Cory was shopping for a home equity loan. A web search found a site (loans.example.com) of which Cory had heard and which offered to get him three free quotes. He entered the required information. The forms application he used embedded rules that indicated that as his location was in California, he was required by state law to specify whether his application was for home improvement. This "intelligent form" means that he was less likely to be have his application returned for additional information. His application was then dispatched to three lenders. The lenders in turn each added his application to their applicant database where it was subject to matching by their rules.

One lender's system determined a suitable rate and sent Cory an e-mail and paper-mail reply immediately. The second flagged the application for review by a loan officer who looked briefly at the data before authorizing the automated offer process to continue. At the third lender, Cory was automatically classified as a highly desirable customer, and a loan officer was flagged to call Cory and personally move the process forward.

The rules in each lender's rule base were in fact based on a combination of their own business rules, rules of their aftermarket loan trading partners, and rules encoding government regulations. Again, this becomes much more practical if it can be based on a common interchange language.

In each case, conventional rules technology is enhanced not only by the usual economies of standardization, but also by the ability to exchange and merge rules from different sources. Particularly in the first scenario, we see the kind of ad hoc data fusion which is the hallmark of the Web, finally being done by machine.

1.2. Language Subsets (Profiles)

In scoping any language, there is a tension between a narrow scope, allowing easier implementation, and a broad scope, providing more utility. If the language is too small, people will in practice have to make non-standard extensions. If the language is too large, implementers will skip features seen as more difficult or less important, forcing users to stay within the intersection of implemented feature sets if they really want portability.

Some relief is available in the form of language levels or profiles, where certain groups of features are marked as being in a particular profile, as in RuleML's model. Implementers can then completely implement one or more profiles of the standard, giving users a simpler choice and typically a larger feature set to work with.

For this work, there are certain well known and natural profiles, based on well-studied logics and sub-languages of first-order logic. These include syntactic sub-languages (such as conjunctive normal form and the language of Horn clauses) as well as semantic sub-languages with reduced expressive power and desirable computational properties (such as the subset of FOL which can be transformed into Horn clauses, eg using Skolemization).

The working group may identify a small number of sub-languages or profiles of the standard in order to reduce the occurrence of partial implementations and re-invented extensions.

1.3. Compatibility

The language must work well with both XML data and RDF data. It should generally work well with both XML tools and RDF tools. Together, these technologies cover the key formats for network data sources.

Compatibility with OWL is also important. Users must be able to express knowledge in OWL and have it largely carry over to work in their rules engine. Users must also be able to express additional ontological knowledge in the rules language (eg the definition of uncle) and have it merge cleanly with OWL in a suitable reasoner. Note that some use cases for rules can be addressed with OWL alone, but users at the workshop reported that it sometimes falls short of what they need. The SWRL and SWRL-FOL submissions present approaches to this compatibility.

2. Language Features

The Working Group is to specify an interlingua, a common format into which existing rule languages can be mapped. This interlingua may itself be considered a rule language, and may be supported natively by some rule systems in the future, but interlingua features are more important than features which make the language itself easy to use directly.

2.1. Extensibility

The mapping must not be lossy. It must be possible for rules to round-trip through the interlingua without significant distortion, so that users can trust it to provide portability of their rules and preserve the intended meaning and behavior and thus their investment in developing them. Not all features will be covered by the specification, however, so some features will have to be mapped via an extension mechanism.

Editor's Note: Should we talk about the role URIs play in extensibility?

If the users find the system inadequite for their needs, they well need to extend it. Editor's Note: or they wont use it at all!

2.2. Full First-Order Logic

The core of the language will be full first-order logic (FOL, also called predicate calculus) with equality. This is a very well studied logic, which is a compatible extension of RDF and OWL.

It is understood that not all rule engines will offer complete FOL reasoning and that not all authoring tools and languages will support full FOL syntax. This is an unfortunate but necessary outcome of building an interlingua among today's rule systems.

2.3. Rule Syntax

The primary normative syntax of the language should use XML. It is unlikely this format will be practical to author or read directly, except for the simplest rules. Users are expected to work through tools or rule languages which are transformed to and from this format.

The Working Group may specify a secondary syntax which is more convenient for direct human use and is encouraged to do so to help its internal process.

The Working Group must specify a mapping of the syntax of the language to RDF, so that rules can be stored and queried using RDF tools. There are potential semantic difficulties around self-reference here which may need to be addressed, at least with effective advice about how to avoid problems.

Note that because the language is a semantic extension of RDF, its XML syntax will function as another W3C Recommended XML syntax for RDF data.

Editor's Note: Should explain something about the expression language? How should it be connected to XPath?

2.4. Data Formats

Editor's Note: More introductory and explanitory text is needed here.

Editor's Note: How about the separation of rules from data, and how the rule language can be defined using XSD?

As in FOL, predicates with more than two arguments ("n-ary" predicates) and logical functions must be supported by the language, but standard techniques for mapping n-ary atomic sentences and logical function terms to RDF must be specified. This might be done via role or slot names and "de-Skolemizing", as in @@@ref?

For instance, if a conclusion p(a,b,c) is reached, it should be visible via a standard mapping as RDF triples like { _:x p1 a ; p2 b ; p3 c }. @@@@ is it okay to just tell users to write the rules which say that? No, it should be in the ontology for p....

2.5. Introspection and Rule Metadata

The language must support working with properties of properties and properties of rules. This can be done while staying first-order by having intensional semantics for predicates, where what seem to be second-order properties are really first-order properties of something in the domain of discourse which has the same name (URI) as the subject property. (This is what Pat Hayes calls "punning" and is widely implemented, although it is not normally a part of FOL.)

2.6. Datatype Support

The language must make it practical to write rules about strings, integers, Boolean values, and XML nodes. It may offer support for dates, times, and other XML Schema datatypes. This support should borrow directly from XQuery 1.0 and XPath 2.0 Functions and Operators where possible, both for ease of implementation and to allow users to re-use their expertise

Editor's Note: How can we manage the complexity of importing XML Schema? Should we use the Infoset/XPath1.0 datamodel instead of the XQuery1.0/XPath2.0 one?

The Working Group need not maintain compatibility with RDF's datatype semantics, if it provides an approach which is largely compatible and simpler to use and implement. For instance, the language need not support integer literals if it supports string literals and a logical function or predicate which maps strings-representing-integers to integers.

Editor's Note: Do we need to say anything about datatype exstensibility here? cf. how OWL does it.

2.7. External Conditions

The truth value of some atomic sentences depends on the state of the world outside the rule engine and possibly outside the computer. The predicates used in such sentences function to provide input to the rule system. Variable arguments to these input predicates can return values from outside the rule engine.

In practice, two kinds of input predicates are needed:

  1. Standard Built-In Input Predicates. These might include predicates for accessing the current time, prompting the user for input, fetching web content, or performing a side-effect-free SOAP call. The Working Group should specify a small but useful set of these.

  2. Plug-In Input Predicates. These predicates can only be be evaluated by a rule engine which is linked to a suitable implementation of the plug-in, in-process or via a protocol like SOAP. These plug-ins might be candidates for future standardization, or they might be entirely local and application specific. The Working Group should specify any necessary framework to support this linking.

The Working Group should not attempt to support input predicates which have side effects.

2.8. Actions

Certain facts, being known to be true, can be used to guide behavior. Equivalently, when the consequent of a rule includes such a fact, the rule can be seen as a "if condition then action" rule; when the condition is true, it becomes appropriate to perform the action, to engage in some behavior.

We see several kinds of actions in use:

Assert or Create
These can be restructured as purely logical "if condition then condition" rules, where the asserted facts are inferred, and the created objects are named by existentially quantified variables.
Retract, Update, or Delete

These actions have a straightforward meaning and effect when applied to premise data; they are simply requests to modify stored data. The change in the rule engine input data may have complex consequences, but the semantics are clear. Applied to input data, these actions should be supported in situations where modifying stored data makes sense.

In some rule systems, these actions may be applied to inferred data, in which case they have engine-specific behavior and non-logical semantics. The Working Group should not include support for this use in the language.

Calling a Standard Built-In Procedure
@@@@ What procedures might be a standard library, other than the procedures to invoke a plug-in / web-service ?
Calling a Plug-In
@@@@

Editor's Note: Can we channel everything through SOAP? Or do we need some other kind of extensibility mechanism here? Can existing Java expressions be automatically turned into SOAP+WSDL expressions? I think so.

2.9. Named Closed Worlds

To allow rule sets to be merged and the overall system to remain monotonic, some care needs to be taken around applications of the Closed-World Assumption, to make sure they are carefully scoped and avoid reference loops.

Editor's Note: This needs a lot more explanation. What should the WG do about it? Note how monotonicity fits in here.

2.10. Defaults and Rule Priorities

In practice, many rule languages provide for an order of execution and way to specify default values and default rules. This becomes more difficult when rule sets can be merged. The Working Group may provide a mechanism to support these features, if it can be done within the overal monotonic system.

2.11. Explanations

The working group may specify a standard way to talk about and look at the proof supporting a conclusion. This is important work in practice, but is not a required deliverable.

3. Out of Scope

Editor's Note: Explain more why these are of interest, and what they are?

3.1. Non-Monotonic Logics (NAF)

Non-mononotic logics, including traditional logic programming features — especially general Negation-As-Failure (NAF) — are out of scope. NAF is essentially the type of negation seen in many commercial and research rule systems, but combining it with FOL is an unsolved research problem.

For more information see issue 3.1 in the workshop report, along with the discussion notes.

Editor's Note: This may be the most contested point in the charter.

3.2. Temporal Logic

Temporal logic can be very useful in modeling business rules, but it is out of scope for this version due to its complexity.

Modal logic can be very useful in modeling business rules, but it is out of scope for this version due to its complexity.

Editor's Note: Explain how this can be mapped to FOL?

3.4. Characterizing Rule Engines

Whether an engine does forward chaining or backward chaining or whatever isn't something that needs to be specified or talked about. It should be possible to implement this spec using almost any rules technology.

4. Relationships to other Efforts

4.1. OMG PRR

PRR has smaller scope: just production rules. We expect the significant overlap in membership to help us avoid any unnecessary duplication of effort.

4.2. OMG OCL

@@@ At least the extension/restriction being defined for PRR

4.3. OMG SBVR

May be a user....

4.4. ISO CL

Lots of overlap.

4.5. RuleML

Useful XML syntax, sub-language/profile design, useful expertise

4.6. W3C DAWG (SPARQL)

Should be compatible with the RDF view of the data

4.7. W3C XQuery

Uhhh, we use XPath operators.

5. Schedule and Deliverables

About 12 months to CR, 18 months to Rec. First face-to-face in early december -- perhaps December 7-8, in Bay Area, co-ordinated with OMG meeting that week. Second F2F at Tech Plenary, end of February.

Editor's Note: There are some concerns about this timeline. Perhaps doing a Level-1 and Level-2 would help? What's described above is Level-2, and might reasonably be given 2-3 years, some people say.

@@@ roughly as dawg

6. Participation, Meeting, and Logistics

@@@ roughly as dawg


Sandro Hawke, W3C
$Id: charter.html,v 1.64 2005/11/24 16:10:54 sandro Exp $