This is one of the possible Use Cases.
Facts from several Web sources, e.g. about businesses, are to be integrated. Rules can be used for creating an integration view of the sources. An individual, e.g. a specific business, can be identified across the sources via rule-based URI normalization. Classes can be aligned employing taxonomic reasoning. The rules and taxonomies often again require integration. Optionally, integrity constraints help to maintain the consistency, completeness, etc. of the sources and the integration result via (normative) rules.
Originally proposed by: (HaroldBoley). Later proposed to the First F2F WG meeting as Use Case NRC-1 http://lists.w3.org/Archives/Public/public-rif-wg/2005Dec/0029.html.
- Based on perceived business and organizational need for Information Integration, which is a world-wide necessity.
A version of the use case, including integrity constraints, with two Web sources for New Brunswick Businesses is implemented and described in http://www.ruleml.org/usecases/nbbizkb.
3. Links to Related Use Cases
Rule Based Service Level Management and SLAs for Service Oriented Computing: Rule-based information integration. XML serialization done uniformly in RuleML.
Rule Interchange Through Test-Driven Verification and Validation: Special, fact-like, integrity constraints. Techniques can thus be transferred between both of these use cases, e.g. RuleML sublanguages, serialization, and semantic attributes.
4. Relationship to OWL/RDF Compatibility
OWL Compatiblity can be achieved as in SWRL or, with Hybrid Rules, as in the Realistic Architecture. RDF Compatiblity can be achieved via translation of URI-identified businesses to RDF 'about' descriptions and the direct use of RDFS sector/category taxonomies.
5. Examples of Rule Platforms Supporting this Use Case
The rule engine OO jDREW http://www.jdrew.org/oojdrew has been employed to run NBBizKB http://www.ruleml.org/usecases/nbbizkb: Permits bottom-up as well as top-down execution, the latter with better support for NAF.
6. Benefits of Interchange
- Facts from the various sources can be transformed into the integration view when need arise or once and for all.
- Normalization rules can be written in any interchangable format.
- Rules creating an integration view can be written in any interchangable format.
7. Requirements on the RIF
An instantiation of this use case was implemented with POSL rules as NBBizKB and tested in OO jDREW. The need to construct such integration rules through iterative refinement with human experts implies the requirement of a human-readable syntax.
In this use case, the identity criterion for businesses across the Web sources is a problem if no URI is provided or URI normalization cannot be done: normalized phone numbers needed to be used in NBBizKB. This implies the requirement to 'webize' the language with URIs and interface it to the newest official URI normalization algorithm.
Given that the same business can be identified in both sources, and assuming it is correctly classified w.r.t. their respective taxonomies, an alignment between the two taxonomic classes can be hypothetically established, which becomes the stronger the more such business-occurrence pairs can be found in both sources. This implies the requirement to combine rules with taxonomies and to permit uncertainty handling, as explored in Fuzzy RuleML.
8.1. Actors and their Goals
- Providers - want their Web sources to be re-used
- Fact Integrators - want to create an integration view of all the sources with fact-integrating rules
- Customers - want to use multiple sources as if they were one uniform source
- Rule Integrators - want to integrate the fact-integration rules with rule-integrating rules
8.2. Main Sequence
- Providers make their Web sources available
- Fact Integrators create an integration view dynamically, when need arise
- Customers use sources in a uniform way
- Rule Integrators dynamically integrate the Fact Integrators' rules
8.3. Alternate Sequences
Like Main Sequence except:
- Fact Integrators create an integration view statically, once and for all
Government analysts, venture capitalists, or entrepreneurs want to monitor the progress of business development in some region XY. Facts about XY businesses are available from two Web sources, S1 and S2. While S1 contains detailed information, it has not been updated since time T. S2 contains less information but continues to be updated after T. As part of the information, a classification of the sector or category of each business is given in the two sources, using two respective taxonomies.
A Web Service is to create an integration view using all business information from S1 except where it is overwritten by S2, adding new entries for businesses only occurring in S2. For integrating the classifications, corresponding sectors or categories need to be determined and aligned in the taxonomies.
"Links to Related Use Cases" should also point to several other Information Integration Use Cases, which together make a strong argument for a RIF.