Interview: Oracle on Data on the Web – Part 1 with Reza B’Far

This is part 1 of a 2-part interview with Oracle about data on the Web. In part 1, the focus is on the consumption of data by applications, such as those that enterprises provide to their employees. In part 2, the focus is on back-end data management.

For this part of the interview I spoke with Reza B’Far, Vice President of Development.

IJ: How does the Oracle apps team use Web standards for data?

Reza: Oracle uses a number of W3C standards, but one of my focus areas is the application of Semantic Web technologies. OWL and PROV are the two standards we’ve used in our Fusion applications. Fusion applications bring together and integrate Oracle acquisitions from the past decade related to enterprise resource planning (ERP), human resources, supply chain management, financials, customer relations, and so on.

IJ: What are some examples of Fusion applications?

Reza: For example, enterprise customers use Fusion GRC to ensure they comply with various government rules and regulations. They also use the tools to detect over-payment or fraud. In the team that I run, the problems of discovering things like overpayment, SOD violations, fraud, and others are best solved by using an artificial intelligence (AI) approach. We have found that OWL provides an optimal way to capture the knowledge required by the AI engine, for example, for intelligent searches.

IJ: What are intelligent searches?

Reza: These are heuristic-based searches. Take the example of trying to detect fraud in an enterprise environment where a lot of systems interact. Suppose Jack reports to Joe and they collude in some way on one transaction out of 100,000. How do we detect this? One might try to look at all possible permutations of the transaction in the system, but there’s no known solution if you take this sort of brute force approach where you simply look at every single possible permutation.

Reza: On the other hand, if you use heuristics based on domain expertise, you can make your search engine smarter and reduce the problem space. The challenge is how to capture the domain knowledge. There are a variety of ways to do this, even several approaches using Semantic Web technology. However, we found OWL worked best for us. OWL lets us represent all the entities in the system as well as statements like “the probability of fraud due to duplicate payment or overpayment is high.” OWL is very versatile because it does not require you to use a single grand schema to represent your world. And, beyond heuristic reasoning, OWL gives us the secondary benefit of data aggregation.

IJ: So you have OWL statements and RDF data. Then what happens?

Reza: We have a reasoning engine –the Planner and Reasoning Engine– which uses the heuristics and walks through the data to verify compliance, detect fraud, etc.

IJ: What were you using before OWL?

Reza: Though we did capture some data using a variety of formats, there really was nothing before OWL. We started using OWL to scale this product line by allowing our partners to add their own rules starting roughly 5 years ago. As an example, a company like Deloitte might use their own rules expressed in OWL, customer data, and Oracle’s tools.

IJ: What is the reception to using OWL?

Reza: Fairly positive. The biggest barrier to OWL adoption has been that people are unfamiliar with it. So we have invested in educating our partners and customers, and this investment has paid off. Within Oracle, we’ve gone from “OWL is weird” to “OWL is a possibility.” But we need more champions with specific applications that generate revenue.

IJ: How are you using PROV?

Reza: PROV is at least is important to us as OWL. Until PROV, one of the hugest problems we faced was maintaining transaction audit trails in a heterogeneous environment in a standard and compatible way. Audit trails are described with literally millions of different formats in different organizations. This used to mean it was impossible to create a single audit time line. PROV solves this problem. We now provide (and consume) a PROV feed that unifies the audit trails generated by transactions across heterogeneous systems.

IJ: What’s an example?

Reza: Suppose I own a retail store and I contract with someone to help out during the holiday season. Months later that person becomes an employee. PROV lets me track changes over time for metadata from heterogeneous systems. It provides a standardized temporal structure for metadata, allowing me to aggregate temporal data from different systems. This lets me do things like look at payment data and changes to employee status and detect fraud.

IJ: Are there other Semantic Web technologies you are thinking of adopting?

Reza: We are actively looking at the opportunity of using Linked Data Platform (LDP) specifications.

IJ: Any comments about vocabulary management?

Reza: I think there’s a dissonance in vocabulary creation, particular related to Dublin Core. There’s no standard mechanism to rationalize OWL implementations with Dublin core. Dublin Core defines a bunch of canonical domain objects. Dublin Core should be mashed into OWL. Or there could be guidelines on using OWL for consistency with Dublin Core. There is a risk of stumbling when using both unless you use them with consistency.

IJ: Thanks for your time!