OGCWG F2F - February, 2012
From Oil, Gas and Chemicals Business Group
Organizational Face-To-Face Meeting Report
At this meeting to discuss formation of a W3C Oil, Gas and Chemicals Business Group, participants identified a number of potential use cases for this industry segment with clear business drivers that can potentially provide better, faster and cheaper strategic and operational business decisions or compliance with legal requirements. In each case there are potential linkages both to current activity in the W3C and also in our industry, providing opportunities for synergies: The W3C is uniquely positioned to provide the expertise required to use this technology effectively because – well, that’s where the technology is coming from. Our industry can help the W3C by providing use cases that drive prioritization of how the technology is best implemented and extended and where the use of Semantic Web technology is most effective for our industry.
An organizational face-to-face meeting was hosted by Chevron in Houston on February 13, 2012 to discuss forming a W3C Oil, Gas and Chemicals Business Group. Present at the meeting were representatives from the following companies and organizations:
- Algebraix Data
- Apache Corp
as well as two unaffiliated individuals and a representative of the W3C. Other companies that were not able to attend, but have previously expressed interest in this effort (e.g by contributing to the draft charter of the group) include:
- Saudi Aramco
- University of Southern California
The most substantive part of the meeting was a discussion of potential use cases that might be pursued by this group. The use cases that received the most interest are as follows:
Use Case: Data Provenance
The business driver is to increase the reliability of decisions based on diverse information by understanding the sources of that information.
The value chain in the Oil & Gas industry progresses from exploration, to production, transportation, refining, and finally, to distribution and marketing. Moving any given resource through this chain may require more than 10 years of investment, much of which occurs in the form of data acquisition, processing, and interpretation, in complex workflows that traverse functional and organizational boundaries. Importantly for the present use case, these workflows generally build on previous results, integrating new data and creating new, derived data products. As new information and knowledge accumulate, however, previous assumptions and work often require modification. At each step or node in this complex system of workflows, there is thus significant need to evaluate data provenance, either to verify the validity of source data for the next step in a workflow, or to identify how previous processing or interpretation should be modified. This need represents a key use case for data provenance.
Semantic technologies seem well suited for capturing the complex, anastomosing flow of data in the Oil & Gas industry. Capturing this opportunity is extremely difficult at this time, however, because very few of our major software systems expose information about data provenance. Such exposure is necessary to compile the provenance of data as it traverses the multitude of systems in use today. Standards are clearly needed in this area in order to enable vendors to expose information about data provenance in a usable way.
There has recently been research associated with the Oil & Gas industry in using Semantic Web technology to deal with data provenance issues (e.g. predicting missing reservoir engineering provenance data and querying provenance in a distributed "Smart Oilfields" environment), and the W3C has recently formed a Data Provenance Working Group, following onto the prior W3C work of the Provenance Incubator Group, to develop Semantic Web standards for data provenance. The potential synergies here are obvious: Since the Working Group is relatively young this could be an opportune time to provide use cases and requirements from our industry. Conversely, technical input from the W3C working group can helpthe Oil & Gas industry head in the right direction. One specific question to ask is, "Does the Oil, Gas & Chemicals industry have requirements for data provenance information that are substantially different from those of, say, the Health Care and Life Sciences industry or eGovernment"? We won't know the answer to that until we look at the details, but we might guess that the relative importance of real-time and time series data in our industry might lead to special requirements and techniques.
In this use case, we recognize as one possible opportunity use of the Energy Industry Profile (EIP) of ISO 19115-1 to capture key provenance information. In particular, the LI_Lineage package in ISO 19115-1 offers the means to document the originating process of a resource, and references to its source data, information core to the incremental construction of a provenance graph. Because the EIP is designed to document virtually all information resource types of importance to the Oil & Gas industry – including digital services, structured and unstructured products, and physical assets – it could contribute to a design for representing data provenance that enables compilation of provenance graphs capable of referencing all information resource types.
Use Case: Linked Enterprise Data -- Drilling and Exploration
The next two use cases are both oriented toward providing use cases and possibly requirements to the W3C activities in Linked Enterprise Data and potentially benefiting from their technical expertise and developing. Here again the W3C activity is relatively young (a 2011 Workshop appears to be heading toward establishing a new working group in this area), which makes it again a good time to provide input of this sort. There is considerable enthusiasm in the industry to develop Semantic Web approaches to drilling information, possibly concentrating on the currently hot areas of shale and fracking. One possibility that was discussed for connecting this to current industry activity could be to consider "semantifying" the University of Tulsa Bricks Taxonomy.
The business driver is predicated on the fact that drilling information is kept in extremely diverse forms. The Semantic Web provides an opportunity to reconcile related information from diverse sources so that it can be more effectively analyzed, leading to quicker and higher quality decisions.
Use Case: Linked Enterprise Data -- Value Chain
The idea here is to create linked data models across the enterprise which model many parts of the value chain (e.g. drilling, exploration, production, maintenance, etc), replacing static enterprise information models in this area.
The expected benefit to the business is to enable the optimization of operations and strategic decisions across the entire enterprise as opposed to one silo at a time.
Again the connection with the W3C would be with the Linked Enterprise Data activity, as described above.
Use Case: Regulatory and Compliance Information
This use case is a bit speculative. We are aware that there is a tremendous amount of Semantic Web activity at the W3C in the general area of eGovernment, and in fact there are both a recently chartered Working Group and an Interest Group in this area. Again, a young working group is a great place to contribute use cases and requirements from our industry and to get back technical assistance.
The business driver in the industry would be more effective processes for understanding our complex regulatory environment as well as making compliance activities more effective and less expensive.
The speculative part of this is that we do not know at this time whether there is substantial activity in the eGovernment Semantic Web area in regulations that would affect our industry. If there is there could be potential areas of cooperation.
Use Case: Big Data Analytics
Many big data analytics projects are working on unstructured or semi-structured data. Semantic Web information could provide a more sophisticated basis for analytics. The W3C is looking at big data issues in the cloud and linked data, so there is potential for interaction on this. The big technology companies (Oracle, IBM, Amazon, etc) are all quite interested and active in this area.
A potential business driver might be to push analytics further down into our huge masses of real time data in order to provide more sophisticated generation of events that can be used to optimize operations, both in terms of efficiency and safety.
Logistics and Action Items
In order to make any of these things happen it is first necessary for the group actually to form under the rules of the W3C for Business Groups. In order to do this five organizations must commit to joining the group. At present, as can be seen at the group Web site, only two organizations (Chevron and Statoil) have committed. However, of the nine meeting participants seven expressed an intent to start the process necessary to make such a commitment, and there was a consensus of all participants that one way or another this would be done in the next 60 days. That is, as of April 15, if there are at least five committed organizations the group will actually form and start working, and if not ... well, that will be that. If the group is a "go", however, at that time we will discuss organizing another F2F at the SemTech Conference in June, establish a schedule for regular telcons and establish other communication processes.
Although the group cannot really form and start to do business until the membership issue is settled, nevertheless there are a couple things we can do in the meantime. Since there was considerable interest expressed in finding out how the Health Care and Life Sciences Interest Group has added value in their industry in a similar activity, we will try to schedule an educational telcon on this subject in the next month or so. And we need to find out whether the W3C eGovernment activity has components that are involved with regulation of our industry.