Knowledge Ecosystem Task Force Proposal


D R A F T           for discussion                     submitted by Tim Clark 2/13/06


Problem Statement


Scientific knowledge discovery, publication and discourse can be understood as a knowledge ecosystem containing numerous lifecycle processes. Currently, information in this ecosystem is produced, moves within and is exchanged across public, corporate, private, institutional, and collaboration ownership spaces in the form of millions of semantically uncharacterized digital resources.


Scientists and health care providers increasingly rely on these resources to an extraordinary degree.


These digital resources can potentially be richly interconnected and contextualized in terms of one another. Establishing these interconnections is part of the process of creating, sharing, discussing, publishing and consuming new knowledge. However, such ecosystem process activities are not currently well-supported by digital models, because information interconnections across processes in the “knowledge ecosystem” currently lack a complete machine-accesible semantic characterization.


For example, there is currently no widely recognized machine-accessible semantic differentiation between a manuscript and a publication; or between an illustration and experimental image data; or between an experiment, its data, the data interpretation, and the hypothesis the experiment was designed to validate.


This problem exists across multiple scientific domains. We believe that solving the semantic characterization problem at the common level of knowledge processes, can facilitate not only the organization and exchange of knowledge within domains but across them. This will be particularly important in goal-oriented clinical research but can be expected to benefit the entire life science and health care community.

Mission and Scope


Our mission in this task force is to develop a collective understanding, and shared semantic models, of inter-related digital knowledge artifacts in the research process as they propagate through laboratories, academic communities, collaborations, research institutes, scientific publishers, bibliographic knowledge bases, and companies.


These models will be representations of common process elements in the ecosystem and their transitions of evidentiary support, accessibility, meaning, interconnectedness and ownership. They will function as a semantic interoperability layer between the many domain (and personal) ontologies in life science and health care by establishing agreed semantics of the knowledge processes in which the artifacts characterized by the domain ontologies are created, evolved and consumed.


A simple way of visualizing the mission of this group is to imagine the processes by which digital resources circulate in the knowledge ecosystem as horizontal layers connecting information across vertical domains of health care and life science research and practice areas. While each separate vertical domain may require its own specialized ontology, the semantics of the ecosystem process model(s) can connect the domains and provide an important means of interoperating across their ontologies.


The primary intent of developing common semantic models of the knowledge ecosystem processes is to accelerate scientific discovery. We hypothesize such models will achieve this by facilitating:


(1)    improved personal-, laboratory-, and research community-level organization and characterization of knowledge;

(2)    high-bandwidth knowledge interoperability between knowledge producers and consumers, based upon the ability to publish, interchange and share richly contextualized resources;

(3)    creation of fluid bridging and interoperability across domain and personal ontologies, using shared models; and

(4)    vastly improved capability of autonomous software agents to navigate and extract meaning from a developing corpus of semantically characterized digital resources.

Charter Relevance


The mission of this Task Force supports the following goals of HCLS SIG’s charter:


(1)    define core vocabularies that can bridge data and ontologies developed by individual communities of practice in HCLS

(2)    provide … descriptive vocabularies to better enable the integration and relationships among people, data, observations, software, collections of algorithms, and scholarly publications / clinical trials.

Statement of Objectives


1.        Prepare a set of general use cases defining the application of a knowledge process ontology to key research and practice areas of HCLS.

2.        Prepare a draft ontology supporting the use cases, with accompanying white paper.

3.        Conduct a comprehensive discussion on the draft ontology and use cases with input from multiple sources, resulting in revised drafts.

4.        Develop an open and diverse set of interoperating pilot applications to validate and refine the use cases and ontology.

5.        Recruit additional members and solicit active involvement from key players in the space, such as academic publishers, academic libraries, research institutes, individual laboratories, health care providers, and pharmaceutical companies.

6.        Task Force Final Report: a one-year assessment of development activities, lessons learned, etc. with recommendations for future activities, if any, to be undertaken by new task forces.

Tasks and Deliverables


1.        Initial Draft Use Case Document: March 30, 2006 – in consultation and collaboration with the other working groups.

2.        Initial Draft Process Ontology & Whitepaper: April 30, 2006.

3.        Revised Drafts of Use Cases and Process Ontology: June 30, 2006 – in consultation and collaboration with the other working groups.

4.        Pilot Applications: March 2006 through December, 2007.


6.        Task Force Final Report: March 31, 2007.