Advertising and Discovering
Alan H. Karp, Kevin Smathers
Palo Alto, California
One of the key needs for businesses to operate effectively over the Internet is the ability to discover providers of services that they need. It is our position that the mechanisms used by existing marketplaces and ad hoc consortia do not fully meet this requirement because they lack the flexibility needed in the dynamic environment of the Internet. In particular, web-based services need to be able to describe themselves in new ways without undue delay. We believe that vocabularies are a representation of ontologies that are well suited to the business needs. A framework that allows for easy creation, dissemination, and evolution of vocabularies, while accommodating industry standard vocabularies better meets this need.
As business services move to the web, it becomes increasingly important to automate the way buyers and sellers find each other. Advertising at the Super Bowl is not going to attract software. Clearly, some sort of automated advertising and discovery mechanism is needed. However, blind searching on keywords results in too many false misses as well as too many false hits. What is needed is a context that provides semantic meaning to the search terms.
An ontology provides semantic content to an attribute, allowing parties to agree on the meaning of the value associated with that attribute. For example, in an electrical engineering ontology, GATES might be an integer denoting the number of electronic gates per square inch. In a landscape architecture ontology, GATES might be a string denoting the style of gate. There are many ways to represent ontologies. Examples include hierarchical taxonomies and vocabulary expressed as name-value pairs. We feel that the latter best meets the needs of e-businesses.
Business interactions on the web use ontologies in a way that differs from many common uses, leading to a set of requirements on the infrastructure.
We have been discovering things on other computers as long as there have been networks, probably even before. The Internet and the growth of commerce on it have dramatically increased both the need for discovery and the difficulty in providing it. While a survey of all ontology efforts would be of value, it is beyond the scope of this position paper. However, understanding why these efforts fail to meet the needs of business being conducted on the Internet is important for understanding our position. We’ve picked some representative examples to illustrate the problem.
Probably the first efforts attempted to index the entire web. The Alta Vista search engine provides a full text search of every web page it has indexed. A problem with this approach is that you get no hits or 40,000, a phenomenon that has been facetiously called the Alta Vista effect. Yahoo also indexed the web, but with human indexers. However, searching requires following a fixed hierarchy, something better for humans than software. Other hierarchical indexing schemes, such as CoS Naming and LDAP, provide wildcarding that avoids some of these problems, but they don’t provide any special mechanisms to support ontologies. You can always make a particular node in the hierarchy an ontology node in which points below it are defined in terms of that ontology. Unfortunately, the very feature that makes it easier for software to search, wildcarding, makes it impossible to enforce the use of the ontology.
Object systems, such as CORBA or Jini, provide a means to find objects that implement specific interfaces. However, a business service is more than the interfaces it supports. While Jini is built on top of JavaSpaces, a very general search engine that could be used to provide more complex description and discovery, there is no specific support for ontologies. VerticalNet supports a large number of trading communities. Each has a specialized ontology to enable businesses within that community to find each other. However, there is only one ontology per community, so that additions must wait for approval from a central authority.
More recent efforts to provide description and discovery frameworks are UDDI and ebXML. UDDI Version 1 allows businesses to describe themselves in one or more standard taxonomies, such as NAICS and UNSPSC. However, these taxonomies are not flexible enough for all situations. For example, it is difficult to decide if the category “computer service” is for hardware or software. Version 2 of UDDI, currently being designed, allows new taxonomies to be introduced. However, allowing the description of a business service in detail is not one of UDDI’s goals. It is only intended to provide a first level filter; further discrimination is done in direct communication with the service provider. Another standard in the making, ebXML, is similar to UDDI Version 2 in that it allows categorization in different taxonomies. However, ebXML includes more of the other aspects of doing business on the web than does UDDI.
E-speak was designed for doing business in the dynamic environment of the Internet. Business services in an e-speak environment are constructed by specifying the job that needs to be done rather than how the job is to be done. Thus, a business process that needs a billing service describes the properties it is looking for rather than naming a specific billing service. Hence, a rich, flexible description and discovery mechanism was critical to making e-speak useful. It’s not surprising that e-speak incorporates many of the features needed for discovery of business services on the web. Our position is strongly influenced by our experience with e-speak, both in understanding the most valuable features of e-speak vocabularies as well as in knowing what extensions would be most useful.
We believe that businesses will want to use industry standard ontologies, but they need a way to meet special needs on a time scale shorter than these standard ontologies can be updated. We propose a framework for defining vocabularies to meet this need. A vocabulary is a particular representation of an ontology that allows a business service to be described as attribute-value pairs. Further, a vocabulary can be advertised as a business service itself in another vocabulary. A very simple base vocabulary understood by all is needed to ground the recursion.
This mechanism is best illustrated by an example. Say that I am interested in finding a billing service to incorporate into my business processes. I find general business services vocabularies by doing a lookup in the base vocabulary. I use those vocabularies to find the ones related to business processes. I can then look for the exact service I’m interested in. Note that I may turn up more vocabularies, which I can then use to extend my search. More complex cases involving detailed business services might well go through more levels. However, at the end of the process, I will have found a vocabulary that is rich enough to describe the services in sufficient detail. Of course, the vocabulary creator decides how much detail is sufficient. If a business finds that it needs an extension, all it needs to do is create a new vocabulary and advertise it in the previous vocabulary.
The vocabulary framework must support certain features that fall into 4 categories.
Advertising and searching
Evolution of vocabularies
Creation and control
This last point needs some explanation. Consider a string-valued attribute. What constitutes a match on a “less than” query? Is it collating sequence? In what language? Is it substring? Case sensitive? Starting at the beginning of the string? The creator of the vocabulary has the semantic knowledge to understand the meaning of the attributes and answer these questions, and the answers may be different for different attributes.
Once such a general vocabulary mechanism is in place, it can be used for other purposes. Events can be defined in terms of a vocabulary. Publishers and subscribers can find each other by advertising in the event’s vocabulary. Event state can be specified as attribute values in the vocabulary and subscription filters can specified as constraint expressions in the vocabulary. A vocabulary can also form the basis of online negotiation and contract formation. Each multi-valued attribute can be treated as a clause in a contract, and negotiation can proceed to settle on values in the contract.
This section provides links to the pages of the technologies referenced in this document.