Multi-service search services are able to collate results from many different search services, such as Lycos or Alta Vista. One of the many challenges faced by such meta-services is that each base service represents the contents of its database in a different manner. In order to compensate for this, meta-services must employ a variety of heuristics and custom code in order to collate information in a manner appropriate for users. This is a problematic approach, as it is not robust to changes in the base servers' representations, as well as being a wasteful approach, as often the meta-engine must compute information about the data, possibly by downloading it from a congested network, which the base service could have provided.
Most global Web search services use a confidence score as their only indication of relevance to the user's query. This score is just a number --- most services use natural numbers from [0 .. 1000] with 1000 being a "perfect match." Meta-search engines will typically normalize the score, and rank based upon a summation of that score. This method has problems, in that one service's notion of a "high score" is dramatically different than anothers. For example, given the query "Used Car," one service may give a high score based upon the word "Car" appearing in the title, whereas another will give an equally high score because "Used Car" appears somewhere in the body text.
What is needed is a richer formulation of the results returned by search services. This representation should include things such as:
These features, and undoubtably others, are needed in order to enable meta-search services to perform as well as they are able. Without this information, meta-search engines either need to infer the data which wastes computation time, download the page and extract the information which wastes network bandwidth, or do without, which produces less than optimal results. The obvious solution is create a standard representation which allows search services to convey the most information about their results to their users, be they human or artificial.