Service Discovery in Massive Scale Federations; Why Look Beyond the Web?

BT Position Paper

Joint W3C/OMG Workshop on Distributed Objects and Mobile Code, June 24-25, 1996

Service Discovery in Massive Scale Federations
Why Look Beyond the Web?

Summary

The marriage between distributed object technology (the real men) and Web technology (the earth mothers) has been announced. From the groom's point of view, preparations for the marriage will be complete once he's taught the bride how to speak IIOP, learnt a bit of HTTP to make her comfortable and finished work on some important object services in the shed at the bottom of the OMG. However, by some mysterious organic process, she has amassed knowledge of all things. Whatever question he has, he must discover the spell that will draw out the answer. He needs to get in touch with her inner feelings. Consummation will be delayed until this chauvinistic gap is breached.

It is fashionable to criticise how well the Web achieves this or that goal then invent a "proper" service to do it better. It is tempting to suppose that technology for structured applications from the traditions of the distributed object world should be inserted into the Web on the day of their marriage. This paper aims to show that, in the field of service discovery, the Web deserves a long hard look before we click on the button marked "Fixed in the Next Release". As such, no new technology is proposed, merely some optimistic thoughts leading to the insight that the Web is a "proper" system for resource discovery (well, nearly).

Background

The ideas in this paper stemmed from a short exercise describing the steps required for service discovery in terms familiar to the distributed object community, then comparing this with how these are implemented in the Web. The big assumption (cheat) was made that data designed as human readable is equivalent to machine-readable data.

Insights

A hyper link is a name to address mapping, the name being the under-lined text (within the context of the title of the Web page on which it sits). Therefore a page of hyper-links is a name-server (albeit unidirectional) and therefore the Web is a massive scale federation of name servers.
Natural language descriptions are a major step towards machine-readable classification systems (type repositories) and point to the need for classification in many alternative contexts.
A large number of Web pages, news postings and other Internet resources list related hyper-links surrounded by text describing the resources pointed to. Therefore the Internet is a massive scale federated trader database from the point of view of both services (export) and clients (import).
Some Web pages list alternative hyper-links to essentially the same item, but different formats, different mirror sites etc. Therefore parts of the Web already act as locators.

At this point the reader is probably thinking, "Well isn't that obvious?" and probably also thinking, "But the assumption that human readable equals machine readable is just plain wrong". The very fact that these insights are possible by merely adopting one assumption, should itself provide further insight by examining that assumption.

Web data and metadata are currently jumbled together throughout the Web, which is fine for humans, but difficult (but not impossible) for machines to interpret.
The designs for discovery services that are likely to be most successful will be able to overlay structure on the metadata in the human-readable data that makes up the Web today, by mimicking human perception. Put another way, there is no point designing discovery services for massive scale systems that only work with structured data.
As authoring systems that structure metadata become more usable and prevalent, author's self-interest will ensure rapid take-up, because the resulting information will be more discoverable for the world at large.

Conclusion

Trying to design a "proper" name server, trader, locator or type repository for a massive scale federated system won't work if all the data that will populate it is "left as an exercise for the user". At present, the distributed object community has all the proper (but hollow) objects, while the Web has all the data to populate them (somewhere). Therefore the problem is not just to add metadata structure to content, but primarily to learn how to extract structure from the unstructured data that will both continue to exist, and continue to be created.

Bob Briscoe, BT
last modified on 25 March, 1996

BT Position Paper

Joint W3C/OMG Workshop on Distributed Objects and Mobile Code, June 24-25, 1996

Service Discovery in Massive Scale Federations Why Look Beyond the Web?

Summary

Background

Insights

Conclusion

Service Discovery in Massive Scale Federations
Why Look Beyond the Web?