Meeting on Web Efficiency and Robustness

Cambridge, Massachusetts, USA, April 1996

The workshop was organized by the World Wide Web Consortium and hosted by Digital Equipment Corporation's Cambridge Research Laboratory.


American Internet Corporation Andrew Sudduth  
Bellcore Paul Lin  
Bellcore Mike Little Presenter
Bellcore Vivek Ratan  
Carnegie Mellon University Mahadev Satyanarayanan  
CyberCash, Inc. Donald Eastlake  
DEC Tony Dahbura  
DEC SRC Bill Weihl  
DEC/W3C Jim Gettys Host
FTP Software Harald Skardal Trip report
Harvard University Margo Seltzer Presenter
HP Labs, Hewlett-Packard Ltd. Andy Norman  
Ipsilon Greg Minshall Additional notes
Iris Associates Steve Beckhardt Presenter
ISI Bob Braden  
Lawrence Berkeley National Laboratory Van Jacobson  
Merit Network, Inc. Larry Blunk  
Microsoft Corporation Butler Lampson  
Microsoft Corporation Paul Leach  
MIT Dave Karger  
MIT Liuba Shrira  
MIT/LCS Dave Clark  
MIT/LCS Greg Ganger  
MIT/LCS Anthony Joseph  
MIT/LCS Barbara Liskov  
Open Market, Inc. John R. Ellis Presenter
PARC/UCLA Lixia Zhang  
University of Washington Brian Bershae  
W3C Henrik Frystyk Nielsen Note taker
W3C Phillip Hallam-Baker Note taker
W3C Dave Raggett  
Xerox PARC Steve Deering  
Xerox PARC Mike Spreitzer Presenter


Jim Gettys welcomed everybody by introducing the scope of the discussion.

Short Presentations

The morning had a set of short presentations which gave room for more focused discussion for the afternoon.

Steve Beckhardt: Experiences from Lotus Notes

One of the main differences between Lotus Notes and the Web is that Notes was not designed to scale to the same degree as the Web. One of the scaling issues is the problem of software installation and maintenance.

A key part of Lotus Notes is a centralized database for public keys. The centralized structure prohibits proper scaling and the architecture is being redesigned to support a decentralized data base model. Another point where scaling is an issue is the Notes directory service.

Notes does use replication and caching but only at a document level - not at the data base level. In the Notes model, no single replicant knows the total topology of the system. This model was discussed somewhat in the afternoon whether it is an advantage to know the full topology or not.

Steve also gave some thoughts on locking in a distributed system. The VMS model which includes locking can cause problems with synchronization and systems based on locking are hard to debug. There was some discussion whether locking belongs in the protocol layer of whether it is part of the application. The common feeling was that the main part belongs in the application layer and not in the protocol layer.

One of the forces in the Notes architecture is the layering of the APIs into three levels:

This ensure high portability over a large set of platforms.

Steve mentioned that the current time precision in HTTP which is measured in seconds is not good enough in a distributed system. Often higher precision is required and suggested that milliseconds would be a good alternative.

Another key element in the Notes model is access control. Access control is very hard to get right and at the same time to keep simple. Notes have currently a very flexible but also very complex model that has evolved over the years. It was not clear whether the Notes solution would port to the Web.

Margo Seltzer: Experiences from Cooperative Caches

One of the most fascinating properties of the Web (and the main reason for this workshop) is that the traffic keeps growing exponentially. A problem is that no one is collecting global data anymore and hence nobody really knows how stable the Net is - is it as stable as we think it is? Some of the main results of Margo's studies of server and proxy log files show that:

Margo's model exploits the fact that Web servers have a lot of information about which documents are popular and where they are requested. The servers use this information to hand of sets of documents to cooperative caches which then "owns" this set until it expires or gets flushed from the cache. Cooperative caches flattens out a more hierarchical cache structure and in stead generates a dynamic tree structure. This will in general lead to less hops in order to get to the data. Margo mentioned that they had developed an algorithm in order to guarantee convergence.

A main problem using cooperative caches is that people in general don't want to help people down the river. However, as Margo mentions, this is not really the situation as the resources taken by a cooperative cache in fact is gained somewhere else in the system. Also, it may be a business advantage to provide cache service and therefore better access times and therefore a role which can be taken on by the ISPs.

Mike Little: Parametizing Quality of Service on the Web

Is Having Web Access Good Enough? (is there more to life on the Web?)

Discriminating Factors

Why Parametric Measures For Services?

Quality Of Service Parametrization

What is QOS? The current situation is "best effort". The phone system is designed for specific quality of service

Need for parametric measures for services in order to have robustness, evaluation and management Some means of measuring are: latency, throughput, availability, integrity, accountability confidentiality, non-repudiation, authentication, access control, capacity, consistency, precedence, authentication, compatibility. Price is also important

Mechanism Impacts

Caching improves some things but worsens others. Most people are looking to improve Latency and throughput,

How dynamic are the parameters and how can you measure it. Applications can adapt to what they can get, so the service does not have to be complete at all times.

Henrik: The importance of these metrics are context dependent, sometime may be willing to trade one parameter against another, trade latency for confidentiality.

PHB: Important for applications (i.e. computers) to know what quality of service they are obtaining, and to be able to chose the level they need.[Example Lotus notes system, password system needs up to date, current information, document distribution less critical).

Dave Clark: Need to have interface that allows applications to know what quality they are getting and specify what they want, what they are prepared to pay for it. At present systems are merely reacting to the environment they are in, they do not select their environment.

Possible Next Steps

John Ellis: Can the Web be used for Commerce?

How to get from here to there.

The original topic for this presentation was "mobile and disconnected use" but it quickly turned into a more commerce oriented direction with John's experiences from Open Market's plans for making money of the Web. Currently, the Web is too slow and too unstable for any serious commercial use.

We are the elite with fast access to the Web. 90% of the corporate employees are in offices of 100 people or less, do not have T1s to every office. 20% of Xerox workers only have laptops. How do they use the Web?

Open Market is headed towards a position where it can sell its products on the Web, but currently there is almost no commerce done on the Web (Last year 3 T$ of commerce 60M$ on the Web). One of the main problems is that it is hard to provide content instead of individual Web pages. The main reason for this is that the Web is not semantically specified. Content providers depend on consistent content - not single web pages.

If you are using the Web seriously you are not surfing you have a fixed, persistent relationship. At present not got infrastructure, have cache which pulls down related information each night. Can thus run offline, but main interest is for desktops with slow links.

Open Market's solution to speed up Web access is to provide a tool that handles time shifted retrieval in order to speed up access, for example during the night when band width is available. Open Market offers together with Pathfinder offers off line services with high performance because using local cache. One of John's main claims was that it is more important to have consistent content than a transparent cache. As an example, batch down loads handle dynamic content by taking a snapshot and then use this in the local cache. The end user and especially the content provider must be able to control the content of the cache. John expects that batch down load is going to happen in a large scale within the next 6 months.

Mike Spreitzer: ILU, an OO RPC System

The Inter-Language Unification system (ILU) is a multi-language object interface system. The object interfaces provided by ILU hide implementation distinctions between different languages, between different address spaces, and between operating system types. ILU can be used to build multi-lingual object-oriented libraries ("class libraries") with well-specified language-independent interfaces. It can also be used to implement distributed systems. It can also be used to define and document interfaces between the modules of non-distributed programs. ILU interfaces are specified in ILU's Interface Specification Language.

Discussion Topics

The afternoon was open for general discussions. It turned out that there (with a few exceptions) was a large reluctance to discuss favorite CS topics. IN this category was locking, naming in general, and export control of crypto technology

How robust is the Internet?

What is the feasible design scope

How easy is it to change the Web

Caching and Replication

How common is multi-cast. It seems that the current MBONE is growing exponentially even faster than the rest of the Internet. All current router implementations have multi-cast capability but It is often not turned on. Maybe all will turn on multi-cast within a year. Many see it as a value added service and want to find a model for charging for it



What data do we need to know in order to go on?

Action items

The only solid action item was for Margo to try and find information about the transatlantic link (and to Australia and New Zealand for that matter).

Henrik Frystyk Nielsen,
Additional Comments Phillip M. Hallam-Baker ,
@(#) $Id: 960419_Notes.html,v 1.5 1997/10/31 19:27:22 frystyk Exp $