The workshop was organized by the World Wide Web Consortium and hosted by Digital Equipment Corporation's Cambridge Research Laboratory.
|American Internet Corporation||Andrew||Sudduth|
|Carnegie Mellon University||Mahadev||Satyanarayanan|
|FTP Software||Harald||Skardal||Trip report|
|HP Labs, Hewlett-Packard Ltd.||Andy||Norman|
|Lawrence Berkeley National Laboratory||Van||Jacobson|
|Merit Network, Inc.||Larry||Blunk|
|Open Market, Inc.||John R.||Ellis||Presenter|
|University of Washington||Brian||Bershae|
|W3C||Henrik||Frystyk Nielsen||Note taker|
Jim Gettys welcomed everybody by introducing the scope of the discussion.
The morning had a set of short presentations which gave room for more focused discussion for the afternoon.
One of the main differences between Lotus Notes and the Web is that Notes
was not designed to scale to the same degree as the Web. One of the scaling
issues is the problem of software installation and maintenance.
A key part of Lotus Notes is a centralized database for public keys. The
centralized structure prohibits proper scaling and the architecture is being
redesigned to support a decentralized data base model. Another point where
scaling is an issue is the Notes directory service.
Notes does use replication and caching but only at a document level - not
at the data base level. In the Notes model, no single replicant knows the
total topology of the system. This model was discussed somewhat in the afternoon
whether it is an advantage to know the full topology or not.
Steve also gave some thoughts on locking in a distributed system. The VMS
model which includes locking can cause problems with synchronization and
systems based on locking are hard to debug. There was some discussion whether
locking belongs in the protocol layer of whether it is part of the application.
The common feeling was that the main part belongs in the application layer
and not in the protocol layer.
One of the forces in the Notes architecture is the layering of the APIs into three levels:
This ensure high portability over a large set of platforms.
Steve mentioned that the current time precision in HTTP which is measured in seconds is not good enough in a distributed system. Often higher precision is required and suggested that milliseconds would be a good alternative.
Another key element in the Notes model is access control. Access control
is very hard to get right and at the same time to keep simple. Notes have
currently a very flexible but also very complex model that has evolved over
the years. It was not clear whether the Notes solution would port to the
One of the most fascinating properties of the Web (and the main reason for this workshop) is that the traffic keeps growing exponentially. A problem is that no one is collecting global data anymore and hence nobody really knows how stable the Net is - is it as stable as we think it is? Some of the main results of Margo's studies of server and proxy log files show that:
Margo's model exploits the fact that Web servers have a lot of information about which documents are popular and where they are requested. The servers use this information to hand of sets of documents to cooperative caches which then "owns" this set until it expires or gets flushed from the cache. Cooperative caches flattens out a more hierarchical cache structure and in stead generates a dynamic tree structure. This will in general lead to less hops in order to get to the data. Margo mentioned that they had developed an algorithm in order to guarantee convergence.
A main problem using cooperative caches is that people in general don't want to help people down the river. However, as Margo mentions, this is not really the situation as the resources taken by a cooperative cache in fact is gained somewhere else in the system. Also, it may be a business advantage to provide cache service and therefore better access times and therefore a role which can be taken on by the ISPs.
Is Having Web Access Good Enough? (is there more to life on the Web?)
What is QOS? The current situation is "best effort". The phone system is designed for specific quality of service
Need for parametric measures for services in order to have robustness, evaluation and management Some means of measuring are: latency, throughput, availability, integrity, accountability confidentiality, non-repudiation, authentication, access control, capacity, consistency, precedence, authentication, compatibility. Price is also important
Caching improves some things but worsens others. Most people are looking to improve Latency and throughput,
How dynamic are the parameters and how can you measure it. Applications can adapt to what they can get, so the service does not have to be complete at all times.
Henrik: The importance of these metrics are context dependent, sometime may be willing to trade one parameter against another, trade latency for confidentiality.
PHB: Important for applications (i.e. computers) to know what quality of service they are obtaining, and to be able to chose the level they need.[Example Lotus notes system, password system needs up to date, current information, document distribution less critical).
Dave Clark: Need to have interface that allows applications to know what quality they are getting and specify what they want, what they are prepared to pay for it. At present systems are merely reacting to the environment they are in, they do not select their environment.
How to get from here to there.
The original topic for this presentation was "mobile and disconnected use" but it quickly turned into a more commerce oriented direction with John's experiences from Open Market's plans for making money of the Web. Currently, the Web is too slow and too unstable for any serious commercial use.
We are the elite with fast access to the Web. 90% of the corporate employees are in offices of 100 people or less, do not have T1s to every office. 20% of Xerox workers only have laptops. How do they use the Web?
Open Market is headed towards a position where it can sell its products on the Web, but currently there is almost no commerce done on the Web (Last year 3 T$ of commerce 60M$ on the Web). One of the main problems is that it is hard to provide content instead of individual Web pages. The main reason for this is that the Web is not semantically specified. Content providers depend on consistent content - not single web pages.
If you are using the Web seriously you are not surfing you have a fixed, persistent relationship. At present not got infrastructure, have cache which pulls down related information each night. Can thus run offline, but main interest is for desktops with slow links.
Open Market's solution to speed up Web access is to provide a tool that handles
time shifted retrieval in order to speed up access, for example during the
night when band width is available. Open Market offers together with
Pathfinder offers off line services
with high performance because using local cache. One of John's main claims
was that it is more important to have consistent content than a transparent
cache. As an example, batch down loads handle dynamic content by taking a
snapshot and then use this in the local cache. The end user and especially
the content provider must be able to control the content of the cache. John
expects that batch down load is going to happen in a large scale within the
next 6 months.
The Inter-Language Unification system (ILU) is a multi-language object interface system. The object interfaces provided by ILU hide implementation distinctions between different languages, between different address spaces, and between operating system types. ILU can be used to build multi-lingual object-oriented libraries ("class libraries") with well-specified language-independent interfaces. It can also be used to implement distributed systems. It can also be used to define and document interfaces between the modules of non-distributed programs. ILU interfaces are specified in ILU's Interface Specification Language.
The afternoon was open for general discussions. It turned out that there (with a few exceptions) was a large reluctance to discuss favorite CS topics. IN this category was locking, naming in general, and export control of crypto technology
How common is multi-cast. It seems that the current MBONE is growing exponentially even faster than the rest of the Internet. All current router implementations have multi-cast capability but It is often not turned on. Maybe all will turn on multi-cast within a year. Many see it as a value added service and want to find a model for charging for it
The only solid action item was for Margo to try and find information about the transatlantic link (and to Australia and New Zealand for that matter).