Network-Centric Measurement
of Caching
Position Paper
Solom Heddaya
InfoLibria, Inc.
1998.10.20
Network managers
deploy caching in order to improve their networks, not to achieve hit
ratios, or op/s, or any of the other commonly used caching metrics.
They seek to improve their network's bandwidth capacity, web service
capacity and repsonse time, without adversely affecting network
latency, or network availability and reliability.
Very little current
research work addresses these issues of pressing concern and value to network
managers. Most of the current benchmarking and performance characterization
research focuses on the behavior of the network cache as a server.
While this approach can be useful in optimizing certain aspects of network
caches, it does not accurately reflect the impact of caching on the network.
Typical cache performance metrics, such hit ratio (or rate), mystify network
managers. They would much rather see metrics that quantify the promised
network capacity expansion, response time speedup (to the end-user), and
availability enhancement. The reliability impact of caching ranks high
on their list, too.
The network-centric
point of view impacts workload characterization. For example, request and
response routing information needs to be included in workload characteristics,
in order to address such problems as network cache placement.
In this paper, we
argue why the repertoire of research in the field should widen to include
the point of view of the network, and to provide an initial attempt at
clarifying how this might be done using research from the field of performability.
Network Bottlenecks
Oscillate
As the Internet grows
at a historic pace, doubling in aggregate traffic rate every three to six
months, it suffers from bottlenecks that frustrate its users. These bottlenecks
oscillate between the two major constituents of the Internet: the client/server
complex, and the network itself. Recently the Internet was stressed to
deliver the Starr report [W98, K98b].
On this particular occasion, the bottleneck was the server(s). However,
ordinary traffic conditions give rise to an unacceptably low 40 kilobit/s
average transfer rate per TCP connection through the backbone [K98a].
This latter measurement reflects transfer rates delivered, not over modem
lines, but over dedicated T1-class last hop.
Network caching applies
server-like functionality to solve the network congestion problem. So,
is a network cache to be judged on how well it functions as a server, or
on the extent to which it improves the network? On the one hand, a network
cache looks like a high performance server, whose performance can be characterized
via the traditional throughput and response time [MA98].
On the other hand, a network cache expands bandwidth and speeds up response
time. These benefits are the true goals of network caching. With only one
exception, measurements of network cache performance continue to focus
on the server aspect of network caching.
Server-Centric Performance
Characterization
Network caches, as implemented
most commonly today, originated from work on high performance web servers.
>From the point of view of the network, servers are hosts, while network
caches are more like routers or switches. The dominance of the server point
of view in characterizing cache workload and cache performance, can be
seen by noting the following:
-
Cache performance parameters
reflect throughput (commonly in op/s) and response time of the cache itself,
augmented with cache hit ratio. These are the same parameters used to characterize
servers.
-
Cache hit ratio is a
poor indicator of bandwidth savings, for several reasons: First, hit ratio
ignores the network distance saved [HMY97]. A packet
transmitted over 30 hops uses up more network capacity (bandwidth) than
one that traverses only 10 hops. Second, network capacity is often determined
by by the bandwidth across (relatively) few links. Packets that consume
otherwise idle bandwidth have zero impact on network capacity. Third, even
if we focus on a single bottleneck link, the hit ratio that matters will
be the hit ratio at peak link utilization, not the average hit ratio.
Fourth, misses are double (or more) counted when they traverse multiple
caches. This understates the aggregate hit ratio of a collective cache.
-
Cache workloads typically
capture only server-centric workload characteristics, such as request arrival
time, requested object name, file size, etc. This is insufficient to predict
a cache's impact on the network. For example, a trace that shows three
requests A1, A2, A3 for the same object, can yield a hit ratio of
zero, 1/3, or 2/3, depending on where caches are placed in the network.
If the three requess reach different caches, the hit ratio will be zero.
Lack of route information in traces or models restricts their validity
to situations where a cache is to be deployed at the exact same spot in
the network at which the load was traced or modeled.
Network-Centric
Performance Evaluation
A number of network-related
factors must be taken into account, for network caching to be evaluated
in the proper context. These range from network topology, to capacity enhancement,
to the effect of cascaded caches on each other's performance. Furthermore,
network and web content availability need to be suitably defined and quantified.
The requirements for such a network-centric performance model include:
-
Workload characteristics
should include route information for requests and responses. The risks
of not doing so have been discussed earlier.
-
Other network phenomena
should be modelled as well, such as slow client connections (e.g.,
modems [AC98]), and backbone congestion leading to
low TCP connection bandwidth to the content-provider's HTTP server.
-
Throughput, hit ratio,
and response time should all be modelled in the aggregate. Or, conversely,
the behavior of each cache should not be reported in isolation. For example,
the aggregate hit ratio of a caching system that intercepts the same (miss)
request multiple times along the request path, would be higher when a miss
is counted only once. Similarly, the aggregate throughput of the caching
system, would be lower than the sum of the constituent caches, if a miss
is counted as a sincle operation, even though it may trigger multiple operation
executions at different caches.
-
Benefits to the network
should be the end result of performance evaluation. This means that
average bandwidth savings, for example, should be replaced with bandwidth
savings at peak link, or network, utilization. For example, a cache that
thrashes during such times would provide zero bandwidth expansion, no matter
what the average traffic reduction it delivers may be.
-
Cache performance metrics
should include the impact of the caching system on network service availabilty
and reliability [HHE97]. Availability is
the probability that a request submitted to the network will lead to the
successful initiation of service. Availability depends on the length
of the downtime caused by a cache failure, until any fail-safety mechanism
kicks in to restore service.
-
Reliability is the probability
that a service, once successfully initiated, runs to completion. The typical
metric for reliability is the mean time between failures (MTBF).
When cached files are small, and hence transaction lifetimes are short,
reliability can be safely ignored. This is especially true if availability
is high. The converse is not true. Large files, such as the Starr report
when served as a single object, resulted in 89% failure rate at the house.gov
web server [K98b]. Many of these failures occurred
after the download started.
-
A single figure of merit
that combines performance (including overhead), availability and reliability,
is performability. It turns out that such a composite metric can
be defined simply yet meaningfully (see [HHE97]).
Long
Term vs. Short Term
Aside from the tactical
effects of network caching, which can be quantified reasonably well using
the approach we outlined above, we should not ignore the strategic impact
on the network. For example, network scalability can be dramatically enhanced
(or hampered) by caching. if the network scales by upgrading individual
links, then a parallel computing solution would be suitable, but if the
network grows primarily by adding new links and nodes, then a distributed
computing approach to caching would be preferable.
References
[AC98]
J. Almeida and P. Cao, "Wisconsin
Proxy Benchmark 1.0", Univ of Wisconsin, (as of Oct. 20, 1998).
[HHE97]
A. Heddaya, A. Helal, and A. Elmagarmid, "Recovery-Enhanced Reliability,
Dependability and Performability," Chapter 4 in Recovery
Mechanisms In Database Systems (V. Kumar and M. Hsu, eds.) Prentice-Hall,
Dec. 1997.
[HMY97]
A. Heddaya, S. Mirdad and D. Yates, "Diffusion-based Caching Along Routing
Paths", Proc. 2nd Web Caching Workshop, Boulder, Colorado, June 9-10, 1997.
[K98a]
Keynote Systems, Inc., "Top
10 Discoveries about the Internet", (as of Oct. 20, 1998).
[K98b]
Keynote Systems, Inc., "Clinton/Lewinsky
Scandal : Effect on Internet Performance", Oct. 6, 1998.
[MA98]
D.A. Menasce, V.A.F. Almeida, "Capacity Planning for Web Performance: Metrics,
Models, & Methods," Prentice-Hall, 1998.
[W98]
D. Wessels, "Report
on the effect of the Independent Council Report on the NLANR Web Caches",
NLANR, Sep 23, 1998.
******