scispace - formally typeset
Open AccessProceedings ArticleDOI

A network-aware distributed storage cache for data intensive environments

Reads0
Chats0
TLDR
This work describes an architecture for data intensive applications where a high-speed distributed data cache is used as a common element for all of the sources and sinks of data, and provides standard interfaces to a large, application-oriented, distributed, on-line, transient storage system.
Abstract
Modern scientific computing involves organizing, moving, visualizing, and analyzing massive amounts of data at multiple sites around the world. The technologies, the middleware services, and the architectures that are used to build useful high-speed, wide area distributed systems, constitute the field of data intensive computing. We describe an architecture for data intensive applications where we use a high-speed distributed data cache as a common element for all of the sources and sinks of data. This cache-based approach provides standard interfaces to a large, application-oriented, distributed, on-line, transient storage system. We describe our implementation of this cache, how we have made it "network aware ", and how we do dynamic load balancing based on the current network conditions. We also show large increases in application throughput by access to knowledge of the network conditions.

read more

Content maybe subject to copyright    Report

1
A Network-Aware Distributed Storage Cache for Data Intensive Environments
1
Brian L. Tierney, Jason Lee, Brian Crowley, Mason Holding
Computing Sciences Directorate
Lawrence Berkeley National Laboratory
University of California, Berkeley, CA, 94720
Jeremy Hylton, Fred L. Drake, Jr.
Corporation for National Research Initiatives, Reston, VA 20191
Abstract
Modern scientific computing involves organizing, moving,
visualizing, and analyzing massive amounts of data at
multiple sites around the world. The technologies, the
middleware services, and the architectures that are used to
build useful high-speed, wide area distributed systems,
constitute the field of data intensive computing. In this
paper we will describe an architecture for data intensive
applications where we use a high-speed distributed data
cache as a common element for all of the sources and sinks
of data. This cache-based approach provides standard
interfaces to a large, application-oriented, distributed,
on-line, transient storage system. We describe our
implementation of this cache, how we have made it
“network aware,” and how we do dynamic load balancing
based on the current network conditions. We also show
large increases in application throughput by access to
knowledge of the network conditions.
1.0 Introduction
High-speed data streams resulting from the operation of
on-line instruments and imaging systems are a staple of
modern scientific, health care, and intelligence
environments. The advent of high-speed networks is
providing the potential for new approaches to the
collection, organization, storage, analysis, visualization,
and distribution of the large-data-objects that result from
such data streams. The result will be to make both the data
and its analysis much more readily available.
For example, health care imaging systems illustrate the
need for both high data rates and real-time cataloging.
Medical video and image data used for diagnostic purposes
e.g., X-ray CT, MRI, and cardio-angiography — are
collected at centralized facilities and may be accessed at
locations other than the point of collection (e.g., the
hospitals of the referring physicians). A second example is
high energy physics experiments, which generate high rates
and massive volumes of data that must be processed and
archived in real time. This data must also be accessible to
large scientific collaborations — typically hundreds of
investigators at dozens of institutions around the world.
In this paper we will describe how “Computational
Grid” environments can be used to help with these types of
applications, and how a high-speed network cache is a
particularly important component in a data intensive grid
architecture. We describe our implementation of a network
cache, how we have made it “network aware,” and how we
adapt its operation to current network conditions.
2.0 Data Intensive Grids
The integration of the various technological approaches
being used to address the problem of integrated use of
dispersed resources is frequently called a “grid,” or a
computational grid — a name arising by analogy with the
grid that supplies ubiquitous access to electric power. See,
e.g., [10]. Basic grid services are those that locate, allocate,
coordinate, utilize, and provide for human interaction with
1. The work described in this paper is supported by DARPA, Information Technology Office (http://www.darpa.mil/ito/Research
Areas.html) and the U. S. Dept. of Energy, Office of Science, Office of Computational and Technology Research, Mathematical, Informa-
tion, and Computational Sciences Division (http://www.er.doe.gov/production/octr/mics/index.html), under contract
DE-AC03-76SF00098 with the University of California. This is report no. LBNL-42896.

2
the various resources that actually perform useful
functions.
Grids are built from collections of primarily independent
services. The essential aspect of grid services is that they
are uniformly available throughout the distributed
environment of the grid. Services may be grouped into
integrated sets of services, sometimes called “middleware.”
Current grid tools include Globus [8], Legion [16], SRB
[3], and workbench systems like Habanero [11] and
WebFlow [2].
From the application’s point of view, the Grid is a
collection of middleware services that provide applications
with a uniform view of distributed resource components
and the mechanisms for assembling them into systems.
From the middleware systems points of view, the Grid is a
standardized set of basic services providing scheduling,
resource discovery, global data directories, security,
communication services, etc. However, from the Grid
implementors point of view, these services result from and
must interact with a heterogeneous set of capabilities, and
frequently involve “drilling” down through the various
layers of the computing and communications infrastructure.
2.1 Architecture for Data Intensive Environments
Our model is to use a high-speed distributed data storage
cache as a common element for all of the sources and sinks
of data involved in high-performance data systems. We use
the term “cacheto mean storage that is faster than typical
local disk, and temporary in nature. This cache-based
approach provides standard interfaces to a large,
application-oriented, distributed, on-line, transient storage
system.
Each data source deposits its data in the cache, and each
data consumer takes data from the cache, often writing the
processed data back to the cache. A tertiary storage system
manager migrates data to and from the cache at various
stages of processing. (See Figur e1.) We have used this
model for data handling systems for high energy physics
data and for medical imaging data. For more information
see [15] and [14].
The high-speed cache serves several roles in this
environment. It provides a standard high data rate interface
for high-speed access by data sources, processing
resources, mass storage systems (MSS), and user interface /
data visualization elements. It provides the functionality of
a single very large, random access, block-oriented I/O
device (i.e., a “virtual disk). It serves to isolate the
application from tertiary storage systems and instrument
data sources, helping eliminate contention for those
resources
This cache can be used as a large buffer, able to absorb
data from a high rate data source and then to forward it to a
slower tertiary storage system. The cache also provides an
“impedance matching function between a small number of
high throughput streams to a larger number of lower speed
streams, e.g. between fine-grained accesses by many
applications and the coarse-grained nature of a few parallel
tape drives in the tertiary storage system.
Depending on the size of the cache relative to the
objects of interest, the tertiary storage system management
may only involve moving partial objects to the cache. In
other words, the cache may contain a moving window for
an extremely large off-line object/data set. Generally, the
cache storage configuration is large (e.g., 100s of
gigabytes) compared to the available disks of a typical
computing environment (e.g., 10s of gigabytes), and very
large compared to any single disk (e.g. hundreds of ~10
gigabytes).
2.2 Network-Aware Applications
In order to efficiently use high-speed wide area
networks, applications will need to be “network-aware”[6].
Network-aware applications attempt to adjust their
demands in response to changes in resource availability.
For example, emerging QoS services will allow
network-aware applications to participate in resource
management, so that network resources are applied in a
way that is most effective for the applications. Services
with a QoS assurance are likely to be more expensive than
best-effort services, so applications may prefer to adjust
rather than pay a higher price. Network-aware applications
will require a general-purpose service that provides
information about the past, current, and future state of all
the network links that it wishes to use. Our monitoring
system, described below, is a first step in providing this
service.
Figure 1 The Data Handling Model
Parallel computation /
data analysis
real-time data
cache partition
processing
scratch
partition
application
data cache
partition
large, high-speed network cache
data cataloguing, archiving,
and access control system
Data
Source
( instrument or
simulation)
visualization
applications
tertiaray storage
system
Disk Storage Tape Storage

3
3.0 The Distributed-Parallel Storage System
Our implementation of this high-speed, distributed
cache is called the Distributed-Parallel Storage System
(DPSS) [7]. LBNL designed and implemented the DPSS as
part of the DARPA MAGIC project [18], and as part of the
U.S. Department of Energy’s high-speed distributed
computing program. This technology has been successful
in providing an economical, high-performance, widely
distributed, and highly scalable architecture for caching
large amounts of data that can potentially be used by many
different users.
Typical DPSS implementations consist of several
low-cost workstations as DPSS block servers, each with
several disk controllers, and several disks on each
controller. A four-server DPSS with a capacity of one
Terabyte (costing about $80K in mid-1999) can thus
produce throughputs of over 50 MBytes/sec by providing
parallel access to 20-30 disks.
Other papers describing the DPSS in more detail include
[23], which describes how the DPSS was used to provide
high-speed access to remote data for a terrain visualization
application, [24], which describes the basic architecture
and implementation, and [25], which describes how the
instrumentation abilities in the DPSS were used to help
track down a wide area network problem. This paper
focuses on how we were able to greatly improve total
throughput to applications by making the DPSS “network
aware.
The application interface to the DPSS cache supports a
variety of I/O semantics, including Unix-like I/O semantics
through an easy to use client API library (e.g. dpssOpen(),
dpssRead(), dpssWrite(), dpssLSeek(), dpssClose()). The
data layout on the disks is completely up to the application,
and the usual strategy for sequential reading applications is
to write the data “round-robin,” striping blocks of data
across the servers. The client library also includes a flexible
data replication ability, allowing for multiple levels of fault
tolerance. The DPSS client library is multi-threaded, where
the number of client threads is equal to the number of
DPSS servers. Therefore the speed of the client scales with
the speed of the server, assuming the client host is powerful
enough.
The internal architecture of the DPSS is illustrated in
Figur e2. Requests for blocks of data are sent from the
client to the “DPSS master” process, which determines
which “DPSS block servers” the blocks are located on, and
forwards the requests to the appropriate servers. The server
then sends the block directly back to the client. Servers
may be anywhere in the network: there is no assumption
that they are all at the same location, or even the same city.
DPSS performance, as measured by total throughput, is
optimized for a relatively smaller number (a few thousand)
of relatively large files (greater than 50 MB). Performance
is the same for any file sizes greater than 50 MB. We have
also shown that performance scales well with the number
of clients, up to at least 64 clients. For example, if the
DPSS system is configured to provide 50 MB/sec to 1
client, it can provide 1 MB/sec to each of 50 simultaneous
clients. The DPSS master host starts to run out of resources
with more than 64 clients.
Because of the threaded nature of the DPSS server, a
server scales linearly with the number of disks, up to the
network limit of the host (possibly limited by the network
card or the CPU). The total DPSS system throughput scales
linearly with the number of servers, up to at least 10
servers.
The DPSS provides several important and unique
capabilities for data intensive distributed computing
environments. It provides application-specific interfaces to
an extremely large space of logical blocks; it offers the
ability to build large, high-performance storage systems
from inexpensive commodity components; and it offers the
ability to increase performance by increasing the number of
parallel disk servers.
DPSS data blocks are available to clients immediately as
they are placed into the cache. It is not necessary to wait
until the entire file has been transferred before requesting
data. This is particularly useful to clients requesting data
from a tape archive. As the file moves from tape to the
DPSS cache, the blocks in the cache are immediately
available to the client. If a block is not available, the
application can either block, waiting for the data to arrive,
or continue to request other blocks of data which may be
ready to read.
The DPSS is dynamically reconfigurable, allowing one
to add or remove servers or disks on the fly. This is done by
storing the DPSS hardware resource information in a
Globus Metacomputing Directory Service (MDS)[5]
formatted LDAP database, which may be updated
Client
Application
Shared Memory Cache
Block
Request
Thread
Disk
Read
Thread
Disk
Read
Thread
Disk
Read
Thread
Disk
Read
Thread
DPSS Master
from other DPSS servers
*
DPSS Data Server
to other
DPSS servers
Block
Writer
Thread
to other
clients
Disk Disk DiskDisk
Figure 2: DPSS Architecture

4
dynamically. Software agents are used to monitor network,
host, and disk availability and load, storing this information
into the LDAP database as well. This information can then
be used for fault tolerance and load balancing. We describe
this load balancing facility in more detail below.
4.0 Network-Aware Adaptation
For the DPSS cache to be effective in a wide area
network environment, it must have sufficient knowledge of
the network to adjust for a wide range of network
performance conditions and sufficient adaptability to be
able to dynamically reconfigure itself in the face of
congestion and component failure.
4.1 Monitoring System
We have developed a software agent architecture for
distributed system monitoring and management. We call
this system Java Agents for Monitoring and Management
(JAMM) [13]. The agents, whose implementation is based
on Java and RMI, can be used to launch a wide range of
system and network monitoring tools, extract their results,
and publish them into an LDAP database. These agents can
securely start any monitoring program on any host and
manage the output of any monitoring data. For example, we
use the agents to run
netperf
[19] and
ping
for network
monitoring,
vmstat
and
uptime
for host monitoring, and
xntpdc
for host clock synchronization monitoring. These
results are uploaded to an LDAP database at regular
intervals, typically every few minutes, for easy access by
any process in the system. We run these agents on every
host in a distributed system, including the client host, so
that we can learn about the network path between the client
and any server.
4.2 TCP Receive Buffers
The DPSS uses the TCP protocol for data transfers. For
TCP to perform well over high-speeds networks, it is
critical that there be enough buffer space for the congestion
control algorithms to work correctly [12]. Proper buffer
size is a function of the network bandwidth-delay product,
but because bandwidth-delay products in the Internet can
span 4-5 orders of magnitude, it is impossible to configure
the default TCP parameters on a host to be optimal for all
connections [21].
To solve this problem, the DPSS client library
automatically determines the bandwidth-delay product for
each connection to a DPSS server and sets the TCP buffer
size to the optimal value. The bandwidth and delay of each
link are obtained from the agent monitoring results which
are stored in the LDAP database.
There are several open issues involved in obtaining
accurate network throughput and latency measures. One
issue is that the use of past performance data to predict the
future may be of limited utility. Another issue is whether to
use active or passive measurement techniques.
Network information such as available bandwidth varies
dynamically due to changing traffic and often cannot be
measured accurately. As a result, characterizing the
network with a single number can be misleading. The
measured bandwidth availability might appear to be stable
based on measurements every 10 minutes, but might
actually be very bursty; this burstiness might only be
noticed if measurements are made every few seconds.
These issues are described in more detail in [17] and
[27]. We plan to adopt techniques used in other projects
such as NWS, once they are proven to be sound.
4.3 Load Balancing
The DPSS can perform load balancing if the data blocks
are replicated on multiple servers. The DPSS master uses
status information in the LDAP database to determine how
to forward a client's block request to the server that will
give the fastest response. A minimum cost flow algorithm
[1][9] is used by the DPSS master to optimize the
assignment of block requests to servers.
Our approach is to treat load balancing as a
combinatorial problem. There is some number of clients
and servers. Each client must be assigned to one or more
servers without any server being overloaded.
The minimum cost flow approach is a good match for
the combinatorial nature of the problem, but there are
several practical challenges to overcome. In particular, the
minimum cost flow algorithm is an offline algorithm; the
number of blocks each client will request must be know in
advance in order to generate a flow of blocks from servers
to clients for a given period. However, client arrivals and
departures are unpredictable, and for some clients, the
request rate and the amount of data requested is also
variable. Our solution is to run the algorithm each time a
client request arrives, using the actual request for the
current client and estimates for every other client. The
algorithm itself is fast (less than 1 ms for typical graphs),
so this solution is workable.
We model the DPSS load balancing problem as a
transportation problem [1] (p. 99). Each server has a supply
of blocks that must be delivered to the clients. The network
is represented as a bipartite graph, where each node is a
client or server and each edge is a network path from server
to client. Each edge has a per-block cost and a maximum
capacity. The algorithm finds a flow of blocks from servers
to clients that minimizes the total cost. It is defined for a
balanced network, where the total demand is equal to the

5
total supply. For the DPSS, this situation occurs only when
the clients have saturated the servers. To create a balanced
problem, we introduce a ghost client and a ghost server that
have infinite capacity and high-cost links to other servers
and clients, respectively. Supply or demand is assigned to
one of the ghosts to create a balanced problem.
We assign a cost and capacity based on the assumption
that network latency is the dominant factor affecting
application performance, so that selecting servers with the
lowest latency will maximize application performance. The
total latency from a client's request to its receipt of the first
tile from a server is affected by three different network
paths: the paths from client to master, master to server, and
server to client. The master obtains the latencies from these
three paths from the LDAP database. The total delay for the
edge cost is the sum of the three latencies, the processing
delay at the master and server, and the transmission delay
of a data block across the link between server and client.
Data blocks are large (typically 64KB), so the transmission
delay is non-trivial, even across a high-speed network.
One limitation of this approach is that the graph does not
represent the actual network topology. Several edges in the
graph may actually share the same bottleneck link in the
real network, but the graph does not capture this
information. The minimum cost flow algorithm could
accommodate a more detailed model of the network, but
the monitoring system only collects information about
host-to-host performance.
The edge capacity is set to the bandwidth obtained from
the LDAP database. This capacity may be reduced based on
the degree of replication of the data blocks. When data is
loaded into the DPSS, blocks are distributed acros
n
servers and each block is replicated
m
times, where
m
<=
n
.
If we assume blocks are uniformly distributed to servers,
then it is unlikely that any one server will store more than
m
/
n
percent of the blocks requested. The actual edge
capacity assigned is the minimum of the bandwidth and
m
/
n
percent of the data requested by the client.
The bandwidth data from the LDAP database is also
used to set the server's supply. The supply at a server is the
total bandwidth available to all clients. This bandwidth
must be determined heuristically because the monitoring
system only reports the maximum bandwidth available to
each client. We might naively assume that the total
bandwidth is the sum of the bandwidth available to each
client. If several clients share the same bottleneck link,
however, the total bandwidth will be less. We
conservatively assume that all clients share the same
bottleneck link and set the total bandwidth to the maximum
bandwidth available to any one client.
The load balancing implementation maintains a graph
data structure that is modified whenever clients arrive or
leave. The edge costs are recomputed every three minutes
based on data from LDAP. We use the CS2 [4] minimum
cost flow solver. For a particular request, the solver
determines what proportion of the blocks will be delivered
by each server. Each block must be looked up in the block
database to determine which specific servers it is loaded on.
A stride scheduler [26] chooses one of the available servers
based on the proportions assigned by the solver.
5.0 Results
5.1 TCP Buffer Tuning
Table 1 shows the results from dynamic setting of the
TCP receive buffer size. This table illustrates that buffers
can be hand-tuned for either LAN access or WAN access,
but not both at once. It is also apparent that while setting
the buffer size big enough is particularly important for the
WAN case, it is also important not to set it too big for the
LAN environment.
If the buffers are too large, throughput may decrease
because the larger receive buffer allows the congestion
window to grow sufficiently large that multiple packets are
lost (in a major buffer overflow) during a single round trip
time (RTT), which then leads to a timeout instead of a
smooth fast retransmit/recovery. [20]
5.2 Load Balancing
We first ran a series of tests to verify that latency is the
dominant factor in determining which server to use in the
load balancing algorithm. Figure 3, Figur e4, and Fig ure5
show the results of using dynamic load balancing varying
one factor at a time. In Figure 3 we used servers with the
same load and latency, and varied the available network
throughput (the first two servers were on OC-3, the third on
10BT ethernet, and the fourth on 100BT ethernet). In
Figur e4 we used DPSS servers with the same network
throughput and latency, but varied the server CPU power
available by using servers with other jobs running
Table 1
buffer method network Total
Throughput
hand tune for LAN
(64KB buffers)
LAN 33 MBytes/sec
WAN 5.5 MBytes/sec
hand tune for WAN
(512 KB buffers)
LAN 19 MBytes/sec
WAN 14 MBytes/sec
auto tune in DPSS
library
LAN 33 MBytes/sec
WAN 14 MBytes/sec
LAN RTT = 1 ms over OC-12 (622 Mbit/sec) network
WAN RTT = 44 ms over OC-3 (155 Mbit/sec) network

Citations
More filters
Journal ArticleDOI

Data management and transfer in high-performance computational grid environments

TL;DR: A high-speed transport service that extends the popular FTP protocol with new features required for Data Grid applications, such as striping and partial file access and a replica management service that integrates a replica catalog with GridFTP transfers to provide for the creation, registration, location, and management of dataset replicas.
Proceedings ArticleDOI

Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing

TL;DR: The high-speed transport service, GridFTP, extends the popular FTP protocol with new features required for Data Grid applications, such as striping and partial file access, and the replica management service integrates a replica catalog with gridFTP transfers to provide for the creation, registration, location, and management of dataset replicas.
Journal ArticleDOI

End-to-end quality of service for high-end applications

TL;DR: The prototype GARA implementation builds on differentiated services mechanisms to enable the coordinated management of two distinct flow types-foreground media flows and background bulk transfers-as well as the co-reservation of networks, CPUs, and storage systems.
Patent

Method and system for providing dynamic hosted service management across disparate accounts/sites

TL;DR: In this paper, a hosted service provider for the Internet is operated so as to provide dynamic management of hosted services across disparate customer accounts and/or geographically distinct sites, such as Amazon EC2.
Patent

System for balance distribution of requests across multiple servers using dynamic metrics

TL;DR: In this article, a system for distributing incoming client requests across multiple servers in a networked client-server computer environment processes all requests as a set that occur within a given time interval and collects information on the attributes of the requests and the resource capability of the servers to dynamically allocate requests in a set to the appropriate servers upon completion of the time interval.
References
More filters
Journal ArticleDOI

Congestion avoidance and control

TL;DR: The measurements and the reports of beta testers suggest that the final product is fairly good at dealing with congested conditions on the Internet, and an algorithm recently developed by Phil Karn of Bell Communications Research is described in a soon-to-be-published RFC.
Journal ArticleDOI

The network weather service: a distributed resource performance forecasting service for metacomputing

TL;DR: The current implementation of the NWS for Unix and TCP/IP sockets is described and examples of its performance monitoring and forecasting capabilities are provided.
Proceedings Article

A resource management architecture for metacomputing systems.

TL;DR: This work describes a resource management architecture that distributes the resource management problem among distinct local manager, resource broker, and resource co-allocator components and defines an extensible resource specification language to exchange information about requirements.
Book ChapterDOI

A Resource Management Architecture for Metacomputing Systems

TL;DR: The Globus metacomputing toolkit as discussed by the authors describes a resource management architecture that distributes the resource management problem among distinct local manager, resource broker, and resource co-allocator components.
Proceedings ArticleDOI

The SDSC storage resource broker

TL;DR: The architecture and various features of the SDSC SRB are described, which provides applications a uniform API to access heterogeneous distributed storage resources including, filesystems, database systems, and archival storage systems.
Related Papers (5)
Frequently Asked Questions (16)
Q1. What contributions have the authors mentioned in the paper "A network-aware distributed storage cache for data intensive environments1" ?

In this paper the authors will describe an architecture for data intensive applications where they use a high-speed distributed data cache as a common element for all of the sources and sinks of data. This cache-based approach provides standard interfaces to a large, application-oriented, distributed, on-line, transient storage system. The authors describe their implementation of this cache, how they have made it “ network aware, ” and how they do dynamic load balancing based on the current network conditions. The authors also show large increases in application throughput by access to knowledge of the network conditions. 

The authors plan to test the DPSS in a larger testbed, with an OC-12 wide area link and more clients. The authors also plan to experiment with other network monitoring methods, such as passive methods, for collecting network throughput information. This ability to predict future performance would be extremely valuable for this system, and the authors plan to try to incorporate NWS into their JAMM system. This will allow us to evaluate the minimum cost master under heavier load, where the authors expect load balancing to have a greater impact. 

This cache can be used as a large buffer, able to absorb data from a high rate data source and then to forward it to aslower tertiary storage system. 

Because of the threaded nature of the DPSS server, a server scales linearly with the number of disks, up to the network limit of the host (possibly limited by the network card or the CPU). 

In the testbed configuration, the minimum cost master sustained a peak throughput of 128 Mbps to three clients using a fully replicated data set; without replication, the peak throughput was only 59 MBps. 

For TCP to perform well over high-speeds networks, it is critical that there be enough buffer space for the congestion control algorithms to work correctly [12]. 

A four-server DPSS with a capacity of one Terabyte (costing about $80K in mid-1999) can thus produce throughputs of over 50 MBytes/sec by providing parallel access to 20-30 disks. 

The sustained bandwidth from Server A to Client A is 112 Mbps, but only 80 Mbps to Server B.A network configuration problem limited the bandwidth between Server C and Client B to 11 Mbps, although Server C achieved 107 Mbps to Client A and Client B to Server A achieved 56 Mbps. 

The authors believe that this type of network-aware data cache will be an important architectural component to building effective data intensive computational grids. 

The minimum cost flow approach to load balancing increases the total throughput of the system by adapting to varying client demand. 

The total latency from a client's request to its receipt of the first tile from a server is affected by three different network paths: the paths from client to master, master to server, and server to client. 

The authors also plan to experiment with other network monitoring methods, such as passive methods, for collecting network throughput information. 

It is also apparent that while setting the buffer size big enough is particularly important for the WAN case, it is also important not to set it too big for the LAN environment. 

If the authors assume blocks are uniformly distributed to servers, then it is unlikely that any one server will store more than m/n percent of the blocks requested. 

Since the authors expect this type of storage cache to be mainly used with very large data sets, in practice the data will likely only be replicated at most twice. 

The measured bandwidth availability might appear to be stable based on measurements every 10 minutes, but might actually be very bursty; this burstiness might only be noticed if measurements are made every few seconds.