scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Packet pacing in small buffer optical packet switched networks

TL;DR: It is argued that the loss-delay tradeoff mechanism provided by pacing can be instrumental in overcoming the performance hurdle arising from the scarcity of buffers in OPS networks.
Abstract: In the absence of a cost-effective technology for storing optical signals, emerging optical packet switched (OPS) networks are expected to have severely limited buffering capability. To mitigate the performance degradation resulting from small buffers, this paper proposes that optical edge nodes ldquopacerdquo the injection of traffic into the OPS core. Our contributions relating to pacing in OPS networks are three-fold: first, we develop real-time pacing algorithms of poly-logarithmic complexity that are feasible for practical implementation in emerging high-speed OPS networks. Second, we provide an analytical quantification of the benefits of pacing in reducing traffic burstiness and traffic loss at a link with very small buffers. Third, we show via simulations of realistic network topologies that pacing can significantly reduce network losses at the expense of a small and bounded increase in end-to-end delay for real-time traffic flows. We argue that the loss-delay tradeoff mechanism provided by pacing can be instrumental in overcoming the performance hurdle arising from the scarcity of buffers in OPS networks.

Summary (4 min read)

Introduction

  • The maturation of Wavelength Division Multiplexing (WDM) technology in recent years has made it possible to harness the enormous bandwidth potential of an optical fibre cost-effectively.
  • As systems supporting hundreds of wavelengths per fibre with transmission rates of 10-40 Gbps per wavelength become available, electronic switching is increasingly challenged in scaling to match these transport capacities.
  • The mechanism to achieve this, termed “pacing”, reduces the short-time-scale burstiness of arbitrary traffic, but without incurring significant delay penalties.

A. Loss for Real-Time Traffic

  • The link rate is set at 10 Gbps, and packets have a constant size of 1250 bytes (this is consistent with earlier studies of slotted OPS systems).
  • Fig. 1 shows the packet losses as a function of buffer size obtained from simulations of short range as well as long range dependent (LRD) input traffic at various system loads (the traffic model is detailed in section V).
  • The plots illustrate that an OPS node with very limited buffering (say 10 to 20 packets) can experience significant losses even at low to moderate traffic loads, particularly with the LRD model which is more representative of real-world traffic.
  • This may be unacceptable in a core network supporting real-time applications with stringent loss requirements.

B. Prior Work

  • Some earlier works have proposed modifying traffic characteristics to improve performance in optical networks with limited buffers.
  • ATM and IP networks have used rate-based shaping methods such as GCRA or leakybucket to protect the network against relatively longer-timescale rate fluctuations, while short-time-scale burstiness is expected to be absorbed by router buffers.
  • This has been confirmed by studies in [22] and their earlier work in [23].
  • Following on from their arguments in [24] showing that router buffer size need only be inversely proportional to the square-root of the number of TCP flows, they have recently shown in [25] that by making each TCP sender “pace” its packet injections into the network, a router buffer size of 10- 20 packets suffices to realise near-maximum TCP throughput performance.
  • Another difference is that rather than pacing at the end-hosts, the authors focus on pacing at the edge of the optical network.

III. SYSTEM MODEL AND OFF-LINE OPTIMUM

  • The packet pacer smoothes traffic entering the OPS network, and is therefore employed at the optical edge switches on their egress links connecting to the all-optical packet switching core.
  • The objective of the pacer is to produce the smoothest output traffic such that each packet is released by its deadline.
  • A feasible exit curve, namely one which is causal and satisfies the delay constraint, must lie in the region bounded above by the arrival curve A(t), and below by the deadline curve D(t).
  • Current mechanims for smoothing consider one or a few video streams at end-hosts or video server; by contrast OPS edge nodes will have to perform the pacing on large traffic aggregates at extremely high data rates.
  • The time-constraints for computing the optimal pacing patterns are also much more stringent – unlike video smoothing where a few frames (tens to hundreds of milliseconds) of delay is acceptable, OPS edge nodes will have buffer traffic for shorter time lest the buffering requirement becomes prohibitively expensive (at 10 Gbps, 1 msec of buffering needs 10 Mbits of RAM).

IV. EFFICIENT REAL-TIME PACING

  • It is shown in [28] that an off-line pacer yields the smoothest output traffic satisfying the delay constraints if its service rate follows the shortest path lying between the arrival and deadline curves.
  • In the on-line case, however, the packet arrival process is non-deterministic, and the arrival curve is not known beforehand.
  • Upon each packet arrival, the deadline curve is augmented, and this may require a recomputation of the convex hull which defines the optimal exit curve.

A. Single Delay Class – Constant Amortised Cost Algorithm

  • The authors first consider the case where all packets entering the pacer have identical delay constraints.
  • This simplifies the hull update algorithm since each packet arrival augments the deadline curve at the end.
  • At this stage the hull is convex and the backward scan can stop, resulting in the new hull.
  • The algorithm of Fig. 4 has constant amortised computation cost per packet arrival, also known as Claim 1.
  • A packet from the head of the pacer queue is released as soon as sufficient credits (corresponding to the packet size) are available, such credits being deducted when the packet departs the pacer.

B. Single Delay Class – Logarithmic Cost Algorithm

  • In operation 2 the mid-point E of the right half is examined, and it is found that EH lies above the original hull, so the algorithm moves left, till it reaches point D in operation 3 that gives the desired tangent and final hull O-A-B-C-D-H. Fig. 7 depicts the update algorithm in pseudocode.
  • Along with each vertex the authors also store pointers to its predecessor and successor on the boundary of its convex hull.
  • The arrival of the new packet, which causes the deadline curve to be amended, results in appending a new vertex to the end of the hull.
  • The search process is a binary search of the AVL tree, as described by Preparata [40, procedure TANGENT].

C. General Poly-Logarithmic Cost Algorithm

  • The authors now consider the general case where arriving packets may have arbitrary delay constraints.
  • The idea behind the algorithm is that for an incoming packet with arbitrary deadline, the original deadline curve is split into two parts, corresponding to the left and right of the new arrival’s deadline.
  • The convex hulls for each of the parts is independently computed, after the deadline curve to the right has been shifted up to account for the new packet arrival.
  • The authors vertices are stored in the leaves of a search tree T structure which is capable of supporting concatenable-queue operations, such as the 2-3 tree [41, sections 4.12], with the value of time used as the search key.
  • The division is a recursive process detailed in [41, section 4.12].

V. PERFORMANCE EVALUATION FOR A SINGLE FLOW

  • Having addressed the feasibility of pacing at high data rates, the authors demonstrate the utility of pacing in OPS systems with small buffers.
  • This section evaluates via analysis and simulation the impact of pacing on traffic burstiness and loss for a single flow, while the next section evaluates via simulation loss performance for several flows in realistic network topologies.

A. Traffic Models

  • The authors apply their pacing technique to Poisson and long range dependent (LRD) traffic models (both of which were introduced in section II-A); the Poisson model is selected for its simplicity and ease of illustration of the central ideas, while the LRD model is chosen since it is believed to be more reflective of traffic in real networks.
  • Fig. 10 shows for Poisson and LRD traffic the burstiness β(s) versus time-scale s (in µsec) on loglog scale observed in simulation for pacing delay d of 0 (i.e. no pacing), 10µsec, 100µsec, 1msec, and 10msec, also known as 1) Simulation Results.
  • At very long time-scales (beyond the delay budget d of the pacer), burstiness is again invariant to pacing.

VI. SIMULATION STUDY OF NETWORKS

  • The previous section considered the impact of pacing on burstiness and loss at a single node.
  • First, unlike the single-link scenario considered earlier in which all traffic was smoothed by the 11 one pacer, traffic will now be paced locally and independently by each ingress point of the OPS core – though globally suboptimal, this is the only feasible practical option.
  • For both topologies the authors quantify the core packet loss as a function of contention resolution resources (optical buffers and wavelength converters) for various end-to-end delay penalties (incurred via pacing at the edge).
  • The cumulative distribution of packet size is shown in Fig. 16, with distinct steps at 40 (the minimum packet size for TCP), 1500 (the maximum Ethernet payload size), as well as at 532 and 576 from TCP implementations that don’t use path MTU discovery.
  • The benefits of pacing are again clear: for say 60 wavelengths per fibre, pacing delays of less than a millisecond are able to reduce loss by more than one order of magnitude.

VII. CONCLUSIONS

  • Emerging optical packet switched (OPS) networks will likely have very limited contention resolution resources usually implemented in the form of packet buffers or wavelength converters.
  • This can cause high packet losses and adversely impact end-to-end performance.
  • Pacing dramatically reduces traffic burstiness for a bounded and controllable penalty in end-to-end delay.
  • The authors showed via simulation of realistic OPS network topologies that pacing can reduce losses by orders of magnitude, at the expense of a small and bounded increase in end-to-end delay.
  • The authors also intend to compare their traffic pacing at the optical edge to pacing TCP traffic at end-hosts [25].

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

1
Packet Pacing in Small Buffer
Optical Packet Switched Networks
Vijay Sivaraman, Hossam Elgindy, David Moreland, and Diethelm Ostry
Abstract—In the absence of a cost-effective technology for
storing optical signals, emerging optical packet switched (OPS)
networks are expected to have severely limited buffering capabil-
ity. To mitigate the performance degradation resulting from small
buffers, this paper proposes that optical edge nodes “pace” the
injection of traffic into the OPS core. Our contributions relating
to pacing in OPS networks are three-fold: first, we develop real-
time pacing algorithms of poly-logarithmic complexity that are
feasible for practical implementation in emerging high-speed OPS
networks. Second, we provide an analytical quantification of the
benefits of pacing in reducing traffic burstiness and traffic loss at
a link with very small buffers. Third, we show via simulations of
realistic network topologies that pacing can significantly reduce
network losses at the expense of a small and bounded increase
in end-to-end delay for real-time traffic flows. We argue that
the loss-delay trade-off mechanism provided by pacing can be
instrumental in overcoming the performance hurdle arising from
the scarcity of buffers in OPS networks.
I. INTRODUCTION
The maturation of Wavelength Division Multiplexing
(WDM) technology in recent years has made it possible
to harness the enormous bandwidth potential of an opti-
cal fibre cost-effectively. As systems supporting hundreds
of wavelengths per fibre with transmission rates of 10-40
Gbps per wavelength become available, electronic switching
is increasingly challenged in scaling to match these transport
capacities. All-optical switching [1] shows promise in meeting
these challenges. To support data traffic efficiently, various
optical sub-wavelength switching methods [2], [3] have been
proposed, of which optical packet switching (OPS) [4] is
particularly attractive. Several experimental test-beds [5]–[9]
have demonstrated the feasibility of OPS.
A fundamental concern in OPS networks is contention,
which occurs at a switching node whenever two or more
packets try to leave on the same output link, on the same wave-
length, at the same time. Today’s electronic switches resolve
this contention relatively easily by using electronic random-
access memory (RAM) that can store over a million packets.
By comparison, a state-of-the-art random-access optical buffer
available on an integrated opto-electronic chip [10] can hold
Vijay Sivaraman is with the School of Electrical Engineering and Telecom-
munications, University of New South Wales, Sydney, NSW 2052, Australia.
Email: Vijay.Sivaraman@ieee.org. Ph: +61 2 9385 6577. Fax: +61 2 9385
5993.
Hossam Elgindy is with the School of Computer Science and Engineering,
University of New South Wales, Sydney, NSW 2052, Australia. Email:
elgindyh@cse.unsw.edu.au
David Moreland and Diethelm Ostry are with the CSIRO ICT Cen-
tre, PO Box 76, Epping, NSW 1710, Australia. Emails: {David.Moreland,
Diet.Ostry}@csiro.au
at most a few dozen packets. Alternatively, spools of fibre can
implement fibre delay lines (FDLs) [11] that provide optical
buffering capability. Unfortunately, the high speed of light
means that a significant buffering capability would necessitate
large fibre spools too unwieldy to be practical: 1 km of fibre
stores 5µsec of optical data; by contrast a conventional router
today typically buffers 50-250msec of electronic data. More-
over, incorporating FDLs into a typical OPS switch design
requires larger optical crossbars, which can add significantly
to cost as the FDL buffers increase. Wavelength conversion
[12], [13] is another technique for resolving contentions in
the optical domain, whereby a contending packet is converted
to an unused wavelength on the same outgoing link. However,
wavelength converters are expensive, and often limited in their
conversion range. Other methods such as deflection routing
[14] and combinational schemes [15] have also been investi-
gated for alleviating contentions, but usually incur overheads
such as packet reordering, complexity, etc. It therefore seems
that OPS networks of the foreseeable future will have very
limited contention resolution resources.
Our objective in this paper is to investigate the impact
of limited contention resolution resources (henceforth “small
buffers”) on the end-to-end performance of real-time traffic
in an OPS network, and to develop means of managing the
performance degradation. We begin by observing that not
withstanding the high bandwidth available in OPS networks,
small buffers at OPS nodes cause significant loss when the
traffic exhibits short-time-scale burstiness. To alleviate this,
we advocate that traffic characteristics be modified before
injection into the OPS network, with the aim of using the
limited contention resolution resources more effectively. The
mechanism to achieve this, termed “pacing”, reduces the
short-time-scale burstiness of arbitrary traffic, but without
incurring significant delay penalties. Our contributions in this
context are three-fold. We first develop algorithms that perform
efficient real-time optimal pacing of high data rate traffic.
Our algorithms vary in complexity between amortised constant
time and poly-logarithmic time in the number of queued
packets, and are shown to be amenable to efficient hardware
implementation. Our second contribution develops a novel
analytic framework to quantify the impact of pacing on a
short or long range dependent traffic stream. It lets us estimate,
for various pacer delay budgets, the burstiness of the output
traffic stream and associated loss at a link with very small
buffers. The analytical estimates match well with simulation
and provide insight into the operation of the pacer. For our
final contribution, we demonstrate via simulation of realistic
OPS network topologies derived from operational networks

2
1e-06
1e-05
0.0001
0.001
0.01
0.1
1
0 5 10 15 20 25 30
Loss
Buffer size (packets)
Poisson load=40%
Poisson load=60%
Poisson load=80%
LRD load=40%
LRD load=50%
LRD load=60%
Fig. 1. Loss vs. buffer size at a finite-buffer switch
in Australia and the USA, that pacing can help achieve
acceptable network loss performance at the cost of a small and
controllable increase in flow end-to-end delays. We therefore
propose pacing as an attractive mechanism for the realisation
of cost-effective (i.e with small buffers) OPS networks that
can provide the desired loss and delay performance.
The rest of this paper is organised as follows: section II
illustrates the performance impact of small buffers, discusses
prior work addressing this issue, and outlines our approach of
pacing packets. In section III we briefly describe the system
setting and recall results from off-line smoothing techniques
for video traffic relevant to our work, while section IV devel-
ops efficient algorithms for the real-time pacing of arbitrary
time-constrained traffic. In section V we quantify via analysis
the impact of pacing on traffic burstiness and loss for a single
flow, after which the loss-delay trade-off achievable via pacing
in realistic network scenarios is investigated via simulation in
section VI. The paper is concluded in section VII, which also
points to directions for future research.
II. SMALL BUFFERS: IMPACT AND SOLUTIONS
In this section we first illustrate via simulation the impact
of small buffers on losses for real-time traffic (the impact on
TCP traffic when mixed with real-time traffic is studied in
our subsequent work [16], [17]). We then briefly review some
prior solutions to reducing the performance degradation, and
outline our approach to tackling this through packet pacing.
A. Loss for Real-Time Traffic
A direct and obvious impact of small network buffers is
an increase in packet losses. As an example we consider a
single link with a queue of finite and small capacity. The link
rate is set at 10 Gbps, and packets have a constant size of
1250 bytes (this is consistent with earlier studies of slotted
OPS systems). Fig. 1 shows the packet losses as a function of
buffer size obtained from simulations of short range (Poisson)
as well as long range dependent (LRD) input traffic at various
system loads (the traffic model is detailed in section V). The
plots illustrate that an OPS node with very limited buffering
(say 10 to 20 packets) can experience significant losses even
at low to moderate traffic loads, particularly with the LRD
model which is more representative of real-world traffic. This
may be unacceptable in a core network supporting real-time
applications with stringent loss requirements.
B. Prior Work
Some earlier works have proposed modifying traffic char-
acteristics to improve performance in optical networks with
limited buffers. The approach in [18]–[20] is to treat this as a
global scheduling problem, wherein packets are transmitted by
the optical edge nodes at appropriate time instants that meet
the packet’s time-constraints while minimising (a weighted
measure of) loss in the network. The general problem is shown
to be NP-hard [20], and approximate off-line [19], [20] and
on-line [18] algorithms are developed for restricted topologies.
Though theoretically insightful, these methods require global
network-wide co-ordinated scheduling amongst the nodes,
which is not practically feasible in packet networks.
Traffic shaping has been widely used for controlling packet
losses in electronic networks. ATM and IP networks have
used rate-based shaping methods such as GCRA or leaky-
bucket to protect the network against relatively longer-time-
scale rate fluctuations, while short-time-scale burstiness is
expected to be absorbed by router buffers. The requirements
for controlled loss, delay and delay variation in ATM networks
have stimulated extensive studies [21] of queueing behavior
for a broad range of traffic models and traffic control and
shaping policies. Low transmission delay requires small queue
lengths, and it has been conjectured that actively spacing
traffic might ensure negligible cell delay variation [21, pg.
121]; however, the provision of guaranteed delay constraints
in networks limited to small buffers has not been investigated.
When buffers at routers are small, we observed in simulations
that short-time-scale burstiness seems to be the main cause of
performance degradation. Rate-based shaping cannot protect
against such losses, since a low shaping rate leads to excessive
queueing delays at the shaper, while a large shaping rate is
ineffective in reducing short-time-scale burstiness. This has
been confirmed by studies in [22] and our earlier work in
[23]. What is therefore required is a means of reducing the
short-time-scale burstiness of the traffic without introducing
excessive delays; such traffic “pacing” is the topic of this
paper.
Parallel to our work, researchers at Stanford have developed
techniques that make networks with very small buffers work-
able. Following on from their arguments in [24] showing that
router buffer size need only be inversely proportional to the
square-root of the number of TCP flows, they have recently
shown in [25] that by making each TCP sender “pace” its
packet injections into the network, a router buffer size of 10-
20 packets suffices to realise near-maximum TCP throughput
performance. Though some aspects of the model are still
under discussion [26], [27], it would seem their identification
of the advantages of TCP pacing bolsters our proposal to
pace traffic at the optical edge. There are however significant
differences to our approaches while their study considers

3
TCP traffic, our current study primarily focuses on non-TCP
real-time traffic (we have since extended our work [16], [17] to
scenarios that have mixed TCP and real-time traffic). Another
difference is that rather than pacing at the end-hosts, we focus
on pacing at the edge of the optical network. The former has
the advantage that it may require changes only in the end-host
TCP implementation. Nevertheless, packet spacing may not be
adequately preserved when traffic reaches the core network,
particularly if there is a significant volume of bursty real-
time traffic sharing links with the TCP traffic. Our approach
puts the pacing as close to the small buffer OPS network
as possible, potentially delivering improved performance for
all traffic. On the downside, our approach requires dedicated,
possibly expensive, high-speed pacing engines.
C. Packet Pacing
Pacing, also known as smoothing, has been studied before
in the context of video traffic transmission. Unlike a shaper,
which releases traffic at a given rate, a pacer accepts arbitrary
traffic with given delay constraints, and releases traffic that
is smoothest subject to the time-constraints of the traffic.
Here “smoothness” may be measured using the maximum
rate, rate variance, etc. The delay tolerance of traffic passing
through the pacer is crucial to the efficacy of the pacer the
longer the traffic can be held back at the pacer, the more the
window of opportunity the pacer has to smooth traffic and
reduce burstiness. A fundamental theoretical contribution in
[28] identifies an optimal strategy for the off-line smoothing of
stored video clips. This has led to several studies on dynamic
smoothing of broadcast video streams [29]–[31] (where a
few seconds of distribution delay is acceptable) as well as
interactive video streams [32] (wherein only a few frames can
be buffered at the smoother).
To the best of our knowledge, there has not been a study
(apart from our own [33], [34]) on the use of traffic pacing
techniques for alleviating contentions in OPS networks with
very small buffering resources. Our paper makes three new
contributions in this regard: (1) we develop new algorithms of
provably low complexity that can perform efficient pacing of
arbitrary traffic in real-time, (2) we develop a novel analytical
framework for estimating burstiness and loss of a paced traffic
stream, and (3) we quantify via simulation of realistic topolo-
gies the loss-delay tradeoffs that packet pacing facilitates. In
the next section we describe the architecture of the pacer and
elaborate on the optimal off-line algorithm which provides the
basis for our real-time pacing algorithms.
III. SYSTEM MODEL AND OFF-LINE OPTIMUM
The packet pacer smoothes traffic entering the OPS network,
and is therefore employed at the optical edge switches on
their egress links connecting to the all-optical packet switching
core. Note that the optical edge switches process packets
electronically, and are therefore assumed to have ample buffers
required to do the pacing. Once a packet enters the OPS
core, is it processed all-optically by each OPS core switch,
where buffering is limited. The idea of pacing is therefore
to modify the traffic profile entering the OPS network so
CoS
classifier sorter
EDF
4
3 2
1
rate
controller
τ
server
output
input
service rate
τ τ
τ
τ
Fig. 2. Model of the pacer
τ
1
τ
2
τ
3
τ
k
0
T
delay bound
workload (bytes)
arrival curve A(t)
deadline curve D(t)
a feasible exit curve S(t)
time
Fig. 3. Arrival, deadline, and exit curves for an example workload process
as to use the limited buffers more efficiently and reduce
losses, but without adversely affecting end-to-end delay. As
we mentioned in the previous section, rate-based shaping is
unsuitable as it does not effectively resolve short-time-scale
burstiness, while adversely affecting end-to-end delay. Our
pacing method instead smoothes traffic, that is, it minimises
output burstiness, subject to delay constraints. We show in this
paper that this approach is very effective in reducing short-
time-scale burstiness, and hence OPS losses, while preserving
end-to-end delay performance.
A generic architecture of our pacer is shown in Fig. 2.
Incoming packets are classified (according to some criteria)
and assigned a deadline by which they are to be released by the
pacer. A special case we will consider later is when all packets
have identical delay constraints, in which case the architecture
can be simplified. The objective of the pacer is to produce the
smoothest output traffic such that each packet is released by
its deadline. It is natural for the pacer therefore to release
packets in order of deadline, namely to implement Earliest
Deadline First (EDF) service [35], [36], which has known
optimality properties [37] and can be implemented efficiently
[38]. However, the pacer, much like a traffic shaper, is non-
work-conserving, and in trying to produce a smooth output,
behaves as a variable rate server whose rate is modulated
by the deadlines of the waiting packets. The challenge is
in determining the rate modulation strategy that maximally
smoothes the output (this section) and in implementing this
scheme efficiently in real-time at high data rates (next section).
Our pacing strategy derives from studies of video traffic

4
smoothing, which we summarise next. Let [0, T ] denote the
time interval during which the pacing system is considered,
chosen such that the system is devoid of traffic at 0 and T .
Denote by A(t), 0 t T the arrival curve, namely the
cumulative workload (say in units of bytes) arriving in [0, t).
Denote by D(t), 0 t T the deadline curve, namely the
cumulative workload that has to be served in [0, t ) so as not
to violate any deadlines (thus any traffic with deadline earlier
than t contributes to D(t)). Fig. 3 depicts an example A(t) and
D(t) for the case where all arriving traffic has identical delay
requirements. Note that by definition D(t) can never lie above
A(t). Any service schedule implemented by the pacer can be
represented by an exit curve S(t), 0 t T , corresponding to
the cumulative traffic released by the pacer in [0, t). A feasible
exit curve, namely one which is causal and satisfies the delay
constraint, must lie in the region bounded above by the arrival
curve A(t), and below by the deadline curve D(t).
Amongst all feasible exit curves, the one which corresponds
to the smoothest output traffic, measured by various metrics
such as transmission rate variance, has been shown in [28] to
be the shortest path between the origin (0, 0) and the point
(T, D(T)), as shown in Fig. 3. This curve always comprises
a sequence of straight-line segments joining points on the
arrival and deadline curves, each segment representing a period
during which the service rate is a constant. Computation of
this curve requires knowledge of the complete traffic arrival
curve, which restricts the approach to off-line applications
like the transmission of stored video files. For on-line video
transmission applications such as news and sports broadcasts
for which delays of seconds to minutes are tolerable, on-line
algorithms can be derived from the above off-line optimum by
maintaining a time window (i.e. delay buffer) to implement a
lookahead capability (see for example [29]–[31]). There has
also been some work in smoothing interactive video streams
[32] wherein a few frames are buffered at the smoother.
Unlike the video transmission context, smoothing or pacing
in OPS networks will have to operate under much more
demanding conditions. Current mechanims for smoothing con-
sider one or a few video streams at end-hosts or video server;
by contrast OPS edge nodes will have to perform the pacing
on large traffic aggregates at extremely high data rates. The
time-constraints for computing the optimal pacing patterns are
also much more stringent unlike video smoothing where
a few frames (tens to hundreds of milliseconds) of delay is
acceptable, OPS edge nodes will have buffer traffic for shorter
time lest the buffering requirement becomes prohibitively
expensive (at 10 Gbps, 1 msec of buffering needs 10 Mbits
of RAM). Our next section therefore develops algorithms that
are amenable to efficient implementation at OPS edge nodes.
IV. EFFICIENT REAL-TIME PACING
It is shown in [28] that an off-line pacer yields the smoothest
output traffic satisfying the delay constraints if its service
rate follows the shortest path lying between the arrival and
deadline curves. In the on-line case, however, the packet
arrival process is non-deterministic, and the arrival curve is not
known beforehand. In the absence of any assumptions about
// determine length and deadline of new packet p
1. L = length(p); T = currtime; T
p
= T + d
// append new hull piece
2. h = new hullPiece
3. h.startT = ((hullList.empty()?) T : hullList.tail().endT);
4. h.endT = T
p
;
5. h.slope = L/(T
p
-h.startT)
6. hullList.append(h)
// scan backwards to restore hull convexity
7. h = hullList.tail()
8. while ((hPrev=h.prev)6=NULL hPrev.slope h.slope)
9. h.slope = [h.slope (h.endT h.startT)
+ hPrev.slope
(hPrev.endT max(T, hPrev.startT))]
/ (h.endT max(T, hPrev.startT))
10. hullList.delete(hPrev)
11. end while // the hull is now convex
Fig. 4. On-line algorithm for hull update upon packet arrival
B
A
packets
E
D
C
deadline curve
c
delay bound
time
packet arrival packet deadline
O
d
convexity holds at B; stop
replace C-D-E by C-E
convexity violated at D
convexity violated at C
replace B-C-E by B-E
new hull piece D-E
arrival
new packet
original hull: O-A-B-C-D
b
a
Final hull: O-A-B-E
Fig. 5. Example showing single-class hull update in amortised O(1) time
future packet arrivals, our on-line algorithm determines the
smoothest output for the packets currently in the pacer. Thus
at time t, the arrival curve considered to the right of t is a
horizontal line (since future arrivals are not known yet), and
the shortest-path exit curve degenerates to the convex hull of
the deadline curve [33]. Upon each packet arrival, the deadline
curve is augmented, and this may require a recomputation of
the convex hull which defines the optimal exit curve. This
section develops algorithms for the efficient update of the
convex hull of the deadline curve upon each packet arrival.
A. Single Delay Class Constant Amortised Cost Algorithm
We first consider the case where all packets entering the
pacer have identical delay constraints. This simplifies the
hull update algorithm since each packet arrival augments the
deadline curve at the end. Our first algorithm computes the
convex hull in O(1) amortised time per packet arrival.
Fig. 4 depicts this update algorithm performed upon each
packet arrival, and Fig. 5 illustrates the operations with an
example. Recalling that the convex hull is piecewise-linear,
we store it as a doubly linked list, where each element of the

5
list corresponds to a linear segment whose start/end times and
slope are maintained. In step 1 of the algorithm, the length
of the incoming packet is determined, along with its deadline.
The arrival of this new packet causes the deadline curve to
be amended, which results in a new segment being appended
to the hull. Steps 2-6 therefore create a new linear segment
with the appropriate slope and append it to the end of the
hull (shown by operation a in Fig. 5). The new piece may
cause the hull to lose convexity, since the newly added piece
may have slope larger than its preceding piece(s). Steps 7-
11 therefore scan the hull backwards and restore convexity.
If a hull piece has slope larger than its preceding piece, the
two can be combined into a single piece which joins the end-
points of the two pieces (as depicted by operations b and
c in Fig. 5). The backward scan repeatedly fuses hull pieces
until the slope of the last piece is smaller than the preceding
piece (operation d in Fig. 5). At this stage the hull is convex
and the backward scan can stop, resulting in the new hull.
Claim 1: The algorithm of Fig. 4 has constant amortised
computation cost per packet arrival.
Proof: Our proof method follows the technique outlined
for amortised analysis in [39] that assigns a dollar cost to each
unit of computation. We start with the invariant that every point
on the hull has a $1 deposit associated with it. Upon packet
arrival, steps 1-6 are constant time operations, consuming $1
paid by the arriving packet. Further, an additional $1 is de-
posited at the end point of the newly added hull segment. The
loop in steps 7-11 walks backwards through the hull checking
for convexity at each hull point. Each check is a constant time
operation, and is paid for by the $1 deposited at the hull point.
If convexity fails, the hull point is removed, fusing two hull
pieces into one. If convexity holds, the arriving packet deposits
$1 at that hull point, and the algorithm terminates. Thus each
arriving packet has paid a constant $3 in computation cost,
and at termination of processing, a $1 deposit is still available
at each hull point, maintaining the invariant. This completes
the proof.
In spite of a constant amortised cost per packet arrival, a
packet arrival in the worst-case may cause all hull points to be
scanned (steps 7-11) in order to restore convexity. We therefore
develop in the next section an algorithm that has worst-case
complexity logarithmic in the number of packets queued at
the pacer.
The pacer releases packets based on the computed exit curve
in a fairly straightforward manner: a process accumulates
“credits” or “tokens” (much like a leaky-bucket) at an instanta-
neous rate stipulated by the slope of the piece (corresponding
to the current time) in the exit curve. A packet from the
head of the pacer queue is released as soon as sufficient
credits (corresponding to the packet size) are available, such
credits being deducted when the packet departs the pacer. The
released packet is eligible for transmission on the output link.
The link scheduler selects from amongst eligible packets for
transmission, and is assumed to be FIFO in this work.
B. Single Delay Class Logarithmic Cost Algorithm
Our O(log n) worst-case complexity algorithm for optimal
exit curve computation is illustrated with an example in Fig.
1
3
2
arrival
new packet
Final hull: O-A-B-C-D-H
original hull: O-A-B-C-D-E-F-G
C
B
A
E
F
G
H
delay bound
packet deadline
second test vertex
first test vertex
final vertex: D-H is tangent to deadline curve
D
deadline curve
O
packet arrival
time
workload (bytes)
Fig. 6. Example showing single-class hull update
6. Starting with the original convex hull O-A-B-C-D-E-F-G, a
new packet arrival at time 0 adds a new point H to the deadline
curve. From H, we find a tangent to the original convex hull by
doing a binary search on the hull segment slopes. Operation
1 examines the mid-point C of the hull, realises that H-C
lies below the original hull, and so moves right. In operation
2 the mid-point E of the right half is examined, and it is
found that EH lies above the original hull, so the algorithm
moves left, till it reaches point D in operation 3 that gives
the desired tangent and final hull O-A-B-C-D-H.
1) Determine size and deadline of newly arrived packet and
create new vertex u.
2) Search in the AVL tree for the tangent point r from
vertex u to the convex hull.
3) Delete all vertices with time larger than that of r, insert
vertex u, and rebalance the AVL tree.
Fig. 7. Single-class hull update algorithm
Fig. 7 depicts the update algorithm in pseudocode. Recalling
that the deadline curve is piecewise-linear, and the start/end
times of the individual segments correspond to arrival of new
packets as illustrated in Fig. 6, we can represent it as a planar
polygonal line whose vertices v
0
, v
1
, . . . v
n
are in increasing
order with respect to both axes. The vertices are stored in
a height-balanced search tree T structure, the AVL tree for
example, with the value of time used as the search key. Along
with each vertex we also store pointers to its predecessor and
successor on the boundary of its convex hull.
In step 1 of the algorithm, the size of the incoming packet
is determined, along with its deadline, and a new vertex u
is created. The arrival of the new packet, which causes the
deadline curve to be amended, results in appending a new
vertex to the end of the hull. The new vertex may cause it to
lose convexity, since the newly added segment may have slope
larger than that of the preceding hull edge. Step 2 therefore
searches the tree T for the unique vertex r such that slope
of the segment connecting r and u is smaller than that of
the preceding hull edge but larger than that of the following
hull edge. The search process is a binary search of the AVL
tree, as described by Preparata [40, procedure TANGENT]. At

Citations
More filters
Journal ArticleDOI
31 Mar 2009
TL;DR: This paper provides a synopsis of the recently proposed buffer sizing strategies and broadly classifies them according to their desired objective: link utilisation, and per-flow performance, and discusses the pros and cons of these different approaches.
Abstract: The past few years have witnessed a lot of debate on how large Internet router buffers should be. The widely believed rule-of-thumb used by router manufacturers today mandates a buffer size equal to the delay-bandwidth product. This rule was first challenged by researchers in 2004 who argued that if there are a large number of long-lived TCP connections flowing through a router, then the buffer size needed is equal to the delay-bandwidth product divided by the square root of the number of long-lived TCP flows. The publication of this result has since reinvigorated interest in the buffer sizing problem with numerous other papers exploring this topic in further detail - ranging from papers questioning the applicability of this result to proposing alternate schemes to developing new congestion control algorithms, etc.This paper provides a synopsis of the recently proposed buffer sizing strategies and broadly classifies them according to their desired objective: link utilisation, and per-flow performance. We discuss the pros and cons of these different approaches. These prior works study buffer sizing purely in the context of TCP. Subsequently, we present arguments that take into account both real-time and TCP traffic. We also report on the performance studies of various high-speed TCP variants and experimental results for networks with limited buffers. We conclude this paper by outlining some interesting avenues for further research.

107 citations

Journal ArticleDOI
TL;DR: The results reveal that, CUBIC and YeAH overcome the other high-speed TCP variants in different cases of buffer size, however, they still require more improvement to extend their ability to fully utilize the high- speed bandwidths, especially when the applied buffer is close to or less than the BDP of the link.

35 citations


Cites background from "Packet pacing in small buffer optic..."

  • ...…et al., 2006, Beheshti et al., 2006, Prasad et al., 2007, Vishwanath and Sivaraman, 2008, Vishwanath et al., 2009a,b, Vishwanath and Sivaraman, 2009, Sivaraman et al., 2009, LeGrange et al., 2009, Vishwanath et al., 2011), to fit the all-fiber networks which is the fastest type of high-speed…...

    [...]

Journal ArticleDOI
TL;DR: This study is the first to consider interactions between real-time and TCP traffic in very small (potentially all-optical) buffers and informs router manufacturers and network operators of the factors to consider when dimensioning such small buffer sizes for desired performance balance.
Abstract: In the past few years there has been vigorous debate regarding the size of buffers required at core Internet routers. Recent arguments supported by theory and experimentation show that under certain conditions, core router buffer sizes of a few tens of packets suffice for realizing acceptable end-to-end TCP throughputs. This is a significant step toward the realization of optical packet switched (OPS) networks, which are inherently limited in their ability to buffer optical signals. However, prior studies have largely ignored the presence of real-time traffic, which is increasing in importance as a source of revenue for Internet service providers. In this paper, we study the interaction that happens between real-time (open-loop) and TCP (closed-loop) traffic when they multiplex at buffers of very small size (few tens of packets) and make a significant discovery - namely that in a specific range of buffer size, real-time traffic losses increase as buffer size becomes larger. Our contributions pertaining to this anomalous behavior are threefold. First, we exhibit this anomalous loss performance for real-time traffic via extensive simulations using synthetic traffic and real video traces. Second, we develop quantitative models that reveal the dynamics of buffer sharing between real-time and TCP traffic that lead to this behavior. Third, we show how various factors such as the nature of real-time traffic, mixture of long-lived and short-lived TCP flows, and packet sizes impact the severity of the anomaly. Our study is the first to consider interactions between real-time and TCP traffic in very small (potentially all-optical) buffers and informs router manufacturers and network operators of the factors to consider when dimensioning such small buffer sizes for desired performance balance between real-time and TCP traffic.

30 citations

Journal ArticleDOI
TL;DR: The benefits of pacing in practical scenarios multiplexing both TCP and real-time traffic are demonstrated, highlighting that unlike host pacing that requires adoption by a critical mass of users, edge pacing can be deployed relatively easily under service provider control to facilitate rapid migration to core networks with small buffers.

10 citations

Proceedings ArticleDOI
04 Jul 2011
TL;DR: This paper presents the design and prototype of a hardware implementation of a packet pacing system based on the NetFPGA system, and shows that traffic pacing can be implemented with few hardware resources and without reducing system throughput.
Abstract: Optical packet switching networks promise to provide high-speed data communication and serve as the foundation of the future Internet. A key technological problem is the very small size of packet buffers that can be implemented in the optical domain. Existing protocols, for example the transmission control protocol, do not perform well in such small-buffer networks. To address this problem, we have proposed techniques for actively pacing traffic to ensure that traffic bursts are reduced or eliminated and thus do not cause packet losses in routers with small buffers. In this paper, we present the design and prototype of a hardware implementation of a packet pacing system based on the NetFPGA system. Our results show that traffic pacing can be implemented with few hardware resources and without reducing system throughput. Therefore, we believe traffic pacing can be deployed widely to improve the operation of current and future networks.

9 citations


Cites background from "Packet pacing in small buffer optic..."

  • ...Sivaraman et al. [ 17 ] show the impact of small buffers on real-time and TCP traffic and identify short­ timescale burstiness as the major contributor of degradation....

    [...]

References
More filters
Book
01 Jan 1990
TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Abstract: From the Publisher: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures. Like the first edition,this text can also be used for self-study by technical professionals since it discusses engineering issues in algorithm design as well as the mathematical aspects. In its new edition,Introduction to Algorithms continues to provide a comprehensive introduction to the modern study of algorithms. The revision has been updated to reflect changes in the years since the book's original publication. New chapters on the role of algorithms in computing and on probabilistic analysis and randomized algorithms have been included. Sections throughout the book have been rewritten for increased clarity,and material has been added wherever a fuller explanation has seemed useful or new information warrants expanded coverage. As in the classic first edition,this new edition of Introduction to Algorithms presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers. Further,the algorithms are presented in pseudocode to make the book easily accessible to students from all programming language backgrounds. Each chapter presents an algorithm,a design technique,an application area,or a related topic. The chapters are not dependent on one another,so the instructor can organize his or her use of the book in the way that best suits the course's needs. Additionally,the new edition offers a 25% increase over the first edition in the number of problems,giving the book 155 problems and over 900 exercises thatreinforcethe concepts the students are learning.

21,651 citations


"Packet pacing in small buffer optic..." refers methods in this paper

  • ...Proof: Our proof method follows the technique outlined for amortized analysis in [ 39 ] that assigns a dollar cost to each unit of computation....

    [...]

01 Jan 2005

19,250 citations

Book
01 Jan 1974
TL;DR: This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.
Abstract: From the Publisher: With this text, you gain an understanding of the fundamental concepts of algorithms, the very heart of computer science. It introduces the basic data structures and programming techniques often used in efficient algorithms. Covers use of lists, push-down stacks, queues, trees, and graphs. Later chapters go into sorting, searching and graphing algorithms, the string-matching algorithms, and the Schonhage-Strassen integer-multiplication algorithm. Provides numerous graded exercises at the end of each chapter. 0201000296B04062001

9,262 citations


"Packet pacing in small buffer optic..." refers methods in this paper

  • ...The tree is then divided about so that all the leaves to the left of and itself are in one 2‐3 tree and all the leaves to the right of are in a second 2‐3 tree . The division is a recursive process detailed in [ 41 ], Section IV.[12]....

    [...]

Book
01 Jan 1998
TL;DR: The second edition of Optical Networks: A Practical Perspective succeeds the first as the authoritative source for information on optical networking technologies and techniques as discussed by the authors, covering componentry and transmission in detail but also emphasizing the practical networking issues that affect organizations as they evaluate, deploy, or develop optical solutions.
Abstract: This fully updated and expanded second edition of Optical Networks: A Practical Perspective succeeds the first as the authoritative source for information on optical networking technologies and techniques. Written by two of the field's most respected individuals, it covers componentry and transmission in detail but also emphasizes the practical networking issues that affect organizations as they evaluate, deploy, or develop optical solutions.

2,282 citations

Journal Article
TL;DR: The general concept of OBS protocols and in particular, those based on Just-Enough-Time (JET), is described, along with the applicability ofOBS protocols to IP over WDM, and the performance of JET-based OBS Protocols is evaluated.
Abstract: To support bursty traffic on the Internet (and especially WWW) efficiently, optical burst switching (OBS) is proposed as a way to streamline both protocols and hardware in building the future generation Optical Internet. By leveraging the attractive properties of optical communications and at the same time, taking into account its limitations, OBS combines the best of optical circuit-switching and packet/cell switching. In this paper, the general concept of OBS protocols and in particular, those based on Just-Enough-Time (JET), is described, along with the applicability of OBS protocols to IP over WDM. Specific issues such as the use of fiber delay-lines (FDLs) for accommodating processing delay and/or resolving conflicts are also discussed. In addition, the performance of JET-based OBS protocols which use an offset time along with delayed reservation to achieve efficient utilization of both bandwidth and FDLs as well as to support priority-based routing is evaluated.

1,997 citations


"Packet pacing in small buffer optic..." refers background in this paper

  • ...To support data traffic efficiently, various optical sub-wavelength switching methods such as in [2], [3] have been proposed, of which optical packet switching (OPS) [4] is particularly attractive....

    [...]

Frequently Asked Questions (2)
Q1. What have the authors contributed in "Packet pacing in small buffer optical packet switched networks" ?

To mitigate the performance degradation resulting from small buffers, this paper proposes that optical edge nodes “ pace ” the injection of traffic into the OPS core. Second, the authors provide an analytical quantification of the benefits of pacing in reducing traffic burstiness and traffic loss at a link with very small buffers. Third, the authors show via simulations of realistic network topologies that pacing can significantly reduce network losses at the expense of a small and bounded increase in end-to-end delay for real-time traffic flows. The authors argue that the loss-delay trade-off mechanism provided by pacing can be instrumental in overcoming the performance hurdle arising from the scarcity of buffers in OPS networks. 

Their future work targets a deeper study of TCP performance, particularly when mixed with real-time traffic [ 16 ], [ 17 ].