Enhancing TCP's Loss Recovery Using Limited Transmit

Home
/
Papers
/
Enhancing TCP's Loss Recovery Using Limited Transmit

Enhancing TCP's Loss Recovery Using Limited Transmit

Mark Allman, Hari Balakrishnan, Sally Floyd

01 Jan 2001-Vol. 3042, pp 1-9

TL;DR: The invention is carried out through apparatus comprising a diffusion pump with a mass spectrometerconnected to the pump inlet and a trace gas inlet connected to the diffusion pump foreline.

read less

Abstract: This document proposes a new Transmission Control Protocol (TCP) mechanism that can be used to more effectively recover lost segments when a connection's congestion window is small, or when a large number of segments are lost in a single transmission window. The "Limited Transmit" algorithm calls for sending a new data segment in response to each of the first two duplicate acknowledgments that arrive at the sender. Transmitting these segments increases the probability that TCP can recover from a single lost segment using the fast retransmit algorithm, rather than using a costly retransmission timeout. Limited Transmit can be used both in conjunction with, and in the absence of, the TCP selective acknowledgment (SACK) mechanism.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

The NewReno Modification to TCP's Fast Recovery Algorithm

[...]

Sally Floyd, Thomas R. Henderson, Andrei Gurtov

01 Apr 2004

TL;DR: The purpose of this document is to advance NewReno TCP's Fast Retransmit and Fast Recovery algorithms in RFC 2582 from Experimental to Standards Track status.

...read moreread less

Abstract: The purpose of this document is to advance NewReno TCP's Fast Retransmit and Fast Recovery algorithms in RFC 2582 from Experimental to Standards Track status.

...read moreread less

1,602 citations

Proceedings Article•DOI•

Safe and effective fine-grained TCP retransmissions for datacenter communication

[...]

Vijay K. Vasudevan¹, Amar Phanishayee¹, Hiral Shah¹, Elie Krevat¹, David G. Andersen¹, Gregory R. Ganger¹, Garth A. Gibson¹, Brian Mueller² - Show less +4 more•Institutions (2)

Carnegie Mellon University¹, Panasas²

16 Aug 2009

TL;DR: This paper uses high-resolution timers to enable microsecond-granularity TCP timeouts and shows that eliminating the minimum retransmission timeout bound is safe for all environments, including the wide-area.

...read moreread less

Abstract: This paper presents a practical solution to a problem facing high-fan-in, high-bandwidth synchronized TCP workloads in datacenter Ethernets---the TCP incast problem. In these networks, receivers can experience a drastic reduction in application throughput when simultaneously requesting data from many servers using TCP. Inbound data overfills small switch buffers, leading to TCP timeouts lasting hundreds of milliseconds. For many datacenter workloads that have a barrier synchronization requirement (e.g., filesystem reads and parallel data-intensive queries), throughput is reduced by up to 90%. For latency-sensitive applications, TCP timeouts in the datacenter impose delays of hundreds of milliseconds in networks with round-trip-times in microseconds.Our practical solution uses high-resolution timers to enable microsecond-granularity TCP timeouts. We demonstrate that this technique is effective in avoiding TCP incast collapse in simulation and in real-world experiments. We show that eliminating the minimum retransmission timeout bound is safe for all environments, including the wide-area.

...read moreread less

483 citations

Cites background from "Enhancing TCP's Loss Recovery Using..."

...TCP mechanisms such as Limited Transmit [1] were specifically designed to help TCP recover from packet loss when window sizes are small—exactly the problem that occurs during incast collapse....
[...]
...Prior work characterizing TCP incast collapse ended on a somewhat down note, finding that TCP improvements— NewReno, SACK [22], RED [13], ECN [30], Limited Transmit [1], and modifications to Slow Start— sometimes increased throughput, but did not substantially prevent TCP incast collapse because the majority of timeouts were caused by full window losses [28]....
[...]

Journal Article•DOI•

On making TCP more robust to packet reordering

[...]

Ethan Blanton¹, Mark Allman²•Institutions (2)

Ohio University¹, BBN Technologies²

01 Jan 2002

TL;DR: This paper illustrates the impact of reordering on TCP performance, and proposes several alternatives to dynamically make the fast retransmission algorithm more tolerant of the reordering observed in the network.

...read moreread less

Abstract: Previous research indicates that packet reordering is not a rare event on some Internet paths. Reordering can cause performance problems for TCP's fast retransmission algorithm, which uses the arrival of duplicate acknowledgments to detect segment loss. Duplicate acknowledgments can be caused by the loss of a segment or by the reordering of segments by the network. In this paper we illustrate the impact of reordering on TCP performance. In addition, we show the performance of a conservative approach to "undo" the congestion control state changes made in conjunction with spurious retransmissions. Finally, we propose several alternatives to dynamically make the fast retransmission algorithm more tolerant of the reordering observed in the network and assess these algorithms.

...read moreread less

322 citations

Proceedings Article•

Measurement and analysis of TCP throughput collapse in cluster-based storage systems

[...]

Amar Phanishayee¹, Elie Krevat¹, Vijay K. Vasudevan¹, David G. Andersen¹, Gregory R. Ganger¹, Garth A. Gibson¹, Srinivasan Seshan¹ - Show less +3 more•Institutions (1)

Carnegie Mellon University¹

26 Feb 2008

TL;DR: This paper analyzes this Incast problem, explores its sensitivity to various system parameters, and examines the effectiveness of alternative TCP- and Ethernet-level strategies in mitigating the TCP throughput collapse.

...read moreread less

Abstract: Cluster-based and iSCSI-based storage systems rely on standard TCP/IP-over-Ethernet for client access to data. Unfortunately, when data is striped over multiple networked storage nodes, a client can experience a TCP throughput collapse that results in much lower read bandwidth than should be provided by the available network links. Conceptually, this problem arises because the client simultaneously reads fragments of a data block from multiple sources that together send enough data to overload the switch buffers on the client's link. This paper analyzes this Incast problem, explores its sensitivity to various system parameters, and examines the effectiveness of alternative TCP- and Ethernet-level strategies in mitigating the TCP throughput collapse.

...read moreread less

280 citations

Cites background from "Enhancing TCP's Loss Recovery Using..."

...Keywords: Cluster-based storage systems, TCP, performance measurement and analysis...
[...]

Journal Article•DOI•

Measuring the evolution of transport protocols in the internet

[...]

Alberto Medina¹, Mark Allman², Sally Floyd²•Institutions (2)

BBN Technologies¹, Institute of Company Secretaries of India²

01 Apr 2005

TL;DR: Measurement results showing the impact of the current network environment on a number of traditional and proposed protocol mechanisms are provided and can be used to guide the definition of more realistic Internet modeling scenarios.

...read moreread less

Abstract: In this paper we explore the evolution of both the Internet's most heavily used transport protocol, TCP, and the current network environment with respect to how the network's evolution ultimately impacts end-to-end protocols. The traditional end-to-end assumptions about the Internet are increasingly challenged by the introduction of intermediary network elements (middleboxes) that intentionally or unintentionally prevent or alter the behavior of end-to-end communications. This paper provides measurement results showing the impact of the current network environment on a number of traditional and proposed protocol mechanisms (e.g., Path MTU Discovery, Explicit Congestion Notification, etc.). In addition, we investigate the prevalence and correctness of implementations using proposed TCP algorithmic and protocol changes (e.g., selective acknowledgment-based loss recovery, congestion window growth based on byte counting, etc.). We present results of measurements taken using an active measurement framework to study web servers and a passive measurement survey of clients accessing information from our web server. We analyze our results to gain further understanding of the differences between the behavior of the Internet in theory versus the behavior we observed through measurements. In addition, these measurements can be used to guide the definition of more realistic Internet modeling scenarios. Finally, we present several lessons that will benefit others taking Internet measurements.

...read moreread less

242 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Congestion avoidance and control

[...]

Van Jacobson¹•Institutions (1)

University of California, Berkeley¹

01 Aug 1988

TL;DR: The measurements and the reports of beta testers suggest that the final product is fairly good at dealing with congested conditions on the Internet, and an algorithm recently developed by Phil Karn of Bell Communications Research is described in a soon-to-be-published RFC.

...read moreread less

Abstract: In October of '86, the Internet had the first of what became a series of 'congestion collapses'. During this period, the data throughput from LBL to UC Berkeley (sites separated by 400 yards and three IMP hops) dropped from 32 Kbps to 40 bps. Mike Karels1 and I were fascinated by this sudden factor-of-thousand drop in bandwidth and embarked on an investigation of why things had gotten so bad. We wondered, in particular, if the 4.3BSD (Berkeley UNIX) TCP was mis-behaving or if it could be tuned to work better under abysmal network conditions. The answer to both of these questions was “yes”.Since that time, we have put seven new algorithms into the 4BSD TCP: round-trip-time variance estimationexponential retransmit timer backoffslow-startmore aggressive receiver ack policydynamic window sizing on congestionKarn's clamped retransmit backofffast retransmit Our measurements and the reports of beta testers suggest that the final product is fairly good at dealing with congested conditions on the Internet.This paper is a brief description of (i) - (v) and the rationale behind them. (vi) is an algorithm recently developed by Phil Karn of Bell Communications Research, described in [KP87]. (viii) is described in a soon-to-be-published RFC.Algorithms (i) - (v) spring from one observation: The flow on a TCP connection (or ISO TP-4 or Xerox NS SPP connection) should obey a 'conservation of packets' principle. And, if this principle were obeyed, congestion collapse would become the exception rather than the rule. Thus congestion control involves finding places that violate conservation and fixing them.By 'conservation of packets' I mean that for a connection 'in equilibrium', i.e., running stably with a full window of data in transit, the packet flow is what a physicist would call 'conservative': A new packet isn't put into the network until an old packet leaves. The physics of flow predicts that systems with this property should be robust in the face of congestion. Observation of the Internet suggests that it was not particularly robust. Why the discrepancy?There are only three ways for packet conservation to fail: The connection doesn't get to equilibrium, orA sender injects a new packet before an old packet has exited, orThe equilibrium can't be reached because of resource limits along the path. In the following sections, we treat each of these in turn.

...read moreread less

5,620 citations

Key words for use in RFCs to Indicate Requirement Levels

[...]

S. Bradner

01 Mar 1997

TL;DR: This document defines these words as they should be interpreted in IETF documents as well as providing guidelines for authors to incorporate this phrase near the beginning of their document.

...read moreread less

Abstract: In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. Authors who follow these guidelines should incorporate this phrase near the beginning of their document:

...read moreread less

3,501 citations

Additional excerpts

...The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this document, are to be interpreted as described in [B97]....
[...]

Transmission Control Protocol

[...]

J. Postel

01 Sep 1981

3,411 citations

TCP Congestion Control

[...]

Mark Allman, Vern Paxson, W. Stevens

01 Apr 1999

TL;DR: This document defines TCP's four intertwined congestion control algorithms: slow start, congestion avoidance, fast retransmit, and fast recovery, as well as discussing various acknowledgment generation methods.

...read moreread less

Abstract: This document defines TCP's four intertwined congestion control algorithms: slow start, congestion avoidance, fast retransmit, and fast recovery. In addition, the document specifies how TCP should begin transmission after a relatively long idle period, as well as discussing various acknowledgment generation methods.

...read moreread less

2,237 citations

TCP Selective Acknowledgement Options

[...]

Matt Mathis, J. Mahdavi, Sally Floyd, Allyn Romanow

01 Oct 1996

TL;DR: TCP may experience poor performance when multiple packets are lost from one window of data because of the limited information available from cumulative acknowledgments.

...read moreread less

Abstract: TCP may experience poor performance when multiple packets are lost from one window of data. With the limited information available from cumulative acknowledgments, a TCP sender can only learn about a single lost packet per round trip time. An aggressive sender could choose to retransmit packets early, but such retransmitted segments may have already been successfully received.

...read moreread less

1,639 citations