scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Improving Performance of QUIC in WiFi

TL;DR: It is shown that a Bursty QUIC (BQUIC), i.e., a customized version of QUIC that is targeted to increase its burstiness, can achieve better performance in WiFi, with throughput gains ranging between 20% to 30%.
Abstract: QUIC is a new transport protocol under standardization since 2016. Initially developed by Google as an experiment, the protocol is already deployed in large-scale, thanks to its support in Chromium and Google's servers. In this paper we experimentally analyze the performance of QUIC in WiFi networks. We perform experiments using both a controlled WiFi testbed and a production WiFi mesh network. In particular, we study how QUIC interplays with MAC layer features such as IEEE 802.11 frame aggregation. We show that the current implementation of QUIC in Chromium achieves sub-optimal throughput in wireless networks. Indeed, burstiness in modern WiFi standards may improve network performance, and we show that a Bursty QUIC (BQUIC), i.e., a customized version of QUIC that is targeted to increase its burstiness, can achieve better performance in WiFi. BQUIC outperforms the current version of QUIC in WiFi, with throughput gains ranging between 20% to 30%.

Summary (3 min read)

Introduction

  • With the exponential growth in adoption of mobile phones and other smart connected devices, the usage of wireless networks continues to grow.
  • On the other hand, efforts from content providers in developing customized mobile versions of websites and new low-latency transport protocols such as QUIC have contributed to improve user experience.
  • The authors start from the observation that Chromium’s implementation of QUIC (version 39) has suboptimal performance in WiFi.
  • As mentioned above, frame aggregation is a key feature to achieve high throughput in recent 802.11 standards and burstiness increases aggregation opportunities.

B. QUIC protocol

  • QUIC provides several cross-layer enhancements, covering the weaknesses of TCP for transporting web content.
  • QUIC also provides an improved congestion controller, better RTT estimation, and a better loss recovery mechanism than TCP.
  • This was the default mode of QUIC in Chromium at the time of writing.
  • In either mode, if an out-of-order packet or a previously missing packet is received, the ACK is sent without any delay to inform the sender immediately about it.
  • Reducing traffic burstiness is known to prevent congestion and, as a consequence, reduce the undesirable effects of packet loss.

C. Transport protocols enhancements for WiFi

  • To the best of their knowledge, no previous work has evaluated the performance of QUIC in WiFi by exploring its interactions with the wireless medium and 802.11 enhancements such as frame aggregation.
  • Many works propose to reduce TCP acknowledgement frequency in wireless networks.
  • Oliveira el. al [12] propose Dynamic Adaptive Acknowledgement where the delay window is adjusted according to the channel condition.
  • Reducing the number of ACKs saves wireless resources and reduces interferences with other packets.
  • The authors show, for example, that since QUIC runs on user-space it incurs performance penalties particularly for mobile devices that are usually constrained by processing power.

III. METHODOLOGY

  • The authors now describe their methodology, covering their customization to Chromium’s QUIC implementation (Sec. III-A), their test environment (Sec. III-B and Sec. III-C) and the performance metrics used in the experiments (Sec. III-D).
  • The results presented in the evaluation have been obtained parsing traces captured with tcpdump.

C. Wireless community network

  • The authors second testbed is a production wireless community network deployed in a neighborhood of the city of Barcelona called Sants [16].
  • The nodes use the linux/openwrt [18] based distribution provided by the Quick Mesh Project (QMP) [19], which runs the BMX6 mesh routing protocol [20].
  • QMPSU is part of a larger community network started in 2004, which has more than 30.000 operative nodes called Guifi.net [21].
  • Fig. 1 shows the geographic location of active nodes and links, using distinct colors to represent wireless links configured with different channels.
  • There are also a number of point-to-point links using Ubiquiti parabolic antennas running the original manufacturer firmware.

D. Measuring performance

  • The authors have selected 10 websites from Alexa’s top 100 list, downloaded their landing pages and other publicly available pages and hosted them on their servers.
  • The selected websites are a mix of social networks, online shopping, news and search engines.
  • The main characteristics of the cloned pages are summarized in Tab.
  • The authors load these pages from the clients using the default Chromium QUIC implementation and BQUIC.
  • The authors parse the HAR file and the captured traffic to calculate various metrics such as the page load time (PLT), throughput, and packet inter-arrival time over 30 runs.

A. Bulk transfer throughput

  • III shows the mean values of the measured throughput and end-to-end % loss obtained by downloading their 10 MB 1https://github.com/cyrus-and/chrome-har-capturer synthetic web page during the 100 runs.
  • The losses have been computed by comparing the identification field of the IP header of transmitted and received datagrams.
  • This is mainly because the antenna gain of the RPi is higher than in the smartphone, and thus, the network card can use MCS with higher bitrates during the transfer.
  • The throughput gain of BQUIC over QUIC is similar for both devices (26% and 23% in the smartphone and the RPi, respectively).
  • Despite these differences and the large variations of measured throughput between mesh nodes (1.97 Mbps for RP2 and 20.9 Mbps for RP4 with QUIC) the authors observe significant performance improvements (between 20% and 31%) in all cases.

B. Web page load time

  • In order to see the impact of BQUIC upon different types of web browsing the authors perform experiments using the cloned websites.
  • Fig. 2 shows the mean PLT for various cloned web pages which are summarized in Tab.
  • Using BQUIC the authors observe a decrease in PLT for all websites ranging from 5% for small web pages such as Google and Live up to 25% for large web pages such as Amazon and Facebook.
  • The authors can see that the larger the page is, the larger is the reduction of the PLT.
  • This is an expected result, since the connection establishment, which includes the exchange of certificates, has a larger relative overhead for small web pages.

C. Detailed analysis

  • To better understand the difference in behavior of QUIC and BQUIC, the authors now perform a detailed analysis for one of the experimental runs from Section IV-A with RP5 node.
  • The authors observe similar trends for the other nodes, albeit to different extent.
  • In QUIC the slow start phase exits when increasing delay is detected.
  • Fig. 4 shows the ACK reception (blue vertical bar) and packet transmission (yellow circle) at the sender side during an interval of 20 ms, also known as 2) Sequence-Acknowledgement analysis.
  • Note that, despite BQUIC being more bursty than QUIC at packet level, as shown before, Fig. 6 depicts similar variations of throughput at larger time scale.

V. CONCLUSIONS

  • The authors analyzed the performance of QUIC in WiFi, investigating the interactions of the protocol with 802.11 frame aggregation.
  • The authors first highlighted that Chromium’s QUIC (v.39) delivers sub-optimal throughput in typical WiFi scenarios.
  • The root-cause is the way QUIC paces packets in the network and its acknowledgment mechanisms.
  • The authors carried out experiments using both a controlled testbed and a production WMN.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Improving Performance of QUIC in WiFi
Jawad Manzoor
, Llorenc¸ Cerd
`
a-Alabern
, Ramin Sadre
and Idilio Drago
Universit
´
e Catholique de Louvain
{jawad.manzoor,ramin.sadre}@uclouvain.be
Universitat Polit
`
ecnica de Catalunya
llorenc@ac.upc.edu
Politecnico di Torino
idilio.drago@polito.it
Abstract—QUIC is a new transport protocol under standard-
ization since 2016. Initially developed by Google as an experi-
ment, the protocol is already deployed in large-scale, thanks to
its support in Chromium and Google’s servers. In this paper
we experimentally analyze the performance of QUIC in WiFi
networks. We perform experiments using both a controlled WiFi
testbed and a production WiFi mesh network. In particular,
we study how QUIC interplays with MAC layer features such
as IEEE 802.11 frame aggregation. We show that the current
implementation of QUIC in Chromium achieves sub-optimal
throughput in wireless networks. Indeed, burstiness in modern
WiFi standards may improve network performance, and we show
that a Bursty QUIC (BQUIC), i.e., a customized version of QUIC
that is targeted to increase its burstiness, can achieve better
performance in WiFi. BQUIC outperforms the current version
of QUIC in WiFi, with throughput gains ranging between 20%
to 30%.
I. INTRODUCTION
With the exponential growth in adoption of mobile phones
and other smart connected devices, the usage of wireless
networks continues to grow. Today, wireless networks are
commonplace in every sector of life including homes, offices,
restaurants, hospitals and university campuses. In a recent
Cisco white paper [1] it is predicted that wireless and mobile
device traffic will exceed that of PCs, and comprise more than
63 percent of total IP traffic by 2021. There is a continuous
increase in user demand for richer mobile web content and
reduced loading time. At the same time, complex applications
and web content put a large computational burden on mobile
devices, which are in general more resource-constrained than
PCs and laptops.
Nonetheless, technological advancements have allowed to
overcome these issues and provide better user experience. On
one hand, WiFi technologies, which are based on the IEEE
802.11 standards, have undergone an enormous evolution in
the recent years [2]. For instance, the 802.11n standard, which
is predominant nowadays, allows coding schemes (MCS)
that support data rates up to 600 Mbps. It also includes a
high throughput enhancement called frame aggregation, which
consists of combining two or more data frames into a single
transmission, thus reducing the fixed overhead associated with
each frame transmission. On the other hand, efforts from
content providers in developing customized mobile versions
of websites and new low-latency transport protocols such as
QUIC have contributed to improve user experience. QUIC
started as an experimental protocol designed by Google and
has emerged as a serious alternative to TCP. In a recent
measurement study [3], it was estimated that around 7% of
the Internet traffic is QUIC.
In this paper we experimentally analyze the performance of
QUIC in WiFi networks. We start from the observation that
Chromium’s implementation of QUIC (version 39) has sub-
optimal performance in WiFi. We then investigate root-causes
for the problem, finding that some of the QUIC features in
Chromium impair the protocol performance in WiFi networks.
In particular, bursty traffic has been traditionally considered
undesirable, since it can lead to longer queuing delays and
multiple consecutive losses which are more difficult to recover.
This fact has motivated QUIC to be designed with packet
pacing [4], with the aim to reduce burstiness and, thus, packet
losses. While this consequence is generally true in wired
networks, we observe that due to the characteristics of the
WiFi medium and interactions between transport and MAC
protocols, bursty traffic might actually be beneficial in WiFi.
One apparent reason for this behavior is frame aggregation. As
mentioned above, frame aggregation is a key feature to achieve
high throughput in recent 802.11 standards and burstiness
increases aggregation opportunities.
Therefore, we implement and evaluate a Bursty QUIC
(BQUIC) in Chromium, which is a customized version of
QUIC targeted to increase traffic burstiness by reducing
the transport layer acknowledgment frequency and disabling
packet pacing. We show the advantage of BQUIC for two
different use cases of WiFi: (i) in home or enterprise wireless
local area networks (WLANs), and (ii) wireless mesh networks
(WMN) such as Guifi.net, MadMesh, Merkai and Google
WiFi. We analyze these use cases by taking measurements in
a lab and a production WMN. Our experimental results show
that increasing burstiness of QUIC improves its throughput in
WiFi, with gains ranging between 20% to 30%.
Our results are a step forward for understanding perfor-
mance trade-offs in QUIC. They can help in the design of
the standard protocol, which would extract better performance
from lower layer protocols. Since layer-2 protocols in wired
and wireless networks are nowadays significantly different,
we believe that the transport protocol can be improved by
individually tuning it for both networks.
II. BACKGROUND AND RELATED WORK
A. 802.11 and frame aggregation
802.11 is a set of IEEE standards that regulate wireless
transmission. A recent measurement study [5] with millions of
Cisco Meraki access points (APs) shows that around 99% of

the APs use the 802.11n and 802.11ac standards. These stan-
dards introduce enhancements to increase data rates. Among
them, frame aggregation is a simple method to enhance
throughput.
The wireless medium has a high overhead, which includes
the MAC and PHY headers, acknowledgments (ACK), backoff
time and inter-frame spacing. For ACKs and small segments,
the overhead in terms of bytes can be higher than the actual
payload. The frame aggregation scheme amortizes this over-
head and achieves high data throughput by combining multiple
data frames into a single transmission unit.
Aggregation in WiFi MAC architecture is supported at two
layers. In the first layer multiple MAC service data units
(MSDUs) are aggregated into an A-MSDU. In the second
layer multiple A-MSDUs are combined to form an aggregated
A-AMPDU. A detailed description of these concepts can be
found in [6]. Their impact on throughput have also been
extensively evaluated [7], [6], [8]. We here evaluate how to
profit from the mechanisms to improve QUIC performance.
B. QUIC protocol
QUIC is a user-space transport protocol running on top of
UDP. QUIC provides several cross-layer enhancements, cover-
ing the weaknesses of TCP for transporting web content. For
example, in the case of HTTP/2 running over TLS and TCP,
the loss of a single TCP packet blocks all HTTP/2 streams,
since a single connection is shared by all streams. QUIC
instead is designed to handle the streams, thus eliminating
head-of-line blocking delays in case of a packet loss.
QUIC profits from recent advances in TLS, implementing
new TLS 1.3 concepts such as zero handshake latency. That
is, QUIC is able to reduce latency by reusing credentials of
known servers on repeated connections. QUIC also provides
an improved congestion controller, better RTT estimation, and
a better loss recovery mechanism than TCP.
QUIC is still under development, thus its features and
operations are not completely standardized yet. To understand
its internal workings, we have studied the QUIC source code
in the open-source Chromium project [9]. Our experiments in
this paper have been performed with QUIC version 39.
Two aspects of QUIC implementation particularly influence
how the protocol interacts with 802.11 frame aggregation: (i)
acknowledgment modes, and (ii) packet pacing.
1) QUIC acknowledgment modes: Chromium’s implemen-
tation of QUIC includes two acknowledgment modes:
TCP ACKING: This mode is similar to TCP delayed
acknowledgment, in which an ACK is generated for every 2
received packets in accordance with RFC 1122. This was the
default mode of QUIC in Chromium at the time of writing.
ACK DECIMATION: In this mode acknowledgments
are delayed up to a maximum of 10 packets (unless
unlimited_decimation is enabled) and a cumulative
ACK is generated. The maximum duration for which the ACK
can be delayed is 25 ms and the actual delay_time is calcu-
lated on the fly as the minimum between max_delay_time
and one quarter of minimum RTT observed during the session.
The ACK_DECIMATION is only considered after at least 100
packets have been received to avoid interfering with slow start.
This mode was disabled in Chromium at the time of writing. In
either mode, if an out-of-order packet or a previously missing
packet is received, the ACK is sent without any delay to inform
the sender immediately about it.
2) Packet pacing: Chromium’s implementation of QUIC
includes a packet pacing mechanism that aims to reduce send-
ing bursts of packets by introducing delay between consecutive
packets. Reducing traffic burstiness is known to prevent con-
gestion and, as a consequence, reduce the undesirable effects
of packet loss. However, it may also hinder performance in
high-speed networks with low loss rates.
C. Transport protocols enhancements for WiFi
To the best of our knowledge, no previous work has evalu-
ated the performance of QUIC in WiFi by exploring its inter-
actions with the wireless medium and 802.11 enhancements
such as frame aggregation. However, given the popularity of
TCP, it is not a surprise that previous works have targeted
similar problems in TCP deployments.
Considering the role of TCP acknowledgments for the
protocol reliability, reducing the acknowledgement frequency
and performing delayed cumulative acknowledgements may
provide benefits in wireless networks. Many works propose to
reduce TCP acknowledgement frequency in wireless networks.
Altman el. al [10] investigated the impact of increasing the
TCP delayed acknowledgement mechanism to more than two
segments as recommended by RFC 1122. Singh et. al [11]
propose TCP with adaptive delayed acknowledgement, which
aims to reduce the number of ACKs to one per congestion
window. Oliveira el. al [12] propose Dynamic Adaptive Ac-
knowledgement where the delay window is adjusted according
to the channel condition. In [13], the same authors provide
an improved delaying window strategy for robustness against
losses.
These works agree that a key factor affecting TCP perfor-
mance in wireless networks is the contention and collision be-
tween ACK and data packets. Reducing the number of ACKs
saves wireless resources and reduces interferences with other
packets. Moreover, lowering the ACK frequency increases
burstiness of traffic as the sender releases a micro burst of
packets after receiving the cumulative ACK. This behavior
may reduce the inter-packet time increasing the opportunities
for frame aggregation at the 802.11 MAC layer.
The closest work to ours is [14], which provides a deep view
on QUIC performance. The authors show, for example, that
since QUIC runs on user-space it incurs performance penalties
particularly for mobile devices that are usually constrained
by processing power. We extend the knowledge about QUIC
performance here, showing how the protocol interacts with
features of lower-layer protocols in WiFi.
III. METHODOLOGY
We now describe our methodology, covering our customiza-
tion to Chromium’s QUIC implementation (Sec. III-A), our

test environment (Sec. III-B and Sec. III-C) and the perfor-
mance metrics used in the experiments (Sec. III-D). The results
presented in the evaluation have been obtained parsing traces
captured with tcpdump.
A. Bursty QUIC
Our goal is to study QUIC’s behavior and improve its
performance in WiFi networks. Based on the observations
in the previous section, we believe that the (non)burstiness
of QUIC traffic cannot fully exploit frame aggregation in
WiFi MAC layer. Therefore, we tune QUIC to produce
bursty traffic: We have compiled a version of Chromium with
ACK_DECIMATION as default acknowledgment mode and
without packet pacing. Disabling packet pacing is important
since it can neutralize the effect of burstiness created by
ACK_DECIMATION. These two features are not controllable
from the browser configuration we had to study the source
code and make required modifications. We call this tuned
version Bursty QUIC (BQUIC).
We focus only on WiFi and do not evaluate the scenarios
with wired or hybrid connectivity between client and server.
There are concerns about the impact of high burstiness on
packet drops and queuing delays in wired networks, particu-
larly in long Internet paths. However, we will show that the
performance gains in the WiFi environment are high and can
potentially overshadow other effects. Moreover, in real-world
scenarios service providers are increasingly deploying caches
and CDN nodes closer to end-users [15]. Thus, the scenario
tested in the following is already popular and tends to become
widespread as more servers are deployed closer to users.
In the following experiments we place the server geograph-
ically close to the clients i.e., at WiFi access point or mesh
network gateway, with an average RTT in the range of 6 ms to
10 ms between the clients and server. Performing experiments
on Internet-wide scale are left for future work. We carry out
experiments in two testbeds (i) a controlled lab and (ii) a real
production mesh network.
B. Lab testbed
Our lab testbed consists of a client connected to a WiFi
router, all using 802.11n. The WiFi router in our testbed is
connected to a server through a Gigabit Ethernet connection.
We use two devices as clients (i) an Android smartphone with
ARM Cortex A-57 quad-core CPU and 2 GB RAM running
Android 6.0 and (ii) a Raspberry PI 3 (RPi) having quad-
core ARM Cortex-A53 CPU and 1 GB RAM running Debian
9. The server in our testbed has a quad-core Intel Core i5-
3470 CPU and 8 GB RAM running Ubuntu 16.04. We cross-
compile chromium browser with QUIC and BQUIC for ARM
and Android and deploy it on the respective clients.
C. Wireless community network
Our second testbed is a production wireless community
network deployed in a neighborhood of the city of Barcelona
(Spain) called Sants [16]. The network was started in 2009
and in 2012 was joined by nodes installed at Universitat
Polit
`
ecnica de Catalunya (UPC) within the EU CONFINE
project [17]. The network is operative since 2009. The nodes
use the linux/openwrt [18] based distribution provided by the
Quick Mesh Project (QMP) [19], which runs the BMX6 mesh
routing protocol [20]. From now on we will refer to this
network as QMPSU. QMPSU is part of a larger community
network started in 2004, which has more than 30.000 operative
nodes called Guifi.net [21]. At the time of writing QMPSU
has around 80 active nodes. Fig. 1 shows the geographic
location of active nodes and links, using distinct colors to
represent wireless links configured with different channels. In
QMPSU there are 2 gateways that connect QMPSU to the rest
of Guifi.net and the Internet.
0.0
0.5
1.0
1.5
2.0
0 2 4
x (km)
y (km)
RP4
RP2
RP3
RP5
RP1
UPC Campus Nord
server
Figure 1: QMPSU geographical topology. Colors indicate links
configured in the same WiFi channel.
QMPSU is 802.11an-based and the most common hardware
is the Ubiquiti NanoStation M5, equipped with a sectorial an-
tenna and running QMP firmware. There are also a number of
point-to-point links using Ubiquiti parabolic antennas running
the original manufacturer firmware. QMPSU also has a live
monitoring web page updated hourly. A detailed description
of QMPSU can be found in [22], and a live monitoring page
updated hourly can be accessed on-line [23].
QMPSU has been deployed by its own users. Its unplanned
spread out using heterogeneous WiFi devices in an urban area
has produced a high diversity on the quality of the links. Thus,
it offers a very realistic testbed to evaluate the performance of
QUIC under a variety of conditions.
We deploy ve RPi clients attached using the Ethernet port
to the premises of different volunteers across the QMPSU
network. Moreover, we set up a server in one of the gateways
of QMPSU to the Internet. Nodes are marked as RP or server
respectively in Fig. 1.
The server has an Intel dual-core CPU and 8 GB RAM,
running Ubuntu 16.04. The hardware specifications of the RPi
are similar to the smartphone used in the Lab testbed, and we
will show later that lab results with smartphone and RPi are
quite similar. For this reason we only use RPis for experiments
in QMPSU for convenience of deployment and maintenance.
Tab. I shows the number of wireless hops from the clients
to the gateway (W-hops). Note that many of these hops use
different frequencies and thus are not interfering with each
other.

Table I: Characteristics of the client locations.
RP Name W-hops
RP 1 BCNevaristoarnus5Rd3-BPi 4
RP 2 GS-BCNpisuerga17Rd1 3
RP 3 GS26gener10-8710 3
RP 4 GSgV-rb-dce0 1
RP 5 BCNJardiBotanicSants186-ba35 5
Table II: Statistics of cloned web pages with the number of
objects of various file types.
Website HTML CSS JS Image Other Total
Size
(kB)
Google 2 1 3 5 1 12
56
Live 2 2 2 2 0 8
262
Twitter 6 1 4 2 3 16
421
Wikipedia 1 1 2 20 1 25
441
Reddit 4 2 5 26 2 39
470
Yahoo 16 13 5 48 4 86
839
Ebay 4 1 6 3 14 28
985
Instagram 3 1 7 25 1 37
1 409
YouTube 8 3 5 113 20 149
2 911
Facebook 1 1 8 123 1 134
3 560
Amazon 5 2 14 41 2 64
3 723
D. Measuring performance
We have selected 10 websites from Alexa’s top 100 list,
downloaded their landing pages and other publicly available
pages and hosted them on our servers. The selected websites
are a mix of social networks, online shopping, news and search
engines. The main characteristics of the cloned pages are
summarized in Tab. II.
We load these pages from the clients using the default
Chromium QUIC implementation and BQUIC. To automate
the page loading we use Chrome-HAR-capturer
1
to connect
to remote clients in the lab or WMN and repeatedly load the
pages multiple times while capturing traffic at both client and
server sides.
We parse the HAR file and the captured traffic to calculate
various metrics such as the page load time (PLT), throughput,
and packet inter-arrival time over 30 runs. We also analyze
data segments and ACK packets. We compute throughput by
dividing the amount of bits sent in the UDP payloads of
the QUIC connections over the time of the transfer, ignoring
connection establishment time. We have also instrumented the
web server to log the CWND size on every acknowledgment.
To emulate bulk file transfers, we have created a synthetic
web page with a large image of 10 MB that we have
downloaded in 100 runs over 10 days from the lab nodes and
the mesh nodes. We calculate the mean throughput and 95%
confidence interval for all runs. We also compute the relative
improvement achieved by BQUIC, referred to as gain.
IV. EVALUATION
A. Bulk transfer throughput
Tab. III shows the mean values of the measured throughput
and end-to-end % loss obtained by downloading our 10 MB
1
https://github.com/cyrus-and/chrome-har-capturer
Table III: Performance comparison of QUIC and BQUIC
Throughput (Mbps) Loss rate (%)
Device QUIC BQUIC Gain QUIC BQUIC
Lab
Android 34.8 43.8 26% - -
RPi 42.38 52.3 23% - -
Mesh network
RP1 8.2 10.6 29% 0.63 0.8
RP2 1.97 2.54 28% 0.79 1.23
RP3 12.8 15.5 20% 0.36 0.4
RP4 20.9 27.2 30% 0.01 0.01
RP5 18.6 24.5 31% 0.05 0.06
synthetic web page during the 100 runs. The losses have been
computed by comparing the identification field of the IP header
of transmitted and received datagrams. We have computed the
95% confidence intervals for throughput, and they are small
in all cases (less than 10%). In the lab testbed, we can see that
the RPi achieves higher throughput than the smartphone. This
is mainly because the antenna gain of the RPi is higher than in
the smartphone, and thus, the network card can use MCS with
higher bitrates during the transfer. However, the throughput
gain of BQUIC over QUIC is similar for both devices (26%
and 23% in the smartphone and the RPi, respectively).
Regarding the losses measured at the transport layer, Tab. III
shows that they are negligible. This is normal on a WiFi link
of an acceptable quality, since 802.11 retransmits lost unicast
frames multiple times before abandoning its transmission.
Indeed, the worst connected device (RP2) has a loss of only
0.79% in QUIC and 1.23% in BQUIC. We can see that the loss
rate slightly increases in BQUIC, but it is negligible. More-
over, the high throughput gain achieved in BQUIC surpasses
the negative effects.
Notice that there are significant differences between the
client in terms of delays, number of hops and link capacities,
which is expected as it is a production network. Despite these
differences and the large variations of measured throughput
between mesh nodes (1.97 Mbps for RP2 and 20.9 Mbps for
RP4 with QUIC) we observe significant performance improve-
ments (between 20% and 31%) in all cases. Since there are too
many factors such as the antenna hardware, firmware, wireless
link conditions etc. for each node, investigating the low level
details to find the root cause of the observed differences is out
of scope of this paper. Our main objective is to experimentally
show that burstiness increases performance of QUIC in WiFi.
B. Web page load time
In order to see the impact of BQUIC upon different types
of web browsing we perform experiments using the cloned
websites. RP5 is used as client. Fig. 2 shows the mean PLT
for various cloned web pages which are summarized in Tab. II.
Using BQUIC we observe a decrease in PLT for all websites
ranging from 5% for small web pages such as Google and
Live up to 25% for large web pages such as Amazon and
Facebook. We can see that the larger the page is, the larger
is the reduction of the PLT. This is an expected result, since

0
5
10
15
20
25
30
google
live
twitter
wikipedia
reddit
yahoo
ebay
instagram
facebook
amazon
Transfer time reduction(%)
Figure 2: Relative reduction in PLT achieved for various
websites by using BQUIC vs QUIC
60
80
100
120
0 1 2 3 4
time [s]
cwnd (kbytes)
QUIC
BQUIC
Figure 3: CWND size comparison
the connection establishment, which includes the exchange of
certificates, has a larger relative overhead for small web pages.
C. Detailed analysis
To better understand the difference in behavior of QUIC
and BQUIC, we now perform a detailed analysis for one of
the experimental runs from Section IV-A with RP5 node. We
observe similar trends for the other nodes, albeit to different
extent.
1) CWND size: We instrument the web server to log the
size of the congestion window upon each acknowledgement.
Fig. 3 shows the CWND size of standard QUIC and BQUIC.
We can see that QUIC exits from slow start phase much earlier
than BQUIC and thus achieves lower throughput. In QUIC
the slow start phase exits when increasing delay is detected.
The detection algorithm is called on every new ACK frame
and a new RTT measurement is performed. If the minimum
delay of the first few packets of the current burst exceeds the
minimum delay during the session by a certain threshold, the
slow start phase exits. The early exit from slow start in QUIC
is conceivably due to packet pacing which reduces aggregation
opportunities and allows only a few packets to be transmitted
together. The next packets get transmitted in separate unit after
gaining access to the wireless medium which injects extra
delay. The increased delay is detected by the algorithm and it
exits slow start. In BQUIC there is no packet pacing and the
inter packet time is much smaller which allows a large number
of consecutive packets to be aggregated and transmitted as part
of a single unit. Therefore the CWND increases to a much
larger value before exiting slow start.
2) Sequence-Acknowledgement analysis: Fig. 4 shows the
ACK reception (blue vertical bar) and packet transmission
(yellow circle) at the sender side during an interval of 20 ms.
2.510 2.515 2.520 2.525
4480
4490
4500
4510
5700
5730
5760
5790
5820
time [s]
packet number
QUIC BQUIC
Figure 4: Transmission of data packets (circles) and reception
of ACKs (vertical bars) at the sender side
0.18
0.89
0.0
0.5
1.0
1.5
0.00
0.25
0.50
0.75
1.00
10
1
10
2
10
3
10
4
inter packet time [µs] (log
10
scale)
count ×10
3
ECDF
QUIC
BQUIC
QUIC
BQUIC
Figure 5: Inter packet time histogram and ECDF
The effect of packet pacing in QUIC can be observed in the
upper sub-figure where segments are mostly evenly spaced
from one another. In the middle of the figure many ACKs
are received together and as a consequence many segments
are transmitted by the sender, which can be observed by
a steeper slope. The bottom sub-figure represents BQUIC.
The figure shows much less ACKs in the interval, due to
ACK_DECIMATION. Furthermore, since a large window is
acknowledged by each ACK and pacing is disabled, a burst
of packets is released shortly after an ACK is received.
3) Inter packet time: We measure the inter packet time
(IPT) at the server side to get a better insight into the different
acknowledgement strategies. Fig. 5 shows the IPT histogram
(upper sub-figure) and empirical cumulative distribution func-
tion, ECDF (lower sub-figure). We can see that in BQUIC
89% packets are sent with an IPT lower than 100µs, while
in QUIC this value is only 18%. The histogram shows two
peaks in QUIC around 100 µs and 500 µs due to different
pacing rates used by QUIC during this execution. The pacing
rate is decided by QUIC on the fly depending on the link
conditions such as bandwidth, RTT etc., and varies between
different nodes in the mesh and even different runs using the
same node. On the other hand, BQUIC IPT is very small and
is concentrated around 30 µs.
4) Throughput: Fig. 6 shows the throughput of QUIC and
BQUIC computed by averaging over intervals of 50 ms during
the bulk transfer time (recall that we compute the throughput

Citations
More filters
Journal ArticleDOI
TL;DR: A comprehensive study on the performance of QUIC in Wireless Mesh Networks (WMN), which shows that while QUIC outperforms TCP in wired networks, it exhibits significantly lower performance than TCP in the WMN.
Abstract: The exponential growth in adoption of mobile phones and the widespread availability of wireless networks has caused a paradigm shift in the way we access the Internet. It has not only eased access to the Internet, but also increased users’ appetite for responsive services. New protocols to speed up Internet applications have naturally emerged. The QUIC transport protocol is one prominent case. Initially developed by Google as an experiment, the protocol has already made phenomenal strides, thanks to its support in Google’s servers and Chrome browser. Since QUIC is still a relatively new protocol, there is a lack of sufficient understanding about its behavior in real network scenarios, particularly in the case of wireless networks. In this paper we present a comprehensive study on the performance of QUIC in Wireless Mesh Networks (WMN). We perform a measurement campaign on a production WMN to compare the performance of QUIC against TCP when retrieving files from the Internet. Our results show that while QUIC outperforms TCP in wired networks, it exhibits significantly lower performance than TCP in the WMN. We investigate the reasons for this behavior and identify the root causes of the performance issues. We find that some design choices of QUIC may penalize the protocol in WiFi, e.g., uncovering sub-optimal interactions of QUIC with MAC layer features, such as frame aggregation. Finally, we implement and evaluate our solution and demonstrate up to 28% increase in throughput of QUIC.

9 citations

Posted Content
TL;DR: This paper surveys major attempts on reducing latency and increasing the throughput on different networks and surroundings such as wired networks, wireless networks, application layer transport control, Remote Direct Memory Access, and machine learning based transport control.
Abstract: Modern applications are highly sensitive to communication delays and throughput. This paper surveys major attempts on reducing latency and increasing the throughput. These methods are surveyed on different networks and surroundings such as wired networks, wireless networks, application layer transport control, Remote Direct Memory Access, and machine learning based transport control.

2 citations


Cites background from "Improving Performance of QUIC in Wi..."

  • ...[392] showed that QUIC achieves sub-optimal throughput in WiFi networks....

    [...]

Dissertation
13 Jun 2019
TL;DR: Tesi en cotutel·la: Universitat Politecnica de Catalunya i Universite catholique de Louvain.
Abstract: Tesi en cotutel·la: Universitat Politecnica de Catalunya i Universite catholique de Louvain

1 citations

Peer ReviewDOI
01 Nov 2022
TL;DR: In this paper , the authors provide a brief survey on the practical application and progress of MPQUIC in data communication, identifying the application domain, tools used, and evaluation parameters.
Abstract: Since its inception, the Internet has experienced tremendous speed and functionality improvements. Among these developments are innovative approaches such as the design and deployment of Internet Protocol version six (IPv6) and the continuous modification of TCP. New transport protocols like Stream Communication Transport Protocol (SCTP) and Multipath TCP (MPTCP), which can use multiple data paths, have been developed to overcome the IP-coupled challenge in TCP. However, given the difficulties of packet modifiers over the Internet that prevent the deployment of newly proposed protocols, e.g., SCTP, a UDP innovative approach with QUIC (Quick UDP Internet Connection) has been put forward as an alternative. QUIC reduces the connection establishment complexity in TCP and its variants, high security, stream multiplexing, and pluggable congestion control. Motivated by the gains and acceptability of MPTCP, Multipath QUIC has been developed to enable multipath transmission in QUIC. While several researchers have reviewed the progress of improvement and application of MPTCP, the review on MPQUIC improvement is limited. To breach the gap, this paper provides a brief survey on the practical application and progress of MPQUIC in data communication. We first review the fundamentals of multipath transport protocols. We then provide details on the design of QUIC and MPQUIC. Based on the articles reviewed, we looked at the various applications of MPQUIC, identifying the application domain, tools used, and evaluation parameters. Finally, we highlighted the open research issues and directions for further investigations.
Proceedings ArticleDOI
01 Nov 2022
TL;DR: In this paper , the authors provide a brief survey on the practical application and progress of MPQUIC in data communication, identifying the application domain, tools used, and evaluation parameters.
Abstract: Since its inception, the Internet has experienced tremendous speed and functionality improvements. Among these developments are innovative approaches such as the design and deployment of Internet Protocol version six (IPv6) and the continuous modification of TCP. New transport protocols like Stream Communication Transport Protocol (SCTP) and Multipath TCP (MPTCP), which can use multiple data paths, have been developed to overcome the IP-coupled challenge in TCP. However, given the difficulties of packet modifiers over the Internet that prevent the deployment of newly proposed protocols, e.g., SCTP, a UDP innovative approach with QUIC (Quick UDP Internet Connection) has been put forward as an alternative. QUIC reduces the connection establishment complexity in TCP and its variants, high security, stream multiplexing, and pluggable congestion control. Motivated by the gains and acceptability of MPTCP, Multipath QUIC has been developed to enable multipath transmission in QUIC. While several researchers have reviewed the progress of improvement and application of MPTCP, the review on MPQUIC improvement is limited. To breach the gap, this paper provides a brief survey on the practical application and progress of MPQUIC in data communication. We first review the fundamentals of multipath transport protocols. We then provide details on the design of QUIC and MPQUIC. Based on the articles reviewed, we looked at the various applications of MPQUIC, identifying the application domain, tools used, and evaluation parameters. Finally, we highlighted the open research issues and directions for further investigations.
References
More filters
Proceedings ArticleDOI
07 Aug 2017
TL;DR: The experience with QUIC is presented, an encrypted, multiplexed, and low-latency transport protocol designed from the ground up to improve transport performance for HTTPS traffic and to enable rapid deployment and continued evolution of transport mechanisms.
Abstract: We present our experience with QUIC, an encrypted, multiplexed, and low-latency transport protocol designed from the ground up to improve transport performance for HTTPS traffic and to enable rapid deployment and continued evolution of transport mechanisms. QUIC has been globally deployed at Google on thousands of servers and is used to serve traffic to a range of clients including a widely-used web browser (Chrome) and a popular mobile video streaming app (YouTube). We estimate that 7% of Internet traffic is now QUIC. We describe our motivations for developing a new transport, the principles that guided our design, the Internet-scale process that we used to perform iterative experiments on QUIC, performance improvements seen by our various services, and our experience deploying QUIC globally. We also share lessons about transport design and the Internet ecosystem that we learned from our deployment.

610 citations


"Improving Performance of QUIC in Wi..." refers background in this paper

  • ...In a recent measurement study [3], it was estimated that around 7% of the Internet traffic is QUIC....

    [...]

Journal ArticleDOI
TL;DR: The emerging 802.11 standard is overviewed and its finalized amendments and those under development are highlighted, to address the technical context of its extensions.
Abstract: The introduction of IEEE's 80211 standards has enabled a mass market, with a huge impact in the home, office, and public areas Today, laptops, PCs, printers, cellular phones, VoIP phones, MP3 players, Blu-Ray players, and many more devices incorporate wireless LAN technology With low-cost chipsets and support for high data rates, 80211 has become a universal solution for an ever increasing application space As a direct consequence of its high market penetration, several amendments to the basic 80211 standard have been developed or are under development They fix technology issues or add functionality expected to be required by future applications In this article we overview the emerging 80211 standard and address the technical context of its extensions The article highlights its finalized amendments and those under development

422 citations


"Improving Performance of QUIC in Wi..." refers background in this paper

  • ...11 standards, have undergone an enormous evolution in the recent years [2]....

    [...]

Journal ArticleDOI
TL;DR: This article investigates the key MAC enhancements that help 802.11n achieve high throughput and high efficiency, and concludes that overall, the two-level aggregation is the most efficacious.
Abstract: IEEE 802.11n is an ongoing next-generation wireless LAN standard that supports a very highspeed connection with more than 100 Mb/s data throughput measured at the medium access control layer. This article investigates the key MAC enhancements that help 802.11n achieve high throughput and high efficiency. A detailed description is given for various frame aggregation mechanisms proposed in the latest 802.11n draft standard. Our simulation results confirm that A-MSDU, A-MPDU, and a combination of these methods improve extensively the channel efficiency and data throughput. We analyze the performance of each frame aggregation scheme in distinct scenarios, and we conclude that overall, the two-level aggregation is the most efficacious.

380 citations


"Improving Performance of QUIC in Wi..." refers background in this paper

  • ...Their impact on throughput have also been extensively evaluated [7], [6], [8]....

    [...]

  • ...A detailed description of these concepts can be found in [6]....

    [...]

Proceedings ArticleDOI
01 Nov 2006
TL;DR: An analytical model is proposed to study the performance improvement of the MAC protocol by using the two frame aggregation techniques, namely A-MPDU and A-MSDU (MAC Service Data Unit Aggregation) and results show that the network throughput performance is significant improved when compared with both randomized and fixed frame aggregation algorithms.
Abstract: The IEEE 802.11a/b/g have been widely accepted as the de facto standards for wireless local area networks (WLANs). The recent IEEE 802.11n proposals aim at providing a physical layer transmission rate of up to 600 Mbps. However, to fully utilize this high data rate, the current IEEE 802.11 medium access control (MAC) needs to be enhanced. In this paper, we investigate the performance improvement of the MAC protocol by using the two frame aggregation techniques, namely A-MPDU (MAC Protocol Data Unit Aggregation) and A-MSDU (MAC Service Data Unit Aggregation). We first propose an analytical model to study the performance under uni-directional and bi-directional data transfer. Our proposed model incorporates packet loss either from collisions or channel errors. Comparison with simulation results show that the model is accurate in predicting the network throughput. We also propose an optimal frame size adaptation algorithm with A-MSDU under error-prone channels. Simulation results show that the network throughput performance is significant improved when compared with both randomized and fixed frame aggregation algorithms.

239 citations


"Improving Performance of QUIC in Wi..." refers background in this paper

  • ...Their impact on throughput have also been extensively evaluated [7], [6], [8]....

    [...]

Book ChapterDOI
TL;DR: This paper proposes a new delayed ACK scheme in which the delay coefficient varies with the sequence number of the TCP packet, and shows that the ACK thinning allows to increase TCP throughput substantially more than previous improvement methods.
Abstract: We study in this paper TCP performance over a static multihop network that uses IEEE 802.11 protocol for access. For such networks it has been shown in [6] that TCP performance is mainly determined by the hidden terminal effects (and not by drop probabilities at buffers) which limits the number of packets that can be transmitted simultaneously in the network. We propose new approaches for improving the performance based on thinning the ACK streams that competes over the same radio resources as the TCP packets. In particular, we propose a new delayed ACK scheme in which the delay coefficient varies with the sequence number of the TCP packet. Through simulations we show that the ACK thinning allows to increase TCP throughput substantially more than previous improvement methods.

144 citations


"Improving Performance of QUIC in Wi..." refers background in this paper

  • ...al [10] investigated the impact of increasing the TCP delayed acknowledgement mechanism to more than two segments as recommended by RFC 1122....

    [...]

Frequently Asked Questions (15)
Q1. What are the future works mentioned in the paper "Improving performance of quic in wifi" ?

Performing experiments on an hybrid scenario combining WiFi and Internet-wide scale are left for future work. 

In this paper the authors experimentally analyze the performance of QUIC in WiFi networks. The authors perform experiments using both a controlled WiFi testbed and a production WiFi mesh network. In particular, the authors study how QUIC interplays with MAC layer features such as IEEE 802. 11 frame aggregation. The authors show that the current implementation of QUIC in Chromium achieves sub-optimal throughput in wireless networks. Indeed, burstiness in modern WiFi standards may improve network performance, and the authors show that a Bursty QUIC ( BQUIC ), i. e., a customized version of QUIC that is targeted to increase its burstiness, can achieve better performance in WiFi. 

since a large window is acknowledged by each ACK and pacing is disabled, a burst of packets is released shortly after an ACK is received. 

lowering the ACK frequency increases burstiness of traffic as the sender releases a micro burst of packets after receiving the cumulative ACK. 

Considering the role of TCP acknowledgments for the protocol reliability, reducing the acknowledgement frequency and performing delayed cumulative acknowledgements may provide benefits in wireless networks. 

The nodes use the linux/openwrt [18] based distribution provided by the Quick Mesh Project (QMP) [19], which runs the BMX6 mesh routing protocol [20]. 

There are concerns about the impact of high burstiness on packet drops and queuing delays in wired networks, particularly in long Internet paths. 

To automate the page loading the authors use Chrome-HAR-capturer1 to connect to remote clients in the lab or WMN and repeatedly load the pages multiple times while capturing traffic at both client and server sides. 

Singh et. al [11] propose TCP with adaptive delayed acknowledgement, which aims to reduce the number of ACKs to one per congestion window. 

Two aspects of QUIC implementation particularly influence how the protocol interacts with 802.11 frame aggregation: (i) acknowledgment modes, and (ii) packet pacing.1) QUIC acknowledgment modes: Chromium’s implementation of QUIC includes two acknowledgment modes:• TCP ACKING: 

The authors show, for example, that since QUIC runs on user-space it incurs performance penalties particularly for mobile devices that are usually constrained by processing power. 

Oliveira el. al [12] propose Dynamic Adaptive Acknowledgement where the delay window is adjusted according to the channel condition. 

This is mainly because the antenna gain of the RPi is higher than in the smartphone, and thus, the network card can use MCS with higher bitrates during the transfer. 

the standard deviation of the throughput measured at 50 ms intervals increases only from 8.5 in QUIC to 9.1 in BQUIC (7%). 

The pacing rate is decided by QUIC on the fly depending on the link conditions such as bandwidth, RTT etc., and varies between different nodes in the mesh and even different runs using the same node. 

Trending Questions (1)
How can I boost my wifi signal in PLDT?

Indeed, burstiness in modern WiFi standards may improve network performance, and we show that a Bursty QUIC (BQUIC), i. e., a customized version of QUIC that is targeted to increase its burstiness, can achieve better performance in WiFi.