From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-x230.google.com (mail-oi0-x230.google.com [IPv6:2607:f8b0:4003:c06::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id E59423B29E for ; Wed, 25 Jan 2017 17:01:35 -0500 (EST) Received: by mail-oi0-x230.google.com with SMTP id w204so125843117oiw.0 for ; Wed, 25 Jan 2017 14:01:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=f63Pjm13y7FIxFhid10Qk6aLcBl09uDou0sS0AnCwJY=; b=IE5qalXHp2fI+jc2kqW0fuR3TB7nuin7ONb4pMUa1P5ofUJAQS6IdSSmUt0oNz11VC Lpzb/YIAAqzORoxi+0UxNHWNmNc9a0mfG36noLG6Lvdp4HKf0LJwZQDjLZZyOTEEcR6K guBDcApPcg0wOXXlimKE9SjBX+plfcWLLctdb5aO4fW9oC/tjg2om5XAyzFIZfK1yOCO nxRXnJpZGYrfQWOZm+uYR5zNVHj1cNB7lNEAn4R/lbW0LVYPX0gL6kP9DHNf8RnvoekG Kat2QBlyU0SMTDkIeKTcYEuRrtbZPPssAx9wrcZHQtdmsVsXYqaoMko40dDDl2u8sDW5 9j1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=f63Pjm13y7FIxFhid10Qk6aLcBl09uDou0sS0AnCwJY=; b=EBSO2qwARp8vxar6ROsOR4P2sBYQ4XSd8heHxqGf+fCzjOCCN52D1QiCNwLWIBz1wn fno5j/PVCRC8hp2+SDdmJo4xeRdQIzPTXwqa1iI7q2K1s1MfIN2sAEfv0n1KYfPQO7JK c1QJSEPQdaWUQbuRS430UcL2ziSDtq+bejoBk8/XkNP507qkHU3jc6h+JgNUHwtI5JBZ 1w4rhE9TfxuYD2ntewoZtgoCqu4eZF3mowxFY940zHWBOrO5D1LQ0DDREy/aBsDuBZLq KqjVTj9JRH5kOlwC79ZYgdx5YW6TVW+ZxtdLjV0vWtwykkFJ019oLvuecYDCFxCrR9+Y iQzg== X-Gm-Message-State: AIkVDXK4NSnqZwI8oyaHEh/klcw/FkrOEsqQKCzo5HxLiQcQBxqAvk8UF1jdKyxsV95IdoVM32t5thnTfwBErD/m X-Received: by 10.202.76.146 with SMTP id z140mr21213227oia.68.1485381695242; Wed, 25 Jan 2017 14:01:35 -0800 (PST) MIME-Version: 1.0 Received: by 10.202.244.203 with HTTP; Wed, 25 Jan 2017 14:01:04 -0800 (PST) In-Reply-To: References: From: Neal Cardwell Date: Wed, 25 Jan 2017 17:01:04 -0500 Message-ID: To: Hans-Kristian Bakke Cc: bloat Content-Type: multipart/alternative; boundary=001a114770c04d61f40546f2601c Subject: Re: [Bloat] Initial tests with BBR in kernel 4.9 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2017 22:01:36 -0000 --001a114770c04d61f40546f2601c Content-Type: text/plain; charset=UTF-8 On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke wrote: > Hi > > Kernel 4.9 finally landed in Debian testing so I could finally test BBR in > a real life environment that I have struggled with getting any kind of > performance out of. > > The challenge at hand is UDP based OpenVPN through europe at around 35 ms > rtt to my VPN-provider with plenty of available bandwith available in both > ends and everything completely unknown in between. After tuning the > UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms > the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, > while the upload was stuck at 30 to 60 mbit/s. > > Just by activating BBR the bandwith instantly shot up to around 150 mbit/s > using a fat tcp test to a public iperf3 server located near my VPN exit > point in the Netherlands. Replace BBR with qubic again and the performance > is once again all over the place ranging from very bad to bad, but never > better than 1/3 of BBRs "steady state". In other words "instant WIN!" > Glad to hear it. Thanks for the test report! > However, seeing the requirement of fq and pacing for BBR and noticing that > I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE host > with fq_codel on all physical interfaces, I was surprised to see that it > worked so well. > I then replaced pfifo_fast with fq and the performance went right down to > only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained the > performance at once. > > I have got some questions to you guys that know a lot more than me about > these things: > 1. Do fq (and fq_codel) even work reliably in a VM? What is the best choice > for default qdisc to use in a VM in general? > Eric covered this one. We are not aware of specific issues with fq in VM environments. And we have tested that fq works sufficiently well on Google Cloud VMs. > 2. Why do BBR immediately "fix" all my issues with upload through that > "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement? > For BBR, pacing is part of the design in order to make BBR more "gentle" in terms of the rate at which it sends, in order to put less pressure on buffers and keep packet loss lower. This is particularly important when a BBR flow is restarting from idle. In this case BBR starts with a full cwnd, and it counts on pacing to pace out the packets at the estimated bandwidth, so that the queue can stay relatively short and yet the pipe can be filled immediately. Running BBR without pacing makes BBR more aggressive, particularly in restarting from idle, but also in the steady state, where BBR tries to use pacing to keep the queue short. For bulk transfer tests with one flow, running BBR without pacing will likely cause higher queues and loss rates at the bottleneck, which may negatively impact other traffic sharing that bottleneck. > 3. Could fq_codel on the physical host be the reason that it still works? > Nope, fq_codel does not implement pacing. > 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a > replacement? > Nope, BBR needs pacing to work correctly, and currently fq is the only Linux qdisc that implements pacing. > 5. Is BBR perhaps modified to do the right thing without having to change > the qdisc in the current kernel 4.9? > Nope. Linux 4.9 contains the initial public release of BBR from September 2016. And there have been no code changes since then (just expanded comments). Thanks for the test report! neal --001a114770c04d61f40546f2601c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On W= ed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <hkbakke@gmail.com>= ; wrote:
Hi

Ke= rnel 4.9 finally landed in Debian testing so I could finally test BBR in a = real life environment that I have struggled with getting any kind of perfor= mance out of.

=
The challenge at hand is UDP = based OpenVPN through europe at around 35 ms rtt to my VPN-provider with pl= enty of available bandwith available in both ends and everything completely= unknown in between. After tuning the UDP-buffers up to make room for my 50= 0 mbit/s symmetrical bandwith at 35 ms the download part seemed to work nic= ely at an unreliable 150 to 300 mbit/s, while the upload was stuck at 30 to= 60 mbit/s.=C2=A0

Just by activating BBR th= e bandwith instantly shot up to around 150 mbit/s using a fat tcp test to a= public iperf3 server located near my VPN exit point in the Netherlands. Re= place BBR with qubic again and the performance is once again all over the p= lace ranging from very bad to bad, but never better than 1/3 of BBRs "= steady state". In other words "instant WIN!"

Glad to hear it. Thanks for the test report!
=C2=A0
However, seeing the requirement of = fq and pacing for BBR and noticing that I am running pfifo_fast within a VM= with virtio NIC on a Proxmox VE host with fq_codel on all physical interfa= ces, I was surprised to see that it worked so well.
I then replaced pfifo_fast with fq and the perf= ormance went right down to only 1-4 mbit/s from around 150 mbit/s. Removing= the fq again regained the performance at once.

I have got some questions to you guys that know a lot more than me abo= ut these things:=C2=A0<= /div>
1. Do fq (and fq_codel) even wor= k reliably in a VM? What is the best choice for default qdisc to use in a V= M in general?

Eric covered this= one. We are not aware of specific issues with fq in VM environments. And = =C2=A0we have tested that fq works sufficiently well on Google Cloud VMs.
=C2=A0
2. Why do BBR immediately "fix= " all my issues with upload through that "unreliable" big BD= P link with pfifo_fast when fq pacing is a requirement?

For BBR, pacing is part of the design in order to = make BBR more "gentle" in terms of the rate at which it sends, in= order to put less pressure on buffers and keep packet loss lower. This is = particularly important when a BBR flow is restarting from idle. In this cas= e BBR starts with a full cwnd, and it counts on pacing to pace out the pack= ets at the estimated bandwidth, so that the queue can stay relatively short= and yet the pipe can be filled immediately.

Runni= ng BBR without pacing makes BBR more aggressive, particularly in restarting= from idle, but also in the steady state, where BBR tries to use pacing to = keep the queue short.

For bulk transfer tests with= one flow, running BBR without pacing will likely cause higher queues and l= oss rates at the bottleneck, which may negatively impact other traffic shar= ing that bottleneck.
=C2=A0
<= div dir=3D"ltr">
3. Could fq_c= odel on the physical host be the reason that it still works?

Nope, fq_codel does not implement pacing.
=C2=A0
4. Do BBR _only_ work with fq pacing = or could fq_codel be used as a replacement?
Nope, BBR needs pacing to work correctly, and currently fq is = the only Linux qdisc that implements pacing.
=C2=A0
5. Is BBR perhaps modified to do the right thing without having = to change the qdisc in the current kernel 4.9?

Nope. Linux 4.9 contains the initial public release of BBR = from September 2016. And there have been no code changes since then (just e= xpanded comments).

Thanks for the test report!

neal

--001a114770c04d61f40546f2601c--