<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke <span dir="ltr"><<a href="mailto:hkbakke@gmail.com" target="_blank">hkbakke@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:verdana,sans-serif">Actually.. the 1-4 mbit/s results with fq sporadically appears again as I keep testing but it is most likely caused by all the unknowns between me an my testserver. But still, changing to pfifo_qdisc seems to normalize the throughput again with BBR, could this be one of those times where BBR and pacing actually is getting hurt for playing nice in some very variable bottleneck on the way?</div></div></blockquote><div><br></div><div>Possibly. Would you be able to take a tcpdump trace of each trial (headers only would be ideal), and post on a web site somewhere a pcap trace for one of the slow trials?</div><div><br></div><div>For example:</div><div><br></div><div> tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c 1000000 &<br></div><div><br></div><div>thanks,</div><div>neal</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail-HOEnZb"><div class="gmail-h5"><div class="gmail_extra"><br><div class="gmail_quote">On 25 January 2017 at 23:01, Neal Cardwell <span dir="ltr"><<a href="mailto:ncardwell@google.com" target="_blank">ncardwell@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span>On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <span dir="ltr"><<a href="mailto:hkbakke@gmail.com" target="_blank">hkbakke@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:verdana,sans-serif">Hi</div><div style="font-family:verdana,sans-serif"><br></div><div style="font-family:verdana,sans-serif">Kernel 4.9 finally landed in Debian testing so I could finally test BBR in a real life environment that I have struggled with getting any kind of performance out of.</div><div style="font-family:verdana,sans-serif"><br></div><div style="font-family:verdana,sans-serif">The challenge at hand is UDP based OpenVPN through europe at around 35 ms rtt to my VPN-provider with plenty of available bandwith available in both ends and everything completely unknown in between. After tuning the UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, while the upload was stuck at 30 to 60 mbit/s. </div><div style="font-family:verdana,sans-serif"><br></div><div style="font-family:verdana,sans-serif">Just by activating BBR the bandwith instantly shot up to around 150 mbit/s using a fat tcp test to a public iperf3 server located near my VPN exit point in the Netherlands. Replace BBR with qubic again and the performance is once again all over the place ranging from very bad to bad, but never better than 1/3 of BBRs "steady state". In other words "instant WIN!"</div></div></blockquote><div><br></div></span><div>Glad to hear it. Thanks for the test report!</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:verdana,sans-serif">However, seeing the requirement of fq and pacing for BBR and noticing that I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE host with fq_codel on all physical interfaces, I was surprised to see that it worked so well.</div><div style="font-family:verdana,sans-serif">I then replaced pfifo_fast with fq and the performance went right down to only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained the performance at once.</div><div style="font-family:verdana,sans-serif"><br></div><div style="font-family:verdana,sans-serif">I have got some questions to you guys that know a lot more than me about these things:<span style="font-family:arial,sans-serif"> </span></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:verdana,sans-serif">1. Do fq (and fq_codel) even work reliably in a VM? What is the best choice for default qdisc to use in a VM in general?</div></div></blockquote><div><br></div></span><div>Eric covered this one. We are not aware of specific issues with fq in VM environments. And we have tested that fq works sufficiently well on Google Cloud VMs.</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:verdana,sans-serif">2. Why do BBR immediately "fix" all my issues with upload through that "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement?</div></div></blockquote><div><br></div></span><div>For BBR, pacing is part of the design in order to make BBR more "gentle" in terms of the rate at which it sends, in order to put less pressure on buffers and keep packet loss lower. This is particularly important when a BBR flow is restarting from idle. In this case BBR starts with a full cwnd, and it counts on pacing to pace out the packets at the estimated bandwidth, so that the queue can stay relatively short and yet the pipe can be filled immediately.</div><div><br></div><div>Running BBR without pacing makes BBR more aggressive, particularly in restarting from idle, but also in the steady state, where BBR tries to use pacing to keep the queue short.</div><div><br></div><div>For bulk transfer tests with one flow, running BBR without pacing will likely cause higher queues and loss rates at the bottleneck, which may negatively impact other traffic sharing that bottleneck.</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:verdana,sans-serif">3. Could fq_codel on the physical host be the reason that it still works?</div></div></blockquote><div><br></div></span><div>Nope, fq_codel does not implement pacing.</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:verdana,sans-serif">4. Do BBR _only_ work with fq pacing or could fq_codel be used as a replacement?</div></div></blockquote><div><br></div></span><div>Nope, BBR needs pacing to work correctly, and currently fq is the only Linux qdisc that implements pacing.</div><span><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-family:verdana,sans-serif">5. Is BBR perhaps modified to do the right thing without having to change the qdisc in the current kernel 4.9?</div></div></blockquote><div><br></div></span><div>Nope. Linux 4.9 contains the initial public release of BBR from September 2016. And there have been no code changes since then (just expanded comments).</div><div><br></div><div>Thanks for the test report!</div><span class="gmail-m_-4615647556409853110HOEnZb"><font color="#888888"><div><br></div><div>neal</div><div><br></div></font></span></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>