<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">I did record the qdisc settings, but I didn't capture the stats, but throttling is definitively active when I watch the tc -s stats in realtime when testing (looking at tun1)</div><div class="gmail_extra"><br></div><div class="gmail_extra"><div class="gmail_default" style="font-family:verdana,sans-serif">​tc -s qdisc show</div><div class="gmail_default" style="font-family:verdana,sans-serif">qdisc noqueue 0: dev lo root refcnt 2</div><div class="gmail_default" style="font-family:verdana,sans-serif"> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)</div><div class="gmail_default" style="font-family:verdana,sans-serif"> backlog 0b 0p requeues 0</div><div class="gmail_default" style="font-family:verdana,sans-serif">qdisc fq 8007: dev eth0 root refcnt 2 limit 10000p flow_limit 100p buckets 1024 orphan_mask 1023 quantum 3028 initial_quantum 15140 refill_delay 40.0ms</div><div class="gmail_default" style="font-family:verdana,sans-serif"> Sent 1420855729 bytes 969198 pkt (dropped 134, overlimits 0 requeues 0)</div><div class="gmail_default" style="font-family:verdana,sans-serif"> backlog 0b 0p requeues 0</div><div class="gmail_default" style="font-family:verdana,sans-serif">  124 flows (123 inactive, 0 throttled)</div><div class="gmail_default" style="font-family:verdana,sans-serif">  0 gc, 0 highprio, 3 throttled, 3925 ns latency, 134 flows_plimit</div><div class="gmail_default" style="font-family:verdana,sans-serif">qdisc fq 8008: dev tun1 root refcnt 2 limit 10000p flow_limit 100p buckets 1024 orphan_mask 1023 quantum 3000 initial_quantum 15000 refill_delay 40.0ms</div><div class="gmail_default" style="font-family:verdana,sans-serif"> Sent 1031289740 bytes 741181 pkt (dropped 0, overlimits 0 requeues 0)</div><div class="gmail_default" style="font-family:verdana,sans-serif"> backlog 101616b 3p requeues 0</div><div class="gmail_default" style="font-family:verdana,sans-serif">  16 flows (15 inactive, 1 throttled), next packet delay 351937 ns</div><div class="gmail_default" style="font-family:verdana,sans-serif">  0 gc, 0 highprio, 58377 throttled, 12761 ns latency</div><div class="gmail_default" style="font-family:verdana,sans-serif">​</div><br></div><div class="gmail_extra"><br><div class="gmail_quote">On 26 January 2017 at 00:33, Eric Dumazet <span dir="ltr"><<a href="mailto:eric.dumazet@gmail.com" target="_blank">eric.dumazet@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-"><br>
On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian Bakke wrote:<br>
> I can do that. I guess I should do the capture from tun1 as that is<br>
> the place that the tcp-traffic is visible? My non-virtual nic is only<br>
> seeing OpenVPN encapsulated UDP-traffic.<br>
><br>
<br>
</span>But is FQ installed at the point TCP sockets are ?<br>
<br>
You should give us "tc -s qdisc show xxx"  so that we can check if<br>
pacing (throttling) actually happens.<br>
<div class="gmail-HOEnZb"><div class="gmail-h5"><br>
<br>
> On 25 January 2017 at 23:48, Neal Cardwell <<a href="mailto:ncardwell@google.com">ncardwell@google.com</a>><br>
> wrote:<br>
>         On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke<br>
>         <<a href="mailto:hkbakke@gmail.com">hkbakke@gmail.com</a>> wrote:<br>
>                 Actually.. the 1-4 mbit/s results with fq sporadically<br>
>                 appears again as I keep testing but it is most likely<br>
>                 caused by all the unknowns between me an my<br>
>                 testserver. But still, changing to pfifo_qdisc seems<br>
>                 to normalize the throughput again with BBR, could this<br>
>                 be one of those times where BBR and pacing actually is<br>
>                 getting hurt for playing nice in some very variable<br>
>                 bottleneck on the way?<br>
><br>
><br>
>         Possibly. Would you be able to take a tcpdump trace of each<br>
>         trial (headers only would be ideal), and post on a web site<br>
>         somewhere a pcap trace for one of the slow trials?<br>
><br>
><br>
>         For example:<br>
><br>
><br>
>            tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c 1000000 &<br>
><br>
><br>
><br>
>         thanks,<br>
>         neal<br>
><br>
><br>
><br>
><br>
>                 On 25 January 2017 at 23:01, Neal Cardwell<br>
>                 <<a href="mailto:ncardwell@google.com">ncardwell@google.com</a>> wrote:<br>
>                         On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian<br>
>                         Bakke <<a href="mailto:hkbakke@gmail.com">hkbakke@gmail.com</a>> wrote:<br>
>                                 Hi<br>
><br>
><br>
>                                 Kernel 4.9 finally landed in Debian<br>
>                                 testing so I could finally test BBR in<br>
>                                 a real life environment that I have<br>
>                                 struggled with getting any kind of<br>
>                                 performance out of.<br>
><br>
><br>
>                                 The challenge at hand is UDP based<br>
>                                 OpenVPN through europe at around 35 ms<br>
>                                 rtt to my VPN-provider with plenty of<br>
>                                 available bandwith available in both<br>
>                                 ends and everything completely unknown<br>
>                                 in between. After tuning the<br>
>                                 UDP-buffers up to make room for my 500<br>
>                                 mbit/s symmetrical bandwith at 35 ms<br>
>                                 the download part seemed to work<br>
>                                 nicely at an unreliable 150 to 300<br>
>                                 mbit/s, while the upload was stuck at<br>
>                                 30 to 60 mbit/s.<br>
><br>
><br>
>                                 Just by activating BBR the bandwith<br>
>                                 instantly shot up to around 150 mbit/s<br>
>                                 using a fat tcp test to a public<br>
>                                 iperf3 server located near my VPN exit<br>
>                                 point in the Netherlands. Replace BBR<br>
>                                 with qubic again and the performance<br>
>                                 is once again all over the place<br>
>                                 ranging from very bad to bad, but<br>
>                                 never better than 1/3 of BBRs "steady<br>
>                                 state". In other words "instant WIN!"<br>
><br>
><br>
>                         Glad to hear it. Thanks for the test report!<br>
><br>
>                                 However, seeing the requirement of fq<br>
>                                 and pacing for BBR and noticing that I<br>
>                                 am running pfifo_fast within a VM with<br>
>                                 virtio NIC on a Proxmox VE host with<br>
>                                 fq_codel on all physical interfaces, I<br>
>                                 was surprised to see that it worked so<br>
>                                 well.<br>
>                                 I then replaced pfifo_fast with fq and<br>
>                                 the performance went right down to<br>
>                                 only 1-4 mbit/s from around 150<br>
>                                 mbit/s. Removing the fq again regained<br>
>                                 the performance at once.<br>
><br>
><br>
>                                 I have got some questions to you guys<br>
>                                 that know a lot more than me about<br>
>                                 these things:<br>
>                                 1. Do fq (and fq_codel) even work<br>
>                                 reliably in a VM? What is the best<br>
>                                 choice for default qdisc to use in a<br>
>                                 VM in general?<br>
><br>
><br>
>                         Eric covered this one. We are not aware of<br>
>                         specific issues with fq in VM environments.<br>
>                         And  we have tested that fq works sufficiently<br>
>                         well on Google Cloud VMs.<br>
><br>
>                                 2. Why do BBR immediately "fix" all my<br>
>                                 issues with upload through that<br>
>                                 "unreliable" big BDP link with<br>
>                                 pfifo_fast when fq pacing is a<br>
>                                 requirement?<br>
><br>
><br>
>                         For BBR, pacing is part of the design in order<br>
>                         to make BBR more "gentle" in terms of the rate<br>
>                         at which it sends, in order to put less<br>
>                         pressure on buffers and keep packet loss<br>
>                         lower. This is particularly important when a<br>
>                         BBR flow is restarting from idle. In this case<br>
>                         BBR starts with a full cwnd, and it counts on<br>
>                         pacing to pace out the packets at the<br>
>                         estimated bandwidth, so that the queue can<br>
>                         stay relatively short and yet the pipe can be<br>
>                         filled immediately.<br>
><br>
><br>
>                         Running BBR without pacing makes BBR more<br>
>                         aggressive, particularly in restarting from<br>
>                         idle, but also in the steady state, where BBR<br>
>                         tries to use pacing to keep the queue short.<br>
><br>
><br>
>                         For bulk transfer tests with one flow, running<br>
>                         BBR without pacing will likely cause higher<br>
>                         queues and loss rates at the bottleneck, which<br>
>                         may negatively impact other traffic sharing<br>
>                         that bottleneck.<br>
><br>
>                                 3. Could fq_codel on the physical host<br>
>                                 be the reason that it still works?<br>
><br>
><br>
>                         Nope, fq_codel does not implement pacing.<br>
><br>
>                                 4. Do BBR _only_ work with fq pacing<br>
>                                 or could fq_codel be used as a<br>
>                                 replacement?<br>
><br>
><br>
>                         Nope, BBR needs pacing to work correctly, and<br>
>                         currently fq is the only Linux qdisc that<br>
>                         implements pacing.<br>
><br>
>                                 5. Is BBR perhaps modified to do the<br>
>                                 right thing without having to change<br>
>                                 the qdisc in the current kernel 4.9?<br>
><br>
><br>
>                         Nope. Linux 4.9 contains the initial public<br>
>                         release of BBR from September 2016. And there<br>
>                         have been no code changes since then (just<br>
>                         expanded comments).<br>
><br>
><br>
>                         Thanks for the test report!<br>
><br>
><br>
>                         neal<br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
</div></div><div class="gmail-HOEnZb"><div class="gmail-h5">> ______________________________<wbr>_________________<br>
> Bloat mailing list<br>
> <a href="mailto:Bloat@lists.bufferbloat.net">Bloat@lists.bufferbloat.net</a><br>
> <a href="https://lists.bufferbloat.net/listinfo/bloat" rel="noreferrer" target="_blank">https://lists.bufferbloat.net/<wbr>listinfo/bloat</a><br>
<br>
<br>
</div></div></blockquote></div><br></div></div>