<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif">The receiver (as in the nginx proxy in the dumps) is actually running fq qdisc with BBR on kernel 4.9. Could that explain what you are seeing?</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">Changing it to cubic does not change the resulting throughput though, and it also was not involved at all in the Windows 10 -> linux router -> Apache server tests which also gives the same 23-ish MB/s with pacing.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 26 January 2017 at 21:54, Eric Dumazet <span dir="ltr"><<a href="mailto:eric.dumazet@gmail.com" target="_blank">eric.dumazet@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It looks like the receiver is seriously limiting receive window in the<br>


"pacing" case.<br>


<br>


Only after 30 Mbytes are transfered, it finally increases it, from<br>


359168 to 1328192<br>


<br>


DRS is not working as expected. Again maybe related to HZ value.<br>


<span class="im HOEnZb"><br>


<br>


On Thu, 2017-01-26 at 21:19 +0100, Hans-Kristian Bakke wrote:<br>


</span><div class="HOEnZb"><div class="h5">> There are two packet captures from fq with and without pacing here:<br>


><br>


><br>


> <a href="https://owncloud.proikt.com/index.php/s/KuXIl8h8bSFH1fM" rel="noreferrer" target="_blank">https://owncloud.proikt.com/<wbr>index.php/s/KuXIl8h8bSFH1fM</a><br>


><br>


><br>


><br>


> The server (with fq pacing/nopacing) is 10.0.5.10 and is running a<br>


> Apache2 webserver at port tcp port 443. The tcp client is nginx<br>


> reverse proxy at 10.0.5.13 on the same subnet which again is proxying<br>


> the connection from the Windows 10 client.<br>


> - I did try to connect directly to the server with the client (via a<br>


> linux gateway router) avoiding the nginx proxy and just using plain<br>


> no-ssl http. That did not change anything.<br>


> - I also tried stopping the eth0 interface to force the traffic to the<br>


> eth1 interface in the LACP which changed nothing.<br>


> - I also pulled each of the cable on the switch to force the traffic<br>


> to switch between interfaces in the LACP link between the client<br>


> switch and the server switch.<br>


><br>


><br>


> The CPU is a 5-6 year old Intel Xeon X3430 CPU @ 4x2.40GHz on a<br>


> SuperMicro platform. It is not very loaded and the results are always<br>


> in the same ballpark with fq pacing on.<br>


><br>


><br>


><br>


> top - 21:12:38 up 12 days, 11:08,  4 users,  load average: 0.56, 0.68,<br>


> 0.77<br>


> Tasks: 1344 total,   1 running, 1343 sleeping,   0 stopped,   0 zombie<br>


> %Cpu0  :  0.0 us,  1.0 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0<br>


> si,  0.0 st<br>


> %Cpu1  :  0.0 us,  0.3 sy,  0.0 ni, 97.4 id,  2.0 wa,  0.0 hi,  0.3<br>


> si,  0.0 st<br>


> %Cpu2  :  0.0 us,  2.0 sy,  0.0 ni, 96.4 id,  1.3 wa,  0.0 hi,  0.3<br>


> si,  0.0 st<br>


> %Cpu3  :  0.7 us,  2.3 sy,  0.0 ni, 94.1 id,  3.0 wa,  0.0 hi,  0.0<br>


> si,  0.0 st<br>


> KiB Mem : 16427572 total,   173712 free,  9739976 used,  6513884<br>


> buff/cache<br>


> KiB Swap:  6369276 total,  6126736 free,   242540 used.  6224836 avail<br>


> Mem<br>


><br>


><br>


> This seems OK to me. It does have 24 drives in 3 ZFS pools at 144TB<br>


> raw storage in total with several SAS HBAs that is pretty much always<br>


> poking the system in some way or the other.<br>


><br>


><br>


> There are around 32K interrupts when running @23 MB/s (as seen in<br>


> chrome downloads) with pacing on and about 25K interrupts when running<br>


> @105 MB/s with fq nopacing. Is that normal?<br>


><br>


><br>


> Hans-Kristian<br>


><br>


><br>


><br>


> On 26 January 2017 at 20:58, David Lang <<a href="mailto:david@lang.hm">david@lang.hm</a>> wrote:<br>


>         Is there any CPU bottleneck?<br>


><br>


>         pacing causing this sort of problem makes me thing that the<br>


>         CPU either can't keep up or that something (Hz setting type of<br>


>         thing) is delaying when the CPU can get used.<br>


><br>


>         It's not clear from the posts if the problem is with sending<br>


>         data or receiving data.<br>


><br>


>         David Lang<br>


><br>


><br>


>         On Thu, 26 Jan 2017, Eric Dumazet wrote:<br>


><br>


>                 Nothing jumps on my head.<br>


><br>


>                 We use FQ on links varying from 1Gbit to 100Gbit, and<br>


>                 we have no such<br>


>                 issues.<br>


><br>


>                 You could probably check on the server the TCP various<br>


>                 infos given by ss<br>


>                 command<br>


><br>


><br>


>                 ss -temoi dst <remoteip><br>


><br>


><br>


>                 pacing rate is shown. You might have some issues, but<br>


>                 it is hard to say.<br>


><br>


><br>


>                 On Thu, 2017-01-26 at 19:55 +0100, Hans-Kristian Bakke<br>


>                 wrote:<br>


>                         After some more testing I see that if I<br>


>                         disable fq pacing the<br>


>                         performance is restored to the expected<br>


>                         levels: # for i in eth0 eth1; do tc qdisc<br>


>                         replace dev $i root fq nopacing;<br>


>                         done<br>


><br>


><br>


>                         Is this expected behaviour? There is some<br>


>                         background traffic, but only<br>


>                         in the sub 100 mbit/s on the switches and<br>


>                         gateway between the server<br>


>                         and client.<br>


><br>


><br>


>                         The chain:<br>


>                         Windows 10 client -> 1000 mbit/s -> switch -><br>


>                         2xgigabit LACP -> switch<br>


>                         -> 4 x gigabit LACP -> gw (fq_codel on all<br>


>                         nics) -> 4 x gigabit LACP<br>


>                         (the same as in) -> switch -> 2 x lacp -><br>


>                         server (with misbehaving fq<br>


>                         pacing)<br>


><br>


><br>


><br>


>                         On 26 January 2017 at 19:38, Hans-Kristian<br>


>                         Bakke <<a href="mailto:hkbakke@gmail.com">hkbakke@gmail.com</a>><br>


>                         wrote:<br>


>                                 I can add that this is without BBR,<br>


>                         just plain old kernel 4.8<br>


>                                 cubic.<br>


><br>


>                                 On 26 January 2017 at 19:36,<br>


>                         Hans-Kristian Bakke<br>


>                                 <<a href="mailto:hkbakke@gmail.com">hkbakke@gmail.com</a>> wrote:<br>


>                                         Another day, another fq issue<br>


>                         (or user error).<br>


><br>


><br>


>                                         I try to do the seeminlig<br>


>                         simple task of downloading a<br>


>                                         single large file over local<br>


>                         gigabit  LAN from a<br>


>                                         physical server running kernel<br>


>                         4.8 and sch_fq on intel<br>


>                                         server NICs.<br>


><br>


><br>


>                                         For some reason it wouldn't go<br>


>                         past around 25 MB/s.<br>


>                                         After having replaced SSL with<br>


>                         no SSL, replaced apache<br>


>                                         with nginx and verified that<br>


>                         there is plenty of<br>


>                                         bandwith available between my<br>


>                         client and the server I<br>


>                                         tried to change qdisc from fq<br>


>                         to pfifo_fast. It<br>


>                                         instantly shot up to around<br>


>                         the expected 85-90 MB/s.<br>


>                                         The same happened with<br>


>                         fq_codel in place of fq.<br>


><br>


><br>


>                                         I then checked the statistics<br>


>                         for fq and the throttled<br>


>                                         counter is increasing<br>


>                         massively every second (eth0 and<br>


>                                         eth1 is LACPed using Linux<br>


>                         bonding so both is seen<br>


>                                         here):<br>


><br>


><br>


>                                         qdisc fq 8007: root refcnt 2<br>


>                         limit 10000p flow_limit<br>


>                                         100p buckets 1024 orphan_mask<br>


>                         1023 quantum 3028<br>


>                                         initial_quantum 15140<br>


>                         refill_delay 40.0ms<br>


>                                          Sent 787131797 bytes 520082<br>


>                         pkt (dropped 15,<br>


>                                         overlimits 0 requeues 0)<br>


>                                          backlog 98410b 65p requeues 0<br>


>                                           15 flows (14 inactive, 1<br>


>                         throttled)<br>


>                                           0 gc, 2 highprio, 259920<br>


>                         throttled, 15 flows_plimit<br>


>                                         qdisc fq 8008: root refcnt 2<br>


>                         limit 10000p flow_limit<br>


>                                         100p buckets 1024 orphan_mask<br>


>                         1023 quantum 3028<br>


>                                         initial_quantum 15140<br>


>                         refill_delay 40.0ms<br>


>                                          Sent 2533167 bytes 6731 pkt<br>


>                         (dropped 0, overlimits 0<br>


>                                         requeues 0)<br>


>                                          backlog 0b 0p requeues 0<br>


>                                           24 flows (24 inactive, 0<br>


>                         throttled)<br>


>                                           0 gc, 2 highprio, 397<br>


>                         throttled<br>


><br>


><br>


>                                         Do you have any suggestions?<br>


><br>


><br>


>                                         Regards,<br>


>                                         Hans-Kristian<br>


><br>


><br>


><br>


><br>


>                         ______________________________<wbr>_________________<br>


>                         Bloat mailing list<br>


>                         <a href="mailto:Bloat@lists.bufferbloat.net">Bloat@lists.bufferbloat.net</a><br>


>                         <a href="https://lists.bufferbloat.net/listinfo/bloat" rel="noreferrer" target="_blank">https://lists.bufferbloat.net/<wbr>listinfo/bloat</a><br>


><br>


><br>


>                 ______________________________<wbr>_________________<br>


>                 Bloat mailing list<br>


>                 <a href="mailto:Bloat@lists.bufferbloat.net">Bloat@lists.bufferbloat.net</a><br>


>                 <a href="https://lists.bufferbloat.net/listinfo/bloat" rel="noreferrer" target="_blank">https://lists.bufferbloat.net/<wbr>listinfo/bloat</a><br>


><br>


><br>


<br>


<br>


</div></div></blockquote></div><br></div>