[Bloat] Excessive throttling with fq

Hans-Kristian Bakke hkbakke at gmail.com
Thu Jan 26 15:46:01 EST 2017


# ethtool -i eth0
driver: e1000e
version: 3.2.6-k
firmware-version: 1.9-0
expansion-rom-version:
bus-info: 0000:04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

# ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]

# grep HZ /boot/config-4.8.0-2-amd64
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_MACHZ_WDT=m


On 26 January 2017 at 21:41, Eric Dumazet <eric.dumazet at gmail.com> wrote:

>
> Can you post :
>
> ethtool -i eth0
> ethtool -k eth0
>
> grep HZ /boot/config.... (what is the HZ value of your kernel)
>
> I suspect a possible problem with TSO autodefer when/if HZ < 1000
>
> Thanks.
>
> On Thu, 2017-01-26 at 21:19 +0100, Hans-Kristian Bakke wrote:
> > There are two packet captures from fq with and without pacing here:
> >
> >
> > https://owncloud.proikt.com/index.php/s/KuXIl8h8bSFH1fM
> >
> >
> >
> > The server (with fq pacing/nopacing) is 10.0.5.10 and is running a
> > Apache2 webserver at port tcp port 443. The tcp client is nginx
> > reverse proxy at 10.0.5.13 on the same subnet which again is proxying
> > the connection from the Windows 10 client.
> > - I did try to connect directly to the server with the client (via a
> > linux gateway router) avoiding the nginx proxy and just using plain
> > no-ssl http. That did not change anything.
> > - I also tried stopping the eth0 interface to force the traffic to the
> > eth1 interface in the LACP which changed nothing.
> > - I also pulled each of the cable on the switch to force the traffic
> > to switch between interfaces in the LACP link between the client
> > switch and the server switch.
> >
> >
> > The CPU is a 5-6 year old Intel Xeon X3430 CPU @ 4x2.40GHz on a
> > SuperMicro platform. It is not very loaded and the results are always
> > in the same ballpark with fq pacing on.
> >
> >
> >
> > top - 21:12:38 up 12 days, 11:08,  4 users,  load average: 0.56, 0.68,
> > 0.77
> > Tasks: 1344 total,   1 running, 1343 sleeping,   0 stopped,   0 zombie
> > %Cpu0  :  0.0 us,  1.0 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0
> > si,  0.0 st
> > %Cpu1  :  0.0 us,  0.3 sy,  0.0 ni, 97.4 id,  2.0 wa,  0.0 hi,  0.3
> > si,  0.0 st
> > %Cpu2  :  0.0 us,  2.0 sy,  0.0 ni, 96.4 id,  1.3 wa,  0.0 hi,  0.3
> > si,  0.0 st
> > %Cpu3  :  0.7 us,  2.3 sy,  0.0 ni, 94.1 id,  3.0 wa,  0.0 hi,  0.0
> > si,  0.0 st
> > KiB Mem : 16427572 total,   173712 free,  9739976 used,  6513884
> > buff/cache
> > KiB Swap:  6369276 total,  6126736 free,   242540 used.  6224836 avail
> > Mem
> >
> >
> > This seems OK to me. It does have 24 drives in 3 ZFS pools at 144TB
> > raw storage in total with several SAS HBAs that is pretty much always
> > poking the system in some way or the other.
> >
> >
> > There are around 32K interrupts when running @23 MB/s (as seen in
> > chrome downloads) with pacing on and about 25K interrupts when running
> > @105 MB/s with fq nopacing. Is that normal?
> >
> >
> > Hans-Kristian
> >
> >
> >
> > On 26 January 2017 at 20:58, David Lang <david at lang.hm> wrote:
> >         Is there any CPU bottleneck?
> >
> >         pacing causing this sort of problem makes me thing that the
> >         CPU either can't keep up or that something (Hz setting type of
> >         thing) is delaying when the CPU can get used.
> >
> >         It's not clear from the posts if the problem is with sending
> >         data or receiving data.
> >
> >         David Lang
> >
> >
> >         On Thu, 26 Jan 2017, Eric Dumazet wrote:
> >
> >                 Nothing jumps on my head.
> >
> >                 We use FQ on links varying from 1Gbit to 100Gbit, and
> >                 we have no such
> >                 issues.
> >
> >                 You could probably check on the server the TCP various
> >                 infos given by ss
> >                 command
> >
> >
> >                 ss -temoi dst <remoteip>
> >
> >
> >                 pacing rate is shown. You might have some issues, but
> >                 it is hard to say.
> >
> >
> >                 On Thu, 2017-01-26 at 19:55 +0100, Hans-Kristian Bakke
> >                 wrote:
> >                         After some more testing I see that if I
> >                         disable fq pacing the
> >                         performance is restored to the expected
> >                         levels: # for i in eth0 eth1; do tc qdisc
> >                         replace dev $i root fq nopacing;
> >                         done
> >
> >
> >                         Is this expected behaviour? There is some
> >                         background traffic, but only
> >                         in the sub 100 mbit/s on the switches and
> >                         gateway between the server
> >                         and client.
> >
> >
> >                         The chain:
> >                         Windows 10 client -> 1000 mbit/s -> switch ->
> >                         2xgigabit LACP -> switch
> >                         -> 4 x gigabit LACP -> gw (fq_codel on all
> >                         nics) -> 4 x gigabit LACP
> >                         (the same as in) -> switch -> 2 x lacp ->
> >                         server (with misbehaving fq
> >                         pacing)
> >
> >
> >
> >                         On 26 January 2017 at 19:38, Hans-Kristian
> >                         Bakke <hkbakke at gmail.com>
> >                         wrote:
> >                                 I can add that this is without BBR,
> >                         just plain old kernel 4.8
> >                                 cubic.
> >
> >                                 On 26 January 2017 at 19:36,
> >                         Hans-Kristian Bakke
> >                                 <hkbakke at gmail.com> wrote:
> >                                         Another day, another fq issue
> >                         (or user error).
> >
> >
> >                                         I try to do the seeminlig
> >                         simple task of downloading a
> >                                         single large file over local
> >                         gigabit  LAN from a
> >                                         physical server running kernel
> >                         4.8 and sch_fq on intel
> >                                         server NICs.
> >
> >
> >                                         For some reason it wouldn't go
> >                         past around 25 MB/s.
> >                                         After having replaced SSL with
> >                         no SSL, replaced apache
> >                                         with nginx and verified that
> >                         there is plenty of
> >                                         bandwith available between my
> >                         client and the server I
> >                                         tried to change qdisc from fq
> >                         to pfifo_fast. It
> >                                         instantly shot up to around
> >                         the expected 85-90 MB/s.
> >                                         The same happened with
> >                         fq_codel in place of fq.
> >
> >
> >                                         I then checked the statistics
> >                         for fq and the throttled
> >                                         counter is increasing
> >                         massively every second (eth0 and
> >                                         eth1 is LACPed using Linux
> >                         bonding so both is seen
> >                                         here):
> >
> >
> >                                         qdisc fq 8007: root refcnt 2
> >                         limit 10000p flow_limit
> >                                         100p buckets 1024 orphan_mask
> >                         1023 quantum 3028
> >                                         initial_quantum 15140
> >                         refill_delay 40.0ms
> >                                          Sent 787131797 bytes 520082
> >                         pkt (dropped 15,
> >                                         overlimits 0 requeues 0)
> >                                          backlog 98410b 65p requeues 0
> >                                           15 flows (14 inactive, 1
> >                         throttled)
> >                                           0 gc, 2 highprio, 259920
> >                         throttled, 15 flows_plimit
> >                                         qdisc fq 8008: root refcnt 2
> >                         limit 10000p flow_limit
> >                                         100p buckets 1024 orphan_mask
> >                         1023 quantum 3028
> >                                         initial_quantum 15140
> >                         refill_delay 40.0ms
> >                                          Sent 2533167 bytes 6731 pkt
> >                         (dropped 0, overlimits 0
> >                                         requeues 0)
> >                                          backlog 0b 0p requeues 0
> >                                           24 flows (24 inactive, 0
> >                         throttled)
> >                                           0 gc, 2 highprio, 397
> >                         throttled
> >
> >
> >                                         Do you have any suggestions?
> >
> >
> >                                         Regards,
> >                                         Hans-Kristian
> >
> >
> >
> >
> >                         _______________________________________________
> >                         Bloat mailing list
> >                         Bloat at lists.bufferbloat.net
> >                         https://lists.bufferbloat.net/listinfo/bloat
> >
> >
> >                 _______________________________________________
> >                 Bloat mailing list
> >                 Bloat at lists.bufferbloat.net
> >                 https://lists.bufferbloat.net/listinfo/bloat
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/bloat/attachments/20170126/2cf3e803/attachment-0001.html>


More information about the Bloat mailing list