[Bloat] Excessive throttling with fq
Hans-Kristian Bakke
hkbakke at gmail.com
Thu Jan 26 15:46:01 EST 2017
# ethtool -i eth0
driver: e1000e
version: 3.2.6-k
firmware-version: 1.9-0
expansion-rom-version:
bus-info: 0000:04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
# ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]
# grep HZ /boot/config-4.8.0-2-amd64
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_MACHZ_WDT=m
On 26 January 2017 at 21:41, Eric Dumazet <eric.dumazet at gmail.com> wrote:
>
> Can you post :
>
> ethtool -i eth0
> ethtool -k eth0
>
> grep HZ /boot/config.... (what is the HZ value of your kernel)
>
> I suspect a possible problem with TSO autodefer when/if HZ < 1000
>
> Thanks.
>
> On Thu, 2017-01-26 at 21:19 +0100, Hans-Kristian Bakke wrote:
> > There are two packet captures from fq with and without pacing here:
> >
> >
> > https://owncloud.proikt.com/index.php/s/KuXIl8h8bSFH1fM
> >
> >
> >
> > The server (with fq pacing/nopacing) is 10.0.5.10 and is running a
> > Apache2 webserver at port tcp port 443. The tcp client is nginx
> > reverse proxy at 10.0.5.13 on the same subnet which again is proxying
> > the connection from the Windows 10 client.
> > - I did try to connect directly to the server with the client (via a
> > linux gateway router) avoiding the nginx proxy and just using plain
> > no-ssl http. That did not change anything.
> > - I also tried stopping the eth0 interface to force the traffic to the
> > eth1 interface in the LACP which changed nothing.
> > - I also pulled each of the cable on the switch to force the traffic
> > to switch between interfaces in the LACP link between the client
> > switch and the server switch.
> >
> >
> > The CPU is a 5-6 year old Intel Xeon X3430 CPU @ 4x2.40GHz on a
> > SuperMicro platform. It is not very loaded and the results are always
> > in the same ballpark with fq pacing on.
> >
> >
> >
> > top - 21:12:38 up 12 days, 11:08, 4 users, load average: 0.56, 0.68,
> > 0.77
> > Tasks: 1344 total, 1 running, 1343 sleeping, 0 stopped, 0 zombie
> > %Cpu0 : 0.0 us, 1.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0
> > si, 0.0 st
> > %Cpu1 : 0.0 us, 0.3 sy, 0.0 ni, 97.4 id, 2.0 wa, 0.0 hi, 0.3
> > si, 0.0 st
> > %Cpu2 : 0.0 us, 2.0 sy, 0.0 ni, 96.4 id, 1.3 wa, 0.0 hi, 0.3
> > si, 0.0 st
> > %Cpu3 : 0.7 us, 2.3 sy, 0.0 ni, 94.1 id, 3.0 wa, 0.0 hi, 0.0
> > si, 0.0 st
> > KiB Mem : 16427572 total, 173712 free, 9739976 used, 6513884
> > buff/cache
> > KiB Swap: 6369276 total, 6126736 free, 242540 used. 6224836 avail
> > Mem
> >
> >
> > This seems OK to me. It does have 24 drives in 3 ZFS pools at 144TB
> > raw storage in total with several SAS HBAs that is pretty much always
> > poking the system in some way or the other.
> >
> >
> > There are around 32K interrupts when running @23 MB/s (as seen in
> > chrome downloads) with pacing on and about 25K interrupts when running
> > @105 MB/s with fq nopacing. Is that normal?
> >
> >
> > Hans-Kristian
> >
> >
> >
> > On 26 January 2017 at 20:58, David Lang <david at lang.hm> wrote:
> > Is there any CPU bottleneck?
> >
> > pacing causing this sort of problem makes me thing that the
> > CPU either can't keep up or that something (Hz setting type of
> > thing) is delaying when the CPU can get used.
> >
> > It's not clear from the posts if the problem is with sending
> > data or receiving data.
> >
> > David Lang
> >
> >
> > On Thu, 26 Jan 2017, Eric Dumazet wrote:
> >
> > Nothing jumps on my head.
> >
> > We use FQ on links varying from 1Gbit to 100Gbit, and
> > we have no such
> > issues.
> >
> > You could probably check on the server the TCP various
> > infos given by ss
> > command
> >
> >
> > ss -temoi dst <remoteip>
> >
> >
> > pacing rate is shown. You might have some issues, but
> > it is hard to say.
> >
> >
> > On Thu, 2017-01-26 at 19:55 +0100, Hans-Kristian Bakke
> > wrote:
> > After some more testing I see that if I
> > disable fq pacing the
> > performance is restored to the expected
> > levels: # for i in eth0 eth1; do tc qdisc
> > replace dev $i root fq nopacing;
> > done
> >
> >
> > Is this expected behaviour? There is some
> > background traffic, but only
> > in the sub 100 mbit/s on the switches and
> > gateway between the server
> > and client.
> >
> >
> > The chain:
> > Windows 10 client -> 1000 mbit/s -> switch ->
> > 2xgigabit LACP -> switch
> > -> 4 x gigabit LACP -> gw (fq_codel on all
> > nics) -> 4 x gigabit LACP
> > (the same as in) -> switch -> 2 x lacp ->
> > server (with misbehaving fq
> > pacing)
> >
> >
> >
> > On 26 January 2017 at 19:38, Hans-Kristian
> > Bakke <hkbakke at gmail.com>
> > wrote:
> > I can add that this is without BBR,
> > just plain old kernel 4.8
> > cubic.
> >
> > On 26 January 2017 at 19:36,
> > Hans-Kristian Bakke
> > <hkbakke at gmail.com> wrote:
> > Another day, another fq issue
> > (or user error).
> >
> >
> > I try to do the seeminlig
> > simple task of downloading a
> > single large file over local
> > gigabit LAN from a
> > physical server running kernel
> > 4.8 and sch_fq on intel
> > server NICs.
> >
> >
> > For some reason it wouldn't go
> > past around 25 MB/s.
> > After having replaced SSL with
> > no SSL, replaced apache
> > with nginx and verified that
> > there is plenty of
> > bandwith available between my
> > client and the server I
> > tried to change qdisc from fq
> > to pfifo_fast. It
> > instantly shot up to around
> > the expected 85-90 MB/s.
> > The same happened with
> > fq_codel in place of fq.
> >
> >
> > I then checked the statistics
> > for fq and the throttled
> > counter is increasing
> > massively every second (eth0 and
> > eth1 is LACPed using Linux
> > bonding so both is seen
> > here):
> >
> >
> > qdisc fq 8007: root refcnt 2
> > limit 10000p flow_limit
> > 100p buckets 1024 orphan_mask
> > 1023 quantum 3028
> > initial_quantum 15140
> > refill_delay 40.0ms
> > Sent 787131797 bytes 520082
> > pkt (dropped 15,
> > overlimits 0 requeues 0)
> > backlog 98410b 65p requeues 0
> > 15 flows (14 inactive, 1
> > throttled)
> > 0 gc, 2 highprio, 259920
> > throttled, 15 flows_plimit
> > qdisc fq 8008: root refcnt 2
> > limit 10000p flow_limit
> > 100p buckets 1024 orphan_mask
> > 1023 quantum 3028
> > initial_quantum 15140
> > refill_delay 40.0ms
> > Sent 2533167 bytes 6731 pkt
> > (dropped 0, overlimits 0
> > requeues 0)
> > backlog 0b 0p requeues 0
> > 24 flows (24 inactive, 0
> > throttled)
> > 0 gc, 2 highprio, 397
> > throttled
> >
> >
> > Do you have any suggestions?
> >
> >
> > Regards,
> > Hans-Kristian
> >
> >
> >
> >
> > _______________________________________________
> > Bloat mailing list
> > Bloat at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat
> >
> >
> > _______________________________________________
> > Bloat mailing list
> > Bloat at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/bloat/attachments/20170126/2cf3e803/attachment-0002.html>
More information about the Bloat
mailing list