# ethtool -i eth0 driver: e1000e version: 3.2.6-k firmware-version: 1.9-0 expansion-rom-version: bus-info: 0000:04:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no # ethtool -k eth0 Features for eth0: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp-mangleid-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: on highdma: on [fixed] rx-vlan-filter: on [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-gre-csum-segmentation: off [fixed] tx-ipxip4-segmentation: off [fixed] tx-ipxip6-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] tx-udp_tnl-csum-segmentation: off [fixed] tx-gso-partial: off [fixed] tx-sctp-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off rx-all: off tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] busy-poll: off [fixed] hw-tc-offload: off [fixed] # grep HZ /boot/config-4.8.0-2-amd64 CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y # CONFIG_NO_HZ_FULL is not set # CONFIG_NO_HZ is not set # CONFIG_HZ_100 is not set CONFIG_HZ_250=y # CONFIG_HZ_300 is not set # CONFIG_HZ_1000 is not set CONFIG_HZ=250 CONFIG_MACHZ_WDT=m On 26 January 2017 at 21:41, Eric Dumazet wrote: > > Can you post : > > ethtool -i eth0 > ethtool -k eth0 > > grep HZ /boot/config.... (what is the HZ value of your kernel) > > I suspect a possible problem with TSO autodefer when/if HZ < 1000 > > Thanks. > > On Thu, 2017-01-26 at 21:19 +0100, Hans-Kristian Bakke wrote: > > There are two packet captures from fq with and without pacing here: > > > > > > https://owncloud.proikt.com/index.php/s/KuXIl8h8bSFH1fM > > > > > > > > The server (with fq pacing/nopacing) is 10.0.5.10 and is running a > > Apache2 webserver at port tcp port 443. The tcp client is nginx > > reverse proxy at 10.0.5.13 on the same subnet which again is proxying > > the connection from the Windows 10 client. > > - I did try to connect directly to the server with the client (via a > > linux gateway router) avoiding the nginx proxy and just using plain > > no-ssl http. That did not change anything. > > - I also tried stopping the eth0 interface to force the traffic to the > > eth1 interface in the LACP which changed nothing. > > - I also pulled each of the cable on the switch to force the traffic > > to switch between interfaces in the LACP link between the client > > switch and the server switch. > > > > > > The CPU is a 5-6 year old Intel Xeon X3430 CPU @ 4x2.40GHz on a > > SuperMicro platform. It is not very loaded and the results are always > > in the same ballpark with fq pacing on. > > > > > > > > top - 21:12:38 up 12 days, 11:08, 4 users, load average: 0.56, 0.68, > > 0.77 > > Tasks: 1344 total, 1 running, 1343 sleeping, 0 stopped, 0 zombie > > %Cpu0 : 0.0 us, 1.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 > > si, 0.0 st > > %Cpu1 : 0.0 us, 0.3 sy, 0.0 ni, 97.4 id, 2.0 wa, 0.0 hi, 0.3 > > si, 0.0 st > > %Cpu2 : 0.0 us, 2.0 sy, 0.0 ni, 96.4 id, 1.3 wa, 0.0 hi, 0.3 > > si, 0.0 st > > %Cpu3 : 0.7 us, 2.3 sy, 0.0 ni, 94.1 id, 3.0 wa, 0.0 hi, 0.0 > > si, 0.0 st > > KiB Mem : 16427572 total, 173712 free, 9739976 used, 6513884 > > buff/cache > > KiB Swap: 6369276 total, 6126736 free, 242540 used. 6224836 avail > > Mem > > > > > > This seems OK to me. It does have 24 drives in 3 ZFS pools at 144TB > > raw storage in total with several SAS HBAs that is pretty much always > > poking the system in some way or the other. > > > > > > There are around 32K interrupts when running @23 MB/s (as seen in > > chrome downloads) with pacing on and about 25K interrupts when running > > @105 MB/s with fq nopacing. Is that normal? > > > > > > Hans-Kristian > > > > > > > > On 26 January 2017 at 20:58, David Lang wrote: > > Is there any CPU bottleneck? > > > > pacing causing this sort of problem makes me thing that the > > CPU either can't keep up or that something (Hz setting type of > > thing) is delaying when the CPU can get used. > > > > It's not clear from the posts if the problem is with sending > > data or receiving data. > > > > David Lang > > > > > > On Thu, 26 Jan 2017, Eric Dumazet wrote: > > > > Nothing jumps on my head. > > > > We use FQ on links varying from 1Gbit to 100Gbit, and > > we have no such > > issues. > > > > You could probably check on the server the TCP various > > infos given by ss > > command > > > > > > ss -temoi dst > > > > > > pacing rate is shown. You might have some issues, but > > it is hard to say. > > > > > > On Thu, 2017-01-26 at 19:55 +0100, Hans-Kristian Bakke > > wrote: > > After some more testing I see that if I > > disable fq pacing the > > performance is restored to the expected > > levels: # for i in eth0 eth1; do tc qdisc > > replace dev $i root fq nopacing; > > done > > > > > > Is this expected behaviour? There is some > > background traffic, but only > > in the sub 100 mbit/s on the switches and > > gateway between the server > > and client. > > > > > > The chain: > > Windows 10 client -> 1000 mbit/s -> switch -> > > 2xgigabit LACP -> switch > > -> 4 x gigabit LACP -> gw (fq_codel on all > > nics) -> 4 x gigabit LACP > > (the same as in) -> switch -> 2 x lacp -> > > server (with misbehaving fq > > pacing) > > > > > > > > On 26 January 2017 at 19:38, Hans-Kristian > > Bakke > > wrote: > > I can add that this is without BBR, > > just plain old kernel 4.8 > > cubic. > > > > On 26 January 2017 at 19:36, > > Hans-Kristian Bakke > > wrote: > > Another day, another fq issue > > (or user error). > > > > > > I try to do the seeminlig > > simple task of downloading a > > single large file over local > > gigabit LAN from a > > physical server running kernel > > 4.8 and sch_fq on intel > > server NICs. > > > > > > For some reason it wouldn't go > > past around 25 MB/s. > > After having replaced SSL with > > no SSL, replaced apache > > with nginx and verified that > > there is plenty of > > bandwith available between my > > client and the server I > > tried to change qdisc from fq > > to pfifo_fast. It > > instantly shot up to around > > the expected 85-90 MB/s. > > The same happened with > > fq_codel in place of fq. > > > > > > I then checked the statistics > > for fq and the throttled > > counter is increasing > > massively every second (eth0 and > > eth1 is LACPed using Linux > > bonding so both is seen > > here): > > > > > > qdisc fq 8007: root refcnt 2 > > limit 10000p flow_limit > > 100p buckets 1024 orphan_mask > > 1023 quantum 3028 > > initial_quantum 15140 > > refill_delay 40.0ms > > Sent 787131797 bytes 520082 > > pkt (dropped 15, > > overlimits 0 requeues 0) > > backlog 98410b 65p requeues 0 > > 15 flows (14 inactive, 1 > > throttled) > > 0 gc, 2 highprio, 259920 > > throttled, 15 flows_plimit > > qdisc fq 8008: root refcnt 2 > > limit 10000p flow_limit > > 100p buckets 1024 orphan_mask > > 1023 quantum 3028 > > initial_quantum 15140 > > refill_delay 40.0ms > > Sent 2533167 bytes 6731 pkt > > (dropped 0, overlimits 0 > > requeues 0) > > backlog 0b 0p requeues 0 > > 24 flows (24 inactive, 0 > > throttled) > > 0 gc, 2 highprio, 397 > > throttled > > > > > > Do you have any suggestions? > > > > > > Regards, > > Hans-Kristian > > > > > > > > > > _______________________________________________ > > Bloat mailing list > > Bloat@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/bloat > > > > > > _______________________________________________ > > Bloat mailing list > > Bloat@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/bloat > > > > > > >