General list for discussing Bufferbloat
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Hans-Kristian Bakke <hkbakke@gmail.com>
Cc: David Lang <david@lang.hm>, bloat <bloat@lists.bufferbloat.net>
Subject: Re: [Bloat] Excessive throttling with fq
Date: Thu, 26 Jan 2017 13:00:52 -0800	[thread overview]
Message-ID: <1485464452.5145.172.camel@edumazet-glaptop3.roam.corp.google.com> (raw)
In-Reply-To: <CAD_cGvGC_Xy4ztKV04R2SeU=YXntqZyRC2HiXm88hfp9L9i7Kg@mail.gmail.com>

For some reason, even though this NIC advertises TSO support,
tcpdump clearly shows TSO is not used at all.

Oh wait, maybe TSO is not enabled on the bonding device ?

On Thu, 2017-01-26 at 21:46 +0100, Hans-Kristian Bakke wrote:
> # ethtool -i eth0
> driver: e1000e
> version: 3.2.6-k
> firmware-version: 1.9-0
> expansion-rom-version:
> bus-info: 0000:04:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: no
> 
> 
> # ethtool -k eth0
> Features for eth0:
> rx-checksumming: on
> tx-checksumming: on
> tx-checksum-ipv4: off [fixed]
> tx-checksum-ip-generic: on
> tx-checksum-ipv6: off [fixed]
> tx-checksum-fcoe-crc: off [fixed]
> tx-checksum-sctp: off [fixed]
> scatter-gather: on
> tx-scatter-gather: on
> tx-scatter-gather-fraglist: off [fixed]
> tcp-segmentation-offload: on
> tx-tcp-segmentation: on
> tx-tcp-ecn-segmentation: off [fixed]
> tx-tcp-mangleid-segmentation: on
> tx-tcp6-segmentation: on
> udp-fragmentation-offload: off [fixed]
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off [fixed]
> rx-vlan-offload: on
> tx-vlan-offload: on
> ntuple-filters: off [fixed]
> receive-hashing: on
> highdma: on [fixed]
> rx-vlan-filter: on [fixed]
> vlan-challenged: off [fixed]
> tx-lockless: off [fixed]
> netns-local: off [fixed]
> tx-gso-robust: off [fixed]
> tx-fcoe-segmentation: off [fixed]
> tx-gre-segmentation: off [fixed]
> tx-gre-csum-segmentation: off [fixed]
> tx-ipxip4-segmentation: off [fixed]
> tx-ipxip6-segmentation: off [fixed]
> tx-udp_tnl-segmentation: off [fixed]
> tx-udp_tnl-csum-segmentation: off [fixed]
> tx-gso-partial: off [fixed]
> tx-sctp-segmentation: off [fixed]
> fcoe-mtu: off [fixed]
> tx-nocache-copy: off
> loopback: off [fixed]
> rx-fcs: off
> rx-all: off
> tx-vlan-stag-hw-insert: off [fixed]
> rx-vlan-stag-hw-parse: off [fixed]
> rx-vlan-stag-filter: off [fixed]
> l2-fwd-offload: off [fixed]
> busy-poll: off [fixed]
> hw-tc-offload: off [fixed]
> 
> 
> # grep HZ /boot/config-4.8.0-2-amd64
> CONFIG_NO_HZ_COMMON=y
> # CONFIG_HZ_PERIODIC is not set
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_FULL is not set
> # CONFIG_NO_HZ is not set
> # CONFIG_HZ_100 is not set
> CONFIG_HZ_250=y
> # CONFIG_HZ_300 is not set
> # CONFIG_HZ_1000 is not set
> CONFIG_HZ=250
> CONFIG_MACHZ_WDT=m
> 
> 
> 
> On 26 January 2017 at 21:41, Eric Dumazet <eric.dumazet@gmail.com>
> wrote:
>         
>         Can you post :
>         
>         ethtool -i eth0
>         ethtool -k eth0
>         
>         grep HZ /boot/config.... (what is the HZ value of your kernel)
>         
>         I suspect a possible problem with TSO autodefer when/if HZ <
>         1000
>         
>         Thanks.
>         
>         On Thu, 2017-01-26 at 21:19 +0100, Hans-Kristian Bakke wrote:
>         > There are two packet captures from fq with and without
>         pacing here:
>         >
>         >
>         > https://owncloud.proikt.com/index.php/s/KuXIl8h8bSFH1fM
>         >
>         >
>         >
>         > The server (with fq pacing/nopacing) is 10.0.5.10 and is
>         running a
>         > Apache2 webserver at port tcp port 443. The tcp client is
>         nginx
>         > reverse proxy at 10.0.5.13 on the same subnet which again is
>         proxying
>         > the connection from the Windows 10 client.
>         > - I did try to connect directly to the server with the
>         client (via a
>         > linux gateway router) avoiding the nginx proxy and just
>         using plain
>         > no-ssl http. That did not change anything.
>         > - I also tried stopping the eth0 interface to force the
>         traffic to the
>         > eth1 interface in the LACP which changed nothing.
>         > - I also pulled each of the cable on the switch to force the
>         traffic
>         > to switch between interfaces in the LACP link between the
>         client
>         > switch and the server switch.
>         >
>         >
>         > The CPU is a 5-6 year old Intel Xeon X3430 CPU @ 4x2.40GHz
>         on a
>         > SuperMicro platform. It is not very loaded and the results
>         are always
>         > in the same ballpark with fq pacing on.
>         >
>         >
>         >
>         > top - 21:12:38 up 12 days, 11:08,  4 users,  load average:
>         0.56, 0.68,
>         > 0.77
>         > Tasks: 1344 total,   1 running, 1343 sleeping,   0
>         stopped,   0 zombie
>         > %Cpu0  :  0.0 us,  1.0 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0
>         hi,  0.0
>         > si,  0.0 st
>         > %Cpu1  :  0.0 us,  0.3 sy,  0.0 ni, 97.4 id,  2.0 wa,  0.0
>         hi,  0.3
>         > si,  0.0 st
>         > %Cpu2  :  0.0 us,  2.0 sy,  0.0 ni, 96.4 id,  1.3 wa,  0.0
>         hi,  0.3
>         > si,  0.0 st
>         > %Cpu3  :  0.7 us,  2.3 sy,  0.0 ni, 94.1 id,  3.0 wa,  0.0
>         hi,  0.0
>         > si,  0.0 st
>         > KiB Mem : 16427572 total,   173712 free,  9739976 used,
>         6513884
>         > buff/cache
>         > KiB Swap:  6369276 total,  6126736 free,   242540 used.
>         6224836 avail
>         > Mem
>         >
>         >
>         > This seems OK to me. It does have 24 drives in 3 ZFS pools
>         at 144TB
>         > raw storage in total with several SAS HBAs that is pretty
>         much always
>         > poking the system in some way or the other.
>         >
>         >
>         > There are around 32K interrupts when running @23 MB/s (as
>         seen in
>         > chrome downloads) with pacing on and about 25K interrupts
>         when running
>         > @105 MB/s with fq nopacing. Is that normal?
>         >
>         >
>         > Hans-Kristian
>         >
>         >
>         >
>         > On 26 January 2017 at 20:58, David Lang <david@lang.hm>
>         wrote:
>         >         Is there any CPU bottleneck?
>         >
>         >         pacing causing this sort of problem makes me thing
>         that the
>         >         CPU either can't keep up or that something (Hz
>         setting type of
>         >         thing) is delaying when the CPU can get used.
>         >
>         >         It's not clear from the posts if the problem is with
>         sending
>         >         data or receiving data.
>         >
>         >         David Lang
>         >
>         >
>         >         On Thu, 26 Jan 2017, Eric Dumazet wrote:
>         >
>         >                 Nothing jumps on my head.
>         >
>         >                 We use FQ on links varying from 1Gbit to
>         100Gbit, and
>         >                 we have no such
>         >                 issues.
>         >
>         >                 You could probably check on the server the
>         TCP various
>         >                 infos given by ss
>         >                 command
>         >
>         >
>         >                 ss -temoi dst <remoteip>
>         >
>         >
>         >                 pacing rate is shown. You might have some
>         issues, but
>         >                 it is hard to say.
>         >
>         >
>         >                 On Thu, 2017-01-26 at 19:55 +0100,
>         Hans-Kristian Bakke
>         >                 wrote:
>         >                         After some more testing I see that
>         if I
>         >                         disable fq pacing the
>         >                         performance is restored to the
>         expected
>         >                         levels: # for i in eth0 eth1; do tc
>         qdisc
>         >                         replace dev $i root fq nopacing;
>         >                         done
>         >
>         >
>         >                         Is this expected behaviour? There is
>         some
>         >                         background traffic, but only
>         >                         in the sub 100 mbit/s on the
>         switches and
>         >                         gateway between the server
>         >                         and client.
>         >
>         >
>         >                         The chain:
>         >                         Windows 10 client -> 1000 mbit/s ->
>         switch ->
>         >                         2xgigabit LACP -> switch
>         >                         -> 4 x gigabit LACP -> gw (fq_codel
>         on all
>         >                         nics) -> 4 x gigabit LACP
>         >                         (the same as in) -> switch -> 2 x
>         lacp ->
>         >                         server (with misbehaving fq
>         >                         pacing)
>         >
>         >
>         >
>         >                         On 26 January 2017 at 19:38,
>         Hans-Kristian
>         >                         Bakke <hkbakke@gmail.com>
>         >                         wrote:
>         >                                 I can add that this is
>         without BBR,
>         >                         just plain old kernel 4.8
>         >                                 cubic.
>         >
>         >                                 On 26 January 2017 at 19:36,
>         >                         Hans-Kristian Bakke
>         >                                 <hkbakke@gmail.com> wrote:
>         >                                         Another day, another
>         fq issue
>         >                         (or user error).
>         >
>         >
>         >                                         I try to do the
>         seeminlig
>         >                         simple task of downloading a
>         >                                         single large file
>         over local
>         >                         gigabit  LAN from a
>         >                                         physical server
>         running kernel
>         >                         4.8 and sch_fq on intel
>         >                                         server NICs.
>         >
>         >
>         >                                         For some reason it
>         wouldn't go
>         >                         past around 25 MB/s.
>         >                                         After having
>         replaced SSL with
>         >                         no SSL, replaced apache
>         >                                         with nginx and
>         verified that
>         >                         there is plenty of
>         >                                         bandwith available
>         between my
>         >                         client and the server I
>         >                                         tried to change
>         qdisc from fq
>         >                         to pfifo_fast. It
>         >                                         instantly shot up to
>         around
>         >                         the expected 85-90 MB/s.
>         >                                         The same happened
>         with
>         >                         fq_codel in place of fq.
>         >
>         >
>         >                                         I then checked the
>         statistics
>         >                         for fq and the throttled
>         >                                         counter is
>         increasing
>         >                         massively every second (eth0 and
>         >                                         eth1 is LACPed using
>         Linux
>         >                         bonding so both is seen
>         >                                         here):
>         >
>         >
>         >                                         qdisc fq 8007: root
>         refcnt 2
>         >                         limit 10000p flow_limit
>         >                                         100p buckets 1024
>         orphan_mask
>         >                         1023 quantum 3028
>         >                                         initial_quantum
>         15140
>         >                         refill_delay 40.0ms
>         >                                          Sent 787131797
>         bytes 520082
>         >                         pkt (dropped 15,
>         >                                         overlimits 0
>         requeues 0)
>         >                                          backlog 98410b 65p
>         requeues 0
>         >                                           15 flows (14
>         inactive, 1
>         >                         throttled)
>         >                                           0 gc, 2 highprio,
>         259920
>         >                         throttled, 15 flows_plimit
>         >                                         qdisc fq 8008: root
>         refcnt 2
>         >                         limit 10000p flow_limit
>         >                                         100p buckets 1024
>         orphan_mask
>         >                         1023 quantum 3028
>         >                                         initial_quantum
>         15140
>         >                         refill_delay 40.0ms
>         >                                          Sent 2533167 bytes
>         6731 pkt
>         >                         (dropped 0, overlimits 0
>         >                                         requeues 0)
>         >                                          backlog 0b 0p
>         requeues 0
>         >                                           24 flows (24
>         inactive, 0
>         >                         throttled)
>         >                                           0 gc, 2 highprio,
>         397
>         >                         throttled
>         >
>         >
>         >                                         Do you have any
>         suggestions?
>         >
>         >
>         >                                         Regards,
>         >                                         Hans-Kristian
>         >
>         >
>         >
>         >
>         >
>          _______________________________________________
>         >                         Bloat mailing list
>         >                         Bloat@lists.bufferbloat.net
>         >
>          https://lists.bufferbloat.net/listinfo/bloat
>         >
>         >
>         >
>          _______________________________________________
>         >                 Bloat mailing list
>         >                 Bloat@lists.bufferbloat.net
>         >                 https://lists.bufferbloat.net/listinfo/bloat
>         >
>         >
>         
>         
>         
> 
> 



  reply	other threads:[~2017-01-26 21:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-26 18:36 Hans-Kristian Bakke
2017-01-26 18:38 ` Hans-Kristian Bakke
2017-01-26 18:55   ` Hans-Kristian Bakke
2017-01-26 19:18     ` Eric Dumazet
2017-01-26 19:58       ` David Lang
2017-01-26 20:19         ` Hans-Kristian Bakke
2017-01-26 20:41           ` Eric Dumazet
2017-01-26 20:46             ` Hans-Kristian Bakke
2017-01-26 21:00               ` Eric Dumazet [this message]
     [not found]                 ` <CAD_cGvFXR+Qb9_gnp=k4UttJZnrRRm4i19of7D4v8MK9EjeZ6Q@mail.gmail.com>
2017-01-26 21:07                   ` Eric Dumazet
     [not found]                     ` <CAD_cGvGuCU+R=ddTGTnLF3C8avmJ=UZyAYAkD0FQzd-v6fknPw@mail.gmail.com>
2017-01-26 21:33                       ` Eric Dumazet
2017-01-26 20:54           ` Eric Dumazet
2017-01-26 20:57             ` Hans-Kristian Bakke
2020-02-19  6:58               ` Alexey Ivanov
2020-02-19 13:52                 ` Neal Cardwell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1485464452.5145.172.camel@edumazet-glaptop3.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=bloat@lists.bufferbloat.net \
    --cc=david@lang.hm \
    --cc=hkbakke@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox