General list for discussing Bufferbloat
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Hans-Kristian Bakke <hkbakke@gmail.com>
Cc: David Lang <david@lang.hm>, bloat <bloat@lists.bufferbloat.net>
Subject: Re: [Bloat] Excessive throttling with fq
Date: Thu, 26 Jan 2017 12:41:21 -0800	[thread overview]
Message-ID: <1485463281.5145.164.camel@edumazet-glaptop3.roam.corp.google.com> (raw)
In-Reply-To: <CAD_cGvFfG-wK6VgG7+2XPXRhnt1x1obRcfs+qzShViZ5K+O1ag@mail.gmail.com>


Can you post :

ethtool -i eth0
ethtool -k eth0

grep HZ /boot/config.... (what is the HZ value of your kernel)

I suspect a possible problem with TSO autodefer when/if HZ < 1000

Thanks.

On Thu, 2017-01-26 at 21:19 +0100, Hans-Kristian Bakke wrote:
> There are two packet captures from fq with and without pacing here:
> 
> 
> https://owncloud.proikt.com/index.php/s/KuXIl8h8bSFH1fM
> 
> 
> 
> The server (with fq pacing/nopacing) is 10.0.5.10 and is running a
> Apache2 webserver at port tcp port 443. The tcp client is nginx
> reverse proxy at 10.0.5.13 on the same subnet which again is proxying
> the connection from the Windows 10 client. 
> - I did try to connect directly to the server with the client (via a
> linux gateway router) avoiding the nginx proxy and just using plain
> no-ssl http. That did not change anything. 
> - I also tried stopping the eth0 interface to force the traffic to the
> eth1 interface in the LACP which changed nothing.
> - I also pulled each of the cable on the switch to force the traffic
> to switch between interfaces in the LACP link between the client
> switch and the server switch.
> 
> 
> The CPU is a 5-6 year old Intel Xeon X3430 CPU @ 4x2.40GHz on a
> SuperMicro platform. It is not very loaded and the results are always
> in the same ballpark with fq pacing on. 
> 
> 
> 
> top - 21:12:38 up 12 days, 11:08,  4 users,  load average: 0.56, 0.68,
> 0.77
> Tasks: 1344 total,   1 running, 1343 sleeping,   0 stopped,   0 zombie
> %Cpu0  :  0.0 us,  1.0 sy,  0.0 ni, 99.0 id,  0.0 wa,  0.0 hi,  0.0
> si,  0.0 st
> %Cpu1  :  0.0 us,  0.3 sy,  0.0 ni, 97.4 id,  2.0 wa,  0.0 hi,  0.3
> si,  0.0 st
> %Cpu2  :  0.0 us,  2.0 sy,  0.0 ni, 96.4 id,  1.3 wa,  0.0 hi,  0.3
> si,  0.0 st
> %Cpu3  :  0.7 us,  2.3 sy,  0.0 ni, 94.1 id,  3.0 wa,  0.0 hi,  0.0
> si,  0.0 st
> KiB Mem : 16427572 total,   173712 free,  9739976 used,  6513884
> buff/cache
> KiB Swap:  6369276 total,  6126736 free,   242540 used.  6224836 avail
> Mem
> 
> 
> This seems OK to me. It does have 24 drives in 3 ZFS pools at 144TB
> raw storage in total with several SAS HBAs that is pretty much always
> poking the system in some way or the other.
> 
> 
> There are around 32K interrupts when running @23 MB/s (as seen in
> chrome downloads) with pacing on and about 25K interrupts when running
> @105 MB/s with fq nopacing. Is that normal?
> 
> 
> Hans-Kristian
> 
> 
> 
> On 26 January 2017 at 20:58, David Lang <david@lang.hm> wrote:
>         Is there any CPU bottleneck?
>         
>         pacing causing this sort of problem makes me thing that the
>         CPU either can't keep up or that something (Hz setting type of
>         thing) is delaying when the CPU can get used.
>         
>         It's not clear from the posts if the problem is with sending
>         data or receiving data.
>         
>         David Lang
>         
>         
>         On Thu, 26 Jan 2017, Eric Dumazet wrote:
>         
>                 Nothing jumps on my head.
>                 
>                 We use FQ on links varying from 1Gbit to 100Gbit, and
>                 we have no such
>                 issues.
>                 
>                 You could probably check on the server the TCP various
>                 infos given by ss
>                 command
>                 
>                 
>                 ss -temoi dst <remoteip>
>                 
>                 
>                 pacing rate is shown. You might have some issues, but
>                 it is hard to say.
>                 
>                 
>                 On Thu, 2017-01-26 at 19:55 +0100, Hans-Kristian Bakke
>                 wrote:
>                         After some more testing I see that if I
>                         disable fq pacing the
>                         performance is restored to the expected
>                         levels: # for i in eth0 eth1; do tc qdisc
>                         replace dev $i root fq nopacing;
>                         done
>                         
>                         
>                         Is this expected behaviour? There is some
>                         background traffic, but only
>                         in the sub 100 mbit/s on the switches and
>                         gateway between the server
>                         and client.
>                         
>                         
>                         The chain:
>                         Windows 10 client -> 1000 mbit/s -> switch ->
>                         2xgigabit LACP -> switch
>                         -> 4 x gigabit LACP -> gw (fq_codel on all
>                         nics) -> 4 x gigabit LACP
>                         (the same as in) -> switch -> 2 x lacp ->
>                         server (with misbehaving fq
>                         pacing)
>                         
>                         
>                         
>                         On 26 January 2017 at 19:38, Hans-Kristian
>                         Bakke <hkbakke@gmail.com>
>                         wrote:
>                                 I can add that this is without BBR,
>                         just plain old kernel 4.8
>                                 cubic.
>                         
>                                 On 26 January 2017 at 19:36,
>                         Hans-Kristian Bakke
>                                 <hkbakke@gmail.com> wrote:
>                                         Another day, another fq issue
>                         (or user error).
>                         
>                         
>                                         I try to do the seeminlig
>                         simple task of downloading a
>                                         single large file over local
>                         gigabit  LAN from a
>                                         physical server running kernel
>                         4.8 and sch_fq on intel
>                                         server NICs.
>                         
>                         
>                                         For some reason it wouldn't go
>                         past around 25 MB/s.
>                                         After having replaced SSL with
>                         no SSL, replaced apache
>                                         with nginx and verified that
>                         there is plenty of
>                                         bandwith available between my
>                         client and the server I
>                                         tried to change qdisc from fq
>                         to pfifo_fast. It
>                                         instantly shot up to around
>                         the expected 85-90 MB/s.
>                                         The same happened with
>                         fq_codel in place of fq.
>                         
>                         
>                                         I then checked the statistics
>                         for fq and the throttled
>                                         counter is increasing
>                         massively every second (eth0 and
>                                         eth1 is LACPed using Linux
>                         bonding so both is seen
>                                         here):
>                         
>                         
>                                         qdisc fq 8007: root refcnt 2
>                         limit 10000p flow_limit
>                                         100p buckets 1024 orphan_mask
>                         1023 quantum 3028
>                                         initial_quantum 15140
>                         refill_delay 40.0ms
>                                          Sent 787131797 bytes 520082
>                         pkt (dropped 15,
>                                         overlimits 0 requeues 0)
>                                          backlog 98410b 65p requeues 0
>                                           15 flows (14 inactive, 1
>                         throttled)
>                                           0 gc, 2 highprio, 259920
>                         throttled, 15 flows_plimit
>                                         qdisc fq 8008: root refcnt 2
>                         limit 10000p flow_limit
>                                         100p buckets 1024 orphan_mask
>                         1023 quantum 3028
>                                         initial_quantum 15140
>                         refill_delay 40.0ms
>                                          Sent 2533167 bytes 6731 pkt
>                         (dropped 0, overlimits 0
>                                         requeues 0)
>                                          backlog 0b 0p requeues 0
>                                           24 flows (24 inactive, 0
>                         throttled)
>                                           0 gc, 2 highprio, 397
>                         throttled
>                         
>                         
>                                         Do you have any suggestions?
>                         
>                         
>                                         Regards,
>                                         Hans-Kristian
>                         
>                         
>                         
>                         
>                         _______________________________________________
>                         Bloat mailing list
>                         Bloat@lists.bufferbloat.net
>                         https://lists.bufferbloat.net/listinfo/bloat
>                 
>                 
>                 _______________________________________________
>                 Bloat mailing list
>                 Bloat@lists.bufferbloat.net
>                 https://lists.bufferbloat.net/listinfo/bloat
> 
> 



  reply	other threads:[~2017-01-26 20:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-26 18:36 Hans-Kristian Bakke
2017-01-26 18:38 ` Hans-Kristian Bakke
2017-01-26 18:55   ` Hans-Kristian Bakke
2017-01-26 19:18     ` Eric Dumazet
2017-01-26 19:58       ` David Lang
2017-01-26 20:19         ` Hans-Kristian Bakke
2017-01-26 20:41           ` Eric Dumazet [this message]
2017-01-26 20:46             ` Hans-Kristian Bakke
2017-01-26 21:00               ` Eric Dumazet
     [not found]                 ` <CAD_cGvFXR+Qb9_gnp=k4UttJZnrRRm4i19of7D4v8MK9EjeZ6Q@mail.gmail.com>
2017-01-26 21:07                   ` Eric Dumazet
     [not found]                     ` <CAD_cGvGuCU+R=ddTGTnLF3C8avmJ=UZyAYAkD0FQzd-v6fknPw@mail.gmail.com>
2017-01-26 21:33                       ` Eric Dumazet
2017-01-26 20:54           ` Eric Dumazet
2017-01-26 20:57             ` Hans-Kristian Bakke
2020-02-19  6:58               ` Alexey Ivanov
2020-02-19 13:52                 ` Neal Cardwell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1485463281.5145.164.camel@edumazet-glaptop3.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=bloat@lists.bufferbloat.net \
    --cc=david@lang.hm \
    --cc=hkbakke@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox