From: Georgios Amanakis <gamanakis@gmail.com>
To: "Toke Høiland-Jørgensen" <toke@toke.dk>
Cc: Jonathan Morton <chromatix99@gmail.com>,
Cake List <cake@lists.bufferbloat.net>
Subject: Re: [Cake] cake at 60gbit
Date: Thu, 5 Jul 2018 19:48:07 -0400 [thread overview]
Message-ID: <CACvFP_hR3WZ4m++HZB_Mixo3Fw4hOxRjmxCqa6E+K85ixd3WiQ@mail.gmail.com> (raw)
In-Reply-To: <8736wxco28.fsf@toke.dk>
[-- Attachment #1: Type: text/plain, Size: 5350 bytes --]
I am going to give it a try, with your patch applied tonight and report.
Thank you!
George
On Thu, Jul 5, 2018, 6:31 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> Toke Høiland-Jørgensen <toke@toke.dk> writes:
>
> > Jonathan Morton <chromatix99@gmail.com> writes:
> >
> >>> On 3 Jul, 2018, at 1:23 am, Toke Høiland-Jørgensen <toke@toke.dk>
> wrote:
> >>>
> >>> My hunch is that this has something to do with the way mlx5 uses
> >>> multiple receive queues (and thus multiple CPUs). Which is probably
> >>> different from veth...
> >>
> >> At this stage I'm pretty confident it has nothing to do with Cake, and
> >> everything to do with the Mellanox hardware and driver. It does strike
> >> me that Linux' default handling of multiqueue hardware doesn't map
> >> very well to the qdisc interface.
> >
> > Well, it doesn't happen with fq_codel, so even if it is a driver bug, it
> > is being triggered by cake specifically...
>
> Right, so finally got some time to investigate this further.
>
> I suspected that cake_dequeue() was looping forever, so I added some
> debug statements to investigate this; and turns out I was right. Using
> the debug patch below, in unlimited mode I get loop aborts on loop 'i'
> for unlimited mode and loop 'l' if I enable the shaper at 70 gbit. It
> happens pretty reliably, but only when I load up the link sufficiently
> (need 4-6 TCP flows which get ~50 Gbps of total throughput).
>
> The weird thing is that what appears to be happening, is that cake
> somehow gets into a state where sch->q.qlen is >0 while all tin backlogs
> are 0. I have no clue how this happens; as far as I can tell, all
> changes to tin_backlog are paired with a change to q.qlen. The only
> thing outside of cake itself that modifies q.qlen is peek(), which is
> not being used here.
>
> I'm giving up for tonight; if anyone else has any ideas, I'm all ears.
>
> -Toke
>
> Sample debug output:
>
> [ 5456.068281] Loop counter i hit 100k; aborting! i 100001 j 0 k 180 l 3 m
> 0 qlen 2 qbkllog 33184 tin 2 deficit 172 tot backlog 0
>
> With this debug patch:
>
> @@ -1892,6 +1892,20 @@ static struct sk_buff *cake_dequeue(struct Qdisc
> *sch)
> u64 delay;
> u32 len;
>
> + int i=0,j=0,k=0,l=0,m=0;
> +
> +#define COUNT_LOOP(v) do { \
> + if (++v > 100000) { \
> + int tot_bkl = 0; \
> + struct cake_tin_data *t; \
> + int n; \
> + for(n=0,t = q->tins; n < CAKE_MAX_TINS; n++,t++)
> \
> + tot_bkl += t->tin_backlog; \
> + net_warn_ratelimited("Loop counter " #v " hit
> 100k; aborting! i %d j %d k %d l %d m %d qlen %d qbkllog %d tin %d deficit
> %d tot backlog %d", i, j, k, l, m, sch->q.qlen, sch->qstats.backlog,
> q->cur_tin, b->tin_deficit, tot_bkl); \
> + return NULL; \
> + } \
> + } while(0);
> +
> begin:
> if (!sch->q.qlen)
> return NULL;
> @@ -1912,6 +1926,7 @@ begin:
> /* In unlimited mode, can't rely on shaper timings, just
> balance
> * with DRR
> */
> + i=0;
> while (b->tin_deficit < 0 ||
> !(b->sparse_flow_count + b->bulk_flow_count)) {
> if (b->tin_deficit <= 0)
> @@ -1923,6 +1938,7 @@ begin:
> q->cur_tin = 0;
> b = q->tins;
> }
> + COUNT_LOOP(i);
> }
> } else {
> /* In shaped mode, choose:
> @@ -1960,8 +1976,10 @@ retry:
> head = &b->old_flows;
> if (unlikely(list_empty(head))) {
> head = &b->decaying_flows;
> - if (unlikely(list_empty(head)))
> + if (unlikely(list_empty(head))) {
> + COUNT_LOOP(j);
> goto begin;
> + }
> }
> }
> }
> @@ -2008,6 +2026,7 @@ retry:
> flow->set = CAKE_SET_SPARSE_WAIT;
> }
> }
> + COUNT_LOOP(k);
> goto retry;
> }
>
> @@ -2050,6 +2069,7 @@ retry:
> srchost->srchost_refcnt--;
> dsthost->dsthost_refcnt--;
> }
> + COUNT_LOOP(l);
> goto begin;
> }
>
> @@ -2075,6 +2095,8 @@ retry:
> kfree_skb(skb);
> if (q->rate_flags & CAKE_FLAG_INGRESS)
> goto retry;
> +
> + COUNT_LOOP(m);
> }
>
> b->tin_ecn_mark += !!flow->cvars.ecn_marked;
>
>
>
>
[-- Attachment #2: Type: text/html, Size: 7083 bytes --]
next prev parent reply other threads:[~2018-07-05 23:48 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <mailman.381.1530347349.3512.cake@lists.bufferbloat.net>
2018-06-30 16:37 ` [Cake] Cake on openwrt - falling behind Georgios Amanakis
2018-06-30 17:26 ` Pete Heist
2018-06-30 18:09 ` Georgios Amanakis
2018-06-30 18:55 ` Kevin Darbyshire-Bryant
2018-06-30 19:57 ` Pete Heist
2018-06-30 20:58 ` Georgios Amanakis
2018-06-30 21:37 ` Pete Heist
2018-06-30 22:43 ` Pete Heist
2018-06-30 23:20 ` Pete Heist
[not found] ` <CACvFP_gkdAPKSEO7j9+cMPqTa-fJYd8XFEEBD6ZzLuVvaNwsvg@mail.gmail.com>
2018-07-01 2:37 ` [Cake] Fwd: " Georgios Amanakis
2018-07-01 7:18 ` [Cake] " Pete Heist
2018-07-01 13:48 ` Pete Heist
2018-07-01 14:02 ` Dave Taht
2018-07-01 15:30 ` Pete Heist
2018-07-01 14:38 ` Kevin Darbyshire-Bryant
2018-07-01 16:54 ` Pete Heist
2018-07-01 19:41 ` Kevin Darbyshire-Bryant
2018-07-02 10:19 ` Pete Heist
2018-07-02 11:38 ` Pete Heist
2018-07-02 11:59 ` Kevin Darbyshire-Bryant
2018-07-02 14:01 ` Pete Heist
2018-07-02 12:03 ` Toke Høiland-Jørgensen
2018-07-02 14:51 ` Pete Heist
2018-07-02 16:14 ` Toke Høiland-Jørgensen
2018-07-02 16:59 ` Kevin Darbyshire-Bryant
2018-07-02 17:04 ` Pete Heist
2018-07-02 17:12 ` Kevin Darbyshire-Bryant
2018-07-02 18:24 ` Pete Heist
2018-07-02 19:31 ` Toke Høiland-Jørgensen
2018-07-02 20:09 ` Pete Heist
2018-07-02 20:11 ` Toke Høiland-Jørgensen
2018-07-02 20:46 ` Pete Heist
[not found] ` <mailman.407.1530550780.3512.cake@lists.bufferbloat.net>
2018-07-02 17:50 ` Kevin Darbyshire-Bryant
2018-07-02 19:33 ` Toke Høiland-Jørgensen
2018-07-02 19:36 ` Kevin Darbyshire-Bryant
2018-07-02 19:39 ` Toke Høiland-Jørgensen
2018-07-02 20:03 ` [Cake] cake at 60gbit Dave Taht
2018-07-02 20:09 ` Toke Høiland-Jørgensen
2018-07-02 21:16 ` Pete Heist
2018-07-02 21:35 ` Toke Høiland-Jørgensen
2018-07-02 22:07 ` Georgios Amanakis
2018-07-02 22:12 ` Dave Taht
2018-07-02 23:48 ` Georgios Amanakis
2018-07-02 22:23 ` Toke Høiland-Jørgensen
2018-07-03 7:35 ` Pete Heist
2018-07-03 9:18 ` Jonathan Morton
2018-07-03 9:57 ` Pete Heist
2018-07-03 10:27 ` Toke Høiland-Jørgensen
2018-07-03 10:41 ` Pete Heist
2018-07-05 22:31 ` Toke Høiland-Jørgensen
2018-07-05 23:48 ` Georgios Amanakis [this message]
2018-07-06 1:21 ` Dave Taht
2018-07-06 2:55 ` George Amanakis
2018-07-06 3:06 ` George Amanakis
2018-07-06 9:22 ` Toke Høiland-Jørgensen
2018-07-06 9:21 ` Toke Høiland-Jørgensen
2018-07-06 8:55 ` Pete Heist
2018-07-06 9:29 ` Toke Høiland-Jørgensen
2018-07-06 10:00 ` Pete Heist
2018-07-06 10:46 ` Toke Høiland-Jørgensen
2018-07-06 11:33 ` Toke Høiland-Jørgensen
2018-07-06 11:43 ` Jonathan Morton
2018-07-06 11:48 ` Toke Høiland-Jørgensen
2018-07-06 11:58 ` Pete Heist
2018-07-06 12:04 ` Toke Høiland-Jørgensen
2018-07-02 18:39 ` [Cake] Cake on openwrt - falling behind Dave Taht
2018-07-02 19:11 ` Kevin Darbyshire-Bryant
2018-07-02 19:23 ` Toke Høiland-Jørgensen
2018-07-02 19:27 ` Dave Taht
2018-07-02 19:38 ` Toke Høiland-Jørgensen
2018-07-02 20:05 ` Toke Høiland-Jørgensen
2018-07-02 19:31 ` Pete Heist
[not found] ` <mailman.397.1530474091.3512.cake@lists.bufferbloat.net>
2018-07-01 23:55 ` Dave Taht
2018-07-02 0:05 ` Dave Taht
[not found] ` <mailman.392.1530455913.3512.cake@lists.bufferbloat.net>
2018-07-01 15:17 ` Jonathan Morton
[not found] ` <mailman.384.1530384918.3512.cake@lists.bufferbloat.net>
2018-07-01 9:46 ` Magnus Olsson
2018-07-01 12:34 ` Kevin Darbyshire-Bryant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CACvFP_hR3WZ4m++HZB_Mixo3Fw4hOxRjmxCqa6E+K85ixd3WiQ@mail.gmail.com \
--to=gamanakis@gmail.com \
--cc=cake@lists.bufferbloat.net \
--cc=chromatix99@gmail.com \
--cc=toke@toke.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox