From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [52.28.52.200]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id F264C3BA8E for ; Thu, 5 Jul 2018 18:31:15 -0400 (EDT) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1530829874; bh=gQIhL6En3vWU+O6qiWktjAsWMOX7w71/eroBT1KXoFU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=nqHITdLA+XdHHTnI8gsH63nnNIhbtklqlw5TMq052bIMd3otK0wbidovlC79JV+tU tIGr4bdWsLzxVslc/u+uRI9krD43qWu9VHxUgh4jL+JfZtnPNXzNaKHfVhKmFocU3Y gmfEO2r/INcxuL1bpVB/mEqGcDHrrXVASq0tLl2gRdfGWLDfwcX34+BokhQYw2Q12M 6lq61XlV6qWcV8Edh+VDWXdMmIJ7Cvnn18NwP0wPP6uMPK0WVNQeQ8/73IaH1FYCrQ UF1MPBeBfhMVqkLFVhc5cv0SSYpUYp0lny34jF24VI0y3I4wFbwuJ9NPh8zE3qaxee HUytTwYFjt0HQ== To: Jonathan Morton Cc: Georgios Amanakis , Cake List In-Reply-To: <87fu10haw7.fsf@toke.dk> References: <17AF79A0-0213-44E3-95B9-62795A644A47@heistp.net> <87lgatj13k.fsf@toke.dk> <87fu11ipir.fsf@toke.dk> <871scligay.fsf@toke.dk> <2AE036E5-BD3D-4176-9476-9EC824EC1D18@darbyshire-bryant.me.uk> <87r2klh1fz.fsf@toke.dk> <87lgath01v.fsf@toke.dk> <52B2B44D-4382-404C-8F6D-03F12A72B11F@heistp.net> <31667353-48F2-4FAB-AC05-163680451719@toke.dk> <48ECB6C8-5D22-4785-A6CE-696D87EC5496@toke.dk> <73DD74AD-C2E7-4A12-AE49-C06D4486660E@gmail.com> <87fu10haw7.fsf@toke.dk> Date: Fri, 06 Jul 2018 00:31:11 +0200 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <8736wxco28.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] cake at 60gbit X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2018 22:31:16 -0000 Toke H=C3=B8iland-J=C3=B8rgensen writes: > Jonathan Morton writes: > >>> On 3 Jul, 2018, at 1:23 am, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>>=20 >>> My hunch is that this has something to do with the way mlx5 uses >>> multiple receive queues (and thus multiple CPUs). Which is probably >>> different from veth... >> >> At this stage I'm pretty confident it has nothing to do with Cake, and >> everything to do with the Mellanox hardware and driver. It does strike >> me that Linux' default handling of multiqueue hardware doesn't map >> very well to the qdisc interface. > > Well, it doesn't happen with fq_codel, so even if it is a driver bug, it > is being triggered by cake specifically... Right, so finally got some time to investigate this further. I suspected that cake_dequeue() was looping forever, so I added some debug statements to investigate this; and turns out I was right. Using the debug patch below, in unlimited mode I get loop aborts on loop 'i' for unlimited mode and loop 'l' if I enable the shaper at 70 gbit. It happens pretty reliably, but only when I load up the link sufficiently (need 4-6 TCP flows which get ~50 Gbps of total throughput). The weird thing is that what appears to be happening, is that cake somehow gets into a state where sch->q.qlen is >0 while all tin backlogs are 0. I have no clue how this happens; as far as I can tell, all changes to tin_backlog are paired with a change to q.qlen. The only thing outside of cake itself that modifies q.qlen is peek(), which is not being used here. I'm giving up for tonight; if anyone else has any ideas, I'm all ears. -Toke Sample debug output: [ 5456.068281] Loop counter i hit 100k; aborting! i 100001 j 0 k 180 l 3 m = 0 qlen 2 qbkllog 33184 tin 2 deficit 172 tot backlog 0 With this debug patch: @@ -1892,6 +1892,20 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sc= h) u64 delay; u32 len; =20 + int i=3D0,j=3D0,k=3D0,l=3D0,m=3D0; + +#define COUNT_LOOP(v) do { \ + if (++v > 100000) { \ + int tot_bkl =3D 0; \ + struct cake_tin_data *t; \ + int n; \ + for(n=3D0,t =3D q->tins; n < CAKE_MAX_TINS; n++,t++) \ + tot_bkl +=3D t->tin_backlog; \ + net_warn_ratelimited("Loop counter " #v " hit 100k; aborting! i %d j %d= k %d l %d m %d qlen %d qbkllog %d tin %d deficit %d tot backlog %d", i, j,= k, l, m, sch->q.qlen, sch->qstats.backlog, q->cur_tin, b->tin_deficit, tot= _bkl); \ + return NULL; \ + } \ + } while(0); + begin: if (!sch->q.qlen) return NULL; @@ -1912,6 +1926,7 @@ begin: /* In unlimited mode, can't rely on shaper timings, just balance * with DRR */ + i=3D0; while (b->tin_deficit < 0 || !(b->sparse_flow_count + b->bulk_flow_count)) { if (b->tin_deficit <=3D 0) @@ -1923,6 +1938,7 @@ begin: q->cur_tin =3D 0; b =3D q->tins; } + COUNT_LOOP(i); } } else { /* In shaped mode, choose: @@ -1960,8 +1976,10 @@ retry: head =3D &b->old_flows; if (unlikely(list_empty(head))) { head =3D &b->decaying_flows; - if (unlikely(list_empty(head))) + if (unlikely(list_empty(head))) { + COUNT_LOOP(j); goto begin; + } } } } @@ -2008,6 +2026,7 @@ retry: flow->set =3D CAKE_SET_SPARSE_WAIT; } } + COUNT_LOOP(k); goto retry; } =20 @@ -2050,6 +2069,7 @@ retry: srchost->srchost_refcnt--; dsthost->dsthost_refcnt--; } + COUNT_LOOP(l); goto begin; } =20 @@ -2075,6 +2095,8 @@ retry: kfree_skb(skb); if (q->rate_flags & CAKE_FLAG_INGRESS) goto retry; + + COUNT_LOOP(m); } =20 b->tin_ecn_mark +=3D !!flow->cvars.ecn_marked;