From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dave.taht@gmail.com>
Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com
 [IPv6:2607:f8b0:4864:20::841])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 5F5253B29E
 for <cake@lists.bufferbloat.net>; Fri, 15 Feb 2019 15:45:37 -0500 (EST)
Received: by mail-qt1-x841.google.com with SMTP id a48so12481229qtb.4
 for <cake@lists.bufferbloat.net>; Fri, 15 Feb 2019 12:45:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=YJTbfuvia5zxL3BZL0cm6NY0CVOUKYEUITk5bLS5nnQ=;
 b=pYFZjMIqREtLXGgf4PsDBzVWz5QGo5jqOWvlyNhivr4ZAGOu730XxEiU5/nyNa8MZz
 5k+lWJwuWwsAf2fys6jnY9FzccHBT4BsqHA67VkMm4wkLUaIb1QXXtfU2l/wWFc425Ni
 SO+y3gMCrPQgS/rlEZv70iOdWsjiDGYsB3GFLe9kAfVnqA6/HycaF4ABI53W9+ddv+mg
 56N9EL7ZBgo4+3Eze5rEEhYWAqWogKCSGGiXiyUf+lrXUW5OJSUgqjhw3k7NiTE/wxds
 iNcmB1fmN0jZWUyOhp3LQ6Yy2kiIKpY/V2+jVyJIg6PmgBUq1LwR/QfsYSd+moLf8eiu
 CGow==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=YJTbfuvia5zxL3BZL0cm6NY0CVOUKYEUITk5bLS5nnQ=;
 b=Yzq541+NQSTQ/AIobDkSNiGga8S3BUuK9Ts57XzJwIl/KqIMXdwjKHlUk69JFRNaUK
 1IcDTXW8OqAhhuhaonzAMbsP8t7PLkt5xAK1/OKyCKvqraQAekOuNjGzGrjonUt2MUoE
 kpz2oAWwhNCY156czOWwBTuIODCOsTa4yFEbIHyYeTNikshZm8gAvnis78/0n0J4m3C5
 adP9lmacVdbOtrfGqG2n2Dfnmd53DMggoYwN1R6FPaTTHPUjj6WhNO1QZX7aw7Qt0VZm
 lgUUZXEJSWFXgsKReonAmGf68ZdB47KMhTCikQmWPp1spPrOTT6w2GlLxQ+mPiwfXHuy
 HGgA==
X-Gm-Message-State: AHQUAuauex6Ehg6jBmou1wmTtbVNgn8BzxKGEyjKMMDEV/j9nd0V1r3z
 CMCohOPeU8oRT5+rM56ZUXxqp8x5s8c5ngsCNiM=
X-Google-Smtp-Source: AHgI3IaaBWjkFEMWglVg17nnbs5YbJ0S85QtKH++W0ABx11KnAWDHzeKiLoqe84g9qYvtQdfBojkEUiIyEmE4ND5FCk=
X-Received: by 2002:ac8:3f46:: with SMTP id w6mr9369147qtk.175.1550263536826; 
 Fri, 15 Feb 2019 12:45:36 -0800 (PST)
MIME-Version: 1.0
References: <CAF3M4P1pqJkQMG7LOt4cTxpP4Z+SjDrm+4YXpXXaOCHL+rH4Cw@mail.gmail.com>
In-Reply-To: <CAF3M4P1pqJkQMG7LOt4cTxpP4Z+SjDrm+4YXpXXaOCHL+rH4Cw@mail.gmail.com>
From: Dave Taht <dave.taht@gmail.com>
Date: Fri, 15 Feb 2019 12:45:24 -0800
Message-ID: <CAA93jw6Zo_STv=Aq_iks75m5bAxBUHKnsuoXX-yKbHGKVHuo=g@mail.gmail.com>
To: Adrian Popescu <adriannnpopescu@gmail.com>
Cc: Cake List <cake@lists.bufferbloat.net>
Content-Type: text/plain; charset="UTF-8"
Subject: Re: [Cake] Dropping dropped
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 15 Feb 2019 20:45:37 -0000

I still regard inbound shaping as our biggest deployment problem,
especially on cheap hardware.

Some days I want to go back to revisiting the ideas in the "bobbie"
shaper, other days...

In terms of speeding up cake:

* At higher speeds (e.g. > 200mbit) cake tends to bottleneck on a
single cpu, in softirq. A lwn article just went by about a proposed
set of improvements for that:
https://lwn.net/SubscriberLink/779738/771e8f7050c26ade/

* Hardware multiqueue is more and more common (APU2 has 4). FQ_codel
is inherently parallel and could take advantage of hardware
multiqueue, if there was a better way to express it. What happens
nowadays is you get the "mq" scheduler with 4 fq_codel instances, when
running at line rate, but I tend to think with 64 hardware queues,
increasingly common in the >10GigE, having 64k fq_codel queues is
excessive. I'd love it if there was a way to have there be a divisor
in the mq -> subqdisc code so that we would have, oh, 32 queues per hw
queue in this case.

Worse, there's no way to attach a global shaped instance to that
hardware, e.g. in cake, which forces all those hardware queues (even
across cpus) into one. The ingress mirred code, here, is also a
problem. a "cake-mq" seemed feasible (basically you just turn the
shaper tracking into an atomic operation in three places), but the
overlying qdisc architecture for sch_mq -> subqdiscs has to be
extended or bypassed, somehow. (there's no way for sch_mq to
automagically pass sub-qdisc options to the next qdisc, and there's no
reason to have sch_mq

* I really liked the ingress "skb list" rework, but I'm not sure how
to get that from A to B.

* and I have a long standing dream of being able to kill off mirred
entirely and just be able to write

tc qdisc add dev eth0 ingress cake bandwidth X

*  native codel is 32 bit, cake is 64 bit. I

* hashing three times as cake does is expensive. Getting a partial
hash and combining it into a final would be faster.

* 8 way set associative is slower than 4 way and almost
indistinguishable from 8. Even direct mapping

* The cake blue code is rarely triggered and inline

I really did want cake to be faster than htb+fq_codel, I started a
project to basically ressurrect "early cake" - which WAS 40% faster
than htb+fq_codel and add in the idea *only* of an atomic builtin
hw-mq shaper a while back, but haven't got back to it.

https://github.com/dtaht/fq_codel_fast

with everything I ripped out in that it was about 5% less cpu to start with.

I can't tell you how many times I've looked over

https://elixir.bootlin.com/linux/latest/source/net/sched/sch_mqprio.c

hoping that enlightment would strike and there was a clean way to get
rid of that layer of abstraction.

But coming up with how to run more stuff in parallel was beyond my rcu-foo.