From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <bcronce@gmail.com>
Received: from mail-lf0-x230.google.com (mail-lf0-x230.google.com
 [IPv6:2a00:1450:4010:c07::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 6A9E23BA8E
 for <cake@lists.bufferbloat.net>; Mon,  6 Mar 2017 08:30:25 -0500 (EST)
Received: by mail-lf0-x230.google.com with SMTP id a6so72450629lfa.0
 for <cake@lists.bufferbloat.net>; Mon, 06 Mar 2017 05:30:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=PQGt/qVsflg4eqESHJFPt2L6thqZ1cR8PB3HjKC+o7o=;
 b=bWXFrPamrFCnNz7HLPBocloHK5LcqWAwruUSIWbKrbLUh6tFHniAwqBo+MtaN7R6zy
 McwuTTwE7lt6ggQiWjB/SoQelOrF/xZhHz5iIifmkdxbocfXapoOPE2lFSRPDPW6SD7+
 8CswKr536+3IKBm2o/Z7waXAFFnehPnAjE2pQArO8qBgtaTDeXQNlqImpGTSR2EMikHD
 v25YpjTV02HbpHwqL8GyRkKpPVtVvm0kaUDwYqqNKPRYB+iMSmBKXwN22RKJLEn/Tui4
 kNMaxJw/Bs2hTeZxhzKGDgmxhNI6ZkJUCdlhHqveHdmbODqqsKsbpeKlbkYs32rQlQ0x
 oD8A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=PQGt/qVsflg4eqESHJFPt2L6thqZ1cR8PB3HjKC+o7o=;
 b=Tmxcq4twEUlHu5NFNSnLiq+8cmOwSR/31uEs2ukMSyiOXoYVs1DFGUIFbHHUZ2XoY9
 7VCx0iuKdmCcK3Tvo0wut9jbjwBaaCKqbqgb/GwzsHVXNj/QL32L05YWAQFegFb6R6GT
 nw3fIXTHEqr10Y6++doRNkFT5ByEoVhs39BY1vWFseLYUlEgSQ44DYc6zGLd6BNjwZq+
 oAJ0gP4g5XNhbfaqMk880WDqUH/vzAue8tD3k+HvuysD7Mu394x+sz0OzuFYnGzBcqqm
 eelEFMl/mw1UGJE7liY/t4SIBhih8eU16TayoTuK9yzGy7N5HG1WCYuw4JWmYyvJimtG
 efcw==
X-Gm-Message-State: AMke39mcU5N84cvyQsONhfLKrn25w2bmBoD+nE9px+8rW7MkTjd8adHfq+/XGjtAR5idz3rYnGoOh4T+uBLEYg==
X-Received: by 10.46.69.214 with SMTP id s205mr3610194lja.125.1488807024158;
 Mon, 06 Mar 2017 05:30:24 -0800 (PST)
MIME-Version: 1.0
Received: by 10.25.198.7 with HTTP; Mon, 6 Mar 2017 05:30:23 -0800 (PST)
In-Reply-To: <CAA93jw7WLR-ghZPYGofiZ-QAY4uCTDJFvvuB4GSQ75P4pYHC-Q@mail.gmail.com>
References: <e955b05f85fea5661cfe306be0a28250@inventati.org>
 <07479F0A-40DD-44E5-B67E-28117C7CF228@gmx.de>
 <1488400107.3610.1@smtp.autistici.org>
 <2B251BF1-C965-444D-A831-9981861E453E@gmx.de>
 <1488484262.16753.0@smtp.autistici.org>
 <d946a3b2-88d9-f2fe-44d5-c2104fc3e0cf@taht.net>
 <CAJnXXogMukBtVQjc9r4pU+JTsPr0thYW+QMnvxptsosmZztYpQ@mail.gmail.com>
 <98E899EF-5E66-42CC-88AA-79FA80A4F228@gmail.com>
 <DM5PR11MB1564974AF02E6511368471F7CC2B0@DM5PR11MB1564.namprd11.prod.outlook.com>
 <2D2AE632-75BB-4DDE-B370-0996EFECF14B@gmail.com>
 <DM5PR11MB15644BBFA60A0C7CEA041CF9CC2B0@DM5PR11MB1564.namprd11.prod.outlook.com>
 <D59F3712-415F-4DB7-A18E-1E55C1461723@gmail.com>
 <CAA93jw7WLR-ghZPYGofiZ-QAY4uCTDJFvvuB4GSQ75P4pYHC-Q@mail.gmail.com>
From: Benjamin Cronce <bcronce@gmail.com>
Date: Mon, 6 Mar 2017 07:30:23 -0600
Message-ID: <CAJ_ENFGCdDMgAg652ObBticSM-uFNBFBEs-KRUEGJxLsU9hWMg@mail.gmail.com>
To: Dave Taht <dave.taht@gmail.com>
Cc: Jonathan Morton <chromatix99@gmail.com>, 
 "cake@lists.bufferbloat.net" <cake@lists.bufferbloat.net>,
 Eric Luehrsen <ericluehrsen@hotmail.com>
Content-Type: multipart/alternative; boundary=001a114b0a26d06d2a054a0fe50e
Subject: Re: [Cake] [LEDE-DEV] Cake SQM killing my DIR-860L - was: [17.01]
 Kernel: bump to 4.4.51
X-BeenThere: cake@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: Cake - FQ_codel the next generation <cake.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/cake>
List-Post: <mailto:cake@lists.bufferbloat.net>
List-Help: <mailto:cake-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/cake>,
 <mailto:cake-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Mon, 06 Mar 2017 13:30:25 -0000

--001a114b0a26d06d2a054a0fe50e
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Fri, Mar 3, 2017 at 12:21 AM, Dave Taht <dave.taht@gmail.com> wrote:

> As this is devolving into a cake specific discussion, removing the
> lede mailing list.
>
> On Thu, Mar 2, 2017 at 9:49 PM, Jonathan Morton <chromatix99@gmail.com>
> wrote:
> >
> >> On 3 Mar, 2017, at 07:00, Eric Luehrsen <ericluehrsen@hotmail.com>
> wrote:
> >>
> >> That's not what I was going for. Agree, it would not be good to depend
> >> on an inferior hash. You mentioned divide as a "cost." So I was
> >> proposing a thought around a "benefit" estimate. If hash collisions ar=
e
> >> not as important (or are they), then what is "benefit / cost?"
> >
> > The computational cost of one divide is not the only consideration I
> have in mind.
> >
> > Cake=E2=80=99s set-associative hash is fundamentally predicated on the =
number of
> hash buckets *not* being prime, as it requires further decomposing the ha=
sh
> into a major and minor part when a collision is detected.  The minor part
> is then iterated to try to locate a matching or free bucket.
> >
> > This is considerably easier to do and reason about when everything is a
> power of two.  Then, modulus is a masking operation, and divide is a shif=
t,
> either of which can be done in one cycle flat.
> >
> > AFAIK, however, the main CPU cost of the hash function in Cake is not
> the hash itself, but the packet dissection required to obtain the data it
> operates on.  This is something a profile would shed more light on.
>
> Tried. Mips wasn't a good target.
>
> The jhash3 setup cost is bad, but I agree flow dissection can be
> deeply expensive. As well as the other 42+ functions a packet needs to
> traverse to get from ingress to egress.
>
> But staying on hashing:
>
> One thing that landed 4.10? 4.11? was fq_codel relying on a skb->hash
> if one already existed (injected already by tcp, or by hardware, or
> the tunneling tool). we only need to compute a partial hash on the
> smaller subset of keys in that case (if we can rely on the skb->hash
> which we cannot do in the nat case)
>
> Another thing I did, long ago, was read the (60s-era!) liturature
> about set-associative cpu cache architectures... and...
>
> In all of these cases I really, really wanted to just punt all this
> extra work to hardware in ingress - computing 3 hashes can be easily
> done in parallel there and appended to the packet as it completes.
>
> I have been working quite a bit more with the arm architecture of
> late, and the "perf" profiler over there is vastly better than the
> mips one we've had.
>
> (and aarch64 is *nice*. So is NEON)
>
> - but I hadn't got around to dinking with cake there until yesterday.
>
> One thing I'm noticing is that even the gigE capable arms have weak or
> non-existent L2 caches, and generally struggle to get past 700Mbits
> bidirectionally on the network.
>
> some quick tests of pfifo vs cake on the "lime-2" (armv7 dual core) are
> here:
>
> http://www.taht.net/~d/lime-2/
>
> The rrul tests were not particularly pleasing. [1]
>
> ...
>
> A second thing on my mind is to be able to take advantage of A) more core=
s
>
> ... and B) hardware that increasingly has 4 or more lanes in it.
>
> 1)  Presently fq_codel (and cake's) behavior there when set as a
> default qdisc is sub-optimal - if you have 64 hardware queues you end
> up with 64 instances, each with 1024 queues. While this might be
> awesome from a FQ perspective I really don't think the aqm will be as
> good. Or maybe it might be - what happens with 64000 queues at
> 100Mbit?
>
> 2) It's currently impossible to shape network traffic across cores.
> I'd like to imagine that with a single atomic exchange or sloppily
> shared values shaping would be feasible.
>
>
When you need to worry about multithreading, many times perfect is very
much the enemy of good. Depending on how quickly you need to make the
network react, you could do something along the lines of a "shared pool" of
bandwidth. Each core gets a split of the bandwidth, any unused bandwidth
can be added to the pool, and cores that want more bandwidth can take
bandwidth from the pool.

You could treat it like task stealing, except each core can generate tokens
that represent a quantum of bandwidth that is only valid for some interval.
If a core suddenly needs bandwidth, it can attempt to "take back" from its
publicly shared pool. If other cores have already borrowed, it can attempt
to borrow from another core. If it can't find any spare bandwidth, it just
waits for some interval related to how long a quantum is valid, and assumes
it's safe.

Or something.. I don't know, it's 7am and I just woke up.


> (also softirq is a single thread, I believe)
>
> 3) mq and mqprio are commonly deployed on the high end for this.
>
> So I've thought about doing up another version - call it - I dunno -
> smq - "smart multi-queue" - and seeing how far we could get.
>
> >  - Jonathan Morton
> >
> > _______________________________________________
> > Cake mailing list
> > Cake@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/cake
>
>
>
> [1] If you are on this list and are not using flent, tough. I'm not
> going through the trouble of generating graphs myself anymore.
>
> --
> Dave T=C3=A4ht
> Let's go make home routers and wifi faster! With better software!
> http://blog.cerowrt.org
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>

--001a114b0a26d06d2a054a0fe50e
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br><div class=3D"gmail_extra"><br><div class=3D"gmail_quo=
te">On Fri, Mar 3, 2017 at 12:21 AM, Dave Taht <span dir=3D"ltr">&lt;<a hre=
f=3D"mailto:dave.taht@gmail.com" target=3D"_blank">dave.taht@gmail.com</a>&=
gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 =
0 .8ex;border-left:1px #ccc solid;padding-left:1ex">As this is devolving in=
to a cake specific discussion, removing the<br>
lede mailing list.<br>
<span class=3D""><br>
On Thu, Mar 2, 2017 at 9:49 PM, Jonathan Morton &lt;<a href=3D"mailto:chrom=
atix99@gmail.com">chromatix99@gmail.com</a>&gt; wrote:<br>
&gt;<br>
&gt;&gt; On 3 Mar, 2017, at 07:00, Eric Luehrsen &lt;<a href=3D"mailto:eric=
luehrsen@hotmail.com">ericluehrsen@hotmail.com</a>&gt; wrote:<br>
&gt;&gt;<br>
&gt;&gt; That&#39;s not what I was going for. Agree, it would not be good t=
o depend<br>
&gt;&gt; on an inferior hash. You mentioned divide as a &quot;cost.&quot; S=
o I was<br>
&gt;&gt; proposing a thought around a &quot;benefit&quot; estimate. If hash=
 collisions are<br>
&gt;&gt; not as important (or are they), then what is &quot;benefit / cost?=
&quot;<br>
&gt;<br>
&gt; The computational cost of one divide is not the only consideration I h=
ave in mind.<br>
&gt;<br>
&gt; Cake=E2=80=99s set-associative hash is fundamentally predicated on the=
 number of hash buckets *not* being prime, as it requires further decomposi=
ng the hash into a major and minor part when a collision is detected.=C2=A0=
 The minor part is then iterated to try to locate a matching or free bucket=
.<br>
&gt;<br>
&gt; This is considerably easier to do and reason about when everything is =
a power of two.=C2=A0 Then, modulus is a masking operation, and divide is a=
 shift, either of which can be done in one cycle flat.<br>
&gt;<br>
&gt; AFAIK, however, the main CPU cost of the hash function in Cake is not =
the hash itself, but the packet dissection required to obtain the data it o=
perates on.=C2=A0 This is something a profile would shed more light on.<br>
<br>
</span>Tried. Mips wasn&#39;t a good target.<br>
<br>
The jhash3 setup cost is bad, but I agree flow dissection can be<br>
deeply expensive. As well as the other 42+ functions a packet needs to<br>
traverse to get from ingress to egress.<br>
<br>
But staying on hashing:<br>
<br>
One thing that landed 4.10? 4.11? was fq_codel relying on a skb-&gt;hash<br=
>
if one already existed (injected already by tcp, or by hardware, or<br>
the tunneling tool). we only need to compute a partial hash on the<br>
smaller subset of keys in that case (if we can rely on the skb-&gt;hash<br>
which we cannot do in the nat case)<br>
<br>
Another thing I did, long ago, was read the (60s-era!) liturature<br>
about set-associative cpu cache architectures... and...<br>
<br>
In all of these cases I really, really wanted to just punt all this<br>
extra work to hardware in ingress - computing 3 hashes can be easily<br>
done in parallel there and appended to the packet as it completes.<br>
<br>
I have been working quite a bit more with the arm architecture of<br>
late, and the &quot;perf&quot; profiler over there is vastly better than th=
e<br>
mips one we&#39;ve had.<br>
<br>
(and aarch64 is *nice*. So is NEON)<br>
<br>
- but I hadn&#39;t got around to dinking with cake there until yesterday.<b=
r>
<br>
One thing I&#39;m noticing is that even the gigE capable arms have weak or<=
br>
non-existent L2 caches, and generally struggle to get past 700Mbits<br>
bidirectionally on the network.<br>
<br>
some quick tests of pfifo vs cake on the &quot;lime-2&quot; (armv7 dual cor=
e) are here:<br>
<br>
<a href=3D"http://www.taht.net/~d/lime-2/" rel=3D"noreferrer" target=3D"_bl=
ank">http://www.taht.net/~d/lime-2/</a><br>
<br>
The rrul tests were not particularly pleasing. [1]<br>
<br>
...<br>
<br>
A second thing on my mind is to be able to take advantage of A) more cores<=
br>
<br>
... and B) hardware that increasingly has 4 or more lanes in it.<br>
<br>
1)=C2=A0 Presently fq_codel (and cake&#39;s) behavior there when set as a<b=
r>
default qdisc is sub-optimal - if you have 64 hardware queues you end<br>
up with 64 instances, each with 1024 queues. While this might be<br>
awesome from a FQ perspective I really don&#39;t think the aqm will be as<b=
r>
good. Or maybe it might be - what happens with 64000 queues at<br>
100Mbit?<br>
<br>
2) It&#39;s currently impossible to shape network traffic across cores.<br>
I&#39;d like to imagine that with a single atomic exchange or sloppily<br>
shared values shaping would be feasible.<br>
<br></blockquote><div><br></div><div>When you need to worry about multithre=
ading, many times perfect is very much the enemy of good. Depending on how =
quickly you need to make the network react, you could do something along th=
e lines of a &quot;shared pool&quot; of bandwidth. Each core gets a split o=
f the bandwidth, any unused bandwidth can be added to the pool, and cores t=
hat want more bandwidth can take bandwidth from the pool.=C2=A0</div><div><=
br></div><div>You could treat it like task stealing, except each core can g=
enerate tokens that represent a quantum of bandwidth that is only valid for=
 some interval. If a core suddenly needs bandwidth, it can attempt to &quot=
;take back&quot; from its publicly shared pool. If other cores have already=
 borrowed, it can attempt to borrow from another core. If it can&#39;t find=
 any spare bandwidth, it just waits for some interval related to how long a=
 quantum is valid, and assumes it&#39;s safe.</div><div><br></div><div>Or s=
omething.. I don&#39;t know, it&#39;s 7am and I just woke up.</div><div>=C2=
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">
(also softirq is a single thread, I believe)<br>
<br>
3) mq and mqprio are commonly deployed on the high end for this.<br>
<br>
So I&#39;ve thought about doing up another version - call it - I dunno -<br=
>
smq - &quot;smart multi-queue&quot; - and seeing how far we could get.<br>
<span class=3D""><br>
&gt;=C2=A0 - Jonathan Morton<br>
&gt;<br>
&gt; ______________________________<wbr>_________________<br>
&gt; Cake mailing list<br>
&gt; <a href=3D"mailto:Cake@lists.bufferbloat.net">Cake@lists.bufferbloat.n=
et</a><br>
&gt; <a href=3D"https://lists.bufferbloat.net/listinfo/cake" rel=3D"norefer=
rer" target=3D"_blank">https://lists.bufferbloat.net/<wbr>listinfo/cake</a>=
<br>
<br>
<br>
<br>
</span>[1] If you are on this list and are not using flent, tough. I&#39;m =
not<br>
going through the trouble of generating graphs myself anymore.<br>
<span class=3D"HOEnZb"><font color=3D"#888888"><br>
--<br>
Dave T=C3=A4ht<br>
Let&#39;s go make home routers and wifi faster! With better software!<br>
<a href=3D"http://blog.cerowrt.org" rel=3D"noreferrer" target=3D"_blank">ht=
tp://blog.cerowrt.org</a><br>
</font></span><div class=3D"HOEnZb"><div class=3D"h5">_____________________=
_________<wbr>_________________<br>
Cake mailing list<br>
<a href=3D"mailto:Cake@lists.bufferbloat.net">Cake@lists.bufferbloat.net</a=
><br>
<a href=3D"https://lists.bufferbloat.net/listinfo/cake" rel=3D"noreferrer" =
target=3D"_blank">https://lists.bufferbloat.net/<wbr>listinfo/cake</a><br>
</div></div></blockquote></div><br></div></div>

--001a114b0a26d06d2a054a0fe50e--