[Bloat] How about a topical LWN article on demonstrating the real-world goodness of CAKE?

Daniel Sterling sterling.daniel at gmail.com
Mon Aug 10 10:00:28 EDT 2020


So I've been wanting to write up what I did to improve my home network
for a while.

Here's a quick overview:

I'm running a small laptop-class sandy bridge CPU in a small desktop
computer, running openwrt, running cake. It has two NICs -- the
built-in realtek NIC, and an old Intel gigabit NIC in the PCI slot.

Internet goes into the realtek NIC and out the Intel NIC. (WAN / LAN in openwrt)

my internet is AT&T gigabit fiber, but I throttle that heavily  with
cake (see below)

I manually apply cake with my own scripts. I'll post those on gist and
reply to this email with that info, just wanted to write this up
quickly this morning. but it's basically just, apply two simple cake
tc lines to the NICs.

For wifi I use UBNT's SOHO line -- Amplifi HD units.

it works really rather well; after some tweaking I've managed to
essentially get rid of the things that I've empirically found really
hurt home network performance:

1. wifi dead zones -- solved by using as many amplifi HD units as you
like, meshed or wired together. obviously wires are better than mesh
and a dedicated backhaul set of APs is better than mesh but mesh works
too.

2. wifi trying to use 5ghz when it's too slow and refusing to switch
to 2ghz -- solved by amplifi AP having a setting where it kicks
devices off the 5ghz network proactively to convice them to switch to
2ghz. thank you UBNT!

3. TCP not dropping enough packets. (or rather, not having good queue
management)

4. TCP (or rather, the network) dropping too many TCP packets --
streams / apps / web sites will get "stuck"

so after much tweaking, I've got cake set to 40mbit down, 20mbit up,
enforced by two cakes (one for each NIC). that's fairly low --

it's low to highly throttle bulk streams so that I  can play
latency-sensitive games with basically no jitter and low latency, even
if other people are using the wifi. even if I can't wire an xbox, I
can still get low latency gaming on wifi

but it's still high enough that we can stream HD video.

and of course low jitter and low latency across the board means good
ssh and video cal performance.

just wanted to write this up quickly to reply to this thread -- cake
really is amazing and I'd bet people would be willing to pay for a
magic box like I've set up that they can stick in between their
existing CPE and a decent AP that applies cake. or if AP vendors would
put cake in their APs themselves, that would be good too.

but as you note #1 and #2 on my list are important, even before queue
management comes into play. you have to be willing to buy a good AP
before cake really starts to matter, I think

-- Dan

On Mon, Aug 10, 2020 at 8:57 AM David Collier-Brown <davecb.42 at gmail.com> wrote:
>
> On 2020-08-09 5:35 p.m., Jonathan Morton wrote:
>
> Are the risks and tradeoffs well enough understood (and visible enough
> for troubleshooting) to recommend broader deployment?
>
> I recently gave openwrt a try on some hardware that I ultimately
> concluded was insufficient for the job.  Fairly soon after changing out
> my access point, I started getting complaints of Wi-Fi dropping in my
> household, especially when someone was trying to videoconference.  I
> discovered that my AP was spontaneously rebooting, and the box was
> getting hot.
>
> Most CPE devices these days rely on hardware accelerated packet forwarding to achieve their published specs.  That's all about taking packets in one side and pushing them out the other as quickly as possible, with only minimal support from the CPU (likely, new connections get a NAT/firewall lookup, that's all).  It has the advantages of speed and power efficiency, but unfortunately it is also incompatible with our debloating efforts.  So debloated CPE will tend to run hotter and with lower peak throughput, which may be noticeable to cable and fibre users; VDSL (FTTC) users might have service of 80Mbps or less where this effect is less likely to matter.
>
> It sounds like that AP had a very marginal thermal design which caused the hardware to overheat as soon as the CPU was under significant load, which it can easily be when a shaper and AQM are running on it at high throughput.  The cure is to use better designed hardware, though you could also contemplate breaking the case open to cure the thermal problem directly.  There are some known reliable models which could be collected into a list.  As a rule of thumb, the ones based on ARM cores are likely to be designed with CPU performance more in mind than those with MIPS.
>
> Cake has some features which can be used to support explicit classification and (de)prioritisation of traffic via firewall marking rules, either by rewriting the Diffserv field or by associating metadata with packets within the network stack (fwmark).  This can be very useful for pushing Bittorrent or WinUpdate swarm traffic out of the way.  But for most situations, the default flow-isolating behaviour already works pretty well, especially for ensuring that one computer's network load has only a bounded effect on any other.  We can discuss that in more detail if that would be helpful.
>
> I'm primarily thinking of this week's version of the home router problem (;-))
>
> Because of the degree to which we're working from home and videoconferencing, a lot of low-price, medium-performance devices are suddenly too wimpy for their new role.
>
> A (very!) draft version is up in Google docs, at https://docs.google.com/document/d/1gWKp9HqTbuHLfgD59WU4KJ8Og3eHuBtIeC7BUK0Ju9w/edit?usp=sharing
>
> Using myself as the guinea-pig, running pfifo-fast was clearly bad, fq_codel was better, and cake was good with a newish Fedora and the stock Rogers router.  It's been a while since I did rrul tests, and in any case, I think that to convince readers we need a very practical way of making it clear that they have a problem. I'm thinking that making VOIP fail might do the trick (;-))
>
> The hard part, IMHO, is constructing a test that immediately communicates the idea that the reader has a problem, and that CAKE addresses it.
>
> Returning to the hardware question, https://evenroute.com/iqrv3 seems to be capable of handling up to ~300 Mbit/S connections, and my ISP only delivers 170 (and advertises 150, which is mildly surprising!)
>
> I just ordered one, so I'll have a 'plug in" example, along with reflashing my linksys for the umpty-thousandth time.
>
> --dave
>
>  I suspect not enough people are aware of the later efforts of the bufferbloat team, so I'm thinking of one or two articles, starting with LWN and an audience of aficionados.
>
> The core community is aware of what we've done, but in my view we haven't converted "grandma". Grandma, as well as a whole bunch of ordinary engineers and partners of engineers, are dependent on debloated performance because they're working at home now, and competing with granddaughter playing video games while they're trying to hold a video call.
>
> Right now, my colleagues at work suffer from more than a second of bloat-related lag. They therefore tend to speak over each other on con-calls, apologize, start again and talk over each other, again. After a little while, the picture becomes a distinctly silly one: a bunch of grown adults putting their hands up and waving, like little kids in school. No-one has called out “me, me, teacher” yet, but I expect it any time.
>
> I propose we show the results in terms that we can explain to Grandma, specifically concentrating on functioning VOIP. I just upgraded to Fedora 31, and the networking is absolutely stock, so I make a perfect victim/guinea-pig (;-))
>
> Who's interested?
>
>
>
>
> --
> David Collier-Brown,         | Always do right. This will gratify
> System Programmer and Author | some people and astonish the rest
> davecb at spamcop.net           |                      -- Mark Twain
>
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


More information about the Bloat mailing list