From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp96.ord1c.emailsrvr.com (smtp96.ord1c.emailsrvr.com [108.166.43.96]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 0943F3CB35 for ; Mon, 10 Aug 2020 13:58:42 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=g001.emailsrvr.com; s=20190322-9u7zjiwi; t=1597082322; bh=IAUSLWjySQKdKnHjcbIQtqYnn6WAnqds2oHBAPfhAEo=; h=From:Subject:Date:To:From; b=Nb8pXpWdIuMgqY/NBVQx1pBDSUB9BfFKW3T81jcoBVV/4VE+9IN5WgHeoHxU1PJnB h4CwoMkFOobHEWHdZvIPj6L3izEz0PpB4XKcjPsE2/7hRYnEWYB8q/DQq8zhF2VnHp H64qziEtJITmXCjr1EzD8uKBhJqlE7WidMcy8SUI= X-Auth-ID: jf@jonathanfoulkes.com Received: by smtp21.relay.ord1c.emailsrvr.com (Authenticated sender: jf-AT-jonathanfoulkes.com) with ESMTPSA id 0604DC0416; Mon, 10 Aug 2020 13:58:41 -0400 (EDT) From: Jonathan Foulkes Message-Id: <95BA0E2B-9DB3-433F-804F-118AC7A90F5E@jonathanfoulkes.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_0FAF3946-D979-434E-AD15-88FF1DAF5BF1" Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\)) Date: Mon, 10 Aug 2020 13:58:41 -0400 In-Reply-To: <04949cee-c4de-900c-e1b1-4b1f227933eb@rogers.com> Cc: Jonathan Morton , tomh@tomh.org, "dave.collier-brown@indexexchange.com" , Y via Bloat To: davecb@spamcop.net References: <225a9c89-ac76-f21e-1450-5deeb3cd23eb@tomh.org> <04949cee-c4de-900c-e1b1-4b1f227933eb@rogers.com> X-Mailer: Apple Mail (2.3608.120.23.2.1) X-Classification-ID: df9f3d3e-72ae-42a7-a8ae-8d7473c57174-1-1 Subject: Re: [Bloat] How about a topical LWN article on demonstrating the real-world goodness of CAKE? X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Aug 2020 17:58:43 -0000 --Apple-Mail=_0FAF3946-D979-434E-AD15-88FF1DAF5BF1 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi David,=20 Great topic, and glad you brought it up, as increasing awareness of all = the goodness that de-bloating brings to end users is important, = especially with all the WFH and soon, teaching/learning going on these = days. Background / disclosure: I=E2=80=99m the founder and CEO of Evenroute, = so first, thank you for your order :-) Second, please let me know if = you have any questions once you get the unit. Happy to personally = support you, or any member of this list. Given I have access to the combined data of the deployed IQrouter fleet, = I have a pretty good view of how well Cake performs in the real world, = as we are on nearly every ISP domestically (US) and a surprising number = of International deployments as well (even though we only market in the = US). as you might imagine, given our marketing, we have a huge number of = users with really bad lines, or on challenging tech, like WISPs & = Satellite. So we get to see the worst of the worst. But interestingly enough, we also have a certain amount of users with = extremely low and stable latencies with no QoS, yet they still continue = to deploy their IQrouters, likely because the prior ISP-supplied device = *added* latencies, and the benefits of fairness in the per-device, = per-host settings in Cake, plus correctly prioritizing based on type & = DSCP marks (most WiFi-calling smartphone traffic is correctly marked). >>> Are the risks and tradeoffs well enough understood (and visible = enough=20 >>> for troubleshooting) to recommend broader deployment? We=E2=80=99ve been deploying SQM / CAKE for 4+ years in the IQrouter, = and while we have evolved a lot of our algorithms do deal with the = millions of permutations in configurations and settings, I=E2=80=99ve = seen SQM and lately CAKE likewise mature, and my assessment is they are = indeed ready for prime time in terms of foundational tech. The challenge is not the core tech, it=E2=80=99s accessibility. That was = my take in 2015 when I first discovered it, and led to the founding of = my company. As for troubleshooting visibility, check out the Status->Ping Stats page = once your IQrouter has been running for a few days. Very helpful in = triaging modem and line issues. Basically its a line capacity usage = monitor and ping plotter. > but in my view we haven't converted "grandma". Because until I produced a product with zero user configuration = requirements (relative to QoS), =E2=80=98grandma=E2=80=99 was never a = viable user. Back story: I live in a large (3,000 homes) development in a rural area, = most residents are retired professionals and DSL was the only choice up = until recently, so a bunch of non-technical grandma's and grandpas are = my neighbors. The IQrouter was developed to meet the needs of that = audience. Grandma should be able to deploy in 15 minutes and have a = de-bloated DSL connection that got rid of the 5,000ms+ lag spikes. So = initial config workflow, and all the tuning and dynamic line adaptation = were largely born of dealing with my local needs. Funny enough, our focus on non-techie usability led to a skeptical = backlash from some =E2=80=99techies=E2=80=99, who find our marketing = messaging and simple UI too off-putting for them. Even though we do = expose the full native UI for OpenWRT under the =E2=80=98Advanced = Menus=E2=80=99 option. Heck, you can even instal OpenWRT packages on = this thing. Smart techies love though, a recent IT guy wrote in response to our = support team letting him know about OpenWRT support was: "Now, I know = that this really is the best router for all consumers to use in their = homes!=E2=80=9D. We made his WISP service usable. > Because of the degree to which we're working from home and = videoconferencing, a lot of low-price, medium-performance devices are = suddenly too wimpy for their new role. >=20 Big time. As noted above, it not just the CPE devices, it=E2=80=99s the = congestion on the backhauls that causes issues, and it=E2=80=99s = everything from DSL (slammed DSLAMs are endemic) to cable systems with = oversubscribed local loops and congested CMTS backhaul. Hell, even fiber = to the home ISPs manage to have variable capacity (and bloat) in the = evenings. Traffic management is a requirement these days. >> I propose we show the results in terms that we can explain to = Grandma, specifically concentrating on functioning VOIP. This is desperately needed, as there need to be more points of proof and = articles outlining the problem, and the impacts of resolving it with = effective traffic management. Stuff like this article from Jim Gettys on the needs of teachers. BTW- = he calls out the IQrouter as his =E2=80=98Go to=E2=80=99 recommendation = for non-techies. He runs one himself and gave one to his non-techie = brother. Bufferbloat in Action due to=C2=A0Covid-19 = More comments on other response later, got work to do ;-) Cheers, Jonathan Foulkes > On Aug 10, 2020, at 8:57 AM, David Collier-Brown = wrote: >=20 > On 2020-08-09 5:35 p.m., Jonathan Morton wrote: >=20 >>> Are the risks and tradeoffs well enough understood (and visible = enough=20 >>> for troubleshooting) to recommend broader deployment? >>>=20 >>> I recently gave openwrt a try on some hardware that I ultimately=20 >>> concluded was insufficient for the job. Fairly soon after changing = out=20 >>> my access point, I started getting complaints of Wi-Fi dropping in = my=20 >>> household, especially when someone was trying to videoconference. I=20= >>> discovered that my AP was spontaneously rebooting, and the box was=20= >>> getting hot. >> Most CPE devices these days rely on hardware accelerated packet = forwarding to achieve their published specs. That's all about taking = packets in one side and pushing them out the other as quickly as = possible, with only minimal support from the CPU (likely, new = connections get a NAT/firewall lookup, that's all). It has the = advantages of speed and power efficiency, but unfortunately it is also = incompatible with our debloating efforts. So debloated CPE will tend to = run hotter and with lower peak throughput, which may be noticeable to = cable and fibre users; VDSL (FTTC) users might have service of 80Mbps or = less where this effect is less likely to matter. >>=20 >> It sounds like that AP had a very marginal thermal design which = caused the hardware to overheat as soon as the CPU was under significant = load, which it can easily be when a shaper and AQM are running on it at = high throughput. The cure is to use better designed hardware, though = you could also contemplate breaking the case open to cure the thermal = problem directly. There are some known reliable models which could be = collected into a list. As a rule of thumb, the ones based on ARM cores = are likely to be designed with CPU performance more in mind than those = with MIPS. >>=20 >> Cake has some features which can be used to support explicit = classification and (de)prioritisation of traffic via firewall marking = rules, either by rewriting the Diffserv field or by associating metadata = with packets within the network stack (fwmark). This can be very useful = for pushing Bittorrent or WinUpdate swarm traffic out of the way. But = for most situations, the default flow-isolating behaviour already works = pretty well, especially for ensuring that one computer's network load = has only a bounded effect on any other. We can discuss that in more = detail if that would be helpful. > I'm primarily thinking of this week's version of the home router = problem (;-))=20 >=20 > Because of the degree to which we're working from home and = videoconferencing, a lot of low-price, medium-performance devices are = suddenly too wimpy for their new role. >=20 > A (very!) draft version is up in Google docs, at = https://docs.google.com/document/d/1gWKp9HqTbuHLfgD59WU4KJ8Og3eHuBtIeC7BUK= 0Ju9w/edit?usp=3Dsharing = > Using myself as the guinea-pig, running pfifo-fast was clearly bad, = fq_codel was better, and cake was good with a newish Fedora and the = stock Rogers router. It's been a while since I did rrul tests, and in = any case, I think that to convince readers we need a very practical way = of making it clear that they have a problem. I'm thinking that making = VOIP fail might do the trick (;-)) >=20 > The hard part, IMHO, is constructing a test that immediately = communicates the idea that the reader has a problem, and that CAKE = addresses it.=20 >=20 > Returning to the hardware question, https://evenroute.com/iqrv3 = seems to be capable of handling up to ~300 = Mbit/S connections, and my ISP only delivers 170 (and advertises 150, = which is mildly surprising!) >=20 > I just ordered one, so I'll have a 'plug in" example, along with = reflashing my linksys for the umpty-thousandth time. >=20 > --dave >=20 >=20 >> I suspect not enough people are aware of the later efforts of the = bufferbloat team, so I'm thinking of one or two articles, starting with = LWN and an audience of aficionados.=20 >>=20 >> The core community is aware of what we've done, but in my view we = haven't converted "grandma". Grandma, as well as a whole bunch of = ordinary engineers and partners of engineers, are dependent on debloated = performance because they're working at home now, and competing with = granddaughter playing video games while they're trying to hold a video = call.=20 >>=20 >> Right now, my colleagues at work suffer from more than a second of = bloat-related lag. They therefore tend to speak over each other on = con-calls, apologize, start again and talk over each other, again. After = a little while, the picture becomes a distinctly silly one: a bunch of = grown adults putting their hands up and waving, like little kids in = school. No-one has called out =E2=80=9Cme, me, teacher=E2=80=9D yet, but = I expect it any time.=20 >>=20 >> I propose we show the results in terms that we can explain to = Grandma, specifically concentrating on functioning VOIP. I just upgraded = to Fedora 31, and the networking is absolutely stock, so I make a = perfect victim/guinea-pig (;-))=20 >>=20 >> Who's interested? >=20 >=20 >=20 >=20 > --=20 > David Collier-Brown, | Always do right. This will gratify > System Programmer and Author | some people and astonish the rest > davecb@spamcop.net | = -- Mark Twain > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat --Apple-Mail=_0FAF3946-D979-434E-AD15-88FF1DAF5BF1 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Hi = David, 

Great = topic, and glad you brought it up, as increasing awareness of all the = goodness that de-bloating brings to end users is important, especially = with all the WFH and soon, teaching/learning going on these = days.

Background= / disclosure: I=E2=80=99m the founder and CEO of Evenroute, so first, = thank you for your order :-)  Second, please let me know if you = have any questions once you get the unit. Happy to personally support = you, or any member of this list.

Given I have access to the combined = data of the deployed IQrouter fleet, I have a pretty good view of how = well Cake performs in the real world, as we are on nearly every ISP = domestically (US) and a surprising number of International deployments = as well (even though we only market in the US).
as = you might imagine, given our marketing, we have a huge number of users = with really bad lines, or on challenging tech, like WISPs & = Satellite. So we get to see the worst of the worst.
But interestingly enough, we also have a certain amount of = users with extremely low and stable latencies with no QoS, yet they = still continue to deploy their IQrouters, likely because the prior = ISP-supplied device *added* latencies, and the benefits of fairness in = the per-device, per-host settings in Cake, plus correctly prioritizing = based on type & DSCP marks (most WiFi-calling smartphone traffic is = correctly marked).
Are the risks and tradeoffs well =
enough understood (and visible enough=20
for troubleshooting) to recommend broader =
deployment?
We=E2=80=99ve been deploying SQM / CAKE for 4+ years in the = IQrouter, and while we have evolved a lot of our algorithms do deal with = the millions of = permutations in configurations and settings, I=E2=80=99ve seen SQM and = lately CAKE likewise mature, and my assessment is they are = indeed ready for prime time in terms of foundational = tech.
The challenge is not the core tech, = it=E2=80=99s accessibility. That was my take in 2015 when I first = discovered it, and led to the founding of my company.

As for troubleshooting = visibility, check out the Status->Ping Stats page once your IQrouter = has been running for a few days. Very helpful in triaging modem and line = issues. Basically its a line capacity usage monitor and ping = plotter.

but in my view we = haven't converted "grandma".

Because until I produced a product with zero user = configuration requirements (relative to QoS), =E2=80=98grandma=E2=80=99 = was never a viable user.
Back story: I live in a large (3,000 = homes) development in a rural area, most residents are retired = professionals and DSL was the only choice up until recently, so a bunch = of non-technical grandma's and grandpas are my neighbors. The IQrouter = was developed to meet the needs of that audience. Grandma should be able = to deploy in 15 minutes and have a de-bloated DSL connection that got = rid of the 5,000ms+ lag spikes. So initial config workflow, and all the = tuning and dynamic line adaptation were largely born of dealing with my = local needs.

Funny enough, our focus = on non-techie usability led to a skeptical backlash from some = =E2=80=99techies=E2=80=99, who find our marketing messaging and simple = UI too off-putting for them. Even though we do expose the full native UI = for OpenWRT under the =E2=80=98Advanced Menus=E2=80=99 option. Heck, you = can even instal OpenWRT packages on this thing.
Smart techies = love though, a recent IT guy wrote in response to our support team = letting him know about OpenWRT support was: "Now, I know that this really = is the best router for all consumers to use in their homes!=E2=80=9D. We made his WISP = service usable.

Because of the = degree to which we're working from home and videoconferencing, a lot of = low-price, medium-performance devices are suddenly too wimpy for their = new role.

Big time. As noted above, it not just the CPE devices, it=E2=80= =99s the congestion on the backhauls that causes issues, and it=E2=80=99s = everything from DSL (slammed DSLAMs are endemic) to cable systems with = oversubscribed local loops and congested CMTS backhaul. Hell, even fiber = to the home ISPs manage to have variable capacity (and bloat) in the = evenings.

Traffic management is a requirement these = days.

I propose we show the = results in terms that we can explain to Grandma, specifically = concentrating on functioning VOIP.

This is desperately needed, as there = need to be more points of proof and articles outlining the problem, and = the impacts of resolving it with effective traffic management.

Stuff like this article = from Jim Gettys on the needs of teachers. BTW- he calls out the IQrouter = as his =E2=80=98Go to=E2=80=99 recommendation for non-techies. He runs = one himself and gave one to his non-techie brother. Bufferbloat in Action due = to Covid-19

More comments on other response later, got work to do = ;-)

Cheers,

Jonathan Foulkes


On Aug 10, 2020, at 8:57 AM, David Collier-Brown <davecb.42@gmail.com>= wrote:

=20 =20

On 2020-08-09 5:35 p.m., Jonathan Morton = wrote:

Are the risks and =
tradeoffs well enough understood (and visible enough=20
for troubleshooting) to recommend broader deployment?

I recently gave openwrt a try on some hardware that I ultimately=20
concluded was insufficient for the job.  Fairly soon after changing out=20=

my access point, I started getting complaints of Wi-Fi dropping in my=20
household, especially when someone was trying to videoconference.  I=20
discovered that my AP was spontaneously rebooting, and the box was=20
getting hot.
Most CPE devices these days =
rely on hardware accelerated packet forwarding to achieve their =
published specs.  That's all about taking packets in one side and =
pushing them out the other as quickly as possible, with only minimal =
support from the CPU (likely, new connections get a NAT/firewall lookup, =
that's all).  It has the advantages of speed and power efficiency, but =
unfortunately it is also incompatible with our debloating efforts.  So =
debloated CPE will tend to run hotter and with lower peak throughput, =
which may be noticeable to cable and fibre users; VDSL (FTTC) users =
might have service of 80Mbps or less where this effect is less likely to =
matter.

It sounds like that AP had a very marginal thermal design which caused =
the hardware to overheat as soon as the CPU was under significant load, =
which it can easily be when a shaper and AQM are running on it at high =
throughput.  The cure is to use better designed hardware, though you =
could also contemplate breaking the case open to cure the thermal =
problem directly.  There are some known reliable models which could be =
collected into a list.  As a rule of thumb, the ones based on ARM cores =
are likely to be designed with CPU performance more in mind than those =
with MIPS.

Cake has some features which can be used to support explicit =
classification and (de)prioritisation of traffic via firewall marking =
rules, either by rewriting the Diffserv field or by associating metadata =
with packets within the network stack (fwmark).  This can be very useful =
for pushing Bittorrent or WinUpdate swarm traffic out of the way.  But =
for most situations, the default flow-isolating behaviour already works =
pretty well, especially for ensuring that one computer's network load =
has only a bounded effect on any other.  We can discuss that in more =
detail if that would be helpful.

I'm primarily thinking of this week's version of the home router problem (;-))

Because of the degree to which we're working from = home and videoconferencing, a lot of low-price, medium-performance devices are suddenly too wimpy for their new role.

A = (very!) draft version is up in Google docs, at https://docs.google.com/document/d/1gWKp= 9HqTbuHLfgD59WU4KJ8Og3eHuBtIeC7BUK0Ju9w/edit?usp=3Dsharing

Using myself as the guinea-pig, running pfifo-fast was = clearly bad, fq_codel was better, and cake was good with a newish Fedora and the stock Rogers router.  It's been a while since I did = rrul tests, and in any case, I think that to convince readers we need a very practical way of making it clear that they have a problem. I'm thinking that making VOIP fail might do the trick (;-))

The hard part, IMHO, is constructing a test that immediately communicates the idea that the reader has a problem, and that CAKE addresses it.

Returning to the hardware question, https://evenroute.com/iqrv3 seems to be capable of handling up to ~300 Mbit/S connections, and my ISP only delivers 170 (and advertises 150, which is mildly surprising!)

I just ordered one, so I'll have a = 'plug in" example, along with reflashing my linksys for the umpty-thousandth time.

--dave


 I suspect not enough people are aware of the later efforts of the bufferbloat team, so I'm thinking of one or two articles, starting with LWN and an audience of aficionados.

The core community is aware of what we've done, but in my view we haven't converted "grandma". Grandma, as well as a whole bunch of ordinary engineers and partners of engineers, are dependent on debloated performance because they're working at home now, and competing with granddaughter playing video games while they're trying to hold a video call.

Right now, my colleagues at work suffer from more than a second of bloat-related lag. They therefore tend to speak over each other on con-calls, apologize, start again and talk over each other, again. After a little while, the picture becomes a distinctly silly one: a bunch of grown adults putting their hands up and waving, like little kids in school. No-one has called out =E2=80=9Cme, me, teacher=E2=80=9D yet, but I expect = it any time.

I propose we show the results in terms that we can explain to Grandma, specifically concentrating on functioning VOIP. I just upgraded to Fedora 31, and the networking is absolutely stock, so I make a perfect victim/guinea-pig (;-))

Who's interested?




--=20
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb@spamcop.net           |    =
                  -- Mark Twain
_______________________________________________
Bloat = mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

= --Apple-Mail=_0FAF3946-D979-434E-AD15-88FF1DAF5BF1--