[Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.
David Lang
david at lang.hm
Fri Jul 25 17:03:38 EDT 2014
On Fri, 25 Jul 2014 14:37:34 -0400, Valdis.Kletnieks at vt.edu wrote:
> On Sat, 24 May 2014 10:02:53 -0400, "R." said:
>
>> Further, this function could be auto-scheduled or made enabled on
>> router boot up.
>
> Yeah, if such a thing worked, it would be good.
>
> (Note in the following that a big part of my *JOB* is doing "What
> could
> possibly go wrong?" analysis on mission-critical systems, which tends
> to color
> my viewpoint on projects. I still think the basic concept is good,
> just
> difficult to do, and am listing the obvious challenges for anybody
> brave
> enough to tackle it... :)
>
>> I must be missing something important which prevents this. What is
>> it?
>
> There's a few biggies. The first is what the linux-kernel calls
> -ENOPATCH -
> nobody's written the code. The second is you need an upstream target
> someplace
> to test against. You need to deal with both the "server is
> unavalailable due
> to a backhoe incident 2 time zones away" problem (which isn't *that*
> hard, just
> default to Something Not Obviously Bad(TM), and "server is
> slashdotted" (whci
> is a bit harder to deal with. Remember that there's some really odd
> corner
> cases to worry about - for instance, if there's a power failure in a
> town, then
> when the electric company restores power you're going to have every
> cerowrt box
> hit the server within a few seconds - all over the same uplink most
> likely. No
> good data can result from that... (Holy crap, it's been almost 3
> decades since
> I first saw a Sun 3/280 server tank because 12 Sun 3/50s all rebooted
> over the
> network at once when building power was restored).
>
> And if you're in Izbekistan and the closest server netwise is at 60
> Hudson, the
> analysis to compute the correct values becomes.... interesting.
>
> Dealing with non-obvious error conditions is also a challenge - a
> router
> may only boot once every few months. And if you happen to be booting
> just
> as a BGP routing flap is causing your traffic to take a vastly
> suboptimal
> path, you may end up encoding a vastly inaccurate setting and have it
> stuck
> there, causing suckage for non-obvious reasons for the non-technical,
> so you
> really don't want to enable auto-tuning unless you also have a good
> plan for
> auto-*RE*tuning....
have the router record it's finding, and then repeat the test
periodically, recording it's finding as well. If the new finding is
substantially different from the prior ones, schedule a retest 'soon'
(or default to the prior setting if it's bad enough), otherwise, if
there aren't many samples, schedule a test 'soon' if there are a lot of
samples, schedule a test in a while.
However, I think the big question is how much the tuning is required.
If a connection with BQL and fq_codel is 90% as good as a tuned setup,
default to untuned unless the user explicitly hits a button to measure
(and then a second button to accept the measurement)
If BQL and fw_codel by default are M70% as good as a tuned setup,
there's more space to argue that all setups must be tuned, but then the
question is how to they fare against a old, non-BQL, non-fq-codel setup?
if they are considerably better, it may still be worthwhile.
David Lang
More information about the Cerowrt-devel
mailing list