[Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.

Sat Jul 26 17:45:59 EDT 2014

On Sat, 26 Jul 2014, Sebastian Moeller wrote:

> On Jul 26, 2014, at 22:39 , David Lang <david at lang.hm> wrote:
>
>> by how much tuning is required, I wasn't meaning how frequently to tune, but 
>> how close default settings can come to the performance of a expertly tuned 
>> setup.
>
> 	Good question.
>
>>
>> Ideally the tuning takes into account the characteristics of the hardware of 
>> the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, 
>> VLAN tagging, ethernet with jumbo packet support for example), then you have 
>> overhead from the encapsulation that you would ideally take into account when 
>> tuning things.
>>
>> the question I'm talking about below is how much do you loose compared to the 
>> idea if you ignore this sort of thing and just assume that the wire is dumb 
>> and puts the bits on them as you send them? By dumb I mean don't even allow 
>> for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound 
>> connections by the timing of your acks, etc. Just run BQL and fq_codel and 
>> start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) 
>> and shrink them based on long-term passive observation of the sender.
>
> 	As data talks I just did a quick experiment with my ADSL2+ koine at 
> home. The solid lines in the attached plot show the results for proper shaping 
> with SQM (shaping to 95% of del link rates of downstream and upstream while 
> taking the link layer properties, that is ATM encapsulation and per packet 
> overhead into account) the broken lines show the same system with just the 
> link layer adjustments and per packet overhead adjustments disabled, but still 
> shaping to 95% of link rate (this is roughly equivalent to 15% underestimation 
> of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams 
> up, 4 tcp steams down while measuring latency with ping and UDP probes). As 
> you can see from the plot just getting the link layer encapsulation wrong 
> destroys latency under load badly. The host is ~52ms RTT away, and with 
> fq_codel the ping time per leg is just increased one codel target of 5ms each 
> resulting in an modest latency increase of ~10ms with proper shaping for a 
> total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost 
> double), so RTT increases by ~43ms. Also note how the extremes for the broken 
> lines are much worse than for the solid lines. In short I would estimate that 
> a slight misjudgment (15%) results in almost 80% increase of latency under 
> load. In other words getting the rates right matters a lot. (I should also 
> note that in my setup there is a secondary router that limits RTT to max 
> 300ms, otherwise the broken lines might look even worse...)

what is the latency like without BQL and codel? the pre-bufferbloat version? 
(without any traffic shaping)

I agree that going from 65ms to 95ms seems significant, but if the stock version 
goes into up above 1000ms, then I think we are talking about things that are 
'close'

assuming that latency under load without the improvents got >1000ms

fast-slow (in ms)
ideal=10
untuned=43
bloated > 1000

fast/slow
ideal = 1.25
untuned = 1.83
bloated > 19

slow/fast
ideal = 0.8
untuned = 0.55
bloated = 0.05

rather than looking at how much worse it is than the ideal, look at how much 
closer it is to the ideal than to the bloated version.

David Lang