[Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.

Sun Jul 27 07:17:16 EDT 2014

Hi David,

On Jul 27, 2014, at 02:49 , David Lang <david at lang.hm> wrote:

> On Sun, 27 Jul 2014, Sebastian Moeller wrote:
> 
>> On Jul 27, 2014, at 00:53 , David Lang <david at lang.hm> wrote:
>> 
>>> On Sun, 27 Jul 2014, Sebastian Moeller wrote:
>>> 
>>>> Hi David,
>>>> 
>>>> On Jul 26, 2014, at 23:45 , David Lang <david at lang.hm> wrote:
>>>> 
>>>>> On Sat, 26 Jul 2014, Sebastian Moeller wrote:
>>>>> 
>>>>>> On Jul 26, 2014, at 22:39 , David Lang <david at lang.hm> wrote:
>>>>>> 
>>>>>>> by how much tuning is required, I wasn't meaning how frequently to tune, but how close default settings can come to the performance of a expertly tuned setup.
>>>>>> 
>>>>>> 	Good question.
>>>>>> 
>>>>>>> 
>>>>>>> Ideally the tuning takes into account the characteristics of the hardware of the link layer. If it's IP encapsulated in something else (ATM, PPPoE, VPN, VLAN tagging, ethernet with jumbo packet support for example), then you have overhead from the encapsulation that you would ideally take into account when tuning things.
>>>>>>> 
>>>>>>> the question I'm talking about below is how much do you loose compared to the idea if you ignore this sort of thing and just assume that the wire is dumb and puts the bits on them as you send them? By dumb I mean don't even allow for inter-packet gaps, don't measure the bandwidth, don't try to pace inbound connections by the timing of your acks, etc. Just run BQL and fq_codel and start the BQL sizes based on the wire speed of your link (Gig-E on the 3800) and shrink them based on long-term passive observation of the sender.
>>>>>> 
>>>>>> 	As data talks I just did a quick experiment with my ADSL2+ koine at home. The solid lines in the attached plot show the results for proper shaping with SQM (shaping to 95% of del link rates of downstream and upstream while taking the link layer properties, that is ATM encapsulation and per packet overhead into account) the broken lines show the same system with just the link layer adjustments and per packet overhead adjustments disabled, but still shaping to 95% of link rate (this is roughly equivalent to 15% underestimation of the packet size). The actual theist is netperf-wrappers RRUL (4 tcp streams up, 4 tcp steams down while measuring latency with ping and UDP probes). As you can see from the plot just getting the link layer encapsulation wrong destroys latency under load badly. The host is ~52ms RTT away, and with fq_codel the ping time per leg is just increased one codel target of 5ms each resulting in an modest latency increase of ~10ms with proper shaping for a total of ~65ms, with improper shaping RTTs increase to ~95ms (they almost double), so RTT increases by ~43ms. Also note how the extremes for the broken lines are much worse than for the solid lines. In short I would estimate that a slight misjudgment (15%) results in almost 80% increase of latency under load. In other words getting the rates right matters a lot. (I should also note that in my setup there is a secondary router that limits RTT to max 300ms, otherwise the broken lines might look even worse...)
>>>>> 
>>>>> what is the latency like without BQL and codel? the pre-bufferbloat version? (without any traffic shaping)
>>>> 
>>>> 	So I just disabled SQM and the plot looks almost exactly like the broken line plot I sent before (~95ms RTT up from 55ms unloaded, with single pings delayed for > 1000ms, just as with the broken line, with proper shaping even extreme pings stay < 100ms). But as I said before I need to run through my ISP supplied primary router (not just a dumb modem) that also tries to bound the latencies under load to some degree. Actually I just repeated the test connected directly to the primary router and get the same ~95ms average ping time with frequent extremes > 1000ms, so it looks like just getting the shaping wrong by 15% eradicates the buffer de-bloating efforts completely...
>>> 
>>> just so I understand this completely
>>> 
>>> you have
>>> 
>>> debloated box <-> ISP router <-> ADSL <-> Internet <-> debloated server?
>> 
>> 	Well more like:
>> 
>> 	Macbook with dubious bloat-state -> wifi to de-bloated cerowrt box that shapes the traffic -> ISP router -> ADSL -> internet -> server
>> 
>> I assume that Dave debated these servers well, but it should not really matter as the problem are the buffers on both ends of the bottleneck ADSL link.
> 
> right, I was forgetting that unless you are the bottleneck, you aren't buffering anything and so debloating makes no difference. In a case like yours where you can't debloat the actual bottleneck, the best that you can do is to artificially become the bottleneck by shaping the traffic. but on the download side it's much harder.

Actually, all RRUL plots that Dave collected show that ingress shaping does work quite well on average. It will fail with a severe DOS, but let’s face it these can only be mitigated by the ISP anyways… 

> 
> What are we aiming for? something that will show the problem clearly so that fixes can be put in the right place? or a work-around to use in the meantime?

	Mmmh, I aim for decent internet connections for home-iusers like myself. It would be great if ISPs could use their leverage on equipment manufacturers to implement the current state of the art solution in broadband gear; realistically even if this would start like today we still face a long transition time, so I am all for putting the smarts into home-router’s. At least the end user has enough incentive to put in (small amount of) work required to mitigate bad buffer management...

> 
> I think both need to be pursued, but we need to be clear on what is being done for each one.

	I have no connection into telco’s, ISPs, nor OEMs, so all I can help with is getting the “work-around” in good shape and ready for deployment. Arguably convincing ISPs might be more important.

> 
> If having BQL+fq_codel with defaults would solve the problem if it was on the right routers, we need to show that.

	I think Dave has pretty much shown this. Note though that it is rather traffic shaping and fq_codel, BQL would be needed in the DSL drivers on both sides of the link.

> 
> Then, because we can't get the fixes on the right routers and need to work-around the problem by artificially becoming the bottleneck, we need to show that the 95% that we shape to is throwing away 5% of your capacity and make that clear to the users.

	I think if you google for “router qos” you will find plenty of pages already describing the rational and bandwidth sacrifice required, so that knowledge might already be in the public knowledge.

> 
> otherwise we will risk getting to the point where it will never get fixed because the ISPs will look at their routers and say that bufferbloat can't possibly be a problem as they never have large queues (because we are doing the workarounds.

	Honestly, for an ISP the best solution is us shaping our connections as that reduces the worst case bandwidth use per user and might allow higher oversubscription. We need to find economical incentives for ISPs to implement BQL equivalents in the broadband gear. In theory it should give a competitive advantage to be able to advertise better gaming/void suitability but many users really have no real choice of ISP. I cold imagine that the big push away from switched circuit telephony to voip even for carriers ISPs might get more interested in improving VOIP resilience unhand usability under load...

> 
> 
>>> and are you measuring the latency impact when uploading or downloading?
>> 
>> 	No I measure the impact of latency of saturating both up- and downlink, pretty much the worst case scenario.
> 
> I think we need to test this in each direction independently.

	Rich Brown has made a nice script to test that, betterspeedtest.sh at https://github.com/richb-hanover/CeroWrtScripts
For figuring out the required shaping point it is easier to work on both “legs” independently, But to assess worst case behavior I think both directions need to be saturated.
There is a pretty good description of a quick bufferloat test on http://www.bufferbloat.net/projects/cerowrt/wiki/Quick_Test_for_Bufferbloat

> 
> Cerowrt can do a pretty good job of keeping the uplink from being saturated, but it can't do a lot for the downlink.

	Well, except it does. Downlink shaping is less reliable than uplink shaping. Most traffic sources, TCP or UDP actually need to deal with the variable bandwidth of the internet anyway and implement some congestion control, that needs to deal with packet loss as congestion signal. So the downlink shaping mostly works okay (even though I think Dave recommends to shape downlink more aggressive than 95% of link rate)

> 
>>> 
>>> I think a lot of people would be happy with 95ms average pings on a loaded connection, even with occasional outliers.
>> 
>> 	No that is too low an aim, this still is not useable for real time applications, we should aim for base RTT plus 10ms. (For very slow links we need to cut some slack but for > 3Mbps 10ms should be achievable )
> 
> perfect is the enemy of good enough.

	Sure but really according to http://www.hh.se/download/18.70cf2e49129168da015800094780/7_7_delay.pdf we only have a 400ms budget for acceptable voip (I would love real psychophysics papers for that instead of cisco marketing material), or 200ms oneway delay. With ~170ms RTT to the west coast (from university wired network, so no ADSL delay involved) almost half of the budget is used up in a way that can not be fixed easily. (It takes 66ms for light to travel the distance of half the earth’s circumference, or 132ms RTT, or assuming c(fiber) = 0.7* c(vacuum) rather 95ms one-way of 190ms RTT). With ~100ms RTT from each end there is barely enough time left for data processing and transcoding.

> 
> There's achievable if every router is tuned to exactly the right conditions and there's achievable for course settings that can be widely deployed. Get the second out while continuing to work on making the first easier.

	Okay so that is easy, if you massively overshsape latency will be great, but bandwidth is compromised...

> 
> residential connections only come in a smallish number of sizes,

	Except that with say DSL there is often a wide corridor for allowed sync speed, e.g. the 50Mbps down / 10Mbps up vdsl2 packet of DT actually will synchronize in a corridor of 50 to 27Mbps and 10 to 5.5 Mbps (numbers are approximately right), That is almost a factor of 2, too much for a one size fits all approach (say 90% of advertised speed).

> it shouldn't be too hard to do a few probes and guess which size is in use, then set the bandwith to 90% of that standard size and you should be pretty good without further tuning.

	No, with ATM carriers (ADSL, some VDSL) the encapsulation overhead ranges from ~10% to >50% depending on packet size, so to get the bottleneck queue reliable under our control we would need to shape to ~50% of link speed, obviously a very hard sell . (And it is not easy to figure out whether the bottleneck link uses ATM or not, so there is no one size fits all). We currently have no easy and quick way of detecting ATM link layers from cerowrt...

> 
>>> It's far better than sustained multi-second ping times which is what I've seen with stock setups.
>> 
>> 	True, but compared to multi seconds even <1000ms would be a really great improvement, but also not enough.
>> 
>>> 
>>> but if no estimate is this bad, how bad is it if you use as your estimate the 'rated' speed of your DSL (i.e. what the ISP claims they are providing you) instead of the fully accurate speed that includes accounting for ATM encapsulation?
>> 
>> 	Well ~95ms with outliers > 1000ms, just as bad as no estimate. I shaped 5% below rated speed as reported by the DSL modem, so disabling the ATM link layer adjustments (as shown in the broken lines in the plot), basically increased the effective shaped rate by ~13% or to effectively 107% of line rate, your proposal would be line rate and no link layer adjustments or effectively 110% of line rate; I do not feel like repeating this experiment right now as I think the data so far shows that even with less misjudgment the bloat effect is fully visible ) Not accounting for ATM framing carries a ~10% cost in link speed, as ATM packet size on the wire increases by >= ~10%.
> 
> so what if you shape to 90% of rated speed (no allowance for ATM vs other transports)?

	I have not done that but the typical recommendation for ADSL links for shaping without taking the link layer peculiarities into account is 85% (which should work for large packets, but can easily melt down with lots of smallish packets, like voip calls). I repeat there is no simple one-size fits all shaping that will solve the buffer bloat issue for most home-users in a acceptable fashion. (And I am not talking perfekt here, it simply is not good enough). Note that 90% will just account for the 48in53 ATM transport cost, it will not take the increased per packet header into account.

> 
>>> It's also worth figuring out if this problem would remain in place if you didn't have to go through the ISP router and were runing fq_codel on that router.
>> 
>> 	If the DSL modem would be debloated at least on upstream no shaping would be required any more; but that does not fix the need for downstream shaping (and bandwidth estimation) until the head end gear is debloated..
> 
> right, I was forgetting this earlier.
> 
>>> As long as fixing bufferbloat involves esoteric measurements and tuning, it's not going to be solved, but if it could be solved by people flahing openwrt onto their DSL router and then using the defaults, it could gain traction fairly quickly.
>> 
>> 	But as there are only very few DSL modems with open sources (especially of the DSL chips) this just as esoteric ;) Really if equipment manufactures could be convinced to take these issues seriously and actually fix their gear that would be best. But this does not look like it is happening on the fast track. (Even DOCSIS developer cable labs punted on requiring codel or fq_codel in DOCSIS modems since the think that the required timestamps are to “expensive” on the device class they want to use for modems. They opted for PIE, much better than what we have right now but far away from my latency under load increase of 10ms...)
>> 
>>> 
>>>>> I agree that going from 65ms to 95ms seems significant, but if the stock version goes into up above 1000ms, then I think we are talking about things that are ‘close'
>>>> 
>>>> 	Well if we include outliers (and we should as enough outliers will degrade the FPS and voip suitability of an otherwise responsive system quickly) stock and improper shaping are in the >1000ms worst case range, while proper SQM bounds this to 100ms.
>>>> 
>>>>> 
>>>>> assuming that latency under load without the improvents got >1000ms
>>>>> 
>>>>> fast-slow (in ms)
>>>> ideal=10
>>>> untuned=43
>>>> bloated > 1000
>>>> 
>>>> 	The sign seems off as fast < slow? I like this best ;)
>>> 
>>> yep, I reversed fast/slow in all of these
>>> 
>>>>> 
>>>>> fast/slow
>>>>> ideal = 1.25
>>>>> untuned = 1.83
>>>>> bloated > 19
>>>> 
>>>> 	But Fast < Slow and hence this ration should be <0?
>>> 
>>> 1 not 0, but yes, this is really slow/fast
>>> 
>>>>> slow/fast
>>>>> ideal = 0.8
>>>>> untuned = 0.55
>>>>> bloated = 0.05
>>>>> 
>>>> 
>>>> 	and this >0?
>>> 
>>> and this is really fast/slow
>> 
>> 
>> 	What about taking the latency difference an re;aging it with a reference time, like say the time a photon would take to travel once around the equator, or the earth’s diamater?
> 
> how about latency difference scaled by the time to send one 1500 byte packet at the measured throughput?

	So you propose latency difference / time to send one full packet at the measured speed

Not sure: think two de-bloated setups, one fast one slow: for the slow link we get 10ms/long for a fast link we get 10ms/short, so assuming that both keep the 10ms average latency increase why should both links show different bloat-measure?
I really think the raw latency difference is what we should convince the users to look at. All one-number measures are going to be too simplistic, but at least for the difference you can easily estimate the effect on RTTs for relevant traffic...

> 
> This would factor out the data rate and would not be affected by long distance links.

	I am not convinced that people on a slow link can afford latency increases any better than people on a fast link. I actually think that it is the other way round. During the tuning process your measure might be helpful to find a good tradeoff between bandwidth and latency increase though.

Best Regards
	Sebastian

> 
> David Lang