[Cerowrt-devel] Ideas on how to simplify and popularize bufferbloat control for consideration.

Sat Jul 26 19:08:08 EDT 2014

Hi David,

On Jul 27, 2014, at 00:23 , David Lang <david at lang.hm> wrote:
[...]
>>> I'm not worried about an implementation existing as much as the question of if it's on the routers/switches by default, and if it isn't, is the service simple enough to be able to avoid causing load on these devices and to avoid having any security vulnerabilities (or DDos potential)
>> 
>> 	But with gargoyle the idea is to monitor a sparse ping stream to the closest host responding and interpreting a sudden increase in RTT as a sign the the upstreams buffers are filling up and using this as signal to throttle on the home router. My limited experience shows that quite often close hosts will respond to pings...
> 
> that measures latency, but how does it tell you bandwidth unless you are the only possible thing on the network and you measure what you are receiving?

	So the idea would be start the ping probe with no traffic and increase the traffic until the ping RTT increases, the useable bandwidth is around when RTTs increase.

[...]
>> 
>>> even fw_codel handles TCP differently
>> 
>> 	Does it? I thought UDP typically reacts differently to fq_codels dropping strategy but fq_codel does not differentiate between protocols (last time I looked at the code I came to that conclusion, but I am not very fluent in C so I might be simply wrong here)
> 
> with TCP, the system can tell the difference between different connections to the same system, with UDP it needs to infer this from port numbers, this isn't as accurate and so the systems (fq_codel and routers) handle them in a slightly different way. This does affect the numbers.

	But that only affects the hashing into fq_codel bins? From http://lxr.free-electrons.com/source/net/sched/sch_fq_codel.c 
70 static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q,
 71                                   const struct sk_buff *skb)
 72 {
 73         struct flow_keys keys;
 74         unsigned int hash;
 75 
 76         skb_flow_dissect(skb, &keys);
 77         hash = jhash_3words((__force u32)keys.dst,
 78                             (__force u32)keys.src ^ keys.ip_proto,
 79                             (__force u32)keys.ports, q->perturbation);
 80         return ((u64)hash * q->flows_cnt) >> 32;
 81 }

The way I read this is that it just uses source  and destination IP and the port, all the protocol does is make sure different protocol connections to the same src dot ports ruble end in different bins, no? My C is bad so I would not be amazed if my interpretation would be wrong, but please show me where?

> 
>>> so if we measure with UDP, does it really reflect the 'real world' of TCP?
>> 
>> 	But we care for UDP as well, no?
> 
> Yes, but the reality is that the vast majority of traffic is TCP, and that's what the devices are optimized to handle, so if we measure with UDP we may not get the same results as if we measure with TCP.
> 
> measuing with ICMP is different yet again.

	Yes, I have heard stories like that when I set out for my little detect ATM quantization from ping RTTs, but to my joy it looks like ICMP still gives reasonable measurements! Besed on tat data I would assume UDP to be even less exotic and hence handled even less special and hence more like tcp?

> 
> Think of the router ASICs that handle the 'normal' traffic in the ASIC in the card, but 'unusual' traffic needs to be sent to the core CPU to be processed and is therefor MUCH slower

	Except for my ICMP RTT measurements I still saw quantization steps in accordance with the expected best case RTT for a packet, showing that the slow processing at least is constant and hence easy to get ridd of in measurements...

> 
>>>>> One thought I have is to require a high TTL on the packets for the services to respond to them. That way any abuse of the service would have to take place from very close on the network.
>>>>> 
>>>>> Ideally these services would only respond to senders that are directly connected, but until these services are deployed and enabled by default, there is going to be a need to be the ability to 'jump over' old equipment. This need will probably never go away completely.
>>>> 
>>>> 	But if we need to modify DSLAMs and CMTSs it would be much nicer if we could just ask nicely what the current negotiated bandwidths are ;)
>>> 
>>> negotiated bandwith and effective bandwidth are not the same
>>> 
>>> what if you can't talk to the devices directly connected to the DSL line, but only to a router one hop on either side?
>> 
>> 	In my limited experience the typical bottleneck is the DSL line, so if we shape for that we are fine… Assume for a moment the DSLAM uplink is so congested because of oversubscription of the DSLAM, that now this constitutes the bottleneck. Now the available bandwidth for each user depends on the combined traffic of all users, not a situation we can reasonable shape for anyway (I would hope that ISPs monitor this situation and would remedy it by adding uplink capacity, so this hopefully is just a transient event).
> 
> for DSL you are correct, it's a point-to-point connection (star network topology), but we have other technologies used in homes that are shared-media bus topology networks. This includes cablemodems and wireless links.

	Well, yes I understand, but you again would assume that the cable ISP tries to provision the system so that most users are happy, so congestion is not the rule? Even then I think cable guarantees some minimum rates per user, no? With wireless it is worse in that RF events outside of the ISP and end users control can ruin the day. 

> 
>>> 
>>> for example, I can't buy (at least not for anything close to a reasonable price) a router to run at home that has a DSL port on it, so I will always have some device between me and the DSL.
>> 
>> 	http://wiki.openwrt.org/toh/tp-link/td-w8970 or
> 
> no 5GHz wireless?

	Could be, but definitely reasonable priced, probany cheap enough to use as smart de-bloated DSL modem, so your main router does not need HTB traffic shaping on uplink anymore. I might actually go that route since I really dislike my ISP primary router, but I digress...

> 
>> http://www.traverse.com.au/products ?
> 
> I couldn't figure out where to buy one through their site.

	Maybe they only sell in AU, I guess I just wanted to be helpful,

> 
>> If you had the DSL modem in the router under cerowrts control you would not need to use a traffic shaper for your uplink, as you could apply the BQL ideas to the ADSL driver.
>> 
>>> 
>>> If you have a shared media (cable, wireless, etc), the negotiated speed is meaningless.
>> 
>> 	Not exactly meaningless, if gives you an upper bound...
> 
> true, but is an upper bound good enough? How close does the estimate need to be?

	If we end up recommending people using say binary search to find the best tradeoff (maximizing throughput while keeping the maximum latency under load increase bounded to say 10ms) we should have an idea where to start, so bit to large is fine as a starting point. Traditionally the recommendation was around 85% of link rates, but that never came with a decent justification or data.

> 
> and does it matter if both sides are doign fq_codel or is this still in the mode of trying to control the far side indirectly?

	Yes, this is only relevant as long as both sides of the bottleneck link are not de-bloated. But it does not look like DSLAMs/CMTs will change any time soon from the old ways...

> 
>>> In my other location, I have a wireless link that is ethernet to the dish on the roof, I expect the other end is a similar setup, so I can never see the link speed directly (not to mention the fact that rain can degrade the effective link speed)
>> 
>> 	One more case for measuring the link speed continuously!
> 
> at what point does the measuring process interfere with the use of the link? or cause other upstream issues.

	If my measuring by sparse stream idea works out the answer to both questions is not much ;)

> 
>>>>> Other requirements or restrictions?
>>>> 
>>>> 	I think the measurement should be fast and continuous…
>>> 
>>> Fast yes, because we want to impact the network as little as possible
>>> 
>>> continuous?? I'm not so sure. Do conditions really change that much?
>> 
>> 	You just gave an example above for changing link conditions, by shared media...
> 
> but can you really measure fast enough to handle shared media? at some point you need to give up measuring because by the time you have your measurement it's obsolete.

	So this is not going to work well a wifi wlan with wildly fluctuating rates (see Dave’s upcoming project make-wifi-fast) but for typical cable node where congestion changes over the day as a function of people being at home it might be fast enough.

> 
> If you look at networking with a tight enough timeframe, it's either idle or 100% utilized depending on if a bit is being sent at that instant, however a plot at that precision is worthless :-)

	Yes I think a moving average over some time would be required.

> 
>>> And as I ask in the other thread, how much does it hurt if your estimates are wrong?
>> 
>> 	I think I sent a plot to that regard.
> 
> yep, our mails are crossing
> 
>>> 
>>> for wireless links the conditions are much more variable, but we don't really know what is going to work well there.
>> 
>> 	Wireless as in point 2 point links or in wifi?
> 
> both, point-to-point is variable based on weather, trees blowing in the wind, interference, etc. Wifi has a lot more congestion, so interference dominates everything else.

	So maybe that is a diffent kettle of fish then.

Best Regards
	Sebastian

> 
> David Lang