[Cake] Long-RTT broken again

Sebastian Moeller moeller0 at gmx.de
Tue Nov 3 12:11:30 EST 2015


Hi Toke,

On Nov 3, 2015, at 18:05 , Toke Høiland-Jørgensen <toke at toke.dk> wrote:

> Jonathan Morton <chromatix99 at gmail.com> writes:
> 
>> Cake does the queue accounting in bytes, and calculates 15MB (by
>> default) as the upper limit.  It’s *not* meant to be a packet buffer.
> 
> Ah, good.
> 
>> note the different behaviour of the upload and download streams in the
>> results given.
> 
> This is not a result of ingress/egress shaping, though; the upstream and
> downstream shaping is done on each side of the bottleneck on separate
> boxes.
> 
>> The only way this could behave like a “packet buffer” instead of a
>> byte-accounted queue is if there is a fixed size allocation per
>> packet, regardless of the size of said packet. There are hints that
>> this might actually be the case, and that the allocation is a hugely
>> wasteful (for an ack) 2KB. (This also means that it’s not a 10240
>> packet buffer, but about 7500.)
> 
> Right, well, in that case fixing the calculation to use the actual
> packet size would make sense in any case?

	Would it? I thought actually using the amount of “pinned” kernel memory would be more relevant, it a ACK packet pins 2KB then it should be accounted at 2KB, IF the goal of the accounting is to avoid un-scheduled OOM, no? And if something like Dave’s patch kicks in that copies larger mostly empty skbs to smaller sizes, these packets then should be accounted with that smaller size. In anyway I believe with default kernels there is a strong correlation between a packet count limit and a byte count limit…

> 
>> But in a bidirectional TCP scenario with ECN, only about a third of
>> the packets should be acks (ignoring the relatively low number of ICMP
>> and UDP probes); ECN causes an ack to be sent immediately, but then
>> normal delayed-ack processing should resume. This makes 6KB allocated
>> per ~3KB transmitted. The effective buffer size is thus 7.5MB, which
>> is still compliant with the traditional rule of thumb (BDP /
>> sqrt(flows)), given that there are four bulk flows each way.
>> 
>> This effect is therefore not enough to explain the huge deficit Toke
>> measured. The calculus also changes by only a small factor if we
>> ignore delayed acks, making 8KB allocated per 3KB transmitted.
> 
> Well, there are also several UDP measurement flows and a ping, which all
> send really small packets; so it's not just the acks.
> 
>> So, again - what’s going on?  Are there any clues in packet traces
>> with sequence analysis?
> 
> Not really; the qdisc stats show ~6000 packets dropped over the 300
> second test; and lots and lots of overlimits. So my guess is that the
> queue is in fact overflowing.
> 
> Will post a trace when I get a chance tomorrow.
> 
>> I’ll put in a configurable memory limit anyway, but I really do want
>> to understand why this is happening.
> 
> As I said before: a configurable limit is not a fix for this; we need
> the default behaviour to be sane.

	As much as I push for a configurable limit, I fully agree that cake’s auto-tuning smarts should actually be smart and do the right thing.

Best Regards
	Sebastian

> 
> -Toke




More information about the Cake mailing list