[Codel] fq_codel : interval servo

Dave Taht dave.taht at gmail.com
Fri Aug 31 12:59:17 EDT 2012


On Fri, Aug 31, 2012 at 9:23 AM, Eric Dumazet <eric.dumazet at gmail.com> wrote:
> On Fri, 2012-08-31 at 08:53 -0700, Rick Jones wrote:
>> On 08/30/2012 11:55 PM, Eric Dumazet wrote:
>> > On locally generated TCP traffic (host), we can override the 100 ms
>> > interval value using the more accurate RTT estimation maintained by TCP
>> > stack (tp->srtt)
>> >
>> > Datacenter workload benefits using shorter feedback (say if RTT is below
>> > 1 ms, we can react 100 times faster to a congestion)
>> >
>> > Idea from Yuchung Cheng.
>>
>> Mileage varies of course, but what are the odds of a datacenter's
>> end-system's NIC(s) being the bottleneck point?  Is it worth pinging a
>> couple additional cache lines around (looking at the v2 email, I'm
>> ass-u-me-ing that sk_protocol and tp->srtt are on different cache lines)?
>>
>
> Its certainly worth pinging additional cache lines.
>
> A host consume almost no cpu in qdisc layer (at least not in fq_codel)
>
> A router wont use this stuff (as skb->sk will be NULL)
>
>> If fq_codel is going to be a little bit pregnant and "layer violate" :)
>> why stop at TCP?
>
> Who said I would stop at TCP ? ;)

Heh. "Vith Codel I vill rule ze vorld!"

I have not, of late, been focused on TCP, (torn up about uTP,
actually) and also more on the kinds of problems that occur in the
home gateway.

I realize that 10GigE and datacenter host based work is sexy and fun,
but getting stuff that runs well in today's 1-20Mbit environments is
my own priority, going up to 100Mbit, with something that can be
embedded in a SoC. The latest generation of SoCs all do QoS in
hardware... badly.

Secondly, finding things that would work on the head ends (CMTSes and
DSLAMs) also ranks way up there.

Fixing wifi follows that in priority...

While many of the things under tweak have commonality I'm thinking
more and more it would be saner for me to fork off for a while and do
a "mfq_codel", which would let me A) be able to test codel, fq_codel,
mfq_codel enhancements on the same build and testbed. B) play with
stuff that works better at lower bandwidths and C) with wifi.

Two notes from my slow and often buggy work that I haven't mentioned
on this list yet:

1) The current fq_codel implementation wipes out the the codel state
every time a queue is emptied. This happens rarely at 10GigE speeds,
quite often below 100Mbit. Keeping the state around helps a LOT on
queue depth - for example (with the rest of my buggy patchset) I end
up with a very nice avg 50 packet backlog at 100Mbit, low median,
stddev, etc, with 8 streams, and with 150, 70 or so...

"fixing" that just involved removing codel_init_vars from the "is this
a new queue" routine.

some of the other patches I've thrown around are about seeing odd
behaviors on temporarily empty queues, like:

2) maxpacket can get set to the largest packet ever seen by a queue
(like a TSO packet)

....
                } else {
+                       stats->maxpacket = qdisc_pkt_len(skb);
                        vars->count = 1;
                        vars->rec_inv_sqrt = ~0U >> REC_INV_SQRT_SHIFT;
                        }

So resetting it will A) have a fq_codel queue set to the actual packet
size (acks, for example),
might make codel more accurate on those sort of streams... and B)
fiddling with ethtool to turn off gso/tso/etc was not reflected in
codel's estimates.

B) has occasionally caused a great deal of headscratching.




>
>>
>> Is this change rectifying an "unfairness" with the existing fq_codel and
>> the 100ms for all when two TCP flows have very different srtts?
>>
>
> codel has to use a single interval value, and we use an average value.
> It seems to work quite well.
>
> fq_codel has the opportunity to get a per tcp flow interval value.
> And this should give better behavior.
>
>> Some perhaps overly paranoid questions:
>>
>> Does it matter that the value of tp->srtt at the time fq_codel dequeues
>> will not necessarily be the same as when that segment was queued?
>>
>
> It matters we use the last srtt value/estimation, which is done in this
> patch.
>
>> Is there any chance of the socket going away between the time the packet
>> was queued and the time it was dequeued? (Or tp->srtt becoming "undefined?")
>
> When skb->sk is non NULL, we hold a reference to the socket, it cannot
> disappear under us.
>
>
>
> _______________________________________________
> Codel mailing list
> Codel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel



-- 
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out
with fq_codel!"



More information about the Codel mailing list