From: Dave Taht <dave.taht@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: codel@lists.bufferbloat.net
Subject: Re: [Codel] fq_codel : interval servo
Date: Fri, 31 Aug 2012 09:59:17 -0700 [thread overview]
Message-ID: <CAA93jw6hQhFpJjySqfRTS3DFLDKV+LPfLzqDU8JMZdJOBaJ2HQ@mail.gmail.com> (raw)
In-Reply-To: <1346430207.7996.11.camel@edumazet-glaptop>
On Fri, Aug 31, 2012 at 9:23 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-08-31 at 08:53 -0700, Rick Jones wrote:
>> On 08/30/2012 11:55 PM, Eric Dumazet wrote:
>> > On locally generated TCP traffic (host), we can override the 100 ms
>> > interval value using the more accurate RTT estimation maintained by TCP
>> > stack (tp->srtt)
>> >
>> > Datacenter workload benefits using shorter feedback (say if RTT is below
>> > 1 ms, we can react 100 times faster to a congestion)
>> >
>> > Idea from Yuchung Cheng.
>>
>> Mileage varies of course, but what are the odds of a datacenter's
>> end-system's NIC(s) being the bottleneck point? Is it worth pinging a
>> couple additional cache lines around (looking at the v2 email, I'm
>> ass-u-me-ing that sk_protocol and tp->srtt are on different cache lines)?
>>
>
> Its certainly worth pinging additional cache lines.
>
> A host consume almost no cpu in qdisc layer (at least not in fq_codel)
>
> A router wont use this stuff (as skb->sk will be NULL)
>
>> If fq_codel is going to be a little bit pregnant and "layer violate" :)
>> why stop at TCP?
>
> Who said I would stop at TCP ? ;)
Heh. "Vith Codel I vill rule ze vorld!"
I have not, of late, been focused on TCP, (torn up about uTP,
actually) and also more on the kinds of problems that occur in the
home gateway.
I realize that 10GigE and datacenter host based work is sexy and fun,
but getting stuff that runs well in today's 1-20Mbit environments is
my own priority, going up to 100Mbit, with something that can be
embedded in a SoC. The latest generation of SoCs all do QoS in
hardware... badly.
Secondly, finding things that would work on the head ends (CMTSes and
DSLAMs) also ranks way up there.
Fixing wifi follows that in priority...
While many of the things under tweak have commonality I'm thinking
more and more it would be saner for me to fork off for a while and do
a "mfq_codel", which would let me A) be able to test codel, fq_codel,
mfq_codel enhancements on the same build and testbed. B) play with
stuff that works better at lower bandwidths and C) with wifi.
Two notes from my slow and often buggy work that I haven't mentioned
on this list yet:
1) The current fq_codel implementation wipes out the the codel state
every time a queue is emptied. This happens rarely at 10GigE speeds,
quite often below 100Mbit. Keeping the state around helps a LOT on
queue depth - for example (with the rest of my buggy patchset) I end
up with a very nice avg 50 packet backlog at 100Mbit, low median,
stddev, etc, with 8 streams, and with 150, 70 or so...
"fixing" that just involved removing codel_init_vars from the "is this
a new queue" routine.
some of the other patches I've thrown around are about seeing odd
behaviors on temporarily empty queues, like:
2) maxpacket can get set to the largest packet ever seen by a queue
(like a TSO packet)
....
} else {
+ stats->maxpacket = qdisc_pkt_len(skb);
vars->count = 1;
vars->rec_inv_sqrt = ~0U >> REC_INV_SQRT_SHIFT;
}
So resetting it will A) have a fq_codel queue set to the actual packet
size (acks, for example),
might make codel more accurate on those sort of streams... and B)
fiddling with ethtool to turn off gso/tso/etc was not reflected in
codel's estimates.
B) has occasionally caused a great deal of headscratching.
>
>>
>> Is this change rectifying an "unfairness" with the existing fq_codel and
>> the 100ms for all when two TCP flows have very different srtts?
>>
>
> codel has to use a single interval value, and we use an average value.
> It seems to work quite well.
>
> fq_codel has the opportunity to get a per tcp flow interval value.
> And this should give better behavior.
>
>> Some perhaps overly paranoid questions:
>>
>> Does it matter that the value of tp->srtt at the time fq_codel dequeues
>> will not necessarily be the same as when that segment was queued?
>>
>
> It matters we use the last srtt value/estimation, which is done in this
> patch.
>
>> Is there any chance of the socket going away between the time the packet
>> was queued and the time it was dequeued? (Or tp->srtt becoming "undefined?")
>
> When skb->sk is non NULL, we hold a reference to the socket, it cannot
> disappear under us.
>
>
>
> _______________________________________________
> Codel mailing list
> Codel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel
--
Dave Täht
http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out
with fq_codel!"
next prev parent reply other threads:[~2012-08-31 16:59 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-31 6:55 Eric Dumazet
2012-08-31 13:41 ` Jim Gettys
2012-08-31 13:50 ` [Codel] [RFC] fq_codel : interval servo on hosts Eric Dumazet
2012-08-31 13:57 ` [Codel] [RFC v2] " Eric Dumazet
2012-09-01 1:37 ` Yuchung Cheng
2012-09-01 12:51 ` Eric Dumazet
2012-09-04 15:10 ` Nandita Dukkipati
2012-09-04 15:25 ` Jonathan Morton
2012-09-04 15:39 ` Eric Dumazet
2012-09-04 15:34 ` Eric Dumazet
2012-09-04 16:40 ` Dave Taht
2012-09-04 16:54 ` Eric Dumazet
2012-09-04 16:57 ` Eric Dumazet
2012-08-31 15:53 ` [Codel] fq_codel : interval servo Rick Jones
2012-08-31 16:23 ` Eric Dumazet
2012-08-31 16:59 ` Dave Taht [this message]
2012-09-01 12:53 ` Eric Dumazet
2012-09-02 18:08 ` Dave Taht
2012-09-02 18:17 ` Dave Taht
2012-09-02 23:28 ` Eric Dumazet
2012-09-02 23:23 ` Eric Dumazet
2012-09-03 0:18 ` Dave Taht
2012-08-31 16:40 ` Jim Gettys
2012-08-31 16:49 ` Jonathan Morton
2012-08-31 17:15 ` Jim Gettys
2012-08-31 17:31 ` Rick Jones
2012-08-31 17:44 ` Jim Gettys
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.bufferbloat.net/postorius/lists/codel.lists.bufferbloat.net/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAA93jw6hQhFpJjySqfRTS3DFLDKV+LPfLzqDU8JMZdJOBaJ2HQ@mail.gmail.com \
--to=dave.taht@gmail.com \
--cc=codel@lists.bufferbloat.net \
--cc=eric.dumazet@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox