* optimizing for very small bandwidths with fq_codel better?
@ 2013-05-02 22:07 Dave Taht
2013-05-03 5:01 ` [Cerowrt-devel] " Mikael Abrahamsson
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Dave Taht @ 2013-05-02 22:07 UTC (permalink / raw)
To: bloat-devel, cerowrt-devel, Jonathan Morton
Given some of the keruffle over bittorrent, and voip traffic in
relation to the cablelabs report in relation to the effects of fq at
very high numbers of flows (100+) vs the priority traffic like voip,
at very low (4mbit and lower) bandwidths....
and no matter that the default hash on fq_codel is *very* robust, at
lower bandwidths some optimization is desirable, and people seem to
have a general need for control and classification that seems
unsatiable...
And adding a classic three or four band shaper is a little difficult,
but using something like classic fq "weights" was, theoretically
not...
so, anyway, I sat down and fiddled with the tc command to try and
generate a set of filters that would scale better below 4Mbit,
deprioritize bittorrent and background traffic, do gaming and ef
traffic better, AND (most importantly) work without HTB when possible,
while still retaining the base simplicity and most of the advantages
of fq_codel. It's very short and can be applied to anything to play
with (although in my case I slow down my laptop's ethernet port to
100Mbit so I can fill the queues)
Very short, heavily commented, 14 line attempt:
http://pastebin.com/bRmW9YD3
It isn't done.
1) I think there's a bug in either the kernel or tc or me on tos matching,
2) and there may be a useful feature to add to tc for doing smarter
filtering when you have multiple sets of bins...
3) and it seems like you'd have to use iptables to match for torrents
and then use a fw match?
so perhaps there is someone more expert on tc than I out there or with
more patience to fiddle?
--
Dave Täht
Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cerowrt-devel] optimizing for very small bandwidths with fq_codel better?
2013-05-02 22:07 optimizing for very small bandwidths with fq_codel better? Dave Taht
@ 2013-05-03 5:01 ` Mikael Abrahamsson
2013-05-03 11:45 ` Dave Taht
2013-05-03 5:58 ` Jonathan Morton
2013-05-03 16:53 ` Juliusz Chroboczek
2 siblings, 1 reply; 6+ messages in thread
From: Mikael Abrahamsson @ 2013-05-03 5:01 UTC (permalink / raw)
To: Dave Taht; +Cc: bloat-devel, cerowrt-devel
On Thu, 2 May 2013, Dave Taht wrote:
> 1) I think there's a bug in either the kernel or tc or me on tos matching,
Taking a guess here...
The TOS byte is 8 bytes. So EF is 46, which is 0x2e, and then you need to
left-shift it 2 bits because it's the most significant 6 bits, you get
0xb8 (if my early morning pre-breakfast hex calculations are correct). Try
matching on that and see if it works.
Some programs use the whole TOS byte, some just do the 6 DSCP bits. I
usually end up using tcpdump to see on the wire what the program actually
does.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: optimizing for very small bandwidths with fq_codel better?
2013-05-02 22:07 optimizing for very small bandwidths with fq_codel better? Dave Taht
2013-05-03 5:01 ` [Cerowrt-devel] " Mikael Abrahamsson
@ 2013-05-03 5:58 ` Jonathan Morton
2013-05-03 16:53 ` Juliusz Chroboczek
2 siblings, 0 replies; 6+ messages in thread
From: Jonathan Morton @ 2013-05-03 5:58 UTC (permalink / raw)
To: Dave Taht; +Cc: bloat-devel, cerowrt-devel
On 3 May, 2013, at 1:07 am, Dave Taht wrote:
> 1) I think there's a bug in either the kernel or tc or me on tos matching,
So this works:
tc filter add dev eth2 parent a: protocol ip prio 8 u32 match ip tos 0x2e fc flowid a:b
But this doesn't:
tc filter add dev eth2 parent a: protocol ip prio 10 u32 match ip tos 0x08 0xfc flowid a:b
I notice, near the end, that one has fc and the other has 0xfc.
- Jonathan Morton
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Cerowrt-devel] optimizing for very small bandwidths with fq_codel better?
2013-05-03 5:01 ` [Cerowrt-devel] " Mikael Abrahamsson
@ 2013-05-03 11:45 ` Dave Taht
0 siblings, 0 replies; 6+ messages in thread
From: Dave Taht @ 2013-05-03 11:45 UTC (permalink / raw)
To: Mikael Abrahamsson; +Cc: bloat-devel, cerowrt-devel
On Thu, May 2, 2013 at 10:01 PM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> On Thu, 2 May 2013, Dave Taht wrote:
>
>> 1) I think there's a bug in either the kernel or tc or me on tos matching,
>
>
> Taking a guess here...
>
> The TOS byte is 8 bytes. So EF is 46, which is 0x2e, and then you need to
> left-shift it 2 bits because it's the most significant 6 bits, you get 0xb8
> (if my early morning pre-breakfast hex calculations are correct). Try
> matching on that and see if it works.
THANK YOU. I've been making that mistake on EF now for 2 years, over
and over again, in everything. Thank you for breaking me of the habit.
> Some programs use the whole TOS byte, some just do the 6 DSCP bits. I
> usually end up using tcpdump to see on the wire what the program actually
> does.
In this case I am hopefully ignoring ECN bits by using 0xfc as the mask.
OK, so I put a new paste up, with the fixed EF classifier.
Fiddle at:
https://pastee.org/m5pxn
I also found the option to have an offset to a filter with a hash and
divisor - "baseclass". So I have arbitrarily fiddled with multiple
traffic types and then tossed the rest into a subset of fq_codel
queues. (I really am a big fan of better mixing, particularly with tcp
traffic).
I have not however figured out how to classify via a u32 match and end
up with a hash, divisor, and baseclass offset, (it seems to make sense
to have a bunch of queues for gaming/dns/interactive traffic).
My observation is that a LOT of traffic is seemingly marked CS1 (or
remarked as such, particularly on ingress from the internet), so that
if you wanted to pound torrent flat and torrent only, deeper packet
inspection might be required, and CS1 marked traffic could use a
couple of queues dedicated to it....
nor have I benchmarked this attempt....
>
> --
> Mikael Abrahamsson email: swmike@swm.pp.se
--
Dave Täht
Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: optimizing for very small bandwidths with fq_codel better?
2013-05-02 22:07 optimizing for very small bandwidths with fq_codel better? Dave Taht
2013-05-03 5:01 ` [Cerowrt-devel] " Mikael Abrahamsson
2013-05-03 5:58 ` Jonathan Morton
@ 2013-05-03 16:53 ` Juliusz Chroboczek
2013-05-03 22:15 ` Dave Taht
2 siblings, 1 reply; 6+ messages in thread
From: Juliusz Chroboczek @ 2013-05-03 16:53 UTC (permalink / raw)
To: Dave Taht; +Cc: bloat-devel, cerowrt-devel
tc qdisc add dev eth2 handle a root fq_codel flows 16 quantum 300
What's the deal with the small quantum? Less than the MTU?
-- Juliusz
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: optimizing for very small bandwidths with fq_codel better?
2013-05-03 16:53 ` Juliusz Chroboczek
@ 2013-05-03 22:15 ` Dave Taht
0 siblings, 0 replies; 6+ messages in thread
From: Dave Taht @ 2013-05-03 22:15 UTC (permalink / raw)
To: Juliusz Chroboczek; +Cc: bloat-devel, cerowrt-devel
On Fri, May 3, 2013 at 9:53 AM, Juliusz Chroboczek
<jch@pps.univ-paris-diderot.fr> wrote:
> tc qdisc add dev eth2 handle a root fq_codel flows 16 quantum 300
>
> What's the deal with the small quantum? Less than the MTU?
What this does is optimizes for interactive traffic over bulk.
fq_code, which is DRR based,l has a "new" flow idea in it, packets
that exceed a given quantum, never become new. So a 1500 byte packet
will be passed over 5 times before being delivered, while other flows
smaller than the quantum have a chance of being bumped into the new
flow queue and delivered most rapidly, depending on how "sparse" they
are.
It's a very old trick, going back as far as the venerable
wondershaper. (which did this, but badly, resulting in packet
re-ordering on some kinds of traffic)
I note that what I'm doing here (with the oddball filters) is an
experiment in progress. There are some comments in the tc code as to
where I'd like it to go...
We get very good results from fq_codel with a quantum this size at
4Mbits, but below that bandwidth things are dicy. Prior to this
experiment I'd been using a 3 tier shaper with htb, but that too had
trouble getting below 2Mbits, and relied on classification in order to
push background traffic into the background queue.
This is an attempt to be able to run at the line rate (whatever it is)
(or under a single htb), at 2Mbit or lower (384k, being the lowest).
It may be that we may also need to fiddle with fq_codel's target
and/or interval in order to work well at these rates.
For some detail as to what spurred this, please see the recent cablelabs report:
http://www.cablelabs.com/downloads/pubs/Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf
Once you get above 4Mbit extensive fiddling with fq_codel's defaults
seems increasingly unnecessary.
We've also been fiddling with various forms a "pfq_codel", which tries
to emulate pfifo_fast behavior in similar ways.
>
> -- Juliusz
--
Dave Täht
Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-05-03 22:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-02 22:07 optimizing for very small bandwidths with fq_codel better? Dave Taht
2013-05-03 5:01 ` [Cerowrt-devel] " Mikael Abrahamsson
2013-05-03 11:45 ` Dave Taht
2013-05-03 5:58 ` Jonathan Morton
2013-05-03 16:53 ` Juliusz Chroboczek
2013-05-03 22:15 ` Dave Taht
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox