[Cerowrt-devel] fq_codel tuning on distros

Dave Taht dave.taht at gmail.com
Fri Oct 17 17:59:22 EDT 2014

On Fri, Oct 17, 2014 at 2:06 PM, Matt Taggart <matt at lackof.org> wrote:
> Hi,
> http://www.bufferbloat.net/projects/codel/wiki/Best_practices_for_benchmarki
> ng_Codel_and_FQ_Codel#Tuning-fq_codel
> explains that the default packet limit of 10000 is designed for 10GigE
> speeds and that for slower links it should be turned down. Is that still
> true? Looking at 3.16 source in fq_codel_init I see:
>   sch->limit = 10*1024;

This is the 10k packet limit.

My take on the fq_codel in fedora discussion was that if a machine has
enough memory to run systemd, it has more than enough memory to run
fq_codel with the default packet limit. :)

Most of our work here has been colored by the pain of streamlining
this stuff to fit into boxes with 32MB of ram or less, and run at
rates well below 50mbit. I don't feel any intense urges to fiddle with
anything in fq_codel above those speeds and memory sizes, which is
what fedora largely targets.

I haven't seen any real difference in benchmarks from fiddling with
the fq_codel quantum at the higher rates.

Does anybody here, running fq_codel on their desktops or servers
fiddle with any fq_codel parameters more?

Certainly we have a backlog of tweaks to the algorithm that could use
wider testing, which I expected to mostly fold into cake and then back
into fq_codel. it's all second or third order stuff.

but: I have longed for more field data from bleeding edge desktop and
server users, which is why fedora would be a wonderful place to nail
down other things that could be optimized better, if needed.

I'm not on their relevant mailing lists, am busy today, and I tried to
dump what info I could into the lwn.net article so that those fiddling
with it over there might show up here or on the codel list for more
discussion. (someone feel free to join over there...)

IF fedora goes down this route... I also do wish we could reduce
TSO/GSO sizes generally, have tcp small queues autosize to load, and
had BQL on everything, in addition to fq or fq_codel. And I'd like a

I do have some concerns with just blithely turning fq_codel on on
everything. One is savvy adminstrators tend to build complex qdiscs -
or as in our case, we tend to turn on SQM on a given interface - and
it would be annoying for a "systemd" to arbitrarily change it back -
on some sort of reload.

But: the 99.99% of the users not doing any tuning need a usually sane default.

In terms of where to turn a different qdisc on, systemd does not
strike me as the right place. I'd argue that you'd chose a default
sysctl qdisc once, at hardware detection time, and only change it when
the hardware changes.

At a rough cut... something like this psuedocode would do

if (have_good_clocksource) { // I have no idea how to get this from userspace
   if(!am_forwarding && 10gigE) {
    } else {

   if ((am_forwarding || have_wifi || non_bql_device ||
non_web_workload) && 10gigE ) {

     // the circumstances under which sch_fq can be used effectively
are not thoroughly
     // defined. For example, I think it's a bad idea to use the giant
quantums it uses at lower speeds,
     // but I do like what fq+pacing does, and love it scaling to
millions of flows. I'd like the
     // srtt and pacing stuff to end up back  in fq_codel actually

} else {
   enable pfifo_fast

>   q->flows_cnt = 1024;

There is some work showing we could improve the hash. I don't see any
reason to change the default number of queues.

Also the odd slow start results I got with a huge packet limit were,
like, 2012 - prior to tcp small queues and some TSO/GSO offload fixes.
They should be revisited. I'll put it on my list.

> But I don't know what those correspond to. Are they sysfs tunable or only
> at compile time?

I would have liked it if there was a sysctl for default options, and
it was more inheritable across devices, much like how ip_forwarding is

> If Linux distros are going to turn on fq_codel by default, are these

Up until this morning I would never have believed it. Aside from the
folks doing routers, we'd heard nothing but crickets back from the
mainstream distros. See for example:


I was convinced we had to make cake into a completely backward
compatible equivalent of pfifo_fast, and we had to make it run as fast
as pfifo_fast.

Stephen's presentation must have been VERY convincing!

> reasonable values for the installed base (which I am assuming is mostly
> 1GigE)? What recommendations should the distro documentation make for
> tuning on various speeds?

I run with the defaults.

> I'm excited for this to go into distros, what needs to be done to make that
> easier?

Most of my stuff was constructed to fall into /etc/network/ifpreup.d.

I would like the folk doing the work to benchmark their results, be
convinced by them, and then be committed to work through whatever
problems are raised by the users in the field. And/or subject
themselves to the pain of pfifo_fast,
as mostafa did....

I certainly have done enough benchmarking personally to convince :me:
and the lovely, dedicated bunch of users we have here.

> Thanks,
> --
> Matt Taggart
> matt at lackof.org
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel

Dave Täht


More information about the Cerowrt-devel mailing list