Since there seems to be some confusion about what aspect of bufferbloat fq_codel controls, perhaps somewhere in a HOWTO somebody should talk about how to combine tbf (for the upstream "cable modem" case) with fq_codel on cerowrt today - though I think the contemplated mfq_codel idea mentioned below would be nice, but it would have to be parameterized, which codel need not be. For even better user experience, perhaps Luci could have a "wizard" that asks a simple question like: is your upstream a cable modem with bufferbloat? ( ) yes ( ) no what is its rated "uplink" bitrate? _____ Mb/sec And set up the tbf+fq_codel properly for people who want simple setup. Reasonable tbf parameters should be easy to calculate given the rate. -----Original Message----- From: "Dave Taht" Sent: Monday, August 20, 2012 3:12pm To: "Sebastian Moeller" Cc: cerowrt-devel@lists.bufferbloat.net, "Felix Fietkau" Subject: [Cerowrt-devel] Coping with router memory limitations in fq_codel Dear Sebastian: In addition to your udp flooding DoS attack, I attacked cero also by using diffserv marking in netperf (-Y codepoint,codepoint) to saturate all 4 wifi fq_codel queues, and also would get the router to have memory allocation failures and ultimately crash in the same way you are crashing it. I can similarly do what you just did with rtp flooding. You are correct that codel is tuned for tcp, and that fq_codel by maintaining many queues is even more susceptible to a tuned udp flooding attack on a memory limited device such as this. I tried to cope with this in 3.3.8-10/11 by reducing the packet limits, which helped a lot. Unfortunately the settings I used then were below codel's reaction time, which invoked "interesting" tail drop behavior, so I arbitrarily doubled them in -17. To invoke more of the kind of problems you are encountering... 0) Since then I have been looking into ways to improve codel's reaction time that are in the ns2 model presently, also fixing an assumption about newton's method that didn't hold in reverse, and also means to incorporate more aggressive codel behavior when queue limits are near to being exceeded. Unfortunately as the memory pressure problem starts in the driver, it's not communicated up the stack to where it could be controlled better... 0) I would like avoid having to determining if a queue is tcp or "other", and then having different kinds of drop strategies for each. That said, it seems possible to implement that... 1) A workaround of sorts for the 64MB 3700v2 has been to give up on named and get some memory back that way. 2) I believe, but am not sure, that Linux 3.6(5?) has some stuff in it to get skb memory allocations done more efficiently. Eric and I and felix had talked about it, I don't know what was implemented. 3) It may be possible to improve how the memory allocations from the 2048 slab work in general. I imagine that half of memory is being wasted on big packets otherwise. 4) some options for improving fq_codel for more memory constrained home environments better. 4a) On the wifi front (as well as other devices with multiple hardware queues), I envision something like "mfq_codel", which would have an overall similar packet limit to a single fq_codel, but be able to deliver (and fair queue) packets to the underlying hardware queues independently. 4b) On the home to-ISP gateway qos front, a rate limited (tbf) mfq_codel with 2-4 queues would replace the complexity of hfsc or htb with a default qdisc that "just worked" without any scripting. It could be mildly more responsive (htb buffers up some data and has it's own notion of time and quantums), thus cpu and memory usage would be lower than htb + multiple fq_codel queues. Getting something that scaled down to 10s of kbits and up to gigabits would be hard, tho. HTB needs to be tuned when running lower or higher than it's original operating range, presently, and that is where, in part, the simple_qos.sh effort is "stuck". 4c) Another thought would be to have a weighted packet (to handle classification) oriented sfq codel or qfq_codel rather than separate fq_codel queues that are each byte-aware... we have CPU to burn, but not memory... On Mon, Aug 20, 2012 at 11:24 AM, Sebastian Moeller wrote: > Hi Dave, > > so I went to play around with this a bit more. I turned to UDP flooding my cable modem through the router and this surely allows me to create enough load on the wndr3700v2 to cause the allocation errors and as a "bonus" also to drive the router to reboot (driven by the watchdog timer?). Here is the script I used over 5G wireless (from http://blog.ioshints.info/2008/03/udp-flood-in-perl.html) > > #!/usr/bin/perl It would be nice to have a C or lua version of this sort of test. > ############## > > # udp flood. > ############## > > use Socket; > use strict; > > if ($#ARGV != 3) { > print "flood.pl