From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-we0-f171.google.com (mail-we0-f171.google.com [74.125.82.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id AD4AB200A76 for ; Mon, 20 Aug 2012 12:12:18 -0700 (PDT) Received: by weyx43 with SMTP id x43so8952814wey.16 for ; Mon, 20 Aug 2012 12:12:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=QIsfxV/41eSvq69Q2+aNlmi0+GmtyCf3TB4j+qUjH9Y=; b=pZ0VZR7AWb2/6q4UC3+kFZvedSDAabiFuJfeqY8Hwjnbs3dGqvN4qJH+0VzguPnZW/ XV6uyO7V/PlG1duGUbdu5Fnvpzf/feEJonq9006PomXYzdBjMQd96ZSjhnkjNpDBYOOP MGiZjYZi1vXJALyey3FX7fJyYQe5nZ2CNfZwLsB8MhqPVGJg4OpmbZK0yfx5fHGb5wSk ROTdUMx5pc0g/LgGM2fa9hXvUcscOnwhDM7CEV32PeuZI+WsBSsgjwcdOdx2i9GEPqo3 pz2urRSpjlWsxWvZfQGSCRWhjsYiI0MLuVAjXu2bTdaXO4J0SG5i58Vn7DWaeD+2PzYk IN/Q== MIME-Version: 1.0 Received: by 10.180.98.200 with SMTP id ek8mr31282916wib.0.1345489936694; Mon, 20 Aug 2012 12:12:16 -0700 (PDT) Received: by 10.223.143.69 with HTTP; Mon, 20 Aug 2012 12:12:16 -0700 (PDT) Date: Mon, 20 Aug 2012 12:12:16 -0700 Message-ID: From: Dave Taht To: Sebastian Moeller Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Cc: cerowrt-devel@lists.bufferbloat.net, Felix Fietkau Subject: [Cerowrt-devel] Coping with router memory limitations in fq_codel X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Aug 2012 19:12:19 -0000 Dear Sebastian: In addition to your udp flooding DoS attack, I attacked cero also by using diffserv marking in netperf (-Y codepoint,codepoint) to saturate all 4 wifi fq_codel queues, and also would get the router to have memory allocation failures and ultimately crash in the same way you are crashing it. I can similarly do what you just did with rtp flooding. You are correct that codel is tuned for tcp, and that fq_codel by maintaining many queues is even more susceptible to a tuned udp flooding attack on a memory limited device such as this. I tried to cope with this in 3.3.8-10/11 by reducing the packet limits, which helped a lot. Unfortunately the settings I used then were below codel's reaction time, which invoked "interesting" tail drop behavior, so I arbitrarily doubled them in -17. To invoke more of the kind of problems you are encountering... 0) Since then I have been looking into ways to improve codel's reaction time that are in the ns2 model presently, also fixing an assumption about newton's method that didn't hold in reverse, and also means to incorporate more aggressive codel behavior when queue limits are near to being exceeded. Unfortunately as the memory pressure problem starts in the driver, it's not communicated up the stack to where it could be controlled better... 0) I would like avoid having to determining if a queue is tcp or "other", and then having different kinds of drop strategies for each. That said, it seems possible to implement that... 1) A workaround of sorts for the 64MB 3700v2 has been to give up on named and get some memory back that way. 2) I believe, but am not sure, that Linux 3.6(5?) has some stuff in it to get skb memory allocations done more efficiently. Eric and I and felix had talked about it, I don't know what was implemented. 3) It may be possible to improve how the memory allocations from the 2048 slab work in general. I imagine that half of memory is being wasted on big packets otherwise. 4) some options for improving fq_codel for more memory constrained home environments better. 4a) On the wifi front (as well as other devices with multiple hardware queues), I envision something like "mfq_codel", which would have an overall similar packet limit to a single fq_codel, but be able to deliver (and fair queue) packets to the underlying hardware queues independently. 4b) On the home to-ISP gateway qos front, a rate limited (tbf) mfq_codel with 2-4 queues would replace the complexity of hfsc or htb with a default qdisc that "just worked" without any scripting. It could be mildly more responsive (htb buffers up some data and has it's own notion of time and quantums), thus cpu and memory usage would be lower than htb + multiple fq_codel queues. Getting something that scaled down to 10s of kbits and up to gigabits would be hard, tho. HTB needs to be tuned when running lower or higher than it's original operating range, presently, and that is where, in part, the simple_qos.sh effort is "stuck". 4c) Another thought would be to have a weighted packet (to handle classification) oriented sfq codel or qfq_codel rather than separate fq_codel queues that are each byte-aware... we have CPU to burn, but not memory... On Mon, Aug 20, 2012 at 11:24 AM, Sebastian Moeller wrote= : > Hi Dave, > > so I went to play around with this a bit more. I turned to UDP flooding m= y cable modem through the router and this surely allows me to create enough= load on the wndr3700v2 to cause the allocation errors and as a "bonus" als= o to drive the router to reboot (driven by the watchdog timer?). Here is th= e script I used over 5G wireless (from http://blog.ioshints.info/2008/03/ud= p-flood-in-perl.html) > > #!/usr/bin/perl It would be nice to have a C or lua version of this sort of test. > ############## > > # udp flood. > ############## > > use Socket; > use strict; > > if ($#ARGV !=3D 3) { > print "flood.pl