[Codel] fq_codel_drop vs a udp flood

Dave Taht dave.taht at gmail.com
Mon May 2 22:26:45 EDT 2016


On Sun, May 1, 2016 at 11:20 AM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>
>> On 1 May, 2016, at 20:59, Eric Dumazet <eric.dumazet at gmail.com> wrote:
>>
>> fq_codel_drop() could drop _all_ packets of the fat flow, instead of a
>> single one.
>
> Unfortunately, that could have bad consequences if the “fat flow” happens to be a TCP in slow-start on a long-RTT path.  Such a flow is responsive, but on an order-magnitude longer timescale than may have been configured as optimum.
>
> The real problem is that fq_codel_drop() performs the same (excessive) amount of work to cope with a single unresponsive flow as it would for a true DDoS.  Optimising the search function is sufficient.

Don't think so.

I did some tests today,  (not the fq_codel batch drop patch yet)

When hit with a 900mbit flood, cake shaping down to 250mbit, results
in nearly 100% cpu use in the ksoftirq1 thread on the apu2, and
150mbits of actual throughput (as measured by iperf3, which is now a
measurement I don't trust)

cake *does* hold the packet count down a lot better than fq_codel does.

fq_codel (pre eric's patch) basically goes to the configured limit and
stays there.

In both cases I will eventually get an error like this (in my babel
routed environment) that suggests that we're also not delivering
packets from other flows (arp?) with either fq_codel or cake in these
extreme conditions.

iperf3 -c 172.26.64.200 -u -b900Mbit -t 600

[  4]  47.00-48.00  sec   107 MBytes   895 Mbits/sec  13659
iperf3: error - unable to write to stream socket: No route to host

...

The results I get from iperf are a bit puzzling over the interval it
samples at - this is from a 100Mbit test (downshifting from 900mbit)

[ 15]  25.00-26.00  sec   152 KBytes  1.25 Mbits/sec  0.998 ms
29673/29692 (1e+02%)
[ 15]  26.00-27.00  sec   232 KBytes  1.90 Mbits/sec  1.207 ms
10235/10264 (1e+02%)
[ 15]  27.00-28.00  sec  72.0 KBytes   590 Kbits/sec  1.098 ms
19035/19044 (1e+02%)
[ 15]  28.00-29.00  sec  0.00 Bytes  0.00 bits/sec  1.098 ms  0/0 (-nan%)
[ 15]  29.00-30.00  sec  72.0 KBytes   590 Kbits/sec  1.044 ms
22468/22477 (1e+02%)
[ 15]  30.00-31.00  sec  64.0 KBytes   524 Kbits/sec  1.060 ms
13078/13086 (1e+02%)
[ 15]  31.00-32.00  sec  0.00 Bytes  0.00 bits/sec  1.060 ms  0/0 (-nan%)
^C[ 15]  32.00-32.66  sec  64.0 KBytes   797 Kbits/sec  1.050 ms
25420/25428 (1e+02%)

Not that I care all that much about how iperf is intepreting it's drop
rate (I guess pulling apart the actual caps is in order).

As for cake struggling to cope:

root at apu2:/home/d/git/tc-adv/tc# ./tc -s qdisc show dev enp2s0

qdisc cake 8018: root refcnt 9 bandwidth 100Mbit diffserv4 flows rtt 100.0ms raw
 Sent 219736818 bytes 157121 pkt (dropped 989289, overlimits 1152272 requeues 0)
 backlog 449646b 319p requeues 0
 memory used: 2658432b of 5000000b
 capacity estimate: 100Mbit
             Bulk    Best Effort     Video       Voice
  thresh       100Mbit   93750Kbit      75Mbit      25Mbit
  target         5.0ms       5.0ms       5.0ms       5.0ms
  interval     100.0ms     100.0ms     100.0ms     100.0ms
  pk_delay         0us       5.2ms        92us        48us
  av_delay         0us       5.1ms         4us         2us
  sp_delay         0us       5.0ms         4us         2us
  pkts               0     1146649          31          49
  bytes              0  1607004053        2258        8779
  way_inds           0           0           0           0
  way_miss           0          15           2           1
  way_cols           0           0           0           0
  drops              0      989289           0           0
  marks              0           0           0           0
  sp_flows           0           0           0           0
  bk_flows           0           1           0           0
  last_len           0        1514          66         138
  max_len            0        1514         110         487

...

But I am very puzzled as to why flow isolation would fail in the face
of this overload.

>  - Jonathan Morton
>



-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


More information about the Codel mailing list