[Codel] fq_codel_drop vs a udp flood

Dave Taht dave.taht at gmail.com
Tue May 3 01:21:34 EDT 2016


On Mon, May 2, 2016 at 7:26 PM, Dave Taht <dave.taht at gmail.com> wrote:
> On Sun, May 1, 2016 at 11:20 AM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>>
>>> On 1 May, 2016, at 20:59, Eric Dumazet <eric.dumazet at gmail.com> wrote:
>>>
>>> fq_codel_drop() could drop _all_ packets of the fat flow, instead of a
>>> single one.
>>
>> Unfortunately, that could have bad consequences if the “fat flow” happens to be a TCP in slow-start on a long-RTT path.  Such a flow is responsive, but on an order-magnitude longer timescale than may have been configured as optimum.
>>
>> The real problem is that fq_codel_drop() performs the same (excessive) amount of work to cope with a single unresponsive flow as it would for a true DDoS.  Optimising the search function is sufficient.
>
> Don't think so.
>
> I did some tests today,  (not the fq_codel batch drop patch yet)
>
> When hit with a 900mbit flood, cake shaping down to 250mbit, results
> in nearly 100% cpu use in the ksoftirq1 thread on the apu2, and
> 150mbits of actual throughput (as measured by iperf3, which is now a
> measurement I don't trust)
>
> cake *does* hold the packet count down a lot better than fq_codel does.
>
> fq_codel (pre eric's patch) basically goes to the configured limit and
> stays there.
>
> In both cases I will eventually get an error like this (in my babel
> routed environment) that suggests that we're also not delivering
> packets from other flows (arp?) with either fq_codel or cake in these
> extreme conditions.
>
> iperf3 -c 172.26.64.200 -u -b900Mbit -t 600
>
> [  4]  47.00-48.00  sec   107 MBytes   895 Mbits/sec  13659
> iperf3: error - unable to write to stream socket: No route to host
>
> ...
>
> The results I get from iperf are a bit puzzling over the interval it
> samples at - this is from a 100Mbit test (downshifting from 900mbit)
>
> [ 15]  25.00-26.00  sec   152 KBytes  1.25 Mbits/sec  0.998 ms
> 29673/29692 (1e+02%)
> [ 15]  26.00-27.00  sec   232 KBytes  1.90 Mbits/sec  1.207 ms
> 10235/10264 (1e+02%)
> [ 15]  27.00-28.00  sec  72.0 KBytes   590 Kbits/sec  1.098 ms
> 19035/19044 (1e+02%)
> [ 15]  28.00-29.00  sec  0.00 Bytes  0.00 bits/sec  1.098 ms  0/0 (-nan%)
> [ 15]  29.00-30.00  sec  72.0 KBytes   590 Kbits/sec  1.044 ms
> 22468/22477 (1e+02%)
> [ 15]  30.00-31.00  sec  64.0 KBytes   524 Kbits/sec  1.060 ms
> 13078/13086 (1e+02%)
> [ 15]  31.00-32.00  sec  0.00 Bytes  0.00 bits/sec  1.060 ms  0/0 (-nan%)
> ^C[ 15]  32.00-32.66  sec  64.0 KBytes   797 Kbits/sec  1.050 ms
> 25420/25428 (1e+02%)

OK, the above weirdness in calculating a "rate" is due to me sending
8k fragmented packets.

-l1470 fixed that.

> Not that I care all that much about how iperf is intepreting it's drop


> rate (I guess pulling apart the actual caps is in order).
>
> As for cake struggling to cope:
>
> root at apu2:/home/d/git/tc-adv/tc# ./tc -s qdisc show dev enp2s0
>
> qdisc cake 8018: root refcnt 9 bandwidth 100Mbit diffserv4 flows rtt 100.0ms raw
>  Sent 219736818 bytes 157121 pkt (dropped 989289, overlimits 1152272 requeues 0)
>  backlog 449646b 319p requeues 0
>  memory used: 2658432b of 5000000b
>  capacity estimate: 100Mbit
>              Bulk    Best Effort     Video       Voice
>   thresh       100Mbit   93750Kbit      75Mbit      25Mbit
>   target         5.0ms       5.0ms       5.0ms       5.0ms
>   interval     100.0ms     100.0ms     100.0ms     100.0ms
>   pk_delay         0us       5.2ms        92us        48us
>   av_delay         0us       5.1ms         4us         2us
>   sp_delay         0us       5.0ms         4us         2us
>   pkts               0     1146649          31          49
>   bytes              0  1607004053        2258        8779
>   way_inds           0           0           0           0
>   way_miss           0          15           2           1
>   way_cols           0           0           0           0
>   drops              0      989289           0           0
>   marks              0           0           0           0
>   sp_flows           0           0           0           0
>   bk_flows           0           1           0           0
>   last_len           0        1514          66         138
>   max_len            0        1514         110         487
>
> ...
>
> But I am very puzzled as to why flow isolation would fail in the face
> of this overload.

And to simplify matters I got rid of the advanced qdiscs entirely,
switched back to htb+pfifo and get the same ultimate result of the
test aborting...

Joy.

OK,

ethtool -s enp2s0 advertise 0x008 # 100mbit

Feeding packets in at 900mbit into a 1000 packet fifo queue at 100Mbit
is predictably horriffic... other flows get starved entirely, you
can't even type on the thing, and still eventually

[ 28]  28.00-29.00  sec  11.4 MBytes  95.7 Mbits/sec  0.120 ms
72598/80726 (90%)
[ 28]  29.00-30.00  sec  11.4 MBytes  95.7 Mbits/sec  0.119 ms
46187/54314 (85%)
[ 28] 189.00-190.00 sec  8.73 MBytes  73.2 Mbits/sec  0.162 ms
55276/61493 (90%)
[ 28] 190.00-191.00 sec  0.00 Bytes  0.00 bits/sec  0.162 ms  0/0 (-nan%)

vs:

[  4] 188.00-189.00 sec   105 MBytes   879 Mbits/sec  74614
iperf3: error - unable to write to stream socket: No route to host

Yea!  More people should do that to themselves. System is bloody
useless with a 1000 packet full queue  and way more useful with
fq_codel in this scenario...

but still this ping should be surviving with fq_codel going and one
full rate udp flood, if it wasn't for all the cpu being used up
throwing away packets. I think.

64 bytes from 172.26.64.200: icmp_seq=50 ttl=63 time=6.92 ms
64 bytes from 172.26.64.200: icmp_seq=52 ttl=63 time=7.15 ms
64 bytes from 172.26.64.200: icmp_seq=53 ttl=63 time=7.11 ms
64 bytes from 172.26.64.200: icmp_seq=55 ttl=63 time=6.68 ms
ping: sendmsg: No route to host
ping: sendmsg: No route to host
ping: sendmsg: No route to host

...

OK, tomorrow, eric's new patch! A new, brighter day now that I've
burned this one melting 3 boxes into the ground. and perf.




-- 
Dave Täht
Let's go make home routers and wifi faster! With better software!
http://blog.cerowrt.org


More information about the Codel mailing list