[Cake] cake infinite loop(?) with hfsc on one-armed router

Pete Heist pete at heistp.net
Fri Jan 4 17:10:07 EST 2019


> On Jan 4, 2019, at 10:34 PM, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
> 
> Pete Heist <pete at heistp.net> writes:
> 
>> Ok, the lockup goes away if you use no-split-gso on the cake qdiscs for the default traffic (noted below in the drr and hfsc cases with "!!! must use no-split-gso here !!!"). Only I’d like my 600 μs back. :)
>> 
>> This smells of a bug Toke fixed on Sep 12, 2018 in 42e87f12ea5c390bf5eeb658c942bc810046160a, but then reverted in the next commit because it was fixed upstream. However, if I re-apply that commit, it still doesn’t fix it.
>> 
>> Perhaps there are more cases where skb_reset_mac_len(skb) needs to be called somewhere for VLAN support?
>> 
>> I managed to capture some output from what happens to hfsc:
>> 
>> [  683.864456] ------------[ cut here ]------------
>> [  683.869116] WARNING: CPU: 1 PID: 11 at net/sched/sch_hfsc.c:1427
>> 0xf9ced4ef()
> 
> So this seems to be this line:
> 
> WARN_ON(next_time == 0);
> 
> See https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1427
> 
> Which seems to indicate that HFSC can't find the next class to schedule.
> Not entirely sure why, nor why this only happens with CAKE as a qdisc.
> But I don't think it's actually an infinite loop that's causing it...


Ok, fwiw one doesn’t actually need a one-armed router or VLANs to reproduce this. Just do this:

tc qdisc add dev $IFACE root handle 1: hfsc default 1
tc class add dev $IFACE parent 1: classid 1:1 hfsc ls rate $RATE ul rate $RATE
tc qdisc add dev $IFACE parent 1:1 cake # add split-gso here, or else…

I’ve tried it as far as 4.9.0-8, but no farther. It’s not much of a priority for me now that I have a workaround for it...

Pete



More information about the Cake mailing list