[Cake] cake infinite loop(?) with hfsc on one-armed router

Pete Heist pete at heistp.net
Sat Jan 5 15:01:31 EST 2019


That first bug report looks decidedly similar to mine, but Toke would have to comment on the specifics. So far I see the patch to sch_codel.c you mentioned and another two-liner to remove the warning in hfsc.c (https://patchwork.ozlabs.org/patch/933611/). It would be really good to know that that warning is truly bogus, that it wasn’t put there by the author for good reason, as Toke may have been thinking of a different way to fix hfsc.

Thanks for bringing this up! I see that I ought to search OpenWRT/kernel.org next time… :)

> On Jan 5, 2019, at 8:27 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
> 
> Dear all,
> 
> I am most likely wrong, but did you have a look at https://bugs.openwrt.org/index.php?do=details&task_id=1136 yet?
> Especially https://bugzilla.kernel.org/show_bug.cgi?id=109581 and https://www.spinics.net/lists/netdev/msg450655.html might be related to Pete's bug.
> Then again, I might be wrong as the whole flurry of emails went past my head quickly.
> 
> Best Regards
> 	Sebastian
> 
> 
>> On Jan 5, 2019, at 17:32, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
>> 
>> Pete Heist <pete at heistp.net> writes:
>> 
>>>> On Jan 5, 2019, at 2:35 PM, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
>>>> 
>>>> Pete Heist <pete at heistp.net> writes:
>>>> 
>>>>>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
>>>>>> 
>>>>>> Hmm, that's odd. Could you try adding this debugging line in
>>>>>> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>>>>>> 
>>>>>> 		net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
>>>>>> 				     parentid, n, len);
>>>>>> 
>>>>>> And see if you actually get any of those lines in your dmesg?
>>>>> 
>>>>> I do see the messages twice, then not after that in the rest of the
>>>>> output...
>>>> 
>>>> Right. Looking at the HFSC code some more, I think the bug is actually
>>>> caused by another, but related, interaction between HFSC and CAKE.
>>>> 
>>>> Specifically, this line:
>>>> 
>>>> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605
>>>> 
>>>> where HFSC checks whether the child queue len is 1, which it interprets
>>>> as the event that activates that queue. However, because CAKE splits the
>>>> packet, this check will fail, and the HFSC class will not be activated.
>>>> This also explains why you only see the bug with HFSC, and not with HTB
>>>> (although I do think that we still need to update the hierarchy).
>>>> 
>>>> The good news it that it is a fairly simple to fix in HFSC. The bad news
>>>> is that it's something that's hard to work around from the out-of-tree
>>>> CAKE...
>>> 
>>> Aha, well, I wonder if we’ll see this problem with other qdiscs- maybe
>>> cbq, if I ever get a chance to try it (not hurrying yet). Ideally this
>>> interaction between qdiscs would be clarified somewhere, at some
>>> point. :)
>>> 
>>> Thanks a lot for doing the discovery though!
>> 
>> You're welcome, and thanks for you help :)
>> 
>>> We may not have hfsc+cake with GSO splitting on older kernels very
>>> soon, but what should we do with this? There’s nobody in MAINTAINERS
>>> for hfsc, so we may not get much of a response to any bug
>>> submissions...
>> 
>> $ ./scripts/get_maintainer.pl net/sched/sch_hfsc.c 
>> Jamal Hadi Salim <jhs at mojatatu.com> (maintainer:TC subsystem)
>> Cong Wang <xiyou.wangcong at gmail.com> (maintainer:TC subsystem)
>> Jiri Pirko <jiri at resnulli.us> (maintainer:TC subsystem)
>> "David S. Miller" <davem at davemloft.net> (maintainer:NETWORKING [GENERAL])
>> netdev at vger.kernel.org (open list:TC subsystem)
>> 
>> I'll submit a patch sometime next week, and also look into the qlen
>> adjustment for CAKE GSO splitting...
>> 
>> -Toke
>> _______________________________________________
>> Cake mailing list
>> Cake at lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
> 



More information about the Cake mailing list