[Cake] cake infinite loop(?) with hfsc on one-armed router
Pete Heist
pete at heistp.net
Sat Jan 5 15:01:31 EST 2019
That first bug report looks decidedly similar to mine, but Toke would have to comment on the specifics. So far I see the patch to sch_codel.c you mentioned and another two-liner to remove the warning in hfsc.c (https://patchwork.ozlabs.org/patch/933611/). It would be really good to know that that warning is truly bogus, that it wasn’t put there by the author for good reason, as Toke may have been thinking of a different way to fix hfsc.
Thanks for bringing this up! I see that I ought to search OpenWRT/kernel.org next time… :)
> On Jan 5, 2019, at 8:27 PM, Sebastian Moeller <moeller0 at gmx.de> wrote:
>
> Dear all,
>
> I am most likely wrong, but did you have a look at https://bugs.openwrt.org/index.php?do=details&task_id=1136 yet?
> Especially https://bugzilla.kernel.org/show_bug.cgi?id=109581 and https://www.spinics.net/lists/netdev/msg450655.html might be related to Pete's bug.
> Then again, I might be wrong as the whole flurry of emails went past my head quickly.
>
> Best Regards
> Sebastian
>
>
>> On Jan 5, 2019, at 17:32, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
>>
>> Pete Heist <pete at heistp.net> writes:
>>
>>>> On Jan 5, 2019, at 2:35 PM, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
>>>>
>>>> Pete Heist <pete at heistp.net> writes:
>>>>
>>>>>> On Jan 5, 2019, at 2:10 PM, Toke Høiland-Jørgensen <toke at toke.dk> wrote:
>>>>>>
>>>>>> Hmm, that's odd. Could you try adding this debugging line in
>>>>>> adjust_parent_qlen(), right before the sch->q.qlen += n line:
>>>>>>
>>>>>> net_info_ratelimited("Adjusting parent qdisc %d with pkt += %d, len += %d",
>>>>>> parentid, n, len);
>>>>>>
>>>>>> And see if you actually get any of those lines in your dmesg?
>>>>>
>>>>> I do see the messages twice, then not after that in the rest of the
>>>>> output...
>>>>
>>>> Right. Looking at the HFSC code some more, I think the bug is actually
>>>> caused by another, but related, interaction between HFSC and CAKE.
>>>>
>>>> Specifically, this line:
>>>>
>>>> https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1605
>>>>
>>>> where HFSC checks whether the child queue len is 1, which it interprets
>>>> as the event that activates that queue. However, because CAKE splits the
>>>> packet, this check will fail, and the HFSC class will not be activated.
>>>> This also explains why you only see the bug with HFSC, and not with HTB
>>>> (although I do think that we still need to update the hierarchy).
>>>>
>>>> The good news it that it is a fairly simple to fix in HFSC. The bad news
>>>> is that it's something that's hard to work around from the out-of-tree
>>>> CAKE...
>>>
>>> Aha, well, I wonder if we’ll see this problem with other qdiscs- maybe
>>> cbq, if I ever get a chance to try it (not hurrying yet). Ideally this
>>> interaction between qdiscs would be clarified somewhere, at some
>>> point. :)
>>>
>>> Thanks a lot for doing the discovery though!
>>
>> You're welcome, and thanks for you help :)
>>
>>> We may not have hfsc+cake with GSO splitting on older kernels very
>>> soon, but what should we do with this? There’s nobody in MAINTAINERS
>>> for hfsc, so we may not get much of a response to any bug
>>> submissions...
>>
>> $ ./scripts/get_maintainer.pl net/sched/sch_hfsc.c
>> Jamal Hadi Salim <jhs at mojatatu.com> (maintainer:TC subsystem)
>> Cong Wang <xiyou.wangcong at gmail.com> (maintainer:TC subsystem)
>> Jiri Pirko <jiri at resnulli.us> (maintainer:TC subsystem)
>> "David S. Miller" <davem at davemloft.net> (maintainer:NETWORKING [GENERAL])
>> netdev at vger.kernel.org (open list:TC subsystem)
>>
>> I'll submit a patch sometime next week, and also look into the qlen
>> adjustment for CAKE GSO splitting...
>>
>> -Toke
>> _______________________________________________
>> Cake mailing list
>> Cake at lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
More information about the Cake
mailing list