From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [52.28.52.200]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id E9C573BA8E for ; Fri, 4 Jan 2019 16:34:39 -0500 (EST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1546637678; bh=aS9FnMN95jP6E4Eb4fJnYBj+7muPEjwsflNYaUyXktE=; h=From:To:Subject:In-Reply-To:References:Date:From; b=p96LYVbzoOE002Q1MtGAecoRl1iosptppiS1TA47MskdTQmozr+OKf8asa3aX39hr T4Tq45E4E4Ni6YkCOIR4EvJkdytd5z7lhRUziM4DCJzfLH3q7MxgVtih6ptsBLuG6V AumGxeigdB09atyYLO6vjsWcOoGmQbT0QEgvV9Vbgh8P+P0OyO5iHTXsT35VvHh2sq ZfNO6J6FA+EQdiLL4ewVwAKkThZcUue6AIDE+hGJTgN7Wr8vbrAV9LfgsiNo/6MFTq NW8x9TGDuZXnMvP6me+AwcXdDlY2lfVD/ZuPQ2fdTTeLrSZ728vwe4grQa/XHWyY9L K29u43S1auHzA== To: Pete Heist , Cake List In-Reply-To: <5482A3CA-9C36-4DDE-A858-24D8467F70C7@heistp.net> References: <5482A3CA-9C36-4DDE-A858-24D8467F70C7@heistp.net> Date: Fri, 04 Jan 2019 22:34:34 +0100 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <8736q8yumt.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] cake infinite loop(?) with hfsc on one-armed router X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Jan 2019 21:34:40 -0000 Pete Heist writes: > Ok, the lockup goes away if you use no-split-gso on the cake qdiscs for t= he default traffic (noted below in the drr and hfsc cases with "!!! must us= e no-split-gso here !!!"). Only I=E2=80=99d like my 600 =CE=BCs back. :) > > This smells of a bug Toke fixed on Sep 12, 2018 in 42e87f12ea5c390bf5eeb6= 58c942bc810046160a, but then reverted in the next commit because it was fix= ed upstream. However, if I re-apply that commit, it still doesn=E2=80=99t f= ix it. > > Perhaps there are more cases where skb_reset_mac_len(skb) needs to be cal= led somewhere for VLAN support? > > I managed to capture some output from what happens to hfsc: > > [ 683.864456] ------------[ cut here ]------------ > [ 683.869116] WARNING: CPU: 1 PID: 11 at net/sched/sch_hfsc.c:1427 > 0xf9ced4ef() So this seems to be this line: WARN_ON(next_time =3D=3D 0); See https://elixir.bootlin.com/linux/v3.16.7/source/net/sched/sch_hfsc.c#L1= 427 Which seems to indicate that HFSC can't find the next class to schedule. Not entirely sure why, nor why this only happens with CAKE as a qdisc. But I don't think it's actually an infinite loop that's causing it... -Toke