From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 4511C3B2A4 for ; Fri, 6 Jul 2018 09:34:29 -0400 (EDT) Received: by mail-lj1-x235.google.com with SMTP id p6-v6so9184617ljc.5 for ; Fri, 06 Jul 2018 06:34:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=WXRVlv4nhorCAs5DNr9ZEOPwubcvU0r/G20h134vZ18=; b=Lt5fdX8tPQ0ayYoHCdRyFH1gIYdTYp6v8T5w36pvDzGIUnXQglyBnYe17uATZa1Af/ SpISx2nnV8eLlp/Y2qkkCbRS/YJG8HAuhehTDlYYIW0oL/LXvGLuOARpMS1wLKBGQHE8 zTyh+UkxVepfGnO+yhm5FDTQetxv+Ue9Q2q1+cvbaAjydVZarnE0edXnpFqS0EfDHd6Z vQQ6Ncc/UmOA+UP+xd01kNdG9eEOKYDa6uCelm0bLoN2TBubY1ogbjB2EeboMa069Cfs ia65V8gCTko0YxnzGgjgTLeMPhAe3dOOZDP7pirDUFVjx9ZJtRVMk6hNefReMfrqv1yv VCPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WXRVlv4nhorCAs5DNr9ZEOPwubcvU0r/G20h134vZ18=; b=jDc6KlIdvSCa/gMbd25tKFwxBesRfZ80+iaEpyb0CNwqcYBT5dnKjDlztWJjAdmdaK +ga+GD94bpWhceengdIN4VQfWZD5RO7gm/oe/6ZXnnMBxP+rdYdgwLdPffScl6HeQvAS 33rwTP6VFKu0+l/s+1ho/kNm5T6PiTfKWYvVk302+n+ywiSSeUwUk8yZvuGaQ75sZ2ec dmn+Nvl/ejXi+bJ8RFssXmZFFyg0ReF4+uroZ++74K1ObdYvJ3rO8M/2UopfOhG/cCx0 T24fhYWkJkYDNp1M6ZdpODaBsDVGq+EJx7QOOXAXklvy8PlVrIbFM/TqyKxSlzS1vDV5 pgOw== X-Gm-Message-State: APt69E3XrUaDk8Nod+9aO0b/cdPcu6IcVFO4N4iaXzlx78uy2W1xegNF 9uz4fSZDXCCpVE7Ozwgv9dAM925Ok5SaUAbmy60= X-Google-Smtp-Source: AAOMgpfeOAnDl2z5MAE9Fr2i4CSEvaLAtAe3yebeVg98N7idWujunHmwMwL4X++Q/mDPFieKMVRpUuMwm6J+VU3vLjo= X-Received: by 2002:a2e:5012:: with SMTP id e18-v6mr6913053ljb.22.1530884068021; Fri, 06 Jul 2018 06:34:28 -0700 (PDT) MIME-Version: 1.0 References: <761C7004-247B-42B4-B56C-2527816826C7@heistp.net> <87y3eoa3wn.fsf@toke.dk> In-Reply-To: <87y3eoa3wn.fsf@toke.dk> From: Georgios Amanakis Date: Fri, 6 Jul 2018 09:34:15 -0400 Message-ID: To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= Cc: Pete Heist , Cake List Content-Type: multipart/alternative; boundary="000000000000111de2057054b94d" Subject: Re: [Cake] lockup with cake and veth X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jul 2018 13:34:29 -0000 --000000000000111de2057054b94d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you both for the great work! I will give it a try, too. George On Fri, Jul 6, 2018, 9:29 AM Toke H=C3=B8iland-J=C3=B8rgensen wrote: > Pete Heist writes: > > > I don=E2=80=99t know if we want to call this an issue, but... > > > > I=E2=80=99m seeing a lockup with cake (and also sfq, but not either pfi= fo or > > fq_codel), when run over veth devices. Two network namespaces are > > created, one for client and one for server, each with one veth device. > > Netem is added as the root qdisc with a delay of 1ms, and a leaf qdisc > > may be added. Lockups occur on my box when the leaf qdisc is either > > cake or sfq, and I'm running flent=E2=80=99s tcp_ndown test with >=3D 4= download > > streams. Note that I happen to be running on a quad-core. > > > > - If no leaf qdisc is added below netem, no lockup occurs. > > - If either pfifo or fq_codel is added below netem, no lockup occurs. > > - If either cake or sfq is the leaf, the lockup occurs. > > > > The symptoms (lockup with >=3D 4 streams on a quad-core box), and the > > fact that it occurs with both cake and sfq, make me think that it may > > simply have to do with the code not being re-entrant, which may be the > > case for veth, and this is just by design? maybe something that we > > should consider fixing but wouldn=E2=80=99t be a show-stopper? But that= should > > be confirmed. > > > > I=E2=80=99ll keep investigating, but am sharing the scripts I=E2=80=99m= running > > meanwhile in case anyone else wants to look. See README.txt in the > > attached... > > Thanks for investigating! I'll take a look later. The fact that it > happens with sfq as well means it's probably not cake-specific, though, > so I don't think we should hold off on the upstream submission until > we've figured it out. Using leaf qdiscs with netem has been dodgy for a > while IIRC... > > -Toke > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake > --000000000000111de2057054b94d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you both for the great work!
I wil= l give it a try, too.

Ge= orge

On Fri, Jul= 6, 2018, 9:29 AM Toke H=C3=B8iland-J=C3=B8rgensen <toke@toke.dk> wrote:
Pete Heist <pete@heistp.net> writes:

> I don=E2=80=99t know if we want to call this an issue, but...
>
> I=E2=80=99m seeing a lockup with cake (and also sfq, but not either pf= ifo or
> fq_codel), when run over veth devices. Two network namespaces are
> created, one for client and one for server, each with one veth device.=
> Netem is added as the root qdisc with a delay of 1ms, and a leaf qdisc=
> may be added. Lockups occur on my box when the leaf qdisc is either > cake or sfq, and I'm running flent=E2=80=99s tcp_ndown test with &= gt;=3D 4 download
> streams. Note that I happen to be running on a quad-core.
>
> - If no leaf qdisc is added below netem, no lockup occurs.
> - If either pfifo or fq_codel is added below netem, no lockup occurs.<= br> > - If either cake or sfq is the leaf, the lockup occurs.
>
> The symptoms (lockup with >=3D 4 streams on a quad-core box), and t= he
> fact that it occurs with both cake and sfq, make me think that it may<= br> > simply have to do with the code not being re-entrant, which may be the=
> case for veth, and this is just by design? maybe something that we
> should consider fixing but wouldn=E2=80=99t be a show-stopper? But tha= t should
> be confirmed.
>
> I=E2=80=99ll keep investigating, but am sharing the scripts I=E2=80=99= m running
> meanwhile in case anyone else wants to look. See README.txt in the
> attached...

Thanks for investigating! I'll take a look later. The fact that it
happens with sfq as well means it's probably not cake-specific, though,=
so I don't think we should hold off on the upstream submission until we've figured it out. Using leaf qdiscs with netem has been dodgy for a=
while IIRC...

-Toke
_______________________________________________
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake
--000000000000111de2057054b94d--