From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id AA4B03B29E for ; Thu, 22 Dec 2022 17:52:43 -0500 (EST) Received: by mail-wm1-x32c.google.com with SMTP id i82-20020a1c3b55000000b003d1e906ca23so2283217wma.3 for ; Thu, 22 Dec 2022 14:52:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=GiQaSzj+/L2cukKYKOLE/W+ZbARJh3SQ7gSrLoYdLq4=; b=p3MdNl7jU1X8FLl68lv58by26MenTAntFQwe2yxn/Xv35m51OfTfbi5jDCItCc2/tp itDzwku4VRRGu/6NR2GlZiMh3em/reSydfYkJ+pR1+qcQTFOrOEgjlACB0zS0ytyTd6r gekn2ecMFKreOsCrO7SYoxMOFQi9g+3tBjZKNhUyFBxSfWfs3rCm7FByaSdOEKVNzwrO ja8Y2PZ0K7O6ltMmjiFHvlUE2JDNX9jWGWoyejhvK4isUAJ/Q9XMp/9sfBerrXeHEgWK xHX1UQnuBuQVfEhRq7E3p27dcK5FZXrfh66ANVK96g+le1ZHj8qtWcy4DJ9xstgYw5XO ikkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GiQaSzj+/L2cukKYKOLE/W+ZbARJh3SQ7gSrLoYdLq4=; b=OVpbSfOeO5FfF6A5ccgEen0WzIDL8TkaYzr8dz+DQl920QjYN7d3n35/OBHhRYDX/D xoJ5GSlvw/BtWbfczpJWBcMJTzmI0xVLh+sEJIZz10O3kvWoKUZjkEJrL+FGEwlyfk6S 8aqi/Kcq8H+gjRe+AradPqiwBONdt7TgWOPqfM5lB8e1bTacx+DhUcptcW6ePFleEm2g cEOUbhScNCY5cjRhA6QZGe41LLx27YDEeuAYos70J+FeRYsQMonRQxKlpCnwBrkx6roL ceMbtzZyernB99hpd1OVFMVkz/zNKmXnAlzt6i7CE6wLQ+FVkXk1lYsoJwawvAjDBUZb To5g== X-Gm-Message-State: AFqh2kpAvy0zhrINW2VbOzIHQUn9SlW3EM0ZCNamdoMF/++MEj9RPtpN bBvv5+TzQ5IcwnGuF8C0hi3X1bwxdtBuRpW/KhZkvI0u X-Google-Smtp-Source: AMrXdXsr1AI2kX11bVbKkrU+3X63Cp3QcHO84vMO5UGQCGJL5ifItxC0z369ZgOzuj/J7YIuT4XFoNyaohtAU8OiX/o= X-Received: by 2002:a05:600c:2309:b0:3d5:f77e:40b6 with SMTP id 9-20020a05600c230900b003d5f77e40b6mr278954wmo.206.1671749562031; Thu, 22 Dec 2022 14:52:42 -0800 (PST) MIME-Version: 1.0 References: <20221222221244.1290833-1-kuba@kernel.org> In-Reply-To: <20221222221244.1290833-1-kuba@kernel.org> From: Dave Taht Date: Thu, 22 Dec 2022 14:52:30 -0800 Message-ID: To: libreqos Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: [LibreQoS] Fwd: [PATCH 0/3] softirq: uncontroversial change X-BeenThere: libreqos@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Many ISPs need the kinds of quality shaping cake can do List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Dec 2022 22:52:43 -0000 This is pretty neat. ---------- Forwarded message --------- From: Jakub Kicinski Date: Thu, Dec 22, 2022 at 2:40 PM Subject: [PATCH 0/3] softirq: uncontroversial change To: , Cc: , , , , Jakub Kicinski Catching up on LWN I run across the article about softirq changes, and then I noticed fresh patches in Peter's tree. So probably wise for me to throw these out there. My (can I say Meta's?) problem is the opposite to what the RT sensitive people complain about. In the current scheme once ksoftirqd is woken no network processing happens until it runs. When networking gets overloaded - that's probably fair, the problem is that we confuse latency tweaks with overload protection. We have a needs_resched() in the loop condition (which is a latency tweak) Most often we defer to ksoftirqd because we're trying to be nice and let user space respond quickly, not because there is an overload. But the user space may not be nice, and sit on the CPU for 10ms+. Also the sirq's "work allowance" is 2ms, which is uncomfortably close to the timer tick, but that's another story. We have a sirq latency tracker in our prod kernel which catches 8ms+ stalls of net Tx (packets queued to the NIC but there is no NAPI cleanup within 8ms) and with these patches applied on 5.19 fully loaded web machine sees a drop in stalls from 1.8 stalls/sec to 0.16/sec. I also see a 50% drop in outgoing TCP retransmissions and ~10% drop in non-TLP incoming ones. This is not a network-heavy workload so most of the rtx are due to scheduling artifacts. The network latency in a datacenter is somewhere around neat 1000x lower than scheduling granularity (around 10us). These patches (patch 2 is "the meat") change what we recognize as overload. Instead of just checking if "ksoftirqd is woken" it also caps how long we consider ourselves to be in overload, a time limit which is different based on whether we yield due to real resource exhaustion vs just hitting that needs_resched(). I hope the core concept is not entirely idiotic. It'd be great if we could get this in or fold an equivalent concept into ongoing work from others, because due to various "scheduler improvements" every time we upgrade the production kernel this problem is getting worse :( Jakub Kicinski (3): softirq: rename ksoftirqd_running() -> ksoftirqd_should_handle() softirq: avoid spurious stalls due to need_resched() softirq: don't yield if only expedited handlers are pending kernel/softirq.c | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-) -- 2.38.1 --=20 This song goes out to all the folk that thought Stadia would work: https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-69813666656= 07352320-FXtz Dave T=C3=A4ht CEO, TekLibre, LLC