From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ns.iliad.fr (ns.iliad.fr [212.27.33.1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id E06BA3B2A4 for ; Thu, 23 Apr 2020 08:33:31 -0400 (EDT) Received: from ns.iliad.fr (localhost [127.0.0.1]) by ns.iliad.fr (Postfix) with ESMTP id EA25C201B8; Thu, 23 Apr 2020 14:33:30 +0200 (CEST) Received: from sakura (freebox.vlq16.iliad.fr [213.36.7.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ns.iliad.fr (Postfix) with ESMTPS id DCDC1201AA; Thu, 23 Apr 2020 14:33:30 +0200 (CEST) Date: Thu, 23 Apr 2020 14:33:29 +0200 From: Maxime Bizon To: Toke =?iso-8859-1?Q?H=F8iland-J=F8rgensen?= Cc: Dave Taht , Cake List Message-ID: <20200423123329.GG28541@sakura> References: <75FEC2D9-BFC8-4FA2-A972-D11A823C5528@gmail.com> <603DFF79-D0C0-41BD-A2FB-E40B95A9CBB0@gmail.com> <20200423092909.GC28541@sakura> <87o8ri76u2.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87o8ri76u2.fsf@toke.dk> User-Agent: Mutt/1.9.4 (2018-02-28) X-Virus-Scanned: ClamAV using ClamSMTP ; ns.iliad.fr ; Thu Apr 23 14:33:30 2020 +0200 (CEST) X-Mailman-Approved-At: Thu, 23 Apr 2020 08:59:01 -0400 Subject: Re: [Cake] Advantages to tightly tuning latency X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2020 12:33:32 -0000 On Thursday 23 Apr 2020 à 13:57:25 (+0200), Toke Høiland-Jørgensen wrote: Hello Toke, > That is awesome! Please make sure you include the AQL patch for ath10k, > it really works wonders, as Dave demonstrated: > > https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.html Was it in 5.4 ? we try to stick to LTS kernel > We're working on that in kernel land - ever heard of XDP? On big-iron > servers we have no issues pushing 10s and 100s of Gbps in software > (well, the latter only given enough cores to throw at the problem :)). > There's not a lot of embedded platforms support as of yet, but we do > have some people in the ARM world working on that. > > Personally, I do see embedded platforms as an important (future) use > case for XDP, though, in particular for CPEs. So I would be very > interested in hearing details about your particular platform, and your > DPDK solution, so we can think about what it will take to achieve the > same with XDP. If you're interested in this, please feel free to reach > out :) Last time I looked at XDP, its primary use cases were "early drop" / "anti ddos". In our case, each packet has to be routed+NAT, we have VLAN tags, we also have MAP-E for IPv4 traffic. So in the vanilla forwading path, this does multiple rounds of RX/TX because of tunneling. TBH, the hard work in our optimized forwarding code is figuring out what modifications to apply to each packets. Now whether modifications and tx would be done by XDP or by hand written C code in the kernel is more of a detail, even though using XDP is much cleaner of course. What the kernel always lacked is what DaveM called once the "grand unified flow cache", the ability to do a single lookup and be able to decide what to do with the packet. Instead we have the bridge forwarding table, the ip routing table (used to be a cache), the netfilter conntrack lookup, and multiple round of those if you do tunneling. Once you have this "flow table" infrastructure, it becomes easy to offload forwarding, either to real hardware, or software (for example, dedicate a CPU core in polling mode) The good news is that it seems nftables is building this: https://wiki.nftables.org/wiki-nftables/index.php/Flowtable I'm still using iptables, but it seems that the features I was missing like TCPMSS are now in nft also, so I will have a look. > Setting aside the fact that those single-stream tests ought to die a > horrible death, I do wonder if it would be feasible to do a bit of > 'optimising for the test'? With XDP we do have the ability to steer > packets between CPUs based on arbitrary criteria, and while it is not as > efficient as hardware-based RSS it may be enough to achieve line rate > for a single TCP flow? You cannot do steering for a single TCP flow at those rates because you will get out-of-order packets and kill TCP performance. I do not consider those single-stream tests to be unrealistic, this is exactly what happen if say you buy a game on Steam and download it. -- Maxime