From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 7F31A3B2A4 for ; Thu, 23 Apr 2020 14:35:22 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1587666922; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RyMJPhgFmjd5hvKRY9K3ZQp1zzID/o8bhmI8czkMBkw=; b=i7dveCXyyEJMx2v79Oe8MVb6kJfNrxU5N4fLVr7i/Ln4tX4GoACwDUMOD3Bn4qzhh2K0Lp yKR41IyFi4C79zGFdrKjhLRpOBYjg7PguqTscc1RO4beTG+xsQ/UZXOGaDPeXovJL12orR wRioyxWgwqcfUt+/2T8SjmbC2u4+0QA= Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-347-gv5-e8UmMvyksbDtWTt11Q-1; Thu, 23 Apr 2020 14:35:20 -0400 X-MC-Unique: gv5-e8UmMvyksbDtWTt11Q-1 Received: by mail-lj1-f200.google.com with SMTP id z1so1073469ljk.9 for ; Thu, 23 Apr 2020 11:35:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=RyMJPhgFmjd5hvKRY9K3ZQp1zzID/o8bhmI8czkMBkw=; b=gpj/zQF9684XKPiW6UX3pRhbqjcXAHeTuc0hT157rNk9MZSZqITpN7HzHktjBY+hoV 1LV4qkE8vzIRMHa+T/DJX2GkQfx0pA9aag1dDSGwg6s2+oAZwVJToUti9L+jXxjTwvua 3aVqtAyo3rHx3OHuqzdVVT8/TYL1EdvtJuoM0Iz9tWGLAkckh/8jSVBlN+FezbTRfmU9 B5dpTc56KtxjtPlXSflxJTrjn1swLX3pqoo/ZWKDTl/ZTOyODx3L+rsrdbF6NlOjQqCt KEdEpiQ7Tc7Vb+zkCCCeyLhonMNpAzkXMV1rQsjBYC4tAoDSzU8gbjjib6izObLdUmiU bqZw== X-Gm-Message-State: AGi0PubFTOic2gOPoULHWOmwgUz2hRgOOFBXoQ0QgttGhusANC1GlwDZ 5xuqBwZ/0sBnw64EtAW3g3pfYWNzL79cqYYRRBYD9jDTgjICZd6ROPL0lY6WBkf2o4qMFw9TgB4 jZrH4ZBpodP60IeB/3QB5bw== X-Received: by 2002:a2e:780a:: with SMTP id t10mr3160848ljc.247.1587666918672; Thu, 23 Apr 2020 11:35:18 -0700 (PDT) X-Google-Smtp-Source: APiQypJ3cgrZPSgnaNLI+ZBIYLij5MQzt914FG1Jt2BmRAvdWvTlD03EOsXmtxrWBrYcWyVL54OZoQ== X-Received: by 2002:a2e:780a:: with SMTP id t10mr3160829ljc.247.1587666918369; Thu, 23 Apr 2020 11:35:18 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id o18sm2527515lfb.13.2020.04.23.11.35.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Apr 2020 11:35:17 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id EE6AA1814FF; Thu, 23 Apr 2020 20:35:15 +0200 (CEST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Maxime Bizon Cc: Dave Taht , Cake List In-Reply-To: <20200423173111.GL28541@sakura> References: <603DFF79-D0C0-41BD-A2FB-E40B95A9CBB0@gmail.com> <20200423092909.GC28541@sakura> <87o8ri76u2.fsf@toke.dk> <20200423123329.GG28541@sakura> <877dy66tng.fsf@toke.dk> <20200423173111.GL28541@sakura> X-Clacks-Overhead: GNU Terry Pratchett Date: Thu, 23 Apr 2020 20:35:15 +0200 Message-ID: <871roe6of0.fsf@toke.dk> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] Advantages to tightly tuning latency X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2020 18:35:22 -0000 Maxime Bizon writes: > On Thursday 23 Apr 2020 =C3=A0 18:42:11 (+0200), Toke H=C3=B8iland-J=C3= =B8rgensen wrote: > >> Didn't make it in until 5.5, unfortunately... :( >>=20 >> I can try to produce a patch that you can manually apply on top of 5.4 >> if you're interested? > > I could do it, but the thing I'm more worried about is the lack of > test coverage from everyone else. Yeah, I guess you'd be on the hook for backporting any follow-ups yourself if you do that; maybe better to wait for the next longterm kernel release, then... >> Anyhow, my larger point was that we really do want to enable such use >> cases for XDP; but we are lacking the details of what exactly is missing >> before we can get to something that's useful / deployable. So any >> details you could share about what feature set you are supporting in >> your own 'fast path' implementation would be really helpful. As would >> details about the hardware platform you are using. You can send them >> off-list if you don't want to make it public, of course :) > > there is no hardware specific feature used, it's all software I meant more details of your SOC platform. You already said it's ARM-based, so I guess the most important missing piece is which (Linux) driver does the Ethernet device(s) use? > imagine this "simple" setup, pretty much what anyone's home router is > doing: > > with + inside, private IPv4 address > with IPv6, vlan interface over > with IPv4, MAP-E tunnel over > > then: > - IPv6 routing between and > - IPv4 routing + NAT between and > > iptables would be filled with usual rules, per interface ALLOW rules > in FORWARD chain, DNAT rules in PREROUTING to access LAN from WAN... > > and then you want this to be fast :) > > What we do is build a "flow" table on top of conntrack, so with a > single lookup we find the flow, the destination interface, and what > modifications to apply to the packet (L3 address to change, encap to > add/remove, etc etc) > > Then we do this lookup more or less early in RX path, on our oldest > platform we even had to do this from the ethernet driver, and do TX > from there too, skipping qdisc layer and allowing cache maintenance > hacks (partial invalidation and wback) This sounds pretty much what you'd do with an XDP program: Packet comes in -> XDP program runs, parses the headers, does a flow lookup, modifies the packet and redirects it out the egress interface. All in one go, kernel never even builds an skb for the packet. You can build most of that with XDP today, but you'd need to implement all the lookups yourself using BPF maps; having a hook into the kernel conntrack / flow tables would help with that. I guess I should look into what happened with that hook. Oh, and we also need to solve queueing in XDP; it's all line rate ATM, which is obviously not ideal for a CPE :) > nftable with flowtables seems to be have developped something that > could replace our flow cache, but I'm not sure if it can handle our > tunneling scheme yet. It even has a notion of offloaded flow for > hardware that can support it. Well, the nice thing about XDP is that you can just implement any custom encapsulation that is not supported by the kernel yourself :) > If you add an XDP offload to it, with an option to do the > lookup/modification/tx at the layer you want, depending on the > performance you need, whether you want qdisc.. that you'd give you > pretty much the same thing we use today, but with a cleaner design. Yup, I think so. What does your current solution do with packets that are destined for the WiFi interface, BTW? Just punt them to the regular kernel path? >> Depends on the TCP stack (I think).=20 > > I guess Linux deals with OFO better, but unfortunately that's not the > main OS used by our subscribers... Yeah, you really should do something about that ;) >> Steam is perhaps a bad example as that is doing something very much like >> bittorrent AFAIK; but point taken, people do occasionally run >> single-stream downloads and want them to be fast. I'm just annoyed that >> this becomes the *one* benchmark people run, to the exclusion of >> everything else that has a much larger impact on the overall user >> experience :/ > > that one is easy > > convince ookla to add some kind of "latency under load" metric, and > have them report it as a big red flag when too high, and even better > add scary messages like "this connection is not suitable for online > gaming". > > subscribers will bug telco, then telco will bug SOCs vendors Heh. Easy in theory, yeah. I do believe people on this list have tried to convince them; no luck thus far :/ -Toke