Cake - FQ_codel the next generation
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Maxime Bizon <mbizon@freebox.fr>
Cc: Dave Taht <dave.taht@gmail.com>, Cake List <cake@lists.bufferbloat.net>
Subject: Re: [Cake] Advantages to tightly tuning latency
Date: Thu, 23 Apr 2020 18:42:11 +0200	[thread overview]
Message-ID: <877dy66tng.fsf@toke.dk> (raw)
In-Reply-To: <20200423123329.GG28541@sakura>

Maxime Bizon <mbizon@freebox.fr> writes:

> On Thursday 23 Apr 2020 à 13:57:25 (+0200), Toke Høiland-Jørgensen wrote:
>
> Hello Toke,
>
>> That is awesome! Please make sure you include the AQL patch for ath10k,
>> it really works wonders, as Dave demonstrated:
>> 
>> https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.html
>
> Was it in 5.4 ? we try to stick to LTS kernel

Didn't make it in until 5.5, unfortunately... :(

I can try to produce a patch that you can manually apply on top of 5.4
if you're interested?

>> We're working on that in kernel land - ever heard of XDP? On big-iron
>> servers we have no issues pushing 10s and 100s of Gbps in software
>> (well, the latter only given enough cores to throw at the problem :)).
>> There's not a lot of embedded platforms support as of yet, but we do
>> have some people in the ARM world working on that.
>> 
>> Personally, I do see embedded platforms as an important (future) use
>> case for XDP, though, in particular for CPEs. So I would be very
>> interested in hearing details about your particular platform, and your
>> DPDK solution, so we can think about what it will take to achieve the
>> same with XDP. If you're interested in this, please feel free to reach
>> out :)
>
> Last time I looked at XDP, its primary use cases were "early drop" /
> "anti ddos".

Yeah, that's the obvious use case (i.e., easiest to implement). But we
really want it to be a general purpose acceleration layer where you can
selectively use only the kernel facilities you need for your use case -
or even skip some of them entirely and reimplement an optimised subset
fitting your use case.

> In our case, each packet has to be routed+NAT, we have VLAN tags, we
> also have MAP-E for IPv4 traffic. So in the vanilla forwading path,
> this does multiple rounds of RX/TX because of tunneling.
>
> TBH, the hard work in our optimized forwarding code is figuring out
> what modifications to apply to each packets. Now whether modifications
> and tx would be done by XDP or by hand written C code in the kernel is
> more of a detail, even though using XDP is much cleaner of course.
>
> What the kernel always lacked is what DaveM called once the "grand
> unified flow cache", the ability to do a single lookup and be able to
> decide what to do with the packet. Instead we have the bridge
> forwarding table, the ip routing table (used to be a cache), the
> netfilter conntrack lookup, and multiple round of those if you do
> tunneling.
>
> Once you have this "flow table" infrastructure, it becomes easy to
> offload forwarding, either to real hardware, or software (for example,
> dedicate a CPU core in polling mode)
>
> The good news is that it seems nftables is building this:
>
> https://wiki.nftables.org/wiki-nftables/index.php/Flowtable
>
> I'm still using iptables, but it seems that the features I was missing
> like TCPMSS are now in nft also, so I will have a look.


I find it useful to think of XDP as a 'software offload' - i.e. a fast
path where you implement the most common functionality as efficiently as
possible and dynamically fall back to the full stack for the edge cases.
Enabling lookups in the flow table from XDP would be an obvious thing to
do, for instance. There were some patches going by to enable some kind
of lookup into conntrack at some point, but I don't recall the details.

Anyhow, my larger point was that we really do want to enable such use
cases for XDP; but we are lacking the details of what exactly is missing
before we can get to something that's useful / deployable. So any
details you could share about what feature set you are supporting in
your own 'fast path' implementation would be really helpful. As would
details about the hardware platform you are using. You can send them
off-list if you don't want to make it public, of course :)

>> Setting aside the fact that those single-stream tests ought to die a
>> horrible death, I do wonder if it would be feasible to do a bit of
>> 'optimising for the test'? With XDP we do have the ability to steer
>> packets between CPUs based on arbitrary criteria, and while it is not as
>> efficient as hardware-based RSS it may be enough to achieve line rate
>> for a single TCP flow?
>
> You cannot do steering for a single TCP flow at those rates because
> you will get out-of-order packets and kill TCP performance.

Depends on the TCP stack (I think). 

> I do not consider those single-stream tests to be unrealistic, this is
> exactly what happen if say you buy a game on Steam and download it.

Steam is perhaps a bad example as that is doing something very much like
bittorrent AFAIK; but point taken, people do occasionally run
single-stream downloads and want them to be fast. I'm just annoyed that
this becomes the *one* benchmark people run, to the exclusion of
everything else that has a much larger impact on the overall user
experience :/

-Toke


  reply	other threads:[~2020-04-23 16:42 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21 18:22 Justin Kilpatrick
2020-04-21 18:40 ` Jonathan Morton
2020-04-21 18:44   ` Dave Taht
2020-04-21 22:25     ` Thibaut
2020-04-21 22:33       ` Jonathan Morton
2020-04-21 22:44         ` Dave Taht
2020-04-21 22:50           ` Dave Taht
2020-04-21 23:07             ` Jonathan Morton
2020-04-21 23:27               ` Dave Taht
2020-04-22  8:28           ` Thibaut
2020-04-22  9:03           ` Luca Muscariello
2020-04-22 14:48             ` Dave Taht
2020-04-22 15:28               ` Luca Muscariello
2020-04-22 17:42                 ` David P. Reed
2020-04-23  9:29               ` Maxime Bizon
2020-04-23 11:57                 ` Toke Høiland-Jørgensen
2020-04-23 12:29                   ` Luca Muscariello
2020-04-23 12:33                   ` Maxime Bizon
2020-04-23 16:42                     ` Toke Høiland-Jørgensen [this message]
2020-04-23 17:31                       ` Maxime Bizon
2020-04-23 18:30                         ` Sebastian Moeller
2020-04-23 21:53                           ` Maxime Bizon
2020-04-23 18:35                         ` Toke Høiland-Jørgensen
2020-04-23 21:59                           ` Maxime Bizon
2020-04-23 23:05                             ` Toke Høiland-Jørgensen
2020-04-23 23:11                               ` Dave Taht
2020-04-23 16:28                   ` Dave Taht
2020-04-21 23:06     ` Justin Kilpatrick
2020-04-21 23:19       ` Dave Taht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/cake.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877dy66tng.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=cake@lists.bufferbloat.net \
    --cc=dave.taht@gmail.com \
    --cc=mbizon@freebox.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox