Re: [LibreQoS] In BPF pping - so far

Many ISPs need the kinds of quality shaping cake can do
 help / color / mirror / Atom feed

From: Herbert Wolverson <herberticus@gmail.com>
Cc: libreqos@lists.bufferbloat.net, Simon Sundberg <Simon.Sundberg@kau.se>
Subject: Re: [LibreQoS] In BPF pping - so far
Date: Mon, 17 Oct 2022 09:59:36 -0500	[thread overview]
Message-ID: <CA+erpM6rxD44HcNeQ14RK6p-_PxRQiKkteC8gBDYqyaJ34mHpQ@mail.gmail.com> (raw)
In-Reply-To: <87bkqatu61.fsf@toke.dk>

[-- Attachment #1: Type: text/plain, Size: 4861 bytes --]

I have no doubt that logging is the biggest slow-down, followed by some
dumb things (e.g. I just significantly
increased performance by not accidentally copying addresses twice...) I'm
honestly pleasantly surprised
by how performant the debug logging is!

In the short-term, this is a fork. I'm not planning on keeping it that way,
but I'm early enough into the
task that I need the freedom to really mess things up without upsetting
upstream. ;-) At some point very
soon, I'll post a temporary GitHub repo with the hacked and messy version
in, with a view to getting
more eyes on it before it transforms into something more generally useful.
Cleaning up the more
embarrassing "written in a hurry" code.

The per-stream RTT buffer looks great. I'll definitely try to use that. I
was a little alarmed to discover
that running clean-up on the kernel side is practically impossible, making
a management daemon a
necessity (since the XDP mapping is long-running, the packet timing is
likely to be running whether or
not LibreQOS is actively reading from it). A ready-summarized buffer format
makes a LOT of sense.
At least until I run out of memory. ;-)

Thanks,
Herbert

On Mon, Oct 17, 2022 at 9:13 AM Toke Høiland-Jørgensen <toke@toke.dk> wrote:

> [ Adding Simon to Cc ]
>
> Herbert Wolverson via LibreQoS <libreqos@lists.bufferbloat.net> writes:
>
> > Hey,
> >
> > I've had some pretty good success with merging xdp-pping (
> > https://github.com/xdp-project/bpf-examples/blob/master/pping/pping.h )
> > into xdp-cpumap-tc ( https://github.com/xdp-project/xdp-cpumap-tc ).
> >
> > I ported over most of the xdp-pping code, and then changed the entry
> point
> > and packet parsing code to make use of the work already done in
> > xdp-cpumap-tc (it's already parsed a big chunk of the packet, no need to
> do
> > it twice). Then I switched the maps to per-cpu maps, and had to pin them
> -
> > otherwise the two tc instances don't properly share data. Right now,
> output
> > is just stubbed - I've still got to port the perfmap output code.
> Instead,
> > I'm dumping a bunch of extra data to the kernel debug pipe, so I can see
> > roughly what the output would look like.
> >
> > With debug enabled and just logging I'm now getting about 4.9 Gbits/sec
> on
> > single-stream iperf between two VMs (with a shaper VM in the middle). :-)
>
> Just FYI, that "just logging" is probably the biggest source of
> overhead, then. What Simon found was that sending the data from kernel
> to userspace is one of the most expensive bits of epping, at least when
> the number of data points goes up (which is does as additional flows are
> added).
>
> > So my question: how would you prefer to receive this data? I'll have to
> > write a daemon that provides userspace control (periodic cleanup as well
> as
> > reading the performance stream), so the world's kinda our oyster. I can
> > stick to Kathie's original format (and dump it to a named pipe,
> perhaps?),
> > a condensed format that only shows what you want to use, an efficient
> > binary format if you feel like parsing that...
>
> It would be great if we could combine efforts a bit here so we don't
> fork the codebase more than we have to. I.e., if "upstream" epping and
> whatever daemon you end up writing can agree on data format etc that
> would be fantastic! Added Simon to Cc to facilitate this :)
>
> Briefly what I've discussed before with Simon was to have the ability to
> aggregate the metrics in the kernel (WiP PR [0]) and have a userspace
> utility periodically pull them out. What we discussed was doing this
> using an LPM map (which is not in that PR yet). The idea would be that
> userspace would populate the LPM map with the keys (prefixes) they
> wanted statistics for (in LibreQOS context that could be one key per
> customer, for instance). Epping would then do a map lookup into the LPM,
> and if it gets a match it would update the statistics in that map entry
> (keeping a histogram of latency values seen, basically). Simon's PR
> below uses this technique where userspace will "reset" the histogram
> every time it loads it by swapping out two different map entries when it
> does a read; this allows you to control the sampling rate from
> userspace, and you'll just get the data since the last time you polled.
>
> I was thinking that if we all can agree on the map format, then your
> polling daemon could be one userspace "client" for that, and the epping
> binary itself could be another; but we could keep compatibility between
> the two, so we don't duplicate effort.
>
> Similarly, refactoring of the epping code itself so it can be plugged
> into the cpumap-tc code would be a good goal...
>
> -Toke
>
> [0] https://github.com/xdp-project/bpf-examples/pull/59
>

[-- Attachment #2: Type: text/html, Size: 6008 bytes --]

next prev parent reply	other threads:[~2022-10-17 14:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-16  1:59 Herbert Wolverson
2022-10-16  2:26 ` Robert Chacón
2022-10-17 14:13 ` Toke Høiland-Jørgensen
2022-10-17 14:59   ` Herbert Wolverson [this message]
2022-10-17 15:14   ` Simon Sundberg
2022-10-17 18:45     ` Herbert Wolverson
2022-10-17 20:33       ` Robert Chacón
2022-10-18 18:01         ` Herbert Wolverson
2022-10-19 13:44           ` Herbert Wolverson
2022-10-19 13:58             ` Dave Taht
2022-10-19 14:01               ` Herbert Wolverson
2022-10-19 14:05                 ` Herbert Wolverson
2022-10-19 14:48                   ` Robert Chacón
2022-10-19 15:49                     ` dan
2022-10-19 16:10                       ` Herbert Wolverson
2022-10-19 16:13                         ` Dave Taht
2022-10-19 16:13                           ` Dave Taht
2022-10-22 14:32                             ` Herbert Wolverson
2022-10-22 14:44                               ` Dave Taht
2022-10-22 14:47                               ` Robert Chacón

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/libreqos.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+erpM6rxD44HcNeQ14RK6p-_PxRQiKkteC8gBDYqyaJ34mHpQ@mail.gmail.com \
    --to=herberticus@gmail.com \
    --cc=Simon.Sundberg@kau.se \
    --cc=libreqos@lists.bufferbloat.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox