Many ISPs need the kinds of quality shaping cake can do
 help / color / mirror / Atom feed
* [LibreQoS] Before/After Performance Comparison (Monitoring Mode)
@ 2022-11-05 16:44 Robert Chacón
  2022-11-08 14:23 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 8+ messages in thread
From: Robert Chacón @ 2022-11-05 16:44 UTC (permalink / raw)
  To: libreqos

[-- Attachment #1: Type: text/plain, Size: 1031 bytes --]

I was hoping to add a monitoring mode which could be used before "turning
on" LibreQoS, ideally before v1.3 release. This way operators can really
see what impact it's having on end-user and network latency.

The simplest solution I can think of is to implement Monitoring Mode using
cpumap-pping as we already do - with plain HTB and leaf classes with no
CAKE qdisc applied, and with HTB and leaf class rates set to impossibly
high amounts (no plan enforcement). This would allow for before/after
comparisons of Nodes (Access Points). My only concern with this approach is
that HTB, even with rates set impossibly high, may not be truly
transparent. It would be pretty easy to implement though.

Alternatively we could use ePPing
<https://github.com/xdp-project/bpf-examples/tree/master/pping> but I worry
about throughput and the possibility of latency tracking being slightly
different from cpumap-pping, which could limit the utility of a comparison.
We'd have to match IPs in a way that's a bit more involved here.

Thoughts?

[-- Attachment #2: Type: text/html, Size: 1197 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LibreQoS] Before/After Performance Comparison (Monitoring Mode)
  2022-11-05 16:44 [LibreQoS] Before/After Performance Comparison (Monitoring Mode) Robert Chacón
@ 2022-11-08 14:23 ` Toke Høiland-Jørgensen
  2022-11-08 15:44   ` Robert Chacón
  2022-11-08 16:02   ` Herbert Wolverson
  0 siblings, 2 replies; 8+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-11-08 14:23 UTC (permalink / raw)
  To: Robert Chacón, libreqos

Robert Chacón via LibreQoS <libreqos@lists.bufferbloat.net> writes:

> I was hoping to add a monitoring mode which could be used before "turning
> on" LibreQoS, ideally before v1.3 release. This way operators can really
> see what impact it's having on end-user and network latency.
>
> The simplest solution I can think of is to implement Monitoring Mode using
> cpumap-pping as we already do - with plain HTB and leaf classes with no
> CAKE qdisc applied, and with HTB and leaf class rates set to impossibly
> high amounts (no plan enforcement). This would allow for before/after
> comparisons of Nodes (Access Points). My only concern with this approach is
> that HTB, even with rates set impossibly high, may not be truly
> transparent. It would be pretty easy to implement though.
>
> Alternatively we could use ePPing
> <https://github.com/xdp-project/bpf-examples/tree/master/pping> but I worry
> about throughput and the possibility of latency tracking being slightly
> different from cpumap-pping, which could limit the utility of a comparison.
> We'd have to match IPs in a way that's a bit more involved here.
>
> Thoughts?

Well, this kind of thing is exactly why I think concatenating the two
programs (cpumap and pping) into a single BPF program was a mistake:
those are two distinct pieces of functionality, and you want to be able
to run them separately, as your "monitor mode" use case shows. The
overhead of parsing the packet twice is trivial compared to everything
else those apps are doing, so I don't think the gain is worth losing
that flexibility.

So I definitely think using the regular epping is the right thing to do
here. Simon is looking into improving its reporting so it can be
per-subnet using a user-supplied configuration file for the actual
subnets, which should hopefully make this feasible. I'm sure he'll chime
in here once he has something to test and/or with any questions that pop
up in the process.

Longer term, I'm hoping all of Herbert's other improvements to epping
reporting/formatting can make it into upstream epping, so LibreQoS can
just use that for everything :)

-Toke

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LibreQoS] Before/After Performance Comparison (Monitoring Mode)
  2022-11-08 14:23 ` Toke Høiland-Jørgensen
@ 2022-11-08 15:44   ` Robert Chacón
  2022-11-08 16:04     ` Herbert Wolverson
  2022-11-08 16:02   ` Herbert Wolverson
  1 sibling, 1 reply; 8+ messages in thread
From: Robert Chacón @ 2022-11-08 15:44 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: libreqos

[-- Attachment #1: Type: text/plain, Size: 2871 bytes --]

Point taken!

Before receiving this email I had started work on it. It's on a branch on
GitHub now <https://github.com/LibreQoE/LibreQoS/tree/monitor-mode/v1.3>.
It uses cpumap-pping and keeps HTB, but overrides all HTB class and leaf
rates to be 10Gbps so that borrowing isn't taking place anywhere - so we
can be as transparent as possible.

I'll try again another shot at monitoring-mode with ePPing instead.

On Tue, Nov 8, 2022 at 7:23 AM Toke Høiland-Jørgensen <toke@toke.dk> wrote:

> Robert Chacón via LibreQoS <libreqos@lists.bufferbloat.net> writes:
>
> > I was hoping to add a monitoring mode which could be used before "turning
> > on" LibreQoS, ideally before v1.3 release. This way operators can really
> > see what impact it's having on end-user and network latency.
> >
> > The simplest solution I can think of is to implement Monitoring Mode
> using
> > cpumap-pping as we already do - with plain HTB and leaf classes with no
> > CAKE qdisc applied, and with HTB and leaf class rates set to impossibly
> > high amounts (no plan enforcement). This would allow for before/after
> > comparisons of Nodes (Access Points). My only concern with this approach
> is
> > that HTB, even with rates set impossibly high, may not be truly
> > transparent. It would be pretty easy to implement though.
> >
> > Alternatively we could use ePPing
> > <https://github.com/xdp-project/bpf-examples/tree/master/pping> but I
> worry
> > about throughput and the possibility of latency tracking being slightly
> > different from cpumap-pping, which could limit the utility of a
> comparison.
> > We'd have to match IPs in a way that's a bit more involved here.
> >
> > Thoughts?
>
> Well, this kind of thing is exactly why I think concatenating the two
> programs (cpumap and pping) into a single BPF program was a mistake:
> those are two distinct pieces of functionality, and you want to be able
> to run them separately, as your "monitor mode" use case shows. The
> overhead of parsing the packet twice is trivial compared to everything
> else those apps are doing, so I don't think the gain is worth losing
> that flexibility.
>
> So I definitely think using the regular epping is the right thing to do
> here. Simon is looking into improving its reporting so it can be
> per-subnet using a user-supplied configuration file for the actual
> subnets, which should hopefully make this feasible. I'm sure he'll chime
> in here once he has something to test and/or with any questions that pop
> up in the process.
>
> Longer term, I'm hoping all of Herbert's other improvements to epping
> reporting/formatting can make it into upstream epping, so LibreQoS can
> just use that for everything :)
>
> -Toke
>


-- 
Robert Chacón
CEO | JackRabbit Wireless LLC <http://jackrabbitwireless.com>
Dev | LibreQoS.io

[-- Attachment #2: Type: text/html, Size: 3801 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LibreQoS] Before/After Performance Comparison (Monitoring Mode)
  2022-11-08 14:23 ` Toke Høiland-Jørgensen
  2022-11-08 15:44   ` Robert Chacón
@ 2022-11-08 16:02   ` Herbert Wolverson
  2022-11-08 18:53     ` Simon Sundberg
  1 sibling, 1 reply; 8+ messages in thread
From: Herbert Wolverson @ 2022-11-08 16:02 UTC (permalink / raw)
  Cc: libreqos

[-- Attachment #1: Type: text/plain, Size: 9110 bytes --]

I'd probably go with an HTB queue per target IP group, and not attach a
discipline to it - with only a ceiling set at the top. That'll do truly
minimal
shaping, and you can still use cpumap-pping to get the data you want.
(The current branch I'm testing/working on also reports the local IP
address, which I'm finding pretty helpful). Otherwise, you're going to
make building both tools part of the setup process* and still have
to parse IP pairs for results. Hopefully, there's a decent Python
LPM Trie out there (to handle subnets and IPv6) to make that
easier.

I'm (obviously!) going to respectfully disagree with Toke on this one.
I didn't dive into cpumap-pping for fun; I tried *really hard* to work
with the original epping/xdp-pping. It's a great tool, really fantastic
work.
It's also not really designed for the same purpose.

The original Polere pping is wonderful, but isn't going to scale - the
way it ingests packets isn't going to scale across multiple CPUs,
and having a core pegging 100% on a busy shaper box was
degrading overall performance. epping solves the scalability
issue wonderfully, and (rightly) remains focused on giving you
a complete report of all of the data is accessed while it was
running. If you want to run a monitoring session and see what's
going on, it's a *fantastic* way to do it - serious props there. I
benchmarked it at about 15 gbit/s on single-stream testing,
which is *really* impressive (no other BPF programs active,
no shaping).

The first issue I ran into is that stacking XDP programs isn't
all that well defined a process. You can make it work, but
it gets messy when both programs have setup/teardown
routines. I kinda, sorta managed to get the two running at
once, and it mostly worked. There *really* needs to be an
easier way that doesn't run headlong into Ubuntu's lovely
"you updated the kernel and tools, we didn't think you'd
need bpftool so we didn't include it" issues, adjusting scripts
until neither says "oops, there's already an XDP program
here! Bye!". I know that this is a pretty new thing, but the
tooling hasn't really caught up yet to make this a comfortable
process. I'm pretty sure I spent more time trying to run both
at once than it took to make a combined version that sort-of
ran. (I had a working version in an afternoon)

With the two literally concatenated (but compiled together),
it worked - but there was a noticeable performance cost. That's
where orthogonal design choices hit - epping/xdp-pping is
sampling everything (it can even go looking for DNS and ICMP!).
A QoE box *really* needs to go out of its way to avoid adding
any latency, otherwise you're self-defeating. A representative
sample is really all you need - while for epping's target,
a really detailed sample is what you need. When faced with
differing design goals like that, my response is always to
make a tool that very efficiently does what I need.

Combining the packet parsing** was the obvious low-hanging
fruit. It is faster, but not by very much. But I really hate
it when code repeats itself. It seriously set off my OCD
watching both find the IP header offset, determine protocol
(IPv4 vs IPv6), etc. Small performance win.

Bailing out as soon as we determine that we aren't looking
at a TCP packet was a big performance win. You can achieve
the same by carefully setting up the "config" for epping,
but there's not a lot of point in keeping the DNS/ICMP code
when it's not needed. Still a performance win, and not
needing to maintain a configuration (that will be the same
each time) makes setup easier.

Running by default on the TC (egress) rather than XDP
is a big win, too - but only after cpumap-tc has shunted
processing to the appropriate CPU. Now processing is
divided between CPUs, and cache locality is more likely
to happen - the packet we are reading is in the local
core's cache when cpumap-pping reads it, and there's
a decent chance it'll still be there (at least L2) by the time
it gets to the actual queue discipline.

Changing the reporting mechanism was a really big win,
in terms of performance and the tool aligning with what's
needed:
* Since xdp-cpumap has already done the work to determine
  that a flow belongs in TC handle X:Y - and mapping RTT
  performance to customer/circuit is *exactly* what we're
  trying to do - it just makes sense to take that value and
  use it as a key for the results.
* Since we don't care about every packet - rather, we want
  a periodic representative sample - we can use an efficient
  per TC handle circular buffer in which to store results.
* In turn, I realized that we could just *sample* rather than
  continually churning the circular buffer. So each flow's
  buffer has a capacity, and the monitor bails out once a flow
  buffer is full of RTT results. Really big performance win.
  "return" is a really fast call. :-) (The buffers are reset when
  read)
* Perfmaps are great, but I didn't want to require a daemon
  run (mapping the perfmap results) and in turn output
  results in a LibreQoS-friendly format when a much simpler
  mechanism gets the same result - without another program
  sitting handling the mmap'd performance flows all the time.

So the result was really fast and does exactly what I need.
It's not meant to be "better" than the original; for the original's
purpose, it's not great. For rapidly building QoE metrics on
a live shaper box, with absolutely minimal overhead and a
focus on sipping the firehose rather than trying to drink it
all - it's about right.

Philosophically, I've always favored tools that do exactly
what I need.

Likewise, if someone would like to come up with a really
good recipe that runs both rather than a combined
program - that'd be awesome. If it can match the
performance of cpumap-pping, I'll happily switch
BracketQoS to use it.

You're obviously welcome to any of the code; if it can help
the original projects, that's wonderful. Right now, I don't
have the time to come up with a better way of layering
XDP/TC programs!

* - I keep wondering if I shouldn't roll some .deb packages
and a configurator to make setup easier!

** - there *really* should be a standard flow dissector. The
Linux traffic shaper's dissector can handle VLAN tags and
an MPLS header. xdp-cpumap-tc handles VLANs with
aplomb and doesn't touch MPLS. epping calls out to the
xdp-project's dissector which appears to handle
VLANs and also doesn't touch MPLS).

Thanks,
Herbert

On Tue, Nov 8, 2022 at 8:23 AM Toke Høiland-Jørgensen via LibreQoS <
libreqos@lists.bufferbloat.net> wrote:

> Robert Chacón via LibreQoS <libreqos@lists.bufferbloat.net> writes:
>
> > I was hoping to add a monitoring mode which could be used before "turning
> > on" LibreQoS, ideally before v1.3 release. This way operators can really
> > see what impact it's having on end-user and network latency.
> >
> > The simplest solution I can think of is to implement Monitoring Mode
> using
> > cpumap-pping as we already do - with plain HTB and leaf classes with no
> > CAKE qdisc applied, and with HTB and leaf class rates set to impossibly
> > high amounts (no plan enforcement). This would allow for before/after
> > comparisons of Nodes (Access Points). My only concern with this approach
> is
> > that HTB, even with rates set impossibly high, may not be truly
> > transparent. It would be pretty easy to implement though.
> >
> > Alternatively we could use ePPing
> > <https://github.com/xdp-project/bpf-examples/tree/master/pping> but I
> worry
> > about throughput and the possibility of latency tracking being slightly
> > different from cpumap-pping, which could limit the utility of a
> comparison.
> > We'd have to match IPs in a way that's a bit more involved here.
> >
> > Thoughts?
>
> Well, this kind of thing is exactly why I think concatenating the two
> programs (cpumap and pping) into a single BPF program was a mistake:
> those are two distinct pieces of functionality, and you want to be able
> to run them separately, as your "monitor mode" use case shows. The
> overhead of parsing the packet twice is trivial compared to everything
> else those apps are doing, so I don't think the gain is worth losing
> that flexibility.
>
> So I definitely think using the regular epping is the right thing to do
> here. Simon is looking into improving its reporting so it can be
> per-subnet using a user-supplied configuration file for the actual
> subnets, which should hopefully make this feasible. I'm sure he'll chime
> in here once he has something to test and/or with any questions that pop
> up in the process.
>
> Longer term, I'm hoping all of Herbert's other improvements to epping
> reporting/formatting can make it into upstream epping, so LibreQoS can
> just use that for everything :)
>
> -Toke
> _______________________________________________
> LibreQoS mailing list
> LibreQoS@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/libreqos
>

[-- Attachment #2: Type: text/html, Size: 11570 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LibreQoS] Before/After Performance Comparison (Monitoring Mode)
  2022-11-08 15:44   ` Robert Chacón
@ 2022-11-08 16:04     ` Herbert Wolverson
  0 siblings, 0 replies; 8+ messages in thread
From: Herbert Wolverson @ 2022-11-08 16:04 UTC (permalink / raw)
  Cc: libreqos

[-- Attachment #1: Type: text/plain, Size: 3358 bytes --]

Glancing at that, I love how simple it is. :-) I'll see if I can try it out
soon (I'm diving back into book writing for the day)

On Tue, Nov 8, 2022 at 9:45 AM Robert Chacón via LibreQoS <
libreqos@lists.bufferbloat.net> wrote:

> Point taken!
>
> Before receiving this email I had started work on it. It's on a branch on
> GitHub now <https://github.com/LibreQoE/LibreQoS/tree/monitor-mode/v1.3>.
> It uses cpumap-pping and keeps HTB, but overrides all HTB class and leaf
> rates to be 10Gbps so that borrowing isn't taking place anywhere - so we
> can be as transparent as possible.
>
> I'll try again another shot at monitoring-mode with ePPing instead.
>
> On Tue, Nov 8, 2022 at 7:23 AM Toke Høiland-Jørgensen <toke@toke.dk>
> wrote:
>
>> Robert Chacón via LibreQoS <libreqos@lists.bufferbloat.net> writes:
>>
>> > I was hoping to add a monitoring mode which could be used before
>> "turning
>> > on" LibreQoS, ideally before v1.3 release. This way operators can really
>> > see what impact it's having on end-user and network latency.
>> >
>> > The simplest solution I can think of is to implement Monitoring Mode
>> using
>> > cpumap-pping as we already do - with plain HTB and leaf classes with no
>> > CAKE qdisc applied, and with HTB and leaf class rates set to impossibly
>> > high amounts (no plan enforcement). This would allow for before/after
>> > comparisons of Nodes (Access Points). My only concern with this
>> approach is
>> > that HTB, even with rates set impossibly high, may not be truly
>> > transparent. It would be pretty easy to implement though.
>> >
>> > Alternatively we could use ePPing
>> > <https://github.com/xdp-project/bpf-examples/tree/master/pping> but I
>> worry
>> > about throughput and the possibility of latency tracking being slightly
>> > different from cpumap-pping, which could limit the utility of a
>> comparison.
>> > We'd have to match IPs in a way that's a bit more involved here.
>> >
>> > Thoughts?
>>
>> Well, this kind of thing is exactly why I think concatenating the two
>> programs (cpumap and pping) into a single BPF program was a mistake:
>> those are two distinct pieces of functionality, and you want to be able
>> to run them separately, as your "monitor mode" use case shows. The
>> overhead of parsing the packet twice is trivial compared to everything
>> else those apps are doing, so I don't think the gain is worth losing
>> that flexibility.
>>
>> So I definitely think using the regular epping is the right thing to do
>> here. Simon is looking into improving its reporting so it can be
>> per-subnet using a user-supplied configuration file for the actual
>> subnets, which should hopefully make this feasible. I'm sure he'll chime
>> in here once he has something to test and/or with any questions that pop
>> up in the process.
>>
>> Longer term, I'm hoping all of Herbert's other improvements to epping
>> reporting/formatting can make it into upstream epping, so LibreQoS can
>> just use that for everything :)
>>
>> -Toke
>>
>
>
> --
> Robert Chacón
> CEO | JackRabbit Wireless LLC <http://jackrabbitwireless.com>
> Dev | LibreQoS.io
>
> _______________________________________________
> LibreQoS mailing list
> LibreQoS@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/libreqos
>

[-- Attachment #2: Type: text/html, Size: 4665 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LibreQoS] Before/After Performance Comparison (Monitoring Mode)
  2022-11-08 16:02   ` Herbert Wolverson
@ 2022-11-08 18:53     ` Simon Sundberg
  2022-11-10 22:10       ` Dave Taht
  0 siblings, 1 reply; 8+ messages in thread
From: Simon Sundberg @ 2022-11-08 18:53 UTC (permalink / raw)
  To: herberticus; +Cc: libreqos

Hi,
Will just chime in with my own perspective on this ePPing (what I've
internally named my eBPF based pping to) vs xdp-cpumap-pping debate and
adress some of the points mentinoned.

First of all I want to say that I'm very impressed with all the work
Herbert has made with both xdp-cpumap-tc and xdp-cpumap-pping. There
seems to be very rapid progress and some very nice performance numbers
being presented. I'm also very happy that some of my work with ePPing
can benefit LibreQoS, even if I, like Toke, would hope that we could
perhaps benefit a bit more from each others work.

That said I can see some of the benefits from keeping cpumap-pping its
own thing and understand if that's the route you want to head down.
Regardless I hope we can at least exchange some ideas and learn from
each other.

On Tue, 2022-11-08 at 10:02 -0600, Herbert Wolverson via LibreQoS
wrote:
> I'm (obviously!) going to respectfully disagree with Toke on this one.
> I didn't dive into cpumap-pping for fun; I tried *really hard* to work
> with the original epping/xdp-pping. It's a great tool, really fantastic work.
> It's also not really designed for the same purpose.

ePPing is very heavily inspired by Kathie's pping, and perhaps a bit
too much so at times. Allowing an ISP to monitor the latency its
customers experience is definintely a use case we would want to support
with ePPing, and are working on some improvements to make it work
better for that (as Toke mentioned we're currently looking at adding
some aggregation support instead of reporting individual RTT samples).
So I will definintely have a look at some of the changes Herbert has
done with cpumap-pping to see if it makes sense to implement some of
them as alternatives for ePPing as well.
>
> The original Polere pping is wonderful, but isn't going to scale - the
> way it ingests packets isn't going to scale across multiple CPUs,
> and having a core pegging 100% on a busy shaper box was
> degrading overall performance. epping solves the scalability
> issue wonderfully, and (rightly) remains focused on giving you
> a complete report of all of the data is accessed while it was
> running. If you want to run a monitoring session and see what's
> going on, it's a *fantastic* way to do it - serious props there. I
> benchmarked it at about 15 gbit/s on single-stream testing,
> which is *really* impressive (no other BPF programs active,
> no shaping).
>
> The first issue I ran into is that stacking XDP programs isn't
> all that well defined a process. You can make it work, but
> it gets messy when both programs have setup/teardown
> routines. I kinda, sorta managed to get the two running at
> once, and it mostly worked. There *really* needs to be an
> easier way that doesn't run headlong into Ubuntu's lovely
> "you updated the kernel and tools, we didn't think you'd
> need bpftool so we didn't include it" issues, adjusting scripts
> until neither says "oops, there's already an XDP program
> here! Bye!". I know that this is a pretty new thing, but the
> tooling hasn't really caught up yet to make this a comfortable
> process. I'm pretty sure I spent more time trying to run both
> at once than it took to make a combined version that sort-of
> ran. (I had a working version in an afternoon)

I will admit that I don't have much experience with chaining XDP
programs, but libxdp has been designed to solve that. ePPing uses
libxdp to load its XDP program since a while a back. But for that to
work together with xdp-cpumap-tc I guess xdp-cpumap-tc would need to be
modified to also use libxdp. I remeber reading that there was some
other issue regarding how ePPing handled VLAN tags, but don't recall
the details, although that seems like it should be possible to solve.
>
> With the two literally concatenated (but compiled together),
> it worked - but there was a noticeable performance cost. That's
> where orthogonal design choices hit - epping/xdp-pping is
> sampling everything (it can even go looking for DNS and ICMP!).
> A QoE box *really* needs to go out of its way to avoid adding
> any latency, otherwise you're self-defeating. A representative
> sample is really all you need - while for epping's target,
> a really detailed sample is what you need. When faced with
> differing design goals like that, my response is always to
> make a tool that very efficiently does what I need.
>
> Combining the packet parsing** was the obvious low-hanging
> fruit. It is faster, but not by very much. But I really hate
> it when code repeats itself. It seriously set off my OCD
> watching both find the IP header offset, determine protocol
> (IPv4 vs IPv6), etc. Small performance win.
>
> Bailing out as soon as we determine that we aren't looking
> at a TCP packet was a big performance win. You can achieve
> the same by carefully setting up the "config" for epping,
> but there's not a lot of point in keeping the DNS/ICMP code
> when it's not needed. Still a performance win, and not
> needing to maintain a configuration (that will be the same
> each time) makes setup easier.

Just want to clarify that ePPing does not support DNS (yet), even if we
may add it at some point. So for now it's just TCP and ICMP. ePPing can
easily ignore non-TCP traffic (it does so by default these days, you
have to explicitly enable tracking of ICMP traffic), and the runtime
overhead from the additional ICMP code should be minimal if ePPing is
not set up to track ICMP (the JIT-compilation ought to optimize it all
away with dead code elimination as those branches will never bit hit
then).

That said, the additional code for different protocols of course add to
the overall code complexity. Furthermore it may make it a bit more
challenging to optimize ePPing for specific protocols as I also try
keep a somewhat common core which can work for all protocols we add (so
we don't end with a completely separate branch of code for each
protocol).
>
> Running by default on the TC (egress) rather than XDP
> is a big win, too - but only after cpumap-tc has shunted
> processing to the appropriate CPU. Now processing is
> divided between CPUs, and cache locality is more likely
> to happen - the packet we are reading is in the local
> core's cache when cpumap-pping reads it, and there's
> a decent chance it'll still be there (at least L2) by the time
> it gets to the actual queue discipline.
>
> Changing the reporting mechanism was a really big win,
> in terms of performance and the tool aligning with what's
> needed:
> * Since xdp-cpumap has already done the work to determine
>   that a flow belongs in TC handle X:Y - and mapping RTT
>   performance to customer/circuit is *exactly* what we're
>   trying to do - it just makes sense to take that value and
>   use it as a key for the results.
> * Since we don't care about every packet - rather, we want
>   a periodic representative sample - we can use an efficient
>   per TC handle circular buffer in which to store results.
> * In turn, I realized that we could just *sample* rather than
>   continually churning the circular buffer. So each flow's
>   buffer has a capacity, and the monitor bails out once a flow
>   buffer is full of RTT results. Really big performance win.
>   "return" is a really fast call. :-) (The buffers are reset when
>   read)
> * Perfmaps are great, but I didn't want to require a daemon
>   run (mapping the perfmap results) and in turn output
>   results in a LibreQoS-friendly format when a much simpler
>   mechanism gets the same result - without another program
>   sitting handling the mmap'd performance flows all the time.
>
> So the result was really fast and does exactly what I need.
> It's not meant to be "better" than the original; for the original's
> purpose, it's not great. For rapidly building QoE metrics on
> a live shaper box, with absolutely minimal overhead and a
> focus on sipping the firehose rather than trying to drink it
> all - it's about right.

As already mentioned, we are working on aggregating RTT reports for
ePPing. Spitting out individual RTT samples as ePPing does now may be
useful in some cases, but can get rather overwhelming (both in terms of
overhead for ePPing itself, but also just for analyzing all those RTT
samples somehow). In some of our own tests we've had ePPing report over
125,000 RTT samples per second, which is of course a bit overkill.

I plan to take a bit closer look at all the optimizations Herbert has
done to see which can also be added to ePPing (at least as an option).
>
> Philosophically, I've always favored tools that do exactly
> what I need.

While I like the simplicity of this philosophy, you will end up with a
lot of very similar but slightly different tools if everyone uses sets
of tools that are tailored to their exact use case. In the long run
that seems a bit cumbersome to maintain, but of course maintaining a
more general tool has its own complexities.
>
> Likewise, if someone would like to come up with a really
> good recipe that runs both rather than a combined
> program - that'd be awesome. If it can match the
> performance of cpumap-pping, I'll happily switch
> BracketQoS to use it.

Long term I think this would be a nice, but getting there might take
some time. ePPing + xdp-cpumap-tc would likely always have a bit more
overhead compared to xdp-cpumap-pping (due to for example parsing the
packets multiple times), but I don't think it should be impossible to
make that overhead relatively small compared to the overall work xdp-
cpumap-tc and ePPing are doing.
>
> You're obviously welcome to any of the code; if it can help
> the original projects, that's wonderful. Right now, I don't
> have the time to come up with a better way of layering
> XDP/TC programs!

Thanks for keeping this open source, I will definintely have a look at
the code and see if I can use some of it for ePPing.

With best regards, Simon Sundberg.

> * - I keep wondering if I shouldn't roll some .deb packages
> and a configurator to make setup easier!
>
> ** - there *really* should be a standard flow dissector. The
> Linux traffic shaper's dissector can handle VLAN tags and
> an MPLS header. xdp-cpumap-tc handles VLANs with
> aplomb and doesn't touch MPLS. epping calls out to the
> xdp-project's dissector which appears to handle
> VLANs and also doesn't touch MPLS).
>
> Thanks,
> Herbert
>
> On Tue, Nov 8, 2022 at 8:23 AM Toke Høiland-Jørgensen via LibreQoS
> <libreqos@lists.bufferbloat.net> wrote:
> > Robert Chacón via LibreQoS <libreqos@lists.bufferbloat.net> writes:
> >
> > > I was hoping to add a monitoring mode which could be used before
> > > "turning
> > > on" LibreQoS, ideally before v1.3 release. This way operators can
> > > really
> > > see what impact it's having on end-user and network latency.
> > >
> > > The simplest solution I can think of is to implement Monitoring
> > > Mode using
> > > cpumap-pping as we already do - with plain HTB and leaf classes
> > > with no
> > > CAKE qdisc applied, and with HTB and leaf class rates set to
> > > impossibly
> > > high amounts (no plan enforcement). This would allow for
> > > before/after
> > > comparisons of Nodes (Access Points). My only concern with this
> > > approach is
> > > that HTB, even with rates set impossibly high, may not be truly
> > > transparent. It would be pretty easy to implement though.
> > >
> > > Alternatively we could use ePPing
> > > <https://github.com/xdp-project/bpf-examples/tree/master/pping>
> > > but I worry
> > > about throughput and the possibility of latency tracking being
> > > slightly
> > > different from cpumap-pping, which could limit the utility of a
> > > comparison.
> > > We'd have to match IPs in a way that's a bit more involved here.
> > >
> > > Thoughts?
> >
> > Well, this kind of thing is exactly why I think concatenating the
> > two
> > programs (cpumap and pping) into a single BPF program was a
> > mistake:
> > those are two distinct pieces of functionality, and you want to be
> > able
> > to run them separately, as your "monitor mode" use case shows. The
> > overhead of parsing the packet twice is trivial compared to
> > everything
> > else those apps are doing, so I don't think the gain is worth
> > losing
> > that flexibility.
> >
> > So I definitely think using the regular epping is the right thing
> > to do
> > here. Simon is looking into improving its reporting so it can be
> > per-subnet using a user-supplied configuration file for the actual
> > subnets, which should hopefully make this feasible. I'm sure he'll
> > chime
> > in here once he has something to test and/or with any questions
> > that pop
> > up in the process.
> >
> > Longer term, I'm hoping all of Herbert's other improvements to
> > epping
> > reporting/formatting can make it into upstream epping, so LibreQoS
> > can
> > just use that for everything :)
> >
> > -Toke
> > _______________________________________________
> > LibreQoS mailing list
> > LibreQoS@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/libreqos
> _______________________________________________
> LibreQoS mailing list
> LibreQoS@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/libreqos

När du skickar e-post till Karlstads universitet behandlar vi dina personuppgifter<https://www.kau.se/gdpr>.
When you send an e-mail to Karlstad University, we will process your personal data<https://www.kau.se/en/gdpr>.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LibreQoS] Before/After Performance Comparison (Monitoring Mode)
  2022-11-08 18:53     ` Simon Sundberg
@ 2022-11-10 22:10       ` Dave Taht
  2022-11-11 14:23         ` Herbert Wolverson
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Taht @ 2022-11-10 22:10 UTC (permalink / raw)
  To: Simon Sundberg; +Cc: herberticus, libreqos, Felix Fietkau

A couple meta comments:

A) Most of this new stuff is not "open source" but "free" (libre)
software. I have kind of despaired at the degradation of the term
"Open", and the phrase "open source". Free is a lousy word also, and
"libre" the closest thing we have left to the original spirit of
sharing and mutual respect that the culture that birthed the modern
internet in the 90s, had.

I have never been able to pay people (aside from vectoring grants in
their direction), and am HUGE on always providing credit, because that
was the only thing I had to give in exchange for huge amount of
craftsmanship required to build truly great software.

Sometimes, this means less code, rather than more! I'm rather proud of
toke & felix & michal (and so many others) fq_codel for wifi
implementation *removing* a net 200 lines of code. (
https://lwn.net/Articles/705884/ )

I'd like us to be looking hard at qosify (
https://forum.openwrt.org/t/qosify-new-package-for-dscp-marking-cake/111789/
) as well, long term.

B) in trying to make tcp rtt sensing performant and always on, with
+1% more cpu... I find myself wondering where the 24% of 16 cores
we're spending at 11gbit is really going!! Cache bandwidth is
enormous... Dick Sites' recent book on tracing well thumbed.

C) There are a ton of things (long term) that will impact future
processing, some merely interesting, others genuinely useful. Examples
of this include - sensing frequency and location of icmp "too big"
messages, quic's spin bit, feed forwarding the rtt stats into
dynamically shaping an instance to compensate for in-home wifi (does
anyone actually know wtf plume is measuring and doing for their 2.2B
valuation??), checking for correct tcp responses to ecn marks, and
detecting ddos attacks...

D) I would like us to always "upstream first", like redhat, and
openwrt. REALLY high on my list is being able to track and extend
"drop_reason" support in the kernel...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [LibreQoS] Before/After Performance Comparison (Monitoring Mode)
  2022-11-10 22:10       ` Dave Taht
@ 2022-11-11 14:23         ` Herbert Wolverson
  0 siblings, 0 replies; 8+ messages in thread
From: Herbert Wolverson @ 2022-11-11 14:23 UTC (permalink / raw)
  Cc: libreqos

[-- Attachment #1: Type: text/plain, Size: 3427 bytes --]

Re A) - we could all benefit from switching to French, where "libre" and
"livre" are
different words, with different connotations. English has really mangled
this one.
"Free" can be "freedom" or "no cost", liberty somehow became "freedom" in a
lot of minds (to the chagrin of those of us with Political Science
degrees), and "open"
can mean anything from "look but don't touch" to "public domain". Ugh.
We're stuck with the language we have, Libre works for me.

Less code is generally a good thing. It can be a balance; e.g. more code
fleshing
out an API usually means much less code using it.

Qosify:

I gave Qosify a once-over, very quickly (short on time). It'll be worth
going over it
again in more detail.
The lack of comments makes it heavy-going. It seems to be focused on a
smaller
router at home, building a simple queue (assuming local NAT) and attaching
a Cake
qdisc?
The "magic" I'm seeing is that the BPF program looks at flows to change
DSCP flags by connection bytes (I thought Cake was already classifying
by reading ConnTrack data?). I get the feeling I'm missing something.

Re B) - I've been wondering that, too. It's not an easy one to profile,
kernel-side
profiling is hard enough without adding in eBPF! It's definitely worth
digging
into.

C) is a whole other future discussion, and I feel I've talked enough about
D).

On Thu, Nov 10, 2022 at 4:10 PM Dave Taht <dave.taht@gmail.com> wrote:

> A couple meta comments:
>
> A) Most of this new stuff is not "open source" but "free" (libre)
> software. I have kind of despaired at the degradation of the term
> "Open", and the phrase "open source". Free is a lousy word also, and
> "libre" the closest thing we have left to the original spirit of
> sharing and mutual respect that the culture that birthed the modern
> internet in the 90s, had.
>
> I have never been able to pay people (aside from vectoring grants in
> their direction), and am HUGE on always providing credit, because that
> was the only thing I had to give in exchange for huge amount of
> craftsmanship required to build truly great software.
>
> Sometimes, this means less code, rather than more! I'm rather proud of
> toke & felix & michal (and so many others) fq_codel for wifi
> implementation *removing* a net 200 lines of code. (
> https://lwn.net/Articles/705884/ )
>
> I'd like us to be looking hard at qosify (
>
> https://forum.openwrt.org/t/qosify-new-package-for-dscp-marking-cake/111789/
> ) as well, long term.
>
> B) in trying to make tcp rtt sensing performant and always on, with
> +1% more cpu... I find myself wondering where the 24% of 16 cores
> we're spending at 11gbit is really going!! Cache bandwidth is
> enormous... Dick Sites' recent book on tracing well thumbed.
>
> C) There are a ton of things (long term) that will impact future
> processing, some merely interesting, others genuinely useful. Examples
> of this include - sensing frequency and location of icmp "too big"
> messages, quic's spin bit, feed forwarding the rtt stats into
> dynamically shaping an instance to compensate for in-home wifi (does
> anyone actually know wtf plume is measuring and doing for their 2.2B
> valuation??), checking for correct tcp responses to ecn marks, and
> detecting ddos attacks...
>
> D) I would like us to always "upstream first", like redhat, and
> openwrt. REALLY high on my list is being able to track and extend
> "drop_reason" support in the kernel...
>

[-- Attachment #2: Type: text/html, Size: 4609 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-11-11 14:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-05 16:44 [LibreQoS] Before/After Performance Comparison (Monitoring Mode) Robert Chacón
2022-11-08 14:23 ` Toke Høiland-Jørgensen
2022-11-08 15:44   ` Robert Chacón
2022-11-08 16:04     ` Herbert Wolverson
2022-11-08 16:02   ` Herbert Wolverson
2022-11-08 18:53     ` Simon Sundberg
2022-11-10 22:10       ` Dave Taht
2022-11-11 14:23         ` Herbert Wolverson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox