CoDel AQM discussions
 help / color / mirror / Atom feed
* [Codel] XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
@ 2025-09-09 10:32 Frantisek Borsik
  2025-09-09 20:25 ` [Codel] Re: [Cake] " David P. Reed
  0 siblings, 1 reply; 15+ messages in thread
From: Frantisek Borsik @ 2025-09-09 10:32 UTC (permalink / raw)
  To: Cake List, codel, bloat, Jeremy Austin via Rpm

Hello to all,

Looks interesting:
https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771


All the best,

Frank

Frantisek (Frank) Borsik


*In loving memory of Dave Täht: *1965-2025

https://libreqos.io/2025/04/01/in-loving-memory-of-dave/


https://www.linkedin.com/in/frantisekborsik

Signal, Telegram, WhatsApp: +421919416714

iMessage, mobile: +420775230885

Skype: casioa5302ca

frantisek.borsik@gmail.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Cake] XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-09 10:32 [Codel] XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released) Frantisek Borsik
@ 2025-09-09 20:25 ` David P. Reed
  2025-09-09 21:02   ` Frantisek Borsik
  0 siblings, 1 reply; 15+ messages in thread
From: David P. Reed @ 2025-09-09 20:25 UTC (permalink / raw)
  To: Frantisek Borsik; +Cc: Cake List, codel, bloat, Jeremy Austin via Rpm


Hi Frank -
 
I think it is interesting as a concept. A project I am advising has been using DPDK very effectively to get rid of the huge path and locking delays in the current Linux network stack. XDP2 could be supported in a ring3 (user) address space, achieving a similar result.
 
But I don't think XDP2 is going that direction - so it may be stuckinto the mess of kernel space networking. Adding eBPF only has made this more of a mess, by the way (and adding a new "compiler" that needs to be veriried as safe for the kernel).

I will be watching how this evolves.
 
David
 
On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <frantisek.borsik@gmail.com> said:



> Hello to all,
> 
> Looks interesting:
> https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771
> 
> 
> All the best,
> 
> Frank
> 
> Frantisek (Frank) Borsik
> 
> 
> *In loving memory of Dave Täht: *1965-2025
> 
> https://libreqos.io/2025/04/01/in-loving-memory-of-dave/
> 
> 
> https://www.linkedin.com/in/frantisekborsik
> 
> Signal, Telegram, WhatsApp: +421919416714
> 
> iMessage, mobile: +420775230885
> 
> Skype: casioa5302ca
> 
> frantisek.borsik@gmail.com
> _______________________________________________
> Cake mailing list -- cake@lists.bufferbloat.net
> To unsubscribe send an email to cake-leave@lists.bufferbloat.net
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Cake] XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-09 20:25 ` [Codel] Re: [Cake] " David P. Reed
@ 2025-09-09 21:02   ` Frantisek Borsik
  2025-09-09 21:36     ` [Codel] Re: [Cake] " Tom Herbert
  0 siblings, 1 reply; 15+ messages in thread
From: Frantisek Borsik @ 2025-09-09 21:02 UTC (permalink / raw)
  To: David P. Reed; +Cc: Cake List, codel, bloat, Jeremy Austin via Rpm

Thanks a lot, David.

I have asked Tom if he wants to join us and he should be here to chat with
us now.

All the best,

Frank

Frantisek (Frank) Borsik


*In loving memory of Dave Täht: *1965-2025

https://libreqos.io/2025/04/01/in-loving-memory-of-dave/


https://www.linkedin.com/in/frantisekborsik

Signal, Telegram, WhatsApp: +421919416714

iMessage, mobile: +420775230885

Skype: casioa5302ca

frantisek.borsik@gmail.com


On Tue, Sep 9, 2025 at 10:25 PM David P. Reed <dpreed@deepplum.com> wrote:

> Hi Frank -
>
>
>
> I think it is interesting as a concept. A project I am advising has been
> using DPDK very effectively to get rid of the huge path and locking delays
> in the current Linux network stack. XDP2 could be supported in a ring3
> (user) address space, achieving a similar result.
>
>
>
> But I don't think XDP2 is going that direction - so it may be stuckinto
> the mess of kernel space networking. Adding eBPF only has made this more of
> a mess, by the way (and adding a new "compiler" that needs to be veriried
> as safe for the kernel).
>
> I will be watching how this evolves.
>
>
>
> David
>
>
>
> On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <
> frantisek.borsik@gmail.com> said:
>
> > Hello to all,
> >
> > Looks interesting:
> >
> https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771
> >
> >
> > All the best,
> >
> > Frank
> >
> > Frantisek (Frank) Borsik
> >
> >
> > *In loving memory of Dave Täht: *1965-2025
> >
> > https://libreqos.io/2025/04/01/in-loving-memory-of-dave/
> >
> >
> > https://www.linkedin.com/in/frantisekborsik
> >
> > Signal, Telegram, WhatsApp: +421919416714
> >
> > iMessage, mobile: +420775230885
> >
> > Skype: casioa5302ca
> >
> > frantisek.borsik@gmail.com
> > _______________________________________________
> > Cake mailing list -- cake@lists.bufferbloat.net
> > To unsubscribe send an email to cake-leave@lists.bufferbloat.net
> >
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-09 21:02   ` Frantisek Borsik
@ 2025-09-09 21:36     ` Tom Herbert
  2025-09-10  8:54       ` [Codel] Re: [Bloat] " BeckW
                         ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Tom Herbert @ 2025-09-09 21:36 UTC (permalink / raw)
  To: Frantisek Borsik
  Cc: David P. Reed, Cake List, codel, bloat, Jeremy Austin via Rpm

On Tue, Sep 9, 2025, 5:03 PM Frantisek Borsik <frantisek.borsik@gmail.com>
wrote:

> Thanks a lot, David.
>
> I have asked Tom if he wants to join us and he should be here to chat with
> us now.
>
> All the best,
>
> Frank
>
> Frantisek (Frank) Borsik
>
>
> *In loving memory of Dave Täht: *1965-2025
>
> https://libreqos.io/2025/04/01/in-loving-memory-of-dave/
>
>
> https://www.linkedin.com/in/frantisekborsik
>
> Signal, Telegram, WhatsApp: +421919416714
>
> iMessage, mobile: +420775230885
>
> Skype: casioa5302ca
>
> frantisek.borsik@gmail.com
>
>
> On Tue, Sep 9, 2025 at 10:25 PM David P. Reed <dpreed@deepplum.com> wrote:
>
> > Hi Frank -
> >
> >
> >
> > I think it is interesting as a concept. A project I am advising has been
> > using DPDK very effectively to get rid of the huge path and locking
> delays
> > in the current Linux network stack. XDP2 could be supported in a ring3
> > (user) address space, achieving a similar result.
>

HI David,

The idea is you could write the code in XDP2 and it would be compiled to
DPDK or eBPF and the compiler would handle the optimizations.


> >
> >
> >
> > But I don't think XDP2 is going that direction - so it may be stuckinto
> > the mess of kernel space networking. Adding eBPF only has made this more
> of
> > a mess, by the way (and adding a new "compiler" that needs to be veriried
> > as safe for the kernel).


Think of XDP2 as the generalization of XDP to go beyond just the kernel.
The idea is that the user writes their datapath code once and they compile
it to run in whatever targets they have-- DPDK, P4, other programmable
hardware, and yes XDP/eBPF. It's really not limited to kernel networking.

As for the name XDP2, when we created XDP, eXpress DataPath, my vision was
that it would be implementation agnostic. eBPF was the first instantiation
for practicality, but now ten years later I think we can realize the
initial vision.

Tom



>
> > I will be watching how this evolves.
> >
> >
> >
> > David
> >
> >
> >
> > On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <
> > frantisek.borsik@gmail.com> said:
> >
> > > Hello to all,
> > >
> > > Looks interesting:
> > >
> >
> https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771
> > >
> > >
> > > All the best,
> > >
> > > Frank
> > >
> > > Frantisek (Frank) Borsik
> > >
> > >
> > > *In loving memory of Dave Täht: *1965-2025
> > >
> > > https://libreqos.io/2025/04/01/in-loving-memory-of-dave/
> > >
> > >
> > > https://www.linkedin.com/in/frantisekborsik
> > >
> > > Signal, Telegram, WhatsApp: +421919416714
> > >
> > > iMessage, mobile: +420775230885
> > >
> > > Skype: casioa5302ca
> > >
> > > frantisek.borsik@gmail.com
> > > _______________________________________________
> > > Cake mailing list -- cake@lists.bufferbloat.net
> > > To unsubscribe send an email to cake-leave@lists.bufferbloat.net
> > >
> >
> _______________________________________________
> Cake mailing list -- cake@lists.bufferbloat.net
> To unsubscribe send an email to cake-leave@lists.bufferbloat.net
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-09 21:36     ` [Codel] Re: [Cake] " Tom Herbert
@ 2025-09-10  8:54       ` BeckW
  2025-09-10 13:59         ` Tom Herbert
  2025-09-13 18:33       ` [Codel] " David P. Reed
       [not found]       ` <FR2PPFEFD18174CA00474D0DC8DBDA3EE00DC0EA@FR2PPFEFD18174C.DEUP281.PROD.OUT LOOK.COM>
  2 siblings, 1 reply; 15+ messages in thread
From: BeckW @ 2025-09-10  8:54 UTC (permalink / raw)
  To: tom, frantisek.borsik; +Cc: dpreed, cake, codel, bloat, rpm

Interesting work! One problem of P4 is that the networking hardware varies so much in number of resources (queues, schedulers, policers, counters, table memory) that the code inevitably becomes tied to a certain system.
It will be difficult to abstract the peculiarities of systems -- eg Broadcom 88800 vs linux kernel -- in a good way.

Wolfgang

-----Ursprüngliche Nachricht-----
Von: Tom Herbert via Bloat <bloat@lists.bufferbloat.net> 
Gesendet: Dienstag, 9. September 2025 23:37
An: Frantisek Borsik <frantisek.borsik@gmail.com>
Cc: David P. Reed <dpreed@deepplum.com>; Cake List <cake@lists.bufferbloat.net>; codel@lists.bufferbloat.net; bloat <bloat@lists.bufferbloat.net>; Jeremy Austin via Rpm <rpm@lists.bufferbloat.net>
Betreff: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)

On Tue, Sep 9, 2025, 5:03 PM Frantisek Borsik <frantisek.borsik@gmail.com>
wrote:

> Thanks a lot, David.
>
> I have asked Tom if he wants to join us and he should be here to chat 
> with us now.
>
> All the best,
>
> Frank
>
> Frantisek (Frank) Borsik
>
>
> *In loving memory of Dave Täht: *1965-2025
>
> https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flibr
> eqos.io%2F2025%2F04%2F01%2Fin-loving-memory-of-dave%2F&data=05%7C02%7C
> BeckW%40telekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604c
> f68b04a5eeb25f5c4f%7C0%7C0%7C638930853609276702%7CUnknown%7CTWFpbGZsb3
> d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoi
> TWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=TAA6NdEkB3KQQt8UhKqs3ZKuLj
> N7A9h9J9FAjNRDDuU%3D&reserved=0
>
>
> https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> linkedin.com%2Fin%2Ffrantisekborsik&data=05%7C02%7CBeckW%40telekom.de%
> 7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a5eeb25f5c4f%
> 7C0%7C0%7C638930853609297031%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiO
> nRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%
> 3D%3D%7C0%7C%7C%7C&sdata=ucL7U%2FQ81ks09nVEGUbe%2FFq1rhJNLaegicv%2FLCV
> cIlg%3D&reserved=0
>
> Signal, Telegram, WhatsApp: +421919416714
>
> iMessage, mobile: +420775230885
>
> Skype: casioa5302ca
>
> frantisek.borsik@gmail.com
>
>
> On Tue, Sep 9, 2025 at 10:25 PM David P. Reed <dpreed@deepplum.com> wrote:
>
> > Hi Frank -
> >
> >
> >
> > I think it is interesting as a concept. A project I am advising has 
> > been using DPDK very effectively to get rid of the huge path and 
> > locking
> delays
> > in the current Linux network stack. XDP2 could be supported in a 
> > ring3
> > (user) address space, achieving a similar result.
>

HI David,

The idea is you could write the code in XDP2 and it would be compiled to DPDK or eBPF and the compiler would handle the optimizations.


> >
> >
> >
> > But I don't think XDP2 is going that direction - so it may be 
> > stuckinto the mess of kernel space networking. Adding eBPF only has 
> > made this more
> of
> > a mess, by the way (and adding a new "compiler" that needs to be 
> > veriried as safe for the kernel).


Think of XDP2 as the generalization of XDP to go beyond just the kernel.
The idea is that the user writes their datapath code once and they compile it to run in whatever targets they have-- DPDK, P4, other programmable hardware, and yes XDP/eBPF. It's really not limited to kernel networking.

As for the name XDP2, when we created XDP, eXpress DataPath, my vision was that it would be implementation agnostic. eBPF was the first instantiation for practicality, but now ten years later I think we can realize the initial vision.

Tom



>
> > I will be watching how this evolves.
> >
> >
> >
> > David
> >
> >
> >
> > On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" < 
> > frantisek.borsik@gmail.com> said:
> >
> > > Hello to all,
> > >
> > > Looks interesting:
> > >
> >
> https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedi
> um.com%2F%40tom_84912%2Fxdp2-this-changes-everything-at-least-for-ai-m
> l-infrastructure-850c1ba82771&data=05%7C02%7CBeckW%40telekom.de%7C299d
> 64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a5eeb25f5c4f%7C0%7C
> 0%7C638930853609308950%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWU
> sIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%
> 7C0%7C%7C%7C&sdata=RTLHTVFR540C8Psr73uiuRvkx1sSyjmlUIICEHFj0HA%3D&rese
> rved=0
> > >
> > >
> > > All the best,
> > >
> > > Frank
> > >
> > > Frantisek (Frank) Borsik
> > >
> > >
> > > *In loving memory of Dave Täht: *1965-2025
> > >
> > > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > libreqos.io%2F2025%2F04%2F01%2Fin-loving-memory-of-dave%2F&data=05
> > > %7C02%7CBeckW%40telekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbd
> > > e4dffc4b604cf68b04a5eeb25f5c4f%7C0%7C0%7C638930853609323027%7CUnkn
> > > own%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAi
> > > OiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=98
> > > musUCMTZR4ID%2Bo6GYxWOX99aiYBspBUdh%2BNV1fzwc%3D&reserved=0
> > >
> > >
> > > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > www.linkedin.com%2Fin%2Ffrantisekborsik&data=05%7C02%7CBeckW%40tel
> > > ekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a
> > > 5eeb25f5c4f%7C0%7C0%7C638930853609334239%7CUnknown%7CTWFpbGZsb3d8e
> > > yJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjo
> > > iTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=0rD47zFeLVV1YDqMc5SO0
> > > xGWEOTrE3FOYv0mOqGF%2FW4%3D&reserved=0
> > >
> > > Signal, Telegram, WhatsApp: +421919416714
> > >
> > > iMessage, mobile: +420775230885
> > >
> > > Skype: casioa5302ca
> > >
> > > frantisek.borsik@gmail.com
> > > _______________________________________________
> > > Cake mailing list -- cake@lists.bufferbloat.net To unsubscribe 
> > > send an email to cake-leave@lists.bufferbloat.net
> > >
> >
> _______________________________________________
> Cake mailing list -- cake@lists.bufferbloat.net To unsubscribe send an 
> email to cake-leave@lists.bufferbloat.net
>
_______________________________________________
Bloat mailing list -- bloat@lists.bufferbloat.net To unsubscribe send an email to bloat-leave@lists.bufferbloat.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-10  8:54       ` [Codel] Re: [Bloat] " BeckW
@ 2025-09-10 13:59         ` Tom Herbert
  2025-09-10 14:06           ` Tom Herbert
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Herbert @ 2025-09-10 13:59 UTC (permalink / raw)
  To: BeckW; +Cc: Frantisek Borsik, dpreed, cake, codel, bloat, rpm

On Wed, Sep 10, 2025, 1:54 AM <BeckW@telekom.de> wrote:

> Interesting work! One problem of P4 is that the networking hardware varies
> so much in number of resources (queues, schedulers, policers, counters,
> table memory) that the code inevitably becomes tied to a certain system.
> It will be difficult to abstract the peculiarities of systems -- eg
> Broadcom 88800 vs linux kernel -- in a good way.
>

Hi Wolfgang

>
Yes, the non-portability of P4 code between different architectures has
been raised as an issue. I believe this is subset of the general problem
that we need to deal with differing resource constraints across targets
(like different memory resources, hardware accelerators, table sizes, etc
). It should be a problem of resource constraints, and not differences in
core functionality supported by the targets

I think there's three possibilities:

1) We create an image and attempt to resolve the resource constraints at
runtime (load time). Determine the resources and run with those. It's
possible that a HW configuration doesn't have sufficient resources to meet
the programmer's requirements in which case we may have to inform the user
the program can't run (e.g.if the programmer wants line rate encryption,
but there's no HW accelerators then the program requirements can't be met
and the program can't run on that target).
2) Recompile the backend to the different targets. This is needed when the
executable is incompatible between targets. The goal is to still be
transparent to the user, but we still need to meet the requirement of the
application.
3) Change the source code if the hardware peculiarities are not transparent
to program source. Obviously, this is the least preferred option, but if
there's no alternative then the goal here would be to isolate the target
specific code as much as possible. Hopefully, the vast majority of program
code is target agnostic and there's just a little glue code for each target.

Tom



> Wolfgang
>
> -----Ursprüngliche Nachricht-----
> Von: Tom Herbert via Bloat <bloat@lists.bufferbloat.net>
> Gesendet: Dienstag, 9. September 2025 23:37
> An: Frantisek Borsik <frantisek.borsik@gmail.com>
> Cc: David P. Reed <dpreed@deepplum.com>; Cake List <
> cake@lists.bufferbloat.net>; codel@lists.bufferbloat.net; bloat <
> bloat@lists.bufferbloat.net>; Jeremy Austin via Rpm <
> rpm@lists.bufferbloat.net>
> Betreff: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom
> Herbert (almost to the date, 10 years after XDP was released)
>
> On Tue, Sep 9, 2025, 5:03 PM Frantisek Borsik <frantisek.borsik@gmail.com>
> wrote:
>
> > Thanks a lot, David.
> >
> > I have asked Tom if he wants to join us and he should be here to chat
> > with us now.
> >
> > All the best,
> >
> > Frank
> >
> > Frantisek (Frank) Borsik
> >
> >
> > *In loving memory of Dave Täht: *1965-2025
> >
> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flibr
> > eqos.io%2F2025%2F04%2F01%2Fin-loving-memory-of-dave%2F&data=05%7C02%7C
> > BeckW%40telekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604c
> > f68b04a5eeb25f5c4f%7C0%7C0%7C638930853609276702%7CUnknown%7CTWFpbGZsb3
> > d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoi
> > TWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=TAA6NdEkB3KQQt8UhKqs3ZKuLj
> > N7A9h9J9FAjNRDDuU%3D&reserved=0
> >
> >
> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > linkedin.com%2Fin%2Ffrantisekborsik&data=05%7C02%7CBeckW%40telekom.de%
> > 7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a5eeb25f5c4f%
> > 7C0%7C0%7C638930853609297031%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiO
> > nRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%
> > 3D%3D%7C0%7C%7C%7C&sdata=ucL7U%2FQ81ks09nVEGUbe%2FFq1rhJNLaegicv%2FLCV
> > cIlg%3D&reserved=0
> >
> > Signal, Telegram, WhatsApp: +421919416714
> >
> > iMessage, mobile: +420775230885
> >
> > Skype: casioa5302ca
> >
> > frantisek.borsik@gmail.com
> >
> >
> > On Tue, Sep 9, 2025 at 10:25 PM David P. Reed <dpreed@deepplum.com>
> wrote:
> >
> > > Hi Frank -
> > >
> > >
> > >
> > > I think it is interesting as a concept. A project I am advising has
> > > been using DPDK very effectively to get rid of the huge path and
> > > locking
> > delays
> > > in the current Linux network stack. XDP2 could be supported in a
> > > ring3
> > > (user) address space, achieving a similar result.
> >
>
> HI David,
>
> The idea is you could write the code in XDP2 and it would be compiled to
> DPDK or eBPF and the compiler would handle the optimizations.
>
>
> > >
> > >
> > >
> > > But I don't think XDP2 is going that direction - so it may be
> > > stuckinto the mess of kernel space networking. Adding eBPF only has
> > > made this more
> > of
> > > a mess, by the way (and adding a new "compiler" that needs to be
> > > veriried as safe for the kernel).
>
>
> Think of XDP2 as the generalization of XDP to go beyond just the kernel.
> The idea is that the user writes their datapath code once and they compile
> it to run in whatever targets they have-- DPDK, P4, other programmable
> hardware, and yes XDP/eBPF. It's really not limited to kernel networking.
>
> As for the name XDP2, when we created XDP, eXpress DataPath, my vision was
> that it would be implementation agnostic. eBPF was the first instantiation
> for practicality, but now ten years later I think we can realize the
> initial vision.
>
> Tom
>
>
>
> >
> > > I will be watching how this evolves.
> > >
> > >
> > >
> > > David
> > >
> > >
> > >
> > > On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <
> > > frantisek.borsik@gmail.com> said:
> > >
> > > > Hello to all,
> > > >
> > > > Looks interesting:
> > > >
> > >
> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedi
> > um.com%2F%40tom_84912%2Fxdp2-this-changes-everything-at-least-for-ai-m
> > l-infrastructure-850c1ba82771&data=05%7C02%7CBeckW%40telekom.de%7C299d
> > 64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a5eeb25f5c4f%7C0%7C
> > 0%7C638930853609308950%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWU
> > sIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%
> > 7C0%7C%7C%7C&sdata=RTLHTVFR540C8Psr73uiuRvkx1sSyjmlUIICEHFj0HA%3D&rese
> > rved=0
> > > >
> > > >
> > > > All the best,
> > > >
> > > > Frank
> > > >
> > > > Frantisek (Frank) Borsik
> > > >
> > > >
> > > > *In loving memory of Dave Täht: *1965-2025
> > > >
> > > > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > libreqos.io%2F2025%2F04%2F01%2Fin-loving-memory-of-dave%2F&data=05
> > > > %7C02%7CBeckW%40telekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbd
> > > > e4dffc4b604cf68b04a5eeb25f5c4f%7C0%7C0%7C638930853609323027%7CUnkn
> > > > own%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAi
> > > > OiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=98
> > > > musUCMTZR4ID%2Bo6GYxWOX99aiYBspBUdh%2BNV1fzwc%3D&reserved=0
> > > >
> > > >
> > > > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > www.linkedin.com%2Fin%2Ffrantisekborsik&data=05%7C02%7CBeckW%40tel
> > > > ekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a
> > > > 5eeb25f5c4f%7C0%7C0%7C638930853609334239%7CUnknown%7CTWFpbGZsb3d8e
> > > > yJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjo
> > > > iTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=0rD47zFeLVV1YDqMc5SO0
> > > > xGWEOTrE3FOYv0mOqGF%2FW4%3D&reserved=0
> > > >
> > > > Signal, Telegram, WhatsApp: +421919416714
> > > >
> > > > iMessage, mobile: +420775230885
> > > >
> > > > Skype: casioa5302ca
> > > >
> > > > frantisek.borsik@gmail.com
> > > > _______________________________________________
> > > > Cake mailing list -- cake@lists.bufferbloat.net To unsubscribe
> > > > send an email to cake-leave@lists.bufferbloat.net
> > > >
> > >
> > _______________________________________________
> > Cake mailing list -- cake@lists.bufferbloat.net To unsubscribe send an
> > email to cake-leave@lists.bufferbloat.net
> >
> _______________________________________________
> Bloat mailing list -- bloat@lists.bufferbloat.net To unsubscribe send an
> email to bloat-leave@lists.bufferbloat.net
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-10 13:59         ` Tom Herbert
@ 2025-09-10 14:06           ` Tom Herbert
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Herbert @ 2025-09-10 14:06 UTC (permalink / raw)
  To: BeckW; +Cc: Frantisek Borsik, dpreed, cake, codel, bloat, rpm, xdp2

Cc XDP2 list

On Wed, Sep 10, 2025, 6:59 AM Tom Herbert <tom@herbertland.com> wrote:

>
>
> On Wed, Sep 10, 2025, 1:54 AM <BeckW@telekom.de> wrote:
>
>> Interesting work! One problem of P4 is that the networking hardware
>> varies so much in number of resources (queues, schedulers, policers,
>> counters, table memory) that the code inevitably becomes tied to a certain
>> system.
>> It will be difficult to abstract the peculiarities of systems -- eg
>> Broadcom 88800 vs linux kernel -- in a good way.
>>
>
> Hi Wolfgang
>
>>
> Yes, the non-portability of P4 code between different architectures has
> been raised as an issue. I believe this is subset of the general problem
> that we need to deal with differing resource constraints across targets
> (like different memory resources, hardware accelerators, table sizes, etc
> ). It should be a problem of resource constraints, and not differences in
> core functionality supported by the targets
>
> I think there's three possibilities:
>
> 1) We create an image and attempt to resolve the resource constraints at
> runtime (load time). Determine the resources and run with those. It's
> possible that a HW configuration doesn't have sufficient resources to meet
> the programmer's requirements in which case we may have to inform the user
> the program can't run (e.g.if the programmer wants line rate encryption,
> but there's no HW accelerators then the program requirements can't be met
> and the program can't run on that target).
> 2) Recompile the backend to the different targets. This is needed when the
> executable is incompatible between targets. The goal is to still be
> transparent to the user, but we still need to meet the requirement of the
> application.
> 3) Change the source code if the hardware peculiarities are not
> transparent to program source. Obviously, this is the least preferred
> option, but if there's no alternative then the goal here would be to
> isolate the target specific code as much as possible. Hopefully, the vast
> majority of program code is target agnostic and there's just a little glue
> code for each target.
>
> Tom
>
>
>
>> Wolfgang
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Tom Herbert via Bloat <bloat@lists.bufferbloat.net>
>> Gesendet: Dienstag, 9. September 2025 23:37
>> An: Frantisek Borsik <frantisek.borsik@gmail.com>
>> Cc: David P. Reed <dpreed@deepplum.com>; Cake List <
>> cake@lists.bufferbloat.net>; codel@lists.bufferbloat.net; bloat <
>> bloat@lists.bufferbloat.net>; Jeremy Austin via Rpm <
>> rpm@lists.bufferbloat.net>
>> Betreff: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom
>> Herbert (almost to the date, 10 years after XDP was released)
>>
>> On Tue, Sep 9, 2025, 5:03 PM Frantisek Borsik <frantisek.borsik@gmail.com
>> >
>> wrote:
>>
>> > Thanks a lot, David.
>> >
>> > I have asked Tom if he wants to join us and he should be here to chat
>> > with us now.
>> >
>> > All the best,
>> >
>> > Frank
>> >
>> > Frantisek (Frank) Borsik
>> >
>> >
>> > *In loving memory of Dave Täht: *1965-2025
>> >
>> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flibr
>> > eqos.io%2F2025%2F04%2F01%2Fin-loving-memory-of-dave%2F&data=05%7C02%7C
>> > BeckW%40telekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604c
>> > f68b04a5eeb25f5c4f%7C0%7C0%7C638930853609276702%7CUnknown%7CTWFpbGZsb3
>> > d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoi
>> > TWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=TAA6NdEkB3KQQt8UhKqs3ZKuLj
>> > N7A9h9J9FAjNRDDuU%3D&reserved=0
>> >
>> >
>> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
>> > linkedin.com%2Fin%2Ffrantisekborsik&data=05%7C02%7CBeckW%40telekom.de%
>> > 7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a5eeb25f5c4f%
>> > 7C0%7C0%7C638930853609297031%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiO
>> > nRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%
>> > 3D%3D%7C0%7C%7C%7C&sdata=ucL7U%2FQ81ks09nVEGUbe%2FFq1rhJNLaegicv%2FLCV
>> > cIlg%3D&reserved=0
>> >
>> > Signal, Telegram, WhatsApp: +421919416714
>> >
>> > iMessage, mobile: +420775230885
>> >
>> > Skype: casioa5302ca
>> >
>> > frantisek.borsik@gmail.com
>> >
>> >
>> > On Tue, Sep 9, 2025 at 10:25 PM David P. Reed <dpreed@deepplum.com>
>> wrote:
>> >
>> > > Hi Frank -
>> > >
>> > >
>> > >
>> > > I think it is interesting as a concept. A project I am advising has
>> > > been using DPDK very effectively to get rid of the huge path and
>> > > locking
>> > delays
>> > > in the current Linux network stack. XDP2 could be supported in a
>> > > ring3
>> > > (user) address space, achieving a similar result.
>> >
>>
>> HI David,
>>
>> The idea is you could write the code in XDP2 and it would be compiled to
>> DPDK or eBPF and the compiler would handle the optimizations.
>>
>>
>> > >
>> > >
>> > >
>> > > But I don't think XDP2 is going that direction - so it may be
>> > > stuckinto the mess of kernel space networking. Adding eBPF only has
>> > > made this more
>> > of
>> > > a mess, by the way (and adding a new "compiler" that needs to be
>> > > veriried as safe for the kernel).
>>
>>
>> Think of XDP2 as the generalization of XDP to go beyond just the kernel.
>> The idea is that the user writes their datapath code once and they
>> compile it to run in whatever targets they have-- DPDK, P4, other
>> programmable hardware, and yes XDP/eBPF. It's really not limited to kernel
>> networking.
>>
>> As for the name XDP2, when we created XDP, eXpress DataPath, my vision
>> was that it would be implementation agnostic. eBPF was the first
>> instantiation for practicality, but now ten years later I think we can
>> realize the initial vision.
>>
>> Tom
>>
>>
>>
>> >
>> > > I will be watching how this evolves.
>> > >
>> > >
>> > >
>> > > David
>> > >
>> > >
>> > >
>> > > On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <
>> > > frantisek.borsik@gmail.com> said:
>> > >
>> > > > Hello to all,
>> > > >
>> > > > Looks interesting:
>> > > >
>> > >
>> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedi
>> > um.com%2F%40tom_84912%2Fxdp2-this-changes-everything-at-least-for-ai-m
>> > l-infrastructure-850c1ba82771&data=05%7C02%7CBeckW%40telekom.de%7C299d
>> > 64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a5eeb25f5c4f%7C0%7C
>> > 0%7C638930853609308950%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWU
>> > sIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%
>> > 7C0%7C%7C%7C&sdata=RTLHTVFR540C8Psr73uiuRvkx1sSyjmlUIICEHFj0HA%3D&rese
>> > rved=0
>> > > >
>> > > >
>> > > > All the best,
>> > > >
>> > > > Frank
>> > > >
>> > > > Frantisek (Frank) Borsik
>> > > >
>> > > >
>> > > > *In loving memory of Dave Täht: *1965-2025
>> > > >
>> > > > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>> > > > libreqos.io%2F2025%2F04%2F01%2Fin-loving-memory-of-dave%2F&data=05
>> > > > %7C02%7CBeckW%40telekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbd
>> > > > e4dffc4b604cf68b04a5eeb25f5c4f%7C0%7C0%7C638930853609323027%7CUnkn
>> > > > own%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAi
>> > > > OiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=98
>> > > > musUCMTZR4ID%2Bo6GYxWOX99aiYBspBUdh%2BNV1fzwc%3D&reserved=0
>> > > >
>> > > >
>> > > > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>> > > > www.linkedin.com%2Fin%2Ffrantisekborsik&data=05%7C02%7CBeckW%40tel
>> > > > ekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a
>> > > > 5eeb25f5c4f%7C0%7C0%7C638930853609334239%7CUnknown%7CTWFpbGZsb3d8e
>> > > > yJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjo
>> > > > iTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=0rD47zFeLVV1YDqMc5SO0
>> > > > xGWEOTrE3FOYv0mOqGF%2FW4%3D&reserved=0
>> > > >
>> > > > Signal, Telegram, WhatsApp: +421919416714
>> > > >
>> > > > iMessage, mobile: +420775230885
>> > > >
>> > > > Skype: casioa5302ca
>> > > >
>> > > > frantisek.borsik@gmail.com
>> > > > _______________________________________________
>> > > > Cake mailing list -- cake@lists.bufferbloat.net To unsubscribe
>> > > > send an email to cake-leave@lists.bufferbloat.net
>> > > >
>> > >
>> > _______________________________________________
>> > Cake mailing list -- cake@lists.bufferbloat.net To unsubscribe send an
>> > email to cake-leave@lists.bufferbloat.net
>> >
>> _______________________________________________
>> Bloat mailing list -- bloat@lists.bufferbloat.net To unsubscribe send an
>> email to bloat-leave@lists.bufferbloat.net
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-09 21:36     ` [Codel] Re: [Cake] " Tom Herbert
  2025-09-10  8:54       ` [Codel] Re: [Bloat] " BeckW
@ 2025-09-13 18:33       ` David P. Reed
  2025-09-13 20:58         ` Tom Herbert
  2025-09-15  8:39         ` [Codel] Re: [Bloat] " BeckW
       [not found]       ` <FR2PPFEFD18174CA00474D0DC8DBDA3EE00DC0EA@FR2PPFEFD18174C.DEUP281.PROD.OUT LOOK.COM>
  2 siblings, 2 replies; 15+ messages in thread
From: David P. Reed @ 2025-09-13 18:33 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Frantisek Borsik, Cake List, codel, bloat, Jeremy Austin via Rpm


Tom -
 
An architecture-independent network framework independent of the OS kernel's peculiarities seems within reach (though a fair bit of work), and I think it would be a GOOD THING indeed. IMHO the Linux networking stack in the kernel is a horrific mess, and it doesn't have to be.
 
The reason it doesn't have to be is that there should be no reason it cannot run in ring3/userland, just like DPDK. And it should be built using "real-time" userland programming techniques. (avoiding the generic linux scheduler). The ONLY reason for involving the scheduler would be because there aren't enough cores. Linux was designed to be a uniprocessor Unix, and that just is no longer true at all. With hyperthreading, too, one need never abandon a processor's context in userspace to run some "userland" application.

This would rip a huge amount of kernel code out of the kernel. (at least 50%, and probably more). THe security issues of all those 3rd party network drivers would go away.

And the performance would be much higher for networking.  (running in ring 3, especially if you don't do system calls, is no performance penalty, and interprocessor communications using shared memory is much lower latency than Linux IPC or mutexes).
 
I like the idea of a compilation based network stack, at a slightly higher level than C. eBPF is NOT what I have in mind - it's an interpreter with high overhead. The language should support high-performance co-routining - shared memory, ideally. I don't thing GC is a good thing. Rust might be a good starting point because its memory management is safe.
To me, some of what the base of DPDK is like is good stuff. However, it isn't architecturally neutral.

To me, the network stack should not be entangled with interrupt handling at all. "polling" is far more performant under load. The only use for interrupts is when the network stack is completely idle. That would be, in userland, a "wait for interrupt" call (not a poll). Ideally, on recent Intel machines, a userspace version of MONITOR/MWAIT).

Now I know that Linus and his crew are really NOT gonna like this. Linus is still thinking like MINIX, a uniprocessor time-sharing system with rich OS functions in the kernel and doing "file" reads and writes to communicate with the kernel state. But it is a much more modern way to think of real-time IO in a modern operating system. (Windows and macOS are also Unix-like, uniprocessor monolithic kernel designs).

So, if XDP2 got away from the Linux kernel, it could be great.
BTW, io_uring, etc. are half-measures. They address getting away from interrupts toward polling, but they still make the mistake of keeping huge drivers in the kernel.
 
 
On Tuesday, September 9, 2025 17:36, "Tom Herbert" <tom@herbertland.com> said:









On Tue, Sep 9, 2025, 5:03 PM Frantisek Borsik <[ frantisek.borsik@gmail.com ]( mailto:frantisek.borsik@gmail.com )> wrote:Thanks a lot, David.

 I have asked Tom if he wants to join us and he should be here to chat with
 us now.

 All the best,

 Frank

 Frantisek (Frank) Borsik


 *In loving memory of Dave Täht: *1965-2025

[ https://libreqos.io/2025/04/01/in-loving-memory-of-dave/ ]( https://libreqos.io/2025/04/01/in-loving-memory-of-dave/ )


[ https://www.linkedin.com/in/frantisekborsik ]( https://www.linkedin.com/in/frantisekborsik )

 Signal, Telegram, WhatsApp: +421919416714

 iMessage, mobile: +420775230885

 Skype: casioa5302ca

[ frantisek.borsik@gmail.com ]( mailto:frantisek.borsik@gmail.com )


 On Tue, Sep 9, 2025 at 10:25 PM David P. Reed <[ dpreed@deepplum.com ]( mailto:dpreed@deepplum.com )> wrote:

 > Hi Frank -
 >
 >
 >
 > I think it is interesting as a concept. A project I am advising has been
 > using DPDK very effectively to get rid of the huge path and locking delays
 > in the current Linux network stack. XDP2 could be supported in a ring3
 > (user) address space, achieving a similar result.
HI David,
The idea is you could write the code in XDP2 and it would be compiled to DPDK or eBPF and the compiler would handle the optimizations.
 >
 >
 >
 > But I don't think XDP2 is going that direction - so it may be stuckinto
 > the mess of kernel space networking. Adding eBPF only has made this more of
 > a mess, by the way (and adding a new "compiler" that needs to be veriried
 > as safe for the kernel).
Think of XDP2 as the generalization of XDP to go beyond just the kernel. The idea is that the user writes their datapath code once and they compile it to run in whatever targets they have-- DPDK, P4, other programmable hardware, and yes XDP/eBPF. It's really not limited to kernel networking.
As for the name XDP2, when we created XDP, eXpress DataPath, my vision was that it would be implementation agnostic. eBPF was the first instantiation for practicality, but now ten years later I think we can realize the initial vision.
Tom

>
 > I will be watching how this evolves.
 >
 >
 >
 > David
 >
 >
 >
 > On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <
 > [ frantisek.borsik@gmail.com ]( mailto:frantisek.borsik@gmail.com )> said:
 >
 > > Hello to all,
 > >
 > > Looks interesting:
 > >
 > [ https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771 ]( https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771 )
 > >
 > >
 > > All the best,
 > >
 > > Frank
 > >
 > > Frantisek (Frank) Borsik
 > >
 > >
 > > *In loving memory of Dave Täht: *1965-2025
 > >
 > > [ https://libreqos.io/2025/04/01/in-loving-memory-of-dave/ ]( https://libreqos.io/2025/04/01/in-loving-memory-of-dave/ )
 > >
 > >
 > > [ https://www.linkedin.com/in/frantisekborsik ]( https://www.linkedin.com/in/frantisekborsik )
 > >
 > > Signal, Telegram, WhatsApp: +421919416714
 > >
 > > iMessage, mobile: +420775230885
 > >
 > > Skype: casioa5302ca
 > >
 > > [ frantisek.borsik@gmail.com ]( mailto:frantisek.borsik@gmail.com )
 > > _______________________________________________
 > > Cake mailing list -- [ cake@lists.bufferbloat.net ]( mailto:cake@lists.bufferbloat.net )
 > > To unsubscribe send an email to [ cake-leave@lists.bufferbloat.net ]( mailto:cake-leave@lists.bufferbloat.net )
 > >
 >
 _______________________________________________
 Cake mailing list -- [ cake@lists.bufferbloat.net ]( mailto:cake@lists.bufferbloat.net )
 To unsubscribe send an email to [ cake-leave@lists.bufferbloat.net ]( mailto:cake-leave@lists.bufferbloat.net )

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
       [not found]         ` <FR2PPFEFD18174CA00474D0DC8DBDA3EE00DC0EA@FR2PPFEFD18174C.DEUP281.PROD.OUTLOO K.COM>
@ 2025-09-13 18:35           ` David P. Reed
  0 siblings, 0 replies; 15+ messages in thread
From: David P. Reed @ 2025-09-13 18:35 UTC (permalink / raw)
  To: BeckW; +Cc: tom, frantisek.borsik, cake, codel, bloat, rpm


I agree. The best solution, I think, is to move all the device management outside the kernel, and compiling the device abstractions into code that runs in user-level isolation.
 
On Wednesday, September 10, 2025 04:54, BeckW@telekom.de said:



> Interesting work! One problem of P4 is that the networking hardware varies so much
> in number of resources (queues, schedulers, policers, counters, table memory) that
> the code inevitably becomes tied to a certain system.
> It will be difficult to abstract the peculiarities of systems -- eg Broadcom 88800
> vs linux kernel -- in a good way.
> 
> Wolfgang
> 
> -----Ursprüngliche Nachricht-----
> Von: Tom Herbert via Bloat <bloat@lists.bufferbloat.net>
> Gesendet: Dienstag, 9. September 2025 23:37
> An: Frantisek Borsik <frantisek.borsik@gmail.com>
> Cc: David P. Reed <dpreed@deepplum.com>; Cake List
> <cake@lists.bufferbloat.net>; codel@lists.bufferbloat.net; bloat
> <bloat@lists.bufferbloat.net>; Jeremy Austin via Rpm
> <rpm@lists.bufferbloat.net>
> Betreff: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert
> (almost to the date, 10 years after XDP was released)
> 
> On Tue, Sep 9, 2025, 5:03 PM Frantisek Borsik
> <frantisek.borsik@gmail.com>
> wrote:
> 
> > Thanks a lot, David.
> >
> > I have asked Tom if he wants to join us and he should be here to chat
> > with us now.
> >
> > All the best,
> >
> > Frank
> >
> > Frantisek (Frank) Borsik
> >
> >
> > *In loving memory of Dave Täht: *1965-2025
> >
> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flibr
> > eqos.io%2F2025%2F04%2F01%2Fin-loving-memory-of-dave%2F&data=05%7C02%7C
> > BeckW%40telekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604c
> > f68b04a5eeb25f5c4f%7C0%7C0%7C638930853609276702%7CUnknown%7CTWFpbGZsb3
> > d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoi
> > TWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=TAA6NdEkB3KQQt8UhKqs3ZKuLj
> > N7A9h9J9FAjNRDDuU%3D&reserved=0
> >
> >
> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > linkedin.com%2Fin%2Ffrantisekborsik&data=05%7C02%7CBeckW%40telekom.de%
> > 7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a5eeb25f5c4f%
> > 7C0%7C0%7C638930853609297031%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiO
> > nRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%
> > 3D%3D%7C0%7C%7C%7C&sdata=ucL7U%2FQ81ks09nVEGUbe%2FFq1rhJNLaegicv%2FLCV
> > cIlg%3D&reserved=0
> >
> > Signal, Telegram, WhatsApp: +421919416714
> >
> > iMessage, mobile: +420775230885
> >
> > Skype: casioa5302ca
> >
> > frantisek.borsik@gmail.com
> >
> >
> > On Tue, Sep 9, 2025 at 10:25 PM David P. Reed
> <dpreed@deepplum.com> wrote:
> >
> > > Hi Frank -
> > >
> > >
> > >
> > > I think it is interesting as a concept. A project I am advising has
> > > been using DPDK very effectively to get rid of the huge path and
> > > locking
> > delays
> > > in the current Linux network stack. XDP2 could be supported in a
> > > ring3
> > > (user) address space, achieving a similar result.
> >
> 
> HI David,
> 
> The idea is you could write the code in XDP2 and it would be compiled to DPDK or
> eBPF and the compiler would handle the optimizations.
> 
> 
> > >
> > >
> > >
> > > But I don't think XDP2 is going that direction - so it may be
> > > stuckinto the mess of kernel space networking. Adding eBPF only has
> > > made this more
> > of
> > > a mess, by the way (and adding a new "compiler" that needs to be
> > > veriried as safe for the kernel).
> 
> 
> Think of XDP2 as the generalization of XDP to go beyond just the kernel.
> The idea is that the user writes their datapath code once and they compile it to
> run in whatever targets they have-- DPDK, P4, other programmable hardware, and yes
> XDP/eBPF. It's really not limited to kernel networking.
> 
> As for the name XDP2, when we created XDP, eXpress DataPath, my vision was that it
> would be implementation agnostic. eBPF was the first instantiation for
> practicality, but now ten years later I think we can realize the initial vision.
> 
> Tom
> 
> 
> 
> >
> > > I will be watching how this evolves.
> > >
> > >
> > >
> > > David
> > >
> > >
> > >
> > > On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <
> > > frantisek.borsik@gmail.com> said:
> > >
> > > > Hello to all,
> > > >
> > > > Looks interesting:
> > > >
> > >
> > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmedi
> > um.com%2F%40tom_84912%2Fxdp2-this-changes-everything-at-least-for-ai-m
> > l-infrastructure-850c1ba82771&data=05%7C02%7CBeckW%40telekom.de%7C299d
> > 64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a5eeb25f5c4f%7C0%7C
> > 0%7C638930853609308950%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWU
> > sIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%
> > 7C0%7C%7C%7C&sdata=RTLHTVFR540C8Psr73uiuRvkx1sSyjmlUIICEHFj0HA%3D&rese
> > rved=0
> > > >
> > > >
> > > > All the best,
> > > >
> > > > Frank
> > > >
> > > > Frantisek (Frank) Borsik
> > > >
> > > >
> > > > *In loving memory of Dave Täht: *1965-2025
> > > >
> > > > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > libreqos.io%2F2025%2F04%2F01%2Fin-loving-memory-of-dave%2F&data=05
> > > > %7C02%7CBeckW%40telekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbd
> > > > e4dffc4b604cf68b04a5eeb25f5c4f%7C0%7C0%7C638930853609323027%7CUnkn
> > > > own%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAi
> > > > OiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=98
> > > > musUCMTZR4ID%2Bo6GYxWOX99aiYBspBUdh%2BNV1fzwc%3D&reserved=0
> > > >
> > > >
> > > > https://deu01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > www.linkedin.com%2Fin%2Ffrantisekborsik&data=05%7C02%7CBeckW%40tel
> > > > ekom.de%7C299d64b9b76b4d4cf88b08ddf039e105%7Cbde4dffc4b604cf68b04a
> > > > 5eeb25f5c4f%7C0%7C0%7C638930853609334239%7CUnknown%7CTWFpbGZsb3d8e
> > > > yJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjo
> > > > iTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=0rD47zFeLVV1YDqMc5SO0
> > > > xGWEOTrE3FOYv0mOqGF%2FW4%3D&reserved=0
> > > >
> > > > Signal, Telegram, WhatsApp: +421919416714
> > > >
> > > > iMessage, mobile: +420775230885
> > > >
> > > > Skype: casioa5302ca
> > > >
> > > > frantisek.borsik@gmail.com
> > > > _______________________________________________
> > > > Cake mailing list -- cake@lists.bufferbloat.net To unsubscribe
> > > > send an email to cake-leave@lists.bufferbloat.net
> > > >
> > >
> > _______________________________________________
> > Cake mailing list -- cake@lists.bufferbloat.net To unsubscribe send an
> > email to cake-leave@lists.bufferbloat.net
> >
> _______________________________________________
> Bloat mailing list -- bloat@lists.bufferbloat.net To unsubscribe send an email to
> bloat-leave@lists.bufferbloat.net
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-13 18:33       ` [Codel] " David P. Reed
@ 2025-09-13 20:58         ` Tom Herbert
  2025-09-14 18:00           ` David P. Reed
  2025-09-15  8:39         ` [Codel] Re: [Bloat] " BeckW
  1 sibling, 1 reply; 15+ messages in thread
From: Tom Herbert @ 2025-09-13 20:58 UTC (permalink / raw)
  To: David P. Reed
  Cc: Frantisek Borsik, Cake List, codel, bloat, Jeremy Austin via Rpm

On Sat, Sep 13, 2025 at 1:33 PM David P. Reed <dpreed@deepplum.com> wrote:
>
> Tom -
>
>
>
> An architecture-independent network framework independent of the OS
kernel's peculiarities seems within reach (though a fair bit of work), and
I think it would be a GOOD THING indeed. IMHO the Linux networking stack in
the kernel is a horrific mess, and it doesn't have to be.

Hi David,

Agreed. But I want to encompass programmable HW in the solution scope.

>
>
>
> The reason it doesn't have to be is that there should be no reason it
cannot run in ring3/userland, just like DPDK. And it should be built using
"real-time" userland programming techniques. (avoiding the generic linux
scheduler). The ONLY reason for involving the scheduler would be because
there aren't enough cores. Linux was designed to be a uniprocessor Unix,
and that just is no longer true at all. With hyperthreading, too, one need
never abandon a processor's context in userspace to run some "userland"
application.

XDP/eBPF gets us most of the way to that. I like the idea that eBPF is a
modern day take on micro kernels.

>
> This would rip a huge amount of kernel code out of the kernel. (at least
50%, and probably more). THe security issues of all those 3rd party network
drivers would go away.

That's exactly the direction I believe the kernel should go. Rip out kernel
code and replace it with eBPF. The result is a malleable kernel and pieces
of it become sub-programs that can be independently run in userspace or in
programmable hardware. That's also the segue to finally solving the kernel
offloads mess that we've had for twenty (except for a couple of exceptions,
all the efforts for kernel offload have been flops)

>
> And the performance would be much higher for networking.  (running in
ring 3, especially if you don't do system calls, is no performance penalty,
and interprocessor communications using shared memory is much lower latency
than Linux IPC or mutexes).

Yes, performance improves when code lives directly on top of the queue.
It's even higher performance running in HW.
>
>
>
> I like the idea of a compilation based network stack, at a slightly
higher level than C. eBPF is NOT what I have in mind - it's an interpreter
with high overhead. The language should support high-performance
co-routining - shared memory, ideally. I don't thing GC is a good thing.
Rust might be a good starting point because its memory management is safe.

IMO, we should let the user pick the language they want to use. It's
feasible as long as the programming model is supported.

> To me, some of what the base of DPDK is like is good stuff. However, it
isn't architecturally neutral.

Yes, there's some good things in DPDK to adopt. Some nice things from P4 as
well. XDP2 unified them and takes the best ideas from them.

>
> To me, the network stack should not be entangled with interrupt handling
at all. "polling" is far more performant under load. The only use for
interrupts is when the network stack is completely idle. That would be, in
userland, a "wait for interrupt" call (not a poll). Ideally, on recent
Intel machines, a userspace version of MONITOR/MWAIT).
>

Part of the reason why high performance networking in use space is so hard.
We have spend inordinate amounts worrying about isolation or APIs to HW.
All that goes away when we run the stack on bare metal (what we do in
CPU-in-the-datapath).

> Now I know that Linus and his crew are really NOT gonna like this. Linus
is still thinking like MINIX, a uniprocessor time-sharing system with rich
OS functions in the kernel and doing "file" reads and writes to communicate
with the kernel state. But it is a much more modern way to think of
real-time IO in a modern operating system. (Windows and macOS are also
Unix-like, uniprocessor monolithic kernel designs).

Just hide everything behind eBPF when in the kernel and they'll be happy.
Outside of the kernel they won't care.

>
> So, if XDP2 got away from the Linux kernel, it could be great.

Yep, we need to go beyond the kernel.

Tom


> BTW, io_uring, etc. are half-measures. They address getting away from
interrupts toward polling, but they still make the mistake of keeping huge
drivers in the kernel.
>
>
>
>
>
> On Tuesday, September 9, 2025 17:36, "Tom Herbert" <tom@herbertland.com>
said:
>
>
>
> On Tue, Sep 9, 2025, 5:03 PM Frantisek Borsik <frantisek.borsik@gmail.com>
wrote:
>>
>> Thanks a lot, David.
>>
>> I have asked Tom if he wants to join us and he should be here to chat
with
>> us now.
>>
>> All the best,
>>
>> Frank
>>
>> Frantisek (Frank) Borsik
>>
>>
>> *In loving memory of Dave Täht: *1965-2025
>>
>> https://libreqos.io/2025/04/01/in-loving-memory-of-dave/
>>
>>
>> https://www.linkedin.com/in/frantisekborsik
>>
>> Signal, Telegram, WhatsApp: +421919416714
>>
>> iMessage, mobile: +420775230885
>>
>> Skype: casioa5302ca
>>
>> frantisek.borsik@gmail.com
>>
>>
>> On Tue, Sep 9, 2025 at 10:25 PM David P. Reed <dpreed@deepplum.com>
wrote:
>>
>> > Hi Frank -
>> >
>> >
>> >
>> > I think it is interesting as a concept. A project I am advising has
been
>> > using DPDK very effectively to get rid of the huge path and locking
delays
>> > in the current Linux network stack. XDP2 could be supported in a ring3
>> > (user) address space, achieving a similar result.
>
> HI David,
> The idea is you could write the code in XDP2 and it would be compiled to
DPDK or eBPF and the compiler would handle the optimizations.
>
>>
>> >
>> >
>> >
>> > But I don't think XDP2 is going that direction - so it may be stuckinto
>> > the mess of kernel space networking. Adding eBPF only has made this
more of
>> > a mess, by the way (and adding a new "compiler" that needs to be
veriried
>> > as safe for the kernel).
>
> Think of XDP2 as the generalization of XDP to go beyond just the kernel.
The idea is that the user writes their datapath code once and they compile
it to run in whatever targets they have-- DPDK, P4, other programmable
hardware, and yes XDP/eBPF. It's really not limited to kernel networking.
> As for the name XDP2, when we created XDP, eXpress DataPath, my vision
was that it would be implementation agnostic. eBPF was the first
instantiation for practicality, but now ten years later I think we can
realize the initial vision.
> Tom
>>
>> >
>> > I will be watching how this evolves.
>> >
>> >
>> >
>> > David
>> >
>> >
>> >
>> > On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <
>> > frantisek.borsik@gmail.com> said:
>> >
>> > > Hello to all,
>> > >
>> > > Looks interesting:
>> > >
>> >
https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771
>> > >
>> > >
>> > > All the best,
>> > >
>> > > Frank
>> > >
>> > > Frantisek (Frank) Borsik
>> > >
>> > >
>> > > *In loving memory of Dave Täht: *1965-2025
>> > >
>> > > https://libreqos.io/2025/04/01/in-loving-memory-of-dave/
>> > >
>> > >
>> > > https://www.linkedin.com/in/frantisekborsik
>> > >
>> > > Signal, Telegram, WhatsApp: +421919416714
>> > >
>> > > iMessage, mobile: +420775230885
>> > >
>> > > Skype: casioa5302ca
>> > >
>> > > frantisek.borsik@gmail.com
>> > > _______________________________________________
>> > > Cake mailing list -- cake@lists.bufferbloat.net
>> > > To unsubscribe send an email to cake-leave@lists.bufferbloat.net
>> > >
>> >
>> _______________________________________________
>> Cake mailing list -- cake@lists.bufferbloat.net
>> To unsubscribe send an email to cake-leave@lists.bufferbloat.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-13 20:58         ` Tom Herbert
@ 2025-09-14 18:00           ` David P. Reed
  2025-09-14 18:38             ` Tom Herbert
  0 siblings, 1 reply; 15+ messages in thread
From: David P. Reed @ 2025-09-14 18:00 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Frantisek Borsik, Cake List, codel, bloat, Jeremy Austin via Rpm


Tom -
 
Well, we may have  to disagree on whether eBPF is a good language/system for writing networking stacks or interfacing to hardware devices.
 
In case I wasn't particularly clear about eBPF, here's a summary of that concern.


Perhaps the biggest drawback (beyond the fact that it is a terrible language) is that it's basically "sugared kretprobes" 1) its "abstractions" are whatever the Linux (or other OS) kernel code design allows to be "exported" within the kernel. You can't, for example, do co-routines or IPC that would be appropriate for a clean network stack. Just what Linus and crew decide to export symbols for. 2) Sadly eBPF is primarily maintained by folks who merely want "hooks" in the kernel for performance analysis. (the original BPF was for programming packet-processing pipelines). Its use for describing the implementation of full network stacks in a clean way, down to and including the semantics of, say, 802.11 devices or 802.2 devices is a masterful hack, but very much tied to the Linux kernel's control structure quirks.
 
As a guy who's been developing operating systems since 1970 starting with Multics and networking protocol implementations since 1976, I would never have thought that eBPF or any language designed to kludge with random APIs produced by a group like the Linux Kernel developers as a basis. There are so many alternatives that are far better that what eBPF is.
 
Go ahead and do whatever experiment you have planned. You don't need my approval to base it on eBPF.
 
Now, understand that I'm not saying DPDK is the answer either (nor is io_uring). It's got a number of design drawbacks, too. The main benefit is that it runs all the code in isolation from the monolithic kernel called Linux, and allows real-time execution in ring3 with interprocessor communication. I would throw most of the rest of DPDK away. I would think that your goals would be best suited by stepping back a bit.

David
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-14 18:00           ` David P. Reed
@ 2025-09-14 18:38             ` Tom Herbert
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Herbert @ 2025-09-14 18:38 UTC (permalink / raw)
  To: David P. Reed
  Cc: Frantisek Borsik, Cake List, codel, bloat, Jeremy Austin via Rpm

On Sun, Sep 14, 2025 at 11:00 AM David P. Reed <dpreed@deepplum.com> wrote:
>
> Tom -
>
>
>
> Well, we may have  to disagree on whether eBPF is a good language/system for writing networking stacks or interfacing to hardware devices.

David,

To be clear, XDP2 is NOT eBPF. Neither is it DPDK or P4 or any other
execution environment. Neither is it a domain specific language like
P4 or Restricted-C.

It IS a programming model and API that allows dataplane code written
in the language of the user's choice and can compile to their
different targets including DPDK, eBPF, P4, FPGA, etc. If we give the
user choices about the frontend language and backend targets, then all
these pedantic debates about eBPF vs. P4 vs. DPDK vs. whatever pretty
much becomes moot. Also the prospect of users having to maintain
multiple code bases because they need to run in different execution
environments goes away.

Tom

>
>
>
> In case I wasn't particularly clear about eBPF, here's a summary of that concern.
>
> Perhaps the biggest drawback (beyond the fact that it is a terrible language) is that it's basically "sugared kretprobes" 1) its "abstractions" are whatever the Linux (or other OS) kernel code design allows to be "exported" within the kernel. You can't, for example, do co-routines or IPC that would be appropriate for a clean network stack. Just what Linus and crew decide to export symbols for. 2) Sadly eBPF is primarily maintained by folks who merely want "hooks" in the kernel for performance analysis. (the original BPF was for programming packet-processing pipelines). Its use for describing the implementation of full network stacks in a clean way, down to and including the semantics of, say, 802.11 devices or 802.2 devices is a masterful hack, but very much tied to the Linux kernel's control structure quirks.
>
>
>
> As a guy who's been developing operating systems since 1970 starting with Multics and networking protocol implementations since 1976, I would never have thought that eBPF or any language designed to kludge with random APIs produced by a group like the Linux Kernel developers as a basis. There are so many alternatives that are far better that what eBPF is.
>
>
>
> Go ahead and do whatever experiment you have planned. You don't need my approval to base it on eBPF.
>
>
>
> Now, understand that I'm not saying DPDK is the answer either (nor is io_uring). It's got a number of design drawbacks, too. The main benefit is that it runs all the code in isolation from the monolithic kernel called Linux, and allows real-time execution in ring3 with interprocessor communication. I would throw most of the rest of DPDK away. I would think that your goals would be best suited by stepping back a bit.
>
> David
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-13 18:33       ` [Codel] " David P. Reed
  2025-09-13 20:58         ` Tom Herbert
@ 2025-09-15  8:39         ` BeckW
  2025-09-15 15:16           ` Stephen Hemminger
  1 sibling, 1 reply; 15+ messages in thread
From: BeckW @ 2025-09-15  8:39 UTC (permalink / raw)
  To: dpreed, tom; +Cc: frantisek.borsik, cake, codel, bloat, rpm

Programming networking hardware is a bit like programming 8 bit computers int the 1980s, the hardware is often too limited and varied to support useful abstractions. This is also true for CPU-based networking once you get into the >10 Gbps realm, when caching and pipelining architectures become relevant. Writing a network protocol compiler that produces efficient code for different NICs and different CPUs is a daunting task. And unlike with 8 bit computers, there are no simple metrics ('you need at least 32kb RAM to run this code' vs 'this NIC supports 4k queues with PIE, Codel', 'this CPU has 20 Mbyte of Intel SmartCache').

Ebpf is very close to what was described in this 1995 exokernel paper( https://pdos.csail.mit.edu/6.828/2008/readings/engler95exokernel.pdf). The idea of the exokernel was to have easily loadable, verified code in the kernel -- eg the security-critical task of assigning a packet to a session of a user -- and leave the rest of the protocol -- eg tcp retransmissions -- to the user space. AFAIK few people use ebpf like this, but it should be possible.

Ebpf manages the abstraction part well, but sacrifices a lot of performance -- eg lack of aggressive batching like vpp / fd.io does. With DPDK,  you often find out that your nic's hardware or driver doesn't support the function that you hoped to use and end up optimizing for a particular hardware. Even if driver and hardware support a functionality, it may very well be that hardware resources are too limited for your particular use case. The abstraction is there, but your code is still hardware specific.

Wolfgang

-----Ursprüngliche Nachricht-----
Von: David P. Reed <dpreed@deepplum.com>
Gesendet: Samstag, 13. September 2025 22:33
An: Tom Herbert <tom@herbertland.com>
Cc: Frantisek Borsik <frantisek.borsik@gmail.com>; Cake List <cake@lists.bufferbloat.net>; codel@lists.bufferbloat.net; bloat <bloat@lists.bufferbloat.net>; Jeremy Austin via Rpm <rpm@lists.bufferbloat.net>
Betreff: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)


Tom -

An architecture-independent network framework independent of the OS kernel's peculiarities seems within reach (though a fair bit of work), and I think it would be a GOOD THING indeed. IMHO the Linux networking stack in the kernel is a horrific mess, and it doesn't have to be.

The reason it doesn't have to be is that there should be no reason it cannot run in ring3/userland, just like DPDK. And it should be built using "real-time" userland programming techniques. (avoiding the generic linux scheduler). The ONLY reason for involving the scheduler would be because there aren't enough cores. Linux was designed to be a uniprocessor Unix, and that just is no longer true at all. With hyperthreading, too, one need never abandon a processor's context in userspace to run some "userland" application.

This would rip a huge amount of kernel code out of the kernel. (at least 50%, and probably more). THe security issues of all those 3rd party network drivers would go away.

And the performance would be much higher for networking.  (running in ring 3, especially if you don't do system calls, is no performance penalty, and interprocessor communications using shared memory is much lower latency than Linux IPC or mutexes).

I like the idea of a compilation based network stack, at a slightly higher level than C. eBPF is NOT what I have in mind - it's an interpreter with high overhead. The language should support high-performance co-routining - shared memory, ideally. I don't thing GC is a good thing. Rust might be a good starting point because its memory management is safe.
To me, some of what the base of DPDK is like is good stuff. However, it isn't architecturally neutral.

To me, the network stack should not be entangled with interrupt handling at all. "polling" is far more performant under load. The only use for interrupts is when the network stack is completely idle. That would be, in userland, a "wait for interrupt" call (not a poll). Ideally, on recent Intel machines, a userspace version of MONITOR/MWAIT).

Now I know that Linus and his crew are really NOT gonna like this. Linus is still thinking like MINIX, a uniprocessor time-sharing system with rich OS functions in the kernel and doing "file" reads and writes to communicate with the kernel state. But it is a much more modern way to think of real-time IO in a modern operating system. (Windows and macOS are also Unix-like, uniprocessor monolithic kernel designs).

So, if XDP2 got away from the Linux kernel, it could be great.
BTW, io_uring, etc. are half-measures. They address getting away from interrupts toward polling, but they still make the mistake of keeping huge drivers in the kernel.


On Tuesday, September 9, 2025 17:36, "Tom Herbert" <tom@herbertland.com> said:









On Tue, Sep 9, 2025, 5:03 PM Frantisek Borsik <[ frantisek.borsik@gmail.com ]( mailto:frantisek.borsik@gmail.com )> wrote:Thanks a lot, David.

 I have asked Tom if he wants to join us and he should be here to chat with  us now.

 All the best,

 Frank

 Frantisek (Frank) Borsik


 *In loving memory of Dave Täht: *1965-2025

[ https://libreqos.io/2025/04/01/in-loving-memory-of-dave/ ]( https://libreqos.io/2025/04/01/in-loving-memory-of-dave/ )


[ https://www.linkedin.com/in/frantisekborsik ]( https://www.linkedin.com/in/frantisekborsik )

 Signal, Telegram, WhatsApp: +421919416714

 iMessage, mobile: +420775230885

 Skype: casioa5302ca

[ frantisek.borsik@gmail.com ]( mailto:frantisek.borsik@gmail.com )


 On Tue, Sep 9, 2025 at 10:25 PM David P. Reed <[ dpreed@deepplum.com ]( mailto:dpreed@deepplum.com )> wrote:

 > Hi Frank -
 >
 >
 >
 > I think it is interesting as a concept. A project I am advising has been  > using DPDK very effectively to get rid of the huge path and locking delays  > in the current Linux network stack. XDP2 could be supported in a ring3  > (user) address space, achieving a similar result.
HI David,
The idea is you could write the code in XDP2 and it would be compiled to DPDK or eBPF and the compiler would handle the optimizations.
 >
 >
 >
 > But I don't think XDP2 is going that direction - so it may be stuckinto  > the mess of kernel space networking. Adding eBPF only has made this more of  > a mess, by the way (and adding a new "compiler" that needs to be veriried  > as safe for the kernel).
Think of XDP2 as the generalization of XDP to go beyond just the kernel. The idea is that the user writes their datapath code once and they compile it to run in whatever targets they have-- DPDK, P4, other programmable hardware, and yes XDP/eBPF. It's really not limited to kernel networking.
As for the name XDP2, when we created XDP, eXpress DataPath, my vision was that it would be implementation agnostic. eBPF was the first instantiation for practicality, but now ten years later I think we can realize the initial vision.
Tom

>
 > I will be watching how this evolves.
 >
 >
 >
 > David
 >
 >
 >
 > On Tuesday, September 9, 2025 06:32, "Frantisek Borsik" <  > [ frantisek.borsik@gmail.com ]( mailto:frantisek.borsik@gmail.com )> said:
 >
 > > Hello to all,
 > >
 > > Looks interesting:
 > >
 > [ https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771 ]( https://medium.com/@tom_84912/xdp2-this-changes-everything-at-least-for-ai-ml-infrastructure-850c1ba82771 )  > >  > >  > > All the best,  > >  > > Frank  > >  > > Frantisek (Frank) Borsik  > >  > >  > > *In loving memory of Dave Täht: *1965-2025  > >  > > [ https://libreqos.io/2025/04/01/in-loving-memory-of-dave/ ]( https://libreqos.io/2025/04/01/in-loving-memory-of-dave/ )  > >  > >  > > [ https://www.linkedin.com/in/frantisekborsik ]( https://www.linkedin.com/in/frantisekborsik )  > >  > > Signal, Telegram, WhatsApp: +421919416714  > >  > > iMessage, mobile: +420775230885  > >  > > Skype: casioa5302ca  > >  > > [ frantisek.borsik@gmail.com ]( mailto:frantisek.borsik@gmail.com )  > > _______________________________________________
 > > Cake mailing list -- [ cake@lists.bufferbloat.net ]( mailto:cake@lists.bufferbloat.net )  > > To unsubscribe send an email to [ cake-leave@lists.bufferbloat.net ]( mailto:cake-leave@lists.bufferbloat.net )  > >  >  _______________________________________________
 Cake mailing list -- [ cake@lists.bufferbloat.net ]( mailto:cake@lists.bufferbloat.net )  To unsubscribe send an email to [ cake-leave@lists.bufferbloat.net ]( mailto:cake-leave@lists.bufferbloat.net ) _______________________________________________
Bloat mailing list -- bloat@lists.bufferbloat.net To unsubscribe send an email to bloat-leave@lists.bufferbloat.net

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-15  8:39         ` [Codel] Re: [Bloat] " BeckW
@ 2025-09-15 15:16           ` Stephen Hemminger
  2025-09-15 18:07             ` Frantisek Borsik
  0 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2025-09-15 15:16 UTC (permalink / raw)
  To: BeckW--- via Bloat; +Cc: BeckW, dpreed, tom, frantisek.borsik, cake, codel, rpm

On Mon, 15 Sep 2025 08:39:48 +0000
BeckW--- via Bloat <bloat@lists.bufferbloat.net> wrote:

> Programming networking hardware is a bit like programming 8 bit computers int the 1980s, the hardware is often too limited and varied to support useful abstractions. This is also true for CPU-based networking once you get into the >10 Gbps realm, when caching and pipelining architectures become relevant. Writing a network protocol compiler that produces efficient code for different NICs and different CPUs is a daunting task. And unlike with 8 bit computers, there are no simple metrics ('you need at least 32kb RAM to run this code' vs 'this NIC supports 4k queues with PIE, Codel', 'this CPU has 20 Mbyte of Intel SmartCache').

Linux kernel still lacks an easy way to setup many features in Smart NIC's. DPDK has rte_flow which allows direct
access to hardware flow processing. But DPDK lacks any reasonable form of shaper control.

> Ebpf is very close to what was described in this 1995 exokernel paper( https://pdos.csail.mit.edu/6.828/2008/readings/engler95exokernel.pdf). The idea of the exokernel was to have easily loadable, verified code in the kernel -- eg the security-critical task of assigning a packet to a session of a user -- and leave the rest of the protocol -- eg tcp retransmissions -- to the user space. AFAIK few people use ebpf like this, but it should be possible.
> 
> Ebpf manages the abstraction part well, but sacrifices a lot of performance -- eg lack of aggressive batching like vpp / fd.io does. With DPDK,  you often find out that your nic's hardware or driver doesn't support the function that you hoped to use and end up optimizing for a particular hardware. Even if driver and hardware support a functionality, it may very well be that hardware resources are too limited for your particular use case. The abstraction is there, but your code is still hardware specific.

There were a few NIC's that offloaded eBPF but they never really went mainstream.

> -----Ursprüngliche Nachricht-----
> Von: David P. Reed <dpreed@deepplum.com>
> Gesendet: Samstag, 13. September 2025 22:33
> An: Tom Herbert <tom@herbertland.com>
> Cc: Frantisek Borsik <frantisek.borsik@gmail.com>; Cake List <cake@lists.bufferbloat.net>; codel@lists.bufferbloat.net; bloat <bloat@lists.bufferbloat.net>; Jeremy Austin via Rpm <rpm@lists.bufferbloat.net>
> Betreff: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
> 
> 
> Tom -
> 
> An architecture-independent network framework independent of the OS kernel's peculiarities seems within reach (though a fair bit of work), and I think it would be a GOOD THING indeed. IMHO the Linux networking stack in the kernel is a horrific mess, and it doesn't have to be.
> 
> The reason it doesn't have to be is that there should be no reason it cannot run in ring3/userland, just like DPDK. And it should be built using "real-time" userland programming techniques. (avoiding the generic linux scheduler). The ONLY reason for involving the scheduler would be because there aren't enough cores. Linux was designed to be a uniprocessor Unix, and that just is no longer true at all. With hyperthreading, too, one need never abandon a processor's context in userspace to run some "userland" application.
> 
> This would rip a huge amount of kernel code out of the kernel. (at least 50%, and probably more). THe security issues of all those 3rd party network drivers would go away.
> 
> And the performance would be much higher for networking.  (running in ring 3, especially if you don't do system calls, is no performance penalty, and interprocessor communications using shared memory is much lower latency than Linux IPC or mutexes).
> 
> I like the idea of a compilation based network stack, at a slightly higher level than C. eBPF is NOT what I have in mind - it's an interpreter with high overhead. The language should support high-performance co-routining - shared memory, ideally. I don't thing GC is a good thing. Rust might be a good starting point because its memory management is safe.
> To me, some of what the base of DPDK is like is good stuff. However, it isn't architecturally neutral.
> 
> To me, the network stack should not be entangled with interrupt handling at all. "polling" is far more performant under load. The only use for interrupts is when the network stack is completely idle. That would be, in userland, a "wait for interrupt" call (not a poll). Ideally, on recent Intel machines, a userspace version of MONITOR/MWAIT).
> 
> Now I know that Linus and his crew are really NOT gonna like this. Linus is still thinking like MINIX, a uniprocessor time-sharing system with rich OS functions in the kernel and doing "file" reads and writes to communicate with the kernel state. But it is a much more modern way to think of real-time IO in a modern operating system. (Windows and macOS are also Unix-like, uniprocessor monolithic kernel designs).
> 
> So, if XDP2 got away from the Linux kernel, it could be great.
> BTW, io_uring, etc. are half-measures. They address getting away from interrupts toward polling, but they still make the mistake of keeping huge drivers in the kernel.

DPDK already supports use of XDP as a way to do userspace networking.
It is good generic way to get packets in/out but the dedicated userspace drivers allow
for more access to hardware. The XDP abstraction gets in the way of little things like programming
VLAN's, etc.

The tradeoff is userspace networking works great for infrastructure, routers, switches, firewalls etc;
but userspace networking for network stacks to applications is hard to do, and loses the isolation
that the kernel provides.

>  > I think it is interesting as a concept. A project I am advising has been  > using DPDK very effectively to get rid of the huge path and locking delays  > in the current Linux network stack. XDP2 could be supported in a ring3  > (user) address space, achieving a similar result.  
> HI David,
> The idea is you could write the code in XDP2 and it would be compiled to DPDK or eBPF and the compiler would handle the optimizations.
>  >
>  >
>  >
>  > But I don't think XDP2 is going that direction - so it may be stuckinto  > the mess of kernel space networking. Adding eBPF only has made this more of  > a mess, by the way (and adding a new "compiler" that needs to be veriried  > as safe for the kernel).  
> Think of XDP2 as the generalization of XDP to go beyond just the kernel. The idea is that the user writes their datapath code once and they compile it to run in whatever targets they have-- DPDK, P4, other programmable hardware, and yes XDP/eBPF. It's really not limited to kernel networking.
> As for the name XDP2, when we created XDP, eXpress DataPath, my vision was that it would be implementation agnostic. eBPF was the first instantiation for practicality, but now ten years later I think we can realize the initial vision.
> Tom


At this point, different network architectures get focused at different use cases.
The days of the one-size-fits-all networking of BSD Unix is dead.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Codel] Re: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released)
  2025-09-15 15:16           ` Stephen Hemminger
@ 2025-09-15 18:07             ` Frantisek Borsik
  0 siblings, 0 replies; 15+ messages in thread
From: Frantisek Borsik @ 2025-09-15 18:07 UTC (permalink / raw)
  To: stephen; +Cc: BeckW--- via Bloat, BeckW, dpreed, tom, cake, codel, rpm

"There were a few NIC's that offloaded eBPF but they never really went
mainstream."

And even then, they were doing only 40 Gbps, like https://netronome.com and
didn't even supported full eBPF...

They only support a pretty small subset of eBPF (in particular they don't
support the LPM map type, which was our biggest performance pain point),
and have a pretty cool user replaceable firmware system. They also don't
have the higher speeds - above 40 Gbps - where the offloading would be most
useful."

Btw, Tom will be at FLOSS Weekly tomorrow (Tuesday), 12:20 EDT / 11:20 CDT
/ 10:20 MDT / 9:20 PDT

https://www.youtube.com/live/OBW5twvmHOI


All the best,

Frank

Frantisek (Frank) Borsik


*In loving memory of Dave Täht: *1965-2025

https://libreqos.io/2025/04/01/in-loving-memory-of-dave/


https://www.linkedin.com/in/frantisekborsik

Signal, Telegram, WhatsApp: +421919416714

iMessage, mobile: +420775230885

Skype: casioa5302ca

frantisek.borsik@gmail.com


On Mon, Sep 15, 2025 at 5:16 PM Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Mon, 15 Sep 2025 08:39:48 +0000
> BeckW--- via Bloat <bloat@lists.bufferbloat.net> wrote:
>
> > Programming networking hardware is a bit like programming 8 bit
> computers int the 1980s, the hardware is often too limited and varied to
> support useful abstractions. This is also true for CPU-based networking
> once you get into the >10 Gbps realm, when caching and pipelining
> architectures become relevant. Writing a network protocol compiler that
> produces efficient code for different NICs and different CPUs is a daunting
> task. And unlike with 8 bit computers, there are no simple metrics ('you
> need at least 32kb RAM to run this code' vs 'this NIC supports 4k queues
> with PIE, Codel', 'this CPU has 20 Mbyte of Intel SmartCache').
>
> Linux kernel still lacks an easy way to setup many features in Smart
> NIC's. DPDK has rte_flow which allows direct
> access to hardware flow processing. But DPDK lacks any reasonable form of
> shaper control.
>
> > Ebpf is very close to what was described in this 1995 exokernel paper(
> https://pdos.csail.mit.edu/6.828/2008/readings/engler95exokernel.pdf).
> The idea of the exokernel was to have easily loadable, verified code in the
> kernel -- eg the security-critical task of assigning a packet to a session
> of a user -- and leave the rest of the protocol -- eg tcp retransmissions
> -- to the user space. AFAIK few people use ebpf like this, but it should be
> possible.
> >
> > Ebpf manages the abstraction part well, but sacrifices a lot of
> performance -- eg lack of aggressive batching like vpp / fd.io does. With
> DPDK,  you often find out that your nic's hardware or driver doesn't
> support the function that you hoped to use and end up optimizing for a
> particular hardware. Even if driver and hardware support a functionality,
> it may very well be that hardware resources are too limited for your
> particular use case. The abstraction is there, but your code is still
> hardware specific.
>
> There were a few NIC's that offloaded eBPF but they never really went
> mainstream.
>



> > -----Ursprüngliche Nachricht-----
> > Von: David P. Reed <dpreed@deepplum.com>
> > Gesendet: Samstag, 13. September 2025 22:33
> > An: Tom Herbert <tom@herbertland.com>
> > Cc: Frantisek Borsik <frantisek.borsik@gmail.com>; Cake List <
> cake@lists.bufferbloat.net>; codel@lists.bufferbloat.net; bloat <
> bloat@lists.bufferbloat.net>; Jeremy Austin via Rpm <
> rpm@lists.bufferbloat.net>
> > Betreff: [Bloat] Re: [Cake] Re: XDP2 is here - from one and only Tom
> Herbert (almost to the date, 10 years after XDP was released)
> >
> >
> > Tom -
> >
> > An architecture-independent network framework independent of the OS
> kernel's peculiarities seems within reach (though a fair bit of work), and
> I think it would be a GOOD THING indeed. IMHO the Linux networking stack in
> the kernel is a horrific mess, and it doesn't have to be.
> >
> > The reason it doesn't have to be is that there should be no reason it
> cannot run in ring3/userland, just like DPDK. And it should be built using
> "real-time" userland programming techniques. (avoiding the generic linux
> scheduler). The ONLY reason for involving the scheduler would be because
> there aren't enough cores. Linux was designed to be a uniprocessor Unix,
> and that just is no longer true at all. With hyperthreading, too, one need
> never abandon a processor's context in userspace to run some "userland"
> application.
> >
> > This would rip a huge amount of kernel code out of the kernel. (at least
> 50%, and probably more). THe security issues of all those 3rd party network
> drivers would go away.
> >
> > And the performance would be much higher for networking.  (running in
> ring 3, especially if you don't do system calls, is no performance penalty,
> and interprocessor communications using shared memory is much lower latency
> than Linux IPC or mutexes).
> >
> > I like the idea of a compilation based network stack, at a slightly
> higher level than C. eBPF is NOT what I have in mind - it's an interpreter
> with high overhead. The language should support high-performance
> co-routining - shared memory, ideally. I don't thing GC is a good thing.
> Rust might be a good starting point because its memory management is safe.
> > To me, some of what the base of DPDK is like is good stuff. However, it
> isn't architecturally neutral.
> >
> > To me, the network stack should not be entangled with interrupt handling
> at all. "polling" is far more performant under load. The only use for
> interrupts is when the network stack is completely idle. That would be, in
> userland, a "wait for interrupt" call (not a poll). Ideally, on recent
> Intel machines, a userspace version of MONITOR/MWAIT).
> >
> > Now I know that Linus and his crew are really NOT gonna like this. Linus
> is still thinking like MINIX, a uniprocessor time-sharing system with rich
> OS functions in the kernel and doing "file" reads and writes to communicate
> with the kernel state. But it is a much more modern way to think of
> real-time IO in a modern operating system. (Windows and macOS are also
> Unix-like, uniprocessor monolithic kernel designs).
> >
> > So, if XDP2 got away from the Linux kernel, it could be great.
> > BTW, io_uring, etc. are half-measures. They address getting away from
> interrupts toward polling, but they still make the mistake of keeping huge
> drivers in the kernel.
>
> DPDK already supports use of XDP as a way to do userspace networking.
> It is good generic way to get packets in/out but the dedicated userspace
> drivers allow
> for more access to hardware. The XDP abstraction gets in the way of little
> things like programming
> VLAN's, etc.
>
> The tradeoff is userspace networking works great for infrastructure,
> routers, switches, firewalls etc;
> but userspace networking for network stacks to applications is hard to do,
> and loses the isolation
> that the kernel provides.
>
> >  > I think it is interesting as a concept. A project I am advising has
> been  > using DPDK very effectively to get rid of the huge path and locking
> delays  > in the current Linux network stack. XDP2 could be supported in a
> ring3  > (user) address space, achieving a similar result.
> > HI David,
> > The idea is you could write the code in XDP2 and it would be compiled to
> DPDK or eBPF and the compiler would handle the optimizations.
> >  >
> >  >
> >  >
> >  > But I don't think XDP2 is going that direction - so it may be
> stuckinto  > the mess of kernel space networking. Adding eBPF only has made
> this more of  > a mess, by the way (and adding a new "compiler" that needs
> to be veriried  > as safe for the kernel).
> > Think of XDP2 as the generalization of XDP to go beyond just the kernel.
> The idea is that the user writes their datapath code once and they compile
> it to run in whatever targets they have-- DPDK, P4, other programmable
> hardware, and yes XDP/eBPF. It's really not limited to kernel networking.
> > As for the name XDP2, when we created XDP, eXpress DataPath, my vision
> was that it would be implementation agnostic. eBPF was the first
> instantiation for practicality, but now ten years later I think we can
> realize the initial vision.
> > Tom
>
>
> At this point, different network architectures get focused at different
> use cases.
> The days of the one-size-fits-all networking of BSD Unix is dead.
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-09-15 18:07 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-09 10:32 [Codel] XDP2 is here - from one and only Tom Herbert (almost to the date, 10 years after XDP was released) Frantisek Borsik
2025-09-09 20:25 ` [Codel] Re: [Cake] " David P. Reed
2025-09-09 21:02   ` Frantisek Borsik
2025-09-09 21:36     ` [Codel] Re: [Cake] " Tom Herbert
2025-09-10  8:54       ` [Codel] Re: [Bloat] " BeckW
2025-09-10 13:59         ` Tom Herbert
2025-09-10 14:06           ` Tom Herbert
2025-09-13 18:33       ` [Codel] " David P. Reed
2025-09-13 20:58         ` Tom Herbert
2025-09-14 18:00           ` David P. Reed
2025-09-14 18:38             ` Tom Herbert
2025-09-15  8:39         ` [Codel] Re: [Bloat] " BeckW
2025-09-15 15:16           ` Stephen Hemminger
2025-09-15 18:07             ` Frantisek Borsik
     [not found]       ` <FR2PPFEFD18174CA00474D0DC8DBDA3EE00DC0EA@FR2PPFEFD18174C.DEUP281.PROD.OUT LOOK.COM>
     [not found]         ` <FR2PPFEFD18174CA00474D0DC8DBDA3EE00DC0EA@FR2PPFEFD18174C.DEUP281.PROD.OUTLOO K.COM>
2025-09-13 18:35           ` David P. Reed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox