* [Cerowrt-devel] fq_pie for linux @ 2018-12-05 22:27 Dave Taht 2018-12-06 7:50 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 12+ messages in thread From: Dave Taht @ 2018-12-05 22:27 UTC (permalink / raw) To: cerowrt-devel https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2 -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-05 22:27 [Cerowrt-devel] fq_pie for linux Dave Taht @ 2018-12-06 7:50 ` Toke Høiland-Jørgensen 2018-12-06 20:03 ` Dave Taht 2018-12-11 18:32 ` Aaron Wood 0 siblings, 2 replies; 12+ messages in thread From: Toke Høiland-Jørgensen @ 2018-12-06 7:50 UTC (permalink / raw) To: Dave Taht, cerowrt-devel Dave Taht <dave.taht@gmail.com> writes: > https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2 With all the variants of fq+AQM, maybe decoupling the FQ part and the AQM part would be worthwhile, instead of reimplementing it for each variant... -Toke ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-06 7:50 ` Toke Høiland-Jørgensen @ 2018-12-06 20:03 ` Dave Taht 2018-12-06 19:13 ` David Lang 2018-12-11 18:32 ` Aaron Wood 1 sibling, 1 reply; 12+ messages in thread From: Dave Taht @ 2018-12-06 20:03 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Dave Taht, cerowrt-devel, edumazet Toke Høiland-Jørgensen <toke@toke.dk> writes: > Dave Taht <dave.taht@gmail.com> writes: > >> https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2 > > With all the variants of fq+AQM, maybe decoupling the FQ part and the > AQM part would be worthwhile, instead of reimplementing it for each > variant... I actually sat down to write a userspace implementation of the fq bits in C with a pluggable AQM a while back. I called it "drrp". I think that there are many applications today that do too much processing per packet, and end up with their recv socket buffer overflowing (ENOBUFS) and tail-dropping in the kernel. I've certainly seen this with babeld, in particular. So by putting an intervening layer around the udp recv call to keep calling that as fast as possible, and try to FQ and AQM the result, I thought we'd get better fairness between different flows over udp and a smarter means of shedding load when that was happening. Then... there was all this activity recently around other approaches to the udp problem in the kernel, and I gave up while that got sorted out. (I'd rather like a setsockopt that sorted packets in the recv socket buffer and head-dropped... ) While trying to work entirely from memory, using things like queue.h's TAILQ macros... it ended up looking almost exactly like eric's code (because that's so perfect! :)) and what I really needed (for babel) was a version that was BSD-licensed. So I figured I'd look hard at the freeBSD version and try to write from that... or a hard look at cake... and never got back to it. I guess we could ask eric if he'd mind if we just ported fq_codel to userspace and relicensed. Wouldn't mind a go and rust version while we're at it... The only difference in what I wrote was that I never liked the "search all the queues on overload" bit in fq_codel and just did that inline. This looked to work well with the bulk dropper. > > -Toke > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-06 20:03 ` Dave Taht @ 2018-12-06 19:13 ` David Lang 2018-12-06 20:21 ` Dave Taht 0 siblings, 1 reply; 12+ messages in thread From: David Lang @ 2018-12-06 19:13 UTC (permalink / raw) To: Dave Taht; +Cc: Toke Høiland-Jørgensen, edumazet, cerowrt-devel [-- Attachment #1: Type: TEXT/PLAIN, Size: 1426 bytes --] On Thu, 6 Dec 2018, Dave Taht wrote: > Toke Høiland-Jørgensen <toke@toke.dk> writes: > >> Dave Taht <dave.taht@gmail.com> writes: >> >>> https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2 >> >> With all the variants of fq+AQM, maybe decoupling the FQ part and the >> AQM part would be worthwhile, instead of reimplementing it for each >> variant... > > I actually sat down to write a userspace implementation of the fq bits > in C with a pluggable AQM a while back. I called it "drrp". > > I think that there are many applications today that do too much > processing per packet, and end up with their recv socket buffer > overflowing (ENOBUFS) and tail-dropping in the kernel. I've certainly > seen this with babeld, in particular. > > So by putting an intervening layer around the udp recv call to keep > calling that as fast as possible, and try to FQ and AQM the result, I > thought we'd get better fairness between different flows over udp and a > smarter means of shedding load when that was happening. > > Then... there was all this activity recently around other approaches to > the udp problem in the kernel, and I gave up while that got sorted out. one of these is (IIRC) mmreceive, which lets the app get all the pending packets from the Linux UDP stack in one systemcall rather than having to make one syscall per packet. In rsyslog this is a significant benefit at high packet rates. David Lang ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-06 19:13 ` David Lang @ 2018-12-06 20:21 ` Dave Taht 0 siblings, 0 replies; 12+ messages in thread From: Dave Taht @ 2018-12-06 20:21 UTC (permalink / raw) To: David Lang; +Cc: Dave Täht, cerowrt-devel, Eric Dumazet On Thu, Dec 6, 2018 at 12:13 PM David Lang <david@lang.hm> wrote: > > On Thu, 6 Dec 2018, Dave Taht wrote: > > > Toke Høiland-Jørgensen <toke@toke.dk> writes: > > > >> Dave Taht <dave.taht@gmail.com> writes: > >> > >>> https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2 > >> > >> With all the variants of fq+AQM, maybe decoupling the FQ part and the > >> AQM part would be worthwhile, instead of reimplementing it for each > >> variant... > > > > I actually sat down to write a userspace implementation of the fq bits > > in C with a pluggable AQM a while back. I called it "drrp". > > > > I think that there are many applications today that do too much > > processing per packet, and end up with their recv socket buffer > > overflowing (ENOBUFS) and tail-dropping in the kernel. I've certainly > > seen this with babeld, in particular. > > > > So by putting an intervening layer around the udp recv call to keep > > calling that as fast as possible, and try to FQ and AQM the result, I > > thought we'd get better fairness between different flows over udp and a > > smarter means of shedding load when that was happening. > > > > Then... there was all this activity recently around other approaches to > > the udp problem in the kernel, and I gave up while that got sorted out. > > one of these is (IIRC) mmreceive, which lets the app get all the pending packets > from the Linux UDP stack in one systemcall rather than having to make one > syscall per packet. In rsyslog this is a significant benefit at high packet > rates. Are you referring to recvmmsg? That's always been problematic (it would block), but I keep hoping it's been fixed. While I'm kvetching about udp and userspace, would love the equivalent of high and low watermarks to work. select can return when there is only 1 byte of space available and thus you have to retry until there's room for a packet. I'm totally incapable of writing such a thing. I find these bits of the kernel impenetrable. > > David Lang_______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-06 7:50 ` Toke Høiland-Jørgensen 2018-12-06 20:03 ` Dave Taht @ 2018-12-11 18:32 ` Aaron Wood 2018-12-11 18:37 ` Dave Taht 2018-12-11 18:38 ` Jonathan Morton 1 sibling, 2 replies; 12+ messages in thread From: Aaron Wood @ 2018-12-11 18:32 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Dave Taht, cerowrt-devel [-- Attachment #1: Type: text/plain, Size: 978 bytes --] On Wed, Dec 5, 2018 at 11:51 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote: > Dave Taht <dave.taht@gmail.com> writes: > > > https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2 > > With all the variants of fq+AQM, maybe decoupling the FQ part and the > AQM part would be worthwhile, instead of reimplementing it for each > variant... > That's a great idea, Toke. There are a lot of places where I think it could work well, especially if it took a pluggable hash function for the hashing (at which point it's very general-purpose, and works on all sorts of different kinds of packets and workloads). That would let it be used for userspace VPN links (as an example), or within QUIC (or similar), where the kernel can't see the embedded flows that are hidden by the TLS encryption. And having it pluggable in the kernel would also allow IPSec to work without bloat (last I checked it was horribly bufferbloated, but that was ~5 years ago). [-- Attachment #2: Type: text/html, Size: 1424 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-11 18:32 ` Aaron Wood @ 2018-12-11 18:37 ` Dave Taht 2018-12-11 18:38 ` Dave Taht 2018-12-11 18:38 ` Jonathan Morton 1 sibling, 1 reply; 12+ messages in thread From: Dave Taht @ 2018-12-11 18:37 UTC (permalink / raw) To: Aaron Wood; +Cc: Toke Høiland-Jørgensen, cerowrt-devel On Tue, Dec 11, 2018 at 10:32 AM Aaron Wood <woody77@gmail.com> wrote: > > On Wed, Dec 5, 2018 at 11:51 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote: >> >> Dave Taht <dave.taht@gmail.com> writes: >> >> > https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2 >> >> With all the variants of fq+AQM, maybe decoupling the FQ part and the >> AQM part would be worthwhile, instead of reimplementing it for each >> variant... > > > That's a great idea, Toke. There are a lot of places where I think it could work well, especially if it took a pluggable hash function for the hashing (at which point it's very general-purpose, and works on all sorts of different kinds of packets and workloads). That would let it be used for userspace VPN links (as an example), or within QUIC (or similar), where the kernel can't see the embedded flows that are hidden by the TLS encryption. > > And having it pluggable in the kernel would also allow IPSec to work without bloat (last I checked it was horribly bufferbloated, but that was ~5 years ago). ipsec terminating on the router was made to work beautifully with fq_codel with this commit, below. Before: http://www.taht.net/~d/ipsec_fq_codel/oldqos.png After: http://www.taht.net/~d/ipsec_fq_codel/newqos.png It's why we keep hoping to do the same thing to wireguard. commit 264b87fa617e758966108db48db220571ff3d60e Author: Andrew Collins <acollins@cradlepoint.com> Date: Wed Jan 18 14:04:28 2017 -0700 fq_codel: Avoid regenerating skb flow hash unless necessary The fq_codel qdisc currently always regenerates the skb flow hash. This wastes some cycles and prevents flow seperation in cases where the traffic has been encrypted and can no longer be understood by the flow dissector. Change it to use the prexisting flow hash if one exists, and only regenerate if necessary. -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-11 18:37 ` Dave Taht @ 2018-12-11 18:38 ` Dave Taht 0 siblings, 0 replies; 12+ messages in thread From: Dave Taht @ 2018-12-11 18:38 UTC (permalink / raw) To: Aaron Wood; +Cc: Toke Høiland-Jørgensen, cerowrt-devel On Tue, Dec 11, 2018 at 10:37 AM Dave Taht <dave.taht@gmail.com> wrote: > > On Tue, Dec 11, 2018 at 10:32 AM Aaron Wood <woody77@gmail.com> wrote: > > > > On Wed, Dec 5, 2018 at 11:51 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote: > >> > >> Dave Taht <dave.taht@gmail.com> writes: > >> > >> > https://github.com/gautamramk/FQ-PIE-for-Linux-Kernel/issues/2 > >> > >> With all the variants of fq+AQM, maybe decoupling the FQ part and the > >> AQM part would be worthwhile, instead of reimplementing it for each > >> variant... > > > > > > That's a great idea, Toke. There are a lot of places where I think it could work well, especially if it took a pluggable hash function for the hashing (at which point it's very general-purpose, and works on all sorts of different kinds of packets and workloads). That would let it be used for userspace VPN links (as an example), or within QUIC (or similar), where the kernel can't see the embedded flows that are hidden by the TLS encryption. I really would like us to have reference userspace versions. Also, in userspace, sse based hashing as in spookyhash or city hash might be faster than jenkins. > > > > And having it pluggable in the kernel would also allow IPSec to work without bloat (last I checked it was horribly bufferbloated, but that was ~5 years ago). > > ipsec terminating on the router was made to work beautifully with > fq_codel with this commit, below. > > Before: > > http://www.taht.net/~d/ipsec_fq_codel/oldqos.png > > After: > > http://www.taht.net/~d/ipsec_fq_codel/newqos.png > > It's why we keep hoping to do the same thing to wireguard. > > commit 264b87fa617e758966108db48db220571ff3d60e > Author: Andrew Collins <acollins@cradlepoint.com> > Date: Wed Jan 18 14:04:28 2017 -0700 > > fq_codel: Avoid regenerating skb flow hash unless necessary > > The fq_codel qdisc currently always regenerates the skb flow hash. > This wastes some cycles and prevents flow seperation in cases where > the traffic has been encrypted and can no longer be understood by the > flow dissector. > > Change it to use the prexisting flow hash if one exists, and only > regenerate if necessary. > > > > > -- > > Dave Täht > CTO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-831-205-9740 -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-11 18:32 ` Aaron Wood 2018-12-11 18:37 ` Dave Taht @ 2018-12-11 18:38 ` Jonathan Morton 2018-12-11 18:39 ` Dave Taht 2018-12-11 20:23 ` Toke Høiland-Jørgensen 1 sibling, 2 replies; 12+ messages in thread From: Jonathan Morton @ 2018-12-11 18:38 UTC (permalink / raw) To: Aaron Wood; +Cc: Toke Høiland-Jørgensen, cerowrt-devel > On 11 Dec, 2018, at 8:32 pm, Aaron Wood <woody77@gmail.com> wrote: > > With all the variants of fq+AQM, maybe decoupling the FQ part and the > AQM part would be worthwhile, instead of reimplementing it for each > variant... > > That's a great idea, Toke. There are a lot of places where I think it could work well, especially if it took a pluggable hash function for the hashing (at which point it's very general-purpose, and works on all sorts of different kinds of packets and workloads). That would let it be used for userspace VPN links (as an example), or within QUIC (or similar), where the kernel can't see the embedded flows that are hidden by the TLS encryption. > > And having it pluggable in the kernel would also allow IPSec to work without bloat (last I checked it was horribly bufferbloated, but that was ~5 years ago). I wonder if it's worth extracting the triple-isolate and set-associative hash logic from Cake for this purpose? The interface to COBALT is clean enough to be replaced by other AQMs relatively easily. - Jonathan Morton ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-11 18:38 ` Jonathan Morton @ 2018-12-11 18:39 ` Dave Taht 2018-12-11 20:23 ` Toke Høiland-Jørgensen 1 sibling, 0 replies; 12+ messages in thread From: Dave Taht @ 2018-12-11 18:39 UTC (permalink / raw) To: Jonathan Morton; +Cc: Aaron Wood, cerowrt-devel On Tue, Dec 11, 2018 at 10:38 AM Jonathan Morton <chromatix99@gmail.com> wrote: > > > On 11 Dec, 2018, at 8:32 pm, Aaron Wood <woody77@gmail.com> wrote: > > > > With all the variants of fq+AQM, maybe decoupling the FQ part and the > > AQM part would be worthwhile, instead of reimplementing it for each > > variant... > > > > That's a great idea, Toke. There are a lot of places where I think it could work well, especially if it took a pluggable hash function for the hashing (at which point it's very general-purpose, and works on all sorts of different kinds of packets and workloads). That would let it be used for userspace VPN links (as an example), or within QUIC (or similar), where the kernel can't see the embedded flows that are hidden by the TLS encryption. > > > > And having it pluggable in the kernel would also allow IPSec to work without bloat (last I checked it was horribly bufferbloated, but that was ~5 years ago). > > I wonder if it's worth extracting the triple-isolate and set-associative hash logic from Cake for this purpose? The interface to COBALT is clean enough to be replaced by other AQMs relatively easily. > > - Jonathan Morton > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel well, it would be nice if cake could re-use the existing skb-hash for it's main hash, as fq_codel does -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-11 18:38 ` Jonathan Morton 2018-12-11 18:39 ` Dave Taht @ 2018-12-11 20:23 ` Toke Høiland-Jørgensen 2018-12-11 20:35 ` Dave Taht 1 sibling, 1 reply; 12+ messages in thread From: Toke Høiland-Jørgensen @ 2018-12-11 20:23 UTC (permalink / raw) To: Jonathan Morton, Aaron Wood; +Cc: cerowrt-devel Jonathan Morton <chromatix99@gmail.com> writes: >> On 11 Dec, 2018, at 8:32 pm, Aaron Wood <woody77@gmail.com> wrote: >> >> With all the variants of fq+AQM, maybe decoupling the FQ part and the >> AQM part would be worthwhile, instead of reimplementing it for each >> variant... >> >> That's a great idea, Toke. There are a lot of places where I think it could work well, especially if it took a pluggable hash function for the hashing (at which point it's very general-purpose, and works on all sorts of different kinds of packets and workloads). That would let it be used for userspace VPN links (as an example), or within QUIC (or similar), where the kernel can't see the embedded flows that are hidden by the TLS encryption. >> >> And having it pluggable in the kernel would also allow IPSec to work >> without bloat (last I checked it was horribly bufferbloated, but that >> was ~5 years ago). > > I wonder if it's worth extracting the triple-isolate and > set-associative hash logic from Cake for this purpose? The interface > to COBALT is clean enough to be replaced by other AQMs relatively > easily. There's already a reusable FQ structure in the kernel (which is what the WiFi stack uses), which is partially modelled on Cake's tins. I had half a mind to try to have the two converge; Cake would shed some LOCs, and the WiFi stack could get set-associativity... -Toke ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [Cerowrt-devel] fq_pie for linux 2018-12-11 20:23 ` Toke Høiland-Jørgensen @ 2018-12-11 20:35 ` Dave Taht 0 siblings, 0 replies; 12+ messages in thread From: Dave Taht @ 2018-12-11 20:35 UTC (permalink / raw) To: Toke Høiland-Jørgensen Cc: Jonathan Morton, Aaron Wood, cerowrt-devel On Tue, Dec 11, 2018 at 12:23 PM Toke Høiland-Jørgensen <toke@toke.dk> wrote: > > Jonathan Morton <chromatix99@gmail.com> writes: > > >> On 11 Dec, 2018, at 8:32 pm, Aaron Wood <woody77@gmail.com> wrote: > >> > >> With all the variants of fq+AQM, maybe decoupling the FQ part and the > >> AQM part would be worthwhile, instead of reimplementing it for each > >> variant... > >> > >> That's a great idea, Toke. There are a lot of places where I think it could work well, especially if it took a pluggable hash function for the hashing (at which point it's very general-purpose, and works on all sorts of different kinds of packets and workloads). That would let it be used for userspace VPN links (as an example), or within QUIC (or similar), where the kernel can't see the embedded flows that are hidden by the TLS encryption. > >> > >> And having it pluggable in the kernel would also allow IPSec to work > >> without bloat (last I checked it was horribly bufferbloated, but that > >> was ~5 years ago). > > > > I wonder if it's worth extracting the triple-isolate and > > set-associative hash logic from Cake for this purpose? The interface > > to COBALT is clean enough to be replaced by other AQMs relatively > > easily. > > There's already a reusable FQ structure in the kernel (which is what the > WiFi stack uses), which is partially modelled on Cake's tins. I had half > a mind to try to have the two converge; Cake would shed some LOCs, and > the WiFi stack could get set-associativity... I'm totally not sold on the need for set-associativity. Recently though, I started thinking about doing dynamic minimal perfect hashing, as most ip addresses (and for that matter, mac addresses) are pretty long term stable. If we can calculate a minimal perfect hash (see cmph for example) fairly rapidly set associativity goes away... (but I don't have huge hopes for it as yet) I'm also impressed with the early analysis of cobalt's AQM implementation. I would like very much, however, for a close look at how much ack-filtering would benefit wifi. and funding for next year is on my mind. Not sure how to wedge anything into nl.net's RFP, but... And then there's the class-e stuff busy, busy, busy. Fixing the internet never ends! > > -Toke > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2018-12-11 20:36 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-12-05 22:27 [Cerowrt-devel] fq_pie for linux Dave Taht 2018-12-06 7:50 ` Toke Høiland-Jørgensen 2018-12-06 20:03 ` Dave Taht 2018-12-06 19:13 ` David Lang 2018-12-06 20:21 ` Dave Taht 2018-12-11 18:32 ` Aaron Wood 2018-12-11 18:37 ` Dave Taht 2018-12-11 18:38 ` Dave Taht 2018-12-11 18:38 ` Jonathan Morton 2018-12-11 18:39 ` Dave Taht 2018-12-11 20:23 ` Toke Høiland-Jørgensen 2018-12-11 20:35 ` Dave Taht
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox