* [Ecn-sane] complexity question @ 2019-08-26 4:28 Dave Taht 2019-08-26 14:42 ` Luca Muscariello 0 siblings, 1 reply; 4+ messages in thread From: Dave Taht @ 2019-08-26 4:28 UTC (permalink / raw) To: Luca MUSCARIELLO, ECN-Sane In my rant on nqb I misspoke on something, and I feel guilty (for the accidental sophistry) and want to express it better next time. https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk I said: "Whether you have 1 queue or thousands in a fq'd system, the code is the same as is the "complexity" for all intended purposes." But I'm wrong about the "complexity" part of that statement, particularly if you are thinking about pure hardware. pie/codel are O(1) for purely marked traffic. For dropping, well, it's easier to reason about random drop probabilities and extrapolate out to some number of loops to bound at some value (?) where you just give up and deliver the packet, based on however much budget you have between packets in the hw. (we have so much budget in the bql and wifi world I've never cared) It's harder to reason about codel, but you can still have a bounded loop if you want one. fq_codel is selecting a queue to dequeue - so it's not O(1) for finding that queue. Selecting the right queue can take multiple loops through the whole queue spaces, based on the quantum, and then on top of that you have the drop decisionmaking, so you have best case (1), worst case (?) and average/median, whatever.... So if you wanted to put a bound on it (say, you were writing in ebpf or the hw) "for the time spent finding a packet to deliver", how would you calculate a good time to give up in any of these cases (pie, codel, fq_codel, pick another fq algo...), and just deliver a packet. (my gut says 6-11 loops btw and it's not tellling me why) But if you bounded the loop seeking the right queue what would happen? But if you bounded the loop, as to giving up on the drop decision what would happen? This is giving me flashbacks to "the benefit of drop tail" back in 2012-2014. -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Ecn-sane] complexity question 2019-08-26 4:28 [Ecn-sane] complexity question Dave Taht @ 2019-08-26 14:42 ` Luca Muscariello 2019-08-28 2:57 ` Dave Taht 0 siblings, 1 reply; 4+ messages in thread From: Luca Muscariello @ 2019-08-26 14:42 UTC (permalink / raw) To: Dave Taht; +Cc: ECN-Sane [-- Attachment #1: Type: text/plain, Size: 3583 bytes --] We could reliably say that most of the cost comes from DRR. FQ based on virtual times, such as start time, would cost O(log N) where N is the number of packets in the queuing system. DRR as designed by Varghese provided a good approximation with O(1) cost. So you're not wrong Dave at least for DRR. But I don't see any other cost to add to the check list. Of course every algorithm can come with a different constant in terms of cost but that is really implementation dependent. SFQ in Linux is using Longest Queue Drop which brings back non constant delay cost because it has to search the longest queue, which give O(log F) (worst case) where F is the number of active flows (with at least one packet in the queue). Smaller than start time fair queuing. But DRR, as implemented in fq_codel, does not do that as there is a single AQM per queue. Which brings more cost in terms of memory but not in terms of time. I'm not sure about the DRR implementation in CAKE, but there should be no differences in terms of complexity. M. Shreedhar and G. Varghese, "Efficient fair queuing using deficit round-robin," in *IEEE/ACM Transactions on Networking*, vol. 4, no. 3, pp. 375-385, June 1996. doi: 10.1109/90.502236 https://www2.cs.duke.edu/courses/spring09/cps214/papers/drr.pdf On Mon, Aug 26, 2019 at 6:28 AM Dave Taht <dave.taht@gmail.com> wrote: > In my rant on nqb I misspoke on something, and I feel guilty (for the > accidental sophistry) and want to express it better next time. > > https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk > > I said: > > "Whether you have 1 queue or thousands in a fq'd system, the code is > the same as is the "complexity" for all intended > purposes." > > But I'm wrong about the "complexity" part of that statement, > particularly if you are thinking about pure hardware. pie/codel are > O(1) for purely marked traffic. For dropping, well, it's easier to > reason about random drop probabilities and extrapolate out to some > number of loops to bound at some value (?) where you just give up and > deliver the packet, based on however much budget you have between > packets in the hw. (we have so much budget in the bql and wifi world > I've never cared) It's harder to reason about codel, but you can still > have a bounded loop if you want one. > > fq_codel is selecting a queue to dequeue - so it's not O(1) for > finding that queue. Selecting the right queue can take multiple loops > through the whole queue spaces, based on the quantum, and then on top > of that you have the drop decisionmaking, > so you have best case (1), worst case (?) and average/median, whatever.... > > So if you wanted to put a bound on it (say, you were writing in ebpf > or the hw) "for the time spent finding a packet to deliver", > how would you calculate a good time to give up in any of these cases > (pie, codel, fq_codel, pick another fq algo...), and just deliver a > packet. > > (my gut says 6-11 loops btw and it's not tellling me why) > > But if you bounded the loop seeking the right queue what would happen? > > But if you bounded the loop, as to giving up on the drop decision what > would happen? > > This is giving me flashbacks to "the benefit of drop tail" back in > 2012-2014. > > -- > > Dave Täht > CTO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-831-205-9740 > _______________________________________________ > Ecn-sane mailing list > Ecn-sane@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/ecn-sane > [-- Attachment #2: Type: text/html, Size: 5090 bytes --] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Ecn-sane] complexity question 2019-08-26 14:42 ` Luca Muscariello @ 2019-08-28 2:57 ` Dave Taht 2019-08-28 3:10 ` Dave Taht 0 siblings, 1 reply; 4+ messages in thread From: Dave Taht @ 2019-08-28 2:57 UTC (permalink / raw) To: Luca Muscariello; +Cc: ECN-Sane On Mon, Aug 26, 2019 at 7:43 AM Luca Muscariello <muscariello@ieee.org> wrote: > > We could reliably say that most of the cost comes from DRR. yep. Particularly as we tend to overuse small quantums and don't really have anything other than gut feel as to how to scale it. I'm not fond of - as we use in wifi - 300 as the quantum - as that's kind of expensive in terms of loops through the algo - however it does optimize for small packets really well, and as a wireless-n aggregate has a 64 packet (oft much less) limit to the number of packets in it makes sense to interleave as much as .you can. I am thinking we should scale this differently for ac and later. https://sci-hub.tw/10.1109/LCOMM.2012.081612.120744 > FQ based on virtual times, such as start time, would cost O(log N) where N is the number of packets in the queuing system. Hmm... and EDF? > DRR as designed by Varghese provided a good approximation with O(1) cost. So you're not wrong Dave > at least for DRR. But I don't see any other cost to add to the check list. The thing that I'm strugglng to express is the codel or pie bit - let me use pie. Assume we have a drop probability of 50%. (not doing wifi here...) You can say that on average you are going to drop every other packet, but probability could be coming up 1 for a potentially infinite number of tries. So, if we were running WAY closer to the hw and we had a budget of X ns to get the next packet onto the wire, we need to give up on dropping anything and just deliver a packet and hope we do more of the right thing next time. Now the act of putting in a budget actually heisenbugs the calc more. bql needs a bound (or used to) skb = dequeue_from_drr(); for(i = 0; skb && i < 6; i++) { if (had_to_drop(skb)) { skb = dequeue_from_drr(); } } You can accept (as we do in bql) a standing queue that's pretty tiny (2-3k at 100mbit, 36k or so at a gbit) you can try (in hw) to process several queues in parallel with, say a 3 (x?) deep pipeline you can do the dequeue and always have one "ready to go" you can process everything simultaneously (edf) and I used to know how qfq worked but now forget Did I miss any options here? (in hw?) How about using some ai technique that's likely to attract grant money? :) . > Of course every algorithm can come with a different constant in terms of cost but that is really implementation dependent. > > SFQ in Linux is using Longest Queue Drop which brings back non constant delay cost because > it has to search the longest queue, which give O(log F) (worst case) where F is the number of active flows (with at least one packet in the queue). > Smaller than start time fair queuing. > > But DRR, as implemented in fq_codel, does not do that as there is a single AQM per queue. > Which brings more cost in terms of memory but not in terms of time. > > I'm not sure about the DRR implementation in CAKE, but there should be no differences in terms of complexity. > > > M. Shreedhar and G. Varghese, "Efficient fair queuing using deficit round-robin," in > IEEE/ACM Transactions on Networking, vol. 4, no. 3, pp. 375-385, June 1996. doi: 10.1109/90.502236 > https://www2.cs.duke.edu/courses/spring09/cps214/papers/drr.pdf > > > On Mon, Aug 26, 2019 at 6:28 AM Dave Taht <dave.taht@gmail.com> wrote: >> >> In my rant on nqb I misspoke on something, and I feel guilty (for the >> accidental sophistry) and want to express it better next time. >> >> https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk >> >> I said: >> >> "Whether you have 1 queue or thousands in a fq'd system, the code is >> the same as is the "complexity" for all intended >> purposes." >> >> But I'm wrong about the "complexity" part of that statement, >> particularly if you are thinking about pure hardware. pie/codel are >> O(1) for purely marked traffic. For dropping, well, it's easier to >> reason about random drop probabilities and extrapolate out to some >> number of loops to bound at some value (?) where you just give up and >> deliver the packet, based on however much budget you have between >> packets in the hw. (we have so much budget in the bql and wifi world >> I've never cared) It's harder to reason about codel, but you can still >> have a bounded loop if you want one. >> >> fq_codel is selecting a queue to dequeue - so it's not O(1) for >> finding that queue. Selecting the right queue can take multiple loops >> through the whole queue spaces, based on the quantum, and then on top >> of that you have the drop decisionmaking, >> so you have best case (1), worst case (?) and average/median, whatever.... >> >> So if you wanted to put a bound on it (say, you were writing in ebpf >> or the hw) "for the time spent finding a packet to deliver", >> how would you calculate a good time to give up in any of these cases >> (pie, codel, fq_codel, pick another fq algo...), and just deliver a >> packet. >> >> (my gut says 6-11 loops btw and it's not tellling me why) >> >> But if you bounded the loop seeking the right queue what would happen? >> >> But if you bounded the loop, as to giving up on the drop decision what >> would happen? >> >> This is giving me flashbacks to "the benefit of drop tail" back in 2012-2014. >> >> -- >> >> Dave Täht >> CTO, TekLibre, LLC >> http://www.teklibre.com >> Tel: 1-831-205-9740 >> _______________________________________________ >> Ecn-sane mailing list >> Ecn-sane@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/ecn-sane -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Ecn-sane] complexity question 2019-08-28 2:57 ` Dave Taht @ 2019-08-28 3:10 ` Dave Taht 0 siblings, 0 replies; 4+ messages in thread From: Dave Taht @ 2019-08-28 3:10 UTC (permalink / raw) To: Luca Muscariello; +Cc: ECN-Sane On Tue, Aug 27, 2019 at 7:57 PM Dave Taht <dave.taht@gmail.com> wrote: > > On Mon, Aug 26, 2019 at 7:43 AM Luca Muscariello <muscariello@ieee.org> wrote: > > > > We could reliably say that most of the cost comes from DRR. > > yep. Particularly as we tend to overuse small quantums and don't > really have anything other than gut > feel as to how to scale it. I'm not fond of - as we use in wifi - 300 > as the quantum - as that's kind of > expensive in terms of loops through the algo - however it does > optimize for small packets really well, and as a wireless-n aggregate > has a 64 packet (oft much less) limit to the number of packets in it > makes sense to interleave as much as .you can. > > I am thinking we should scale this differently for ac and later. > > https://sci-hub.tw/10.1109/LCOMM.2012.081612.120744 > > > FQ based on virtual times, such as start time, would cost O(log N) where N is the number of packets in the queuing system. > > Hmm... and EDF? > > > DRR as designed by Varghese provided a good approximation with O(1) cost. So you're not wrong Dave > > at least for DRR. But I don't see any other cost to add to the check list. > > The thing that I'm strugglng to express is the codel or pie bit - let > me use pie. > > Assume we have a drop probability of 50%. (not doing wifi here...) > > You can say that on average you are going to drop every other packet, > but probability could be > coming up 1 for a potentially infinite number of tries. So, if we were > running WAY closer to the hw > and we had a budget of X ns to get the next packet onto the wire, we > need to give up on dropping > anything and just deliver a packet and hope we do more of the right > thing next time. > > Now the act of putting in a budget actually heisenbugs the calc more. > bql needs a bound (or used to) > > skb = dequeue_from_drr(); > for(i = 0; skb && i < 6; i++) { > if (had_to_drop(skb)) { > skb = dequeue_from_drr(); > } > } oops I hit send too early: skb = dequeue_from_drr(); for(i = 0; skb && i < 6; i++) { if (had_to_drop(skb)) skb = dequeue_from_drr(); else { i = 6; break; } } Or another way: now = gettime(); budget = now + (time_left_to_deliver_the_previous_packet) skb = dequeue_from_drr(); ok = false; while (budget > now && !ok && skb) if (had_to_drop(skb)) { // pseudocode that drops the skb if needed skb = dequeue_from_drr(); now = gettime(); } else ok = true; } time_left_to_deliver_the_previous_packet = calc_it_from_size_and_remaining_budget(skb); return skb; (I make no warranties for code I write in a browser, I'm hoping my intent is clear) > > You can accept (as we do in bql) a standing queue that's pretty tiny > (2-3k at 100mbit, 36k or so at a gbit) > you can try (in hw) to process several queues in parallel with, say a > 3 (x?) deep pipeline > you can do the dequeue and always have one "ready to go" > you can process everything simultaneously (edf) > > and I used to know how qfq worked but now forget > > Did I miss any options here? (in hw?) How about using some ai > technique that's likely to attract grant money? :) > . > > Of course every algorithm can come with a different constant in terms of cost but that is really implementation dependent. > > > > SFQ in Linux is using Longest Queue Drop which brings back non constant delay cost because > > it has to search the longest queue, which give O(log F) (worst case) where F is the number of active flows (with at least one packet in the queue). > > Smaller than start time fair queuing. > > > > But DRR, as implemented in fq_codel, does not do that as there is a single AQM per queue. > > Which brings more cost in terms of memory but not in terms of time. > > > > I'm not sure about the DRR implementation in CAKE, but there should be no differences in terms of complexity. > > > > > > M. Shreedhar and G. Varghese, "Efficient fair queuing using deficit round-robin," in > > IEEE/ACM Transactions on Networking, vol. 4, no. 3, pp. 375-385, June 1996. doi: 10.1109/90.502236 > > https://www2.cs.duke.edu/courses/spring09/cps214/papers/drr.pdf > > > > > > On Mon, Aug 26, 2019 at 6:28 AM Dave Taht <dave.taht@gmail.com> wrote: > >> > >> In my rant on nqb I misspoke on something, and I feel guilty (for the > >> accidental sophistry) and want to express it better next time. > >> > >> https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk > >> > >> I said: > >> > >> "Whether you have 1 queue or thousands in a fq'd system, the code is > >> the same as is the "complexity" for all intended > >> purposes." > >> > >> But I'm wrong about the "complexity" part of that statement, > >> particularly if you are thinking about pure hardware. pie/codel are > >> O(1) for purely marked traffic. For dropping, well, it's easier to > >> reason about random drop probabilities and extrapolate out to some > >> number of loops to bound at some value (?) where you just give up and > >> deliver the packet, based on however much budget you have between > >> packets in the hw. (we have so much budget in the bql and wifi world > >> I've never cared) It's harder to reason about codel, but you can still > >> have a bounded loop if you want one. > >> > >> fq_codel is selecting a queue to dequeue - so it's not O(1) for > >> finding that queue. Selecting the right queue can take multiple loops > >> through the whole queue spaces, based on the quantum, and then on top > >> of that you have the drop decisionmaking, > >> so you have best case (1), worst case (?) and average/median, whatever.... > >> > >> So if you wanted to put a bound on it (say, you were writing in ebpf > >> or the hw) "for the time spent finding a packet to deliver", > >> how would you calculate a good time to give up in any of these cases > >> (pie, codel, fq_codel, pick another fq algo...), and just deliver a > >> packet. > >> > >> (my gut says 6-11 loops btw and it's not tellling me why) > >> > >> But if you bounded the loop seeking the right queue what would happen? > >> > >> But if you bounded the loop, as to giving up on the drop decision what > >> would happen? > >> > >> This is giving me flashbacks to "the benefit of drop tail" back in 2012-2014. > >> > >> -- > >> > >> Dave Täht > >> CTO, TekLibre, LLC > >> http://www.teklibre.com > >> Tel: 1-831-205-9740 > >> _______________________________________________ > >> Ecn-sane mailing list > >> Ecn-sane@lists.bufferbloat.net > >> https://lists.bufferbloat.net/listinfo/ecn-sane > > > > -- > > Dave Täht > CTO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-831-205-9740 -- Dave Täht CTO, TekLibre, LLC http://www.teklibre.com Tel: 1-831-205-9740 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-08-28 3:10 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-26 4:28 [Ecn-sane] complexity question Dave Taht 2019-08-26 14:42 ` Luca Muscariello 2019-08-28 2:57 ` Dave Taht 2019-08-28 3:10 ` Dave Taht
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox