[Ecn-sane] complexity question

Discussion of explicit congestion notification's impact on the Internet
 help / color / mirror / Atom feed

* [Ecn-sane] complexity question
@ 2019-08-26  4:28 Dave Taht
  2019-08-26 14:42 ` Luca Muscariello
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Taht @ 2019-08-26  4:28 UTC (permalink / raw)
  To: Luca MUSCARIELLO, ECN-Sane

In my rant on nqb I misspoke on something, and I feel guilty (for the
accidental sophistry) and want to express it better next time.

https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk

I said:

"Whether you have 1 queue or thousands in a fq'd system, the code is
the same as is the "complexity" for all intended
purposes."

But I'm wrong about the "complexity" part of that statement,
particularly if you are thinking about pure hardware. pie/codel are
O(1) for purely marked traffic. For dropping, well, it's easier to
reason about random drop probabilities and extrapolate out to some
number of loops to bound at some value (?) where you just give up and
deliver the packet, based on however much budget you have between
packets in the hw. (we have so much budget in the bql and wifi world
I've never cared) It's harder to reason about codel, but you can still
have a bounded loop if you want one.

fq_codel is selecting a queue to dequeue - so it's not O(1) for
finding that queue.  Selecting the right queue can take multiple loops
through the whole queue spaces, based on the quantum, and then on top
of that you have the drop decisionmaking,
so you have best case (1), worst case (?) and average/median, whatever....

So if you wanted to put a bound on it (say, you were writing in ebpf
or the hw) "for the time spent finding a packet to deliver",
how would you calculate a good time to give up in any of these cases
(pie, codel, fq_codel, pick another fq algo...), and just deliver a
packet.

(my gut says 6-11 loops btw and it's not tellling me why)

But if you bounded the loop seeking the right queue what would happen?

But if you bounded the loop, as to giving up on the drop decision what
would happen?

This is giving me flashbacks to "the benefit of drop tail" back in 2012-2014.

-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Ecn-sane] complexity question
  2019-08-26  4:28 [Ecn-sane] complexity question Dave Taht
@ 2019-08-26 14:42 ` Luca Muscariello
  2019-08-28  2:57   ` Dave Taht
  0 siblings, 1 reply; 4+ messages in thread
From: Luca Muscariello @ 2019-08-26 14:42 UTC (permalink / raw)
  To: Dave Taht; +Cc: ECN-Sane

[-- Attachment #1: Type: text/plain, Size: 3583 bytes --]

We could reliably say that most of the cost comes from DRR.

FQ based on virtual times, such as start time, would cost O(log N) where N
is the number of packets in the queuing system.
DRR as designed by Varghese provided a good approximation with O(1) cost.
So you're not wrong Dave
at least for DRR. But I don't see any other cost to add to the check list.
Of course every algorithm can come with a different constant in terms of
cost but that is really implementation dependent.

SFQ in Linux is using Longest Queue Drop which brings back non constant
delay cost because
it has to search the longest queue, which give O(log F) (worst case) where
F is the number of active flows (with at least one packet in the queue).
Smaller than start time fair queuing.

But DRR, as implemented in fq_codel, does not do that as there is a single
AQM per queue.
Which brings more cost in terms of memory but not in terms of time.

I'm not sure about the DRR implementation in CAKE, but there should be no
differences in terms of complexity.

M. Shreedhar and G. Varghese, "Efficient fair queuing using deficit
round-robin," in
*IEEE/ACM Transactions on Networking*, vol. 4, no. 3, pp. 375-385, June
1996. doi: 10.1109/90.502236
https://www2.cs.duke.edu/courses/spring09/cps214/papers/drr.pdf

On Mon, Aug 26, 2019 at 6:28 AM Dave Taht <dave.taht@gmail.com> wrote:

> In my rant on nqb I misspoke on something, and I feel guilty (for the
> accidental sophistry) and want to express it better next time.
>
> https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk
>
> I said:
>
> "Whether you have 1 queue or thousands in a fq'd system, the code is
> the same as is the "complexity" for all intended
> purposes."
>
> But I'm wrong about the "complexity" part of that statement,
> particularly if you are thinking about pure hardware. pie/codel are
> O(1) for purely marked traffic. For dropping, well, it's easier to
> reason about random drop probabilities and extrapolate out to some
> number of loops to bound at some value (?) where you just give up and
> deliver the packet, based on however much budget you have between
> packets in the hw. (we have so much budget in the bql and wifi world
> I've never cared) It's harder to reason about codel, but you can still
> have a bounded loop if you want one.
>
> fq_codel is selecting a queue to dequeue - so it's not O(1) for
> finding that queue.  Selecting the right queue can take multiple loops
> through the whole queue spaces, based on the quantum, and then on top
> of that you have the drop decisionmaking,
> so you have best case (1), worst case (?) and average/median, whatever....
>
> So if you wanted to put a bound on it (say, you were writing in ebpf
> or the hw) "for the time spent finding a packet to deliver",
> how would you calculate a good time to give up in any of these cases
> (pie, codel, fq_codel, pick another fq algo...), and just deliver a
> packet.
>
> (my gut says 6-11 loops btw and it's not tellling me why)
>
> But if you bounded the loop seeking the right queue what would happen?
>
> But if you bounded the loop, as to giving up on the drop decision what
> would happen?
>
> This is giving me flashbacks to "the benefit of drop tail" back in
> 2012-2014.
>
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane
>

[-- Attachment #2: Type: text/html, Size: 5090 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Ecn-sane] complexity question
  2019-08-26 14:42 ` Luca Muscariello
@ 2019-08-28  2:57   ` Dave Taht
  2019-08-28  3:10     ` Dave Taht
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Taht @ 2019-08-28  2:57 UTC (permalink / raw)
  To: Luca Muscariello; +Cc: ECN-Sane

On Mon, Aug 26, 2019 at 7:43 AM Luca Muscariello <muscariello@ieee.org> wrote:
>
> We could reliably say that most of the cost comes from DRR.

yep. Particularly as we tend to overuse small quantums and don't
really have anything other than gut
feel as to how to scale it. I'm not fond of - as we use in wifi - 300
as the quantum - as that's kind of
expensive in terms of loops through the algo - however it does
optimize for small packets really well, and as a wireless-n aggregate
has a 64 packet (oft much less) limit to the number of packets in it
makes sense to interleave as much as .you can.

I am thinking we should scale this differently for ac and later.

https://sci-hub.tw/10.1109/LCOMM.2012.081612.120744

> FQ based on virtual times, such as start time, would cost O(log N) where N is the number of packets in the queuing system.

Hmm... and EDF?

> DRR as designed by Varghese provided a good approximation with O(1) cost. So you're not wrong Dave
> at least for DRR. But I don't see any other cost to add to the check list.

The thing that I'm strugglng to express is the codel or pie bit - let
me use pie.

Assume we have a drop probability of 50%. (not doing wifi here...)

You can say that on average you are going to drop every other packet,
but probability could be
coming up 1 for a potentially infinite number of tries. So, if we were
running WAY closer to the hw
and we had a budget of X ns to get the next packet onto the wire, we
need to give up on dropping
anything and just deliver a packet and hope we do more of the right
thing next time.

Now the act of putting in a budget actually heisenbugs the calc more.
bql needs a bound (or used to)

skb = dequeue_from_drr();
for(i = 0; skb && i < 6; i++) {
    if (had_to_drop(skb)) {
    skb = dequeue_from_drr();
    }
}

You can accept (as we do in bql) a standing queue that's pretty tiny
(2-3k at 100mbit, 36k or so at a gbit)
you can try (in hw) to process several queues in parallel with, say a
3 (x?) deep pipeline
you can do the dequeue and always have one "ready to go"
you can process everything simultaneously (edf)

and I used to know how qfq worked but now forget

Did I miss any options here? (in hw?) How about using some ai
technique that's likely to attract grant money? :)
.
> Of course every algorithm can come with a different constant in terms of cost but that is really implementation dependent.
>
> SFQ in Linux is using Longest Queue Drop which brings back non constant delay cost because
> it has to search the longest queue, which give O(log F) (worst case) where F is the number of active flows (with at least one packet in the queue).
> Smaller than start time fair queuing.
>
> But DRR, as implemented in fq_codel, does not do that as there is a single AQM per queue.
> Which brings more cost in terms of memory but not in terms of time.
>
> I'm not sure about the DRR implementation in CAKE, but there should be no differences in terms of complexity.
>
>
> M. Shreedhar and G. Varghese, "Efficient fair queuing using deficit round-robin," in
> IEEE/ACM Transactions on Networking, vol. 4, no. 3, pp. 375-385, June 1996. doi: 10.1109/90.502236
> https://www2.cs.duke.edu/courses/spring09/cps214/papers/drr.pdf
>
>
> On Mon, Aug 26, 2019 at 6:28 AM Dave Taht <dave.taht@gmail.com> wrote:
>>
>> In my rant on nqb I misspoke on something, and I feel guilty (for the
>> accidental sophistry) and want to express it better next time.
>>
>> https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk
>>
>> I said:
>>
>> "Whether you have 1 queue or thousands in a fq'd system, the code is
>> the same as is the "complexity" for all intended
>> purposes."
>>
>> But I'm wrong about the "complexity" part of that statement,
>> particularly if you are thinking about pure hardware. pie/codel are
>> O(1) for purely marked traffic. For dropping, well, it's easier to
>> reason about random drop probabilities and extrapolate out to some
>> number of loops to bound at some value (?) where you just give up and
>> deliver the packet, based on however much budget you have between
>> packets in the hw. (we have so much budget in the bql and wifi world
>> I've never cared) It's harder to reason about codel, but you can still
>> have a bounded loop if you want one.
>>
>> fq_codel is selecting a queue to dequeue - so it's not O(1) for
>> finding that queue.  Selecting the right queue can take multiple loops
>> through the whole queue spaces, based on the quantum, and then on top
>> of that you have the drop decisionmaking,
>> so you have best case (1), worst case (?) and average/median, whatever....
>>
>> So if you wanted to put a bound on it (say, you were writing in ebpf
>> or the hw) "for the time spent finding a packet to deliver",
>> how would you calculate a good time to give up in any of these cases
>> (pie, codel, fq_codel, pick another fq algo...), and just deliver a
>> packet.
>>
>> (my gut says 6-11 loops btw and it's not tellling me why)
>>
>> But if you bounded the loop seeking the right queue what would happen?
>>
>> But if you bounded the loop, as to giving up on the drop decision what
>> would happen?
>>
>> This is giving me flashbacks to "the benefit of drop tail" back in 2012-2014.
>>
>> --
>>
>> Dave Täht
>> CTO, TekLibre, LLC
>> http://www.teklibre.com
>> Tel: 1-831-205-9740
>> _______________________________________________
>> Ecn-sane mailing list
>> Ecn-sane@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/ecn-sane



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Ecn-sane] complexity question
  2019-08-28  2:57   ` Dave Taht
@ 2019-08-28  3:10     ` Dave Taht
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Taht @ 2019-08-28  3:10 UTC (permalink / raw)
  To: Luca Muscariello; +Cc: ECN-Sane

On Tue, Aug 27, 2019 at 7:57 PM Dave Taht <dave.taht@gmail.com> wrote:
>
> On Mon, Aug 26, 2019 at 7:43 AM Luca Muscariello <muscariello@ieee.org> wrote:
> >
> > We could reliably say that most of the cost comes from DRR.
>
> yep. Particularly as we tend to overuse small quantums and don't
> really have anything other than gut
> feel as to how to scale it. I'm not fond of - as we use in wifi - 300
> as the quantum - as that's kind of
> expensive in terms of loops through the algo - however it does
> optimize for small packets really well, and as a wireless-n aggregate
> has a 64 packet (oft much less) limit to the number of packets in it
> makes sense to interleave as much as .you can.
>
> I am thinking we should scale this differently for ac and later.
>
> https://sci-hub.tw/10.1109/LCOMM.2012.081612.120744
>
> > FQ based on virtual times, such as start time, would cost O(log N) where N is the number of packets in the queuing system.
>
> Hmm... and EDF?
>
> > DRR as designed by Varghese provided a good approximation with O(1) cost. So you're not wrong Dave
> > at least for DRR. But I don't see any other cost to add to the check list.
>
> The thing that I'm strugglng to express is the codel or pie bit - let
> me use pie.
>
> Assume we have a drop probability of 50%. (not doing wifi here...)
>
> You can say that on average you are going to drop every other packet,
> but probability could be
> coming up 1 for a potentially infinite number of tries. So, if we were
> running WAY closer to the hw
> and we had a budget of X ns to get the next packet onto the wire, we
> need to give up on dropping
> anything and just deliver a packet and hope we do more of the right
> thing next time.
>
> Now the act of putting in a budget actually heisenbugs the calc more.
> bql needs a bound (or used to)
>
> skb = dequeue_from_drr();
> for(i = 0; skb && i < 6; i++) {
>     if (had_to_drop(skb)) {
>     skb = dequeue_from_drr();
>     }
> }

oops I hit send too early:

skb = dequeue_from_drr();
for(i = 0; skb && i < 6; i++) {
    if (had_to_drop(skb))
        skb = dequeue_from_drr();
     else {
        i = 6;
        break;
        }
}

Or another way:

now = gettime();
budget = now + (time_left_to_deliver_the_previous_packet)
skb = dequeue_from_drr();
ok = false;
while (budget > now && !ok && skb)
    if (had_to_drop(skb)) { // pseudocode that drops the skb if needed
        skb = dequeue_from_drr();
        now = gettime();
    }
    else
        ok = true;
}

time_left_to_deliver_the_previous_packet =
calc_it_from_size_and_remaining_budget(skb);
return skb;

(I make no warranties for code I write in a browser, I'm hoping my
intent is clear)

>
> You can accept (as we do in bql) a standing queue that's pretty tiny
> (2-3k at 100mbit, 36k or so at a gbit)
> you can try (in hw) to process several queues in parallel with, say a
> 3 (x?) deep pipeline
> you can do the dequeue and always have one "ready to go"
> you can process everything simultaneously (edf)
>
> and I used to know how qfq worked but now forget
>
> Did I miss any options here? (in hw?) How about using some ai
> technique that's likely to attract grant money? :)
> .
> > Of course every algorithm can come with a different constant in terms of cost but that is really implementation dependent.
> >
> > SFQ in Linux is using Longest Queue Drop which brings back non constant delay cost because
> > it has to search the longest queue, which give O(log F) (worst case) where F is the number of active flows (with at least one packet in the queue).
> > Smaller than start time fair queuing.
> >
> > But DRR, as implemented in fq_codel, does not do that as there is a single AQM per queue.
> > Which brings more cost in terms of memory but not in terms of time.
> >
> > I'm not sure about the DRR implementation in CAKE, but there should be no differences in terms of complexity.
> >
> >
> > M. Shreedhar and G. Varghese, "Efficient fair queuing using deficit round-robin," in
> > IEEE/ACM Transactions on Networking, vol. 4, no. 3, pp. 375-385, June 1996. doi: 10.1109/90.502236
> > https://www2.cs.duke.edu/courses/spring09/cps214/papers/drr.pdf
> >
> >
> > On Mon, Aug 26, 2019 at 6:28 AM Dave Taht <dave.taht@gmail.com> wrote:
> >>
> >> In my rant on nqb I misspoke on something, and I feel guilty (for the
> >> accidental sophistry) and want to express it better next time.
> >>
> >> https://mailarchive.ietf.org/arch/msg/tsvwg/hZGjm899t87YZl9JJUOWQq4KBsk
> >>
> >> I said:
> >>
> >> "Whether you have 1 queue or thousands in a fq'd system, the code is
> >> the same as is the "complexity" for all intended
> >> purposes."
> >>
> >> But I'm wrong about the "complexity" part of that statement,
> >> particularly if you are thinking about pure hardware. pie/codel are
> >> O(1) for purely marked traffic. For dropping, well, it's easier to
> >> reason about random drop probabilities and extrapolate out to some
> >> number of loops to bound at some value (?) where you just give up and
> >> deliver the packet, based on however much budget you have between
> >> packets in the hw. (we have so much budget in the bql and wifi world
> >> I've never cared) It's harder to reason about codel, but you can still
> >> have a bounded loop if you want one.
> >>
> >> fq_codel is selecting a queue to dequeue - so it's not O(1) for
> >> finding that queue.  Selecting the right queue can take multiple loops
> >> through the whole queue spaces, based on the quantum, and then on top
> >> of that you have the drop decisionmaking,
> >> so you have best case (1), worst case (?) and average/median, whatever....
> >>
> >> So if you wanted to put a bound on it (say, you were writing in ebpf
> >> or the hw) "for the time spent finding a packet to deliver",
> >> how would you calculate a good time to give up in any of these cases
> >> (pie, codel, fq_codel, pick another fq algo...), and just deliver a
> >> packet.
> >>
> >> (my gut says 6-11 loops btw and it's not tellling me why)
> >>
> >> But if you bounded the loop seeking the right queue what would happen?
> >>
> >> But if you bounded the loop, as to giving up on the drop decision what
> >> would happen?
> >>
> >> This is giving me flashbacks to "the benefit of drop tail" back in 2012-2014.
> >>
> >> --
> >>
> >> Dave Täht
> >> CTO, TekLibre, LLC
> >> http://www.teklibre.com
> >> Tel: 1-831-205-9740
> >> _______________________________________________
> >> Ecn-sane mailing list
> >> Ecn-sane@lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/ecn-sane
>
>
>
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-08-28  3:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-26  4:28 [Ecn-sane] complexity question Dave Taht
2019-08-26 14:42 ` Luca Muscariello
2019-08-28  2:57   ` Dave Taht
2019-08-28  3:10     ` Dave Taht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox