[Cake] GSO peel behaviour tweaks

Cake - FQ_codel the next generation
 help / color / mirror / Atom feed

* [Cake] GSO peel behaviour tweaks
@ 2015-11-24  9:12 Kevin Darbyshire-Bryant
  2015-11-24 10:48 ` Dave Taht
  2015-11-24 10:52 ` Sebastian Moeller
  0 siblings, 2 replies; 4+ messages in thread
From: Kevin Darbyshire-Bryant @ 2015-11-24  9:12 UTC (permalink / raw)
  To: cake

[-- Attachment #1: Type: text/plain, Size: 695 bytes --]

I've just pushed 2 commits related to GSO peeling behaviour to master.

1st tweak is at worst benign and at best removes a multiply compare for
every packet enqueued.  I'd like to think the optimiser in the compiler
would have done what I've done explicitly (in essence check this is a
gso packet 1st before thinking about peeling it) but when I checked on
x86_64 there was a definite difference in produced code.

2nd tweak is *not* benign.  In essence this forces peeling if either ATM
framing or packet overhead is specified.  Previously only ATM framing
forced peeling.  I think this is more correct but unfortunately will be
slower.

Commits can be reverted - feel free :-)

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4816 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Cake] GSO peel behaviour tweaks
  2015-11-24  9:12 [Cake] GSO peel behaviour tweaks Kevin Darbyshire-Bryant
@ 2015-11-24 10:48 ` Dave Taht
  2015-11-24 10:55   ` Kevin Darbyshire-Bryant
  2015-11-24 10:52 ` Sebastian Moeller
  1 sibling, 1 reply; 4+ messages in thread
From: Dave Taht @ 2015-11-24 10:48 UTC (permalink / raw)
  To: Kevin Darbyshire-Bryant; +Cc: cake

I don't know what this used to look like but it is essentially wrong
in both (all?) versions.

-               q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ?
-                       0 : min(65535U, q->rate_bps >> 12);
+               q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ||
+                       q->rate_overhead ? 0 : min(65535U, q->rate_bps >> 12);

What we want to do is closer to:

A) start peeling once we start accruing or incurring delay in excess
of, say, 250usec.  At 1Mbit, this is basically peel always. At a gbit,
it's peel with roughly two 10 full-size packet offloads in play. There
are nuances vs a vs ack GRO stuff (served with a 300 quantum in
fq_codel), and in the 10-100Mbit range...

A1) So doing nothing at a rate unlimited is wrong
A2) Taking the current len * flows as a way to calculate it is wrong
A3) I don't know if this was ever "right". It doesn't need to be
perfect, but this is far from right...

While I am unfond of the rate estimator's overhead, it perhaps could
be used to calculate the peel threshold in a saner way...

B) always peel when we are trying to do accurate on-wire accounting.

As for the other patch...

In general random pointer lookups into memory (like the skb->gro
pointer) cost more than math as the other two params here are possibly
part of a local cache hit already... and I have no idea what the ratio
is between gso packets and how often you'd hit the comparison... but
see point A2 above...

-       if (unlikely((len * max_t(u32, b->bulk_flow_count, 1U) >
-                     q->peel_threshold && skb_is_gso(skb)))) {

+       if (unlikely(skb_is_gso(skb) &&
+               (len * max_t(u32, b->bulk_flow_count, 1U) >
+                     q->peel_threshold))) {

Dave Täht
Let's go make home routers and wifi faster! With better software!
https://www.gofundme.com/savewifi

On Tue, Nov 24, 2015 at 10:12 AM, Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:
> I've just pushed 2 commits related to GSO peeling behaviour to master.
>
> 1st tweak is at worst benign and at best removes a multiply compare for
> every packet enqueued.  I'd like to think the optimiser in the compiler
> would have done what I've done explicitly (in essence check this is a
> gso packet 1st before thinking about peeling it) but when I checked on
> x86_64 there was a definite difference in produced code.
>
> 2nd tweak is *not* benign.  In essence this forces peeling if either ATM
> framing or packet overhead is specified.  Previously only ATM framing
> forced peeling.  I think this is more correct but unfortunately will be
> slower.
>
> Commits can be reverted - feel free :-)
>
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Cake] GSO peel behaviour tweaks
  2015-11-24  9:12 [Cake] GSO peel behaviour tweaks Kevin Darbyshire-Bryant
  2015-11-24 10:48 ` Dave Taht
@ 2015-11-24 10:52 ` Sebastian Moeller
  1 sibling, 0 replies; 4+ messages in thread
From: Sebastian Moeller @ 2015-11-24 10:52 UTC (permalink / raw)
  To: Kevin Darbyshire-Bryant; +Cc: cake

Hi Kevin,

On Nov 24, 2015, at 10:12 , Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:

> I've just pushed 2 commits related to GSO peeling behaviour to master.
> 
> 1st tweak is at worst benign and at best removes a multiply compare for
> every packet enqueued.  I'd like to think the optimiser in the compiler
> would have done what I've done explicitly (in essence check this is a
> gso packet 1st before thinking about peeling it) but when I checked on
> x86_64 there was a definite difference in produced code.
> 
> 2nd tweak is *not* benign.  In essence this forces peeling if either ATM
> framing or packet overhead is specified.  Previously only ATM framing
> forced peeling.  I think this is more correct but unfortunately will be
> slower.

	Why? Does cake not account all the overhead that the de-composed aggregate will cause on the wire? If not, it should do that and keep the decision to peel or not-peel orthogonal, no?

Best Regards
	Sebastian

> 
> Commits can be reverted - feel free :-)
> 
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Cake] GSO peel behaviour tweaks
  2015-11-24 10:48 ` Dave Taht
@ 2015-11-24 10:55   ` Kevin Darbyshire-Bryant
  0 siblings, 0 replies; 4+ messages in thread
From: Kevin Darbyshire-Bryant @ 2015-11-24 10:55 UTC (permalink / raw)
  To: Dave Taht; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 3066 bytes --]

On 24/11/15 10:48, Dave Taht wrote:
> I don't know what this used to look like but it is essentially wrong
> in both (all?) versions.
>
> -               q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ?
> -                       0 : min(65535U, q->rate_bps >> 12);
> +               q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ||
> +                       q->rate_overhead ? 0 : min(65535U, q->rate_bps >> 12);
>
> What we want to do is closer to:
>
> A) start peeling once we start accruing or incurring delay in excess
> of, say, 250usec.  At 1Mbit, this is basically peel always. At a gbit,
> it's peel with roughly two 10 full-size packet offloads in play. There
> are nuances vs a vs ack GRO stuff (served with a 300 quantum in
> fq_codel), and in the 10-100Mbit range...
>
> A1) So doing nothing at a rate unlimited is wrong
> A2) Taking the current len * flows as a way to calculate it is wrong
> A3) I don't know if this was ever "right". It doesn't need to be
> perfect, but this is far from right...
>
> While I am unfond of the rate estimator's overhead, it perhaps could
> be used to calculate the peel threshold in a saner way...
>
> B) always peel when we are trying to do accurate on-wire accounting.
>
> As for the other patch...
>
> In general random pointer lookups into memory (like the skb->gro
> pointer) cost more than math as the other two params here are possibly
> part of a local cache hit already... and I have no idea what the ratio
> is between gso packets and how often you'd hit the comparison... but
> see point A2 above...
>
> -       if (unlikely((len * max_t(u32, b->bulk_flow_count, 1U) >
> -                     q->peel_threshold && skb_is_gso(skb)))) {
>
> +       if (unlikely(skb_is_gso(skb) &&
> +               (len * max_t(u32, b->bulk_flow_count, 1U) >
> +                     q->peel_threshold))) {
>
>
>
>
>
> Dave Täht
> Let's go make home routers and wifi faster! With better software!
> https://www.gofundme.com/savewifi
>
>
> On Tue, Nov 24, 2015 at 10:12 AM, Kevin Darbyshire-Bryant
> <kevin@darbyshire-bryant.me.uk> wrote:
>> I've just pushed 2 commits related to GSO peeling behaviour to master.
>>
>> 1st tweak is at worst benign and at best removes a multiply compare for
>> every packet enqueued.  I'd like to think the optimiser in the compiler
>> would have done what I've done explicitly (in essence check this is a
>> gso packet 1st before thinking about peeling it) but when I checked on
>> x86_64 there was a definite difference in produced code.
>>
>> 2nd tweak is *not* benign.  In essence this forces peeling if either ATM
>> framing or packet overhead is specified.  Previously only ATM framing
>> forced peeling.  I think this is more correct but unfortunately will be
>> slower.
>>
>> Commits can be reverted - feel free :-)
>>
>>
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>>

Both changes reverted


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4816 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-11-24 10:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-24  9:12 [Cake] GSO peel behaviour tweaks Kevin Darbyshire-Bryant
2015-11-24 10:48 ` Dave Taht
2015-11-24 10:55   ` Kevin Darbyshire-Bryant
2015-11-24 10:52 ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox