* [Cake] GSO peel behaviour tweaks
@ 2015-11-24 9:12 Kevin Darbyshire-Bryant
2015-11-24 10:48 ` Dave Taht
2015-11-24 10:52 ` Sebastian Moeller
0 siblings, 2 replies; 4+ messages in thread
From: Kevin Darbyshire-Bryant @ 2015-11-24 9:12 UTC (permalink / raw)
To: cake
[-- Attachment #1: Type: text/plain, Size: 695 bytes --]
I've just pushed 2 commits related to GSO peeling behaviour to master.
1st tweak is at worst benign and at best removes a multiply compare for
every packet enqueued. I'd like to think the optimiser in the compiler
would have done what I've done explicitly (in essence check this is a
gso packet 1st before thinking about peeling it) but when I checked on
x86_64 there was a definite difference in produced code.
2nd tweak is *not* benign. In essence this forces peeling if either ATM
framing or packet overhead is specified. Previously only ATM framing
forced peeling. I think this is more correct but unfortunately will be
slower.
Commits can be reverted - feel free :-)
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4816 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Cake] GSO peel behaviour tweaks
2015-11-24 9:12 [Cake] GSO peel behaviour tweaks Kevin Darbyshire-Bryant
@ 2015-11-24 10:48 ` Dave Taht
2015-11-24 10:55 ` Kevin Darbyshire-Bryant
2015-11-24 10:52 ` Sebastian Moeller
1 sibling, 1 reply; 4+ messages in thread
From: Dave Taht @ 2015-11-24 10:48 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: cake
I don't know what this used to look like but it is essentially wrong
in both (all?) versions.
- q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ?
- 0 : min(65535U, q->rate_bps >> 12);
+ q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ||
+ q->rate_overhead ? 0 : min(65535U, q->rate_bps >> 12);
What we want to do is closer to:
A) start peeling once we start accruing or incurring delay in excess
of, say, 250usec. At 1Mbit, this is basically peel always. At a gbit,
it's peel with roughly two 10 full-size packet offloads in play. There
are nuances vs a vs ack GRO stuff (served with a 300 quantum in
fq_codel), and in the 10-100Mbit range...
A1) So doing nothing at a rate unlimited is wrong
A2) Taking the current len * flows as a way to calculate it is wrong
A3) I don't know if this was ever "right". It doesn't need to be
perfect, but this is far from right...
While I am unfond of the rate estimator's overhead, it perhaps could
be used to calculate the peel threshold in a saner way...
B) always peel when we are trying to do accurate on-wire accounting.
As for the other patch...
In general random pointer lookups into memory (like the skb->gro
pointer) cost more than math as the other two params here are possibly
part of a local cache hit already... and I have no idea what the ratio
is between gso packets and how often you'd hit the comparison... but
see point A2 above...
- if (unlikely((len * max_t(u32, b->bulk_flow_count, 1U) >
- q->peel_threshold && skb_is_gso(skb)))) {
+ if (unlikely(skb_is_gso(skb) &&
+ (len * max_t(u32, b->bulk_flow_count, 1U) >
+ q->peel_threshold))) {
Dave Täht
Let's go make home routers and wifi faster! With better software!
https://www.gofundme.com/savewifi
On Tue, Nov 24, 2015 at 10:12 AM, Kevin Darbyshire-Bryant
<kevin@darbyshire-bryant.me.uk> wrote:
> I've just pushed 2 commits related to GSO peeling behaviour to master.
>
> 1st tweak is at worst benign and at best removes a multiply compare for
> every packet enqueued. I'd like to think the optimiser in the compiler
> would have done what I've done explicitly (in essence check this is a
> gso packet 1st before thinking about peeling it) but when I checked on
> x86_64 there was a definite difference in produced code.
>
> 2nd tweak is *not* benign. In essence this forces peeling if either ATM
> framing or packet overhead is specified. Previously only ATM framing
> forced peeling. I think this is more correct but unfortunately will be
> slower.
>
> Commits can be reverted - feel free :-)
>
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Cake] GSO peel behaviour tweaks
2015-11-24 9:12 [Cake] GSO peel behaviour tweaks Kevin Darbyshire-Bryant
2015-11-24 10:48 ` Dave Taht
@ 2015-11-24 10:52 ` Sebastian Moeller
1 sibling, 0 replies; 4+ messages in thread
From: Sebastian Moeller @ 2015-11-24 10:52 UTC (permalink / raw)
To: Kevin Darbyshire-Bryant; +Cc: cake
Hi Kevin,
On Nov 24, 2015, at 10:12 , Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> wrote:
> I've just pushed 2 commits related to GSO peeling behaviour to master.
>
> 1st tweak is at worst benign and at best removes a multiply compare for
> every packet enqueued. I'd like to think the optimiser in the compiler
> would have done what I've done explicitly (in essence check this is a
> gso packet 1st before thinking about peeling it) but when I checked on
> x86_64 there was a definite difference in produced code.
>
> 2nd tweak is *not* benign. In essence this forces peeling if either ATM
> framing or packet overhead is specified. Previously only ATM framing
> forced peeling. I think this is more correct but unfortunately will be
> slower.
Why? Does cake not account all the overhead that the de-composed aggregate will cause on the wire? If not, it should do that and keep the decision to peel or not-peel orthogonal, no?
Best Regards
Sebastian
>
> Commits can be reverted - feel free :-)
>
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Cake] GSO peel behaviour tweaks
2015-11-24 10:48 ` Dave Taht
@ 2015-11-24 10:55 ` Kevin Darbyshire-Bryant
0 siblings, 0 replies; 4+ messages in thread
From: Kevin Darbyshire-Bryant @ 2015-11-24 10:55 UTC (permalink / raw)
To: Dave Taht; +Cc: cake
[-- Attachment #1: Type: text/plain, Size: 3066 bytes --]
On 24/11/15 10:48, Dave Taht wrote:
> I don't know what this used to look like but it is essentially wrong
> in both (all?) versions.
>
> - q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ?
> - 0 : min(65535U, q->rate_bps >> 12);
> + q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ||
> + q->rate_overhead ? 0 : min(65535U, q->rate_bps >> 12);
>
> What we want to do is closer to:
>
> A) start peeling once we start accruing or incurring delay in excess
> of, say, 250usec. At 1Mbit, this is basically peel always. At a gbit,
> it's peel with roughly two 10 full-size packet offloads in play. There
> are nuances vs a vs ack GRO stuff (served with a 300 quantum in
> fq_codel), and in the 10-100Mbit range...
>
> A1) So doing nothing at a rate unlimited is wrong
> A2) Taking the current len * flows as a way to calculate it is wrong
> A3) I don't know if this was ever "right". It doesn't need to be
> perfect, but this is far from right...
>
> While I am unfond of the rate estimator's overhead, it perhaps could
> be used to calculate the peel threshold in a saner way...
>
> B) always peel when we are trying to do accurate on-wire accounting.
>
> As for the other patch...
>
> In general random pointer lookups into memory (like the skb->gro
> pointer) cost more than math as the other two params here are possibly
> part of a local cache hit already... and I have no idea what the ratio
> is between gso packets and how often you'd hit the comparison... but
> see point A2 above...
>
> - if (unlikely((len * max_t(u32, b->bulk_flow_count, 1U) >
> - q->peel_threshold && skb_is_gso(skb)))) {
>
> + if (unlikely(skb_is_gso(skb) &&
> + (len * max_t(u32, b->bulk_flow_count, 1U) >
> + q->peel_threshold))) {
>
>
>
>
>
> Dave Täht
> Let's go make home routers and wifi faster! With better software!
> https://www.gofundme.com/savewifi
>
>
> On Tue, Nov 24, 2015 at 10:12 AM, Kevin Darbyshire-Bryant
> <kevin@darbyshire-bryant.me.uk> wrote:
>> I've just pushed 2 commits related to GSO peeling behaviour to master.
>>
>> 1st tweak is at worst benign and at best removes a multiply compare for
>> every packet enqueued. I'd like to think the optimiser in the compiler
>> would have done what I've done explicitly (in essence check this is a
>> gso packet 1st before thinking about peeling it) but when I checked on
>> x86_64 there was a definite difference in produced code.
>>
>> 2nd tweak is *not* benign. In essence this forces peeling if either ATM
>> framing or packet overhead is specified. Previously only ATM framing
>> forced peeling. I think this is more correct but unfortunately will be
>> slower.
>>
>> Commits can be reverted - feel free :-)
>>
>>
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>>
Both changes reverted
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4816 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-11-24 10:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-24 9:12 [Cake] GSO peel behaviour tweaks Kevin Darbyshire-Bryant
2015-11-24 10:48 ` Dave Taht
2015-11-24 10:55 ` Kevin Darbyshire-Bryant
2015-11-24 10:52 ` Sebastian Moeller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox