On 24/11/15 10:48, Dave Taht wrote: > I don't know what this used to look like but it is essentially wrong > in both (all?) versions. > > - q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) ? > - 0 : min(65535U, q->rate_bps >> 12); > + q->peel_threshold = (q->rate_flags & CAKE_FLAG_ATM) || > + q->rate_overhead ? 0 : min(65535U, q->rate_bps >> 12); > > What we want to do is closer to: > > A) start peeling once we start accruing or incurring delay in excess > of, say, 250usec. At 1Mbit, this is basically peel always. At a gbit, > it's peel with roughly two 10 full-size packet offloads in play. There > are nuances vs a vs ack GRO stuff (served with a 300 quantum in > fq_codel), and in the 10-100Mbit range... > > A1) So doing nothing at a rate unlimited is wrong > A2) Taking the current len * flows as a way to calculate it is wrong > A3) I don't know if this was ever "right". It doesn't need to be > perfect, but this is far from right... > > While I am unfond of the rate estimator's overhead, it perhaps could > be used to calculate the peel threshold in a saner way... > > B) always peel when we are trying to do accurate on-wire accounting. > > As for the other patch... > > In general random pointer lookups into memory (like the skb->gro > pointer) cost more than math as the other two params here are possibly > part of a local cache hit already... and I have no idea what the ratio > is between gso packets and how often you'd hit the comparison... but > see point A2 above... > > - if (unlikely((len * max_t(u32, b->bulk_flow_count, 1U) > > - q->peel_threshold && skb_is_gso(skb)))) { > > + if (unlikely(skb_is_gso(skb) && > + (len * max_t(u32, b->bulk_flow_count, 1U) > > + q->peel_threshold))) { > > > > > > Dave Täht > Let's go make home routers and wifi faster! With better software! > https://www.gofundme.com/savewifi > > > On Tue, Nov 24, 2015 at 10:12 AM, Kevin Darbyshire-Bryant > wrote: >> I've just pushed 2 commits related to GSO peeling behaviour to master. >> >> 1st tweak is at worst benign and at best removes a multiply compare for >> every packet enqueued. I'd like to think the optimiser in the compiler >> would have done what I've done explicitly (in essence check this is a >> gso packet 1st before thinking about peeling it) but when I checked on >> x86_64 there was a definite difference in produced code. >> >> 2nd tweak is *not* benign. In essence this forces peeling if either ATM >> framing or packet overhead is specified. Previously only ATM framing >> forced peeling. I think this is more correct but unfortunately will be >> slower. >> >> Commits can be reverted - feel free :-) >> >> >> _______________________________________________ >> Cake mailing list >> Cake@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/cake >> Both changes reverted