[Make-wifi-fast] [PATCH v5] mac80211: Switch to a virtual time-based airtime scheduler

Toke Høiland-Jørgensen toke at redhat.com
Tue Jan 7 05:43:47 EST 2020


John Yates <john at yates-sheets.org> writes:

> On Mon, Jan 6, 2020 at 10:54 AM Toke Høiland-Jørgensen <toke at redhat.com> wrote:
>> Yeah, we'd be doing the accumulation in 64bit values in any case; we're
>> talking about mainly multiplication here (the whole point of the
>> reciprocal stuff is to get the division out of the fast path). So how
>> big of an impact is one (or two) extra 64-bit multiplications going to
>> have on a 32bit platform?
>
> Top line: usually replacing 64 bit divide with multiply is a massive
> win.
>
> Many platforms make (32 bits * 32 bits) -> 64 bits quite cheap:
> - x86 has this as a single instruction: eax * edx -> eax:edx
> - arm has much the same, plus a variant that tacks ona  64 bit accumulation!
> - mips leaves the 64 bit product in a dedicated register; retrieval
> requires 2 instructions
> - ppc, being more "RISCy", has two instruction: mullo and mulhi
> (performs multiply twice!)

Ah, this is very useful, thanks :)

> Best case is when the compiler can recognize a 64 bit multiply as really
>
>   widen_32_to_64(left) x widen_32_to_64(right) -> 64_bit_product
>
> In such a case only one of the above multiply cases is necessary.  Otherwise
> one tends to get multiple partial products and double width additions.  Still,
> better than nearly any flavor of 64 bit divide.

So going back to the original patch, we don't really need to use 64-bit
divides to compute the reciprocals; not sure what I was thinking there.
That leaves us with a single 32-bit divide whenever a station is
scheduled or unscheduled, and two 64-bit multiplications in
ieee80211_register_airtime().

If we assume no more than 8ms of airtime is being reported at a time, we
can use 2^19 as the divisor and keep the multiplication in 32 bits
without overflowing, which would keep the rounding error <10% for
weights <2^15. This should be enough for single-station weights, at
least. I think it could also be sufficient for the weight_sum for most
uses, actually, so we could start out with that and only revert to
64-bit multiplication if it turns out people are pushing the weighted
fairness stuff to a point where this breaks?

Johannes, WDYT? Also, what is a good place to document this?

-Toke



More information about the Make-wifi-fast mailing list