[Make-wifi-fast] [RFC/RFT] mac80211: Switch to a virtual time-based airtime scheduler
Toke Høiland-Jørgensen
toke at redhat.com
Tue Apr 30 06:39:13 EDT 2019
Yibo Zhao <yiboz at codeaurora.org> writes:
> On 2019-04-21 05:15, Toke Høiland-Jørgensen wrote:
>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>
>>> On 2019-04-11 19:24, Toke Høiland-Jørgensen wrote:
>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>
>>>>> On 2019-04-10 18:40, Toke Høiland-Jørgensen wrote:
>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>>
>>>>>>> On 2019-04-10 04:41, Toke Høiland-Jørgensen wrote:
>>>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>>>>
>>>>>>>>> On 2019-04-04 16:31, Toke Høiland-Jørgensen wrote:
>>>>>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>>>>>>
>>>>>>>>>>> On 2019-02-16 01:05, Toke Høiland-Jørgensen wrote:
>>>>>>>>>>>> This switches the airtime scheduler in mac80211 to use a
>>>>>>>>>>>> virtual
>>>>>>>>>>>> time-based
>>>>>>>>>>>> scheduler instead of the round-robin scheduler used before.
>>>>>>>>>>>> This
>>>>>>>>>>>> has
>>>>>>>>>>>> a
>>>>>>>>>>>> couple of advantages:
>>>>>>>>>>>>
>>>>>>>>>>>> - No need to sync up the round-robin scheduler in
>>>>>>>>>>>> firmware/hardware
>>>>>>>>>>>> with
>>>>>>>>>>>> the round-robin airtime scheduler.
>>>>>>>>>>>>
>>>>>>>>>>>> - If several stations are eligible for transmission we can
>>>>>>>>>>>> schedule
>>>>>>>>>>>> both of
>>>>>>>>>>>> them; no need to hard-block the scheduling rotation until
>>>>>>>>>>>> the
>>>>>>>>>>>> head
>>>>>>>>>>>> of
>>>>>>>>>>>> the
>>>>>>>>>>>> queue has used up its quantum.
>>>>>>>>>>>>
>>>>>>>>>>>> - The check of whether a station is eligible for transmission
>>>>>>>>>>>> becomes
>>>>>>>>>>>> simpler (in ieee80211_txq_may_transmit()).
>>>>>>>>>>>>
>>>>>>>>>>>> The drawback is that scheduling becomes slightly more
>>>>>>>>>>>> expensive,
>>>>>>>>>>>> as
>>>>>>>>>>>> we
>>>>>>>>>>>> need
>>>>>>>>>>>> to maintain an rbtree of TXQs sorted by virtual time. This
>>>>>>>>>>>> means
>>>>>>>>>>>> that
>>>>>>>>>>>> ieee80211_register_airtime() becomes O(logN) in the number of
>>>>>>>>>>>> currently
>>>>>>>>>>>> scheduled TXQs. However, hopefully this number rarely grows
>>>>>>>>>>>> too
>>>>>>>>>>>> big
>>>>>>>>>>>> (it's
>>>>>>>>>>>> only TXQs currently backlogged, not all associated stations),
>>>>>>>>>>>> so
>>>>>>>>>>>> it
>>>>>>>>>>>> shouldn't be too big of an issue.
>>>>>>>>>>>>
>>>>>>>>>>>> @@ -1831,18 +1830,32 @@ void
>>>>>>>>>>>> ieee80211_sta_register_airtime(struct
>>>>>>>>>>>> ieee80211_sta *pubsta, u8 tid,
>>>>>>>>>>>> {
>>>>>>>>>>>> struct sta_info *sta = container_of(pubsta, struct
>>>>>>>>>>>> sta_info,
>>>>>>>>>>>> sta);
>>>>>>>>>>>> struct ieee80211_local *local = sta->sdata->local;
>>>>>>>>>>>> + struct ieee80211_txq *txq = sta->sta.txq[tid];
>>>>>>>>>>>> u8 ac = ieee80211_ac_from_tid(tid);
>>>>>>>>>>>> - u32 airtime = 0;
>>>>>>>>>>>> + u64 airtime = 0, weight_sum;
>>>>>>>>>>>> +
>>>>>>>>>>>> + if (!txq)
>>>>>>>>>>>> + return;
>>>>>>>>>>>>
>>>>>>>>>>>> if (sta->local->airtime_flags & AIRTIME_USE_TX)
>>>>>>>>>>>> airtime += tx_airtime;
>>>>>>>>>>>> if (sta->local->airtime_flags & AIRTIME_USE_RX)
>>>>>>>>>>>> airtime += rx_airtime;
>>>>>>>>>>>>
>>>>>>>>>>>> + /* Weights scale so the unit weight is 256 */
>>>>>>>>>>>> + airtime <<= 8;
>>>>>>>>>>>> +
>>>>>>>>>>>> spin_lock_bh(&local->active_txq_lock[ac]);
>>>>>>>>>>>> +
>>>>>>>>>>>> sta->airtime[ac].tx_airtime += tx_airtime;
>>>>>>>>>>>> sta->airtime[ac].rx_airtime += rx_airtime;
>>>>>>>>>>>> - sta->airtime[ac].deficit -= airtime;
>>>>>>>>>>>> +
>>>>>>>>>>>> + weight_sum = local->airtime_weight_sum[ac] ?:
>>>>>>>>>>>> sta->airtime_weight;
>>>>>>>>>>>> +
>>>>>>>>>>>> + local->airtime_v_t[ac] += airtime / weight_sum;
>>> Hi Toke,
>>>
>>> I was porting this version of ATF design to my ath10k platform and
>>> found
>>> my old kernel version not supporting 64bit division. I'm wondering if
>>> it
>>> is necessary to use u64 for airtime and weight_sum here though I can
>>> find a solution for it. I think u32 might be enough. For airtime,
>>> u32_max / 256 = 7182219 us(718 ms). As for weight_sum, u32_max / 8092
>>> us
>>> = 130490, meaning we can support more than 130000 nodes with airtime
>>> weight 8092 us.
>>
>> As Felix said, we don't really want divides in the fast path at all.
>> And
>> since the divisors are constant, we should be able to just pre-compute
>> reciprocals and turn the whole thing into multiplications...
>>
>>> Another finding was when I configured two 11ac STAs with different
>>> airtime weight, such as 256 and 1024 meaning ratio is 1:4, the
>>> throughput ratio was not roughly matching the ratio. Could you please
>>> share your results? I am not sure if it is due to platform difference.
>>
>> Hmm, I tested them with ath9k where things seemed to work equivalently
>> to the DRR. Are you testing the same hardware with that? Would be a
>> good
>> baseline.
>>
>> I am on vacation until the end of the month, but can share my actual
>> test results once I get back...
> Hi Toke,
> I saw your commit in hostapd in
> http://patchwork.ozlabs.org/patch/1059334/
>
> For dynamic and limit mode described in above hostapd patch, do I need
> to change any code in this kernel patch or any other patches am I
> missing?
Nope, the kernel just exposes the API to set weights, hostapd does
everything else :)
> After a quick look at the hostapd patch, I guess all the efforts for
> both modes are done in hostapd. Correct me if I am wrong. :)
You are quite right!
-Toke
More information about the Make-wifi-fast
mailing list