[Make-wifi-fast] [RFC/RFT] mac80211: Switch to a virtual time-based airtime scheduler

Yibo Zhao yiboz at codeaurora.org
Tue Apr 30 05:45:00 EDT 2019


On 2019-04-21 05:15, Toke Høiland-Jørgensen wrote:
> Yibo Zhao <yiboz at codeaurora.org> writes:
> 
>> On 2019-04-11 19:24, Toke Høiland-Jørgensen wrote:
>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>> 
>>>> On 2019-04-10 18:40, Toke Høiland-Jørgensen wrote:
>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>> 
>>>>>> On 2019-04-10 04:41, Toke Høiland-Jørgensen wrote:
>>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>>> 
>>>>>>>> On 2019-04-04 16:31, Toke Høiland-Jørgensen wrote:
>>>>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>>>>> 
>>>>>>>>>> On 2019-02-16 01:05, Toke Høiland-Jørgensen wrote:
>>>>>>>>>>> This switches the airtime scheduler in mac80211 to use a
>>>>>>>>>>> virtual
>>>>>>>>>>> time-based
>>>>>>>>>>> scheduler instead of the round-robin scheduler used before.
>>>>>>>>>>> This
>>>>>>>>>>> has
>>>>>>>>>>> a
>>>>>>>>>>> couple of advantages:
>>>>>>>>>>> 
>>>>>>>>>>> - No need to sync up the round-robin scheduler in
>>>>>>>>>>> firmware/hardware
>>>>>>>>>>> with
>>>>>>>>>>>   the round-robin airtime scheduler.
>>>>>>>>>>> 
>>>>>>>>>>> - If several stations are eligible for transmission we can
>>>>>>>>>>> schedule
>>>>>>>>>>> both of
>>>>>>>>>>>   them; no need to hard-block the scheduling rotation until 
>>>>>>>>>>> the
>>>>>>>>>>> head
>>>>>>>>>>> of
>>>>>>>>>>> the
>>>>>>>>>>>   queue has used up its quantum.
>>>>>>>>>>> 
>>>>>>>>>>> - The check of whether a station is eligible for transmission
>>>>>>>>>>> becomes
>>>>>>>>>>>   simpler (in ieee80211_txq_may_transmit()).
>>>>>>>>>>> 
>>>>>>>>>>> The drawback is that scheduling becomes slightly more
>>>>>>>>>>> expensive,
>>>>>>>>>>> as
>>>>>>>>>>> we
>>>>>>>>>>> need
>>>>>>>>>>> to maintain an rbtree of TXQs sorted by virtual time. This
>>>>>>>>>>> means
>>>>>>>>>>> that
>>>>>>>>>>> ieee80211_register_airtime() becomes O(logN) in the number of
>>>>>>>>>>> currently
>>>>>>>>>>> scheduled TXQs. However, hopefully this number rarely grows 
>>>>>>>>>>> too
>>>>>>>>>>> big
>>>>>>>>>>> (it's
>>>>>>>>>>> only TXQs currently backlogged, not all associated stations),
>>>>>>>>>>> so
>>>>>>>>>>> it
>>>>>>>>>>> shouldn't be too big of an issue.
>>>>>>>>>>> 
>>>>>>>>>>> @@ -1831,18 +1830,32 @@ void
>>>>>>>>>>> ieee80211_sta_register_airtime(struct
>>>>>>>>>>> ieee80211_sta *pubsta, u8 tid,
>>>>>>>>>>>  {
>>>>>>>>>>>  	struct sta_info *sta = container_of(pubsta, struct 
>>>>>>>>>>> sta_info,
>>>>>>>>>>> sta);
>>>>>>>>>>>  	struct ieee80211_local *local = sta->sdata->local;
>>>>>>>>>>> +	struct ieee80211_txq *txq = sta->sta.txq[tid];
>>>>>>>>>>>  	u8 ac = ieee80211_ac_from_tid(tid);
>>>>>>>>>>> -	u32 airtime = 0;
>>>>>>>>>>> +	u64 airtime = 0, weight_sum;
>>>>>>>>>>> +
>>>>>>>>>>> +	if (!txq)
>>>>>>>>>>> +		return;
>>>>>>>>>>> 
>>>>>>>>>>>  	if (sta->local->airtime_flags & AIRTIME_USE_TX)
>>>>>>>>>>>  		airtime += tx_airtime;
>>>>>>>>>>>  	if (sta->local->airtime_flags & AIRTIME_USE_RX)
>>>>>>>>>>>  		airtime += rx_airtime;
>>>>>>>>>>> 
>>>>>>>>>>> +	/* Weights scale so the unit weight is 256 */
>>>>>>>>>>> +	airtime <<= 8;
>>>>>>>>>>> +
>>>>>>>>>>>  	spin_lock_bh(&local->active_txq_lock[ac]);
>>>>>>>>>>> +
>>>>>>>>>>>  	sta->airtime[ac].tx_airtime += tx_airtime;
>>>>>>>>>>>  	sta->airtime[ac].rx_airtime += rx_airtime;
>>>>>>>>>>> -	sta->airtime[ac].deficit -= airtime;
>>>>>>>>>>> +
>>>>>>>>>>> +	weight_sum = local->airtime_weight_sum[ac] ?:
>>>>>>>>>>> sta->airtime_weight;
>>>>>>>>>>> +
>>>>>>>>>>> +	local->airtime_v_t[ac] += airtime / weight_sum;
>> Hi Toke,
>> 
>> I was porting this version of ATF design to my ath10k platform and 
>> found
>> my old kernel version not supporting 64bit division. I'm wondering if 
>> it
>> is necessary to use u64 for airtime and weight_sum here though I can
>> find a solution for it. I think u32 might be enough. For airtime,
>> u32_max / 256 = 7182219 us(718 ms). As for weight_sum, u32_max / 8092 
>> us
>> = 130490, meaning we can support more than 130000 nodes with airtime
>> weight 8092 us.
> 
> As Felix said, we don't really want divides in the fast path at all. 
> And
> since the divisors are constant, we should be able to just pre-compute
> reciprocals and turn the whole thing into multiplications...
> 
>> Another finding was when I configured two 11ac STAs with different
>> airtime weight, such as 256 and 1024 meaning ratio is 1:4, the
>> throughput ratio was not roughly matching the ratio. Could you please
>> share your results? I am not sure if it is due to platform difference.
> 
> Hmm, I tested them with ath9k where things seemed to work equivalently
> to the DRR. Are you testing the same hardware with that? Would be a 
> good
> baseline.
> 
> I am on vacation until the end of the month, but can share my actual
> test results once I get back...
Hi Toke,
I saw your commit in hostapd in
http://patchwork.ozlabs.org/patch/1059334/

For dynamic and limit mode described in above hostapd patch, do I need 
to change any code in this kernel patch or any other patches am I 
missing?

After a quick look at the hostapd patch, I guess all the efforts for 
both modes are done in hostapd. Correct me if I am wrong. :)
> 
> -Toke

-- 
Yibo


More information about the Make-wifi-fast mailing list