[Make-wifi-fast] [RFC/RFT] mac80211: Switch to a virtual time-based airtime scheduler

Toke Høiland-Jørgensen toke at redhat.com
Tue Apr 30 06:39:13 EDT 2019


Yibo Zhao <yiboz at codeaurora.org> writes:

> On 2019-04-21 05:15, Toke Høiland-Jørgensen wrote:
>> Yibo Zhao <yiboz at codeaurora.org> writes:
>> 
>>> On 2019-04-11 19:24, Toke Høiland-Jørgensen wrote:
>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>> 
>>>>> On 2019-04-10 18:40, Toke Høiland-Jørgensen wrote:
>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>> 
>>>>>>> On 2019-04-10 04:41, Toke Høiland-Jørgensen wrote:
>>>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>>>> 
>>>>>>>>> On 2019-04-04 16:31, Toke Høiland-Jørgensen wrote:
>>>>>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>>>>>> 
>>>>>>>>>>> On 2019-02-16 01:05, Toke Høiland-Jørgensen wrote:
>>>>>>>>>>>> This switches the airtime scheduler in mac80211 to use a
>>>>>>>>>>>> virtual
>>>>>>>>>>>> time-based
>>>>>>>>>>>> scheduler instead of the round-robin scheduler used before.
>>>>>>>>>>>> This
>>>>>>>>>>>> has
>>>>>>>>>>>> a
>>>>>>>>>>>> couple of advantages:
>>>>>>>>>>>> 
>>>>>>>>>>>> - No need to sync up the round-robin scheduler in
>>>>>>>>>>>> firmware/hardware
>>>>>>>>>>>> with
>>>>>>>>>>>>   the round-robin airtime scheduler.
>>>>>>>>>>>> 
>>>>>>>>>>>> - If several stations are eligible for transmission we can
>>>>>>>>>>>> schedule
>>>>>>>>>>>> both of
>>>>>>>>>>>>   them; no need to hard-block the scheduling rotation until 
>>>>>>>>>>>> the
>>>>>>>>>>>> head
>>>>>>>>>>>> of
>>>>>>>>>>>> the
>>>>>>>>>>>>   queue has used up its quantum.
>>>>>>>>>>>> 
>>>>>>>>>>>> - The check of whether a station is eligible for transmission
>>>>>>>>>>>> becomes
>>>>>>>>>>>>   simpler (in ieee80211_txq_may_transmit()).
>>>>>>>>>>>> 
>>>>>>>>>>>> The drawback is that scheduling becomes slightly more
>>>>>>>>>>>> expensive,
>>>>>>>>>>>> as
>>>>>>>>>>>> we
>>>>>>>>>>>> need
>>>>>>>>>>>> to maintain an rbtree of TXQs sorted by virtual time. This
>>>>>>>>>>>> means
>>>>>>>>>>>> that
>>>>>>>>>>>> ieee80211_register_airtime() becomes O(logN) in the number of
>>>>>>>>>>>> currently
>>>>>>>>>>>> scheduled TXQs. However, hopefully this number rarely grows 
>>>>>>>>>>>> too
>>>>>>>>>>>> big
>>>>>>>>>>>> (it's
>>>>>>>>>>>> only TXQs currently backlogged, not all associated stations),
>>>>>>>>>>>> so
>>>>>>>>>>>> it
>>>>>>>>>>>> shouldn't be too big of an issue.
>>>>>>>>>>>> 
>>>>>>>>>>>> @@ -1831,18 +1830,32 @@ void
>>>>>>>>>>>> ieee80211_sta_register_airtime(struct
>>>>>>>>>>>> ieee80211_sta *pubsta, u8 tid,
>>>>>>>>>>>>  {
>>>>>>>>>>>>  	struct sta_info *sta = container_of(pubsta, struct 
>>>>>>>>>>>> sta_info,
>>>>>>>>>>>> sta);
>>>>>>>>>>>>  	struct ieee80211_local *local = sta->sdata->local;
>>>>>>>>>>>> +	struct ieee80211_txq *txq = sta->sta.txq[tid];
>>>>>>>>>>>>  	u8 ac = ieee80211_ac_from_tid(tid);
>>>>>>>>>>>> -	u32 airtime = 0;
>>>>>>>>>>>> +	u64 airtime = 0, weight_sum;
>>>>>>>>>>>> +
>>>>>>>>>>>> +	if (!txq)
>>>>>>>>>>>> +		return;
>>>>>>>>>>>> 
>>>>>>>>>>>>  	if (sta->local->airtime_flags & AIRTIME_USE_TX)
>>>>>>>>>>>>  		airtime += tx_airtime;
>>>>>>>>>>>>  	if (sta->local->airtime_flags & AIRTIME_USE_RX)
>>>>>>>>>>>>  		airtime += rx_airtime;
>>>>>>>>>>>> 
>>>>>>>>>>>> +	/* Weights scale so the unit weight is 256 */
>>>>>>>>>>>> +	airtime <<= 8;
>>>>>>>>>>>> +
>>>>>>>>>>>>  	spin_lock_bh(&local->active_txq_lock[ac]);
>>>>>>>>>>>> +
>>>>>>>>>>>>  	sta->airtime[ac].tx_airtime += tx_airtime;
>>>>>>>>>>>>  	sta->airtime[ac].rx_airtime += rx_airtime;
>>>>>>>>>>>> -	sta->airtime[ac].deficit -= airtime;
>>>>>>>>>>>> +
>>>>>>>>>>>> +	weight_sum = local->airtime_weight_sum[ac] ?:
>>>>>>>>>>>> sta->airtime_weight;
>>>>>>>>>>>> +
>>>>>>>>>>>> +	local->airtime_v_t[ac] += airtime / weight_sum;
>>> Hi Toke,
>>> 
>>> I was porting this version of ATF design to my ath10k platform and 
>>> found
>>> my old kernel version not supporting 64bit division. I'm wondering if 
>>> it
>>> is necessary to use u64 for airtime and weight_sum here though I can
>>> find a solution for it. I think u32 might be enough. For airtime,
>>> u32_max / 256 = 7182219 us(718 ms). As for weight_sum, u32_max / 8092 
>>> us
>>> = 130490, meaning we can support more than 130000 nodes with airtime
>>> weight 8092 us.
>> 
>> As Felix said, we don't really want divides in the fast path at all. 
>> And
>> since the divisors are constant, we should be able to just pre-compute
>> reciprocals and turn the whole thing into multiplications...
>> 
>>> Another finding was when I configured two 11ac STAs with different
>>> airtime weight, such as 256 and 1024 meaning ratio is 1:4, the
>>> throughput ratio was not roughly matching the ratio. Could you please
>>> share your results? I am not sure if it is due to platform difference.
>> 
>> Hmm, I tested them with ath9k where things seemed to work equivalently
>> to the DRR. Are you testing the same hardware with that? Would be a 
>> good
>> baseline.
>> 
>> I am on vacation until the end of the month, but can share my actual
>> test results once I get back...
> Hi Toke,
> I saw your commit in hostapd in
> http://patchwork.ozlabs.org/patch/1059334/
>
> For dynamic and limit mode described in above hostapd patch, do I need 
> to change any code in this kernel patch or any other patches am I 
> missing?

Nope, the kernel just exposes the API to set weights, hostapd does
everything else :)

> After a quick look at the hostapd patch, I guess all the efforts for 
> both modes are done in hostapd. Correct me if I am wrong. :)

You are quite right!

-Toke


More information about the Make-wifi-fast mailing list