[Make-wifi-fast] [RFC/RFT] mac80211: Switch to a virtual time-based airtime scheduler

Toke Høiland-Jørgensen toke at redhat.com
Sat Apr 20 17:15:55 EDT 2019


Yibo Zhao <yiboz at codeaurora.org> writes:

> On 2019-04-11 19:24, Toke Høiland-Jørgensen wrote:
>> Yibo Zhao <yiboz at codeaurora.org> writes:
>> 
>>> On 2019-04-10 18:40, Toke Høiland-Jørgensen wrote:
>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>> 
>>>>> On 2019-04-10 04:41, Toke Høiland-Jørgensen wrote:
>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>> 
>>>>>>> On 2019-04-04 16:31, Toke Høiland-Jørgensen wrote:
>>>>>>>> Yibo Zhao <yiboz at codeaurora.org> writes:
>>>>>>>> 
>>>>>>>>> On 2019-02-16 01:05, Toke Høiland-Jørgensen wrote:
>>>>>>>>>> This switches the airtime scheduler in mac80211 to use a 
>>>>>>>>>> virtual
>>>>>>>>>> time-based
>>>>>>>>>> scheduler instead of the round-robin scheduler used before. 
>>>>>>>>>> This
>>>>>>>>>> has
>>>>>>>>>> a
>>>>>>>>>> couple of advantages:
>>>>>>>>>> 
>>>>>>>>>> - No need to sync up the round-robin scheduler in
>>>>>>>>>> firmware/hardware
>>>>>>>>>> with
>>>>>>>>>>   the round-robin airtime scheduler.
>>>>>>>>>> 
>>>>>>>>>> - If several stations are eligible for transmission we can
>>>>>>>>>> schedule
>>>>>>>>>> both of
>>>>>>>>>>   them; no need to hard-block the scheduling rotation until the
>>>>>>>>>> head
>>>>>>>>>> of
>>>>>>>>>> the
>>>>>>>>>>   queue has used up its quantum.
>>>>>>>>>> 
>>>>>>>>>> - The check of whether a station is eligible for transmission
>>>>>>>>>> becomes
>>>>>>>>>>   simpler (in ieee80211_txq_may_transmit()).
>>>>>>>>>> 
>>>>>>>>>> The drawback is that scheduling becomes slightly more 
>>>>>>>>>> expensive,
>>>>>>>>>> as
>>>>>>>>>> we
>>>>>>>>>> need
>>>>>>>>>> to maintain an rbtree of TXQs sorted by virtual time. This 
>>>>>>>>>> means
>>>>>>>>>> that
>>>>>>>>>> ieee80211_register_airtime() becomes O(logN) in the number of
>>>>>>>>>> currently
>>>>>>>>>> scheduled TXQs. However, hopefully this number rarely grows too
>>>>>>>>>> big
>>>>>>>>>> (it's
>>>>>>>>>> only TXQs currently backlogged, not all associated stations), 
>>>>>>>>>> so
>>>>>>>>>> it
>>>>>>>>>> shouldn't be too big of an issue.
>>>>>>>>>> 
>>>>>>>>>> @@ -1831,18 +1830,32 @@ void
>>>>>>>>>> ieee80211_sta_register_airtime(struct
>>>>>>>>>> ieee80211_sta *pubsta, u8 tid,
>>>>>>>>>>  {
>>>>>>>>>>  	struct sta_info *sta = container_of(pubsta, struct sta_info,
>>>>>>>>>> sta);
>>>>>>>>>>  	struct ieee80211_local *local = sta->sdata->local;
>>>>>>>>>> +	struct ieee80211_txq *txq = sta->sta.txq[tid];
>>>>>>>>>>  	u8 ac = ieee80211_ac_from_tid(tid);
>>>>>>>>>> -	u32 airtime = 0;
>>>>>>>>>> +	u64 airtime = 0, weight_sum;
>>>>>>>>>> +
>>>>>>>>>> +	if (!txq)
>>>>>>>>>> +		return;
>>>>>>>>>> 
>>>>>>>>>>  	if (sta->local->airtime_flags & AIRTIME_USE_TX)
>>>>>>>>>>  		airtime += tx_airtime;
>>>>>>>>>>  	if (sta->local->airtime_flags & AIRTIME_USE_RX)
>>>>>>>>>>  		airtime += rx_airtime;
>>>>>>>>>> 
>>>>>>>>>> +	/* Weights scale so the unit weight is 256 */
>>>>>>>>>> +	airtime <<= 8;
>>>>>>>>>> +
>>>>>>>>>>  	spin_lock_bh(&local->active_txq_lock[ac]);
>>>>>>>>>> +
>>>>>>>>>>  	sta->airtime[ac].tx_airtime += tx_airtime;
>>>>>>>>>>  	sta->airtime[ac].rx_airtime += rx_airtime;
>>>>>>>>>> -	sta->airtime[ac].deficit -= airtime;
>>>>>>>>>> +
>>>>>>>>>> +	weight_sum = local->airtime_weight_sum[ac] ?:
>>>>>>>>>> sta->airtime_weight;
>>>>>>>>>> +
>>>>>>>>>> +	local->airtime_v_t[ac] += airtime / weight_sum;
> Hi Toke,
>
> I was porting this version of ATF design to my ath10k platform and found 
> my old kernel version not supporting 64bit division. I'm wondering if it 
> is necessary to use u64 for airtime and weight_sum here though I can 
> find a solution for it. I think u32 might be enough. For airtime, 
> u32_max / 256 = 7182219 us(718 ms). As for weight_sum, u32_max / 8092 us 
> = 130490, meaning we can support more than 130000 nodes with airtime 
> weight 8092 us.

As Felix said, we don't really want divides in the fast path at all. And
since the divisors are constant, we should be able to just pre-compute
reciprocals and turn the whole thing into multiplications...

> Another finding was when I configured two 11ac STAs with different
> airtime weight, such as 256 and 1024 meaning ratio is 1:4, the
> throughput ratio was not roughly matching the ratio. Could you please
> share your results? I am not sure if it is due to platform difference.

Hmm, I tested them with ath9k where things seemed to work equivalently
to the DRR. Are you testing the same hardware with that? Would be a good
baseline.

I am on vacation until the end of the month, but can share my actual
test results once I get back...

-Toke


More information about the Make-wifi-fast mailing list