From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-f68.google.com (mail-lf1-f68.google.com [209.85.167.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 38F7C3CB36 for ; Wed, 1 May 2019 06:12:51 -0400 (EDT) Received: by mail-lf1-f68.google.com with SMTP id i68so12670803lfi.10 for ; Wed, 01 May 2019 03:12:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=91yblX5MKaqxnF25qflRx7IWX0G6YFJweKroNeGpU5Q=; b=WjOMifnTV/s1xk1Ajc6jtTGSbRavkN1QfxNvtUTWXB9c6u6Hvdwbj95mpJSzRSLYyB a3FD2j3RoWMnIYejZmHkCIqZv0BInS+86iYkkcYrteRJuN38WxufoRvvJBH4pyfePoO0 A2Wa9uDfjthVMHFL473LmSntTuksuRPIuFF+eevj/4E+1XyY/eu9gn/0y8zBe+RYSi45 pkCPj+8eh07ydOkPG6EvkYXl6oJxqLhY8G+638dUJXA+5Y5xVU2gt1QIW1GpSxTTXeFz 5HTsAeX9j0SW/MPwehF9itMi33CxXnyvqxzEkbz/nRR93zBCIVflWxufLTzIwU9HlvAu TcmA== X-Gm-Message-State: APjAAAUQLzJPBUucYRid8S5bSk9QLiVNvaZRedXhGabDlVSYPpApj7Hk Hjbnb7kU0y/QQOMgiwAn2UAAbA== X-Google-Smtp-Source: APXvYqwj28hHm8dHJMSlLmM/Du8jZPknWT8j72AhSyDTA446kuCu72BEQdSDd0DVnos+jSc6gq0NrA== X-Received: by 2002:ac2:554a:: with SMTP id l10mr34384581lfk.45.1556705570074; Wed, 01 May 2019 03:12:50 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (borgediget.toke.dk. [85.204.121.218]) by smtp.gmail.com with ESMTPSA id v26sm8024084lja.60.2019.05.01.03.12.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 01 May 2019 03:12:49 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 7C661182083; Tue, 30 Apr 2019 12:39:13 +0200 (CEST) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Yibo Zhao Cc: make-wifi-fast@lists.bufferbloat.net, linux-wireless@vger.kernel.org, Felix Fietkau , Rajkumar Manoharan , Kan Yan , linux-wireless-owner@vger.kernel.org In-Reply-To: References: <20190215170512.31512-1-toke@redhat.com> <753b328855b85f960ceaf974194a7506@codeaurora.org> <87ftqy41ea.fsf@toke.dk> <877ec2ykrh.fsf@toke.dk> <89d32174b282006c8d4e7614657171be@codeaurora.org> <87a7gyw3cu.fsf@toke.dk> <73077ba7cda566d5eeb2395978b3524c@codeaurora.org> <877ec0u6mu.fsf@toke.dk> <76591d2924d7b6fec06d0df07247166a@codeaurora.org> <87bm10ped0.fsf@toke.dk> X-Clacks-Overhead: GNU Terry Pratchett Date: Tue, 30 Apr 2019 12:39:13 +0200 Message-ID: <875zqvojzi.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Make-wifi-fast] [RFC/RFT] mac80211: Switch to a virtual time-based airtime scheduler X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 May 2019 10:12:51 -0000 Yibo Zhao writes: > On 2019-04-21 05:15, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> Yibo Zhao writes: >>=20 >>> On 2019-04-11 19:24, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>>> Yibo Zhao writes: >>>>=20 >>>>> On 2019-04-10 18:40, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>>>>> Yibo Zhao writes: >>>>>>=20 >>>>>>> On 2019-04-10 04:41, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>>>>>>> Yibo Zhao writes: >>>>>>>>=20 >>>>>>>>> On 2019-04-04 16:31, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>>>>>>>>> Yibo Zhao writes: >>>>>>>>>>=20 >>>>>>>>>>> On 2019-02-16 01:05, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>>>>>>>>>>> This switches the airtime scheduler in mac80211 to use a >>>>>>>>>>>> virtual >>>>>>>>>>>> time-based >>>>>>>>>>>> scheduler instead of the round-robin scheduler used before. >>>>>>>>>>>> This >>>>>>>>>>>> has >>>>>>>>>>>> a >>>>>>>>>>>> couple of advantages: >>>>>>>>>>>>=20 >>>>>>>>>>>> - No need to sync up the round-robin scheduler in >>>>>>>>>>>> firmware/hardware >>>>>>>>>>>> with >>>>>>>>>>>> the round-robin airtime scheduler. >>>>>>>>>>>>=20 >>>>>>>>>>>> - If several stations are eligible for transmission we can >>>>>>>>>>>> schedule >>>>>>>>>>>> both of >>>>>>>>>>>> them; no need to hard-block the scheduling rotation until=20 >>>>>>>>>>>> the >>>>>>>>>>>> head >>>>>>>>>>>> of >>>>>>>>>>>> the >>>>>>>>>>>> queue has used up its quantum. >>>>>>>>>>>>=20 >>>>>>>>>>>> - The check of whether a station is eligible for transmission >>>>>>>>>>>> becomes >>>>>>>>>>>> simpler (in ieee80211_txq_may_transmit()). >>>>>>>>>>>>=20 >>>>>>>>>>>> The drawback is that scheduling becomes slightly more >>>>>>>>>>>> expensive, >>>>>>>>>>>> as >>>>>>>>>>>> we >>>>>>>>>>>> need >>>>>>>>>>>> to maintain an rbtree of TXQs sorted by virtual time. This >>>>>>>>>>>> means >>>>>>>>>>>> that >>>>>>>>>>>> ieee80211_register_airtime() becomes O(logN) in the number of >>>>>>>>>>>> currently >>>>>>>>>>>> scheduled TXQs. However, hopefully this number rarely grows=20 >>>>>>>>>>>> too >>>>>>>>>>>> big >>>>>>>>>>>> (it's >>>>>>>>>>>> only TXQs currently backlogged, not all associated stations), >>>>>>>>>>>> so >>>>>>>>>>>> it >>>>>>>>>>>> shouldn't be too big of an issue. >>>>>>>>>>>>=20 >>>>>>>>>>>> @@ -1831,18 +1830,32 @@ void >>>>>>>>>>>> ieee80211_sta_register_airtime(struct >>>>>>>>>>>> ieee80211_sta *pubsta, u8 tid, >>>>>>>>>>>> { >>>>>>>>>>>> struct sta_info *sta =3D container_of(pubsta, struct=20 >>>>>>>>>>>> sta_info, >>>>>>>>>>>> sta); >>>>>>>>>>>> struct ieee80211_local *local =3D sta->sdata->local; >>>>>>>>>>>> + struct ieee80211_txq *txq =3D sta->sta.txq[tid]; >>>>>>>>>>>> u8 ac =3D ieee80211_ac_from_tid(tid); >>>>>>>>>>>> - u32 airtime =3D 0; >>>>>>>>>>>> + u64 airtime =3D 0, weight_sum; >>>>>>>>>>>> + >>>>>>>>>>>> + if (!txq) >>>>>>>>>>>> + return; >>>>>>>>>>>>=20 >>>>>>>>>>>> if (sta->local->airtime_flags & AIRTIME_USE_TX) >>>>>>>>>>>> airtime +=3D tx_airtime; >>>>>>>>>>>> if (sta->local->airtime_flags & AIRTIME_USE_RX) >>>>>>>>>>>> airtime +=3D rx_airtime; >>>>>>>>>>>>=20 >>>>>>>>>>>> + /* Weights scale so the unit weight is 256 */ >>>>>>>>>>>> + airtime <<=3D 8; >>>>>>>>>>>> + >>>>>>>>>>>> spin_lock_bh(&local->active_txq_lock[ac]); >>>>>>>>>>>> + >>>>>>>>>>>> sta->airtime[ac].tx_airtime +=3D tx_airtime; >>>>>>>>>>>> sta->airtime[ac].rx_airtime +=3D rx_airtime; >>>>>>>>>>>> - sta->airtime[ac].deficit -=3D airtime; >>>>>>>>>>>> + >>>>>>>>>>>> + weight_sum =3D local->airtime_weight_sum[ac] ?: >>>>>>>>>>>> sta->airtime_weight; >>>>>>>>>>>> + >>>>>>>>>>>> + local->airtime_v_t[ac] +=3D airtime / weight_sum; >>> Hi Toke, >>>=20 >>> I was porting this version of ATF design to my ath10k platform and=20 >>> found >>> my old kernel version not supporting 64bit division. I'm wondering if=20 >>> it >>> is necessary to use u64 for airtime and weight_sum here though I can >>> find a solution for it. I think u32 might be enough. For airtime, >>> u32_max / 256 =3D 7182219 us(718 ms). As for weight_sum, u32_max / 8092= =20 >>> us >>> =3D 130490, meaning we can support more than 130000 nodes with airtime >>> weight 8092 us. >>=20 >> As Felix said, we don't really want divides in the fast path at all.=20 >> And >> since the divisors are constant, we should be able to just pre-compute >> reciprocals and turn the whole thing into multiplications... >>=20 >>> Another finding was when I configured two 11ac STAs with different >>> airtime weight, such as 256 and 1024 meaning ratio is 1:4, the >>> throughput ratio was not roughly matching the ratio. Could you please >>> share your results? I am not sure if it is due to platform difference. >>=20 >> Hmm, I tested them with ath9k where things seemed to work equivalently >> to the DRR. Are you testing the same hardware with that? Would be a=20 >> good >> baseline. >>=20 >> I am on vacation until the end of the month, but can share my actual >> test results once I get back... > Hi Toke, > I saw your commit in hostapd in > http://patchwork.ozlabs.org/patch/1059334/ > > For dynamic and limit mode described in above hostapd patch, do I need=20 > to change any code in this kernel patch or any other patches am I=20 > missing? Nope, the kernel just exposes the API to set weights, hostapd does everything else :) > After a quick look at the hostapd patch, I guess all the efforts for=20 > both modes are done in hostapd. Correct me if I am wrong. :) You are quite right! -Toke