From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-12-iad.dyndns.com (mxout-189-iad.mailhop.org [216.146.32.189]) by lists.bufferbloat.net (Postfix) with ESMTP id F076F2E0532 for ; Wed, 16 Feb 2011 20:26:50 -0800 (PST) Received: from scan-11-iad.mailhop.org (scan-11-iad.local [10.150.0.208]) by mail-12-iad.dyndns.com (Postfix) with ESMTP id EB79F370334 for ; Thu, 17 Feb 2011 04:26:49 +0000 (UTC) X-Spam-Score: -1.0 (-) X-Mail-Handler: MailHop by DynDNS X-Originating-IP: 209.85.214.43 Received: from mail-bw0-f43.google.com (mail-bw0-f43.google.com [209.85.214.43]) by mail-12-iad.dyndns.com (Postfix) with ESMTP id 3E3F83700B2 for ; Thu, 17 Feb 2011 04:26:46 +0000 (UTC) Received: by bwz14 with SMTP id 14so2313121bwz.16 for ; Wed, 16 Feb 2011 20:26:45 -0800 (PST) MIME-Version: 1.0 Received: by 10.204.33.70 with SMTP id g6mr1193657bkd.177.1297916804220; Wed, 16 Feb 2011 20:26:44 -0800 (PST) Sender: njs@vorpus.org Received: by 10.204.54.4 with HTTP; Wed, 16 Feb 2011 20:26:44 -0800 (PST) In-Reply-To: <1297907356-3214-1-git-send-email-linville@tuxdriver.com> References: <1297619803-2832-1-git-send-email-njs@pobox.com> <1297907356-3214-1-git-send-email-linville@tuxdriver.com> Date: Wed, 16 Feb 2011 20:26:44 -0800 X-Google-Sender-Auth: g5WvNUM_oBIpEC1Qq-dB1-Mu2JE Message-ID: Subject: Re: [RFC] mac80211: implement eBDP algorithm to fight bufferbloat From: Nathaniel Smith To: "John W. Linville" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: bloat-devel@lists.bufferbloat.net, johannes@sipsolutions.net, linux-wireless@vger.kernel.org X-BeenThere: bloat-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Developers working on AQM, device drivers, and networking stacks" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Feb 2011 04:26:51 -0000 On Wed, Feb 16, 2011 at 5:49 PM, John W. Linville wrote: > I'm sure this isn't ideal. =C2=A0This should be combined with the ALT > algorithm to yield the A* algorithm. =C2=A0There are parameters that > probably should be tunable (or at least better researched). =C2=A0This ma= y > not be ideal for 802.11n -- it may even be detrimental to it. Excellent! General thoughts: I think you're wiping out any interrupt mitigation any drivers are doing, because they'll never get filled up to even their low water mark? (http://article.gmane.org/gmane.linux.kernel.wireless.general/64843 has a scheme for adapting the eBDP idea to interrupt mitigation) It's important to keep in mind the distinction between: -- a host's total tx buffer -- the individual queues that make up that buffer In Linux we have two queues in serial, the net subsystem's Qdisc thing, which feeds the driver's tx queue. It's this distinction that makes it reasonable to shrink the tx queue down to really tiny sizes (a few ms), because while a router needs a few hundred milliseconds (~a RTT) of *total* buffering to absorb bursty packet arrivals, we want as much of that buffering as possible to be happening in the Qdisc, where it can have AMQ and QoS and applied. A* is an algorithm for estimating the right total host buffer size. (Of course, we might want to just disable the Qdisc buffering and move everything inside the driver -- Felix Fietkau is considering this[1], because when aggregating you really want a separate buffer per destination STA, and there's no easy way to do this with the current system -- but obviously that raises its own issues...) [1] https://lists.bufferbloat.net/pipermail/bloat-devel/2011-February/00001= 3.html > @@ -212,6 +216,11 @@ void ieee80211_tx_status(struct ieee80211_hw *hw, st= ruct sk_buff *skb) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (memcmp(hdr->ad= dr2, sta->sdata->vif.addr, ETH_ALEN)) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0continue; > > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 atomic_dec(&sta->sdata= ->enqueued); > + > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* factor current tser= v into service time estimate */ > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ewma_add(&sta->sdata->= tserv_ns_avg, ktime_to_ns(tserv)); > + I think you're calculating the total time that the packet was resident in the queue, and treating it like the time to service a single packet. In my patch I also stored the current queue length at the time the packet was enqueued, and then divided the time delta by the number of packets that were serviced in that time. > @@ -1323,6 +1325,20 @@ static int __ieee80211_tx(struct ieee80211_local *= local, > > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sdata =3D vif_to_s= data(info->control.vif); > > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* test for queue admi= ssion qualifications */ > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 tserv_ns_avg =3D ewma_= read(&sdata->tserv_ns_avg); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* constants 2 msec an= d offset 5 should be tunable? */ > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 max_enqueued =3D 2 * N= SEC_PER_MSEC / tserv_ns_avg + 5; 5 packets worth of fudge factor seems high. I can measure 15-20 ms single packet service times here just by turning down to 1Mbs on an uncontended network; the Li et al paper you link has a graph suggesting that on contented networks, 50-100 ms/packet is not uncommon. (And even if I don't force my card to 1Mbs, with my patch I'm still seeing appropriate buffer sizes in the 1-2 packet range.) So you may be unconditionally adding a few hundred milliseconds of latency here. They make a good point that you might want some extra space to absorb short-term fluctuations in packet service times, but I think it'd make more sense to do that by clamping the queue length to some minimum value (2 packets?), and possibly bumping up the magic "2 ms" constant. Or be even more clever and estimate the standard deviation of the single packet service times... (http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_a= lgorithm) > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (atomic_read(&sdata= ->enqueued) > max_enqueued) { > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 /* silently drop */ > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 dev_kfree_skb(skb); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 return IEEE80211_TX_OK; > + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 } Shouldn't you be stopping the queue, too? I think by leaving the queue open and discarding everything, you effectively disable the net layer's Qdisc buffering, which might also be a factor in your observations of reduced bufferbloat :-). -- Nathaniel