From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-02-iad.dyndns.com (mxout-232-iad.mailhop.org [216.146.32.232]) by lists.bufferbloat.net (Postfix) with ESMTP id 6A5902E0403 for ; Thu, 3 Mar 2011 04:52:13 -0800 (PST) Received: from scan-02-iad.mailhop.org (scan-02-iad.local [10.150.0.207]) by mail-02-iad.dyndns.com (Postfix) with ESMTP id 019C28340BD for ; Thu, 3 Mar 2011 12:51:24 +0000 (UTC) X-Spam-Score: -1.0 (-) X-Mail-Handler: MailHop by DynDNS X-Originating-IP: 209.85.214.43 Received: from mail-bw0-f43.google.com (mail-bw0-f43.google.com [209.85.214.43]) by mail-02-iad.dyndns.com (Postfix) with ESMTP id 62907833F27 for ; Thu, 3 Mar 2011 12:51:23 +0000 (UTC) Received: by bwz14 with SMTP id 14so1276609bwz.16 for ; Thu, 03 Mar 2011 04:51:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:subject:from:to:cc:in-reply-to:references :content-type:date:message-id:mime-version:x-mailer :content-transfer-encoding; bh=MTGtdKeKrvgP2UY36Q0OcnUREq53Qfi9EmWkppGxH9I=; b=FCi8h85VzlQ4UNBEqBMxnUWLZt6U0nPoelkRN4v8cUsxTRUDIoSQfZdlpV1yUIDHRQ DbLeW3ZCgGhr5zYex0Nq4T6+Yz2ngVSi9GOEVHS4aLJfmge8gvHbvCMY1y6exUeowLKx oUyKVphKL795Te3pjGYwWydCH1zqfzCGd3Zwk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:to:cc:in-reply-to:references:content-type:date :message-id:mime-version:x-mailer:content-transfer-encoding; b=i1o4DTZNapSKq1MV5BC391uffLaYmKF8tOBH6eYLHh7YcTirZdrtopZejieHP4KPjA C022pl7rN/6pILtLg5PJTfteJSpr+X7u7fxJ476KFhH+Xv7p178LMN8sUfNVCETrjV+A W0wN0eEMaPsExkHVmyevNxlwvWXPmLMOB1iLg= Received: by 10.204.56.194 with SMTP id z2mr1419351bkg.81.1299156679009; Thu, 03 Mar 2011 04:51:19 -0800 (PST) Received: from [10.150.51.216] (gw0.net.jmsp.net [212.23.165.14]) by mx.google.com with ESMTPS id u23sm720210bkw.21.2011.03.03.04.51.17 (version=SSLv3 cipher=OTHER); Thu, 03 Mar 2011 04:51:17 -0800 (PST) Subject: Re: [RFC LOL OMG] pfifo_lat: qdisc that limits dequeueing based on estimated link latency From: Eric Dumazet To: "John W. Linville" In-Reply-To: <1299102850-2883-1-git-send-email-linville@tuxdriver.com> References: <20110228132341.194975v6ojrudl18@hayate.sektori.org> <1299102850-2883-1-git-send-email-linville@tuxdriver.com> Content-Type: text/plain; charset="UTF-8" Date: Thu, 03 Mar 2011 13:51:15 +0100 Message-ID: <1299156675.2983.65.camel@edumazet-laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 Content-Transfer-Encoding: 8bit Cc: netdev@vger.kernel.org, bloat-devel@lists.bufferbloat.net X-BeenThere: bloat-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Developers working on AQM, device drivers, and networking stacks" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Mar 2011 12:52:13 -0000 Le mercredi 02 mars 2011 à 16:54 -0500, John W. Linville a écrit : > This is a qdisc based on the existing pfifo_fast code. The difference > is that this qdisc limits the dequeue rate based on estimates of how > many packets can be in-flight at a given time while maintaining a target > link latency. > > This work is based on the eBDP documented in Section IV of "Buffer > Sizing for 802.11 Based Networks" by Tianji Li, et al. > > http://www.hamilton.ie/tianji_li/buffersizing.pdf > > This implementation timestamps an skb as it dequeues it, then > computes the service time when the frame is freed by the driver. > An exponentially weighted moving average of per fragment service times > is used to restrict queueing delays in hopes of achieving a target > fragment transmission latency. The skb->deconstructor mechanism is > abused in order to obtain packet service time estimates. > > Signed-off-by: John W. Linville > --- > I took a whack at reimplementing my eBDP patch at the qdisc level. > Unfortunately, it doesn't seem to work very well and I'm at a loss > as to why... :-( Comments welcome -- maybe I'm doing something really > stupid in the math and just can't see it. > > The skb->deconstructor abuse includes adding a union member in the skb > to record the qdisc->handle on the way out so that it can be used for > accounting in the deconstructor -- thanks to Neil Horman for the > suggestion! > > The reason I think this is an idea worth exploring is that existing > qdisc code doesn't seem to account for the fact that the devices could > be doing a lot of queueing behind them. Even Jussi's recent > sch_fifo_ewma post doesn't seem to take into account how long the device > holds-on to packets, which limits his ability to fight latency. > > Anyway, all comments appreciated! > > Well, many issues in your patch. skb destructor cannot be used like that (think about locking, and various context where drivers actually free skbs (from interrupt, from softirq, or even _before_ sending data on wire). qdisc_lookup(skb->dev, skb->qdhandle) for example is only safe if run with RTNL held. Its not meant to be used in fast path at all, but management code only. Being able to have a feedback on when a skb is freed (with a notification of being delivered or dropped) is a recurring idea, so we might design a stackable infrastructure.