From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-01-iad.dyndns.com (mxout-116-iad.mailhop.org [216.146.32.116]) by lists.bufferbloat.net (Postfix) with ESMTP id D31482E0271 for ; Tue, 15 Mar 2011 15:01:54 -0700 (PDT) Received: from scan-01-iad.mailhop.org (scan-01-iad.local [10.150.0.206]) by mail-01-iad.dyndns.com (Postfix) with ESMTP id 2358B71955 for ; Tue, 15 Mar 2011 22:01:54 +0000 (UTC) X-Spam-Score: -1.0 (-) X-Mail-Handler: MailHop by DynDNS X-Originating-IP: 74.125.82.47 Received: from mail-ww0-f47.google.com (mail-ww0-f47.google.com [74.125.82.47]) by mail-01-iad.dyndns.com (Postfix) with ESMTP id AB2DC7191A for ; Tue, 15 Mar 2011 22:01:53 +0000 (UTC) Received: by wwk4 with SMTP id 4so1123510wwk.28 for ; Tue, 15 Mar 2011 15:01:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to:x-mailer; bh=Mf+P1fK0b9mFUzNw/BMvyRNsbEneA4/8IJcgE7Q/uME=; b=BAIQ2YdQx3Z8kRfCrf/+hQ+YSymkVPbzEhJS+gemr3yX8jsA7YHEZlESjKo7hjs+DD 5th0MoTR4gfacrqWtxD9mGYv3n3KGp4ZwHeIhF+Q95XJuAJh9a5MCPv9thhvx2r4Ij5g b345dBmYXD6HiUm+0ufVdFeNhBqk/87Js89FE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; b=eR9MaMZZoCFHP0PF4wiXxql5Yge3hgQ5qegJw9TzfuXGfHAFc7Sw2yt8PSubabWNLV opX8BiVI9Ox9cSB/5ckRwV8BZbSQkOek8Jybv9lblYikkkoG6olc6XC38Q+KTaf64+gZ fQIl+IF+8DwYu7ej9RCE2e1g6IWjg0XEpOaQ0= Received: by 10.227.204.195 with SMTP id fn3mr22622wbb.36.1300226504100; Tue, 15 Mar 2011 15:01:44 -0700 (PDT) Received: from [192.168.239.42] (xdsl-83-150-84-172.nebulazone.fi [83.150.84.172]) by mx.google.com with ESMTPS id r80sm168602wei.39.2011.03.15.15.01.43 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 15 Mar 2011 15:01:43 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: Jonathan Morton In-Reply-To: <20110315205146.GF2542@tuxdriver.com> Date: Wed, 16 Mar 2011 00:01:41 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <219C7840-ED79-49EA-929D-96C5A6200401@gmail.com> References: <4D7F4121.40307@freedesktop.org> <20110315175942.GA10064@goldfish> <1300212877.2087.2155.camel@tardy> <20110315183111.GB2542@tuxdriver.com> <29B06777-CC5F-4802-8727-B04F58CDA9E3@gmail.com> <20110315205146.GF2542@tuxdriver.com> To: "John W. Linville" X-Mailer: Apple Mail (2.1082) Cc: bloat@lists.bufferbloat.net Subject: Re: [Bloat] Random idea in reaction to all the discussion of TCP flavours - timestamps? X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Mar 2011 22:01:55 -0000 On 15 Mar, 2011, at 10:51 pm, John W. Linville wrote: >>> If you don't throttle _both_ >>> the _enqueue_ and the _dequeue_, then you could be keeping a nice, >>> near-empty tx queue on the host and still have a long, bloated queue >>> building at the device. >>=20 >> Don't devices at least let you query how full their queue is? >=20 > I suppose it depends on what you mean? Presumably drivers know that, > or at least can figure it out. The accuracy of that might depend on > the exact mechanism, how often the tx rings are replinished, etc. >=20 > However, I'm not aware of any API that would let something in the > stack (e.g. a qdisc) query the device driver for the current device > queue depth. At least, I don't think Linux has one -- do other > kernels/stacks provide that? I get the impression that eBDP is supposed to work relatively close to = the device driver, rather than in the core network stack. As such it's = not a qdisc, but instead manages a parameter used by a well-behaved = device driver. (The number of well-behaved device drivers appears to be = small at present.) So there's a queue in the qdisc, and there's a queue in the hardware, = and eBDP tries to make the latter smaller when possible, allowing the = former (which is potentially much more intelligent) to do more work. There is a tradeoff with wireless devices: if the buffer is bigger, more = packets can be aggregated into a single timeslot and a greater packet = loss rate can be hidden by local retransmission, but the latency gets = bigger. So bigger buffers are required when the network is running = fast, and smaller buffers when it is running slow. Packets which don't = fit in the hardware buffer go to the qdisc instead. Meanwhile the qdisc can re-order packets (eg. SFQ) so that one packet = from each of a number of different flows is presented to the device in = turn. This tends to increase fairness and smoothness, and makes the = delay on interactive traffic much less dependent on the queue length = occupied by bulk flows. It can also detect congestion (eg. nRED, SFB) = and mark packets to cause TCPs to back off. But the qdisc can only = operate effectively, for both of these tasks, if the hardware buffers = are as small as possible. In short: - Network-stack queues can be large as long as they are smart. - Hardware buffers can be dumb but should be as small as possible. Knowing the occupancy of the hardware buffer is useful if the size of = the buffer cannot be changed, because it is then possible to simply = decline to fill the buffer more than a certain amount. If you can also = assume that packets are sent in order of submission, or by some other = easy rule, then you can also infer the time that the oldest packet has = spent there, and use it to tune the future occupancy limit even if you = can't cancel the old packet. Cancelling old packets is potentially desirable because it allows TCPs = and applications to retransmit (which they will do anyway) without fear = of exacerbating a wireless congestion collapse. I do appreciate that = not all hardware will support this, however, and it should be totally = unnecessary for wired links. - Jonathan