From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw0-f43.google.com (mail-yw0-f43.google.com [209.85.213.43]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 278E021F0B9 for ; Thu, 12 Jul 2012 07:55:36 -0700 (PDT) Received: by yhl10 with SMTP id 10so3708960yhl.16 for ; Thu, 12 Jul 2012 07:55:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-system-of-record; bh=hialcajEByW0BTm4leXlg8FZHLDu7SgJg/NW9O426EE=; b=ND3k5kZTvfA0rqILDTJ8E0i1qICb+/0HaxR/iJ2LNmB52f2aoiUfV0bc29jL+ltwYR fdHWV7DnweaCNdoiCv6OZntdYajrs/yH4w7BZWCzUlVabutJdv5uhZEZGw601pFr0vKE cn71VCsjAvwmSg476O2Bc0zqVwF7Biohly7DKRDBylvAV6acG4/nuSjKWmF8q5qyIQT+ TtmZ1W02s76tVQ/pNMBDMfI2iH4zv0XUheLv0u6ZL5haiEniUhoRi1gPc9xNLG6X+VIN hEHeC6nLkNmMRaOjqZ0AY5dMUqI4qSRG6yfsn/moeC23jYHboMqH73hwlmSGAg+ZmiVu iq7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-system-of-record:x-gm-message-state; bh=hialcajEByW0BTm4leXlg8FZHLDu7SgJg/NW9O426EE=; b=hJYJ2dMxCR7+Se/9Onnv6umC6eJQBq1jG85gxKoVrnHM/mwIkXoEY192vcLziao7RQ NDF5ZWCSUPkPNBd48QfT5zEEwUW2un7XljNTN69WUTvB8egy30MHh2+jpWK9Q10R4oyo 7SksJeKMPp2vnHfDIBJci0xQLzs6Y2eNr+PGt8n19QlCyZIsGpSEJFM5p3R/Kr4wTtY0 NHvZgTWCg7dtyLlqS108krHf/Yoy8gjmwW0+srDstGqKC1ZvHcOHKwInGCMAOR/R7LLE gH95pOCPH4boCR9sial78FzSyivQ/zq9SMYUD9niVuoYQvZO0qcb+YSD+I2+bRm91TaY 7Q7A== Received: by 10.50.94.196 with SMTP id de4mr17596992igb.17.1342104934150; Thu, 12 Jul 2012 07:55:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.94.196 with SMTP id de4mr17596965igb.17.1342104934030; Thu, 12 Jul 2012 07:55:34 -0700 (PDT) Received: by 10.231.51.95 with HTTP; Thu, 12 Jul 2012 07:55:33 -0700 (PDT) In-Reply-To: <1342079487.3265.8245.camel@edumazet-glaptop> References: <4FFDC985.6050805@hp.com> <1342050592.3265.8195.camel@edumazet-glaptop> <1342078459.3265.8244.camel@edumazet-glaptop> <20120712.003700.49235222504944712.davem@davemloft.net> <1342079487.3265.8245.camel@edumazet-glaptop> Date: Thu, 12 Jul 2012 07:55:33 -0700 Message-ID: From: Tom Herbert To: Eric Dumazet Content-Type: text/plain; charset=ISO-8859-1 X-System-Of-Record: true X-Gm-Message-State: ALoCoQmfBieripVS827MYRUhgkgakHyilCY589RCkofZXotDhcDYK8X4JyJ0tj+y3hFkPXqrvCcK98XvCxv7Z4ui11LnF8FPNHh81/8Lrq/tI++hlT74K+PDxieG6ZkngZQt4LK7chR7wUqbxuNIeKR3E/78KQvzE8SXvbEBj0pAlH9oGCsZa7wktTS5e5BPt226HHQfq7CF Cc: nanditad@google.com, netdev@vger.kernel.org, codel@lists.bufferbloat.net, ncardwell@google.com, David Miller , mattmathis@google.com Subject: Re: [Codel] [RFC PATCH v2] tcp: TCP Small Queues X-BeenThere: codel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: CoDel AQM discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jul 2012 14:55:36 -0000 On Thu, Jul 12, 2012 at 12:51 AM, Eric Dumazet wrote: > On Thu, 2012-07-12 at 00:37 -0700, David Miller wrote: >> From: Eric Dumazet >> Date: Thu, 12 Jul 2012 09:34:19 +0200 >> >> > On Thu, 2012-07-12 at 01:49 +0200, Eric Dumazet wrote: >> > >> >> The 10Gb receiver is a net-next kernel, but the 1Gb receiver is a 2.6.38 >> >> ubuntu kernel. They probably have very different TCP behavior. >> > >> > >> > I tested TSQ on bnx2x and 10Gb links. >> > >> > I get full rate even using 65536 bytes for >> > the /proc/sys/net/ipv4/tcp_limit_output_bytes tunable >> >> Great work Eric. > > Thanks ! > This is indeed great work! A couple of comments... Do you know if there are are any qdiscs that function less efficiently when we are restricting the number of packets? For instance, will HTB work as expected in various configurations? One extension to this work be to make the limit dynamic and mostly eliminate the tunable. I'm thinking we might be able to correlate the limit to the BQL limit of the egress queue for the flow it there is one. Assuming all work conserving qdiscs the minimal amount of outstanding host data for a queue could be associated with the BQL limit of the egress NIC queue. We want to minimize the outstanding data so that: sum(data_of_tcp_flows_share_same_queue) > bql_limit_for _queue So this could imply a per flow limit of: tcp_limit = max(bql_limit - bql_inflight, one_packet) For a single active connection on a queue, the tcp_limit is equal to the BQL limit. Once the BQL limit is hit in the NIC, we only need one packet outstanding per flow to maintain flow control. For fairness, we might need "one_packet" to actually be max GSO data. Also, this disregards any latency of scheduling and running the tasklet, that might need to be taken into account also. Tom