From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wy0-f171.google.com (mail-wy0-f171.google.com [74.125.82.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 8AD01201A74 for ; Tue, 31 May 2011 10:04:09 -0700 (PDT) Received: by wyb32 with SMTP id 32so5029858wyb.16 for ; Tue, 31 May 2011 10:20:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=/pNwVc8ogcNg4eVPFl0ZIyMe+dS3ZogDmW7T3hWZTik=; b=KVfS6ssjECiDC0jABvnVZMoU8ou/0T2yp04iHYIdrE7we4PVTmWd5CMBAgEpF6EAAS SlvMFR69qiUih8ozzSQL6sS8A68n+YJc/RY0PIcgRMEb3lBV9b5nFbOF5cu388if6iwX vTas/NaVYewsG6M27i5LzUxIIJexE+HZ7t55A= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=ID/aoRlpfvxybooKnb5rGxjaU6/uht6lAGuR6MkGI6lgkKgsg7XfUTZwxtfpMARszG v9SNfg1Zi6wqucupEK28CeIeQwXECCkt5LE4qGjmTa9UZ3x9UcJb+m4CTgV1LvN07QjJ kPJwnZqg7RV6js1HKpjheOjyEVSGs+K4mbOLk= MIME-Version: 1.0 Received: by 10.216.177.4 with SMTP id c4mr4260481wem.3.1306862454463; Tue, 31 May 2011 10:20:54 -0700 (PDT) Received: by 10.216.185.13 with HTTP; Tue, 31 May 2011 10:20:54 -0700 (PDT) In-Reply-To: <0E55287D-41D8-4CE7-8AD8-493A874B80D7@gmail.com> References: <0E55287D-41D8-4CE7-8AD8-493A874B80D7@gmail.com> Date: Tue, 31 May 2011 10:20:54 -0700 Message-ID: From: "George B." To: Jonathan Morton Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: "bloat@lists.bufferbloat.net" Subject: Re: [Bloat] philosophical question X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 May 2011 17:04:10 -0000 On Mon, May 30, 2011 at 8:57 AM, Jonathan Morton wr= ote: > If most of your clients are mobile, you should use a tcp congestion contr= ol algorithm such as Westwood+ which is designed for the task. This is desi= gned to distinguish between congestion and random packet losses. It is much= less aggressive at filling buffers than the default CUBIC. Not only are they mobile, the behavior might be considered like that of a "thin" client in the context of "tcp-thin" http://www.mjmwired.net/kernel/Documentation/networking/tcp-thin.txt So the servers have two different sorts of connections. There would be thousands of long-lived connections with only an occasional packet going back and forth. Those streams are on lossy mobile networks. Then there are several hundred very "fat" and fast connections moving a lot of data. Sometimes a client might change from a "thin" stream to a "thick" stream if it must collect a lot of content. So maybe westwood+ along with the "tcp-thin" settings in 2.6.38 might be a good idea. Looking at one server this morning, I have about 27,000 TCP connections in ESTABLISHED state. Many (most) of these are "thin" flows to devices that exchange a packet only occasionally. The server has 16 cores talking to 2 GigE NICs with 8 queues each. There is about 40 meg of traffic flowing into the server from the network and the outbound bandwidth is about 8 meg/sec. Much of that 8 meg (about 2.5 meg) is logging traffic going to a local log host via IPv6 + jumbo frames. The way this is configured is there are two NICs, eth0 and eth1 and three vlans, lets call them vlan 2 (front end traffic) vlan 3 (logging traffic) and vlan 4 (backend traffic). VLAN 2 would be configured on the NICs (eth0.2 and eth1.2) and then bonded using balance-xor using the layer2-3 xmit hash. This way a given flow should always hash to a given vlan interface on a particular NIC. So I have three bond interfaces talking to a multiqueue aware vlan driver. This allows one processor to handle log traffic while a different processor handles front-end traffic and another processor handing a backend transaction at the same time. The higher inbound to outbound ratio is backwards from your traditional server profile but that is because the traffic that comes in gets compressed and sent back out and it is mostly text and text compresses nicely. /proc/sys/net/ipv4/tcp_ecn is currently set to "2" meaning use ECN if I see ECN set from the other end but don't initiate connections with ECN set. As I never initiate connections to the clients, that really isn't an issue. The client always initiates the connection so if I see ECN, it will be used. So the exercise I am going through is trying to determine the best qdisc and where to put it. On the bond interfaces, on the vlan interfaces or on the NICs? Something simple like SFQ would probably work ok. I just want to make sure a single packet to the client can get through without a lot of delay in the face of a "fat" stream going somewhere else. Currently it is using the default mq qdisc: root@foo:~> tc -s qdisc show qdisc mq 0: dev eth0 root Sent 5038416030858 bytes 1039959904 pkt (dropped 0, overlimits 0 requeues 24686) rate 0bit 0pps backlog 0b 0p requeues 24686 qdisc mq 0: dev eth1 root Sent 1380477553077 bytes 2131951760 pkt (dropped 0, overlimits 0 requeues = 2934) rate 0bit 0pps backlog 0b 0p requeues 2934 So there have been no packets dropped and there is no backlog and the path is clean all the way to the Internet without any congestion in my network (the path is currently about 5 times bigger than current bandwidth utilization and is 10GigE all the way from the switch to which the server is connected all the way to the Internet). Any congestion would be somewhere upstream from me. Suggestions? George