From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f47.google.com (mail-wg0-f47.google.com [74.125.82.47]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id B949620099A; Sat, 18 Aug 2012 10:07:47 -0700 (PDT) Received: by wgbfa7 with SMTP id fa7so3308709wgb.28 for ; Sat, 18 Aug 2012 10:07:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=GCnq0vdizchC74BpnaOUBLqmUx+KDecxvQa8VbIcy8w=; b=yQ8uNcunMuIOer4kNsWi6qs7OtmYoRDjOxrA6Jn/kDGtiF7/dAcvTPaVcnCqJmVu7+ ityFdoqUzpTFt7Djd99YHYgjYXklJoyL5fgJFBIvAGThjSLA2YWJiNN3oi7N7uY7p5ek E6ExhqNE25Mtki9bOkwlexwlZBFatuqMnII7iaoqzW2pnhvjgsslNkZcK0VOsfx8clTe +bcQzbiD+ttxzRiXIGX8E4bjN2upvbIbjyEwZ2YKKc9LtyCxWMBOmg9JfEpJClX7VLrI AP6CpIyUQ9od5y4u/AvZ+cCA4vOeCv1Ny0HhoGTTFUV9bUDOjQjjDXM+r8g0zMTXP5BG T/9A== MIME-Version: 1.0 Received: by 10.180.109.129 with SMTP id hs1mr14735190wib.0.1345309665656; Sat, 18 Aug 2012 10:07:45 -0700 (PDT) Received: by 10.223.143.69 with HTTP; Sat, 18 Aug 2012 10:07:45 -0700 (PDT) In-Reply-To: <502F6279.1090708@etorok.net> References: <502E064C.50305@etorok.net> <502F6279.1090708@etorok.net> Date: Sat, 18 Aug 2012 10:07:45 -0700 Message-ID: From: Dave Taht To: =?ISO-8859-1?Q?T=F6r=F6k_Edwin?= Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: cerowrt-devel@lists.bufferbloat.net, bloat Subject: Re: [Bloat] [Cerowrt-devel] cerowrt 3.3.8-17: nice latency improvements, some issues with bind X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2012 17:07:48 -0000 Thx again for the benchmarks on your hardware! Can I get you to go one more time to the well? There's a subtle point to be made which basically involves the difference between testing in lab conditions and in the real world. On Sat, Aug 18, 2012 at 2:38 AM, T=F6r=F6k Edwin wrote: > Baseline (only ping, no other traffic): 0.806/ 1.323/ 8.753/ = 1.333 ms > no fq_codel on laptop, cerowrt defaults, nttcp -t: 1.192/16.605/107.351/2= 5.265 ms; 94 Mbps > no fq_codel on laptop, cerowrt qlen_*=3D4, nttcp -t: 1.285/25.108/105.519= /22.607 ms; 107 Mbps > no fq_codel on laptop, cerowrt qlen_*=3D12,nttcp -t: 2.195/24.277/131.490= /21.161 ms; 127 Mbps Stripping out the incremental steps some will save you some time on benchmarking, so lets go with 3,4,12,35,100. Wireless data is incredibly noisy and I usually end up going with cdf plots like this old one http://www.teklibre.com/~d/bloat/hoqvssfqred.ps to cope with noisy data with tons and tons of voip-like pings http://www.teklibre.com/~d/bloat/ping_log.ps (also old) but moving forward, we can do some stuff with this, so see below.. (to explain the first plot: sfqred was the predecessor to fq_codel, and the first showed a distinct advantage towards optimizing for new streams, which ended up (more elegantly) in fq_codel. The second plot shows the effect of a small bandwidth change on latency, when the underlying buffering was large. Yes, I need to get around to newer plots but we still have some analysis and optimization to do of the underlying codel algo) > no fq_codel on laptop, cerowrt defaults, nttcp -r: 1.332/ 3.129/ 41.900/ = 5.221 ms; 39 Mbps > no fq_codel on laptop, cerowrt qlen_*=3D4, nttcp -r: 1.514/ 3.205/ 8.595= / 1.817 ms; 46 Mbps > no fq_codel on laptop, cerowrt qlen_*=3D12,nttcp -r: 2.025/ 5.173/ 16.890= / 3.763 ms; 81 Mbps > no fq_codel on laptop, cerowrt qlen_*=3D35,nttcp -r: 2.893/ 7.895/130.859= /17.621 ms; 119 Mbps > no fq_codel on laptop, cerowrt qlen_*=3D50,nttcp -r: 0.951/ 7.810/ 47.646= / 6.428 ms; 131 Mbps > no fq_codel on laptop, cerowrt qlen_*=3D100,nttcp -r:5.149/ 8.766/ 14.371= / 2.191 ms; 128 Mbps > > To get twice the speed a qlen=3D11 is enough already, and to get all the = speed back a qlen=3D35 is needed. This is an incomplete conclusion. It is incomplete in that A) these tests were done under laboratory conditions at the highest data rate (MCS15), and B), it was with a single point to point link to an AP which normally would be handling more than one client. C) it tests a single full throttle TCP stream when typical websites and usage involve 70+ dns lookups and 70 separate short streams. I can live with B and C) for now, although I note that the chrome benchmark while doing a full blown stream test as you are doing now in the background and ping is quite useful for looking at C. Let's tackle A... > > And here are the results with fq_codel on the laptop too (just nttcp -t a= s thats the one affected): > > fq_codel on laptop, cerowrt defaults, nttcp -t: 1.248/12.960/108.490/16= .733 ms; 90 Mbps > fq_codel on laptop, cerowrt qlen_*=3D4, nttcp -t: 1.205/10.843/ 76.983/= 12.460 ms; 105 Mbps > fq_codel on laptop, cerowrt qlen_*=3D8, nttcp -t: 4.034/16.088/ 98.611/= 17.050 ms; 120 Mbps > fq_codel on laptop, cerowrt qlen_*=3D11, nttcp -t: 3.766/15.687/ 56.684/= 11.135 ms; 114 Mbps > fq_codel on laptop, cerowrt qlen_*=3D35, nttcp -t: 11.360/26.742/ 48.051/= 7.489 ms; 113 Mbps So, if you could move your laptop to where it gets MCS4 on a fairly reliable basis, and repeat the tests? a wall or three will do it. I will predict several things: 1) the bulk of the buffering problem is going to move to your laptop, as it has weaker antennas than the wndrs. Most likely you will end up with tx on the one side higher than rx on the other. 2) you will see much higher jitter and latency and much lower throughput. Your results will also get wildly more variable run to run. (I tend to run tests for 2 minutes or longer and toss out the first few seconds) 3) The lower fixed buffering sizes on cero's qlens will start making a lot more sense, but it may be hard to see due to 1 and 2. The thing I don't honestly know is how well fq_codel reacts to sudden bandwidth changes when the underlying device driver (the iwl in this case) is overbuffered or how well codel's target idea really works in the wifi case in general. It would be nice to have some data on it. (hint, hint) Some work was done on debloating the iwl last year, I don't know if any of the work made it into mainline. Lastly, I put a version of Linux 3.6-rc2 up here. http://snapon.lab.bufferbloat.net/~cero1/deb/ It has a fix to codel in it that was needed (I think but have not checked to see if it's in 3.5.1), and it also incorporates "TCP small queues", which reduces tcp-related buffering in pfifo_fast enormously, and helps on other qdiscs as well. Switching to it will invalidate the testing you've done so far... (another reason why I'm reluctant to post graphs on codel/fq_codel right now is that good stuff keeps happening above/below it in Linux), please don't change your kernel out before trying that test... (and I make no warranties about the reliability/usefulness of a rc2!) > Shouldn't wireless N be able to do 200 - 300 Mbps though? If I enable deb= ugging in iwl4965 I see that it > starts TX aggregation, so not sure whats wrong (router or laptop?). With = encryption off I can get at most 160 Mbps. A UDP test will get you in the 270Mbit range usually. > > iw dev sw10 station dump shows: > ... > signal: -56 [-60, -59] dBm > signal avg: -125 [-65, -58] dBm > tx bitrate: 300.0 MBit/s MCS 15 40Mhz short GI > rx bitrate: 300.0 MBit/s MCS 15 40Mhz short GI > > On laptop: > tx bitrate: 300.0 Mbit/s MCS 15 40Mhz short GI In non-lab conditions you generally don't lock into a rate. The minstrel algorithm tries various strategies to get the packets through, so you can get a grip on what's really happening by looking at the rc_stats file for your particular device. example here: http://www.bufferbloat.net/projects/cerowrt/wiki/Minstrel_Wireless_Rate_Sel= ection