From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.etorok.net (mail.etorok.net [IPv6:2a01:4f8:160:1223::beef:2]) by huchra.bufferbloat.net (Postfix) with ESMTP id 0421821F129; Sat, 25 Aug 2012 06:56:08 -0700 (PDT) Received: from [IPv6:2a02:2f02:1022:6079:1e6f:65ff:fe23:db0d] (unknown [IPv6:2a02:2f02:1022:6079:1e6f:65ff:fe23:db0d]) by mail.etorok.net (Postfix) with ESMTPSA id 3DCFE46A8; Sat, 25 Aug 2012 15:56:06 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=etorok.net; s=MAILOUT; t=1345902966; bh=WUSYe8TdtgDRvxoYQHSRcVZYmjoKvSPqAnV8Q7S6yfA=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=nOYZq4nHqEFbtwoqgjjslTKcOuKU0IiXXwm2f+aJe/i14JVua51zqHZCmMJAjf1Zv UgXiAyAdq6jDI2fTAfp6w/OajbqqToOZQAWmRTB/Y5RQZ4BHtf7XewMbN/I7erNcxW YiDDdHJukBjAepk9ngBuGhOHvHnNEBKdFA8EEW+A= Message-ID: <5038D974.1000901@etorok.net> Date: Sat, 25 Aug 2012 16:56:04 +0300 From: =?ISO-8859-1?Q?T=F6r=F6k_Edwin?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.6esrpre) Gecko/20120817 Icedove/10.0.6 MIME-Version: 1.0 To: Dave Taht References: <502E064C.50305@etorok.net> <502F6279.1090708@etorok.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.97.5 at mail X-Virus-Status: Clean Cc: cerowrt-devel@lists.bufferbloat.net, bloat Subject: Re: [Bloat] [Cerowrt-devel] cerowrt 3.3.8-17: nice latency improvements, some issues with bind X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Aug 2012 13:56:10 -0000 On 08/18/2012 08:07 PM, Dave Taht wrote: > Thx again for the benchmarks on your hardware! Can I get you to go one > more time to the well? Yes, but you have to wait until I have some time to do it. > Stripping out the incremental steps some will save you some time > on benchmarking, so lets go with 3,4,12,35,100. Wireless data is > incredibly noisy and I usually end up going with cdf plots like this > old one >> To get twice the speed a qlen=11 is enough already, and to get all the speed back a qlen=35 is needed. > > This is an incomplete conclusion. It is incomplete in that A) these > tests were done under laboratory conditions at the highest data rate > (MCS15), and B), it was with a single point to point link to an AP > which normally would be handling more than one client. C) it tests a > single full throttle TCP stream when typical websites and usage > involve 70+ dns lookups and 70 separate short streams. > > I can live with B and C) for now, although I note that the chrome > benchmark while doing a full blown stream test as you are doing now in > the background and ping is quite useful for looking at C. Let's tackle > A... > >> >> And here are the results with fq_codel on the laptop too (just nttcp -t as thats the one affected): >> >> fq_codel on laptop, cerowrt defaults, nttcp -t: 1.248/12.960/108.490/16.733 ms; 90 Mbps >> fq_codel on laptop, cerowrt qlen_*=4, nttcp -t: 1.205/10.843/ 76.983/12.460 ms; 105 Mbps >> fq_codel on laptop, cerowrt qlen_*=8, nttcp -t: 4.034/16.088/ 98.611/17.050 ms; 120 Mbps >> fq_codel on laptop, cerowrt qlen_*=11, nttcp -t: 3.766/15.687/ 56.684/11.135 ms; 114 Mbps >> fq_codel on laptop, cerowrt qlen_*=35, nttcp -t: 11.360/26.742/ 48.051/ 7.489 ms; 113 Mbps > > So, if you could move your laptop to where it gets MCS4 on a fairly > reliable basis, and repeat the tests? a wall or three will do it. I've put my laptop in a place where I got MCS4 on TX most of the time. RX is MCS4 most of the time too, but it is switching to MCS5, 7, 11, 12 and back to MCS4 quite a lot. > please don't change your kernel out before trying that test... (and I > make no warranties about the reliability/usefulness of a rc2!) Here are the results with fq_codel on the laptop, and same 3.5.0 kernel: qlen 100, nttcp -t: 5.966/57.104/192.017/26.674 ms; 52.2376 Mbps qlen 35, nttcp -t: 15.636/54.823/108.921/19.762 ms; 52.4675 Mbps qlen 12, nttcp -t: 4.768/29.439/132.924/27.159 ms; 51.2619 Mbps qlen 4, nttcp -t: 2.631/20.500/152.741/31.549 ms; 40.3949 Mbps qlen def, ntccp -t: 2.010/21.851/317.085/49.323 ms; 35.8268 Mbps qlen 100, nttcp -r: 23.225/44.101/142.835/21.181 ms; 36.6789 Mbps qlen 35, nttcp -r: 3.755/23.413/ 83.530/15.329 ms; 35.4602 Mbps qlen 12, nttcp -r: 4.318/10.251/ 96.773/12.008 ms; 31.1557 Mbps qlen 4, nttcp -r: 2.733/ 4.507/ 16.353/ 1.917 ms; 24.6688 Mbps qlen def, nttcp -r: 2.119/ 4.999/ 64.968/ 7.275 ms; 27.3645 Mbps Note that the laptop was on battery this time, so that may add some jitter (CPU freq switching, wifi power saving?), but shouldn't matter for >10ms quantities. Looks like the iwl4965 is somewhat bloated, with those 100ms+ latencies. I don't know what happened there, but with the default qlen (2,3,3,3) I get the 317 ms max latency, whereas with qlen 4 I get 152 ms max latency on TX. The average is also better with qlen 4. Same observation goes for the RX side. > > I will predict several things: > > 1) the bulk of the buffering problem is going to move to your laptop, > as it has weaker antennas than the wndrs. Most likely you will end up > with tx on the one side higher than rx on the other. Yes the laptop TX latencies are worse. > > 2) you will see much higher jitter and latency and much lower > throughput. Your results will also get wildly more variable run to > run. (I tend to run tests for 2 minutes or longer and toss out the > first few seconds) On TX it is quite consistently in MCS4 (according to watch iw wlan0 station dump), but on RX its jumping quite a lot. > > 3) The lower fixed buffering sizes on cero's qlens will start making a > lot more sense, but it may be hard to see due to 1 and 2. qlen 12 and 4 look good. The default looks worse though. > > The thing I don't honestly know is how well fq_codel reacts to sudden > bandwidth changes when the underlying device driver (the iwl in this > case) is overbuffered or how well codel's target idea really works in > the wifi case in general. It would be nice to have some data on it. > (hint, hint) The bandwidth varies quite a lot on RX even if both the laptop and router are perfectly still. So the -r numbers above should be what you are looking for. If you want some other data let me know. > > Some work was done on debloating the iwl last year, I don't know if > any of the work made it into mainline. > > Lastly, I put a version of Linux 3.6-rc2 up here. > > http://snapon.lab.bufferbloat.net/~cero1/deb/ > > It has a fix to codel in it that was needed (I think but have not > checked to see if it's in 3.5.1), and it also incorporates "TCP small > queues", which reduces tcp-related buffering in pfifo_fast enormously, > and helps on other qdiscs as well. Switching to it will invalidate the > testing you've done so far... I assume these are in the upstream 3.6-rc3 too, right? Here is just one measurement done with 3.6-rc3 on the laptop and fq_codel (same location as above tests, approx MCS4): qlen def, nttcp -t, 2.871/15.655/375.777/44.212 ms; 35.2776 Mbps qlen def, nttcp -r, 1.406/ 3.434/ 12.763/ 1.649 ms; 24.3334 Mbps It looks somewhat better. > > (another reason why I'm reluctant to post graphs on codel/fq_codel > right now is that good stuff keeps happening above/below it in Linux), > > > >> Shouldn't wireless N be able to do 200 - 300 Mbps though? If I enable debugging in iwl4965 I see that it >> starts TX aggregation, so not sure whats wrong (router or laptop?). With encryption off I can get at most 160 Mbps. > > A UDP test will get you in the 270Mbit range usually. nttcp -T -u -D -n2000 gives ~180 Mbps at most, and with -r I can't make sense of it (looks like most gets dropped): Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 16384 0.08 0.00 1.6090 13107.2000 5 61.38 500000.0 1 8192000 0.08 0.04 845.8113 1820.6973 2003 25850.83 55646.6 > >> >> iw dev sw10 station dump shows: >> ... >> signal: -56 [-60, -59] dBm >> signal avg: -125 [-65, -58] dBm >> tx bitrate: 300.0 MBit/s MCS 15 40Mhz short GI >> rx bitrate: 300.0 MBit/s MCS 15 40Mhz short GI >> >> On laptop: >> tx bitrate: 300.0 Mbit/s MCS 15 40Mhz short GI > > In non-lab conditions you generally don't lock into a rate. The > minstrel algorithm tries various strategies to get the packets > through, so you can > get a grip on what's really happening by looking at the rc_stats file > for your particular device. > > example here: > > > http://www.bufferbloat.net/projects/cerowrt/wiki/Minstrel_Wireless_Rate_Selection > I looked at the rc_stats file by cd-ing into the stations dir on the router. After disabling/enabling the radio the stations subdir was gone though: root@OpenWrt:~# ls /sys/kernel/debug/ieee80211/phy1/netdev\:sw10/stations/ -al drwxr-xr-x 2 root root 0 Aug 25 10:28 . drwxr-xr-x 3 root root 0 Aug 25 10:28 .. So unfortunately I'm without an rc_stats now (until I reboot the router probably?). Best regards, --Edwin