[Bloat] [Cerowrt-devel] cerowrt 3.3.8-17: nice latency improvements, some issues with bind
Török Edwin
edwin+ml-cerowrt at etorok.net
Sat Aug 25 09:56:04 EDT 2012
On 08/18/2012 08:07 PM, Dave Taht wrote:
> Thx again for the benchmarks on your hardware! Can I get you to go one
> more time to the well?
Yes, but you have to wait until I have some time to do it.
> Stripping out the incremental steps some will save you some time
> on benchmarking, so lets go with 3,4,12,35,100. Wireless data is
> incredibly noisy and I usually end up going with cdf plots like this
> old one
>> To get twice the speed a qlen=11 is enough already, and to get all the speed back a qlen=35 is needed.
>
> This is an incomplete conclusion. It is incomplete in that A) these
> tests were done under laboratory conditions at the highest data rate
> (MCS15), and B), it was with a single point to point link to an AP
> which normally would be handling more than one client. C) it tests a
> single full throttle TCP stream when typical websites and usage
> involve 70+ dns lookups and 70 separate short streams.
>
> I can live with B and C) for now, although I note that the chrome
> benchmark while doing a full blown stream test as you are doing now in
> the background and ping is quite useful for looking at C. Let's tackle
> A...
>
>>
>> And here are the results with fq_codel on the laptop too (just nttcp -t as thats the one affected):
>>
>> fq_codel on laptop, cerowrt defaults, nttcp -t: 1.248/12.960/108.490/16.733 ms; 90 Mbps
>> fq_codel on laptop, cerowrt qlen_*=4, nttcp -t: 1.205/10.843/ 76.983/12.460 ms; 105 Mbps
>> fq_codel on laptop, cerowrt qlen_*=8, nttcp -t: 4.034/16.088/ 98.611/17.050 ms; 120 Mbps
>> fq_codel on laptop, cerowrt qlen_*=11, nttcp -t: 3.766/15.687/ 56.684/11.135 ms; 114 Mbps
>> fq_codel on laptop, cerowrt qlen_*=35, nttcp -t: 11.360/26.742/ 48.051/ 7.489 ms; 113 Mbps
>
> So, if you could move your laptop to where it gets MCS4 on a fairly
> reliable basis, and repeat the tests? a wall or three will do it.
I've put my laptop in a place where I got MCS4 on TX most of the time.
RX is MCS4 most of the time too, but it is switching to MCS5, 7, 11, 12 and back to MCS4
quite a lot.
> please don't change your kernel out before trying that test... (and I
> make no warranties about the reliability/usefulness of a rc2!)
Here are the results with fq_codel on the laptop, and same 3.5.0 kernel:
qlen 100, nttcp -t: 5.966/57.104/192.017/26.674 ms; 52.2376 Mbps
qlen 35, nttcp -t: 15.636/54.823/108.921/19.762 ms; 52.4675 Mbps
qlen 12, nttcp -t: 4.768/29.439/132.924/27.159 ms; 51.2619 Mbps
qlen 4, nttcp -t: 2.631/20.500/152.741/31.549 ms; 40.3949 Mbps
qlen def, ntccp -t: 2.010/21.851/317.085/49.323 ms; 35.8268 Mbps
qlen 100, nttcp -r: 23.225/44.101/142.835/21.181 ms; 36.6789 Mbps
qlen 35, nttcp -r: 3.755/23.413/ 83.530/15.329 ms; 35.4602 Mbps
qlen 12, nttcp -r: 4.318/10.251/ 96.773/12.008 ms; 31.1557 Mbps
qlen 4, nttcp -r: 2.733/ 4.507/ 16.353/ 1.917 ms; 24.6688 Mbps
qlen def, nttcp -r: 2.119/ 4.999/ 64.968/ 7.275 ms; 27.3645 Mbps
Note that the laptop was on battery this time, so that may add some jitter
(CPU freq switching, wifi power saving?), but shouldn't matter for >10ms quantities.
Looks like the iwl4965 is somewhat bloated, with those 100ms+ latencies.
I don't know what happened there, but with the default qlen (2,3,3,3) I get the 317 ms max latency,
whereas with qlen 4 I get 152 ms max latency on TX. The average is also better with qlen 4.
Same observation goes for the RX side.
>
> I will predict several things:
>
> 1) the bulk of the buffering problem is going to move to your laptop,
> as it has weaker antennas than the wndrs. Most likely you will end up
> with tx on the one side higher than rx on the other.
Yes the laptop TX latencies are worse.
>
> 2) you will see much higher jitter and latency and much lower
> throughput. Your results will also get wildly more variable run to
> run. (I tend to run tests for 2 minutes or longer and toss out the
> first few seconds)
On TX it is quite consistently in MCS4 (according to watch iw wlan0 station dump),
but on RX its jumping quite a lot.
>
> 3) The lower fixed buffering sizes on cero's qlens will start making a
> lot more sense, but it may be hard to see due to 1 and 2.
qlen 12 and 4 look good. The default looks worse though.
>
> The thing I don't honestly know is how well fq_codel reacts to sudden
> bandwidth changes when the underlying device driver (the iwl in this
> case) is overbuffered or how well codel's target idea really works in
> the wifi case in general. It would be nice to have some data on it.
> (hint, hint)
The bandwidth varies quite a lot on RX even if both the laptop and router
are perfectly still. So the -r numbers above should be what you are looking for.
If you want some other data let me know.
>
> Some work was done on debloating the iwl last year, I don't know if
> any of the work made it into mainline.
>
> Lastly, I put a version of Linux 3.6-rc2 up here.
>
> http://snapon.lab.bufferbloat.net/~cero1/deb/
>
> It has a fix to codel in it that was needed (I think but have not
> checked to see if it's in 3.5.1), and it also incorporates "TCP small
> queues", which reduces tcp-related buffering in pfifo_fast enormously,
> and helps on other qdiscs as well. Switching to it will invalidate the
> testing you've done so far...
I assume these are in the upstream 3.6-rc3 too, right?
Here is just one measurement done with 3.6-rc3 on the laptop and fq_codel
(same location as above tests, approx MCS4):
qlen def, nttcp -t, 2.871/15.655/375.777/44.212 ms; 35.2776 Mbps
qlen def, nttcp -r, 1.406/ 3.434/ 12.763/ 1.649 ms; 24.3334 Mbps
It looks somewhat better.
>
> (another reason why I'm reluctant to post graphs on codel/fq_codel
> right now is that good stuff keeps happening above/below it in Linux),
>
>
>
>> Shouldn't wireless N be able to do 200 - 300 Mbps though? If I enable debugging in iwl4965 I see that it
>> starts TX aggregation, so not sure whats wrong (router or laptop?). With encryption off I can get at most 160 Mbps.
>
> A UDP test will get you in the 270Mbit range usually.
nttcp -T -u -D -n2000 gives ~180 Mbps at most, and with -r I can't make sense of it (looks like most gets dropped):
Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s
l 16384 0.08 0.00 1.6090 13107.2000 5 61.38 500000.0
1 8192000 0.08 0.04 845.8113 1820.6973 2003 25850.83 55646.6
>
>>
>> iw dev sw10 station dump shows:
>> ...
>> signal: -56 [-60, -59] dBm
>> signal avg: -125 [-65, -58] dBm
>> tx bitrate: 300.0 MBit/s MCS 15 40Mhz short GI
>> rx bitrate: 300.0 MBit/s MCS 15 40Mhz short GI
>>
>> On laptop:
>> tx bitrate: 300.0 Mbit/s MCS 15 40Mhz short GI
>
> In non-lab conditions you generally don't lock into a rate. The
> minstrel algorithm tries various strategies to get the packets
> through, so you can
> get a grip on what's really happening by looking at the rc_stats file
> for your particular device.
>
> example here:
>
>
> http://www.bufferbloat.net/projects/cerowrt/wiki/Minstrel_Wireless_Rate_Selection
>
I looked at the rc_stats file by cd-ing into the stations dir on the router. After disabling/enabling the radio
the stations subdir was gone though:
root at OpenWrt:~# ls /sys/kernel/debug/ieee80211/phy1/netdev\:sw10/stations/ -al
drwxr-xr-x 2 root root 0 Aug 25 10:28 .
drwxr-xr-x 3 root root 0 Aug 25 10:28 ..
So unfortunately I'm without an rc_stats now (until I reboot the router probably?).
Best regards,
--Edwin
More information about the Bloat
mailing list