* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) [not found] ` <20120521003115.GO22418@yumi.tdiedrich.de> @ 2012-05-21 3:48 ` Dave Taht 2012-05-21 17:24 ` Rick Jones 0 siblings, 1 reply; 11+ messages in thread From: Dave Taht @ 2012-05-21 3:48 UTC (permalink / raw) To: Tobias Diedrich; +Cc: OpenWrt Development List, codel Thx for the numbers! Could you do a TCP_RR while under load from UDP_STREAM? On Mon, May 21, 2012 at 1:31 AM, Tobias Diedrich <ranma+openwrt@tdiedrich.de> wrote: > Tobias Diedrich wrote: >> Dave Taht wrote: >> > In looking over the enormous stack of boards and drivers that openwrt >> > supports, I see that many of the ethernet drivers don't yet support >> > Linux 3.3's "Byte Queue Limits", which are discussed here: >> > >> > http://lwn.net/Articles/454390/ >> > >> > It would be good if more did. They improve network performance in the >> > general case enormously, particularly when a link is not connected at >> > it's peak wire speed. >> > >> > *Adding* support for BQL to an ethernet driver is trivial, here's an >> > example of how. >> >> I tried adding BQL to the ramips ethernet driver, however I found >> some interesting behaviour while doing >> "root@OpenWrt:~# netperf -l 120 -t UDP_STREAM -H myserver" >> >> It looks like the briding code still needs to implement this as well? >> >> netperf UDP_STREAM: >> iface limit_min inflight tx mbps remote mbps ping ms >> eth0 0 ~15000 95.71 95.71 ~10ms >> eth0 1000000 ~300000 177.98 23.28(*) ~30ms >> br0 0 ~15000 154.88 33.94(*) ~120ms >> br0 1000000 ~300000 170.92 25.57(*) ~30ms >> >> (*) bwm-ng on the server showed ~100mbps incoming... > [...] >> Haven't tried codel yet... > > Turns out, it works nicely with codel, even with the bridge: > > netperf: netperf -l 120 -t UDP_STREAM -H myserver > fq_codel: tc qdisc add dev eth0 handle 1: root fq_codel target 5ms > > iface eth0 qdisc bql inflight tx mbps sys time ping ms > eth0 pfifo_fast no n/a 182.98(*) 96.43s ~30ms > eth0 fq_codel no n/a 177.98(*) 96.09s ~30ms > eth0 pfifo_fast yes ~15000 95.71 42.73s ~10ms > eth0 fq_codel yes ~15000 95.19 51.52s ~4ms > br0 pfifo_fast yes ~15000 155.19(*) 94.24s ~120ms > br0 fq_codel yes ~15000 90.92 65.52s ~4ms > > (*) 100mbit link after the switch, ifconfig eth0 shows no drops, > so I'm assuming they are getting dropped by the switch. > > -- > Tobias PGP: http://8ef7ddba.uguu.de -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-21 3:48 ` [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) Dave Taht @ 2012-05-21 17:24 ` Rick Jones 2012-05-21 21:49 ` Tobias Diedrich 0 siblings, 1 reply; 11+ messages in thread From: Rick Jones @ 2012-05-21 17:24 UTC (permalink / raw) To: Dave Taht, Tobias Diedrich; +Cc: OpenWrt Development List, codel On 05/20/2012 08:48 PM, Dave Taht wrote: > Thx for the numbers! > > Could you do a TCP_RR while under load from UDP_STREAM? If you want to generate pretty pictures while doing so, you can probably tweak http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh rick jones > > On Mon, May 21, 2012 at 1:31 AM, Tobias Diedrich > <ranma+openwrt@tdiedrich.de> wrote: >> Tobias Diedrich wrote: >>> Dave Taht wrote: >>>> In looking over the enormous stack of boards and drivers that openwrt >>>> supports, I see that many of the ethernet drivers don't yet support >>>> Linux 3.3's "Byte Queue Limits", which are discussed here: >>>> >>>> http://lwn.net/Articles/454390/ >>>> >>>> It would be good if more did. They improve network performance in the >>>> general case enormously, particularly when a link is not connected at >>>> it's peak wire speed. >>>> >>>> *Adding* support for BQL to an ethernet driver is trivial, here's an >>>> example of how. >>> >>> I tried adding BQL to the ramips ethernet driver, however I found >>> some interesting behaviour while doing >>> "root@OpenWrt:~# netperf -l 120 -t UDP_STREAM -H myserver" >>> >>> It looks like the briding code still needs to implement this as well? >>> >>> netperf UDP_STREAM: >>> iface limit_min inflight tx mbps remote mbps ping ms >>> eth0 0 ~15000 95.71 95.71 ~10ms >>> eth0 1000000 ~300000 177.98 23.28(*) ~30ms >>> br0 0 ~15000 154.88 33.94(*) ~120ms >>> br0 1000000 ~300000 170.92 25.57(*) ~30ms >>> >>> (*) bwm-ng on the server showed ~100mbps incoming... >> [...] >>> Haven't tried codel yet... >> >> Turns out, it works nicely with codel, even with the bridge: >> >> netperf: netperf -l 120 -t UDP_STREAM -H myserver >> fq_codel: tc qdisc add dev eth0 handle 1: root fq_codel target 5ms >> >> iface eth0 qdisc bql inflight tx mbps sys time ping ms >> eth0 pfifo_fast no n/a 182.98(*) 96.43s ~30ms >> eth0 fq_codel no n/a 177.98(*) 96.09s ~30ms >> eth0 pfifo_fast yes ~15000 95.71 42.73s ~10ms >> eth0 fq_codel yes ~15000 95.19 51.52s ~4ms >> br0 pfifo_fast yes ~15000 155.19(*) 94.24s ~120ms >> br0 fq_codel yes ~15000 90.92 65.52s ~4ms >> >> (*) 100mbit link after the switch, ifconfig eth0 shows no drops, >> so I'm assuming they are getting dropped by the switch. >> >> -- >> Tobias PGP: http://8ef7ddba.uguu.de > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-21 17:24 ` Rick Jones @ 2012-05-21 21:49 ` Tobias Diedrich 2012-05-21 22:17 ` Rick Jones 2012-05-22 0:27 ` Dave Taht 0 siblings, 2 replies; 11+ messages in thread From: Tobias Diedrich @ 2012-05-21 21:49 UTC (permalink / raw) To: Rick Jones; +Cc: OpenWrt Development List, codel Rick Jones wrote: > On 05/20/2012 08:48 PM, Dave Taht wrote: > >Thx for the numbers! > > > >Could you do a TCP_RR while under load from UDP_STREAM? > > If you want to generate pretty pictures while doing so, you can > probably tweak > http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh How about this: http://tdiedrich.de/~ranma/bufferbloat-rt3050/ -- Tobias PGP: http://8ef7ddba.uguu.de ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-21 21:49 ` Tobias Diedrich @ 2012-05-21 22:17 ` Rick Jones 2012-05-21 22:20 ` Rick Jones 2012-05-21 23:09 ` Tobias Diedrich 2012-05-22 0:27 ` Dave Taht 1 sibling, 2 replies; 11+ messages in thread From: Rick Jones @ 2012-05-21 22:17 UTC (permalink / raw) To: Tobias Diedrich; +Cc: OpenWrt Development List, codel On 05/21/2012 02:49 PM, Tobias Diedrich wrote: > Rick Jones wrote: >> On 05/20/2012 08:48 PM, Dave Taht wrote: >>> Thx for the numbers! >>> >>> Could you do a TCP_RR while under load from UDP_STREAM? >> >> If you want to generate pretty pictures while doing so, you can >> probably tweak >> http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh > > How about this: > http://tdiedrich.de/~ranma/bufferbloat-rt3050/ They look pretty I suppose, but it also looks like I've got the vrules botched somehow. Though I cannot find the bug just yet in the repository copy. The red vertical line should be at the start of the UDP_STREAM test's results, and there should be a black one right after. They shouldn't be at the ends of the _RR test. Did you tweak that bit when you converted to a UDP_STREAM test? The other thing is it appears the scaling to make rrdtool look like it supports dual y-axes could use a bit of tweaking. I was pretty much guessing there :( rick ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-21 22:17 ` Rick Jones @ 2012-05-21 22:20 ` Rick Jones 2012-05-21 23:09 ` Tobias Diedrich 1 sibling, 0 replies; 11+ messages in thread From: Rick Jones @ 2012-05-21 22:20 UTC (permalink / raw) To: Tobias Diedrich; +Cc: OpenWrt Development List, codel By the way - one caveat about using UDP_STREAM - the demo mode output will be the sending-side results, not the receiving side. So it could be overstating what was actually making it through the path. rick ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-21 22:17 ` Rick Jones 2012-05-21 22:20 ` Rick Jones @ 2012-05-21 23:09 ` Tobias Diedrich 2012-05-21 23:30 ` Rick Jones 1 sibling, 1 reply; 11+ messages in thread From: Tobias Diedrich @ 2012-05-21 23:09 UTC (permalink / raw) To: Rick Jones; +Cc: OpenWrt Development List, codel Rick Jones wrote: > On 05/21/2012 02:49 PM, Tobias Diedrich wrote: > >Rick Jones wrote: > >>On 05/20/2012 08:48 PM, Dave Taht wrote: > >>>Thx for the numbers! > >>> > >>>Could you do a TCP_RR while under load from UDP_STREAM? > >> > >>If you want to generate pretty pictures while doing so, you can > >>probably tweak > >>http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh > > > >How about this: > >http://tdiedrich.de/~ranma/bufferbloat-rt3050/ > > They look pretty I suppose, but it also looks like I've got the > vrules botched somehow. Though I cannot find the bug just yet in > the repository copy. The red vertical line should be at the start > of the UDP_STREAM test's results, and there should be a black one > right after. They shouldn't be at the ends of the _RR test. Did > you tweak that bit when you converted to a UDP_STREAM test? Ah, yes, I botched the vrules. > The other thing is it appears the scaling to make rrdtool look like > it supports dual y-axes could use a bit of tweaking. I was pretty > much guessing there :( Well, I tweaked the scaling myself since I wasn't happy with the original result either. :) I reuploaded new images with correct vrules and your scaling. Anything above 100Mbit can be assumed to be dropped here (although only the bridge seems to drop, the gige mac gets backpressure from the switch I think and just delays transmitting the next packet I suppose). I can do a TCP_STREAM test, but since the SoC lacks sufficient oomph to saturate a 100Mbit link the results are going to be boring I expect. I get about 3MiB/s, regardless of TCP_STREAM or TCP_SENDFILE. Maybe TCP_SENDFILE would be a bit faster if the driver implemented checksum offload. -- Tobias PGP: http://8ef7ddba.uguu.de ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-21 23:09 ` Tobias Diedrich @ 2012-05-21 23:30 ` Rick Jones 2012-05-22 1:22 ` Rick Jones 0 siblings, 1 reply; 11+ messages in thread From: Rick Jones @ 2012-05-21 23:30 UTC (permalink / raw) To: Tobias Diedrich; +Cc: OpenWrt Development List, codel On 05/21/2012 04:09 PM, Tobias Diedrich wrote: > Rick Jones wrote: >> On 05/21/2012 02:49 PM, Tobias Diedrich wrote: >>> Rick Jones wrote: >>>> On 05/20/2012 08:48 PM, Dave Taht wrote: >>>>> Thx for the numbers! >>>>> >>>>> Could you do a TCP_RR while under load from UDP_STREAM? >>>> >>>> If you want to generate pretty pictures while doing so, you can >>>> probably tweak >>>> http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh >>> >>> How about this: >>> http://tdiedrich.de/~ranma/bufferbloat-rt3050/ >> >> They look pretty I suppose, but it also looks like I've got the >> vrules botched somehow. Though I cannot find the bug just yet in >> the repository copy. The red vertical line should be at the start >> of the UDP_STREAM test's results, and there should be a black one >> right after. They shouldn't be at the ends of the _RR test. Did >> you tweak that bit when you converted to a UDP_STREAM test? > > Ah, yes, I botched the vrules. > >> The other thing is it appears the scaling to make rrdtool look like >> it supports dual y-axes could use a bit of tweaking. I was pretty >> much guessing there :( > > Well, I tweaked the scaling myself since I wasn't happy with the > original result either. :) I think my original ones had the unfortunate effect of putting lines on top of one another. Your's seem to put them pretty far apart (at least sometimes). We aught to be able to find some reasonable medium in there somewhere. I'm thinking if latency is the metric of greatest interest, we want that to have the full y axis, and then the peak bandwidth of the STREAM test be about half-way up? > I reuploaded new images with correct vrules and your scaling. > > Anything above 100Mbit can be assumed to be dropped here (although > only the bridge seems to drop, the gige mac gets backpressure from > the switch I think and just delays transmitting the next packet I > suppose). > > I can do a TCP_STREAM test, but since the SoC lacks sufficient oomph > to saturate a 100Mbit link the results are going to be boring I > expect. I get about 3MiB/s, regardless of TCP_STREAM or TCP_SENDFILE. > Maybe TCP_SENDFILE would be a bit faster if the driver implemented > checksum offload. I'm fine with folks using UDP_STREAM, so long as they are aware of the issues involved. rick ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-21 23:30 ` Rick Jones @ 2012-05-22 1:22 ` Rick Jones 2012-05-22 1:29 ` Dave Taht 0 siblings, 1 reply; 11+ messages in thread From: Rick Jones @ 2012-05-22 1:22 UTC (permalink / raw) To: Tobias Diedrich; +Cc: OpenWrt Development List, codel [-- Attachment #1: Type: text/plain, Size: 1138 bytes --] On 05/21/2012 04:30 PM, Rick Jones wrote: > I think my original ones had the unfortunate effect of putting lines on > top of one another. Your's seem to put them pretty far apart (at least > sometimes). We aught to be able to find some reasonable medium in there > somewhere. I'm thinking if latency is the metric of greatest interest, > we want that to have the full y axis, and then the peak bandwidth of the > STREAM test be about half-way up? I've tweaked the bloat.sh script in a couple ways. First, I changed how I compute the scaling factor, to implement what I described above. Second, I am using a negative value for the demo interval for the TCP_RR test. This causes netperf to check if it is time to emit a result after each transaction rather than after what it thought would be the number of transactions in the interval. In that way the latency line is much more robust in the face of a sudden bloating of the path. The effect on transactions per second should be similar to that of enabling histograms. An example of a test across my 100 Mbit/s link to a laptop is attached. happy benchmarking, rick jones [-- Attachment #2: bloat.png --] [-- Type: image/png, Size: 33750 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-22 1:22 ` Rick Jones @ 2012-05-22 1:29 ` Dave Taht 0 siblings, 0 replies; 11+ messages in thread From: Dave Taht @ 2012-05-22 1:29 UTC (permalink / raw) To: Rick Jones; +Cc: Tobias Diedrich, OpenWrt Development List, codel I would really like people to clearly mark when they are using pfifo_fast, codel, and fq_codel. Secondly, I note that for utterly best results it is useful to ALSO have htb on on ingress to a value only slightly lower than the rate under test, and fq_codel attached to the bin(s) (an example of this is in the deBloat repo on github - both ingress.sh and simple_qos.sh) It would be nice if doing ingress was as simple as egress, maybe using some sort of tbf + fq_codel.... Otherwise for some benchmarks... at 100Mbit, you will see TCP_STREAM behavior holding the line at ~5ms, and TCP_MAERTS being in excess of 30ms, especially when pfifo_fast is on the other side. Or vice versa, depending on where you are running the test. On Tue, May 22, 2012 at 2:22 AM, Rick Jones <rick.jones2@hp.com> wrote: > On 05/21/2012 04:30 PM, Rick Jones wrote: >> >> I think my original ones had the unfortunate effect of putting lines on >> top of one another. Your's seem to put them pretty far apart (at least >> sometimes). We aught to be able to find some reasonable medium in there >> somewhere. I'm thinking if latency is the metric of greatest interest, >> we want that to have the full y axis, and then the peak bandwidth of the >> STREAM test be about half-way up? > > > I've tweaked the bloat.sh script in a couple ways. First, I changed how I > compute the scaling factor, to implement what I described above. Second, I > am using a negative value for the demo interval for the TCP_RR test. This > causes netperf to check if it is time to emit a result after each > transaction rather than after what it thought would be the number of > transactions in the interval. In that way the latency line is much more > robust in the face of a sudden bloating of the path. The effect on > transactions per second should be similar to that of enabling histograms. > An example of a test across my 100 Mbit/s link to a laptop is attached. > > happy benchmarking, > > rick jones > > _______________________________________________ > Codel mailing list > Codel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/codel > -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-21 21:49 ` Tobias Diedrich 2012-05-21 22:17 ` Rick Jones @ 2012-05-22 0:27 ` Dave Taht 2012-05-29 13:00 ` Tobias Diedrich 1 sibling, 1 reply; 11+ messages in thread From: Dave Taht @ 2012-05-22 0:27 UTC (permalink / raw) To: Tobias Diedrich; +Cc: codel, OpenWrt Development List In looking over your test scripts and results, it seems possible you have gso on. ethtool -K the_device tso off ethtool -K the_device gso off ethtool -K the_device ufo off Secondly, in the 100Mbit and below case, I have found BQL's estimates to be persistently on the high side, and have generally found that a byte queue limit of 3000 or 4500 produces optimal, consistent results. Usually 1500 causes starvation. YMMV. On Mon, May 21, 2012 at 10:49 PM, Tobias Diedrich <ranma+openwrt@tdiedrich.de> wrote: > Rick Jones wrote: >> On 05/20/2012 08:48 PM, Dave Taht wrote: >> >Thx for the numbers! >> > >> >Could you do a TCP_RR while under load from UDP_STREAM? >> >> If you want to generate pretty pictures while doing so, you can >> probably tweak >> http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh > > How about this: > http://tdiedrich.de/~ranma/bufferbloat-rt3050/ > > -- > Tobias PGP: http://8ef7ddba.uguu.de -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 http://www.bufferbloat.net ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) 2012-05-22 0:27 ` Dave Taht @ 2012-05-29 13:00 ` Tobias Diedrich 0 siblings, 0 replies; 11+ messages in thread From: Tobias Diedrich @ 2012-05-29 13:00 UTC (permalink / raw) To: Dave Taht; +Cc: codel, OpenWrt Development List Dave Taht wrote: > In looking over your test scripts and results, it seems possible you > have gso on. The driver didn't implement any advanced hardware features, so GSO was unsupported. Still, I've found the performance for this SoC is heavily limited by memory bandwith and implementing scatter/gather support and hw checksum offload improves TCP_STREAM performance greatly (About doubled throughput). Case in point, a second device with basically the same SoC, but slightly faster (384MHz instead of 320MHz) and double the memory bandwith (two chips instead of one) reaches twice the speed of the first device (and also doubles the speed when scatter/gather and checksum offload is enabled). I haven't done any further testing with fq_codel yet. -- Tobias PGP: http://8ef7ddba.uguu.de ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-05-29 13:00 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CAA93jw6TcZGpxpS=DhuHCTWBn3uq1RbzugtJ3oJmA5zx9oDP-w@mail.gmail.com> [not found] ` <20120520212944.GK22418@yumi.tdiedrich.de> [not found] ` <20120521003115.GO22418@yumi.tdiedrich.de> 2012-05-21 3:48 ` [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) Dave Taht 2012-05-21 17:24 ` Rick Jones 2012-05-21 21:49 ` Tobias Diedrich 2012-05-21 22:17 ` Rick Jones 2012-05-21 22:20 ` Rick Jones 2012-05-21 23:09 ` Tobias Diedrich 2012-05-21 23:30 ` Rick Jones 2012-05-22 1:22 ` Rick Jones 2012-05-22 1:29 ` Dave Taht 2012-05-22 0:27 ` Dave Taht 2012-05-29 13:00 ` Tobias Diedrich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox