* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
[not found] ` <20120521003115.GO22418@yumi.tdiedrich.de>
@ 2012-05-21 3:48 ` Dave Taht
2012-05-21 17:24 ` Rick Jones
0 siblings, 1 reply; 11+ messages in thread
From: Dave Taht @ 2012-05-21 3:48 UTC (permalink / raw)
To: Tobias Diedrich; +Cc: OpenWrt Development List, codel
Thx for the numbers!
Could you do a TCP_RR while under load from UDP_STREAM?
On Mon, May 21, 2012 at 1:31 AM, Tobias Diedrich
<ranma+openwrt@tdiedrich.de> wrote:
> Tobias Diedrich wrote:
>> Dave Taht wrote:
>> > In looking over the enormous stack of boards and drivers that openwrt
>> > supports, I see that many of the ethernet drivers don't yet support
>> > Linux 3.3's "Byte Queue Limits", which are discussed here:
>> >
>> > http://lwn.net/Articles/454390/
>> >
>> > It would be good if more did. They improve network performance in the
>> > general case enormously, particularly when a link is not connected at
>> > it's peak wire speed.
>> >
>> > *Adding* support for BQL to an ethernet driver is trivial, here's an
>> > example of how.
>>
>> I tried adding BQL to the ramips ethernet driver, however I found
>> some interesting behaviour while doing
>> "root@OpenWrt:~# netperf -l 120 -t UDP_STREAM -H myserver"
>>
>> It looks like the briding code still needs to implement this as well?
>>
>> netperf UDP_STREAM:
>> iface limit_min inflight tx mbps remote mbps ping ms
>> eth0 0 ~15000 95.71 95.71 ~10ms
>> eth0 1000000 ~300000 177.98 23.28(*) ~30ms
>> br0 0 ~15000 154.88 33.94(*) ~120ms
>> br0 1000000 ~300000 170.92 25.57(*) ~30ms
>>
>> (*) bwm-ng on the server showed ~100mbps incoming...
> [...]
>> Haven't tried codel yet...
>
> Turns out, it works nicely with codel, even with the bridge:
>
> netperf: netperf -l 120 -t UDP_STREAM -H myserver
> fq_codel: tc qdisc add dev eth0 handle 1: root fq_codel target 5ms
>
> iface eth0 qdisc bql inflight tx mbps sys time ping ms
> eth0 pfifo_fast no n/a 182.98(*) 96.43s ~30ms
> eth0 fq_codel no n/a 177.98(*) 96.09s ~30ms
> eth0 pfifo_fast yes ~15000 95.71 42.73s ~10ms
> eth0 fq_codel yes ~15000 95.19 51.52s ~4ms
> br0 pfifo_fast yes ~15000 155.19(*) 94.24s ~120ms
> br0 fq_codel yes ~15000 90.92 65.52s ~4ms
>
> (*) 100mbit link after the switch, ifconfig eth0 shows no drops,
> so I'm assuming they are getting dropped by the switch.
>
> --
> Tobias PGP: http://8ef7ddba.uguu.de
--
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-21 3:48 ` [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) Dave Taht
@ 2012-05-21 17:24 ` Rick Jones
2012-05-21 21:49 ` Tobias Diedrich
0 siblings, 1 reply; 11+ messages in thread
From: Rick Jones @ 2012-05-21 17:24 UTC (permalink / raw)
To: Dave Taht, Tobias Diedrich; +Cc: OpenWrt Development List, codel
On 05/20/2012 08:48 PM, Dave Taht wrote:
> Thx for the numbers!
>
> Could you do a TCP_RR while under load from UDP_STREAM?
If you want to generate pretty pictures while doing so, you can probably
tweak http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh
rick jones
>
> On Mon, May 21, 2012 at 1:31 AM, Tobias Diedrich
> <ranma+openwrt@tdiedrich.de> wrote:
>> Tobias Diedrich wrote:
>>> Dave Taht wrote:
>>>> In looking over the enormous stack of boards and drivers that openwrt
>>>> supports, I see that many of the ethernet drivers don't yet support
>>>> Linux 3.3's "Byte Queue Limits", which are discussed here:
>>>>
>>>> http://lwn.net/Articles/454390/
>>>>
>>>> It would be good if more did. They improve network performance in the
>>>> general case enormously, particularly when a link is not connected at
>>>> it's peak wire speed.
>>>>
>>>> *Adding* support for BQL to an ethernet driver is trivial, here's an
>>>> example of how.
>>>
>>> I tried adding BQL to the ramips ethernet driver, however I found
>>> some interesting behaviour while doing
>>> "root@OpenWrt:~# netperf -l 120 -t UDP_STREAM -H myserver"
>>>
>>> It looks like the briding code still needs to implement this as well?
>>>
>>> netperf UDP_STREAM:
>>> iface limit_min inflight tx mbps remote mbps ping ms
>>> eth0 0 ~15000 95.71 95.71 ~10ms
>>> eth0 1000000 ~300000 177.98 23.28(*) ~30ms
>>> br0 0 ~15000 154.88 33.94(*) ~120ms
>>> br0 1000000 ~300000 170.92 25.57(*) ~30ms
>>>
>>> (*) bwm-ng on the server showed ~100mbps incoming...
>> [...]
>>> Haven't tried codel yet...
>>
>> Turns out, it works nicely with codel, even with the bridge:
>>
>> netperf: netperf -l 120 -t UDP_STREAM -H myserver
>> fq_codel: tc qdisc add dev eth0 handle 1: root fq_codel target 5ms
>>
>> iface eth0 qdisc bql inflight tx mbps sys time ping ms
>> eth0 pfifo_fast no n/a 182.98(*) 96.43s ~30ms
>> eth0 fq_codel no n/a 177.98(*) 96.09s ~30ms
>> eth0 pfifo_fast yes ~15000 95.71 42.73s ~10ms
>> eth0 fq_codel yes ~15000 95.19 51.52s ~4ms
>> br0 pfifo_fast yes ~15000 155.19(*) 94.24s ~120ms
>> br0 fq_codel yes ~15000 90.92 65.52s ~4ms
>>
>> (*) 100mbit link after the switch, ifconfig eth0 shows no drops,
>> so I'm assuming they are getting dropped by the switch.
>>
>> --
>> Tobias PGP: http://8ef7ddba.uguu.de
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-21 17:24 ` Rick Jones
@ 2012-05-21 21:49 ` Tobias Diedrich
2012-05-21 22:17 ` Rick Jones
2012-05-22 0:27 ` Dave Taht
0 siblings, 2 replies; 11+ messages in thread
From: Tobias Diedrich @ 2012-05-21 21:49 UTC (permalink / raw)
To: Rick Jones; +Cc: OpenWrt Development List, codel
Rick Jones wrote:
> On 05/20/2012 08:48 PM, Dave Taht wrote:
> >Thx for the numbers!
> >
> >Could you do a TCP_RR while under load from UDP_STREAM?
>
> If you want to generate pretty pictures while doing so, you can
> probably tweak
> http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh
How about this:
http://tdiedrich.de/~ranma/bufferbloat-rt3050/
--
Tobias PGP: http://8ef7ddba.uguu.de
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-21 21:49 ` Tobias Diedrich
@ 2012-05-21 22:17 ` Rick Jones
2012-05-21 22:20 ` Rick Jones
2012-05-21 23:09 ` Tobias Diedrich
2012-05-22 0:27 ` Dave Taht
1 sibling, 2 replies; 11+ messages in thread
From: Rick Jones @ 2012-05-21 22:17 UTC (permalink / raw)
To: Tobias Diedrich; +Cc: OpenWrt Development List, codel
On 05/21/2012 02:49 PM, Tobias Diedrich wrote:
> Rick Jones wrote:
>> On 05/20/2012 08:48 PM, Dave Taht wrote:
>>> Thx for the numbers!
>>>
>>> Could you do a TCP_RR while under load from UDP_STREAM?
>>
>> If you want to generate pretty pictures while doing so, you can
>> probably tweak
>> http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh
>
> How about this:
> http://tdiedrich.de/~ranma/bufferbloat-rt3050/
They look pretty I suppose, but it also looks like I've got the vrules
botched somehow. Though I cannot find the bug just yet in the
repository copy. The red vertical line should be at the start of the
UDP_STREAM test's results, and there should be a black one right after.
They shouldn't be at the ends of the _RR test. Did you tweak that bit
when you converted to a UDP_STREAM test?
The other thing is it appears the scaling to make rrdtool look like it
supports dual y-axes could use a bit of tweaking. I was pretty much
guessing there :(
rick
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-21 22:17 ` Rick Jones
@ 2012-05-21 22:20 ` Rick Jones
2012-05-21 23:09 ` Tobias Diedrich
1 sibling, 0 replies; 11+ messages in thread
From: Rick Jones @ 2012-05-21 22:20 UTC (permalink / raw)
To: Tobias Diedrich; +Cc: OpenWrt Development List, codel
By the way - one caveat about using UDP_STREAM - the demo mode output
will be the sending-side results, not the receiving side. So it could
be overstating what was actually making it through the path.
rick
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-21 22:17 ` Rick Jones
2012-05-21 22:20 ` Rick Jones
@ 2012-05-21 23:09 ` Tobias Diedrich
2012-05-21 23:30 ` Rick Jones
1 sibling, 1 reply; 11+ messages in thread
From: Tobias Diedrich @ 2012-05-21 23:09 UTC (permalink / raw)
To: Rick Jones; +Cc: OpenWrt Development List, codel
Rick Jones wrote:
> On 05/21/2012 02:49 PM, Tobias Diedrich wrote:
> >Rick Jones wrote:
> >>On 05/20/2012 08:48 PM, Dave Taht wrote:
> >>>Thx for the numbers!
> >>>
> >>>Could you do a TCP_RR while under load from UDP_STREAM?
> >>
> >>If you want to generate pretty pictures while doing so, you can
> >>probably tweak
> >>http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh
> >
> >How about this:
> >http://tdiedrich.de/~ranma/bufferbloat-rt3050/
>
> They look pretty I suppose, but it also looks like I've got the
> vrules botched somehow. Though I cannot find the bug just yet in
> the repository copy. The red vertical line should be at the start
> of the UDP_STREAM test's results, and there should be a black one
> right after. They shouldn't be at the ends of the _RR test. Did
> you tweak that bit when you converted to a UDP_STREAM test?
Ah, yes, I botched the vrules.
> The other thing is it appears the scaling to make rrdtool look like
> it supports dual y-axes could use a bit of tweaking. I was pretty
> much guessing there :(
Well, I tweaked the scaling myself since I wasn't happy with the
original result either. :)
I reuploaded new images with correct vrules and your scaling.
Anything above 100Mbit can be assumed to be dropped here (although
only the bridge seems to drop, the gige mac gets backpressure from
the switch I think and just delays transmitting the next packet I
suppose).
I can do a TCP_STREAM test, but since the SoC lacks sufficient oomph
to saturate a 100Mbit link the results are going to be boring I
expect. I get about 3MiB/s, regardless of TCP_STREAM or TCP_SENDFILE.
Maybe TCP_SENDFILE would be a bit faster if the driver implemented
checksum offload.
--
Tobias PGP: http://8ef7ddba.uguu.de
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-21 23:09 ` Tobias Diedrich
@ 2012-05-21 23:30 ` Rick Jones
2012-05-22 1:22 ` Rick Jones
0 siblings, 1 reply; 11+ messages in thread
From: Rick Jones @ 2012-05-21 23:30 UTC (permalink / raw)
To: Tobias Diedrich; +Cc: OpenWrt Development List, codel
On 05/21/2012 04:09 PM, Tobias Diedrich wrote:
> Rick Jones wrote:
>> On 05/21/2012 02:49 PM, Tobias Diedrich wrote:
>>> Rick Jones wrote:
>>>> On 05/20/2012 08:48 PM, Dave Taht wrote:
>>>>> Thx for the numbers!
>>>>>
>>>>> Could you do a TCP_RR while under load from UDP_STREAM?
>>>>
>>>> If you want to generate pretty pictures while doing so, you can
>>>> probably tweak
>>>> http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh
>>>
>>> How about this:
>>> http://tdiedrich.de/~ranma/bufferbloat-rt3050/
>>
>> They look pretty I suppose, but it also looks like I've got the
>> vrules botched somehow. Though I cannot find the bug just yet in
>> the repository copy. The red vertical line should be at the start
>> of the UDP_STREAM test's results, and there should be a black one
>> right after. They shouldn't be at the ends of the _RR test. Did
>> you tweak that bit when you converted to a UDP_STREAM test?
>
> Ah, yes, I botched the vrules.
>
>> The other thing is it appears the scaling to make rrdtool look like
>> it supports dual y-axes could use a bit of tweaking. I was pretty
>> much guessing there :(
>
> Well, I tweaked the scaling myself since I wasn't happy with the
> original result either. :)
I think my original ones had the unfortunate effect of putting lines on
top of one another. Your's seem to put them pretty far apart (at least
sometimes). We aught to be able to find some reasonable medium in there
somewhere. I'm thinking if latency is the metric of greatest interest,
we want that to have the full y axis, and then the peak bandwidth of the
STREAM test be about half-way up?
> I reuploaded new images with correct vrules and your scaling.
>
> Anything above 100Mbit can be assumed to be dropped here (although
> only the bridge seems to drop, the gige mac gets backpressure from
> the switch I think and just delays transmitting the next packet I
> suppose).
>
> I can do a TCP_STREAM test, but since the SoC lacks sufficient oomph
> to saturate a 100Mbit link the results are going to be boring I
> expect. I get about 3MiB/s, regardless of TCP_STREAM or TCP_SENDFILE.
> Maybe TCP_SENDFILE would be a bit faster if the driver implemented
> checksum offload.
I'm fine with folks using UDP_STREAM, so long as they are aware of the
issues involved.
rick
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-21 21:49 ` Tobias Diedrich
2012-05-21 22:17 ` Rick Jones
@ 2012-05-22 0:27 ` Dave Taht
2012-05-29 13:00 ` Tobias Diedrich
1 sibling, 1 reply; 11+ messages in thread
From: Dave Taht @ 2012-05-22 0:27 UTC (permalink / raw)
To: Tobias Diedrich; +Cc: codel, OpenWrt Development List
In looking over your test scripts and results, it seems possible you
have gso on.
ethtool -K the_device tso off
ethtool -K the_device gso off
ethtool -K the_device ufo off
Secondly, in the 100Mbit and below case, I have found BQL's estimates
to be persistently on the high side, and have generally found that a
byte queue limit of 3000 or 4500 produces optimal, consistent results.
Usually 1500 causes starvation. YMMV.
On Mon, May 21, 2012 at 10:49 PM, Tobias Diedrich
<ranma+openwrt@tdiedrich.de> wrote:
> Rick Jones wrote:
>> On 05/20/2012 08:48 PM, Dave Taht wrote:
>> >Thx for the numbers!
>> >
>> >Could you do a TCP_RR while under load from UDP_STREAM?
>>
>> If you want to generate pretty pictures while doing so, you can
>> probably tweak
>> http://www.netperf.org/svn/netperf2/trunk/doc/examples/bloat.sh
>
> How about this:
> http://tdiedrich.de/~ranma/bufferbloat-rt3050/
>
> --
> Tobias PGP: http://8ef7ddba.uguu.de
--
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-21 23:30 ` Rick Jones
@ 2012-05-22 1:22 ` Rick Jones
2012-05-22 1:29 ` Dave Taht
0 siblings, 1 reply; 11+ messages in thread
From: Rick Jones @ 2012-05-22 1:22 UTC (permalink / raw)
To: Tobias Diedrich; +Cc: OpenWrt Development List, codel
[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]
On 05/21/2012 04:30 PM, Rick Jones wrote:
> I think my original ones had the unfortunate effect of putting lines on
> top of one another. Your's seem to put them pretty far apart (at least
> sometimes). We aught to be able to find some reasonable medium in there
> somewhere. I'm thinking if latency is the metric of greatest interest,
> we want that to have the full y axis, and then the peak bandwidth of the
> STREAM test be about half-way up?
I've tweaked the bloat.sh script in a couple ways. First, I changed how
I compute the scaling factor, to implement what I described above.
Second, I am using a negative value for the demo interval for the TCP_RR
test. This causes netperf to check if it is time to emit a result after
each transaction rather than after what it thought would be the number
of transactions in the interval. In that way the latency line is much
more robust in the face of a sudden bloating of the path. The effect on
transactions per second should be similar to that of enabling
histograms. An example of a test across my 100 Mbit/s link to a laptop
is attached.
happy benchmarking,
rick jones
[-- Attachment #2: bloat.png --]
[-- Type: image/png, Size: 33750 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-22 1:22 ` Rick Jones
@ 2012-05-22 1:29 ` Dave Taht
0 siblings, 0 replies; 11+ messages in thread
From: Dave Taht @ 2012-05-22 1:29 UTC (permalink / raw)
To: Rick Jones; +Cc: Tobias Diedrich, OpenWrt Development List, codel
I would really like people to clearly mark when they are using pfifo_fast,
codel, and fq_codel.
Secondly, I note that for utterly best results it is useful to ALSO have
htb on on ingress to a value only slightly lower than the rate under
test, and fq_codel attached to
the bin(s)
(an example of this is in the deBloat repo on github - both ingress.sh
and simple_qos.sh)
It would be nice if doing ingress was as simple as egress, maybe using some sort
of tbf + fq_codel....
Otherwise for some benchmarks... at 100Mbit, you will see TCP_STREAM
behavior holding
the line at ~5ms, and TCP_MAERTS being in excess of 30ms, especially
when pfifo_fast is on the other
side. Or vice versa, depending on where you are running the test.
On Tue, May 22, 2012 at 2:22 AM, Rick Jones <rick.jones2@hp.com> wrote:
> On 05/21/2012 04:30 PM, Rick Jones wrote:
>>
>> I think my original ones had the unfortunate effect of putting lines on
>> top of one another. Your's seem to put them pretty far apart (at least
>> sometimes). We aught to be able to find some reasonable medium in there
>> somewhere. I'm thinking if latency is the metric of greatest interest,
>> we want that to have the full y axis, and then the peak bandwidth of the
>> STREAM test be about half-way up?
>
>
> I've tweaked the bloat.sh script in a couple ways. First, I changed how I
> compute the scaling factor, to implement what I described above. Second, I
> am using a negative value for the demo interval for the TCP_RR test. This
> causes netperf to check if it is time to emit a result after each
> transaction rather than after what it thought would be the number of
> transactions in the interval. In that way the latency line is much more
> robust in the face of a sudden bloating of the path. The effect on
> transactions per second should be similar to that of enabling histograms.
> An example of a test across my 100 Mbit/s link to a laptop is attached.
>
> happy benchmarking,
>
> rick jones
>
> _______________________________________________
> Codel mailing list
> Codel@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/codel
>
--
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://www.bufferbloat.net
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel)
2012-05-22 0:27 ` Dave Taht
@ 2012-05-29 13:00 ` Tobias Diedrich
0 siblings, 0 replies; 11+ messages in thread
From: Tobias Diedrich @ 2012-05-29 13:00 UTC (permalink / raw)
To: Dave Taht; +Cc: codel, OpenWrt Development List
Dave Taht wrote:
> In looking over your test scripts and results, it seems possible you
> have gso on.
The driver didn't implement any advanced hardware features, so GSO
was unsupported.
Still, I've found the performance for this SoC is heavily limited by
memory bandwith and implementing scatter/gather support and hw
checksum offload improves TCP_STREAM performance greatly (About
doubled throughput).
Case in point, a second device with basically the same SoC, but slightly
faster (384MHz instead of 320MHz) and double the memory bandwith
(two chips instead of one) reaches twice the speed of the first
device (and also doubles the speed when scatter/gather and checksum
offload is enabled).
I haven't done any further testing with fq_codel yet.
--
Tobias PGP: http://8ef7ddba.uguu.de
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-05-29 13:00 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <CAA93jw6TcZGpxpS=DhuHCTWBn3uq1RbzugtJ3oJmA5zx9oDP-w@mail.gmail.com>
[not found] ` <20120520212944.GK22418@yumi.tdiedrich.de>
[not found] ` <20120521003115.GO22418@yumi.tdiedrich.de>
2012-05-21 3:48 ` [Codel] BQL support in Ethernet drivers (and Kathie Nichols and Van Jacobson's new AQM, codel) Dave Taht
2012-05-21 17:24 ` Rick Jones
2012-05-21 21:49 ` Tobias Diedrich
2012-05-21 22:17 ` Rick Jones
2012-05-21 22:20 ` Rick Jones
2012-05-21 23:09 ` Tobias Diedrich
2012-05-21 23:30 ` Rick Jones
2012-05-22 1:22 ` Rick Jones
2012-05-22 1:29 ` Dave Taht
2012-05-22 0:27 ` Dave Taht
2012-05-29 13:00 ` Tobias Diedrich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox