Cake - FQ_codel the next generation
 help / color / mirror / Atom feed
* [Cake] Cake latency update
@ 2017-02-09 16:36 Pete Heist
  2017-02-09 20:52 ` Jonathan Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Pete Heist @ 2017-02-09 16:36 UTC (permalink / raw)
  To: cake

[-- Attachment #1: Type: text/plain, Size: 764 bytes --]

I’m seeing good latency results for Cake at lower MCS levels (graphs below), in case that wasn’t already known. These results are with half-duplex rate limiting at around 90% of iperf3 reported throughput. “Default” is LEDE’s default from the ath9k driver.

In addition to these fixed MCS level tests, I’m also testing MCS level “steps” (step up and step down in the middle of the test), and a varying MCS level test to simulate what can happen in the field. Full results to follow later...

Pete

http://www.drhleny.cz/bufferbloat/mcstmp/mcs_latency.png <http://www.drhleny.cz/bufferbloat/mcstmp/mcs_latency.png>

http://www.drhleny.cz/bufferbloat/mcstmp/mcs_throughput.png <http://www.drhleny.cz/bufferbloat/mcstmp/mcs_throughput.png>


[-- Attachment #2: Type: text/html, Size: 1301 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-09 16:36 [Cake] Cake latency update Pete Heist
@ 2017-02-09 20:52 ` Jonathan Morton
  2017-02-10  8:04   ` Pete Heist
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Morton @ 2017-02-09 20:52 UTC (permalink / raw)
  To: Pete Heist; +Cc: cake


> On 9 Feb, 2017, at 18:36, Pete Heist <peteheist@gmail.com> wrote:
> 
> I’m seeing good latency results for Cake at lower MCS levels (graphs below), in case that wasn’t already known.

Yes - despite its complexity, Cake has always performed well on latency in comparison to other qdiscs.

I gather this time you’re comparing it against the mac80211 fq_codel, rather than a conventional qdisc stack?

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-09 20:52 ` Jonathan Morton
@ 2017-02-10  8:04   ` Pete Heist
  2017-02-10  8:49     ` Jonathan Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Pete Heist @ 2017-02-10  8:04 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 2130 bytes --]

> On Feb 9, 2017, at 9:52 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 9 Feb, 2017, at 18:36, Pete Heist <peteheist@gmail.com> wrote:
>> 
>> I’m seeing good latency results for Cake at lower MCS levels (graphs below), in case that wasn’t already known.
> 
> Yes - despite its complexity, Cake has always performed well on latency in comparison to other qdiscs.
> 
> I gather this time you’re comparing it against the mac80211 fq_codel, rather than a conventional qdisc stack?

Yes, this is on stock LEDE, and I hope to complete results on Chaos Calmer.

I haven’t always seen Cake’s latency numbers lower compared to fq_codel. If I take these two:

http://www.drhleny.cz/bufferbloat/fq_codel_fd-wifi-both_40mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_fd-wifi-both_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_fd-wifi-both_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-wifi-both_40mbit/index.html>

FQ-CoDel did better in this case, but it was running on the AP (OM2P-HS, 520 MHz MIPS 74Kc). When run on higher powered Intel routers it basically tied fq_codel:

http://www.drhleny.cz/bufferbloat/cake_fd-eth-ap_100ms_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-ap_100ms_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-ap_target_5ms_interval_100ms_40mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-ap_target_5ms_interval_100ms_40mbit/index.html>

but then as you lower the bitrate, Cake’s performance on the rrul test seems to get better relative to fq_codel:

http://www.drhleny.cz/bufferbloat/mcstmp/mcs_latency.png <http://www.drhleny.cz/bufferbloat/mcstmp/mcs_latency.png>

I look forward to the throughput shifts being solved, where I see results like this:

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_100ms_80mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_100ms_80mbit/index.html>

This is the only thing so far that would keep me from recommending it for my ISP to test. Otherwise the results from Cake are promising… :)

Pete


[-- Attachment #2: Type: text/html, Size: 3486 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10  8:04   ` Pete Heist
@ 2017-02-10  8:49     ` Jonathan Morton
  2017-02-10  9:21       ` Pete Heist
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Morton @ 2017-02-10  8:49 UTC (permalink / raw)
  To: Pete Heist; +Cc: cake


> On 10 Feb, 2017, at 10:04, Pete Heist <peteheist@gmail.com> wrote:
> 
> I look forward to the throughput shifts being solved, where I see results like this:
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_100ms_80mbit/index.html

That basically looks like it’s run out of CPU, so there’s hard choices to make over CPU allocation.  Cake isn’t responsible for that allocation, though it *might* be possible to optimise its use of the CPU a little further.

If you can obtain a CPU profile of that workload on that hardware, that might help to direct those efforts.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10  8:49     ` Jonathan Morton
@ 2017-02-10  9:21       ` Pete Heist
  2017-02-10  9:29         ` Jonathan Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Pete Heist @ 2017-02-10  9:21 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 5104 bytes --]


> On Feb 10, 2017, at 9:49 AM, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
> 
>> On 10 Feb, 2017, at 10:04, Pete Heist <peteheist@gmail.com> wrote:
>> 
>> I look forward to the throughput shifts being solved, where I see results like this:
>> 
>> http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_100ms_80mbit/index.html
> 
> That basically looks like it’s run out of CPU, so there’s hard choices to make over CPU allocation.  Cake isn’t responsible for that allocation, though it *might* be possible to optimise its use of the CPU a little further.
> 
> If you can obtain a CPU profile of that workload on that hardware, that might help to direct those efforts.
> 
> - Jonathan Morton

I’d be surprised if that were a CPU problem in this case, as that test was run with Cake on a 2.4 GHz Core 2 Duo, not so new, but far more powerful than a typical embedded CPU. Here’s the CPU info:

http://www.drhleny.cz/bufferbloat/hostinfo/mbp_cpuinfo.txt <http://www.drhleny.cz/bufferbloat/hostinfo/mbp_cpuinfo.txt>

Here are the results at various bitrates (all half-duplex rate limiting on this CPU). I find it easiest to just open them in multiple browser tabs and keyboard shift between them to compare:

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_10mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_10mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_20mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_20mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_30mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_30mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_50mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_50mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_60mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_60mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_70mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_70mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_75mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_75mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_80mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_80mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_85mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_85mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_90mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_90mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_100mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth-ap_100mbit/index.html>

There are strange shifts at 30 Mbit, 40 Mbit and 70 Mbit, but I think this hardware should be able to handle those speeds. It’s interesting that the throughput shifts don’t seem to affect the latency.

Compare that to the results for HTB+fq_codel, which doesn’t show such shifts:

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_10mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_10mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_20mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_20mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_30mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_30mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_40mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_50mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_50mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_60mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_60mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_70mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_70mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_75mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_75mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_80mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_80mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_85mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_85mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_90mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_90mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_100mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth-ap_100mbit/index.html>

But if you still think that could be the CPU, I can try to get a CPU profile, if you can direct me on how to do that for Cake…

Pete


[-- Attachment #2: Type: text/html, Size: 7334 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10  9:21       ` Pete Heist
@ 2017-02-10  9:29         ` Jonathan Morton
  2017-02-10 10:05           ` Pete Heist
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Morton @ 2017-02-10  9:29 UTC (permalink / raw)
  To: Pete Heist; +Cc: cake


> On 10 Feb, 2017, at 11:21, Pete Heist <peteheist@gmail.com> wrote:
> 
> Here are the results at various bitrates (all half-duplex rate limiting on this CPU).

Hold on a minute.  What does “half-duplex rate limiting” mean exactly?

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10  9:29         ` Jonathan Morton
@ 2017-02-10 10:05           ` Pete Heist
  2017-02-10 10:31             ` Jonathan Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Pete Heist @ 2017-02-10 10:05 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 4196 bytes --]


> On Feb 10, 2017, at 10:29 AM, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 10 Feb, 2017, at 11:21, Pete Heist <peteheist@gmail.com> wrote:
>> 
>> Here are the results at various bitrates (all half-duplex rate limiting on this CPU).
> 
> Hold on a minute.  What does “half-duplex rate limiting” mean exactly?
> 
> - Jonathan Morton
> 

Yes, that’s a good point, I probably invented the phrase “half-duplex rate limiting". :) It means that both the ingress and egress have been redirected over the same IFB device and QoS'd together. This seems to work better for the half-duplex nature of Wi-Fi, because then you can use soft rate limiting to control the queue and keep latency low while still allowing almost the full one-directional throughput. You made the suggestion earlier on the Cake list to try it, and it does work for me.

By the way, in case you want to see the qdisc setup, it’s there for each host under the qos_* sections on each page. The AP router is “mbp”, which I use for half-duplex limiting, then for full-duplex limiting it’s done both ends of the link- “mini” and “mbp”. And if you want to see the QoS setup script, it’s here:

http://www.drhleny.cz/bufferbloat/qos.sh <http://www.drhleny.cz/bufferbloat/qos.sh>

But what I hadn’t noticed before is that I so far haven’t seen the same throughput shifts in the so-called "full-duplex” rate limiting results, meaning I’m just limiting on the egress of both ends of the point-to-point link.

http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_10mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_10mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_20mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_20mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_30mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_30mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_45mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_45mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_fd-eth-ap_70ms_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-ap_70ms_40mbit/index.html>

And “full-duplex limiting” for fq_codel:

http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_10mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_10mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_20mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_20mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_30mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_30mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_40mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_45mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_fd-eth-both_45mbit/index.html>

But what I _do_ see now in the full-duplex limiting results is not throughput shifts, but occasional latency shifts for individual flows, like in the 30 Mbit full-duplex Cake result:

http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_30mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-both_30mbit/index.html>

and when I was testing lowering Cake’s rtt parameter I saw it (perhaps unrelated to changing the rtt parameter), so here’s rtt 70ms, bandwidth 40 Mbit:

http://www.drhleny.cz/bufferbloat/cake_fd-eth-ap_70ms_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_fd-eth-ap_70ms_40mbit/index.html>

I’m still parsing all of these results and haven’t figured everything out yet, so thanks for making me look at that again. It does appear that the throughput shifts may be related to rate limiting both egress and ingress over the same IFB device. However, I have not seen such throughput shifts for HTB+fq_codel when rate limited in the same way...

Pete


[-- Attachment #2: Type: text/html, Size: 6157 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10 10:05           ` Pete Heist
@ 2017-02-10 10:31             ` Jonathan Morton
  2017-02-10 11:08               ` Pete Heist
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Morton @ 2017-02-10 10:31 UTC (permalink / raw)
  To: Pete Heist; +Cc: cake


> On 10 Feb, 2017, at 12:05, Pete Heist <peteheist@gmail.com> wrote:
> 
> It means that both the ingress and egress have been redirected over the same IFB device and QoS'd together.

Okay, I guessed as much but wanted to be sure.

I can’t think of any theoretical reason for these results.  Cake’s flow isolation should be robust enough to cope transparently with bidirectional traffic in half-duplex mode.  As you say, a C2D should easily be able to keep up, and at these modest rates I can even discount PCI bandwidth as a concern.  So I might need to try to reproduce it here.

Does the problem go away if you use a wired link with the same setup otherwise?  Or is that inconvenient to try?  I have some ath9k equipped machines, but they would need to be set up.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10 10:31             ` Jonathan Morton
@ 2017-02-10 11:08               ` Pete Heist
  2017-02-10 11:35                 ` Sebastian Moeller
  0 siblings, 1 reply; 18+ messages in thread
From: Pete Heist @ 2017-02-10 11:08 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: cake


> On Feb 10, 2017, at 11:31 AM, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
> 
>> On 10 Feb, 2017, at 12:05, Pete Heist <peteheist@gmail.com> wrote:
>> 
>> It means that both the ingress and egress have been redirected over the same IFB device and QoS'd together.
> 
> Okay, I guessed as much but wanted to be sure.
> 
> I can’t think of any theoretical reason for these results.  Cake’s flow isolation should be robust enough to cope transparently with bidirectional traffic in half-duplex mode.  As you say, a C2D should easily be able to keep up, and at these modest rates I can even discount PCI bandwidth as a concern.  So I might need to try to reproduce it here.
> 
> Does the problem go away if you use a wired link with the same setup otherwise?  Or is that inconvenient to try?  I have some ath9k equipped machines, but they would need to be set up.

Not a problem. I’ll run a spread of Cake and fq_codel over Ethernet at various bandwidths. It will be through their Apple USB Ethernet adapters (used now for management), which are also connected through a switch, but I think that setup should be fine for this purpose. Should be done in a hour or so and we’ll see…

Pete


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10 11:08               ` Pete Heist
@ 2017-02-10 11:35                 ` Sebastian Moeller
  2017-02-10 12:21                   ` Pete Heist
  0 siblings, 1 reply; 18+ messages in thread
From: Sebastian Moeller @ 2017-02-10 11:35 UTC (permalink / raw)
  To: Pete Heist; +Cc: Jonathan Morton, cake

Hi Pete,

> On Feb 10, 2017, at 12:08, Pete Heist <peteheist@gmail.com> wrote:
> 
> 
>> On Feb 10, 2017, at 11:31 AM, Jonathan Morton <chromatix99@gmail.com> wrote:
>> 
>> 
>>> On 10 Feb, 2017, at 12:05, Pete Heist <peteheist@gmail.com> wrote:
>>> 
>>> It means that both the ingress and egress have been redirected over the same IFB device and QoS'd together.
>> 
>> Okay, I guessed as much but wanted to be sure.
>> 
>> I can’t think of any theoretical reason for these results.  Cake’s flow isolation should be robust enough to cope transparently with bidirectional traffic in half-duplex mode.  As you say, a C2D should easily be able to keep up, and at these modest rates I can even discount PCI bandwidth as a concern.  So I might need to try to reproduce it here.
>> 
>> Does the problem go away if you use a wired link with the same setup otherwise?  Or is that inconvenient to try?  I have some ath9k equipped machines, but they would need to be set up.
> 
> Not a problem. I’ll run a spread of Cake and fq_codel over Ethernet at various bandwidths. It will be through their Apple USB Ethernet adapters (used now for management), which are also connected through a switch, but I think that setup should be fine for this purpose. Should be done in a hour or so and we’ll see…

	I believe the Apple USB dongles are fastEthernet only, at least the USB2 types I have available here, which for your tested bandwidth would work, but it will not allow you test at what shaper rate things go pear shaped… Also it wifi creates a bit more CPU load than wired ethernet, it _might_ make sense to concurrently excercise the WIFI cards just to re-create the SIRQ load (but probably not as the first experiment ;) ).

Best Regards
	Sebastian 

> 
> Pete
> 
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10 11:35                 ` Sebastian Moeller
@ 2017-02-10 12:21                   ` Pete Heist
  2017-02-12 12:43                     ` Pete Heist
  0 siblings, 1 reply; 18+ messages in thread
From: Pete Heist @ 2017-02-10 12:21 UTC (permalink / raw)
  To: Jonathan Morton, Sebastian Moeller; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 5557 bytes --]


> On Feb 10, 2017, at 12:35 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> Hi Pete,
> 
>> On Feb 10, 2017, at 12:08, Pete Heist <peteheist@gmail.com> wrote:
>> 
>> Not a problem. I’ll run a spread of Cake and fq_codel over Ethernet at various bandwidths. It will be through their Apple USB Ethernet adapters (used now for management), which are also connected through a switch, but I think that setup should be fine for this purpose. Should be done in a hour or so and we’ll see…
> 
> 	I believe the Apple USB dongles are fastEthernet only, at least the USB2 types I have available here, which for your tested bandwidth would work, but it will not allow you test at what shaper rate things go pear shaped… Also it wifi creates a bit more CPU load than wired ethernet, it _might_ make sense to concurrently excercise the WIFI cards just to re-create the SIRQ load (but probably not as the first experiment ;) ).
> 
> Best Regards
> 	Sebastian 

Hi Sebastian, yes, they’re only 100 Mbit, but that’s enough to cover the rates where I was seeing the problem with Wi-Fi. Also in my test setup there are four nodes connected as described under Configuration #1:

http://www.drhleny.cz/bufferbloat/wifi_bufferbloat.html <http://www.drhleny.cz/bufferbloat/wifi_bufferbloat.html>

I’m running Cake on ‘mini’ and ‘mbp’, and the Wi-Fi radios are only on ‘om1’ and ‘om2’, so the CPU load shouldn’t be different for mini and mbp when connected directly via Ethernet, instead of via Ethernet and a Wi-Fi link, I suppose.

I think we just wanted to see if the throughput shifting would reproduce over Ethernet at the same rates, but so far it didn’t for me, although there are other anomalies that don’t look like the throughput shifts I sent before (there’s a throughput anomaly for Cake 20Mbit and latency anomalies for fq_codel 60Mbit and 90Mbit):

http://www.drhleny.cz/bufferbloat/cake_hd-eth_10mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_10mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_20mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_20mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_30mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_30mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_50mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_50mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_60mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_60mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_70mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_70mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_75mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_75mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_80mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_80mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_85mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_85mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_90mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_90mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_100mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_100mbit/index.html>

fq_codel:

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_10mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_10mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_20mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_20mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_30mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_30mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_40mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_50mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_50mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_60mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_60mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_70mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_70mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_75mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_75mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_80mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_80mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_85mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_85mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_90mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_90mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_100mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_100mbit/index.html>

So that suggests that the throughput shifting problem may also be somehow related to Wi-Fi. I’m still going to be testing Chaos Calmer, as well as two Ubiquiti NanoStation M5’s, though this will take some more time. We might learn some more from this, or if you can reproduce it with ath9k hardware that would be good too...

Thanks,
Pete


[-- Attachment #2: Type: text/html, Size: 12812 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-10 12:21                   ` Pete Heist
@ 2017-02-12 12:43                     ` Pete Heist
  2017-02-12 13:08                       ` Dave Taht
  0 siblings, 1 reply; 18+ messages in thread
From: Pete Heist @ 2017-02-12 12:43 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Dave Taht, cake

[-- Attachment #1: Type: text/plain, Size: 8678 bytes --]

As an update on this, I now suspect a problem with either the Ethernet hardware or (more likely) sky2 driver on ‘mbp’, my 2007 MBP that acts as Flent server and where I’m often using a qdisc. I should have looked at dmesg earlier, as there are log entries like this:

-----
[  221.478753] eth0: hw csum failure
[  221.478756] CPU: 1 PID: 1890 Comm: netserver Tainted: G        W       4.8.0-37-generic #39-Ubuntu
[  221.478757] Hardware name: Apple Inc. MacBookPro4,1/Mac-F42C89C8, BIOS    MBP41.88Z.00C1.B03.0802271651 02/27/08
[  221.478762]  0000000000000286 000000003844a735 ffff9c293fd03ba8 ffffffffb5c30e12
[  221.478765]  ffff9c293a505000 ffffffffb66fb5c0 ffff9c293fd03bc0 ffffffffb5f7c028
[  221.478769]  ffff9c29399ea800 ffff9c293fd03be0 ffffffffb5f71f26 af75267500000000
[  221.478770] Call Trace:
[  221.478775]  <IRQ>  [<ffffffffb5c30e12>] dump_stack+0x63/0x81
[  221.478778]  [<ffffffffb5f7c028>] netdev_rx_csum_fault+0x38/0x40
[  221.478781]  [<ffffffffb5f71f26>] __skb_checksum_complete+0xb6/0xc0
…
[  226.478373] net_ratelimit: 386 callbacks suppressed
[  226.478378] eth0: hw csum failure
[  226.479523] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.8.0-37-generic #39-Ubuntu
[  226.479527] Hardware name: Apple Inc. MacBookPro4,1/Mac-F42C89C8, BIOS    MBP41.88Z.00C1.B03.0802271651 02/27/08
[  226.479533]  0000000000000286 f78e43dca42a09d0 ffff9c293fd03b88 ffffffffb5c30e12
[  226.479542]  ffff9c293a505000 ffffffffb66fb5c0 ffff9c293fd03ba0 ffffffffb5f7c028
[  226.479549]  ffff9c2932093b00 ffff9c293fd03bc0 ffffffffb5f71f26 46898f6100000000
[  226.479557] Call Trace:
[  226.479560]  <IRQ>  [<ffffffffb5c30e12>] dump_stack+0x63/0x81
[  226.479581]  [<ffffffffb5f7c028>] netdev_rx_csum_fault+0x38/0x40
-----

What’s interesting is that they only occur during testing, and when QoS with rate limiting is applied (Cake or HTB+X also). It’s also interesting that they occur on exactly 5 second intervals, not every 5 seconds, but sometimes after 10, or 15 seconds, but on 5 second intervals. I went back and looked at my results, and realized that a very large number of the latency and throughput shifts I saw are also quantized to 5 second intervals. I don’t think that’s a coincidence.

I saw Dave posted something that he saw a similar 'hw csum failure' on raspi earlier in 2016:

https://github.com/raspberrypi/linux/issues/1371 <https://github.com/raspberrypi/linux/issues/1371>

but since I’ve also seen more reports of this over the years with no clear solution.

Why I saw it more with Cake than other qdiscs I don’t know, but I think it’s safe to say there’s no point in you trying to reproduce this until I can get past this with my hardware, and also I’m likely going to have to do a re-run of all of my tests after this is sorted out.

Pete

> On Feb 10, 2017, at 1:21 PM, Pete Heist <peteheist@gmail.com> wrote:
> 
> 
>> On Feb 10, 2017, at 12:35 PM, Sebastian Moeller <moeller0@gmx.de <mailto:moeller0@gmx.de>> wrote:
>> 
>> Hi Pete,
>> 
>>> On Feb 10, 2017, at 12:08, Pete Heist <peteheist@gmail.com <mailto:peteheist@gmail.com>> wrote:
>>> 
>>> Not a problem. I’ll run a spread of Cake and fq_codel over Ethernet at various bandwidths. It will be through their Apple USB Ethernet adapters (used now for management), which are also connected through a switch, but I think that setup should be fine for this purpose. Should be done in a hour or so and we’ll see…
>> 
>> 	I believe the Apple USB dongles are fastEthernet only, at least the USB2 types I have available here, which for your tested bandwidth would work, but it will not allow you test at what shaper rate things go pear shaped… Also it wifi creates a bit more CPU load than wired ethernet, it _might_ make sense to concurrently excercise the WIFI cards just to re-create the SIRQ load (but probably not as the first experiment ;) ).
>> 
>> Best Regards
>> 	Sebastian 
> 
> Hi Sebastian, yes, they’re only 100 Mbit, but that’s enough to cover the rates where I was seeing the problem with Wi-Fi. Also in my test setup there are four nodes connected as described under Configuration #1:
> 
> http://www.drhleny.cz/bufferbloat/wifi_bufferbloat.html <http://www.drhleny.cz/bufferbloat/wifi_bufferbloat.html>
> 
> I’m running Cake on ‘mini’ and ‘mbp’, and the Wi-Fi radios are only on ‘om1’ and ‘om2’, so the CPU load shouldn’t be different for mini and mbp when connected directly via Ethernet, instead of via Ethernet and a Wi-Fi link, I suppose.
> 
> I think we just wanted to see if the throughput shifting would reproduce over Ethernet at the same rates, but so far it didn’t for me, although there are other anomalies that don’t look like the throughput shifts I sent before (there’s a throughput anomaly for Cake 20Mbit and latency anomalies for fq_codel 60Mbit and 90Mbit):
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_10mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_10mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_20mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_20mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_30mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_30mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_40mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_50mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_50mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_60mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_60mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_70mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_70mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_75mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_75mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_80mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_80mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_85mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_85mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_90mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_90mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/cake_hd-eth_100mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_100mbit/index.html>
> 
> fq_codel:
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_10mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_10mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_20mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_20mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_30mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_30mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_40mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_40mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_50mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_50mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_60mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_60mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_70mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_70mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_75mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_75mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_80mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_80mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_85mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_85mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_90mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_90mbit/index.html>
> 
> http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_100mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_100mbit/index.html>
> 
> So that suggests that the throughput shifting problem may also be somehow related to Wi-Fi. I’m still going to be testing Chaos Calmer, as well as two Ubiquiti NanoStation M5’s, though this will take some more time. We might learn some more from this, or if you can reproduce it with ath9k hardware that would be good too...
> 
> Thanks,
> Pete
> 


[-- Attachment #2: Type: text/html, Size: 17334 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-12 12:43                     ` Pete Heist
@ 2017-02-12 13:08                       ` Dave Taht
  2017-02-12 14:12                         ` Pete Heist
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2017-02-12 13:08 UTC (permalink / raw)
  To: Pete Heist; +Cc: Jonathan Morton, Cake List

Disable offloads on the sky hardware and see what happens?

ethtool -K gro off tso off gso off your_device

How old is the OS on that hardware - offloads have always been tricksy.

as to why you might be seeing it more with cake, with this stuff on,
you are not necessarily checking every packet for checksums, and flows
are "finer" - more mixed up packets.

capturing these events with tcpdump at various points on the path might help.

Still, these are the kinds of baseline deployment issues that block
progress elsewhere. The whole first stage of the rocket has to succeed
in order to test the second. Doesn't matter how good your second stage
is, if you RUD the first.

Good digging!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-12 13:08                       ` Dave Taht
@ 2017-02-12 14:12                         ` Pete Heist
  2017-02-12 16:37                           ` Pete Heist
  2017-02-12 16:41                           ` Jonathan Morton
  0 siblings, 2 replies; 18+ messages in thread
From: Pete Heist @ 2017-02-12 14:12 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jonathan Morton, Cake List

[-- Attachment #1: Type: text/plain, Size: 2503 bytes --]

> On Feb 12, 2017, at 2:08 PM, Dave Taht <dave.taht@gmail.com> wrote:
> 
> Disable offloads on the sky hardware and see what happens?
> 
> ethtool -K gro off tso off gso off your_device

I’d already had them disabled for testing in /etc/network/interfaces:

post-up ethtool -K eth0 tso off gso off gro off sg off

On a whim I tried _enabling_ offloads again but it happens in both cases.

> How old is the OS on that hardware - offloads have always been tricksy.

Pretty new: Ubuntu 16.10 (GNU/Linux 4.8.0-37-generic x86_64)

> as to why you might be seeing it more with cake, with this stuff on,
> you are not necessarily checking every packet for checksums, and flows
> are "finer" - more mixed up packets.
> 
> capturing these events with tcpdump at various points on the path might help.
> 
> Still, these are the kinds of baseline deployment issues that block
> progress elsewhere. The whole first stage of the rocket has to succeed
> in order to test the second. Doesn't matter how good your second stage
> is, if you RUD the first.

It must be a challenge for you guys sometimes! Unless I can find an obvious solution soon it’s probably going to mean a hardware change for me. But there are only a few options I see with what’s available to me now:

1) Using my Apple USB Ethernet adapter for testing instead of just management. Not excited about that- no BQL? USB latency? fq_codel on this adapter over Ethernet reduces Flent RRUL average latency to a pretty solid 1ms, looks sufficient? (Perhaps no coincidence that USB 2.0 start-of-frame is sent every 1 ms.)

2) Using a 1.25 GHz Mac Mini PPC G4 I have laying around. I successfully ran fq_codel for ADSL on that box in the past, but at 5 / 0.5 Mbps. Accurate Flent results running Cake at 80 Mbps? Timer issues? Also I think no BQL support with the Sun GEM chipset: https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/sun/sungem.c <https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/sun/sungem.c>.

3) Using two of these for my routers instead: https://pcengines.ch/alix2d2.htm <https://pcengines.ch/alix2d2.htm>, which I’ll want to test later anyway. They’re not new. 500 MHz AMD Geode LX800. Pre-Obama (June 2008). Not even sure yet if I’ll rate limit properly at 80-90 Mbit with these.

Any opinion on a ‘best’ alternative among these? I’m leaning towards #1 for ease. Otherwise I’ll make my way, and may have to dig up some better hardware.

Pete


[-- Attachment #2: Type: text/html, Size: 3625 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-12 14:12                         ` Pete Heist
@ 2017-02-12 16:37                           ` Pete Heist
  2017-02-12 16:41                           ` Jonathan Morton
  1 sibling, 0 replies; 18+ messages in thread
From: Pete Heist @ 2017-02-12 16:37 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jonathan Morton, Cake List

[-- Attachment #1: Type: text/plain, Size: 2961 bytes --]

Or, I can do: ethtool -K eth0 tx off rx off

and disable the checksums entirely. That stops the messages, but unfortunately it doesn’t appear to be the end of the throughput shifts.

But this experience has made me want to look at all of this on other hardware, so that’s next.

> On Feb 12, 2017, at 3:12 PM, Pete Heist <peteheist@gmail.com> wrote:
> 
>> On Feb 12, 2017, at 2:08 PM, Dave Taht <dave.taht@gmail.com <mailto:dave.taht@gmail.com>> wrote:
>> 
>> Disable offloads on the sky hardware and see what happens?
>> 
>> ethtool -K gro off tso off gso off your_device
> 
> I’d already had them disabled for testing in /etc/network/interfaces:
> 
> post-up ethtool -K eth0 tso off gso off gro off sg off
> 
> On a whim I tried _enabling_ offloads again but it happens in both cases.
> 
>> How old is the OS on that hardware - offloads have always been tricksy.
> 
> Pretty new: Ubuntu 16.10 (GNU/Linux 4.8.0-37-generic x86_64)
> 
>> as to why you might be seeing it more with cake, with this stuff on,
>> you are not necessarily checking every packet for checksums, and flows
>> are "finer" - more mixed up packets.
>> 
>> capturing these events with tcpdump at various points on the path might help.
>> 
>> Still, these are the kinds of baseline deployment issues that block
>> progress elsewhere. The whole first stage of the rocket has to succeed
>> in order to test the second. Doesn't matter how good your second stage
>> is, if you RUD the first.
> 
> It must be a challenge for you guys sometimes! Unless I can find an obvious solution soon it’s probably going to mean a hardware change for me. But there are only a few options I see with what’s available to me now:
> 
> 1) Using my Apple USB Ethernet adapter for testing instead of just management. Not excited about that- no BQL? USB latency? fq_codel on this adapter over Ethernet reduces Flent RRUL average latency to a pretty solid 1ms, looks sufficient? (Perhaps no coincidence that USB 2.0 start-of-frame is sent every 1 ms.)
> 
> 2) Using a 1.25 GHz Mac Mini PPC G4 I have laying around. I successfully ran fq_codel for ADSL on that box in the past, but at 5 / 0.5 Mbps. Accurate Flent results running Cake at 80 Mbps? Timer issues? Also I think no BQL support with the Sun GEM chipset: https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/sun/sungem.c <https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/sun/sungem.c>.
> 
> 3) Using two of these for my routers instead: https://pcengines.ch/alix2d2.htm <https://pcengines.ch/alix2d2.htm>, which I’ll want to test later anyway. They’re not new. 500 MHz AMD Geode LX800. Pre-Obama (June 2008). Not even sure yet if I’ll rate limit properly at 80-90 Mbit with these.
> 
> Any opinion on a ‘best’ alternative among these? I’m leaning towards #1 for ease. Otherwise I’ll make my way, and may have to dig up some better hardware.
> 
> Pete
> 


[-- Attachment #2: Type: text/html, Size: 4702 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-12 14:12                         ` Pete Heist
  2017-02-12 16:37                           ` Pete Heist
@ 2017-02-12 16:41                           ` Jonathan Morton
  2017-02-12 17:15                             ` Pete Heist
  1 sibling, 1 reply; 18+ messages in thread
From: Jonathan Morton @ 2017-02-12 16:41 UTC (permalink / raw)
  To: Pete Heist; +Cc: Dave Taht, Cake List


> On 12 Feb, 2017, at 16:12, Pete Heist <peteheist@gmail.com> wrote:
> 
> 2) Using a 1.25 GHz Mac Mini PPC G4 I have laying around. I successfully ran fq_codel for ADSL on that box in the past, but at 5 / 0.5 Mbps. Accurate Flent results running Cake at 80 Mbps? Timer issues?

That should be absolutely fine at 80Mbps.  PowerPC Macs seem to have good hi-res timers in my experience, and the Sun GEM is a relatively good NIC.  The main limitation is that it’s PCI rather than PCIe, so it can’t reach the full GigE line rate, but it does go well above 100Mbps.

Don’t worry about BQL, since you’re not relying on NIC backpressure to control queuing.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-12 16:41                           ` Jonathan Morton
@ 2017-02-12 17:15                             ` Pete Heist
  2017-02-14 10:02                               ` Pete Heist
  0 siblings, 1 reply; 18+ messages in thread
From: Pete Heist @ 2017-02-12 17:15 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Cake List


> On Feb 12, 2017, at 5:41 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 12 Feb, 2017, at 16:12, Pete Heist <peteheist@gmail.com> wrote:
>> 
>> 2) Using a 1.25 GHz Mac Mini PPC G4 I have laying around. I successfully ran fq_codel for ADSL on that box in the past, but at 5 / 0.5 Mbps. Accurate Flent results running Cake at 80 Mbps? Timer issues?
> 
> That should be absolutely fine at 80Mbps.  PowerPC Macs seem to have good hi-res timers in my experience, and the Sun GEM is a relatively good NIC.  The main limitation is that it’s PCI rather than PCIe, so it can’t reach the full GigE line rate, but it does go well above 100Mbps.
> 
> Don’t worry about BQL, since you’re not relying on NIC backpressure to control queuing.

Ok, good news, then that’s the way I’ll go! Thanks for that, more results later...

Pete


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake latency update
  2017-02-12 17:15                             ` Pete Heist
@ 2017-02-14 10:02                               ` Pete Heist
  0 siblings, 0 replies; 18+ messages in thread
From: Pete Heist @ 2017-02-14 10:02 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Cake List

[-- Attachment #1: Type: text/plain, Size: 5114 bytes --]


> On Feb 12, 2017, at 6:15 PM, Pete Heist <peteheist@gmail.com> wrote:
> 
>> On Feb 12, 2017, at 5:41 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>> 
>>> On 12 Feb, 2017, at 16:12, Pete Heist <peteheist@gmail.com> wrote:
>>> 
>>> 2) Using a 1.25 GHz Mac Mini PPC G4 I have laying around. I successfully ran fq_codel for ADSL on that box in the past, but at 5 / 0.5 Mbps. Accurate Flent results running Cake at 80 Mbps? Timer issues?
>> 
>> That should be absolutely fine at 80Mbps.  PowerPC Macs seem to have good hi-res timers in my experience, and the Sun GEM is a relatively good NIC.  The main limitation is that it’s PCI rather than PCIe, so it can’t reach the full GigE line rate, but it does go well above 100Mbps.
>> 
>> Don’t worry about BQL, since you’re not relying on NIC backpressure to control queuing.
> 
> Ok, good news, then that’s the way I’ll go! Thanks for that, more results later...


Replacing the MBP (sky2 Ethernet driver) with a PPC G4 with the Sun GEM appears to have fixed the class of five-second-throughput-shifts that I reported earlier. See the much smoother Cake results below. :) I’ll be re-running all of my other tests again.

It does raise a new question though. In the 10 Mbit and 20 Mbit Cake tests, one of the TCP download flows gets a bit starved relative to the average. I was seeing this earlier also before I moved to the PPC. I don’t see this at 30Mbit and above. There is also a latency variation in 75Mbit fq_codel, so I know that sometimes you can just say “it’s Wi-Fi”. I don’t have the time to check this on Ethernet now, but maybe later.

Cake:

http://www.drhleny.cz/bufferbloat/cake_hd-eth_10mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_10mbit/index.html> (partial starvation for one tcp download flow)

http://www.drhleny.cz/bufferbloat/cake_hd-eth_20mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_20mbit/index.html> (partial starvation for one tcp download flow)

http://www.drhleny.cz/bufferbloat/cake_hd-eth_30mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_30mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_40mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_50mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_50mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_60mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_60mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_70mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_70mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_75mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_75mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_80mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_80mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_85mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_85mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_90mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_90mbit/index.html>

http://www.drhleny.cz/bufferbloat/cake_hd-eth_100mbit/index.html <http://www.drhleny.cz/bufferbloat/cake_hd-eth_100mbit/index.html>

fq_codel:

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_10mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_10mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_20mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_20mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_30mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_30mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_40mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_40mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_50mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_50mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_60mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_60mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_70mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_70mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_75mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_75mbit/index.html> (latency variation and something wrong in one udp flow)

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_80mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_80mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_85mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_85mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_90mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_90mbit/index.html>

http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_100mbit/index.html <http://www.drhleny.cz/bufferbloat/fq_codel_hd-eth_100mbit/index.html>

Pete


[-- Attachment #2: Type: text/html, Size: 7885 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-02-14 10:01 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09 16:36 [Cake] Cake latency update Pete Heist
2017-02-09 20:52 ` Jonathan Morton
2017-02-10  8:04   ` Pete Heist
2017-02-10  8:49     ` Jonathan Morton
2017-02-10  9:21       ` Pete Heist
2017-02-10  9:29         ` Jonathan Morton
2017-02-10 10:05           ` Pete Heist
2017-02-10 10:31             ` Jonathan Morton
2017-02-10 11:08               ` Pete Heist
2017-02-10 11:35                 ` Sebastian Moeller
2017-02-10 12:21                   ` Pete Heist
2017-02-12 12:43                     ` Pete Heist
2017-02-12 13:08                       ` Dave Taht
2017-02-12 14:12                         ` Pete Heist
2017-02-12 16:37                           ` Pete Heist
2017-02-12 16:41                           ` Jonathan Morton
2017-02-12 17:15                             ` Pete Heist
2017-02-14 10:02                               ` Pete Heist

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox