[Cake] Cake3 - source code and some questions

Cake - FQ_codel the next generation
 help / color / mirror / Atom feed

* [Cake] Cake3 - source code and some questions
@ 2015-04-12  9:39 Adrian Popescu
  2015-04-12  9:58 ` Jonathan Morton
  2015-04-12 10:24 ` Jonathan Morton
  0 siblings, 2 replies; 18+ messages in thread
From: Adrian Popescu @ 2015-04-12  9:39 UTC (permalink / raw)
  To: cake

Hello everyone,

Is cake3's source available for testing? Is there a way to test cake3 today?

Does cake3 solve the problems fq_codel was having with high bandwidth
and low latency connections? Does it still require tuning for low
bandwidth and high bandwidth with low latency?

Has cake3 been tested on 10gbps networks? Can cake3 be used in a
hierarchical setup, like htb?

Thanks,
Adrian

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-12  9:39 [Cake] Cake3 - source code and some questions Adrian Popescu
@ 2015-04-12  9:58 ` Jonathan Morton
  2015-04-12 10:24 ` Jonathan Morton
  1 sibling, 0 replies; 18+ messages in thread
From: Jonathan Morton @ 2015-04-12  9:58 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 792 bytes --]

To answer the later questions first:

Cake is designed for use at the internet edge, and therefore assumes
internet scale RTTs. It does not have any sort of tuning for datacentre
networks. But it does work and has a measurable effect on home LANs, even
though it's not specifically tuned for that.

If there is sufficient demand for cake's features on such networks, then a
flag could be added to provide appropriate tuning for low RTTs. Fq_codel
can already be tuned this way by adjusting the target and interval
parameters.

Cake does have tuning for low bandwidth links (increasing codel's target
and interval), and has been run (but not yet extensively tested) at 64kbps.

We have cake's code in a git repo, but I don't think we have anonymous pull
access to it. Toke?

- Jonathan Morton

[-- Attachment #2: Type: text/html, Size: 922 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-12  9:39 [Cake] Cake3 - source code and some questions Adrian Popescu
  2015-04-12  9:58 ` Jonathan Morton
@ 2015-04-12 10:24 ` Jonathan Morton
  2015-04-12 12:33   ` Adrian Popescu
  1 sibling, 1 reply; 18+ messages in thread
From: Jonathan Morton @ 2015-04-12 10:24 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 613 bytes --]

> Can cake3 be used in a hierarchical setup, like htb?

This is a trickier question. Cake is designed to be as simple to configure
as possible, and a classful setup would work against that (it would
instantly triple the number of tc invocations required). However, it could
be used as a leaf qdisc with a separate classifier, if you really wanted
to. I have trouble imagining why, though.

To put it simply, we want to build the functionality for the most common
use cases into cake natively, especially when they don't do any harm to be
left switched on (by default) when not strictly needed.

- Jonathan Morton

[-- Attachment #2: Type: text/html, Size: 696 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-12 10:24 ` Jonathan Morton
@ 2015-04-12 12:33   ` Adrian Popescu
  2015-04-12 18:57     ` Jonathan Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Adrian Popescu @ 2015-04-12 12:33 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: cake

Thank you, Jonathan. High bandwidth home networks are becoming more
and more common. FTTH has very low latency of 1-2ms. fq_codel has
exhibited some weird behaviour, but I can't put my finger on it
because CPU usage wasn't a problem. Figuring out what's going on at
the kernel or fq_codel level can be complicated.

These high bandwidth connections with low latency are somewhat similar
to data centre networks. Some who co-locate their servers have 100mbps
of symmetric bandwidth outside of their network and they have 1 gbps
or 10 gbps within their network.

Setting up fq_codel properly can be difficult because the quantum, the
target and the interval need to be adjusted on high bandwidth & low
latency links. Figuring out if the changes have helped or hurt is
difficult because the network conditions can be different.

I can't wait to test cake3.

Regards,
Adrian

On Sun, Apr 12, 2015 at 1:24 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>> Can cake3 be used in a hierarchical setup, like htb?
>
> This is a trickier question. Cake is designed to be as simple to configure
> as possible, and a classful setup would work against that (it would
> instantly triple the number of tc invocations required). However, it could
> be used as a leaf qdisc with a separate classifier, if you really wanted to.
> I have trouble imagining why, though.
>
> To put it simply, we want to build the functionality for the most common use
> cases into cake natively, especially when they don't do any harm to be left
> switched on (by default) when not strictly needed.
>
> - Jonathan Morton

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-12 12:33   ` Adrian Popescu
@ 2015-04-12 18:57     ` Jonathan Morton
  2015-04-16 12:14       ` Adrian Popescu
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Morton @ 2015-04-12 18:57 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 1965 bytes --]

This is a question worth discussing. There is a certain amount of
controversy over the actual meaning, utility and constraints of the target
parameter, although the interval parameter is fairly well understood as a
rough (order of magnitude) estimate of the prevailing RTT.

Note that even with FTTP, while the RTT to the head end may be unusually
low, the RTTs to interesting servers will still be in roughly the same
range as on a good quality ADSL link. This is especially true if the
interesting servers tend to be at the other end of the country/continent or
on the other side of an ocean. This variability is within Codel's capacity.

Due to the Diffserv and flow isolation features of cake, the latency
minimization feature provided by Codel also isn't as critical to tune as it
is when standalone, or with a lesser flow isolation system such as
fq_codel's collision prone hash function. I think this is sufficient to
make further tuning unnecessary up to 1 gigabit, whether on a LAN or over
the internet, and since I haven't seen any home affordable gear for more
than a gigabit yet - marketing tricks by Wi-Fi vendors aside - I don't
think it's worth thinking too hard about pushing that higher in the home
use case. Fq_codel also works quite well on a LAN already.

The difference in a datacentre is that typical native RTTs are measured in
microseconds, well outside the range that Codel is by default tuned for.
The bandwidths involved also mean that the standard 5ms target invokes a
large amount of buffered data. Additionally, we're inherently talking about
a wholly local environment, so there is no need to adapt to internet scale
RTTs.

For those cases where you do have a datacentre like environment connected
to an internet like environment, the solution is obvious. Deploy datacentre
tuned AQM (which might be fq_codel with altered parameters) within the
datacentre, and put cake at the gateway(s) to the internet. Job done.

- Jonathan Morton

[-- Attachment #2: Type: text/html, Size: 2102 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-12 18:57     ` Jonathan Morton
@ 2015-04-16 12:14       ` Adrian Popescu
  2015-04-16 13:25         ` Jonathan Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Adrian Popescu @ 2015-04-16 12:14 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: cake

That answers my question. Will the changes to codel made by cake be
put into fq_codel?

On Sun, Apr 12, 2015 at 9:57 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
> This is a question worth discussing. There is a certain amount of
> controversy over the actual meaning, utility and constraints of the target
> parameter, although the interval parameter is fairly well understood as a
> rough (order of magnitude) estimate of the prevailing RTT.
>
> Note that even with FTTP, while the RTT to the head end may be unusually
> low, the RTTs to interesting servers will still be in roughly the same range
> as on a good quality ADSL link. This is especially true if the interesting
> servers tend to be at the other end of the country/continent or on the other
> side of an ocean. This variability is within Codel's capacity.
>
> Due to the Diffserv and flow isolation features of cake, the latency
> minimization feature provided by Codel also isn't as critical to tune as it
> is when standalone, or with a lesser flow isolation system such as
> fq_codel's collision prone hash function. I think this is sufficient to make
> further tuning unnecessary up to 1 gigabit, whether on a LAN or over the
> internet, and since I haven't seen any home affordable gear for more than a
> gigabit yet - marketing tricks by Wi-Fi vendors aside - I don't think it's
> worth thinking too hard about pushing that higher in the home use case.
> Fq_codel also works quite well on a LAN already.
>
> The difference in a datacentre is that typical native RTTs are measured in
> microseconds, well outside the range that Codel is by default tuned for. The
> bandwidths involved also mean that the standard 5ms target invokes a large
> amount of buffered data. Additionally, we're inherently talking about a
> wholly local environment, so there is no need to adapt to internet scale
> RTTs.
>
> For those cases where you do have a datacentre like environment connected to
> an internet like environment, the solution is obvious. Deploy datacentre
> tuned AQM (which might be fq_codel with altered parameters) within the
> datacentre, and put cake at the gateway(s) to the internet. Job done.
>
> - Jonathan Morton

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-16 12:14       ` Adrian Popescu
@ 2015-04-16 13:25         ` Jonathan Morton
  2015-04-16 13:48           ` Adrian Popescu
  2015-04-16 13:49           ` Sebastian Moeller
  0 siblings, 2 replies; 18+ messages in thread
From: Jonathan Morton @ 2015-04-16 13:25 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

> On 16 Apr, 2015, at 15:14, Adrian Popescu <adriannnpopescu@gmail.com> wrote:
> 
> Will the changes to codel made by cake be put into fq_codel?

This might be a more complex question than you realise.

The most likely feature of cake to be implemented in fq_codel might be the set-associative hash, since it’s almost a pure win.  That would almost be a cut-and-paste operation, but due to fq_codel’s de-facto status as a “standard candle” in research, it would need to be made configurable, at least to make turning it off easy.  And that isn’t really a “codel” feature change, since it influences the FQ layer exclusively.

The codel parameter tuning done by cake isn’t applicable to fq_codel, because the bandwidth information that this tuning relies on isn’t available (not even when it’s stacked with HTB).  That’s why cake defaults to something very like the standard codel parameters when the internal shaper is disabled (“unlimited” mode), and that in turn is one reason why those defaults are also used at "sufficiently high” bandwidths, so that there isn’t a sharp discontinuity in the behaviour when the bandwidth is increased beyond the link rate and on to infinity (unlimited mode actually works by setting the shaper to infinite bandwidth, ie. zero time per byte).  The other reason, as I previously noted, is because the parameters depend on the total RTT as well as the packet rate.

Which leaves algorithmic changes to codel itself.  It’s certainly possible to drop these (fairly subtle) changes in, but we should probably spend some more time measuring the effects of these changes and finalising them.  We’re considering doing a major refactor of the code, which might make it harder to perform a drop-in replacement.  In any case, FQ does mean that codel’s precise behaviour is less critical than it might otherwise be, and there are valid arguments - such as the “standard candle” one - for leaving it alone.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-16 13:25         ` Jonathan Morton
@ 2015-04-16 13:48           ` Adrian Popescu
  2015-04-16 19:26             ` Dave Taht
  2015-04-16 13:49           ` Sebastian Moeller
  1 sibling, 1 reply; 18+ messages in thread
From: Adrian Popescu @ 2015-04-16 13:48 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: cake

I've discovered there are other problems in the Linux networking stack
which don't seem to be related to fq_codel, qdiscs, AQM and HTB.

There are latency inducing issues bugs in the Ethernet network drivers
of many network adapters, including e1000e, or in the kernel itself.
Some kernels are better. The newest ones have severe regressions in
this area.

I was under the impression there's a problem in codel or fq_codel that
lead to very frequent latency micro-spikes of between 1 and 3
milliseconds. It also seemed to produce bigger latency spikes under
moderate load. No amount of tuning and disabling of offloads helped
with this.

Imagine having 2 milliseconds of latency to your ISP and having your
router induce between 3 to 5 milliseconds of latency for every flow.
It's not particularly helpful for low latency paths on high bandwidth
links. Adding more latency in both directions to high latency paths is
even worse.

These problems are the reason behind starting this thread. I believed
these problems to be related to fq_codel or to the codel algorithm
itself.

My question about porting these improvements to codel and fq_codel was
strictly about the tighter recovery, better invsqrt and other codel
enhancements mentioned on the wiki page.

The solution to this unstable latency will necessitate migration to
another platform without Linux. I'm aware cake and fq_codel won't fix
this problem on Linux.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-16 13:48           ` Adrian Popescu
@ 2015-04-16 19:26             ` Dave Taht
  2015-04-22 21:02               ` Adrian Popescu
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Taht @ 2015-04-16 19:26 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

On Thu, Apr 16, 2015 at 6:48 AM, Adrian Popescu
<adriannnpopescu@gmail.com> wrote:
> I've discovered there are other problems in the Linux networking stack
> which don't seem to be related to fq_codel, qdiscs, AQM and HTB.
>
> There are latency inducing issues bugs in the Ethernet network drivers
> of many network adapters, including e1000e, or in the kernel itself.
> Some kernels are better. The newest ones have severe regressions in
> this area.

There are a multiplicity of problems in doing real-time packet processing
while on processors that generally cannot context switch in under 1000
cycles anymore,
and on virtual machines, oh, my!

One of the biggest fixes for linux networking was the BQL infrastructure which
is in something like 24 drivers now (including the e1000e) - but things like
TSO and GRO offloads remain a PITA.

There are still many drivers left to fix. One (mvneta) is really bugging me
of late....

> I was under the impression there's a problem in codel or fq_codel that
> lead to very frequent latency micro-spikes of between 1 and 3
> milliseconds.

In my world it is sometimes hard to worry about stuff down in this latency
noise level. This is so far less than what these algorithms generally
solve in the first place, that I generally have treated it as noise.

That said, I remain sad that the linux-rt folk are so underfunded: as they
have the tools and expertise to try and get more consistent low latency
with full throughput.

>It also seemed to produce bigger latency spikes under
> moderate load. No amount of tuning and disabling of offloads helped
> with this.

Regrettably you are not providing enough details. Repeatable
tests and actual measurements are always helpful.

There is presently
what is viewed as a regresson by the Xen folk involving the tcp
small queues subsystem, which is being discussed heavily on lkml.

What are you referring to specifically?

>
> Imagine having 2 milliseconds of latency to your ISP and having your
> router induce between 3 to 5 milliseconds of latency for every flow.
> It's not particularly helpful for low latency paths on high bandwidth
> links. Adding more latency in both directions to high latency paths is
> even worse.

Imagine having your router induce seconds or 10s of seconds latency
- the state of affairs on most edge devices today.

I DO care about latency and jitter to this level, but it is very hard to
isolate and measure.

>
> These problems are the reason behind starting this thread. I believed
> these problems to be related to fq_codel or to the codel algorithm
> itself.

Not enough detail, what exactly, are you measuring, on what hardware
using what tools?

> My question about porting these improvements to codel and fq_codel was
> strictly about the tighter recovery, better invsqrt and other codel

In the tree and in cerowrt for 2 years has been multiple variants of the
algorithms under test, individually. Cake rolls up the best of these
attempts thus far, and each of those separate models remain in-tree
for further testing against all the other variables.

In no case have I cared one whit about sub 3ms worth of jitter, I was
mostly looking to get faster convergence, better
utilization at longer RTTs, better behavior at > 100Mbit, and more
filling of the pipe in generally

I happen to not agree with jonathon that the better invsqrt (cache) now in cake
accomplishes anything, but plan to test.

I do think that the better resumption stuff (which has one part that
corrects an error in newton's method going in reverse) helps at >
100mbit, which was a speed we were not able to test effectively at in
our prior attempts before we had all these nice test tools.

> enhancements mentioned on the wiki page.

Well, at the sub 3ms level it is almost always about the device driver,
BQL, tcp small queues. and kernel context switch time.

There are two feature of BQLs I dislike in that it uses a MIAD
(Rather than AIMD) controller, and that it's buffering is additive
across hardware multiqueues (and devices have sprouted a lot of those
of late, which exhibit birthday problems)

I dislike TCP IW10 intensely. I feel the entire GRO/TSO/GSO concepts
are basically broken (and that we need to develop better hardware that
can deal with packets as packets again).

There are quite a large list of things to solve to get latencies lower
than 2ms that are *hard*. Perhaps userspace networking is an answer.

> The solution to this unstable latency will necessitate migration to
> another platform without Linux. I'm aware cake and fq_codel won't fix
> this problem on Linux.

Well a huge goal here is from the dual BSD/GPL licensing of cake. We hope
that someone will show up to port this version of the algorithms to another OS,
and by escaping the monoculture, we will learn more about how to do it
more right.

There is already a project starting to do a dpdk  version

Codel is in click, codel in ns2, fq_codel in ns3 - we do not have anyone
committed to doing cake in anything else at present. Keep hoping a BSD
expert will show up to do a pfsense version.

> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake

-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-16 19:26             ` Dave Taht
@ 2015-04-22 21:02               ` Adrian Popescu
  2015-04-23  0:45                 ` Stephen Hemminger
  2015-04-23  9:01                 ` Toke Høiland-Jørgensen
  0 siblings, 2 replies; 18+ messages in thread
From: Adrian Popescu @ 2015-04-22 21:02 UTC (permalink / raw)
  To: Dave Taht; +Cc: cake

Hello Dave,


On Thu, Apr 16, 2015 at 10:26 PM, Dave Taht <dave.taht@gmail.com> wrote:
>
>> I was under the impression there's a problem in codel or fq_codel that
>> lead to very frequent latency micro-spikes of between 1 and 3
>> milliseconds.
>
> In my world it is sometimes hard to worry about stuff down in this latency
> noise level. This is so far less than what these algorithms generally
> solve in the first place, that I generally have treated it as noise.
>

These algorithms have been doing well on 1 gigabit links. However,
latency depends on the used kernel.

>
>>It also seemed to produce bigger latency spikes under
>> moderate load. No amount of tuning and disabling of offloads helped
>> with this.
>
> Regrettably you are not providing enough details. Repeatable
> tests and actual measurements are always helpful.
>
> There is presently
> what is viewed as a regresson by the Xen folk involving the tcp
> small queues subsystem, which is being discussed heavily on lkml.
>
> What are you referring to specifically?
>

e1000e (82574, dual port and quad port server e1000e adapters with
various Intel chips) is exhibiting varying latency based on the used
kernel.

Local network ping latencies on Ubuntu 14.04 with its LTS 3.13 kernel
are always below 0.5 milliseconds when idle and lightly loaded. The
latest kernels from kernel.org are unable to match those latencies.
These modern kernels are always seeing latencies of 2-3 milliseconds.

Turning off all offloads on the involved ethernet e1000e network
interfaces doesn't help. This is all physical Intel hardware with
e1000e interfaces.

Two idle hosts on a local network should be able to ping each other
and get latencies which are always less than one millisecond. FreeBSD
doesn't have this problem.

>
> Imagine having your router induce seconds or 10s of seconds latency
> - the state of affairs on most edge devices today.
>
> I DO care about latency and jitter to this level, but it is very hard to
> isolate and measure.
>

2-3 milliseconds of jitter or extra latency each way wouldn't have
been a big deal if that represented the only problem a network has to
deal with all the time. Induced latency coupled with stupid RED
causing packet loss at the ISP leads to lower throughput. Induced
latency coupled with wireless packet loss and packet loss caused by
stupid RED is worse.

Adding 5 more milliseconds of latency (or more) during higher load can
be very bad if you're playing a game and the latency is at 95
milliseconds already.

Two persons having a conversation over two such edge devices would get
about 10 milliseconds of induced RTT latency for no good reason. Two
persons having a conversation over two such edge devices from two
devices which induce such latency would probably be seeing up to 12
milliseconds of latency only because of their network. That would be
in addition to all the network latency due to the RTT.

This works properly on older kernels and there's no fq_codel problem.
cake is also affected on the new kernels. Maybe it's better to say the
AQM or shaper makes no difference. Disabling offloads merely reduce it
slightly while idling.

>>
>> These problems are the reason behind starting this thread. I believed
>> these problems to be related to fq_codel or to the codel algorithm
>> itself.
>
> Not enough detail, what exactly, are you measuring, on what hardware
> using what tools?

testing method:
- take two e1000e modern machines (sandy bridge, ivy bridge, haswell)
- one machine should run Ubuntu 14.04 with kernel LTS kernel 3.13
- the other machine should run Ubuntu 14.04 with kernel LTS kernel
3.13 for the control test
- the other machine should run kernel 3.18/3.19/4.0 for the other test
- the two hosts should be very lightly loaded
- connect the machines through a gigabit switch

control test:
ping 172.16.0.1
result: response below 0.5 milliseconds

second test:
ping 172.16.0.1
result: response above 1 millisecond, sometimes 2 milliseconds or worse

Seeing worse latency under load (20-100 milliseconds) isn't uncommon.
I believe this to be a regression in the kernel or in the network
drivers.

>
>> My question about porting these improvements to codel and fq_codel was
>> strictly about the tighter recovery, better invsqrt and other codel
>
> In the tree and in cerowrt for 2 years has been multiple variants of the
> algorithms under test, individually. Cake rolls up the best of these
> attempts thus far, and each of those separate models remain in-tree
> for further testing against all the other variables.
>
> In no case have I cared one whit about sub 3ms worth of jitter, I was
> mostly looking to get faster convergence, better
> utilization at longer RTTs, better behavior at > 100Mbit, and more
> filling of the pipe in generally
>
> I happen to not agree with jonathon that the better invsqrt (cache) now in cake
> accomplishes anything, but plan to test.
>
> I do think that the better resumption stuff (which has one part that
> corrects an error in newton's method going in reverse) helps at >
> 100mbit, which was a speed we were not able to test effectively at in
> our prior attempts before we had all these nice test tools.
>
>> enhancements mentioned on the wiki page.
>
> Well, at the sub 3ms level it is almost always about the device driver,
> BQL, tcp small queues. and kernel context switch time.
>
> There are two feature of BQLs I dislike in that it uses a MIAD
> (Rather than AIMD) controller, and that it's buffering is additive
> across hardware multiqueues (and devices have sprouted a lot of those
> of late, which exhibit birthday problems)
>
> I dislike TCP IW10 intensely. I feel the entire GRO/TSO/GSO concepts
> are basically broken (and that we need to develop better hardware that
> can deal with packets as packets again).
>
> There are quite a large list of things to solve to get latencies lower
> than 2ms that are *hard*. Perhaps userspace networking is an answer.
>
>> The solution to this unstable latency will necessitate migration to
>> another platform without Linux. I'm aware cake and fq_codel won't fix
>> this problem on Linux.
>
> Well a huge goal here is from the dual BSD/GPL licensing of cake. We hope
> that someone will show up to port this version of the algorithms to another OS,
> and by escaping the monoculture, we will learn more about how to do it
> more right.
>
> There is already a project starting to do a dpdk  version
>
> Codel is in click, codel in ns2, fq_codel in ns3 - we do not have anyone
> committed to doing cake in anything else at present. Keep hoping a BSD
> expert will show up to do a pfsense version.
>
>> _______________________________________________
>> Cake mailing list
>> Cake@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cake
>
>
>
> --
> Dave Täht
> Open Networking needs **Open Source Hardware**
>
> https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-22 21:02               ` Adrian Popescu
@ 2015-04-23  0:45                 ` Stephen Hemminger
  2015-04-23  9:01                 ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 18+ messages in thread
From: Stephen Hemminger @ 2015-04-23  0:45 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

On Thu, 23 Apr 2015 00:02:35 +0300
Adrian Popescu <adriannnpopescu@gmail.com> wrote:

> e1000e (82574, dual port and quad port server e1000e adapters with
> various Intel chips) is exhibiting varying latency based on the used
> kernel.
> 
> Local network ping latencies on Ubuntu 14.04 with its LTS 3.13 kernel
> are always below 0.5 milliseconds when idle and lightly loaded. The
> latest kernels from kernel.org are unable to match those latencies.
> These modern kernels are always seeing latencies of 2-3 milliseconds.
> 
> Turning off all offloads on the involved ethernet e1000e network
> interfaces doesn't help. This is all physical Intel hardware with
> e1000e interfaces.
> 
> Two idle hosts on a local network should be able to ping each other
> and get latencies which are always less than one millisecond. FreeBSD
> doesn't have this problem.

These NIC's have had a histrory of power management related issues.
I suspect some power management (maybe even in SMI) is turning off
parts of the chips and it is taking long to turn back on.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-22 21:02               ` Adrian Popescu
  2015-04-23  0:45                 ` Stephen Hemminger
@ 2015-04-23  9:01                 ` Toke Høiland-Jørgensen
  2015-04-23 10:56                   ` Adrian Popescu
  1 sibling, 1 reply; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2015-04-23  9:01 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

Adrian Popescu <adriannnpopescu@gmail.com> writes:

> Seeing worse latency under load (20-100 milliseconds) isn't uncommon.
> I believe this to be a regression in the kernel or in the network
> drivers.

I don't see this behaviour at all:

$ ls -l /sys/class/net/enp0s25/device/driver                                                                                                                                                                                   :(
0 lrwxrwxrwx 1 root root 0 Apr 21 16:05 /sys/class/net/enp0s25/device/driver -> ../../../bus/pci/drivers/e1000e/

$ ping 130.243.26.1 -c 100 # this is my default gateway
..snip...
--- 130.243.26.1 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 99000ms
rtt min/avg/max/mdev = 0.341/0.801/28.260/2.769 ms

$ cat /proc/loadavg 
9.29 8.43 5.30 14/508 6665

(yes, this is while running a cpu-hungry data processing application in
the background on all eight cores)

$ uname -a
Linux alrua-kau 3.19.3-3-ARCH #1 SMP PREEMPT Wed Apr 8 14:10:00 CEST 2015 x86_64 GNU/Linux

$  tc qdisc show dev enp0s25
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

(hmm, why am I running pfifo_fast?)

Repeating with sch_fq:

$ ping 130.243.26.1 -c 100
...snip...
--- 130.243.26.1 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 98998ms
rtt min/avg/max/mdev = 0.358/0.468/1.278/0.151 ms

-Toke

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-23  9:01                 ` Toke Høiland-Jørgensen
@ 2015-04-23 10:56                   ` Adrian Popescu
  2015-04-23 11:01                     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 18+ messages in thread
From: Adrian Popescu @ 2015-04-23 10:56 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: cake

Hello Toke,

Thanks to your experiment and your statement regarding CPU load on
your box during testing, I was able to fix the problem.

It looks like this problem was being caused by power saving. Something
changed between the older kernels and the newer ones. Changing the
power saving settings in the BIOS brings back latency below 0.5
milliseconds.

This might have an impact some benchmarks which don't load up all CPU
cores or which don't need a lot of CPU power. This is certainly
something to keep an eye on when doing any kind of testing involving
really low latencies or network schedulers.


On Thu, Apr 23, 2015 at 12:01 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> Adrian Popescu <adriannnpopescu@gmail.com> writes:
>
>> Seeing worse latency under load (20-100 milliseconds) isn't uncommon.
>> I believe this to be a regression in the kernel or in the network
>> drivers.
>
> I don't see this behaviour at all:
>
> $ ls -l /sys/class/net/enp0s25/device/driver                                                                                                                                                                                   :(
> 0 lrwxrwxrwx 1 root root 0 Apr 21 16:05 /sys/class/net/enp0s25/device/driver -> ../../../bus/pci/drivers/e1000e/
>
> $ ping 130.243.26.1 -c 100 # this is my default gateway
> ..snip...
> --- 130.243.26.1 ping statistics ---
> 100 packets transmitted, 100 received, 0% packet loss, time 99000ms
> rtt min/avg/max/mdev = 0.341/0.801/28.260/2.769 ms
>
> $ cat /proc/loadavg
> 9.29 8.43 5.30 14/508 6665
>
> (yes, this is while running a cpu-hungry data processing application in
> the background on all eight cores)
>
> $ uname -a
> Linux alrua-kau 3.19.3-3-ARCH #1 SMP PREEMPT Wed Apr 8 14:10:00 CEST 2015 x86_64 GNU/Linux
>
> $  tc qdisc show dev enp0s25
> qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>
> (hmm, why am I running pfifo_fast?)
>
> Repeating with sch_fq:
>
> $ ping 130.243.26.1 -c 100
> ...snip...
> --- 130.243.26.1 ping statistics ---
> 100 packets transmitted, 100 received, 0% packet loss, time 98998ms
> rtt min/avg/max/mdev = 0.358/0.468/1.278/0.151 ms
>
> -Toke

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-23 10:56                   ` Adrian Popescu
@ 2015-04-23 11:01                     ` Toke Høiland-Jørgensen
  2015-04-23 11:05                       ` Adrian Popescu
  0 siblings, 1 reply; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2015-04-23 11:01 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

Adrian Popescu <adriannnpopescu@gmail.com> writes:

> Thanks to your experiment and your statement regarding CPU load on
> your box during testing, I was able to fix the problem.

Cool!

> It looks like this problem was being caused by power saving. Something
> changed between the older kernels and the newer ones. Changing the
> power saving settings in the BIOS brings back latency below 0.5
> milliseconds.

So is this the PCI bus power saving settings, or the CPU, or?

> This might have an impact some benchmarks which don't load up all CPU
> cores or which don't need a lot of CPU power. This is certainly
> something to keep an eye on when doing any kind of testing involving
> really low latencies or network schedulers.

Yes, definitely. Having things be worse during idle is definitely not
optimal. I wonder if there's a kernel-level setting that can affect this?

-Toke

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-23 11:01                     ` Toke Høiland-Jørgensen
@ 2015-04-23 11:05                       ` Adrian Popescu
  2015-04-23 11:09                         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 18+ messages in thread
From: Adrian Popescu @ 2015-04-23 11:05 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: cake

The problems I was seeing were related to c-states.

PCI-E bus ASPM power saving is disabled for the e1000e network
interfaces. This can be observed using dmesg.

Perhaps using a CPU which is low power enough for a router would help
avoid the need of deep sleep power states and other things such as
speedstep.


On Thu, Apr 23, 2015 at 2:01 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> Adrian Popescu <adriannnpopescu@gmail.com> writes:
>
>> Thanks to your experiment and your statement regarding CPU load on
>> your box during testing, I was able to fix the problem.
>
> Cool!
>
>> It looks like this problem was being caused by power saving. Something
>> changed between the older kernels and the newer ones. Changing the
>> power saving settings in the BIOS brings back latency below 0.5
>> milliseconds.
>
> So is this the PCI bus power saving settings, or the CPU, or?
>
>> This might have an impact some benchmarks which don't load up all CPU
>> cores or which don't need a lot of CPU power. This is certainly
>> something to keep an eye on when doing any kind of testing involving
>> really low latencies or network schedulers.
>
> Yes, definitely. Having things be worse during idle is definitely not
> optimal. I wonder if there's a kernel-level setting that can affect this?
>
> -Toke

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-23 11:05                       ` Adrian Popescu
@ 2015-04-23 11:09                         ` Toke Høiland-Jørgensen
  2015-04-23 11:13                           ` Jonathan Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2015-04-23 11:09 UTC (permalink / raw)
  To: Adrian Popescu; +Cc: cake

Adrian Popescu <adriannnpopescu@gmail.com> writes:

> The problems I was seeing were related to c-states.

Ah, right, so you turned off power saving for the CPU entirely?

> Perhaps using a CPU which is low power enough for a router would help
> avoid the need of deep sleep power states and other things such as
> speedstep.

Well shouldn't it be possible to do something more intelligent with the
power saving to avoid this problem?

-Toke

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-23 11:09                         ` Toke Høiland-Jørgensen
@ 2015-04-23 11:13                           ` Jonathan Morton
  0 siblings, 0 replies; 18+ messages in thread
From: Jonathan Morton @ 2015-04-23 11:13 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: cake

[-- Attachment #1: Type: text/plain, Size: 245 bytes --]

C-states refer to various levels of sleep, which take time for the CPU to
wake up from. On modern Intel CPUs, changing frequency is virtually a free
action, so allowing out to use a lower frequency during idle should be fine.

- Jonathan Morton

[-- Attachment #2: Type: text/html, Size: 284 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Cake] Cake3 - source code and some questions
  2015-04-16 13:25         ` Jonathan Morton
  2015-04-16 13:48           ` Adrian Popescu
@ 2015-04-16 13:49           ` Sebastian Moeller
  1 sibling, 0 replies; 18+ messages in thread
From: Sebastian Moeller @ 2015-04-16 13:49 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: cake

Hi Jonathan,

On Apr 16, 2015, at 15:25 , Jonathan Morton <chromatix99@gmail.com> wrote:

> 
>> On 16 Apr, 2015, at 15:14, Adrian Popescu <adriannnpopescu@gmail.com> wrote:
>> 
>> Will the changes to codel made by cake be put into fq_codel?
> 
> This might be a more complex question than you realise.
> 
> The most likely feature of cake to be implemented in fq_codel might be the set-associative hash, since it’s almost a pure win.  That would almost be a cut-and-paste operation, but due to fq_codel’s de-facto status as a “standard candle” in research, it would need to be made configurable, at least to make turning it off easy.  And that isn’t really a “codel” feature change, since it influences the FQ layer exclusively.
> 
> The codel parameter tuning done by cake isn’t applicable to fq_codel, because the bandwidth information that this tuning relies on isn’t available (not even when it’s stacked with HTB).  That’s why cake defaults to something very like the standard codel parameters when the internal shaper is disabled (“unlimited” mode),

	That makes me wonder, is there a way to specify “fixed” target and interval values for the codel part of cake sort of to override the default automatic selection while still using the shaper? This might make for a compelling demonstration of the beauty of the automatic mode to convince skeptics.

Best Regards
	Sebastian


> and that in turn is one reason why those defaults are also used at "sufficiently high” bandwidths, so that there isn’t a sharp discontinuity in the behaviour when the bandwidth is increased beyond the link rate and on to infinity (unlimited mode actually works by setting the shaper to infinite bandwidth, ie. zero time per byte).  The other reason, as I previously noted, is because the parameters depend on the total RTT as well as the packet rate.
> 
> Which leaves algorithmic changes to codel itself.  It’s certainly possible to drop these (fairly subtle) changes in, but we should probably spend some more time measuring the effects of these changes and finalising them.  We’re considering doing a major refactor of the code, which might make it harder to perform a drop-in replacement.  In any case, FQ does mean that codel’s precise behaviour is less critical than it might otherwise be, and there are valid arguments - such as the “standard candle” one - for leaving it alone.
> 
> - Jonathan Morton
> 
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-04-23 11:13 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-12  9:39 [Cake] Cake3 - source code and some questions Adrian Popescu
2015-04-12  9:58 ` Jonathan Morton
2015-04-12 10:24 ` Jonathan Morton
2015-04-12 12:33   ` Adrian Popescu
2015-04-12 18:57     ` Jonathan Morton
2015-04-16 12:14       ` Adrian Popescu
2015-04-16 13:25         ` Jonathan Morton
2015-04-16 13:48           ` Adrian Popescu
2015-04-16 19:26             ` Dave Taht
2015-04-22 21:02               ` Adrian Popescu
2015-04-23  0:45                 ` Stephen Hemminger
2015-04-23  9:01                 ` Toke Høiland-Jørgensen
2015-04-23 10:56                   ` Adrian Popescu
2015-04-23 11:01                     ` Toke Høiland-Jørgensen
2015-04-23 11:05                       ` Adrian Popescu
2015-04-23 11:09                         ` Toke Høiland-Jørgensen
2015-04-23 11:13                           ` Jonathan Morton
2015-04-16 13:49           ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox