* [Cake] Cake3 - source code and some questions @ 2015-04-12 9:39 Adrian Popescu 2015-04-12 9:58 ` Jonathan Morton 2015-04-12 10:24 ` Jonathan Morton 0 siblings, 2 replies; 18+ messages in thread From: Adrian Popescu @ 2015-04-12 9:39 UTC (permalink / raw) To: cake Hello everyone, Is cake3's source available for testing? Is there a way to test cake3 today? Does cake3 solve the problems fq_codel was having with high bandwidth and low latency connections? Does it still require tuning for low bandwidth and high bandwidth with low latency? Has cake3 been tested on 10gbps networks? Can cake3 be used in a hierarchical setup, like htb? Thanks, Adrian ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-12 9:39 [Cake] Cake3 - source code and some questions Adrian Popescu @ 2015-04-12 9:58 ` Jonathan Morton 2015-04-12 10:24 ` Jonathan Morton 1 sibling, 0 replies; 18+ messages in thread From: Jonathan Morton @ 2015-04-12 9:58 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake [-- Attachment #1: Type: text/plain, Size: 792 bytes --] To answer the later questions first: Cake is designed for use at the internet edge, and therefore assumes internet scale RTTs. It does not have any sort of tuning for datacentre networks. But it does work and has a measurable effect on home LANs, even though it's not specifically tuned for that. If there is sufficient demand for cake's features on such networks, then a flag could be added to provide appropriate tuning for low RTTs. Fq_codel can already be tuned this way by adjusting the target and interval parameters. Cake does have tuning for low bandwidth links (increasing codel's target and interval), and has been run (but not yet extensively tested) at 64kbps. We have cake's code in a git repo, but I don't think we have anonymous pull access to it. Toke? - Jonathan Morton [-- Attachment #2: Type: text/html, Size: 922 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-12 9:39 [Cake] Cake3 - source code and some questions Adrian Popescu 2015-04-12 9:58 ` Jonathan Morton @ 2015-04-12 10:24 ` Jonathan Morton 2015-04-12 12:33 ` Adrian Popescu 1 sibling, 1 reply; 18+ messages in thread From: Jonathan Morton @ 2015-04-12 10:24 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake [-- Attachment #1: Type: text/plain, Size: 613 bytes --] > Can cake3 be used in a hierarchical setup, like htb? This is a trickier question. Cake is designed to be as simple to configure as possible, and a classful setup would work against that (it would instantly triple the number of tc invocations required). However, it could be used as a leaf qdisc with a separate classifier, if you really wanted to. I have trouble imagining why, though. To put it simply, we want to build the functionality for the most common use cases into cake natively, especially when they don't do any harm to be left switched on (by default) when not strictly needed. - Jonathan Morton [-- Attachment #2: Type: text/html, Size: 696 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-12 10:24 ` Jonathan Morton @ 2015-04-12 12:33 ` Adrian Popescu 2015-04-12 18:57 ` Jonathan Morton 0 siblings, 1 reply; 18+ messages in thread From: Adrian Popescu @ 2015-04-12 12:33 UTC (permalink / raw) To: Jonathan Morton; +Cc: cake Thank you, Jonathan. High bandwidth home networks are becoming more and more common. FTTH has very low latency of 1-2ms. fq_codel has exhibited some weird behaviour, but I can't put my finger on it because CPU usage wasn't a problem. Figuring out what's going on at the kernel or fq_codel level can be complicated. These high bandwidth connections with low latency are somewhat similar to data centre networks. Some who co-locate their servers have 100mbps of symmetric bandwidth outside of their network and they have 1 gbps or 10 gbps within their network. Setting up fq_codel properly can be difficult because the quantum, the target and the interval need to be adjusted on high bandwidth & low latency links. Figuring out if the changes have helped or hurt is difficult because the network conditions can be different. I can't wait to test cake3. Regards, Adrian On Sun, Apr 12, 2015 at 1:24 PM, Jonathan Morton <chromatix99@gmail.com> wrote: >> Can cake3 be used in a hierarchical setup, like htb? > > This is a trickier question. Cake is designed to be as simple to configure > as possible, and a classful setup would work against that (it would > instantly triple the number of tc invocations required). However, it could > be used as a leaf qdisc with a separate classifier, if you really wanted to. > I have trouble imagining why, though. > > To put it simply, we want to build the functionality for the most common use > cases into cake natively, especially when they don't do any harm to be left > switched on (by default) when not strictly needed. > > - Jonathan Morton ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-12 12:33 ` Adrian Popescu @ 2015-04-12 18:57 ` Jonathan Morton 2015-04-16 12:14 ` Adrian Popescu 0 siblings, 1 reply; 18+ messages in thread From: Jonathan Morton @ 2015-04-12 18:57 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake [-- Attachment #1: Type: text/plain, Size: 1965 bytes --] This is a question worth discussing. There is a certain amount of controversy over the actual meaning, utility and constraints of the target parameter, although the interval parameter is fairly well understood as a rough (order of magnitude) estimate of the prevailing RTT. Note that even with FTTP, while the RTT to the head end may be unusually low, the RTTs to interesting servers will still be in roughly the same range as on a good quality ADSL link. This is especially true if the interesting servers tend to be at the other end of the country/continent or on the other side of an ocean. This variability is within Codel's capacity. Due to the Diffserv and flow isolation features of cake, the latency minimization feature provided by Codel also isn't as critical to tune as it is when standalone, or with a lesser flow isolation system such as fq_codel's collision prone hash function. I think this is sufficient to make further tuning unnecessary up to 1 gigabit, whether on a LAN or over the internet, and since I haven't seen any home affordable gear for more than a gigabit yet - marketing tricks by Wi-Fi vendors aside - I don't think it's worth thinking too hard about pushing that higher in the home use case. Fq_codel also works quite well on a LAN already. The difference in a datacentre is that typical native RTTs are measured in microseconds, well outside the range that Codel is by default tuned for. The bandwidths involved also mean that the standard 5ms target invokes a large amount of buffered data. Additionally, we're inherently talking about a wholly local environment, so there is no need to adapt to internet scale RTTs. For those cases where you do have a datacentre like environment connected to an internet like environment, the solution is obvious. Deploy datacentre tuned AQM (which might be fq_codel with altered parameters) within the datacentre, and put cake at the gateway(s) to the internet. Job done. - Jonathan Morton [-- Attachment #2: Type: text/html, Size: 2102 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-12 18:57 ` Jonathan Morton @ 2015-04-16 12:14 ` Adrian Popescu 2015-04-16 13:25 ` Jonathan Morton 0 siblings, 1 reply; 18+ messages in thread From: Adrian Popescu @ 2015-04-16 12:14 UTC (permalink / raw) To: Jonathan Morton; +Cc: cake That answers my question. Will the changes to codel made by cake be put into fq_codel? On Sun, Apr 12, 2015 at 9:57 PM, Jonathan Morton <chromatix99@gmail.com> wrote: > This is a question worth discussing. There is a certain amount of > controversy over the actual meaning, utility and constraints of the target > parameter, although the interval parameter is fairly well understood as a > rough (order of magnitude) estimate of the prevailing RTT. > > Note that even with FTTP, while the RTT to the head end may be unusually > low, the RTTs to interesting servers will still be in roughly the same range > as on a good quality ADSL link. This is especially true if the interesting > servers tend to be at the other end of the country/continent or on the other > side of an ocean. This variability is within Codel's capacity. > > Due to the Diffserv and flow isolation features of cake, the latency > minimization feature provided by Codel also isn't as critical to tune as it > is when standalone, or with a lesser flow isolation system such as > fq_codel's collision prone hash function. I think this is sufficient to make > further tuning unnecessary up to 1 gigabit, whether on a LAN or over the > internet, and since I haven't seen any home affordable gear for more than a > gigabit yet - marketing tricks by Wi-Fi vendors aside - I don't think it's > worth thinking too hard about pushing that higher in the home use case. > Fq_codel also works quite well on a LAN already. > > The difference in a datacentre is that typical native RTTs are measured in > microseconds, well outside the range that Codel is by default tuned for. The > bandwidths involved also mean that the standard 5ms target invokes a large > amount of buffered data. Additionally, we're inherently talking about a > wholly local environment, so there is no need to adapt to internet scale > RTTs. > > For those cases where you do have a datacentre like environment connected to > an internet like environment, the solution is obvious. Deploy datacentre > tuned AQM (which might be fq_codel with altered parameters) within the > datacentre, and put cake at the gateway(s) to the internet. Job done. > > - Jonathan Morton ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-16 12:14 ` Adrian Popescu @ 2015-04-16 13:25 ` Jonathan Morton 2015-04-16 13:48 ` Adrian Popescu 2015-04-16 13:49 ` Sebastian Moeller 0 siblings, 2 replies; 18+ messages in thread From: Jonathan Morton @ 2015-04-16 13:25 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake > On 16 Apr, 2015, at 15:14, Adrian Popescu <adriannnpopescu@gmail.com> wrote: > > Will the changes to codel made by cake be put into fq_codel? This might be a more complex question than you realise. The most likely feature of cake to be implemented in fq_codel might be the set-associative hash, since it’s almost a pure win. That would almost be a cut-and-paste operation, but due to fq_codel’s de-facto status as a “standard candle” in research, it would need to be made configurable, at least to make turning it off easy. And that isn’t really a “codel” feature change, since it influences the FQ layer exclusively. The codel parameter tuning done by cake isn’t applicable to fq_codel, because the bandwidth information that this tuning relies on isn’t available (not even when it’s stacked with HTB). That’s why cake defaults to something very like the standard codel parameters when the internal shaper is disabled (“unlimited” mode), and that in turn is one reason why those defaults are also used at "sufficiently high” bandwidths, so that there isn’t a sharp discontinuity in the behaviour when the bandwidth is increased beyond the link rate and on to infinity (unlimited mode actually works by setting the shaper to infinite bandwidth, ie. zero time per byte). The other reason, as I previously noted, is because the parameters depend on the total RTT as well as the packet rate. Which leaves algorithmic changes to codel itself. It’s certainly possible to drop these (fairly subtle) changes in, but we should probably spend some more time measuring the effects of these changes and finalising them. We’re considering doing a major refactor of the code, which might make it harder to perform a drop-in replacement. In any case, FQ does mean that codel’s precise behaviour is less critical than it might otherwise be, and there are valid arguments - such as the “standard candle” one - for leaving it alone. - Jonathan Morton ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-16 13:25 ` Jonathan Morton @ 2015-04-16 13:48 ` Adrian Popescu 2015-04-16 19:26 ` Dave Taht 2015-04-16 13:49 ` Sebastian Moeller 1 sibling, 1 reply; 18+ messages in thread From: Adrian Popescu @ 2015-04-16 13:48 UTC (permalink / raw) To: Jonathan Morton; +Cc: cake I've discovered there are other problems in the Linux networking stack which don't seem to be related to fq_codel, qdiscs, AQM and HTB. There are latency inducing issues bugs in the Ethernet network drivers of many network adapters, including e1000e, or in the kernel itself. Some kernels are better. The newest ones have severe regressions in this area. I was under the impression there's a problem in codel or fq_codel that lead to very frequent latency micro-spikes of between 1 and 3 milliseconds. It also seemed to produce bigger latency spikes under moderate load. No amount of tuning and disabling of offloads helped with this. Imagine having 2 milliseconds of latency to your ISP and having your router induce between 3 to 5 milliseconds of latency for every flow. It's not particularly helpful for low latency paths on high bandwidth links. Adding more latency in both directions to high latency paths is even worse. These problems are the reason behind starting this thread. I believed these problems to be related to fq_codel or to the codel algorithm itself. My question about porting these improvements to codel and fq_codel was strictly about the tighter recovery, better invsqrt and other codel enhancements mentioned on the wiki page. The solution to this unstable latency will necessitate migration to another platform without Linux. I'm aware cake and fq_codel won't fix this problem on Linux. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-16 13:48 ` Adrian Popescu @ 2015-04-16 19:26 ` Dave Taht 2015-04-22 21:02 ` Adrian Popescu 0 siblings, 1 reply; 18+ messages in thread From: Dave Taht @ 2015-04-16 19:26 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake On Thu, Apr 16, 2015 at 6:48 AM, Adrian Popescu <adriannnpopescu@gmail.com> wrote: > I've discovered there are other problems in the Linux networking stack > which don't seem to be related to fq_codel, qdiscs, AQM and HTB. > > There are latency inducing issues bugs in the Ethernet network drivers > of many network adapters, including e1000e, or in the kernel itself. > Some kernels are better. The newest ones have severe regressions in > this area. There are a multiplicity of problems in doing real-time packet processing while on processors that generally cannot context switch in under 1000 cycles anymore, and on virtual machines, oh, my! One of the biggest fixes for linux networking was the BQL infrastructure which is in something like 24 drivers now (including the e1000e) - but things like TSO and GRO offloads remain a PITA. There are still many drivers left to fix. One (mvneta) is really bugging me of late.... > I was under the impression there's a problem in codel or fq_codel that > lead to very frequent latency micro-spikes of between 1 and 3 > milliseconds. In my world it is sometimes hard to worry about stuff down in this latency noise level. This is so far less than what these algorithms generally solve in the first place, that I generally have treated it as noise. That said, I remain sad that the linux-rt folk are so underfunded: as they have the tools and expertise to try and get more consistent low latency with full throughput. >It also seemed to produce bigger latency spikes under > moderate load. No amount of tuning and disabling of offloads helped > with this. Regrettably you are not providing enough details. Repeatable tests and actual measurements are always helpful. There is presently what is viewed as a regresson by the Xen folk involving the tcp small queues subsystem, which is being discussed heavily on lkml. What are you referring to specifically? > > Imagine having 2 milliseconds of latency to your ISP and having your > router induce between 3 to 5 milliseconds of latency for every flow. > It's not particularly helpful for low latency paths on high bandwidth > links. Adding more latency in both directions to high latency paths is > even worse. Imagine having your router induce seconds or 10s of seconds latency - the state of affairs on most edge devices today. I DO care about latency and jitter to this level, but it is very hard to isolate and measure. > > These problems are the reason behind starting this thread. I believed > these problems to be related to fq_codel or to the codel algorithm > itself. Not enough detail, what exactly, are you measuring, on what hardware using what tools? > My question about porting these improvements to codel and fq_codel was > strictly about the tighter recovery, better invsqrt and other codel In the tree and in cerowrt for 2 years has been multiple variants of the algorithms under test, individually. Cake rolls up the best of these attempts thus far, and each of those separate models remain in-tree for further testing against all the other variables. In no case have I cared one whit about sub 3ms worth of jitter, I was mostly looking to get faster convergence, better utilization at longer RTTs, better behavior at > 100Mbit, and more filling of the pipe in generally I happen to not agree with jonathon that the better invsqrt (cache) now in cake accomplishes anything, but plan to test. I do think that the better resumption stuff (which has one part that corrects an error in newton's method going in reverse) helps at > 100mbit, which was a speed we were not able to test effectively at in our prior attempts before we had all these nice test tools. > enhancements mentioned on the wiki page. Well, at the sub 3ms level it is almost always about the device driver, BQL, tcp small queues. and kernel context switch time. There are two feature of BQLs I dislike in that it uses a MIAD (Rather than AIMD) controller, and that it's buffering is additive across hardware multiqueues (and devices have sprouted a lot of those of late, which exhibit birthday problems) I dislike TCP IW10 intensely. I feel the entire GRO/TSO/GSO concepts are basically broken (and that we need to develop better hardware that can deal with packets as packets again). There are quite a large list of things to solve to get latencies lower than 2ms that are *hard*. Perhaps userspace networking is an answer. > The solution to this unstable latency will necessitate migration to > another platform without Linux. I'm aware cake and fq_codel won't fix > this problem on Linux. Well a huge goal here is from the dual BSD/GPL licensing of cake. We hope that someone will show up to port this version of the algorithms to another OS, and by escaping the monoculture, we will learn more about how to do it more right. There is already a project starting to do a dpdk version Codel is in click, codel in ns2, fq_codel in ns3 - we do not have anyone committed to doing cake in anything else at present. Keep hoping a BSD expert will show up to do a pfsense version. > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake -- Dave Täht Open Networking needs **Open Source Hardware** https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-16 19:26 ` Dave Taht @ 2015-04-22 21:02 ` Adrian Popescu 2015-04-23 0:45 ` Stephen Hemminger 2015-04-23 9:01 ` Toke Høiland-Jørgensen 0 siblings, 2 replies; 18+ messages in thread From: Adrian Popescu @ 2015-04-22 21:02 UTC (permalink / raw) To: Dave Taht; +Cc: cake Hello Dave, On Thu, Apr 16, 2015 at 10:26 PM, Dave Taht <dave.taht@gmail.com> wrote: > >> I was under the impression there's a problem in codel or fq_codel that >> lead to very frequent latency micro-spikes of between 1 and 3 >> milliseconds. > > In my world it is sometimes hard to worry about stuff down in this latency > noise level. This is so far less than what these algorithms generally > solve in the first place, that I generally have treated it as noise. > These algorithms have been doing well on 1 gigabit links. However, latency depends on the used kernel. > >>It also seemed to produce bigger latency spikes under >> moderate load. No amount of tuning and disabling of offloads helped >> with this. > > Regrettably you are not providing enough details. Repeatable > tests and actual measurements are always helpful. > > There is presently > what is viewed as a regresson by the Xen folk involving the tcp > small queues subsystem, which is being discussed heavily on lkml. > > What are you referring to specifically? > e1000e (82574, dual port and quad port server e1000e adapters with various Intel chips) is exhibiting varying latency based on the used kernel. Local network ping latencies on Ubuntu 14.04 with its LTS 3.13 kernel are always below 0.5 milliseconds when idle and lightly loaded. The latest kernels from kernel.org are unable to match those latencies. These modern kernels are always seeing latencies of 2-3 milliseconds. Turning off all offloads on the involved ethernet e1000e network interfaces doesn't help. This is all physical Intel hardware with e1000e interfaces. Two idle hosts on a local network should be able to ping each other and get latencies which are always less than one millisecond. FreeBSD doesn't have this problem. > > Imagine having your router induce seconds or 10s of seconds latency > - the state of affairs on most edge devices today. > > I DO care about latency and jitter to this level, but it is very hard to > isolate and measure. > 2-3 milliseconds of jitter or extra latency each way wouldn't have been a big deal if that represented the only problem a network has to deal with all the time. Induced latency coupled with stupid RED causing packet loss at the ISP leads to lower throughput. Induced latency coupled with wireless packet loss and packet loss caused by stupid RED is worse. Adding 5 more milliseconds of latency (or more) during higher load can be very bad if you're playing a game and the latency is at 95 milliseconds already. Two persons having a conversation over two such edge devices would get about 10 milliseconds of induced RTT latency for no good reason. Two persons having a conversation over two such edge devices from two devices which induce such latency would probably be seeing up to 12 milliseconds of latency only because of their network. That would be in addition to all the network latency due to the RTT. This works properly on older kernels and there's no fq_codel problem. cake is also affected on the new kernels. Maybe it's better to say the AQM or shaper makes no difference. Disabling offloads merely reduce it slightly while idling. >> >> These problems are the reason behind starting this thread. I believed >> these problems to be related to fq_codel or to the codel algorithm >> itself. > > Not enough detail, what exactly, are you measuring, on what hardware > using what tools? testing method: - take two e1000e modern machines (sandy bridge, ivy bridge, haswell) - one machine should run Ubuntu 14.04 with kernel LTS kernel 3.13 - the other machine should run Ubuntu 14.04 with kernel LTS kernel 3.13 for the control test - the other machine should run kernel 3.18/3.19/4.0 for the other test - the two hosts should be very lightly loaded - connect the machines through a gigabit switch control test: ping 172.16.0.1 result: response below 0.5 milliseconds second test: ping 172.16.0.1 result: response above 1 millisecond, sometimes 2 milliseconds or worse Seeing worse latency under load (20-100 milliseconds) isn't uncommon. I believe this to be a regression in the kernel or in the network drivers. > >> My question about porting these improvements to codel and fq_codel was >> strictly about the tighter recovery, better invsqrt and other codel > > In the tree and in cerowrt for 2 years has been multiple variants of the > algorithms under test, individually. Cake rolls up the best of these > attempts thus far, and each of those separate models remain in-tree > for further testing against all the other variables. > > In no case have I cared one whit about sub 3ms worth of jitter, I was > mostly looking to get faster convergence, better > utilization at longer RTTs, better behavior at > 100Mbit, and more > filling of the pipe in generally > > I happen to not agree with jonathon that the better invsqrt (cache) now in cake > accomplishes anything, but plan to test. > > I do think that the better resumption stuff (which has one part that > corrects an error in newton's method going in reverse) helps at > > 100mbit, which was a speed we were not able to test effectively at in > our prior attempts before we had all these nice test tools. > >> enhancements mentioned on the wiki page. > > Well, at the sub 3ms level it is almost always about the device driver, > BQL, tcp small queues. and kernel context switch time. > > There are two feature of BQLs I dislike in that it uses a MIAD > (Rather than AIMD) controller, and that it's buffering is additive > across hardware multiqueues (and devices have sprouted a lot of those > of late, which exhibit birthday problems) > > I dislike TCP IW10 intensely. I feel the entire GRO/TSO/GSO concepts > are basically broken (and that we need to develop better hardware that > can deal with packets as packets again). > > There are quite a large list of things to solve to get latencies lower > than 2ms that are *hard*. Perhaps userspace networking is an answer. > >> The solution to this unstable latency will necessitate migration to >> another platform without Linux. I'm aware cake and fq_codel won't fix >> this problem on Linux. > > Well a huge goal here is from the dual BSD/GPL licensing of cake. We hope > that someone will show up to port this version of the algorithms to another OS, > and by escaping the monoculture, we will learn more about how to do it > more right. > > There is already a project starting to do a dpdk version > > Codel is in click, codel in ns2, fq_codel in ns3 - we do not have anyone > committed to doing cake in anything else at present. Keep hoping a BSD > expert will show up to do a pfsense version. > >> _______________________________________________ >> Cake mailing list >> Cake@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/cake > > > > -- > Dave Täht > Open Networking needs **Open Source Hardware** > > https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-22 21:02 ` Adrian Popescu @ 2015-04-23 0:45 ` Stephen Hemminger 2015-04-23 9:01 ` Toke Høiland-Jørgensen 1 sibling, 0 replies; 18+ messages in thread From: Stephen Hemminger @ 2015-04-23 0:45 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake On Thu, 23 Apr 2015 00:02:35 +0300 Adrian Popescu <adriannnpopescu@gmail.com> wrote: > e1000e (82574, dual port and quad port server e1000e adapters with > various Intel chips) is exhibiting varying latency based on the used > kernel. > > Local network ping latencies on Ubuntu 14.04 with its LTS 3.13 kernel > are always below 0.5 milliseconds when idle and lightly loaded. The > latest kernels from kernel.org are unable to match those latencies. > These modern kernels are always seeing latencies of 2-3 milliseconds. > > Turning off all offloads on the involved ethernet e1000e network > interfaces doesn't help. This is all physical Intel hardware with > e1000e interfaces. > > Two idle hosts on a local network should be able to ping each other > and get latencies which are always less than one millisecond. FreeBSD > doesn't have this problem. These NIC's have had a histrory of power management related issues. I suspect some power management (maybe even in SMI) is turning off parts of the chips and it is taking long to turn back on. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-22 21:02 ` Adrian Popescu 2015-04-23 0:45 ` Stephen Hemminger @ 2015-04-23 9:01 ` Toke Høiland-Jørgensen 2015-04-23 10:56 ` Adrian Popescu 1 sibling, 1 reply; 18+ messages in thread From: Toke Høiland-Jørgensen @ 2015-04-23 9:01 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake Adrian Popescu <adriannnpopescu@gmail.com> writes: > Seeing worse latency under load (20-100 milliseconds) isn't uncommon. > I believe this to be a regression in the kernel or in the network > drivers. I don't see this behaviour at all: $ ls -l /sys/class/net/enp0s25/device/driver :( 0 lrwxrwxrwx 1 root root 0 Apr 21 16:05 /sys/class/net/enp0s25/device/driver -> ../../../bus/pci/drivers/e1000e/ $ ping 130.243.26.1 -c 100 # this is my default gateway ..snip... --- 130.243.26.1 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 99000ms rtt min/avg/max/mdev = 0.341/0.801/28.260/2.769 ms $ cat /proc/loadavg 9.29 8.43 5.30 14/508 6665 (yes, this is while running a cpu-hungry data processing application in the background on all eight cores) $ uname -a Linux alrua-kau 3.19.3-3-ARCH #1 SMP PREEMPT Wed Apr 8 14:10:00 CEST 2015 x86_64 GNU/Linux $ tc qdisc show dev enp0s25 qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 (hmm, why am I running pfifo_fast?) Repeating with sch_fq: $ ping 130.243.26.1 -c 100 ...snip... --- 130.243.26.1 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 98998ms rtt min/avg/max/mdev = 0.358/0.468/1.278/0.151 ms -Toke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-23 9:01 ` Toke Høiland-Jørgensen @ 2015-04-23 10:56 ` Adrian Popescu 2015-04-23 11:01 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 18+ messages in thread From: Adrian Popescu @ 2015-04-23 10:56 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: cake Hello Toke, Thanks to your experiment and your statement regarding CPU load on your box during testing, I was able to fix the problem. It looks like this problem was being caused by power saving. Something changed between the older kernels and the newer ones. Changing the power saving settings in the BIOS brings back latency below 0.5 milliseconds. This might have an impact some benchmarks which don't load up all CPU cores or which don't need a lot of CPU power. This is certainly something to keep an eye on when doing any kind of testing involving really low latencies or network schedulers. On Thu, Apr 23, 2015 at 12:01 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote: > Adrian Popescu <adriannnpopescu@gmail.com> writes: > >> Seeing worse latency under load (20-100 milliseconds) isn't uncommon. >> I believe this to be a regression in the kernel or in the network >> drivers. > > I don't see this behaviour at all: > > $ ls -l /sys/class/net/enp0s25/device/driver :( > 0 lrwxrwxrwx 1 root root 0 Apr 21 16:05 /sys/class/net/enp0s25/device/driver -> ../../../bus/pci/drivers/e1000e/ > > $ ping 130.243.26.1 -c 100 # this is my default gateway > ..snip... > --- 130.243.26.1 ping statistics --- > 100 packets transmitted, 100 received, 0% packet loss, time 99000ms > rtt min/avg/max/mdev = 0.341/0.801/28.260/2.769 ms > > $ cat /proc/loadavg > 9.29 8.43 5.30 14/508 6665 > > (yes, this is while running a cpu-hungry data processing application in > the background on all eight cores) > > $ uname -a > Linux alrua-kau 3.19.3-3-ARCH #1 SMP PREEMPT Wed Apr 8 14:10:00 CEST 2015 x86_64 GNU/Linux > > $ tc qdisc show dev enp0s25 > qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > > (hmm, why am I running pfifo_fast?) > > Repeating with sch_fq: > > $ ping 130.243.26.1 -c 100 > ...snip... > --- 130.243.26.1 ping statistics --- > 100 packets transmitted, 100 received, 0% packet loss, time 98998ms > rtt min/avg/max/mdev = 0.358/0.468/1.278/0.151 ms > > -Toke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-23 10:56 ` Adrian Popescu @ 2015-04-23 11:01 ` Toke Høiland-Jørgensen 2015-04-23 11:05 ` Adrian Popescu 0 siblings, 1 reply; 18+ messages in thread From: Toke Høiland-Jørgensen @ 2015-04-23 11:01 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake Adrian Popescu <adriannnpopescu@gmail.com> writes: > Thanks to your experiment and your statement regarding CPU load on > your box during testing, I was able to fix the problem. Cool! > It looks like this problem was being caused by power saving. Something > changed between the older kernels and the newer ones. Changing the > power saving settings in the BIOS brings back latency below 0.5 > milliseconds. So is this the PCI bus power saving settings, or the CPU, or? > This might have an impact some benchmarks which don't load up all CPU > cores or which don't need a lot of CPU power. This is certainly > something to keep an eye on when doing any kind of testing involving > really low latencies or network schedulers. Yes, definitely. Having things be worse during idle is definitely not optimal. I wonder if there's a kernel-level setting that can affect this? -Toke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-23 11:01 ` Toke Høiland-Jørgensen @ 2015-04-23 11:05 ` Adrian Popescu 2015-04-23 11:09 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 18+ messages in thread From: Adrian Popescu @ 2015-04-23 11:05 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: cake The problems I was seeing were related to c-states. PCI-E bus ASPM power saving is disabled for the e1000e network interfaces. This can be observed using dmesg. Perhaps using a CPU which is low power enough for a router would help avoid the need of deep sleep power states and other things such as speedstep. On Thu, Apr 23, 2015 at 2:01 PM, Toke Høiland-Jørgensen <toke@toke.dk> wrote: > Adrian Popescu <adriannnpopescu@gmail.com> writes: > >> Thanks to your experiment and your statement regarding CPU load on >> your box during testing, I was able to fix the problem. > > Cool! > >> It looks like this problem was being caused by power saving. Something >> changed between the older kernels and the newer ones. Changing the >> power saving settings in the BIOS brings back latency below 0.5 >> milliseconds. > > So is this the PCI bus power saving settings, or the CPU, or? > >> This might have an impact some benchmarks which don't load up all CPU >> cores or which don't need a lot of CPU power. This is certainly >> something to keep an eye on when doing any kind of testing involving >> really low latencies or network schedulers. > > Yes, definitely. Having things be worse during idle is definitely not > optimal. I wonder if there's a kernel-level setting that can affect this? > > -Toke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-23 11:05 ` Adrian Popescu @ 2015-04-23 11:09 ` Toke Høiland-Jørgensen 2015-04-23 11:13 ` Jonathan Morton 0 siblings, 1 reply; 18+ messages in thread From: Toke Høiland-Jørgensen @ 2015-04-23 11:09 UTC (permalink / raw) To: Adrian Popescu; +Cc: cake Adrian Popescu <adriannnpopescu@gmail.com> writes: > The problems I was seeing were related to c-states. Ah, right, so you turned off power saving for the CPU entirely? > Perhaps using a CPU which is low power enough for a router would help > avoid the need of deep sleep power states and other things such as > speedstep. Well shouldn't it be possible to do something more intelligent with the power saving to avoid this problem? -Toke ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-23 11:09 ` Toke Høiland-Jørgensen @ 2015-04-23 11:13 ` Jonathan Morton 0 siblings, 0 replies; 18+ messages in thread From: Jonathan Morton @ 2015-04-23 11:13 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: cake [-- Attachment #1: Type: text/plain, Size: 245 bytes --] C-states refer to various levels of sleep, which take time for the CPU to wake up from. On modern Intel CPUs, changing frequency is virtually a free action, so allowing out to use a lower frequency during idle should be fine. - Jonathan Morton [-- Attachment #2: Type: text/html, Size: 284 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Cake] Cake3 - source code and some questions 2015-04-16 13:25 ` Jonathan Morton 2015-04-16 13:48 ` Adrian Popescu @ 2015-04-16 13:49 ` Sebastian Moeller 1 sibling, 0 replies; 18+ messages in thread From: Sebastian Moeller @ 2015-04-16 13:49 UTC (permalink / raw) To: Jonathan Morton; +Cc: cake Hi Jonathan, On Apr 16, 2015, at 15:25 , Jonathan Morton <chromatix99@gmail.com> wrote: > >> On 16 Apr, 2015, at 15:14, Adrian Popescu <adriannnpopescu@gmail.com> wrote: >> >> Will the changes to codel made by cake be put into fq_codel? > > This might be a more complex question than you realise. > > The most likely feature of cake to be implemented in fq_codel might be the set-associative hash, since it’s almost a pure win. That would almost be a cut-and-paste operation, but due to fq_codel’s de-facto status as a “standard candle” in research, it would need to be made configurable, at least to make turning it off easy. And that isn’t really a “codel” feature change, since it influences the FQ layer exclusively. > > The codel parameter tuning done by cake isn’t applicable to fq_codel, because the bandwidth information that this tuning relies on isn’t available (not even when it’s stacked with HTB). That’s why cake defaults to something very like the standard codel parameters when the internal shaper is disabled (“unlimited” mode), That makes me wonder, is there a way to specify “fixed” target and interval values for the codel part of cake sort of to override the default automatic selection while still using the shaper? This might make for a compelling demonstration of the beauty of the automatic mode to convince skeptics. Best Regards Sebastian > and that in turn is one reason why those defaults are also used at "sufficiently high” bandwidths, so that there isn’t a sharp discontinuity in the behaviour when the bandwidth is increased beyond the link rate and on to infinity (unlimited mode actually works by setting the shaper to infinite bandwidth, ie. zero time per byte). The other reason, as I previously noted, is because the parameters depend on the total RTT as well as the packet rate. > > Which leaves algorithmic changes to codel itself. It’s certainly possible to drop these (fairly subtle) changes in, but we should probably spend some more time measuring the effects of these changes and finalising them. We’re considering doing a major refactor of the code, which might make it harder to perform a drop-in replacement. In any case, FQ does mean that codel’s precise behaviour is less critical than it might otherwise be, and there are valid arguments - such as the “standard candle” one - for leaving it alone. > > - Jonathan Morton > > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-04-23 11:13 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-04-12 9:39 [Cake] Cake3 - source code and some questions Adrian Popescu 2015-04-12 9:58 ` Jonathan Morton 2015-04-12 10:24 ` Jonathan Morton 2015-04-12 12:33 ` Adrian Popescu 2015-04-12 18:57 ` Jonathan Morton 2015-04-16 12:14 ` Adrian Popescu 2015-04-16 13:25 ` Jonathan Morton 2015-04-16 13:48 ` Adrian Popescu 2015-04-16 19:26 ` Dave Taht 2015-04-22 21:02 ` Adrian Popescu 2015-04-23 0:45 ` Stephen Hemminger 2015-04-23 9:01 ` Toke Høiland-Jørgensen 2015-04-23 10:56 ` Adrian Popescu 2015-04-23 11:01 ` Toke Høiland-Jørgensen 2015-04-23 11:05 ` Adrian Popescu 2015-04-23 11:09 ` Toke Høiland-Jørgensen 2015-04-23 11:13 ` Jonathan Morton 2015-04-16 13:49 ` Sebastian Moeller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox