* [Bloat] Initial tests with BBR in kernel 4.9 @ 2017-01-25 20:54 Hans-Kristian Bakke 2017-01-25 21:00 ` Jonathan Morton ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 20:54 UTC (permalink / raw) To: bloat [-- Attachment #1: Type: text/plain, Size: 2068 bytes --] Hi Kernel 4.9 finally landed in Debian testing so I could finally test BBR in a real life environment that I have struggled with getting any kind of performance out of. The challenge at hand is UDP based OpenVPN through europe at around 35 ms rtt to my VPN-provider with plenty of available bandwith available in both ends and everything completely unknown in between. After tuning the UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, while the upload was stuck at 30 to 60 mbit/s. Just by activating BBR the bandwith instantly shot up to around 150 mbit/s using a fat tcp test to a public iperf3 server located near my VPN exit point in the Netherlands. Replace BBR with qubic again and the performance is once again all over the place ranging from very bad to bad, but never better than 1/3 of BBRs "steady state". In other words "instant WIN!" However, seeing the requirement of fq and pacing for BBR and noticing that I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE host with fq_codel on all physical interfaces, I was surprised to see that it worked so well. I then replaced pfifo_fast with fq and the performance went right down to only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained the performance at once. I have got some questions to you guys that know a lot more than me about these things: 1. Do fq (and fq_codel) even work reliably in a VM? What is the best choice for default qdisc to use in a VM in general? 2. Why do BBR immediately "fix" all my issues with upload through that "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement? 3. Could fq_codel on the physical host be the reason that it still works? 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a replacement? 5. Is BBR perhaps modified to do the right thing without having to change the qdisc in the current kernel 4.9? Sorry for long post, but this is an interesting topic! Regards, Hans-Kristian Bakke [-- Attachment #2: Type: text/html, Size: 3723 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 20:54 [Bloat] Initial tests with BBR in kernel 4.9 Hans-Kristian Bakke @ 2017-01-25 21:00 ` Jonathan Morton [not found] ` <CAD_cGvHKw6upOCzDbHLZMSYdyuBHGyo4baaPqM7r=VvMzRFVtg@mail.gmail.com> 2017-01-25 21:03 ` Hans-Kristian Bakke 2017-01-25 22:01 ` Neal Cardwell 2 siblings, 1 reply; 31+ messages in thread From: Jonathan Morton @ 2017-01-25 21:00 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: bloat > On 25 Jan, 2017, at 22:54, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > > 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a replacement? Without pacing, it is not BBR as specified. Currently, only the fq qdisc implements pacing. AFAIK, you need a working HPET for pacing to work correctly. A low-precision timer would be a good explanation for low throughput under pacing. - Jonathan Morton ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <CAD_cGvHKw6upOCzDbHLZMSYdyuBHGyo4baaPqM7r=VvMzRFVtg@mail.gmail.com>]
* [Bloat] Fwd: Initial tests with BBR in kernel 4.9 [not found] ` <CAD_cGvHKw6upOCzDbHLZMSYdyuBHGyo4baaPqM7r=VvMzRFVtg@mail.gmail.com> @ 2017-01-25 21:09 ` Hans-Kristian Bakke [not found] ` <908CA0EF-3D84-4EB4-ABD8-3042668E842E@gmail.com> 1 sibling, 0 replies; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 21:09 UTC (permalink / raw) To: bloat [-- Attachment #1: Type: text/plain, Size: 712 bytes --] Thank you. Do I understand correctly that fq is really just hit and miss within a VM in general then? Is there no advantage to the fair queing part even with a low-precision clock? On 25 January 2017 at 22:00, Jonathan Morton <chromatix99@gmail.com> wrote: > > > On 25 Jan, 2017, at 22:54, Hans-Kristian Bakke <hkbakke@gmail.com> > wrote: > > > > 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a > replacement? > > Without pacing, it is not BBR as specified. Currently, only the fq qdisc > implements pacing. > > AFAIK, you need a working HPET for pacing to work correctly. A > low-precision timer would be a good explanation for low throughput under > pacing. > > - Jonathan Morton > > [-- Attachment #2: Type: text/html, Size: 1533 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <908CA0EF-3D84-4EB4-ABD8-3042668E842E@gmail.com>]
* Re: [Bloat] Initial tests with BBR in kernel 4.9 [not found] ` <908CA0EF-3D84-4EB4-ABD8-3042668E842E@gmail.com> @ 2017-01-25 21:13 ` Hans-Kristian Bakke 2017-01-25 21:17 ` Jonathan Morton 0 siblings, 1 reply; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 21:13 UTC (permalink / raw) To: bloat [-- Attachment #1: Type: text/plain, Size: 1090 bytes --] dmesg | grep HPET [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 (v01 BOCHS BXPCHPET 00000001 BXPC 00000001) [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 I seem to indeed have a HPET in my VM. Does that mean that I should be able to use fq as intended or could the HPET be some kind of virtualized device? Regards, Hans-Kristian On 25 January 2017 at 22:09, Jonathan Morton <chromatix99@gmail.com> wrote: > > > On 25 Jan, 2017, at 23:05, Hans-Kristian Bakke <hkbakke@gmail.com> > wrote: > > > > Do I understand correctly that fq is really just hit and miss within a > VM in general then? Is there no advantage to the fair queing part even with > a low-precision clock? > > First, check using dmesg or whatever that you do, or do not, have a > working HPET within your VM. > > If this is a widespread problem, I could concoct a patch to sch_fq which > compensates for it. I already fixed the same problem when using part of > sch_fq as a basis for part of sch_cake, and demonstrated correct operation > on an old, slow PC without an HPET. > > - Jonathan Morton > > [-- Attachment #2: Type: text/html, Size: 1778 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 21:13 ` [Bloat] " Hans-Kristian Bakke @ 2017-01-25 21:17 ` Jonathan Morton 2017-01-25 21:20 ` Hans-Kristian Bakke 0 siblings, 1 reply; 31+ messages in thread From: Jonathan Morton @ 2017-01-25 21:17 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: bloat > On 25 Jan, 2017, at 23:13, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > > dmesg | grep HPET > [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 (v01 BOCHS BXPCHPET 00000001 BXPC 00000001) > [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 > > I seem to indeed have a HPET in my VM. Does that mean that I should be able to use fq as intended or could the HPET be some kind of virtualized device? Try “dmesg | fgrep -i hpet” - that’ll also tell you whether you have drivers for your HPET device, and whether it is being used. - Jonathan Morton ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 21:17 ` Jonathan Morton @ 2017-01-25 21:20 ` Hans-Kristian Bakke 2017-01-25 21:26 ` Jonathan Morton 0 siblings, 1 reply; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 21:20 UTC (permalink / raw) To: Jonathan Morton; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 1284 bytes --] This is the output from "dmesg | fgrep -i hpet": [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 (v01 BOCHS BXPCHPET 00000001 BXPC 00000001) [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns [ 0.000000] hpet clockevent registered [ 0.362335] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 [ 0.362339] hpet0: 3 comparators, 64-bit 100.000000 MHz counter [ 0.661731] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes nvram, hpet irqs On 25 January 2017 at 22:17, Jonathan Morton <chromatix99@gmail.com> wrote: > > > On 25 Jan, 2017, at 23:13, Hans-Kristian Bakke <hkbakke@gmail.com> > wrote: > > > > dmesg | grep HPET > > [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 (v01 BOCHS BXPCHPET > 00000001 BXPC 00000001) > > [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 > > > > I seem to indeed have a HPET in my VM. Does that mean that I should be > able to use fq as intended or could the HPET be some kind of virtualized > device? > > Try “dmesg | fgrep -i hpet” - that’ll also tell you whether you have > drivers for your HPET device, and whether it is being used. > > - Jonathan Morton > > [-- Attachment #2: Type: text/html, Size: 2712 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 21:20 ` Hans-Kristian Bakke @ 2017-01-25 21:26 ` Jonathan Morton 2017-01-25 21:29 ` Hans-Kristian Bakke 0 siblings, 1 reply; 31+ messages in thread From: Jonathan Morton @ 2017-01-25 21:26 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: bloat > On 25 Jan, 2017, at 23:20, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > > [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 (v01 BOCHS BXPCHPET 00000001 BXPC 00000001) > [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 > [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns > [ 0.000000] hpet clockevent registered > [ 0.362335] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 > [ 0.362339] hpet0: 3 comparators, 64-bit 100.000000 MHz counter > [ 0.661731] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes nvram, hpet irqs Conspicuously absent here is a line saying “clocksource: Switched to clocksource hpet”. That may be worth examining in more detail. - Jonathan Morton ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 21:26 ` Jonathan Morton @ 2017-01-25 21:29 ` Hans-Kristian Bakke 2017-01-25 21:31 ` Hans-Kristian Bakke 0 siblings, 1 reply; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 21:29 UTC (permalink / raw) To: Jonathan Morton; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 1809 bytes --] Actually I think that is because it may be using the newer TSC: dmesg | grep clocksource [ 0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns [ 0.092665] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns [ 0.366429] clocksource: Switched to clocksource kvm-clock [ 0.378974] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns [ 1.666474] tsc: Refined TSC clocksource calibration: 3200.013 MHz [ 1.666479] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2e20562a1bb, max_idle_ns: 440795285529 ns On 25 January 2017 at 22:26, Jonathan Morton <chromatix99@gmail.com> wrote: > > > On 25 Jan, 2017, at 23:20, Hans-Kristian Bakke <hkbakke@gmail.com> > wrote: > > > > [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 (v01 BOCHS > BXPCHPET 00000001 BXPC 00000001) > > [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 > > [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: > 0xffffffff, max_idle_ns: 19112604467 ns > > [ 0.000000] hpet clockevent registered > > [ 0.362335] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 > > [ 0.362339] hpet0: 3 comparators, 64-bit 100.000000 MHz counter > > [ 0.661731] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes > nvram, hpet irqs > > Conspicuously absent here is a line saying “clocksource: Switched to > clocksource hpet”. That may be worth examining in more detail. > > - Jonathan Morton > > [-- Attachment #2: Type: text/html, Size: 2733 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 21:29 ` Hans-Kristian Bakke @ 2017-01-25 21:31 ` Hans-Kristian Bakke 2017-01-25 21:42 ` Jonathan Morton 2017-01-25 21:48 ` Eric Dumazet 0 siblings, 2 replies; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 21:31 UTC (permalink / raw) To: Jonathan Morton; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 2324 bytes --] kvm-clock is a paravirtualized clock that seems to use the CPUs TSC capabilities if they exist. But it may not be perfect: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/chap-Virtualization_Host_Configuration_and_Guest_Installation_Guide-KVM_guest_timing_management.html On 25 January 2017 at 22:29, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > Actually I think that is because it may be using the newer TSC: > dmesg | grep clocksource > [ 0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff > max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns > [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: > 0xffffffff, max_idle_ns: 7645519600211568 ns > [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, > max_idle_ns: 19112604467 ns > [ 0.092665] clocksource: jiffies: mask: 0xffffffff max_cycles: > 0xffffffff, max_idle_ns: 7645041785100000 ns > [ 0.366429] clocksource: Switched to clocksource kvm-clock > [ 0.378974] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, > max_idle_ns: 2085701024 ns > [ 1.666474] tsc: Refined TSC clocksource calibration: 3200.013 MHz > [ 1.666479] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: > 0x2e20562a1bb, max_idle_ns: 440795285529 ns > > > On 25 January 2017 at 22:26, Jonathan Morton <chromatix99@gmail.com> > wrote: > >> >> > On 25 Jan, 2017, at 23:20, Hans-Kristian Bakke <hkbakke@gmail.com> >> wrote: >> > >> > [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 (v01 BOCHS >> BXPCHPET 00000001 BXPC 00000001) >> > [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 >> > [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: >> 0xffffffff, max_idle_ns: 19112604467 ns >> > [ 0.000000] hpet clockevent registered >> > [ 0.362335] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 >> > [ 0.362339] hpet0: 3 comparators, 64-bit 100.000000 MHz counter >> > [ 0.661731] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes >> nvram, hpet irqs >> >> Conspicuously absent here is a line saying “clocksource: Switched to >> clocksource hpet”. That may be worth examining in more detail. >> >> - Jonathan Morton >> >> > [-- Attachment #2: Type: text/html, Size: 4046 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 21:31 ` Hans-Kristian Bakke @ 2017-01-25 21:42 ` Jonathan Morton 2017-01-25 21:48 ` Eric Dumazet 1 sibling, 0 replies; 31+ messages in thread From: Jonathan Morton @ 2017-01-25 21:42 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: bloat > On 25 Jan, 2017, at 23:31, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > > kvm-clock is a paravirtualized clock that seems to use the CPUs TSC capabilities if they exist. But it may not be perfect: The key capability sch_fq needs for pacing is to generate timer interrupts at precise, sub-millisecond intervals. TSC doesn’t provide that; it’s just a register inside the CPU which can be read efficiently on demand. It is however possible that kvm-clock provides that capability via the underlying HPET hardware. - Jonathan Morton ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 21:31 ` Hans-Kristian Bakke 2017-01-25 21:42 ` Jonathan Morton @ 2017-01-25 21:48 ` Eric Dumazet 2017-01-25 22:03 ` Hans-Kristian Bakke 1 sibling, 1 reply; 31+ messages in thread From: Eric Dumazet @ 2017-01-25 21:48 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: Jonathan Morton, bloat I do not know any particular issues with FQ in VM If you have a recent tc binary (iproute2 package) you can get some infos, as mentioned in this commit changelog : https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=fefa569a9d4bc4b7758c0fddd75bb0382c95da77 $ tc -s qd sh dev eth0 | grep latency 0 gc, 0 highprio, 32490767 throttled, 2382 ns latency On Wed, 2017-01-25 at 22:31 +0100, Hans-Kristian Bakke wrote: > kvm-clock is a paravirtualized clock that seems to use the CPUs TSC > capabilities if they exist. But it may not be perfect: > > > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/chap-Virtualization_Host_Configuration_and_Guest_Installation_Guide-KVM_guest_timing_management.html > > > On 25 January 2017 at 22:29, Hans-Kristian Bakke <hkbakke@gmail.com> > wrote: > Actually I think that is because it may be using the newer > TSC: > dmesg | grep clocksource > [ 0.000000] clocksource: kvm-clock: mask: > 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: > 881590591483 ns > [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff > max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns > [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: > 0xffffffff, max_idle_ns: 19112604467 ns > [ 0.092665] clocksource: jiffies: mask: 0xffffffff > max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns > [ 0.366429] clocksource: Switched to clocksource kvm-clock > [ 0.378974] clocksource: acpi_pm: mask: 0xffffff > max_cycles: 0xffffff, max_idle_ns: 2085701024 ns > [ 1.666474] tsc: Refined TSC clocksource calibration: > 3200.013 MHz > [ 1.666479] clocksource: tsc: mask: 0xffffffffffffffff > max_cycles: 0x2e20562a1bb, max_idle_ns: 440795285529 ns > > > > On 25 January 2017 at 22:26, Jonathan Morton > <chromatix99@gmail.com> wrote: > > > On 25 Jan, 2017, at 23:20, Hans-Kristian Bakke > <hkbakke@gmail.com> wrote: > > > > [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 > (v01 BOCHS BXPCHPET 00000001 BXPC 00000001) > > [ 0.000000] ACPI: HPET id: 0x8086a201 base: > 0xfed00000 > > [ 0.000000] clocksource: hpet: mask: 0xffffffff > max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns > > [ 0.000000] hpet clockevent registered > > [ 0.362335] hpet0: at MMIO 0xfed00000, IRQs 2, 8, > 0 > > [ 0.362339] hpet0: 3 comparators, 64-bit > 100.000000 MHz counter > > [ 0.661731] rtc_cmos 00:00: alarms up to one day, > y3k, 114 bytes nvram, hpet irqs > > Conspicuously absent here is a line saying > “clocksource: Switched to clocksource hpet”. That may > be worth examining in more detail. > > - Jonathan Morton > > > > > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 21:48 ` Eric Dumazet @ 2017-01-25 22:03 ` Hans-Kristian Bakke 0 siblings, 0 replies; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 22:03 UTC (permalink / raw) To: bloat [-- Attachment #1: Type: text/plain, Size: 4485 bytes --] Perhaps the mail didn't arrive properly, but the fq performance is okay now. I don't know why it was completely out for a couple of tests. It was most likely my mistake or some very bad timing for testing. I see that on my physical hosts tsc is also the default with hpet in the list of available clocksources, just as in the VM (where kvm-clock is a paravirtualized version of the host tsc) so the same behaviour is probably to be expected in the VM as on the physical hosts. As far as I know I have not seen that it is a requirement to actually run: echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource ... before using fq as most (physical) linux hosts use tsc as the default clock source today unless the kernel detects unreliabilities. On 25 January 2017 at 22:48, Eric Dumazet <eric.dumazet@gmail.com> wrote: > I do not know any particular issues with FQ in VM > > If you have a recent tc binary (iproute2 package) you can get some > infos, as mentioned in this commit changelog : > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/ > linux.git/commit/?id=fefa569a9d4bc4b7758c0fddd75bb0382c95da77 > > > $ tc -s qd sh dev eth0 | grep latency > 0 gc, 0 highprio, 32490767 throttled, 2382 ns latency > > > On Wed, 2017-01-25 at 22:31 +0100, Hans-Kristian Bakke wrote: > > kvm-clock is a paravirtualized clock that seems to use the CPUs TSC > > capabilities if they exist. But it may not be perfect: > > > > > > https://access.redhat.com/documentation/en-US/Red_Hat_ > Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_ > Installation_Guide/chap-Virtualization_Host_Configuration_and_Guest_ > Installation_Guide-KVM_guest_timing_management.html > > > > > > On 25 January 2017 at 22:29, Hans-Kristian Bakke <hkbakke@gmail.com> > > wrote: > > Actually I think that is because it may be using the newer > > TSC: > > dmesg | grep clocksource > > [ 0.000000] clocksource: kvm-clock: mask: > > 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: > > 881590591483 ns > > [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff > > max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns > > [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: > > 0xffffffff, max_idle_ns: 19112604467 ns > > [ 0.092665] clocksource: jiffies: mask: 0xffffffff > > max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns > > [ 0.366429] clocksource: Switched to clocksource kvm-clock > > [ 0.378974] clocksource: acpi_pm: mask: 0xffffff > > max_cycles: 0xffffff, max_idle_ns: 2085701024 ns > > [ 1.666474] tsc: Refined TSC clocksource calibration: > > 3200.013 MHz > > [ 1.666479] clocksource: tsc: mask: 0xffffffffffffffff > > max_cycles: 0x2e20562a1bb, max_idle_ns: 440795285529 ns > > > > > > > > On 25 January 2017 at 22:26, Jonathan Morton > > <chromatix99@gmail.com> wrote: > > > > > On 25 Jan, 2017, at 23:20, Hans-Kristian Bakke > > <hkbakke@gmail.com> wrote: > > > > > > [ 0.000000] ACPI: HPET 0x00000000BFFE274F 000038 > > (v01 BOCHS BXPCHPET 00000001 BXPC 00000001) > > > [ 0.000000] ACPI: HPET id: 0x8086a201 base: > > 0xfed00000 > > > [ 0.000000] clocksource: hpet: mask: 0xffffffff > > max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns > > > [ 0.000000] hpet clockevent registered > > > [ 0.362335] hpet0: at MMIO 0xfed00000, IRQs 2, 8, > > 0 > > > [ 0.362339] hpet0: 3 comparators, 64-bit > > 100.000000 MHz counter > > > [ 0.661731] rtc_cmos 00:00: alarms up to one day, > > y3k, 114 bytes nvram, hpet irqs > > > > Conspicuously absent here is a line saying > > “clocksource: Switched to clocksource hpet”. That may > > be worth examining in more detail. > > > > - Jonathan Morton > > > > > > > > > > > > _______________________________________________ > > Bloat mailing list > > Bloat@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/bloat > > > [-- Attachment #2: Type: text/html, Size: 6940 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 20:54 [Bloat] Initial tests with BBR in kernel 4.9 Hans-Kristian Bakke 2017-01-25 21:00 ` Jonathan Morton @ 2017-01-25 21:03 ` Hans-Kristian Bakke 2017-01-25 22:01 ` Neal Cardwell 2 siblings, 0 replies; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 21:03 UTC (permalink / raw) To: bloat [-- Attachment #1: Type: text/plain, Size: 2429 bytes --] I did some more testing with fq as a replacement of pfifo_fast and it now behaves just as good. It must have been some strange artifact. My questions are still standing however. Regards, Hans-Kristian On 25 January 2017 at 21:54, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > Hi > > Kernel 4.9 finally landed in Debian testing so I could finally test BBR in > a real life environment that I have struggled with getting any kind of > performance out of. > > The challenge at hand is UDP based OpenVPN through europe at around 35 ms > rtt to my VPN-provider with plenty of available bandwith available in both > ends and everything completely unknown in between. After tuning the > UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms > the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, > while the upload was stuck at 30 to 60 mbit/s. > > Just by activating BBR the bandwith instantly shot up to around 150 mbit/s > using a fat tcp test to a public iperf3 server located near my VPN exit > point in the Netherlands. Replace BBR with qubic again and the performance > is once again all over the place ranging from very bad to bad, but never > better than 1/3 of BBRs "steady state". In other words "instant WIN!" > > However, seeing the requirement of fq and pacing for BBR and noticing that > I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE host > with fq_codel on all physical interfaces, I was surprised to see that it > worked so well. > I then replaced pfifo_fast with fq and the performance went right down to > only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained the > performance at once. > > I have got some questions to you guys that know a lot more than me about > these things: > 1. Do fq (and fq_codel) even work reliably in a VM? What is the best > choice for default qdisc to use in a VM in general? > 2. Why do BBR immediately "fix" all my issues with upload through that > "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement? > 3. Could fq_codel on the physical host be the reason that it still works? > 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a > replacement? > 5. Is BBR perhaps modified to do the right thing without having to change > the qdisc in the current kernel 4.9? > > Sorry for long post, but this is an interesting topic! > > Regards, > Hans-Kristian Bakke > [-- Attachment #2: Type: text/html, Size: 4586 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 20:54 [Bloat] Initial tests with BBR in kernel 4.9 Hans-Kristian Bakke 2017-01-25 21:00 ` Jonathan Morton 2017-01-25 21:03 ` Hans-Kristian Bakke @ 2017-01-25 22:01 ` Neal Cardwell 2017-01-25 22:02 ` Neal Cardwell ` (2 more replies) 2 siblings, 3 replies; 31+ messages in thread From: Neal Cardwell @ 2017-01-25 22:01 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 3540 bytes --] On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > Hi > > Kernel 4.9 finally landed in Debian testing so I could finally test BBR in > a real life environment that I have struggled with getting any kind of > performance out of. > > The challenge at hand is UDP based OpenVPN through europe at around 35 ms > rtt to my VPN-provider with plenty of available bandwith available in both > ends and everything completely unknown in between. After tuning the > UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms > the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, > while the upload was stuck at 30 to 60 mbit/s. > > Just by activating BBR the bandwith instantly shot up to around 150 mbit/s > using a fat tcp test to a public iperf3 server located near my VPN exit > point in the Netherlands. Replace BBR with qubic again and the performance > is once again all over the place ranging from very bad to bad, but never > better than 1/3 of BBRs "steady state". In other words "instant WIN!" > Glad to hear it. Thanks for the test report! > However, seeing the requirement of fq and pacing for BBR and noticing that > I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE host > with fq_codel on all physical interfaces, I was surprised to see that it > worked so well. > I then replaced pfifo_fast with fq and the performance went right down to > only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained the > performance at once. > > I have got some questions to you guys that know a lot more than me about > these things: > 1. Do fq (and fq_codel) even work reliably in a VM? What is the best choice > for default qdisc to use in a VM in general? > Eric covered this one. We are not aware of specific issues with fq in VM environments. And we have tested that fq works sufficiently well on Google Cloud VMs. > 2. Why do BBR immediately "fix" all my issues with upload through that > "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement? > For BBR, pacing is part of the design in order to make BBR more "gentle" in terms of the rate at which it sends, in order to put less pressure on buffers and keep packet loss lower. This is particularly important when a BBR flow is restarting from idle. In this case BBR starts with a full cwnd, and it counts on pacing to pace out the packets at the estimated bandwidth, so that the queue can stay relatively short and yet the pipe can be filled immediately. Running BBR without pacing makes BBR more aggressive, particularly in restarting from idle, but also in the steady state, where BBR tries to use pacing to keep the queue short. For bulk transfer tests with one flow, running BBR without pacing will likely cause higher queues and loss rates at the bottleneck, which may negatively impact other traffic sharing that bottleneck. > 3. Could fq_codel on the physical host be the reason that it still works? > Nope, fq_codel does not implement pacing. > 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a > replacement? > Nope, BBR needs pacing to work correctly, and currently fq is the only Linux qdisc that implements pacing. > 5. Is BBR perhaps modified to do the right thing without having to change > the qdisc in the current kernel 4.9? > Nope. Linux 4.9 contains the initial public release of BBR from September 2016. And there have been no code changes since then (just expanded comments). Thanks for the test report! neal [-- Attachment #2: Type: text/html, Size: 5796 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:01 ` Neal Cardwell @ 2017-01-25 22:02 ` Neal Cardwell 2017-01-25 22:12 ` Hans-Kristian Bakke 2017-01-25 22:06 ` Steinar H. Gunderson 2017-01-25 22:38 ` Hans-Kristian Bakke 2 siblings, 1 reply; 31+ messages in thread From: Neal Cardwell @ 2017-01-25 22:02 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 527 bytes --] On Wed, Jan 25, 2017 at 5:01 PM, Neal Cardwell <ncardwell@google.com> wrote: > On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <hkbakke@gmail.com> > wrote: > >> Hi >> >> Kernel 4.9 finally landed in Debian testing so I could finally test BBR >> in a real life environment that I have struggled with getting any kind of >> performance out of. >> > BTW, these kinds of posts are ideal for the bbr-dev list, if you have any future test reports or discussion items: https://groups.google.com/d/forum/bbr-dev Thanks, neal [-- Attachment #2: Type: text/html, Size: 1473 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:02 ` Neal Cardwell @ 2017-01-25 22:12 ` Hans-Kristian Bakke 0 siblings, 0 replies; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 22:12 UTC (permalink / raw) To: Neal Cardwell; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 846 bytes --] Thank you for you comments. It makes sense to me now! I am not a member of that mailing list so feel free to repost or quote my experiences if you feel it's relevant. I will keep it in mind for the future though. On 25 January 2017 at 23:02, Neal Cardwell <ncardwell@google.com> wrote: > On Wed, Jan 25, 2017 at 5:01 PM, Neal Cardwell <ncardwell@google.com> > wrote: > >> On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <hkbakke@gmail.com> >> wrote: >> >>> Hi >>> >>> Kernel 4.9 finally landed in Debian testing so I could finally test BBR >>> in a real life environment that I have struggled with getting any kind of >>> performance out of. >>> >> > BTW, these kinds of posts are ideal for the bbr-dev list, if you have any > future test reports or discussion items: > > https://groups.google.com/d/forum/bbr-dev > > Thanks, > neal > > [-- Attachment #2: Type: text/html, Size: 2268 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:01 ` Neal Cardwell 2017-01-25 22:02 ` Neal Cardwell @ 2017-01-25 22:06 ` Steinar H. Gunderson 2017-01-25 22:12 ` Eric Dumazet 2017-01-25 22:38 ` Hans-Kristian Bakke 2 siblings, 1 reply; 31+ messages in thread From: Steinar H. Gunderson @ 2017-01-25 22:06 UTC (permalink / raw) To: Neal Cardwell; +Cc: Hans-Kristian Bakke, bloat On Wed, Jan 25, 2017 at 05:01:04PM -0500, Neal Cardwell wrote: > Nope, BBR needs pacing to work correctly, and currently fq is the only > Linux qdisc that implements pacing. I really wish sch_fq was renamed sch_pacing :-) And of course that we had a single qdisc that was ideal for both end hosts and routers (especially since some machines act as both). /* Steinar */ -- Homepage: https://www.sesse.net/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:06 ` Steinar H. Gunderson @ 2017-01-25 22:12 ` Eric Dumazet 2017-01-25 22:23 ` Steinar H. Gunderson 0 siblings, 1 reply; 31+ messages in thread From: Eric Dumazet @ 2017-01-25 22:12 UTC (permalink / raw) To: Steinar H. Gunderson; +Cc: Neal Cardwell, bloat On Wed, 2017-01-25 at 23:06 +0100, Steinar H. Gunderson wrote: > On Wed, Jan 25, 2017 at 05:01:04PM -0500, Neal Cardwell wrote: > > Nope, BBR needs pacing to work correctly, and currently fq is the only > > Linux qdisc that implements pacing. > > I really wish sch_fq was renamed sch_pacing :-) And of course that we had a > single qdisc that was ideal for both end hosts and routers (especially since > some machines act as both). Well, pacing is optional in sch_fq. Only the FQ part is not optional. So sch_fq is actually a proper name ;) I have on my plate few things : 1) Add a fallback to actually do pacing in TCP itself, if it detects no pacing happens in a qdisc. This would be okay for devices that have very few local TCP flows (eg a router) 2) Add (optional) pacing to fq_codel. But this would come later. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:12 ` Eric Dumazet @ 2017-01-25 22:23 ` Steinar H. Gunderson 2017-01-25 22:27 ` Eric Dumazet 0 siblings, 1 reply; 31+ messages in thread From: Steinar H. Gunderson @ 2017-01-25 22:23 UTC (permalink / raw) To: Eric Dumazet; +Cc: Neal Cardwell, bloat On Wed, Jan 25, 2017 at 02:12:05PM -0800, Eric Dumazet wrote: > Well, pacing is optional in sch_fq. > > Only the FQ part is not optional. Yeah, but who cares about the FQ part for an end host =) > So sch_fq is actually a proper name ;) Hah, kernel devs ;-) /* Steinar */ -- Homepage: https://www.sesse.net/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:23 ` Steinar H. Gunderson @ 2017-01-25 22:27 ` Eric Dumazet 0 siblings, 0 replies; 31+ messages in thread From: Eric Dumazet @ 2017-01-25 22:27 UTC (permalink / raw) To: Steinar H. Gunderson; +Cc: Neal Cardwell, bloat On Wed, 2017-01-25 at 23:23 +0100, Steinar H. Gunderson wrote: > On Wed, Jan 25, 2017 at 02:12:05PM -0800, Eric Dumazet wrote: > > Well, pacing is optional in sch_fq. > > > > Only the FQ part is not optional. > > Yeah, but who cares about the FQ part for an end host =) Well, we do care a lot, many hosts are actually sending data at full speed. It is always unpleasant when a single rogue UDP flow can steal your NIC. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:01 ` Neal Cardwell 2017-01-25 22:02 ` Neal Cardwell 2017-01-25 22:06 ` Steinar H. Gunderson @ 2017-01-25 22:38 ` Hans-Kristian Bakke 2017-01-25 22:48 ` Neal Cardwell 2 siblings, 1 reply; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 22:38 UTC (permalink / raw) To: Neal Cardwell; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 4133 bytes --] Actually.. the 1-4 mbit/s results with fq sporadically appears again as I keep testing but it is most likely caused by all the unknowns between me an my testserver. But still, changing to pfifo_qdisc seems to normalize the throughput again with BBR, could this be one of those times where BBR and pacing actually is getting hurt for playing nice in some very variable bottleneck on the way? On 25 January 2017 at 23:01, Neal Cardwell <ncardwell@google.com> wrote: > On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <hkbakke@gmail.com> > wrote: > >> Hi >> >> Kernel 4.9 finally landed in Debian testing so I could finally test BBR >> in a real life environment that I have struggled with getting any kind of >> performance out of. >> >> The challenge at hand is UDP based OpenVPN through europe at around 35 ms >> rtt to my VPN-provider with plenty of available bandwith available in both >> ends and everything completely unknown in between. After tuning the >> UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms >> the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, >> while the upload was stuck at 30 to 60 mbit/s. >> >> Just by activating BBR the bandwith instantly shot up to around 150 >> mbit/s using a fat tcp test to a public iperf3 server located near my VPN >> exit point in the Netherlands. Replace BBR with qubic again and the >> performance is once again all over the place ranging from very bad to bad, >> but never better than 1/3 of BBRs "steady state". In other words "instant >> WIN!" >> > > Glad to hear it. Thanks for the test report! > > >> However, seeing the requirement of fq and pacing for BBR and noticing >> that I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE >> host with fq_codel on all physical interfaces, I was surprised to see that >> it worked so well. >> I then replaced pfifo_fast with fq and the performance went right down to >> only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained the >> performance at once. >> >> I have got some questions to you guys that know a lot more than me about >> these things: >> > 1. Do fq (and fq_codel) even work reliably in a VM? What is the best >> choice for default qdisc to use in a VM in general? >> > > Eric covered this one. We are not aware of specific issues with fq in VM > environments. And we have tested that fq works sufficiently well on Google > Cloud VMs. > > >> 2. Why do BBR immediately "fix" all my issues with upload through that >> "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement? >> > > For BBR, pacing is part of the design in order to make BBR more "gentle" > in terms of the rate at which it sends, in order to put less pressure on > buffers and keep packet loss lower. This is particularly important when a > BBR flow is restarting from idle. In this case BBR starts with a full cwnd, > and it counts on pacing to pace out the packets at the estimated bandwidth, > so that the queue can stay relatively short and yet the pipe can be filled > immediately. > > Running BBR without pacing makes BBR more aggressive, particularly in > restarting from idle, but also in the steady state, where BBR tries to use > pacing to keep the queue short. > > For bulk transfer tests with one flow, running BBR without pacing will > likely cause higher queues and loss rates at the bottleneck, which may > negatively impact other traffic sharing that bottleneck. > > >> 3. Could fq_codel on the physical host be the reason that it still works? >> > > Nope, fq_codel does not implement pacing. > > >> 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a >> replacement? >> > > Nope, BBR needs pacing to work correctly, and currently fq is the only > Linux qdisc that implements pacing. > > >> 5. Is BBR perhaps modified to do the right thing without having to change >> the qdisc in the current kernel 4.9? >> > > Nope. Linux 4.9 contains the initial public release of BBR from September > 2016. And there have been no code changes since then (just expanded > comments). > > Thanks for the test report! > > neal > > [-- Attachment #2: Type: text/html, Size: 6818 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:38 ` Hans-Kristian Bakke @ 2017-01-25 22:48 ` Neal Cardwell 2017-01-25 23:04 ` Hans-Kristian Bakke 0 siblings, 1 reply; 31+ messages in thread From: Neal Cardwell @ 2017-01-25 22:48 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 4592 bytes --] On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > Actually.. the 1-4 mbit/s results with fq sporadically appears again as I > keep testing but it is most likely caused by all the unknowns between me an > my testserver. But still, changing to pfifo_qdisc seems to normalize the > throughput again with BBR, could this be one of those times where BBR and > pacing actually is getting hurt for playing nice in some very variable > bottleneck on the way? > Possibly. Would you be able to take a tcpdump trace of each trial (headers only would be ideal), and post on a web site somewhere a pcap trace for one of the slow trials? For example: tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c 1000000 & thanks, neal > > On 25 January 2017 at 23:01, Neal Cardwell <ncardwell@google.com> wrote: > >> On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <hkbakke@gmail.com> >> wrote: >> >>> Hi >>> >>> Kernel 4.9 finally landed in Debian testing so I could finally test BBR >>> in a real life environment that I have struggled with getting any kind of >>> performance out of. >>> >>> The challenge at hand is UDP based OpenVPN through europe at around 35 >>> ms rtt to my VPN-provider with plenty of available bandwith available in >>> both ends and everything completely unknown in between. After tuning the >>> UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms >>> the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, >>> while the upload was stuck at 30 to 60 mbit/s. >>> >>> Just by activating BBR the bandwith instantly shot up to around 150 >>> mbit/s using a fat tcp test to a public iperf3 server located near my VPN >>> exit point in the Netherlands. Replace BBR with qubic again and the >>> performance is once again all over the place ranging from very bad to bad, >>> but never better than 1/3 of BBRs "steady state". In other words "instant >>> WIN!" >>> >> >> Glad to hear it. Thanks for the test report! >> >> >>> However, seeing the requirement of fq and pacing for BBR and noticing >>> that I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE >>> host with fq_codel on all physical interfaces, I was surprised to see that >>> it worked so well. >>> I then replaced pfifo_fast with fq and the performance went right down >>> to only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained >>> the performance at once. >>> >>> I have got some questions to you guys that know a lot more than me about >>> these things: >>> >> 1. Do fq (and fq_codel) even work reliably in a VM? What is the best >>> choice for default qdisc to use in a VM in general? >>> >> >> Eric covered this one. We are not aware of specific issues with fq in VM >> environments. And we have tested that fq works sufficiently well on Google >> Cloud VMs. >> >> >>> 2. Why do BBR immediately "fix" all my issues with upload through that >>> "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement? >>> >> >> For BBR, pacing is part of the design in order to make BBR more "gentle" >> in terms of the rate at which it sends, in order to put less pressure on >> buffers and keep packet loss lower. This is particularly important when a >> BBR flow is restarting from idle. In this case BBR starts with a full cwnd, >> and it counts on pacing to pace out the packets at the estimated bandwidth, >> so that the queue can stay relatively short and yet the pipe can be filled >> immediately. >> >> Running BBR without pacing makes BBR more aggressive, particularly in >> restarting from idle, but also in the steady state, where BBR tries to use >> pacing to keep the queue short. >> >> For bulk transfer tests with one flow, running BBR without pacing will >> likely cause higher queues and loss rates at the bottleneck, which may >> negatively impact other traffic sharing that bottleneck. >> >> >>> 3. Could fq_codel on the physical host be the reason that it still works? >>> >> >> Nope, fq_codel does not implement pacing. >> >> >>> 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a >>> replacement? >>> >> >> Nope, BBR needs pacing to work correctly, and currently fq is the only >> Linux qdisc that implements pacing. >> >> >>> 5. Is BBR perhaps modified to do the right thing without having to >>> change the qdisc in the current kernel 4.9? >>> >> >> Nope. Linux 4.9 contains the initial public release of BBR from September >> 2016. And there have been no code changes since then (just expanded >> comments). >> >> Thanks for the test report! >> >> neal >> >> > [-- Attachment #2: Type: text/html, Size: 7910 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 22:48 ` Neal Cardwell @ 2017-01-25 23:04 ` Hans-Kristian Bakke 2017-01-25 23:31 ` Hans-Kristian Bakke 2017-01-25 23:33 ` Eric Dumazet 0 siblings, 2 replies; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 23:04 UTC (permalink / raw) To: Neal Cardwell; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 4985 bytes --] I can do that. I guess I should do the capture from tun1 as that is the place that the tcp-traffic is visible? My non-virtual nic is only seeing OpenVPN encapsulated UDP-traffic. On 25 January 2017 at 23:48, Neal Cardwell <ncardwell@google.com> wrote: > On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke <hkbakke@gmail.com> > wrote: > >> Actually.. the 1-4 mbit/s results with fq sporadically appears again as I >> keep testing but it is most likely caused by all the unknowns between me an >> my testserver. But still, changing to pfifo_qdisc seems to normalize the >> throughput again with BBR, could this be one of those times where BBR and >> pacing actually is getting hurt for playing nice in some very variable >> bottleneck on the way? >> > > Possibly. Would you be able to take a tcpdump trace of each trial (headers > only would be ideal), and post on a web site somewhere a pcap trace for one > of the slow trials? > > For example: > > tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c 1000000 & > > thanks, > neal > > > >> >> On 25 January 2017 at 23:01, Neal Cardwell <ncardwell@google.com> wrote: >> >>> On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <hkbakke@gmail.com> >>> wrote: >>> >>>> Hi >>>> >>>> Kernel 4.9 finally landed in Debian testing so I could finally test BBR >>>> in a real life environment that I have struggled with getting any kind of >>>> performance out of. >>>> >>>> The challenge at hand is UDP based OpenVPN through europe at around 35 >>>> ms rtt to my VPN-provider with plenty of available bandwith available in >>>> both ends and everything completely unknown in between. After tuning the >>>> UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms >>>> the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, >>>> while the upload was stuck at 30 to 60 mbit/s. >>>> >>>> Just by activating BBR the bandwith instantly shot up to around 150 >>>> mbit/s using a fat tcp test to a public iperf3 server located near my VPN >>>> exit point in the Netherlands. Replace BBR with qubic again and the >>>> performance is once again all over the place ranging from very bad to bad, >>>> but never better than 1/3 of BBRs "steady state". In other words "instant >>>> WIN!" >>>> >>> >>> Glad to hear it. Thanks for the test report! >>> >>> >>>> However, seeing the requirement of fq and pacing for BBR and noticing >>>> that I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE >>>> host with fq_codel on all physical interfaces, I was surprised to see that >>>> it worked so well. >>>> I then replaced pfifo_fast with fq and the performance went right down >>>> to only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained >>>> the performance at once. >>>> >>>> I have got some questions to you guys that know a lot more than me >>>> about these things: >>>> >>> 1. Do fq (and fq_codel) even work reliably in a VM? What is the best >>>> choice for default qdisc to use in a VM in general? >>>> >>> >>> Eric covered this one. We are not aware of specific issues with fq in VM >>> environments. And we have tested that fq works sufficiently well on Google >>> Cloud VMs. >>> >>> >>>> 2. Why do BBR immediately "fix" all my issues with upload through that >>>> "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement? >>>> >>> >>> For BBR, pacing is part of the design in order to make BBR more "gentle" >>> in terms of the rate at which it sends, in order to put less pressure on >>> buffers and keep packet loss lower. This is particularly important when a >>> BBR flow is restarting from idle. In this case BBR starts with a full cwnd, >>> and it counts on pacing to pace out the packets at the estimated bandwidth, >>> so that the queue can stay relatively short and yet the pipe can be filled >>> immediately. >>> >>> Running BBR without pacing makes BBR more aggressive, particularly in >>> restarting from idle, but also in the steady state, where BBR tries to use >>> pacing to keep the queue short. >>> >>> For bulk transfer tests with one flow, running BBR without pacing will >>> likely cause higher queues and loss rates at the bottleneck, which may >>> negatively impact other traffic sharing that bottleneck. >>> >>> >>>> 3. Could fq_codel on the physical host be the reason that it still >>>> works? >>>> >>> >>> Nope, fq_codel does not implement pacing. >>> >>> >>>> 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a >>>> replacement? >>>> >>> >>> Nope, BBR needs pacing to work correctly, and currently fq is the only >>> Linux qdisc that implements pacing. >>> >>> >>>> 5. Is BBR perhaps modified to do the right thing without having to >>>> change the qdisc in the current kernel 4.9? >>>> >>> >>> Nope. Linux 4.9 contains the initial public release of BBR from >>> September 2016. And there have been no code changes since then (just >>> expanded comments). >>> >>> Thanks for the test report! >>> >>> neal >>> >>> >> > [-- Attachment #2: Type: text/html, Size: 8638 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 23:04 ` Hans-Kristian Bakke @ 2017-01-25 23:31 ` Hans-Kristian Bakke 2017-01-25 23:33 ` Eric Dumazet 1 sibling, 0 replies; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 23:31 UTC (permalink / raw) To: Neal Cardwell; +Cc: bloat [-- Attachment #1: Type: text/plain, Size: 5484 bytes --] Okay, here is some captures. It took a while as I of course had to create a script for it as the public iperf3 server is not always free to do tests. I think there are enough context with the folder names to make sense of it. https://owncloud.proikt.com/index.php/s/eY6eZmjDlznar0N On 26 January 2017 at 00:04, Hans-Kristian Bakke <hkbakke@gmail.com> wrote: > I can do that. I guess I should do the capture from tun1 as that is the > place that the tcp-traffic is visible? My non-virtual nic is only seeing > OpenVPN encapsulated UDP-traffic. > > On 25 January 2017 at 23:48, Neal Cardwell <ncardwell@google.com> wrote: > >> On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke <hkbakke@gmail.com> >> wrote: >> >>> Actually.. the 1-4 mbit/s results with fq sporadically appears again as >>> I keep testing but it is most likely caused by all the unknowns between me >>> an my testserver. But still, changing to pfifo_qdisc seems to normalize the >>> throughput again with BBR, could this be one of those times where BBR and >>> pacing actually is getting hurt for playing nice in some very variable >>> bottleneck on the way? >>> >> >> Possibly. Would you be able to take a tcpdump trace of each trial >> (headers only would be ideal), and post on a web site somewhere a pcap >> trace for one of the slow trials? >> >> For example: >> >> tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c 1000000 & >> >> thanks, >> neal >> >> >> >>> >>> On 25 January 2017 at 23:01, Neal Cardwell <ncardwell@google.com> wrote: >>> >>>> On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian Bakke <hkbakke@gmail.com >>>> > wrote: >>>> >>>>> Hi >>>>> >>>>> Kernel 4.9 finally landed in Debian testing so I could finally test >>>>> BBR in a real life environment that I have struggled with getting any kind >>>>> of performance out of. >>>>> >>>>> The challenge at hand is UDP based OpenVPN through europe at around 35 >>>>> ms rtt to my VPN-provider with plenty of available bandwith available in >>>>> both ends and everything completely unknown in between. After tuning the >>>>> UDP-buffers up to make room for my 500 mbit/s symmetrical bandwith at 35 ms >>>>> the download part seemed to work nicely at an unreliable 150 to 300 mbit/s, >>>>> while the upload was stuck at 30 to 60 mbit/s. >>>>> >>>>> Just by activating BBR the bandwith instantly shot up to around 150 >>>>> mbit/s using a fat tcp test to a public iperf3 server located near my VPN >>>>> exit point in the Netherlands. Replace BBR with qubic again and the >>>>> performance is once again all over the place ranging from very bad to bad, >>>>> but never better than 1/3 of BBRs "steady state". In other words "instant >>>>> WIN!" >>>>> >>>> >>>> Glad to hear it. Thanks for the test report! >>>> >>>> >>>>> However, seeing the requirement of fq and pacing for BBR and noticing >>>>> that I am running pfifo_fast within a VM with virtio NIC on a Proxmox VE >>>>> host with fq_codel on all physical interfaces, I was surprised to see that >>>>> it worked so well. >>>>> I then replaced pfifo_fast with fq and the performance went right down >>>>> to only 1-4 mbit/s from around 150 mbit/s. Removing the fq again regained >>>>> the performance at once. >>>>> >>>>> I have got some questions to you guys that know a lot more than me >>>>> about these things: >>>>> >>>> 1. Do fq (and fq_codel) even work reliably in a VM? What is the best >>>>> choice for default qdisc to use in a VM in general? >>>>> >>>> >>>> Eric covered this one. We are not aware of specific issues with fq in >>>> VM environments. And we have tested that fq works sufficiently well on >>>> Google Cloud VMs. >>>> >>>> >>>>> 2. Why do BBR immediately "fix" all my issues with upload through that >>>>> "unreliable" big BDP link with pfifo_fast when fq pacing is a requirement? >>>>> >>>> >>>> For BBR, pacing is part of the design in order to make BBR more >>>> "gentle" in terms of the rate at which it sends, in order to put less >>>> pressure on buffers and keep packet loss lower. This is particularly >>>> important when a BBR flow is restarting from idle. In this case BBR starts >>>> with a full cwnd, and it counts on pacing to pace out the packets at the >>>> estimated bandwidth, so that the queue can stay relatively short and yet >>>> the pipe can be filled immediately. >>>> >>>> Running BBR without pacing makes BBR more aggressive, particularly in >>>> restarting from idle, but also in the steady state, where BBR tries to use >>>> pacing to keep the queue short. >>>> >>>> For bulk transfer tests with one flow, running BBR without pacing will >>>> likely cause higher queues and loss rates at the bottleneck, which may >>>> negatively impact other traffic sharing that bottleneck. >>>> >>>> >>>>> 3. Could fq_codel on the physical host be the reason that it still >>>>> works? >>>>> >>>> >>>> Nope, fq_codel does not implement pacing. >>>> >>>> >>>>> 4. Do BBR _only_ work with fq pacing or could fq_codel be used as a >>>>> replacement? >>>>> >>>> >>>> Nope, BBR needs pacing to work correctly, and currently fq is the only >>>> Linux qdisc that implements pacing. >>>> >>>> >>>>> 5. Is BBR perhaps modified to do the right thing without having to >>>>> change the qdisc in the current kernel 4.9? >>>>> >>>> >>>> Nope. Linux 4.9 contains the initial public release of BBR from >>>> September 2016. And there have been no code changes since then (just >>>> expanded comments). >>>> >>>> Thanks for the test report! >>>> >>>> neal >>>> >>>> >>> >> > [-- Attachment #2: Type: text/html, Size: 9753 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 23:04 ` Hans-Kristian Bakke 2017-01-25 23:31 ` Hans-Kristian Bakke @ 2017-01-25 23:33 ` Eric Dumazet 2017-01-25 23:41 ` Hans-Kristian Bakke 2017-01-25 23:47 ` Hans-Kristian Bakke 1 sibling, 2 replies; 31+ messages in thread From: Eric Dumazet @ 2017-01-25 23:33 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: Neal Cardwell, bloat On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian Bakke wrote: > I can do that. I guess I should do the capture from tun1 as that is > the place that the tcp-traffic is visible? My non-virtual nic is only > seeing OpenVPN encapsulated UDP-traffic. > But is FQ installed at the point TCP sockets are ? You should give us "tc -s qdisc show xxx" so that we can check if pacing (throttling) actually happens. > On 25 January 2017 at 23:48, Neal Cardwell <ncardwell@google.com> > wrote: > On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke > <hkbakke@gmail.com> wrote: > Actually.. the 1-4 mbit/s results with fq sporadically > appears again as I keep testing but it is most likely > caused by all the unknowns between me an my > testserver. But still, changing to pfifo_qdisc seems > to normalize the throughput again with BBR, could this > be one of those times where BBR and pacing actually is > getting hurt for playing nice in some very variable > bottleneck on the way? > > > Possibly. Would you be able to take a tcpdump trace of each > trial (headers only would be ideal), and post on a web site > somewhere a pcap trace for one of the slow trials? > > > For example: > > > tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c 1000000 & > > > > thanks, > neal > > > > > On 25 January 2017 at 23:01, Neal Cardwell > <ncardwell@google.com> wrote: > On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian > Bakke <hkbakke@gmail.com> wrote: > Hi > > > Kernel 4.9 finally landed in Debian > testing so I could finally test BBR in > a real life environment that I have > struggled with getting any kind of > performance out of. > > > The challenge at hand is UDP based > OpenVPN through europe at around 35 ms > rtt to my VPN-provider with plenty of > available bandwith available in both > ends and everything completely unknown > in between. After tuning the > UDP-buffers up to make room for my 500 > mbit/s symmetrical bandwith at 35 ms > the download part seemed to work > nicely at an unreliable 150 to 300 > mbit/s, while the upload was stuck at > 30 to 60 mbit/s. > > > Just by activating BBR the bandwith > instantly shot up to around 150 mbit/s > using a fat tcp test to a public > iperf3 server located near my VPN exit > point in the Netherlands. Replace BBR > with qubic again and the performance > is once again all over the place > ranging from very bad to bad, but > never better than 1/3 of BBRs "steady > state". In other words "instant WIN!" > > > Glad to hear it. Thanks for the test report! > > However, seeing the requirement of fq > and pacing for BBR and noticing that I > am running pfifo_fast within a VM with > virtio NIC on a Proxmox VE host with > fq_codel on all physical interfaces, I > was surprised to see that it worked so > well. > I then replaced pfifo_fast with fq and > the performance went right down to > only 1-4 mbit/s from around 150 > mbit/s. Removing the fq again regained > the performance at once. > > > I have got some questions to you guys > that know a lot more than me about > these things: > 1. Do fq (and fq_codel) even work > reliably in a VM? What is the best > choice for default qdisc to use in a > VM in general? > > > Eric covered this one. We are not aware of > specific issues with fq in VM environments. > And we have tested that fq works sufficiently > well on Google Cloud VMs. > > 2. Why do BBR immediately "fix" all my > issues with upload through that > "unreliable" big BDP link with > pfifo_fast when fq pacing is a > requirement? > > > For BBR, pacing is part of the design in order > to make BBR more "gentle" in terms of the rate > at which it sends, in order to put less > pressure on buffers and keep packet loss > lower. This is particularly important when a > BBR flow is restarting from idle. In this case > BBR starts with a full cwnd, and it counts on > pacing to pace out the packets at the > estimated bandwidth, so that the queue can > stay relatively short and yet the pipe can be > filled immediately. > > > Running BBR without pacing makes BBR more > aggressive, particularly in restarting from > idle, but also in the steady state, where BBR > tries to use pacing to keep the queue short. > > > For bulk transfer tests with one flow, running > BBR without pacing will likely cause higher > queues and loss rates at the bottleneck, which > may negatively impact other traffic sharing > that bottleneck. > > 3. Could fq_codel on the physical host > be the reason that it still works? > > > Nope, fq_codel does not implement pacing. > > 4. Do BBR _only_ work with fq pacing > or could fq_codel be used as a > replacement? > > > Nope, BBR needs pacing to work correctly, and > currently fq is the only Linux qdisc that > implements pacing. > > 5. Is BBR perhaps modified to do the > right thing without having to change > the qdisc in the current kernel 4.9? > > > Nope. Linux 4.9 contains the initial public > release of BBR from September 2016. And there > have been no code changes since then (just > expanded comments). > > > Thanks for the test report! > > > neal > > > > > > > > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 23:33 ` Eric Dumazet @ 2017-01-25 23:41 ` Hans-Kristian Bakke 2017-01-25 23:46 ` Eric Dumazet 2017-01-25 23:47 ` Hans-Kristian Bakke 1 sibling, 1 reply; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 23:41 UTC (permalink / raw) To: Eric Dumazet; +Cc: Neal Cardwell, bloat [-- Attachment #1: Type: text/plain, Size: 9682 bytes --] I listed the qdiscs and put them with the captures, but the setup is: KVM VM: tun1 (qdisc: fq, OpenVPN UDP to dst 443) eth0 (qdisc: fq, local connection to internet) BBR always set Nics are using virtio and is connected to a Open vSwitch Physical host (Newest proxmox VE with kernel 4.4): fq_codel on all interfaces (default qdisc). 2 x gigabit in OVS bond LACP (with rebalancing every 10 sec) towards the switch Switch is then connected using a 4 x gigabit LACP to a Debian testing linux gateway (fq_codel on all nics). This gateway is using my own traffic shaper script (HTB/FQ_CODEL) based what I could find from your bufferbloat project on the internet and shapes the 500/500 fiber link to just within specs (only upload is touched, no inbound shaping/policing) The script is actually quite useful for it's simplicity and can be found here: https://github.com/hkbakke/tc-gen The VPN connection is terminated in netherlands on a gigabit VPN server with around 35 ms RTT. On 26 January 2017 at 00:33, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian Bakke wrote: > > I can do that. I guess I should do the capture from tun1 as that is > > the place that the tcp-traffic is visible? My non-virtual nic is only > > seeing OpenVPN encapsulated UDP-traffic. > > > > But is FQ installed at the point TCP sockets are ? > > You should give us "tc -s qdisc show xxx" so that we can check if > pacing (throttling) actually happens. > > > > On 25 January 2017 at 23:48, Neal Cardwell <ncardwell@google.com> > > wrote: > > On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke > > <hkbakke@gmail.com> wrote: > > Actually.. the 1-4 mbit/s results with fq sporadically > > appears again as I keep testing but it is most likely > > caused by all the unknowns between me an my > > testserver. But still, changing to pfifo_qdisc seems > > to normalize the throughput again with BBR, could this > > be one of those times where BBR and pacing actually is > > getting hurt for playing nice in some very variable > > bottleneck on the way? > > > > > > Possibly. Would you be able to take a tcpdump trace of each > > trial (headers only would be ideal), and post on a web site > > somewhere a pcap trace for one of the slow trials? > > > > > > For example: > > > > > > tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c 1000000 & > > > > > > > > thanks, > > neal > > > > > > > > > > On 25 January 2017 at 23:01, Neal Cardwell > > <ncardwell@google.com> wrote: > > On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian > > Bakke <hkbakke@gmail.com> wrote: > > Hi > > > > > > Kernel 4.9 finally landed in Debian > > testing so I could finally test BBR in > > a real life environment that I have > > struggled with getting any kind of > > performance out of. > > > > > > The challenge at hand is UDP based > > OpenVPN through europe at around 35 ms > > rtt to my VPN-provider with plenty of > > available bandwith available in both > > ends and everything completely unknown > > in between. After tuning the > > UDP-buffers up to make room for my 500 > > mbit/s symmetrical bandwith at 35 ms > > the download part seemed to work > > nicely at an unreliable 150 to 300 > > mbit/s, while the upload was stuck at > > 30 to 60 mbit/s. > > > > > > Just by activating BBR the bandwith > > instantly shot up to around 150 mbit/s > > using a fat tcp test to a public > > iperf3 server located near my VPN exit > > point in the Netherlands. Replace BBR > > with qubic again and the performance > > is once again all over the place > > ranging from very bad to bad, but > > never better than 1/3 of BBRs "steady > > state". In other words "instant WIN!" > > > > > > Glad to hear it. Thanks for the test report! > > > > However, seeing the requirement of fq > > and pacing for BBR and noticing that I > > am running pfifo_fast within a VM with > > virtio NIC on a Proxmox VE host with > > fq_codel on all physical interfaces, I > > was surprised to see that it worked so > > well. > > I then replaced pfifo_fast with fq and > > the performance went right down to > > only 1-4 mbit/s from around 150 > > mbit/s. Removing the fq again regained > > the performance at once. > > > > > > I have got some questions to you guys > > that know a lot more than me about > > these things: > > 1. Do fq (and fq_codel) even work > > reliably in a VM? What is the best > > choice for default qdisc to use in a > > VM in general? > > > > > > Eric covered this one. We are not aware of > > specific issues with fq in VM environments. > > And we have tested that fq works sufficiently > > well on Google Cloud VMs. > > > > 2. Why do BBR immediately "fix" all my > > issues with upload through that > > "unreliable" big BDP link with > > pfifo_fast when fq pacing is a > > requirement? > > > > > > For BBR, pacing is part of the design in order > > to make BBR more "gentle" in terms of the rate > > at which it sends, in order to put less > > pressure on buffers and keep packet loss > > lower. This is particularly important when a > > BBR flow is restarting from idle. In this case > > BBR starts with a full cwnd, and it counts on > > pacing to pace out the packets at the > > estimated bandwidth, so that the queue can > > stay relatively short and yet the pipe can be > > filled immediately. > > > > > > Running BBR without pacing makes BBR more > > aggressive, particularly in restarting from > > idle, but also in the steady state, where BBR > > tries to use pacing to keep the queue short. > > > > > > For bulk transfer tests with one flow, running > > BBR without pacing will likely cause higher > > queues and loss rates at the bottleneck, which > > may negatively impact other traffic sharing > > that bottleneck. > > > > 3. Could fq_codel on the physical host > > be the reason that it still works? > > > > > > Nope, fq_codel does not implement pacing. > > > > 4. Do BBR _only_ work with fq pacing > > or could fq_codel be used as a > > replacement? > > > > > > Nope, BBR needs pacing to work correctly, and > > currently fq is the only Linux qdisc that > > implements pacing. > > > > 5. Is BBR perhaps modified to do the > > right thing without having to change > > the qdisc in the current kernel 4.9? > > > > > > Nope. Linux 4.9 contains the initial public > > release of BBR from September 2016. And there > > have been no code changes since then (just > > expanded comments). > > > > > > Thanks for the test report! > > > > > > neal > > > > > > > > > > > > > > > > > > _______________________________________________ > > Bloat mailing list > > Bloat@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/bloat > > > [-- Attachment #2: Type: text/html, Size: 14327 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 23:41 ` Hans-Kristian Bakke @ 2017-01-25 23:46 ` Eric Dumazet 0 siblings, 0 replies; 31+ messages in thread From: Eric Dumazet @ 2017-01-25 23:46 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: Neal Cardwell, bloat On Thu, 2017-01-26 at 00:41 +0100, Hans-Kristian Bakke wrote: > I listed the qdiscs and put them with the captures, but the setup is: > > KVM VM: > tun1 (qdisc: fq, OpenVPN UDP to dst 443) > eth0 (qdisc: fq, local connection to internet) I am not sure that it will work properly. My concern is that pacing might happen twice, depending if skb are orphaned or not between tun1 and eth0 Please give us the result from VM side : tc -s qdisc > BBR always set > Nics are using virtio and is connected to a Open vSwitch > > > Physical host (Newest proxmox VE with kernel 4.4): > fq_codel on all interfaces (default qdisc). 2 x gigabit in OVS bond > LACP (with rebalancing every 10 sec) towards the switch > > > Switch is then connected using a 4 x gigabit LACP to a Debian testing > linux gateway (fq_codel on all nics). This gateway is using my own > traffic shaper script (HTB/FQ_CODEL) based what I could find from your > bufferbloat project on the internet and shapes the 500/500 fiber link > to just within specs (only upload is touched, no inbound > shaping/policing) > > > The script is actually quite useful for it's simplicity and can be > found here: https://github.com/hkbakke/tc-gen > > > The VPN connection is terminated in netherlands on a gigabit VPN > server with around 35 ms RTT. > > On 26 January 2017 at 00:33, Eric Dumazet <eric.dumazet@gmail.com> > wrote: > > On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian Bakke wrote: > > I can do that. I guess I should do the capture from tun1 as > that is > > the place that the tcp-traffic is visible? My non-virtual > nic is only > > seeing OpenVPN encapsulated UDP-traffic. > > > > But is FQ installed at the point TCP sockets are ? > > You should give us "tc -s qdisc show xxx" so that we can > check if > pacing (throttling) actually happens. > > > > On 25 January 2017 at 23:48, Neal Cardwell > <ncardwell@google.com> > > wrote: > > On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke > > <hkbakke@gmail.com> wrote: > > Actually.. the 1-4 mbit/s results with fq > sporadically > > appears again as I keep testing but it is > most likely > > caused by all the unknowns between me an my > > testserver. But still, changing to > pfifo_qdisc seems > > to normalize the throughput again with BBR, > could this > > be one of those times where BBR and pacing > actually is > > getting hurt for playing nice in some very > variable > > bottleneck on the way? > > > > > > Possibly. Would you be able to take a tcpdump trace > of each > > trial (headers only would be ideal), and post on a > web site > > somewhere a pcap trace for one of the slow trials? > > > > > > For example: > > > > > > tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c > 1000000 & > > > > > > > > thanks, > > neal > > > > > > > > > > On 25 January 2017 at 23:01, Neal Cardwell > > <ncardwell@google.com> wrote: > > On Wed, Jan 25, 2017 at 3:54 PM, > Hans-Kristian > > Bakke <hkbakke@gmail.com> wrote: > > Hi > > > > > > Kernel 4.9 finally landed in > Debian > > testing so I could finally > test BBR in > > a real life environment that > I have > > struggled with getting any > kind of > > performance out of. > > > > > > The challenge at hand is UDP > based > > OpenVPN through europe at > around 35 ms > > rtt to my VPN-provider with > plenty of > > available bandwith available > in both > > ends and everything > completely unknown > > in between. After tuning the > > UDP-buffers up to make room > for my 500 > > mbit/s symmetrical bandwith > at 35 ms > > the download part seemed to > work > > nicely at an unreliable 150 > to 300 > > mbit/s, while the upload was > stuck at > > 30 to 60 mbit/s. > > > > > > Just by activating BBR the > bandwith > > instantly shot up to around > 150 mbit/s > > using a fat tcp test to a > public > > iperf3 server located near > my VPN exit > > point in the Netherlands. > Replace BBR > > with qubic again and the > performance > > is once again all over the > place > > ranging from very bad to > bad, but > > never better than 1/3 of > BBRs "steady > > state". In other words > "instant WIN!" > > > > > > Glad to hear it. Thanks for the test > report! > > > > However, seeing the > requirement of fq > > and pacing for BBR and > noticing that I > > am running pfifo_fast within > a VM with > > virtio NIC on a Proxmox VE > host with > > fq_codel on all physical > interfaces, I > > was surprised to see that it > worked so > > well. > > I then replaced pfifo_fast > with fq and > > the performance went right > down to > > only 1-4 mbit/s from around > 150 > > mbit/s. Removing the fq > again regained > > the performance at once. > > > > > > I have got some questions to > you guys > > that know a lot more than me > about > > these things: > > 1. Do fq (and fq_codel) even > work > > reliably in a VM? What is > the best > > choice for default qdisc to > use in a > > VM in general? > > > > > > Eric covered this one. We are not > aware of > > specific issues with fq in VM > environments. > > And we have tested that fq works > sufficiently > > well on Google Cloud VMs. > > > > 2. Why do BBR immediately > "fix" all my > > issues with upload through > that > > "unreliable" big BDP link > with > > pfifo_fast when fq pacing is > a > > requirement? > > > > > > For BBR, pacing is part of the > design in order > > to make BBR more "gentle" in terms > of the rate > > at which it sends, in order to put > less > > pressure on buffers and keep packet > loss > > lower. This is particularly > important when a > > BBR flow is restarting from idle. In > this case > > BBR starts with a full cwnd, and it > counts on > > pacing to pace out the packets at > the > > estimated bandwidth, so that the > queue can > > stay relatively short and yet the > pipe can be > > filled immediately. > > > > > > Running BBR without pacing makes BBR > more > > aggressive, particularly in > restarting from > > idle, but also in the steady state, > where BBR > > tries to use pacing to keep the > queue short. > > > > > > For bulk transfer tests with one > flow, running > > BBR without pacing will likely cause > higher > > queues and loss rates at the > bottleneck, which > > may negatively impact other traffic > sharing > > that bottleneck. > > > > 3. Could fq_codel on the > physical host > > be the reason that it still > works? > > > > > > Nope, fq_codel does not implement > pacing. > > > > 4. Do BBR _only_ work with > fq pacing > > or could fq_codel be used as > a > > replacement? > > > > > > Nope, BBR needs pacing to work > correctly, and > > currently fq is the only Linux qdisc > that > > implements pacing. > > > > 5. Is BBR perhaps modified > to do the > > right thing without having > to change > > the qdisc in the current > kernel 4.9? > > > > > > Nope. Linux 4.9 contains the initial > public > > release of BBR from September 2016. > And there > > have been no code changes since then > (just > > expanded comments). > > > > > > Thanks for the test report! > > > > > > neal > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Bloat mailing list > > Bloat@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/bloat > > > > > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 23:33 ` Eric Dumazet 2017-01-25 23:41 ` Hans-Kristian Bakke @ 2017-01-25 23:47 ` Hans-Kristian Bakke 2017-01-25 23:53 ` Eric Dumazet 1 sibling, 1 reply; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 23:47 UTC (permalink / raw) To: Eric Dumazet; +Cc: Neal Cardwell, bloat [-- Attachment #1: Type: text/plain, Size: 9975 bytes --] I did record the qdisc settings, but I didn't capture the stats, but throttling is definitively active when I watch the tc -s stats in realtime when testing (looking at tun1) tc -s qdisc show qdisc noqueue 0: dev lo root refcnt 2 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc fq 8007: dev eth0 root refcnt 2 limit 10000p flow_limit 100p buckets 1024 orphan_mask 1023 quantum 3028 initial_quantum 15140 refill_delay 40.0ms Sent 1420855729 bytes 969198 pkt (dropped 134, overlimits 0 requeues 0) backlog 0b 0p requeues 0 124 flows (123 inactive, 0 throttled) 0 gc, 0 highprio, 3 throttled, 3925 ns latency, 134 flows_plimit qdisc fq 8008: dev tun1 root refcnt 2 limit 10000p flow_limit 100p buckets 1024 orphan_mask 1023 quantum 3000 initial_quantum 15000 refill_delay 40.0ms Sent 1031289740 bytes 741181 pkt (dropped 0, overlimits 0 requeues 0) backlog 101616b 3p requeues 0 16 flows (15 inactive, 1 throttled), next packet delay 351937 ns 0 gc, 0 highprio, 58377 throttled, 12761 ns latency On 26 January 2017 at 00:33, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian Bakke wrote: > > I can do that. I guess I should do the capture from tun1 as that is > > the place that the tcp-traffic is visible? My non-virtual nic is only > > seeing OpenVPN encapsulated UDP-traffic. > > > > But is FQ installed at the point TCP sockets are ? > > You should give us "tc -s qdisc show xxx" so that we can check if > pacing (throttling) actually happens. > > > > On 25 January 2017 at 23:48, Neal Cardwell <ncardwell@google.com> > > wrote: > > On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke > > <hkbakke@gmail.com> wrote: > > Actually.. the 1-4 mbit/s results with fq sporadically > > appears again as I keep testing but it is most likely > > caused by all the unknowns between me an my > > testserver. But still, changing to pfifo_qdisc seems > > to normalize the throughput again with BBR, could this > > be one of those times where BBR and pacing actually is > > getting hurt for playing nice in some very variable > > bottleneck on the way? > > > > > > Possibly. Would you be able to take a tcpdump trace of each > > trial (headers only would be ideal), and post on a web site > > somewhere a pcap trace for one of the slow trials? > > > > > > For example: > > > > > > tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c 1000000 & > > > > > > > > thanks, > > neal > > > > > > > > > > On 25 January 2017 at 23:01, Neal Cardwell > > <ncardwell@google.com> wrote: > > On Wed, Jan 25, 2017 at 3:54 PM, Hans-Kristian > > Bakke <hkbakke@gmail.com> wrote: > > Hi > > > > > > Kernel 4.9 finally landed in Debian > > testing so I could finally test BBR in > > a real life environment that I have > > struggled with getting any kind of > > performance out of. > > > > > > The challenge at hand is UDP based > > OpenVPN through europe at around 35 ms > > rtt to my VPN-provider with plenty of > > available bandwith available in both > > ends and everything completely unknown > > in between. After tuning the > > UDP-buffers up to make room for my 500 > > mbit/s symmetrical bandwith at 35 ms > > the download part seemed to work > > nicely at an unreliable 150 to 300 > > mbit/s, while the upload was stuck at > > 30 to 60 mbit/s. > > > > > > Just by activating BBR the bandwith > > instantly shot up to around 150 mbit/s > > using a fat tcp test to a public > > iperf3 server located near my VPN exit > > point in the Netherlands. Replace BBR > > with qubic again and the performance > > is once again all over the place > > ranging from very bad to bad, but > > never better than 1/3 of BBRs "steady > > state". In other words "instant WIN!" > > > > > > Glad to hear it. Thanks for the test report! > > > > However, seeing the requirement of fq > > and pacing for BBR and noticing that I > > am running pfifo_fast within a VM with > > virtio NIC on a Proxmox VE host with > > fq_codel on all physical interfaces, I > > was surprised to see that it worked so > > well. > > I then replaced pfifo_fast with fq and > > the performance went right down to > > only 1-4 mbit/s from around 150 > > mbit/s. Removing the fq again regained > > the performance at once. > > > > > > I have got some questions to you guys > > that know a lot more than me about > > these things: > > 1. Do fq (and fq_codel) even work > > reliably in a VM? What is the best > > choice for default qdisc to use in a > > VM in general? > > > > > > Eric covered this one. We are not aware of > > specific issues with fq in VM environments. > > And we have tested that fq works sufficiently > > well on Google Cloud VMs. > > > > 2. Why do BBR immediately "fix" all my > > issues with upload through that > > "unreliable" big BDP link with > > pfifo_fast when fq pacing is a > > requirement? > > > > > > For BBR, pacing is part of the design in order > > to make BBR more "gentle" in terms of the rate > > at which it sends, in order to put less > > pressure on buffers and keep packet loss > > lower. This is particularly important when a > > BBR flow is restarting from idle. In this case > > BBR starts with a full cwnd, and it counts on > > pacing to pace out the packets at the > > estimated bandwidth, so that the queue can > > stay relatively short and yet the pipe can be > > filled immediately. > > > > > > Running BBR without pacing makes BBR more > > aggressive, particularly in restarting from > > idle, but also in the steady state, where BBR > > tries to use pacing to keep the queue short. > > > > > > For bulk transfer tests with one flow, running > > BBR without pacing will likely cause higher > > queues and loss rates at the bottleneck, which > > may negatively impact other traffic sharing > > that bottleneck. > > > > 3. Could fq_codel on the physical host > > be the reason that it still works? > > > > > > Nope, fq_codel does not implement pacing. > > > > 4. Do BBR _only_ work with fq pacing > > or could fq_codel be used as a > > replacement? > > > > > > Nope, BBR needs pacing to work correctly, and > > currently fq is the only Linux qdisc that > > implements pacing. > > > > 5. Is BBR perhaps modified to do the > > right thing without having to change > > the qdisc in the current kernel 4.9? > > > > > > Nope. Linux 4.9 contains the initial public > > release of BBR from September 2016. And there > > have been no code changes since then (just > > expanded comments). > > > > > > Thanks for the test report! > > > > > > neal > > > > > > > > > > > > > > > > > > _______________________________________________ > > Bloat mailing list > > Bloat@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/bloat > > > [-- Attachment #2: Type: text/html, Size: 14675 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 23:47 ` Hans-Kristian Bakke @ 2017-01-25 23:53 ` Eric Dumazet 2017-01-25 23:56 ` Hans-Kristian Bakke 0 siblings, 1 reply; 31+ messages in thread From: Eric Dumazet @ 2017-01-25 23:53 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: Neal Cardwell, bloat On Thu, 2017-01-26 at 00:47 +0100, Hans-Kristian Bakke wrote: > > > I did record the qdisc settings, but I didn't capture the stats, but > throttling is definitively active when I watch the tc -s stats in > realtime when testing (looking at tun1) > > > tc -s qdisc show > qdisc noqueue 0: dev lo root refcnt 2 > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > qdisc fq 8007: dev eth0 root refcnt 2 limit 10000p flow_limit 100p > buckets 1024 orphan_mask 1023 quantum 3028 initial_quantum 15140 > refill_delay 40.0ms > Sent 1420855729 bytes 969198 pkt (dropped 134, overlimits 0 requeues > 0) > backlog 0b 0p requeues 0 > 124 flows (123 inactive, 0 throttled) > 0 gc, 0 highprio, 3 throttled, 3925 ns latency, 134 flows_plimit You seem to hit the "flow_limit 100" maybe because all packets are going through a single encap flow. ( 134 drops ) > qdisc fq 8008: dev tun1 root refcnt 2 limit 10000p flow_limit 100p > buckets 1024 orphan_mask 1023 quantum 3000 initial_quantum 15000 > refill_delay 40.0ms > Sent 1031289740 bytes 741181 pkt (dropped 0, overlimits 0 requeues 0) > backlog 101616b 3p requeues 0 > 16 flows (15 inactive, 1 throttled), next packet delay 351937 ns > 0 gc, 0 highprio, 58377 throttled, 12761 ns latency > > Looks good, although latency seems a bit high, thanks ! > > > On 26 January 2017 at 00:33, Eric Dumazet <eric.dumazet@gmail.com> > wrote: > > On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian Bakke wrote: > > I can do that. I guess I should do the capture from tun1 as > that is > > the place that the tcp-traffic is visible? My non-virtual > nic is only > > seeing OpenVPN encapsulated UDP-traffic. > > > > But is FQ installed at the point TCP sockets are ? > > You should give us "tc -s qdisc show xxx" so that we can > check if > pacing (throttling) actually happens. > > > > On 25 January 2017 at 23:48, Neal Cardwell > <ncardwell@google.com> > > wrote: > > On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke > > <hkbakke@gmail.com> wrote: > > Actually.. the 1-4 mbit/s results with fq > sporadically > > appears again as I keep testing but it is > most likely > > caused by all the unknowns between me an my > > testserver. But still, changing to > pfifo_qdisc seems > > to normalize the throughput again with BBR, > could this > > be one of those times where BBR and pacing > actually is > > getting hurt for playing nice in some very > variable > > bottleneck on the way? > > > > > > Possibly. Would you be able to take a tcpdump trace > of each > > trial (headers only would be ideal), and post on a > web site > > somewhere a pcap trace for one of the slow trials? > > > > > > For example: > > > > > > tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c > 1000000 & > > > > > > > > thanks, > > neal > > > > > > > > > > On 25 January 2017 at 23:01, Neal Cardwell > > <ncardwell@google.com> wrote: > > On Wed, Jan 25, 2017 at 3:54 PM, > Hans-Kristian > > Bakke <hkbakke@gmail.com> wrote: > > Hi > > > > > > Kernel 4.9 finally landed in > Debian > > testing so I could finally > test BBR in > > a real life environment that > I have > > struggled with getting any > kind of > > performance out of. > > > > > > The challenge at hand is UDP > based > > OpenVPN through europe at > around 35 ms > > rtt to my VPN-provider with > plenty of > > available bandwith available > in both > > ends and everything > completely unknown > > in between. After tuning the > > UDP-buffers up to make room > for my 500 > > mbit/s symmetrical bandwith > at 35 ms > > the download part seemed to > work > > nicely at an unreliable 150 > to 300 > > mbit/s, while the upload was > stuck at > > 30 to 60 mbit/s. > > > > > > Just by activating BBR the > bandwith > > instantly shot up to around > 150 mbit/s > > using a fat tcp test to a > public > > iperf3 server located near > my VPN exit > > point in the Netherlands. > Replace BBR > > with qubic again and the > performance > > is once again all over the > place > > ranging from very bad to > bad, but > > never better than 1/3 of > BBRs "steady > > state". In other words > "instant WIN!" > > > > > > Glad to hear it. Thanks for the test > report! > > > > However, seeing the > requirement of fq > > and pacing for BBR and > noticing that I > > am running pfifo_fast within > a VM with > > virtio NIC on a Proxmox VE > host with > > fq_codel on all physical > interfaces, I > > was surprised to see that it > worked so > > well. > > I then replaced pfifo_fast > with fq and > > the performance went right > down to > > only 1-4 mbit/s from around > 150 > > mbit/s. Removing the fq > again regained > > the performance at once. > > > > > > I have got some questions to > you guys > > that know a lot more than me > about > > these things: > > 1. Do fq (and fq_codel) even > work > > reliably in a VM? What is > the best > > choice for default qdisc to > use in a > > VM in general? > > > > > > Eric covered this one. We are not > aware of > > specific issues with fq in VM > environments. > > And we have tested that fq works > sufficiently > > well on Google Cloud VMs. > > > > 2. Why do BBR immediately > "fix" all my > > issues with upload through > that > > "unreliable" big BDP link > with > > pfifo_fast when fq pacing is > a > > requirement? > > > > > > For BBR, pacing is part of the > design in order > > to make BBR more "gentle" in terms > of the rate > > at which it sends, in order to put > less > > pressure on buffers and keep packet > loss > > lower. This is particularly > important when a > > BBR flow is restarting from idle. In > this case > > BBR starts with a full cwnd, and it > counts on > > pacing to pace out the packets at > the > > estimated bandwidth, so that the > queue can > > stay relatively short and yet the > pipe can be > > filled immediately. > > > > > > Running BBR without pacing makes BBR > more > > aggressive, particularly in > restarting from > > idle, but also in the steady state, > where BBR > > tries to use pacing to keep the > queue short. > > > > > > For bulk transfer tests with one > flow, running > > BBR without pacing will likely cause > higher > > queues and loss rates at the > bottleneck, which > > may negatively impact other traffic > sharing > > that bottleneck. > > > > 3. Could fq_codel on the > physical host > > be the reason that it still > works? > > > > > > Nope, fq_codel does not implement > pacing. > > > > 4. Do BBR _only_ work with > fq pacing > > or could fq_codel be used as > a > > replacement? > > > > > > Nope, BBR needs pacing to work > correctly, and > > currently fq is the only Linux qdisc > that > > implements pacing. > > > > 5. Is BBR perhaps modified > to do the > > right thing without having > to change > > the qdisc in the current > kernel 4.9? > > > > > > Nope. Linux 4.9 contains the initial > public > > release of BBR from September 2016. > And there > > have been no code changes since then > (just > > expanded comments). > > > > > > Thanks for the test report! > > > > > > neal > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Bloat mailing list > > Bloat@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/bloat > > > > > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 23:53 ` Eric Dumazet @ 2017-01-25 23:56 ` Hans-Kristian Bakke 2017-01-26 0:10 ` Eric Dumazet 0 siblings, 1 reply; 31+ messages in thread From: Hans-Kristian Bakke @ 2017-01-25 23:56 UTC (permalink / raw) To: Eric Dumazet; +Cc: Neal Cardwell, bloat [-- Attachment #1: Type: text/plain, Size: 13780 bytes --] These are just the fq settings as they get applied when having fq as default qdiscs. I guess there are room for improvements on those default settings depending on use case. For future reference: should I increase the limit on drops or is it okay as it is? On 26 January 2017 at 00:53, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Thu, 2017-01-26 at 00:47 +0100, Hans-Kristian Bakke wrote: > > > > > > I did record the qdisc settings, but I didn't capture the stats, but > > throttling is definitively active when I watch the tc -s stats in > > realtime when testing (looking at tun1) > > > > > > tc -s qdisc show > > qdisc noqueue 0: dev lo root refcnt 2 > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > backlog 0b 0p requeues 0 > > qdisc fq 8007: dev eth0 root refcnt 2 limit 10000p flow_limit 100p > > buckets 1024 orphan_mask 1023 quantum 3028 initial_quantum 15140 > > refill_delay 40.0ms > > Sent 1420855729 bytes 969198 pkt (dropped 134, overlimits 0 requeues > > 0) > > backlog 0b 0p requeues 0 > > 124 flows (123 inactive, 0 throttled) > > 0 gc, 0 highprio, 3 throttled, 3925 ns latency, 134 flows_plimit > > You seem to hit the "flow_limit 100" maybe because all packets are going > through a single encap flow. ( 134 drops ) > > > qdisc fq 8008: dev tun1 root refcnt 2 limit 10000p flow_limit 100p > > buckets 1024 orphan_mask 1023 quantum 3000 initial_quantum 15000 > > refill_delay 40.0ms > > Sent 1031289740 bytes 741181 pkt (dropped 0, overlimits 0 requeues 0) > > backlog 101616b 3p requeues 0 > > 16 flows (15 inactive, 1 throttled), next packet delay 351937 ns > > 0 gc, 0 highprio, 58377 throttled, 12761 ns latency > > > > > > Looks good, although latency seems a bit high, thanks ! > > > > > > On 26 January 2017 at 00:33, Eric Dumazet <eric.dumazet@gmail.com> > > wrote: > > > > On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian Bakke wrote: > > > I can do that. I guess I should do the capture from tun1 as > > that is > > > the place that the tcp-traffic is visible? My non-virtual > > nic is only > > > seeing OpenVPN encapsulated UDP-traffic. > > > > > > > But is FQ installed at the point TCP sockets are ? > > > > You should give us "tc -s qdisc show xxx" so that we can > > check if > > pacing (throttling) actually happens. > > > > > > > On 25 January 2017 at 23:48, Neal Cardwell > > <ncardwell@google.com> > > > wrote: > > > On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke > > > <hkbakke@gmail.com> wrote: > > > Actually.. the 1-4 mbit/s results with fq > > sporadically > > > appears again as I keep testing but it is > > most likely > > > caused by all the unknowns between me an my > > > testserver. But still, changing to > > pfifo_qdisc seems > > > to normalize the throughput again with BBR, > > could this > > > be one of those times where BBR and pacing > > actually is > > > getting hurt for playing nice in some very > > variable > > > bottleneck on the way? > > > > > > > > > Possibly. Would you be able to take a tcpdump trace > > of each > > > trial (headers only would be ideal), and post on a > > web site > > > somewhere a pcap trace for one of the slow trials? > > > > > > > > > For example: > > > > > > > > > tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c > > 1000000 & > > > > > > > > > > > > thanks, > > > neal > > > > > > > > > > > > > > > On 25 January 2017 at 23:01, Neal Cardwell > > > <ncardwell@google.com> wrote: > > > On Wed, Jan 25, 2017 at 3:54 PM, > > Hans-Kristian > > > Bakke <hkbakke@gmail.com> wrote: > > > Hi > > > > > > > > > Kernel 4.9 finally landed in > > Debian > > > testing so I could finally > > test BBR in > > > a real life environment that > > I have > > > struggled with getting any > > kind of > > > performance out of. > > > > > > > > > The challenge at hand is UDP > > based > > > OpenVPN through europe at > > around 35 ms > > > rtt to my VPN-provider with > > plenty of > > > available bandwith available > > in both > > > ends and everything > > completely unknown > > > in between. After tuning the > > > UDP-buffers up to make room > > for my 500 > > > mbit/s symmetrical bandwith > > at 35 ms > > > the download part seemed to > > work > > > nicely at an unreliable 150 > > to 300 > > > mbit/s, while the upload was > > stuck at > > > 30 to 60 mbit/s. > > > > > > > > > Just by activating BBR the > > bandwith > > > instantly shot up to around > > 150 mbit/s > > > using a fat tcp test to a > > public > > > iperf3 server located near > > my VPN exit > > > point in the Netherlands. > > Replace BBR > > > with qubic again and the > > performance > > > is once again all over the > > place > > > ranging from very bad to > > bad, but > > > never better than 1/3 of > > BBRs "steady > > > state". In other words > > "instant WIN!" > > > > > > > > > Glad to hear it. Thanks for the test > > report! > > > > > > However, seeing the > > requirement of fq > > > and pacing for BBR and > > noticing that I > > > am running pfifo_fast within > > a VM with > > > virtio NIC on a Proxmox VE > > host with > > > fq_codel on all physical > > interfaces, I > > > was surprised to see that it > > worked so > > > well. > > > I then replaced pfifo_fast > > with fq and > > > the performance went right > > down to > > > only 1-4 mbit/s from around > > 150 > > > mbit/s. Removing the fq > > again regained > > > the performance at once. > > > > > > > > > I have got some questions to > > you guys > > > that know a lot more than me > > about > > > these things: > > > 1. Do fq (and fq_codel) even > > work > > > reliably in a VM? What is > > the best > > > choice for default qdisc to > > use in a > > > VM in general? > > > > > > > > > Eric covered this one. We are not > > aware of > > > specific issues with fq in VM > > environments. > > > And we have tested that fq works > > sufficiently > > > well on Google Cloud VMs. > > > > > > 2. Why do BBR immediately > > "fix" all my > > > issues with upload through > > that > > > "unreliable" big BDP link > > with > > > pfifo_fast when fq pacing is > > a > > > requirement? > > > > > > > > > For BBR, pacing is part of the > > design in order > > > to make BBR more "gentle" in terms > > of the rate > > > at which it sends, in order to put > > less > > > pressure on buffers and keep packet > > loss > > > lower. This is particularly > > important when a > > > BBR flow is restarting from idle. In > > this case > > > BBR starts with a full cwnd, and it > > counts on > > > pacing to pace out the packets at > > the > > > estimated bandwidth, so that the > > queue can > > > stay relatively short and yet the > > pipe can be > > > filled immediately. > > > > > > > > > Running BBR without pacing makes BBR > > more > > > aggressive, particularly in > > restarting from > > > idle, but also in the steady state, > > where BBR > > > tries to use pacing to keep the > > queue short. > > > > > > > > > For bulk transfer tests with one > > flow, running > > > BBR without pacing will likely cause > > higher > > > queues and loss rates at the > > bottleneck, which > > > may negatively impact other traffic > > sharing > > > that bottleneck. > > > > > > 3. Could fq_codel on the > > physical host > > > be the reason that it still > > works? > > > > > > > > > Nope, fq_codel does not implement > > pacing. > > > > > > 4. Do BBR _only_ work with > > fq pacing > > > or could fq_codel be used as > > a > > > replacement? > > > > > > > > > Nope, BBR needs pacing to work > > correctly, and > > > currently fq is the only Linux qdisc > > that > > > implements pacing. > > > > > > 5. Is BBR perhaps modified > > to do the > > > right thing without having > > to change > > > the qdisc in the current > > kernel 4.9? > > > > > > > > > Nope. Linux 4.9 contains the initial > > public > > > release of BBR from September 2016. > > And there > > > have been no code changes since then > > (just > > > expanded comments). > > > > > > > > > Thanks for the test report! > > > > > > > > > neal > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Bloat mailing list > > > Bloat@lists.bufferbloat.net > > > https://lists.bufferbloat.net/listinfo/bloat > > > > > > > > > > > > > [-- Attachment #2: Type: text/html, Size: 20031 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Bloat] Initial tests with BBR in kernel 4.9 2017-01-25 23:56 ` Hans-Kristian Bakke @ 2017-01-26 0:10 ` Eric Dumazet 0 siblings, 0 replies; 31+ messages in thread From: Eric Dumazet @ 2017-01-26 0:10 UTC (permalink / raw) To: Hans-Kristian Bakke; +Cc: Neal Cardwell, bloat On Thu, 2017-01-26 at 00:56 +0100, Hans-Kristian Bakke wrote: > These are just the fq settings as they get applied when having fq as > default qdiscs. I guess there are room for improvements on those > default settings depending on use case. > > > For future reference: should I increase the limit on drops or is it > okay as it is? For your use case, increasing flow_limit to 1000 or even 2000 on eth0 would be absolutely fine, since most of your traffic is going to be encapsulated. Note that this setup (vpn) was probably breaking back pressure (TCP Small Queues is relying on this), so adding FQ/pacing probably helps, even with Cubic. > > On 26 January 2017 at 00:53, Eric Dumazet <eric.dumazet@gmail.com> > wrote: > On Thu, 2017-01-26 at 00:47 +0100, Hans-Kristian Bakke wrote: > > > > > > I did record the qdisc settings, but I didn't capture the > stats, but > > throttling is definitively active when I watch the tc -s > stats in > > realtime when testing (looking at tun1) > > > > > > tc -s qdisc show > > qdisc noqueue 0: dev lo root refcnt 2 > > Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) > > backlog 0b 0p requeues 0 > > qdisc fq 8007: dev eth0 root refcnt 2 limit 10000p > flow_limit 100p > > buckets 1024 orphan_mask 1023 quantum 3028 initial_quantum > 15140 > > refill_delay 40.0ms > > Sent 1420855729 bytes 969198 pkt (dropped 134, overlimits 0 > requeues > > 0) > > backlog 0b 0p requeues 0 > > 124 flows (123 inactive, 0 throttled) > > 0 gc, 0 highprio, 3 throttled, 3925 ns latency, 134 > flows_plimit > > You seem to hit the "flow_limit 100" maybe because all packets > are going > through a single encap flow. ( 134 drops ) > > > qdisc fq 8008: dev tun1 root refcnt 2 limit 10000p > flow_limit 100p > > buckets 1024 orphan_mask 1023 quantum 3000 initial_quantum > 15000 > > refill_delay 40.0ms > > Sent 1031289740 bytes 741181 pkt (dropped 0, overlimits 0 > requeues 0) > > backlog 101616b 3p requeues 0 > > 16 flows (15 inactive, 1 throttled), next packet delay > 351937 ns > > 0 gc, 0 highprio, 58377 throttled, 12761 ns latency > > > > > > Looks good, although latency seems a bit high, thanks ! > > > > > > On 26 January 2017 at 00:33, Eric Dumazet > <eric.dumazet@gmail.com> > > wrote: > > > > On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian > Bakke wrote: > > > I can do that. I guess I should do the capture > from tun1 as > > that is > > > the place that the tcp-traffic is visible? My > non-virtual > > nic is only > > > seeing OpenVPN encapsulated UDP-traffic. > > > > > > > But is FQ installed at the point TCP sockets are ? > > > > You should give us "tc -s qdisc show xxx" so that > we can > > check if > > pacing (throttling) actually happens. > > > > > > > On 25 January 2017 at 23:48, Neal Cardwell > > <ncardwell@google.com> > > > wrote: > > > On Wed, Jan 25, 2017 at 5:38 PM, > Hans-Kristian Bakke > > > <hkbakke@gmail.com> wrote: > > > Actually.. the 1-4 mbit/s results > with fq > > sporadically > > > appears again as I keep testing > but it is > > most likely > > > caused by all the unknowns between > me an my > > > testserver. But still, changing to > > pfifo_qdisc seems > > > to normalize the throughput again > with BBR, > > could this > > > be one of those times where BBR > and pacing > > actually is > > > getting hurt for playing nice in > some very > > variable > > > bottleneck on the way? > > > > > > > > > Possibly. Would you be able to take a > tcpdump trace > > of each > > > trial (headers only would be ideal), and > post on a > > web site > > > somewhere a pcap trace for one of the slow > trials? > > > > > > > > > For example: > > > > > > > > > tcpdump -n -w /tmp/out.pcap -s 120 -i > eth0 -c > > 1000000 & > > > > > > > > > > > > thanks, > > > neal > > > > > > > > > > > > > > > On 25 January 2017 at 23:01, Neal > Cardwell > > > <ncardwell@google.com> wrote: > > > On Wed, Jan 25, 2017 at > 3:54 PM, > > Hans-Kristian > > > Bakke <hkbakke@gmail.com> > wrote: > > > Hi > > > > > > > > > Kernel 4.9 finally > landed in > > Debian > > > testing so I could > finally > > test BBR in > > > a real life > environment that > > I have > > > struggled with > getting any > > kind of > > > performance out > of. > > > > > > > > > The challenge at > hand is UDP > > based > > > OpenVPN through > europe at > > around 35 ms > > > rtt to my > VPN-provider with > > plenty of > > > available bandwith > available > > in both > > > ends and > everything > > completely unknown > > > in between. After > tuning the > > > UDP-buffers up to > make room > > for my 500 > > > mbit/s symmetrical > bandwith > > at 35 ms > > > the download part > seemed to > > work > > > nicely at an > unreliable 150 > > to 300 > > > mbit/s, while the > upload was > > stuck at > > > 30 to 60 mbit/s. > > > > > > > > > Just by activating > BBR the > > bandwith > > > instantly shot up > to around > > 150 mbit/s > > > using a fat tcp > test to a > > public > > > iperf3 server > located near > > my VPN exit > > > point in the > Netherlands. > > Replace BBR > > > with qubic again > and the > > performance > > > is once again all > over the > > place > > > ranging from very > bad to > > bad, but > > > never better than > 1/3 of > > BBRs "steady > > > state". In other > words > > "instant WIN!" > > > > > > > > > Glad to hear it. Thanks > for the test > > report! > > > > > > However, seeing > the > > requirement of fq > > > and pacing for BBR > and > > noticing that I > > > am running > pfifo_fast within > > a VM with > > > virtio NIC on a > Proxmox VE > > host with > > > fq_codel on all > physical > > interfaces, I > > > was surprised to > see that it > > worked so > > > well. > > > I then replaced > pfifo_fast > > with fq and > > > the performance > went right > > down to > > > only 1-4 mbit/s > from around > > 150 > > > mbit/s. Removing > the fq > > again regained > > > the performance at > once. > > > > > > > > > I have got some > questions to > > you guys > > > that know a lot > more than me > > about > > > these things: > > > 1. Do fq (and > fq_codel) even > > work > > > reliably in a VM? > What is > > the best > > > choice for default > qdisc to > > use in a > > > VM in general? > > > > > > > > > Eric covered this one. We > are not > > aware of > > > specific issues with fq in > VM > > environments. > > > And we have tested that > fq works > > sufficiently > > > well on Google Cloud VMs. > > > > > > 2. Why do BBR > immediately > > "fix" all my > > > issues with upload > through > > that > > > "unreliable" big > BDP link > > with > > > pfifo_fast when fq > pacing is > > a > > > requirement? > > > > > > > > > For BBR, pacing is part of > the > > design in order > > > to make BBR more "gentle" > in terms > > of the rate > > > at which it sends, in > order to put > > less > > > pressure on buffers and > keep packet > > loss > > > lower. This is > particularly > > important when a > > > BBR flow is restarting > from idle. In > > this case > > > BBR starts with a full > cwnd, and it > > counts on > > > pacing to pace out the > packets at > > the > > > estimated bandwidth, so > that the > > queue can > > > stay relatively short and > yet the > > pipe can be > > > filled immediately. > > > > > > > > > Running BBR without pacing > makes BBR > > more > > > aggressive, particularly > in > > restarting from > > > idle, but also in the > steady state, > > where BBR > > > tries to use pacing to > keep the > > queue short. > > > > > > > > > For bulk transfer tests > with one > > flow, running > > > BBR without pacing will > likely cause > > higher > > > queues and loss rates at > the > > bottleneck, which > > > may negatively impact > other traffic > > sharing > > > that bottleneck. > > > > > > 3. Could fq_codel > on the > > physical host > > > be the reason that > it still > > works? > > > > > > > > > Nope, fq_codel does not > implement > > pacing. > > > > > > 4. Do BBR _only_ > work with > > fq pacing > > > or could fq_codel > be used as > > a > > > replacement? > > > > > > > > > Nope, BBR needs pacing to > work > > correctly, and > > > currently fq is the only > Linux qdisc > > that > > > implements pacing. > > > > > > 5. Is BBR perhaps > modified > > to do the > > > right thing > without having > > to change > > > the qdisc in the > current > > kernel 4.9? > > > > > > > > > Nope. Linux 4.9 contains > the initial > > public > > > release of BBR from > September 2016. > > And there > > > have been no code changes > since then > > (just > > > expanded comments). > > > > > > > > > Thanks for the test > report! > > > > > > > > > neal > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Bloat mailing list > > > Bloat@lists.bufferbloat.net > > > https://lists.bufferbloat.net/listinfo/bloat > > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2017-01-26 0:10 UTC | newest] Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-25 20:54 [Bloat] Initial tests with BBR in kernel 4.9 Hans-Kristian Bakke 2017-01-25 21:00 ` Jonathan Morton [not found] ` <CAD_cGvHKw6upOCzDbHLZMSYdyuBHGyo4baaPqM7r=VvMzRFVtg@mail.gmail.com> 2017-01-25 21:09 ` [Bloat] Fwd: " Hans-Kristian Bakke [not found] ` <908CA0EF-3D84-4EB4-ABD8-3042668E842E@gmail.com> 2017-01-25 21:13 ` [Bloat] " Hans-Kristian Bakke 2017-01-25 21:17 ` Jonathan Morton 2017-01-25 21:20 ` Hans-Kristian Bakke 2017-01-25 21:26 ` Jonathan Morton 2017-01-25 21:29 ` Hans-Kristian Bakke 2017-01-25 21:31 ` Hans-Kristian Bakke 2017-01-25 21:42 ` Jonathan Morton 2017-01-25 21:48 ` Eric Dumazet 2017-01-25 22:03 ` Hans-Kristian Bakke 2017-01-25 21:03 ` Hans-Kristian Bakke 2017-01-25 22:01 ` Neal Cardwell 2017-01-25 22:02 ` Neal Cardwell 2017-01-25 22:12 ` Hans-Kristian Bakke 2017-01-25 22:06 ` Steinar H. Gunderson 2017-01-25 22:12 ` Eric Dumazet 2017-01-25 22:23 ` Steinar H. Gunderson 2017-01-25 22:27 ` Eric Dumazet 2017-01-25 22:38 ` Hans-Kristian Bakke 2017-01-25 22:48 ` Neal Cardwell 2017-01-25 23:04 ` Hans-Kristian Bakke 2017-01-25 23:31 ` Hans-Kristian Bakke 2017-01-25 23:33 ` Eric Dumazet 2017-01-25 23:41 ` Hans-Kristian Bakke 2017-01-25 23:46 ` Eric Dumazet 2017-01-25 23:47 ` Hans-Kristian Bakke 2017-01-25 23:53 ` Eric Dumazet 2017-01-25 23:56 ` Hans-Kristian Bakke 2017-01-26 0:10 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox