[Bloat] Initial tests with BBR in kernel 4.9
Eric Dumazet
eric.dumazet at gmail.com
Wed Jan 25 18:53:05 EST 2017
On Thu, 2017-01-26 at 00:47 +0100, Hans-Kristian Bakke wrote:
>
>
> I did record the qdisc settings, but I didn't capture the stats, but
> throttling is definitively active when I watch the tc -s stats in
> realtime when testing (looking at tun1)
>
>
> tc -s qdisc show
> qdisc noqueue 0: dev lo root refcnt 2
> Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 0b 0p requeues 0
> qdisc fq 8007: dev eth0 root refcnt 2 limit 10000p flow_limit 100p
> buckets 1024 orphan_mask 1023 quantum 3028 initial_quantum 15140
> refill_delay 40.0ms
> Sent 1420855729 bytes 969198 pkt (dropped 134, overlimits 0 requeues
> 0)
> backlog 0b 0p requeues 0
> 124 flows (123 inactive, 0 throttled)
> 0 gc, 0 highprio, 3 throttled, 3925 ns latency, 134 flows_plimit
You seem to hit the "flow_limit 100" maybe because all packets are going
through a single encap flow. ( 134 drops )
> qdisc fq 8008: dev tun1 root refcnt 2 limit 10000p flow_limit 100p
> buckets 1024 orphan_mask 1023 quantum 3000 initial_quantum 15000
> refill_delay 40.0ms
> Sent 1031289740 bytes 741181 pkt (dropped 0, overlimits 0 requeues 0)
> backlog 101616b 3p requeues 0
> 16 flows (15 inactive, 1 throttled), next packet delay 351937 ns
> 0 gc, 0 highprio, 58377 throttled, 12761 ns latency
>
>
Looks good, although latency seems a bit high, thanks !
>
>
> On 26 January 2017 at 00:33, Eric Dumazet <eric.dumazet at gmail.com>
> wrote:
>
> On Thu, 2017-01-26 at 00:04 +0100, Hans-Kristian Bakke wrote:
> > I can do that. I guess I should do the capture from tun1 as
> that is
> > the place that the tcp-traffic is visible? My non-virtual
> nic is only
> > seeing OpenVPN encapsulated UDP-traffic.
> >
>
> But is FQ installed at the point TCP sockets are ?
>
> You should give us "tc -s qdisc show xxx" so that we can
> check if
> pacing (throttling) actually happens.
>
>
> > On 25 January 2017 at 23:48, Neal Cardwell
> <ncardwell at google.com>
> > wrote:
> > On Wed, Jan 25, 2017 at 5:38 PM, Hans-Kristian Bakke
> > <hkbakke at gmail.com> wrote:
> > Actually.. the 1-4 mbit/s results with fq
> sporadically
> > appears again as I keep testing but it is
> most likely
> > caused by all the unknowns between me an my
> > testserver. But still, changing to
> pfifo_qdisc seems
> > to normalize the throughput again with BBR,
> could this
> > be one of those times where BBR and pacing
> actually is
> > getting hurt for playing nice in some very
> variable
> > bottleneck on the way?
> >
> >
> > Possibly. Would you be able to take a tcpdump trace
> of each
> > trial (headers only would be ideal), and post on a
> web site
> > somewhere a pcap trace for one of the slow trials?
> >
> >
> > For example:
> >
> >
> > tcpdump -n -w /tmp/out.pcap -s 120 -i eth0 -c
> 1000000 &
> >
> >
> >
> > thanks,
> > neal
> >
> >
> >
> >
> > On 25 January 2017 at 23:01, Neal Cardwell
> > <ncardwell at google.com> wrote:
> > On Wed, Jan 25, 2017 at 3:54 PM,
> Hans-Kristian
> > Bakke <hkbakke at gmail.com> wrote:
> > Hi
> >
> >
> > Kernel 4.9 finally landed in
> Debian
> > testing so I could finally
> test BBR in
> > a real life environment that
> I have
> > struggled with getting any
> kind of
> > performance out of.
> >
> >
> > The challenge at hand is UDP
> based
> > OpenVPN through europe at
> around 35 ms
> > rtt to my VPN-provider with
> plenty of
> > available bandwith available
> in both
> > ends and everything
> completely unknown
> > in between. After tuning the
> > UDP-buffers up to make room
> for my 500
> > mbit/s symmetrical bandwith
> at 35 ms
> > the download part seemed to
> work
> > nicely at an unreliable 150
> to 300
> > mbit/s, while the upload was
> stuck at
> > 30 to 60 mbit/s.
> >
> >
> > Just by activating BBR the
> bandwith
> > instantly shot up to around
> 150 mbit/s
> > using a fat tcp test to a
> public
> > iperf3 server located near
> my VPN exit
> > point in the Netherlands.
> Replace BBR
> > with qubic again and the
> performance
> > is once again all over the
> place
> > ranging from very bad to
> bad, but
> > never better than 1/3 of
> BBRs "steady
> > state". In other words
> "instant WIN!"
> >
> >
> > Glad to hear it. Thanks for the test
> report!
> >
> > However, seeing the
> requirement of fq
> > and pacing for BBR and
> noticing that I
> > am running pfifo_fast within
> a VM with
> > virtio NIC on a Proxmox VE
> host with
> > fq_codel on all physical
> interfaces, I
> > was surprised to see that it
> worked so
> > well.
> > I then replaced pfifo_fast
> with fq and
> > the performance went right
> down to
> > only 1-4 mbit/s from around
> 150
> > mbit/s. Removing the fq
> again regained
> > the performance at once.
> >
> >
> > I have got some questions to
> you guys
> > that know a lot more than me
> about
> > these things:
> > 1. Do fq (and fq_codel) even
> work
> > reliably in a VM? What is
> the best
> > choice for default qdisc to
> use in a
> > VM in general?
> >
> >
> > Eric covered this one. We are not
> aware of
> > specific issues with fq in VM
> environments.
> > And we have tested that fq works
> sufficiently
> > well on Google Cloud VMs.
> >
> > 2. Why do BBR immediately
> "fix" all my
> > issues with upload through
> that
> > "unreliable" big BDP link
> with
> > pfifo_fast when fq pacing is
> a
> > requirement?
> >
> >
> > For BBR, pacing is part of the
> design in order
> > to make BBR more "gentle" in terms
> of the rate
> > at which it sends, in order to put
> less
> > pressure on buffers and keep packet
> loss
> > lower. This is particularly
> important when a
> > BBR flow is restarting from idle. In
> this case
> > BBR starts with a full cwnd, and it
> counts on
> > pacing to pace out the packets at
> the
> > estimated bandwidth, so that the
> queue can
> > stay relatively short and yet the
> pipe can be
> > filled immediately.
> >
> >
> > Running BBR without pacing makes BBR
> more
> > aggressive, particularly in
> restarting from
> > idle, but also in the steady state,
> where BBR
> > tries to use pacing to keep the
> queue short.
> >
> >
> > For bulk transfer tests with one
> flow, running
> > BBR without pacing will likely cause
> higher
> > queues and loss rates at the
> bottleneck, which
> > may negatively impact other traffic
> sharing
> > that bottleneck.
> >
> > 3. Could fq_codel on the
> physical host
> > be the reason that it still
> works?
> >
> >
> > Nope, fq_codel does not implement
> pacing.
> >
> > 4. Do BBR _only_ work with
> fq pacing
> > or could fq_codel be used as
> a
> > replacement?
> >
> >
> > Nope, BBR needs pacing to work
> correctly, and
> > currently fq is the only Linux qdisc
> that
> > implements pacing.
> >
> > 5. Is BBR perhaps modified
> to do the
> > right thing without having
> to change
> > the qdisc in the current
> kernel 4.9?
> >
> >
> > Nope. Linux 4.9 contains the initial
> public
> > release of BBR from September 2016.
> And there
> > have been no code changes since then
> (just
> > expanded comments).
> >
> >
> > Thanks for the test report!
> >
> >
> > neal
> >
> >
> >
> >
> >
> >
> >
> >
>
> > _______________________________________________
> > Bloat mailing list
> > Bloat at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat
>
>
>
>
>
More information about the Bloat
mailing list