[Bloat] Mitigating bufferbloat at the receiver

Thu Mar 10 15:29:28 PST 2011

So far I've seen plenty of talk about removing bufferbloat in the sending side (TCP congestion window, device/driver buffers) and in the network (AQM, ECN).  These are all good, but I'd like to talk about what we can do today at the receiver.

Once upon a time, I spent a few months living in the back of my parents' house.  By British standards, this was in the middle of absolutely nowhere, and the phone line quality was *dreadful*.  This meant that to avoid my analogue modem dropping out more than once an hour, or suffering painfully long periods retraining to a type of noise that retraining really wasn't designed to solve, I had to force it all the way down to 4800 baud, the lowest speed available without dropping down to ancient modulations without any robustness designed in.

Needless to say, at 5kbps the bufferbloat problem in the ISP's modem bank was pretty bad - and this was in about 2003, so anyone who says this is a new problem is either lying or ignorant.  I soon got fed up enough to tune Linux' receive window limit down to about 4 packets, which was still several seconds' worth but allowed me to use the connection for more than one thing at once - handy when I wanted to experiment with Gentoo.

Incidentally, I was already using a form of split-TCP, in that I had installed a webcache on the modem-equipped machine.  This meant that I only had to tune one TCP in order to get the vast majority of the benefits.  Bittorrent hadn't taken off at that point, so Web and FTP traffic were the main bandwidth users, and everything else was interactive and didn't need tuning.  Meanwhile, for uploads I turned on SFQ and left it at that, all the while wondering why ISPs didn't use SFQ in their modem banks.

Without receive window scaling, I didn't see any particular problem with congestion control per se.  I just saw that TCP was opening the congestion window to match the receive window, which was tuned for LANs and clean phone lines at that time, and that interactive packets had to wait behind the bulk traffic.  SFQ would have reduced the latency for interactive traffic and setting up new connections, while bulk traffic would still work as well as before.

Much more recently, I have more than once had to spend extended periods using a 3G modem (well, a tethered Nokia phone) as my primary connection (thank goodness that here in Finland, unlimited connections are actually available).  I soon discovered that the problem I had seen at 5kbps was just as bad at 500kbps.

By now receive window scaling was enabled by default in Linux, so the receive window and congestion window were both growing until they hit the end of the buffer in the 3G base - which proved to be something like 30 seconds.  Furthermore, interactive traffic was *still* waiting behind the bulk traffic, indicating a plain drop-tail queue.

Read that again.  THIRTY SECONDS of latency under a single bulk TCP.

The practical effect of this was that I could watch the progress of Gentoo downloading source packages, and it would proceed smoothly at line speed for a while, and then abruptly stop - due to a dropped packet.  Many seconds later, it would suddenly jump ahead, whereupon it would usually continue smoothly for a little longer before abruptly stopping again.  This was while using a geographically local mirror.

I quickly concluded that since ISPs were *still* not using anything like SFQ despite the enormous cost of 3G base equipment, they were simply as dumb as rocks - and the lack of ECN also confirmed it.  So I started poking around to see what I could do about it, half-remembering the effect of cutting the receive window years before.

I've attached the kernel patch that I came up with, and which I've been running on my gateway box ever since (even though I have my ADSL back).  It measures the actual bandwidth of the flow (based on RTT and window size), and calculates an appropriate window size, which it then increments towards.  This got me down to about 1 second of buffering on the 3G link, which I considered basically acceptable (in comparison).  At higher bandwidths the latency is lower, or to put it another way, at lower latencies the available bandwidth is increased.  The acceptable latency is also capped at 2 seconds as a safety valve.

As it happens, 2 seconds latency pretty much is the maximum for acceptable TCP setup performance.  This is because the initial RTO for TCP is supposed to be 3 seconds.  With a 30-second latency, TCP will *always* retransmit several times during the setup phase, even if no packets are actually lost.  So that's something to tell your packet-loss-obsessed router chums: forget packet loss, minimise retransmits - and explain why packet loss and retransmits are not synonymous!

 - Jonathan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: blackpool.patch
Type: application/octet-stream
Size: 5176 bytes
Desc: not available
URL: <https://lists.bufferbloat.net/pipermail/bloat/attachments/20110311/7a07efe5/attachment.obj>