[Bloat] The wireless problem in a nutshell

General list for discussing Bufferbloat
 help / color / mirror / Atom feed

* [Bloat] The wireless problem in a nutshell
@ 2011-02-06 17:41 Dave Täht
  2011-02-07  8:56 ` Luca Dionisi
  2011-02-08 19:56 ` Juliusz Chroboczek
  0 siblings, 2 replies; 9+ messages in thread
From: Dave Täht @ 2011-02-06 17:41 UTC (permalink / raw)
  To: bloat

Packet loss on wireless is *bursty* and causes TCP resets all over the
place. Losing three acks in a row is commonplace.

With a long Round Trip Time (RTT), such as from your house to youtube,
you can only rarely get even slightly close to good throughput over
wireless. Real packet loss rates can go as high as 100%. [1]

Even with a moderately lossy connection (3%), you can end up almost
permamently in slow start.

This is why my original (1998)[2] (and current) wireless architecture
always includes a proxy like squid or polipo on the last mile/household,
from the wire to the wireless. 

A TCP reset doesn't hurt you if your RTT is 2ms. You ramp up quickly on
either side of the wireless connection.

In the piece that I'm still struggling to write - I call this concept
the long “U” - where the U describes the huge amounts of available
bandwidth on either side of the choke point - usually at the home or
business gateway.

People think oh - adding a cache is what you are doing with a proxy -
yes caching helps, but, what you are also doing is dividing the TCP
streams into two pieces - the wireless piece is VERY short - and
congestion control then works correctly using existing techniques on the
wired and unwired portions of the connection.

The problem is, I never noticed until recently that everyone (else) was
trying to make long RTT paths work over wireless allllll the way across
the Internet!

And... To compensate, they were adding sufficient buffering inside the
wireless device AND retries - to get around real, local packet loss
rates that can lose hundreds of packets all at once, in a burst.

Which, as we all now know, clobbers latency.

That's the wireless problem in a nutshell[3]. 

If your local path is only 2ms, wireless + any given TCP algorithm
recovers from a packet loss burst, GREAT. There is no confusion between
wireless interference and congestion on the LFN. You have plenty of
bandwidth on both sides of the connection to recover. The proxy smooths
out the traffic on the wired AND wireless side.

RTT of 70ms... not so much. International links, forget about it. [4]

One positive aspect of this is many routers support proxying - polipo is
common. And an even more positive aspect is that supplying a proxy is
well supported both by browsers by both DHCP and the WPAD
standards. (Wpad will even work over IPv6 and polipo can function as an
IPv4/IPv6 ALG gateway)

Most overseas providers already use some level of transparent proxying. 

It doesn't help the scp upload problem much - but there are ssh proxies
out there too.

I personally regard the wireless packet loss burst problem as completely
intractable for long TCP links without using proxies - or insane amounts
of buffering.

The irony?

If everybody were to go and turn up a web proxy tomorrow - we'd make
bufferbloat vastly worse on both sides of the connection.

On the other hand, proxies make the problem tractable on both sides of
the connection - the wired side can apply ECN/SACK/DSACK and AQM, and
Qos - and the wireless side can do lots of stuff from the mac layer on
up, including the same techniques, on its side.

Once you have a much smaller RTT, the wireless problem gets much simpler.

If there are those on this list that are not running a proxy, try one out - you'll be amazed.

-- 
Dave Taht
http://nex-6.taht.net

[1] There are large numbers of compensation mechanisms from the mac
layer up that I'm not going to go into. They universally induce latency.
[2]
http://the-edge.blogspot.com/2010/10/who-invented-embedded-linux-based.html

We never documented the use of proxies. At the time, it was obvious -
everybody running a small business or home was using web proxies.

[3] Long distance wireless links and mesh networks have similar but
different problems

[4] Bursty packet loss doesn't affect non-tcp traffic as bad - udp voip, in
particular, doesn't have an issue.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bloat] The wireless problem in a nutshell
  2011-02-06 17:41 [Bloat] The wireless problem in a nutshell Dave Täht
@ 2011-02-07  8:56 ` Luca Dionisi
  2011-02-07 11:31   ` Dave Täht
  2011-02-08 19:56 ` Juliusz Chroboczek
  1 sibling, 1 reply; 9+ messages in thread
From: Luca Dionisi @ 2011-02-07  8:56 UTC (permalink / raw)
  To: Dave Täht, bufferbloat list

On Sun, Feb 6, 2011 at 6:41 PM, Dave Täht <d@taht.net> wrote:
> I personally regard the wireless packet loss burst problem as completely
> intractable for long TCP links without using proxies - or insane amounts
> of buffering.

Is this particular aspect going to be different once that nRED is complete?

I mean, if you were in a mesh network, primarily made of wireless
links, where the RTT becomes easily high, how could you make TCP links
to behave right?

--Luca

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bloat] The wireless problem in a nutshell
  2011-02-07  8:56 ` Luca Dionisi
@ 2011-02-07 11:31   ` Dave Täht
  2011-02-08 15:54     ` Justin McCann
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Täht @ 2011-02-07 11:31 UTC (permalink / raw)
  To: Luca Dionisi; +Cc: bufferbloat list

Luca Dionisi <luca.dionisi@gmail.com> writes:

> On Sun, Feb 6, 2011 at 6:41 PM, Dave Täht <d@taht.net> wrote:
>> I personally regard the wireless packet loss burst problem as completely
>> intractable for long TCP links without using proxies - or insane amounts
>> of buffering.
>
> Is this particular aspect going to be different once that nRED is complete?

I would love for nRED to compensate for both variable bandwidth and
bursty packet loss. 

I also would like the Easter bunny to visit this year, world peace be
achieved, and all four Beatles get back together.

I would love it if someone could pull this bunny out of the hat.I'd
settle for acceptable buffering at acceptable latencies with acceptable
packet loss under worst case conditions with good queue management - the
last nRED can solve, the rest requires hacking a lot of
over-optimistically designed device drivers.

> I mean, if you were in a mesh network, primarily made of wireless
> links, where the RTT becomes easily high, how could you make TCP links
> to behave right?

What I wrote about in the previous email was an in-home or final hop
wireless 802.11 network, and I tried to avoid the thorny issues of a
mesh network. I'll try, but I don't want my original point about "a
short RTT helps wireless TCP" to be lost. It's not well known.

You can apply all kinds of AQM (RED, etc) to either side of a one hop
wireless connection with a proxy to good effect. You can also apply
(some) level of buffering and/or remote acknowledgments to good
effect. (You have to anyway, I just tried to simplify out that part -
the various 802.11 standards about how/when/why 802.11 does retransmits
cover dozens of pages)

Moving on to mesh networks....

An encouraging thing about those is that you are carrying not one, but
dozens or hundreds of TCP streams. Assuming you are using some form of
weighted fair queuing, this means that a burst of lost packets
translates into a statistically small, random amount of packet loss
across many streams - which TCP can actually compensate for.

How long can the RTT get before TCP throughput drops? 

Good question.

That depends on your tolerance for latency and packet loss. You can
increase your RTT, and reduce packet drops, by using buffering at the
expense of latency. You can reduce your perceived latency for things
like DNS by locating caches throughout the network. You can/should put
proxies at certain average distances in the mesh (at one point I was
using BGP anycast, and later on I switched to DNS - I kind of regret
doing that - I wish babel did anycast)

A really negative thing about mesh networks is each hop is lossy. 

Anyway the discussion of bufferbloat has given me an idea regarding
using SACK more often that I'm dying to try out.

> --Luca

-- 
Dave Taht
http://nex-6.taht.net

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bloat] The wireless problem in a nutshell
  2011-02-07 11:31   ` Dave Täht
@ 2011-02-08 15:54     ` Justin McCann
  0 siblings, 0 replies; 9+ messages in thread
From: Justin McCann @ 2011-02-08 15:54 UTC (permalink / raw)
  To: Dave Täht; +Cc: bufferbloat list

On Mon, Feb 7, 2011 at 6:31 AM, Dave Täht <d@taht.net> wrote:
>...
> How long can the RTT get before TCP throughput drops?
>
> Good question.
>
> That depends on your tolerance for latency and packet loss. You can
> increase your RTT, and reduce packet drops, by using buffering at the
> expense of latency. ...

There's a rule of thumb for the upper bound in two SIGCOMM-award
papers; I generally use the simple one from the Mathis paper.

bandwidth = (MSS / RTT) * (C / sqrt(loss))

   MSS = maximum segment size
   RTT = round trip time
   C = a constant between 0.87 and 1.31, depending on the ACK strategy
and type of loss
   sqrt(loss) = square root of the probability of losing a packet

So, for a given connection, your expected throughput scales linearly
with the size of your packets, and inversely with the RTT and
sqrt(loss).

I have an unverified/untested Python implementation of the equations
at http://www.cs.umd.edu/~jmccann/tcp_tput.py

I wonder how jumbo frames play into bufferbloat. When queues are in
terms of number of packets (not bytes), jumbo frames make the problem
even worse.

    Justin

* The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm,
by Matthew Mathis, Jeffrey Semke, Jamshid Mahdavi, and Teunis Ott, CCR
27(3), 1997. (pdf:
http://ccr.sigcomm.org/archive/1997/jul97/ccr-9707-mathis.pdf)
* Modeling TCP Throughput:  A Simple Model and Its Empirical
Validation, by Jitendra Padhye, Victor Firoiu, Don Towsley, and Jim
Kurose, Proc. of ACM SIGCOMM 1998. (pdf:
http://conferences.sigcomm.org/sigcomm/1998/tp/paper25.pdf)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bloat] The wireless problem in a nutshell
  2011-02-06 17:41 [Bloat] The wireless problem in a nutshell Dave Täht
  2011-02-07  8:56 ` Luca Dionisi
@ 2011-02-08 19:56 ` Juliusz Chroboczek
  2011-02-08 21:54   ` Dave Täht
  1 sibling, 1 reply; 9+ messages in thread
From: Juliusz Chroboczek @ 2011-02-08 19:56 UTC (permalink / raw)
  To: Dave Täht; +Cc: bloat

> what you are also doing is dividing the TCP streams into two pieces -
> the wireless piece is VERY short - and congestion control then works
> correctly using existing techniques on the wired and unwired portions
> of the connection.

You've reinvented a useful technique, called "split-TCP".  It's commonly
used by the "WAN accelerators" that you can buy in order to get Microsoft
protocols working across the Internet.

The downside of split-TCP, of course, is that it breaks e2e, and hence
fate sharing, i.e. it introduces a new point of failure and makes your
network more brittle.  The challenge is to make TCP efficient without
the need for split-TCP, which requires differentiating between
congestion-induced loss and wireless loss.

--Juliusz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bloat] The wireless problem in a nutshell
  2011-02-08 19:56 ` Juliusz Chroboczek
@ 2011-02-08 21:54   ` Dave Täht
  2011-02-10 18:39     ` Juliusz Chroboczek
  2011-02-11 14:50     ` Dave Täht
  0 siblings, 2 replies; 9+ messages in thread
From: Dave Täht @ 2011-02-08 21:54 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: bloat

Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> writes:

>> what you are also doing is dividing the TCP streams into two pieces -
>> the wireless piece is VERY short - and congestion control then works
>> correctly using existing techniques on the wired and unwired portions
>> of the connection.
>
> You've reinvented a useful technique, called "split-TCP".  It's commonly
> used by the "WAN accelerators" that you can buy in order to get Microsoft
> protocols working across the Internet.

I'd been using this technique since 1996 or so, when I was working on a
hybrid wireless broadcast system (on channel) 13, with modem uplinks. At
the time I was like, "oh, proxies are useful to smooth and shorten the
path" than thought it any great revelation - and have used it everywhere
since. 

I'm pretty sure it existed before then, but it has special applicability
to wireless. Hmm... The only wikipedia article on it is in German... I
remember it coming up during the mosquitonet project...

At the time I was more relieved by the discovery that there was indeed a
lower bound on asymmetric TCP/ip based connections - the size of TCP/ip
acks meant that despite the cable companies' intention (at the time) to
build a network that only had enough backchannel bandwidth for a "buy"
button, they were going to be forced to have some ratio between
downloads/uploads that was better than 20/1 in order to function at all.

A quick google for "split tcp" shows a chain of papers on it in the 00s
that are quite fascinating, extensive, and have some delightful math[1]
that describes the (desirable) behavior(s). There's even a proposed
generic implementation for the Linux kernel[2] and an ns2 simulation [3]

Great pointer, thanks.

> The downside of split-TCP, of course, is that it breaks e2e, and hence
> fate sharing, i.e. it introduces a new point of failure and makes your
> network more brittle.  The challenge is to make TCP efficient without
> the need for split-TCP, which requires differentiating between
> congestion-induced loss and wireless loss.

e2e has already gone the way of the dodo with nat. With the exhaustion
of the ipv4 address space, we are heading towards a world of ALGs[4]
everywhere whether we like it or not.

So above I'd change either 

A) the word "challenge" for "impossibility"

Or 

B) The challenge is to make some replacement protocol for TCP efficient
that differentiates between congestion-induced loss and wireless loss.

-- 
Dave Taht
http://nex-6.taht.net

1: https://www1.ethz.ch/csg/people/karaliom/papers/ieeewcnc05.pdf

2: http://www.docstoc.com/docs/11900341/Implementing-Split-TCP-in-Linux-Kernel
3: http://www.cs.northwestern.edu/~ais/split_tcp.html
4: Application layer gateways

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bloat] The wireless problem in a nutshell
  2011-02-08 21:54   ` Dave Täht
@ 2011-02-10 18:39     ` Juliusz Chroboczek
  2011-02-11 14:50     ` Dave Täht
  1 sibling, 0 replies; 9+ messages in thread
From: Juliusz Chroboczek @ 2011-02-10 18:39 UTC (permalink / raw)
  To: Dave Täht; +Cc: bloat

> A quick google for "split tcp" shows a chain of papers on it in the 00s

Yeah, that's the time when people thought they'd need to put satellite
links into their networks.  Thankfully, the availability of cheap fiber
has made a lot of that work obsolete.

--Juliusz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bloat] The wireless problem in a nutshell
  2011-02-08 21:54   ` Dave Täht
  2011-02-10 18:39     ` Juliusz Chroboczek
@ 2011-02-11 14:50     ` Dave Täht
  2011-02-11 20:12       ` Stuart Cheshire
  1 sibling, 1 reply; 9+ messages in thread
From: Dave Täht @ 2011-02-11 14:50 UTC (permalink / raw)
  To: Juliusz Chroboczek; +Cc: bloat

d@taht.net (Dave Täht) writes:

> Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr> writes:
>
>>> what you are also doing is dividing the TCP streams into two pieces -
>>> the wireless piece is VERY short - and congestion control then works
>>> correctly using existing techniques on the wired and unwired portions
>>> of the connection.
>>
>> You've reinvented a useful technique, called "split-TCP".  It's commonly
>> used by the "WAN accelerators" that you can buy in order to get Microsoft
>> protocols working across the Internet.
>
> I'd been using this technique since 1996 or so, when I was working on a
> hybrid wireless broadcast system (on channel) 13, with modem uplinks. At
> the time I was like, "oh, proxies are useful to smooth and shorten the
> path" than thought it any great revelation - and have used it everywhere
> since. 
>
> I'm pretty sure it existed before then, but it has special applicability
> to wireless. 

"Socks" - which appeared sometime before 1992 [1] - counts as "before
then".

I've put some thought into the history of my use of proxies over
wireless - re-analyzing something that I'd incorporated into my gut back
in 1996.

The "split TCP" smoothing effect on the last mile I'd noticed *then* was
minor; the time savings were dominated by the latencies in the modem of
100ms or so. [2]

The mosquitonet research[3] had clearly shown the disastrous effects on
un-error-corrected wireless on TCP.

When we did the wireless router thing in 98, 12km latencies were in the
2-12ms range, and I specifically chose a newer technology - the first
almost-standardized version of 802.11 - because it did a low level of
link layer error detection/correction.

Even then, we regarded 1-3% packet loss as acceptable.

Today, with 802.11n, we're trying to shove 6x as much data over the air
in the same timeslot. Bursty packet loss is a disaster to a device that
is advertised to have 300Mbit speeds across the internet. Thus: seconds
of buffering and attempts at QoS to channel the more timely packets, and
a net result of moving users past the moon and back again on a regular basis.

AND - in the case of using a proxy, instead - 

Now that 802.11 wireless problem is in the last 30 meters, RTT is less
than one 1ms and the smoothing effect FAR more noticeable. - Or, it
would be, if the wireless device makers hadn't already added absurd
amounts of buffering. And we had decent AQM.

It's amazing how the changes in certain constants make certain formulas
more compelling. Let's divide C by 100, shall we? What happens?

I ran across Stuart Cheshire's work 4 times while time traveling.

Ruckus wireless has got bloat[4] - in 2007 he experiences 2.3 second
wireless latencies.

There's also a more formal version of his wonderful "it's the latency,
stupid" rant, with some good analogies[5]. To quote from his conclusion:

“As long as customers think that what they want is more throughput, and
 they don't care about latency, modem makers will continue to make design
 decisions that trade off worse latency for better throughput. 

 Modems are not the only problem here. In the near future we can expect
 to see big growth in areas such as ISDN, cable tv modems, ADSL modems
 and even wireless 'modems', all offering increases in bandwidth. If we
 don't also concentrate on improved bandwidth, we're not going to get it.

 [Elided] If we don't start caring about latency, we're going to find
 ourselves in a marketplace offering nothing but appallingly useless
 hardware.”

Written in 1996. Ah, irony, before my first cup of coffee.

Things that concern me:

1) Wireless Packet aggregation is now becoming more common, leading to
bursty behavior for short packets. What sorts of packets are being
aggregated? How much buffering is in place? I worry.

2) I think I can theorize that starting 8 TCP connections in a browser,
rather than 2, also "smooths" out bursty packet loss on the wireless
last hop.

3) Wild speculation - what packet loss the in home wireless network has
been experiencing - has been a significant factor in keeping the
internet operational.

4) I am very encouraged by the various standards for enabling web
proxies by default: dnsmasq will supply one via dhcp, browsers still
look for wpad. 

In combination with reduced buffer sizes, proxies, and AQM on the
gateway side - I am thinking it would be possible to reduce pressure on
the wireless client and AP makers for bloating buffers in the first place.

What remains is to shift the market. [6]

I imagine, if, for example, apple would use these techniques on their
tightly integrated wireless gateways and apple tv - that it would
provide a competitive advantage. 

The same goes for other manufacturers.

-- 
Dave Taht
http://nex-6.taht.net

1 http://en.wikipedia.org/wiki/SOCKS
2 Stuart Cheshire, Mary Baker "Metricom wireless experiences" 
  http://www.stuartcheshire.org/papers/wireless.ps
3 Mosquito net project has vanished from the net.
4 2007 http://www.stuartcheshire.org/papers/Ruckus-WiFi-Evaluation.pdf
5 1996 http://www.stuartcheshire.org/papers/LatencyQuest.ps
6 there is no rule 6

-- 
Dave Taht
http://nex-6.taht.net

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bloat] The wireless problem in a nutshell
  2011-02-11 14:50     ` Dave Täht
@ 2011-02-11 20:12       ` Stuart Cheshire
  0 siblings, 0 replies; 9+ messages in thread
From: Stuart Cheshire @ 2011-02-11 20:12 UTC (permalink / raw)
  To: Dave Täht; +Cc: Juliusz Chroboczek, bloat

On 11 Feb 2011, at 6:50, Dave Täht wrote:

> Modems are not the only problem here. In the near future we can expect
> to see big growth in areas such as ISDN, cable tv modems, ADSL modems
> and even wireless 'modems', all offering increases in bandwidth. If we
> don't also concentrate on improved bandwidth, we're not going to get it.

A strange copy-and-paste error got introduced somehow there. The paper say:

If we don’t also concentrate on improved latency, we’re not going to get it.

Stuart Cheshire <cheshire@apple.com>
* Wizard Without Portfolio, Apple Inc.
* www.stuartcheshire.org


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-02-11 20:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-06 17:41 [Bloat] The wireless problem in a nutshell Dave Täht
2011-02-07  8:56 ` Luca Dionisi
2011-02-07 11:31   ` Dave Täht
2011-02-08 15:54     ` Justin McCann
2011-02-08 19:56 ` Juliusz Chroboczek
2011-02-08 21:54   ` Dave Täht
2011-02-10 18:39     ` Juliusz Chroboczek
2011-02-11 14:50     ` Dave Täht
2011-02-11 20:12       ` Stuart Cheshire

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox