> the receiver advertizes a large receive window, so the sender doesn't pause > until there is that much data outstanding, or they get a timeout of a packet as > a signal to slow down.

> and because you have a gig-E link locally, your machine generates traffic  \
> very rapidly, until all that data is 'in flight'. but it's really sitting in the buffer of
> router trying to get through.

Hmm, then I have a quandary because I can easily solve the nasty bumpy upload graphs by keeping the advertised receive window on the server capped low, however then, paradoxically, there is no more sign of buffer bloat in the result, at least for the upload phase.

(The graph under the upload/download graphs for my results shows almost no latency increase during the upload phase, now).

Or, I can crank it back open again, serving people with fiber connections without having to run heaps of streams in parallel -- and then have people complain that the upload result is inefficient, or bumpy, vs what they expect.

And I can't offer an option, because the server receive window (I think) cannot be set on a case by case basis. You set it for all TCP and forget it.

I suspect you guys are going to say the server should be left with a large max receive window.. and let people complain to find out what their issue is.

BTW my setup is wire to billion 7800N, which is a DSL modem and router. I believe it is a linux based (judging from the system log) device.

cheers,
-Justin

On Tue, Apr 21, 2015 at 2:47 PM, David Lang <david@lang.hm> wrote:
On Tue, 21 Apr 2015, jb wrote:

I've discovered something perhaps you guys can explain it better or shed
some light.
It isn't specifically to do with buffer bloat but it is to do with TCP
tuning.

Attached is two pictures of my upload to New York speed test server with 1
stream.
It doesn't make any difference if it is 1 stream or 8 streams, the picture
and behaviour remains the same.
I am 200ms from new york so it qualifies as a fairly long (but not very
fat) pipe.

The nice smooth one is with linux tcp_rmem set to '4096 32768 65535' (on
the server)
The ugly bumpy one is with linux tcp_rmem set to '4096 65535 67108864' (on
the server)

It actually doesn't matter what that last huge number is, once it goes much
about 65k, e.g. 128k or 256k or beyond things get bumpy and ugly on the
upload speed.

Now as I understand this setting, it is the tcp receive window that Linux
advertises, and the last number sets the maximum size it can get to (for
one TCP stream).

For users with very fast upload speeds, they do not see an ugly bumpy
upload graph, it is smooth and sustained.
But for the majority of users (like me) with uploads less than 5 to 10mbit,
we frequently see the ugly graph.

The second tcp_rmem setting is how I have been running the speed test
servers.

Up to now I thought this was just the distance of the speedtest from the
interface: perhaps the browser was buffering a lot, and didn't feed back
progress but now I realise the bumpy one is actually being influenced by
the server receive window.

I guess my question is this: Why does ALLOWING a large receive window
appear to encourage problems with upload smoothness??

This implies that setting the receive window should be done on a connection
by connection basis: small for slow connections, large, for high speed,
long distance connections.

This is classic bufferbloat

the receiver advertizes a large receive window, so the sender doesn't pause until there is that much data outstanding, or they get a timeout of a packet as a signal to slow down.

and because you have a gig-E link locally, your machine generates traffic very rapidly, until all that data is 'in flight'. but it's really sitting in the buffer of a router trying to get through.

then when a packet times out, the sender slows down a smidge and retransmits it. But the old packet is still sitting in a queue, eating bandwidth. the packets behind it are also going to timeout and be retransmitted before your first retransmitted packet gets through, so you have a large slug of data that's being retransmitted, and the first of the replacement data can't get through until the last of the old (timed out) data is transmitted.

then when data starts flowing again, the sender again tries to fill up the window with data in flight.

In addition, if I cap it to 65k, for reasons of smoothness,
that means the bandwidth delay product will keep maximum speed per upload
stream quite low. So a symmetric or gigabit connection is going to need a
ton of parallel streams to see full speed.

Most puzzling is why would anything special be required on the Client -->
Server side of the equation
but nothing much appears wrong with the Server --> Client side, whether
speeds are very low (GPRS) or very high (gigabit).

but what window sizes are these clients advertising?


Note that also I am not yet sure if smoothness == better throughput. I have
noticed upload speeds for some people often being under their claimed sync
rate by 10 or 20% but I've no logs that show the bumpy graph is showing
inefficiency. Maybe.

If you were to do a packet capture on the server side, you would see that you have a bunch of packets that are arriving multiple times, but the first time "does't count" because the replacement is already on the way.

so your overall throughput is lower for two reasons

1. it's bursty, and there are times when the connection actually is idle (after you have a lot of timed out packets, the sender needs to ramp up it's speed again)

2. you are sending some packets multiple times, consuming more total bandwidth for the same 'goodput' (effective throughput)

David Lang


help!


On Tue, Apr 21, 2015 at 12:56 PM, Simon Barber <simon@superduper.net> wrote:

One thing users understand is slow web access.  Perhaps translating the
latency measurement into 'a typical web page will take X seconds longer to
load', or even stating the impact as 'this latency causes a typical web
page to load slower, as if your connection was only YY% of the measured
speed.'

Simon

Sent with AquaMail for Android
http://www.aqua-mail.com



On April 19, 2015 1:54:19 PM Jonathan Morton <chromatix99@gmail.com>
wrote:

>>>> Frequency readouts are probably more accessible to the latter.

    The frequency domain more accessible to laypersons? I have my
doubts ;)

Gamers, at least, are familiar with “frames per second” and how that
corresponds to their monitor’s refresh rate.

      I am sure they can easily transform back into time domain to get
the frame period ;) .  I am partly kidding, I think your idea is great in
that it is a truly positive value which could lend itself to being used in
ISP/router manufacturer advertising, and hence might work in the real work;
on the other hand I like to keep data as “raw” as possible (not that ^(-1)
is a transformation worthy of being called data massage).

The desirable range of latencies, when converted to Hz, happens to be
roughly the same as the range of desirable frame rates.

      Just to play devils advocate, the interesting part is time or
saving time so seconds or milliseconds are also intuitively understandable
and can be easily added ;)

Such readouts are certainly interesting to people like us.  I have no
objection to them being reported alongside a frequency readout.  But I
think most people are not interested in “time savings” measured in
milliseconds; they’re much more aware of the minute- and hour-level time
savings associated with greater bandwidth.

 - Jonathan Morton

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat



_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat


_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat