[Bloat] Network computing article on bloat

General list for discussing Bufferbloat
 help / color / mirror / Atom feed

* [Bloat] Network computing article on bloat
@ 2011-04-26 17:05 Dave Taht
  2011-04-26 18:13 ` Dave Hart
  0 siblings, 1 reply; 66+ messages in thread
From: Dave Taht @ 2011-04-26 17:05 UTC (permalink / raw)
  To: bloat

Not bad, although I can live without the title. Coins a new-ish phrase
"insertion latency"

http://www.networkcomputing.com/end-to-end-apm/bufferbloat-and-the-collapse-of-the-internet.php

-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 17:05 [Bloat] Network computing article on bloat Dave Taht
@ 2011-04-26 18:13 ` Dave Hart
  2011-04-26 18:17   ` Dave Taht
  0 siblings, 1 reply; 66+ messages in thread
From: Dave Hart @ 2011-04-26 18:13 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

On Tue, Apr 26, 2011 at 17:05 UTC, Dave Taht <dave.taht@gmail.com> wrote:
> Not bad, although I can live without the title. Coins a new-ish phrase
> "insertion latency"
>
> http://www.networkcomputing.com/end-to-end-apm/bufferbloat-and-the-collapse-of-the-internet.php

The piece ends with a paragraph claiming preventing packet loss is
addressing a more fundamental problem which contributes to
bufferbloat.  As long as the writer and readers believe packet loss is
an unmitigated evil, the battle is lost.  More encouraging would have
been a statement that packet loss is preferable to excessive queueing
and a required TCP feedback signal when ECN isn't in play.

Cheers,
Dave Hart

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 18:13 ` Dave Hart
@ 2011-04-26 18:17   ` Dave Taht
  2011-04-26 18:28     ` dave greenfield
  2011-04-26 18:32     ` Wesley Eddy
  0 siblings, 2 replies; 66+ messages in thread
From: Dave Taht @ 2011-04-26 18:17 UTC (permalink / raw)
  To: bloat; +Cc: dave greenfield

"Big Buffers Bad. Small Buffers Good."

"*Some* packet loss is essential for the correct operation of the Internet"

are two of the memes I try to propagate, in their simplicity. Even
then there are so many qualifiers to both of those that the core
message gets lost.



On Tue, Apr 26, 2011 at 12:13 PM, Dave Hart <davehart@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 17:05 UTC, Dave Taht <dave.taht@gmail.com> wrote:
>> Not bad, although I can live without the title. Coins a new-ish phrase
>> "insertion latency"
>>
>> http://www.networkcomputing.com/end-to-end-apm/bufferbloat-and-the-collapse-of-the-internet.php
>
> The piece ends with a paragraph claiming preventing packet loss is
> addressing a more fundamental problem which contributes to
> bufferbloat.  As long as the writer and readers believe packet loss is
> an unmitigated evil, the battle is lost.  More encouraging would have
> been a statement that packet loss is preferable to excessive queueing
> and a required TCP feedback signal when ECN isn't in play.
>
> Cheers,
> Dave Hart
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 18:17   ` Dave Taht
@ 2011-04-26 18:28     ` dave greenfield
  2011-04-26 18:32     ` Wesley Eddy
  1 sibling, 0 replies; 66+ messages in thread
From: dave greenfield @ 2011-04-26 18:28 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 2274 bytes --]

Thanks, Dave. I'm actually NOT under the impression that packet loss is the
dark lord incarnate. Yes, I too would have preferred a different title, but
editors have the last say sometimes. Oh, and insertion latency or insertion
loss isn't all that new. I've seen it used in switch and device design for
several years. Call it what you will, but it's important that IT understands
the amount of latency introduced by a given device into the data path. This
isn't always widely discussed in WAN opt circles.....

Dave

PS Can we please have someone else jump in here who's name is NOT Dave!


On Tue, Apr 26, 2011 at 9:17 PM, Dave Taht <dave.taht@gmail.com> wrote:

> "Big Buffers Bad. Small Buffers Good."
>
> "*Some* packet loss is essential for the correct operation of the Internet"
>
> are two of the memes I try to propagate, in their simplicity. Even
> then there are so many qualifiers to both of those that the core
> message gets lost.
>
>
>
> On Tue, Apr 26, 2011 at 12:13 PM, Dave Hart <davehart@gmail.com> wrote:
> > On Tue, Apr 26, 2011 at 17:05 UTC, Dave Taht <dave.taht@gmail.com>
> wrote:
> >> Not bad, although I can live without the title. Coins a new-ish phrase
> >> "insertion latency"
> >>
> >>
> http://www.networkcomputing.com/end-to-end-apm/bufferbloat-and-the-collapse-of-the-internet.php
> >
> > The piece ends with a paragraph claiming preventing packet loss is
> > addressing a more fundamental problem which contributes to
> > bufferbloat.  As long as the writer and readers believe packet loss is
> > an unmitigated evil, the battle is lost.  More encouraging would have
> > been a statement that packet loss is preferable to excessive queueing
> > and a required TCP feedback signal when ECN isn't in play.
> >
> > Cheers,
> > Dave Hart
> >
>
>
>
> --
> Dave Täht
> SKYPE: davetaht
> US Tel: 1-239-829-5608
> http://the-edge.blogspot.com
>



-- 
---
Dave Greenfield
Principal
Strategic Technology Analytics
Research. Analysis. Insight

<dave@stanalytics.com>dave@stanalytics.com | 1-908-206-4114
 Netmagdave  |  @Netmagdave
Blogs: ZDNet  <http://www.blogs.zdnet.com/greenfield> | Information
Week<http://www.networkcomputing.com/author_profile.php?name=dgreenfield&page_no=1>

[-- Attachment #2: Type: text/html, Size: 3879 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 18:17   ` Dave Taht
  2011-04-26 18:28     ` dave greenfield
@ 2011-04-26 18:32     ` Wesley Eddy
  2011-04-26 19:37       ` Dave Taht
                         ` (3 more replies)
  1 sibling, 4 replies; 66+ messages in thread
From: Wesley Eddy @ 2011-04-26 18:32 UTC (permalink / raw)
  To: bloat

On 4/26/2011 2:17 PM, Dave Taht wrote:
> "Big Buffers Bad. Small Buffers Good."
>
> "*Some* packet loss is essential for the correct operation of the Internet"
>
> are two of the memes I try to propagate, in their simplicity. Even
> then there are so many qualifiers to both of those that the core
> message gets lost.


The second one is actually backwards; it should be "the Internet can
operate correctly with some packet loss".

-- 
Wes Eddy
MTI Systems

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 18:32     ` Wesley Eddy
@ 2011-04-26 19:37       ` Dave Taht
  2011-04-26 20:21         ` Wesley Eddy
  2011-04-27  7:43       ` Jonathan Morton
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 66+ messages in thread
From: Dave Taht @ 2011-04-26 19:37 UTC (permalink / raw)
  To: Wesley Eddy; +Cc: bloat

On Tue, Apr 26, 2011 at 12:32 PM, Wesley Eddy <wes@mti-systems.com> wrote:
> On 4/26/2011 2:17 PM, Dave Taht wrote:
>>
>> "Big Buffers Bad. Small Buffers Good."
>>
>> "*Some* packet loss is essential for the correct operation of the
>> Internet"
>>
>> are two of the memes I try to propagate, in their simplicity. Even
>> then there are so many qualifiers to both of those that the core
>> message gets lost.
>
>
> The second one is actually backwards; it should be "the Internet can
> operate correctly with some packet loss".
>
INCORRECT.

See? We can't win, even amongst ourselves.

The Internet *cannot operate correctly without packet loss*.

RFC970, http://www.faqs.org/rfcs/rfc970.html



> --
> Wes Eddy
> MTI Systems
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 19:37       ` Dave Taht
@ 2011-04-26 20:21         ` Wesley Eddy
  2011-04-26 20:30           ` Constantine Dovrolis
  2011-04-27 17:10           ` Bill Sommerfeld
  0 siblings, 2 replies; 66+ messages in thread
From: Wesley Eddy @ 2011-04-26 20:21 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

On 4/26/2011 3:37 PM, Dave Taht wrote:
> On Tue, Apr 26, 2011 at 12:32 PM, Wesley Eddy<wes@mti-systems.com>  wrote:
>> On 4/26/2011 2:17 PM, Dave Taht wrote:
>>>
>>> "Big Buffers Bad. Small Buffers Good."
>>>
>>> "*Some* packet loss is essential for the correct operation of the
>>> Internet"
>>>
>>> are two of the memes I try to propagate, in their simplicity. Even
>>> then there are so many qualifiers to both of those that the core
>>> message gets lost.
>>
>>
>> The second one is actually backwards; it should be "the Internet can
>> operate correctly with some packet loss".
>>
> INCORRECT.
>
> See? We can't win, even amongst ourselves.
>
> The Internet *cannot operate correctly without packet loss*.
>
> RFC970, http://www.faqs.org/rfcs/rfc970.html
>


Operating with infinite storage and operating without packet loss are
two different things.

Ideally, you may have a path with ample bandwidth such that packet
losses don't occur and all connections are either application limited or
receive window limitedand congestion control never kicks in.  In this
case, there's no loss and the Internet clearly works.

-- 
Wes Eddy
MTI Systems

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 20:21         ` Wesley Eddy
@ 2011-04-26 20:30           ` Constantine Dovrolis
  2011-04-26 21:16             ` Dave Taht
  2011-04-27 17:10           ` Bill Sommerfeld
  1 sibling, 1 reply; 66+ messages in thread
From: Constantine Dovrolis @ 2011-04-26 20:30 UTC (permalink / raw)
  To: bloat

Thanks Wes - I was hoping that someone will make this point.

btw, another common reason for lossless operation is the
size of the flows. basically flows often finish before their
window increases so much that they overflow their bottleneck's
buffer.

Plz spend some time to read the following paper:
http://www.cc.gatech.edu/fac/Constantinos.Dovrolis/Papers/buffers-ton.pdf
It is very relevant to the bufferbloat initiative and it shows clearly,
I think, that statements like "Big Buffers Bad. Small Buffers Good."
are crude oversimplifications that will cause even more confusion.

regards

Constantine

On 4/26/2011 4:21 PM, Wesley Eddy wrote:
>
>
> Operating with infinite storage and operating without packet loss are
> two different things.
>
> Ideally, you may have a path with ample bandwidth such that packet
> losses don't occur and all connections are either application limited or
> receive window limitedand congestion control never kicks in. In this
> case, there's no loss and the Internet clearly works.
>

--------------------------------------------------------------
Constantine Dovrolis, Associate Professor
College of Computing, Georgia Institute of Technology
3346 KACB, 404-385-4205, dovrolis@cc.gatech.edu
http://www.cc.gatech.edu/~dovrolis/

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 20:30           ` Constantine Dovrolis
@ 2011-04-26 21:16             ` Dave Taht
  0 siblings, 0 replies; 66+ messages in thread
From: Dave Taht @ 2011-04-26 21:16 UTC (permalink / raw)
  To: Constantine Dovrolis; +Cc: bloat

On Tue, Apr 26, 2011 at 2:30 PM, Constantine Dovrolis
<dovrolis@cc.gatech.edu> wrote:
> Thanks Wes - I was hoping that someone will make this point.
>
> btw, another common reason for lossless operation is the
> size of the flows. basically flows often finish before their
> window increases so much that they overflow their bottleneck's
> buffer.

We do tend to overuse TCP for short flows, like those of the core http
protocol without 1.1 pipelining. However more uptake of 1.1's
pipelining would lead to more correct and timely behavior in the
presence of congestion, and longer flows in the general case.

That said in an age of netflix and facetime, we have problems with big
flows again.

>
> Plz spend some time to read the following paper:
> http://www.cc.gatech.edu/fac/Constantinos.Dovrolis/Papers/buffers-ton.pdf

The paper above appears to be testing against networks in the USA, and
at speeds higher than 1Mbit.

Did you try working internationally, at speeds closer to 128Kbit?

> It is very relevant to the bufferbloat initiative and it shows clearly,
> I think, that statements like "Big Buffers Bad. Small Buffers Good."
> are crude oversimplifications that will cause even more confusion.

I think they are crude simplifications, but they lead to slightly more
correct conclusions in the general case than the alternatives. I would
love to have a short elevator pitch that nailed the problem
adequately.



>
> regards
>
> Constantine
>
> On 4/26/2011 4:21 PM, Wesley Eddy wrote:
>>
>>
>> Operating with infinite storage and operating without packet loss are
>> two different things.
>>
>> Ideally, you may have a path with ample bandwidth such that packet
>> losses don't occur and all connections are either application limited or
>> receive window limitedand congestion control never kicks in. In this
>> case, there's no loss and the Internet clearly works.
>>
>
> --------------------------------------------------------------
> Constantine Dovrolis, Associate Professor
> College of Computing, Georgia Institute of Technology
> 3346 KACB, 404-385-4205, dovrolis@cc.gatech.edu
> http://www.cc.gatech.edu/~dovrolis/
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 20:21         ` Wesley Eddy
  2011-04-26 20:30           ` Constantine Dovrolis
@ 2011-04-27 17:10           ` Bill Sommerfeld
  2011-04-27 17:40             ` Wesley Eddy
  1 sibling, 1 reply; 66+ messages in thread
From: Bill Sommerfeld @ 2011-04-27 17:10 UTC (permalink / raw)
  To: Wesley Eddy; +Cc: bloat

On Tue, Apr 26, 2011 at 13:21, Wesley Eddy <wes@mti-systems.com> wrote:
> Ideally, you may have a path with ample bandwidth such that packet
> losses don't occur and all connections are either application limited or
> receive window limited and congestion control never kicks in.  In this
> case, there's no loss and the Internet clearly works.

This situation is not really "ideal" because it indicates an
unbalanced system -- you've probably spent too much on link bandwidth
and not enough on end system performance.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-27 17:10           ` Bill Sommerfeld
@ 2011-04-27 17:40             ` Wesley Eddy
  0 siblings, 0 replies; 66+ messages in thread
From: Wesley Eddy @ 2011-04-27 17:40 UTC (permalink / raw)
  To: Bill Sommerfeld; +Cc: bloat

On 4/27/2011 1:10 PM, Bill Sommerfeld wrote:
> On Tue, Apr 26, 2011 at 13:21, Wesley Eddy<wes@mti-systems.com>  wrote:
>> Ideally, you may have a path with ample bandwidth such that packet
>> losses don't occur and all connections are either application limited or
>> receive window limited and congestion control never kicks in.  In this
>> case, there's no loss and the Internet clearly works.
>
> This situation is not really "ideal" because it indicates an
> unbalanced system -- you've probably spent too much on link bandwidth
> and not enough on end system performance.
>
>

Many applications are inherently limited in max rate; e.g. VoIP and
video streams with fixed-rate codecs, telemetry, etc.

The elevator pitch should  be that optimizing for low loss is harmful
and needs to be balanced with optimizing latency.  It should not be
saying that loss is required.

-- 
Wes Eddy
MTI Systems

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 18:32     ` Wesley Eddy
  2011-04-26 19:37       ` Dave Taht
@ 2011-04-27  7:43       ` Jonathan Morton
  2011-04-30 15:56       ` Henrique de Moraes Holschuh
  2011-04-30 19:18       ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Richard Scheffenegger
  3 siblings, 0 replies; 66+ messages in thread
From: Jonathan Morton @ 2011-04-27  7:43 UTC (permalink / raw)
  To: Wesley Eddy; +Cc: bloat

On 26 Apr, 2011, at 9:32 pm, Wesley Eddy wrote:

> On 4/26/2011 2:17 PM, Dave Taht wrote:
>> "Big Buffers Bad. Small Buffers Good."
>> 
>> "*Some* packet loss is essential for the correct operation of the Internet"
>> 
>> are two of the memes I try to propagate, in their simplicity. Even
>> then there are so many qualifiers to both of those that the core
>> message gets lost.
> 
> The second one is actually backwards; it should be "the Internet can
> operate correctly with some packet loss".

I would say, more accurately, that the *potential* for packet loss is necessary for correct Internet operation.

This is the same as saying that the potential for bringing individual trains to an unscheduled halt is necessary to allow railways to operate safely.  If one train is delayed, another train has to wait for it to clear the junction to avoid a collision.  If the brakes fail, they are designed to bring the train to an immediate halt rather than face the possibility of not coming to a halt when later required to.  If the signals fail, they automatically show Danger.

When congestion control fails, packet loss is inevitable.  Bigger buffers - the traditional "solution" to packet loss - only delay that fact slightly, and not even for very long.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Network computing article on bloat
  2011-04-26 18:32     ` Wesley Eddy
  2011-04-26 19:37       ` Dave Taht
  2011-04-27  7:43       ` Jonathan Morton
@ 2011-04-30 15:56       ` Henrique de Moraes Holschuh
  2011-04-30 19:18       ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Richard Scheffenegger
  3 siblings, 0 replies; 66+ messages in thread
From: Henrique de Moraes Holschuh @ 2011-04-30 15:56 UTC (permalink / raw)
  To: Wesley Eddy; +Cc: bloat

On Tue, 26 Apr 2011, Wesley Eddy wrote:
> On 4/26/2011 2:17 PM, Dave Taht wrote:
> >"Big Buffers Bad. Small Buffers Good."
> >
> >"*Some* packet loss is essential for the correct operation of the Internet"
> >
> >are two of the memes I try to propagate, in their simplicity. Even
> >then there are so many qualifiers to both of those that the core
> >message gets lost.
> 
> 
> The second one is actually backwards; it should be "the Internet can
> operate correctly with some packet loss".

Right now in the real world, it CANNOT operate correctly WITHOUT the use
of aggressive packet loss to throttle back flows, or the queues just
fill up to the brink, and then you start dropping all packets anyway.

IMO, Dave's wodring get that point across a lot better.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [Bloat]  Goodput fraction w/ AQM vs bufferbloat
  2011-04-26 18:32     ` Wesley Eddy
                         ` (2 preceding siblings ...)
  2011-04-30 15:56       ` Henrique de Moraes Holschuh
@ 2011-04-30 19:18       ` Richard Scheffenegger
  2011-05-05 16:01         ` Jim Gettys
  3 siblings, 1 reply; 66+ messages in thread
From: Richard Scheffenegger @ 2011-04-30 19:18 UTC (permalink / raw)
  To: bloat

I'm curious, has anyone done some simulations to check if the following 
qualitative statement holds true, and if, what the quantitative effect is:

With bufferbloat, the TCP congestion control reaction is unduely delayed. 
When it finally happens, the tcp stream is likely facing a "burst loss" 
event - multiple consecutive packets get dropped. Worse yet, the sender with 
the lowest RTT across the bottleneck will likely start to retransmit while 
the (tail-drop) queue is still overflowing.

And a lost retransmission means a major setback in bandwidth (except for 
Linux with bulk transfers and SACK enabled), as the standard (RFC 
documented) behaviour asks for a RTO (1sec nominally, 200-500 ms typically) 
to recover such a lost retransmission...

The second part (more important as an incentive to the ISPs actually), how 
does the fraction of goodput vs. throughput change, when AQM schemes are 
deployed, and TCP CC reacts in a timely manner? Small ISPs have to pay for 
their upstream volume, regardless if that is "real" work (goodput) or 
unneccessary retransmissions.

When I was at a small cable ISP in switzerland last week, surely enough 
bufferbloat was readily observable (17ms -> 220ms after 30 sec of a bulk 
transfer), but at first they had the "not our problem" view, until I started 
discussing burst loss / retransmissions / goodput vs throughput - with the 
latest point being a real commercial incentive to them. (They promised to 
check if AQM would be available in the CPE / CMTS, and put latency bounds in 
their tenders going forward).

Best regards,
   Richard

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-04-30 19:18       ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Richard Scheffenegger
@ 2011-05-05 16:01         ` Jim Gettys
  2011-05-05 16:10           ` Stephen Hemminger
  2011-05-06  4:18           ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Fred Baker
  0 siblings, 2 replies; 66+ messages in thread
From: Jim Gettys @ 2011-05-05 16:01 UTC (permalink / raw)
  To: bloat

On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
> I'm curious, has anyone done some simulations to check if the 
> following qualitative statement holds true, and if, what the 
> quantitative effect is:
>
> With bufferbloat, the TCP congestion control reaction is unduely 
> delayed. When it finally happens, the tcp stream is likely facing a 
> "burst loss" event - multiple consecutive packets get dropped. Worse 
> yet, the sender with the lowest RTT across the bottleneck will likely 
> start to retransmit while the (tail-drop) queue is still overflowing.
>
> And a lost retransmission means a major setback in bandwidth (except 
> for Linux with bulk transfers and SACK enabled), as the standard (RFC 
> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms 
> typically) to recover such a lost retransmission...
>
> The second part (more important as an incentive to the ISPs actually), 
> how does the fraction of goodput vs. throughput change, when AQM 
> schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs 
> have to pay for their upstream volume, regardless if that is "real" 
> work (goodput) or unneccessary retransmissions.
>
> When I was at a small cable ISP in switzerland last week, surely 
> enough bufferbloat was readily observable (17ms -> 220ms after 30 sec 
> of a bulk transfer), but at first they had the "not our problem" view, 
> until I started discussing burst loss / retransmissions / goodput vs 
> throughput - with the latest point being a real commercial incentive 
> to them. (They promised to check if AQM would be available in the CPE 
> / CMTS, and put latency bounds in their tenders going forward).
>
I wish I had a good answer to your very good questions.  Simulation 
would be interesting though real daa is more convincing.

I haven't looked in detail at all that many traces to try to get a feel 
for how much bandwidth waste there actually is, and more formal studies 
like Netalyzr, SamKnows, or the Bismark project would be needed to 
quantify the loss on the network as a whole.

I did spend some time last fall with the traces I've taken.  In those, 
I've typically been seeing 1-3% packet loss in the main TCP transfers.  
On the wireless trace I took, I saw 9% loss, but whether that is 
bufferbloat induced loss or not, I don't know (the data is out there for 
those who might want to dig).  And as you note, the losses are 
concentrated in bursts (probably due to the details of Cubic, so I'm told).

I've had anecdotal reports (and some first hand experience) with much 
higher loss rates, for example from Nick Weaver at ICSI; but I believe 
in playing things conservatively with any numbers I quote and I've not 
gotten consistent results when I've tried, so I just report what's in 
the packet captures I did take.

A phenomena that could be occurring is that during congestion avoidance 
(until TCP loses its cookies entirely and probes for a higher operating 
point) that TCP is carefully timing it's packets to keep the buffers 
almost exactly full, so that competing flows (in my case, simple pings) 
are likely to arrive just when there is no buffer space to accept them 
and therefore you see higher losses on them than you would on the single 
flow I've been tracing and getting loss statistics from.

People who want to look into this further would be a great help.
                 - Jim

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-05 16:01         ` Jim Gettys
@ 2011-05-05 16:10           ` Stephen Hemminger
  2011-05-05 16:30             ` Jim Gettys
  2011-05-05 16:49             ` [Bloat] Burst Loss Neil Davies
  2011-05-06  4:18           ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Fred Baker
  1 sibling, 2 replies; 66+ messages in thread
From: Stephen Hemminger @ 2011-05-05 16:10 UTC (permalink / raw)
  To: Jim Gettys; +Cc: bloat

On Thu, 05 May 2011 12:01:22 -0400
Jim Gettys <jg@freedesktop.org> wrote:

> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
> > I'm curious, has anyone done some simulations to check if the 
> > following qualitative statement holds true, and if, what the 
> > quantitative effect is:
> >
> > With bufferbloat, the TCP congestion control reaction is unduely 
> > delayed. When it finally happens, the tcp stream is likely facing a 
> > "burst loss" event - multiple consecutive packets get dropped. Worse 
> > yet, the sender with the lowest RTT across the bottleneck will likely 
> > start to retransmit while the (tail-drop) queue is still overflowing.
> >
> > And a lost retransmission means a major setback in bandwidth (except 
> > for Linux with bulk transfers and SACK enabled), as the standard (RFC 
> > documented) behaviour asks for a RTO (1sec nominally, 200-500 ms 
> > typically) to recover such a lost retransmission...
> >
> > The second part (more important as an incentive to the ISPs actually), 
> > how does the fraction of goodput vs. throughput change, when AQM 
> > schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs 
> > have to pay for their upstream volume, regardless if that is "real" 
> > work (goodput) or unneccessary retransmissions.
> >
> > When I was at a small cable ISP in switzerland last week, surely 
> > enough bufferbloat was readily observable (17ms -> 220ms after 30 sec 
> > of a bulk transfer), but at first they had the "not our problem" view, 
> > until I started discussing burst loss / retransmissions / goodput vs 
> > throughput - with the latest point being a real commercial incentive 
> > to them. (They promised to check if AQM would be available in the CPE 
> > / CMTS, and put latency bounds in their tenders going forward).
> >
> I wish I had a good answer to your very good questions.  Simulation 
> would be interesting though real daa is more convincing.
> 
> I haven't looked in detail at all that many traces to try to get a feel 
> for how much bandwidth waste there actually is, and more formal studies 
> like Netalyzr, SamKnows, or the Bismark project would be needed to 
> quantify the loss on the network as a whole.
> 
> I did spend some time last fall with the traces I've taken.  In those, 
> I've typically been seeing 1-3% packet loss in the main TCP transfers.  
> On the wireless trace I took, I saw 9% loss, but whether that is 
> bufferbloat induced loss or not, I don't know (the data is out there for 
> those who might want to dig).  And as you note, the losses are 
> concentrated in bursts (probably due to the details of Cubic, so I'm told).
> 
> I've had anecdotal reports (and some first hand experience) with much 
> higher loss rates, for example from Nick Weaver at ICSI; but I believe 
> in playing things conservatively with any numbers I quote and I've not 
> gotten consistent results when I've tried, so I just report what's in 
> the packet captures I did take.
> 
> A phenomena that could be occurring is that during congestion avoidance 
> (until TCP loses its cookies entirely and probes for a higher operating 
> point) that TCP is carefully timing it's packets to keep the buffers 
> almost exactly full, so that competing flows (in my case, simple pings) 
> are likely to arrive just when there is no buffer space to accept them 
> and therefore you see higher losses on them than you would on the single 
> flow I've been tracing and getting loss statistics from.
> 
> People who want to look into this further would be a great help.
>                  - Jim

I would not put a lot of trust in measuring loss with pings. 
I heard that some ISP's do different processing on ICMP's used
for ping packets. They either prioritize them high to provide 
artificially good response (better marketing numbers); or 
prioritize them low since they aren't useful traffic.
There are also filters that only allow N ICMP requests per second
which means repeated probes will be dropped.



-- 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-05 16:10           ` Stephen Hemminger
@ 2011-05-05 16:30             ` Jim Gettys
  2011-05-05 16:49             ` [Bloat] Burst Loss Neil Davies
  1 sibling, 0 replies; 66+ messages in thread
From: Jim Gettys @ 2011-05-05 16:30 UTC (permalink / raw)
  To: bloat

On 05/05/2011 12:10 PM, Stephen Hemminger wrote:
> On Thu, 05 May 2011 12:01:22 -0400
> Jim Gettys<jg@freedesktop.org>  wrote:
>
>> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
>>> I'm curious, has anyone done some simulations to check if the
>>> following qualitative statement holds true, and if, what the
>>> quantitative effect is:
>>>
>>> With bufferbloat, the TCP congestion control reaction is unduely
>>> delayed. When it finally happens, the tcp stream is likely facing a
>>> "burst loss" event - multiple consecutive packets get dropped. Worse
>>> yet, the sender with the lowest RTT across the bottleneck will likely
>>> start to retransmit while the (tail-drop) queue is still overflowing.
>>>
>>> And a lost retransmission means a major setback in bandwidth (except
>>> for Linux with bulk transfers and SACK enabled), as the standard (RFC
>>> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms
>>> typically) to recover such a lost retransmission...
>>>
>>> The second part (more important as an incentive to the ISPs actually),
>>> how does the fraction of goodput vs. throughput change, when AQM
>>> schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs
>>> have to pay for their upstream volume, regardless if that is "real"
>>> work (goodput) or unneccessary retransmissions.
>>>
>>> When I was at a small cable ISP in switzerland last week, surely
>>> enough bufferbloat was readily observable (17ms ->  220ms after 30 sec
>>> of a bulk transfer), but at first they had the "not our problem" view,
>>> until I started discussing burst loss / retransmissions / goodput vs
>>> throughput - with the latest point being a real commercial incentive
>>> to them. (They promised to check if AQM would be available in the CPE
>>> / CMTS, and put latency bounds in their tenders going forward).
>>>
>> I wish I had a good answer to your very good questions.  Simulation
>> would be interesting though real daa is more convincing.
>>
>> I haven't looked in detail at all that many traces to try to get a feel
>> for how much bandwidth waste there actually is, and more formal studies
>> like Netalyzr, SamKnows, or the Bismark project would be needed to
>> quantify the loss on the network as a whole.
>>
>> I did spend some time last fall with the traces I've taken.  In those,
>> I've typically been seeing 1-3% packet loss in the main TCP transfers.
>> On the wireless trace I took, I saw 9% loss, but whether that is
>> bufferbloat induced loss or not, I don't know (the data is out there for
>> those who might want to dig).  And as you note, the losses are
>> concentrated in bursts (probably due to the details of Cubic, so I'm told).
>>
>> I've had anecdotal reports (and some first hand experience) with much
>> higher loss rates, for example from Nick Weaver at ICSI; but I believe
>> in playing things conservatively with any numbers I quote and I've not
>> gotten consistent results when I've tried, so I just report what's in
>> the packet captures I did take.
>>
>> A phenomena that could be occurring is that during congestion avoidance
>> (until TCP loses its cookies entirely and probes for a higher operating
>> point) that TCP is carefully timing it's packets to keep the buffers
>> almost exactly full, so that competing flows (in my case, simple pings)
>> are likely to arrive just when there is no buffer space to accept them
>> and therefore you see higher losses on them than you would on the single
>> flow I've been tracing and getting loss statistics from.
>>
>> People who want to look into this further would be a great help.
>>                   - Jim
> I would not put a lot of trust in measuring loss with pings.
> I heard that some ISP's do different processing on ICMP's used
> for ping packets. They either prioritize them high to provide
> artificially good response (better marketing numbers); or
> prioritize them low since they aren't useful traffic.
> There are also filters that only allow N ICMP requests per second
> which means repeated probes will be dropped.
I didn't use ping for my loss measurements above, but derived them from 
the traces themselves (using tstat: see: 
http://tstat.tlc.polito.it/index.shtml).

Your explanation is part of why I don't use what I've seen when using 
ping for loss rates (though I have yet to actually see the behaviour of 
messing with priorities or preferentially dropping that many have claimed.

Ping does often get processed on network gear slow paths, and it is 
believable that on loaded routers or broad band head end under load the 
pings might get dropped, classified or otherwise messed with.  So I made 
sure to avoid that in the loss numbers I quote on the traces I looked at.

It's also why I worked with Folkert Van Heusden last summer and fall to 
ensure that there was a TCP based ping program available (you can use 
options to httping http://www.vanheusden.com/httping/ to get an HTTP 
based ping using HTTP persistent connections), with one packet out and 
exactly one back, so it should be prioritised exactly as web traffic.  
So far, it and conventional ICMP ping have always returned effectively 
identical tests in the paths I've probed.

How much of the anecdotal information of ISP's doing this or that with 
ICMP I'd believe is therefore not clear.  But at least with httpping we 
can figure out what extent it may be true, and certainly care is in 
order on any measurements.
                     Best regards,
                                 - jim



                 - Jim

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [Bloat] Burst Loss
  2011-05-05 16:10           ` Stephen Hemminger
  2011-05-05 16:30             ` Jim Gettys
@ 2011-05-05 16:49             ` Neil Davies
  2011-05-05 18:34               ` Jim Gettys
                                 ` (2 more replies)
  1 sibling, 3 replies; 66+ messages in thread
From: Neil Davies @ 2011-05-05 16:49 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: bloat

On the issue of loss - we did a study of the UK's ADSL access network back in 2006 over several weeks, looking at the loss and delay that was introduced into the bi-directional traffic.

We found that the delay variability (that bit left over after you've taken the effects of geography and line sync rates) was broadly
the same over the half dozen locations we studied - it was there all the time to the same level of  variance and that what did vary by time of day was the loss rate.

We also found out, at the time much to our surprise - but we understand why now, that loss was broadly independent of the offered load - we used a constant data rate (with either fixed or variable packet sizes) .

We found that loss rates were in the range 1% to 3% (which is what would be expected from a large number of TCP streams contending for a limiting resource).

As for burst loss, yes it does occur - but it could be argued that this more the fault of the sending TCP stack than the network.

This phenomenon was well covered in the academic literature in the '90s (if I remember correctly folks at INRIA lead the way) - it is all down to the nature of random processes and how you observe them.  

Back to back packets see higher loss rates than packets more spread out in time. Consider a pair of packets, back to back, arriving over a 1Gbit/sec link into a queue being serviced at 34Mbit/sec, the first packet being 'lost' is equivalent to saying that the first packet 'observed' the queue full - the system's state is no longer a random variable - it is known to be full. The second packet (lets assume it is also a full one) 'makes an observation' of the state of that queue about 12us later - but that is only 3% of the time that it takes to service such large packets at 34 Mbit/sec. The system has not had any time to 'relax' anywhere near to back its steady state, it is highly likely that it is still full. 

Fixing this makes a phenomenal difference on the goodput (with the usual delay effects that implies), we've even built and deployed systems with this sort of engineering embedded (deployed as a network 'wrap') that mean that end users can sustainably (days on end) achieve effective throughput that is better than 98% of (the transmission media imposed) maximum. What we had done is make the network behave closer to the underlying statistical assumptions made in TCP's design.

Neil

On 5 May 2011, at 17:10, Stephen Hemminger wrote:

> On Thu, 05 May 2011 12:01:22 -0400
> Jim Gettys <jg@freedesktop.org> wrote:
> 
>> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
>>> I'm curious, has anyone done some simulations to check if the 
>>> following qualitative statement holds true, and if, what the 
>>> quantitative effect is:
>>> 
>>> With bufferbloat, the TCP congestion control reaction is unduely 
>>> delayed. When it finally happens, the tcp stream is likely facing a 
>>> "burst loss" event - multiple consecutive packets get dropped. Worse 
>>> yet, the sender with the lowest RTT across the bottleneck will likely 
>>> start to retransmit while the (tail-drop) queue is still overflowing.
>>> 
>>> And a lost retransmission means a major setback in bandwidth (except 
>>> for Linux with bulk transfers and SACK enabled), as the standard (RFC 
>>> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms 
>>> typically) to recover such a lost retransmission...
>>> 
>>> The second part (more important as an incentive to the ISPs actually), 
>>> how does the fraction of goodput vs. throughput change, when AQM 
>>> schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs 
>>> have to pay for their upstream volume, regardless if that is "real" 
>>> work (goodput) or unneccessary retransmissions.
>>> 
>>> When I was at a small cable ISP in switzerland last week, surely 
>>> enough bufferbloat was readily observable (17ms -> 220ms after 30 sec 
>>> of a bulk transfer), but at first they had the "not our problem" view, 
>>> until I started discussing burst loss / retransmissions / goodput vs 
>>> throughput - with the latest point being a real commercial incentive 
>>> to them. (They promised to check if AQM would be available in the CPE 
>>> / CMTS, and put latency bounds in their tenders going forward).
>>> 
>> I wish I had a good answer to your very good questions.  Simulation 
>> would be interesting though real daa is more convincing.
>> 
>> I haven't looked in detail at all that many traces to try to get a feel 
>> for how much bandwidth waste there actually is, and more formal studies 
>> like Netalyzr, SamKnows, or the Bismark project would be needed to 
>> quantify the loss on the network as a whole.
>> 
>> I did spend some time last fall with the traces I've taken.  In those, 
>> I've typically been seeing 1-3% packet loss in the main TCP transfers.  
>> On the wireless trace I took, I saw 9% loss, but whether that is 
>> bufferbloat induced loss or not, I don't know (the data is out there for 
>> those who might want to dig).  And as you note, the losses are 
>> concentrated in bursts (probably due to the details of Cubic, so I'm told).
>> 
>> I've had anecdotal reports (and some first hand experience) with much 
>> higher loss rates, for example from Nick Weaver at ICSI; but I believe 
>> in playing things conservatively with any numbers I quote and I've not 
>> gotten consistent results when I've tried, so I just report what's in 
>> the packet captures I did take.
>> 
>> A phenomena that could be occurring is that during congestion avoidance 
>> (until TCP loses its cookies entirely and probes for a higher operating 
>> point) that TCP is carefully timing it's packets to keep the buffers 
>> almost exactly full, so that competing flows (in my case, simple pings) 
>> are likely to arrive just when there is no buffer space to accept them 
>> and therefore you see higher losses on them than you would on the single 
>> flow I've been tracing and getting loss statistics from.
>> 
>> People who want to look into this further would be a great help.
>>                 - Jim
> 
> I would not put a lot of trust in measuring loss with pings. 
> I heard that some ISP's do different processing on ICMP's used
> for ping packets. They either prioritize them high to provide 
> artificially good response (better marketing numbers); or 
> prioritize them low since they aren't useful traffic.
> There are also filters that only allow N ICMP requests per second
> which means repeated probes will be dropped.
> 
> 
> 
> -- 
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-05 16:49             ` [Bloat] Burst Loss Neil Davies
@ 2011-05-05 18:34               ` Jim Gettys
  2011-05-06 11:40               ` Sam Stickland
  2011-05-08 12:42               ` Richard Scheffenegger
  2 siblings, 0 replies; 66+ messages in thread
From: Jim Gettys @ 2011-05-05 18:34 UTC (permalink / raw)
  To: bloat

On 05/05/2011 12:49 PM, Neil Davies wrote:
> On the issue of loss - we did a study of the UK's ADSL access network back in 2006 over several weeks, looking at the loss and delay that was introduced into the bi-directional traffic.
>
> We found that the delay variability (that bit left over after you've taken the effects of geography and line sync rates) was broadly
> the same over the half dozen locations we studied - it was there all the time to the same level of  variance and that what did vary by time of day was the loss rate.
>
> We also found out, at the time much to our surprise - but we understand why now, that loss was broadly independent of the offered load - we used a constant data rate (with either fixed or variable packet sizes) .
>
> We found that loss rates were in the range 1% to 3% (which is what would be expected from a large number of TCP streams contending for a limiting resource).
>
> As for burst loss, yes it does occur - but it could be argued that this more the fault of the sending TCP stack than the network.
>
> This phenomenon was well covered in the academic literature in the '90s (if I remember correctly folks at INRIA lead the way) - it is all down to the nature of random processes and how you observe them.
>
> Back to back packets see higher loss rates than packets more spread out in time. Consider a pair of packets, back to back, arriving over a 1Gbit/sec link into a queue being serviced at 34Mbit/sec, the first packet being 'lost' is equivalent to saying that the first packet 'observed' the queue full - the system's state is no longer a random variable - it is known to be full. The second packet (lets assume it is also a full one) 'makes an observation' of the state of that queue about 12us later - but that is only 3% of the time that it takes to service such large packets at 34 Mbit/sec. The system has not had any time to 'relax' anywhere near to back its steady state, it is highly likely that it is still full.
>
> Fixing this makes a phenomenal difference on the goodput (with the usual delay effects that implies), we've even built and deployed systems with this sort of engineering embedded (deployed as a network 'wrap') that mean that end users can sustainably (days on end) achieve effective throughput that is better than 98% of (the transmission media imposed) maximum. What we had done is make the network behave closer to the underlying statistical assumptions made in TCP's design.
>
> Neil

Good point: in phone conversations with Van Jacobson, he made the point 
that we'd really like the hardware to allow scheduling of packet 
transmission to allow proper paceing of packets, to avoid clumping and 
smooth flow.
                 - Jim

>
>
>
> On 5 May 2011, at 17:10, Stephen Hemminger wrote:
>
>> On Thu, 05 May 2011 12:01:22 -0400
>> Jim Gettys<jg@freedesktop.org>  wrote:
>>
>>> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
>>>> I'm curious, has anyone done some simulations to check if the
>>>> following qualitative statement holds true, and if, what the
>>>> quantitative effect is:
>>>>
>>>> With bufferbloat, the TCP congestion control reaction is unduely
>>>> delayed. When it finally happens, the tcp stream is likely facing a
>>>> "burst loss" event - multiple consecutive packets get dropped. Worse
>>>> yet, the sender with the lowest RTT across the bottleneck will likely
>>>> start to retransmit while the (tail-drop) queue is still overflowing.
>>>>
>>>> And a lost retransmission means a major setback in bandwidth (except
>>>> for Linux with bulk transfers and SACK enabled), as the standard (RFC
>>>> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms
>>>> typically) to recover such a lost retransmission...
>>>>
>>>> The second part (more important as an incentive to the ISPs actually),
>>>> how does the fraction of goodput vs. throughput change, when AQM
>>>> schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs
>>>> have to pay for their upstream volume, regardless if that is "real"
>>>> work (goodput) or unneccessary retransmissions.
>>>>
>>>> When I was at a small cable ISP in switzerland last week, surely
>>>> enough bufferbloat was readily observable (17ms ->  220ms after 30 sec
>>>> of a bulk transfer), but at first they had the "not our problem" view,
>>>> until I started discussing burst loss / retransmissions / goodput vs
>>>> throughput - with the latest point being a real commercial incentive
>>>> to them. (They promised to check if AQM would be available in the CPE
>>>> / CMTS, and put latency bounds in their tenders going forward).
>>>>
>>> I wish I had a good answer to your very good questions.  Simulation
>>> would be interesting though real daa is more convincing.
>>>
>>> I haven't looked in detail at all that many traces to try to get a feel
>>> for how much bandwidth waste there actually is, and more formal studies
>>> like Netalyzr, SamKnows, or the Bismark project would be needed to
>>> quantify the loss on the network as a whole.
>>>
>>> I did spend some time last fall with the traces I've taken.  In those,
>>> I've typically been seeing 1-3% packet loss in the main TCP transfers.
>>> On the wireless trace I took, I saw 9% loss, but whether that is
>>> bufferbloat induced loss or not, I don't know (the data is out there for
>>> those who might want to dig).  And as you note, the losses are
>>> concentrated in bursts (probably due to the details of Cubic, so I'm told).
>>>
>>> I've had anecdotal reports (and some first hand experience) with much
>>> higher loss rates, for example from Nick Weaver at ICSI; but I believe
>>> in playing things conservatively with any numbers I quote and I've not
>>> gotten consistent results when I've tried, so I just report what's in
>>> the packet captures I did take.
>>>
>>> A phenomena that could be occurring is that during congestion avoidance
>>> (until TCP loses its cookies entirely and probes for a higher operating
>>> point) that TCP is carefully timing it's packets to keep the buffers
>>> almost exactly full, so that competing flows (in my case, simple pings)
>>> are likely to arrive just when there is no buffer space to accept them
>>> and therefore you see higher losses on them than you would on the single
>>> flow I've been tracing and getting loss statistics from.
>>>
>>> People who want to look into this further would be a great help.
>>>                  - Jim
>> I would not put a lot of trust in measuring loss with pings.
>> I heard that some ISP's do different processing on ICMP's used
>> for ping packets. They either prioritize them high to provide
>> artificially good response (better marketing numbers); or
>> prioritize them low since they aren't useful traffic.
>> There are also filters that only allow N ICMP requests per second
>> which means repeated probes will be dropped.
>>
>>
>>
>> -- 
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-05 16:49             ` [Bloat] Burst Loss Neil Davies
  2011-05-05 18:34               ` Jim Gettys
@ 2011-05-06 11:40               ` Sam Stickland
  2011-05-06 11:53                 ` Neil Davies
  2011-05-08 12:42               ` Richard Scheffenegger
  2 siblings, 1 reply; 66+ messages in thread
From: Sam Stickland @ 2011-05-06 11:40 UTC (permalink / raw)
  To: Neil Davies; +Cc: Stephen Hemminger, bloat

[-- Attachment #1: Type: text/plain, Size: 2604 bytes --]



On 5 May 2011, at 17:49, Neil Davies <Neil.Davies@pnsol.com> wrote:

> On the issue of loss - we did a study of the UK's ADSL access network back in 2006 over several weeks, looking at the loss and delay that was introduced into the bi-directional traffic.
> 
> We found that the delay variability (that bit left over after you've taken the effects of geography and line sync rates) was broadly
> the same over the half dozen locations we studied - it was there all the time to the same level of  variance and that what did vary by time of day was the loss rate.
> 
> We also found out, at the time much to our surprise - but we understand why now, that loss was broadly independent of the offered load - we used a constant data rate (with either fixed or variable packet sizes) .
> 
> We found that loss rates were in the range 1% to 3% (which is what would be expected from a large number of TCP streams contending for a limiting resource).
> 
> As for burst loss, yes it does occur - but it could be argued that this more the fault of the sending TCP stack than the network.
> 
> This phenomenon was well covered in the academic literature in the '90s (if I remember correctly folks at INRIA lead the way) - it is all down to the nature of random processes and how you observe them.  
> 
> Back to back packets see higher loss rates than packets more spread out in time. Consider a pair of packets, back to back, arriving over a 1Gbit/sec link into a queue being serviced at 34Mbit/sec, the first packet being 'lost' is equivalent to saying that the first packet 'observed' the queue full - the system's state is no longer a random variable - it is known to be full. The second packet (lets assume it is also a full one) 'makes an observation' of the state of that queue about 12us later - but that is only 3% of the time that it takes to service such large packets at 34 Mbit/sec. The system has not had any time to 'relax' anywhere near to back its steady state, it is highly likely that it is still full. 
> 
> Fixing this makes a phenomenal difference on the goodput (with the usual delay effects that implies), we've even built and deployed systems with this sort of engineering embedded (deployed as a network 'wrap') that mean that end users can sustainably (days on end) achieve effective throughput that is better than 98% of (the transmission media imposed) maximum. What we had done is make the network behave closer to the underlying statistical assumptions made in TCP's design.

How did you fix this? What alters the packet spacing? The network or the host?

Sam

[-- Attachment #2: Type: text/html, Size: 3594 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-06 11:40               ` Sam Stickland
@ 2011-05-06 11:53                 ` Neil Davies
  0 siblings, 0 replies; 66+ messages in thread
From: Neil Davies @ 2011-05-06 11:53 UTC (permalink / raw)
  To: Sam Stickland; +Cc: Stephen Hemminger, bloat

[-- Attachment #1: Type: text/plain, Size: 2862 bytes --]


On 6 May 2011, at 12:40, Sam Stickland wrote:

> 
> 
> On 5 May 2011, at 17:49, Neil Davies <Neil.Davies@pnsol.com> wrote:
> 
>> On the issue of loss - we did a study of the UK's ADSL access network back in 2006 over several weeks, looking at the loss and delay that was introduced into the bi-directional traffic.
>> 
>> We found that the delay variability (that bit left over after you've taken the effects of geography and line sync rates) was broadly
>> the same over the half dozen locations we studied - it was there all the time to the same level of  variance and that what did vary by time of day was the loss rate.
>> 
>> We also found out, at the time much to our surprise - but we understand why now, that loss was broadly independent of the offered load - we used a constant data rate (with either fixed or variable packet sizes) .
>> 
>> We found that loss rates were in the range 1% to 3% (which is what would be expected from a large number of TCP streams contending for a limiting resource).
>> 
>> As for burst loss, yes it does occur - but it could be argued that this more the fault of the sending TCP stack than the network.
>> 
>> This phenomenon was well covered in the academic literature in the '90s (if I remember correctly folks at INRIA lead the way) - it is all down to the nature of random processes and how you observe them.  
>> 
>> Back to back packets see higher loss rates than packets more spread out in time. Consider a pair of packets, back to back, arriving over a 1Gbit/sec link into a queue being serviced at 34Mbit/sec, the first packet being 'lost' is equivalent to saying that the first packet 'observed' the queue full - the system's state is no longer a random variable - it is known to be full. The second packet (lets assume it is also a full one) 'makes an observation' of the state of that queue about 12us later - but that is only 3% of the time that it takes to service such large packets at 34 Mbit/sec. The system has not had any time to 'relax' anywhere near to back its steady state, it is highly likely that it is still full. 
>> 
>> Fixing this makes a phenomenal difference on the goodput (with the usual delay effects that implies), we've even built and deployed systems with this sort of engineering embedded (deployed as a network 'wrap') that mean that end users can sustainably (days on end) achieve effective throughput that is better than 98% of (the transmission media imposed) maximum. What we had done is make the network behave closer to the underlying statistical assumptions made in TCP's design.
> 
> How did you fix this? What alters the packet spacing? The network or the host?


It is a device in the network, it sits at the 'edge' of the access network (at the ISP / Network Wholesaler boundary) - that resolves the downstream issue.

Neil

> 
> Sam


[-- Attachment #2: Type: text/html, Size: 4145 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-05 16:49             ` [Bloat] Burst Loss Neil Davies
  2011-05-05 18:34               ` Jim Gettys
  2011-05-06 11:40               ` Sam Stickland
@ 2011-05-08 12:42               ` Richard Scheffenegger
  2011-05-09 18:06                 ` Rick Jones
  2 siblings, 1 reply; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-08 12:42 UTC (permalink / raw)
  To: Neil Davies, Stephen Hemminger; +Cc: bloat


I'm not an expert in TSO / GSO, and NIC driver design, but what I gathered 
is, that with these schemes, and mordern NICs that do scatter/gather DMA of 
dotzends of "independent" header/data chuncks directly from memory, the NIC 
will typically send out non-interleaved trains of segments all belonging to 
single TCP sessions. With the implicit assumption, that these burst of up to 
180 segments (Intel supports 256kB data per chain) can be absorped by the 
buffer at the bottleneck and spread out in time there...

From my perspective, having such GSO / TSO to "cycle" through all the 
different chains belonging to different sessions (to not introduce 
reordering at the sender even), should already help pace the segments per 
session somewhat; a slightly more sophisticated DMA engine could check each 
of the chains for how much data is to be sent by those, and then clock an 
appropriate number of interleaved segmets out... I do understand that this 
is "work" for a HW DMA engine and slows down GSO software implementations, 
but may severly reduce the instantaneous rate of a single session, and 
thereby the impact of burst loss to to momenary buffer overload...

(Let me know if I should draw a picture of the way I understand TSO / HW DMA 
is currently working, and where it could be improved upon):

Best regards,
   Richard


----- Original Message ----- 
> Back to back packets see higher loss rates than packets more spread out in 
> time. Consider a pair of packets, back to back, arriving over a 1Gbit/sec 
> link into a queue being serviced at 34Mbit/sec, the first packet being 
> 'lost' is equivalent to saying that the first packet 'observed' the queue 
> full - the system's state is no longer a random variable - it is known to 
> be full. The second packet (lets assume it is also a full one) 'makes an 
> observation' of the state of that queue about 12us later - but that is 
> only 3% of the time that it takes to service such large packets at 34 
> Mbit/sec. The system has not had any time to 'relax' anywhere near to back 
> its steady state, it is highly likely that it is still full.
>
> Fixing this makes a phenomenal difference on the goodput (with the usual 
> delay effects that implies), we've even built and deployed systems with 
> this sort of engineering embedded (deployed as a network 'wrap') that mean 
> that end users can sustainably (days on end) achieve effective throughput 
> that is better than 98% of (the transmission media imposed) maximum. What 
> we had done is make the network behave closer to the underlying 
> statistical assumptions made in TCP's design.
>
> Neil
>
>
>
>
> On 5 May 2011, at 17:10, Stephen Hemminger wrote:
>
>> On Thu, 05 May 2011 12:01:22 -0400
>> Jim Gettys <jg@freedesktop.org> wrote:
>>
>>> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
>>>> I'm curious, has anyone done some simulations to check if the
>>>> following qualitative statement holds true, and if, what the
>>>> quantitative effect is:
>>>>
>>>> With bufferbloat, the TCP congestion control reaction is unduely
>>>> delayed. When it finally happens, the tcp stream is likely facing a
>>>> "burst loss" event - multiple consecutive packets get dropped. Worse
>>>> yet, the sender with the lowest RTT across the bottleneck will likely
>>>> start to retransmit while the (tail-drop) queue is still overflowing.
>>>>
>>>> And a lost retransmission means a major setback in bandwidth (except
>>>> for Linux with bulk transfers and SACK enabled), as the standard (RFC
>>>> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms
>>>> typically) to recover such a lost retransmission...
>>>>
>>>> The second part (more important as an incentive to the ISPs actually),
>>>> how does the fraction of goodput vs. throughput change, when AQM
>>>> schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs
>>>> have to pay for their upstream volume, regardless if that is "real"
>>>> work (goodput) or unneccessary retransmissions.
>>>>
>>>> When I was at a small cable ISP in switzerland last week, surely
>>>> enough bufferbloat was readily observable (17ms -> 220ms after 30 sec
>>>> of a bulk transfer), but at first they had the "not our problem" view,
>>>> until I started discussing burst loss / retransmissions / goodput vs
>>>> throughput - with the latest point being a real commercial incentive
>>>> to them. (They promised to check if AQM would be available in the CPE
>>>> / CMTS, and put latency bounds in their tenders going forward).
>>>>
>>> I wish I had a good answer to your very good questions.  Simulation
>>> would be interesting though real daa is more convincing.
>>>
>>> I haven't looked in detail at all that many traces to try to get a feel
>>> for how much bandwidth waste there actually is, and more formal studies
>>> like Netalyzr, SamKnows, or the Bismark project would be needed to
>>> quantify the loss on the network as a whole.
>>>
>>> I did spend some time last fall with the traces I've taken.  In those,
>>> I've typically been seeing 1-3% packet loss in the main TCP transfers.
>>> On the wireless trace I took, I saw 9% loss, but whether that is
>>> bufferbloat induced loss or not, I don't know (the data is out there for
>>> those who might want to dig).  And as you note, the losses are
>>> concentrated in bursts (probably due to the details of Cubic, so I'm 
>>> told).
>>>
>>> I've had anecdotal reports (and some first hand experience) with much
>>> higher loss rates, for example from Nick Weaver at ICSI; but I believe
>>> in playing things conservatively with any numbers I quote and I've not
>>> gotten consistent results when I've tried, so I just report what's in
>>> the packet captures I did take.
>>>
>>> A phenomena that could be occurring is that during congestion avoidance
>>> (until TCP loses its cookies entirely and probes for a higher operating
>>> point) that TCP is carefully timing it's packets to keep the buffers
>>> almost exactly full, so that competing flows (in my case, simple pings)
>>> are likely to arrive just when there is no buffer space to accept them
>>> and therefore you see higher losses on them than you would on the single
>>> flow I've been tracing and getting loss statistics from.
>>>
>>> People who want to look into this further would be a great help.
>>>                 - Jim
>>
>> I would not put a lot of trust in measuring loss with pings.
>> I heard that some ISP's do different processing on ICMP's used
>> for ping packets. They either prioritize them high to provide
>> artificially good response (better marketing numbers); or
>> prioritize them low since they aren't useful traffic.
>> There are also filters that only allow N ICMP requests per second
>> which means repeated probes will be dropped.
>>
>>
>>
>> -- 
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-08 12:42               ` Richard Scheffenegger
@ 2011-05-09 18:06                 ` Rick Jones
  2011-05-11  8:53                   ` Richard Scheffenegger
  2011-05-12 16:31                   ` [Bloat] Burst Loss Fred Baker
  0 siblings, 2 replies; 66+ messages in thread
From: Rick Jones @ 2011-05-09 18:06 UTC (permalink / raw)
  To: Richard Scheffenegger; +Cc: Stephen Hemminger, bloat

On Sun, 2011-05-08 at 14:42 +0200, Richard Scheffenegger wrote:
> I'm not an expert in TSO / GSO, and NIC driver design, but what I gathered 
> is, that with these schemes, and mordern NICs that do scatter/gather DMA of 
> dotzends of "independent" header/data chuncks directly from memory, the NIC 
> will typically send out non-interleaved trains of segments all belonging to 
> single TCP sessions. With the implicit assumption, that these burst of up to 
> 180 segments (Intel supports 256kB data per chain) can be absorped by the 
> buffer at the bottleneck and spread out in time there...
> 
> From my perspective, having such GSO / TSO to "cycle" through all the 
> different chains belonging to different sessions (to not introduce 
> reordering at the sender even), should already help pace the segments per 
> session somewhat; a slightly more sophisticated DMA engine could check each 
> of the chains for how much data is to be sent by those, and then clock an 
> appropriate number of interleaved segmets out... I do understand that this 
> is "work" for a HW DMA engine and slows down GSO software implementations, 
> but may severly reduce the instantaneous rate of a single session, and 
> thereby the impact of burst loss to to momenary buffer overload...
> 
> (Let me know if I should draw a picture of the way I understand TSO / HW DMA 
> is currently working, and where it could be improved upon):

GSO/TSO can be thought of as a symptom of standards bodies (eg the IEEE)
refusing to standardize an increase in frame sizes.  Put another way,
they are a "poor man's jumbo frames."

Within the context of a given "priority" at least, NICs are
setup/designed to do things in order.  I too cannot claim to be a NIC
designer, but suspect it would be a non-trivial, if straight-forward
exercise to get a NIC to cycle through multiple GSO/TSO sends.  Yes,
they could probably (ab)use any prioritization support they have.

NICs and drivers are accustomed to "in order" processing - grab packet,
send packet, update status, lather, rinse, repeat (modulo some
pre-fetching).  Those rings aren't really amenable to "out of order"
completion notifications, so the NIC would have to still do "in order"
retirement of packets or the driver model will loose simplicity.

As for the issue below, even if the NIC(s) upstream did interleave
between two GSO'd sends, you are simply trading back-to-back frames of a
single flow for back-to-back frames of different flows.  And if there is
only the one flow upstream of this bottleneck, whether GSO is on or not
probably won't make a huge difference in the timing - only how much CPU
is burned on the source host.

> Best regards,
>    Richard
> 
> 
> ----- Original Message ----- 
> > Back to back packets see higher loss rates than packets more spread out in 
> > time. Consider a pair of packets, back to back, arriving over a 1Gbit/sec 
> > link into a queue being serviced at 34Mbit/sec, the first packet being 
> > 'lost' is equivalent to saying that the first packet 'observed' the queue 
> > full - the system's state is no longer a random variable - it is known to 
> > be full. The second packet (lets assume it is also a full one) 'makes an 
> > observation' of the state of that queue about 12us later - but that is 
> > only 3% of the time that it takes to service such large packets at 34 
> > Mbit/sec. The system has not had any time to 'relax' anywhere near to back 
> > its steady state, it is highly likely that it is still full.
> >
> > Fixing this makes a phenomenal difference on the goodput (with the usual 
> > delay effects that implies), we've even built and deployed systems with 
> > this sort of engineering embedded (deployed as a network 'wrap') that mean 
> > that end users can sustainably (days on end) achieve effective throughput 
> > that is better than 98% of (the transmission media imposed) maximum. What 
> > we had done is make the network behave closer to the underlying 
> > statistical assumptions made in TCP's design.
> >
> > Neil
> >
> >
> >
> >
> > On 5 May 2011, at 17:10, Stephen Hemminger wrote:
> >
> >> On Thu, 05 May 2011 12:01:22 -0400
> >> Jim Gettys <jg@freedesktop.org> wrote:
> >>
> >>> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
> >>>> I'm curious, has anyone done some simulations to check if the
> >>>> following qualitative statement holds true, and if, what the
> >>>> quantitative effect is:
> >>>>
> >>>> With bufferbloat, the TCP congestion control reaction is unduely
> >>>> delayed. When it finally happens, the tcp stream is likely facing a
> >>>> "burst loss" event - multiple consecutive packets get dropped. Worse
> >>>> yet, the sender with the lowest RTT across the bottleneck will likely
> >>>> start to retransmit while the (tail-drop) queue is still overflowing.
> >>>>
> >>>> And a lost retransmission means a major setback in bandwidth (except
> >>>> for Linux with bulk transfers and SACK enabled), as the standard (RFC
> >>>> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms
> >>>> typically) to recover such a lost retransmission...
> >>>>
> >>>> The second part (more important as an incentive to the ISPs actually),
> >>>> how does the fraction of goodput vs. throughput change, when AQM
> >>>> schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs
> >>>> have to pay for their upstream volume, regardless if that is "real"
> >>>> work (goodput) or unneccessary retransmissions.
> >>>>
> >>>> When I was at a small cable ISP in switzerland last week, surely
> >>>> enough bufferbloat was readily observable (17ms -> 220ms after 30 sec
> >>>> of a bulk transfer), but at first they had the "not our problem" view,
> >>>> until I started discussing burst loss / retransmissions / goodput vs
> >>>> throughput - with the latest point being a real commercial incentive
> >>>> to them. (They promised to check if AQM would be available in the CPE
> >>>> / CMTS, and put latency bounds in their tenders going forward).
> >>>>
> >>> I wish I had a good answer to your very good questions.  Simulation
> >>> would be interesting though real daa is more convincing.
> >>>
> >>> I haven't looked in detail at all that many traces to try to get a feel
> >>> for how much bandwidth waste there actually is, and more formal studies
> >>> like Netalyzr, SamKnows, or the Bismark project would be needed to
> >>> quantify the loss on the network as a whole.
> >>>
> >>> I did spend some time last fall with the traces I've taken.  In those,
> >>> I've typically been seeing 1-3% packet loss in the main TCP transfers.
> >>> On the wireless trace I took, I saw 9% loss, but whether that is
> >>> bufferbloat induced loss or not, I don't know (the data is out there for
> >>> those who might want to dig).  And as you note, the losses are
> >>> concentrated in bursts (probably due to the details of Cubic, so I'm 
> >>> told).
> >>>
> >>> I've had anecdotal reports (and some first hand experience) with much
> >>> higher loss rates, for example from Nick Weaver at ICSI; but I believe
> >>> in playing things conservatively with any numbers I quote and I've not
> >>> gotten consistent results when I've tried, so I just report what's in
> >>> the packet captures I did take.
> >>>
> >>> A phenomena that could be occurring is that during congestion avoidance
> >>> (until TCP loses its cookies entirely and probes for a higher operating
> >>> point) that TCP is carefully timing it's packets to keep the buffers
> >>> almost exactly full, so that competing flows (in my case, simple pings)
> >>> are likely to arrive just when there is no buffer space to accept them
> >>> and therefore you see higher losses on them than you would on the single
> >>> flow I've been tracing and getting loss statistics from.
> >>>
> >>> People who want to look into this further would be a great help.
> >>>                 - Jim
> >>
> >> I would not put a lot of trust in measuring loss with pings.
> >> I heard that some ISP's do different processing on ICMP's used
> >> for ping packets. They either prioritize them high to provide
> >> artificially good response (better marketing numbers); or
> >> prioritize them low since they aren't useful traffic.
> >> There are also filters that only allow N ICMP requests per second
> >> which means repeated probes will be dropped.
> >>
> >>
> >>
> >> -- 
> >> _______________________________________________
> >> Bloat mailing list
> >> Bloat@lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/bloat
> >
> > _______________________________________________
> > Bloat mailing list
> > Bloat@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/bloat 
> 
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-09 18:06                 ` Rick Jones
@ 2011-05-11  8:53                   ` Richard Scheffenegger
  2011-05-11  9:53                     ` Eric Dumazet
  2011-05-12 16:31                   ` [Bloat] Burst Loss Fred Baker
  1 sibling, 1 reply; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-11  8:53 UTC (permalink / raw)
  To: rick.jones2; +Cc: Stephen Hemminger, bloat

> Within the context of a given "priority" at least, NICs are
> setup/designed to do things in order.  I too cannot claim to be a NIC
> designer, but suspect it would be a non-trivial, if straight-forward
> exercise to get a NIC to cycle through multiple GSO/TSO sends.  Yes,
> they could probably (ab)use any prioritization support they have.
>
> NICs and drivers are accustomed to "in order" processing - grab packet,
> send packet, update status, lather, rinse, repeat (modulo some
> pre-fetching).  Those rings aren't really amenable to "out of order"
> completion notifications, so the NIC would have to still do "in order"
> retirement of packets or the driver model will loose simplicity.
>
> As for the issue below, even if the NIC(s) upstream did interleave
> between two GSO'd sends, you are simply trading back-to-back frames of a
> single flow for back-to-back frames of different flows.  And if there is
> only the one flow upstream of this bottleneck, whether GSO is on or not
> probably won't make a huge difference in the timing - only how much CPU
> is burned on the source host.


Well, the transmit descriptors (header + pointer to the data to be 
segmented) is in the hand of the hw driver...
The hw driver could at least check if the current list of transmit 
descriptors is for different tcp sessions
(or interspaced non-tcp traffic), and could interleave these descriptors 
(reorder them, before they are processed
by hardware - while obviously maintaining relative ordering between the 
descriptors belonging to the same flow.

Also, I think this feature could be utilized for pacing to some extent - 
interspace the (valid) traffic descriptors
with descriptors that will cause "invalid" packets to be sent (ie. dst mac 
== src max; should be dropped by the first switch). It's been well known 
that properly paced traffic is much more resilient than traffic being sent 
in short bursts of wirespeed trains of packets. (TSO defeats the 
self-clocking of TCP with ACKs).

Just a thought...

Richard



----- Original Message ----- 
From: "Rick Jones" <rick.jones2@hp.com>
To: "Richard Scheffenegger" <rscheff@gmx.at>
Cc: "Neil Davies" <Neil.Davies@pnsol.com>; "Stephen Hemminger" 
<shemminger@vyatta.com>; <bloat@lists.bufferbloat.net>
Sent: Monday, May 09, 2011 8:06 PM
Subject: Re: [Bloat] Burst Loss


> On Sun, 2011-05-08 at 14:42 +0200, Richard Scheffenegger wrote:
>> I'm not an expert in TSO / GSO, and NIC driver design, but what I 
>> gathered
>> is, that with these schemes, and mordern NICs that do scatter/gather DMA 
>> of
>> dotzends of "independent" header/data chuncks directly from memory, the 
>> NIC
>> will typically send out non-interleaved trains of segments all belonging 
>> to
>> single TCP sessions. With the implicit assumption, that these burst of up 
>> to
>> 180 segments (Intel supports 256kB data per chain) can be absorped by the
>> buffer at the bottleneck and spread out in time there...
>>
>> From my perspective, having such GSO / TSO to "cycle" through all the
>> different chains belonging to different sessions (to not introduce
>> reordering at the sender even), should already help pace the segments per
>> session somewhat; a slightly more sophisticated DMA engine could check 
>> each
>> of the chains for how much data is to be sent by those, and then clock an
>> appropriate number of interleaved segmets out... I do understand that 
>> this
>> is "work" for a HW DMA engine and slows down GSO software 
>> implementations,
>> but may severly reduce the instantaneous rate of a single session, and
>> thereby the impact of burst loss to to momenary buffer overload...
>>
>> (Let me know if I should draw a picture of the way I understand TSO / HW 
>> DMA
>> is currently working, and where it could be improved upon):
>
> GSO/TSO can be thought of as a symptom of standards bodies (eg the IEEE)
> refusing to standardize an increase in frame sizes.  Put another way,
> they are a "poor man's jumbo frames."
>
> Within the context of a given "priority" at least, NICs are
> setup/designed to do things in order.  I too cannot claim to be a NIC
> designer, but suspect it would be a non-trivial, if straight-forward
> exercise to get a NIC to cycle through multiple GSO/TSO sends.  Yes,
> they could probably (ab)use any prioritization support they have.
>
> NICs and drivers are accustomed to "in order" processing - grab packet,
> send packet, update status, lather, rinse, repeat (modulo some
> pre-fetching).  Those rings aren't really amenable to "out of order"
> completion notifications, so the NIC would have to still do "in order"
> retirement of packets or the driver model will loose simplicity.
>
> As for the issue below, even if the NIC(s) upstream did interleave
> between two GSO'd sends, you are simply trading back-to-back frames of a
> single flow for back-to-back frames of different flows.  And if there is
> only the one flow upstream of this bottleneck, whether GSO is on or not
> probably won't make a huge difference in the timing - only how much CPU
> is burned on the source host.
>
>> Best regards,
>>    Richard
>>
>>
>> ----- Original Message ----- 
>> > Back to back packets see higher loss rates than packets more spread out 
>> > in
>> > time. Consider a pair of packets, back to back, arriving over a 
>> > 1Gbit/sec
>> > link into a queue being serviced at 34Mbit/sec, the first packet being
>> > 'lost' is equivalent to saying that the first packet 'observed' the 
>> > queue
>> > full - the system's state is no longer a random variable - it is known 
>> > to
>> > be full. The second packet (lets assume it is also a full one) 'makes 
>> > an
>> > observation' of the state of that queue about 12us later - but that is
>> > only 3% of the time that it takes to service such large packets at 34
>> > Mbit/sec. The system has not had any time to 'relax' anywhere near to 
>> > back
>> > its steady state, it is highly likely that it is still full.
>> >
>> > Fixing this makes a phenomenal difference on the goodput (with the 
>> > usual
>> > delay effects that implies), we've even built and deployed systems with
>> > this sort of engineering embedded (deployed as a network 'wrap') that 
>> > mean
>> > that end users can sustainably (days on end) achieve effective 
>> > throughput
>> > that is better than 98% of (the transmission media imposed) maximum. 
>> > What
>> > we had done is make the network behave closer to the underlying
>> > statistical assumptions made in TCP's design.
>> >
>> > Neil
>> >
>> >
>> >
>> >
>> > On 5 May 2011, at 17:10, Stephen Hemminger wrote:
>> >
>> >> On Thu, 05 May 2011 12:01:22 -0400
>> >> Jim Gettys <jg@freedesktop.org> wrote:
>> >>
>> >>> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
>> >>>> I'm curious, has anyone done some simulations to check if the
>> >>>> following qualitative statement holds true, and if, what the
>> >>>> quantitative effect is:
>> >>>>
>> >>>> With bufferbloat, the TCP congestion control reaction is unduely
>> >>>> delayed. When it finally happens, the tcp stream is likely facing a
>> >>>> "burst loss" event - multiple consecutive packets get dropped. Worse
>> >>>> yet, the sender with the lowest RTT across the bottleneck will 
>> >>>> likely
>> >>>> start to retransmit while the (tail-drop) queue is still 
>> >>>> overflowing.
>> >>>>
>> >>>> And a lost retransmission means a major setback in bandwidth (except
>> >>>> for Linux with bulk transfers and SACK enabled), as the standard 
>> >>>> (RFC
>> >>>> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms
>> >>>> typically) to recover such a lost retransmission...
>> >>>>
>> >>>> The second part (more important as an incentive to the ISPs 
>> >>>> actually),
>> >>>> how does the fraction of goodput vs. throughput change, when AQM
>> >>>> schemes are deployed, and TCP CC reacts in a timely manner? Small 
>> >>>> ISPs
>> >>>> have to pay for their upstream volume, regardless if that is "real"
>> >>>> work (goodput) or unneccessary retransmissions.
>> >>>>
>> >>>> When I was at a small cable ISP in switzerland last week, surely
>> >>>> enough bufferbloat was readily observable (17ms -> 220ms after 30 
>> >>>> sec
>> >>>> of a bulk transfer), but at first they had the "not our problem" 
>> >>>> view,
>> >>>> until I started discussing burst loss / retransmissions / goodput vs
>> >>>> throughput - with the latest point being a real commercial incentive
>> >>>> to them. (They promised to check if AQM would be available in the 
>> >>>> CPE
>> >>>> / CMTS, and put latency bounds in their tenders going forward).
>> >>>>
>> >>> I wish I had a good answer to your very good questions.  Simulation
>> >>> would be interesting though real daa is more convincing.
>> >>>
>> >>> I haven't looked in detail at all that many traces to try to get a 
>> >>> feel
>> >>> for how much bandwidth waste there actually is, and more formal 
>> >>> studies
>> >>> like Netalyzr, SamKnows, or the Bismark project would be needed to
>> >>> quantify the loss on the network as a whole.
>> >>>
>> >>> I did spend some time last fall with the traces I've taken.  In 
>> >>> those,
>> >>> I've typically been seeing 1-3% packet loss in the main TCP 
>> >>> transfers.
>> >>> On the wireless trace I took, I saw 9% loss, but whether that is
>> >>> bufferbloat induced loss or not, I don't know (the data is out there 
>> >>> for
>> >>> those who might want to dig).  And as you note, the losses are
>> >>> concentrated in bursts (probably due to the details of Cubic, so I'm
>> >>> told).
>> >>>
>> >>> I've had anecdotal reports (and some first hand experience) with much
>> >>> higher loss rates, for example from Nick Weaver at ICSI; but I 
>> >>> believe
>> >>> in playing things conservatively with any numbers I quote and I've 
>> >>> not
>> >>> gotten consistent results when I've tried, so I just report what's in
>> >>> the packet captures I did take.
>> >>>
>> >>> A phenomena that could be occurring is that during congestion 
>> >>> avoidance
>> >>> (until TCP loses its cookies entirely and probes for a higher 
>> >>> operating
>> >>> point) that TCP is carefully timing it's packets to keep the buffers
>> >>> almost exactly full, so that competing flows (in my case, simple 
>> >>> pings)
>> >>> are likely to arrive just when there is no buffer space to accept 
>> >>> them
>> >>> and therefore you see higher losses on them than you would on the 
>> >>> single
>> >>> flow I've been tracing and getting loss statistics from.
>> >>>
>> >>> People who want to look into this further would be a great help.
>> >>>                 - Jim
>> >>
>> >> I would not put a lot of trust in measuring loss with pings.
>> >> I heard that some ISP's do different processing on ICMP's used
>> >> for ping packets. They either prioritize them high to provide
>> >> artificially good response (better marketing numbers); or
>> >> prioritize them low since they aren't useful traffic.
>> >> There are also filters that only allow N ICMP requests per second
>> >> which means repeated probes will be dropped.
>> >>
>> >>
>> >>
>> >> -- 
>> >> _______________________________________________
>> >> Bloat mailing list
>> >> Bloat@lists.bufferbloat.net
>> >> https://lists.bufferbloat.net/listinfo/bloat
>> >
>> > _______________________________________________
>> > Bloat mailing list
>> > Bloat@lists.bufferbloat.net
>> > https://lists.bufferbloat.net/listinfo/bloat
>>
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-11  8:53                   ` Richard Scheffenegger
@ 2011-05-11  9:53                     ` Eric Dumazet
  2011-05-12 14:16                       ` [Bloat] Publications Richard Scheffenegger
  0 siblings, 1 reply; 66+ messages in thread
From: Eric Dumazet @ 2011-05-11  9:53 UTC (permalink / raw)
  To: Richard Scheffenegger; +Cc: Stephen Hemminger, bloat

Le mercredi 11 mai 2011 à 10:53 +0200, Richard Scheffenegger a écrit :

> Well, the transmit descriptors (header + pointer to the data to be 
> segmented) is in the hand of the hw driver...
> The hw driver could at least check if the current list of transmit 
> descriptors is for different tcp sessions
> (or interspaced non-tcp traffic), and could interleave these descriptors 
> (reorder them, before they are processed
> by hardware - while obviously maintaining relative ordering between the 
> descriptors belonging to the same flow.
> 
> Also, I think this feature could be utilized for pacing to some extent - 
> interspace the (valid) traffic descriptors
> with descriptors that will cause "invalid" packets to be sent (ie. dst mac 
> == src max; should be dropped by the first switch). It's been well known 
> that properly paced traffic is much more resilient than traffic being sent 
> in short bursts of wirespeed trains of packets. (TSO defeats the 
> self-clocking of TCP with ACKs).

In French, we would say "Avoir le beurre et l'argent du beurre" ;)

GSO is for high performance data xmits, usually in LAN.
Dont expect NICS perform the hard/smart work for you.
Of course hardware vendors claim they can do this, but this is mostly
done with vendor specific methods, and you might spend a lot of time
tuning hardware.

If you want AQM, better use a well chosen qdisc setup (depending on the
workload), and disable TSO/GSO. This will work well with all hardware,
and presumably last for longer times (including hardware changes)




^ permalink raw reply	[flat|nested] 66+ messages in thread

* [Bloat] Publications
  2011-05-11  9:53                     ` Eric Dumazet
@ 2011-05-12 14:16                       ` Richard Scheffenegger
  0 siblings, 0 replies; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-12 14:16 UTC (permalink / raw)
  To: bloat

Multimedia-unfriendly TCP Congestion Control and Home Gateway Queue 
Management
http://caia.swin.edu.au/~gja/papers/mmsys2011-lstewart-p35.pdf

Two-way TCP Connections: Old Problem, New Insight
http://ccr.sigcomm.org/online/files/p6-v41n2b2-heussePS.pdf
(actually, they look at two antiparallel tcp connections, not individual 
two-way tcp connections :).

Both papers have something to say about buffer sizeing in Home CPE gear...

Regards,
   Richard

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-09 18:06                 ` Rick Jones
  2011-05-11  8:53                   ` Richard Scheffenegger
@ 2011-05-12 16:31                   ` Fred Baker
  2011-05-12 16:41                     ` Rick Jones
  2011-05-13  5:00                     ` Kevin Gross
  1 sibling, 2 replies; 66+ messages in thread
From: Fred Baker @ 2011-05-12 16:31 UTC (permalink / raw)
  To: rick.jones2; +Cc: Stephen Hemminger, bloat

On May 9, 2011, at 11:06 AM, Rick Jones wrote:

> GSO/TSO can be thought of as a symptom of standards bodies (eg the IEEE)
> refusing to standardize an increase in frame sizes.  Put another way,
> they are a "poor man's jumbo frames."

I'll agree, but only half; once the packets are transferred on the local wire, any jumbo-ness is lost. GSO/TSO mostly squeezes interframe gaps out of the wire and perhaps limits the amount of work the driver has to do. The real value of an end to end (IP) jumbo frame is that the receiving system experiences less interrupt load - a 9K frame replaces half a dozen 1500 byte frames, and as a result the receiver experiences 1/5 or 1/6 of the interrupts. Given that it has to save state, activate the kernel thread, and at least enqueue and perhaps acknowledge the received message, reducing interrupt load on the receiver makes it far more effective. This has the greatest effect on multi-gigabit file transfers.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-12 16:31                   ` [Bloat] Burst Loss Fred Baker
@ 2011-05-12 16:41                     ` Rick Jones
  2011-05-12 17:11                       ` Fred Baker
  2011-05-13  5:00                     ` Kevin Gross
  1 sibling, 1 reply; 66+ messages in thread
From: Rick Jones @ 2011-05-12 16:41 UTC (permalink / raw)
  To: Fred Baker; +Cc: Stephen Hemminger, bloat

On Thu, 2011-05-12 at 09:31 -0700, Fred Baker wrote:
> On May 9, 2011, at 11:06 AM, Rick Jones wrote:
> 
> > GSO/TSO can be thought of as a symptom of standards bodies (eg the IEEE)
> > refusing to standardize an increase in frame sizes.  Put another way,
> > they are a "poor man's jumbo frames."
> 
> I'll agree, but only half; once the packets are transferred on the
>  local wire, any jumbo-ness is lost. 

That is why I called them "poor man's" - he can't have everything :)

>  GSO/TSO mostly squeezes interframe  gaps out of the wire and perhaps
>  limits the amount of work the driver  has to do. The real value of an
>  end to end (IP) jumbo frame is that  the receiving system experiences
>  less interrupt load - a 9K frame  replaces half a dozen 1500 byte
>  frames, and as a result the receiver  experiences 1/5 or 1/6 of the
>  interrupts. Given that it has to save  state, activate the kernel
>  thread, and at least enqueue and perhaps  acknowledge the received
>  message, reducing interrupt load on the  receiver makes it far more
>  effective. This has the greatest effect on  multi-gigabit file
>  transfers.

Perhaps I'm trying to argue about the number of angels which can dance
on the head of a pin, but isn't mitigating interrupt rates something
that NICs and their drivers (and NAPI in the context of Linux) been
doing for years?

Or are you using "interrupt" to refer to the entire trip up the protocol
stack and not just "interupts?"

And then there is GRO/LRO.

Of course as all the world is not bulk flows, one still has to write a
nice, tight, stack and driver :)

rick jones


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-12 16:41                     ` Rick Jones
@ 2011-05-12 17:11                       ` Fred Baker
  0 siblings, 0 replies; 66+ messages in thread
From: Fred Baker @ 2011-05-12 17:11 UTC (permalink / raw)
  To: rick.jones2; +Cc: Stephen Hemminger, bloat


On May 12, 2011, at 9:41 AM, Rick Jones wrote:

> Perhaps I'm trying to argue about the number of angels which can dance
> on the head of a pin, but isn't mitigating interrupt rates something
> that NICs and their drivers (and NAPI in the context of Linux) been
> doing for years?
> 
> Or are you using "interrupt" to refer to the entire trip up the protocol
> stack and not just "interupts?"

It's the stack up to the API to the application, which of course receives the data when it reads the socket. 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-12 16:31                   ` [Bloat] Burst Loss Fred Baker
  2011-05-12 16:41                     ` Rick Jones
@ 2011-05-13  5:00                     ` Kevin Gross
  2011-05-13 14:35                       ` Rick Jones
  1 sibling, 1 reply; 66+ messages in thread
From: Kevin Gross @ 2011-05-13  5:00 UTC (permalink / raw)
  To: bloat

[-- Attachment #1: Type: text/plain, Size: 1571 bytes --]

One of the principal reasons jumbo frames have not been standardized is due
to latency concerns. I assume this group can appreciate the IEEE holding
ground on this. For a short time, servers with gigabit NICs suffered but
smarter NICs were developed (TSO, LRO, other TLAs) and OSs upgraded to
support them and I believe it is no longer a significant issue.

Kevin Gross

On Thu, May 12, 2011 at 10:31 AM, Fred Baker <fred@cisco.com> wrote:

>
> On May 9, 2011, at 11:06 AM, Rick Jones wrote:
>
> > GSO/TSO can be thought of as a symptom of standards bodies (eg the IEEE)
> > refusing to standardize an increase in frame sizes.  Put another way,
> > they are a "poor man's jumbo frames."
>
> I'll agree, but only half; once the packets are transferred on the local
> wire, any jumbo-ness is lost. GSO/TSO mostly squeezes interframe gaps out of
> the wire and perhaps limits the amount of work the driver has to do. The
> real value of an end to end (IP) jumbo frame is that the receiving system
> experiences less interrupt load - a 9K frame replaces half a dozen 1500 byte
> frames, and as a result the receiver experiences 1/5 or 1/6 of the
> interrupts. Given that it has to save state, activate the kernel thread, and
> at least enqueue and perhaps acknowledge the received message, reducing
> interrupt load on the receiver makes it far more effective. This has the
> greatest effect on multi-gigabit file transfers.
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>

[-- Attachment #2: Type: text/html, Size: 1997 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-13  5:00                     ` Kevin Gross
@ 2011-05-13 14:35                       ` Rick Jones
  2011-05-13 14:54                         ` Dave Taht
  2011-05-13 19:32                         ` Denton Gentry
  0 siblings, 2 replies; 66+ messages in thread
From: Rick Jones @ 2011-05-13 14:35 UTC (permalink / raw)
  To: Kevin Gross; +Cc: bloat

On Thu, 2011-05-12 at 23:00 -0600, Kevin Gross wrote:
> One of the principal reasons jumbo frames have not been standardized
> is due to latency concerns. I assume this group can appreciate the
> IEEE holding ground on this.

Thusfar at least, bloaters are fighting to eliminate 10s of milliseconds
of queuing delay.  I don't think this list is worrying about the tens of
microseconds difference between the transmission time of a 9000 byte
frame at 1 GbE vs a 1500 byte frame, or the single digit microseconds
difference at 10 GbE.

The "lets try to get onto the Top 500 list" crowd might, but official
sanction for a 9000 byte MTU (or larger) doesn't mean it *must* be used.
 
> For a short time, servers with gigabit NICs suffered but smarter NICs
> were developed (TSO, LRO, other TLAs) and OSs upgraded to support them
> and I believe it is no longer a significant issue.

Are TSO and LRO going to be sufficient at 40 and 100 GbE?  Cores aren't
getting any faster. Only more plentiful.  And while it isn't the
strongest point in the world, one might even argue that the need to use
TSO/LRO to achieve performance hinders new transport protocol adoption -
the presence of NIC offloads for only TCP (or UDP) leaves a new
transport protocol (perhaps SCTP) at a disadvantage.

rick jones

> Kevin Gross
> 
> On Thu, May 12, 2011 at 10:31 AM, Fred Baker <fred@cisco.com> wrote:
>         
>         On May 9, 2011, at 11:06 AM, Rick Jones wrote:
>         
>         > GSO/TSO can be thought of as a symptom of standards bodies
>         (eg the IEEE)
>         > refusing to standardize an increase in frame sizes.  Put
>         another way,
>         > they are a "poor man's jumbo frames."
>         
>         I'll agree, but only half; once the packets are transferred on
>         the local wire, any jumbo-ness is lost. GSO/TSO mostly
>         squeezes interframe gaps out of the wire and perhaps limits
>         the amount of work the driver has to do. The real value of an
>         end to end (IP) jumbo frame is that the receiving system
>         experiences less interrupt load - a 9K frame replaces half a
>         dozen 1500 byte frames, and as a result the receiver
>         experiences 1/5 or 1/6 of the interrupts. Given that it has to
>         save state, activate the kernel thread, and at least enqueue
>         and perhaps acknowledge the received message, reducing
>         interrupt load on the receiver makes it far more effective.
>         This has the greatest effect on multi-gigabit file transfers.
>         _______________________________________________
>         Bloat mailing list
>         Bloat@lists.bufferbloat.net
>         https://lists.bufferbloat.net/listinfo/bloat
> 
> 
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-13 14:35                       ` Rick Jones
@ 2011-05-13 14:54                         ` Dave Taht
  2011-05-13 20:03                           ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Kevin Gross
                                             ` (2 more replies)
  2011-05-13 19:32                         ` Denton Gentry
  1 sibling, 3 replies; 66+ messages in thread
From: Dave Taht @ 2011-05-13 14:54 UTC (permalink / raw)
  To: rick.jones2; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 1280 bytes --]

On Fri, May 13, 2011 at 8:35 AM, Rick Jones <rick.jones2@hp.com> wrote:

> On Thu, 2011-05-12 at 23:00 -0600, Kevin Gross wrote:
> > One of the principal reasons jumbo frames have not been standardized
> > is due to latency concerns. I assume this group can appreciate the
> > IEEE holding ground on this.
>
> Thusfar at least, bloaters are fighting to eliminate 10s of milliseconds
> of queuing delay.  I don't think this list is worrying about the tens of
> microseconds difference between the transmission time of a 9000 byte
> frame at 1 GbE vs a 1500 byte frame, or the single digit microseconds
> difference at 10 GbE.
>

Heh.  With the first iteration of the bismark project I'm trying to get to
where I have less than 30ms latency under load and have far larger problems
to worry about than jumbo frames. I'll be lucky to manage 1/10th that
(300ms) at this point.

Not, incidentally that I mind the idea of jumbo frames. It seems silly to be
saddled with default frame sizes that made sense in the 70s, and in an age
where we will be seeing ever more packet encapsulation, reducing the header
size as a ratio to data size strikes me as a very worthy goal.

-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

[-- Attachment #2: Type: text/html, Size: 1699 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-13 14:54                         ` Dave Taht
@ 2011-05-13 20:03                           ` Kevin Gross
  2011-05-14 20:48                             ` Fred Baker
       [not found]                           ` <-4629065256951087821@unknownmsgid>
  2011-05-13 22:08                           ` [Bloat] Burst Loss david
  2 siblings, 1 reply; 66+ messages in thread
From: Kevin Gross @ 2011-05-13 20:03 UTC (permalink / raw)
  To: bloat

[-- Attachment #1: Type: text/plain, Size: 2529 bytes --]

Do we think that bufferbloat is just a WAN problem? I work on live media
applications for LANs and campus networks. I'm seeing what I think could be
characterized as bufferbloat in LAN equipment. The timescales on 1 Gb
Ethernet are orders of magnitude shorter and the performance problems caused
are in many cases a bit different but root cause and potential solutions
are, I'm hoping, very similar.

Keeping the frame byte size small while the frame time has shrunk maintains
the overhead at the same level. Again, this has been a conscious decision
not a stubborn relic. Ethernet improvements have increased bandwidth by
orders of magnitude. Do we really need to increase it by a couple percentage
points more by reducing overhead for large payloads?

The cost of that improved marginal bandwidth efficiency is a 6x increase in
latency. Many applications would not notice an increase from 12 us to 72 us
for a Gigabit switch hop. But on a large network it adds up, some
applications are absolutely that sensitive (transaction processing, cluster
computing, SANs) and (I thought I'd be preaching to the choir here) there's
no way to ever recover the lost performance.

Kevin Gross

From: Dave Taht [mailto:dave.taht@gmail.com] 
Sent: Friday, May 13, 2011 8:54 AM
To: rick.jones2@hp.com
Cc: Kevin Gross; bloat@lists.bufferbloat.net
Subject: Re: [Bloat] Burst Loss

On Fri, May 13, 2011 at 8:35 AM, Rick Jones <rick.jones2@hp.com> wrote:

On Thu, 2011-05-12 at 23:00 -0600, Kevin Gross wrote:
> One of the principal reasons jumbo frames have not been standardized
> is due to latency concerns. I assume this group can appreciate the
> IEEE holding ground on this.

Thusfar at least, bloaters are fighting to eliminate 10s of milliseconds
of queuing delay.  I don't think this list is worrying about the tens of
microseconds difference between the transmission time of a 9000 byte
frame at 1 GbE vs a 1500 byte frame, or the single digit microseconds
difference at 10 GbE.

Heh.  With the first iteration of the bismark project I'm trying to get to
where I have less than 30ms latency under load and have far larger problems
to worry about than jumbo frames. I'll be lucky to manage 1/10th that
(300ms) at this point. 

Not, incidentally that I mind the idea of jumbo frames. It seems silly to be
saddled with default frame sizes that made sense in the 70s, and in an age
where we will be seeing ever more packet encapsulation, reducing the header
size as a ratio to data size strikes me as a very worthy goal.

[-- Attachment #2: Type: text/html, Size: 8491 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-13 20:03                           ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Kevin Gross
@ 2011-05-14 20:48                             ` Fred Baker
  2011-05-15 18:28                               ` Jonathan Morton
  2011-05-17  7:49                               ` BeckW
  0 siblings, 2 replies; 66+ messages in thread
From: Fred Baker @ 2011-05-14 20:48 UTC (permalink / raw)
  To: Kevin Gross; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 5084 bytes --]

On May 13, 2011, at 1:03 PM, Kevin Gross wrote:

> Do we think that bufferbloat is just a WAN problem? I work on live media applications for LANs and campus networks. I'm seeing what I think could be characterized as bufferbloat in LAN equipment. The timescales on 1 Gb Ethernet are orders of magnitude shorter and the performance problems caused are in many cases a bit different but root cause and potential solutions are, I'm hoping, very similar.

Bufferbloat is most noticeable on WANs, because they have longer delays, but yes LAN equipment does the same thing. It shows up as extended delay or as an increase in loss rates. A lot of LAN equipment has very shallow buffers due to cost (LAN markets are very cost-sensitive). One myth with bufferbloat is that a reasonable solution is to make the buffer shallow; no, because when the queue fills you now have an increased loss rate, which shows up in timeout-driven retransmissions - you really want a deep buffer (for bursts and temporary surges) that you keep shallow using AQM techniques.

> Keeping the frame byte size small while the frame time has shrunk maintains the overhead at the same level. Again, this has been a conscious decision not a stubborn relic. Ethernet improvements have increased bandwidth by orders of magnitude. Do we really need to increase it by a couple percentage points more by reducing overhead for large payloads?

You might talk with folks who do the LAN Speed records. They generally view end to end jumboframes as material to the achievement. It's not about changing the serialization delay, it's about changing the amount of processing at the endpoints.

> The cost of that improved marginal bandwidth efficiency is a 6x increase in latency. Many applications would not notice an increase from 12 us to 72 us for a Gigabit switch hop. But on a large network it adds up, some applications are absolutely that sensitive (transaction processing, cluster computing, SANs) and (I thought I'd be preaching to the choir here) there's no way to ever recover the lost performance.

Well, the extra delay is solvable in the transport. The question isn't really what the impact on the network is; it's what the requirements of the application are. For voice, if a voice sample is delayed 50 ms the jitter buffer in the codec resolves that - microseconds are irrelevant. Video codecs generally keep at least three video frames in their jitter buffer; at 30 fps, that's 100 milliseconds of acceptable variation in delay. milliseconds. 

Where it gets dicey is in elastic applications (applications using transports with the characteristics of TCP) that are retransmitting or otherwise reacting in timeframes comparable to the RTT and the RTT is small, or in elastic applications in which the timeout-retransmission interval is on the order of hundreds of milliseconds to seconds (true of most TCPs) but the RTT is on the order of microseconds to milliseconds. In the former, a deep queue buildup and trigger a transmission that further builds the queue; in the latter, a hiccup can have dramatic side effects. There is ongoing research on how best to do such things in data centers. My suspicion is that the right approach is something akin to 802.2 at the link layer, but with NACK retransmission - system A enumerates the data it sends to system B, and if system B sees a number skip it asks A to retransmit the indicated datagram. You might take a look at RFC 5401/5740/5776 for implementation suggestions. 

> Kevin Gross
>  
> From: Dave Taht [mailto:dave.taht@gmail.com] 
> Sent: Friday, May 13, 2011 8:54 AM
> To: rick.jones2@hp.com
> Cc: Kevin Gross; bloat@lists.bufferbloat.net
> Subject: Re: [Bloat] Burst Loss
>  
>  
> 
> On Fri, May 13, 2011 at 8:35 AM, Rick Jones <rick.jones2@hp.com> wrote:
> On Thu, 2011-05-12 at 23:00 -0600, Kevin Gross wrote:
> > One of the principal reasons jumbo frames have not been standardized
> > is due to latency concerns. I assume this group can appreciate the
> > IEEE holding ground on this.
> 
> Thusfar at least, bloaters are fighting to eliminate 10s of milliseconds
> of queuing delay.  I don't think this list is worrying about the tens of
> microseconds difference between the transmission time of a 9000 byte
> frame at 1 GbE vs a 1500 byte frame, or the single digit microseconds
> difference at 10 GbE.
> 
> Heh.  With the first iteration of the bismark project I'm trying to get to where I have less than 30ms latency under load and have far larger problems to worry about than jumbo frames. I'll be lucky to manage 1/10th that (300ms) at this point. 
> 
> Not, incidentally that I mind the idea of jumbo frames. It seems silly to be saddled with default frame sizes that made sense in the 70s, and in an age where we will be seeing ever more packet encapsulation, reducing the header size as a ratio to data size strikes me as a very worthy goal.
> 
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

[-- Attachment #2: Type: text/html, Size: 15595 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-14 20:48                             ` Fred Baker
@ 2011-05-15 18:28                               ` Jonathan Morton
  2011-05-15 20:49                                 ` Fred Baker
  2011-05-17  7:49                               ` BeckW
  1 sibling, 1 reply; 66+ messages in thread
From: Jonathan Morton @ 2011-05-15 18:28 UTC (permalink / raw)
  To: Fred Baker; +Cc: bloat

On 14 May, 2011, at 11:48 pm, Fred Baker wrote:

> My suspicion is that the right approach is something akin to 802.2 at the link layer, but with NACK retransmission - system A enumerates the data it sends to system B, and if system B sees a number skip it asks A to retransmit the indicated datagram. You might take a look at RFC 5401/5740/5776 for implementation suggestions.

This sounds like "reliable datagram" semantics to me.  It also sounds a lot like ARQ as used in amateur packet radio.  I believe similar mechanisms are built into 802.11.

The fundamental thing is that the sender must be able to know when sent frames can be flushed from the buffer because they don't need to be retransmitted.  So if there's a NACK, there must also be an ACK - at which point the ACK serves the purpose of the NACK, as it does in TCP.  The only alternative is a wall-time TTL, which is doable on single hops but requires careful design.

Let's face it.  UDP is unreliable by design - applications using it *must* anticipate and cope with dropped and delayed packets, either by exponential RTO or ARQ or NACK or FEC, all at the application layer.  And, in a congested network, some UDP packets *will* be lost.

TCP is reliable but needs to maintain appropriate window sizes - which it doesn't at present because a lossless network without ECN provides insufficient feedback (and AQM, which is required for good ECN signals, is usually absent), and in the quest for performance, the trend has been inexorably towards more aggressive window sizing (of which TCP-Fit is the latest example).  At the receiver end, it is possible to restrain this trend by reducing the receive window.

Unfortunately, it's useless to expect Ethernet switches to turn on ECN.  They operate at a lower stack level than IP, so they will not modify the IP TOS headers.  However, recent versions of Ethernet *do* support a throttling feedback mechanism, and this can and should be exploited to tell the edge host or router that ECN *might* be needed.  Also, with throttling feedback throughout the LAN, the Ethernet can for practical purposes be treated as almost-reliable.  This is *better* in terms of packet loss than ARQ or NACK, although if the Ethernet's buffers are large, it will still increase delay.  (With small buffers, it will just decrease throughput to the capacity, which is fine.)

 - Jonathan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-15 18:28                               ` Jonathan Morton
@ 2011-05-15 20:49                                 ` Fred Baker
  2011-05-16  0:31                                   ` Jonathan Morton
  0 siblings, 1 reply; 66+ messages in thread
From: Fred Baker @ 2011-05-15 20:49 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: bloat

On May 15, 2011, at 11:28 AM, Jonathan Morton wrote:
> The fundamental thing is that the sender must be able to know when sent frames can be flushed from the buffer because they don't need to be retransmitted.  So if there's a NACK, there must also be an ACK - at which point the ACK serves the purpose of the NACK, as it does in TCP.  The only alternative is a wall-time TTL, which is doable on single hops but requires careful design.

To a point. NORM holds a frame for possible retransmission for a stated period of time, and if retransmission isn't requested in that interval forgets it. So the ack isn't actually necessary; what is necessary is that the retention interval be long enough that a nack has a high probability of succeeding in getting the message through. A 100 Gbit interface can handle 97656 per millisecond (100G/(8*128*1000). We're looking at something on the order of 18 bits (4 ms to retransmit without falling back to TCP) for a rational sequence number at 100 Gbps; 16 bits would be enough at 10 Gbps, and 12 bits would be enough at 1 Gbps.

> ...recent versions of Ethernet *do* support a throttling feedback mechanism, and this can and should be exploited to tell the edge host or router that ECN *might* be needed.  Also, with throttling feedback throughout the LAN, the Ethernet can for practical purposes be treated as almost-reliable.  This is *better* in terms of packet loss than ARQ or NACK, although if the Ethernet's buffers are large, it will still increase delay.  (With small buffers, it will just decrease throughput to the capacity, which is fine.)

It increases the delay anyway. It just pushes the retention buffer to another place. What do you think the packet is doing during the "don't transmit" interval?

Throughput never exceeds capacity. If I have a 10 GBPS link, I will never get more than 10 GBPS through it. Buffer fill rate is statistically predictable. With small buffers, the fill rate acheives the top sooner. They increase the probability that the buffers are full, which is to say the drop probability. Which puts us to an end to end retransmission, which is the worst case of what you were worried about.

I'm not going to argue against letting retransmission go end to end; it's an endless debate. I'll simply note that several link layers, including but not limited to those you mention, find that applications using them work better if there is a high high probability of retransmission in an interval on the order of the link RTT as opposed to the end to end RTT. You brought up data centers (aka variable delays in LAN networks); those have been heavily the province of fiberchannel, which is a link layer protocol with retransmission. Think about it.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-15 20:49                                 ` Fred Baker
@ 2011-05-16  0:31                                   ` Jonathan Morton
  2011-05-16  7:51                                     ` Richard Scheffenegger
  0 siblings, 1 reply; 66+ messages in thread
From: Jonathan Morton @ 2011-05-16  0:31 UTC (permalink / raw)
  To: Fred Baker; +Cc: bloat

On 15 May, 2011, at 11:49 pm, Fred Baker wrote:

> 
> On May 15, 2011, at 11:28 AM, Jonathan Morton wrote:
>> The fundamental thing is that the sender must be able to know when sent frames can be flushed from the buffer because they don't need to be retransmitted.  So if there's a NACK, there must also be an ACK - at which point the ACK serves the purpose of the NACK, as it does in TCP.  The only alternative is a wall-time TTL, which is doable on single hops but requires careful design.
> 
> To a point. NORM holds a frame for possible retransmission for a stated period of time, and if retransmission isn't requested in that interval forgets it. So the ack isn't actually necessary; what is necessary is that the retention interval be long enough that a nack has a high probability of succeeding in getting the message through.

Okay, so because it can fall back to TCP's retransmit, the retention requirements can be relaxed.

>> ...recent versions of Ethernet *do* support a throttling feedback mechanism, and this can and should be exploited to tell the edge host or router that ECN *might* be needed.  Also, with throttling feedback throughout the LAN, the Ethernet can for practical purposes be treated as almost-reliable.  This is *better* in terms of packet loss than ARQ or NACK, although if the Ethernet's buffers are large, it will still increase delay.  (With small buffers, it will just decrease throughput to the capacity, which is fine.)
> 
> It increases the delay anyway. It just pushes the retention buffer to another place. What do you think the packet is doing during the "don't transmit" interval?

Most packets delayed by Ethernet throttling would, with small buffers, end up waiting in the sending host (or router).  They thus spend more time in a potentially active queue instead of in a dumb one.  But even if the host queue is dumb, the overall delay is no worse than with the larger Ethernet buffers.

> Throughput never exceeds capacity. If I have a 10 GBPS link, I will never get more than 10 GBPS through it. Buffer fill rate is statistically predictable. With small buffers, the fill rate acheives the top sooner. They increase the probability that the buffers are full, which is to say the drop probability. Which puts us to an end to end retransmission, which is the worst case of what you were worried about.

Let's suppose someone has generously provisioned an office with GigE throughout, using a two-level hierarchy of switches.  Some dumb schmuck then schedules every single computer to run it's backups (to a single fileserver) at the same time.  That's say 100 computers all competing for one GigE link to the fileserver.  If the switches are fair, each computer should get 10Mbps - that's the capacity.

With throttling, each computer sees the link closed 99% of the time.  It can send at link rate for the remaining 1% of the time.  On medium timescales, that looks like a 10Mbps bottleneck at the first link.  So the throughput on that link equals the capacity, and hopefully the goodput is also thus.  The only queue that is likely to overflow is the one on the sending computer, and one would hope there is enough feedback in a host's own TCP/IP stack to prevent that.

Without throttling but with ARQ, NACK or whatever you want to call it, the host has no signal to tell it to slow down - so the throughput on the edge link is more than 10Mbps (but the goodput will be less).  The buffer in the outer switch fills up - no matter how big or small it is - and starts dropping packets.  The switch then won't ask for retransmission of packets it's just dropped, because it has nowhere to put them.  The same process then repeats at the inner switch.  Finally, the server sees the missing packets, and asks for the retransmission - but these requests have to be switched all the way back to the clients, because the missing packets aren't in the switches' buffers.  It's therefore no better than a TCP SACK retransmission.

So there you have a classic congested network scenario in which throttling solves the problem, but link-level retransmission can't.

Where ARQ and/or NACK come in handy is where the link itself is unreliable, such as on WLANs (hence the use in amateur radio) and last-mile links.  In that case, the reason for the packet loss is not a full receive buffer, so asking for a retransmission is not inherently self-defeating.

> I'm not going to argue against letting retransmission go end to end; it's an endless debate. I'll simply note that several link layers, including but not limited to those you mention, find that applications using them work better if there is a high high probability of retransmission in an interval on the order of the link RTT as opposed to the end to end RTT. You brought up data centers (aka variable delays in LAN networks); those have been heavily the province of fiberchannel, which is a link layer protocol with retransmission. Think about it.

What I'd like to see is a complete absence of need for retransmission on a properly built wired network.  Obviously the capability still needs to be there to cope with the parts that aren't properly built or aren't wired, but TCP can do that. Throttling (in the form of Ethernet PAUSE) is simply the third possible method of signalling congestion in the network, alongside delay and loss - and it happens to be quite widely deployed already.

 - Jonathan

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-16  0:31                                   ` Jonathan Morton
@ 2011-05-16  7:51                                     ` Richard Scheffenegger
  2011-05-16  9:49                                       ` Fred Baker
  0 siblings, 1 reply; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-16  7:51 UTC (permalink / raw)
  To: Jonathan Morton, Fred Baker; +Cc: bloat

Jonathan,

> What I'd like to see is a complete absence of need for retransmission on a 
> properly
> built wired network.  Obviously the capability still needs to be there to 
> cope with
> the parts that aren't properly built or aren't wired, but TCP can do that. 
> Throttling
> (in the form of Ethernet PAUSE) is simply the third possible method of 
> signalling
> congestion in the network, alongside delay and loss - and it happens to be 
> quite
> widely deployed already.

Two comments: TCP can currently NOT deal properly with non-congestion loss 
(with
other words, any loss will lead to a congestion control reaction - reduction 
of sending
rate). TCP can only (mostly) deal with the recovery part in a hopefully 
timely fashion.
In this area you'll find a high number of possible approaches, none of which 
is quite
backwards-compatible with "standard" TCP.

Second, you wouldn't want to deploy basic 802.3x to any network consisting 
of more
than a single switch. If you do, you can run into an effect called 
congestion tree formation,
where (simplified) the slowest receiver determines the global speed of your
ethernet network. 802.1Qbb is also prone to congestion trees, even though 
the probability
is somewhat reduced provided all priority classes are being used. 
Unfortunately, most
traffic is in the same 802.1p class... Adequate solutions (more complex than 
the FCP
buffer-credit based congestion avoidance) like 802.1Qau / QCN are not 
available
commercially afaik. (They need new NICs +  new Switches for the HW support).

But I agree, a L3 device should be able to distribute L2 congestion 
information
into the L3 header (even though today, cheap generic broadcom and perhaps 
even
Realtek chipsets support ECN marking even when they are running as L2 
switch;
a speciality firmware (see the DCTCP papers) is required though.

Best regards,
   Richard 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-16  7:51                                     ` Richard Scheffenegger
@ 2011-05-16  9:49                                       ` Fred Baker
  2011-05-16 11:23                                         ` [Bloat] Jumbo frames and LAN buffers Jim Gettys
  2011-05-16 18:11                                         ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Richard Scheffenegger
  0 siblings, 2 replies; 66+ messages in thread
From: Fred Baker @ 2011-05-16  9:49 UTC (permalink / raw)
  To: Richard Scheffenegger; +Cc: bloat


On May 16, 2011, at 9:51 AM, Richard Scheffenegger wrote:

> Second, you wouldn't want to deploy basic 802.3x to any network consisting of more than a single switch. 

actually, it's pretty common practice. Three layers, even. People build backbones, and then ring them with workgroup switches, and then put small switches on their desks.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers
  2011-05-16  9:49                                       ` Fred Baker
@ 2011-05-16 11:23                                         ` Jim Gettys
  2011-05-16 13:15                                           ` Kevin Gross
  2011-05-16 18:11                                         ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Richard Scheffenegger
  1 sibling, 1 reply; 66+ messages in thread
From: Jim Gettys @ 2011-05-16 11:23 UTC (permalink / raw)
  To: bloat

On 05/16/2011 05:49 AM, Fred Baker wrote:
> On May 16, 2011, at 9:51 AM, Richard Scheffenegger wrote:
>
>> Second, you wouldn't want to deploy basic 802.3x to any network consisting of more than a single switch.
> actually, it's pretty common practice. Three layers, even. People build backbones, and then ring them with workgroup switches, and then put small switches on their desks.
>
Not necessarily out of knowledge or desire (since it isn't usually 
controllable in the small switches you buy for home).  It can cause 
trouble even in small environments as your house.

http://virtualthreads.blogspot.com/2006/02/beware-ethernet-flow-control.html

I know I'm at least three consumer switches deep, and it's not by choice.
                     - Jim




^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers
  2011-05-16 11:23                                         ` [Bloat] Jumbo frames and LAN buffers Jim Gettys
@ 2011-05-16 13:15                                           ` Kevin Gross
  2011-05-16 13:22                                             ` Jim Gettys
  2011-05-16 18:36                                             ` Richard Scheffenegger
  0 siblings, 2 replies; 66+ messages in thread
From: Kevin Gross @ 2011-05-16 13:15 UTC (permalink / raw)
  To: bloat

All the stand-alone switches I've looked at recently either do not support
802.3x or support it in the (desireable) manner described in the last
paragraph of the linked blog post. I don't believe Ethernet flow control is
a factor in current LANs. I'd be interested to know the specifics if anyone
sees it differently.

My understanding is that 802.1au, "lossless Ethernet", was designed
primarily to allow Fibre Channel to be carried over 10 GbE so that SAN and
LAN can share a common infrastructure in datacenters. I don't believe anyone
intends for it to be enabled for traffic classes carrying TCP.

Kevin Gross

-----Original Message-----
From: bloat-bounces@lists.bufferbloat.net
[mailto:bloat-bounces@lists.bufferbloat.net] On Behalf Of Jim Gettys
Sent: Monday, May 16, 2011 5:24 AM
To: bloat@lists.bufferbloat.net
Subject: Re: [Bloat] Jumbo frames and LAN buffers

Not necessarily out of knowledge or desire (since it isn't usually 
controllable in the small switches you buy for home).  It can cause 
trouble even in small environments as your house.

http://virtualthreads.blogspot.com/2006/02/beware-ethernet-flow-control.html

I know I'm at least three consumer switches deep, and it's not by choice.
                     - Jim

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers
  2011-05-16 13:15                                           ` Kevin Gross
@ 2011-05-16 13:22                                             ` Jim Gettys
  2011-05-16 13:42                                               ` Kevin Gross
       [not found]                                               ` <-854731558634984958@unknownmsgid>
  2011-05-16 18:36                                             ` Richard Scheffenegger
  1 sibling, 2 replies; 66+ messages in thread
From: Jim Gettys @ 2011-05-16 13:22 UTC (permalink / raw)
  To: bloat

On 05/16/2011 09:15 AM, Kevin Gross wrote:
> All the stand-alone switches I've looked at recently either do not support
> 802.3x or support it in the (desireable) manner described in the last
> paragraph of the linked blog post. I don't believe Ethernet flow control is
> a factor in current LANs. I'd be interested to know the specifics if anyone
> sees it differently.

Heh.  Plug wireshark into current off the shelf cheap consumer switches 
intended for the home.  You won't like what you see.  And you have no 
way to manage them.  I was quite surprised last fall when doing my home 
experiments to see 802.3 frames; I had been blissfully unaware of its 
existence, and had to go read up on it as a result.

I don't think any of the enterprise switches are so brain damaged.  So i 
suspect it's mostly lurking to cause trouble in home and small office 
environments, exactly where no-one will know what's going on.
                         - Jim

> My understanding is that 802.1au, "lossless Ethernet", was designed
> primarily to allow Fibre Channel to be carried over 10 GbE so that SAN and
> LAN can share a common infrastructure in datacenters. I don't believe anyone
> intends for it to be enabled for traffic classes carrying TCP.
>
> Kevin Gross
>
> -----Original Message-----
> From: bloat-bounces@lists.bufferbloat.net
> [mailto:bloat-bounces@lists.bufferbloat.net] On Behalf Of Jim Gettys
> Sent: Monday, May 16, 2011 5:24 AM
> To: bloat@lists.bufferbloat.net
> Subject: Re: [Bloat] Jumbo frames and LAN buffers
>
> Not necessarily out of knowledge or desire (since it isn't usually
> controllable in the small switches you buy for home).  It can cause
> trouble even in small environments as your house.
>
> http://virtualthreads.blogspot.com/2006/02/beware-ethernet-flow-control.html
>
> I know I'm at least three consumer switches deep, and it's not by choice.
>                       - Jim
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers
  2011-05-16 13:22                                             ` Jim Gettys
@ 2011-05-16 13:42                                               ` Kevin Gross
  2011-05-16 15:23                                                 ` Jim Gettys
       [not found]                                               ` <-854731558634984958@unknownmsgid>
  1 sibling, 1 reply; 66+ messages in thread
From: Kevin Gross @ 2011-05-16 13:42 UTC (permalink / raw)
  To: bloat

I would like to try this. Can you suggest specific equipment to look at. Due
to integration and low port count, most of the cheap consumer stuff has
surprisingly good layer-2 performance. I've tested a bunch of Linksys and
other small/medium business 5 to 24 port gigabit switches. Since I measure
latency, I expect I would have noticed if flow control were kicking in.

Kevin Gross

-----Original Message-----
From: bloat-bounces@lists.bufferbloat.net
[mailto:bloat-bounces@lists.bufferbloat.net] On Behalf Of Jim Gettys
Sent: Monday, May 16, 2011 7:23 AM
To: bloat@lists.bufferbloat.net
Subject: Re: [Bloat] Jumbo frames and LAN buffers

On 05/16/2011 09:15 AM, Kevin Gross wrote:
> All the stand-alone switches I've looked at recently either do not support
> 802.3x or support it in the (desireable) manner described in the last
> paragraph of the linked blog post. I don't believe Ethernet flow control
is
> a factor in current LANs. I'd be interested to know the specifics if
anyone
> sees it differently.

Heh.  Plug wireshark into current off the shelf cheap consumer switches 
intended for the home.  You won't like what you see.  And you have no 
way to manage them.  I was quite surprised last fall when doing my home 
experiments to see 802.3 frames; I had been blissfully unaware of its 
existence, and had to go read up on it as a result.

I don't think any of the enterprise switches are so brain damaged.  So i 
suspect it's mostly lurking to cause trouble in home and small office 
environments, exactly where no-one will know what's going on.
                         - Jim

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers
  2011-05-16 13:42                                               ` Kevin Gross
@ 2011-05-16 15:23                                                 ` Jim Gettys
  0 siblings, 0 replies; 66+ messages in thread
From: Jim Gettys @ 2011-05-16 15:23 UTC (permalink / raw)
  To: bloat

On 05/16/2011 09:42 AM, Kevin Gross wrote:
> I would like to try this. Can you suggest specific equipment to look at. Due
> to integration and low port count, most of the cheap consumer stuff has
> surprisingly good layer-2 performance. I've tested a bunch of Linksys and
> other small/medium business 5 to 24 port gigabit switches. Since I measure
> latency, I expect I would have noticed if flow control were kicking in.

I think I was using a D-Link DGS2208. (8 port consumer switch).

I then went and looked at the spec sheets of some of the other consumer 
kit out there and found they all had the "feature" of 802.3 flow control.

I may have been using iperf to tickle it, rather than ssh.

I was also playing around with an old 100Mbps switch, as documented in 
my blog; I don't remember if I saw it there.
                         - Jim

> Kevin Gross
>
> -----Original Message-----
> From: bloat-bounces@lists.bufferbloat.net
> [mailto:bloat-bounces@lists.bufferbloat.net] On Behalf Of Jim Gettys
> Sent: Monday, May 16, 2011 7:23 AM
> To: bloat@lists.bufferbloat.net
> Subject: Re: [Bloat] Jumbo frames and LAN buffers
>
> On 05/16/2011 09:15 AM, Kevin Gross wrote:
>> All the stand-alone switches I've looked at recently either do not support
>> 802.3x or support it in the (desireable) manner described in the last
>> paragraph of the linked blog post. I don't believe Ethernet flow control
> is
>> a factor in current LANs. I'd be interested to know the specifics if
> anyone
>> sees it differently.
> Heh.  Plug wireshark into current off the shelf cheap consumer switches
> intended for the home.  You won't like what you see.  And you have no
> way to manage them.  I was quite surprised last fall when doing my home
> experiments to see 802.3 frames; I had been blissfully unaware of its
> existence, and had to go read up on it as a result.
>
> I don't think any of the enterprise switches are so brain damaged.  So i
> suspect it's mostly lurking to cause trouble in home and small office
> environments, exactly where no-one will know what's going on.
>                           - Jim
>
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


^ permalink raw reply	[flat|nested] 66+ messages in thread

[parent not found: <-854731558634984958@unknownmsgid>]

* Re: [Bloat] Jumbo frames and LAN buffers
       [not found]                                               ` <-854731558634984958@unknownmsgid>
@ 2011-05-16 13:45                                                 ` Dave Taht
  0 siblings, 0 replies; 66+ messages in thread
From: Dave Taht @ 2011-05-16 13:45 UTC (permalink / raw)
  To: Kevin Gross; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 2196 bytes --]

On Mon, May 16, 2011 at 7:42 AM, Kevin Gross <kevin.gross@avanw.com> wrote:

> I would like to try this. Can you suggest specific equipment to look at.
> Due
> to integration and low port count, most of the cheap consumer stuff has
> surprisingly good layer-2 performance. I've tested a bunch of Linksys and
> other small/medium business 5 to 24 port gigabit switches. Since I measure
> latency, I expect I would have noticed if flow control were kicking in.
>

I would certainly appreciate more people looking at the switch in the
wndr3700v2 we're using on the bismark project.

I'm seeing some pretty deep buffering on it

>
> Kevin Gross
>
> -----Original Message-----
> From: bloat-bounces@lists.bufferbloat.net
> [mailto:bloat-bounces@lists.bufferbloat.net] On Behalf Of Jim Gettys
> Sent: Monday, May 16, 2011 7:23 AM
> To: bloat@lists.bufferbloat.net
> Subject: Re: [Bloat] Jumbo frames and LAN buffers
>
> On 05/16/2011 09:15 AM, Kevin Gross wrote:
> > All the stand-alone switches I've looked at recently either do not
> support
> > 802.3x or support it in the (desireable) manner described in the last
> > paragraph of the linked blog post. I don't believe Ethernet flow control
> is
> > a factor in current LANs. I'd be interested to know the specifics if
> anyone
> > sees it differently.
>
> Heh.  Plug wireshark into current off the shelf cheap consumer switches
> intended for the home.  You won't like what you see.  And you have no
> way to manage them.  I was quite surprised last fall when doing my home
> experiments to see 802.3 frames; I had been blissfully unaware of its
> existence, and had to go read up on it as a result.
>
> I don't think any of the enterprise switches are so brain damaged.  So i
> suspect it's mostly lurking to cause trouble in home and small office
> environments, exactly where no-one will know what's going on.
>                         - Jim
>
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

[-- Attachment #2: Type: text/html, Size: 3233 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers
  2011-05-16 13:15                                           ` Kevin Gross
  2011-05-16 13:22                                             ` Jim Gettys
@ 2011-05-16 18:36                                             ` Richard Scheffenegger
  1 sibling, 0 replies; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-16 18:36 UTC (permalink / raw)
  To: Kevin Gross, bloat

Kevin,

> My understanding is that 802.1au, "lossless Ethernet", was designed
> primarily to allow Fibre Channel to be carried over 10 GbE so that SAN and
> LAN can share a common infrastructure in datacenters. I don't believe 
> anyone
> intends for it to be enabled for traffic classes carrying TCP.

Well, QCN requires a L2 MAC sender, network and receiver cooperation (thus 
you need fancy "CNA" converged network adapters, to start using it - these 
would be reaction/reflection points; plus the congestion points - switches - 
would need HW support too; nothing one can buy today; higher-grade 
(carrier?) switches may have the reaction/reflection points built into them, 
and could use legacy 802.3x signalling outside the 802.1Qau cloud).

The following may be too simplistic

Once the hardware has a reaction point support, it classifies traffic, and 
calculates the per flow congestion of the path (with flow really being the 
classification rules by the sender), the intermediates / receiver sample the 
flow and return the congestion back to the sender - and within the sender, a 
token bucket-like rate limiter will adjust the sending rate of the 
appropriate flow(s) to adjust to the observed network conditions.

http://www.stanford.edu/~balaji/presentations/au-prabhakar-qcn-description.pdf
http://www.ieee802.org/1/files/public/docs2007/au-pan-qcn-details-053007.pdf

The congestion control loop has a lot of similarities to TCP CC as you will 
note...

Also, I haven't found out how fine-grained the classification is supposed to 
be (per L2 address pair? Group of flows? Which hashing then to use for 
mapping L2 flows into those groups between reaction/congestion/reflection 
points...).

Anyway, for the here and now, this is pretty much esoteric stuff not 
relevant in this context :)

Best regards,
  Richard

----- Original Message ----- 
From: "Kevin Gross" <kevin.gross@avanw.com>
To: <bloat@lists.bufferbloat.net>
Sent: Monday, May 16, 2011 3:15 PM
Subject: Re: [Bloat] Jumbo frames and LAN buffers

> All the stand-alone switches I've looked at recently either do not support
> 802.3x or support it in the (desireable) manner described in the last
> paragraph of the linked blog post. I don't believe Ethernet flow control 
> is
> a factor in current LANs. I'd be interested to know the specifics if 
> anyone
> sees it differently.
>
> My understanding is that 802.1au, "lossless Ethernet", was designed
> primarily to allow Fibre Channel to be carried over 10 GbE so that SAN and
> LAN can share a common infrastructure in datacenters. I don't believe 
> anyone
> intends for it to be enabled for traffic classes carrying TCP.
>
> Kevin Gross
>
> -----Original Message-----
> From: bloat-bounces@lists.bufferbloat.net
> [mailto:bloat-bounces@lists.bufferbloat.net] On Behalf Of Jim Gettys
> Sent: Monday, May 16, 2011 5:24 AM
> To: bloat@lists.bufferbloat.net
> Subject: Re: [Bloat] Jumbo frames and LAN buffers
>
> Not necessarily out of knowledge or desire (since it isn't usually
> controllable in the small switches you buy for home).  It can cause
> trouble even in small environments as your house.
>
> http://virtualthreads.blogspot.com/2006/02/beware-ethernet-flow-control.html
>
> I know I'm at least three consumer switches deep, and it's not by choice.
>                     - Jim
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-16  9:49                                       ` Fred Baker
  2011-05-16 11:23                                         ` [Bloat] Jumbo frames and LAN buffers Jim Gettys
@ 2011-05-16 18:11                                         ` Richard Scheffenegger
  1 sibling, 0 replies; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-16 18:11 UTC (permalink / raw)
  To: Fred Baker; +Cc: bloat

Hi Fred,

Yes, that's the common topology; However, 802.3x is often used only 
unidirectional and with very limited effect, but not bidirectional. At least 
that's the default settings... (I wonder, if both ends of a link are RX, 
would flow control ever get triggered?)

I know a number of deployments, where globally enabling full flowcontrol (as 
opposed to RX / TX only) lead to fewer packet drops, but also to sometimes 
massively reduces network bandwidth.

This is what I meant when I said you don't want to deploy flow control in a 
multi-tier  network topology because of the congestion tree forming.

Best regards,
   Richard

----- Original Message ----- 
From: "Fred Baker" <fred@cisco.com>
To: "Richard Scheffenegger" <rscheff@gmx.at>
Cc: "Jonathan Morton" <chromatix99@gmail.com>; <bloat@lists.bufferbloat.net>
Sent: Monday, May 16, 2011 11:49 AM
Subject: Re: [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss)

On May 16, 2011, at 9:51 AM, Richard Scheffenegger wrote:

> Second, you wouldn't want to deploy basic 802.3x to any network consisting 
> of more than a single switch.

actually, it's pretty common practice. Three layers, even. People build 
backbones, and then ring them with workgroup switches, and then put small 
switches on their desks.= 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
  2011-05-14 20:48                             ` Fred Baker
  2011-05-15 18:28                               ` Jonathan Morton
@ 2011-05-17  7:49                               ` BeckW
  2011-05-17 14:16                                 ` Dave Taht
  1 sibling, 1 reply; 66+ messages in thread
From: BeckW @ 2011-05-17  7:49 UTC (permalink / raw)
  To: bloat

(I think) Fred wrote:
> Well, the extra delay is solvable in the transport. The question isn't really what the impact on the > network is; it's what the requirements of the application are. For voice, if a voice sample is
> delayed 50 ms the jitter buffer in the codec resolves that - microseconds are irrelevant.

If you meant 50 microseconds, ignore the rest of this post.

50 milliseconds is a *long* time in VoIP. The total mouth-to-ear delay budget is only 150 ms. Adaptive jitter buffer algorithms choose a buffer size that is bigger than the observed delay variation. So the additional delay will be even higher than 50 ms.

Big frames are a problem on slower upstream links, even if you strictly prioritize VoIP and don't use jumbo frames. Some DSL providers resort to using two ATM VCs, just to prevent TCP packets from delaying VoIP.

Wolfgang Beck

--
Deutsche Telekom Netzproduktion GmbH
Zentrum Technik Einführung
Heinrich-Hertz-Straße 3-7, 64295 Darmstadt
+49 61516282832 (Tel.)
http://www.telekom.com

Deutsche Telekom Netzproduktion GmbH
Aufsichtsrat: Timotheus Höttges (Vorsitzender)
Geschäftsführung: Bruno Jacobfeuerborn (Vorsitzender), Albert Matheis, Klaus Peren
Handelsregister: Amtsgericht Bonn HRB 14190
Sitz der Gesellschaft: Bonn
USt-IdNr.: DE 814645262

Erleben, was verbindet.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss)
  2011-05-17  7:49                               ` BeckW
@ 2011-05-17 14:16                                 ` Dave Taht
  0 siblings, 0 replies; 66+ messages in thread
From: Dave Taht @ 2011-05-17 14:16 UTC (permalink / raw)
  To: BeckW; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 879 bytes --]

On Tue, May 17, 2011 at 1:49 AM, <BeckW@telekom.de> wrote:

> (I think) Fred wrote:
> > Well, the extra delay is solvable in the transport. The question isn't
> really what the impact on the > network is; it's what the requirements of
> the application are. For voice, if a voice sample is
> > delayed 50 ms the jitter buffer in the codec resolves that - microseconds
> are irrelevant.
>
> If you meant 50 microseconds, ignore the rest of this post.
>
> 50 milliseconds is a *long* time in VoIP. The total mouth-to-ear delay
> budget is only 150 ms. Adaptive jitter buffer algorithms choose a buffer
> size that is bigger than the observed delay variation. So the additional
> delay will be even higher than 50 ms.
>
>
*10* ms in terms of jitter is a *long* time in voip.

-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

[-- Attachment #2: Type: text/html, Size: 1262 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

[parent not found: <-4629065256951087821@unknownmsgid>]

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss)
       [not found]                           ` <-4629065256951087821@unknownmsgid>
@ 2011-05-13 20:21                             ` Dave Taht
  2011-05-13 22:36                               ` Kevin Gross
  0 siblings, 1 reply; 66+ messages in thread
From: Dave Taht @ 2011-05-13 20:21 UTC (permalink / raw)
  To: Kevin Gross; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 4323 bytes --]

On Fri, May 13, 2011 at 2:03 PM, Kevin Gross <kevin.gross@avanw.com> wrote:

> Do we think that bufferbloat is just a WAN problem? I work on live media
> applications for LANs and campus networks. I'm seeing what I think could be
> characterized as bufferbloat in LAN equipment. The timescales on 1 Gb
> Ethernet are orders of magnitude shorter and the performance problems caused
> are in many cases a bit different but root cause and potential solutions
> are, I'm hoping, very similar.
>
>
>
> Keeping the frame byte size small while the frame time has shrunk maintains
> the overhead at the same level. Again, this has been a conscious decision
> not a stubborn relic. Ethernet improvements have increased bandwidth by
> orders of magnitude. Do we really need to increase it by a couple percentage
> points more by reducing overhead for large payloads?
>
>
>
> The cost of that improved marginal bandwidth efficiency is a 6x increase in
> latency. Many applications would not notice an increase from 12 us to 72 us
> for a Gigabit switch hop. But on a large network it adds up, some
> applications are absolutely that sensitive (transaction processing, cluster
> computing, SANs) and (I thought I'd be preaching to the choir here) there's
> no way to ever recover the lost performance.
>
>
>

You are preaching to the choir here, but I note several things:

Large frame sizes on 10GigE networks to other 10GigE networks is less of a
problem than 10GigE to 10Mbit networks. I would hope/expect that frame would
fragment in that case.

Getting to where latencies are less than 10ms in the general case makes voip
feasible again. I'm still at well over 300ms on bismark.

Enabling higher speed stock market trades and live music exchange over a lan
would be next on my list after getting below 10ms on the local
switch/wireless interface!

A lot of research points to widely enabling some form of fair queuing at the
servers and switches to distribute the load at sane levels. (nagle, 89) I
think few gig+e vendors are doing that in hardware, and it would be good to
know who is and who isn't.

For example, the switch I'm using on bismark has all sorts of wonderful QoS
features such as fair queuing, but as best as I can tell they are not
enabled, and I'm seeing buffering in the switch at well above 20ms....

It is astonishing that a switch chip this capable has reached the consumer
marketplace...

http://realtek.info/pdf/rtl8366s_8366sr_datasheet_vpre-1.4_20071022.pdf

And depressing that so few of it's capabilities have software to configure
them.

> Kevin Gross
>
>
>
> *From:* Dave Taht [mailto:dave.taht@gmail.com]
> *Sent:* Friday, May 13, 2011 8:54 AM
> *To:* rick.jones2@hp.com
> *Cc:* Kevin Gross; bloat@lists.bufferbloat.net
> *Subject:* Re: [Bloat] Burst Loss
>
>
>
>
>
> On Fri, May 13, 2011 at 8:35 AM, Rick Jones <rick.jones2@hp.com> wrote:
>
> On Thu, 2011-05-12 at 23:00 -0600, Kevin Gross wrote:
> > One of the principal reasons jumbo frames have not been standardized
> > is due to latency concerns. I assume this group can appreciate the
> > IEEE holding ground on this.
>
> Thusfar at least, bloaters are fighting to eliminate 10s of milliseconds
> of queuing delay.  I don't think this list is worrying about the tens of
> microseconds difference between the transmission time of a 9000 byte
> frame at 1 GbE vs a 1500 byte frame, or the single digit microseconds
> difference at 10 GbE.
>
>
> Heh.  With the first iteration of the bismark project I'm trying to get to
> where I have less than 30ms latency under load and have far larger problems
> to worry about than jumbo frames. I'll be lucky to manage 1/10th that
> (300ms) at this point.
>
> Not, incidentally that I mind the idea of jumbo frames. It seems silly to
> be saddled with default frame sizes that made sense in the 70s, and in an
> age where we will be seeing ever more packet encapsulation, reducing the
> header size as a ratio to data size strikes me as a very worthy goal.
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
>
>


-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

[-- Attachment #2: Type: text/html, Size: 6678 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss)
  2011-05-13 20:21                             ` Dave Taht
@ 2011-05-13 22:36                               ` Kevin Gross
  0 siblings, 0 replies; 66+ messages in thread
From: Kevin Gross @ 2011-05-13 22:36 UTC (permalink / raw)
  To: bloat

[-- Attachment #1: Type: text/plain, Size: 5732 bytes --]

Even through jumbo frames are not standardized, most new network equipment
supports them (though generally support is disabled by default). If you IPv4
route jumbo packets to a network that doesn't support them, the router will
fragment for you. Under IPv6, it is the sender's responsibility to choose an
MTU that is supported by all networks between source and destination. IPv6
routers do no fragmentation.

Although consumer products are often dumbed down, it is not difficult to
find switches with comprehensive QoS
configurability. Weighted fair queuing is a popular scheme.
Strict priority is a bit dangerous but useful for latency-critical
applications. The IEEE has just ratified a credit-based algorithm called
802.1av. What I find is missing from all but the high-end equipment is
configurability of buffering capacity and behavior. Bad buffering can burn
an otherwise competent QoS implementation.

In his talks, Jim Gettys claims that these QoS features do not fix
bufferbloat - they just move the problem elsewhere. I generally agree with
this though I find that moving the problem elsewhere is sometimes a
perfectly acceptable solution.

Kevin Gross


On Fri, May 13, 2011 at 2:21 PM, Dave Taht <dave.taht@gmail.com> wrote:

>
>
> On Fri, May 13, 2011 at 2:03 PM, Kevin Gross <kevin.gross@avanw.com>wrote:
>
>> Do we think that bufferbloat is just a WAN problem? I work on live media
>> applications for LANs and campus networks. I'm seeing what I think could be
>> characterized as bufferbloat in LAN equipment. The timescales on 1 Gb
>> Ethernet are orders of magnitude shorter and the performance problems caused
>> are in many cases a bit different but root cause and potential solutions
>> are, I'm hoping, very similar.
>>
>>
>>
>> Keeping the frame byte size small while the frame time has shrunk
>> maintains the overhead at the same level. Again, this has been a conscious
>> decision not a stubborn relic. Ethernet improvements have increased
>> bandwidth by orders of magnitude. Do we really need to increase it by a
>> couple percentage points more by reducing overhead for large payloads?
>>
>>
>>
>> The cost of that improved marginal bandwidth efficiency is a 6x increase
>> in latency. Many applications would not notice an increase from 12 us to 72
>> us for a Gigabit switch hop. But on a large network it adds up, some
>> applications are absolutely that sensitive (transaction processing, cluster
>> computing, SANs) and (I thought I'd be preaching to the choir here) there's
>> no way to ever recover the lost performance.
>>
>>
>>
>
> You are preaching to the choir here, but I note several things:
>
> Large frame sizes on 10GigE networks to other 10GigE networks is less of a
> problem than 10GigE to 10Mbit networks. I would hope/expect that frame would
> fragment in that case.
>
> Getting to where latencies are less than 10ms in the general case makes
> voip feasible again. I'm still at well over 300ms on bismark.
>
> Enabling higher speed stock market trades and live music exchange over a
> lan would be next on my list after getting below 10ms on the local
> switch/wireless interface!
>
> A lot of research points to widely enabling some form of fair queuing at
> the servers and switches to distribute the load at sane levels. (nagle, 89)
> I think few gig+e vendors are doing that in hardware, and it would be good
> to know who is and who isn't.
>
> For example, the switch I'm using on bismark has all sorts of wonderful QoS
> features such as fair queuing, but as best as I can tell they are not
> enabled, and I'm seeing buffering in the switch at well above 20ms....
>
> It is astonishing that a switch chip this capable has reached the consumer
> marketplace...
>
> http://realtek.info/pdf/rtl8366s_8366sr_datasheet_vpre-1.4_20071022.pdf
>
> And depressing that so few of it's capabilities have software to configure
> them.
>
>> Kevin Gross
>>
>>
>>
>> *From:* Dave Taht [mailto:dave.taht@gmail.com]
>> *Sent:* Friday, May 13, 2011 8:54 AM
>> *To:* rick.jones2@hp.com
>> *Cc:* Kevin Gross; bloat@lists.bufferbloat.net
>> *Subject:* Re: [Bloat] Burst Loss
>>
>>
>>
>>
>>
>> On Fri, May 13, 2011 at 8:35 AM, Rick Jones <rick.jones2@hp.com> wrote:
>>
>> On Thu, 2011-05-12 at 23:00 -0600, Kevin Gross wrote:
>> > One of the principal reasons jumbo frames have not been standardized
>> > is due to latency concerns. I assume this group can appreciate the
>> > IEEE holding ground on this.
>>
>> Thusfar at least, bloaters are fighting to eliminate 10s of milliseconds
>> of queuing delay.  I don't think this list is worrying about the tens of
>> microseconds difference between the transmission time of a 9000 byte
>> frame at 1 GbE vs a 1500 byte frame, or the single digit microseconds
>> difference at 10 GbE.
>>
>>
>> Heh.  With the first iteration of the bismark project I'm trying to get to
>> where I have less than 30ms latency under load and have far larger problems
>> to worry about than jumbo frames. I'll be lucky to manage 1/10th that
>> (300ms) at this point.
>>
>> Not, incidentally that I mind the idea of jumbo frames. It seems silly to
>> be saddled with default frame sizes that made sense in the 70s, and in an
>> age where we will be seeing ever more packet encapsulation, reducing the
>> header size as a ratio to data size strikes me as a very worthy goal.
>>
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>>
>>
>
>
> --
> Dave Täht
> SKYPE: davetaht
> US Tel: 1-239-829-5608
> http://the-edge.blogspot.com
>

[-- Attachment #2: Type: text/html, Size: 8367 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-13 14:54                         ` Dave Taht
  2011-05-13 20:03                           ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Kevin Gross
       [not found]                           ` <-4629065256951087821@unknownmsgid>
@ 2011-05-13 22:08                           ` david
  2 siblings, 0 replies; 66+ messages in thread
From: david @ 2011-05-13 22:08 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat

[-- Attachment #1: Type: TEXT/Plain, Size: 926 bytes --]

On Fri, 13 May 2011, Dave Taht wrote:

> Not, incidentally that I mind the idea of jumbo frames. It seems silly to be
> saddled with default frame sizes that made sense in the 70s, and in an age
> where we will be seeing ever more packet encapsulation, reducing the header
> size as a ratio to data size strikes me as a very worthy goal.

the header to data size ratio is a small factor (but with a header of ~50 
bytes, you don't save _that_ much)

but I thought the huge advantage to jumbo frames was eliminating the gap 
between packets. back in the 1Mb network days, this gap size was not 
significant (a few bits work), but as networks have gotten faster, the gap 
has not gotten smaller by the same ratio.

you guys are probably closer to the raw numbers than I am, but what it the 
total throughput of a network (including header data as throughput) for 
various packet sizes (64 byte, 1500 byte, 9000 byte)

David Lang

[-- Attachment #2: Type: TEXT/PLAIN, Size: 140 bytes --]

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-13 14:35                       ` Rick Jones
  2011-05-13 14:54                         ` Dave Taht
@ 2011-05-13 19:32                         ` Denton Gentry
  2011-05-13 20:47                           ` Rick Jones
  1 sibling, 1 reply; 66+ messages in thread
From: Denton Gentry @ 2011-05-13 19:32 UTC (permalink / raw)
  To: rick.jones2, Kevin Gross; +Cc: bloat

[-- Attachment #1: Type: text/plain, Size: 865 bytes --]

On Fri, May 13, 2011 at 7:35 AM, Rick Jones <rick.jones2@hp.com> wrote:

> > For a short time, servers with gigabit NICs suffered but smarter NICs
> > were developed (TSO, LRO, other TLAs) and OSs upgraded to support them
> > and I believe it is no longer a significant issue.
>
> Are TSO and LRO going to be sufficient at 40 and 100 GbE?  Cores aren't
> getting any faster. Only more plentiful.


  NICs seem to be responding by hashing incoming 5-tuples to distribute
flows across cores.


> And while it isn't the
> strongest point in the world, one might even argue that the need to use
> TSO/LRO to achieve performance hinders new transport protocol adoption -
> the presence of NIC offloads for only TCP (or UDP) leaves a new
> transport protocol (perhaps SCTP) at a disadvantage.


  True, and even UDP seems to be often blocked for anything other than DNS.

[-- Attachment #2: Type: text/html, Size: 1340 bytes --]

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Burst Loss
  2011-05-13 19:32                         ` Denton Gentry
@ 2011-05-13 20:47                           ` Rick Jones
  0 siblings, 0 replies; 66+ messages in thread
From: Rick Jones @ 2011-05-13 20:47 UTC (permalink / raw)
  To: Denton Gentry; +Cc: bloat

On Fri, 2011-05-13 at 12:32 -0700, Denton Gentry wrote:

>   NICs seem to be responding by hashing incoming 5-tuples to
> distribute flows across cores.

When I first kicked netperf out onto the Internet, when 10
Megabits/second was really fast, people started asking me "Why can't I
get link-rate on a single-stream netperf test?"  The answer was "Because
you don't have enough CPU horsepower, but perhaps the next processor
will." Then when 100BT happened, people asked me "Why can't I get
link-rate on a single-stream netperf test?"  And the answer was the
same.  Then when 1 GbE happened, people asked me "Why can't I get
link-rate on a single-stream netperf test?"  And the answer was the
same, tweaked slightly to suggest they get a NIC with CKO.  Then when 10
GbE happened people asked me "Why can't I get link-rate on a
single-stream netperf test?" And the answer was "Because you don't have
enough CPU, try a NIC with TSO and LRO."

Based on the past 20 years I am quite confident that when 40 and 100 GbE
NICs appear for end systems, I will again be asked "Why can't I get
link-rate on a single-stream netperf test?"  While indeed, the world is
not just unidirectional bulk flows (if it were netperf and its
request-response tests would never have come into being to replace
ttcp), even after decades it is still something people seem to expect.
There must be some value to high performance unidirectional transfer.

Only now the cores aren't going to have gotten any faster, and spreading
incoming 5-tuples across cores isn't going to help a single stream.

So, the "answer" will likely end-up being to add still more complexity -
either in the applications to use multiple streams, or to push the full
stack into the NIC. Adde parvum parvo manus acervus erit. But, by
Metcalf, we will have preserved the sacrosanct Ethernet maximum frame
size.

Crossing emails a bit, Kevin wrote about the 6X increase in latency.  It
is a 6X increase in *potential* latency *if* someone actually enables
the larger MTU.  And yes, the "We want to be on the Top 500 list" types
do worry about latency and some perhaps even many of them use Ethernet
instead of Infiniband (which does, BTW offer at least the illusion of a
quite large MTU to IP), but a sanctioned way to run a larger MTU over
Ethernet does not *force* them to use it if they want to make the
explicit latency vs overhead trade-off.  As it stands, those who do not
worry about micro or nanoseconds are forced off the standard in the name
of preserving something for those who do.  (And with 100 GbE it would be
nanosecond differences we would talking about - the 12 and 72 usec of 1
GbE become 120 and 720 nanoseconds at 100 GbE - the realm of a processor
cache miss because memory latency hasn't and won't likely get much
better either)

And, are transaction or SAN latencies actually measured in microseconds
or nanoseconds?  If "transactions" are OLTP, those things are measured
in milliseconds and even whole seconds (TPC), and spinning rust (yes,
but not SSDs) still has latencies measured in milliseconds.

rick jones

>  
>         And while it isn't the
>         strongest point in the world, one might even argue that the
>         need to use
>         TSO/LRO to achieve performance hinders new transport protocol
>         adoption -
>         the presence of NIC offloads for only TCP (or UDP) leaves a
>         new
>         transport protocol (perhaps SCTP) at a disadvantage.
> 
> 
>   True, and even UDP seems to be often blocked for anything other than
> DNS.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-05 16:01         ` Jim Gettys
  2011-05-05 16:10           ` Stephen Hemminger
@ 2011-05-06  4:18           ` Fred Baker
  2011-05-06 15:14             ` richard
  2011-05-08 12:34             ` Richard Scheffenegger
  1 sibling, 2 replies; 66+ messages in thread
From: Fred Baker @ 2011-05-06  4:18 UTC (permalink / raw)
  To: Jim Gettys; +Cc: bloat

There are a couple of ways to approach this, and they depend on your network model.

In general, if you assume that there is one bottleneck, losses occur in the queue at the bottleneck, and are each retransmitted exactly once (not necessary, but helps), goodput should approximate 100% regardless of the queue depth. Why? Because every packet transits the bottleneck once - if it is dropped at the bottleneck, the retransmission transits the bottleneck. So you are using exactly the capacity of the bottleneck.

the value of a shallow queue is to reduce RTT, not to increase or decrease goodput. cwnd can become too small, however; if it is possible to set cwnd to N without increasing queuing delay, and cwnd is less than N, you're not maximizing throughput. When cwnd grows above N, it merely increases queuing delay, and therefore bufferbloat.

If there are two bottlenecks in series, you have some probability that a packet transits one bottleneck and doesn't transit the other. In that case, there is probably an analytical way to describe the behavior, but it depends on a lot of factors including distributions of competing traffic. There are a number of other possibilities; imagine that you drop a packet, there is a sack, you retransmit it, the ack is lost, and meanwhile there is another loss. You could easily retransmit the retransmission unnecessarily, which reduces goodput. The list of silly possibilities goes on for a while, and we have to assume that each has some probability of happening in the wild.



On May 5, 2011, at 9:01 AM, Jim Gettys wrote:

> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
>> I'm curious, has anyone done some simulations to check if the following qualitative statement holds true, and if, what the quantitative effect is:
>> 
>> With bufferbloat, the TCP congestion control reaction is unduely delayed. When it finally happens, the tcp stream is likely facing a "burst loss" event - multiple consecutive packets get dropped. Worse yet, the sender with the lowest RTT across the bottleneck will likely start to retransmit while the (tail-drop) queue is still overflowing.
>> 
>> And a lost retransmission means a major setback in bandwidth (except for Linux with bulk transfers and SACK enabled), as the standard (RFC documented) behaviour asks for a RTO (1sec nominally, 200-500 ms typically) to recover such a lost retransmission...
>> 
>> The second part (more important as an incentive to the ISPs actually), how does the fraction of goodput vs. throughput change, when AQM schemes are deployed, and TCP CC reacts in a timely manner? Small ISPs have to pay for their upstream volume, regardless if that is "real" work (goodput) or unneccessary retransmissions.
>> 
>> When I was at a small cable ISP in switzerland last week, surely enough bufferbloat was readily observable (17ms -> 220ms after 30 sec of a bulk transfer), but at first they had the "not our problem" view, until I started discussing burst loss / retransmissions / goodput vs throughput - with the latest point being a real commercial incentive to them. (They promised to check if AQM would be available in the CPE / CMTS, and put latency bounds in their tenders going forward).
>> 
> I wish I had a good answer to your very good questions.  Simulation would be interesting though real daa is more convincing.
> 
> I haven't looked in detail at all that many traces to try to get a feel for how much bandwidth waste there actually is, and more formal studies like Netalyzr, SamKnows, or the Bismark project would be needed to quantify the loss on the network as a whole.
> 
> I did spend some time last fall with the traces I've taken.  In those, I've typically been seeing 1-3% packet loss in the main TCP transfers.  On the wireless trace I took, I saw 9% loss, but whether that is bufferbloat induced loss or not, I don't know (the data is out there for those who might want to dig).  And as you note, the losses are concentrated in bursts (probably due to the details of Cubic, so I'm told).
> 
> I've had anecdotal reports (and some first hand experience) with much higher loss rates, for example from Nick Weaver at ICSI; but I believe in playing things conservatively with any numbers I quote and I've not gotten consistent results when I've tried, so I just report what's in the packet captures I did take.
> 
> A phenomena that could be occurring is that during congestion avoidance (until TCP loses its cookies entirely and probes for a higher operating point) that TCP is carefully timing it's packets to keep the buffers almost exactly full, so that competing flows (in my case, simple pings) are likely to arrive just when there is no buffer space to accept them and therefore you see higher losses on them than you would on the single flow I've been tracing and getting loss statistics from.
> 
> People who want to look into this further would be a great help.
>                - Jim
> 
> 
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-06  4:18           ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Fred Baker
@ 2011-05-06 15:14             ` richard
  2011-05-06 21:56               ` Fred Baker
  2011-05-08 12:53               ` Richard Scheffenegger
  2011-05-08 12:34             ` Richard Scheffenegger
  1 sibling, 2 replies; 66+ messages in thread
From: richard @ 2011-05-06 15:14 UTC (permalink / raw)
  To: Fred Baker; +Cc: bloat

I'm wondering if we should look at the ratio of throughput to goodput
instead of the absolute numbers.

Yes, the goodput will be 100% but at what cost in actual throughput? And
at what cost in total bandwidth?

If every packet takes two attempts then the ratio will be 1/2 - 1 unit
of googput for two units of throughput (at least up to the choke-point).
This is worst-case, so the ratio is likely to be something better than
that 3/4, 5/6, 99/100 ??? 

Hmmm... maybe inverting the ratio and calling it something flashy (the
bloaty rating???) might give us a lever in the media and with ISPs that
is easier for the math challenged to understand. Higher is worse.

Putting a number to this will also help those of us trying to get ISPs
to understand that their Usage Based Bilking (UBB) won't address the
real problem which is hidden in this ratio. The fact is, the choke point
for much of this is the home router/firewall - and so that 1/2 ratio
tells me the consumer is getting hosed for a technical problem.

richard

On Thu, 2011-05-05 at 21:18 -0700, Fred Baker wrote:
> There are a couple of ways to approach this, and they depend on your network model.
> 
> In general, if you assume that there is one bottleneck, losses occur in the queue at the bottleneck, 
> and are each retransmitted exactly once (not necessary, but helps), goodput should approximate 100% 
> regardless of the queue depth. Why? Because every packet transits the bottleneck once - if it is 
> dropped at the bottleneck, the retransmission transits the bottleneck. So you are using exactly 
> the capacity of the bottleneck.
> 
> the value of a shallow queue is to reduce RTT, not to increase or decrease goodput. cwnd can become 
> too small, however; if it is possible to set cwnd to N without increasing queuing delay, and cwnd is
>  less than N, you're not maximizing throughput. When cwnd grows above N, it merely increases queuing
>  delay, and therefore bufferbloat.
> 
> If there are two bottlenecks in series, you have some probability that a packet transits one
> bottleneck and doesn't transit the other. In that case, there is probably an analytical way 
> to describe the behavior, but it depends on a lot of factors including distributions of competing
>  traffic. There are a number of other possibilities; imagine that you drop a packet, there is a 
> sack, you retransmit it, the ack is lost, and meanwhile there is another loss. You could easily 
> retransmit the retransmission unnecessarily, which reduces goodput. The list of silly possibilities
>  goes on for a while, and we have to assume that each has some probability of happening in the wild.
> 
snip...

richard

-- 
Richard C. Pitt                 Pacific Data Capture
rcpitt@pacdat.net               604-644-9265
http://digital-rag.com          www.pacdat.net
PGP Fingerprint: FCEF 167D 151B 64C4 3333  57F0 4F18 AF98 9F59 DD73


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-06 15:14             ` richard
@ 2011-05-06 21:56               ` Fred Baker
  2011-05-06 22:10                 ` Stephen Hemminger
  2011-05-08 13:00                 ` Richard Scheffenegger
  2011-05-08 12:53               ` Richard Scheffenegger
  1 sibling, 2 replies; 66+ messages in thread
From: Fred Baker @ 2011-05-06 21:56 UTC (permalink / raw)
  To: richard; +Cc: bloat

On May 6, 2011, at 8:14 AM, richard wrote:
> If every packet takes two attempts then the ratio will be 1/2 - 1 unit
> of googput for two units of throughput (at least up to the choke-point).
> This is worst-case, so the ratio is likely to be something better than
> that 3/4, 5/6, 99/100 ??? 

I have a suggestion. turn on tcpdump on your laptop. Download a web page with lots of imagines, such as a google images web page, and then download a humongous file. Scan through the output file for SACK messages; that will give you the places where the receiver (you) saw losses and tried to recover from them.

> Putting a number to this will also help those of us trying to get ISPs
> to understand that their Usage Based Bilking (UBB) won't address the
> real problem which is hidden in this ratio. The fact is, the choke point
> for much of this is the home router/firewall - and so that 1/2 ratio
> tells me the consumer is getting hosed for a technical problem.

I think you need to do some research there. A TCP session with 1% loss (your ratio being 1/100) has difficulty maintaining throughput; usual TCP loss rates are on the order of tenths to hundredths of a percent.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-06 21:56               ` Fred Baker
@ 2011-05-06 22:10                 ` Stephen Hemminger
  2011-05-07 16:39                   ` Jonathan Morton
  2011-05-08 13:00                 ` Richard Scheffenegger
  1 sibling, 1 reply; 66+ messages in thread
From: Stephen Hemminger @ 2011-05-06 22:10 UTC (permalink / raw)
  To: Fred Baker; +Cc: bloat

On Fri, 6 May 2011 14:56:01 -0700
Fred Baker <fred@cisco.com> wrote:

> 
> On May 6, 2011, at 8:14 AM, richard wrote:
> > If every packet takes two attempts then the ratio will be 1/2 - 1 unit
> > of googput for two units of throughput (at least up to the choke-point).
> > This is worst-case, so the ratio is likely to be something better than
> > that 3/4, 5/6, 99/100 ??? 
> 
> I have a suggestion. turn on tcpdump on your laptop. Download a web page with lots of imagines, such as a google images web page, and then download a humongous file. Scan through the output file for SACK messages; that will give you the places where the receiver (you) saw losses and tried to recover from them.
> 
> > Putting a number to this will also help those of us trying to get ISPs
> > to understand that their Usage Based Bilking (UBB) won't address the
> > real problem which is hidden in this ratio. The fact is, the choke point
> > for much of this is the home router/firewall - and so that 1/2 ratio
> > tells me the consumer is getting hosed for a technical problem.
> 
> I think you need to do some research there. A TCP session with 1% loss (your ratio being 1/100) has difficulty maintaining throughput; usual TCP loss rates are on the order of tenths to hundredths of a percent.

There is some good theoretical work which shows relationship
between throughput and loss.
  http://www.slac.stanford.edu/comp/net/wan-mon/thru-vs-loss.html

Rate <= (MSS/RTT)*(1 / sqrt{p})

where:
Rate: is the TCP transfer rate or throughputd
MSS: is the maximum segment size (fixed for each Internet path, typically 1460 bytes)
RTT: is the round trip time (as measured by TCP)
p: is the packet loss rate. 

It is interesting that longer RTT which can be an artifact of
bloat in the queues, will hurt throughput in this case.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-06 22:10                 ` Stephen Hemminger
@ 2011-05-07 16:39                   ` Jonathan Morton
  2011-05-08  0:15                     ` Stephen Hemminger
  0 siblings, 1 reply; 66+ messages in thread
From: Jonathan Morton @ 2011-05-07 16:39 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: bloat


On 7 May, 2011, at 1:10 am, Stephen Hemminger wrote:

> Rate <= (MSS/RTT)*(1 / sqrt{p})
> 
> where:
> Rate: is the TCP transfer rate or throughputd
> MSS: is the maximum segment size (fixed for each Internet path, typically 1460 bytes)
> RTT: is the round trip time (as measured by TCP)
> p: is the packet loss rate. 

So if the loss rate is 1.0 (100%), the throughput is MSS/RTT.  If the loss rate is 0, the throughput goes to infinity.  That doesn't seem right to me.

 - Jonathan


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-07 16:39                   ` Jonathan Morton
@ 2011-05-08  0:15                     ` Stephen Hemminger
  2011-05-08  3:04                       ` Constantine Dovrolis
  0 siblings, 1 reply; 66+ messages in thread
From: Stephen Hemminger @ 2011-05-08  0:15 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: bloat

On Sat, 7 May 2011 19:39:22 +0300
Jonathan Morton <chromatix99@gmail.com> wrote:

> 
> On 7 May, 2011, at 1:10 am, Stephen Hemminger wrote:
> 
> > Rate <= (MSS/RTT)*(1 / sqrt{p})
> > 
> > where:
> > Rate: is the TCP transfer rate or throughputd
> > MSS: is the maximum segment size (fixed for each Internet path, typically 1460 bytes)
> > RTT: is the round trip time (as measured by TCP)
> > p: is the packet loss rate. 
> 
> So if the loss rate is 1.0 (100%), the throughput is MSS/RTT.  If the loss rate is 0, the throughput goes to infinity.  That doesn't seem right to me.

If loss rate is 0 there is no upper bound on TCP due to loss.
There are other limits on TCP throughput like window size but not limits
because of loss.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-08  0:15                     ` Stephen Hemminger
@ 2011-05-08  3:04                       ` Constantine Dovrolis
  0 siblings, 0 replies; 66+ messages in thread
From: Constantine Dovrolis @ 2011-05-08  3:04 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: bloat

Hi, I suggest you look at the following paper for a more
general version of this formula (equation 3), which includes the
effect of limited capacity and/or limited receive-window:
http://www.cc.gatech.edu/fac/Constantinos.Dovrolis/Papers/f235-he.pdf

The paper also discusses common mistakes when this formula is used
to predict the throughput of a TCP connection - the basic
idea is that we cannot use the loss rate *before* the start
of a TCP connection to predict what its throughput will be.
A large TCP connection that is not limited by its receive-window
can of course cause an increase in the loss rate of the path
that it traverses (see sections 3.2 - 3.4)

regards
Constantine

On 5/7/2011 8:15 PM, Stephen Hemminger wrote:
> On Sat, 7 May 2011 19:39:22 +0300
> Jonathan Morton<chromatix99@gmail.com>  wrote:
>
>>
>> On 7 May, 2011, at 1:10 am, Stephen Hemminger wrote:
>>
>>> Rate<= (MSS/RTT)*(1 / sqrt{p})
>>>
>>> where:
>>> Rate: is the TCP transfer rate or throughputd
>>> MSS: is the maximum segment size (fixed for each Internet path, typically 1460 bytes)
>>> RTT: is the round trip time (as measured by TCP)
>>> p: is the packet loss rate.
>>
>> So if the loss rate is 1.0 (100%), the throughput is MSS/RTT.  If the loss rate is 0, the throughput goes to infinity.  That doesn't seem right to me.
>
> If loss rate is 0 there is no upper bound on TCP due to loss.
> There are other limits on TCP throughput like window size but not limits
> because of loss.
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

-- 
Constantine

--------------------------------------------------------------
Constantine Dovrolis, Associate Professor
College of Computing, Georgia Institute of Technology
3346 KACB, 404-385-4205, dovrolis@cc.gatech.edu
http://www.cc.gatech.edu/~dovrolis/


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-06 21:56               ` Fred Baker
  2011-05-06 22:10                 ` Stephen Hemminger
@ 2011-05-08 13:00                 ` Richard Scheffenegger
  1 sibling, 0 replies; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-08 13:00 UTC (permalink / raw)
  To: Fred Baker, richard; +Cc: bloat


Note that this will only give you a lower bound; the true losses that were 
addressed by the sender (ie. RTO retransmissions that got lost again) can by 
principle not be discovered by a receiver side trace, only a (reliable) 
sender side trace will allow that.

To the second point: Only for simple Reno/NewReno there exists a closed 
formular for estimating throughput based on random, non-markow distributed 
losses; and more modern congestion control / loss recovery scheme will 
permit (more or less slightly) higher thoughput, thus the formulas (ie. RFC 
3448 states the one for Reno) will only serve as a (good) lower bound 
estimate.

Again, increasing throughput at the cost of goodput is a bad proposition, if 
you get charged by traffic volume (because what you really want is data 
delivered to the receiver, not dumped into the network for no good reason).

Regards,
   Richard


----- Original Message ----- 
From: "Fred Baker" <fred@cisco.com>
To: "richard" <richard@pacdat.net>
Cc: <bloat@lists.bufferbloat.net>
Sent: Friday, May 06, 2011 11:56 PM
Subject: Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat


>
> On May 6, 2011, at 8:14 AM, richard wrote:
>> If every packet takes two attempts then the ratio will be 1/2 - 1 unit
>> of googput for two units of throughput (at least up to the choke-point).
>> This is worst-case, so the ratio is likely to be something better than
>> that 3/4, 5/6, 99/100 ???
>
> I have a suggestion. turn on tcpdump on your laptop. Download a web page 
> with lots of imagines, such as a google images web page, and then download 
> a humongous file. Scan through the output file for SACK messages; that 
> will give you the places where the receiver (you) saw losses and tried to 
> recover from them.
>
>> Putting a number to this will also help those of us trying to get ISPs
>> to understand that their Usage Based Bilking (UBB) won't address the
>> real problem which is hidden in this ratio. The fact is, the choke point
>> for much of this is the home router/firewall - and so that 1/2 ratio
>> tells me the consumer is getting hosed for a technical problem.
>
> I think you need to do some research there. A TCP session with 1% loss 
> (your ratio being 1/100) has difficulty maintaining throughput; usual TCP 
> loss rates are on the order of tenths to hundredths of a percent.
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-06 15:14             ` richard
  2011-05-06 21:56               ` Fred Baker
@ 2011-05-08 12:53               ` Richard Scheffenegger
  1 sibling, 0 replies; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-08 12:53 UTC (permalink / raw)
  To: richard, Fred Baker; +Cc: bloat


I think a definition of terms would be in order. For me:

goodput: number of bytes delivered at the receiver to the next upper layer 
application, per unit of time
throughput: number of bytes send by the sender, into the network, per unit 
of time

Thus goodput can be a ratio (delivered bytes on the receiving application 
vs. data bytes sent by the sender's TCP), but by definition, only a 
completely loss-less, in-order stream of segments can ever hope of achiving 
that; any instance of fast recovery, retransmission timeout etc, and the 
goodput fraction will always be (much) less than 100%. (However, fringe 
effects like ssthresh reset for idle connections won't influence that 
fraction at all, but may lower the absolute values).

Charging for volume without considering the goodput fraction, is like 
overpaying - if the publing would work properly, you (end customer, 
small/medium ISP) would get charged for the real work you demanded of the 
network (data bytes delivered to a receiving application). Since the 
plumbing is broken, you get charged for the brokenness also (because only 
absolut data volume is counted), giving less than zero incentive to those 
who could fix the plumbing to do it.


Exposing this brokenness is one of the nice properties of CONEX - upstream 
ISPs can be graded by the congestion they cause (or are willing to 
tolerate), and customers are empowered to make a concious choice to use an 
ISP which may be charge more (say 2%) per volume of data, but where the 
goodput fraction is at least a similar percentage points better... I.e. by 
properly tuning their AQM schemes.

Best regards,
   Richard


----- Original Message ----- 
From: "richard" <richard@pacdat.net>
To: "Fred Baker" <fredbakersba@gmail.com>
Cc: <bloat@lists.bufferbloat.net>
Sent: Friday, May 06, 2011 5:14 PM
Subject: Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat


> I'm wondering if we should look at the ratio of throughput to goodput
> instead of the absolute numbers.
>
> Yes, the goodput will be 100% but at what cost in actual throughput? And
> at what cost in total bandwidth?
>
> If every packet takes two attempts then the ratio will be 1/2 - 1 unit
> of googput for two units of throughput (at least up to the choke-point).
> This is worst-case, so the ratio is likely to be something better than
> that 3/4, 5/6, 99/100 ???
>
> Hmmm... maybe inverting the ratio and calling it something flashy (the
> bloaty rating???) might give us a lever in the media and with ISPs that
> is easier for the math challenged to understand. Higher is worse.
>
> Putting a number to this will also help those of us trying to get ISPs
> to understand that their Usage Based Bilking (UBB) won't address the
> real problem which is hidden in this ratio. The fact is, the choke point
> for much of this is the home router/firewall - and so that 1/2 ratio
> tells me the consumer is getting hosed for a technical problem.
>
> richard
>
> On Thu, 2011-05-05 at 21:18 -0700, Fred Baker wrote:
>> There are a couple of ways to approach this, and they depend on your 
>> network model.
>>
>> In general, if you assume that there is one bottleneck, losses occur in 
>> the queue at the bottleneck,
>> and are each retransmitted exactly once (not necessary, but helps), 
>> goodput should approximate 100%
>> regardless of the queue depth. Why? Because every packet transits the 
>> bottleneck once - if it is
>> dropped at the bottleneck, the retransmission transits the bottleneck. So 
>> you are using exactly
>> the capacity of the bottleneck.
>>
>> the value of a shallow queue is to reduce RTT, not to increase or 
>> decrease goodput. cwnd can become
>> too small, however; if it is possible to set cwnd to N without increasing 
>> queuing delay, and cwnd is
>>  less than N, you're not maximizing throughput. When cwnd grows above N, 
>> it merely increases queuing
>>  delay, and therefore bufferbloat.
>>
>> If there are two bottlenecks in series, you have some probability that a 
>> packet transits one
>> bottleneck and doesn't transit the other. In that case, there is probably 
>> an analytical way
>> to describe the behavior, but it depends on a lot of factors including 
>> distributions of competing
>>  traffic. There are a number of other possibilities; imagine that you 
>> drop a packet, there is a
>> sack, you retransmit it, the ack is lost, and meanwhile there is another 
>> loss. You could easily
>> retransmit the retransmission unnecessarily, which reduces goodput. The 
>> list of silly possibilities
>>  goes on for a while, and we have to assume that each has some 
>> probability of happening in the wild.
>>
> snip...
>
> richard
>
> -- 
> Richard C. Pitt                 Pacific Data Capture
> rcpitt@pacdat.net               604-644-9265
> http://digital-rag.com          www.pacdat.net
> PGP Fingerprint: FCEF 167D 151B 64C4 3333  57F0 4F18 AF98 9F59 DD73
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-06  4:18           ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Fred Baker
  2011-05-06 15:14             ` richard
@ 2011-05-08 12:34             ` Richard Scheffenegger
  2011-05-09  3:07               ` Fred Baker
  1 sibling, 1 reply; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-08 12:34 UTC (permalink / raw)
  To: Fred Baker, Jim Gettys; +Cc: bloat

Hi Fred,

Goodput can really only be measured at the sender; by definition, any 
retransmitted packet will reduce goodput vs throughput; In your example, 
where each segment is retransmitted once, goodput would be - at most - 0.5, 
not 1.0... IMHO defining the data volume after the bottleneck by itself as 
goodput is also a bit short-sighted, because a good fraction of that data 
may still be discarded by TCP for numerous reasons, ultimately (ie, legacy 
go-back-n RTO recovery by the sender)...

Measuring at the receiver (or in-path network) side, on a SACK enabled 
session, will miss all the instances where the last (or a number of segments 
running up to and including the last) segment was lost, or where a 
retransmitted segment was lost twice.

The former can be approximated by checking the RTOs (which would require 
already some heuristic to come up with a good approximation of what the 
sender's RTO timeout is likely to be - the IETF RFC 1sec prescribed minRTO 
is virtually never used). The latter, where retransmitted segments are also 
lost, you can only infer indirectly about the senders behavior from a 
receiver-side (or in-path ) trace, again because lost retransmission 
detection is done by one stack (Linux), but not by the others, and RTOs can 
again not be evaded under all circumstances.

But back to my original question: When looking at modern TCP stacks, with 
TSO, if the bufferbloat allows the senders cwnd to grow beyond thresholds 
which allow the aggressive use of TSO (64kB or even 256kB of data allowed in 
the senders cwnd), the effective sending rate of such a burst will be 
wirespeed (no interleaving segments of other sessions). As pointed out in 
other mails to this thread, if the bottleneck has then 1/10th the capacity 
of the senders wire (and is potentially shared among multiple senders), at 
least 90% of all the sent data of such a TSO segment train will be dropped 
in a single burst of loss... With proper AQM, and some (single segment) loss 
earlier, cwnd may never grow to trigger TSO in that way, and the goodput (1 
segment out of 64kB data, vs. 58kB out of 64kB data) is obviously shifted 
extremely to the scenario with AQM...

So, qualitatively, a ISP with proper AQM should be able to have a better 
Goodput (downloads from upstream or uploads to upstream ISP); However, 
pricing is typically done on data volume exchanged - if goodput is lower, an 
inverse number of higher volume is necessary, to achive the same "real" data 
exchange.

However, the next question becomes, how to quanitfy this on large scale - if 
the monetary difference is, say, in the vicinity of 2-3% saved (average 
internet loss ratio), that accumulates to huge sums for small / medium ISPs 
(which get charged more per volume than large ISPs).

If the quantitative difference is only 0,02-0,05%, say, than the incentive 
of enabling AQMs in small ISPs is not really there in monetary terms (and 
these ISPs would have to be motivated by other, typically much less strong 
incentives).

Best regards,
   Richard

----- Original Message ----- 
From: "Fred Baker" <fredbakersba@gmail.com>
To: "Jim Gettys" <jg@freedesktop.org>
Cc: <bloat@lists.bufferbloat.net>
Sent: Friday, May 06, 2011 6:18 AM
Subject: Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat

> There are a couple of ways to approach this, and they depend on your 
> network model.
>
> In general, if you assume that there is one bottleneck, losses occur in 
> the queue at the bottleneck, and are each retransmitted exactly once (not 
> necessary, but helps), goodput should approximate 100% regardless of the 
> queue depth. Why? Because every packet transits the bottleneck once - if 
> it is dropped at the bottleneck, the retransmission transits the 
> bottleneck. So you are using exactly the capacity of the bottleneck.
>
> the value of a shallow queue is to reduce RTT, not to increase or decrease 
> goodput. cwnd can become too small, however; if it is possible to set cwnd 
> to N without increasing queuing delay, and cwnd is less than N, you're not 
> maximizing throughput. When cwnd grows above N, it merely increases 
> queuing delay, and therefore bufferbloat.
>
> If there are two bottlenecks in series, you have some probability that a 
> packet transits one bottleneck and doesn't transit the other. In that 
> case, there is probably an analytical way to describe the behavior, but it 
> depends on a lot of factors including distributions of competing traffic. 
> There are a number of other possibilities; imagine that you drop a packet, 
> there is a sack, you retransmit it, the ack is lost, and meanwhile there 
> is another loss. You could easily retransmit the retransmission 
> unnecessarily, which reduces goodput. The list of silly possibilities goes 
> on for a while, and we have to assume that each has some probability of 
> happening in the wild.
>
>
>
> On May 5, 2011, at 9:01 AM, Jim Gettys wrote:
>
>> On 04/30/2011 03:18 PM, Richard Scheffenegger wrote:
>>> I'm curious, has anyone done some simulations to check if the following 
>>> qualitative statement holds true, and if, what the quantitative effect 
>>> is:
>>>
>>> With bufferbloat, the TCP congestion control reaction is unduely 
>>> delayed. When it finally happens, the tcp stream is likely facing a 
>>> "burst loss" event - multiple consecutive packets get dropped. Worse 
>>> yet, the sender with the lowest RTT across the bottleneck will likely 
>>> start to retransmit while the (tail-drop) queue is still overflowing.
>>>
>>> And a lost retransmission means a major setback in bandwidth (except for 
>>> Linux with bulk transfers and SACK enabled), as the standard (RFC 
>>> documented) behaviour asks for a RTO (1sec nominally, 200-500 ms 
>>> typically) to recover such a lost retransmission...
>>>
>>> The second part (more important as an incentive to the ISPs actually), 
>>> how does the fraction of goodput vs. throughput change, when AQM schemes 
>>> are deployed, and TCP CC reacts in a timely manner? Small ISPs have to 
>>> pay for their upstream volume, regardless if that is "real" work 
>>> (goodput) or unneccessary retransmissions.
>>>
>>> When I was at a small cable ISP in switzerland last week, surely enough 
>>> bufferbloat was readily observable (17ms -> 220ms after 30 sec of a bulk 
>>> transfer), but at first they had the "not our problem" view, until I 
>>> started discussing burst loss / retransmissions / goodput vs 
>>> throughput - with the latest point being a real commercial incentive to 
>>> them. (They promised to check if AQM would be available in the CPE / 
>>> CMTS, and put latency bounds in their tenders going forward).
>>>
>> I wish I had a good answer to your very good questions.  Simulation would 
>> be interesting though real daa is more convincing.
>>
>> I haven't looked in detail at all that many traces to try to get a feel 
>> for how much bandwidth waste there actually is, and more formal studies 
>> like Netalyzr, SamKnows, or the Bismark project would be needed to 
>> quantify the loss on the network as a whole.
>>
>> I did spend some time last fall with the traces I've taken.  In those, 
>> I've typically been seeing 1-3% packet loss in the main TCP transfers. 
>> On the wireless trace I took, I saw 9% loss, but whether that is 
>> bufferbloat induced loss or not, I don't know (the data is out there for 
>> those who might want to dig).  And as you note, the losses are 
>> concentrated in bursts (probably due to the details of Cubic, so I'm 
>> told).
>>
>> I've had anecdotal reports (and some first hand experience) with much 
>> higher loss rates, for example from Nick Weaver at ICSI; but I believe in 
>> playing things conservatively with any numbers I quote and I've not 
>> gotten consistent results when I've tried, so I just report what's in the 
>> packet captures I did take.
>>
>> A phenomena that could be occurring is that during congestion avoidance 
>> (until TCP loses its cookies entirely and probes for a higher operating 
>> point) that TCP is carefully timing it's packets to keep the buffers 
>> almost exactly full, so that competing flows (in my case, simple pings) 
>> are likely to arrive just when there is no buffer space to accept them 
>> and therefore you see higher losses on them than you would on the single 
>> flow I've been tracing and getting loss statistics from.
>>
>> People who want to look into this further would be a great help.
>>                - Jim
>>
>>
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Goodput fraction w/ AQM vs bufferbloat
  2011-05-08 12:34             ` Richard Scheffenegger
@ 2011-05-09  3:07               ` Fred Baker
  0 siblings, 0 replies; 66+ messages in thread
From: Fred Baker @ 2011-05-09  3:07 UTC (permalink / raw)
  To: Richard Scheffenegger; +Cc: bloat


On May 8, 2011, at 5:34 AM, Richard Scheffenegger wrote:
> Goodput can really only be measured at the sender; by definition, any retransmitted packet will reduce goodput vs throughput; In your example, where each segment is retransmitted once, goodput would be - at most - 0.5, not 1.0... IMHO defining the data volume after the bottleneck by itself as goodput is also a bit short-sighted, because a good fraction of that data may still be discarded by TCP for numerous reasons, ultimately (ie, legacy go-back-n RTO recovery by the sender)...

Actually, I didn't say that every packet was retransmitted once. I said that every dropped packet was retransmitted once. And Goodput will never exceed the bit rate of the bottleneck in the path, apart from compression (which in effect applies a multiplier to the bottleneck bandwidth).

> But back to my original question: When looking at modern TCP stacks, with TSO, if the bufferbloat allows the senders cwnd to grow beyond thresholds which allow the aggressive use of TSO (64kB or even 256kB of data allowed in the senders cwnd), the effective sending rate of such a burst will be wirespeed (no interleaving segments of other sessions). As pointed out in other mails to this thread, if the bottleneck has then 1/10th the capacity of the senders wire (and is potentially shared among multiple senders), at least 90% of all the sent data of such a TSO segment train will be dropped in a single burst of loss... With proper AQM, and some (single segment) loss earlier, cwnd may never grow to trigger TSO in that way, and the goodput (1 segment out of 64kB data, vs. 58kB out of 64kB data) is obviously shifted extremely to the scenario with AQM...

Again, possibly, but not necessarily. If we have a constrained queue and are using tail drop, it is possible for a single burst sent to a full queue to be entirely lost. The question is, in the course of a file transfer, how many packets are lost. Before you make sweeping statements, I would strongly suggest that you mock up the situation and take a tcpdump.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [Bloat] Jumbo frames and LAN buffers
@ 2011-05-16 18:40 Richard Scheffenegger
  0 siblings, 0 replies; 66+ messages in thread
From: Richard Scheffenegger @ 2011-05-16 18:40 UTC (permalink / raw)
  To: Kevin Gross, bloat


Also found this:
http://www.stanford.edu/~balaji/papers/QCN.pdf

Jim, you may notice that the congestion feedback probability function looks 
just like the basic RED marking function :)

Regards,
  Richard

----- Original Message ----- 
From: "Richard Scheffenegger" <rscheff@gmx.at>
To: "Kevin Gross" <kevin.gross@avanw.com>; <bloat@lists.bufferbloat.net>
Sent: Monday, May 16, 2011 8:36 PM
Subject: Re: [Bloat] Jumbo frames and LAN buffers


> Kevin,
>
>> My understanding is that 802.1au, "lossless Ethernet", was designed
>> primarily to allow Fibre Channel to be carried over 10 GbE so that SAN 
>> and
>> LAN can share a common infrastructure in datacenters. I don't believe 
>> anyone
>> intends for it to be enabled for traffic classes carrying TCP.
>
> Well, QCN requires a L2 MAC sender, network and receiver cooperation (thus 
> you need fancy "CNA" converged network adapters, to start using it - these 
> would be reaction/reflection points; plus the congestion points - 
> switches - would need HW support too; nothing one can buy today; 
> higher-grade (carrier?) switches may have the reaction/reflection points 
> built into them, and could use legacy 802.3x signalling outside the 
> 802.1Qau cloud).
>
> The following may be too simplistic
>
> Once the hardware has a reaction point support, it classifies traffic, and 
> calculates the per flow congestion of the path (with flow really being the 
> classification rules by the sender), the intermediates / receiver sample 
> the flow and return the congestion back to the sender - and within the 
> sender, a token bucket-like rate limiter will adjust the sending rate of 
> the appropriate flow(s) to adjust to the observed network conditions.
>
> http://www.stanford.edu/~balaji/presentations/au-prabhakar-qcn-description.pdf
> http://www.ieee802.org/1/files/public/docs2007/au-pan-qcn-details-053007.pdf
>
> The congestion control loop has a lot of similarities to TCP CC as you 
> will note...
>
> Also, I haven't found out how fine-grained the classification is supposed 
> to be (per L2 address pair? Group of flows? Which hashing then to use for 
> mapping L2 flows into those groups between reaction/congestion/reflection 
> points...).
>
>
> Anyway, for the here and now, this is pretty much esoteric stuff not 
> relevant in this context :)
>
> Best regards,
>  Richard
>
> ----- Original Message ----- 
> From: "Kevin Gross" <kevin.gross@avanw.com>
> To: <bloat@lists.bufferbloat.net>
> Sent: Monday, May 16, 2011 3:15 PM
> Subject: Re: [Bloat] Jumbo frames and LAN buffers
>
>
>> All the stand-alone switches I've looked at recently either do not 
>> support
>> 802.3x or support it in the (desireable) manner described in the last
>> paragraph of the linked blog post. I don't believe Ethernet flow control 
>> is
>> a factor in current LANs. I'd be interested to know the specifics if 
>> anyone
>> sees it differently.
>>
>> My understanding is that 802.1au, "lossless Ethernet", was designed
>> primarily to allow Fibre Channel to be carried over 10 GbE so that SAN 
>> and
>> LAN can share a common infrastructure in datacenters. I don't believe 
>> anyone
>> intends for it to be enabled for traffic classes carrying TCP.
>>
>> Kevin Gross
>>
>> -----Original Message-----
>> From: bloat-bounces@lists.bufferbloat.net
>> [mailto:bloat-bounces@lists.bufferbloat.net] On Behalf Of Jim Gettys
>> Sent: Monday, May 16, 2011 5:24 AM
>> To: bloat@lists.bufferbloat.net
>> Subject: Re: [Bloat] Jumbo frames and LAN buffers
>>
>> Not necessarily out of knowledge or desire (since it isn't usually
>> controllable in the small switches you buy for home).  It can cause
>> trouble even in small environments as your house.
>>
>> http://virtualthreads.blogspot.com/2006/02/beware-ethernet-flow-control.html
>>
>> I know I'm at least three consumer switches deep, and it's not by choice.
>>                     - Jim
>>
>>
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat
>>
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2011-05-17 14:06 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-26 17:05 [Bloat] Network computing article on bloat Dave Taht
2011-04-26 18:13 ` Dave Hart
2011-04-26 18:17   ` Dave Taht
2011-04-26 18:28     ` dave greenfield
2011-04-26 18:32     ` Wesley Eddy
2011-04-26 19:37       ` Dave Taht
2011-04-26 20:21         ` Wesley Eddy
2011-04-26 20:30           ` Constantine Dovrolis
2011-04-26 21:16             ` Dave Taht
2011-04-27 17:10           ` Bill Sommerfeld
2011-04-27 17:40             ` Wesley Eddy
2011-04-27  7:43       ` Jonathan Morton
2011-04-30 15:56       ` Henrique de Moraes Holschuh
2011-04-30 19:18       ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Richard Scheffenegger
2011-05-05 16:01         ` Jim Gettys
2011-05-05 16:10           ` Stephen Hemminger
2011-05-05 16:30             ` Jim Gettys
2011-05-05 16:49             ` [Bloat] Burst Loss Neil Davies
2011-05-05 18:34               ` Jim Gettys
2011-05-06 11:40               ` Sam Stickland
2011-05-06 11:53                 ` Neil Davies
2011-05-08 12:42               ` Richard Scheffenegger
2011-05-09 18:06                 ` Rick Jones
2011-05-11  8:53                   ` Richard Scheffenegger
2011-05-11  9:53                     ` Eric Dumazet
2011-05-12 14:16                       ` [Bloat] Publications Richard Scheffenegger
2011-05-12 16:31                   ` [Bloat] Burst Loss Fred Baker
2011-05-12 16:41                     ` Rick Jones
2011-05-12 17:11                       ` Fred Baker
2011-05-13  5:00                     ` Kevin Gross
2011-05-13 14:35                       ` Rick Jones
2011-05-13 14:54                         ` Dave Taht
2011-05-13 20:03                           ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Kevin Gross
2011-05-14 20:48                             ` Fred Baker
2011-05-15 18:28                               ` Jonathan Morton
2011-05-15 20:49                                 ` Fred Baker
2011-05-16  0:31                                   ` Jonathan Morton
2011-05-16  7:51                                     ` Richard Scheffenegger
2011-05-16  9:49                                       ` Fred Baker
2011-05-16 11:23                                         ` [Bloat] Jumbo frames and LAN buffers Jim Gettys
2011-05-16 13:15                                           ` Kevin Gross
2011-05-16 13:22                                             ` Jim Gettys
2011-05-16 13:42                                               ` Kevin Gross
2011-05-16 15:23                                                 ` Jim Gettys
     [not found]                                               ` <-854731558634984958@unknownmsgid>
2011-05-16 13:45                                                 ` Dave Taht
2011-05-16 18:36                                             ` Richard Scheffenegger
2011-05-16 18:11                                         ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Richard Scheffenegger
2011-05-17  7:49                               ` BeckW
2011-05-17 14:16                                 ` Dave Taht
     [not found]                           ` <-4629065256951087821@unknownmsgid>
2011-05-13 20:21                             ` Dave Taht
2011-05-13 22:36                               ` Kevin Gross
2011-05-13 22:08                           ` [Bloat] Burst Loss david
2011-05-13 19:32                         ` Denton Gentry
2011-05-13 20:47                           ` Rick Jones
2011-05-06  4:18           ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Fred Baker
2011-05-06 15:14             ` richard
2011-05-06 21:56               ` Fred Baker
2011-05-06 22:10                 ` Stephen Hemminger
2011-05-07 16:39                   ` Jonathan Morton
2011-05-08  0:15                     ` Stephen Hemminger
2011-05-08  3:04                       ` Constantine Dovrolis
2011-05-08 13:00                 ` Richard Scheffenegger
2011-05-08 12:53               ` Richard Scheffenegger
2011-05-08 12:34             ` Richard Scheffenegger
2011-05-09  3:07               ` Fred Baker
2011-05-16 18:40 [Bloat] Jumbo frames and LAN buffers Richard Scheffenegger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox