General list for discussing Bufferbloat
 help / color / mirror / Atom feed
From: Jonathan Morton <chromatix99@gmail.com>
To: Fred Baker <fred@cisco.com>
Cc: bloat@lists.bufferbloat.net
Subject: Re: [Bloat] Jumbo frames and LAN buffers (was: RE:  Burst Loss)
Date: Mon, 16 May 2011 03:31:41 +0300	[thread overview]
Message-ID: <2EEFB9D5-E9CC-4612-8D91-F6B382E3C2FB@gmail.com> (raw)
In-Reply-To: <5946BA6B-4E00-43AF-A8A2-17FB3769F37B@cisco.com>


On 15 May, 2011, at 11:49 pm, Fred Baker wrote:

> 
> On May 15, 2011, at 11:28 AM, Jonathan Morton wrote:
>> The fundamental thing is that the sender must be able to know when sent frames can be flushed from the buffer because they don't need to be retransmitted.  So if there's a NACK, there must also be an ACK - at which point the ACK serves the purpose of the NACK, as it does in TCP.  The only alternative is a wall-time TTL, which is doable on single hops but requires careful design.
> 
> To a point. NORM holds a frame for possible retransmission for a stated period of time, and if retransmission isn't requested in that interval forgets it. So the ack isn't actually necessary; what is necessary is that the retention interval be long enough that a nack has a high probability of succeeding in getting the message through.

Okay, so because it can fall back to TCP's retransmit, the retention requirements can be relaxed.

>> ...recent versions of Ethernet *do* support a throttling feedback mechanism, and this can and should be exploited to tell the edge host or router that ECN *might* be needed.  Also, with throttling feedback throughout the LAN, the Ethernet can for practical purposes be treated as almost-reliable.  This is *better* in terms of packet loss than ARQ or NACK, although if the Ethernet's buffers are large, it will still increase delay.  (With small buffers, it will just decrease throughput to the capacity, which is fine.)
> 
> It increases the delay anyway. It just pushes the retention buffer to another place. What do you think the packet is doing during the "don't transmit" interval?

Most packets delayed by Ethernet throttling would, with small buffers, end up waiting in the sending host (or router).  They thus spend more time in a potentially active queue instead of in a dumb one.  But even if the host queue is dumb, the overall delay is no worse than with the larger Ethernet buffers.

> Throughput never exceeds capacity. If I have a 10 GBPS link, I will never get more than 10 GBPS through it. Buffer fill rate is statistically predictable. With small buffers, the fill rate acheives the top sooner. They increase the probability that the buffers are full, which is to say the drop probability. Which puts us to an end to end retransmission, which is the worst case of what you were worried about.

Let's suppose someone has generously provisioned an office with GigE throughout, using a two-level hierarchy of switches.  Some dumb schmuck then schedules every single computer to run it's backups (to a single fileserver) at the same time.  That's say 100 computers all competing for one GigE link to the fileserver.  If the switches are fair, each computer should get 10Mbps - that's the capacity.

With throttling, each computer sees the link closed 99% of the time.  It can send at link rate for the remaining 1% of the time.  On medium timescales, that looks like a 10Mbps bottleneck at the first link.  So the throughput on that link equals the capacity, and hopefully the goodput is also thus.  The only queue that is likely to overflow is the one on the sending computer, and one would hope there is enough feedback in a host's own TCP/IP stack to prevent that.

Without throttling but with ARQ, NACK or whatever you want to call it, the host has no signal to tell it to slow down - so the throughput on the edge link is more than 10Mbps (but the goodput will be less).  The buffer in the outer switch fills up - no matter how big or small it is - and starts dropping packets.  The switch then won't ask for retransmission of packets it's just dropped, because it has nowhere to put them.  The same process then repeats at the inner switch.  Finally, the server sees the missing packets, and asks for the retransmission - but these requests have to be switched all the way back to the clients, because the missing packets aren't in the switches' buffers.  It's therefore no better than a TCP SACK retransmission.

So there you have a classic congested network scenario in which throttling solves the problem, but link-level retransmission can't.

Where ARQ and/or NACK come in handy is where the link itself is unreliable, such as on WLANs (hence the use in amateur radio) and last-mile links.  In that case, the reason for the packet loss is not a full receive buffer, so asking for a retransmission is not inherently self-defeating.

> I'm not going to argue against letting retransmission go end to end; it's an endless debate. I'll simply note that several link layers, including but not limited to those you mention, find that applications using them work better if there is a high high probability of retransmission in an interval on the order of the link RTT as opposed to the end to end RTT. You brought up data centers (aka variable delays in LAN networks); those have been heavily the province of fiberchannel, which is a link layer protocol with retransmission. Think about it.

What I'd like to see is a complete absence of need for retransmission on a properly built wired network.  Obviously the capability still needs to be there to cope with the parts that aren't properly built or aren't wired, but TCP can do that. Throttling (in the form of Ethernet PAUSE) is simply the third possible method of signalling congestion in the network, alongside delay and loss - and it happens to be quite widely deployed already.

 - Jonathan


  reply	other threads:[~2011-05-16  0:22 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-26 17:05 [Bloat] Network computing article on bloat Dave Taht
2011-04-26 18:13 ` Dave Hart
2011-04-26 18:17   ` Dave Taht
2011-04-26 18:28     ` dave greenfield
2011-04-26 18:32     ` Wesley Eddy
2011-04-26 19:37       ` Dave Taht
2011-04-26 20:21         ` Wesley Eddy
2011-04-26 20:30           ` Constantine Dovrolis
2011-04-26 21:16             ` Dave Taht
2011-04-27 17:10           ` Bill Sommerfeld
2011-04-27 17:40             ` Wesley Eddy
2011-04-27  7:43       ` Jonathan Morton
2011-04-30 15:56       ` Henrique de Moraes Holschuh
2011-04-30 19:18       ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Richard Scheffenegger
2011-05-05 16:01         ` Jim Gettys
2011-05-05 16:10           ` Stephen Hemminger
2011-05-05 16:30             ` Jim Gettys
2011-05-05 16:49             ` [Bloat] Burst Loss Neil Davies
2011-05-05 18:34               ` Jim Gettys
2011-05-06 11:40               ` Sam Stickland
2011-05-06 11:53                 ` Neil Davies
2011-05-08 12:42               ` Richard Scheffenegger
2011-05-09 18:06                 ` Rick Jones
2011-05-11  8:53                   ` Richard Scheffenegger
2011-05-11  9:53                     ` Eric Dumazet
2011-05-12 14:16                       ` [Bloat] Publications Richard Scheffenegger
2011-05-12 16:31                   ` [Bloat] Burst Loss Fred Baker
2011-05-12 16:41                     ` Rick Jones
2011-05-12 17:11                       ` Fred Baker
2011-05-13  5:00                     ` Kevin Gross
2011-05-13 14:35                       ` Rick Jones
2011-05-13 14:54                         ` Dave Taht
2011-05-13 20:03                           ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Kevin Gross
2011-05-14 20:48                             ` Fred Baker
2011-05-15 18:28                               ` Jonathan Morton
2011-05-15 20:49                                 ` Fred Baker
2011-05-16  0:31                                   ` Jonathan Morton [this message]
2011-05-16  7:51                                     ` Richard Scheffenegger
2011-05-16  9:49                                       ` Fred Baker
2011-05-16 11:23                                         ` [Bloat] Jumbo frames and LAN buffers Jim Gettys
2011-05-16 13:15                                           ` Kevin Gross
2011-05-16 13:22                                             ` Jim Gettys
2011-05-16 13:42                                               ` Kevin Gross
2011-05-16 15:23                                                 ` Jim Gettys
     [not found]                                               ` <-854731558634984958@unknownmsgid>
2011-05-16 13:45                                                 ` Dave Taht
2011-05-16 18:36                                             ` Richard Scheffenegger
2011-05-16 18:11                                         ` [Bloat] Jumbo frames and LAN buffers (was: RE: Burst Loss) Richard Scheffenegger
2011-05-17  7:49                               ` BeckW
2011-05-17 14:16                                 ` Dave Taht
     [not found]                           ` <-4629065256951087821@unknownmsgid>
2011-05-13 20:21                             ` Dave Taht
2011-05-13 22:36                               ` Kevin Gross
2011-05-13 22:08                           ` [Bloat] Burst Loss david
2011-05-13 19:32                         ` Denton Gentry
2011-05-13 20:47                           ` Rick Jones
2011-05-06  4:18           ` [Bloat] Goodput fraction w/ AQM vs bufferbloat Fred Baker
2011-05-06 15:14             ` richard
2011-05-06 21:56               ` Fred Baker
2011-05-06 22:10                 ` Stephen Hemminger
2011-05-07 16:39                   ` Jonathan Morton
2011-05-08  0:15                     ` Stephen Hemminger
2011-05-08  3:04                       ` Constantine Dovrolis
2011-05-08 13:00                 ` Richard Scheffenegger
2011-05-08 12:53               ` Richard Scheffenegger
2011-05-08 12:34             ` Richard Scheffenegger
2011-05-09  3:07               ` Fred Baker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2EEFB9D5-E9CC-4612-8D91-F6B382E3C2FB@gmail.com \
    --to=chromatix99@gmail.com \
    --cc=bloat@lists.bufferbloat.net \
    --cc=fred@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox