* [Bloat] Hmmn, this is in part a capacity planning/management problem. @ 2011-09-14 15:17 Collier-Brown, David (LNG-CAN) 2011-09-20 16:14 ` Jim Gettys 0 siblings, 1 reply; 3+ messages in thread From: Collier-Brown, David (LNG-CAN) @ 2011-09-14 15:17 UTC (permalink / raw) To: bloat; +Cc: Collier-Brown, David (LNG-CAN), davecb I was reading the latest article in LWN, and commented there, but part of the comment may be relevant to the list... -- reply to mlankhorst (subscriber, #52260) -- Changing the subject slightly, there's a subtle, underlying problem in that when developing products and protocols, we tend to work with what's easy, not what's important. We work with the bandwidth/delay product because it's what we needed in the short run, and we probably couldn't predict we'd need something more at the time. We work with buffer sizes because that's dead easy. What we need instead is to work in the delay, latency and/or service time of the various components. It's easy to deal with performance problems that are stated in time units and are fixed by varying the times things take. It's insanely hard to deal with performance problems when all we know is a volume in bytes. It's a bit like measuring the performance of large versus small cargo containers when you don't know if they're on a truck, a train or a ship! If you expose any time-based metrics or tuneables in your investigation, please highlight them. Anything that looks like delay or latency would be seriously cool. One needs very little to analyze this class of problems. Knowing the service time of a packet, the number of packets, and the time between packets is sufficient to build a tiny little mathematical model of the thing you measured. From the model you can then predict what happens when you improve or disimprove the system. More information allows for more predictive models, of course, and eventually to my mathie friends becoming completely unintelligible (;-)) --dave (davecb@spamcop.net) c-b -- As you might guess, I'm a capacity planner, and might be able to help a bit on the modeling side. Besides, I'm looking for a networking example for my next book (;-)) --dave ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Bloat] Hmmn, this is in part a capacity planning/management problem. 2011-09-14 15:17 [Bloat] Hmmn, this is in part a capacity planning/management problem Collier-Brown, David (LNG-CAN) @ 2011-09-20 16:14 ` Jim Gettys 2011-09-20 16:42 ` Collier-Brown, David (LNG-CAN) 0 siblings, 1 reply; 3+ messages in thread From: Jim Gettys @ 2011-09-20 16:14 UTC (permalink / raw) To: Collier-Brown, David (LNG-CAN); +Cc: davecb, bloat On 09/14/2011 11:17 AM, Collier-Brown, David (LNG-CAN) wrote: > I was reading the latest article in LWN, and commented there, but part > of the comment may be relevant to the list... > > -- reply to mlankhorst (subscriber, #52260) -- > Changing the subject slightly, there's a subtle, underlying problem in > that when developing products and protocols, we tend to work with what's > easy, not what's important. > > We work with the bandwidth/delay product because it's what we needed in > the short run, and we probably couldn't predict we'd need something more > at the time. We work with buffer sizes because that's dead easy. > > What we need instead is to work in the delay, latency and/or service > time of the various components. It's easy to deal with performance > problems that are stated in time units and are fixed by varying the > times things take. It's insanely hard to deal with performance problems > when all we know is a volume in bytes. It's a bit like measuring the > performance of large versus small cargo containers when you don't know > if they're on a truck, a train or a ship! You are exactly correct, and I certainly said so at the LPC wireless meeting and on the Linux networking mailing lists recently. Bytes or packets are not useful metrics on wireless at all, and naive even on wired ethernet (due to contention and switches doing flow control). Bytes are not interchangeable: a broadcast/multicast packet in 802.11 may cost as much as 2 orders of magnitude more than a unicast packet, and how fast you can transmit any packet depends on the current environment, which is also dynamically changing. My naive view (before August) was that if you couldn't send a packet, you should drop the rate; but this is usually wrong in practice; you are going to take less time to transmit a packet (and it may be more likely to get through since noise gets a bigger shot at a lower rate to damage a packet) at a higher rate despite multiple transmission attempts if you are able to get most packets through intact. 802.11n aggregation, in fact, allows you to send multiple packets in a frame, and tells you which packets got through intact with a bitmask. This then begs the question of retransmitting damaged packets, and not doing massive reordering when there is packet loss. Thankfully, Linux has a module called Minstrel, which is dynamically monitoring the likely cost of sending a packet at various rates, so it can make a much better than random guess what strategy to transmit a packet is best. Andrew McGregor (one of the Minstrel authors) worked with Felix Fietkau (the ath9k driver maintainer) and Dave Taht to greatly improve the aggregation in the driver since the Quebec City IETF (this is part of why things have been quiet, along with vacations). Buffering is now about 1/3 what it was (worst case), and the infinite retry problems cured (we were observing ICMP packets taking 1.6 seconds to get across the air under some circumstances). Our thanks to them. This code is in RC6; I'm not sure it is in debloat-testing yet; I should ping John Linville. Using Minstrel, and estimating contention on the network, we may be able to get to a point of a proper metric and short term prediction on how fast data may flow in an 802.11 network. That is going to be necessary for any AQM to work properly. > > If you expose any time-based metrics or tuneables in your investigation, > please highlight them. Anything that looks like delay or latency would > be seriously cool. > > One needs very little to analyze this class of problems. Knowing the > service time of a packet, the number of packets, and the time between > packets is sufficient to build a tiny little mathematical model of the > thing you measured. From the model you can then predict what happens > when you improve or disimprove the system. More information allows for > more predictive models, of course, and eventually to my mathie friends > becoming completely unintelligible (;-)) Again, a look at Minstrel is a good thing to do. And you have to keep in mind that the time to transmit something is constantly varying. You can't compute a time to transmit something once and expect it to be correct seconds later. You can't fool mother nature; or she bites back. - Jim > > --dave (davecb@spamcop.net) c-b > -- > > As you might guess, I'm a capacity planner, and might be able to help a > bit on the modeling side. Besides, I'm looking for a networking example > for my next book (;-)) > > --dave > > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Bloat] Hmmn, this is in part a capacity planning/management problem. 2011-09-20 16:14 ` Jim Gettys @ 2011-09-20 16:42 ` Collier-Brown, David (LNG-CAN) 0 siblings, 0 replies; 3+ messages in thread From: Collier-Brown, David (LNG-CAN) @ 2011-09-20 16:42 UTC (permalink / raw) To: Jim Gettys; +Cc: Collier-Brown, David (LNG-CAN), davecb, bloat -----Original Message----- From: Jim Gettys [mailto:gettysjim@gmail.com] On Behalf Of Jim Gettys Sent: Tuesday, September 20, 2011 12:14 PM To: Collier-Brown, David (LNG-CAN) Cc: bloat@lists.bufferbloat.net; davecb@spamcop.net Subject: Re: [Bloat] Hmmn, this is in part a capacity planning/management problem. On 09/14/2011 11:17 AM, Collier-Brown, David (LNG-CAN) wrote: > I was reading the latest article in LWN, and commented there, but part > of the comment may be relevant to the list... > > -- reply to mlankhorst (subscriber, #52260) -- > Changing the subject slightly, there's a subtle, underlying problem in > that when developing products and protocols, we tend to work with what's > easy, not what's important. > > We work with the bandwidth/delay product because it's what we needed in > the short run, and we probably couldn't predict we'd need something more > at the time. We work with buffer sizes because that's dead easy. > > What we need instead is to work in the delay, latency and/or service > time of the various components. It's easy to deal with performance > problems that are stated in time units and are fixed by varying the > times things take. It's insanely hard to deal with performance problems > when all we know is a volume in bytes. It's a bit like measuring the > performance of large versus small cargo containers when you don't know > if they're on a truck, a train or a ship! You are exactly correct, and I certainly said so at the LPC wireless meeting and on the Linux networking mailing lists recently. Bytes or packets are not useful metrics on wireless at all, and naive even on wired ethernet (due to contention and switches doing flow control). Bytes are not interchangeable: a broadcast/multicast packet in 802.11 may cost as much as 2 orders of magnitude more than a unicast packet, and how fast you can transmit any packet depends on the current environment, which is also dynamically changing. My naive view (before August) was that if you couldn't send a packet, you should drop the rate; but this is usually wrong in practice; you are going to take less time to transmit a packet (and it may be more likely to get through since noise gets a bigger shot at a lower rate to damage a packet) at a higher rate despite multiple transmission attempts if you are able to get most packets through intact. 802.11n aggregation, in fact, allows you to send multiple packets in a frame, and tells you which packets got through intact with a bitmask. This then begs the question of retransmitting damaged packets, and not doing massive reordering when there is packet loss. Thankfully, Linux has a module called Minstrel, which is dynamically monitoring the likely cost of sending a packet at various rates, so it can make a much better than random guess what strategy to transmit a packet is best. Andrew McGregor (one of the Minstrel authors) worked with Felix Fietkau (the ath9k driver maintainer) and Dave Taht to greatly improve the aggregation in the driver since the Quebec City IETF (this is part of why things have been quiet, along with vacations). Buffering is now about 1/3 what it was (worst case), and the infinite retry problems cured (we were observing ICMP packets taking 1.6 seconds to get across the air under some circumstances). Our thanks to them. This code is in RC6; I'm not sure it is in debloat-testing yet; I should ping John Linville. Using Minstrel, and estimating contention on the network, we may be able to get to a point of a proper metric and short term prediction on how fast data may flow in an 802.11 network. That is going to be necessary for any AQM to work properly. > > If you expose any time-based metrics or tuneables in your investigation, > please highlight them. Anything that looks like delay or latency would > be seriously cool. > > One needs very little to analyze this class of problems. Knowing the > service time of a packet, the number of packets, and the time between > packets is sufficient to build a tiny little mathematical model of the > thing you measured. From the model you can then predict what happens > when you improve or disimprove the system. More information allows for > more predictive models, of course, and eventually to my mathie friends > becoming completely unintelligible (;-)) Again, a look at Minstrel is a good thing to do. And you have to keep in mind that the time to transmit something is constantly varying. You can't compute a time to transmit something once and expect it to be correct seconds later. You can't fool mother nature; or she bites back. - Jim Indeed: even ordinary IP is hard for the serious modelers, because the service time is - constantly bouncing about due to random events - trending up and down over longer periods (seconds), and - originates in trains of packets, not individual packets The latter does bad things to the better models, so I have to think in terms of simple, computationally cheap and fast models, constantly applied. In classic-IP terms, I'm probably more interested in delay that the delay/bandwidth product, as computations using it are less likely to take you down the garden path. My garden has a Troll down at the end, so you really don't want to go there... Thanks for the pointers, I'm now off to dilute my ignorance of Mistral and retransmit-only-the-lost-parts. I think I remember the latter from Appletalk. --dave > > --dave (davecb@spamcop.net) c-b > -- > > As you might guess, I'm a capacity planner, and might be able to help a > bit on the modeling side. Besides, I'm looking for a networking example > for my next book (;-)) > > --dave > > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-09-20 16:42 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-09-14 15:17 [Bloat] Hmmn, this is in part a capacity planning/management problem Collier-Brown, David (LNG-CAN) 2011-09-20 16:14 ` Jim Gettys 2011-09-20 16:42 ` Collier-Brown, David (LNG-CAN)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox