[Bloat] Hmmn, this is in part a capacity planning/management problem.

General list for discussing Bufferbloat
 help / color / mirror / Atom feed

* [Bloat] Hmmn, this is in part a capacity planning/management problem.
@ 2011-09-14 15:17 Collier-Brown, David (LNG-CAN)
  2011-09-20 16:14 ` Jim Gettys
  0 siblings, 1 reply; 3+ messages in thread
From: Collier-Brown, David (LNG-CAN) @ 2011-09-14 15:17 UTC (permalink / raw)
  To: bloat; +Cc: Collier-Brown, David (LNG-CAN), davecb

I was reading the latest article in LWN, and commented there, but part
of the comment may be relevant to the list...

-- reply to mlankhorst (subscriber, #52260) --
Changing the subject slightly, there's a subtle, underlying problem in
that when developing products and protocols, we tend to work with what's
easy, not what's important.

We work with the bandwidth/delay product because it's what we needed in
the short run, and we probably couldn't predict we'd need something more
at the time. We work with buffer sizes because that's dead easy.

What we need instead is to work in the delay, latency and/or service
time of the various components. It's easy to deal with performance
problems that are stated in time units and are fixed by varying the
times things take. It's insanely hard to deal with performance problems
when all we know is a volume in bytes. It's a bit like measuring the
performance of large versus small cargo containers when you don't know
if they're on a truck, a train or a ship!

If you expose any time-based metrics or tuneables in your investigation,
please highlight them. Anything that looks like delay or latency would
be seriously cool.

One needs very little to analyze this class of problems. Knowing the
service time of a packet, the number of packets, and the time between
packets is sufficient to build a tiny little mathematical model of the
thing you measured. From the model you can then predict what happens
when you improve or disimprove the system. More information allows for
more predictive models, of course, and eventually to my mathie friends
becoming completely unintelligible (;-))

--dave (davecb@spamcop.net) c-b
--

As you might guess, I'm a capacity planner, and might be able to help a
bit on the modeling side. Besides, I'm looking for a networking example
for my next book (;-))

--dave

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Bloat] Hmmn, this is in part a capacity planning/management problem.
  2011-09-14 15:17 [Bloat] Hmmn, this is in part a capacity planning/management problem Collier-Brown, David (LNG-CAN)
@ 2011-09-20 16:14 ` Jim Gettys
  2011-09-20 16:42   ` Collier-Brown, David (LNG-CAN)
  0 siblings, 1 reply; 3+ messages in thread
From: Jim Gettys @ 2011-09-20 16:14 UTC (permalink / raw)
  To: Collier-Brown, David (LNG-CAN); +Cc: davecb, bloat

On 09/14/2011 11:17 AM, Collier-Brown, David (LNG-CAN) wrote:
> I was reading the latest article in LWN, and commented there, but part
> of the comment may be relevant to the list...
>
> -- reply to mlankhorst (subscriber, #52260) --
> Changing the subject slightly, there's a subtle, underlying problem in
> that when developing products and protocols, we tend to work with what's
> easy, not what's important.
>
> We work with the bandwidth/delay product because it's what we needed in
> the short run, and we probably couldn't predict we'd need something more
> at the time. We work with buffer sizes because that's dead easy.
>
> What we need instead is to work in the delay, latency and/or service
> time of the various components. It's easy to deal with performance
> problems that are stated in time units and are fixed by varying the
> times things take. It's insanely hard to deal with performance problems
> when all we know is a volume in bytes. It's a bit like measuring the
> performance of large versus small cargo containers when you don't know
> if they're on a truck, a train or a ship!

You are exactly correct, and I certainly said so at the LPC wireless
meeting and on the Linux networking mailing lists recently.  Bytes or
packets are not useful metrics on wireless at all, and naive even on
wired ethernet (due to contention and switches doing flow control).

Bytes are not interchangeable: a broadcast/multicast packet in 802.11
may cost as much as 2 orders of magnitude more than a unicast packet,
and how fast you can transmit any packet depends on the current
environment, which is also dynamically changing. 

My naive view (before August) was that if you couldn't send a packet,
you should drop the rate; but this is usually wrong in practice; you are
going to take less time to transmit a packet (and it may be more likely
to get through since noise gets a bigger shot at a lower rate to damage
a packet) at a higher rate despite multiple transmission attempts if you
are able to get most packets through intact.  802.11n aggregation, in
fact, allows you to send multiple packets in a frame, and tells you
which packets got through intact with a bitmask.  This then begs the
question of retransmitting damaged packets, and not doing massive
reordering when there is packet loss.

Thankfully, Linux has a module called Minstrel, which is dynamically
monitoring the likely cost of sending a packet at various rates, so it
can make a much better than random guess what strategy to transmit a
packet is best.

Andrew McGregor (one of the Minstrel authors) worked with Felix Fietkau
(the ath9k driver maintainer) and Dave Taht to greatly improve the
aggregation in the driver since the Quebec City IETF (this is part of
why things have been quiet, along with vacations).  Buffering is now
about 1/3 what it was (worst case), and the infinite retry problems
cured (we were observing ICMP packets taking 1.6 seconds to get across
the air under some circumstances).  Our thanks to them.  This code is in
RC6; I'm not sure it is in debloat-testing yet; I should ping John Linville.

Using Minstrel, and estimating contention on the network, we may be able
to get to a point of a proper metric and short term prediction on how
fast data may flow in an 802.11 network.  That is going to be necessary
for any AQM to work properly.
>
> If you expose any time-based metrics or tuneables in your investigation,
> please highlight them. Anything that looks like delay or latency would
> be seriously cool.
>
> One needs very little to analyze this class of problems. Knowing the
> service time of a packet, the number of packets, and the time between
> packets is sufficient to build a tiny little mathematical model of the
> thing you measured. From the model you can then predict what happens
> when you improve or disimprove the system. More information allows for
> more predictive models, of course, and eventually to my mathie friends
> becoming completely unintelligible (;-))

Again, a look at Minstrel is a good thing to do.  And you have to keep
in mind that the time to transmit something is constantly varying.
You can't compute a time to transmit something once and expect it to be
correct seconds later.  You can't fool mother nature; or she bites back.
                    - Jim

>
> --dave (davecb@spamcop.net) c-b
> --
>
> As you might guess, I'm a capacity planner, and might be able to help a
> bit on the modeling side. Besides, I'm looking for a networking example
> for my next book (;-))
>
> --dave
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Bloat] Hmmn, this is in part a capacity planning/management problem.
  2011-09-20 16:14 ` Jim Gettys
@ 2011-09-20 16:42   ` Collier-Brown, David (LNG-CAN)
  0 siblings, 0 replies; 3+ messages in thread
From: Collier-Brown, David (LNG-CAN) @ 2011-09-20 16:42 UTC (permalink / raw)
  To: Jim Gettys; +Cc: Collier-Brown, David (LNG-CAN), davecb, bloat

-----Original Message-----
From: Jim Gettys [mailto:gettysjim@gmail.com] On Behalf Of Jim Gettys
Sent: Tuesday, September 20, 2011 12:14 PM
To: Collier-Brown, David (LNG-CAN)
Cc: bloat@lists.bufferbloat.net; davecb@spamcop.net
Subject: Re: [Bloat] Hmmn, this is in part a capacity
planning/management problem.

On 09/14/2011 11:17 AM, Collier-Brown, David (LNG-CAN) wrote:
> I was reading the latest article in LWN, and commented there, but part
> of the comment may be relevant to the list...
>
> -- reply to mlankhorst (subscriber, #52260) --
> Changing the subject slightly, there's a subtle, underlying problem in
> that when developing products and protocols, we tend to work with
what's
> easy, not what's important.
>
> We work with the bandwidth/delay product because it's what we needed
in
> the short run, and we probably couldn't predict we'd need something
more
> at the time. We work with buffer sizes because that's dead easy.
>
> What we need instead is to work in the delay, latency and/or service
> time of the various components. It's easy to deal with performance
> problems that are stated in time units and are fixed by varying the
> times things take. It's insanely hard to deal with performance
problems
> when all we know is a volume in bytes. It's a bit like measuring the
> performance of large versus small cargo containers when you don't know
> if they're on a truck, a train or a ship!

You are exactly correct, and I certainly said so at the LPC wireless
meeting and on the Linux networking mailing lists recently.  Bytes or
packets are not useful metrics on wireless at all, and naive even on
wired ethernet (due to contention and switches doing flow control).

Bytes are not interchangeable: a broadcast/multicast packet in 802.11
may cost as much as 2 orders of magnitude more than a unicast packet,
and how fast you can transmit any packet depends on the current
environment, which is also dynamically changing. 

My naive view (before August) was that if you couldn't send a packet,
you should drop the rate; but this is usually wrong in practice; you are
going to take less time to transmit a packet (and it may be more likely
to get through since noise gets a bigger shot at a lower rate to damage
a packet) at a higher rate despite multiple transmission attempts if you
are able to get most packets through intact.  802.11n aggregation, in
fact, allows you to send multiple packets in a frame, and tells you
which packets got through intact with a bitmask.  This then begs the
question of retransmitting damaged packets, and not doing massive
reordering when there is packet loss.

Thankfully, Linux has a module called Minstrel, which is dynamically
monitoring the likely cost of sending a packet at various rates, so it
can make a much better than random guess what strategy to transmit a
packet is best.

Andrew McGregor (one of the Minstrel authors) worked with Felix Fietkau
(the ath9k driver maintainer) and Dave Taht to greatly improve the
aggregation in the driver since the Quebec City IETF (this is part of
why things have been quiet, along with vacations).  Buffering is now
about 1/3 what it was (worst case), and the infinite retry problems
cured (we were observing ICMP packets taking 1.6 seconds to get across
the air under some circumstances).  Our thanks to them.  This code is in
RC6; I'm not sure it is in debloat-testing yet; I should ping John
Linville.

Using Minstrel, and estimating contention on the network, we may be able
to get to a point of a proper metric and short term prediction on how
fast data may flow in an 802.11 network.  That is going to be necessary
for any AQM to work properly.
>
> If you expose any time-based metrics or tuneables in your
investigation,
> please highlight them. Anything that looks like delay or latency would
> be seriously cool.
>
> One needs very little to analyze this class of problems. Knowing the
> service time of a packet, the number of packets, and the time between
> packets is sufficient to build a tiny little mathematical model of the
> thing you measured. From the model you can then predict what happens
> when you improve or disimprove the system. More information allows for
> more predictive models, of course, and eventually to my mathie friends
> becoming completely unintelligible (;-))

Again, a look at Minstrel is a good thing to do.  And you have to keep
in mind that the time to transmit something is constantly varying.
You can't compute a time to transmit something once and expect it to be
correct seconds later.  You can't fool mother nature; or she bites back.
                    - Jim

Indeed: even ordinary IP is hard for the serious modelers, because the
service time is 
- constantly bouncing about due to random events
- trending up and down over longer periods (seconds), and
- originates in trains of packets, not individual packets

The latter does bad things to the better models, so I have to think
in terms of simple, computationally cheap and fast models, constantly
applied.

In classic-IP terms, I'm probably more interested in delay that the
delay/bandwidth 
product, as computations using it are less likely to take you down the
garden path.
My garden has a Troll down at the end, so you really don't want to go
there...

Thanks for the pointers, I'm now off to dilute my ignorance of Mistral
and 
retransmit-only-the-lost-parts. I think I remember the latter from
Appletalk.

--dave

>
> --dave (davecb@spamcop.net) c-b
> --
>
> As you might guess, I'm a capacity planner, and might be able to help
a
> bit on the modeling side. Besides, I'm looking for a networking
example
> for my next book (;-))
>
> --dave
>
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-09-20 16:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-14 15:17 [Bloat] Hmmn, this is in part a capacity planning/management problem Collier-Brown, David (LNG-CAN)
2011-09-20 16:14 ` Jim Gettys
2011-09-20 16:42   ` Collier-Brown, David (LNG-CAN)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox