[Codel] RFC: Realtime Response Under Load (rrul) test specification

CoDel AQM discussions
 help / color / mirror / Atom feed

* [Codel] RFC: Realtime Response Under Load (rrul) test specification
@ 2012-11-06 12:42 Dave Taht
  2012-11-06 13:42 ` [Codel] [Bloat] " Henrique de Moraes Holschuh
  2012-11-06 18:14 ` [Codel] " Rick Jones
  0 siblings, 2 replies; 8+ messages in thread
From: Dave Taht @ 2012-11-06 12:42 UTC (permalink / raw)
  To: bloat, codel, cerowrt-devel

I have been working on developing a specification for testing networks
more effectively for various side effects of bufferbloat, notably
gaming and voip performance, and especially web performance.... as
well as a few other things that concerned me, such as IPv6 behavior,
and the effects of packet classification.

A key goal is to be able to measure the quality of the user experience
while a network is otherwise busy, with complex stuff going on in the
background, but with a simple presentation of the results in the end,
in under 60 seconds.

While it's not done yet, it escaped into the wild today, and I might
as well solicit wider opinions on it, sooo... get the spec at:

https://github.com/dtaht/deBloat/blob/master/spec/rrule.doc?raw=true

Portions of the test are being prototyped in the netperf-wrappers repo
on github. The initial results of the rrul test on several hotel
networks I've tried it on are "interesting". Example:
http://www.teklibre.com/~d/rrul2_conference.pdf

A major sticking point at the moment is to come up with an equivalent
of the chrome-benchmarks for measuring relative web page performance
with and without a network load, or to merely incorporate some
automated form of that benchmark into the overall test load.

The end goal is to have a complex, comprehensive benchmark of some
core networking issues, that produces simple results, whether they be
via a java tool like icsi's, or via flash on the web, or the command
line, via something like netperf.

Related resources:

netperf 2.6 or later running on a fairly nearby server
https://github.com/tohojo/netperf-wrapper
python-matplotlib

I look forward to your comments.

-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Codel] [Bloat] RFC: Realtime Response Under Load (rrul) test specification
  2012-11-06 12:42 [Codel] RFC: Realtime Response Under Load (rrul) test specification Dave Taht
@ 2012-11-06 13:42 ` Henrique de Moraes Holschuh
  2012-11-06 13:56   ` Dave Taht
  2012-11-06 18:14 ` [Codel] " Rick Jones
  1 sibling, 1 reply; 8+ messages in thread
From: Henrique de Moraes Holschuh @ 2012-11-06 13:42 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel, cerowrt-devel, bloat

On Tue, 06 Nov 2012, Dave Taht wrote:
> I have been working on developing a specification for testing networks
> more effectively for various side effects of bufferbloat, notably
> gaming and voip performance, and especially web performance.... as
> well as a few other things that concerned me, such as IPv6 behavior,
> and the effects of packet classification.

When it is reasonably complete, it would be nice to have it as an
informational or better yet, standards-track IETF RFC.  

IETF RFC non-experimental status allows us to require RRUL testing prior to
service acceptance, and even add it as one of the SLA metrics on public
tenders, which goes a long way into pushing anything into more widespread
usage.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Codel] [Bloat] RFC: Realtime Response Under Load (rrul) test specification
  2012-11-06 13:42 ` [Codel] [Bloat] " Henrique de Moraes Holschuh
@ 2012-11-06 13:56   ` Dave Taht
  2012-11-06 15:40     ` Wesley Eddy
  2012-11-06 18:58     ` [Codel] [Cerowrt-devel] " Michael Richardson
  0 siblings, 2 replies; 8+ messages in thread
From: Dave Taht @ 2012-11-06 13:56 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh; +Cc: codel, cerowrt-devel, bloat

On Tue, Nov 6, 2012 at 2:42 PM, Henrique de Moraes Holschuh
<hmh@hmh.eng.br> wrote:
> On Tue, 06 Nov 2012, Dave Taht wrote:
>> I have been working on developing a specification for testing networks
>> more effectively for various side effects of bufferbloat, notably
>> gaming and voip performance, and especially web performance.... as
>> well as a few other things that concerned me, such as IPv6 behavior,
>> and the effects of packet classification.
>
> When it is reasonably complete, it would be nice to have it as an
> informational or better yet, standards-track IETF RFC.
>
> IETF RFC non-experimental status allows us to require RRUL testing prior to
> service acceptance, and even add it as one of the SLA metrics on public
> tenders, which goes a long way into pushing anything into more widespread
> usage.

It was my intent to write this as a real, standards track rfc, and
also submit it as a prospective test to the ITU and other testing
bodies such as nist, undewriter labratories, consumer reports, and so
on.

However I:

A) got intimidated by the prospect of dealing with the rfc editor

B) Have some sticky problems with two aspects of the test methodology
(and that's just what I know about) which I am prototyping around.
Running the prototype tests on various real networks has had very
"interesting" results... (I do hope others try the prototype tests,
too, on their networks)

C) thought it would be clearer to write the shortest document possible
on this go-round.
D) Am not particularly fond of the "rrule" name. (suggestions?)

I now plan (after feedback) to produce and submit this as a standards
track RFC in the march timeframe.

It would give me great joy to have this test series included in
various SLA metrics, in the long run.

-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Codel] [Bloat] RFC: Realtime Response Under Load (rrul) test specification
  2012-11-06 13:56   ` Dave Taht
@ 2012-11-06 15:40     ` Wesley Eddy
  2012-11-06 18:58     ` [Codel] [Cerowrt-devel] " Michael Richardson
  1 sibling, 0 replies; 8+ messages in thread
From: Wesley Eddy @ 2012-11-06 15:40 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel, cerowrt-devel, Henrique de Moraes Holschuh, bloat

On 11/6/2012 8:56 AM, Dave Taht wrote:
> On Tue, Nov 6, 2012 at 2:42 PM, Henrique de Moraes Holschuh
> <hmh@hmh.eng.br> wrote:
>> On Tue, 06 Nov 2012, Dave Taht wrote:
>>> I have been working on developing a specification for testing networks
>>> more effectively for various side effects of bufferbloat, notably
>>> gaming and voip performance, and especially web performance.... as
>>> well as a few other things that concerned me, such as IPv6 behavior,
>>> and the effects of packet classification.
>>
>> When it is reasonably complete, it would be nice to have it as an
>> informational or better yet, standards-track IETF RFC.
>>
>> IETF RFC non-experimental status allows us to require RRUL testing prior to
>> service acceptance, and even add it as one of the SLA metrics on public
>> tenders, which goes a long way into pushing anything into more widespread
>> usage.
> 
> It was my intent to write this as a real, standards track rfc, and
> also submit it as a prospective test to the ITU and other testing
> bodies such as nist, undewriter labratories, consumer reports, and so
> on.
> 
> However I:
> 
> A) got intimidated by the prospect of dealing with the rfc editor
> 
> B) Have some sticky problems with two aspects of the test methodology
> (and that's just what I know about) which I am prototyping around.
> Running the prototype tests on various real networks has had very
> "interesting" results... (I do hope others try the prototype tests,
> too, on their networks)
> 
> C) thought it would be clearer to write the shortest document possible
> on this go-round.
> D) Am not particularly fond of the "rrule" name. (suggestions?)
> 
> I now plan (after feedback) to produce and submit this as a standards
> track RFC in the march timeframe.
> 
> It would give me great joy to have this test series included in
> various SLA metrics, in the long run.
> 


Hi Dave, in my role as IETF TSV AD, I would be happy to help
you get this into the IETF.  Please note that you can't get
a Standards Track RFC published without a working group
adopting it or an AD sponsoring it.

This topic would be of interest for the IPPM and BMWG working
groups, and I know it is of interest to me as a TSV AD, so we
should be able to find a way to bring it in.

In fact, the timing is good, as FCC folks are at the IETF this
week talking about their vision for broadband test and
measurement architecture, and these tests may relate nicely
to that proposed work.

-- 
Wes Eddy
MTI Systems

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Codel] RFC: Realtime Response Under Load (rrul) test specification
  2012-11-06 12:42 [Codel] RFC: Realtime Response Under Load (rrul) test specification Dave Taht
  2012-11-06 13:42 ` [Codel] [Bloat] " Henrique de Moraes Holschuh
@ 2012-11-06 18:14 ` Rick Jones
  2012-11-09 10:34   ` Dave Taht
  1 sibling, 1 reply; 8+ messages in thread
From: Rick Jones @ 2012-11-06 18:14 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel, cerowrt-devel, bloat

On 11/06/2012 04:42 AM, Dave Taht wrote:
> I have been working on developing a specification for testing networks
> more effectively for various side effects of bufferbloat, notably
> gaming and voip performance, and especially web performance.... as
> well as a few other things that concerned me, such as IPv6 behavior,
> and the effects of packet classification.
>
> A key goal is to be able to measure the quality of the user experience
> while a network is otherwise busy, with complex stuff going on in the
> background, but with a simple presentation of the results in the end,
> in under 60 seconds.

Would you like fries with that?

Snark aside, I think that being able to capture the state of the user 
experience in only 60 seconds is daunting at best.  Especially if this 
testing is going to run over the Big Bad Internet (tm) rather than in a 
controlled test lab.

> While it's not done yet, it escaped into the wild today, and I might
> as well solicit wider opinions on it, sooo... get the spec at:
>
> https://github.com/dtaht/deBloat/blob/master/spec/rrule.doc?raw=true

Github is serving that up as a plain text file, which then has Firefox 
looking to use gedit to look at the file, and gedit does not seem at all 
happy with it.  It was necessary to download the file and open it 
"manually" in LibreOffice.

> MUST run long enough to defeat bursty bandwidth optimizations such as
> PowerBoost and discard data from that interval.

I'll willingly display my ignorance, but for how long does PowerBoost 
and its cousins boost bandwidth?

I wasn't looking for PowerBoost, and given the thing being examined I 
wasn't seeing that, but recently when I was evaluating the network 
performance of something "out there" in the cloud (not my home cloud as 
it were though) I noticed performance spikes repeating at intervals 
which would require > 60 seconds to "defeat"

> MUST track and sum bi-directional throughput, using estimates for ACK
> sizes of ipv4, ipv6, and encapsulated ipv6 packets, udp and tcp_rr
> packets, etc.

Estimating the bandwidth consumed by ACKs and/or protocol headers, using 
code operating at user-space, is going to be guessing.  Particularly 
portable user-space.  While those things may indeed affect the user's 
experience, the user doesn't particularly care about ACKs or header 
sizes.  She cares how well the page loads or the call sounds.

> MUST have the test server(s) within 80ms of the testing client

Why?  Perhaps there is something stating that some number of nines worth 
of things being accessed are within 80ms of the user.  If there is, that 
should be given in support of the requirement.

> This portion of the test will take your favorite website as a target
> and show you how much it will slow down, under load.

Under load on the website itself, or under load on one's link.  I 
ass-u-me the latter, but that should be made clear.  And while the 
chances of the additional load on a web site via this testing is likely 
epsilon, there is still the matter of its "optics" if you will - how it 
looks.  Particularly if there is going to be something distributed with 
a default website coded into it.

Further, websites are not going to remain static, so there will be the 
matter of being able to compare results over time.  Perhaps that can be 
finessed with the "unloaded" (again I assume relative to the link of 
interest/test) measurement.

rick jones

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Codel] [Cerowrt-devel] [Bloat] RFC: Realtime Response Under Load (rrul) test specification
  2012-11-06 13:56   ` Dave Taht
  2012-11-06 15:40     ` Wesley Eddy
@ 2012-11-06 18:58     ` Michael Richardson
  1 sibling, 0 replies; 8+ messages in thread
From: Michael Richardson @ 2012-11-06 18:58 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel, cerowrt-devel, Henrique de Moraes Holschuh, bloat

[-- Attachment #1: Type: text/plain, Size: 1170 bytes --]


Dave Taht <dave.taht@gmail.com> wrote:
    DT> However I:

    DT> A) got intimidated by the prospect of dealing with the rfc
    DT> editor

been there, done that, I could volunteer to operate the xml2rfc and get
your document posted.  I think that this might be uptaked by the bmwg.

    DT> B) Have some sticky problems with two aspects of the test
    DT> methodology (and that's just what I know about) which I am
    DT> prototyping around.  Running the prototype tests on various real
    DT> networks has had very "interesting" results... (I do hope others
    DT> try the prototype tests, too, on their networks)

Others might be able to help with this.

    DT> I now plan (after feedback) to produce and submit this as a
    DT> standards track RFC in the march timeframe.

    DT> It would give me great joy to have this test series included in
    DT> various SLA metrics, in the long run.

good. Can I help?
I could create the -00 document from your .doc file?
I'll fork your dtaht repo.... I think that I can also
debianize your debloat.sh script, which I think is also in that repo.

-- 
Michael Richardson
-on the road-

[-- Attachment #2: Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Codel] RFC: Realtime Response Under Load (rrul) test specification
  2012-11-06 18:14 ` [Codel] " Rick Jones
@ 2012-11-09 10:34   ` Dave Taht
  2012-11-09 17:57     ` Rick Jones
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Taht @ 2012-11-09 10:34 UTC (permalink / raw)
  To: Rick Jones; +Cc: codel, cerowrt-devel, bloat

On Tue, Nov 6, 2012 at 7:14 PM, Rick Jones <rick.jones2@hp.com> wrote:
> On 11/06/2012 04:42 AM, Dave Taht wrote:
>>
>> I have been working on developing a specification for testing networks
>> more effectively for various side effects of bufferbloat, notably
>> gaming and voip performance, and especially web performance.... as
>> well as a few other things that concerned me, such as IPv6 behavior,
>> and the effects of packet classification.
>>
>> A key goal is to be able to measure the quality of the user experience
>> while a network is otherwise busy, with complex stuff going on in the
>> background, but with a simple presentation of the results in the end,
>> in under 60 seconds.
>
>
> Would you like fries with that?

and a shake!

>
> Snark aside, I think that being able to capture the state of the user
> experience in only 60 seconds is daunting at best.  Especially if this

Concur.

> testing is going to run over the Big Bad Internet (tm) rather than in a
> controlled test lab.

In my testing of this scheme, from networks ranging in size and
quality from a 4 hop mesh network to the internet, to testing at
random hotels throughout the US and eu at baseline RTT up to 200ms, to
lab testing at multiple other locations, I was generally able to
generate a load that had "interesting" side effects inside of 40
seconds, and generally shorter.

My suggestion, as I always do, is for you (and others) to simply try
out the prototypes that are in the netperf-wrapper git repo, and see
what you can see, and learn what you can learn, and feedback what you
can.

In the longer term I would certainly like the simplest test for
latency under load added directly to netperf. The VOIP test is also
nifty.

>
>
>> While it's not done yet, it escaped into the wild today, and I might
>> as well solicit wider opinions on it, sooo... get the spec at:
>>
>> https://github.com/dtaht/deBloat/blob/master/spec/rrule.doc?raw=true
>
>
> Github is serving that up as a plain text file, which then has Firefox
> looking to use gedit to look at the file, and gedit does not seem at all
> happy with it.  It was necessary to download the file and open it "manually"
> in LibreOffice.

Sorry. The original was in emacs org mode. Shall I put that up instead?

>
>> MUST run long enough to defeat bursty bandwidth optimizations such as
>> PowerBoost and discard data from that interval.
>
>
> I'll willingly display my ignorance, but for how long does PowerBoost and
> its cousins boost bandwidth?
>
> I wasn't looking for PowerBoost, and given the thing being examined I wasn't
> seeing that, but recently when I was evaluating the network performance of
> something "out there" in the cloud (not my home cloud as it were though) I
> noticed performance spikes repeating at intervals which would require > 60
> seconds to "defeat"

I too have seen oddball spikes - for example, older forms of sfq do
permutation every 10 seconds and totally wipe out many tcp connections
by doing so.

I regard your problem detailed above as an edge case, compared to the
much more gross effects this benchmark generates.

Certainly being able to run the tests for longer intervals and capture
traffic would be useful for network engineers.


>
>> MUST track and sum bi-directional throughput, using estimates for ACK
>> sizes of ipv4, ipv6, and encapsulated ipv6 packets, udp and tcp_rr
>> packets, etc.
>
>
> Estimating the bandwidth consumed by ACKs and/or protocol headers, using
> code operating at user-space, is going to be guessing.  Particularly
> portable user-space.  While those things may indeed affect the user's
> experience, the user doesn't particularly care about ACKs or header sizes.
> She cares how well the page loads or the call sounds.

I feel an "optimum" ack overhead should be calculated, vs the actual
(which is impossible)
>> MUST have the test server(s) within 80ms of the testing client
>
>
> Why?  Perhaps there is something stating that some number of nines worth of
> things being accessed are within 80ms of the user.  If there is, that should
> be given in support of the requirement.

Con-US distance. Despite me pushing the test to 200ms, I have a great
deal more confidence it will work consistently at 80ms.

Can make this a "SHOULD" if you like.

>> This portion of the test will take your favorite website as a target
>> and show you how much it will slow down, under load.
>
>
> Under load on the website itself, or under load on one's link.  I ass-u-me
> the latter, but that should be made clear.  And while the chances of the
> additional load on a web site via this testing is likely epsilon, there is
> still the matter of its "optics" if you will - how it looks.  Particularly
> if there is going to be something distributed with a default website coded
> into it.
>
> Further, websites are not going to remain static, so there will be the
> matter of being able to compare results over time.  Perhaps that can be
> finessed with the "unloaded" (again I assume relative to the link of
> interest/test) measurement.

A core portion of the test really is comparing unloaded vs loaded
performance of the same place, in the same test, over the course of
about a minute.

And as these two baseline figures are kept, those can be compared for
any given website from any given location, over history, and changes
in the underlying network.

> rick jones



-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Codel] RFC: Realtime Response Under Load (rrul) test specification
  2012-11-09 10:34   ` Dave Taht
@ 2012-11-09 17:57     ` Rick Jones
  0 siblings, 0 replies; 8+ messages in thread
From: Rick Jones @ 2012-11-09 17:57 UTC (permalink / raw)
  To: Dave Taht; +Cc: codel, cerowrt-devel, bloat

>> Github is serving that up as a plain text file, which then has Firefox
>> looking to use gedit to look at the file, and gedit does not seem at all
>> happy with it.  It was necessary to download the file and open it "manually"
>> in LibreOffice.
>
> Sorry. The original was in emacs org mode. Shall I put that up instead?

Just make sure the file has the correct MIME type (?) associated with it 
and I think it will be fine.

>> Estimating the bandwidth consumed by ACKs and/or protocol headers, using
>> code operating at user-space, is going to be guessing.  Particularly
>> portable user-space.  While those things may indeed affect the user's
>> experience, the user doesn't particularly care about ACKs or header sizes.
>> She cares how well the page loads or the call sounds.
>
> I feel an "optimum" ack overhead should be calculated, vs the actual
> (which is impossible)

Well, keep in mind that there will be cases where the two will be rather 
divergent.  Consider a request/response sort of exchange.  For excessive 
simplicity assume a netperf TCP_RR test.  Presumably for the single byte 
case, there will be no ACKs - they will all be piggy-backed on the 
segments carrying the requests and responses.  But now suppose there was 
a little think time in there - say to do a disc I/O or a back-end query 
or whatnot.  That may or may not make the response to the request or the 
next request after a response come after the stack's standalone ACK 
interval, which is a value we will not know up in user space.

Now make the responses longer and cross the MSS threshold - say 
something like 8KB.  We might ass-u-me an ACK-every-two-MSS, and we can 
get the MSS from user space (at least under *nix) but we will not know 
if GRO is present, enabled, or even effective from up at user space. 
And if GRO is working, rather than sending something like 5 or 6 ACKs 
for that 8KB the stack will have sent just one.

>>> MUST have the test server(s) within 80ms of the testing client
>>
>>
>> Why?  Perhaps there is something stating that some number of nines worth of
>> things being accessed are within 80ms of the user.  If there is, that should
>> be given in support of the requirement.
>
> Con-US distance. Despite me pushing the test to 200ms, I have a great
> deal more confidence it will work consistently at 80ms.
>
> Can make this a "SHOULD" if you like.

MUST or SHOULD, either way you should... include the reason for the 
requirement/request.

>
>>> This portion of the test will take your favorite website as a target
>>> and show you how much it will slow down, under load.
>>
>>
>> Under load on the website itself, or under load on one's link.  I ass-u-me
>> the latter, but that should be made clear.  And while the chances of the
>> additional load on a web site via this testing is likely epsilon, there is
>> still the matter of its "optics" if you will - how it looks.  Particularly
>> if there is going to be something distributed with a default website coded
>> into it.
>>
>> Further, websites are not going to remain static, so there will be the
>> matter of being able to compare results over time.  Perhaps that can be
>> finessed with the "unloaded" (again I assume relative to the link of
>> interest/test) measurement.
>
> A core portion of the test really is comparing unloaded vs loaded
> performance of the same place, in the same test, over the course of
> about a minute.
>
> And as these two baseline figures are kept, those can be compared for
> any given website from any given location, over history, and changes
> in the underlying network.

Adding further clarity on specifically *what* is presumed to be 
unloaded/loaded and calling-out the assumption that the web server being 
accessed will itself have uniform loading for the duration of the test 
would be goodness.

David Collier-Brown mentioned "stretch factor" - the ratio of the 
unloaded vs loaded delay (assuming I've interpreted what he wrote 
correctly).   Comparing stretch factors (as one is tweaking things) 
still calls for a rather consistent-over-time baseline doesn't it? (what 
David referred to as the "normal service time") If I target webserver 
foo.com on Monday, and on Monday I see an unloaded-network latency to it 
of 100ms and loaded of 200ms that would be a stretch factor of 2 yes? 
If I then look again on Tuesday, having made some change to my network 
under test that causes it to add only 75 ms, if unloaded access to 
webserver foo.com is for some reason 50 ms I will have a stretch factor 
of 2.5. That is something which will need to be kept in mind.

rick

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-11-09 17:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-06 12:42 [Codel] RFC: Realtime Response Under Load (rrul) test specification Dave Taht
2012-11-06 13:42 ` [Codel] [Bloat] " Henrique de Moraes Holschuh
2012-11-06 13:56   ` Dave Taht
2012-11-06 15:40     ` Wesley Eddy
2012-11-06 18:58     ` [Codel] [Cerowrt-devel] " Michael Richardson
2012-11-06 18:14 ` [Codel] " Rick Jones
2012-11-09 10:34   ` Dave Taht
2012-11-09 17:57     ` Rick Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox