* [Codel] RFC: Realtime Response Under Load (rrul) test specification [not found] <mailman.3.1352232001.18990.codel@lists.bufferbloat.net> @ 2012-11-06 20:52 ` David Collier-Brown 2012-11-09 10:21 ` Dave Taht 0 siblings, 1 reply; 6+ messages in thread From: David Collier-Brown @ 2012-11-06 20:52 UTC (permalink / raw) To: codel Dave Taht wrote: > I have been working on developing a specification for testing networks > more effectively for various side effects of bufferbloat, notably > gaming and voip performance, and especially web performance.... as > well as a few other things that concerned me, such as IPv6 behaviour, > and the effects of packet classification. > > A key goal is to be able to measure the quality of the user experience > while a network is otherwise busy, with complex stuff going on in the > background, but with a simple presentation of the results in the end, > in under 60 seconds. Rick Jones <rick.jones2@hp.com> replied: | Would you like fries with that? | | Snark aside, I think that being able to capture the state of the user | experience in only 60 seconds is daunting at best. Especially if | this testing is going to run over the Big Bad Internet (tm) rather | than in a controlled test lab. > This portion of the test will take your favourite website as a target > and show you how much it will slow down, under load. | Under load on the website itself, or under load on one's link. I | ass-u-me the latter, but that should be made clear. And while the | chances of the additional load on a web site via this testing is | likely epsilon, there is still the matter of its "optics" if you will | - how it looks. Particularly if there is going to be something | distributed with a default website coded into it. This, contraintuitive as it might sound, is what will make the exercise work: an indication as a ratio (a non-dimensional measure) of how much the response-time of a known site is degraded by the network going into queue delay. We're assuming a queuing centre, the website, that is running at a steady speed and load throughout the short test, and is NOT the bottleneck. When we increase the load on the network, it becomes the bottleneck, a queue builds up, and the degradation is directly proportional to the network being delayed. A traditional measure in capacity planning is quite similar to what you describe: the "stretch factor" is the ratio of the sitting-in-a-queue delay to the normal service time of the network. When it's above 1, you're spending as much time twiddling your thumbs as you are doing work, and each additional bit of load will increase the delay and the ratio dramatically. I don't know if this will reproduce, but this, drawn as a curve against load, the ratio you describe will look like a hockey-stick: ............................./ 3.........................../ .........................../ ........................../ 2......................../ ......................../ ......................./ 1....................- ._________---------- 0....5....10....15....20....25 Ratio is the Y-axis, load is the X, and the periods are supposed to be blank spaces (;-)) At loads 1-18 or so, the ratio is < 1 and grows quite slowly. Above 20, the ratio is >> 1 and grows very rapidly, and without bound The results will look like this, and the graphic-equalizer display will tell the reader where the big components of the slowness are coming from. Pretty classic capacity planning, by folks like Gunther. Of course, if the web site you're measuring gets DDOSed in the middle of the test, Your Mileage May Vary! --dave -- David Collier-Brown, | Always do right. This will gratify System Programmer and Author | some people and astonish the rest davecb@spamcop.net | -- Mark Twain (416) 223-8968 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Codel] RFC: Realtime Response Under Load (rrul) test specification 2012-11-06 20:52 ` [Codel] RFC: Realtime Response Under Load (rrul) test specification David Collier-Brown @ 2012-11-09 10:21 ` Dave Taht 0 siblings, 0 replies; 6+ messages in thread From: Dave Taht @ 2012-11-09 10:21 UTC (permalink / raw) To: davecb; +Cc: codel, cerowrt-devel, bloat On Tue, Nov 6, 2012 at 9:52 PM, David Collier-Brown <davec-b@rogers.com> wrote: > Dave Taht wrote: >> I have been working on developing a specification for testing networks >> more effectively for various side effects of bufferbloat, notably >> gaming and voip performance, and especially web performance.... as >> well as a few other things that concerned me, such as IPv6 behaviour, >> and the effects of packet classification. >> >> A key goal is to be able to measure the quality of the user experience >> while a network is otherwise busy, with complex stuff going on in the >> background, but with a simple presentation of the results in the end, >> in under 60 seconds. > > > Rick Jones <rick.jones2@hp.com> replied: > | Would you like fries with that? > | > | Snark aside, I think that being able to capture the state of the user > | experience in only 60 seconds is daunting at best. Especially if > | this testing is going to run over the Big Bad Internet (tm) rather > | than in a controlled test lab. > > >> This portion of the test will take your favourite website as a target >> and show you how much it will slow down, under load. > > | Under load on the website itself, or under load on one's link. I > | ass-u-me the latter, but that should be made clear. And while the > | chances of the additional load on a web site via this testing is > | likely epsilon, there is still the matter of its "optics" if you will > | - how it looks. Particularly if there is going to be something > | distributed with a default website coded into it. > > > This, contraintuitive as it might sound, is what will make the exercise > work: an indication as a ratio (a non-dimensional measure) of how much > the response-time of a known site is degraded by the network going into > queue delay. Exactly! The core comparison of this test is unloaded vs loaded behavior of a network, which is to a large extent independent of the underlying raw bandwidth. I should work harder in bringing this out in the document. I note that the most core component of the benchmark really is web performance without and then with load, as exemplified by the short video herein: http://gettys.wordpress.com/2012/02/01/bufferbloat-demonstration-videos/ with the dozens of dns lookups and short tcp streams that entails. Regrettably emulating that behavior is hard, so being able to a/b a random website while under the kinds of loads generated by rrul is a key intent. while there are interesting factoids to be gained by the behavior of the elephantine TCP flows in relation to each other, it's the behavior of the thinner flows that matters the most. > We're assuming a queuing centre, the website, that is running at a > steady speed and load throughout the short test, and is NOT the > bottleneck. When we increase the load on the network, it becomes the > bottleneck, a queue builds up, and the degradation is directly > proportional to the network being delayed. > > A traditional measure in capacity planning is quite similar to what you > describe: the "stretch factor" is the ratio of the sitting-in-a-queue > delay to the normal service time of the network. When it's above 1, > you're spending as much time twiddling your thumbs as you are doing > work, and each additional bit of load will increase the delay and the > ratio dramatically. I like the stretch factor concept, a lot. > > I don't know if this will reproduce, but this, drawn as a curve against > load, the ratio you describe will look like a hockey-stick: > > ............................./ > 3.........................../ > .........................../ > ........................../ > 2......................../ > ......................../ > ......................./ > 1....................- > ._________---------- > > 0....5....10....15....20....25 > > Ratio is the Y-axis, load is the X, and the periods are supposed to be > blank spaces (;-)) > > At loads 1-18 or so, the ratio is < 1 and grows quite slowly. > Above 20, the ratio is >> 1 and grows very rapidly, and without bound > > The results will look like this, and the graphic-equalizer display will > tell the reader where the big components of the slowness are coming > from. Pretty classic capacity planning, by folks like Gunther. > > Of course, if the web site you're measuring gets DDOSed in the middle of > the test, Your Mileage May Vary! > > --dave > -- > David Collier-Brown, | Always do right. This will gratify > System Programmer and Author | some people and astonish the rest > davecb@spamcop.net | -- Mark Twain > (416) 223-8968 > _______________________________________________ > Codel mailing list > Codel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/codel -- Dave Täht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Codel] RFC: Realtime Response Under Load (rrul) test specification @ 2012-11-06 12:42 Dave Taht 2012-11-06 18:14 ` Rick Jones 0 siblings, 1 reply; 6+ messages in thread From: Dave Taht @ 2012-11-06 12:42 UTC (permalink / raw) To: bloat, codel, cerowrt-devel I have been working on developing a specification for testing networks more effectively for various side effects of bufferbloat, notably gaming and voip performance, and especially web performance.... as well as a few other things that concerned me, such as IPv6 behavior, and the effects of packet classification. A key goal is to be able to measure the quality of the user experience while a network is otherwise busy, with complex stuff going on in the background, but with a simple presentation of the results in the end, in under 60 seconds. While it's not done yet, it escaped into the wild today, and I might as well solicit wider opinions on it, sooo... get the spec at: https://github.com/dtaht/deBloat/blob/master/spec/rrule.doc?raw=true Portions of the test are being prototyped in the netperf-wrappers repo on github. The initial results of the rrul test on several hotel networks I've tried it on are "interesting". Example: http://www.teklibre.com/~d/rrul2_conference.pdf A major sticking point at the moment is to come up with an equivalent of the chrome-benchmarks for measuring relative web page performance with and without a network load, or to merely incorporate some automated form of that benchmark into the overall test load. The end goal is to have a complex, comprehensive benchmark of some core networking issues, that produces simple results, whether they be via a java tool like icsi's, or via flash on the web, or the command line, via something like netperf. Related resources: netperf 2.6 or later running on a fairly nearby server https://github.com/tohojo/netperf-wrapper python-matplotlib I look forward to your comments. -- Dave Täht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Codel] RFC: Realtime Response Under Load (rrul) test specification 2012-11-06 12:42 Dave Taht @ 2012-11-06 18:14 ` Rick Jones 2012-11-09 10:34 ` Dave Taht 0 siblings, 1 reply; 6+ messages in thread From: Rick Jones @ 2012-11-06 18:14 UTC (permalink / raw) To: Dave Taht; +Cc: codel, cerowrt-devel, bloat On 11/06/2012 04:42 AM, Dave Taht wrote: > I have been working on developing a specification for testing networks > more effectively for various side effects of bufferbloat, notably > gaming and voip performance, and especially web performance.... as > well as a few other things that concerned me, such as IPv6 behavior, > and the effects of packet classification. > > A key goal is to be able to measure the quality of the user experience > while a network is otherwise busy, with complex stuff going on in the > background, but with a simple presentation of the results in the end, > in under 60 seconds. Would you like fries with that? Snark aside, I think that being able to capture the state of the user experience in only 60 seconds is daunting at best. Especially if this testing is going to run over the Big Bad Internet (tm) rather than in a controlled test lab. > While it's not done yet, it escaped into the wild today, and I might > as well solicit wider opinions on it, sooo... get the spec at: > > https://github.com/dtaht/deBloat/blob/master/spec/rrule.doc?raw=true Github is serving that up as a plain text file, which then has Firefox looking to use gedit to look at the file, and gedit does not seem at all happy with it. It was necessary to download the file and open it "manually" in LibreOffice. > MUST run long enough to defeat bursty bandwidth optimizations such as > PowerBoost and discard data from that interval. I'll willingly display my ignorance, but for how long does PowerBoost and its cousins boost bandwidth? I wasn't looking for PowerBoost, and given the thing being examined I wasn't seeing that, but recently when I was evaluating the network performance of something "out there" in the cloud (not my home cloud as it were though) I noticed performance spikes repeating at intervals which would require > 60 seconds to "defeat" > MUST track and sum bi-directional throughput, using estimates for ACK > sizes of ipv4, ipv6, and encapsulated ipv6 packets, udp and tcp_rr > packets, etc. Estimating the bandwidth consumed by ACKs and/or protocol headers, using code operating at user-space, is going to be guessing. Particularly portable user-space. While those things may indeed affect the user's experience, the user doesn't particularly care about ACKs or header sizes. She cares how well the page loads or the call sounds. > MUST have the test server(s) within 80ms of the testing client Why? Perhaps there is something stating that some number of nines worth of things being accessed are within 80ms of the user. If there is, that should be given in support of the requirement. > This portion of the test will take your favorite website as a target > and show you how much it will slow down, under load. Under load on the website itself, or under load on one's link. I ass-u-me the latter, but that should be made clear. And while the chances of the additional load on a web site via this testing is likely epsilon, there is still the matter of its "optics" if you will - how it looks. Particularly if there is going to be something distributed with a default website coded into it. Further, websites are not going to remain static, so there will be the matter of being able to compare results over time. Perhaps that can be finessed with the "unloaded" (again I assume relative to the link of interest/test) measurement. rick jones ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Codel] RFC: Realtime Response Under Load (rrul) test specification 2012-11-06 18:14 ` Rick Jones @ 2012-11-09 10:34 ` Dave Taht 2012-11-09 17:57 ` Rick Jones 0 siblings, 1 reply; 6+ messages in thread From: Dave Taht @ 2012-11-09 10:34 UTC (permalink / raw) To: Rick Jones; +Cc: codel, cerowrt-devel, bloat On Tue, Nov 6, 2012 at 7:14 PM, Rick Jones <rick.jones2@hp.com> wrote: > On 11/06/2012 04:42 AM, Dave Taht wrote: >> >> I have been working on developing a specification for testing networks >> more effectively for various side effects of bufferbloat, notably >> gaming and voip performance, and especially web performance.... as >> well as a few other things that concerned me, such as IPv6 behavior, >> and the effects of packet classification. >> >> A key goal is to be able to measure the quality of the user experience >> while a network is otherwise busy, with complex stuff going on in the >> background, but with a simple presentation of the results in the end, >> in under 60 seconds. > > > Would you like fries with that? and a shake! > > Snark aside, I think that being able to capture the state of the user > experience in only 60 seconds is daunting at best. Especially if this Concur. > testing is going to run over the Big Bad Internet (tm) rather than in a > controlled test lab. In my testing of this scheme, from networks ranging in size and quality from a 4 hop mesh network to the internet, to testing at random hotels throughout the US and eu at baseline RTT up to 200ms, to lab testing at multiple other locations, I was generally able to generate a load that had "interesting" side effects inside of 40 seconds, and generally shorter. My suggestion, as I always do, is for you (and others) to simply try out the prototypes that are in the netperf-wrapper git repo, and see what you can see, and learn what you can learn, and feedback what you can. In the longer term I would certainly like the simplest test for latency under load added directly to netperf. The VOIP test is also nifty. > > >> While it's not done yet, it escaped into the wild today, and I might >> as well solicit wider opinions on it, sooo... get the spec at: >> >> https://github.com/dtaht/deBloat/blob/master/spec/rrule.doc?raw=true > > > Github is serving that up as a plain text file, which then has Firefox > looking to use gedit to look at the file, and gedit does not seem at all > happy with it. It was necessary to download the file and open it "manually" > in LibreOffice. Sorry. The original was in emacs org mode. Shall I put that up instead? > >> MUST run long enough to defeat bursty bandwidth optimizations such as >> PowerBoost and discard data from that interval. > > > I'll willingly display my ignorance, but for how long does PowerBoost and > its cousins boost bandwidth? > > I wasn't looking for PowerBoost, and given the thing being examined I wasn't > seeing that, but recently when I was evaluating the network performance of > something "out there" in the cloud (not my home cloud as it were though) I > noticed performance spikes repeating at intervals which would require > 60 > seconds to "defeat" I too have seen oddball spikes - for example, older forms of sfq do permutation every 10 seconds and totally wipe out many tcp connections by doing so. I regard your problem detailed above as an edge case, compared to the much more gross effects this benchmark generates. Certainly being able to run the tests for longer intervals and capture traffic would be useful for network engineers. > >> MUST track and sum bi-directional throughput, using estimates for ACK >> sizes of ipv4, ipv6, and encapsulated ipv6 packets, udp and tcp_rr >> packets, etc. > > > Estimating the bandwidth consumed by ACKs and/or protocol headers, using > code operating at user-space, is going to be guessing. Particularly > portable user-space. While those things may indeed affect the user's > experience, the user doesn't particularly care about ACKs or header sizes. > She cares how well the page loads or the call sounds. I feel an "optimum" ack overhead should be calculated, vs the actual (which is impossible) >> MUST have the test server(s) within 80ms of the testing client > > > Why? Perhaps there is something stating that some number of nines worth of > things being accessed are within 80ms of the user. If there is, that should > be given in support of the requirement. Con-US distance. Despite me pushing the test to 200ms, I have a great deal more confidence it will work consistently at 80ms. Can make this a "SHOULD" if you like. >> This portion of the test will take your favorite website as a target >> and show you how much it will slow down, under load. > > > Under load on the website itself, or under load on one's link. I ass-u-me > the latter, but that should be made clear. And while the chances of the > additional load on a web site via this testing is likely epsilon, there is > still the matter of its "optics" if you will - how it looks. Particularly > if there is going to be something distributed with a default website coded > into it. > > Further, websites are not going to remain static, so there will be the > matter of being able to compare results over time. Perhaps that can be > finessed with the "unloaded" (again I assume relative to the link of > interest/test) measurement. A core portion of the test really is comparing unloaded vs loaded performance of the same place, in the same test, over the course of about a minute. And as these two baseline figures are kept, those can be compared for any given website from any given location, over history, and changes in the underlying network. > rick jones -- Dave Täht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Codel] RFC: Realtime Response Under Load (rrul) test specification 2012-11-09 10:34 ` Dave Taht @ 2012-11-09 17:57 ` Rick Jones 0 siblings, 0 replies; 6+ messages in thread From: Rick Jones @ 2012-11-09 17:57 UTC (permalink / raw) To: Dave Taht; +Cc: codel, cerowrt-devel, bloat >> Github is serving that up as a plain text file, which then has Firefox >> looking to use gedit to look at the file, and gedit does not seem at all >> happy with it. It was necessary to download the file and open it "manually" >> in LibreOffice. > > Sorry. The original was in emacs org mode. Shall I put that up instead? Just make sure the file has the correct MIME type (?) associated with it and I think it will be fine. >> Estimating the bandwidth consumed by ACKs and/or protocol headers, using >> code operating at user-space, is going to be guessing. Particularly >> portable user-space. While those things may indeed affect the user's >> experience, the user doesn't particularly care about ACKs or header sizes. >> She cares how well the page loads or the call sounds. > > I feel an "optimum" ack overhead should be calculated, vs the actual > (which is impossible) Well, keep in mind that there will be cases where the two will be rather divergent. Consider a request/response sort of exchange. For excessive simplicity assume a netperf TCP_RR test. Presumably for the single byte case, there will be no ACKs - they will all be piggy-backed on the segments carrying the requests and responses. But now suppose there was a little think time in there - say to do a disc I/O or a back-end query or whatnot. That may or may not make the response to the request or the next request after a response come after the stack's standalone ACK interval, which is a value we will not know up in user space. Now make the responses longer and cross the MSS threshold - say something like 8KB. We might ass-u-me an ACK-every-two-MSS, and we can get the MSS from user space (at least under *nix) but we will not know if GRO is present, enabled, or even effective from up at user space. And if GRO is working, rather than sending something like 5 or 6 ACKs for that 8KB the stack will have sent just one. >>> MUST have the test server(s) within 80ms of the testing client >> >> >> Why? Perhaps there is something stating that some number of nines worth of >> things being accessed are within 80ms of the user. If there is, that should >> be given in support of the requirement. > > Con-US distance. Despite me pushing the test to 200ms, I have a great > deal more confidence it will work consistently at 80ms. > > Can make this a "SHOULD" if you like. MUST or SHOULD, either way you should... include the reason for the requirement/request. > >>> This portion of the test will take your favorite website as a target >>> and show you how much it will slow down, under load. >> >> >> Under load on the website itself, or under load on one's link. I ass-u-me >> the latter, but that should be made clear. And while the chances of the >> additional load on a web site via this testing is likely epsilon, there is >> still the matter of its "optics" if you will - how it looks. Particularly >> if there is going to be something distributed with a default website coded >> into it. >> >> Further, websites are not going to remain static, so there will be the >> matter of being able to compare results over time. Perhaps that can be >> finessed with the "unloaded" (again I assume relative to the link of >> interest/test) measurement. > > A core portion of the test really is comparing unloaded vs loaded > performance of the same place, in the same test, over the course of > about a minute. > > And as these two baseline figures are kept, those can be compared for > any given website from any given location, over history, and changes > in the underlying network. Adding further clarity on specifically *what* is presumed to be unloaded/loaded and calling-out the assumption that the web server being accessed will itself have uniform loading for the duration of the test would be goodness. David Collier-Brown mentioned "stretch factor" - the ratio of the unloaded vs loaded delay (assuming I've interpreted what he wrote correctly). Comparing stretch factors (as one is tweaking things) still calls for a rather consistent-over-time baseline doesn't it? (what David referred to as the "normal service time") If I target webserver foo.com on Monday, and on Monday I see an unloaded-network latency to it of 100ms and loaded of 200ms that would be a stretch factor of 2 yes? If I then look again on Tuesday, having made some change to my network under test that causes it to add only 75 ms, if unloaded access to webserver foo.com is for some reason 50 ms I will have a stretch factor of 2.5. That is something which will need to be kept in mind. rick ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-11-09 17:58 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <mailman.3.1352232001.18990.codel@lists.bufferbloat.net> 2012-11-06 20:52 ` [Codel] RFC: Realtime Response Under Load (rrul) test specification David Collier-Brown 2012-11-09 10:21 ` Dave Taht 2012-11-06 12:42 Dave Taht 2012-11-06 18:14 ` Rick Jones 2012-11-09 10:34 ` Dave Taht 2012-11-09 17:57 ` Rick Jones
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox