[Cerowrt-devel] [aqm] chrome web page benchmarker fixed
dpreed at reed.com
dpreed at reed.com
Fri Apr 18 14:48:08 EDT 2014
Why is the DNS PLR so high? 1% is pretty depressing.
Also, it seems odd to eliminate 19% of the content retrieval because the tail is fat and long rather than short. Wouldn't it be better to have 1000 servers?
On Friday, April 18, 2014 2:15pm, "Greg White" <g.white at CableLabs.com> said:
> We used the 25k object size for a short time back in 2012 until we had
> resources to build a more advanced model (appendix A). I did a bunch of
> captures of real web pages back in 2011 and compared the object size
> statistics to models that I'd seen published. Lognormal didn't seem to be
> *exactly* right, but it wasn't a bad fit to what I saw. I've attached a
> The choice of 4 servers was based somewhat on logistics, and also on a
> finding that across our data set, the average web page retrieved 81% of
> its resources from the top 4 servers. Increasing to 5 servers only
> increased that percentage to 84%.
> The choice of RTTs also came from the web traffic captures. I saw
> RTTmin=16ms, RTTmean=53.8ms, RTTmax=134ms.
> Much of this can be found in
> In many of the cases that we've simulated, the packet drop probability is
> less than 1% for DNS packets. In our web model, there are a total of 4
> servers, so 4 DNS lookups assuming none of the addresses are cached. If
> PLR = 1%, there would be a 3.9% chance of losing one or more DNS packets
> (with a resulting ~5 second additional delay on load time). I've probably
> oversimplified this, but Kathie N. and I made the call that it would be
> significantly easier to just do this math than to build a dns
> implementation in ns2. We've open sourced the web model (it's on Kathie's
> web page and will be part of ns2.36) with an encouragement to the
> community to improve on it. If you'd like to port it to ns3 and add a dns
> model, that would be fantastic.
> On 4/17/14, 3:07 PM, "Dave Taht" <dave.taht at gmail.com> wrote:
> >On Thu, Apr 17, 2014 at 12:01 PM, William Chan (陈智昌)
> ><willchan at chromium.org> wrote:
> >> Speaking as the primary Chromium developer in charge of this relevant
> >> I would like to caution putting too much trust in the numbers
> >>generated. Any
> >> statistical claims about the numbers are probably unreasonable to make.
> >Sigh. Other benchmarks such as the apache ("ab") benchmark
> >are primarily designed as stress testers for web servers, not as realistic
> >traffic. Modern web traffic has such a high level of dynamicism in it,
> >that static web page loads along any distribution, seem insufficient,
> >passive analysis of aggregated traffic "feels" incorrect relative to the
> >sorts of home and small business traffic I've seen, and so on.
> >Famous papers, such as this one:
> >Seem possibly irrelevant to draw conclusions from given the kind
> >of data they analysed and proceeding from an incorrect model or
> >gut feel for how the network behaves today seems to be foolish.
> >Even the most basic of tools, such as httping, had three basic bugs
> >that I found in a few minutes of trying to come up with some basic
> >behaviors yesterday:
> >Those are going to be a lot easier to fix than diving into the chromium
> >There are very few tools worth trusting, and I am always dubious
> >of papers that publish results with unavailable tools and data. The only
> >tools I have any faith in for network analysis are netperf,
> >tcpdump and xplot.org, and to a large extent wireshark. Toke and I have
> >been tearing apart d-itg and I hope to one day be able to add that to
> >my trustable list... but better tools are needed!
> >Tools that I don't have a lot of faith in include that, iperf, anything
> >in java or other high level languages, speedtest.net, and things like
> >Have very little faith in ns2, slightly more in ns3, and I've been meaning
> >to look over the mininet and other simulators whenever I got some spare
> >time; the mininet results stanford gets seem pretty reasonable and I
> >adore their reproducing results effort. Haven't explored ndt, keep meaning
> >> Reasons:
> >> * We don't actively maintain this code. It's behind the command line
> >> They are broken. The fact that it still results in numbers on the
> >> extension is an example of where unmaintained code doesn't have the UI
> >> disabled, even though the internal workings of the code fail to
> >> correct operation. We haven't disabled it because, well, it's
> >As I mentioned I was gearing up for a hacking run...
> >The vast majority of results I look at are actually obtained via
> >looking at packet captures. I mostly use benchmarks as abstractions
> >to see if they make some sense relative to the captures and tend
> >to compare different benchmarks against each other.
> >I realize others don't go into that level of detail, so you have given
> >fair warning! In our case we used the web page benchmarker as
> >a means to try and rapidly acquire some typical distributions of
> >get and tcp stream requests from things like the alexa top 1000,
> >and as a way to A/B different aqm/packet scheduling setups.
> >... but the only easily publishable results were from the benchmark
> >and we (reluctantly) only published one graph from all the work that
> >went into it 2+ years back and used it as a test driver for the famous
> >ietf video, comparing two identical boxes running it at the same time
> >under different network conditions:
> >from what I fiddled with today, it is at least still useful for that?
> >moving on...
> >The web model in the cablelabs work doesn't look much like my captures,
> >in addition to not modeling dns at all, and using a smaller IW than google
> >it looks like this:
> >>> Model single user web page download as follows:
> >>> - Web page modeled as single HTML page + 100 objects spread evenly
> >>> across 4 servers. Web object sizes are currently fixed at 25 kB
> >>> whereas the initial HTML page is 100 kB. Appendix A provides an
> >>> alternative page model that may be explored in future work.
> >Where what I see is a huge number of stuff that fits into a single
> >iw10 slow start episode and some level of pipelining on larger stuff, so
> >that a
> >large number of object sizes of less than 7k with a lightly tailed
> >outside of that makes more sense.
> >(I'm not staring at appendix A right now, I'm under the impression
> > it was better)
> >I certainly would like more suggestions for models and types
> >of web traffic, as well as simulation of https + pfs traffic,
> >spdy, quic, etc....
> >>> - Server RTTs set as follows (20 ms, 30 ms, 50 ms, 100 ms).
> >Server RTTs from my own web history tend to be lower than 50ms.
> >>> - Initial HTTP GET to retrieve a moderately sized object (100 kB
> >>> page) from server 1.
> >An initial GET to google fits into iw10 - it's about 7k.
> >>> - Once initial HTTP GET completes, initiate 24 simultaneous HTTP
> >>> (via separate TCP connections), 6 connections each to 4 different
> >>> server nodes
> >I usually don't see more than 15. and certainly not 25k sized objects.
> > > - Once each individual HTTP GET completes, initiate a subsequent GET
> >> to the same server, until 25 objects have been retrieved from each
> >> server.
> >> * We don't make sure to flush all the network state in between runs, so
> >> you're using that option, don't trust it to work.
> >The typical scenario we used was a run against dozens or hundreds of urls,
> >capturing traffic, while varying network conditions.
> >Regarded the first run as the most interesting.
> >Can exit the browser and restart after a run like that.
> >At moment, merely plan to use the tool primarily to survey various
> >web sites and load times while doing packet captures. Hope was
> >to get valid data from the network portion of the load, tho...
> >> * If you have an advanced Chromium setup, this definitely does not
> >>work. I
> >> advise using the benchmark extension only with a separate Chromium
> >> for testing purposes. Our flushing of sockets, caches, etc does not
> >> work correctly when you use the Chromium multiprofile feature and also
> >> to flush lots of our other network caches.
> >> * No one on Chromium really believes the time to paint numbers that we
> >> output :) It's complicated. Our graphics stack is complicated. The time
> >I actually care only about time-to-full layout as that's a core network
> >> when Blink thinks it painted to when the GPU actually blits to the
> >> cannot currently be corroborated with any high degree of accuracy from
> >> within our code.
> >> * It has not been maintained since 2010. It is quite likely there are
> >> other subtle inaccuracies here.
> >> In short, while you can expect it to give you a very high level
> >> understanding of performance issues, I advise against placing
> >> confidence in the accuracy of the numbers generated by the benchmark
> >> extension. The fact that numbers are produced by the extension should
> >>not be
> >> treated as evidence that the extension actually functions correctly.
> >OK, noted. Still delighted to be able to have a simple load generator
> >that exercises the browsers and generates some results, however
> >> Cheers.
> >> On Thu, Apr 17, 2014 at 10:49 AM, Dave Taht <dave.taht at gmail.com>
> >>> Getting a grip on real web page load time behavior in an age of
> >>> sharded websites,
> >>> services
> >>> and cdns against how a modern browsers behaves is very, very hard.
> >>> it turns out if you run
> >>> google-chrome --enable-benchmarking --enable-net-benchmarking
> >>> (Mac users have to embed these options in their startup script - see
> >>> http://www.chromium.org/developers/how-tos/run-chromium-with-flags
> >>> enable developer options and install and run the chrome web page
> >>> benchmarker,
> >>> (
> >>> )
> >>> that it works (at least for me, on a brief test of the latest
> >>> linux.
> >>> Can someone try windows and mac?)
> >>> You can then feed in a list of urls to test against, and post
> >>> the resulting .csv file to your hearts content. We used to use this
> >>> benchmark a lot while trying to characterise typical web behaviors
> >>> under aqm and packet scheduling systems under load. Running
> >>> it simultaneously with a rrul test or one of the simpler tcp upload
> >>> download
> >>> tests in the rrul suite was often quite interesting.
> >>> It turned out the doc has been wrong a while as to the name of the
> >>> command lnie option. I was gearing up mentally for having to look at
> >>> the source....
> >>> http://code.google.com/p/chromium/issues/detail?id=338705
> >>> /me happy
> >>> --
> >>> Dave Täht
> >>> Heartbleed POC on wifi campus networks with EAP auth:
> >>> http://www.eduroam.edu.au/advisory.html
> >>> _______________________________________________
> >>> aqm mailing list
> >>> aqm at ietf.org
> >>> https://www.ietf.org/mailman/listinfo/aqm
> >Dave Täht
> >aqm mailing list
> >aqm at ietf.org
> Cerowrt-devel mailing list
> Cerowrt-devel at lists.bufferbloat.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Cerowrt-devel