[Bloat] [Cerowrt-devel] [aqm] chrome web page benchmarker fixed

dpreed at reed.com dpreed at reed.com
Fri Apr 18 11:48:08 PDT 2014


Why is the DNS PLR so high?  1% is pretty depressing.
 
Also, it seems odd to eliminate 19% of the content retrieval because the tail is fat and long rather than short.  Wouldn't it be better to have 1000 servers?
 
 


On Friday, April 18, 2014 2:15pm, "Greg White" <g.white at CableLabs.com> said:



> Dave,
> 
> We used the 25k object size for a short time back in 2012 until we had
> resources to build a more advanced model (appendix A).  I did a bunch of
> captures of real web pages back in 2011 and compared the object size
> statistics to models that I'd seen published.  Lognormal didn't seem to be
> *exactly* right, but it wasn't a bad fit to what I saw.  I've attached a
> CDF.
> 
> The choice of 4 servers was based somewhat on logistics, and also on a
> finding that across our data set, the average web page retrieved 81% of
> its resources from the top 4 servers.  Increasing to 5 servers only
> increased that percentage to 84%.
> 
> The choice of RTTs also came from the web traffic captures. I saw
> RTTmin=16ms, RTTmean=53.8ms, RTTmax=134ms.
> 
> Much of this can be found in
> https://tools.ietf.org/html/draft-white-httpbis-spdy-analysis-00
> 
> In many of the cases that we've simulated, the packet drop probability is
> less than 1% for DNS packets.  In our web model, there are a total of 4
> servers, so 4 DNS lookups assuming none of the addresses are cached. If
> PLR = 1%, there would be a 3.9% chance of losing one or more DNS packets
> (with a resulting ~5 second additional delay on load time).  I've probably
> oversimplified this, but Kathie N. and I made the call that it would be
> significantly easier to just do this math than to build a dns
> implementation in ns2.  We've open sourced the web model (it's on Kathie's
> web page and will be part of ns2.36) with an encouragement to the
> community to improve on it.  If you'd like to port it to ns3 and add a dns
> model, that would be fantastic.
> 
> -Greg
> 
> 
> On 4/17/14, 3:07 PM, "Dave Taht" <dave.taht at gmail.com> wrote:
> 
> >On Thu, Apr 17, 2014 at 12:01 PM, William Chan (陈智昌)
> ><willchan at chromium.org> wrote:
> >> Speaking as the primary Chromium developer in charge of this relevant
> >>code,
> >> I would like to caution putting too much trust in the numbers
> >>generated. Any
> >> statistical claims about the numbers are probably unreasonable to make.
> >
> >Sigh. Other benchmarks such as the apache ("ab") benchmark
> >are primarily designed as stress testers for web servers, not as realistic
> >traffic. Modern web traffic has such a high level of dynamicism in it,
> >that static web page loads along any distribution, seem insufficient,
> >passive analysis of aggregated traffic "feels" incorrect relative to the
> >sorts of home and small business traffic I've seen, and so on.
> >
> >Famous papers, such as this one:
> >
> >http://ccr.sigcomm.org/archive/1995/jan95/ccr-9501-leland.pdf
> >
> >Seem possibly irrelevant to draw conclusions from given the kind
> >of data they analysed and proceeding from an incorrect model or
> >gut feel for how the network behaves today seems to be foolish.
> >
> >Even the most basic of tools, such as httping, had three basic bugs
> >that I found in a few minutes of trying to come up with some basic
> >behaviors yesterday:
> >
> >https://lists.bufferbloat.net/pipermail/bloat/2014-April/001890.html
> >
> >Those are going to be a lot easier to fix than diving into the chromium
> >codebase!
> >
> >There are very few tools worth trusting, and I am always dubious
> >of papers that publish results with unavailable tools and data. The only
> >tools I have any faith in for network analysis are netperf,
> >netperf-wrapper,
> >tcpdump and xplot.org, and to a large extent wireshark. Toke and I have
> >been tearing apart d-itg and I hope to one day be able to add that to
> >my trustable list... but better tools are needed!
> >
> >Tools that I don't have a lot of faith in include that, iperf, anything
> >written
> >in java or other high level languages, speedtest.net, and things like
> >shaperprobe.
> >
> >Have very little faith in ns2, slightly more in ns3, and I've been meaning
> >to look over the mininet and other simulators whenever I got some spare
> >time; the mininet results stanford gets seem pretty reasonable and I
> >adore their reproducing results effort. Haven't explored ndt, keep meaning
> >to...
> >
> >> Reasons:
> >> * We don't actively maintain this code. It's behind the command line
> >>flags.
> >> They are broken. The fact that it still results in numbers on the
> >>benchmark
> >> extension is an example of where unmaintained code doesn't have the UI
> >> disabled, even though the internal workings of the code fail to
> >>guarantee
> >> correct operation. We haven't disabled it because, well, it's
> >>unmaintained.
> >
> >As I mentioned I was gearing up for a hacking run...
> >
> >The vast majority of results I look at are actually obtained via
> >looking at packet captures. I mostly use benchmarks as abstractions
> >to see if they make some sense relative to the captures and tend
> >to compare different benchmarks against each other.
> >
> >I realize others don't go into that level of detail, so you have given
> >fair warning! In our case we used the web page benchmarker as
> >a means to try and rapidly acquire some typical distributions of
> >get and tcp stream requests from things like the alexa top 1000,
> >and as a way to A/B different aqm/packet scheduling setups.
> >
> >... but the only easily publishable results were from the benchmark
> >itself,
> >and we (reluctantly) only published one graph from all the work that
> >went into it 2+ years back and used it as a test driver for the famous
> >ietf video, comparing two identical boxes running it at the same time
> >under different network conditions:
> >
> >https://www.bufferbloat.net/projects/cerowrt/wiki/Bloat-videos#IETF-demo-s
> >ide-by-side-of-a-normal-cable-modem-vs-fq_codel
> >
> >from what I fiddled with today, it is at least still useful for that?
> >
> >moving on...
> >
> >The web model in the cablelabs work doesn't look much like my captures,
> >in addition to not modeling dns at all, and using a smaller IW than google
> >it looks like this:
> >
> >>> Model single user web page download as follows:
> >
> >>> - Web page modeled as single HTML page + 100 objects spread evenly
> >>> across 4 servers. Web object sizes are currently fixed at 25 kB
> each,
> >>> whereas the initial HTML page is 100 kB. Appendix A provides an
> >>> alternative page model that may be explored in future work.
> >
> >Where what I see is a huge number of stuff that fits into a single
> >iw10 slow start episode and some level of pipelining on larger stuff, so
> >that a
> >large number of object sizes of less than 7k with a lightly tailed
> >distribution
> >outside of that makes more sense.
> >
> >(I'm not staring at appendix A right now, I'm under the impression
> > it was better)
> >
> >I certainly would like more suggestions for models and types
> >of web traffic, as well as simulation of https + pfs traffic,
> >spdy, quic, etc....
> >
> >>> - Server RTTs set as follows (20 ms, 30 ms, 50 ms, 100 ms).
> >
> >Server RTTs from my own web history tend to be lower than 50ms.
> >
> >>> - Initial HTTP GET to retrieve a moderately sized object (100 kB
> HTML
> >>> page) from server 1.
> >
> >An initial GET to google fits into iw10 - it's about 7k.
> >
> >>> - Once initial HTTP GET completes, initiate 24 simultaneous HTTP
> GETs
> >>> (via separate TCP connections), 6 connections each to 4 different
> >>> server nodes
> >
> >I usually don't see more than 15. and certainly not 25k sized objects.
> >
> > > - Once each individual HTTP GET completes, initiate a subsequent GET
> >> to the same server, until 25 objects have been retrieved from each
> >> server.
> >
> >
> >> * We don't make sure to flush all the network state in between runs, so
> >>if
> >> you're using that option, don't trust it to work.
> >
> >The typical scenario we used was a run against dozens or hundreds of urls,
> >capturing traffic, while varying network conditions.
> >
> >Regarded the first run as the most interesting.
> >
> >Can exit the browser and restart after a run like that.
> >
> >At moment, merely plan to use the tool primarily to survey various
> >web sites and load times while doing packet captures. Hope was
> >to get valid data from the network portion of the load, tho...
> >
> >> * If you have an advanced Chromium setup, this definitely does not
> >>work. I
> >> advise using the benchmark extension only with a separate Chromium
> >>profile
> >> for testing purposes. Our flushing of sockets, caches, etc does not
> >>actually
> >> work correctly when you use the Chromium multiprofile feature and also
> >>fails
> >> to flush lots of our other network caches.
> >
> >noted.
> >
> >
> >> * No one on Chromium really believes the time to paint numbers that we
> >> output :) It's complicated. Our graphics stack is complicated. The time
> >>from
> >
> >I actually care only about time-to-full layout as that's a core network
> >effect...
> >
> >> when Blink thinks it painted to when the GPU actually blits to the
> >>screen
> >> cannot currently be corroborated with any high degree of accuracy from
> >> within our code.
> >
> >> * It has not been maintained since 2010. It is quite likely there are
> >>many
> >> other subtle inaccuracies here.
> >
> >Grok.
> >
> >> In short, while you can expect it to give you a very high level
> >> understanding of performance issues, I advise against placing
> >>non-trivial
> >> confidence in the accuracy of the numbers generated by the benchmark
> >> extension. The fact that numbers are produced by the extension should
> >>not be
> >> treated as evidence that the extension actually functions correctly.
> >
> >OK, noted. Still delighted to be able to have a simple load generator
> >that exercises the browsers and generates some results, however
> >dubious.
> >
> >>
> >> Cheers.
> >>
> >>
> >> On Thu, Apr 17, 2014 at 10:49 AM, Dave Taht <dave.taht at gmail.com>
> wrote:
> >>>
> >>> Getting a grip on real web page load time behavior in an age of
> >>> sharded websites,
> >>> dozens of dns lookups, javascript, and fairly random behavior in ad
> >>> services
> >>> and cdns against how a modern browsers behaves is very, very hard.
> >>>
> >>> it turns out if you run
> >>>
> >>> google-chrome --enable-benchmarking --enable-net-benchmarking
> >>>
> >>> (Mac users have to embed these options in their startup script - see
> >>>  http://www.chromium.org/developers/how-tos/run-chromium-with-flags
> )
> >>>
> >>> enable developer options and install and run the chrome web page
> >>> benchmarker,
> >>> (
> >>>
> >>>https://chrome.google.com/webstore/detail/page-benchmarker/channimfdomah
> >>>ekjcahlbpccbgaopjll?hl=en
> >>> )
> >>>
> >>> that it works (at least for me, on a brief test of the latest
> chrome,
> >>>on
> >>> linux.
> >>> Can someone try windows and mac?)
> >>>
> >>> You can then feed in a list of urls to test against, and post
> process
> >>> the resulting .csv file to your hearts content. We used to use this
> >>> benchmark a lot while trying to characterise typical web behaviors
> >>> under aqm and packet scheduling systems under load. Running
> >>> it simultaneously with a rrul test or one of the simpler tcp upload
> or
> >>> download
> >>> tests in the rrul suite was often quite interesting.
> >>>
> >>> It turned out the doc has been wrong a while as to the name of the
> >>>second
> >>> command lnie option. I was gearing up mentally for having to look at
> >>> the source....
> >>>
> >>> http://code.google.com/p/chromium/issues/detail?id=338705
> >>>
> >>> /me happy
> >>>
> >>> --
> >>> Dave Täht
> >>>
> >>> Heartbleed POC on wifi campus networks with EAP auth:
> >>> http://www.eduroam.edu.au/advisory.html
> >>>
> >>> _______________________________________________
> >>> aqm mailing list
> >>> aqm at ietf.org
> >>> https://www.ietf.org/mailman/listinfo/aqm
> >>
> >>
> >
> >
> >
> >--
> >Dave Täht
> >
> >NSFW:
> >https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indec
> >ent.article
> >
> >_______________________________________________
> >aqm mailing list
> >aqm at ietf.org
> >https://www.ietf.org/mailman/listinfo/aqm
> 
> _______________________________________________
> Cerowrt-devel mailing list
> Cerowrt-devel at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.bufferbloat.net/pipermail/bloat/attachments/20140418/33a8bee5/attachment-0001.html>


More information about the Bloat mailing list