Why is the DNS PLR so high? 1% is pretty depressing. Also, it seems odd to eliminate 19% of the content retrieval because the tail is fat and long rather than short. Wouldn't it be better to have 1000 servers? On Friday, April 18, 2014 2:15pm, "Greg White" said: > Dave, > > We used the 25k object size for a short time back in 2012 until we had > resources to build a more advanced model (appendix A). I did a bunch of > captures of real web pages back in 2011 and compared the object size > statistics to models that I'd seen published. Lognormal didn't seem to be > *exactly* right, but it wasn't a bad fit to what I saw. I've attached a > CDF. > > The choice of 4 servers was based somewhat on logistics, and also on a > finding that across our data set, the average web page retrieved 81% of > its resources from the top 4 servers. Increasing to 5 servers only > increased that percentage to 84%. > > The choice of RTTs also came from the web traffic captures. I saw > RTTmin=16ms, RTTmean=53.8ms, RTTmax=134ms. > > Much of this can be found in > https://tools.ietf.org/html/draft-white-httpbis-spdy-analysis-00 > > In many of the cases that we've simulated, the packet drop probability is > less than 1% for DNS packets. In our web model, there are a total of 4 > servers, so 4 DNS lookups assuming none of the addresses are cached. If > PLR = 1%, there would be a 3.9% chance of losing one or more DNS packets > (with a resulting ~5 second additional delay on load time). I've probably > oversimplified this, but Kathie N. and I made the call that it would be > significantly easier to just do this math than to build a dns > implementation in ns2. We've open sourced the web model (it's on Kathie's > web page and will be part of ns2.36) with an encouragement to the > community to improve on it. If you'd like to port it to ns3 and add a dns > model, that would be fantastic. > > -Greg > > > On 4/17/14, 3:07 PM, "Dave Taht" wrote: > > >On Thu, Apr 17, 2014 at 12:01 PM, William Chan (陈智昌) > > wrote: > >> Speaking as the primary Chromium developer in charge of this relevant > >>code, > >> I would like to caution putting too much trust in the numbers > >>generated. Any > >> statistical claims about the numbers are probably unreasonable to make. > > > >Sigh. Other benchmarks such as the apache ("ab") benchmark > >are primarily designed as stress testers for web servers, not as realistic > >traffic. Modern web traffic has such a high level of dynamicism in it, > >that static web page loads along any distribution, seem insufficient, > >passive analysis of aggregated traffic "feels" incorrect relative to the > >sorts of home and small business traffic I've seen, and so on. > > > >Famous papers, such as this one: > > > >http://ccr.sigcomm.org/archive/1995/jan95/ccr-9501-leland.pdf > > > >Seem possibly irrelevant to draw conclusions from given the kind > >of data they analysed and proceeding from an incorrect model or > >gut feel for how the network behaves today seems to be foolish. > > > >Even the most basic of tools, such as httping, had three basic bugs > >that I found in a few minutes of trying to come up with some basic > >behaviors yesterday: > > > >https://lists.bufferbloat.net/pipermail/bloat/2014-April/001890.html > > > >Those are going to be a lot easier to fix than diving into the chromium > >codebase! > > > >There are very few tools worth trusting, and I am always dubious > >of papers that publish results with unavailable tools and data. The only > >tools I have any faith in for network analysis are netperf, > >netperf-wrapper, > >tcpdump and xplot.org, and to a large extent wireshark. Toke and I have > >been tearing apart d-itg and I hope to one day be able to add that to > >my trustable list... but better tools are needed! > > > >Tools that I don't have a lot of faith in include that, iperf, anything > >written > >in java or other high level languages, speedtest.net, and things like > >shaperprobe. > > > >Have very little faith in ns2, slightly more in ns3, and I've been meaning > >to look over the mininet and other simulators whenever I got some spare > >time; the mininet results stanford gets seem pretty reasonable and I > >adore their reproducing results effort. Haven't explored ndt, keep meaning > >to... > > > >> Reasons: > >> * We don't actively maintain this code. It's behind the command line > >>flags. > >> They are broken. The fact that it still results in numbers on the > >>benchmark > >> extension is an example of where unmaintained code doesn't have the UI > >> disabled, even though the internal workings of the code fail to > >>guarantee > >> correct operation. We haven't disabled it because, well, it's > >>unmaintained. > > > >As I mentioned I was gearing up for a hacking run... > > > >The vast majority of results I look at are actually obtained via > >looking at packet captures. I mostly use benchmarks as abstractions > >to see if they make some sense relative to the captures and tend > >to compare different benchmarks against each other. > > > >I realize others don't go into that level of detail, so you have given > >fair warning! In our case we used the web page benchmarker as > >a means to try and rapidly acquire some typical distributions of > >get and tcp stream requests from things like the alexa top 1000, > >and as a way to A/B different aqm/packet scheduling setups. > > > >... but the only easily publishable results were from the benchmark > >itself, > >and we (reluctantly) only published one graph from all the work that > >went into it 2+ years back and used it as a test driver for the famous > >ietf video, comparing two identical boxes running it at the same time > >under different network conditions: > > > >https://www.bufferbloat.net/projects/cerowrt/wiki/Bloat-videos#IETF-demo-s > >ide-by-side-of-a-normal-cable-modem-vs-fq_codel > > > >from what I fiddled with today, it is at least still useful for that? > > > >moving on... > > > >The web model in the cablelabs work doesn't look much like my captures, > >in addition to not modeling dns at all, and using a smaller IW than google > >it looks like this: > > > >>> Model single user web page download as follows: > > > >>> - Web page modeled as single HTML page + 100 objects spread evenly > >>> across 4 servers. Web object sizes are currently fixed at 25 kB > each, > >>> whereas the initial HTML page is 100 kB. Appendix A provides an > >>> alternative page model that may be explored in future work. > > > >Where what I see is a huge number of stuff that fits into a single > >iw10 slow start episode and some level of pipelining on larger stuff, so > >that a > >large number of object sizes of less than 7k with a lightly tailed > >distribution > >outside of that makes more sense. > > > >(I'm not staring at appendix A right now, I'm under the impression > > it was better) > > > >I certainly would like more suggestions for models and types > >of web traffic, as well as simulation of https + pfs traffic, > >spdy, quic, etc.... > > > >>> - Server RTTs set as follows (20 ms, 30 ms, 50 ms, 100 ms). > > > >Server RTTs from my own web history tend to be lower than 50ms. > > > >>> - Initial HTTP GET to retrieve a moderately sized object (100 kB > HTML > >>> page) from server 1. > > > >An initial GET to google fits into iw10 - it's about 7k. > > > >>> - Once initial HTTP GET completes, initiate 24 simultaneous HTTP > GETs > >>> (via separate TCP connections), 6 connections each to 4 different > >>> server nodes > > > >I usually don't see more than 15. and certainly not 25k sized objects. > > > > > - Once each individual HTTP GET completes, initiate a subsequent GET > >> to the same server, until 25 objects have been retrieved from each > >> server. > > > > > >> * We don't make sure to flush all the network state in between runs, so > >>if > >> you're using that option, don't trust it to work. > > > >The typical scenario we used was a run against dozens or hundreds of urls, > >capturing traffic, while varying network conditions. > > > >Regarded the first run as the most interesting. > > > >Can exit the browser and restart after a run like that. > > > >At moment, merely plan to use the tool primarily to survey various > >web sites and load times while doing packet captures. Hope was > >to get valid data from the network portion of the load, tho... > > > >> * If you have an advanced Chromium setup, this definitely does not > >>work. I > >> advise using the benchmark extension only with a separate Chromium > >>profile > >> for testing purposes. Our flushing of sockets, caches, etc does not > >>actually > >> work correctly when you use the Chromium multiprofile feature and also > >>fails > >> to flush lots of our other network caches. > > > >noted. > > > > > >> * No one on Chromium really believes the time to paint numbers that we > >> output :) It's complicated. Our graphics stack is complicated. The time > >>from > > > >I actually care only about time-to-full layout as that's a core network > >effect... > > > >> when Blink thinks it painted to when the GPU actually blits to the > >>screen > >> cannot currently be corroborated with any high degree of accuracy from > >> within our code. > > > >> * It has not been maintained since 2010. It is quite likely there are > >>many > >> other subtle inaccuracies here. > > > >Grok. > > > >> In short, while you can expect it to give you a very high level > >> understanding of performance issues, I advise against placing > >>non-trivial > >> confidence in the accuracy of the numbers generated by the benchmark > >> extension. The fact that numbers are produced by the extension should > >>not be > >> treated as evidence that the extension actually functions correctly. > > > >OK, noted. Still delighted to be able to have a simple load generator > >that exercises the browsers and generates some results, however > >dubious. > > > >> > >> Cheers. > >> > >> > >> On Thu, Apr 17, 2014 at 10:49 AM, Dave Taht > wrote: > >>> > >>> Getting a grip on real web page load time behavior in an age of > >>> sharded websites, > >>> dozens of dns lookups, javascript, and fairly random behavior in ad > >>> services > >>> and cdns against how a modern browsers behaves is very, very hard. > >>> > >>> it turns out if you run > >>> > >>> google-chrome --enable-benchmarking --enable-net-benchmarking > >>> > >>> (Mac users have to embed these options in their startup script - see > >>> http://www.chromium.org/developers/how-tos/run-chromium-with-flags > ) > >>> > >>> enable developer options and install and run the chrome web page > >>> benchmarker, > >>> ( > >>> > >>>https://chrome.google.com/webstore/detail/page-benchmarker/channimfdomah > >>>ekjcahlbpccbgaopjll?hl=en > >>> ) > >>> > >>> that it works (at least for me, on a brief test of the latest > chrome, > >>>on > >>> linux. > >>> Can someone try windows and mac?) > >>> > >>> You can then feed in a list of urls to test against, and post > process > >>> the resulting .csv file to your hearts content. We used to use this > >>> benchmark a lot while trying to characterise typical web behaviors > >>> under aqm and packet scheduling systems under load. Running > >>> it simultaneously with a rrul test or one of the simpler tcp upload > or > >>> download > >>> tests in the rrul suite was often quite interesting. > >>> > >>> It turned out the doc has been wrong a while as to the name of the > >>>second > >>> command lnie option. I was gearing up mentally for having to look at > >>> the source.... > >>> > >>> http://code.google.com/p/chromium/issues/detail?id=338705 > >>> > >>> /me happy > >>> > >>> -- > >>> Dave Täht > >>> > >>> Heartbleed POC on wifi campus networks with EAP auth: > >>> http://www.eduroam.edu.au/advisory.html > >>> > >>> _______________________________________________ > >>> aqm mailing list > >>> aqm@ietf.org > >>> https://www.ietf.org/mailman/listinfo/aqm > >> > >> > > > > > > > >-- > >Dave Täht > > > >NSFW: > >https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indec > >ent.article > > > >_______________________________________________ > >aqm mailing list > >aqm@ietf.org > >https://www.ietf.org/mailman/listinfo/aqm > > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel >