From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-we0-x231.google.com (mail-we0-x231.google.com [IPv6:2a00:1450:400c:c03::231]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 0836321F262; Fri, 18 Apr 2014 12:05:16 -0700 (PDT) Received: by mail-we0-f177.google.com with SMTP id u57so1817934wes.22 for ; Fri, 18 Apr 2014 12:05:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=TsM/UE6svtZ+HxVxDpOsOG6MzeitFZAeHWaE++U67gY=; b=C25A1jV0dueoBE0bEpMehyo5XLwI14WbosOIuLxsccT09dznDPU/CVFFC3aJDD8NAB hZxoCRtdMgT708YQnBHF9BgHjcBpG2Vf6UHpXOCo97fNDR5TeQtsOvug+ZlRqihDQAg/ NTZG9uslcbYGqhJG3vFlavrzkjWcm14AilQExwHNfGZHbWz6stHZS4rcBKIWdx8pgiB/ 8kthxUxbJ9ZuGmq0cD+eptRa32ADwOTrGn8+GA61EJMvgQB/EM/1dHyH1wd0sd6/o4UO MFjAnCDBYKCCTa1pqgJ+Yn8WCZOhKcFo81zMjLafIqJ/xNgbg7wCwdz/UNH2HR1TmPsT UKHw== MIME-Version: 1.0 X-Received: by 10.180.81.138 with SMTP id a10mr3446993wiy.53.1397847915068; Fri, 18 Apr 2014 12:05:15 -0700 (PDT) Received: by 10.216.177.10 with HTTP; Fri, 18 Apr 2014 12:05:15 -0700 (PDT) In-Reply-To: References: Date: Fri, 18 Apr 2014 12:05:15 -0700 Message-ID: From: Dave Taht To: Greg White Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: =?UTF-8?B?V2lsbGlhbSBDaGFuICjpmYjmmbrmmIwp?= , "aqm@ietf.org" , "cerowrt-devel@lists.bufferbloat.net" , bloat Subject: Re: [Cerowrt-devel] [aqm] chrome web page benchmarker fixed X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Apr 2014 19:05:17 -0000 On Fri, Apr 18, 2014 at 11:15 AM, Greg White wrote: > Dave, > > We used the 25k object size for a short time back in 2012 until we had > resources to build a more advanced model (appendix A). I did a bunch of > captures of real web pages back in 2011 and compared the object size > statistics to models that I'd seen published. Lognormal didn't seem to b= e > *exactly* right, but it wasn't a bad fit to what I saw. I've attached a > CDF. That does seem a bit large on the initial 20%. Hmm. There is a second kind of major case, where you are moving around on the same web property, and hopefully many core portions of the web page(s) such as the css and javascript, basic logos and other images, are cached. Caching is handled two ways, one is to explicitly mark the data as cacheable for a certain period, the other is an if-modified-since request, which costs RTTs for setup and the query. I am under the impression that we generally see a lot more of the latter than the former these days. > The choice of 4 servers was based somewhat on logistics, and also on a > finding that across our data set, the average web page retrieved 81% of > its resources from the top 4 servers. Increasing to 5 servers only > increased that percentage to 84%. > > The choice of RTTs also came from the web traffic captures. I saw > RTTmin=3D16ms, RTTmean=3D53.8ms, RTTmax=3D134ms. Get a median? My own stats are probably quite skewed lower from being in california, and doing some tests from places like isc.org in redwood city, which is insanely well co-located. > Much of this can be found in > https://tools.ietf.org/html/draft-white-httpbis-spdy-analysis-00 Thx! > In many of the cases that we've simulated, the packet drop probability is > less than 1% for DNS packets. In our web model, there are a total of 4 I think we have the ability to get a better number for dns loss now. > servers, so 4 DNS lookups assuming none of the addresses are cached. If > PLR =3D 1%, there would be a 3.9% chance of losing one or more DNS packet= s > (with a resulting ~5 second additional delay on load time). I've probabl= y > oversimplified this, but Kathie N. and I made the call that it would be > significantly easier to just do this math than to build a dns > implementation in ns2. The specific thing I've been concerned about was not the probability of a dns loss, although as you note the consequences are huge - but the frequency and cost of a cache miss and the resulting fill. This is a very simple namebench test against the alexa top 1000: http://snapon.lab.bufferbloat.net/~d/namebench/namebench_2014-03-20_1255.ht= ml This is a more comprehensive one taken against my own recent web history fi= le. http://snapon.lab.bufferbloat.net/~d/namebench/namebench_2014-03-24_1541.ht= ml Both of these were taken against the default SQM system in cerowrt against a cable modem, so you can pretty safely assume the ~20ms (middle) knee in the curve is basically based on physical RTT to the nearest upstream DNS server. And it's a benchmark so I don't generally believe in the relative hit ratios vs a vs "normal traffic", but do think the baseline RTT, and the knees in the curves in the cost of a miss and file are relevant. (it's also not clear to me if all cable modems run a local dns server) Recently simon kelly added support for gathering hit and miss statistics to dnsmasq 2.69. They can be obtained via a simple dns lookup as answers to queries of class CHAOS and type TXT in domain bind. The domain names are cachesize.bind, insertions.bind, evictions.bind, misses.bind, hits.bind, auth.bind and servers.bind. An example command to query this, using the dig utility would be dig +short chaos txt cachesize.bind It would be very interesting to see the differences between dnsmasq without DNSSEC, with DNSSEC and with DNSSEC and --dnssec-check-unsigned (checking for proof of non-existence) - we've been a bit concerned about the overheads of the last in particular. Getting more elaborate stats (hit, miss, and fill costs) is under discussio= n. > We've open sourced the web model (it's on Kathie's > web page and will be part of ns2.36) with an encouragement to the > community to improve on it. If you'd like to port it to ns3 and add a dn= s > model, that would be fantastic. As part of the google summer of code I am signed up to mentor a student with tom for the *codel related bits in ns3, and certainly plan to get fingers dirty in the cablelabs drop, and there was a very encouraging patch set distributed around for tcp-cubic with hystart support recently as well as a halfway decent 802.11 mac emulation. As usual, I have no funding, personally, to tackle the job, but I'll do what I can anyway. It would be wonderful to finally have all the ns2 and ns3 code mainlined for more people to use it. --=20 Dave T=C3=A4ht