From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.taht.net (mail.taht.net [IPv6:2a01:7e00::f03c:91ff:feae:7028]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id C5CF43B2FE for ; Sat, 27 Feb 2016 14:00:42 -0500 (EST) Received: from dair-1314.local (c-73-252-201-217.hsd1.ca.comcast.net [73.252.201.217]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.taht.net (Postfix) with ESMTPSA id 2CAE01F4B6; Sat, 27 Feb 2016 19:00:40 +0000 (UTC) To: aqm@ietf.org, bloat@lists.bufferbloat.net References: <56BB8F05.2030006@mti-systems.com> From: =?UTF-8?Q?Dave_T=c3=a4ht?= Message-ID: <56D1F349.6040601@taht.net> Date: Sat, 27 Feb 2016 11:04:41 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Subject: [Bloat] review: Deployment of RITE mechanisms, in use-case trial testbeds report part 1 X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Feb 2016 19:00:43 -0000 On 2/26/16 3:23 AM, De Schepper, Koen (Nokia - BE) wrote: > Hi Wes, > > Just to let you know that we are still working on AQMs that support scalable (L4S) TCPs. > We could present some of our latest results (if there will be a meeting in Buenos Aires, otherwise in Berlin?) > > * Performance of HTTP Adaptive Video Streaming (HAS) with different TCP's and AQMs > o HAS is currently ~30% of Internet traffic, but no AQM testing so far has included it I am aware of several unpublished studies. There was also something that compared 1-3 HAS flows from several years back from stanford that I've longed to be repeated against these aqm technologies. https://reproducingnetworkresearch.wordpress.com/2014/06/03/cs244-14-confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard/ > o the results are very poor with a particular popular AQM Define "very poor". ? > Presenter: Inton Tsang > Duration: 10mins > Draft: Comparative testing of draft-ietf-aqm-pie-01, draft-ietf-aqm-fq-codel-04, draft-briscoe-aqm-dualq-coupled > > For experiment write-up, see Section 3 of https://riteproject.files.wordpress.com/2015/12/rite-deliverable-3-3-public1.pdf At the risk of sounding grumpy and pedantic, partially driven by trying to read through 58 pages of text on a b/w kindle (before I switched to something higher res) No source, can't reproduce. The diagrams of the topology, etc, were quite helpful. Insight into the actual complexity of BT's network was quite useful. I would love to know the actual usage of the various QoS queues in terms of real traffic... Has DCTP's not responding to drop properly problem been fixed in linux yet? (sec 3.3.2). That remains a scary bug.... More detailed comments: The tests tested download traffic only and seem to have assumed that the uplink would never have been a bottleneck. Home connections are asymmetric and increasingly so - comcast, for example is now putting out links with 75Mbit down, 5.5Mbit up. It is not clear what the uplink rate of the emulated or real network is in this paper. 2) Section 2 - would have been tons more interesting had it evaluated the effects of other workloads against actual "twitch" games such as any of the quake series, call of duty, or starcraft. The game chosen was quite uninteresting from "needing a low latency and jitter to win" on, front. Applying interesting background network workloads while game bots competed as in here http://sscaitournament.com/ would have been great fun! - injecting random jitter and delay into the network stack has been what Karl Auerbach has been doing to the competitors at the DARPA robot competitions, and he's probably been the cause of quite a few of those real-life robots falling over (I can hear his evil chuckles now). Artificially applying a wifi loss rate of 1.5% is a far cry from what actually happens on wifi, where what you see is more typically large spikes in l2 delay, nowadays. 3) As for section 3 - it uses a "bottleneck of 20Mbps and RTT of 20ms, ten TCP long file download flows as background traffic and one HAS request, which launch two TCP flows." I will try to restrain myself, but: A) In a cable modem scenario, the typical uplink varies from 2mbit to 4mbit on a 20mbit link. I am not aware of many 20mbit symmetric links, aside from FIOS. I would love symmetry on home links, is BT deploying that?? B) this is a workload that has very little resemblance to any reality I can think of and seems expressly designed to show the burst tolerance and additional latency induced by pie and dualQ to an advantage, and fq_codel at a local minimum. For the sake of science, varying the number of long running downloads from, say, 2,4,8 and 16, would have provided a more rounded picture. Still I am really hard pressed to think of any case where home HAS traffic will be competing with more than 2-3 long running full rate downloads in your typical home. I HAVE seen 3 android devices attempt to update themselves all at once, but that's it.... Using bittorrent to generate 5-15 flows down, perhaps? Even then, most real world torrents are not running very fast, there's utp based congestion control, and torrenters spend way more time in the upload phase than the download. ... Using web access bursts (10-100 flows lasting for 2-8 seconds), chock full of dns accesses, seems to be a good test against HAS traffic. I see there is a section doing that, but it is against 5 background flows, also and I'm pretty sure there will be a pause between bursts, and although the "The web requests followed an exponential arrival process, while the size of the downloaded files were designed to represent actual Internet web objects following a Pareto distribution with size between 1 Kbyte and 1 Mbyte" is documented... I could do the math assuming that distribution * the number of files... but I would prefer the total size of the web transfer be documented, and it's time to complete a load presented... nor is it clear if dns lookups are enabled or counted. Unless I missed it somewhere? If you are going to measure two loads - one HAS, one web, presenting the results for both for contrast, seems logical. "Figure 3.22 shows that HAS-DCTCP can maintain its performance even with the additional 100 web traffic request per second as background traffic. It consistently delivered segments of 2056 Kbps encoding bit rate. In contrast, under the same network condition, the additional web traffic deteriorate further the performance of HAS-CTCP as the delivered segment alternated qualities between 688 Kbps and 991 Kbps encoding rates. This had a lower overall quality when compared with the previous experiment, which without the additional web traffic could achieved a sustainable delivery of segment with 991 Kbps encoding rates, as shown in Figure 3.18" This kind of stuff really gets under my skin. What was the web page completion time? Which matters more to the users of the household? a slight decline in video quality, or a web page load? Sacrificing all other network applications on the crucible of 4k video is really not my goal in life, I would like to see videoconferencing, gaming, web, in particular, work well in the presence of HAS. Please, please, please, put web page completion times in this document? ... Also that sort of web traffic would probably cycle on a 30-60 second basis at most - people just can't read that fast! Secondly, a search engine query often precedes an actual web page lookup. (I note that web page size growth has slowed dramatically since this wg started, it is now well below the projections we had a few years ago (7mbit by 20XX, more like 4 now, but I'd have to go redo the math - growth went from exponential-looking in 2012 to linear now...). There's also some data presented recently by some googlers at netconf 1.1 that showed the impact and commonality of dns failures across their subscriber base..... I freely confess that maybe i'm out of touch, that maybe having perfect television quality while waiting seconds for web pages to load, is what most internet users want, and they deserve to get it, good and hard, along with their supersized mcgreasy burgers and calming drugs delivered to their door. C) And no reverse traffic whatsoever in section three. In your typical HAS video streaming home scenario, the worst behaviors I can think of would be bittorrent generated *on the upload*, but the simplest case of a single upload happening at all - tends towards starving the ack streams and ramp up times on the video download flows. Which was not tested. Sigh. How long have I banged on this drum? D) It is unproven from a QOE perspective (I think) that having a video stream change qualities on a "better average" basis is of benefit to the user - seeing a stream never pause "for buffering", and consistent quality, seems saner. I notice when the quality changes down, but rarely up. "buffering" - that I seriously notice. I would have liked reference data for drop tail vs the various experiments in this paper. Typical amounts of head end buffering on docsis 3.0 arris cmtss seem to be in the 800ms range at 20mbit. E) A workload that I would like to see tested with some rigor is a family of four - one doing a big upload (facebook/instagram), another browsing the web, another doing a phone call or videoconference, and a fourth attempting to watch a movie at the highest possible resolution. The bar for the last has moved to SD or HD quality in the past few years - 18mbits is a number I keep seeing more and more. Someone with a 20Mbit link IS going to try for the highest quality, and showing the dynamic range of that (1-18mbits) would be more interesting than 1mbit to 2, as in in this paper. I also would welcome 2-3 HAS testing on downloads against these AQMs, attempting those rates, along the lines of the stanford thing I mentioned first. I petered out before reading sections 4 and 5. I will try to get to it this week.