[Bloat] [aqm] review: Deployment of RITE mechanisms, in use-case trial testbeds report part 1

Wed Mar 2 06:26:03 EST 2016

Hi Dave,

Thanks for reviewing our results. But probably I should not have referred to this document, as it contained indeed preliminary results, not showing all experiments. More elaborate results will be presented.

Just to be clear: we agree that FQ_ is currently the only network-"fix" to make sure that every flow gets an equal and stable rate. But I think you agree that FQ_ has some disadvantages too (mainly how to correctly identifying "flows", requiring still a classic self-queue, and not allowing deviation both up and down from the fair rate).

The goal of L4S is to solve the problem in the end-system allowing a neutral, transport layer independent and simple network. And note that our rate fairness benchmark is FQ_! If we can reach results that are close to FQ_ (disregarding its disadvantages here), we have achieved our goal. We don't expect L4S to be better than the FQ_ for its default rate fairness, except where FQ_ cannot fix the classic TCP problems in the end-system.

Related to the results in this document, the absolute quality selected is indeed skewed by both the Windows CTCP and DCTCP achieving a higher throughput than its Linux Reno and DCTCP counterparts. So the absolute values of quality should not be compared, but rather the stability of the quality. In the border cases where the available bitrate is close to the switchover points, the quality of DCTCP HAS stays stable at the expected quality, the FQ and PIE start fluctuating more heavily, PIE due to classic TCP rate variability and PIE's bigger queue, and FQ_ because the HAS flows cannot regain lost scheduling opportunities during TCP time-outs and restarts (referred to as "the peeks are cut off").

Of course feel free to reproduce our results, Windows server is free available for evaluation including the IIS server, Linux has DCTCP and the AQMs (the immediate step function for DCTCP can be made with RED, so no need for DualQ), and also other (open source) adaptive streaming servers can be used.

More inline too...

Regards,
Koen.

> -----Original Message-----
> From: aqm [mailto:aqm-bounces at ietf.org] On Behalf Of EXT Dave Täht
> Sent: zaterdag 27 februari 2016 20:05
> To: aqm at ietf.org; bloat at lists.bufferbloat.net
> Subject: [aqm] review: Deployment of RITE mechanisms, in use-case trial
> testbeds report part 1
> 
> 
> 
> 
> On 2/26/16 3:23 AM, De Schepper, Koen (Nokia - BE) wrote:
> > Hi Wes,
> >
> > Just to let you know that we are still working on AQMs that support
> scalable (L4S) TCPs.
> > We could present some of our latest results (if there will be a
> meeting in Buenos Aires, otherwise in Berlin?)
> >
> > * Performance of HTTP Adaptive Video Streaming (HAS) with different
> TCP's and AQMs
> >    o HAS is currently ~30% of Internet traffic, but no AQM testing so
> far has included it
> 
> I am aware of several unpublished studies. There was also something that
> compared 1-3 HAS flows from several years back from stanford that I've
> longed to be repeated against these aqm technologies.
> 
> https://reproducingnetworkresearch.wordpress.com/2014/06/03/cs244-14-
> confused-timid-and-unstable-picking-a-video-streaming-rate-is-hard/
> 
> >    o the results are very poor with a particular popular AQM
> 
> Define "very poor". ?
> 

"Very poor" compared to what can be expected from a perfect per flow scheduler in the network ;-). Of course it is not necessarily due to FQ_'s performance, rather the combination of FQ_ and Classic TCP's shortcomings. If something is not functioning well in the end-systems, it cannot always be fixed in the network only. So this is certainly not a criticism on the FQ_ implementation.

> > Presenter: Inton Tsang
> > Duration: 10mins
> > Draft: Comparative testing of draft-ietf-aqm-pie-01, draft-ietf-aqm-
> fq-codel-04, draft-briscoe-aqm-dualq-coupled
> 
> 
> 
> >
> > For experiment write-up, see Section 3 of
> https://riteproject.files.wordpress.com/2015/12/rite-deliverable-3-3-
> public1.pdf
> 
> At the risk of sounding grumpy and pedantic, partially driven by trying
> to read through 58 pages of text on a b/w kindle (before I switched to
> something higher res)

I appreciate your effort in reviewing the full document, but I mentioned section 3 as the relevant section here.
Note that this deliverable is not a like a peer reviewed publication, but a write-up of research at a certain moment in time, and therefore is mostly work in progress. It has been reviewed before by independent assigned experts by the EU, and their some of their comments were in line with yours.

> 
> No source, can't reproduce.
> 
> The diagrams of the topology, etc, were quite helpful. Insight into the
> actual complexity of BT's network was quite useful. I would love to know
> the actual usage of the various QoS queues in terms of real traffic...
> 
> Has DCTP's not responding to drop properly problem been fixed in linux
> yet? (sec 3.3.2). That remains a scary bug....
> 
> More detailed comments:
> 
> The tests tested download traffic only and seem to have assumed that the
> uplink would never have been a bottleneck. Home connections are
> asymmetric and increasingly so - comcast, for example is now putting out
> links with 75Mbit down, 5.5Mbit up. It is not clear what the uplink rate
> of the emulated or real network is in this paper.
> 
> 2) Section 2 - would have been tons more interesting had it evaluated
> the effects of other workloads against actual "twitch" games such as any
> of the quake series, call of duty, or starcraft. The game chosen was
> quite uninteresting from "needing a low latency and jitter to win" on,
> front.
> 
> Applying interesting background network workloads while game bots
> competed as in here http://sscaitournament.com/ would have been great
> fun! - injecting random jitter and delay into the network stack has been
> what Karl Auerbach has been doing to the competitors at the DARPA robot
> competitions, and he's probably been the cause of quite a few of those
> real-life robots falling over (I can hear his evil chuckles now).
> 
> Artificially applying a wifi loss rate of 1.5% is a far cry from what
> actually happens on wifi, where what you see is more typically large
> spikes in l2 delay, nowadays.
> 
> 3) As for section 3 - it uses a "bottleneck of 20Mbps and RTT of 20ms,
> ten TCP long file download flows as background traffic and one HAS
> request, which launch two TCP flows." I will try to restrain myself,
> but:
> 
> A) In a cable modem scenario, the typical uplink varies from 2mbit to
> 4mbit on a 20mbit link. I am not aware of many 20mbit symmetric links,
> aside from FIOS. I would love symmetry on home links, is BT deploying
> that??

We define these experiments with the purpose to analyze and improve, not necessarily to resemble realistic scenario's. Of course we try to cover the range of today's realistic cases and finally evaluation of realistic test cases are needed as well. But when mixing too many mechanisms together we find it too complex to analyze. The current Internet is working most of the time correctly, it are only the short moments of coincidental conditions that make users experience the Internet as unreliable. We tried to create these conditions in our testcases frequently enough so we can evaluate the impact of different mechanisms (AQMs, schedulers and CCs). Also detecting and solving "none-used" problems allow new applications to become realistic...

> 
> B) this is a workload that has very little resemblance to any reality I
> can think of and seems expressly designed to show the burst tolerance
> and additional latency induced by pie and dualQ to an advantage, and
> fq_codel at a local minimum.
> 
> For the sake of science, varying the number of long running downloads
> from, say, 2,4,8 and 16, would have provided a more rounded picture.
> Still I am really hard pressed to think of any case where home HAS
> traffic will be competing with more than 2-3 long running full rate
> downloads in your typical home. I HAVE seen 3 android devices attempt to
> update themselves all at once, but that's it....
> 
> Using bittorrent to generate 5-15 flows down, perhaps? Even then, most
> real world torrents are not running very fast, there's utp based
> congestion control, and torrenters spend way more time in the upload
> phase than the download.
> 
> ...
> 
> Using web access bursts (10-100 flows lasting for 2-8 seconds), chock
> full of dns accesses, seems to be a good test against HAS traffic. I see
> there is a section doing that, but it is against 5 background flows,
> also and I'm pretty sure there will be a pause between bursts, and
> although the "The web requests followed an exponential arrival process,
> while the size of the downloaded files were designed to represent actual
> Internet web objects following a Pareto distribution with size between 1
> Kbyte and 1 Mbyte" is documented...
> 
> I could do the math assuming that distribution * the number of files...
> but I would prefer the total size of the web transfer be documented, and
> it's time to complete a load presented... nor is it clear if dns lookups
> are enabled or counted. Unless I missed it somewhere? If you are going
> to measure two loads - one HAS, one web, presenting the results for both
> for contrast, seems logical.
> 
> "Figure 3.22 shows that HAS-DCTCP can maintain its performance even with
> the additional 100 web traffic request per second as background traffic.
> It consistently delivered segments of 2056 Kbps encoding
> bit rate. In contrast, under the same network condition, the additional
> web traffic deteriorate further the performance of HAS-CTCP as the
> delivered segment alternated qualities between 688 Kbps and
> 991 Kbps encoding rates. This had a lower overall quality when compared
> with the previous experiment, which without the additional web traffic
> could achieved a sustainable delivery of segment with 991 Kbps
> encoding rates, as shown in Figure 3.18"
> 
> This kind of stuff really gets under my skin. What was the web page
> completion time? Which matters more to the users of the household? a
> slight decline in video quality, or a web page load?
> 
> Sacrificing all other network applications on the crucible of 4k video
> is really not my goal in life, I would like to see videoconferencing,
> gaming, web, in particular, work well in the presence of HAS.
> 
> Please, please, please, put web page completion times in this document?

For our purpose, we prefer the statistical mix of downloads with different sizes and the representation of completion times for these different sizes. Again creating sporadic events frequently enough to study them. The representation allows clearly to compare the impact of the different mechanisms, as we repeat the same scenarios over the different combinations of AQMs, schedulers and CCs.

> 
> ...
> 
> Also that sort of web traffic would probably cycle on a 30-60 second
> basis at most - people just can't read that fast! Secondly, a search
> engine query often precedes an actual web page lookup.
> 
>  (I note that web page size growth has slowed dramatically since this wg
> started, it is now well below the projections we had a few years ago
> (7mbit by 20XX, more like 4 now, but I'd have to go redo the math -
> growth went from exponential-looking in 2012 to linear now...). There's
> also some data presented recently by some googlers at netconf 1.1 that
> showed the impact and commonality of dns failures across their
> subscriber base.....
> 
> I freely confess that maybe i'm out of touch, that maybe having perfect
> television quality while waiting seconds for web pages to load, is what
> most internet users want, and they deserve to get it, good and hard,
> along with their supersized mcgreasy burgers and calming drugs delivered
> to their door.
> 
> C) And no reverse traffic whatsoever in section three.
> 
> In your typical HAS video streaming home scenario, the worst behaviors I
> can think of would be bittorrent generated *on the upload*, but the
> simplest case of a single upload happening at all - tends towards
> starving the ack streams and ramp up times on the video download flows.
> Which was not tested. Sigh. How long have I banged on this drum?
> 
> D) It is unproven from a QOE perspective (I think) that having a video
> stream change qualities on a "better average" basis is of benefit to the
> user - seeing a stream never pause "for buffering", and consistent
> quality, seems saner. I notice when the quality changes down, but rarely
> up. "buffering" - that I seriously notice.
> 
> I would have liked reference data for drop tail vs the various
> experiments in this paper. Typical amounts of head end buffering on
> docsis 3.0 arris cmtss seem to be in the 800ms range at 20mbit.
> 
> E) A workload that I would like to see tested with some rigor is a
> family of four - one doing a big upload (facebook/instagram), another
> browsing the web, another doing a phone call or videoconference, and a
> fourth attempting to watch a movie at the highest possible resolution.
> 
> The bar for the last has moved to SD or HD quality in the past few years
> - 18mbits is a number I keep seeing more and more. Someone with a 20Mbit
> link IS going to try for the highest quality, and showing the dynamic
> range of that (1-18mbits) would be more interesting than 1mbit to 2, as
> in in this paper.
> 
> I also would welcome 2-3 HAS testing on downloads against these AQMs,
> attempting those rates, along the lines of the stanford thing I
> mentioned first.
> 
> I petered out before reading sections 4 and 5. I will try to get to it
> this week.
> 
> _______________________________________________
> aqm mailing list
> aqm at ietf.org
> https://www.ietf.org/mailman/listinfo/aqm