* Re: [Ecn-sane] abc congestion control on time varying wireless links
2019-12-11 19:54 ` Dave Taht
@ 2019-12-11 20:10 ` Dave Taht
2019-12-11 20:12 ` [Ecn-sane] [Bloat] " Jonathan Morton
2019-12-11 21:18 ` [Ecn-sane] " David P. Reed
2 siblings, 0 replies; 9+ messages in thread
From: Dave Taht @ 2019-12-11 20:10 UTC (permalink / raw)
To: Prateesh Goyal
Cc: bloat, ECN-Sane, Hari Balakrishnan, Mohammad Alizadeh, Make-Wifi-fast
On Wed, Dec 11, 2019 at 11:54 AM Dave Taht <dave.taht@gmail.com> wrote:
>
> On Wed, Dec 11, 2019 at 11:19 AM Prateesh Goyal <g.pratish@gmail.com> wrote:
> >
> > Adding Hari, Mohammad
> >
> > On Wed, Dec 11, 2019 at 2:17 PM Dave Taht <dave.taht@gmail.com> wrote:
> >>
> >> https://arxiv.org/pdf/1905.03429.pdf
> >>
> >> the principal item of interest is section 3.1.2 where the accelerate
> >> and brake concepts and math are described.
>
> What we have now is a string of conflicts of interest over the values
> of the ecn bits, in part based
> on the characteristics of the underlying link technologies.
>
> The DC folk want a multibit more immediate signal, for which L4S is
> kind of targetted, (and SCE also
> applies). I haven't seen any data on how well dctcp or SCE -style can
> work on wildly RTT varying links as yet, although it's been pitched at
> the LTE direction, not at wifi.
>
> The abc concept hasn't been tried in a DC-like environment, and while
> it shows some good results for both the LTE and wifi simulations, was
> not compared against the fq_codel based solution currently in linux
> wifi, nor against the minstrel rate controller.
>
> I have plenty of data on how fq_codel + RFC3168 ecn currently works on
> wifi, I like to think it's pretty good, but it's still pretty slow to
> respond with just RFC3168 or drop.
>
> this is yet another one of those cases where unified sets of
> benchmarks would help.
>
> And then there's, like, the actual deployment on actual devices... I
> just did a string of benchmarks, tethered to my new moto 6e phone. You
> saturate the download, and nearly ALL other traffic (icmp and udp) in
> the upstream direction, gets starved out.
>
> I just did a string of benchmarks on my new LTE
(I'll post these at some point)
Kan Yan, Toke, and a multitude of others have committed AQL
(Airtime queue limits) for the QCA ath10k 802.11ac chip to
the linux kernel and it should be appearing in mainline
and in openwrt soon if it hasn't already. (It already worked on the
mt76, and I'm hoping we can make it work on the iwl devices, notably
the new ax ones)
https://lore.kernel.org/linux-wireless/20191119060610.76681-5-kyan@google.com/
Kan's data and post about it:
https://drive.google.com/corp/drive/folders/14OIuQEHOUiIoNrVnKprj6rBYFNZ0Coif
The raw trace, parsed data in csv format and plots can be found here:
https://drive.google.com/open?id=1Mg_wHu7elYAdkXz4u--42qGCVE1nrILV
All tests are done with 2 TCP download sessions that oversubscribed
the link bandwidth.
With AQL on, the mean sojourn time about ~20000us, matches the default
codel "target".
With AQL off, the mean sojourn time is less than 4us even the latency
is off the charts, just as we expected that fd_codel with mac80211
alone is not effective for drivers with deep firmware/hardware queues.
Kan followed up with some 10ms vs 20 codel target data
> Apologize for the late reply. Here is the test results with target set to 10ms.
> The trace for the sojourn time:
> https://drive.google.com/open?id=1MEy_wbKKdl22yF17hZaGzpv3uOz6orTi
>
> Flent test for 20 ms target time vs 10 ms target time:
> https://drive.google.com/open?id=1leIWe0-L0XE78eFvlmRJlNmYgbpoH8xZ
At which point a debate kicked off on the make-wifi-list about using
the 10ms target on wifi, particularly with multiple stations transmitting.
https://lists.bufferbloat.net/pipermail/make-wifi-fast/2019-December/002605.html
To me, the arrival of AQL, and the applicability various AQM
technologies to 802.11ac devices is kind of a whole new debate, that
we simply do not have enough data on.
> >>
> >> --
> >> Make Music, Not War
> >>
> >> Dave Täht
> >> CTO, TekLibre, LLC
> >> http://www.teklibre.com
> >> Tel: 1-831-435-0729
>
>
>
> --
> Make Music, Not War
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-435-0729
--
Make Music, Not War
Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-435-0729
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Ecn-sane] [Bloat] abc congestion control on time varying wireless links
2019-12-11 19:54 ` Dave Taht
2019-12-11 20:10 ` Dave Taht
@ 2019-12-11 20:12 ` Jonathan Morton
2019-12-12 21:31 ` Dave Taht
2019-12-11 21:18 ` [Ecn-sane] " David P. Reed
2 siblings, 1 reply; 9+ messages in thread
From: Jonathan Morton @ 2019-12-11 20:12 UTC (permalink / raw)
To: Dave Taht
Cc: Prateesh Goyal, Hari Balakrishnan, ECN-Sane, Mohammad Alizadeh, bloat
> On 11 Dec, 2019, at 9:54 pm, Dave Taht <dave.taht@gmail.com> wrote:
>
> The DC folk want a multibit more immediate signal, for which L4S is
> kind of targetted, (and SCE also
> applies). I haven't seen any data on how well dctcp or SCE -style can
> work on wildly RTT varying links as yet, although it's been pitched at
> the LTE direction, not at wifi.
It turns out that a Codel marking strategy for SCE, with modified parameters of course, works well for tolerating bursty and aggregating links. The RED-ramp and step-function strategies do not - and they're equally bad if the same test scenario is applied to DCTCP or TCP Prague.
The difference is not small; switching from RED to Codel improves goodput from 1/8th to 80% of nominal link capacity, when a rough model of wifi characteristics is inserted into our usual Internet-path scenario.
We're currently exploring how best to set the extra set of Codel parameters involved.
- Jonathan Morton
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Ecn-sane] abc congestion control on time varying wireless links
2019-12-11 19:54 ` Dave Taht
2019-12-11 20:10 ` Dave Taht
2019-12-11 20:12 ` [Ecn-sane] [Bloat] " Jonathan Morton
@ 2019-12-11 21:18 ` David P. Reed
2019-12-11 21:30 ` David P. Reed
2 siblings, 1 reply; 9+ messages in thread
From: David P. Reed @ 2019-12-11 21:18 UTC (permalink / raw)
To: Dave Taht
Cc: Prateesh Goyal, Hari Balakrishnan, ECN-Sane, Mohammad Alizadeh, bloat
I will not be gentle here. THe authors deserve my typical peer-review feedback as an expert in the field of wireless protocols and congestion. (Many of you on the list are as well, I know, and may have different reviews. But I'm very troubled by this paper's claims. It's interesting technically, but flawed seriously, enough that I would send it back for more work before publication. (not that my opinion matters these days)
A separate perspective from me on the paper.
1) There is a problem in the very title wording of the paper. WiFi is not at all a "time varying wireless link" Nor is it obvious that a time varying link is even a good approximate model of WiFi LANs. What do I mean here?
a. WiFi is not a link. In its typical deployment (non-peer-to-peer) it is a hub that is multiplexed by many wireless links that share the same spatial channel, but follow different paths.
b. WiFi's spatial shared wireless channel's temporal behavior is not modeled by a single scalar variable called "speed" or "error rate" that is varying over a range over time.
c. congestion is typically queueing delay on a shared FIFO queue. In the AP-STA operation described, when delays happen they are not at all characterized by a shared single FIFO queue. In fact each packet travels twice through the air, each time in a highly correlated temporal distribution, and each packet travels through two FIFO cuques, plus a strange CSMA exponential backoff queue. This is NOT congestion in any real sense.
2) the paper doesn't present any data whatever regarding actual observed channel behaviors, or even actual observed effects.
a. Indoor propagation of OFDM signals is complicated. I've done actual measurements, and continue to carry them out. But many others have as well. The sources of variability over time and the time constants of that variability are not well characterized in the literature at all. My dear friend Ted Rapaport is an expert on *outdoor* microwave and mmwave propagation, and has done lots of measurements there. But not indoor, where such things as rotating fans, moving people, floorand ceiling elements, etc. all affect the propagation of OFDM signals in ways that do vary, but not according to any model that has been characterized sufficiently to build, say, an ns2 simulation.
b. the indoor behavior at the MAC layer of signals is highly variable due to many effects, not all physical (for example, microwave noise that affects the time waiting for a "clear" channel before a station can transmit. This can vary a lot. Also, in a Multi-User Dwelling or an enterprise office/campus, other WiFi traffic causes delay at the MAC layer that is non trivial. What's the problem here is that this "interference" (not radio interference at all, but MAC layer variability) is not slowly varying in any sense. The idea that this is modelable by a congestion control mechanism of any sort is not clear.
c. driving all of this is the mix of application traffic in a "local area" (the physical region around the access point, and the upstream network to which the access point connects. Not all of this traffic is anything like a simple distribution. In fact, it's time varying across many time scales. For example, Netflix video is typically TCP with controlled bursts (buffer filling) separated by long relatively quiet periods. These bursts can use up all available airtime. In contrast, web traffic for one "page" often involves many independent HTTP streams (or soon HTTP3/UDP streams being rolled out at scale by Google on all its services) involving 10's or even hundreds of remote distinct sites, where response time is critical (lag under load is unacceptable).
3. the paper alludes to, but doesn't really characterize, the issue of "fairness" very well. Fairness isn't Harrison Bergeron style exact matching of bits delivered among all pairs. Instead, it really amounts to allocation of latency degradation (due to excess queueing) among independent applications sharing the medium. In other words, it is more like "non-starvation", except where the applications themselves may actually back off their load when resources are reduced, to be "friendly".
I am afraid that this pragmatic issue, the real goal of congestion control, is poorly discussed in the paper, yet it is the crucial measure of a good congestion control scheme. Throughput is entirely secondary to avoiding starvation unless the starvation can be proved to be inherent in the load presented.
Now I will say the mechanism presented may well be quite useful, but I think such mechanisms should not just be preseented in the technical literature *as if it were obvious that they are useful* at least in some typical real world situations.
In other words, before launching into solving a problem, one needs to research and characterize the problem being solved. Preferably this research will produce good experimentally valid models.
We saw back in the early 1970's a huge volume of theoretical work from some famous people (Bob Gallegher of MIT is a good example) where packet networks were evaluated under Poisson arrival loads, and asserted to be good. It turns out that there are NO real world networks that have anything like Poisson arrival processes. The only reason Poisson arrival processas are interesting is because they are mathematically trivial to analyze in closed form without simulation.
But work on time-shared operating system schedulers in the 1960's (at MIT, in the Multics project, but also at Berkeley and other places) had already demonstrated that user requests are not at all Poisson. In fact, so far from Poisson that any scheduler that assumed Poisson arrivals was dreadful in practice. Adding epicycles to Poisson arrivals fixed nothing, but produced even richer "closed form" solutions, and a vast literature of research results in departments focused on scheduling theory around the US.
The same has been true of Gallegher and his theory students. Poisson random arrivals infest the literature, and measurement driven, practical research in networking has been despised.
It's time to focus on the Science of actual real networks, wireless ones in the real world, and simulations validated against real world situations (as scientists do when they have to model the real world).
I'm very, very sad to see this kind of publication, which is not science, but just a mathematical game played based on a hunch about wireless behavior that is not based on measurements or characteristic applicaitons. IN contrast, the reality centered work being done by people like the bloat project, while not so academically abstract, is the state of the art
A proper title would be "a random congestion control mmethod on ann imaginary artificial network that might, if we are lucky, be somewhat like a wifi network, but honestly we never actually looked at one in the wild"
^ permalink raw reply [flat|nested] 9+ messages in thread