[Ecn-sane] [tsvwg] Comments on L4S drafts

Fri Jun 14 17:44:31 EDT 2019

This thread using unconventional markers for it is hard to follow.

Luca Muscariello <muscariello at ieee.org> writes:

> On Fri, Jun 7, 2019 at 8:10 PM Bob Briscoe <ietf at bobbriscoe.net>
> wrote:
>
>     
>         
>
>     I'm afraid there are not the same pressures to cause rapid
>     roll-out at all, cos it's flakey now, jam tomorrow. (Actually
>     ECN-DualQ-SCE has a much greater problem - complete starvation of
>     SCE flows - but we'll come on to that in Q4.)

Answering that statement is the only reason why I popped up here.
more below.

>     I want to say at this point, that I really appreciate all the
>     effort you've been putting in, trying to find common ground. 

I am happy to see this thread happen also, and I do plan to
stay out of it.

>     
>     In trying to find a compromise, you've taken the fire that is
>     really aimed at the inadequacy of underlying SCE protocol - for
>     anything other than FQ.

The SCE idea does, indeed work best with FQ, in a world of widely
varying congestion control ideas as explored in the recent paper, 50
shades of congestion control:

https://arxiv.org/pdf/1903.03852.pdf

>     If the primary SCE proponents had
>     attempted to articulate a way to use SCE in a single queue or a
>     dual queue, as you have, that would have taken my fire. 

I have no faith in single or dual queues with ECN either, due to
how anyone can scribble on the relevant bits, however...

>     
>         
>         But regardless, the queue-building from classic ECN-capable endpoints that
> only get 1 congestion signal per RTT is what I understand as the main
> downside of the tradeoff if we try to use ECN-capability as the dualq
> classifier.  Does that match your understanding?
>
>     This is indeed a major concern of mine (not as major as the
>     starvation of SCE explained under Q4, but we'll come to that).

I think I missed a portion of this thread. Starvation is impossible,
you are reduced to no less than cwnd 2 (non-bbr), or cwnd 4 (bbr).

Your own work points out a general problem with needing sub-packet
windows with too many flows that cause excessive marking using CE, which
so far as I know remains an unsolved problem.

https://arxiv.org/pdf/1904.07598.pdf

This is easily demonstrated via experiment, also, and the primary reason
why, even with FQ_codel in the field, we generally have turned off ecn
support at low bitrates until the first major release of sch_cake.

I had an open question outstanding about the 10% figure for converting
to drop sch_pie uses that remains unresolved.

As for what level of compatability with classic transports in a single
queue that is possible with a SCE capable receiver and sender, that
remains to be seen. Only the bits have been defined as yet. Two
approaches are being tried in public, so far.

>     
>     Fine-grained (DCTCP-like) and coarse-grained (Cubic-like)
>     congestion controls need to be isolated, but I don't see how,
>     unless their packets are tagged for separate queues. Without a
>     specific fine/coarse identifier, we're left with having to re-use
>     other identifiers:
>     
>     * You've tried to use ECN vs Not-ECN. But that still lumps two
>       large incompatible groups (fine ECN and coarse ECN) together. 
>     * The only alternative that would serve this purpose is the flow
>       identifier at layer-4, because it isolates everything from
>       everything else. FQ is where SCE started, and that seems to be
>       as far as it can go.

Actually, I was seeking a solution (and had been, for going on 5 years)
to the "too many flows not getting out of slow start fast enough",
problem, which you can see from any congested airport, public space,
small office, or coffeeshop nowadays. The vast majority of traffic
there does not consist of long duration high rate flows.

Even if you eliminate the wireless retries and rate changes and put in a
good fq_codel aqm, the traffic in such a large shared environment is
mostly flows lacking a need for congestion control at all (dns, voip,
etc), or in slow start, hammering away at ever increasing delays in
those environments until the user stops hitting the reload button.

Others have different goals and outlooks in this project and I'm
not really part of that.

I would rather like to see both approaches tried in an environment
that had a normal mix of traffic in a shared environment like that.

Some good potential solutions include reducing the slower bits of the
internet back to IW4 and/or using things like initial spreading, both of
which are good ideas and interact well with SCE's more immediate
response curve, paced chirping also.

>
>     Should we burn the last unicorn for a capability needed on
>     "carrier-scale" boxes, but which requires FQ to work? Perhaps yes
>     if there was no alternative. But there is: L4S.

The core of the internet is simply overprovisioned, with fairly short
queues. DCTCP itself did not deploy in very many places that I know of.

could you define exactly what carrier scale means?

>     
>     
>
> I have a problem to understand why all traffic ends up to be
> classified as either Cubic-like or DCTCP-like. 
> If we know that this is not true today I fail to understand why this
> should be the case in the future. 
> It is also difficult to predict now how applications will change in
> the future in terms of the traffic mix they'll generate.
> I feel like we'd be moving towards more customized transport services
> with less predictable patterns.
>
> I do not see for instance much discussion about the presence of RTC
> traffic and how the dualQ system behaves when the 
> input traffic does not respond as expected by the 2-types of sources
> assumed by dualQ.
>
> If my application is using simulcast or multi-stream techniques I can
> have several video streams in the same link, that, as far as I
> understand,
> will get significant latency in the classic queue. Unless my app
> starts cheating by marking packets to get into the priority queue.
>
> In both cases, i.e. my RTC app is cheating or not, I do not understand
> how the parametrization of the dualQ scheduler 
> can cope with traffic that behaves in a different way to what is
> assumed while tuning parameters. 
> For instance, in one instantiation of dualQ based on WRR the weights
> are set to 1:16. This has to necessarily 
> change when RTC traffic is present. How?
>
> Is the assumption that a trusted marker is used as in typical diffserv
> deployments
> or that a policer identifies and punishes cheating applications?
>
> BTW I'd love to understand how dualQ is supposed to work under more
> general traffic assumptions.
>
> Luca