From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 678BD3B29E for ; Tue, 18 Jun 2019 21:33:44 -0400 (EDT) Received: by mail-io1-xd2e.google.com with SMTP id i10so34255639iol.13 for ; Tue, 18 Jun 2019 18:33:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5oBdHVI5fGPvwYAMy+cR/1btyEqF7VPoN4Ens87ax2k=; b=GhDTZcY7aEc+nD2nre6Q0kRrTG7+2UrSumLcsbFYLNzozLNbj7Rv0gTTLP2Bz7GYwW oyAdghJ99JXAPDvkB85Y/ZfIiJ9QArT4g2Aaj6AnFONFmcGFkUPLcZ+YedP2vAB7TQV4 0JoPi5v3ocIAB9mC/PRSE+XbDtueWK3a0yGeCjdycVs3tMxfyiNkj64NEp8Wtm9gHD82 Bbi0llvD5g+w4Qzp42jUeeJOfw/QUdJWnjqpmR3EOcwWqeULgyPawf4z9/a5Qg/pqVf+ hKYLOd86GnIrGN7X5JYzdgrGHlB+nkcV6OCspUCpzzSqIRGabx5XecOJI/+/KQgv8ikl kVaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5oBdHVI5fGPvwYAMy+cR/1btyEqF7VPoN4Ens87ax2k=; b=SvXWQ54vcHdJ7geqwYqb8/UzVOmTSTOXbQdTQr73AxoYVb7c2Qm83jtWWms7bJvZAy AtmK6QrXziNvJrwPvWw71CLJsj4o9M3l+tXXrNWHzIxYRlZwYJ815b/w58mJs1CR8wlQ 6IgCy9qitBkaASOJDgqkVPJKOBMo3UIgNxV6PB9XtxGdMq9qemEc1tr/segmKYW2KeBH guBz8SjhvKQ3cR2nJj8MQrJ1FaOtLrNt8q08dOfLrYIx8bJ0ne48URvwED60jYwiGKcD 4QU5K5P3KPr4kNBSyh27WfnWFboWA000422lEyNvKf1o9yp/8qfqIHmjOOEobHRcgGCT H+rg== X-Gm-Message-State: APjAAAVGKyb7nyzy5rraZja8Y4EWlY3fOH1Wu5rcFqevonpF6T89NRQ6 MYpiDsOZd2CGjWqPET3fcWmWUt9FoKr4C3O4jIs= X-Google-Smtp-Source: APXvYqwKATYshnNT4OZqwPVVihaaBIQ/DRL4uu8155lOr/CBXhKZsAP35B8N4QfMZWz2Jy8v23FCflJ0EeF40zl+o8g= X-Received: by 2002:a02:c7c9:: with SMTP id s9mr89914406jao.82.1560908023534; Tue, 18 Jun 2019 18:33:43 -0700 (PDT) MIME-Version: 1.0 References: <364514D5-07F2-4388-A2CD-35ED1AE38405@akamai.com> <4aff6353-eb0d-b0b8-942d-9c92753f074e@bobbriscoe.net> In-Reply-To: <4aff6353-eb0d-b0b8-942d-9c92753f074e@bobbriscoe.net> From: Dave Taht Date: Tue, 18 Jun 2019 18:33:31 -0700 Message-ID: To: Bob Briscoe Cc: Luca Muscariello , ECN-Sane , "tsvwg@ietf.org" Content-Type: multipart/alternative; boundary="000000000000452e31058ba33892" Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Jun 2019 01:33:44 -0000 --000000000000452e31058ba33892 Content-Type: text/plain; charset="UTF-8" I simply have one question. Is the code for the modified dctcp and dualpi in the l4steam repos on github ready for independent testing? On Tue, Jun 18, 2019, 6:15 PM Bob Briscoe wrote: > Luca, > > I'm still preparing a (long) reply to Jake's earlier (long) response. But > I'll take time out to quickly clear this point up inline... > > On 14/06/2019 21:10, Luca Muscariello wrote: > > > On Fri, Jun 7, 2019 at 8:10 PM Bob Briscoe wrote: > >> >> I'm afraid there are not the same pressures to cause rapid roll-out at >> all, cos it's flakey now, jam tomorrow. (Actually ECN-DualQ-SCE has a much >> greater problem - complete starvation of SCE flows - but we'll come on to >> that in Q4.) >> >> I want to say at this point, that I really appreciate all the effort >> you've been putting in, trying to find common ground. >> >> In trying to find a compromise, you've taken the fire that is really >> aimed at the inadequacy of underlying SCE protocol - for anything other >> than FQ. If the primary SCE proponents had attempted to articulate a way to >> use SCE in a single queue or a dual queue, as you have, that would have >> taken my fire. >> >> But regardless, the queue-building from classic ECN-capable endpoints that >> only get 1 congestion signal per RTT is what I understand as the main >> downside of the tradeoff if we try to use ECN-capability as the dualq >> classifier. Does that match your understanding? >> >> This is indeed a major concern of mine (not as major as the starvation of >> SCE explained under Q4, but we'll come to that). >> >> Fine-grained (DCTCP-like) and coarse-grained (Cubic-like) congestion >> controls need to be isolated, but I don't see how, unless their packets are >> tagged for separate queues. Without a specific fine/coarse identifier, >> we're left with having to re-use other identifiers: >> >> - You've tried to use ECN vs Not-ECN. But that still lumps two large >> incompatible groups (fine ECN and coarse ECN) together. >> - The only alternative that would serve this purpose is the flow >> identifier at layer-4, because it isolates everything from everything else. >> FQ is where SCE started, and that seems to be as far as it can go. >> >> Should we burn the last unicorn for a capability needed on >> "carrier-scale" boxes, but which requires FQ to work? Perhaps yes if there >> was no alternative. But there is: L4S. >> >> > I have a problem to understand why all traffic ends up to be classified as > either Cubic-like or DCTCP-like. > If we know that this is not true today I fail to understand why this > should be the case in the future. > It is also difficult to predict now how applications will change in the > future in terms of the traffic mix they'll generate. > I feel like we'd be moving towards more customized transport services with > less predictable patterns. > > I do not see for instance much discussion about the presence of RTC > traffic and how the dualQ system behaves when the > input traffic does not respond as expected by the 2-types of sources > assumed by dualQ. > > I'm sorry for using "Cubic-like" and "DCTCP-like", but I was trying > (obviously unsuccessfully) to be clearer than using 'Classic' and > 'Scalable'. > > "Classic" means traffic driven by congestion controls designed to coexist > in the same queue with Reno (TCP-friendly), which necessarily makes it > unscalable, as explained below. > > The definition of a scalable congestion control concerns the power b in > the relationship between window, W and the fraction of congestion signals, > p (ECN or drop) under stable conditions: > W = k / p^b > where k is a constant (or in some cases a function of other parameters > such as RTT). > If b >= 1 the CC is scalable. > If b < 1 it is not (i.e. Classic). > > "Scalable" does not exclude RTC traffic. For instance the L4S variant of > SCReAM that Ingemar just talked about is scalable ("DCTCP-like"), because > it has b = 1. > > I used "Cubic-like" 'cos there's more Cubic than Reno on the current > Internet. Over Internet paths with typical BDP, Cubic is always in its > Reno-friendly mode, and therefore also just as unscalable as Reno, with b = > 1/2 (inversely proportional to the square-root). Even in its proper Cubic > mode on high BDP paths, Cubic is still unscalable with b = 0.75. > > As flow rate scales up, the increase-decrease sawteeth of unscalable CCs > get very large and very infrequent, so the control becomes extremely slack > during dynamics. Whereas the sawteeth of scalable CCs stay invariant and > tiny at any scale, keeping control tight, queuing low and utilization high. > See the example of Cubic & DCTCP at Slide 5 here: > https://www.files.netdevconf.org/f/4ebdcdd6f94547ad8b77/?dl=1 > > Also, there's a useful plot of when Cubic switches to Reno mode on the > last slide. > > > If my application is using simulcast or multi-stream techniques I can have > several video streams in the same link, that, as far as I understand, > will get significant latency in the classic queue. > > > You are talking as if you think that queuing delay is caused by the > buffer. You haven't said what your RTC congestion control is (gcc > perhaps?). Whatever, assuming it's TCP-friendly, even in a queue on its > own, it will need to induce about 1 additional base RTT of queuing delay to > maintain full utilization. > > In the coupled dualQ AQM, the classic queue runs a state-of-the-art > classic AQM (PI2 in our implementation) with a target delay of 15ms. With > any less, your classic congestion controlled streams would under-utilize > the link. > > Unless my app starts cheating by marking packets to get into the priority > queue. > > There's two misconceptions here about the DualQ Coupled AQM that I need to > correct. > > 1/ As above, if a classic CC can't build ~1 base RTT of queue in the > classic buffer, it badly underutiizes. So if you 'cheat' by directing > traffic from a queue-building CC into the low latency queue with a shallow > ECN threshold, you'll just massively under-utilize the capacity. > > 2/ Even if it were a strict priority scheduler it wouldn't determine the > scheduling under all normal traffic conditions. The coupling between the > AQMs dominates the scheduler. I'll explain next... > > > In both cases, i.e. my RTC app is cheating or not, I do not understand how > the parametrization of the dualQ scheduler > can cope with traffic that behaves in a different way to what is assumed > while tuning parameters. > For instance, in one instantiation of dualQ based on WRR the weights are > set to 1:16. This has to necessarily > change when RTC traffic is present. How? > > > The coupling simply applies congestion signals from the C queue across > into the L queue, as if the C flows were L flows. So, the L flows leave > sufficient space for however many C flows there are. Then, in all the gaps > that the L traffic leaves, any work-conserving scheduler can be used to > serve the C queue. > > The WRR scheduler is only there in case of overload or unresponsive L > traffic; to prevent the Classic queue starving. > > > > Is the assumption that a trusted marker is used as in typical diffserv > deployments > or that a policer identifies and punishes cheating applications? > > As explained, if a classic flow cheats, it will get v low throughput. So > it has no incentive to cheat. > > There's still the possibility of bugs/accidents/malice. The need for > general Internet flows to be responsive to congestion is also vulnerable to > bugs/accidents/malice, but it hasn't needed policing. > > Nonetheless, in Low Latency DOCSIS, we have implemented a queue protection > function that maintains a queuing score per flow. Then, any packets from > high-scoring flows that would cause the queue to exceed a threshold delay, > are redirected to the classic queue instead. For well-behaved flows the > state that holds the score ages out between packets, so only ill-behaved > flows hold flow-state long term. > > Queue protection might not be needed, but it's as well to have it in case. > It can be disabled. > > > BTW I'd love to understand how dualQ is supposed to work under more > general traffic assumptions. > > Coexistence with Reno is a general requirement for long-running Internet > traffic. That's really all we depend on. That also covers RTC flows in the > C queue that average to similar throughput as Reno but react more smoothly. > > The L traffic can be similarly heterogeneous - part of the L4S experiment > is to see how broad that will stretch to. It can certainly accommodate > other lighter traffic like VoIP, DNS, flow startups, transactional, etc, > etc. > > > BBR (v1) is a good example of something different that wasn't designed to > coexist with Reno. It sort-of avoided too many problems by being primarily > used for app-limited flows. It does its RTT probing on much longer > timescales than typical sawtoothing congestion controls, running on a model > of the link between times, so it doesn't fit the formulae above. > > For BBRv2 we're promised that the non-ECN side of it will coexist with > existing Internet traffic, at least above a certain loss level. Without > having seen it I can't be sure, but I assume that implies it will fit the > formulae above in some way. > > > PS. I believe all the above is explained in the three L4S Internet drafts, > which we've taken a lot of trouble over. I don't really want to have to > keep explaining it longhand in response to each email. So I'd prefer > questions to be of the form "In section X of draft Y, I don't understand > Z". Then I can devote my time to improving the drafts. > > Alternatively, there's useful papers of various lengths on the L4S landing > page at: > https://riteproject.eu/dctth/#papers > > > Cheers > > > > Bob > > > > Luca > > > > > -- > ________________________________________________________________ > Bob Briscoe http://bobbriscoe.net/ > > _______________________________________________ > Ecn-sane mailing list > Ecn-sane@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/ecn-sane > --000000000000452e31058ba33892 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I simply have one question. Is the code for the modified = dctcp and dualpi in the l4steam repos on github ready for independent testi= ng?

On Tue, Jun 18, 2019, 6:15 PM Bob Briscoe <ietf@bobbriscoe.net> wrote:
=20 =20 =20
Luca,

I'm still preparing a (long) reply to Jake's earlier (long) response. But I'll take time out to quickly clear this point up inline...

On 14/06/2019 21:1= 0, Luca Muscariello wrote:
=20

On Fri, Jun 7, 2019 at 8:10 PM Bob Briscoe <ietf@bobbriscoe.net> wrote:

=C2=A0I'm afraid there are not the same pressures to caus= e rapid roll-out at all, cos it's flakey now, jam tomorrow. (Actually ECN-DualQ-SCE has a much greater problem - complete starvation of SCE flows - but we'll come on to that in Q4.)

I want to say at this point, that I really appreciate all the effort you've been putting in, trying to find common ground.

In trying to find a compromise, you've taken the fire tha= t is really aimed at the inadequacy of underlying SCE protocol - for anything other than FQ. If the primary SCE proponents had attempted to articulate a way to use SCE in a single queue or a dual queue, as you have, that would have taken my fire.

But regardless, the queue-building from classic ECN-cap=
able endpoints that
only get 1 congestion signal per RTT is what I understand as the main
downside of the tradeoff if we try to use ECN-capability as the dualq
classifier.  Does that match your understanding?
This is indeed a major concern of mine (not as major as the starvation of SCE explained under Q4, but we'll come to that).

Fine-grained (DCTCP-like) and coarse-grained (Cubic-like) congestion controls need to be isolated, but I don't see how, unless their packets are tagged for separate queues. Without a specific fine/coarse identifier, we're left wit= h having to re-use other identifiers:
  • You've tried to use ECN vs Not-ECN. But that still lumps two large incompatible groups (fine ECN and coarse ECN) together.
  • The only alternative that would serve this purpose is the flow identifier at layer-4, because it isolates everything from everything else. FQ is where SCE started, and that seems to be as far as it can go.
Should we burn the last unicorn for a capability needed on "carrier-scale" boxes, but which requires FQ to wor= k? Perhaps yes if there was no alternative. But there is: L4S.


I have a problem to understand why all traffic ends up to be classified as either Cubic-like or DCTCP-like.=C2=A0
If we know that this is not true today I fail to understand why this should be the case in the future.=C2=A0
It is also difficult to predict now how applications will change in the future in terms of the traffic mix they'll generate.
I feel like we'd be moving towards more customized transport services with less predictable patterns.

I do not see for instance much discussion about the presence of RTC traffic and how the dualQ system behaves when the=C2=A0
input traffic does not respond as expected by the 2-types of sources assumed by dualQ.
I'm sorry for using "Cubic-like" and "DCTCP-like&quo= t;, but I was trying (obviously unsuccessfully) to be clearer than using 'Classic' a= nd 'Scalable'.

"Classic" means traffic driven by congestion controls designe= d to coexist in the same queue with Reno (TCP-friendly), which necessarily makes it unscalable, as explained below.

The definition of a scalable congestion control concerns the power b in the relationship between window, W and the fraction of congestion signals, p (ECN or drop) under stable conditions:
=C2=A0=C2=A0=C2=A0 W =3D k / p^b
where k is a constant (or in some cases a function of other parameters such as RTT).
=C2=A0=C2=A0=C2=A0 If b >=3D 1 the CC is scalable.
=C2=A0=C2=A0=C2=A0 If b < 1 it is not (i.e. Classic).

"Scalable" does not exclude RTC traffic. For instance the L4S variant of SCReAM that Ingemar just talked about is scalable ("DCTCP-like"), because it has b =3D 1.

I used "Cubic-like" 'cos there's more Cubic than Reno= on the current Internet. Over Internet paths with typical BDP, Cubic is always in its Reno-friendly mode, and therefore also just as unscalable as Reno, with b =3D 1/2 (inversely proportional to the square-root). Even in its proper Cubic mode on high BDP paths, Cubic is still unscalable with b =3D 0.75.

As flow rate scales up, the increase-decrease sawteeth of unscalable CCs get very large and very infrequent, so the control becomes extremely slack during dynamics. Whereas the sawteeth of scalable CCs stay invariant and tiny at any scale, keeping control tight, queuing low and utilization high. See the example of Cubic & DCTCP at Slide 5 here:
https://www.files.netdevconf.org/f/4ebdcdd6f94547ad8b7= 7/?dl=3D1

Also, there's a useful plot of when Cubic switches to Reno mode on the last slide.


If my application is using simulcast or multi-stream techniques I can have several video streams in the same link,=C2=A0 that, as far as I understand,
will get significant latency in the classic queue.

You are talking as if you think that queuing delay is caused by the buffer. You haven't said what your RTC congestion control is (gcc perhaps?). Whatever, assuming it's TCP-friendly, even in a queue on its own, it will need to induce about 1 additional base RTT of queuing delay to maintain full utilization.

In the coupled dualQ AQM, the classic queue runs a state-of-the-art classic AQM (PI2 in our implementation) with a target delay of 15ms. With any less, your classic congestion controlled streams would under-utilize the link.

Unless my app starts cheating by marking packets to get into the priority queue.
There's two misconceptions here about the DualQ Coupled AQM that I need to correct.

1/ As above, if a classic CC can't build ~1 base RTT of queue in th= e classic buffer, it badly underutiizes. So if you 'cheat' by directing traffic from a queue-building CC into the low latency queue with a shallow ECN threshold, you'll just massively under-utilize the capacity.

2/ Even if it were a strict priority scheduler it wouldn't determin= e the scheduling under all normal traffic conditions. The coupling between the AQMs dominates the scheduler. I'll explain next...


In both cases, i.e. my RTC app is cheating or not, I do not understand how the parametrization of the dualQ scheduler=C2=A0
can cope with traffic that behaves in a different way to what is assumed while tuning parameters.=C2=A0
For instance, in one instantiation of dualQ based on WRR the weights are set to 1:16.=C2=A0 This has to necessarily=C2= =A0
change when RTC traffic is present. How?

The coupling simply applies congestion signals from the C queue across into the L queue, as if the C flows were L flows. So, the L flows leave sufficient space for however many C flows there are. Then, in all the gaps that the L traffic leaves, any work-conserving scheduler can be used to serve the C queue.

The WRR scheduler is only there in case of overload or unresponsive L traffic; to prevent the Classic queue starving.



Is the assumption that a trusted marker is used as in typical diffserv deployments
or that a policer identifies and punishes cheating applications?
As explained, if a classic flow cheats, it will get v low throughput. So it has no incentive to cheat.

There's still the possibility of bugs/accidents/malice. The need fo= r general Internet flows to be responsive to congestion is also vulnerable to bugs/accidents/malice, but it hasn't needed policing.=

Nonetheless, in Low Latency DOCSIS, we have implemented a queue protection function that maintains a queuing score per flow. Then, any packets from high-scoring flows that would cause the queue to exceed a threshold delay, are redirected to the classic queue instead. For well-behaved flows the state that holds the score ages out between packets, so only ill-behaved flows hold flow-state long term.

Queue protection might not be needed, but it's as well to have it i= n case. It can be disabled.


BTW I'd love to understand how dualQ is supposed to work under more general traffic assumptions.
Coexistence with Reno is a general requirement for long-running Internet traffic. That's really all we depend on. That also covers RTC flows in the C queue that average to similar throughput as Reno but react more smoothly.

The L traffic can be similarly heterogeneous - part of the L4S experiment is to see how broad that will stretch to. It can certainly accommodate other lighter traffic like VoIP, DNS, flow startups, transactional, etc, etc.


BBR (v1) is a good example of something different that wasn't designed to coexist with Reno. It sort-of avoided too many problems by being primarily used for app-limited flows. It does its RTT probing on much longer timescales than typical sawtoothing congestion controls, running on a model of the link between times, so it doesn't fit the formulae above.

For BBRv2 we're promised that the non-ECN side of it will coexist with existing Internet traffic, at least above a certain loss level. Without having seen it I can't be sure, but I assume that implies i= t will fit the formulae above in some way.


PS. I believe all the above is explained in the three L4S Internet drafts, which we've taken a lot of trouble over. I don't really= want to have to keep explaining it longhand in response to each email. So I'd prefer questions to be of the form "In section X of draft = Y, I don't understand Z". Then I can devote my time to improving th= e drafts.

Alternatively, there's useful papers of various lengths on the L4S landing page at:
https:= //riteproject.eu/dctth/#papers


Cheers



Bob



Luca

=C2=A0

--=20
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/
_______________________________________________
Ecn-sane mailing list
Ecn-sane@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/ecn= -sane
--000000000000452e31058ba33892--