I simply have one question. Is the code for the modified dctcp and dualpi in the l4steam repos on github ready for independent testing? On Tue, Jun 18, 2019, 6:15 PM Bob Briscoe wrote: > Luca, > > I'm still preparing a (long) reply to Jake's earlier (long) response. But > I'll take time out to quickly clear this point up inline... > > On 14/06/2019 21:10, Luca Muscariello wrote: > > > On Fri, Jun 7, 2019 at 8:10 PM Bob Briscoe wrote: > >> >> I'm afraid there are not the same pressures to cause rapid roll-out at >> all, cos it's flakey now, jam tomorrow. (Actually ECN-DualQ-SCE has a much >> greater problem - complete starvation of SCE flows - but we'll come on to >> that in Q4.) >> >> I want to say at this point, that I really appreciate all the effort >> you've been putting in, trying to find common ground. >> >> In trying to find a compromise, you've taken the fire that is really >> aimed at the inadequacy of underlying SCE protocol - for anything other >> than FQ. If the primary SCE proponents had attempted to articulate a way to >> use SCE in a single queue or a dual queue, as you have, that would have >> taken my fire. >> >> But regardless, the queue-building from classic ECN-capable endpoints that >> only get 1 congestion signal per RTT is what I understand as the main >> downside of the tradeoff if we try to use ECN-capability as the dualq >> classifier. Does that match your understanding? >> >> This is indeed a major concern of mine (not as major as the starvation of >> SCE explained under Q4, but we'll come to that). >> >> Fine-grained (DCTCP-like) and coarse-grained (Cubic-like) congestion >> controls need to be isolated, but I don't see how, unless their packets are >> tagged for separate queues. Without a specific fine/coarse identifier, >> we're left with having to re-use other identifiers: >> >> - You've tried to use ECN vs Not-ECN. But that still lumps two large >> incompatible groups (fine ECN and coarse ECN) together. >> - The only alternative that would serve this purpose is the flow >> identifier at layer-4, because it isolates everything from everything else. >> FQ is where SCE started, and that seems to be as far as it can go. >> >> Should we burn the last unicorn for a capability needed on >> "carrier-scale" boxes, but which requires FQ to work? Perhaps yes if there >> was no alternative. But there is: L4S. >> >> > I have a problem to understand why all traffic ends up to be classified as > either Cubic-like or DCTCP-like. > If we know that this is not true today I fail to understand why this > should be the case in the future. > It is also difficult to predict now how applications will change in the > future in terms of the traffic mix they'll generate. > I feel like we'd be moving towards more customized transport services with > less predictable patterns. > > I do not see for instance much discussion about the presence of RTC > traffic and how the dualQ system behaves when the > input traffic does not respond as expected by the 2-types of sources > assumed by dualQ. > > I'm sorry for using "Cubic-like" and "DCTCP-like", but I was trying > (obviously unsuccessfully) to be clearer than using 'Classic' and > 'Scalable'. > > "Classic" means traffic driven by congestion controls designed to coexist > in the same queue with Reno (TCP-friendly), which necessarily makes it > unscalable, as explained below. > > The definition of a scalable congestion control concerns the power b in > the relationship between window, W and the fraction of congestion signals, > p (ECN or drop) under stable conditions: > W = k / p^b > where k is a constant (or in some cases a function of other parameters > such as RTT). > If b >= 1 the CC is scalable. > If b < 1 it is not (i.e. Classic). > > "Scalable" does not exclude RTC traffic. For instance the L4S variant of > SCReAM that Ingemar just talked about is scalable ("DCTCP-like"), because > it has b = 1. > > I used "Cubic-like" 'cos there's more Cubic than Reno on the current > Internet. Over Internet paths with typical BDP, Cubic is always in its > Reno-friendly mode, and therefore also just as unscalable as Reno, with b = > 1/2 (inversely proportional to the square-root). Even in its proper Cubic > mode on high BDP paths, Cubic is still unscalable with b = 0.75. > > As flow rate scales up, the increase-decrease sawteeth of unscalable CCs > get very large and very infrequent, so the control becomes extremely slack > during dynamics. Whereas the sawteeth of scalable CCs stay invariant and > tiny at any scale, keeping control tight, queuing low and utilization high. > See the example of Cubic & DCTCP at Slide 5 here: > https://www.files.netdevconf.org/f/4ebdcdd6f94547ad8b77/?dl=1 > > Also, there's a useful plot of when Cubic switches to Reno mode on the > last slide. > > > If my application is using simulcast or multi-stream techniques I can have > several video streams in the same link, that, as far as I understand, > will get significant latency in the classic queue. > > > You are talking as if you think that queuing delay is caused by the > buffer. You haven't said what your RTC congestion control is (gcc > perhaps?). Whatever, assuming it's TCP-friendly, even in a queue on its > own, it will need to induce about 1 additional base RTT of queuing delay to > maintain full utilization. > > In the coupled dualQ AQM, the classic queue runs a state-of-the-art > classic AQM (PI2 in our implementation) with a target delay of 15ms. With > any less, your classic congestion controlled streams would under-utilize > the link. > > Unless my app starts cheating by marking packets to get into the priority > queue. > > There's two misconceptions here about the DualQ Coupled AQM that I need to > correct. > > 1/ As above, if a classic CC can't build ~1 base RTT of queue in the > classic buffer, it badly underutiizes. So if you 'cheat' by directing > traffic from a queue-building CC into the low latency queue with a shallow > ECN threshold, you'll just massively under-utilize the capacity. > > 2/ Even if it were a strict priority scheduler it wouldn't determine the > scheduling under all normal traffic conditions. The coupling between the > AQMs dominates the scheduler. I'll explain next... > > > In both cases, i.e. my RTC app is cheating or not, I do not understand how > the parametrization of the dualQ scheduler > can cope with traffic that behaves in a different way to what is assumed > while tuning parameters. > For instance, in one instantiation of dualQ based on WRR the weights are > set to 1:16. This has to necessarily > change when RTC traffic is present. How? > > > The coupling simply applies congestion signals from the C queue across > into the L queue, as if the C flows were L flows. So, the L flows leave > sufficient space for however many C flows there are. Then, in all the gaps > that the L traffic leaves, any work-conserving scheduler can be used to > serve the C queue. > > The WRR scheduler is only there in case of overload or unresponsive L > traffic; to prevent the Classic queue starving. > > > > Is the assumption that a trusted marker is used as in typical diffserv > deployments > or that a policer identifies and punishes cheating applications? > > As explained, if a classic flow cheats, it will get v low throughput. So > it has no incentive to cheat. > > There's still the possibility of bugs/accidents/malice. The need for > general Internet flows to be responsive to congestion is also vulnerable to > bugs/accidents/malice, but it hasn't needed policing. > > Nonetheless, in Low Latency DOCSIS, we have implemented a queue protection > function that maintains a queuing score per flow. Then, any packets from > high-scoring flows that would cause the queue to exceed a threshold delay, > are redirected to the classic queue instead. For well-behaved flows the > state that holds the score ages out between packets, so only ill-behaved > flows hold flow-state long term. > > Queue protection might not be needed, but it's as well to have it in case. > It can be disabled. > > > BTW I'd love to understand how dualQ is supposed to work under more > general traffic assumptions. > > Coexistence with Reno is a general requirement for long-running Internet > traffic. That's really all we depend on. That also covers RTC flows in the > C queue that average to similar throughput as Reno but react more smoothly. > > The L traffic can be similarly heterogeneous - part of the L4S experiment > is to see how broad that will stretch to. It can certainly accommodate > other lighter traffic like VoIP, DNS, flow startups, transactional, etc, > etc. > > > BBR (v1) is a good example of something different that wasn't designed to > coexist with Reno. It sort-of avoided too many problems by being primarily > used for app-limited flows. It does its RTT probing on much longer > timescales than typical sawtoothing congestion controls, running on a model > of the link between times, so it doesn't fit the formulae above. > > For BBRv2 we're promised that the non-ECN side of it will coexist with > existing Internet traffic, at least above a certain loss level. Without > having seen it I can't be sure, but I assume that implies it will fit the > formulae above in some way. > > > PS. I believe all the above is explained in the three L4S Internet drafts, > which we've taken a lot of trouble over. I don't really want to have to > keep explaining it longhand in response to each email. So I'd prefer > questions to be of the form "In section X of draft Y, I don't understand > Z". Then I can devote my time to improving the drafts. > > Alternatively, there's useful papers of various lengths on the L4S landing > page at: > https://riteproject.eu/dctth/#papers > > > Cheers > > > > Bob > > > > Luca > > > > > -- > ________________________________________________________________ > Bob Briscoe http://bobbriscoe.net/ > > _______________________________________________ > Ecn-sane mailing list > Ecn-sane@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/ecn-sane >