Discussion of explicit congestion notification's impact on the Internet
 help / color / mirror / Atom feed
* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
       [not found] <HE1PR07MB4425603844DED8D36AC21B67C2110@HE1PR07MB4425.eurprd07.prod.outlook.com>
@ 2019-06-14 18:27 ` Holland, Jake
       [not found]   ` <HE1PR07MB4425E0997EE8ADCAE2D4C564C2E80@HE1PR07MB4425.eurprd07.prod.outlook.com>
  0 siblings, 1 reply; 59+ messages in thread
From: Holland, Jake @ 2019-06-14 18:27 UTC (permalink / raw)
  To: Ingemar Johansson S, Bob Briscoe; +Cc: tsvwg

Hi Ingemar,
(bcc: ecn-sane, to keep them apprised on the discussion).

Thanks for chiming in on this.  A few comments inline:

On 2019-06-08, 12:46, "Ingemar Johansson S" <ingemar.s.johansson@ericsson.com> wrote:
> Up until now it has been quite a challenge to make ECN happen, I believe
> that part of the reason has been that ECN is not judged to give a large
> enough gain. 

Could you elaborate on this point?  

I haven't been sure how to think about the claims in the l4s drafts that
operators will deploy it rapidly because of performance.

Based on past analyses (e.g. the classic ECN rollout case study in RFC
8170 [1]), I thought network operators had a very "safety first" outlook on
these things, and that rapid deployment for performance benefits seemed
like wishful thinking.

But I'd be interested to know more about why that view might be mistaken.

> Besides this, L4S has the nice
> property that it has potential to allow for faster rate increase when link
> capacity increases.

I think section 3.4 of RFC 8257 says the rate increase would be the
same:
https://tools.ietf.org/html/rfc8257#section-3.4
   A DCTCP sender grows its congestion window in the same way as
   conventional TCP.

I guess this is referring to the paced chirping for rapid growth idea
presented last time?
https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-implementing-the-prague-requirements-in-tcp-for-l4s-01#page=20

I'm a little unclear on how safe this can be made, but I agree it seems
useful if it can work well.

Do you think the L4S benefits will still be sufficient if this point
about faster growth doesn't hold up (and/or could be replicated regardless
of L4S), or is it critical to providing sufficient benefit in 3GPP?

(Note: I'm not taking a position on this point, just asking about how
much this point matters to the 3GPP support, as you see it.)

> I see many applications that can benefit greatly from L4S, besides AR/VR,
> there is also an increased interest in the deployment of remote control
> capabilities for vehicles such as cars, trucks and drones, all of which
> require low latency video streaming.

Remote control over the internet instead of a direct radio link is an
interesting use case.  Do you happen to know the research about delay
parameters that make the difference between viable or not viable for
RC?

This touches on one of the reasons I've been skeptical that the benefits
will drive a rapid deployment--in most of the use cases I've come up with,
it seems like reducing delay from ~200-500ms down to ~15-30ms (as seems
achievable even for single queue with classic AQM) would give almost
all the same benefits as reducing from ~15-30ms down to 1ms.

Of course, there's a difference in that last 14-29ms, but for instance
for gaming reaction time it's well under the thresholds that make a
difference for humans (the low end of which is at 45ms, according to
[2]), so it seems like the value in that market would be captured by
classic ECN, and therefore since classic ECN deployment hasn't caught
on yet, I had to conclude that the performance gains to enable that
market aren't sufficient to drive wide adoption.

So I'm curious to know more about the use cases that get over that
hump from an operator's point of view, and what you've seen that leads
you to believe the additional gains of L4S from will make the difference
on those use cases where classic ECN wasn't adequate.

> My bottomline is that I believe L4S provides with a clear benefit that is
> large enough to be more widely accepted in 3GPP. SCE is as I see it more
> like something that is just a minor enhancement to ECN and is therefore much
> harder to sell in to 3GPP.   

Thanks, this is good to know.

To me one benefit of SCE over L4S is that it seems safer to avoid
relying on an ambiguous signal (namely a CE that we don't know which
kind of AQM set it) in a control system, while still providing high-
fidelity info about the network device congestion, where available.

I agree that it's not completely clear exactly how the congestion
controllers can capitalize on that info, but to me it still seems worth
considering.

So although I'll support L4S if it really covers all the safety issues
and performs better, I'd be more comfortable with the signaling if
there's a way to make SCE do the same job, especially if the endpoint
implementation is simpler to get robustly deployed.

So really, I'm hoping for a bakeoff to decide this, because one of my
concerns is that L4S still doesn't have an implementation that does
all the things the drafts say are needed for safety on the internet,
even though the initial proof of concept demoing the performance
gains was presented 7 years ago.  It's good that it's getting closer,
but the long implementation cycle (which still doesn't have all the
features required by the drafts) is a concern for me from the
"running code" point of view.

On this point of view, it's possible that a parallel track might get
further faster, especially if it doesn't need the same special cases
to be safe, which is part of why I've been tentatively supportive.

And although I can see how the queue classification is a major issue
that could make the difference, especially with the very promising
dualq proposal, it also seems true that in addition to CPEs, there are
promising avenues for carrier-scale FQ systems (e.g [3], [4]) that could
solve that.  It makes me think that even if SCE only gets low-latency
with FQ and otherwise causes no harm, it's not clear it'll be a slower
path to ubiquitous deployment (and by the way, this approach also would
handle the opt-in access control problem).

Of course, this will presumably collapse to one answer at some point,
but I'll argue that it's worthwhile to give a good look to the alternate
proposal...

Anyway, thanks for the comments, I think it's good to see more
discussion on this.

Best regards,
Jake

[1] Appendix A.1 RFC 8170 https://tools.ietf.org/html/rfc8170#appendix-A.1
[2] https://ojs.bibsys.no/index.php/NIK/article/view/9 says 45ms
[3] http://ppv.elte.hu/
[4] https://ieeexplore.ieee.org/document/8419697



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
       [not found]   ` <HE1PR07MB4425E0997EE8ADCAE2D4C564C2E80@HE1PR07MB4425.eurprd07.prod.outlook.com>
@ 2019-06-19 12:59     ` Bob Briscoe
  0 siblings, 0 replies; 59+ messages in thread
From: Bob Briscoe @ 2019-06-19 12:59 UTC (permalink / raw)
  To: Holland, Jake; +Cc: Ingemar Johansson S, tsvwg

[-- Attachment #1: Type: text/plain, Size: 19964 bytes --]

Jake & Ingemar,

On 16/06/2019 11:07, Ingemar Johansson S wrote:
> Hi Jake + all
>
> Please see inline
>
> /Ingemar
>
>
>> -----Original Message-----
>> From: Holland, Jake<jholland@akamai.com>
>> Sent: den 14 juni 2019 20:28
>> To: Ingemar Johansson S<ingemar.s.johansson@ericsson.com>; Bob Briscoe
>> <ietf@bobbriscoe.net>
>> Cc:tsvwg@ietf.org
>> Subject: Re: [tsvwg] Comments on L4S drafts
>>
>> Hi Ingemar,
>> (bcc: ecn-sane, to keep them apprised on the discussion).
>>
>> Thanks for chiming in on this.  A few comments inline:
>>
>> On 2019-06-08, 12:46, "Ingemar Johansson S"
>> <ingemar.s.johansson@ericsson.com>  wrote:
>>> Up until now it has been quite a challenge to make ECN happen, I
>>> believe that part of the reason has been that ECN is not judged to
>>> give a large enough gain.
>> Could you elaborate on this point?
>>
>> I haven't been sure how to think about the claims in the l4s drafts that operators
>> will deploy it rapidly because of performance.
>>
>> Based on past analyses (e.g. the classic ECN rollout case study in RFC
>> 8170 [1]), I thought network operators had a very "safety first" outlook on these
>> things, and that rapid deployment for performance benefits seemed like wishful
>> thinking.
[BB] The ECN rollout case study in rfc8170 is not a useful example . It 
ends hoping there will be some client roll-out (written before Apple's 
decision) and doesn't even mention that network roll-out would still be 
needed subsequently. So it gives no insight into what causes network 
operator resistance
>> But I'd be interested to know more about why that view might be mistaken.
> [IJ] I believe that it is easy to end up in a lot of speculation. I don't believe that the safety first thinking makes much sense, yes it has sometimes been used as a counter argument. Part of the problem is perhaps that ECN was introduced into 3GPP for VoLTE. And then when ECN is proposed for its original use in 3GPP (=Generic transport protocol agnostic feature) it gets hard to make it stick. With that said, ECN is supported in both LTE and NR standards (TS36.300, TS38.300). It is however rarely deployed. One could speculate around the reasons, I believe that one big reason can be that traditional ECN does not show a large enough delta improvement to make it worthwhile. I can of course be wrong , I don't possess a crystal ball 😊
[BB] My experience on this comes from years inside BT. The last 15 were 
after ECN was standardized, and for the last few years I was in BT's 
tech strategy team, regularly making business cases for various 
improvements. And talking with folks from other operators, of course.

When I quantified the performance benefit of classic ECN, it was 
embarrassing. You only got significant benefits in an under-provisioned 
network, which most operators avoid for obvious other reasons. Classic 
ECN gave next-to-no benefit with long-running flows. The more 
significant benefit for short transactional flows was primarily due to 
avoiding the timeout when the last packet of a flow was dropped. I 
figured that could be solved e2e, and indeed in 2012 the tail loss probe 
was proposed to solve that problem. The remaining benefit was mostly due 
to not losing SYNs and to a lesser extent SYN/ACKs, but classic ECN 
couldn't be used on SYNs anyway. In comparison the potential risks on 
the cost side dominated.

Finally, for a large network improvement project it is nearly impossible 
to squeeze the cash needed out of the relatively small budgets assigned 
for regular network improvements. None of the access equipment supported 
a modern AQM or ECN, so we would have had to tender for new designs. To 
persuade vendors to spend that sort of money, you need a budget line in 
a project that is buying kit for a new service with a projected revenue 
stream (e.g. a new sports service, a VR product, etc). That means your 
performance improvement has to be necessary for that product.

An alternative would have beeen to show that the performance improvement 
would gain sales from competing ISPs for long enough to pay for the 
costs of the improvement, but that's much harder to argue convincingly.

>>> Besides this, L4S has the nice
>>> property that it has potential to allow for faster rate increase when
>>> link capacity increases.
>> I think section 3.4 of RFC 8257 says the rate increase would be the
>> same:
>> https://tools.ietf.org/html/rfc8257#section-3.4
>>     A DCTCP sender grows its congestion window in the same way as
>>     conventional TCP.
>>
>> I guess this is referring to the paced chirping for rapid growth idea presented last
>> time?
>> https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-
>> implementing-the-prague-requirements-in-tcp-for-l4s-01#page=20
>>
>> I'm a little unclear on how safe this can be made, but I agree it seems useful if it
>> can work well.
> [IJ] Yes DCTCP use traditional additive increase. I have personally done a few experiments in this area, nothing that is good enough to show as the experiments were very limited. One possible idea can be to make the bandwidth probing in BBR(v2) more aggressive. And there may also be possibilities with Paced chirping too
[BB] Note that paced chirping is not the differentiator here. It doesn't 
depend on ECN, nor L4S-ECN, nor SCE-ECN for that matter. It is 
delay-based, and potentially applicable to any e2e technology.

The differentiator that L4S provides (and perhaps SCE if all the 
problems were fixed) is the introduction of scalable congestion control 
(like DCTCP), which induces a frequent amount of signalling per RTT that 
remains invariant as flow rate scales.

Support for a transition to scalable CC is as important as cutting 
latency. Aside from being able to scale flow rate indefinitely,...

...it also solves the problem of rapidly detecting when more capacity 
has become available. If you normally get 2 signals per RTT (like 
DCTCP), you can tell there's available capacity after 2 or 3 RTT. If you 
get 1 signal every few hundred RTTs (like Cubic), you cannot tell 
there's available capacity for a thousand or so RTTs. That in itself is 
useful.... You don't need to do the seeking with paced chirping, which 
is just one attempt to get up to capacity both with less overshoot and 
faster.


>> Do you think the L4S benefits will still be sufficient if this point about faster
>> growth doesn't hold up (and/or could be replicated regardless of L4S), or is it
>> critical to providing sufficient benefit in 3GPP?
> [IJ] No, I don't believe that it is critical, it is definitely a welcome bonus if it is possible.
>
>> (Note: I'm not taking a position on this point, just asking about how much this
>> point matters to the 3GPP support, as you see it.)
[BB] Ingemar has described the New Radio meetings to me where he's tried 
to propose ECN in the RLC layer. Like the swathes of other proposals, he 
was given 2 minutes, to persuade primarily radio people, who have 
already seen all the work that went into ECN for VoLTE not being taken up.

5G has promised extremely low latency. It is currently planning to do 
that with 'old school' QoS - by limiting throughput into reserved 
capacity. But that doesn't scale to apps that want high bandwidth and 
low latency. That's when the NR working group will start listening more 
carefully to ECN-based solutions.

>>> I see many applications that can benefit greatly from L4S, besides
>>> AR/VR, there is also an increased interest in the deployment of remote
>>> control capabilities for vehicles such as cars, trucks and drones, all
>>> of which require low latency video streaming.
>> Remote control over the internet instead of a direct radio link is an interesting
>> use case.  Do you happen to know the research about delay parameters that
>> make the difference between viable or not viable for RC?
>> This touches on one of the reasons I've been skeptical that the benefits will drive
>> a rapid deployment--in most of the use cases I've come up with, it seems like
>> reducing delay from ~200-500ms down to ~15-30ms (as seems achievable even
>> for single queue with classic AQM) would give almost all the same benefits as
>> reducing from ~15-30ms down to 1ms.
> [IJ] The thing I like with L4S is that it reduces standing queues down to almost zero, which gives a very fast reaction time when throughput drops. In addition L4S gives frequent signals of congestion, which makes it easier for a congestion control algorithm to know when it is close to the congestion knee.
[BB] I did some research on motion-to-photon latency a while ago, with 
others. It was for VR/AR, but it translates to similar apps. Quoting:

    MTP Latency:  AR/VR developers generally agree that MTP latency
       becomes imperceptible below about 20 ms [Carmack13  <https://tools.ietf.org/html/draft-han-iccrg-arvr-transport-problem-01#ref-Carmack13>].  However,
       some research has concluded that MTP latency must be less than
       17ms for sensitive users [MTP-Latency-NASA  <https://tools.ietf.org/html/draft-han-iccrg-arvr-transport-problem-01#ref-MTP-Latency-NASA>].  Experience has shown
       that standards bodies tend to set demanding quality levels, while
       motivated humans often happily adapt to lower quality although
       they struggle with more demanding tasks.  Therefore, we must be
       clear that this 20 ms requirement is designed to enable immersive
       interaction for the same wide range of tasks that people are used
       to undertaking locally.
...
    For a summary of numerous references
    concerning the limit of human perception of delay see the thesis of
    Raaen [Raaen16  <https://tools.ietf.org/html/draft-han-iccrg-arvr-transport-problem-01#ref-Raaen16>].


Let's say 20ms is too pedantic and you've got 50ms round trip MTP budget 
(John Carmack says that 50ms feels responsive, but the slight lag is 
still subtly unnatural, and merely defers the onset of VR-sickness).

We projected [latency budget 
<https://tools.ietf.org/html/draft-han-iccrg-arvr-transport-problem-01#appendix-A.1.2>] 
that, with some expected advances, it should possible to get the total 
of all delays except propagation and queuing down to about 13ms.

If one subtracts the delays you just stated for queuing, you get the 
following left for propagation:


	target for 'responsive' 	non-network
	queuing
	left for 2-way propagation
	reach in fibre
2nd gen. AQM 	50ms
	- 13ms
	- 30ms
	= 7ms
	700km (440miles)
3rd gen. AQM 	50ms
	- 13ms
	- 1ms
	= 36ms
	3600km (2250miles)


5 times greater reach means responsive interaction between Los Angeles 
and Atlanta, rather than just Los Angeles and Phoenix.

For communicating with a data centre, 5 times greater reach means 
equivalent coverage from 25 times fewer sites (coverage area is the 
square of reach). Concentration of sites is surely a very important cost 
factor for a CDN.


Note that, for real-time comms you need to watch the 99 or 99.9 
percentile, not just median. See these percentiles on a log-scale at 
slide 24 here:
https://www.files.netdevconf.org/f/4ebdcdd6f94547ad8b77/?dl=1
This was under rather extreme load (600 web sessions per second - see 
slide for details).


Whatever, @Jake, I think you will agree that SCE's aim is to cut queuing 
to similarly low levels. So arguing that we don't need such low delay 
also argues against SCE.


>> Of course, there's a difference in that last 14-29ms, but for instance for gaming
>> reaction time it's well under the thresholds that make a difference for humans
>> (the low end of which is at 45ms, according to [2]), so it seems like the value in
>> that market would be captured by classic ECN, and therefore since classic ECN
>> deployment hasn't caught on yet, I had to conclude that the performance gains
>> to enable that market aren't sufficient to drive wide adoption.
>>
>> So I'm curious to know more about the use cases that get over that hump from
>> an operator's point of view, and what you've seen that leads you to believe the
>> additional gains of L4S from will make the difference on those use cases where
>> classic ECN wasn't adequate.
> [IJ] I guess for this part, there need to be more input from operators
[BB] When Kjetil [2] says 45ms is good enough for today's games, I'd 
trust that. But you can't burn all that with queuing - if I had aimed 
for 45ms not 50 ms above, I'd have been left with 2ms for propagation.

When I showed Kjetil the demo of L4S using finger-gestures to pan and 
zoom cloud-rendered video, he agreed that humans are much more sensitive 
to the lag that the eye sees between their real hand controlling a 
movement and seeing the thing move under their hand. It depends how much 
freedom we want to give game developers to explore new user interfaces 
and delivery platforms (e.g. a Wii interacting with cloud-rendering).

>
>>> My bottomline is that I believe L4S provides with a clear benefit that
>>> is large enough to be more widely accepted in 3GPP. SCE is as I see it
>>> more like something that is just a minor enhancement to ECN and is therefore
>> much
>>> harder to sell in to 3GPP.
>> Thanks, this is good to know.
>>
>> To me one benefit of SCE over L4S is that it seems safer to avoid relying on an
>> ambiguous signal (namely a CE that we don't know which kind of AQM set it) in a
>> control system, while still providing high- fidelity info about the network device
>> congestion, where available.
>>
>> I agree that it's not completely clear exactly how the congestion controllers can
>> capitalize on that info, but to me it still seems worth considering.
>>
>> So although I'll support L4S if it really covers all the safety issues and performs
>> better, I'd be more comfortable with the signaling if there's a way to make SCE
>> do the same job, especially if the endpoint implementation is simpler to get
>> robustly deployed.
[BB] How much is this a case of, "There aren't any problems with the SCE 
endpoint because we haven't thought about the problems yet"?

As well as the straightforward engineering showstoppers that I have 
highlighted (which I'll repeat 1-by-1 in later emails), there's also 
algorithmic stuff that hasn't even been identified yet in SCE, let alone 
addressed theoretically, let alone implemented.

For instance, a shift to fine-grained signals also shifts the smoothing 
from the network to the sender. That means the sender has to smooth the 
SCE signal and not the CE. So you have to deal with the cases where the 
two controllers interact and one overtakes the other. I don't believe 
stability is understood in such a system (you can be pessimistic when 
slowing down, but you also have to ensure stability when speeding up).

It's not as if SCE can just ride on the back of the CC research we and 
others have already done - it also introduces its own new research 
problems.
>>
>> So really, I'm hoping for a bakeoff to decide this, because one of my concerns is
>> that L4S still doesn't have an implementation that does all the things the drafts
>> say are needed for safety on the internet, even though the initial proof of
>> concept demoing the performance gains was presented 7 years ago.
[BB] It was Jul 2015 (nearly 4 years ago, not 7).
>> It's good
>> that it's getting closer, but the long implementation cycle (which still doesn't
>> have all the features required by the drafts) is a concern for me from the
>> "running code" point of view.
[BB] The SCE endpoint will need all the features required by the drafts 
as well. Unless it is going to solely require FQ.

I should also add that we (the L4S proponents) never envisaged that we 
would have to do all the endpoint stuff. We were all from 
network-focused companies. Altho we all had background in congestion 
control for video, we weren't allowed/expected to do such work on 
company time.

What we didn't realize was that researchers aren't getting funding to do 
such work these days (those that haven't been collected by Google). So 
we eventually had to grasp the nettle and find ways to do the endpoint 
stuff ourselves.

For instance, on a personal note, CableLabs is only funding a fraction 
of my working week, and will only pay for time on tasks in my contract, 
which are nearly all about the network aspects. I am self-funding nearly 
all work I do on end-system stuff.


>>
>> On this point of view, it's possible that a parallel track might get further faster,
>> especially if it doesn't need the same special cases to be safe, which is part of
>> why I've been tentatively supportive.
[BB] Let me first see if I can get the SCE proponents to address the 
show-stoppers that I have highlighted. By remaining silent, they seem to 
have convinced everyone that these show-stoppers don't exist.

If necessary, it sounds like it would help to address the only 
outstanding concern with L4S (classic ECN fall-back), irrespective of 
whether we think the problem actually exists or will ever exist.

>>
>> And although I can see how the queue classification is a major issue that could
>> make the difference, especially with the very promising dualq proposal, it also
>> seems true that in addition to CPEs, there are promising avenues for carrier-
>> scale FQ systems (e.g [3], [4]) that could solve that.  It makes me think that even
>> if SCE only gets low-latency with FQ and otherwise causes no harm, it's not clear
>> it'll be a slower path to ubiquitous deployment (and by the way, this approach
>> also would handle the opt-in access control problem).
[BB] You (@Jake) are right to point out that different people have 
different ideas of what they think might happen in the future. However, 
I think it is a bit of a stretch to imagine that ubiquitous deployment 
of FQ might happen...

FQ assumes L4 headers are accessible, which assumes the Internet is an 
unencrypted L3 network. In 4G and 5G the eNodeB or gNodeB where ECN 
would need to be marked is a L2 node. A node deeper into the network has 
already compressed, tunnelled and encapsulated the IP headers. So how 
would FQ here access L4 port numbers? it can't do the cake trick - 
creating an artificial bottleneck where the IP header is accessible, 
because this concerns radio capacity, which varies hugely and continually.

Not to mention... all my other unanswered points about where SCE doesn't 
work at all, e.g.

  * all tunnels will have to propagate the ECT(1) codepoint, when the
    spec saying this isn't even out of WGLC yet,
  * and the optional TCP option for AccECN will be needed to feed back
    ECT(1), when no major OS is going to implement the TCP option,
    because they don't want to handle all the pain of middlebox mangling,
  * and... <my other 3 points that I'll get to in later emails>


If the last unicorn goes to a solution that will rarely work, and 
becomes renowned as ineffective and unconvincing, we will have wasted 
the last unicorn.

>> Of course, this will presumably collapse to one answer at some point, but I'll
>> argue that it's worthwhile to give a good look to the alternate proposal...
>>
>> Anyway, thanks for the comments, I think it's good to see more discussion on
>> this.
[BB]  Having alternative(s) is v important, even if strawmen. Proper 
discussion is good too - I've been close enough to this that I can 
identify problems v quickly, but the wider community needs discussion 
time to get steeped in it all.

Thank you v much for all the time you're putting into this.
Cheers


Bob
>>
>> Best regards,
>> Jake
>>
>> [1] Appendix A.1 RFC 8170https://tools.ietf.org/html/rfc8170#appendix-A.1
>> [2]https://protect2.fireeye.com/url?k=997ee527-c5f43093-997ea5bc-
>> 866a015dd3d5-
>> 1d25c70963170b1e&q=1&u=https%3A%2F%2Fojs.bibsys.no%2Findex.php%2FNI
>> K%2Farticle%2Fview%2F9 says 45ms [3]http://ppv.elte.hu/  [4]
>> https://ieeexplore.ieee.org/document/8419697
>>

-- 
________________________________________________________________
Bob Briscoehttp://bobbriscoe.net/


[-- Attachment #2: Type: text/html, Size: 27821 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-25 16:14                                                   ` De Schepper, Koen (Nokia - BE/Antwerp)
@ 2019-07-26 13:10                                                     ` Pete Heist
  0 siblings, 0 replies; 59+ messages in thread
From: Pete Heist @ 2019-07-26 13:10 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp); +Cc: ecn-sane, tsvwg


> On Jul 25, 2019, at 12:14 PM, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> We have the testbed running our reference kernel version 3.19 with the drop patch. Let me know if you want to see the difference in behavior between the “good” DCTCP and the “deteriorated” DCTCP in the latest kernels too. There were several issues introduced which made DCTCP both more aggressive, and currently less aggressive. It calls for better regression tests (for Prague at least) to make sure it’s behavior is not changed too drastically by new updates. If enough people are interested, we can organize a session in one of the available rooms.
>  
> Pete, Jonathan,
>  
> Also for testing further your tests, let me know when you are available.

Regarding testing, we now have a five node setup in our test environment running a mixture of tcp-prague and dualq kernels to cover the scenarios Jon outlined earlier. With what little time we’ve had for it this week, we’ve only done some basic tests, and seem to be seeing behavior similar to what we saw at the hackathon, but we can discuss specific results following IETF 105.

Our intention is to coordinate a public effort to create reproducible test scenarios for L4S using flent. Details to follow post-conference. We do feel it’s important that all of our Linux testing be on modern 5.1+ kernels, as the 3.19 series was end of life as of May 2015 (https://lwn.net/Articles/643934/), so we'll try to keep up to date with any patches you might have for the newer kernels.

Overall, I think we’ve improved the cooperation between the teams this week (from zero to a little bit :), which should hopefully help move both projects along...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-25 21:17                                                 ` Bob Briscoe
@ 2019-07-25 22:00                                                   ` Sebastian Moeller
  0 siblings, 0 replies; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-25 22:00 UTC (permalink / raw)
  To: Bob Briscoe
  Cc: De Schepper, Koen (Nokia - BE/Antwerp),
	Black, David, ecn-sane, tsvwg, Dave Taht

Dear Bob,

thanks for you time and insight. More comments below. I will try to follow your style.

> On Jul 25, 2019, at 23:17, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> 
> Sebastien,
> 
> Sry, I sent that last reply too early, and not bottom posted. Both corrected below (tagged [BB]):
> 
> 
> On 25/07/2019 16:51, Bob Briscoe wrote:
>> Sebastien,
>> 
>> 
>> On 21/07/2019 16:48, Sebastian Moeller wrote:
>>> Dear Bob, 
>>> 
>>> 
>>>> On Jul 21, 2019, at 21:14, Bob Briscoe <ietf@bobbriscoe.net>
>>>>  wrote:
>>>> 
>>>> Sebastien,
>>>> 
>>>> On 21/07/2019 17:08, Sebastian Moeller wrote:
>>>> 
>>>>> Hi Bob,
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jul 21, 2019, at 14:30, Bob Briscoe <ietf@bobbriscoe.net>
>>>>>> 
>>>>>>  wrote:
>>>>>> 
>>>>>> David,
>>>>>> 
>>>>>> On 19/07/2019 21:06, Black, David wrote:
>>>>>> 
>>>>>> 
>>>>>>> Two comments as an individual, not as a WG chair:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> Mostly, they're things that an end-host algorithm needs
>>>>>>>> to do in order to behave nicely, that might be good things anyways
>>>>>>>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>>>>>>>> work well w/ small RTT, be robust to reordering).  I am curious which
>>>>>>>> ones you think are too rigid ... maybe they can be loosened?
>>>>>>>> 
>>>>>>>> 
>>>>>>> [1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).
>>>>>>> 
>>>>>>> For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.
>>>>>>> 
>>>>>>> 
>>>>>> As you know, we have been at pains to address every concern about L4S that has come up over the years, and I thought we had addressed this one to your satisfaction.
>>>>>> 
>>>>>> The reliable transports you are are concerned about require ordered delivery by the underlying fabric, so they can only ever exist in a controlled environment. In such a controlled environment, your ECT1+DSCP idea (below) could be used to isolate the L4S experiment from these transports and their firmware/hardware constraints.
>>>>>> 
>>>>>> On the public Internet, the DSCP commonly gets wiped at the first hop. So requiring a DSCP as well as ECT1 to separate off L4S would serve no useful purpose: it would still lead to ECT1 packets without the DSCP sent from a scalable congestion controls (which is behind Jonathan's concern in response to you).
>>>>>> 
>>>>>> 
>>>>> 	And this is why IPv4's protocol fiel/ IPv6's next header field are the classifier you actually need... You are changing a significant portion of TCP's observable behavior, so it can be argued that TCP-Prague is TCP by name only; this "classifier" still lives in the IP header, so no deeper layer's need to be accessed, this is non-leaky in that the classifier is unambiguously present independent of the value of the ECN bits; and it is also compatible with an SCE style ECN signaling. Since I believe the most/only likely roll-out of L4S is going to be at the ISPs access nodes (BRAS/BNG/CMTS/whatever)  middleboxes shpould not be an unsurmountable problem, as ISPs controll their own middleboxes and often even the CPEs, so protocoll ossification is not going to be a showstopper for this part of the roll-out.
>>>>> 
>>>>> Best Regards
>>>>> 	Sebastian
>>>>> 
>>>>> 
>>>>> 
>>>> I think you've understood this from reading abbreviated description of the requirement on the list, rather than the spec. The spec. solely says:
>>>> 	A scalable congestion control MUST detect loss by counting in time-based units
>>>> That's all. No more, no less. 
>>>> 
>>>> People call this the "RACK requirement", purely because the idea came from RACK. There is no requirement to do RACK, and the requirement applies to all transports, not just TCP.
>>>> 
>>> 	Fair enough, but my argument was not really about RACK at all, it more-so applies to the linear response to CE-marks that ECT(1) promises in the L4S approach. You are making changes to TCP's congestion controller that make it cease to be "TCP-friendly" (for arguably good reasons). So why insist on pretending that this is still TCP? So give it a new protocol ID already and all your classification needs are solved. As a bonus you do not need to use the same signal (CE) to elicit two different responses, but you could use the re-gained ECT(1) code point similarly to SCE to put the new fine-grained congestion signal into... while using CE in the RFC3168 compliant sense.
> 
> [BB] The protocol ID identifies the wire protocol, not the congestion control behaviour. If we had used a different protocol ID for each congestion control behaviour, we'd have run out of protocol IDs long ago (semi serious ;)


	[SM] Yes, I know, but you are proposing a massively incompatible "congestion control behaviour" for L4S that is not TCP-friendly, otherwise you would not need to deal with isolating your new style flows from the rest. For convenience (and since most of the other components are TCP-like) you package the whole thing as a congestion control module for TCP. My argument is, do not do that.
	As an aside, with this approach you are still at the mercy of OS and router manufacturers (okay Linux should be easy, but what is the plan of attack to get L4S behaviour into windows' TCP implementation; to me it seems your best bet would be to create a library for UDP that will do your L4S type response on top of UDP (you get resequencing tolerance for free ;) ), as long as you supply that library for all inportant OSes application writers can opt in without the need for OSes to change, but that is an aside.



> 
> This is a re-run of a debate that has already been had (in Jul 2015 - Nov 2016), which is recorded in the appendix of ecn-l4s-id here:
> https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-07#appendix-B.4

	[SM] Read it there, I just believe that the final choice of identifier was not the optimal one (I know this is all about trade-offs, I just happen to have different priorities than the L4S project; IMHO all the power to L4S as long as it does stay opt-in and has ZERO side-effects on existing internet users).


> Quoted and annotated below:
> 
>> B.4.  Protocol ID
>> 
>>    It has been suggested that a new ID in the IPv4 Protocol field or the
>>    IPv6 Next Header field could identify L4S packets.  However this
>>    approach is ruled out by numerous problems:
>> 
>>    o  A new protocol ID would need to be paired with the old one for
>>       each transport (TCP, SCTP, UDP, etc.);

	[SM] That is somewhat weak, as you are a) currently only pushing a TCP version, and you might want a UDP version (see above), (how many applications use anything but TCP or UDP?)

>> 
>>    o  In IPv6, there can be a sequence of Next Header fields, and it
>>       would not be obvious which one would be expected to identify a
>>       network service like L4S;
>> 
> In particular, the protocol ID / next header stays next to the upper layer header as a PDU gets encapsulated, possibly many times. So the protocol ID is not necessarily (rarely?) in the outer, particularly in IPv6, and it might be encrypted in IPSec.

	[SM] So, at a peering/transit point, which encapsulations are actually realistic? I would have thought that more or less raw IP packets are required to make the necessary routing decisions at a network's edge, same argument holds for the internet access links. At which points besides the ingress and egress of a network do you expect queueing to happen routinely? From my limited experience it really is at ingress/egress/transit, so which other hops will actually be realistic targets for an L4S-AQM?
	I also am not yet convinced that ISPs will really want to signal that their peering/transits are under-sized, so I am dubious that these will ever get L4S/SCE style signaling (but I hope I am overly pessimistic here).


> 
>>    o  A new protocol ID would rarely provide an end-to-end service,
>>       because It is well-known that new protocol IDs are often blocked
>>       by numerous types of middlebox;

	[SM] Yes, that is the strongest of these four arguments, at last to my layman's eyes.


>> 
>>    o  The approach is not a solution for AQMs below the IP layer;
>> 
>> 
> That last point means that the protocol ID is not designed to always propagate to the outer on encap and back from the outer on decap, whereas the ECN field is (and it's the only field that is).

	[SM] Fair enough, as indicated above, I am not really seeing hops that deal in non-IP packets to actually ever use L4S/SCE type signalling, so is that really a big problem?


> 
> more....
>>> 
>>> 
>>> 
>>>> It then means that a packet with ECT1 in the IP field can be forwarded without resequencing (no requirement - it just it /can/ be).
>>>> 
>>> 	Packets always "can" be forwarded without resequencing, the question is whether the end-points are going to like that... 
>>> And IMHO even RACK with its at maximum one RTT reordering windows gives intermediate hops not much to work with, without knowing the full RTT a cautious hop might allow itself one retransmission slot (so its own contribution to the RTT), but as far as I can tell they do that already. And tracking the RTT will require to keep per flow statistics, this also seems like it can get computationally expensive quickly... (I probably misunderstand how RACK works, but I fail to see how it will really allow more re-ordering, but that is also orthogonal to the L4S issues I try to raise).
>>> 
> [BB] No-one's suggesting reordering degree will adapt to measured RTT at run-time. 

	[SM] I know, as that would defeat the purpose, but that also puts severe limits on how much re-ordering budget a given link actually has.

> 
> See the original discussion on this point here:
> Vicious or Virtuous circle? Adapting reordering window to reordering degree
> 
> In summary, the uncertainty for the network is a feature not a bug. It means it has to keep reordering degree lower than the lowest likely RTT (or some fraction of it) that is expected for that link technology at the design stage. This will keep reordering low, but not too unnecessarily low (i.e. not 3 packets at the link rate).

	[SM] As I state above, a given link realistically will only be allowed one of its own local RTTs worth of re-ordering (other links might re-order as well, so no link can claim the full e2E RTT's worth of re-ordering all for itself). So all I can see for each link one or (if the link feels lucky) two re-transmit opportunities before the link needs to stall to resequenced packets again. Now, that might already be enough (and a sufficiently "batchy" link might transfer more than 3 packets in one haul).
	I naively thought that a link would only ever stall those flows with out-of-order packets and happily fill its upstream pipe with packets from unaffected flows, but that seems not to be happening.


> 
>>> 
>>>> This is a network layer 'unordered delivery' property, so it's appropriate to flag at the IP layer. 
>>>> 
>>> 	But at that point you are multiplexing multiple things into the poor ECT(1) codepoint, the promise of a certain "linear" back-off behavior on encountered congestion AND a "allow relaxed ordering" ( "detect loss by counting in time-based units" does not seem to be fully equivalent with a generic tolerance to 'unordered delivery' as far as I understand). That seems asking to much of a simple number...
> [BB] In a purist sense, it is a valid architectural criticism that we overload one codepoint with two architecturally distinct functions:
> 	• low queuing delay
> 	• low resequencing delay
> But then, one has to consider the value vs cost of 2 independent identifiers for two things that are unlikely to ever need to be distinguished. If an app wants low delay, would it want only low queuing delay and not low resequencing delay? 

	[SM] Sorry, I can well envision apps that do not care about "low queuing delay" but would be happy to give laxer reordering requirements to the network (like a bulk data transfer, that just wants to keep pushing packets through). Is that unrealistic? 

> 
> You could contrive a case where the receiver is memory-challenged and needs the network to do the resequencing.

	Well, packets are send in sequence, so the idea is not to burden the network with undue work, but rather to faithfully transmit what the endpoints send. 
(On a tangent, somewhere else you argued against FQ as it will take the dynamic packet spacing decisions away from the sending endpoint, but surely changing the order of packets is a far more grave intervention than just changing the interpacket intervals, no?)

> But it's not a reasonable expectation for the network to do a function that will cause HoL blocking for other applications in the process of helping you with your memory problems.
> 
> Given we are header-bit-challenged, it would not be unreasonable for the WG to decide to conflate these two architectural identifiers into one.
> 
> 
> Bob
> 
>>> 
>>> Best Regards
>>> 	Sebastian
>>> 
>>> 
>>>> 
>>>> Bob
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> ________________________________________________________________
>>>> Bob Briscoe                               
>>>> 
>>>> http://bobbriscoe.net/
>>> _______________________________________________
>>> Ecn-sane mailing list
>>> 
>>> Ecn-sane@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/ecn-sane
>> 
>> -- 
>> ________________________________________________________________
>> Bob Briscoe                               
>> http://bobbriscoe.net/
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               
> http://bobbriscoe.net/


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-25 20:51                                               ` Bob Briscoe
@ 2019-07-25 21:17                                                 ` Bob Briscoe
  2019-07-25 22:00                                                   ` Sebastian Moeller
  0 siblings, 1 reply; 59+ messages in thread
From: Bob Briscoe @ 2019-07-25 21:17 UTC (permalink / raw)
  To: Sebastian Moeller
  Cc: De Schepper, Koen (Nokia - BE/Antwerp),
	Black, David, ecn-sane, tsvwg, Dave Taht

[-- Attachment #1: Type: text/plain, Size: 10231 bytes --]

Sebastien,

Sry, I sent that last reply too early, and not bottom posted. Both 
corrected below (tagged [BB]):


On 25/07/2019 16:51, Bob Briscoe wrote:
> Sebastien,
>
>
> On 21/07/2019 16:48, Sebastian Moeller wrote:
>> Dear Bob,
>>
>>> On Jul 21, 2019, at 21:14, Bob Briscoe<ietf@bobbriscoe.net>  wrote:
>>>
>>> Sebastien,
>>>
>>> On 21/07/2019 17:08, Sebastian Moeller wrote:
>>>> Hi Bob,
>>>>
>>>>
>>>>
>>>>> On Jul 21, 2019, at 14:30, Bob Briscoe<ietf@bobbriscoe.net>
>>>>>   wrote:
>>>>>
>>>>> David,
>>>>>
>>>>> On 19/07/2019 21:06, Black, David wrote:
>>>>>
>>>>>> Two comments as an individual, not as a WG chair:
>>>>>>
>>>>>>
>>>>>>> Mostly, they're things that an end-host algorithm needs
>>>>>>> to do in order to behave nicely, that might be good things anyways
>>>>>>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>>>>>>> work well w/ small RTT, be robust to reordering).  I am curious which
>>>>>>> ones you think are too rigid ... maybe they can be loosened?
>>>>>>>
>>>>>> [1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).
>>>>>>
>>>>>> For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.
>>>>>>
>>>>> As you know, we have been at pains to address every concern about L4S that has come up over the years, and I thought we had addressed this one to your satisfaction.
>>>>>
>>>>> The reliable transports you are are concerned about require ordered delivery by the underlying fabric, so they can only ever exist in a controlled environment. In such a controlled environment, your ECT1+DSCP idea (below) could be used to isolate the L4S experiment from these transports and their firmware/hardware constraints.
>>>>>
>>>>> On the public Internet, the DSCP commonly gets wiped at the first hop. So requiring a DSCP as well as ECT1 to separate off L4S would serve no useful purpose: it would still lead to ECT1 packets without the DSCP sent from a scalable congestion controls (which is behind Jonathan's concern in response to you).
>>>>>
>>>> 	And this is why IPv4's protocol fiel/ IPv6's next header field are the classifier you actually need... You are changing a significant portion of TCP's observable behavior, so it can be argued that TCP-Prague is TCP by name only; this "classifier" still lives in the IP header, so no deeper layer's need to be accessed, this is non-leaky in that the classifier is unambiguously present independent of the value of the ECN bits; and it is also compatible with an SCE style ECN signaling. Since I believe the most/only likely roll-out of L4S is going to be at the ISPs access nodes (BRAS/BNG/CMTS/whatever)  middleboxes shpould not be an unsurmountable problem, as ISPs controll their own middleboxes and often even the CPEs, so protocoll ossification is not going to be a showstopper for this part of the roll-out.
>>>>
>>>> Best Regards
>>>> 	Sebastian
>>>>
>>>>
>>> I think you've understood this from reading abbreviated description of the requirement on the list, rather than the spec. The spec. solely says:
>>> 	A scalable congestion control MUST detect loss by counting in time-based units
>>> That's all. No more, no less.
>>>
>>> People call this the "RACK requirement", purely because the idea came from RACK. There is no requirement to do RACK, and the requirement applies to all transports, not just TCP.
>> 	Fair enough, but my argument was not really about RACK at all, it more-so applies to the linear response to CE-marks that ECT(1) promises in the L4S approach. You are making changes to TCP's congestion controller that make it cease to be "TCP-friendly" (for arguably good reasons). So why insist on pretending that this is still TCP? So give it a new protocol ID already and all your classification needs are solved. As a bonus you do not need to use the same signal (CE) to elicit two different responses, but you could use the re-gained ECT(1) code point similarly to SCE to put the new fine-grained congestion signal into... while using CE in the RFC3168 compliant sense.

[BB] The protocol ID identifies the wire protocol, not the congestion 
control behaviour. If we had used a different protocol ID for each 
congestion control behaviour, we'd have run out of protocol IDs long ago 
(semi serious ;)

This is a re-run of a debate that has already been had (in Jul 2015 - 
Nov 2016), which is recorded in the appendix of ecn-l4s-id here:
https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-07#appendix-B.4
Quoted and annotated below:

> B.4.  Protocol ID
>
>     It has been suggested that a new ID in the IPv4 Protocol field or the
>     IPv6 Next Header field could identify L4S packets.  However this
>     approach is ruled out by numerous problems:
>
>     o  A new protocol ID would need to be paired with the old one for
>        each transport (TCP, SCTP, UDP, etc.);
>
>     o  In IPv6, there can be a sequence of Next Header fields, and it
>        would not be obvious which one would be expected to identify a
>        network service like L4S;

In particular, the protocol ID / next header stays next to the upper 
layer header as a PDU gets encapsulated, possibly many times. So the 
protocol ID is not necessarily (rarely?) in the outer, particularly in 
IPv6, and it might be encrypted in IPSec.

>     o  A new protocol ID would rarely provide an end-to-end service,
>        because It is well-known that new protocol IDs are often blocked
>        by numerous types of middlebox;
>
>     o  The approach is not a solution for AQMs below the IP layer;

That last point means that the protocol ID is not designed to always 
propagate to the outer on encap and back from the outer on decap, 
whereas the ECN field is (and it's the only field that is).

more....
>>
>>
>>> It then means that a packet with ECT1 in the IP field can be forwarded without resequencing (no requirement - it just it /can/ be).
>> 	Packets always "can" be forwarded without resequencing, the question is whether the end-points are going to like that...
>> And IMHO even RACK with its at maximum one RTT reordering windows gives intermediate hops not much to work with, without knowing the full RTT a cautious hop might allow itself one retransmission slot (so its own contribution to the RTT), but as far as I can tell they do that already. And tracking the RTT will require to keep per flow statistics, this also seems like it can get computationally expensive quickly... (I probably misunderstand how RACK works, but I fail to see how it will really allow more re-ordering, but that is also orthogonal to the L4S issues I try to raise).
[BB] No-one's suggesting reordering degree will adapt to measured RTT at 
run-time.

See the original discussion on this point here:
Vicious or Virtuous circle? Adapting reordering window to reordering 
degree 
<https://mailarchive.ietf.org/arch/msg/tcpm/QOhMjHEo2kbHGInH8eFEsXbdwkA>

In summary, the uncertainty for the network is a feature not a bug. It 
means it has to keep reordering degree lower than the lowest likely RTT 
(or some fraction of it) that is expected for that link technology at 
the design stage. This will keep reordering low, but not too 
unnecessarily low (i.e. not 3 packets at the link rate).

>>
>>> This is a network layer 'unordered delivery' property, so it's appropriate to flag at the IP layer.
>> 	But at that point you are multiplexing multiple things into the poor ECT(1) codepoint, the promise of a certain "linear" back-off behavior on encountered congestion AND a "allow relaxed ordering" ( "detect loss by counting in time-based units" does not seem to be fully equivalent with a generic tolerance to 'unordered delivery' as far as I understand). That seems asking to much of a simple number...
[BB] In a purist sense, it is a valid architectural criticism that we 
overload one codepoint with two architecturally distinct functions:

  * low queuing delay
  * low resequencing delay

But then, one has to consider the value vs cost of 2 independent 
identifiers for two things that are unlikely to ever need to be 
distinguished. If an app wants low delay, would it want only low queuing 
delay and not low resequencing delay?

You could contrive a case where the receiver is memory-challenged and 
needs the network to do the resequencing. But it's not a reasonable 
expectation for the network to do a function that will cause HoL 
blocking for other applications in the process of helping you with your 
memory problems.

Given we are header-bit-challenged, it would not be unreasonable for the 
WG to decide to conflate these two architectural identifiers into one.


Bob

>>
>> Best Regards
>> 	Sebastian
>>
>>>
>>> Bob
>>>
>>>
>>>
>>> -- 
>>> ________________________________________________________________
>>> Bob Briscoe
>>> http://bobbriscoe.net/
>> _______________________________________________
>> Ecn-sane mailing list
>> Ecn-sane@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/ecn-sane
>
> -- 
> ________________________________________________________________
> Bob Briscoehttp://bobbriscoe.net/

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


[-- Attachment #2: Type: text/html, Size: 14261 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-21 20:48                                             ` Sebastian Moeller
@ 2019-07-25 20:51                                               ` Bob Briscoe
  2019-07-25 21:17                                                 ` Bob Briscoe
  0 siblings, 1 reply; 59+ messages in thread
From: Bob Briscoe @ 2019-07-25 20:51 UTC (permalink / raw)
  To: Sebastian Moeller
  Cc: De Schepper, Koen (Nokia - BE/Antwerp),
	Black, David, ecn-sane, tsvwg, Dave Taht

[-- Attachment #1: Type: text/plain, Size: 8382 bytes --]

Sebastien,

The protocol ID identifies the wire protocol, not the congestion control 
behaviour. If we had used a different protocol ID for each congestion 
control behaviour, we'd have run out of protocol IDs long ago (semi 
serious ;)

This is a re-run of a debate that has already been had (in Jul 2015 - 
Nov 2016), which is recorded in the appendix of ecn-l4s-id here:
https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-07#appendix-B.4
Quoted and annotated below:

> B.4.  Protocol ID
>
>     It has been suggested that a new ID in the IPv4 Protocol field or the
>     IPv6 Next Header field could identify L4S packets.  However this
>     approach is ruled out by numerous problems:
>
>     o  A new protocol ID would need to be paired with the old one for
>        each transport (TCP, SCTP, UDP, etc.);
>
>     o  In IPv6, there can be a sequence of Next Header fields, and it
>        would not be obvious which one would be expected to identify a
>        network service like L4S;

In particular, the protocol ID / next header stays next to the upper 
layer header as a PDU gets encapsulated, possibly many times. So the 
protocol ID is not necessarily (rarely?) in the outer, particularly in 
IPv6, and it might be encrypted in IPSec.

>     o  A new protocol ID would rarely provide an end-to-end service,
>        because It is well-known that new protocol IDs are often blocked
>        by numerous types of middlebox;
>
>     o  The approach is not a solution for AQMs below the IP layer;

That last point means that the protocol ID is not designed to always 
propagate to the outer on encap and back from the outer on decap, 
whereas the ECN field is (and it's the only field that is).




Bob

On 21/07/2019 16:48, Sebastian Moeller wrote:
> Dear Bob,
>
>> On Jul 21, 2019, at 21:14, Bob Briscoe <ietf@bobbriscoe.net> wrote:
>>
>> Sebastien,
>>
>> On 21/07/2019 17:08, Sebastian Moeller wrote:
>>> Hi Bob,
>>>
>>>
>>>
>>>> On Jul 21, 2019, at 14:30, Bob Briscoe <ietf@bobbriscoe.net>
>>>>   wrote:
>>>>
>>>> David,
>>>>
>>>> On 19/07/2019 21:06, Black, David wrote:
>>>>
>>>>> Two comments as an individual, not as a WG chair:
>>>>>
>>>>>
>>>>>> Mostly, they're things that an end-host algorithm needs
>>>>>> to do in order to behave nicely, that might be good things anyways
>>>>>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>>>>>> work well w/ small RTT, be robust to reordering).  I am curious which
>>>>>> ones you think are too rigid ... maybe they can be loosened?
>>>>>>
>>>>> [1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).
>>>>>
>>>>> For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.
>>>>>
>>>> As you know, we have been at pains to address every concern about L4S that has come up over the years, and I thought we had addressed this one to your satisfaction.
>>>>
>>>> The reliable transports you are are concerned about require ordered delivery by the underlying fabric, so they can only ever exist in a controlled environment. In such a controlled environment, your ECT1+DSCP idea (below) could be used to isolate the L4S experiment from these transports and their firmware/hardware constraints.
>>>>
>>>> On the public Internet, the DSCP commonly gets wiped at the first hop. So requiring a DSCP as well as ECT1 to separate off L4S would serve no useful purpose: it would still lead to ECT1 packets without the DSCP sent from a scalable congestion controls (which is behind Jonathan's concern in response to you).
>>>>
>>> 	And this is why IPv4's protocol fiel/ IPv6's next header field are the classifier you actually need... You are changing a significant portion of TCP's observable behavior, so it can be argued that TCP-Prague is TCP by name only; this "classifier" still lives in the IP header, so no deeper layer's need to be accessed, this is non-leaky in that the classifier is unambiguously present independent of the value of the ECN bits; and it is also compatible with an SCE style ECN signaling. Since I believe the most/only likely roll-out of L4S is going to be at the ISPs access nodes (BRAS/BNG/CMTS/whatever)  middleboxes shpould not be an unsurmountable problem, as ISPs controll their own middleboxes and often even the CPEs, so protocoll ossification is not going to be a showstopper for this part of the roll-out.
>>>
>>> Best Regards
>>> 	Sebastian
>>>
>>>
>> I think you've understood this from reading abbreviated description of the requirement on the list, rather than the spec. The spec. solely says:
>> 	A scalable congestion control MUST detect loss by counting in time-based units
>> That's all. No more, no less.
>>
>> People call this the "RACK requirement", purely because the idea came from RACK. There is no requirement to do RACK, and the requirement applies to all transports, not just TCP.
> 	Fair enough, but my argument was not really about RACK at all, it more-so applies to the linear response to CE-marks that ECT(1) promises in the L4S approach. You are making changes to TCP's congestion controller that make it cease to be "TCP-friendly" (for arguably good reasons). So why insist on pretending that this is still TCP? So give it a new protocol ID already and all your classification needs are solved. As a bonus you do not need to use the same signal (CE) to elicit two different responses, but you could use the re-gained ECT(1) code point similarly to SCE to put the new fine-grained congestion signal into... while using CE in the RFC3168 compliant sense.
>
>
>> It then means that a packet with ECT1 in the IP field can be forwarded without resequencing (no requirement - it just it /can/ be).
> 	Packets always "can" be forwarded without resequencing, the question is whether the end-points are going to like that...
> And IMHO even RACK with its at maximum one RTT reordering windows gives intermediate hops not much to work with, without knowing the full RTT a cautious hop might allow itself one retransmission slot (so its own contribution to the RTT), but as far as I can tell they do that already. And tracking the RTT will require to keep per flow statistics, this also seems like it can get computationally expensive quickly... (I probably misunderstand how RACK works, but I fail to see how it will really allow more re-ordering, but that is also orthogonal to the L4S issues I try to raise).
>
>> This is a network layer 'unordered delivery' property, so it's appropriate to flag at the IP layer.
> 	But at that point you are multiplexing multiple things into the poor ECT(1) codepoint, the promise of a certain "linear" back-off behavior on encountered congestion AND a "allow relaxed ordering" ( "detect loss by counting in time-based units" does not seem to be fully equivalent with a generic tolerance to 'unordered delivery' as far as I understand). That seems asking to much of a simple number...
>
> Best Regards
> 	Sebastian
>
>>
>>
>>
>> Bob
>>
>>
>>
>> -- 
>> ________________________________________________________________
>> Bob Briscoe
>> http://bobbriscoe.net/
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


[-- Attachment #2: Type: text/html, Size: 10743 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-22 19:48                                                 ` Pete Heist
@ 2019-07-25 16:14                                                   ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-26 13:10                                                     ` Pete Heist
  0 siblings, 1 reply; 59+ messages in thread
From: De Schepper, Koen (Nokia - BE/Antwerp) @ 2019-07-25 16:14 UTC (permalink / raw)
  To: Pete Heist
  Cc: Jonathan Morton, Bob Briscoe, ecn-sane, Black,  David, tsvwg, Dave Taht

[-- Attachment #1: Type: text/plain, Size: 2624 bytes --]

All,

We have the testbed running our reference kernel version 3.19 with the drop patch. Let me know if you want to see the difference in behavior between the “good” DCTCP and the “deteriorated” DCTCP in the latest kernels too. There were several issues introduced which made DCTCP both more aggressive, and currently less aggressive. It calls for better regression tests (for Prague at least) to make sure it’s behavior is not changed too drastically by new updates. If enough people are interested, we can organize a session in one of the available rooms.

Pete, Jonathan,

Also for testing further your tests, let me know when you are available.

Koen.

From: Pete Heist <pete@heistp.net>
Sent: Monday, July 22, 2019 9:48 PM
To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
Cc: Jonathan Morton <chromatix99@gmail.com>; Bob Briscoe <in@bobbriscoe.net>; ecn-sane@lists.bufferbloat.net; Black, David <David.Black@dell.com>; tsvwg@ietf.org; Dave Taht <dave@taht.net>
Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts


On Jul 22, 2019, at 2:15 PM, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com<mailto:koen.de_schepper@nokia-bell-labs.com>> wrote:

- related to the flent testing, you might have expected to find big differences, but both measurements showed exactly the same results. I understood you need to extent your tools to get more measurement parameters included which were missing compared to ours.

On this point, this morning the ability to start multiple ping flows with different tos values for each was already added to flent (thanks to Toke), so that we can measure inter-flow latency separately for the classic and L4S queues. We added a few related plots to use this new feature.

Since 104, development and testing of SCE has been our focus, but work on testing and interop with L4S has begun. We have built the TCP Prague and sch_dualpi2 repos for use in our testbed. Some documentation on setup, including which kernels from which repos need to be deployed in which part of a dumbbell setup, and for example, what if any configuration is, like new sysctls or sysctl values, could be helpful. We have added some documentation to the README of our repo (https://github.com/chromi/sce/).

To editorialize a bit, I think we’re both aware that testing congestion control can take time and care. I believe that together we can figure out how to improve congestion control for people that use the Internet, and the different ways that they use it. We’ll try to think about them first and foremost. :)


[-- Attachment #2: Type: text/html, Size: 5851 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-19 23:42                                         ` Dave Taht
@ 2019-07-24 16:21                                           ` Dave Taht
  0 siblings, 0 replies; 59+ messages in thread
From: Dave Taht @ 2019-07-24 16:21 UTC (permalink / raw)
  To: Wesley Eddy
  Cc: Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp), ecn-sane, tsvwg

On Fri, Jul 19, 2019 at 4:42 PM Dave Taht <dave.taht@gmail.com> wrote:
>
> On Fri, Jul 19, 2019 at 3:09 PM Wesley Eddy <wes@mti-systems.com> wrote:
> >
> > Hi Dave, thanks for clarifying, and sorry if you're getting upset.
>
> There have been a few other disappointments this ietf. I'd hoped bbrv2
> would land for independent testing. Didn't.
>
> https://github.com/google/bbr
>
> I have some "interesting" patches for bbrv1 but felt it would be saner
> to wait for the most current version (or for the bbrv2 authors to
> have the small rfc3168 baseline patch I'd requested tested by them
> rather than I), to bother redoing that series of tests and publishing.

The bbrv2 code did indeed land yesterday (and - joy!) was accompanied
by test scripts for repeatable results. The iccrg preso was
impressive. thank you, thank you. It's going to take a while to
retofit my suggested simpler rfc3168 ecn handing, and or/sce, but not
as long as until next ietf.

> I'd asked if the dctcp and dualpi code on github was stable enough to
> be independently tested. No reply.

In poking through the most current git trees, I see this commit
finally installed into dctcp *sane behavior
in response to loss* which it didn't have before.

commit aecfde23108b8e637d9f5c5e523b24fb97035dc3
Author: Koen De Schepper <koen.de_schepper@nokia-bell-labs.com>
Date:   Thu Apr 4 12:24:02 2019 +0000
    tcp: Ensure DCTCP reacts to losses
...

Which explains a few things. Now I get to throw out 8 years of test
results and start over. And throw out most of yours, also. Please note
that seeing a bug fixed of this magnitude gives me joy. Perhaps many
issues I saw were due to this, not theory/spec failures. This brings
up another issue I'll start a new subject line for.

This commit looks to make a dent in the GRO issue I've raised periodically:

commit e3058450965972e67cc0e5492c08c4cdadafc134
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Apr 11 05:55:23 2019 -0700
    dctcp: more accurate tracking of packets delivery

After commit e21db6f69a95 ("tcp: track total bytes delivered with ECN CE marks")
core TCP stack does a very good job tracking ECN signals.

    The "sender's best estimate of CE information" Yuchung mentioned in his
    patch is indeed the best we can do.

    DCTCP can use tp->delivered_ce and tp->delivered to not duplicate the logic,
    and use the existing best estimate.

    This solves some problems, since current DCTCP logic does not deal
with losses
    and/or GRO or ack aggregation very well.

...

Still it's hard to mark multiple packets in a gso/gro bundle - cake
does gso splitting by default, dualpi
does not. Has tso/gro been enabled or disabled for other's tests so far?

> The SCE folk did freeze and document a release worth testing.

But it looks to me they were missing both these commits.

> I did some testing on wifi at battlemesh but it's too noisy (but the
> sources of "noise" were important) and too obviously "ecn is not the
> wifi problem"
>
> I didn't know there was an "add a delay based option to cubic patch"
> until last week.
>
> So anyway, I do retain hope, maybe after this coming week and some
> more hackathoning, it might be possible to start getting reproducible
> and repeatable results from more participants in this controversy.
> Having to sit through another half-dozen presentations with
> irreproducible results is not something I look forward to, and I'm
> glad I don't have to.
>
> > When we're talking about keeping very small queues, then RTT is lost as
> > a congestion indicator (since there is no queue depth to modulate as a
> > congestion signal into the RTT).  We have indicators that include drop,
> > RTT, and ECN (when available).  Using rate of marks rather than just
> > binary presence of marking gives a finer-grained signal.  SCE is also
> > providing a multi-level indication, so that's another way to get more
> > "ENOB" into the samples of congestion being fed to the controllers.
>
> While this is extremely well said, RTT is NOT lost as a congestion
> indicator, it just becomes finer grained.
>
> While I'm reading tea-leaves... there's been a lot of stuff landing in
> the linux kernel from google around edf scheduling for tcp and the
> hardware enabled pacing qdiscs. So I figure they are now in the nsec
> category on their stuff but not ready to be talking.
>
> > Marking (whether classic ECN, mark-rate, or multi-level marking) is
> > needed since with small queues there's lack of congestion information in
> > the RTT.
>
> small queues *and isochronous, high speed, wired connections*.
>
> What will it take to get the ecn and especially l4s crowd to take a
> hard look at actual wireless or wifi packet captures? I mean, y'all
> are sitting staring into your laptops for a week, doing wifi. Would it
> hurt to test more actual transports during
> that time?

I do keep hoping someone will attempt to publish some wifi results. I guess
that might end up being me, next time around.

>
> How many ISPs would still be in business if wifi didn't exist, only {X}G?
>
> the wifi at the last ietf sucked...
>
> Can't even get close to 5ms latencies on any form of wireless/wifi.
>
> Anyway, I long ago agreed that multiple marks (of some sort) per rtt
> made sense (see my position statements on ecn-sane),
> but of late I've been leaning more towards really good pacing,  rtt
> and chirping with minimal marking required on
> "small queues *and isochronous, high speed, wired connections*.
>
> >
> > To address one question you repeated a couple times:
> >
> > > Is there any chance we'll see my conception of the good ietf process
> > > enforced on the L4S and SCE processes by the chairs?
> >
> > We look for working group consensus.  So far, we saw consensus to adopt
> > as a WG item for experimental track, and have been following the process
> > for that.
>
> Well, given the announcement of docsis low latency, and the size of
> the fq_codel deployment,
> and the l4s/sce drafts, we are light-years beyond anything I'd
> consider to be "experimental" in the real world.
>
> Would recognizing this reality and somehow converting this to a
> standards track debate within the ietf help anything?
>
> Would getting this out of tsvwg and restarting aqmwg help any?
>
> I was, up until all this blew up in december, planning on starting the
> process for an rfc8289bis and rfc8290bis on the standards track.
>
> >
> > On the topic of gaming the system by falsely setting the L4S ID, that
> > might need to be discussed a little bit more, since now that you mention
> > it, the docs don't seem to very directly address it yet.
>
> to me this has always been a game theory deal killer for l4s (and
> diffserv, intserv, etc). You cannot ask for
> more priority, only less. While I've been recommending books from
> kleinrock lately, another one that
> I think everyone in this field should have is:
>
> https://www.amazon.com/Theory-Games-Economic-Behavior-Commemorative-ebook/dp/B00AMAZL4I/ref=sr_1_1?keywords=theory+of+games+and+economic+behavior&qid=1563579161&s=gateway&sr=8-1
>
> I've read it countless times (and can't claim to have understood more
> than a tiny percentage of it). I wasn't aware
> until this moment there was a kindle edition.
>
> > I can only
> > speak for myself, but assumed a couple things internally, such as (1)
> > this is getting enabled in specific environments, (2) in less controlled
> > environments, an operator enabling it has protections in place for
> > getting admission or dealing with bad behavior, (3) there could be
> > further development of audit capabilities such as in CONEX, etc.  I
> > guess it could be good to hear more about what others were thinking on this.
>
> I think there was "yet another queue" suggested for detected bad behavior.
>
> >
> > > So I should have said - "tosses all normal ("classic") flows into a
> > > single and higher latency queue when a greedy normal flow is present"
> > > ... "in the dualpi" case? I know it's possible to hang a different
> > > queue algo on the "normal" queue, but
> > > to this day I don't see the need for the l4s "fast lane" in the first
> > > place, nor a cpu efficient way of doing the right things with the
> > > dualpi or curvyred code. What I see, is, long term, that special bit
> > > just becomes a "fast" lane for any sort of admission controlled
> > > traffic the ISP wants to put there, because the dualpi idea fails on
> > > real traffic.
> >
> > Thanks; this was helpful for me to understand your position.
>
> Groovy.
>
> I recently ripped ecn support out of fq_codel entirely, in
> the fq_codel_fast tree. saved some cpu, still measuring (my real objective
> is to make that code multicore),
>
> another branch also has the basic sce support, and will have more
> after jon settles on a ramp and single queue fallbacks in
> sch_cake. btw, if anyone cares, there's more than a few flent test
> servers scattered around the internet now that
> do some variant of sce for others to play with....
>
> >
> >
> > > Well if the various WGs would exit that nice hotel, and form a
> > > diaspora over the city in coffee shops and other public spaces, and do
> > > some tests of your latest and greatest stuff, y'all might get a more
> > > accurate viewpoint of what you are actually accomplishing. Take a look
> > > at what BBR does, take a look at what IW10 does, take a look at what
> > > browsers currently do.
> >
> > All of those things come up in the meetings, and frequently there is
> > measurement data shown and discussed.  It's always welcome when people
> > bring measurements, data, and experience.  The drafts and other
> > contributions are here so that anyone interested can independently
> > implement and do the testing you advocate and share results.  We're all
> > on the same team trying to make the Internet better.
>
> Skip a meeting. Try the internet in Bali. Or africa. Or south america.
> Or on a boat, Or do an interim
> in places like that.
>
> >
> >
>
>
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-22 18:15                                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-22 18:33                                                 ` Dave Taht
  2019-07-22 19:48                                                 ` Pete Heist
@ 2019-07-23 10:33                                                 ` Sebastian Moeller
  2 siblings, 0 replies; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-23 10:33 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp)
  Cc: Jonathan Morton, Bob Briscoe, Black, David, ecn-sane, tsvwg, Dave Taht

Hi Koen,


> On Jul 22, 2019, at 20:15, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> Jonathan,
> 
> I'm a bit surprised to read what I read here... I had the impression that we were on a much better level of understanding during the hackathon and that :
> 
> - we both agreed that the latest updates in the Linux kernels had quite some impact on DCTCP's performance (burstyness) that both you and we are working on. As also our testbed showed it had the same impact on DualPI2 and FQ-Codel (yes we do understand FQ_Codel and did extensively compare DualQ with it since the beginning of L4S).
> - the current TCP-Prague we have in the public GitHub, which is DCTCP using accurate ECN and ect(1) and is drop compliant with Reno, is what SCE can use as well, and whatever you called SCE-TCP can be used for L4S, as (what I showed you mathematically) it is actually perfectly working according to DCTCP's law of 1/p, because it is DCTCP with some simple pacing tweaks you did. I thought we agreed that there is no difference in the congestion control part, and we want the same thing, and the only difference is how to use the code-point.
> - related to the testbed setups, we have several running, the first since 2013. We support all kernel versions since 3.19 up to the latest 5.2-rc5. We have demonstrated L4S since 2015 in IETF93 and the L4S BoF with real equipment and software that is still the same as we use today.
> - the testbed I brought (5 laptops and a switch that got broken during travel and I had to replace in the nearest shop), I had to install during the hackathon from scratch from our public GitHub (I arrived only at 14:00 on Saturday) which we made immediately available for you guys to put the flent testing tools on.
> - related to the flent testing, you might have expected to find big differences, but both measurements showed exactly the same results. I understood you need to extent your tools to get more measurement parameters included which were missing compared to ours.
> - we planned to complete your test list during this week and maybe best that we jointly report on the outcome of those to avoid different interpretations again.
> - anybody who had interest in L4S could have evaluated it since we made our DUALPI2 code available in 2015 (actually many did).

	`Well, at IETF 104 there was a promise on the lists of VMs with both endpoints for a L4S system, which as far as I can tell never materialized, which made me refrain from testing... And I believe I did ask/propose the VM thing on this very list and got no response.

> (To Dave That: if you wanted to evaluate DualPI2 you had plenty of opportunity, 4 years by now. I find it weird that suddenly you were not able to install a qdisc in Linux. Even if you wanted us to setup a testbed for you, you could have asked us.)
> 
> Maybe some good news too, we also had a (first time right) successful accurate ECN interop test between our Linux TCP-Prague and FreeBSD Reno (acc-ecn implementation provided by Richard Scheffenegger).
> 
> I hope these accusations of incompetence can stop now, and that we get to the point of finally getting a future looking low latency Internet deployed.

	??? sorry to be so negative, but the "getting [...] deployed" part is out of our control. 

> Anybody else who doubts on the performance/robustness of L4S, let me know and we arrange a test session this week.

	Not that it counts for much, but I am neither convinced of L4S reaching its stated performance or robustness goals under adversarial conditions and long RTTs. I am looking forward to the outcome of this weeks testing (and hope my concerns will have been unfounded).


Best Regards
	Sebastian


> 
> Koen.
> 
> 
> -----Original Message-----
> From: Jonathan Morton <chromatix99@gmail.com> 
> Sent: Sunday, July 21, 2019 6:01 PM
> To: Bob Briscoe <in@bobbriscoe.net>
> Cc: Sebastian Moeller <moeller0@gmx.de>; De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>; Black, David <David.Black@dell.com>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org; Dave Taht <dave@taht.net>
> Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
> 
>> On 21 Jul, 2019, at 7:53 am, Bob Briscoe <in@bobbriscoe.net> wrote:
>> 
>> Both teams brought their testbeds, and as of yesterday evening, Koen and Pete Heist had put the two together and started the tests Jonathan proposed. Usual problems: latest Linux kernel being used has introduced a bug, so need to wind back. But progressing.
>> 
>> Nonetheless, altho it's included in the tests, I don't see the particular concern with this 'Cake' scenario. How can "L4S flows crowd out more reactive RFC3168 flows" in "an RFC3168-compliant FQ-AQM". Whenever it would be happening, FQ would prevent it.
>> 
>> To ensure we're not continually being blown into the weeds, I thought the /only/ concern was about RFC3168-compliant /single-queue/ AQMs.
> 
> I drew up a list of five network topologies to test, each with the SCE set of tests and tools, but using mostly L4S network components and focused on L4S performance and robustness.
> 
> 
> 1: L4S sender -> L4S middlebox (bottleneck) -> L4S receiver.
> 
> This is simply a sanity check to make sure the tools worked.  Actually we fell over even at this stage yesterday, because we discovered problems in the system Bob and Koen had brought along to demo.  These may or may not be improved today; we'll see.
> 
> 
> 2: L4S sender -> FQ-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.
> 
> This is the most favourable-to-L4S topology that incorporates a non-L4S component that we could easily come up with, and therefore .  Apparently the L4S folks are also relatively unfamiliar with Codel, which is now the most widely deployed AQM in the world, and this would help to validate that L4S transports respond reasonably to it.
> 
> 
> 3: L4S sender -> single-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.
> 
> This is the topology of most concern, and is obtained from topology 2 by simply changing a parameter on our middlebox.
> 
> 
> 4: L4S sender -> ECT(1) mangler -> L4S middlebox (bottleneck) -> L4S receiver.
> 
> Exploring what happens if an adversary tries to game the system.  We could also try an ECT(0) mangler or a Not-ECT mangler, in the same spirit.
> 
> 
> 5: L4S sender -> L4S middlebox (bottleneck 1) -> Dumb FIFO (bottleneck 2) -> FQ-AQM middlebox (bottleneck 3) -> L4S receiver.
> 
> This is Sebastian's scenario.  We did have some discussion yesterday about the propensity of existing senders to produce line-rate bursts occasionally, and the way these bursts could collect in *all* of the queues at successively decreasing bottlenecks.  This is a test which explores that scenario and measures its effects, and is highly relevant to best consumer practice on today's Internet.
> 
> 
> Naturally, we have tried the equivalent of most of the above scenarios on our SCE testbed already.  The only one we haven't explicitly tried out is #5; I think we'd need to use all of Pete's APUs plus at least one of my machines to set it up, and we were too tired for that last night.
> 
> - Jonathan Morton


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-22 18:15                                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-22 18:33                                                 ` Dave Taht
@ 2019-07-22 19:48                                                 ` Pete Heist
  2019-07-25 16:14                                                   ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-23 10:33                                                 ` Sebastian Moeller
  2 siblings, 1 reply; 59+ messages in thread
From: Pete Heist @ 2019-07-22 19:48 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp)
  Cc: Jonathan Morton, Bob Briscoe, ecn-sane, Black, David, tsvwg, Dave Taht

[-- Attachment #1: Type: text/plain, Size: 1537 bytes --]


> On Jul 22, 2019, at 2:15 PM, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> - related to the flent testing, you might have expected to find big differences, but both measurements showed exactly the same results. I understood you need to extent your tools to get more measurement parameters included which were missing compared to ours.

On this point, this morning the ability to start multiple ping flows with different tos values for each was already added to flent (thanks to Toke), so that we can measure inter-flow latency separately for the classic and L4S queues. We added a few related plots to use this new feature.

Since 104, development and testing of SCE has been our focus, but work on testing and interop with L4S has begun. We have built the TCP Prague and sch_dualpi2 repos for use in our testbed. Some documentation on setup, including which kernels from which repos need to be deployed in which part of a dumbbell setup, and for example, what if any configuration is, like new sysctls or sysctl values, could be helpful. We have added some documentation to the README of our repo (https://github.com/chromi/sce/ <https://github.com/chromi/sce/>).

To editorialize a bit, I think we’re both aware that testing congestion control can take time and care. I believe that together we can figure out how to improve congestion control for people that use the Internet, and the different ways that they use it. We’ll try to think about them first and foremost. :)


[-- Attachment #2: Type: text/html, Size: 2091 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-22 18:15                                               ` De Schepper, Koen (Nokia - BE/Antwerp)
@ 2019-07-22 18:33                                                 ` Dave Taht
  2019-07-22 19:48                                                 ` Pete Heist
  2019-07-23 10:33                                                 ` Sebastian Moeller
  2 siblings, 0 replies; 59+ messages in thread
From: Dave Taht @ 2019-07-22 18:33 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp)
  Cc: Jonathan Morton, Bob Briscoe, ecn-sane, Black, David, tsvwg, Dave Taht

Koen:

to be utterly clear the principal barrier to me evaluating dualpi at
any point was the patent. Still is - has the DCO issue been resolved?
But I did look at it and ran it after all this blew up and it's part
of my testbeds.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-21 16:00                                             ` Jonathan Morton
  2019-07-21 16:12                                               ` Sebastian Moeller
@ 2019-07-22 18:15                                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-22 18:33                                                 ` Dave Taht
                                                                   ` (2 more replies)
  1 sibling, 3 replies; 59+ messages in thread
From: De Schepper, Koen (Nokia - BE/Antwerp) @ 2019-07-22 18:15 UTC (permalink / raw)
  To: Jonathan Morton, Bob Briscoe
  Cc: Sebastian Moeller, Black, David, ecn-sane, tsvwg, Dave Taht

Jonathan,

I'm a bit surprised to read what I read here... I had the impression that we were on a much better level of understanding during the hackathon and that :

- we both agreed that the latest updates in the Linux kernels had quite some impact on DCTCP's performance (burstyness) that both you and we are working on. As also our testbed showed it had the same impact on DualPI2 and FQ-Codel (yes we do understand FQ_Codel and did extensively compare DualQ with it since the beginning of L4S).
- the current TCP-Prague we have in the public GitHub, which is DCTCP using accurate ECN and ect(1) and is drop compliant with Reno, is what SCE can use as well, and whatever you called SCE-TCP can be used for L4S, as (what I showed you mathematically) it is actually perfectly working according to DCTCP's law of 1/p, because it is DCTCP with some simple pacing tweaks you did. I thought we agreed that there is no difference in the congestion control part, and we want the same thing, and the only difference is how to use the code-point.
- related to the testbed setups, we have several running, the first since 2013. We support all kernel versions since 3.19 up to the latest 5.2-rc5. We have demonstrated L4S since 2015 in IETF93 and the L4S BoF with real equipment and software that is still the same as we use today.
- the testbed I brought (5 laptops and a switch that got broken during travel and I had to replace in the nearest shop), I had to install during the hackathon from scratch from our public GitHub (I arrived only at 14:00 on Saturday) which we made immediately available for you guys to put the flent testing tools on.
- related to the flent testing, you might have expected to find big differences, but both measurements showed exactly the same results. I understood you need to extent your tools to get more measurement parameters included which were missing compared to ours.
- we planned to complete your test list during this week and maybe best that we jointly report on the outcome of those to avoid different interpretations again.
- anybody who had interest in L4S could have evaluated it since we made our DUALPI2 code available in 2015 (actually many did). (To Dave That: if you wanted to evaluate DualPI2 you had plenty of opportunity, 4 years by now. I find it weird that suddenly you were not able to install a qdisc in Linux. Even if you wanted us to setup a testbed for you, you could have asked us.)

Maybe some good news too, we also had a (first time right) successful accurate ECN interop test between our Linux TCP-Prague and FreeBSD Reno (acc-ecn implementation provided by Richard Scheffenegger).

I hope these accusations of incompetence can stop now, and that we get to the point of finally getting a future looking low latency Internet deployed. Anybody else who doubts on the performance/robustness of L4S, let me know and we arrange a test session this week.

Koen.


-----Original Message-----
From: Jonathan Morton <chromatix99@gmail.com> 
Sent: Sunday, July 21, 2019 6:01 PM
To: Bob Briscoe <in@bobbriscoe.net>
Cc: Sebastian Moeller <moeller0@gmx.de>; De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>; Black, David <David.Black@dell.com>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org; Dave Taht <dave@taht.net>
Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts

> On 21 Jul, 2019, at 7:53 am, Bob Briscoe <in@bobbriscoe.net> wrote:
> 
> Both teams brought their testbeds, and as of yesterday evening, Koen and Pete Heist had put the two together and started the tests Jonathan proposed. Usual problems: latest Linux kernel being used has introduced a bug, so need to wind back. But progressing.
> 
> Nonetheless, altho it's included in the tests, I don't see the particular concern with this 'Cake' scenario. How can "L4S flows crowd out more reactive RFC3168 flows" in "an RFC3168-compliant FQ-AQM". Whenever it would be happening, FQ would prevent it.
> 
> To ensure we're not continually being blown into the weeds, I thought the /only/ concern was about RFC3168-compliant /single-queue/ AQMs.

I drew up a list of five network topologies to test, each with the SCE set of tests and tools, but using mostly L4S network components and focused on L4S performance and robustness.


1: L4S sender -> L4S middlebox (bottleneck) -> L4S receiver.

This is simply a sanity check to make sure the tools worked.  Actually we fell over even at this stage yesterday, because we discovered problems in the system Bob and Koen had brought along to demo.  These may or may not be improved today; we'll see.


2: L4S sender -> FQ-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.

This is the most favourable-to-L4S topology that incorporates a non-L4S component that we could easily come up with, and therefore .  Apparently the L4S folks are also relatively unfamiliar with Codel, which is now the most widely deployed AQM in the world, and this would help to validate that L4S transports respond reasonably to it.


3: L4S sender -> single-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.

This is the topology of most concern, and is obtained from topology 2 by simply changing a parameter on our middlebox.


4: L4S sender -> ECT(1) mangler -> L4S middlebox (bottleneck) -> L4S receiver.

Exploring what happens if an adversary tries to game the system.  We could also try an ECT(0) mangler or a Not-ECT mangler, in the same spirit.


5: L4S sender -> L4S middlebox (bottleneck 1) -> Dumb FIFO (bottleneck 2) -> FQ-AQM middlebox (bottleneck 3) -> L4S receiver.

This is Sebastian's scenario.  We did have some discussion yesterday about the propensity of existing senders to produce line-rate bursts occasionally, and the way these bursts could collect in *all* of the queues at successively decreasing bottlenecks.  This is a test which explores that scenario and measures its effects, and is highly relevant to best consumer practice on today's Internet.


Naturally, we have tried the equivalent of most of the above scenarios on our SCE testbed already.  The only one we haven't explicitly tried out is #5; I think we'd need to use all of Pete's APUs plus at least one of my machines to set it up, and we were too tired for that last night.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]  Comments on L4S drafts
  2019-07-19 15:37                                 ` Dave Taht
  2019-07-19 18:33                                   ` Wesley Eddy
@ 2019-07-22 16:28                                   ` Bless, Roland (TM)
  1 sibling, 0 replies; 59+ messages in thread
From: Bless, Roland (TM) @ 2019-07-22 16:28 UTC (permalink / raw)
  To: Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp)
  Cc: ecn-sane, Sebastian Moeller, tsvwg

Hi Dave and all,

[sorry, I'm a bit behind all the recent discussion, however...]
I agree in several points here:
1) burning ECT(1) for L4S is less beneficial than using ECT(1)
   as different kind of congestion signal as proposed in SCE
2) L4S could also use the very same signal, probably in
   addition to an L4S DSCP.
3) I don't think that we need to couple a particular SCE
   implementation to the ECT(1) usage.

Regards,
 Roland

Am 19.07.19 um 17:37 schrieb Dave Taht:
> "De Schepper, Koen (Nokia - BE/Antwerp)"
> <koen.de_schepper@nokia-bell-labs.com> writes:
> 
>> Hi Sebastian,
>>
>> To avoid people to read through the long mail, I think the main point I want to make is:
>>  "Indeed, having common-Qs supported is one of my requirements. That's
> 
> It's the common-q with AQM **+ ECN** that's the sticking point. I'm
> perfectly satisfied with the behavior of every ietf approved single
> queued AQM without ecn enabled. Let's deploy more of those!
> 
>> why I want to keep the discussion on that level: is there consensus
>> that low latency is only needed for a per flow FQ system with an AQM
>> per flow?"
> 
> Your problem statement elides the ECN bit.
> 
> If there is any one point that I'd like to see resolved about L4S
> vs SCE, it's having a vote on its the use of ECT(1) as an e2e
> identifier.
> 
> The poll I took in my communities (after trying really hard for years to
> get folk to take a look at the architecture without bias), ran about
> 98% against the L4S usage of ect(1), in the lwn article and in every
> private conversation since.
> 
> The SCE proposal for this half a bit as an additional congestion
> signal supplied by the aqm, is vastly superior.
> 
> If we could somehow create a neutral poll in the general networking
> community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
> ISPs small and large) , and do it much like your classic "vote for a
> political measure" thing, with a single point/counterpoint section,
> maybe we'd get somewhere.
> 
>>
>> If there is this consensus, this means that we can use SCE and that
>> from now on, all network nodes have to implement per flow queuing with
>> an AQM per flow.
> 
> There is no "we" here, and this is not a binary set of choices.
> 
> In particular conflating "low latency" really confounds the subject
> matter, and has for years. FQ gives "low latency" for the vast
> majority of flows running below their fair share. L4S promises "low
> latency" for a rigidly defined set of congestion controls in a
> specialized queue, and otherwise tosses all flows into a higher latency
> queue when one flow is greedy.
> 
> The "ultra low queuing latency *for all*" marketing claptrap that l4S
> had at one point really stuck in my craw.
> 
> 0) There is a "we" that likes L4S in all its complexity and missing
> integrated running code that demands total ECN deployment on one
> physical medium (so far), a change to the definition of ECN itself, and
> uses up ect(1) e2e instead of a dscp.
> 
> 1) There is a "we" that has a highly deployed fq+aqm that happens to
> have an ECN response, that is providing some of the lowest latencies
> ever seen, live on the internet, across multiple physical mediums.
> 
> With a backward compatible proposal to do better, that uses up ect(1) as
> an additional congestion notifier by the AQM.
> 
> 2) There is a VERY large (silent) majority that wants nothing to do with
> ECN at all and long ago fled the ietf, and works on things like RTT and
> other metrics that don't need anything extra at the IP layer.
> 
> 3) There is a vastly larger majority that has never even heard of AQM,
> much less ECN, and doesn't care.
> 
>> If there is no consensus, we cannot use SCE and need to use L4S.
> 
> No.
> 
> If there is no consensus, we just keep motoring on with the existing
> pie (with drop) deployments, and fq_codel/fq_pie/sch_cake more or less
> as is... and continued refinement of transports and more research.
> 
> We've got a few billion devices that could use just what we got to get
> orders of magnitude improvements in network delay.
> 
> And:
> 
> If there is consensus on fq+aqm+sce - ECN remains *optional*
> which is an outcome I massively support, also.
> 
> So repeating this:
> 
>> If there is this consensus, this means that we can use SCE and that
>> from now on, all network nodes have to implement per flow queuing with
>> an AQM per flow.
> 
> It's not a binary choice as you lay it out.
> 
> 1) Just getting FIFO queue sizes down to something reasonable - would be
> GREAT. It still blows my mind that CMTSes still have 700ms of buffering at
> 100Mbit, 8 years into this debate.
> 
> 2) only the network nodes most regularly experiencing human visible
> congestive events truly need any form of AQM or FQ. In terms of what I
> observe, thats:
> 
> ISP uplinks
> Wifi (at ISP downlink speeds > 40Mbit)
> 345G 
> ISP downlinks
> Other in-home devices like ethernet over powerline
> 
> I'm sure others in the DC and interconnects see things differently.
> 
> I know I'm weird, but I'd like to eliminate congestion *humans* see,
> rather than what skynet sees. Am I the only one that thinks this way?
> 
> 3) we currently have a choice between multiple single queue, *non ECN*
> enabled aqms that DO indeed work - pretty well - without any ECN support
> enabled - pie, red, dualpi without using the ect identifier, cake
> (cobalt). We never got around to making codel work better on a single
> queue because we didn't see the point, but what's in cobalt could go
> there if anyone cares.
> 
> We have a couple very successful fq+aqm combinations, *also*, that
> happen to have an RFC3168 ECN response.
> 
> 4) as for ECN enabled AQMs - single queued, dual q'd, or FQ'd, there's
> plenty of problems remaining with all of them and their transports, that
> make me very dubious about internet-wide deployment. Period. No matter
> what happens here, I am going to keep discouraging the linux distros as
> a whole to turn it on without first addressing the long list of items in
> the ecn-sane design group's work list.
> 
> ....
> 
> So to me, it goes back to slamming the door shut, or not, on L4S's usage
> of ect(1) as a too easily gamed e2e identifier. As I don't think it and
> all the dependent code and algorithms can possibly scale past a single
> physical layer tech, I'd like to see it move to a DSCP codepoint, worst
> case... and certainly remain "experimental" in scope until anyone
> independent can attempt to evaluate it. 
> 
> second door I'd like to slam shut is redefining CE to be a weaker signal
> of congestion as L4S does. I'm willing to write a whole bunch of
> standards track RFCs obsoleting the experimental RFCs allowing this, if
> that's what it takes. Bufferbloat is still a huge problem! Can we keep
> working on fixing that?
> 
> third door I'd like to see open is the possibilities behind SCE.
> 
> Lastly:
> 
> I'd really all the tcp-go-fast-at-any-cost people to take a year off to
> dogfood their designs, and go live somewhere with a congested network to
> deal with daily, like a railway or airport, or on 3G network on a
> sailboat or beach somewhere. It's not a bad life... REALLY.
> 
> In fact, it's WAY cheaper than attending 3 ietf conferences a year.
> 
> Enjoy Montreal!
> 
> Sincerely,
> 
> Dave Taht
>From my sailboat in Alameda
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-21 19:14                                           ` Bob Briscoe
@ 2019-07-21 20:48                                             ` Sebastian Moeller
  2019-07-25 20:51                                               ` Bob Briscoe
  0 siblings, 1 reply; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-21 20:48 UTC (permalink / raw)
  To: Bob Briscoe
  Cc: tsvwg, Black, David, ecn-sane, Dave Taht, De Schepper,
	Koen (Nokia - BE/Antwerp)

Dear Bob, 

> On Jul 21, 2019, at 21:14, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> 
> Sebastien,
> 
> On 21/07/2019 17:08, Sebastian Moeller wrote:
>> Hi Bob,
>> 
>> 
>> 
>>> On Jul 21, 2019, at 14:30, Bob Briscoe <ietf@bobbriscoe.net>
>>>  wrote:
>>> 
>>> David,
>>> 
>>> On 19/07/2019 21:06, Black, David wrote:
>>> 
>>>> Two comments as an individual, not as a WG chair:
>>>> 
>>>> 
>>>>> Mostly, they're things that an end-host algorithm needs
>>>>> to do in order to behave nicely, that might be good things anyways
>>>>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>>>>> work well w/ small RTT, be robust to reordering).  I am curious which
>>>>> ones you think are too rigid ... maybe they can be loosened?
>>>>> 
>>>> [1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).
>>>> 
>>>> For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.
>>>> 
>>> As you know, we have been at pains to address every concern about L4S that has come up over the years, and I thought we had addressed this one to your satisfaction.
>>> 
>>> The reliable transports you are are concerned about require ordered delivery by the underlying fabric, so they can only ever exist in a controlled environment. In such a controlled environment, your ECT1+DSCP idea (below) could be used to isolate the L4S experiment from these transports and their firmware/hardware constraints.
>>> 
>>> On the public Internet, the DSCP commonly gets wiped at the first hop. So requiring a DSCP as well as ECT1 to separate off L4S would serve no useful purpose: it would still lead to ECT1 packets without the DSCP sent from a scalable congestion controls (which is behind Jonathan's concern in response to you).
>>> 
>> 	And this is why IPv4's protocol fiel/ IPv6's next header field are the classifier you actually need... You are changing a significant portion of TCP's observable behavior, so it can be argued that TCP-Prague is TCP by name only; this "classifier" still lives in the IP header, so no deeper layer's need to be accessed, this is non-leaky in that the classifier is unambiguously present independent of the value of the ECN bits; and it is also compatible with an SCE style ECN signaling. Since I believe the most/only likely roll-out of L4S is going to be at the ISPs access nodes (BRAS/BNG/CMTS/whatever)  middleboxes shpould not be an unsurmountable problem, as ISPs controll their own middleboxes and often even the CPEs, so protocoll ossification is not going to be a showstopper for this part of the roll-out.
>> 
>> Best Regards
>> 	Sebastian
>> 
>> 
> I think you've understood this from reading abbreviated description of the requirement on the list, rather than the spec. The spec. solely says:
> 	A scalable congestion control MUST detect loss by counting in time-based units
> That's all. No more, no less. 
> 
> People call this the "RACK requirement", purely because the idea came from RACK. There is no requirement to do RACK, and the requirement applies to all transports, not just TCP.

	Fair enough, but my argument was not really about RACK at all, it more-so applies to the linear response to CE-marks that ECT(1) promises in the L4S approach. You are making changes to TCP's congestion controller that make it cease to be "TCP-friendly" (for arguably good reasons). So why insist on pretending that this is still TCP? So give it a new protocol ID already and all your classification needs are solved. As a bonus you do not need to use the same signal (CE) to elicit two different responses, but you could use the re-gained ECT(1) code point similarly to SCE to put the new fine-grained congestion signal into... while using CE in the RFC3168 compliant sense.


> 
> It then means that a packet with ECT1 in the IP field can be forwarded without resequencing (no requirement - it just it /can/ be).

	Packets always "can" be forwarded without resequencing, the question is whether the end-points are going to like that... 
And IMHO even RACK with its at maximum one RTT reordering windows gives intermediate hops not much to work with, without knowing the full RTT a cautious hop might allow itself one retransmission slot (so its own contribution to the RTT), but as far as I can tell they do that already. And tracking the RTT will require to keep per flow statistics, this also seems like it can get computationally expensive quickly... (I probably misunderstand how RACK works, but I fail to see how it will really allow more re-ordering, but that is also orthogonal to the L4S issues I try to raise).

> This is a network layer 'unordered delivery' property, so it's appropriate to flag at the IP layer. 

	But at that point you are multiplexing multiple things into the poor ECT(1) codepoint, the promise of a certain "linear" back-off behavior on encountered congestion AND a "allow relaxed ordering" ( "detect loss by counting in time-based units" does not seem to be fully equivalent with a generic tolerance to 'unordered delivery' as far as I understand). That seems asking to much of a simple number...

Best Regards
	Sebastian

> 
> 
> 
> 
> Bob
> 
> 
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               
> http://bobbriscoe.net/


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-21 16:08                                         ` Sebastian Moeller
@ 2019-07-21 19:14                                           ` Bob Briscoe
  2019-07-21 20:48                                             ` Sebastian Moeller
  0 siblings, 1 reply; 59+ messages in thread
From: Bob Briscoe @ 2019-07-21 19:14 UTC (permalink / raw)
  To: Sebastian Moeller
  Cc: tsvwg, Black, David, ecn-sane, Dave Taht, De Schepper,
	Koen (Nokia - BE/Antwerp)

[-- Attachment #1: Type: text/plain, Size: 4291 bytes --]

Sebastien,

On 21/07/2019 17:08, Sebastian Moeller wrote:
> Hi Bob,
>
>
>> On Jul 21, 2019, at 14:30, Bob Briscoe <ietf@bobbriscoe.net> wrote:
>>
>> David,
>>
>> On 19/07/2019 21:06, Black, David wrote:
>>> Two comments as an individual, not as a WG chair:
>>>
>>>> Mostly, they're things that an end-host algorithm needs
>>>> to do in order to behave nicely, that might be good things anyways
>>>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>>>> work well w/ small RTT, be robust to reordering).  I am curious which
>>>> ones you think are too rigid ... maybe they can be loosened?
>>> [1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).
>>>
>>> For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.
>> As you know, we have been at pains to address every concern about L4S that has come up over the years, and I thought we had addressed this one to your satisfaction.
>>
>> The reliable transports you are are concerned about require ordered delivery by the underlying fabric, so they can only ever exist in a controlled environment. In such a controlled environment, your ECT1+DSCP idea (below) could be used to isolate the L4S experiment from these transports and their firmware/hardware constraints.
>>
>> On the public Internet, the DSCP commonly gets wiped at the first hop. So requiring a DSCP as well as ECT1 to separate off L4S would serve no useful purpose: it would still lead to ECT1 packets without the DSCP sent from a scalable congestion controls (which is behind Jonathan's concern in response to you).
> 	And this is why IPv4's protocol fiel/ IPv6's next header field are the classifier you actually need... You are changing a significant portion of TCP's observable behavior, so it can be argued that TCP-Prague is TCP by name only; this "classifier" still lives in the IP header, so no deeper layer's need to be accessed, this is non-leaky in that the classifier is unambiguously present independent of the value of the ECN bits; and it is also compatible with an SCE style ECN signaling. Since I believe the most/only likely roll-out of L4S is going to be at the ISPs access nodes (BRAS/BNG/CMTS/whatever)  middleboxes shpould not be an unsurmountable problem, as ISPs controll their own middleboxes and often even the CPEs, so protocoll ossification is not going to be a showstopper for this part of the roll-out.
>
> Best Regards
> 	Sebastian
>
I think you've understood this from reading abbreviated description of 
the requirement on the list, rather than the spec. The spec. solely says:

	A scalable congestion control MUST detect loss by counting in time-based units

That's all. No more, no less.

People call this the "RACK requirement", purely because the idea came 
from RACK. There is no requirement to do RACK, and the requirement 
applies to all transports, not just TCP.

It then means that a packet with ECT1 in the IP field can be forwarded 
without resequencing (no requirement - it just it /can/ be). This is a 
network layer 'unordered delivery' property, so it's appropriate to flag 
at the IP layer.




Bob



-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


[-- Attachment #2: Type: text/html, Size: 5421 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
       [not found]                                         ` <5D34803D.50501@erg.abdn.ac.uk>
@ 2019-07-21 16:43                                           ` Black, David
  0 siblings, 0 replies; 59+ messages in thread
From: Black, David @ 2019-07-21 16:43 UTC (permalink / raw)
  To: gorry, Bob Briscoe; +Cc: ecn-sane, tsvwg

Bob,

Pulling relevant text to the top ...

> > As you know, we have been at pains to address every concern about L4S
> > that has come up over the years, and I thought we had addressed this
> > one to your satisfaction.

Truth be told, "acquiescence" would be a more accurate word than "satisfaction."  I can live with the current plans, but I would not describe myself as satisfied with them.

> > The reliable transports you are concerned about require ordered
> > delivery by the underlying fabric, so they can only ever exist in a
> > controlled environment. In such a controlled environment, your
> > ECT1+DSCP idea (below) could be used to isolate the L4S experiment
> > from these transports and their firmware/hardware constraints.

There appears to be a lack of understanding here.  The protocols in question, RoCEv2 in particular have some reordering tolerance, but not as good as TCP's.  Current requirements for ordered delivery are in the same general area as TCP with 3DupACK, which is not constrained to controlled environments.

> > On the public Internet, the DSCP commonly gets wiped at the first hop.
> > So requiring a DSCP as well as ECT1 to separate off L4S would serve no
> > useful purpose: it would still lead to ECT1 packets without the DSCP
> > sent from a scalable congestion controls (which is behind Jonathan's
> > concern in response to you).

We're on the same page here, as I also wrote the following (although a stronger word that "subtleties" would have been better in 20/20 hindsight):

> >> traffic (there are some subtleties here, e.g., interaction
> >> with operator bleaching of DSCPs to zero at network boundaries).

On to the two requests.

> > Please confirm:
> > a) that your RACK concern only applies in controlled environments, and
> > ECT1+DSCP resolves it

No, twice.  I hope that’s clearer now between what Gorry, Michael, and myself have posted.

As stated in the past, and moreover in this email thread, I can accept some sort controlled environment text as a compromise means of moving the experiment forward:

> >> Process wise, I'm ok with addressing this objection via some sort of
> >> "controlled environment" escape clause text that makes this RACK-like
> >> requirement inapplicable in a "controlled environment" that does not
> >> need that behavior (e.g., where 3DupACK does not cause problems and
> >> is not expected to cause problems).

Moving on to the next topic:

> > b) on the public Internet, we currently have one issue to address:
> > single-queue RFC3168 AQMs,
> > and if we can resolve that, ECT1 alone would be acceptable as an L4S
> > identifier.

In addition to a), there is now the desire of SCE to use ECT(1) at similar scope.

Thanks, --David

> -----Original Message-----
> From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
> Sent: Sunday, July 21, 2019 11:10 AM
> To: Bob Briscoe
> Cc: Black, David; Wesley Eddy; Dave Taht; De Schepper, Koen (Nokia -
> BE/Antwerp); ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
> Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
> 
> 
> [EXTERNAL EMAIL]
> 
> I'd like to add to this what I understand as an individual ... see inline.
> 
> On 21/07/2019, 08:30, Bob Briscoe wrote:
> > David,
> >
> > On 19/07/2019 21:06, Black, David wrote:
> >> Two comments as an individual, not as a WG chair:
> >>
> >>> Mostly, they're things that an end-host algorithm needs
> >>> to do in order to behave nicely, that might be good things anyways
> >>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
> >>> work well w/ small RTT, be robust to reordering).  I am curious which
> >>> ones you think are too rigid ... maybe they can be loosened?
> >> [1] I have profoundly objected to L4S's RACK-like requirement (use
> >> time to detect loss, and in particular do not use 3DupACK) in public
> >> on multiple occasions, because in reliable transport space, that
> >> forces use of TCP Prague, a protocol with which we have little to no
> >> deployment or operational experience.  Moreover, that requirement
> >> raises the bar for other protocols in a fashion that impacts endpoint
> >> firmware, and possibly hardware in some important (IMHO) environments
> >> where investing in those changes delivers little to no benefit.  The
> >> environments that I have in mind include a lot of data centers.
> >> Process wise, I'm ok with addressing this objection via some sort of
> >> "controlled environment" escape clause text that makes this RACK-like
> >> requirement inapplicable in a "controlled environment" that does not
> >> need that behavior (e.g., where 3DupACK does not cause problems and
> >> is not expected to cause problems).
> >>
> >> For clarity, I understand the multi-lane link design rationale behind
> >> the RACK-like requirement and would agree with that requirement in a
> >> perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK
> >> will not vanish from "running code" anytime soon.
> > As you know, we have been at pains to address every concern about L4S
> > that has come up over the years, and I thought we had addressed this
> > one to your satisfaction.
> >
> > The reliable transports you are are concerned about require ordered
> > delivery by the underlying fabric, so they can only ever exist in a
> > controlled environment. In such a controlled environment, your
> > ECT1+DSCP idea (below) could be used to isolate the L4S experiment
> > from these transports and their firmware/hardware constraints.
> >
> > On the public Internet, the DSCP commonly gets wiped at the first hop.
> > So requiring a DSCP as well as ECT1 to separate off L4S would serve no
> > useful purpose: it would still lead to ECT1 packets without the DSCP
> > sent from a scalable congestion controls (which is behind Jonathan's
> > concern in response to you).
> >
> >
> It would always be possible to have taken an approach that required a
> DSCP to use the "alternantive ECN semantic" . This option was debated
> when L4S was first discussed. The WG draft decided against that
> approach, and instead chose to use an ECT(1) codepoint. That I recall
> was analysed in depth.
> 
> This does not preclude someone from classifying on a DSCP (such as the
> suggested NQB) to also choose which ECN treatment to use (should that be
> useful for some reason, e.g. because the traffic is low rate). To me, at
> least, it important to allow traffic with DSCP markings to utilise the
> AQM ECN treatments.
> 
> >>>> So to me, it goes back to slamming the door shut, or not, on L4S's
> >>>> usage
> >>>> of ect(1) as a too easily gamed e2e identifier. As I don't think it
> >>>> and
> >>>> all the dependent code and algorithms can possibly scale past a single
> >>>> physical layer tech, I'd like to see it move to a DSCP codepoint,
> >>>> worst
> >>>> case... and certainly remain "experimental" in scope until anyone
> >>>> independent cn attempt to evaluate it.
> >>> That seems good to discuss in regard to the L4S ID draft.  There is a
> >>> section (5.2) there already discussing DSCP, and why it alone isn't
> >>> feasible.  There's also more detailed description of the relation and
> >>> interworking in
> >>> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02
> >> [2] We probably should pay more attention to that draft.  One of the
> >> things that I think is important in that draft is a requirement that
> >> operators can enable/disable L4S behavior of ECT(1) on a per-DSCP
> >> basis - the rationale for that functionality starts with incremental
> >> deployment.   This technique may also have the potential to provide a
> >> means for L4S and SCE to coexist via use of different DSCPs for L4S
> >> vs. SCE traffic (there are some subtleties here, e.g., interaction
> >> with operator bleaching of DSCPs to zero at network boundaries).
> >>
> >> To be clear on what I have in mind:
> >>     o Unacceptable: All traffic marked with ECT(1) goes into the L4S
> >> queue, independent of what DSCP it is marked with.
> That is what has been described in the WG drafts since they entered
> TSVWG. I don't recall any suggested change to that decision until just now.
> >>     o Acceptable:  There's an operator-configurable list of DSCPs
> >> that support an L4S service - traffic marked with ECT(1) goes into
> >> the L4S queue if and only if that traffic is also marked with a DSCP
> >> that is on the operator's DSCPs-for-L4S list.
> That was always possible under the "alternative ECN markings", but I
> understood the purpose was to facilitate an Internet experiment.
> > Please confirm:
> > a) that your RACK concern only applies in controlled environments, and
> > ECT1+DSCP resolves it
> That seems more than obviously needed to me. There is a lot of traffic
> that uses some notion of timeliness for retransmission. Designing such a
> transport to be robust is tricky, but we're alreday exploring that for
> TCP and QUIC.
> 
> On the other hand, I have many times urged caution in creating
> assumptions that wit would be OK for Internet paths to somehow now allow
> more reordering. I'd like to see that happen - but I don't this
> recommendation is appropriuate.
> > b) on the public Internet, we currently have one issue to address:
> > single-queue RFC3168 AQMs,
> > and if we can resolve that, ECT1 alone would be acceptable as an L4S
> > identifier.
> >
> > I am trying to focus the issues list, which I would hope you would
> > support, even without your chair hat on.
> >
> >
> >
> > Bob
> >
> Gorry
> >>
> >> Reminder: This entire message is posted as an individual, not as a WG
> >> chair.
> >>
> >> Thanks, --David
> >>
> >>> -----Original Message-----
> >>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Wesley Eddy
> >>> Sent: Friday, July 19, 2019 2:34 PM
> >>> To: Dave Taht; De Schepper, Koen (Nokia - BE/Antwerp)
> >>> Cc: ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
> >>> Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
> >>>
> >>>
> >>> [EXTERNAL EMAIL]
> >>>
> >>> On 7/19/2019 11:37 AM, Dave Taht wrote:
> >>>> It's the common-q with AQM **+ ECN** that's the sticking point. I'm
> >>>> perfectly satisfied with the behavior of every ietf approved single
> >>>> queued AQM without ecn enabled. Let's deploy more of those!
> >>> Hi Dave, I'm just trying to make sure I'm reading into your message
> >>> correctly ... if I'm understanding it, then you're not in favor of
> >>> either SCE or L4S at all?  With small queues and without ECN, loss
> >>> becomes the only congestion signal, which is not desirable, IMHO, or am
> >>> I totally misunderstanding something?
> >>>
> >>>
> >>>> If we could somehow create a neutral poll in the general networking
> >>>> community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
> >>>> ISPs small and large) , and do it much like your classic "vote for a
> >>>> political measure" thing, with a single point/counterpoint section,
> >>>> maybe we'd get somewhere.
> >>> While I agree that would be really useful, it's kind of an "I want a
> >>> pony" statement.  As a TSVWG chair where we're doing this work, we've
> >>> been getting inputs from people that have a foot in many of the
> >>> communities you mention, but always looking for more.
> >>>
> >>>
> >>>> In particular conflating "low latency" really confounds the subject
> >>>> matter, and has for years. FQ gives "low latency" for the vast
> >>>> majority of flows running below their fair share. L4S promises "low
> >>>> latency" for a rigidly defined set of congestion controls in a
> >>>> specialized queue, and otherwise tosses all flows into a higher
> >>>> latency
> >>>> queue when one flow is greedy.
> >>> I don't think this is a correct statement.  Packets have to be from a
> >>> "scalable congestion control" to get access to the L4S queue.  There
> >>> are
> >>> some draft requirements for using the L4S ID, but they seem pretty
> >>> flexible to me.  Mostly, they're things that an end-host algorithm
> >>> needs
> >>> to do in order to behave nicely, that might be good things anyways
> >>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
> >>> work well w/ small RTT, be robust to reordering).  I am curious which
> >>> ones you think are too rigid ... maybe they can be loosened?
> >>>
> >>> Also, I don't think the "tosses all flows into a higher latency queue
> >>> when one flow is greedy" characterization is correct.  The other queue
> >>> is for classic/non-scalable traffic, and not necessarily higher latency
> >>> for a given flow, nor is winding up there related to whether another
> >>> flow is greedy.
> >>>
> >>>
> >>>> So to me, it goes back to slamming the door shut, or not, on L4S's
> >>>> usage
> >>>> of ect(1) as a too easily gamed e2e identifier. As I don't think it
> >>>> and
> >>>> all the dependent code and algorithms can possibly scale past a single
> >>>> physical layer tech, I'd like to see it move to a DSCP codepoint,
> >>>> worst
> >>>> case... and certainly remain "experimental" in scope until anyone
> >>>> independent can attempt to evaluate it.
> >>> That seems good to discuss in regard to the L4S ID draft.  There is a
> >>> section (5.2) there already discussing DSCP, and why it alone isn't
> >>> feasible.  There's also more detailed description of the relation and
> >>> interworking in
> >>> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02
> >>>
> >>>
> >>>> I'd really all the tcp-go-fast-at-any-cost people to take a year
> >>>> off to
> >>>> dogfood their designs, and go live somewhere with a congested network
> >>> to
> >>>> deal with daily, like a railway or airport, or on 3G network on a
> >>>> sailboat or beach somewhere. It's not a bad life... REALLY.
> >>>>
> >>> Fortunately, at least in the IETF, I don't think there have been
> >>> initiatives in the direction of going fast at any cost in recent
> >>> history, and they would be unlikely to be well accepted if there were!
> >>> That is at least one place that there seems to be strong consensus.
> >>>
> >> _______________________________________________
> >> Ecn-sane mailing list
> >> Ecn-sane@lists.bufferbloat.net
> >> https://lists.bufferbloat.net/listinfo/ecn-sane
> >


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-21 16:00                                             ` Jonathan Morton
@ 2019-07-21 16:12                                               ` Sebastian Moeller
  2019-07-22 18:15                                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  1 sibling, 0 replies; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-21 16:12 UTC (permalink / raw)
  To: Jonathan Morton
  Cc: Bob Briscoe, De Schepper, Koen (Nokia - BE/Antwerp),
	Black, David, ecn-sane, tsvwg, Dave Taht

Dear Jonathan,

many thanks, these are exactly the tests I am curious about. Excellent work, now I am super curious about the results!



> On Jul 21, 2019, at 18:00, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 21 Jul, 2019, at 7:53 am, Bob Briscoe <in@bobbriscoe.net> wrote:
>> 
>> Both teams brought their testbeds, and as of yesterday evening, Koen and Pete Heist had put the two together and started the tests Jonathan proposed. Usual problems: latest Linux kernel being used has introduced a bug, so need to wind back. But progressing.
>> 
>> Nonetheless, altho it's included in the tests, I don't see the particular concern with this 'Cake' scenario. How can "L4S flows crowd out more reactive RFC3168 flows" in "an RFC3168-compliant FQ-AQM". Whenever it would be happening, FQ would prevent it.
>> 
>> To ensure we're not continually being blown into the weeds, I thought the /only/ concern was about RFC3168-compliant /single-queue/ AQMs.
> 
> I drew up a list of five network topologies to test, each with the SCE set of tests and tools, but using mostly L4S network components and focused on L4S performance and robustness.
> 
> 
> 1: L4S sender -> L4S middlebox (bottleneck) -> L4S receiver.
> 
> This is simply a sanity check to make sure the tools worked.  Actually we fell over even at this stage yesterday, because we discovered problems in the system Bob and Koen had brought along to demo.  These may or may not be improved today; we'll see.
> 
> 
> 2: L4S sender -> FQ-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.
> 
> This is the most favourable-to-L4S topology that incorporates a non-L4S component that we could easily come up with, and therefore .  Apparently the L4S folks are also relatively unfamiliar with Codel, which is now the most widely deployed AQM in the world, and this would help to validate that L4S transports respond reasonably to it.
> 
> 
> 3: L4S sender -> single-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.
> 
> This is the topology of most concern, and is obtained from topology 2 by simply changing a parameter on our middlebox.
> 
> 
> 4: L4S sender -> ECT(1) mangler -> L4S middlebox (bottleneck) -> L4S receiver.
> 
> Exploring what happens if an adversary tries to game the system.  We could also try an ECT(0) mangler or a Not-ECT mangler, in the same spirit.
> 
> 
> 5: L4S sender -> L4S middlebox (bottleneck 1) -> Dumb FIFO (bottleneck 2) -> FQ-AQM middlebox (bottleneck 3) -> L4S receiver.
> 
> This is Sebastian's scenario.  We did have some discussion yesterday about the propensity of existing senders to produce line-rate bursts occasionally, and the way these bursts could collect in *all* of the queues at successively decreasing bottlenecks.  This is a test which explores that scenario and measures its effects, and is highly relevant to best consumer practice on today's Internet.

	double plus!



> 
> 
> Naturally, we have tried the equivalent of most of the above scenarios on our SCE testbed already.  The only one we haven't explicitly tried out is #5; I think we'd need to use all of Pete's APUs plus at least one of my machines to set it up, and we were too tired for that last night.

	Thanks for doing this!

Best Regards
	Sebastian


> 
> - Jonathan Morton


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-21 12:30                                       ` Bob Briscoe
@ 2019-07-21 16:08                                         ` Sebastian Moeller
  2019-07-21 19:14                                           ` Bob Briscoe
       [not found]                                         ` <5D34803D.50501@erg.abdn.ac.uk>
  1 sibling, 1 reply; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-21 16:08 UTC (permalink / raw)
  To: Bob Briscoe
  Cc: Black, David, Wesley Eddy, Dave Taht, De Schepper,
	Koen (Nokia - BE/Antwerp),
	ecn-sane, tsvwg

Hi Bob,


> On Jul 21, 2019, at 14:30, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> 
> David,
> 
> On 19/07/2019 21:06, Black, David wrote:
>> Two comments as an individual, not as a WG chair:
>> 
>>> Mostly, they're things that an end-host algorithm needs
>>> to do in order to behave nicely, that might be good things anyways
>>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>>> work well w/ small RTT, be robust to reordering).  I am curious which
>>> ones you think are too rigid ... maybe they can be loosened?
>> [1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).
>> 
>> For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.
> As you know, we have been at pains to address every concern about L4S that has come up over the years, and I thought we had addressed this one to your satisfaction.
> 
> The reliable transports you are are concerned about require ordered delivery by the underlying fabric, so they can only ever exist in a controlled environment. In such a controlled environment, your ECT1+DSCP idea (below) could be used to isolate the L4S experiment from these transports and their firmware/hardware constraints.
> 
> On the public Internet, the DSCP commonly gets wiped at the first hop. So requiring a DSCP as well as ECT1 to separate off L4S would serve no useful purpose: it would still lead to ECT1 packets without the DSCP sent from a scalable congestion controls (which is behind Jonathan's concern in response to you).

	And this is why IPv4's protocol fiel/ IPv6's next header field are the classifier you actually need... You are changing a significant portion of TCP's observable behavior, so it can be argued that TCP-Prague is TCP by name only; this "classifier" still lives in the IP header, so no deeper layer's need to be accessed, this is non-leaky in that the classifier is unambiguously present independent of the value of the ECN bits; and it is also compatible with an SCE style ECN signaling. Since I believe the most/only likely roll-out of L4S is going to be at the ISPs access nodes (BRAS/BNG/CMTS/whatever)  middleboxes shpould not be an unsurmountable problem, as ISPs controll their own middleboxes and often even the CPEs, so protocoll ossification is not going to be a showstopper for this part of the roll-out.

Best Regards
	Sebastian



> 
> 
>>>> So to me, it goes back to slamming the door shut, or not, on L4S's usage
>>>> of ect(1) as a too easily gamed e2e identifier. As I don't think it and
>>>> all the dependent code and algorithms can possibly scale past a single
>>>> physical layer tech, I'd like to see it move to a DSCP codepoint, worst
>>>> case... and certainly remain "experimental" in scope until anyone
>>>> independent can attempt to evaluate it.
>>> That seems good to discuss in regard to the L4S ID draft.  There is a
>>> section (5.2) there already discussing DSCP, and why it alone isn't
>>> feasible.  There's also more detailed description of the relation and
>>> interworking in
>>> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02
>> [2] We probably should pay more attention to that draft.  One of the things that I think is important in that draft is a requirement that operators can enable/disable L4S behavior of ECT(1) on a per-DSCP basis - the rationale for that functionality starts with incremental deployment.   This technique may also have the potential to provide a means for L4S and SCE to coexist via use of different DSCPs for L4S vs. SCE traffic (there are some subtleties here, e.g., interaction with operator bleaching of DSCPs to zero at network boundaries).
>> 
>> To be clear on what I have in mind:
>> 	o Unacceptable: All traffic marked with ECT(1) goes into the L4S queue, independent of what DSCP it is marked with.
>> 	o Acceptable:  There's an operator-configurable list of DSCPs that support an L4S service - traffic marked with ECT(1) goes into the L4S queue if and only if that traffic is also marked with a DSCP that is on the operator's DSCPs-for-L4S list.
> Please confirm:
> a) that your RACK concern only applies in controlled environments, and ECT1+DSCP resolves it
> b) on the public Internet, we currently have one issue to address: single-queue RFC3168 AQMs,
> and if we can resolve that, ECT1 alone would be acceptable as an L4S identifier.
> 
> I am trying to focus the issues list, which I would hope you would support, even without your chair hat on.
> 
> 
> 
> Bob
> 
>> 
>> Reminder: This entire message is posted as an individual, not as a WG chair.
>> 
>> Thanks, --David
>> 
>>> -----Original Message-----
>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Wesley Eddy
>>> Sent: Friday, July 19, 2019 2:34 PM
>>> To: Dave Taht; De Schepper, Koen (Nokia - BE/Antwerp)
>>> Cc: ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
>>> Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
>>> 
>>> 
>>> [EXTERNAL EMAIL]
>>> 
>>> On 7/19/2019 11:37 AM, Dave Taht wrote:
>>>> It's the common-q with AQM **+ ECN** that's the sticking point. I'm
>>>> perfectly satisfied with the behavior of every ietf approved single
>>>> queued AQM without ecn enabled. Let's deploy more of those!
>>> Hi Dave, I'm just trying to make sure I'm reading into your message
>>> correctly ... if I'm understanding it, then you're not in favor of
>>> either SCE or L4S at all?  With small queues and without ECN, loss
>>> becomes the only congestion signal, which is not desirable, IMHO, or am
>>> I totally misunderstanding something?
>>> 
>>> 
>>>> If we could somehow create a neutral poll in the general networking
>>>> community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
>>>> ISPs small and large) , and do it much like your classic "vote for a
>>>> political measure" thing, with a single point/counterpoint section,
>>>> maybe we'd get somewhere.
>>> While I agree that would be really useful, it's kind of an "I want a
>>> pony" statement.  As a TSVWG chair where we're doing this work, we've
>>> been getting inputs from people that have a foot in many of the
>>> communities you mention, but always looking for more.
>>> 
>>> 
>>>> In particular conflating "low latency" really confounds the subject
>>>> matter, and has for years. FQ gives "low latency" for the vast
>>>> majority of flows running below their fair share. L4S promises "low
>>>> latency" for a rigidly defined set of congestion controls in a
>>>> specialized queue, and otherwise tosses all flows into a higher latency
>>>> queue when one flow is greedy.
>>> I don't think this is a correct statement.  Packets have to be from a
>>> "scalable congestion control" to get access to the L4S queue.  There are
>>> some draft requirements for using the L4S ID, but they seem pretty
>>> flexible to me.  Mostly, they're things that an end-host algorithm needs
>>> to do in order to behave nicely, that might be good things anyways
>>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>>> work well w/ small RTT, be robust to reordering).  I am curious which
>>> ones you think are too rigid ... maybe they can be loosened?
>>> 
>>> Also, I don't think the "tosses all flows into a higher latency queue
>>> when one flow is greedy" characterization is correct.  The other queue
>>> is for classic/non-scalable traffic, and not necessarily higher latency
>>> for a given flow, nor is winding up there related to whether another
>>> flow is greedy.
>>> 
>>> 
>>>> So to me, it goes back to slamming the door shut, or not, on L4S's usage
>>>> of ect(1) as a too easily gamed e2e identifier. As I don't think it and
>>>> all the dependent code and algorithms can possibly scale past a single
>>>> physical layer tech, I'd like to see it move to a DSCP codepoint, worst
>>>> case... and certainly remain "experimental" in scope until anyone
>>>> independent can attempt to evaluate it.
>>> That seems good to discuss in regard to the L4S ID draft.  There is a
>>> section (5.2) there already discussing DSCP, and why it alone isn't
>>> feasible.  There's also more detailed description of the relation and
>>> interworking in
>>> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02
>>> 
>>> 
>>>> I'd really all the tcp-go-fast-at-any-cost people to take a year off to
>>>> dogfood their designs, and go live somewhere with a congested network
>>> to
>>>> deal with daily, like a railway or airport, or on 3G network on a
>>>> sailboat or beach somewhere. It's not a bad life... REALLY.
>>>> 
>>> Fortunately, at least in the IETF, I don't think there have been
>>> initiatives in the direction of going fast at any cost in recent
>>> history, and they would be unlikely to be well accepted if there were!
>>> That is at least one place that there seems to be strong consensus.
>>> 
>> _______________________________________________
>> Ecn-sane mailing list
>> Ecn-sane@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/ecn-sane
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
> 
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-21 11:53                                           ` Bob Briscoe
  2019-07-21 15:33                                             ` Sebastian Moeller
@ 2019-07-21 16:00                                             ` Jonathan Morton
  2019-07-21 16:12                                               ` Sebastian Moeller
  2019-07-22 18:15                                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  1 sibling, 2 replies; 59+ messages in thread
From: Jonathan Morton @ 2019-07-21 16:00 UTC (permalink / raw)
  To: Bob Briscoe
  Cc: Sebastian Moeller, De Schepper, Koen (Nokia - BE/Antwerp),
	Black, David, ecn-sane, tsvwg, Dave Taht

> On 21 Jul, 2019, at 7:53 am, Bob Briscoe <in@bobbriscoe.net> wrote:
> 
> Both teams brought their testbeds, and as of yesterday evening, Koen and Pete Heist had put the two together and started the tests Jonathan proposed. Usual problems: latest Linux kernel being used has introduced a bug, so need to wind back. But progressing.
> 
> Nonetheless, altho it's included in the tests, I don't see the particular concern with this 'Cake' scenario. How can "L4S flows crowd out more reactive RFC3168 flows" in "an RFC3168-compliant FQ-AQM". Whenever it would be happening, FQ would prevent it.
> 
> To ensure we're not continually being blown into the weeds, I thought the /only/ concern was about RFC3168-compliant /single-queue/ AQMs.

I drew up a list of five network topologies to test, each with the SCE set of tests and tools, but using mostly L4S network components and focused on L4S performance and robustness.


1: L4S sender -> L4S middlebox (bottleneck) -> L4S receiver.

This is simply a sanity check to make sure the tools worked.  Actually we fell over even at this stage yesterday, because we discovered problems in the system Bob and Koen had brought along to demo.  These may or may not be improved today; we'll see.


2: L4S sender -> FQ-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.

This is the most favourable-to-L4S topology that incorporates a non-L4S component that we could easily come up with, and therefore .  Apparently the L4S folks are also relatively unfamiliar with Codel, which is now the most widely deployed AQM in the world, and this would help to validate that L4S transports respond reasonably to it.


3: L4S sender -> single-AQM middlebox (bottleneck) -> L4S middlebox -> L4S receiver.

This is the topology of most concern, and is obtained from topology 2 by simply changing a parameter on our middlebox.


4: L4S sender -> ECT(1) mangler -> L4S middlebox (bottleneck) -> L4S receiver.

Exploring what happens if an adversary tries to game the system.  We could also try an ECT(0) mangler or a Not-ECT mangler, in the same spirit.


5: L4S sender -> L4S middlebox (bottleneck 1) -> Dumb FIFO (bottleneck 2) -> FQ-AQM middlebox (bottleneck 3) -> L4S receiver.

This is Sebastian's scenario.  We did have some discussion yesterday about the propensity of existing senders to produce line-rate bursts occasionally, and the way these bursts could collect in *all* of the queues at successively decreasing bottlenecks.  This is a test which explores that scenario and measures its effects, and is highly relevant to best consumer practice on today's Internet.


Naturally, we have tried the equivalent of most of the above scenarios on our SCE testbed already.  The only one we haven't explicitly tried out is #5; I think we'd need to use all of Pete's APUs plus at least one of my machines to set it up, and we were too tired for that last night.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-21 11:53                                           ` Bob Briscoe
@ 2019-07-21 15:33                                             ` Sebastian Moeller
  2019-07-21 16:00                                             ` Jonathan Morton
  1 sibling, 0 replies; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-21 15:33 UTC (permalink / raw)
  To: Bob Briscoe
  Cc: Jonathan Morton, De Schepper, Koen (Nokia - BE/Antwerp),
	Black, David, ecn-sane, tsvwg, Dave Taht

Hi Bob,

I hope you had an enjoyable holiday.

> On Jul 21, 2019, at 13:53, Bob Briscoe <in@bobbriscoe.net> wrote:
> 
> Sebastian,
> 
> On 19/07/2019 23:03, Sebastian Moeller wrote:
>> Hi Jonathan,
>> 
>> 
>> 
>>> On Jul 19, 2019, at 22:44, Jonathan Morton <chromatix99@gmail.com> wrote:
>>> So I'm pleased to hear that the L4S team will be at the hackathon with a demo setup.  Hopefully we will be able to obtain comparative test results, using the same test scripts as we use on SCE, and also insert an RFC-3168 single queue AQM into their network to demonstrate what actually happens in that case.  I think that the results will be illuminating for all concerned.
>> 	What I really would like to see, how L4S endpoints will deal with post-bottleneck ingress shaping by an RFC3168 -compliant FQ-AQM. I know the experts here deems this not even a theoretical concern, but I really really want to see data, that L4S flows will not crowd out the more reactive RFC3168 flows in that situation. This is the set-up quite a number of latency sensitive end-users actually use to "debloat" the internet and it would be nice to have real data showing that this is not a concern.
> Both teams brought their testbeds, and as of yesterday evening, Koen and Pete Heist had put the two together and started the tests Jonathan proposed. Usual problems: latest Linux kernel being used has introduced a bug, so need to wind back. But progressing.

	Great!

> 
> Nonetheless, altho it's included in the tests, I don't see the particular concern with this 'Cake' scenario.

	This is not a "cake" scenario, but rather an sqm-scripts scenario; for a number of years we have directed latency sensitive users to use ingress and egress traffic shaping to keep latency under load increase in check. To make things easy we offer an exemplary set of scripts under the name sqm-scripts, see https://github.com/tohojo/sqm-scripts, that make it easy to create and test this approach (we also integrated it nicely into OpenWrt to make it eve simpler to get decent de-bufferblat configured for home networks). We implanted the general approach of an FQ-AQM as post-bottleneck shaper, with HFSC+fq_codel (since retired), HTB+fq_codel and also with cake, but the whole approach proceeds cake existence. Now, cake takes most of these ideas to a new level (e.g. operating as ingress shaper to actually shape the ingress rate instead of the shaper's egress rate), but it is not that this approach requires cake.


> How can "L4S flows crowd out more reactive RFC3168 flows" in "an RFC3168-compliant FQ-AQM". Whenever it would be happening, FQ would prevent it.

	I have heard this repeatedly, but I want to see hard data instead of theoretic considerations, please. Especially since nobody bothered to think about post-bottleneck ingress shaping before I brought it up, this certainly was not considered during the design of L4S; so if it is not a problem, just demonstrate this to shut me up ;).
	So to be clear the scenario I want tested is something like the following:

1) Internet: the test servers connected with say 10 times the true bottleneck rate

2) "true bottleneck": say 100 Mbps / 40 Mbps (using a relative dump over-buffered traffic shaper, like most ISPs seem to do, so at least buffering for >=300ms per direction)

3) post-bottleneck ingress&egress flow-fair shaping: say 90/36 Mbps.

What I want to see is that with that set-up bi-directional saturating traffic with both RFC3168 and L4S flows that each flow still sees roughly its fair share of the bandwidth. I fear that L4S with its linear CE-response will react slower to AQM signals and hence will successively eat a bit of the RFC3168-flow's bandwidth share that throttle back due to receiving a CE mark. I hope my fears are over blown, but at the current state it was not easy enough to actually test that myself.


> 
> To ensure we're not continually being blown into the weeds, I thought the /only/ concern was about RFC3168-compliant /single-queue/ AQMs.

	I believe I have been clear that my concern is the effect of under-responsive L4S-flows on the flow fairness with a post-bottleneck ingress FQ-AQM system. So no compatibility with a "RFC3168-compliant /single-queue/ AQM" is not the only concern. Especially since I know that there is a community out there using post-bottleneck ingress FQ-AQM to keep latency under load increase under control, who would be less than impressed if L4S would destroy the effectiveness of their "solution". Really, I wonder why the L4S project did not reach out to this community during the design phase, since there users could be your natural supporters assuming your solution scratches their itches sufficiently well.

Best Regards
	Sebastian

> 
> 
> 
> Bob
> 
>> 
>> Best Regards
>> 	Sebastian
>> 
>> 
>> 
>>> - Jonathan Morton
>>> _______________________________________________
>>> Ecn-sane mailing list
>>> Ecn-sane@lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/ecn-sane
>> _______________________________________________
>> Ecn-sane mailing list
>> Ecn-sane@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/ecn-sane
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-19 20:06                                     ` Black, David
  2019-07-19 20:44                                       ` Jonathan Morton
  2019-07-21 12:30                                       ` Bob Briscoe
@ 2019-07-21 12:30                                       ` Scharf, Michael
  2 siblings, 0 replies; 59+ messages in thread
From: Scharf, Michael @ 2019-07-21 12:30 UTC (permalink / raw)
  To: Black, David, Wesley Eddy, Dave Taht, De Schepper,
	Koen (Nokia - BE/Antwerp)
  Cc: ecn-sane, tsvwg

One comment, also with no hat on...

> -----Original Message-----
> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Black, David
> Sent: Friday, July 19, 2019 10:06 PM
> To: Wesley Eddy <wes@mti-systems.com>; Dave Taht <dave@taht.net>; De
> Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-
> labs.com>
> Cc: ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
> Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
> 
> Two comments as an individual, not as a WG chair:
> 
> > Mostly, they're things that an end-host algorithm needs
> > to do in order to behave nicely, that might be good things anyways
> > without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
> > work well w/ small RTT, be robust to reordering).  I am curious which
> > ones you think are too rigid ... maybe they can be loosened?
> 
> [1] I have profoundly objected to L4S's RACK-like requirement (use time to
> detect loss, and in particular do not use 3DupACK) in public on multiple
> occasions

... and I have asked in public to remove the RACK requirement, too.

> because in reliable transport space, that forces use of TCP Prague,
> a protocol with which we have little to no deployment or operational
> experience.  Moreover, that requirement raises the bar for other protocols
> in a fashion that impacts endpoint firmware, and possibly hardware in some
> important (IMHO) environments where investing in those changes delivers
> little to no benefit.  The environments that I have in mind include a lot of data
> centers.  Process wise, I'm ok with addressing this objection via some sort of
> "controlled environment" escape clause text that makes this RACK-like
> requirement inapplicable in a "controlled environment" that does not need
> that behavior (e.g., where 3DupACK does not cause problems and is not
> expected to cause problems).

Also, note that the work on RACK is ongoing in TCPM. While there seems to be plenty of deployment expertise, it is perfectly possible that issues will be discovered in future. And we are pre-WGLC in TCPM, i.e., even the specification of RACK could still change.

In general, listing in one experiment requirements on the outcome of another ongoing experiment is a bad idea and should be avoided, IMHO. I have also mentioned this in the past and will not change my mind so easily. Historically, the IETF has good experience with bottom-up modular protocol mechanisms and running code instead of top-down architectures.

Michael

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-19 20:06                                     ` Black, David
  2019-07-19 20:44                                       ` Jonathan Morton
@ 2019-07-21 12:30                                       ` Bob Briscoe
  2019-07-21 16:08                                         ` Sebastian Moeller
       [not found]                                         ` <5D34803D.50501@erg.abdn.ac.uk>
  2019-07-21 12:30                                       ` Scharf, Michael
  2 siblings, 2 replies; 59+ messages in thread
From: Bob Briscoe @ 2019-07-21 12:30 UTC (permalink / raw)
  To: Black, David, Wesley Eddy, Dave Taht, De Schepper,
	Koen (Nokia - BE/Antwerp)
  Cc: ecn-sane, tsvwg

David,

On 19/07/2019 21:06, Black, David wrote:
> Two comments as an individual, not as a WG chair:
>
>> Mostly, they're things that an end-host algorithm needs
>> to do in order to behave nicely, that might be good things anyways
>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>> work well w/ small RTT, be robust to reordering).  I am curious which
>> ones you think are too rigid ... maybe they can be loosened?
> [1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).
>
> For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.
As you know, we have been at pains to address every concern about L4S 
that has come up over the years, and I thought we had addressed this one 
to your satisfaction.

The reliable transports you are are concerned about require ordered 
delivery by the underlying fabric, so they can only ever exist in a 
controlled environment. In such a controlled environment, your ECT1+DSCP 
idea (below) could be used to isolate the L4S experiment from these 
transports and their firmware/hardware constraints.

On the public Internet, the DSCP commonly gets wiped at the first hop. 
So requiring a DSCP as well as ECT1 to separate off L4S would serve no 
useful purpose: it would still lead to ECT1 packets without the DSCP 
sent from a scalable congestion controls (which is behind Jonathan's 
concern in response to you).


>>> So to me, it goes back to slamming the door shut, or not, on L4S's usage
>>> of ect(1) as a too easily gamed e2e identifier. As I don't think it and
>>> all the dependent code and algorithms can possibly scale past a single
>>> physical layer tech, I'd like to see it move to a DSCP codepoint, worst
>>> case... and certainly remain "experimental" in scope until anyone
>>> independent can attempt to evaluate it.
>> That seems good to discuss in regard to the L4S ID draft.  There is a
>> section (5.2) there already discussing DSCP, and why it alone isn't
>> feasible.  There's also more detailed description of the relation and
>> interworking in
>> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02
> [2] We probably should pay more attention to that draft.  One of the things that I think is important in that draft is a requirement that operators can enable/disable L4S behavior of ECT(1) on a per-DSCP basis - the rationale for that functionality starts with incremental deployment.   This technique may also have the potential to provide a means for L4S and SCE to coexist via use of different DSCPs for L4S vs. SCE traffic (there are some subtleties here, e.g., interaction with operator bleaching of DSCPs to zero at network boundaries).
>
> To be clear on what I have in mind:
> 	o Unacceptable: All traffic marked with ECT(1) goes into the L4S queue, independent of what DSCP it is marked with.
> 	o Acceptable:  There's an operator-configurable list of DSCPs that support an L4S service - traffic marked with ECT(1) goes into the L4S queue if and only if that traffic is also marked with a DSCP that is on the operator's DSCPs-for-L4S list.
Please confirm:
a) that your RACK concern only applies in controlled environments, and 
ECT1+DSCP resolves it
b) on the public Internet, we currently have one issue to address: 
single-queue RFC3168 AQMs,
and if we can resolve that, ECT1 alone would be acceptable as an L4S 
identifier.

I am trying to focus the issues list, which I would hope you would 
support, even without your chair hat on.



Bob

>
> Reminder: This entire message is posted as an individual, not as a WG chair.
>
> Thanks, --David
>
>> -----Original Message-----
>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Wesley Eddy
>> Sent: Friday, July 19, 2019 2:34 PM
>> To: Dave Taht; De Schepper, Koen (Nokia - BE/Antwerp)
>> Cc: ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
>> Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
>>
>>
>> [EXTERNAL EMAIL]
>>
>> On 7/19/2019 11:37 AM, Dave Taht wrote:
>>> It's the common-q with AQM **+ ECN** that's the sticking point. I'm
>>> perfectly satisfied with the behavior of every ietf approved single
>>> queued AQM without ecn enabled. Let's deploy more of those!
>> Hi Dave, I'm just trying to make sure I'm reading into your message
>> correctly ... if I'm understanding it, then you're not in favor of
>> either SCE or L4S at all?  With small queues and without ECN, loss
>> becomes the only congestion signal, which is not desirable, IMHO, or am
>> I totally misunderstanding something?
>>
>>
>>> If we could somehow create a neutral poll in the general networking
>>> community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
>>> ISPs small and large) , and do it much like your classic "vote for a
>>> political measure" thing, with a single point/counterpoint section,
>>> maybe we'd get somewhere.
>> While I agree that would be really useful, it's kind of an "I want a
>> pony" statement.  As a TSVWG chair where we're doing this work, we've
>> been getting inputs from people that have a foot in many of the
>> communities you mention, but always looking for more.
>>
>>
>>> In particular conflating "low latency" really confounds the subject
>>> matter, and has for years. FQ gives "low latency" for the vast
>>> majority of flows running below their fair share. L4S promises "low
>>> latency" for a rigidly defined set of congestion controls in a
>>> specialized queue, and otherwise tosses all flows into a higher latency
>>> queue when one flow is greedy.
>> I don't think this is a correct statement.  Packets have to be from a
>> "scalable congestion control" to get access to the L4S queue.  There are
>> some draft requirements for using the L4S ID, but they seem pretty
>> flexible to me.  Mostly, they're things that an end-host algorithm needs
>> to do in order to behave nicely, that might be good things anyways
>> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
>> work well w/ small RTT, be robust to reordering).  I am curious which
>> ones you think are too rigid ... maybe they can be loosened?
>>
>> Also, I don't think the "tosses all flows into a higher latency queue
>> when one flow is greedy" characterization is correct.  The other queue
>> is for classic/non-scalable traffic, and not necessarily higher latency
>> for a given flow, nor is winding up there related to whether another
>> flow is greedy.
>>
>>
>>> So to me, it goes back to slamming the door shut, or not, on L4S's usage
>>> of ect(1) as a too easily gamed e2e identifier. As I don't think it and
>>> all the dependent code and algorithms can possibly scale past a single
>>> physical layer tech, I'd like to see it move to a DSCP codepoint, worst
>>> case... and certainly remain "experimental" in scope until anyone
>>> independent can attempt to evaluate it.
>> That seems good to discuss in regard to the L4S ID draft.  There is a
>> section (5.2) there already discussing DSCP, and why it alone isn't
>> feasible.  There's also more detailed description of the relation and
>> interworking in
>> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02
>>
>>
>>> I'd really all the tcp-go-fast-at-any-cost people to take a year off to
>>> dogfood their designs, and go live somewhere with a congested network
>> to
>>> deal with daily, like a railway or airport, or on 3G network on a
>>> sailboat or beach somewhere. It's not a bad life... REALLY.
>>>
>> Fortunately, at least in the IETF, I don't think there have been
>> initiatives in the direction of going fast at any cost in recent
>> history, and they would be unlikely to be well accepted if there were!
>> That is at least one place that there seems to be strong consensus.
>>
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-19 22:03                                         ` Sebastian Moeller
  2019-07-20 21:02                                           ` Dave Taht
@ 2019-07-21 11:53                                           ` Bob Briscoe
  2019-07-21 15:33                                             ` Sebastian Moeller
  2019-07-21 16:00                                             ` Jonathan Morton
  1 sibling, 2 replies; 59+ messages in thread
From: Bob Briscoe @ 2019-07-21 11:53 UTC (permalink / raw)
  To: Sebastian Moeller, Jonathan Morton
  Cc: De Schepper, Koen (Nokia - BE/Antwerp),
	Black, David, ecn-sane, tsvwg, Dave Taht

Sebastian,

On 19/07/2019 23:03, Sebastian Moeller wrote:
> Hi Jonathan,
>
>
>
>> On Jul 19, 2019, at 22:44, Jonathan Morton <chromatix99@gmail.com> wrote:
>> So I'm pleased to hear that the L4S team will be at the hackathon with a demo setup.  Hopefully we will be able to obtain comparative test results, using the same test scripts as we use on SCE, and also insert an RFC-3168 single queue AQM into their network to demonstrate what actually happens in that case.  I think that the results will be illuminating for all concerned.
> 	What I really would like to see, how L4S endpoints will deal with post-bottleneck ingress shaping by an RFC3168 -compliant FQ-AQM. I know the experts here deems this not even a theoretical concern, but I really really want to see data, that L4S flows will not crowd out the more reactive RFC3168 flows in that situation. This is the set-up quite a number of latency sensitive end-users actually use to "debloat" the internet and it would be nice to have real data showing that this is not a concern.
Both teams brought their testbeds, and as of yesterday evening, Koen and 
Pete Heist had put the two together and started the tests Jonathan 
proposed. Usual problems: latest Linux kernel being used has introduced 
a bug, so need to wind back. But progressing.

Nonetheless, altho it's included in the tests, I don't see the 
particular concern with this 'Cake' scenario. How can "L4S flows crowd 
out more reactive RFC3168 flows" in "an RFC3168-compliant FQ-AQM". 
Whenever it would be happening, FQ would prevent it.

To ensure we're not continually being blown into the weeds, I thought 
the /only/ concern was about RFC3168-compliant /single-queue/ AQMs.



Bob

>
> Best Regards
> 	Sebastian
>
>
>
>> - Jonathan Morton
>> _______________________________________________
>> Ecn-sane mailing list
>> Ecn-sane@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/ecn-sane
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-19 22:03                                         ` Sebastian Moeller
@ 2019-07-20 21:02                                           ` Dave Taht
  2019-07-21 11:53                                           ` Bob Briscoe
  1 sibling, 0 replies; 59+ messages in thread
From: Dave Taht @ 2019-07-20 21:02 UTC (permalink / raw)
  To: Sebastian Moeller
  Cc: Jonathan Morton, Black, David, tsvwg, ecn-sane, De Schepper,
	Koen (Nokia - BE/Antwerp)

Sebastian Moeller <moeller0@gmx.de> writes:

> Hi Jonathan,
>
>
>
>> On Jul 19, 2019, at 22:44, Jonathan Morton <chromatix99@gmail.com> wrote:
>> 
>>> On 19 Jul, 2019, at 4:06 pm, Black, David <David.Black@dell.com> wrote:
>>> 
>>> To be clear on what I have in mind:
>>> 	o Unacceptable: All traffic marked with ECT(1) goes into the L4S queue, independent of what DSCP it is marked with.
>>> 	o Acceptable: There's an operator-configurable list of DSCPs
>>> that support an L4S service - traffic marked with ECT(1) goes into
>>> the L4S queue if and only if that traffic is also marked with a
>>> DSCP that is on the operator's DSCPs-for-L4S list.
>> 
>> I take it, in the latter case, that this increases the cases in
>> which L4S endpoints would need to detect that they are not receiving
>> L4S signals, but RFC-3168 signals.  The current lack of such a
>> mechanism therefore remains concerning.  For comparison, SCE
>> inherently retains such a mechanism by putting the RFC-3168 and
>> high-fidelity signals on different ECN codepoints.
>> 
>> So I'm pleased to hear that the L4S team will be at the hackathon
>> with a demo setup.  Hopefully we will be able to obtain comparative
>> test results, using the same test scripts as we use on SCE, and also
>> insert an RFC-3168 single queue AQM into their network to
>> demonstrate what actually happens in that case.  I think that the
>> results will be illuminating for all concerned.
>
> 	What I really would like to see, how L4S endpoints will deal
> with post-bottleneck ingress shaping by an RFC3168 -compliant
> FQ-AQM. I know the experts here deems this not even a theoretical
> concern, but I really really want to see data, that L4S flows will not
> crowd out the more reactive RFC3168 flows in that situation. This is
> the set-up quite a number of latency sensitive end-users actually use
> to "debloat" the internet and it would be nice to have real data
> showing that this is not a concern.

+10

>
> Best Regards
> 	Sebastian
>
>
>
>> 
>> - Jonathan Morton
>> _______________________________________________
>> Ecn-sane mailing list
>> Ecn-sane@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/ecn-sane

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-19 22:09                                       ` Wesley Eddy
@ 2019-07-19 23:42                                         ` Dave Taht
  2019-07-24 16:21                                           ` Dave Taht
  0 siblings, 1 reply; 59+ messages in thread
From: Dave Taht @ 2019-07-19 23:42 UTC (permalink / raw)
  To: Wesley Eddy
  Cc: Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp), ecn-sane, tsvwg

On Fri, Jul 19, 2019 at 3:09 PM Wesley Eddy <wes@mti-systems.com> wrote:
>
> Hi Dave, thanks for clarifying, and sorry if you're getting upset.

There have been a few other disappointments this ietf. I'd hoped bbrv2
would land for independent testing. Didn't.

https://github.com/google/bbr

I have some "interesting" patches for bbrv1 but felt it would be saner
to wait for the most current version (or for the bbrv2 authors to
have the small rfc3168 baseline patch I'd requested tested by them
rather than I), to bother redoing that series of tests and publishing.

I'd asked if the dctcp and dualpi code on github was stable enough to
be independently tested. No reply.

The SCE folk did freeze and document a release worth testing.

I did some testing on wifi at battlemesh but it's too noisy (but the
sources of "noise" were important) and too obviously "ecn is not the
wifi problem"

I didn't know there was an "add a delay based option to cubic patch"
until last week.

So anyway, I do retain hope, maybe after this coming week and some
more hackathoning, it might be possible to start getting reproducible
and repeatable results from more participants in this controversy.
Having to sit through another half-dozen presentations with
irreproducible results is not something I look forward to, and I'm
glad I don't have to.

> When we're talking about keeping very small queues, then RTT is lost as
> a congestion indicator (since there is no queue depth to modulate as a
> congestion signal into the RTT).  We have indicators that include drop,
> RTT, and ECN (when available).  Using rate of marks rather than just
> binary presence of marking gives a finer-grained signal.  SCE is also
> providing a multi-level indication, so that's another way to get more
> "ENOB" into the samples of congestion being fed to the controllers.

While this is extremely well said, RTT is NOT lost as a congestion
indicator, it just becomes finer grained.

While I'm reading tea-leaves... there's been a lot of stuff landing in
the linux kernel from google around edf scheduling for tcp and the
hardware enabled pacing qdiscs. So I figure they are now in the nsec
category on their stuff but not ready to be talking.

> Marking (whether classic ECN, mark-rate, or multi-level marking) is
> needed since with small queues there's lack of congestion information in
> the RTT.

small queues *and isochronous, high speed, wired connections*.

What will it take to get the ecn and especially l4s crowd to take a
hard look at actual wireless or wifi packet captures? I mean, y'all
are sitting staring into your laptops for a week, doing wifi. Would it
hurt to test more actual transports during
that time?

How many ISPs would still be in business if wifi didn't exist, only {X}G?

the wifi at the last ietf sucked...

Can't even get close to 5ms latencies on any form of wireless/wifi.

Anyway, I long ago agreed that multiple marks (of some sort) per rtt
made sense (see my position statements on ecn-sane),
but of late I've been leaning more towards really good pacing,  rtt
and chirping with minimal marking required on
"small queues *and isochronous, high speed, wired connections*.

>
> To address one question you repeated a couple times:
>
> > Is there any chance we'll see my conception of the good ietf process
> > enforced on the L4S and SCE processes by the chairs?
>
> We look for working group consensus.  So far, we saw consensus to adopt
> as a WG item for experimental track, and have been following the process
> for that.

Well, given the announcement of docsis low latency, and the size of
the fq_codel deployment,
and the l4s/sce drafts, we are light-years beyond anything I'd
consider to be "experimental" in the real world.

Would recognizing this reality and somehow converting this to a
standards track debate within the ietf help anything?

Would getting this out of tsvwg and restarting aqmwg help any?

I was, up until all this blew up in december, planning on starting the
process for an rfc8289bis and rfc8290bis on the standards track.

>
> On the topic of gaming the system by falsely setting the L4S ID, that
> might need to be discussed a little bit more, since now that you mention
> it, the docs don't seem to very directly address it yet.

to me this has always been a game theory deal killer for l4s (and
diffserv, intserv, etc). You cannot ask for
more priority, only less. While I've been recommending books from
kleinrock lately, another one that
I think everyone in this field should have is:

https://www.amazon.com/Theory-Games-Economic-Behavior-Commemorative-ebook/dp/B00AMAZL4I/ref=sr_1_1?keywords=theory+of+games+and+economic+behavior&qid=1563579161&s=gateway&sr=8-1

I've read it countless times (and can't claim to have understood more
than a tiny percentage of it). I wasn't aware
until this moment there was a kindle edition.

> I can only
> speak for myself, but assumed a couple things internally, such as (1)
> this is getting enabled in specific environments, (2) in less controlled
> environments, an operator enabling it has protections in place for
> getting admission or dealing with bad behavior, (3) there could be
> further development of audit capabilities such as in CONEX, etc.  I
> guess it could be good to hear more about what others were thinking on this.

I think there was "yet another queue" suggested for detected bad behavior.

>
> > So I should have said - "tosses all normal ("classic") flows into a
> > single and higher latency queue when a greedy normal flow is present"
> > ... "in the dualpi" case? I know it's possible to hang a different
> > queue algo on the "normal" queue, but
> > to this day I don't see the need for the l4s "fast lane" in the first
> > place, nor a cpu efficient way of doing the right things with the
> > dualpi or curvyred code. What I see, is, long term, that special bit
> > just becomes a "fast" lane for any sort of admission controlled
> > traffic the ISP wants to put there, because the dualpi idea fails on
> > real traffic.
>
> Thanks; this was helpful for me to understand your position.

Groovy.

I recently ripped ecn support out of fq_codel entirely, in
the fq_codel_fast tree. saved some cpu, still measuring (my real objective
is to make that code multicore),

another branch also has the basic sce support, and will have more
after jon settles on a ramp and single queue fallbacks in
sch_cake. btw, if anyone cares, there's more than a few flent test
servers scattered around the internet now that
do some variant of sce for others to play with....

>
>
> > Well if the various WGs would exit that nice hotel, and form a
> > diaspora over the city in coffee shops and other public spaces, and do
> > some tests of your latest and greatest stuff, y'all might get a more
> > accurate viewpoint of what you are actually accomplishing. Take a look
> > at what BBR does, take a look at what IW10 does, take a look at what
> > browsers currently do.
>
> All of those things come up in the meetings, and frequently there is
> measurement data shown and discussed.  It's always welcome when people
> bring measurements, data, and experience.  The drafts and other
> contributions are here so that anyone interested can independently
> implement and do the testing you advocate and share results.  We're all
> on the same team trying to make the Internet better.

Skip a meeting. Try the internet in Bali. Or africa. Or south america.
Or on a boat, Or do an interim
in places like that.

>
>


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-19 20:03                                     ` Dave Taht
@ 2019-07-19 22:09                                       ` Wesley Eddy
  2019-07-19 23:42                                         ` Dave Taht
  0 siblings, 1 reply; 59+ messages in thread
From: Wesley Eddy @ 2019-07-19 22:09 UTC (permalink / raw)
  To: Dave Taht
  Cc: Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp), ecn-sane, tsvwg

Hi Dave, thanks for clarifying, and sorry if you're getting upset.

When we're talking about keeping very small queues, then RTT is lost as 
a congestion indicator (since there is no queue depth to modulate as a 
congestion signal into the RTT).  We have indicators that include drop, 
RTT, and ECN (when available).  Using rate of marks rather than just 
binary presence of marking gives a finer-grained signal.  SCE is also 
providing a multi-level indication, so that's another way to get more 
"ENOB" into the samples of congestion being fed to the controllers.

Marking (whether classic ECN, mark-rate, or multi-level marking) is 
needed since with small queues there's lack of congestion information in 
the RTT.

To address one question you repeated a couple times:

> Is there any chance we'll see my conception of the good ietf process
> enforced on the L4S and SCE processes by the chairs?

We look for working group consensus.  So far, we saw consensus to adopt 
as a WG item for experimental track, and have been following the process 
for that.

On the topic of gaming the system by falsely setting the L4S ID, that 
might need to be discussed a little bit more, since now that you mention 
it, the docs don't seem to very directly address it yet.  I can only 
speak for myself, but assumed a couple things internally, such as (1) 
this is getting enabled in specific environments, (2) in less controlled 
environments, an operator enabling it has protections in place for 
getting admission or dealing with bad behavior, (3) there could be 
further development of audit capabilities such as in CONEX, etc.  I 
guess it could be good to hear more about what others were thinking on this.


> So I should have said - "tosses all normal ("classic") flows into a
> single and higher latency queue when a greedy normal flow is present"
> ... "in the dualpi" case? I know it's possible to hang a different
> queue algo on the "normal" queue, but
> to this day I don't see the need for the l4s "fast lane" in the first
> place, nor a cpu efficient way of doing the right things with the
> dualpi or curvyred code. What I see, is, long term, that special bit
> just becomes a "fast" lane for any sort of admission controlled
> traffic the ISP wants to put there, because the dualpi idea fails on
> real traffic.

Thanks; this was helpful for me to understand your position.


> Well if the various WGs would exit that nice hotel, and form a
> diaspora over the city in coffee shops and other public spaces, and do
> some tests of your latest and greatest stuff, y'all might get a more
> accurate viewpoint of what you are actually accomplishing. Take a look
> at what BBR does, take a look at what IW10 does, take a look at what
> browsers currently do.

All of those things come up in the meetings, and frequently there is 
measurement data shown and discussed.  It's always welcome when people 
bring measurements, data, and experience.  The drafts and other 
contributions are here so that anyone interested can independently 
implement and do the testing you advocate and share results.  We're all 
on the same team trying to make the Internet better.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-19 20:44                                       ` Jonathan Morton
@ 2019-07-19 22:03                                         ` Sebastian Moeller
  2019-07-20 21:02                                           ` Dave Taht
  2019-07-21 11:53                                           ` Bob Briscoe
  0 siblings, 2 replies; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-19 22:03 UTC (permalink / raw)
  To: Jonathan Morton
  Cc: Black, David, tsvwg, ecn-sane, Dave Taht, De Schepper,
	Koen (Nokia - BE/Antwerp)

Hi Jonathan,



> On Jul 19, 2019, at 22:44, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 19 Jul, 2019, at 4:06 pm, Black, David <David.Black@dell.com> wrote:
>> 
>> To be clear on what I have in mind:
>> 	o Unacceptable: All traffic marked with ECT(1) goes into the L4S queue, independent of what DSCP it is marked with.
>> 	o Acceptable:  There's an operator-configurable list of DSCPs that support an L4S service - traffic marked with ECT(1) goes into the L4S queue if and only if that traffic is also marked with a DSCP that is on the operator's DSCPs-for-L4S list.
> 
> I take it, in the latter case, that this increases the cases in which L4S endpoints would need to detect that they are not receiving L4S signals, but RFC-3168 signals.  The current lack of such a mechanism therefore remains concerning.  For comparison, SCE inherently retains such a mechanism by putting the RFC-3168 and high-fidelity signals on different ECN codepoints.
> 
> So I'm pleased to hear that the L4S team will be at the hackathon with a demo setup.  Hopefully we will be able to obtain comparative test results, using the same test scripts as we use on SCE, and also insert an RFC-3168 single queue AQM into their network to demonstrate what actually happens in that case.  I think that the results will be illuminating for all concerned.

	What I really would like to see, how L4S endpoints will deal with post-bottleneck ingress shaping by an RFC3168 -compliant FQ-AQM. I know the experts here deems this not even a theoretical concern, but I really really want to see data, that L4S flows will not crowd out the more reactive RFC3168 flows in that situation. This is the set-up quite a number of latency sensitive end-users actually use to "debloat" the internet and it would be nice to have real data showing that this is not a concern.

Best Regards
	Sebastian



> 
> - Jonathan Morton
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-19 18:33                                   ` Wesley Eddy
  2019-07-19 20:03                                     ` Dave Taht
  2019-07-19 20:06                                     ` Black, David
@ 2019-07-19 21:49                                     ` Sebastian Moeller
  2 siblings, 0 replies; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-19 21:49 UTC (permalink / raw)
  To: Wesley Eddy
  Cc: Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp), ecn-sane, tsvwg



> On Jul 19, 2019, at 20:33, Wesley Eddy <wes@mti-systems.com> wrote:
> 
> On 7/19/2019 11:37 AM, Dave Taht wrote:
> [...]
>> In particular conflating "low latency" really confounds the subject
>> matter, and has for years. FQ gives "low latency" for the vast
>> majority of flows running below their fair share. L4S promises "low
>> latency" for a rigidly defined set of congestion controls in a
>> specialized queue, and otherwise tosses all flows into a higher latency
>> queue when one flow is greedy.
> 
> I don't think this is a correct statement.  Packets have to be from a "scalable congestion control" to get access to the L4S queue.  

	With the current proposal, a packet needs to set the ECT(1) codepoint, there is _no_ checking whether there is a "scalable congestion control" operational on this flow. Even worse every CE-marked packet will be put into the L4S queue; the latter is a consequence of the currently preferred choice of using ECT(1) as L4S classifying bit. Sure the queue protection feature might help to demote flows not playing along the L4S rules back into the RFC3168 queue, but queue protection is advertized as optional....


> There are some draft requirements for using the L4S ID, but they seem pretty flexible to me.  Mostly, they're things that an end-host algorithm needs to do in order to behave nicely,

	Except there is no real enforcement/measurement whether flows "behave nicely", at least as far as I can see.


> [...]
> 
>> So to me, it goes back to slamming the door shut, or not, on L4S's usage
>> of ect(1) as a too easily gamed e2e identifier. As I don't think it and
>> all the dependent code and algorithms can possibly scale past a single
>> physical layer tech, I'd like to see it move to a DSCP codepoint, worst
>> case... and certainly remain "experimental" in scope until anyone
>> independent can attempt to evaluate it.
> 
> That seems good to discuss in regard to the L4S ID draft.  There is a section (5.2) there already discussing DSCP, and why it alone isn't feasible.  There's also more detailed description of the relation and interworking in https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02

	IMHO a new protocol ID is the solution:
See https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-07#appendix-B.4"

'B.4.  Protocol ID


   It has been suggested that a new ID in the IPv4 Protocol field or the
   IPv6 Next Header field could identify L4S packets.  However this
   approach is ruled out by numerous problems:

   o  A new protocol ID would need to be paired with the old one for
      each transport (TCP, SCTP, UDP, etc.);

   o  In IPv6, there can be a sequence of Next Header fields, and it
      would not be obvious which one would be expected to identify a
      network service like L4S;

   o  A new protocol ID would rarely provide an end-to-end service,
      because It is well-known that new protocol IDs are often blocked
      by numerous types of middlebox;

   o  The approach is not a solution for AQMs below the IP layer;"


None of these points are show stoppers, IMHO:
1) Especially since in all likelihood only two new protocol IDs will be needed, "AIAD TCP" and "AIAD UDP". 
2) The IPv6 issue is a bit of a red herring as the next header field typically seems to contain the exact same number as IPv4's protocol field and chained headers are probably rare. Also if the primary next header is not of an L4S type, simply treating the flow as RFC3168 compliant seems like a safe option.
3) Okay that is a challenge, but ig L4S is worth its salt, it will offer enough incentives to overcome this hurdle, otherwise why waste ECT(1) on something that the market/the network community does not seem to want?
4)
Me: "Doctor it hurts if I put an AQM below the IP layer."
Physician: " Do not do that then!"
Honestly, how is an AQM below the IP layer (so L1/L2) going to act on IP's ECN code points as required for L4S, but going to fail to look at the protocol/next header field?

This is a really clean solution for L4S issues with the currently proposed badly fitting classifier, that solves all issues with interoperability with the rest of the current internet. 

[...]


Best Regards
	Sebastian

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-19 20:06                                     ` Black, David
@ 2019-07-19 20:44                                       ` Jonathan Morton
  2019-07-19 22:03                                         ` Sebastian Moeller
  2019-07-21 12:30                                       ` Bob Briscoe
  2019-07-21 12:30                                       ` Scharf, Michael
  2 siblings, 1 reply; 59+ messages in thread
From: Jonathan Morton @ 2019-07-19 20:44 UTC (permalink / raw)
  To: Black, David
  Cc: Wesley Eddy, Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp),
	ecn-sane, tsvwg

> On 19 Jul, 2019, at 4:06 pm, Black, David <David.Black@dell.com> wrote:
> 
> To be clear on what I have in mind:
> 	o Unacceptable: All traffic marked with ECT(1) goes into the L4S queue, independent of what DSCP it is marked with.
> 	o Acceptable:  There's an operator-configurable list of DSCPs that support an L4S service - traffic marked with ECT(1) goes into the L4S queue if and only if that traffic is also marked with a DSCP that is on the operator's DSCPs-for-L4S list.

I take it, in the latter case, that this increases the cases in which L4S endpoints would need to detect that they are not receiving L4S signals, but RFC-3168 signals.  The current lack of such a mechanism therefore remains concerning.  For comparison, SCE inherently retains such a mechanism by putting the RFC-3168 and high-fidelity signals on different ECN codepoints.

So I'm pleased to hear that the L4S team will be at the hackathon with a demo setup.  Hopefully we will be able to obtain comparative test results, using the same test scripts as we use on SCE, and also insert an RFC-3168 single queue AQM into their network to demonstrate what actually happens in that case.  I think that the results will be illuminating for all concerned.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-19 18:33                                   ` Wesley Eddy
  2019-07-19 20:03                                     ` Dave Taht
@ 2019-07-19 20:06                                     ` Black, David
  2019-07-19 20:44                                       ` Jonathan Morton
                                                         ` (2 more replies)
  2019-07-19 21:49                                     ` Sebastian Moeller
  2 siblings, 3 replies; 59+ messages in thread
From: Black, David @ 2019-07-19 20:06 UTC (permalink / raw)
  To: Wesley Eddy, Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp)
  Cc: ecn-sane, tsvwg

Two comments as an individual, not as a WG chair:

> Mostly, they're things that an end-host algorithm needs
> to do in order to behave nicely, that might be good things anyways
> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
> work well w/ small RTT, be robust to reordering).  I am curious which
> ones you think are too rigid ... maybe they can be loosened?

[1] I have profoundly objected to L4S's RACK-like requirement (use time to detect loss, and in particular do not use 3DupACK) in public on multiple occasions, because in reliable transport space, that forces use of TCP Prague, a protocol with which we have little to no deployment or operational experience.  Moreover, that requirement raises the bar for other protocols in a fashion that impacts endpoint firmware, and possibly hardware in some important (IMHO) environments where investing in those changes delivers little to no benefit.  The environments that I have in mind include a lot of data centers.  Process wise, I'm ok with addressing this objection via some sort of "controlled environment" escape clause text that makes this RACK-like requirement inapplicable in a "controlled environment" that does not need that behavior (e.g., where 3DupACK does not cause problems and is not expected to cause problems).  

For clarity, I understand the multi-lane link design rationale behind the RACK-like requirement and would agree with that requirement in a perfect world ... BUT ... this world is not perfect ... e.g., 3DupACK will not vanish from "running code" anytime soon.

> > So to me, it goes back to slamming the door shut, or not, on L4S's usage
> > of ect(1) as a too easily gamed e2e identifier. As I don't think it and
> > all the dependent code and algorithms can possibly scale past a single
> > physical layer tech, I'd like to see it move to a DSCP codepoint, worst
> > case... and certainly remain "experimental" in scope until anyone
> > independent can attempt to evaluate it.
> 
> That seems good to discuss in regard to the L4S ID draft.  There is a
> section (5.2) there already discussing DSCP, and why it alone isn't
> feasible.  There's also more detailed description of the relation and
> interworking in
> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02

[2] We probably should pay more attention to that draft.  One of the things that I think is important in that draft is a requirement that operators can enable/disable L4S behavior of ECT(1) on a per-DSCP basis - the rationale for that functionality starts with incremental deployment.   This technique may also have the potential to provide a means for L4S and SCE to coexist via use of different DSCPs for L4S vs. SCE traffic (there are some subtleties here, e.g., interaction with operator bleaching of DSCPs to zero at network boundaries).

To be clear on what I have in mind:
	o Unacceptable: All traffic marked with ECT(1) goes into the L4S queue, independent of what DSCP it is marked with.
	o Acceptable:  There's an operator-configurable list of DSCPs that support an L4S service - traffic marked with ECT(1) goes into the L4S queue if and only if that traffic is also marked with a DSCP that is on the operator's DSCPs-for-L4S list.

Reminder: This entire message is posted as an individual, not as a WG chair.

Thanks, --David

> -----Original Message-----
> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Wesley Eddy
> Sent: Friday, July 19, 2019 2:34 PM
> To: Dave Taht; De Schepper, Koen (Nokia - BE/Antwerp)
> Cc: ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
> Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
> 
> 
> [EXTERNAL EMAIL]
> 
> On 7/19/2019 11:37 AM, Dave Taht wrote:
> > It's the common-q with AQM **+ ECN** that's the sticking point. I'm
> > perfectly satisfied with the behavior of every ietf approved single
> > queued AQM without ecn enabled. Let's deploy more of those!
> 
> Hi Dave, I'm just trying to make sure I'm reading into your message
> correctly ... if I'm understanding it, then you're not in favor of
> either SCE or L4S at all?  With small queues and without ECN, loss
> becomes the only congestion signal, which is not desirable, IMHO, or am
> I totally misunderstanding something?
> 
> 
> > If we could somehow create a neutral poll in the general networking
> > community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
> > ISPs small and large) , and do it much like your classic "vote for a
> > political measure" thing, with a single point/counterpoint section,
> > maybe we'd get somewhere.
> 
> While I agree that would be really useful, it's kind of an "I want a
> pony" statement.  As a TSVWG chair where we're doing this work, we've
> been getting inputs from people that have a foot in many of the
> communities you mention, but always looking for more.
> 
> 
> > In particular conflating "low latency" really confounds the subject
> > matter, and has for years. FQ gives "low latency" for the vast
> > majority of flows running below their fair share. L4S promises "low
> > latency" for a rigidly defined set of congestion controls in a
> > specialized queue, and otherwise tosses all flows into a higher latency
> > queue when one flow is greedy.
> 
> I don't think this is a correct statement.  Packets have to be from a
> "scalable congestion control" to get access to the L4S queue.  There are
> some draft requirements for using the L4S ID, but they seem pretty
> flexible to me.  Mostly, they're things that an end-host algorithm needs
> to do in order to behave nicely, that might be good things anyways
> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
> work well w/ small RTT, be robust to reordering).  I am curious which
> ones you think are too rigid ... maybe they can be loosened?
> 
> Also, I don't think the "tosses all flows into a higher latency queue
> when one flow is greedy" characterization is correct.  The other queue
> is for classic/non-scalable traffic, and not necessarily higher latency
> for a given flow, nor is winding up there related to whether another
> flow is greedy.
> 
> 
> > So to me, it goes back to slamming the door shut, or not, on L4S's usage
> > of ect(1) as a too easily gamed e2e identifier. As I don't think it and
> > all the dependent code and algorithms can possibly scale past a single
> > physical layer tech, I'd like to see it move to a DSCP codepoint, worst
> > case... and certainly remain "experimental" in scope until anyone
> > independent can attempt to evaluate it.
> 
> That seems good to discuss in regard to the L4S ID draft.  There is a
> section (5.2) there already discussing DSCP, and why it alone isn't
> feasible.  There's also more detailed description of the relation and
> interworking in
> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02
> 
> 
> > I'd really all the tcp-go-fast-at-any-cost people to take a year off to
> > dogfood their designs, and go live somewhere with a congested network
> to
> > deal with daily, like a railway or airport, or on 3G network on a
> > sailboat or beach somewhere. It's not a bad life... REALLY.
> >
> Fortunately, at least in the IETF, I don't think there have been
> initiatives in the direction of going fast at any cost in recent
> history, and they would be unlikely to be well accepted if there were!
> That is at least one place that there seems to be strong consensus.
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-19 18:33                                   ` Wesley Eddy
@ 2019-07-19 20:03                                     ` Dave Taht
  2019-07-19 22:09                                       ` Wesley Eddy
  2019-07-19 20:06                                     ` Black, David
  2019-07-19 21:49                                     ` Sebastian Moeller
  2 siblings, 1 reply; 59+ messages in thread
From: Dave Taht @ 2019-07-19 20:03 UTC (permalink / raw)
  To: Wesley Eddy
  Cc: Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp), ecn-sane, tsvwg

On Fri, Jul 19, 2019 at 11:33 AM Wesley Eddy <wes@mti-systems.com> wrote:
>
> On 7/19/2019 11:37 AM, Dave Taht wrote:
> > It's the common-q with AQM **+ ECN** that's the sticking point. I'm
> > perfectly satisfied with the behavior of every ietf approved single
> > queued AQM without ecn enabled. Let's deploy more of those!
>
> Hi Dave, I'm just trying to make sure I'm reading into your message
> correctly ... if I'm understanding it, then you're not in favor of
> either SCE or L4S at all?

I am not in favor of internet scale deployment of *ECN* at this time.
For controlled networks it can make sense. I have, indeed, done so.

Of the two proposals for making ECN safer and more useful, SCE is
struck me as superior when it appeared, and L4S
totally undeployable for a half dozen reasons since it appeared and
perpetually worse as more and more details and flaws fell out of the
architecture documents, and were 'documented' rather than treated as
the showstoppers they were.

>  With small queues and without ECN, loss
> becomes the only congestion signal

RTT... BBR...

>, which is not desirable,

packet loss we know to work with all protocols we have on the internet
other than tcp, and thus is the most important congestion indicator we
have. Until fq_codel's essentially accidental deployment of
ecn-enablement, and apple then turning it on universally, we had
essentially no field data aside from those crazies (like me) that
fully deployed it on their corporate networks.

I do rather like SCE's addition of two new congestion signals and
retention of CE as a very strong one. I'd *really* like it,
additionally, if treating "drop and mark" as an even stronger
congesting indicator also became a thing.

And I'd like it if we did more transport level work (as is finally
happening) on just about everything and *dogfooded* the results on
real home and small business networks (as I do), and ran real
benchmarks with real loads concurrent, before unleashing such a change
to the internet.

> IMHO, or am
> I totally misunderstanding something?

Has it not been clear all these years that I don't care much for ECN
in the first place? Nor do the designers of codel? Nor everyone burned
by it the first time? That ecn advocacy is limited to a very small
passionate number of folk in the ietf?

Do any of the "ecn side" actually dogfood any of their ecn stuff, day
in and day out? I encouraged y'all years ago to convince one uni, one
lab, one reasonably large scale enterprise to go all-in on ecn, and
that has not happened? still?

Look at how much of that sort of testing went into ipv6 before it
started to deploy...

every time I give a talk to the more general networking public -
people that should know what I'm talking about - I have to go explain
ecn, in enormous detail.

One of the most basic side-effects of ecn enablement is that I also
had to ecn-enable the babel protocol so it doesn't get starved out on
slower links. This points to bad side effects on every non-tcp-enabled
routing protocol.

>
> > If we could somehow create a neutral poll in the general networking
> > community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
> > ISPs small and large) , and do it much like your classic "vote for a
> > political measure" thing, with a single point/counterpoint section,
> > maybe we'd get somewhere.
>
> While I agree that would be really useful, it's kind of an "I want a
> pony" statement.  As a TSVWG chair where we're doing this work, we've
> been getting inputs from people that have a foot in many of the
> communities you mention, but always looking for more.

Speaking as someone very fed up with the ietf, that did try to leave a
few months back - there is one sadly optional ietf process I like -
"running code, & two interoperable implementations", that I wish had
been applied to the entire l4s process long before it got to this
point.

public Ns2 and ns3 models of pie and codel were required in the aqm
group. So was independent testing.

In the L4S process we'd also made the strong suggestion the L4Steam
went the openwrt route, just as we did for fq_codel, to be able to
look at real world problems we encountered there like TSO/GRO batching
and non-tcp applications. We still don't got anything even close to
that. L4S is essentially at a pre 2011 state in terms of the real
effects on real networks and legacy applcations.

Wanting that basic stuff, *running* long before it is standardized is
not "I want a pony", it's "you want a unicorn".

"doing the work" includes doing basic stuff like that. to me it's
utterly required to have done that work before inflicting it on even
the tiniest portion of the internet. I have no idea why some ietfers
don't seem to get this.

Anyway, I'm on the verge of losing my temper again, and I really
should just stay clear of these discussions, and steer clear of the
meetings, and try to just read summary reports and code. I rather
liked the early SCE results that went
by one some thread here or another in the past week or two, even the
single queue ones looked promising, and the FQ one was to die for.....

I'm looking forward, as I've always said throughout these processes,
for *RUNNING CODE* and a chance to independently evaluate the various
new ideas on real gear. My personal and principal goal is to make wifi
(and other wireless internet tech)  work better, or at least - not
work worse - that what has already been deployed in 10s of millions in
the fq_codel for wifi work.

I would like it very much if the tsvwg chairs decided to enforce the
"running code, two interoperable implementations,
and independent testability requirements that I have" - and the old
ietf that I used to like used to have - on both L4S and SCE, and the
transport mods under test - and even then the ect(1) dispute needs to
be resolved soon.

Is there any chance we'll see my conception of the good ietf process
enforced on the L4S and SCE processes by the chairs?

I'd sleep better to then focus on what I do best, which is blowing up
ideas in the real world and making them good enough to use across the
internet.

>
>
> > In particular conflating "low latency" really confounds the subject
> > matter, and has for years. FQ gives "low latency" for the vast
> > majority of flows running below their fair share. L4S promises "low
> > latency" for a rigidly defined set of congestion controls in a
> > specialized queue, and otherwise tosses all flows into a higher latency
> > queue when one flow is greedy.
>
> I don't think this is a correct statement.  Packets have to be from a
> "scalable congestion control" to get access to the L4S queue.  There are

No, they just have to mark the right bit.

No requirement to be from a scalable congestion control is *enforcable*.

So I'd never say "packets have to be from a scalable congestion
control", but "they have to set the right bit"

as for the other part, I'd re-say:

"and otherwise toss all  "normal" (classic) flows into a higher
latency classic queue when one normal flow is greedy."

I don't think "have to be from a scalable congestion control" is a
correct statement. What part about how any application can, from
userspace, set:

    const int ds = 0x01;        /* Yea! let's abuse L4S! */
    rc = setsockopt(s, IPPROTO_IPV6, IPV6_TCLASS, &ds, sizeof(ds));

is unclear?

> some draft requirements for using the L4S ID, but they seem pretty
> flexible to me.  Mostly, they're things that an end-host algorithm needs
> to do in order to behave nicely, that might be good things anyways
> without regard to L4S in the network (coexist w/ Reno, avoid RTT bias,
> work well w/ small RTT, be robust to reordering).  I am curious which
> ones you think are too rigid ... maybe they can be loosened?

no, I don't think they are rigid enough to actually work against
mixed, real workloads!

> Also, I don't think the "tosses all flows into a higher latency queue
> when one flow is greedy" characterization is correct.  The other queue
> is for classic/non-scalable traffic, and not necessarily higher latency

"Classic" is *normal* traffic. roughly 100% of the traffic that exists
today, falls into that queue.

So I should have said - "tosses all normal ("classic") flows into a
single and higher latency queue when a greedy normal flow is present"
... "in the dualpi" case? I know it's possible to hang a different
queue algo on the "normal" queue, but
to this day I don't see the need for the l4s "fast lane" in the first
place, nor a cpu efficient way of doing the right things with the
dualpi or curvyred code. What I see, is, long term, that special bit
just becomes a "fast" lane for any sort of admission controlled
traffic the ISP wants to put there, because the dualpi idea fails on
real traffic.

In my future public statements on this I'm going to give up entirely
on the newspeak.

> for a given flow, nor is winding up there related to whether another
> flow is greedy.

I'm not sure if we were talking about the same thing, but I agree what
I wrote above was originally unclear especially if your mated to the
dualq concept.

>
> > So to me, it goes back to slamming the door shut, or not, on L4S's usage
> > of ect(1) as a too easily gamed e2e identifier. As I don't think it and
> > all the dependent code and algorithms can possibly scale past a single
> > physical layer tech, I'd like to see it move to a DSCP codepoint, worst
> > case... and certainly remain "experimental" in scope until anyone
> > independent can attempt to evaluate it.
>
> That seems good to discuss in regard to the L4S ID draft.  There is a
> section (5.2) there already discussing DSCP, and why it alone isn't
> feasible.  There's also more detailed description of the relation and
> interworking in
> https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02

It's kind of a showstopping problem, I think, for anything but a well
controlled network.

Ship some code, do some tests, let some other people at it, get some
real results, starting with flent's rrul tests.

>
>
> > I'd really all the tcp-go-fast-at-any-cost people to take a year off to
> > dogfood their designs, and go live somewhere with a congested network to
> > deal with daily, like a railway or airport, or on 3G network on a
> > sailboat or beach somewhere. It's not a bad life... REALLY.
> >
> Fortunately, at least in the IETF, I don't think there have been
> initiatives in the direction of going fast at any cost in recent
> history, and they would be unlikely to be well accepted if there were!
> That is at least one place that there seems to be strong consensus.

Well if the various WGs would exit that nice hotel, and form a
diaspora over the city in coffee shops and other public spaces, and do
some tests of your latest and greatest stuff, y'all might get a more
accurate viewpoint of what you are actually accomplishing. Take a look
at what BBR does, take a look at what IW10 does, take a look at what
browsers currently do.

IETF design and testing is overly driven by overly simple tests, and
not enough by real world traffic effects.

I'm not coming to this meeting and I'm not on the tsvwg list.

I'd wanted the ecn-sane list to be a nice quiet spot to be able to
think clearly about how to fix the enormous fq_codel deployment -
particularly on wifi - if we had to - far more than I'd wanted to get
embroiled in the l4s debate.

Is there any chance we'll see my conception of the good ietf process
enforced on both the L4S and SCE processes by the chairs?


>
>
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-19 15:37                                 ` Dave Taht
@ 2019-07-19 18:33                                   ` Wesley Eddy
  2019-07-19 20:03                                     ` Dave Taht
                                                       ` (2 more replies)
  2019-07-22 16:28                                   ` Bless, Roland (TM)
  1 sibling, 3 replies; 59+ messages in thread
From: Wesley Eddy @ 2019-07-19 18:33 UTC (permalink / raw)
  To: Dave Taht, De Schepper, Koen (Nokia - BE/Antwerp); +Cc: ecn-sane, tsvwg

On 7/19/2019 11:37 AM, Dave Taht wrote:
> It's the common-q with AQM **+ ECN** that's the sticking point. I'm
> perfectly satisfied with the behavior of every ietf approved single
> queued AQM without ecn enabled. Let's deploy more of those!

Hi Dave, I'm just trying to make sure I'm reading into your message 
correctly ... if I'm understanding it, then you're not in favor of 
either SCE or L4S at all?  With small queues and without ECN, loss 
becomes the only congestion signal, which is not desirable, IMHO, or am 
I totally misunderstanding something?


> If we could somehow create a neutral poll in the general networking
> community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
> ISPs small and large) , and do it much like your classic "vote for a
> political measure" thing, with a single point/counterpoint section,
> maybe we'd get somewhere.

While I agree that would be really useful, it's kind of an "I want a 
pony" statement.  As a TSVWG chair where we're doing this work, we've 
been getting inputs from people that have a foot in many of the 
communities you mention, but always looking for more.


> In particular conflating "low latency" really confounds the subject
> matter, and has for years. FQ gives "low latency" for the vast
> majority of flows running below their fair share. L4S promises "low
> latency" for a rigidly defined set of congestion controls in a
> specialized queue, and otherwise tosses all flows into a higher latency
> queue when one flow is greedy.

I don't think this is a correct statement.  Packets have to be from a 
"scalable congestion control" to get access to the L4S queue.  There are 
some draft requirements for using the L4S ID, but they seem pretty 
flexible to me.  Mostly, they're things that an end-host algorithm needs 
to do in order to behave nicely, that might be good things anyways 
without regard to L4S in the network (coexist w/ Reno, avoid RTT bias, 
work well w/ small RTT, be robust to reordering).  I am curious which 
ones you think are too rigid ... maybe they can be loosened?

Also, I don't think the "tosses all flows into a higher latency queue 
when one flow is greedy" characterization is correct.  The other queue 
is for classic/non-scalable traffic, and not necessarily higher latency 
for a given flow, nor is winding up there related to whether another 
flow is greedy.


> So to me, it goes back to slamming the door shut, or not, on L4S's usage
> of ect(1) as a too easily gamed e2e identifier. As I don't think it and
> all the dependent code and algorithms can possibly scale past a single
> physical layer tech, I'd like to see it move to a DSCP codepoint, worst
> case... and certainly remain "experimental" in scope until anyone
> independent can attempt to evaluate it.

That seems good to discuss in regard to the L4S ID draft.  There is a 
section (5.2) there already discussing DSCP, and why it alone isn't 
feasible.  There's also more detailed description of the relation and 
interworking in 
https://tools.ietf.org/html/draft-briscoe-tsvwg-l4s-diffserv-02


> I'd really all the tcp-go-fast-at-any-cost people to take a year off to
> dogfood their designs, and go live somewhere with a congested network to
> deal with daily, like a railway or airport, or on 3G network on a
> sailboat or beach somewhere. It's not a bad life... REALLY.
>
Fortunately, at least in the IETF, I don't think there have been 
initiatives in the direction of going fast at any cost in recent 
history, and they would be unlikely to be well accepted if there were!  
That is at least one place that there seems to be strong consensus.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-19  9:06                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-19 15:37                                 ` Dave Taht
@ 2019-07-19 17:59                                 ` Sebastian Moeller
  1 sibling, 0 replies; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-19 17:59 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp)
  Cc: Holland, Jake, Jonathan Morton, ecn-sane, tsvwg

Hi Koen,



> On Jul 19, 2019, at 11:06, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> Hi Sebastian,
> 
> To avoid people to read through the long mail, I think the main point I want to make is:
> "Indeed, having common-Qs supported is one of my requirements. That's why I want to keep the discussion on that level: is there consensus that low latency is only needed for a per flow FQ system with an AQM per flow?"
> 
> If there is this consensus, this means that we can use SCE and that from now on, all network nodes have to implement per flow queuing with an AQM per flow.

	Well, this in this exclusivity I would say this is wrong. as always only few nodes along a path actually develop queues in the first place and only those need to implement a competent AQM. As a data point from real life, I employ an fq-shaper for both ingress and egress traffic on my CPE and almost all of my latency-under-load issues improved to a level where I do not care anymore; and of the remaining issues most are/were caused by my ISPs peerings/transits to the other endpoint of a connection was running "hot". And as stated in this thread already, I do not see any of our proposals reach the transit/peering routers for lack of a monetary incentive for those that would need to operate AQMs on such devices.
	On monetary incentives, I add, that, even though it is not one of L4S's stated goals, but it looks like a reasonable match for the "special services" exemption carved out in the EU's network neutrality regulations. I do not want to go into a political discussion about special services here, but just notice that this is one option for ISPs to monetize a special low-latency service tier (as L4S aims to deliver), but even in that case the ISPs are at best incentivized to build L4S-empowered links into their own data-centers and for payed peeerings, this still does not address the issue of generel peering/transit routers IMHO.

> If there is no consensus, we cannot use SCE and need to use L4S.

	I am always very wary of these kind on "tertium non datur" arguments, as if L4S and SCE would be the only options to tackle the issue (sure those are the two alternatives in the table right now, but that is a different argument).

> 
> For all the other detailed discussion topics, see [K] inline:
> 
> Regards,
> Koen.
> 
> -----Original Message-----
> From: Sebastian Moeller <moeller0@gmx.de> 
> Sent: Thursday, July 18, 2019 12:40 AM
> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
> Cc: Holland, Jake <jholland@akamai.com>; Jonathan Morton <chromatix99@gmail.com>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
> Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
> 
> Dear Koen,
> 
> 
>> On Jul 10, 2019, at 11:00, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> [...]
>>>> Are you saying that even if a scalable FQ can be implemented in high-volume aggregated links at the same cost and difficulty as dualq, there's a reason not to use FQ?
>> 
>> FQ for "per-user" isolation in access equipment has clearly an extra cost, not? If we need to implement FQ "per-flow" on top, we need 2 levels of FQ (per-user and per-user-flow, so from thousands to millions of queues). Also, I haven’t seen DC switches coming with an FQ AQM...
> 
> 	I believe there is work available demonstrating that a) millions of concurrently active flows might be overly pessimistic (even for peering routers) and b) IMHO it is far from settled that these bid transit/peering routers will employ any of the the schemes we are cooking up here. For b) I argue that both L4S "linear" CE-marking and SCE linear ECT(1) marking will give a clear signal of overload that an big ISP might not want to explicitly tell its customers...
> 
> [K] a) indeed, if queues can be dynamically allocated you could settle with less, not sure if dynamic allocation is compatible with high speed implementations. Anyway, any additional complexity is additional cost (and cycles, energy, heat dissipation, ...). Of course everything can be done...

	Great that we agree here, this is all about trade-offs.

> b) I don't agree that ECN is a signal of overload.

	Rereading RFC3168, I believe a CE mark is merited only if the packet would be dropped otherwise. IMHO that most likely will be caused by the node running out of some limited resource (bandwidth and/or CPU cycles), but sure it can be a policy decision as well, but I fail to see how such subtleties matter in our discussion.


> It is a natural part of feedback to tell greedy TCP that it reached its full capacity. Excessive drop/latency

	Well, excessive latency often correlates with overload, but IMHO is not causally linked (and hence I believe all schemes trying to deduce overload from latency-under-load-increases are _not_ looking at the right measure).


> is the signal of overload and an ECN-capable AQM switches from ECN to drop anyway in overload conditions.  

	This is the extreme situation, like in L4S when the 20ms queue limit gets exceeded and head- or tail-dropping starts?

> Excessive drop and latency can also be measured today, not?

	Well, only if you have a reasonable prior for what drop-rate and latency variation is under normal conditions. And even then one needs time to get measurement error down to the desired level, in other words that seems sub-optimal for a tight control loop.

> Running a few probes can tell customers the same with or without ECN, and capacity is measured simply with speedtests.

	Running probes is a) harder than it seems (as the probes should run against the servers of interest) b) requires probes send over the reverse path as well (so one needs looking glass servers close to the endpoints of interest). Ans speedtests are a whole different can of worms.... most end-user accessible speedtests severely under-report the necessary details to actually being able to access a link's properties even at rest, IMHO.

> 
>> 
>>>> Is there a use case where it's necessary to avoid strict isolation if strict isolation can be accomplished as cheaply?
>> 
>> Even if as cheaply, as long as there is no reliable flow identification, it clearly has side effects. Many homeworkers are using a VPN tunnel, which is only one flow encapsulating maybe dozens.
> 
> 	Fair enough, but why do you see a problem of treating this multiplexed flow as different from any other flow, after all it was the end-points conscious decision to masquerade as a single flow so why assume special treatment; it is not that intermediate hops have any insight into the multiplexing, so why expect them to cater for this?
> 
> [K] Because the design of VPN tunnels had as a main goal to maintain a secure/encrypted connection between clients and servers, trying to minimize the overhead on clients and servers by using a single TCP/UDP connection. I don't think the single flow was chosen to get treated as one flow's budget of throughput. This "feature" didn't exist at that (pre-FQ) time.

	Well, in pre-FQ-times there was no guarantee what so ever, so claiming this is an insurmountable problems seems a bit naive to me. For one using IPv6 flow labels or multiple flows are all options to deal with an FQ world. I see this as an non-serious strawman argument.

> 
>> Drop and ECN (if implemented correctly) are tunnel agnostic.
> 
> 	Exactly, and that is true for each identified flow as well, so fq does not diminish this, but rather builds on top of it.
> 
> [K] True for flows within a tunnel, but the point was that FQ treats the aggregated tunnel as a single flow compared to other single flows.

	And so does L4S... (modulo queue protection, but that will only act on packet ingress as it seems to leave the already queued packets alone). But yes, tunneling has side-effects, don't do it of you dislike these.

> 
>> Also how flows are identified might evolve (new transport protocols, encapsulations, ...?).
> 
> 	You are jesting surely, new protocols? We are in this kefuffle, because you claim that a new protocol to signal linear CE-marking response to be made of unobtaininum so you want to abuse an underused EVN code point as a classifier. If new protocols are an option, just bite the bullet and give tcp-reno a new protocol number and use this for your L4S classifier; problem solved in a nice and clean fashion.
> 
> [K] Indeed, it is hardly impossible to deploy new protocols in practice, but I hope we can make it more possible in the future, not less possible... Maybe utopic, but at least we should try to learn from past mistakes.

	So, why not use a new protocol for L4S behaviour then? If L4S truly is the bee's knees then it will drive adoptation of the new protocol, and if not, that also tells us something about the market's assessment of L4S's promises.

> 
>> Also if strict flow isolation could be done correctly, it has additional issues related to missed scheduling opportunities,
> 
> 	Please elaborate, how an intermediate hop would know about the desires of the endpoints here. As far as I can tell such hops have their own ideas about optimal scheduling that they will enforce independent of the what the endpoints deem optimal (by ncessity as most endpoints will desire highest priority for their packets).
> 
> [K] That network nodes cannot know what the end-systems want is exactly the point. FQ just assumes everybody should have the same throughput,

	Which has the great advantage of being predictable by the enduser.

> and makes an exception for single packets (to undo the most flagrant disadvantage of this strategy).

	Sorry, IMHO this one-packet rule assures forward progress for all flows and is a feature, not a kludge. But I guess I am missing something in your argument, care to elaborate?


>  But again, I don't want to let the discussion get distracted by arguing pro or con FQ. I think we have to live with both now.
> 
> [...]
> 
>>>> Anyway, to me this discussion is about the tradeoffs between the 2 proposals.  It seems to me SCE has some safety advantages that should not be thrown away lightly, 
>> 
>> I appreciate the efforts of trying to improve L4S, but nobody working on L4S for years now see a way that SCE can work on a non-FQ system.
> 
> 	That is a rather peculiar argument, especially given that both you and Bob, major forces in the L4S approach, seemm to have philosophical issues with fq?
> 
> [K] I think I am realistic to accept pro's and con's and existence of both. I think wanting only FQ is as philosophical as wanting no FQ at all.

	Nobody wants you to switch your design away from dualQ or whathever you might want, as long as your choice does not have side-effects on the rest of the internet; use a real classifier instead of trying to press ECT(1) into service where a full bit is required and the issue is solved. My point is, again, I already use an fq-system on my CPE which gets me quite close to what L4S promises, but without necessarily redesigning most of the internet. So from my perspective FQ proofed itself already, now the newcomer L4S will need to demonstrate sufficient improvements over the existing FQ solution to merit the required non-backward compatible changes it mandates. And I do want to see a fair competition between the options (and will happily switch to L4S if it proves to be superior) under fair conditions.

> 
>> For me (and I think many others) it is a no-go to only support FQ. Unfortunately we only have half a bit free, 
> 
> 	??? Again you elaborately state the options in the L4S RFC and just converge on the one which is most convenient, but also not the best match for your requirements.
> 
> [K] Indeed, having common-Qs supported is one of my requirements.

	Misunderstanding here, I am not talking about dualQ/common-Q or mandating FQ everywhere, but about the fact that you committed on (ab)using ECT(1) as you "classifier" of choice even though this has severe side-effects...


> That's why I want to keep the discussion on that level: is there consensus that low latency is only needed for a per flow FQ system with an AQM per flow?

	This is a strawman argument , as far as I am concerned, as all I want is that L4S be orthogonal the the existing internet. As the L4S-RFCs verbosely describe there are other options for the required classification, so why insist upon using ECT(1)?

> 
>> and we need to choose how to use it. Would you choose for the existing ECN switches that cannot be upgraded (are there any?) or for all future non-FQ systems.
>> 
>>>> so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.
>> 
>> The performance in FQ is clearly equivalent, but for a common-Q behavior, only L4S can work.
>> As far as I understood the SCE-LFQ proposal is actually a slower FQ implementation (an FQ in DualQ disguise 😉), so I think not really a better alternative than pure FQ. Also its single AQM on the bulk queue will undo any isolation, as a coupled AQM is stronger than any scheduler, including FQ.
> 
> 	But how would the bulk queue actually care, being dedicated to bulk flows? This basically just uses a single codel instance for all flows in the bulk queue, exactly the situation codel was designed for, if I recall correctly. Sure this will run into problems with unrepsonsive flows, but not any more than DualQ with or without  queue protection (you can steer misbehaving flows into the the "classic" queue, but this will just change which flows will suffer most of the collateral damage of that unresponsive flow, IMHO).
> 
> [K] As far as I recall, CoDel works best for a single flow.

	As any other AQM on a single queue... The point is that the AQM really really wants to target those flows that cause most of the traffic (as throttling those will cause the most immediate reduction on ingress rate for the AQM hop), FQ presents those flows on a platter, single queue AQMs rely on stochastic truths like the likelihood of marking/dropping a flow's packets being proportional to the fraction of packets of this flow in the queue. As far as I can tell DualQ works exactly on the same (stochastically marking) principle and hence also will work best for a single flow (sure due to the higher marking probability this might not be as pronounced as with RED and codel, but still theoretically it will be there). I might be confused by DualQ, so please correct me if my assumption is wrong.


> For a stateless AQM like a step using only per packet sojourn time, a common AQM over FQs is indeed working as an FQ with an AQM per queue. Applying an stateless AQM for Classic traffic (like sojourn-time RED without smoothing) will have impact on its performance. Adding common state for all bulk queue AQMs will disable the FQ effect. Anyway, the sequential scan at dequeue is the main reason why LFQ will be hard to get traction in high-speed equipment.

	I believe this to be directed at Jonathan, so no comment from my side.

Best Regards
	Sebastian

> 
> 
> Best Regards
> 	Sebastian Moeller


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]     Comments on L4S drafts
  2019-07-19  9:06                               ` De Schepper, Koen (Nokia - BE/Antwerp)
@ 2019-07-19 15:37                                 ` Dave Taht
  2019-07-19 18:33                                   ` Wesley Eddy
  2019-07-22 16:28                                   ` Bless, Roland (TM)
  2019-07-19 17:59                                 ` Sebastian Moeller
  1 sibling, 2 replies; 59+ messages in thread
From: Dave Taht @ 2019-07-19 15:37 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp); +Cc: Sebastian Moeller, ecn-sane, tsvwg

"De Schepper, Koen (Nokia - BE/Antwerp)"
<koen.de_schepper@nokia-bell-labs.com> writes:

> Hi Sebastian,
>
> To avoid people to read through the long mail, I think the main point I want to make is:
>  "Indeed, having common-Qs supported is one of my requirements. That's

It's the common-q with AQM **+ ECN** that's the sticking point. I'm
perfectly satisfied with the behavior of every ietf approved single
queued AQM without ecn enabled. Let's deploy more of those!

> why I want to keep the discussion on that level: is there consensus
> that low latency is only needed for a per flow FQ system with an AQM
> per flow?"

Your problem statement elides the ECN bit.

If there is any one point that I'd like to see resolved about L4S
vs SCE, it's having a vote on its the use of ECT(1) as an e2e
identifier.

The poll I took in my communities (after trying really hard for years to
get folk to take a look at the architecture without bias), ran about
98% against the L4S usage of ect(1), in the lwn article and in every
private conversation since.

The SCE proposal for this half a bit as an additional congestion
signal supplied by the aqm, is vastly superior.

If we could somehow create a neutral poll in the general networking
community outside the ietf (nanog, bsd, linux, dcs, bigcos, routercos,
ISPs small and large) , and do it much like your classic "vote for a
political measure" thing, with a single point/counterpoint section,
maybe we'd get somewhere.

>
> If there is this consensus, this means that we can use SCE and that
> from now on, all network nodes have to implement per flow queuing with
> an AQM per flow.

There is no "we" here, and this is not a binary set of choices.

In particular conflating "low latency" really confounds the subject
matter, and has for years. FQ gives "low latency" for the vast
majority of flows running below their fair share. L4S promises "low
latency" for a rigidly defined set of congestion controls in a
specialized queue, and otherwise tosses all flows into a higher latency
queue when one flow is greedy.

The "ultra low queuing latency *for all*" marketing claptrap that l4S
had at one point really stuck in my craw.

0) There is a "we" that likes L4S in all its complexity and missing
integrated running code that demands total ECN deployment on one
physical medium (so far), a change to the definition of ECN itself, and
uses up ect(1) e2e instead of a dscp.

1) There is a "we" that has a highly deployed fq+aqm that happens to
have an ECN response, that is providing some of the lowest latencies
ever seen, live on the internet, across multiple physical mediums.

With a backward compatible proposal to do better, that uses up ect(1) as
an additional congestion notifier by the AQM.

2) There is a VERY large (silent) majority that wants nothing to do with
ECN at all and long ago fled the ietf, and works on things like RTT and
other metrics that don't need anything extra at the IP layer.

3) There is a vastly larger majority that has never even heard of AQM,
much less ECN, and doesn't care.

> If there is no consensus, we cannot use SCE and need to use L4S.

No.

If there is no consensus, we just keep motoring on with the existing
pie (with drop) deployments, and fq_codel/fq_pie/sch_cake more or less
as is... and continued refinement of transports and more research.

We've got a few billion devices that could use just what we got to get
orders of magnitude improvements in network delay.

And:

If there is consensus on fq+aqm+sce - ECN remains *optional*
which is an outcome I massively support, also.

So repeating this:

> If there is this consensus, this means that we can use SCE and that
> from now on, all network nodes have to implement per flow queuing with
> an AQM per flow.

It's not a binary choice as you lay it out.

1) Just getting FIFO queue sizes down to something reasonable - would be
GREAT. It still blows my mind that CMTSes still have 700ms of buffering at
100Mbit, 8 years into this debate.

2) only the network nodes most regularly experiencing human visible
congestive events truly need any form of AQM or FQ. In terms of what I
observe, thats:

ISP uplinks
Wifi (at ISP downlink speeds > 40Mbit)
345G 
ISP downlinks
Other in-home devices like ethernet over powerline

I'm sure others in the DC and interconnects see things differently.

I know I'm weird, but I'd like to eliminate congestion *humans* see,
rather than what skynet sees. Am I the only one that thinks this way?

3) we currently have a choice between multiple single queue, *non ECN*
enabled aqms that DO indeed work - pretty well - without any ECN support
enabled - pie, red, dualpi without using the ect identifier, cake
(cobalt). We never got around to making codel work better on a single
queue because we didn't see the point, but what's in cobalt could go
there if anyone cares.

We have a couple very successful fq+aqm combinations, *also*, that
happen to have an RFC3168 ECN response.

4) as for ECN enabled AQMs - single queued, dual q'd, or FQ'd, there's
plenty of problems remaining with all of them and their transports, that
make me very dubious about internet-wide deployment. Period. No matter
what happens here, I am going to keep discouraging the linux distros as
a whole to turn it on without first addressing the long list of items in
the ecn-sane design group's work list.

...

So to me, it goes back to slamming the door shut, or not, on L4S's usage
of ect(1) as a too easily gamed e2e identifier. As I don't think it and
all the dependent code and algorithms can possibly scale past a single
physical layer tech, I'd like to see it move to a DSCP codepoint, worst
case... and certainly remain "experimental" in scope until anyone
independent can attempt to evaluate it. 

second door I'd like to slam shut is redefining CE to be a weaker signal
of congestion as L4S does. I'm willing to write a whole bunch of
standards track RFCs obsoleting the experimental RFCs allowing this, if
that's what it takes. Bufferbloat is still a huge problem! Can we keep
working on fixing that?

third door I'd like to see open is the possibilities behind SCE.

Lastly:

I'd really all the tcp-go-fast-at-any-cost people to take a year off to
dogfood their designs, and go live somewhere with a congested network to
deal with daily, like a railway or airport, or on 3G network on a
sailboat or beach somewhere. It's not a bad life... REALLY.

In fact, it's WAY cheaper than attending 3 ietf conferences a year.

Enjoy Montreal!

Sincerely,

Dave Taht
From my sailboat in Alameda

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-17 22:40                             ` Sebastian Moeller
@ 2019-07-19  9:06                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-19 15:37                                 ` Dave Taht
  2019-07-19 17:59                                 ` Sebastian Moeller
  0 siblings, 2 replies; 59+ messages in thread
From: De Schepper, Koen (Nokia - BE/Antwerp) @ 2019-07-19  9:06 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Holland, Jake, Jonathan Morton, ecn-sane, tsvwg

Hi Sebastian,

To avoid people to read through the long mail, I think the main point I want to make is:
 "Indeed, having common-Qs supported is one of my requirements. That's why I want to keep the discussion on that level: is there consensus that low latency is only needed for a per flow FQ system with an AQM per flow?"

If there is this consensus, this means that we can use SCE and that from now on, all network nodes have to implement per flow queuing with an AQM per flow.
If there is no consensus, we cannot use SCE and need to use L4S.

For all the other detailed discussion topics, see [K] inline:

Regards,
Koen.

-----Original Message-----
From: Sebastian Moeller <moeller0@gmx.de> 
Sent: Thursday, July 18, 2019 12:40 AM
To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
Cc: Holland, Jake <jholland@akamai.com>; Jonathan Morton <chromatix99@gmail.com>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts

Dear Koen,


> On Jul 10, 2019, at 11:00, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
[...]
>>> Are you saying that even if a scalable FQ can be implemented in high-volume aggregated links at the same cost and difficulty as dualq, there's a reason not to use FQ?
> 
> FQ for "per-user" isolation in access equipment has clearly an extra cost, not? If we need to implement FQ "per-flow" on top, we need 2 levels of FQ (per-user and per-user-flow, so from thousands to millions of queues). Also, I haven’t seen DC switches coming with an FQ AQM...

	I believe there is work available demonstrating that a) millions of concurrently active flows might be overly pessimistic (even for peering routers) and b) IMHO it is far from settled that these bid transit/peering routers will employ any of the the schemes we are cooking up here. For b) I argue that both L4S "linear" CE-marking and SCE linear ECT(1) marking will give a clear signal of overload that an big ISP might not want to explicitly tell its customers...

[K] a) indeed, if queues can be dynamically allocated you could settle with less, not sure if dynamic allocation is compatible with high speed implementations. Anyway, any additional complexity is additional cost (and cycles, energy, heat dissipation, ...). Of course everything can be done...
b) I don't agree that ECN is a signal of overload. It is a natural part of feedback to tell greedy TCP that it reached its full capacity. Excessive drop/latency is the signal of overload and an ECN-capable AQM switches from ECN to drop anyway in overload conditions.  Excessive drop and latency can also be measured today, not? Running a few probes can tell customers the same with or without ECN, and capacity is measured simply with speedtests.

> 
>>> Is there a use case where it's necessary to avoid strict isolation if strict isolation can be accomplished as cheaply?
> 
> Even if as cheaply, as long as there is no reliable flow identification, it clearly has side effects. Many homeworkers are using a VPN tunnel, which is only one flow encapsulating maybe dozens.

	Fair enough, but why do you see a problem of treating this multiplexed flow as different from any other flow, after all it was the end-points conscious decision to masquerade as a single flow so why assume special treatment; it is not that intermediate hops have any insight into the multiplexing, so why expect them to cater for this?

[K] Because the design of VPN tunnels had as a main goal to maintain a secure/encrypted connection between clients and servers, trying to minimize the overhead on clients and servers by using a single TCP/UDP connection. I don't think the single flow was chosen to get treated as one flow's budget of throughput. This "feature" didn't exist at that (pre-FQ) time.

> Drop and ECN (if implemented correctly) are tunnel agnostic.

	Exactly, and that is true for each identified flow as well, so fq does not diminish this, but rather builds on top of it.

[K] True for flows within a tunnel, but the point was that FQ treats the aggregated tunnel as a single flow compared to other single flows.

> Also how flows are identified might evolve (new transport protocols, encapsulations, ...?).

	You are jesting surely, new protocols? We are in this kefuffle, because you claim that a new protocol to signal linear CE-marking response to be made of unobtaininum so you want to abuse an underused EVN code point as a classifier. If new protocols are an option, just bite the bullet and give tcp-reno a new protocol number and use this for your L4S classifier; problem solved in a nice and clean fashion.

[K] Indeed, it is hardly impossible to deploy new protocols in practice, but I hope we can make it more possible in the future, not less possible... Maybe utopic, but at least we should try to learn from past mistakes.

> Also if strict flow isolation could be done correctly, it has additional issues related to missed scheduling opportunities,

	Please elaborate, how an intermediate hop would know about the desires of the endpoints here. As far as I can tell such hops have their own ideas about optimal scheduling that they will enforce independent of the what the endpoints deem optimal (by ncessity as most endpoints will desire highest priority for their packets).

[K] That network nodes cannot know what the end-systems want is exactly the point. FQ just assumes everybody should have the same throughput, and makes an exception for single packets (to undo the most flagrant disadvantage of this strategy). But again, I don't want to let the discussion get distracted by arguing pro or con FQ. I think we have to live with both now.

[...]

>>> Anyway, to me this discussion is about the tradeoffs between the 2 proposals.  It seems to me SCE has some safety advantages that should not be thrown away lightly, 
> 
> I appreciate the efforts of trying to improve L4S, but nobody working on L4S for years now see a way that SCE can work on a non-FQ system.

	That is a rather peculiar argument, especially given that both you and Bob, major forces in the L4S approach, seemm to have philosophical issues with fq?

[K] I think I am realistic to accept pro's and con's and existence of both. I think wanting only FQ is as philosophical as wanting no FQ at all.

> For me (and I think many others) it is a no-go to only support FQ. Unfortunately we only have half a bit free, 

	??? Again you elaborately state the options in the L4S RFC and just converge on the one which is most convenient, but also not the best match for your requirements.

[K] Indeed, having common-Qs supported is one of my requirements. That's why I want to keep the discussion on that level: is there consensus that low latency is only needed for a per flow FQ system with an AQM per flow?

> and we need to choose how to use it. Would you choose for the existing ECN switches that cannot be upgraded (are there any?) or for all future non-FQ systems.
> 
>>> so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.
> 
> The performance in FQ is clearly equivalent, but for a common-Q behavior, only L4S can work.
> As far as I understood the SCE-LFQ proposal is actually a slower FQ implementation (an FQ in DualQ disguise 😉), so I think not really a better alternative than pure FQ. Also its single AQM on the bulk queue will undo any isolation, as a coupled AQM is stronger than any scheduler, including FQ.

	But how would the bulk queue actually care, being dedicated to bulk flows? This basically just uses a single codel instance for all flows in the bulk queue, exactly the situation codel was designed for, if I recall correctly. Sure this will run into problems with unrepsonsive flows, but not any more than DualQ with or without  queue protection (you can steer misbehaving flows into the the "classic" queue, but this will just change which flows will suffer most of the collateral damage of that unresponsive flow, IMHO).

[K] As far as I recall, CoDel works best for a single flow. For a stateless AQM like a step using only per packet sojourn time, a common AQM over FQs is indeed working as an FQ with an AQM per queue. Applying an stateless AQM for Classic traffic (like sojourn-time RED without smoothing) will have impact on its performance. Adding common state for all bulk queue AQMs will disable the FQ effect. Anyway, the sequential scan at dequeue is the main reason why LFQ will be hard to get traction in high-speed equipment.


Best Regards
	Sebastian Moeller

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-10  9:00                           ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-10 13:14                             ` Dave Taht
@ 2019-07-17 22:40                             ` Sebastian Moeller
  2019-07-19  9:06                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  1 sibling, 1 reply; 59+ messages in thread
From: Sebastian Moeller @ 2019-07-17 22:40 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp)
  Cc: Holland, Jake, Jonathan Morton, ecn-sane, tsvwg

Dear Koen,


> On Jul 10, 2019, at 11:00, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
[...]
>>> Are you saying that even if a scalable FQ can be implemented in high-volume aggregated links at the same cost and difficulty as dualq, there's a reason not to use FQ?
> 
> FQ for "per-user" isolation in access equipment has clearly an extra cost, not? If we need to implement FQ "per-flow" on top, we need 2 levels of FQ (per-user and per-user-flow, so from thousands to millions of queues). Also, I haven’t seen DC switches coming with an FQ AQM...

	I believe there is work available demonstrating that a) millions of concurrently active flows might be overly pessimistic (even for peering routers) and b) IMHO it is far from settled that these bid transit/peering routers will employ any of the the schemes we are cooking up here. For b) I argue that both L4S "linear" CE-marking and SCE linear ECT(1) marking will give a clear signal of overload that an big ISP might not want to explicitly tell its customers...

> 
>>> Is there a use case where it's necessary to avoid strict isolation if strict isolation can be accomplished as cheaply?
> 
> Even if as cheaply, as long as there is no reliable flow identification, it clearly has side effects. Many homeworkers are using a VPN tunnel, which is only one flow encapsulating maybe dozens.

	Fair enough, but why do you see a problem of treating this multiplexed flow as different from any other flow, after all it was the end-points conscious decision to masquerade as a single flow so why assume special treatment; it is not that intermediate hops have any insight into the multiplexing, so why expect them to cater for this?

> Drop and ECN (if implemented correctly) are tunnel agnostic.

	Exactly, and that is true for each identified flow as well, so fq does not diminish this, but rather builds on top of it.


> Also how flows are identified might evolve (new transport protocols, encapsulations, ...?).

	You are jesting surely, new protocols? We are in this kefuffle, because you claim that a new protocol to signal linear CE-marking response to be made of unobtaininum so you want to abuse an underused EVN code point as a classifier. If new protocols are an option, just bite the bullet and give tcp-reno a new protocol number and use this for your L4S classifier; problem solved in a nice and clean fashion.

> Also if strict flow isolation could be done correctly, it has additional issues related to missed scheduling opportunities,

	Please elaborate, how an intermediate hop would know about the desires of the endpoints here. As far as I can tell such hops have their own ideas about optimal scheduling that they will enforce independent of the what the endpoints deem optimal (by ncessity as most endpoints will desire highest priority for their packets).

[...]

>>> Anyway, to me this discussion is about the tradeoffs between the 2 proposals.  It seems to me SCE has some safety advantages that should not be thrown away lightly, 
> 
> I appreciate the efforts of trying to improve L4S, but nobody working on L4S for years now see a way that SCE can work on a non-FQ system.

	That is a rather peculiar argument, especially given that both you and Bob, major forces in the L4S approach, seemm to have philosophical issues with fq?

> For me (and I think many others) it is a no-go to only support FQ. Unfortunately we only have half a bit free, 

	??? Again you elaborately state the options in the L4S RFC and just converge on the one which is most convenient, but also not the best match for your requirements.

> and we need to choose how to use it. Would you choose for the existing ECN switches that cannot be upgraded (are there any?) or for all future non-FQ systems.
> 
>>> so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.
> 
> The performance in FQ is clearly equivalent, but for a common-Q behavior, only L4S can work.
> As far as I understood the SCE-LFQ proposal is actually a slower FQ implementation (an FQ in DualQ disguise 😉), so I think not really a better alternative than pure FQ. Also its single AQM on the bulk queue will undo any isolation, as a coupled AQM is stronger than any scheduler, including FQ.

	But how would the bulk queue actually care, being dedicated to bulk flows? This basically just uses a single codel instance for all flows in the bulk queue, exactly the situation codel was designed for, if I recall correctly. Sure this will run into problems with unrepsonsive flows, but not any more than DualQ with or without  queue protection (you can steer misbehaving flows into the the "classic" queue, but this will just change which flows will suffer most of the collateral damage of that unresponsive flow, IMHO).


Best Regards
	Sebastian Moeller

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-10 13:14                             ` Dave Taht
@ 2019-07-10 17:32                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  0 siblings, 0 replies; 59+ messages in thread
From: De Schepper, Koen (Nokia - BE/Antwerp) @ 2019-07-10 17:32 UTC (permalink / raw)
  To: Dave Taht; +Cc: Holland, Jake, Jonathan Morton, ecn-sane, tsvwg

[-- Attachment #1: Type: text/plain, Size: 21448 bytes --]

Hi Dave,

Sorry for your lunch, but maybe I’ve cut away too much context, as I think some of your responses are not really about the discussion point.

In general I see that we both agree that FQ has pro’s and con’s, and is deployed and useful, so no need for further discussion on FQ. The actual discussion is on whether we still need to support low latency on non_FQ systems, or that low latency is only a privilege of FQ systems.

>>> so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.
>>
>>The performance in FQ is clearly equivalent,
>
>Huh?

My point was that SCE on FQ can give equivalent results as L4S on FQ, and I think everyone agrees here too.
But I want to make clear that SCE is only working with FQ with an AQM per Q:

>> but for a common-Q behavior, only L4S can work. As far as I understood the SCE-LFQ proposal is actually
>> a slower FQ implementation (an FQ in DualQ disguise 😉), so I think not really a better alternative than
>> pure FQ. Also its single AQM on the bulk queue will undo any isolation, as a coupled AQM is stronger than
>> any scheduler, including FQ. Don't underestimate the power of congestion control 😉. The ultimate proof
>> is in the DualQ Coupled AQM where congestion control can beat a priority scheduler. If you want FQ to
>> have effect, you need to have an AQM per FQ... The authors will notice this when they implement an AQM
>> on top of it. I saw the current implementation works only in taildrop mode. But I think it is very good that
>> the SCE proponents are very motivated to try with this speed to improve L4S. I'm happy to be proven wrong,
>> but up to now I don't see any promising improvements to justify delay for L4S, only the above alternative
>> compromise. Agreed that we can continue exploring alternative proposal in parallel though.
>
> I cannot parse this extreme set of assumptions and declarations. "taildrop mode??"

Context: Common-Q behavior is one common Q or set of common Qs (like DualQ) with one
coupled AQM which doesn’t want to identify every flow, but only traffic classes (Classic or L4S).

If you re-read the section again with this context, you will better understand that this is not about FQ (we agree that
Both L4S and SCE work) but about the LFQ (light-weight-FQ) proposal that seems to claim to be a DualQ, but is
actually an FQ which needs more time to select a packet at dequeue. It also has a common AQM on top of all bulk
virtual-FQ-queues. As you probably agree, you need an AQM per queue if you want to benefit from FQ or congestion
control will take over and FQ behaves like a single Q. This is especially important if the congestion controls are not
compatible, because you need to identify the traffic classes to give a differentiated AQM treatment to the different
classes, hence the need for L4S...

I hope this clarifies,
Koen.



From: Dave Taht <dave.taht@gmail.com>
Sent: Wednesday, July 10, 2019 3:15 PM
To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
Cc: Holland, Jake <jholland@akamai.com>; Jonathan Morton <chromatix99@gmail.com>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts

I keep trying to stay out of this conversation being yellow about ecn in the first place, in any form. I would like to stress that
ecn-sane was formed by the group of folk that were concerned about having accidentally masterminded the worlds biggest fq + aqm
deployment, and the only one with ecn support, which happens

In the case of wifi, the deployment is now in the 10s of millions, and doing hordes of good - latencies measured in the 10s of ms rather than 10s of seconds.

I have seen no numbers on how well l4s will make it over to wifi as yet, nor any discussion, and I would rather like more pieces of the l4s solution to land sufficiently integrated for testing using tools like flent, and over far more than just a isochronous mac layer like dsl or docsis. Given the size of a txop in wifi (5.3ms), and how far back we have
to put the AQM and FQ components today (2 txops), I don't think many of either SCE or L4S concepts will work well on wifi... but in general
I prefer not to make assertions or assumptions until real-world testing can commence.

I am presently at the battlemesh conference trying to get a bit of real-world data.

A big problem wifi and 3g have is too many retransmits at the mac layer, not congestion controlled. Any signalling gets there late, and it's
better to drop a bunch of packets when you hit a bunch of retransmits, in general. IMHO.

On Wed, Jul 10, 2019 at 2:05 AM De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com<mailto:koen.de_schepper@nokia-bell-labs.com>> wrote:
Hi Jake,

>> I agree the key question for this discussion is about how best to get low latency for the internet.
Thanks

>> under the L4S approach for ECT(1), we can achieve it with either dualq or fq at the bottleneck, but under the SCE approach we can only do it with fq at the bottleneck.
Correct

>> we agree that in neither case can very low latency be achieved with a classic single queue with classic bandwidth-seeking traffic
Correct, not without compromising latency for Prague or throughput/utilization/stability/drop for Reno/Cubic

>> Are you saying that even if a scalable FQ can be implemented in high-volume aggregated links at the same cost and difficulty as dualq, there's a reason not to use FQ?

FQ for "per-user" isolation in access equipment has clearly an extra cost, not?

I've argued in the past that hashing is a bog standard part of most network cards and switches already.

"extra cost" should be measured by actual measurements. Usually when you do those, you find it's another variable entirely costing you the most
cpu/circuits.


If we need to implement FQ "per-flow" on top, we need 2 levels of FQ (per-user and per-user-flow, so from thousands to millions of queues). Also, I haven’t seen DC switches coming with an FQ AQM...

Meh. Most of the time the instantaneous number of queues for some measurement of instantenious is in the low hundreds for rates up to
10GigE. We don't have a lot of data for bigger pipes.

I haven't seen any DC switches with support anything other than RED or AFD, and DC folk overprovision anyway.


>> Is there a use case where it's necessary to avoid strict isolation if strict isolation can be accomplished as cheaply?

Even if as cheaply, as long as there is no reliable flow identification, it clearly has side effects. Many homeworkers are using a VPN tunnel, which is only one flow encapsulating maybe dozens.

This is true. For a local endpoint for a vpn from a router fq_codel long ago gained support for doing the hashing & FQ before entering the tunnel.

This works only with in-kernel ipsec transports although I've been trying to get it added to wireguard for a long time now.

 It of course doesn't apply to the whole path, but when applied at the home gateway router (bottleneck link), works rather well.

Here are two examples of that mechanism in play.

http://www.taht.net/~d/ipsec_fq_codel/oldqos.png

http://www.taht.net/~d/ipsec_fq_codel/newqos.png

Drop and ECN (if implemented correctly) are tunnel agnostic. Also how flows are identified might evolve (new transport protocols, encapsulations, ...?). Also if strict flow isolation could be done correctly, it has additional issues related to missed scheduling opportunities, besides it is a hard-coded throughput policy (and even mice size = 1 packet). On the other hand, flow isolation has benefits too, so hard to rule out one of them, not?

The packet dissector in linux is quite robust, the one in BSD, less so.

A counterpoint to the entire ECN debate (l4s or sce) that I'd like to make at more length is that it can and does hurt non ecn'd flows, particularly at lower
bandwidths when you cannot reduce cwnd below 2 and the link is thus saturated. ARP can starve. ISIS fails. batman - lacking an IP header -  can starve.
babel, lacking ecn support can start to fail. And so on.


>> Also, I think if the SCE position is "low latency can only be achieved with FQ", that's different from "forcing only FQ on the internet", provided the fairness claims hold up, right?  (Classic single queue AQMs may still have a useful place in getting pretty-good latency in the cheapest hardware, like maybe PIE with marking.)

Are you saying that the real good stuff can only be for FQ 😉? Fairness between a flow getting only one signal and another getting 2 is an issue, right? The one with the 2 signals can either ignore one, listen half to both, or try to smooth both signals to find the average loudest one? Again safety or performance needs to be chosen. PIE or PI2 is optimal for Classic traffic and good to couple congestion to Prague traffic, but Prague traffic needs a separate Q and an immediate step to get the "good stuff" working. Otherwise it will also overshoot, respond sluggish, etc...

>> Anyway, to me this discussion is about the tradeoffs between the 2 proposals.  It seems to me SCE has some safety advantages that should not be thrown away lightly,

I appreciate the efforts of trying to improve L4S, but nobody working on L4S for years now see a way that SCE can work on a non-FQ system. For me (and I think many others) it is a no-go to only support FQ. Unfortunately we only have half a bit free, and we need to choose how to use it. Would you choose for the existing ECN switches that cannot be upgraded (are there any?) or for all future non-FQ systems.


>> so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.

The performance in FQ is clearly equivalent,

Huh?

but for a common-Q behavior, only L4S can work. As far as I understood the SCE-LFQ proposal is actually a slower FQ implementation (an FQ in DualQ disguise 😉), so I think not really a better alternative than pure FQ. Also its single AQM on the bulk queue will undo any isolation, as a coupled AQM is stronger than any scheduler, including FQ. Don't underestimate the power of congestion control 😉. The ultimate proof is in the DualQ Coupled AQM where congestion control can beat a priority scheduler. If you want FQ to have effect, you need to have an AQM per FQ... The authors will notice this when they implement an AQM on top of it. I saw the current implementation works only in taildrop mode. But I think it is very good that the SCE proponents are very motivated to try with this speed to improve L4S. I'm happy to be proven wrong, but up to now I don't see any promising improvements to justify delay for L4S, only the above alternative compromise. Agreed that we can continue exploring alternative proposal in parallel though.

I cannot parse this extreme set of assumptions and declarations. "taildrop mode??"

As for promising improvements in general, there is a 7 year old deployment, running code,  of something that we've show to work well in a variety
of network scenarios, with 10x-100x improvements in network latency, at roughly 100% in linux overall, widely used in wifi and in many, many SQM/Qos systems and containers, with basic rfc3168 ecn enabled... and a proposal for a backward compatible way of enhancing that still more being explored. The embedded hardware pipeline
for future implementations of this tech is full - it would take 3+ years to make a course change....

vs something that still has no real-world deployment data at all, that changes the definition of ecn, that has not a public ns2 or n3 model (?), no testing aside from a few
very specific benchmarks, and so on...

I do hope the coding competition heats up more, with more running code that others can explore, most of all. I long ago tired of the endless debates, as everyone knows,
and I do kind of wish I wasn't burning lunch on this email instead of setting up a test at battlemesh.

I note also that my leanings - in a fq_codel'd world, were it to stay such, was to enable more RTT based CCs  like BBRto work more often in an RTT mode, and thus
we start - originally to me, the SCE idea was a way to trigger a faster switch to congestion avoidance - as most of my captures taken from over used APs in
restaurants, cafes, train stations etc shows stuff in slow start to be the biggest problem - and, regardless, an initial CE, right now, is a strong indicator that fq-codel is present, and
a RTT based tcp can thus start to happen, and a good one, would not have many future marks after the first.

A big difference in our outlooks, I guess, is that my viewpoint is that most of the congestion is at the edges of the network and I don't care all that
much about big iron or switches, and I don't think either can afford much aqm tech at all in the first place. Not dual queues, not fqs.

Were L4S not to deploy (using ect1 as a marker - btw, I think CS5 might be a better candidate as it goes into the wifi VI queue), and a fq_pie/fq_codel/sch_cake
world to remain predominant, well, we might get somewhere, faster, where it counted.

Koen.


-----Original Message-----
From: Holland, Jake <jholland@akamai.com<mailto:jholland@akamai.com>>
Sent: Monday, July 8, 2019 10:56 PM
To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com<mailto:koen.de_schepper@nokia-bell-labs.com>>; Jonathan Morton <chromatix99@gmail.com<mailto:chromatix99@gmail.com>>
Cc: ecn-sane@lists.bufferbloat.net<mailto:ecn-sane@lists.bufferbloat.net>; tsvwg@ietf.org<mailto:tsvwg@ietf.org>
Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts

Hi Koen,

I'm a bit confused by this response.

I agree the key question for this discussion is about how best to get low latency for the internet.

If I'm reading your message correctly, you're saying that under the L4S approach for ECT(1), we can achieve it with either dualq or fq at the bottleneck, but under the SCE approach we can only do it with fq at the bottleneck.

(I think I understand and roughly agree with this claim, subject to some caveats.  I just want to make sure I've got this right so far, and that we agree that in neither case can very low latency be achieved with a classic single queue with classic bandwidth-seeking
traffic.)

Are you saying that even if a scalable FQ can be implemented in high-volume aggregated links at the same cost and difficulty as dualq, there's a reason not to use FQ?  Is there a use case where it's necessary to avoid strict isolation if strict isolation can be accomplished as cheaply?

Also, I think if the SCE position is "low latency can only be achieved with FQ", that's different from "forcing only FQ on the internet", provided the fairness claims hold up, right?  (Classic single queue AQMs may still have a useful place in getting pretty-good latency in the cheapest hardware, like maybe PIE with
marking.)

Anyway, to me this discussion is about the tradeoffs between the
2 proposals.  It seems to me SCE has some safety advantages that should not be thrown away lightly, so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.

Best regards,
Jake

On 2019-07-08, 03:26, "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com<mailto:koen.de_schepper@nokia-bell-labs.com>> wrote:

    Hi Jonathan,

    From your responses below, I have the impression you think this discussion is about FQ (flow/fair queuing). Fair queuing is used today where strict isolation is wanted, like between subscribers, and by extension (if possible and preferred) on a per transport layer flow, like in Fixed CPEs and Mobile networks. No discussion about this, and assuming we have and still will have an Internet which needs to support both common queues (like DualQ is intended) and FQs, I think the only discussion point is how we want to migrate to an Internet that supports optimally Low Latency.

    This leads us to the question L4S or SCE?

    If we want to support low latency for both common queues and FQs we "NEED" L4S, if we need to support it only for FQs, we "COULD" use SCE too, and if we want to force the whole Internet to use only FQs, we "SHOULD" use SCE 😉. If your goal is to force only FQs in the Internet, then let this be clear... I assume we need a discussion on another level in that case (and to be clear, it is not a goal I can support)...

    Koen.


    -----Original Message-----
    From: Jonathan Morton <chromatix99@gmail.com<mailto:chromatix99@gmail.com>>
    Sent: Friday, July 5, 2019 10:51 AM
    To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com<mailto:koen.de_schepper@nokia-bell-labs.com>>
    Cc: Bob Briscoe <ietf@bobbriscoe.net<mailto:ietf@bobbriscoe.net>>; ecn-sane@lists.bufferbloat.net<mailto:ecn-sane@lists.bufferbloat.net>; tsvwg@ietf.org<mailto:tsvwg@ietf.org>
    Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts

    > On 5 Jul, 2019, at 9:46 am, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com<mailto:koen.de_schepper@nokia-bell-labs.com>> wrote:
    >
    >>> 2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.
    >
    > Before jumping to another point, let's close down your original issue. Since you didn't mention, I assume that you agree with the following, right?
    >
    >        "You cannot defeat a DualQ" (at least no more than a single Q)

    I consider forcibly degrading DualQ to single-queue mode to be a defeat.  However…

    >>> But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.
    >
    > With L4S a single queue can differentiate between Classic and L4S traffic. That's why it knows exactly how to treat the traffic. For Non-ECT and ECT(0) square the probability, and for ECT(1) don't square, and it works exactly like a DualQ, but then without the latency isolation. Both types get the same throughput, AND delay. See the PI2 paper, which is exactly about a single Q.

    Okay, this is an important point: the real assertion is not that DualQ itself is needed for L4S to be safe on the Internet, but for differential AQM treatment to be present at the bottleneck.  Defeating DualQ only destroys L4S' latency advantage over "classic" traffic.  We might actually be making progress here!

    > I agree you cannot isolate in a single Q, and this is why L4S is better than SCE, because it tells the AQM what to do, even if it has a single Q. SCE needs isolation, L4S not.

    Devil's advocate time.  What if, instead of providing differential treatment WRT CE marking, PI2 instead applied both marking strategies simultaneously - the higher rate using SCE, and the lower rate using CE?  Classic traffic would see only the latter; L4S could use the former.

    > We tried years ago similar things like needed for SCE, and found that it can't work. For throughput fairness you need the squared relation between the 2 signals, but with SCE, you need to apply both signals in parallel, because you don't know the sender type.

    Yes, that's exactly what we do - and it does work.

    >   - So either the sender needs to ignore CE if it gets SCE, or ignore SCE if you get CE. The first is dangerous if you have multiple bottlenecks, and the second is defeating the purpose of SCE. Any other combination leads to unfairness (double response).

    This is a false dichotomy.  We quickly realised both of those options were unacceptable, and sought a third way.

    SCE senders apply a reduced CE response when also responding to parallel SCE feedback, roughly in line with ABE, on the grounds that responding to SCE does some of the necessary reduction already.  The reduced response is still a Multiplicative Decrease, so it fits with normal TCP congestion control principles.

    >   - you separate the signals in queue dept, first applying SCE and later CE, as you originally proposed, but that results in starvation for SCE.

    Yes, although this approach gives the best performance for SCE when used with flow isolation, or when all flows are known to be SCE-aware.  So we apply this strategy in those cases, and move the SCE marking function up to overlap CE marking specifically for single queues.

    It has been suggested that single queue AQMs are rare in any case, but this approach covers that corner case.

    > Add on top that SCE makes it impossible to use DualQ, as you cannot differentiate the traffic types.

    SCE is designed around not *needing* to differentiate the traffic types.  Single queues have known disadvantages, and SCE doesn't worsen them.

    Meanwhile, we have proposed LFQ to cover the DualQ use case.  I'd be interested in hearing a principled critique of it.

     - Jonathan Morton



_______________________________________________
Ecn-sane mailing list
Ecn-sane@lists.bufferbloat.net<mailto:Ecn-sane@lists.bufferbloat.net>
https://lists.bufferbloat.net/listinfo/ecn-sane


--

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

[-- Attachment #2: Type: text/html, Size: 34027 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-04 13:45         ` Bob Briscoe
@ 2019-07-10 17:03           ` Holland, Jake
  0 siblings, 0 replies; 59+ messages in thread
From: Holland, Jake @ 2019-07-10 17:03 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: Luca Muscariello, ecn-sane, tsvwg

Hi Bob,

<JH>Responses inline...</JH>

From: Bob Briscoe <ietf@bobbriscoe.net>
Date: 2019-07-04 at 06:45

Nonetheless, when an unresponsive flow(s) is consuming some capacity, and a responsive flow(s) takes the total over the available capacity, then both are responsible in proportion to their contribution to the queue, 'cos the unresponsive flow didn't respond (it didn't even try to).

This is why it's OK to have a small unresponsive flow, but it becomes less and less OK to have a larger and larger unresponsive flow. 

<JH>
Right, this is a big part of the point I'm trying to make here.
Some of the video systems are sending a substantial-sized flow which
is not responsive at the transport layer.

However, that doesn't mean it's entirely unresponsive.  These often
do respond in practice at the application layer, but by observing
some quality of experience threshold from the video rendering.

Part of this quality of experience signal comes from the delay
fluctuation caused by the queueing delay when the link is overloaded,
but running the video through a low-latency queue would remove that
fluctuation, and thus change it from something that would cut over
to a lower bit-rate or remove the video into something that wouldn't.

At the same time, the app benefits from removing that fluctuation--
it gets to deliver a better video quality successfully.  When its
owners test it comparatively, they'll find they have an incentive
to add the marking, and their customers will have an incentive to
adopt that solution over solutions that don't, leading to an arms
race that progressively pushes out more of the responsive traffic.

My claim is that the lack of admission control is what makes this
arms race possible, by removing an important source of backpressure
on apps like this relative to today's internet (or one that does a
stricter fair-share-based degradation at bottlenecks).
</JH>

There's no bandwidth benefit. 
There's only latency benefit, and then the only benefits are:
• the low latency behaviour of yourself and other flows behaving like you
• and, critically, isolation from those flows not behaving well like you. 
Neither give an incentive to mismark - you get nothing if you don't behave. And there's a disincentive for 'Classic' TCP flows to mismark, 'cos they badly underutilize without a queue.

<JH>
It's typical for Non-responsive flows to get benefits from lower
latency.

I actually do (with caveats) agree that flows that don't respond
to transient congestion should be fine, as long as they use no more
than their fair share of the capacity.  However, by removing the
backpressure without adding something to prevent them from using
more than their fair share, it sets up perverse incentives that
push the ecosystem toward congestion collapse.

The Queue protection mechanism you mentioned might be sufficient
for this, but I'm having a hard time understanding the claim that
it's not required.

It seems to me in practice it will need to be used whenever there's
a chance that non-responsive flows can consume more than their
share, which chance we can reasonably expect will grow naturally
if L4S deployment grows.
</JH>

1/ The q-prot mechanism certainly has the disadvantage that it has to access L4 headers. But it is much more lightweight than FQ.

...

That's probably not understandable. Let me write it up properly - with some explanatory pictures and examples.

<JH>
I thought it was a reasonable summary and thanks for the
quick explanation (not to discourage writing it up properly,
which would also be good).

In short, it sounds to me like if we end up agreeing that Q
protection is required in L4S with dualq (a point currently
disputed, acked), and if the lfq draft holds up to scrutiny
(also a point to be determined), then it means:

The per-bucket complexity overhead comparison for the 2 proposed
architectures (L4S vs. SCE-based) would be 1 int per hash bucket
for dualq, vs. 2 ints + 1 bit per hash bucket for lfq.  And if so,
these overhead comparisons at the bottleneck queues can be taken
as a roughly fair comparison to weigh against other considerations.

Does that sound approximately correct?

Best regards,
Jake
</JH>



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-10  9:00                           ` De Schepper, Koen (Nokia - BE/Antwerp)
@ 2019-07-10 13:14                             ` Dave Taht
  2019-07-10 17:32                               ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-17 22:40                             ` Sebastian Moeller
  1 sibling, 1 reply; 59+ messages in thread
From: Dave Taht @ 2019-07-10 13:14 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp)
  Cc: Holland, Jake, Jonathan Morton, ecn-sane, tsvwg

[-- Attachment #1: Type: text/plain, Size: 18327 bytes --]

I keep trying to stay out of this conversation being yellow about ecn in
the first place, in any form. I would like to stress that
ecn-sane was formed by the group of folk that were concerned about having
accidentally masterminded the worlds biggest fq + aqm
deployment, and the only one with ecn support, which happens

In the case of wifi, the deployment is now in the 10s of millions, and
doing hordes of good - latencies measured in the 10s of ms rather than 10s
of seconds.

I have seen no numbers on how well l4s will make it over to wifi as yet,
nor any discussion, and I would rather like more pieces of the l4s solution
to land sufficiently integrated for testing using tools like flent, and
over far more than just a isochronous mac layer like dsl or docsis. Given
the size of a txop in wifi (5.3ms), and how far back we have
to put the AQM and FQ components today (2 txops), I don't think many of
either SCE or L4S concepts will work well on wifi... but in general
I prefer not to make assertions or assumptions until real-world testing can
commence.

I am presently at the battlemesh conference trying to get a bit of
real-world data.

A big problem wifi and 3g have is too many retransmits at the mac layer,
not congestion controlled. Any signalling gets there late, and it's
better to drop a bunch of packets when you hit a bunch of retransmits, in
general. IMHO.

On Wed, Jul 10, 2019 at 2:05 AM De Schepper, Koen (Nokia - BE/Antwerp) <
koen.de_schepper@nokia-bell-labs.com> wrote:

> Hi Jake,
>
> >> I agree the key question for this discussion is about how best to get
> low latency for the internet.
> Thanks
>
> >> under the L4S approach for ECT(1), we can achieve it with either dualq
> or fq at the bottleneck, but under the SCE approach we can only do it with
> fq at the bottleneck.
> Correct
>
> >> we agree that in neither case can very low latency be achieved with a
> classic single queue with classic bandwidth-seeking traffic
> Correct, not without compromising latency for Prague or
> throughput/utilization/stability/drop for Reno/Cubic
>
> >> Are you saying that even if a scalable FQ can be implemented in
> high-volume aggregated links at the same cost and difficulty as dualq,
> there's a reason not to use FQ?
>


> FQ for "per-user" isolation in access equipment has clearly an extra cost,
> not?


I've argued in the past that hashing is a bog standard part of most network
cards and switches already.

"extra cost" should be measured by actual measurements. Usually when you do
those, you find it's another variable entirely costing you the most
cpu/circuits.


If we need to implement FQ "per-flow" on top, we need 2 levels of FQ
> (per-user and per-user-flow, so from thousands to millions of queues).
> Also, I haven’t seen DC switches coming with an FQ AQM...
>

Meh. Most of the time the instantaneous number of queues for some
measurement of instantenious is in the low hundreds for rates up to
10GigE. We don't have a lot of data for bigger pipes.

I haven't seen any DC switches with support anything other than RED or AFD,
and DC folk overprovision anyway.



> >> Is there a use case where it's necessary to avoid strict isolation if
> strict isolation can be accomplished as cheaply?
>
> Even if as cheaply, as long as there is no reliable flow identification,
> it clearly has side effects. Many homeworkers are using a VPN tunnel, which
> is only one flow encapsulating maybe dozens.


This is true. For a local endpoint for a vpn from a router fq_codel long
ago gained support for doing the hashing & FQ before entering the tunnel.

This works only with in-kernel ipsec transports although I've been trying
to get it added to wireguard for a long time now.

 It of course doesn't apply to the whole path, but when applied at the home
gateway router (bottleneck link), works rather well.

Here are two examples of that mechanism in play.

http://www.taht.net/~d/ipsec_fq_codel/oldqos.png

http://www.taht.net/~d/ipsec_fq_codel/newqos.png

Drop and ECN (if implemented correctly) are tunnel agnostic. Also how flows
> are identified might evolve (new transport protocols, encapsulations,
> ...?). Also if strict flow isolation could be done correctly, it has
> additional issues related to missed scheduling opportunities, besides it is
> a hard-coded throughput policy (and even mice size = 1 packet). On the
> other hand, flow isolation has benefits too, so hard to rule out one of
> them, not?
>

The packet dissector in linux is quite robust, the one in BSD, less so.

A counterpoint to the entire ECN debate (l4s or sce) that I'd like to make
at more length is that it can and does hurt non ecn'd flows, particularly
at lower
bandwidths when you cannot reduce cwnd below 2 and the link is thus
saturated. ARP can starve. ISIS fails. batman - lacking an IP header -  can
starve.
babel, lacking ecn support can start to fail. And so on.


> >> Also, I think if the SCE position is "low latency can only be achieved
> with FQ", that's different from "forcing only FQ on the internet", provided
> the fairness claims hold up, right?  (Classic single queue AQMs may still
> have a useful place in getting pretty-good latency in the cheapest
> hardware, like maybe PIE with marking.)
>
> Are you saying that the real good stuff can only be for FQ 😉? Fairness
> between a flow getting only one signal and another getting 2 is an issue,
> right? The one with the 2 signals can either ignore one, listen half to
> both, or try to smooth both signals to find the average loudest one? Again
> safety or performance needs to be chosen. PIE or PI2 is optimal for Classic
> traffic and good to couple congestion to Prague traffic, but Prague traffic
> needs a separate Q and an immediate step to get the "good stuff" working.
> Otherwise it will also overshoot, respond sluggish, etc...
>
> >> Anyway, to me this discussion is about the tradeoffs between the 2
> proposals.  It seems to me SCE has some safety advantages that should not
> be thrown away lightly,
>
> I appreciate the efforts of trying to improve L4S, but nobody working on
> L4S for years now see a way that SCE can work on a non-FQ system. For me
> (and I think many others) it is a no-go to only support FQ. Unfortunately
> we only have half a bit free, and we need to choose how to use it. Would
> you choose for the existing ECN switches that cannot be upgraded (are there
> any?) or for all future non-FQ systems.
>
>


> >> so if the performance can be made equivalent, it would be good to know
> about it before committing the codepoint.
>
> The performance in FQ is clearly equivalent,


Huh?


> but for a common-Q behavior, only L4S can work. As far as I understood the
> SCE-LFQ proposal is actually a slower FQ implementation (an FQ in DualQ
> disguise 😉), so I think not really a better alternative than pure FQ. Also
> its single AQM on the bulk queue will undo any isolation, as a coupled AQM
> is stronger than any scheduler, including FQ. Don't underestimate the power
> of congestion control 😉. The ultimate proof is in the DualQ Coupled AQM
> where congestion control can beat a priority scheduler. If you want FQ to
> have effect, you need to have an AQM per FQ... The authors will notice this
> when they implement an AQM on top of it. I saw the current implementation
> works only in taildrop mode. But I think it is very good that the SCE
> proponents are very motivated to try with this speed to improve L4S. I'm
> happy to be proven wrong, but up to now I don't see any promising
> improvements to justify delay for L4S, only the above alternative
> compromise. Agreed that we can continue exploring alternative proposal in
> parallel though.
>
>
I cannot parse this extreme set of assumptions and declarations. "taildrop
mode??"

As for promising improvements in general, there is a 7 year old deployment,
running code,  of something that we've show to work well in a variety
of network scenarios, with 10x-100x improvements in network latency, at
roughly 100% in linux overall, widely used in wifi and in many, many
SQM/Qos systems and containers, with basic rfc3168 ecn enabled... and a
proposal for a backward compatible way of enhancing that still more being
explored. The embedded hardware pipeline
for future implementations of this tech is full - it would take 3+ years to
make a course change....

vs something that still has no real-world deployment data at all, that
changes the definition of ecn, that has not a public ns2 or n3 model (?),
no testing aside from a few
very specific benchmarks, and so on...

I do hope the coding competition heats up more, with more running code that
others can explore, most of all. I long ago tired of the endless debates,
as everyone knows,
and I do kind of wish I wasn't burning lunch on this email instead of
setting up a test at battlemesh.

I note also that my leanings - in a fq_codel'd world, were it to stay such,
was to enable more RTT based CCs  like BBRto work more often in an RTT
mode, and thus
we start - originally to me, the SCE idea was a way to trigger a faster
switch to congestion avoidance - as most of my captures taken from over
used APs in
restaurants, cafes, train stations etc shows stuff in slow start to be the
biggest problem - and, regardless, an initial CE, right now, is a strong
indicator that fq-codel is present, and
a RTT based tcp can thus start to happen, and a good one, would not have
many future marks after the first.

A big difference in our outlooks, I guess, is that my viewpoint is that
most of the congestion is at the edges of the network and I don't care all
that
much about big iron or switches, and I don't think either can afford much
aqm tech at all in the first place. Not dual queues, not fqs.

Were L4S not to deploy (using ect1 as a marker - btw, I think CS5 might be
a better candidate as it goes into the wifi VI queue), and a
fq_pie/fq_codel/sch_cake
world to remain predominant, well, we might get somewhere, faster, where it
counted.

Koen.
>
>
> -----Original Message-----
> From: Holland, Jake <jholland@akamai.com>
> Sent: Monday, July 8, 2019 10:56 PM
> To: De Schepper, Koen (Nokia - BE/Antwerp) <
> koen.de_schepper@nokia-bell-labs.com>; Jonathan Morton <
> chromatix99@gmail.com>
> Cc: ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
> Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
>
> Hi Koen,
>
> I'm a bit confused by this response.
>
> I agree the key question for this discussion is about how best to get low
> latency for the internet.
>
> If I'm reading your message correctly, you're saying that under the L4S
> approach for ECT(1), we can achieve it with either dualq or fq at the
> bottleneck, but under the SCE approach we can only do it with fq at the
> bottleneck.
>
> (I think I understand and roughly agree with this claim, subject to some
> caveats.  I just want to make sure I've got this right so far, and that we
> agree that in neither case can very low latency be achieved with a classic
> single queue with classic bandwidth-seeking
> traffic.)
>
> Are you saying that even if a scalable FQ can be implemented in
> high-volume aggregated links at the same cost and difficulty as dualq,
> there's a reason not to use FQ?  Is there a use case where it's necessary
> to avoid strict isolation if strict isolation can be accomplished as
> cheaply?
>
> Also, I think if the SCE position is "low latency can only be achieved
> with FQ", that's different from "forcing only FQ on the internet", provided
> the fairness claims hold up, right?  (Classic single queue AQMs may still
> have a useful place in getting pretty-good latency in the cheapest
> hardware, like maybe PIE with
> marking.)
>
> Anyway, to me this discussion is about the tradeoffs between the
> 2 proposals.  It seems to me SCE has some safety advantages that should
> not be thrown away lightly, so if the performance can be made equivalent,
> it would be good to know about it before committing the codepoint.
>
> Best regards,
> Jake
>
> On 2019-07-08, 03:26, "De Schepper, Koen (Nokia - BE/Antwerp)" <
> koen.de_schepper@nokia-bell-labs.com> wrote:
>
>     Hi Jonathan,
>
>     From your responses below, I have the impression you think this
> discussion is about FQ (flow/fair queuing). Fair queuing is used today
> where strict isolation is wanted, like between subscribers, and by
> extension (if possible and preferred) on a per transport layer flow, like
> in Fixed CPEs and Mobile networks. No discussion about this, and assuming
> we have and still will have an Internet which needs to support both common
> queues (like DualQ is intended) and FQs, I think the only discussion point
> is how we want to migrate to an Internet that supports optimally Low
> Latency.
>
>     This leads us to the question L4S or SCE?
>
>     If we want to support low latency for both common queues and FQs we
> "NEED" L4S, if we need to support it only for FQs, we "COULD" use SCE too,
> and if we want to force the whole Internet to use only FQs, we "SHOULD" use
> SCE 😉. If your goal is to force only FQs in the Internet, then let this be
> clear... I assume we need a discussion on another level in that case (and
> to be clear, it is not a goal I can support)...
>
>     Koen.
>
>
>     -----Original Message-----
>     From: Jonathan Morton <chromatix99@gmail.com>
>     Sent: Friday, July 5, 2019 10:51 AM
>     To: De Schepper, Koen (Nokia - BE/Antwerp) <
> koen.de_schepper@nokia-bell-labs.com>
>     Cc: Bob Briscoe <ietf@bobbriscoe.net>; ecn-sane@lists.bufferbloat.net;
> tsvwg@ietf.org
>     Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
>
>     > On 5 Jul, 2019, at 9:46 am, De Schepper, Koen (Nokia - BE/Antwerp) <
> koen.de_schepper@nokia-bell-labs.com> wrote:
>     >
>     >>> 2: DualQ can be defeated by an adversary, destroying its ability
> to isolate L4S traffic.
>     >
>     > Before jumping to another point, let's close down your original
> issue. Since you didn't mention, I assume that you agree with the
> following, right?
>     >
>     >        "You cannot defeat a DualQ" (at least no more than a single Q)
>
>     I consider forcibly degrading DualQ to single-queue mode to be a
> defeat.  However…
>
>     >>> But that's exactly the problem.  Single queue AQM does not isolate
> L4S traffic from "classic" traffic, so the latter suffers from the former's
> relative aggression in the face of AQM activity.
>     >
>     > With L4S a single queue can differentiate between Classic and L4S
> traffic. That's why it knows exactly how to treat the traffic. For Non-ECT
> and ECT(0) square the probability, and for ECT(1) don't square, and it
> works exactly like a DualQ, but then without the latency isolation. Both
> types get the same throughput, AND delay. See the PI2 paper, which is
> exactly about a single Q.
>
>     Okay, this is an important point: the real assertion is not that DualQ
> itself is needed for L4S to be safe on the Internet, but for differential
> AQM treatment to be present at the bottleneck.  Defeating DualQ only
> destroys L4S' latency advantage over "classic" traffic.  We might actually
> be making progress here!
>
>     > I agree you cannot isolate in a single Q, and this is why L4S is
> better than SCE, because it tells the AQM what to do, even if it has a
> single Q. SCE needs isolation, L4S not.
>
>     Devil's advocate time.  What if, instead of providing differential
> treatment WRT CE marking, PI2 instead applied both marking strategies
> simultaneously - the higher rate using SCE, and the lower rate using CE?
> Classic traffic would see only the latter; L4S could use the former.
>
>     > We tried years ago similar things like needed for SCE, and found
> that it can't work. For throughput fairness you need the squared relation
> between the 2 signals, but with SCE, you need to apply both signals in
> parallel, because you don't know the sender type.
>
>     Yes, that's exactly what we do - and it does work.
>
>     >   - So either the sender needs to ignore CE if it gets SCE, or
> ignore SCE if you get CE. The first is dangerous if you have multiple
> bottlenecks, and the second is defeating the purpose of SCE. Any other
> combination leads to unfairness (double response).
>
>     This is a false dichotomy.  We quickly realised both of those options
> were unacceptable, and sought a third way.
>
>     SCE senders apply a reduced CE response when also responding to
> parallel SCE feedback, roughly in line with ABE, on the grounds that
> responding to SCE does some of the necessary reduction already.  The
> reduced response is still a Multiplicative Decrease, so it fits with normal
> TCP congestion control principles.
>
>     >   - you separate the signals in queue dept, first applying SCE and
> later CE, as you originally proposed, but that results in starvation for
> SCE.
>
>     Yes, although this approach gives the best performance for SCE when
> used with flow isolation, or when all flows are known to be SCE-aware.  So
> we apply this strategy in those cases, and move the SCE marking function up
> to overlap CE marking specifically for single queues.
>
>     It has been suggested that single queue AQMs are rare in any case, but
> this approach covers that corner case.
>
>     > Add on top that SCE makes it impossible to use DualQ, as you cannot
> differentiate the traffic types.
>
>     SCE is designed around not *needing* to differentiate the traffic
> types.  Single queues have known disadvantages, and SCE doesn't worsen them.
>
>     Meanwhile, we have proposed LFQ to cover the DualQ use case.  I'd be
> interested in hearing a principled critique of it.
>
>      - Jonathan Morton
>
>
>
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane
>


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

[-- Attachment #2: Type: text/html, Size: 22205 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-08 20:55                         ` Holland, Jake
  2019-07-10  0:10                           ` Jonathan Morton
@ 2019-07-10  9:00                           ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-10 13:14                             ` Dave Taht
  2019-07-17 22:40                             ` Sebastian Moeller
  1 sibling, 2 replies; 59+ messages in thread
From: De Schepper, Koen (Nokia - BE/Antwerp) @ 2019-07-10  9:00 UTC (permalink / raw)
  To: Holland, Jake, Jonathan Morton; +Cc: ecn-sane, tsvwg

Hi Jake,

>> I agree the key question for this discussion is about how best to get low latency for the internet.
Thanks

>> under the L4S approach for ECT(1), we can achieve it with either dualq or fq at the bottleneck, but under the SCE approach we can only do it with fq at the bottleneck.
Correct

>> we agree that in neither case can very low latency be achieved with a classic single queue with classic bandwidth-seeking traffic
Correct, not without compromising latency for Prague or throughput/utilization/stability/drop for Reno/Cubic

>> Are you saying that even if a scalable FQ can be implemented in high-volume aggregated links at the same cost and difficulty as dualq, there's a reason not to use FQ?

FQ for "per-user" isolation in access equipment has clearly an extra cost, not? If we need to implement FQ "per-flow" on top, we need 2 levels of FQ (per-user and per-user-flow, so from thousands to millions of queues). Also, I haven’t seen DC switches coming with an FQ AQM...

>> Is there a use case where it's necessary to avoid strict isolation if strict isolation can be accomplished as cheaply?

Even if as cheaply, as long as there is no reliable flow identification, it clearly has side effects. Many homeworkers are using a VPN tunnel, which is only one flow encapsulating maybe dozens. Drop and ECN (if implemented correctly) are tunnel agnostic. Also how flows are identified might evolve (new transport protocols, encapsulations, ...?). Also if strict flow isolation could be done correctly, it has additional issues related to missed scheduling opportunities, besides it is a hard-coded throughput policy (and even mice size = 1 packet). On the other hand, flow isolation has benefits too, so hard to rule out one of them, not?

>> Also, I think if the SCE position is "low latency can only be achieved with FQ", that's different from "forcing only FQ on the internet", provided the fairness claims hold up, right?  (Classic single queue AQMs may still have a useful place in getting pretty-good latency in the cheapest hardware, like maybe PIE with marking.)

Are you saying that the real good stuff can only be for FQ 😉? Fairness between a flow getting only one signal and another getting 2 is an issue, right? The one with the 2 signals can either ignore one, listen half to both, or try to smooth both signals to find the average loudest one? Again safety or performance needs to be chosen. PIE or PI2 is optimal for Classic traffic and good to couple congestion to Prague traffic, but Prague traffic needs a separate Q and an immediate step to get the "good stuff" working. Otherwise it will also overshoot, respond sluggish, etc...

>> Anyway, to me this discussion is about the tradeoffs between the 2 proposals.  It seems to me SCE has some safety advantages that should not be thrown away lightly, 

I appreciate the efforts of trying to improve L4S, but nobody working on L4S for years now see a way that SCE can work on a non-FQ system. For me (and I think many others) it is a no-go to only support FQ. Unfortunately we only have half a bit free, and we need to choose how to use it. Would you choose for the existing ECN switches that cannot be upgraded (are there any?) or for all future non-FQ systems.

>> so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.

The performance in FQ is clearly equivalent, but for a common-Q behavior, only L4S can work. As far as I understood the SCE-LFQ proposal is actually a slower FQ implementation (an FQ in DualQ disguise 😉), so I think not really a better alternative than pure FQ. Also its single AQM on the bulk queue will undo any isolation, as a coupled AQM is stronger than any scheduler, including FQ. Don't underestimate the power of congestion control 😉. The ultimate proof is in the DualQ Coupled AQM where congestion control can beat a priority scheduler. If you want FQ to have effect, you need to have an AQM per FQ... The authors will notice this when they implement an AQM on top of it. I saw the current implementation works only in taildrop mode. But I think it is very good that the SCE proponents are very motivated to try with this speed to improve L4S. I'm happy to be proven wrong, but up to now I don't see any promising improvements to justify delay for L4S, only the above alternative compromise. Agreed that we can continue exploring alternative proposal in parallel though.

Koen.


-----Original Message-----
From: Holland, Jake <jholland@akamai.com> 
Sent: Monday, July 8, 2019 10:56 PM
To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>; Jonathan Morton <chromatix99@gmail.com>
Cc: ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts

Hi Koen,

I'm a bit confused by this response.

I agree the key question for this discussion is about how best to get low latency for the internet.

If I'm reading your message correctly, you're saying that under the L4S approach for ECT(1), we can achieve it with either dualq or fq at the bottleneck, but under the SCE approach we can only do it with fq at the bottleneck.

(I think I understand and roughly agree with this claim, subject to some caveats.  I just want to make sure I've got this right so far, and that we agree that in neither case can very low latency be achieved with a classic single queue with classic bandwidth-seeking
traffic.)

Are you saying that even if a scalable FQ can be implemented in high-volume aggregated links at the same cost and difficulty as dualq, there's a reason not to use FQ?  Is there a use case where it's necessary to avoid strict isolation if strict isolation can be accomplished as cheaply?

Also, I think if the SCE position is "low latency can only be achieved with FQ", that's different from "forcing only FQ on the internet", provided the fairness claims hold up, right?  (Classic single queue AQMs may still have a useful place in getting pretty-good latency in the cheapest hardware, like maybe PIE with
marking.)

Anyway, to me this discussion is about the tradeoffs between the
2 proposals.  It seems to me SCE has some safety advantages that should not be thrown away lightly, so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.

Best regards,
Jake

On 2019-07-08, 03:26, "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com> wrote:

    Hi Jonathan,
    
    From your responses below, I have the impression you think this discussion is about FQ (flow/fair queuing). Fair queuing is used today where strict isolation is wanted, like between subscribers, and by extension (if possible and preferred) on a per transport layer flow, like in Fixed CPEs and Mobile networks. No discussion about this, and assuming we have and still will have an Internet which needs to support both common queues (like DualQ is intended) and FQs, I think the only discussion point is how we want to migrate to an Internet that supports optimally Low Latency.
    
    This leads us to the question L4S or SCE?
    
    If we want to support low latency for both common queues and FQs we "NEED" L4S, if we need to support it only for FQs, we "COULD" use SCE too, and if we want to force the whole Internet to use only FQs, we "SHOULD" use SCE 😉. If your goal is to force only FQs in the Internet, then let this be clear... I assume we need a discussion on another level in that case (and to be clear, it is not a goal I can support)...
    
    Koen.
    
    
    -----Original Message-----
    From: Jonathan Morton <chromatix99@gmail.com> 
    Sent: Friday, July 5, 2019 10:51 AM
    To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
    Cc: Bob Briscoe <ietf@bobbriscoe.net>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
    Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
    
    > On 5 Jul, 2019, at 9:46 am, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
    > 
    >>> 2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.
    > 
    > Before jumping to another point, let's close down your original issue. Since you didn't mention, I assume that you agree with the following, right?
    > 
    >        "You cannot defeat a DualQ" (at least no more than a single Q)
    
    I consider forcibly degrading DualQ to single-queue mode to be a defeat.  However…
    
    >>> But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.
    > 
    > With L4S a single queue can differentiate between Classic and L4S traffic. That's why it knows exactly how to treat the traffic. For Non-ECT and ECT(0) square the probability, and for ECT(1) don't square, and it works exactly like a DualQ, but then without the latency isolation. Both types get the same throughput, AND delay. See the PI2 paper, which is exactly about a single Q.
    
    Okay, this is an important point: the real assertion is not that DualQ itself is needed for L4S to be safe on the Internet, but for differential AQM treatment to be present at the bottleneck.  Defeating DualQ only destroys L4S' latency advantage over "classic" traffic.  We might actually be making progress here!
    
    > I agree you cannot isolate in a single Q, and this is why L4S is better than SCE, because it tells the AQM what to do, even if it has a single Q. SCE needs isolation, L4S not.
    
    Devil's advocate time.  What if, instead of providing differential treatment WRT CE marking, PI2 instead applied both marking strategies simultaneously - the higher rate using SCE, and the lower rate using CE?  Classic traffic would see only the latter; L4S could use the former.
    
    > We tried years ago similar things like needed for SCE, and found that it can't work. For throughput fairness you need the squared relation between the 2 signals, but with SCE, you need to apply both signals in parallel, because you don't know the sender type. 
    
    Yes, that's exactly what we do - and it does work.
    
    > 	- So either the sender needs to ignore CE if it gets SCE, or ignore SCE if you get CE. The first is dangerous if you have multiple bottlenecks, and the second is defeating the purpose of SCE. Any other combination leads to unfairness (double response).
    
    This is a false dichotomy.  We quickly realised both of those options were unacceptable, and sought a third way.
    
    SCE senders apply a reduced CE response when also responding to parallel SCE feedback, roughly in line with ABE, on the grounds that responding to SCE does some of the necessary reduction already.  The reduced response is still a Multiplicative Decrease, so it fits with normal TCP congestion control principles.
    
    > 	- you separate the signals in queue dept, first applying SCE and later CE, as you originally proposed, but that results in starvation for SCE.
    
    Yes, although this approach gives the best performance for SCE when used with flow isolation, or when all flows are known to be SCE-aware.  So we apply this strategy in those cases, and move the SCE marking function up to overlap CE marking specifically for single queues.
    
    It has been suggested that single queue AQMs are rare in any case, but this approach covers that corner case.
    
    > Add on top that SCE makes it impossible to use DualQ, as you cannot differentiate the traffic types.
    
    SCE is designed around not *needing* to differentiate the traffic types.  Single queues have known disadvantages, and SCE doesn't worsen them.
    
    Meanwhile, we have proposed LFQ to cover the DualQ use case.  I'd be interested in hearing a principled critique of it.
    
     - Jonathan Morton
    
    


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-08 20:55                         ` Holland, Jake
@ 2019-07-10  0:10                           ` Jonathan Morton
  2019-07-10  9:00                           ` De Schepper, Koen (Nokia - BE/Antwerp)
  1 sibling, 0 replies; 59+ messages in thread
From: Jonathan Morton @ 2019-07-10  0:10 UTC (permalink / raw)
  To: Holland, Jake; +Cc: De Schepper, Koen (Nokia - BE/Antwerp), ecn-sane, tsvwg

[-- Attachment #1: Type: text/plain, Size: 2866 bytes --]

> On 8 Jul, 2019, at 11:55 pm, Holland, Jake <jholland@akamai.com> wrote:
> 
> Also, I think if the SCE position is "low latency can only be
> achieved with FQ", that's different from "forcing only FQ on the
> internet", provided the fairness claims hold up, right?  (Classic
> single queue AQMs may still have a useful place in getting
> pretty-good latency in the cheapest hardware, like maybe PIE with
> marking.)

In support of this viewpoint, here are some illustrative time-series graphs showing SCE behaviour in a variety of contexts.  These are all simple two-flow tests plus a sparse latency probe flow, conducted using Flent, over a 50Mbps, 80ms RTT path under lab conditions.

First let's get the FQ case out of the way, with Reno-SCE competing against plain old Reno.  Here you can see Reno's classic sawtooth, while FQ keeps the latency of sparse flows sharing the link low; the novelty is that Reno-SCE is successfully using almost all of the capacity left on the table by plain Reno's sawtooth.  This is basically ideal behaviour, enabled by FQ.



If we then disable FQ and run the same test, we find that Reno-SCE yields very politely to plain Reno, again using only leftover capacity.  From earlier comments, I gather that a similar graph was seen by the L4S team at some point in their development.  Here we can see some small delay spikes, just before AQM activates to cut the plain Reno flow down.



Conversely, if we begin the SCE marking ramp only when CE marking also begins, we get good fairness between the two flows, in the same manner as with a conventional AQM - because both flows are mostly receiving only conventional AQM signals.  The delay spikes also reflect that fact, and a significant amount of capacity goes unused.  I gather that this scenario was also approximately seen during L4S development.



Our solution - which required only a few days' thought and calculation to define - is to make the SCE ramp straddle the AQM activation threshold, for single-queue situations only.  The precise extent of straddling is configurable to suit different network situations; here is the one that works best for this scenario.  Fairness between the two flows remains good; mostly the CE marks are going to the plain Reno flow, while the SCE flow is using the remaining capacity fairly effectively.  Notice however that the delay plateaus due to the weakened SCE signalling:



Compare this to single-queue SCE vs SCE performance in a single queue, using the basic SCE ramp which lies entirely below the AQM threshold:



And with the straddling ramp:



And with the SCE ramp entirely above the threshold:



And, finally, the *real* ideal situation - SCE vs SCE with FQ:



I hope this reassures various people that we do, in fact, know what we're doing over here.

 - Jonathan Morton


[-- Attachment #2.1: Type: text/html, Size: 5164 bytes --]

[-- Attachment #2.2: PastedGraphic-1.png --]
[-- Type: image/png, Size: 134772 bytes --]

[-- Attachment #2.3: PastedGraphic-2.png --]
[-- Type: image/png, Size: 142683 bytes --]

[-- Attachment #2.4: PastedGraphic-3.png --]
[-- Type: image/png, Size: 165646 bytes --]

[-- Attachment #2.5: PastedGraphic-4.png --]
[-- Type: image/png, Size: 171245 bytes --]

[-- Attachment #2.6: PastedGraphic-5.png --]
[-- Type: image/png, Size: 171513 bytes --]

[-- Attachment #2.7: PastedGraphic-6.png --]
[-- Type: image/png, Size: 172226 bytes --]

[-- Attachment #2.8: PastedGraphic-7.png --]
[-- Type: image/png, Size: 187005 bytes --]

[-- Attachment #2.9: PastedGraphic-8.png --]
[-- Type: image/png, Size: 140303 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-08 10:26                       ` De Schepper, Koen (Nokia - BE/Antwerp)
@ 2019-07-08 20:55                         ` Holland, Jake
  2019-07-10  0:10                           ` Jonathan Morton
  2019-07-10  9:00                           ` De Schepper, Koen (Nokia - BE/Antwerp)
  0 siblings, 2 replies; 59+ messages in thread
From: Holland, Jake @ 2019-07-08 20:55 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp), Jonathan Morton; +Cc: ecn-sane, tsvwg

Hi Koen,

I'm a bit confused by this response.

I agree the key question for this discussion is about how best to
get low latency for the internet.

If I'm reading your message correctly, you're saying that under the
L4S approach for ECT(1), we can achieve it with either dualq or fq
at the bottleneck, but under the SCE approach we can only do it with
fq at the bottleneck.

(I think I understand and roughly agree with this claim, subject to
some caveats.  I just want to make sure I've got this right so
far, and that we agree that in neither case can very low latency be
achieved with a classic single queue with classic bandwidth-seeking
traffic.)

Are you saying that even if a scalable FQ can be implemented in
high-volume aggregated links at the same cost and difficulty as
dualq, there's a reason not to use FQ?  Is there a use case where
it's necessary to avoid strict isolation if strict isolation can be
accomplished as cheaply?

Also, I think if the SCE position is "low latency can only be
achieved with FQ", that's different from "forcing only FQ on the
internet", provided the fairness claims hold up, right?  (Classic
single queue AQMs may still have a useful place in getting
pretty-good latency in the cheapest hardware, like maybe PIE with
marking.)

Anyway, to me this discussion is about the tradeoffs between the
2 proposals.  It seems to me SCE has some safety advantages that
should not be thrown away lightly, so if the performance can be
made equivalent, it would be good to know about it before
committing the codepoint.

Best regards,
Jake

On 2019-07-08, 03:26, "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com> wrote:

    Hi Jonathan,
    
    From your responses below, I have the impression you think this discussion is about FQ (flow/fair queuing). Fair queuing is used today where strict isolation is wanted, like between subscribers, and by extension (if possible and preferred) on a per transport layer flow, like in Fixed CPEs and Mobile networks. No discussion about this, and assuming we have and still will have an Internet which needs to support both common queues (like DualQ is intended) and FQs, I think the only discussion point is how we want to migrate to an Internet that supports optimally Low Latency.
    
    This leads us to the question L4S or SCE?
    
    If we want to support low latency for both common queues and FQs we "NEED" L4S, if we need to support it only for FQs, we "COULD" use SCE too, and if we want to force the whole Internet to use only FQs, we "SHOULD" use SCE 😉. If your goal is to force only FQs in the Internet, then let this be clear... I assume we need a discussion on another level in that case (and to be clear, it is not a goal I can support)...
    
    Koen.
    
    
    -----Original Message-----
    From: Jonathan Morton <chromatix99@gmail.com> 
    Sent: Friday, July 5, 2019 10:51 AM
    To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
    Cc: Bob Briscoe <ietf@bobbriscoe.net>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
    Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
    
    > On 5 Jul, 2019, at 9:46 am, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
    > 
    >>> 2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.
    > 
    > Before jumping to another point, let's close down your original issue. Since you didn't mention, I assume that you agree with the following, right?
    > 
    >        "You cannot defeat a DualQ" (at least no more than a single Q)
    
    I consider forcibly degrading DualQ to single-queue mode to be a defeat.  However…
    
    >>> But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.
    > 
    > With L4S a single queue can differentiate between Classic and L4S traffic. That's why it knows exactly how to treat the traffic. For Non-ECT and ECT(0) square the probability, and for ECT(1) don't square, and it works exactly like a DualQ, but then without the latency isolation. Both types get the same throughput, AND delay. See the PI2 paper, which is exactly about a single Q.
    
    Okay, this is an important point: the real assertion is not that DualQ itself is needed for L4S to be safe on the Internet, but for differential AQM treatment to be present at the bottleneck.  Defeating DualQ only destroys L4S' latency advantage over "classic" traffic.  We might actually be making progress here!
    
    > I agree you cannot isolate in a single Q, and this is why L4S is better than SCE, because it tells the AQM what to do, even if it has a single Q. SCE needs isolation, L4S not.
    
    Devil's advocate time.  What if, instead of providing differential treatment WRT CE marking, PI2 instead applied both marking strategies simultaneously - the higher rate using SCE, and the lower rate using CE?  Classic traffic would see only the latter; L4S could use the former.
    
    > We tried years ago similar things like needed for SCE, and found that it can't work. For throughput fairness you need the squared relation between the 2 signals, but with SCE, you need to apply both signals in parallel, because you don't know the sender type. 
    
    Yes, that's exactly what we do - and it does work.
    
    > 	- So either the sender needs to ignore CE if it gets SCE, or ignore SCE if you get CE. The first is dangerous if you have multiple bottlenecks, and the second is defeating the purpose of SCE. Any other combination leads to unfairness (double response).
    
    This is a false dichotomy.  We quickly realised both of those options were unacceptable, and sought a third way.
    
    SCE senders apply a reduced CE response when also responding to parallel SCE feedback, roughly in line with ABE, on the grounds that responding to SCE does some of the necessary reduction already.  The reduced response is still a Multiplicative Decrease, so it fits with normal TCP congestion control principles.
    
    > 	- you separate the signals in queue dept, first applying SCE and later CE, as you originally proposed, but that results in starvation for SCE.
    
    Yes, although this approach gives the best performance for SCE when used with flow isolation, or when all flows are known to be SCE-aware.  So we apply this strategy in those cases, and move the SCE marking function up to overlap CE marking specifically for single queues.
    
    It has been suggested that single queue AQMs are rare in any case, but this approach covers that corner case.
    
    > Add on top that SCE makes it impossible to use DualQ, as you cannot differentiate the traffic types.
    
    SCE is designed around not *needing* to differentiate the traffic types.  Single queues have known disadvantages, and SCE doesn't worsen them.
    
    Meanwhile, we have proposed LFQ to cover the DualQ use case.  I'd be interested in hearing a principled critique of it.
    
     - Jonathan Morton
    
    


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-05  8:51                     ` Jonathan Morton
@ 2019-07-08 10:26                       ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-08 20:55                         ` Holland, Jake
  0 siblings, 1 reply; 59+ messages in thread
From: De Schepper, Koen (Nokia - BE/Antwerp) @ 2019-07-08 10:26 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Bob Briscoe, ecn-sane, tsvwg

Hi Jonathan,

From your responses below, I have the impression you think this discussion is about FQ (flow/fair queuing). Fair queuing is used today where strict isolation is wanted, like between subscribers, and by extension (if possible and preferred) on a per transport layer flow, like in Fixed CPEs and Mobile networks. No discussion about this, and assuming we have and still will have an Internet which needs to support both common queues (like DualQ is intended) and FQs, I think the only discussion point is how we want to migrate to an Internet that supports optimally Low Latency.

This leads us to the question L4S or SCE?

If we want to support low latency for both common queues and FQs we "NEED" L4S, if we need to support it only for FQs, we "COULD" use SCE too, and if we want to force the whole Internet to use only FQs, we "SHOULD" use SCE 😉. If your goal is to force only FQs in the Internet, then let this be clear... I assume we need a discussion on another level in that case (and to be clear, it is not a goal I can support)...

Koen.


-----Original Message-----
From: Jonathan Morton <chromatix99@gmail.com> 
Sent: Friday, July 5, 2019 10:51 AM
To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
Cc: Bob Briscoe <ietf@bobbriscoe.net>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts

> On 5 Jul, 2019, at 9:46 am, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
>>> 2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.
> 
> Before jumping to another point, let's close down your original issue. Since you didn't mention, I assume that you agree with the following, right?
> 
>        "You cannot defeat a DualQ" (at least no more than a single Q)

I consider forcibly degrading DualQ to single-queue mode to be a defeat.  However…

>>> But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.
> 
> With L4S a single queue can differentiate between Classic and L4S traffic. That's why it knows exactly how to treat the traffic. For Non-ECT and ECT(0) square the probability, and for ECT(1) don't square, and it works exactly like a DualQ, but then without the latency isolation. Both types get the same throughput, AND delay. See the PI2 paper, which is exactly about a single Q.

Okay, this is an important point: the real assertion is not that DualQ itself is needed for L4S to be safe on the Internet, but for differential AQM treatment to be present at the bottleneck.  Defeating DualQ only destroys L4S' latency advantage over "classic" traffic.  We might actually be making progress here!

> I agree you cannot isolate in a single Q, and this is why L4S is better than SCE, because it tells the AQM what to do, even if it has a single Q. SCE needs isolation, L4S not.

Devil's advocate time.  What if, instead of providing differential treatment WRT CE marking, PI2 instead applied both marking strategies simultaneously - the higher rate using SCE, and the lower rate using CE?  Classic traffic would see only the latter; L4S could use the former.

> We tried years ago similar things like needed for SCE, and found that it can't work. For throughput fairness you need the squared relation between the 2 signals, but with SCE, you need to apply both signals in parallel, because you don't know the sender type. 

Yes, that's exactly what we do - and it does work.

> 	- So either the sender needs to ignore CE if it gets SCE, or ignore SCE if you get CE. The first is dangerous if you have multiple bottlenecks, and the second is defeating the purpose of SCE. Any other combination leads to unfairness (double response).

This is a false dichotomy.  We quickly realised both of those options were unacceptable, and sought a third way.

SCE senders apply a reduced CE response when also responding to parallel SCE feedback, roughly in line with ABE, on the grounds that responding to SCE does some of the necessary reduction already.  The reduced response is still a Multiplicative Decrease, so it fits with normal TCP congestion control principles.

> 	- you separate the signals in queue dept, first applying SCE and later CE, as you originally proposed, but that results in starvation for SCE.

Yes, although this approach gives the best performance for SCE when used with flow isolation, or when all flows are known to be SCE-aware.  So we apply this strategy in those cases, and move the SCE marking function up to overlap CE marking specifically for single queues.

It has been suggested that single queue AQMs are rare in any case, but this approach covers that corner case.

> Add on top that SCE makes it impossible to use DualQ, as you cannot differentiate the traffic types.

SCE is designed around not *needing* to differentiate the traffic types.  Single queues have known disadvantages, and SCE doesn't worsen them.

Meanwhile, we have proposed LFQ to cover the DualQ use case.  I'd be interested in hearing a principled critique of it.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-04 11:54           ` Bob Briscoe
  2019-07-04 12:24             ` Jonathan Morton
@ 2019-07-05  9:48             ` Luca Muscariello
  1 sibling, 0 replies; 59+ messages in thread
From: Luca Muscariello @ 2019-07-05  9:48 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: Holland, Jake, ecn-sane, tsvwg

[-- Attachment #1: Type: text/plain, Size: 15396 bytes --]

I have asked a relatively simple question that Jake's got right, so I'm not
alone in my own bubble.
I've asked this to Greg White months ago and his answer was that
unresponsive traffic  x  is assumed to be small.
That the queue protection mechanism will ensure that.

You Bob send me a reference, that I checked again, and considers the other
extreme case where x is very large.

I have not found anything that covers the most realistic cases which is
case c in my previous message, when x
just varies normally when you have unresponsive traffic that varies just
like in today networks.

The normal case is the one I'm interested the most. This is something that
in LDD for Cable may
enter the access network of many cable subscribers, surfing the web, using
WebEx working form home, doing normal things.
The document "Destruction testing" :) is not very useful to my use case and
specific question.

We have been promised below millisecond latency up to the 99-th quantile
and I have simple technical questions.
LLD is a new initiative in the public IETF forum so it is normal that
people start asking questions.
No need to get blood pressure above the threshold.

As long as LLD is a cable industry only thing using PHB and private
marking, all this discussion may be irrelevant in this forum,
but this is not the case.



On Thu, Jul 4, 2019 at 1:55 PM Bob Briscoe <ietf@bobbriscoe.net> wrote:

> Luca,
>
>
> On 19/06/2019 14:02, Luca Muscariello wrote:
>
> Jake,
>
> Yes, that is one scenario that I had in mind.
> Your response comforts me that I my message was not totally unreadable.
>
> My understanding was
> - There are incentives to mark packets  if they get privileged treatment
> because of that marking. This is similar to the diffserv model with all the
> consequences in terms of trust.
>
> [BB] I'm afraid this is a common misunderstanding. We have gone to great
> lengths to ensure that the coupled dualQ does not give any privilege, by
> separating out latency from throughput, so:
>
>    - It solely isolates traffic that gives /itself/ low latency from
>    traffic that doesn't.
>    - It is very hard to get any throughput advantage from the mechanism,
>    relative to a FIFO (see further down this email).
>
> The phrase "relative to a FIFO" is important. In a FIFO, it is of course
> possible for flows to take more throughput than others. We see that as a
> feature of the Internet not a bug. But we accept that some might disagree...
>
> So those that want equal flow rates can add per-flow bandwidth policing,
> e.g. AFD, to the coupled dualQ. But that should be (and now can be) a
> separate policy choice.
>
> An important advance of the coupled dualQ is to cut latency without
> interfering with throughput.
>
>
> - Unresponsive traffic in particular (gaming, voice, video etc.) has
> incentives to mark. Assuming there is x% of unresponsive traffic in the
> priority queue, it is non trivial to guess how the system works.
> - in particular it is easy to see the extreme cases,
>                (a) x is very small, assuming the system is stable, the
> overall equilibrium will not change.
>                (b) x is very large so the dctcp like sources fall back to
> cubic like and the systems behave almost like a single FIFO.
>                (c) in all other cases x varies according to the
> unresponsive sources' rates.
>                     Several different equilibria may exist, some of which
> may include oscillations. Including oscillations of all fallback
> mechanisms.
> The reason I'm asking is that these cases are not discussed in the I-D
> documents or in the references, despite these are very common use cases.
>
> [BB] This has all already been explained and discussed at length during
> various IETF meetings. I had an excellent student (Henrik Steen) act as a
> "red-team" guy. His challenge was: Can you contrive a mis-marking strategy
> with unresponsive traffic to cause any more harm than in a FIFO? We wanted
> to make sure that introducing a priority scheduler could not be exploited
> as a significant new attack vector.
>
> Have you looked at his thesis - the [DualQ-Test
> <https://tools.ietf.org/html/draft-ietf-tsvwg-aqm-dualq-coupled-09#ref-DualQ-Test>]
> reference at the end of this subsection of the Security Considerations in
> the aqm-dualq-coupled draft:
>  4.1.3.  Protecting against Unresponsive ECN-Capable Traffic
> <https://tools.ietf.org/html/draft-ietf-tsvwg-aqm-dualq-coupled-09#section-4.1.3>
> ?
> (we ruled evaluation results out of scope of this already over-long draft
> - instead giving references).
>
> Firstly, when unresponsive traffic < link rate, counter-intuitively it
> doesn't matter which queue it classifies itself into. Any responsive
> traffic in either or both queues still shares out the remaining capacity as
> if the unresponsive traffic had subtracted from the overall capacity (like
> a FIFO).
>
> Beyond that, Henrik tested whether the persistent overload mechanism that
> switches off any distinction between the queues (code in the reference
> Linux implementation
> <https://github.com/L4STeam/sch_dualpi2_upstream/blob/master/net/sched/sch_dualpi2.c>,
> pseudocode and explanation in Appendix A.2
> <https://tools.ietf.org/html/draft-ietf-tsvwg-aqm-dualq-coupled-09#appendix-A.2>)
> left any room for mis-marked traffic to gain an advantage before the
> switch-over. There was a narrow region in which unresponsive traffic
> mismarked as ECN could strengthen its attack relative to the same attack on
> the Classic queue without mismarking.
>
> I presented a one-slide summary of Henrik's experiment here in 2017 in
> IETF tcpm
> <https://datatracker.ietf.org/meeting/99/materials/slides-99-tcpm-ecn-adding-explicit-congestion-notification-ecn-to-tcp-control-packets-02#page=12>
> .
> I tried to make the legends self-explanatory as long as you work at it,
> but shout if you need it explained.
> Each column of plots shows attack traffic at increasing fractions of the
> link rate; from 70% to 200%.
>
> Try to spot the difference between the odd columns and the even columns -
> they're just a little different in the narrow window either side of 100% -
> a sharp kink instead of a smooth kink.
> I included log-scale plots of the bottom end of the range to magnify the
> difference.
>
> Yes, the system oscillates around the switch-over point, but you can see
> from the tcpm slide that the oscillations are also there in the 3rd column
> (which emulates the same switch-over in a FIFO). So we haven't added a new
> problem.
>
> In summary, the advantage of mismarking was small and it was hard for the
> attacker not to trip the dualQ into overload state when it applies the same
> drop level in either queue. And that was when the victim traffic was just a
> predictable long-running flow. With normal less predictable victim traffic,
> I cannot think how to get this attack to be effective.
>
>
> If we add the queue protection mechanism, all unresponsive  flows that are
> caught cheating are registered in a blacklist and always scheduled in the
> non-priority queue.
>
> [BB]
> 1/ Queue protection is an alternative to overload protection, not an
> addition.
>
>    - The Linux implementation solely uses the overload mechanism, which
>    is sufficient to prevent the priority scheduler amplifying a mismarking
>    attack (whether ECN or DSCP).
>    - The DOCSIS implementation use per-flow queue protection instead.
>
> 2/ Aligned incentives
>
> The coupled dualQ with just overload protection ensures incentives are
> aligned so that, normal developers won't intentionally mismark traffic. As
> explained at the start of this email:
>
> the DualQ solely isolates traffic that gives /itself/ low latency from
> traffic that doesn't. Low latency solely depends on the traffic's own
> behaviour. Traffic doesn't /get/ anything from the low latency queue, so
> there's no point mismarking to get into it.
>
> However, incentives only address rational behaviour, not accidents and
> malice. That's why DOCSIS operators asked for Q protection - to protect
> against something accidentally or deliberately introducing bursty or
> excessive traffic into the low latency queue.
>
> The Linux code is sufficient under normal circumstances though. There are
> already other mechanisms that deal with the worms, trojans, etc. that might
> launch these attacks.
>
> 3/ DOCSIS Q protection does not black-list flows.
>
> It redirects certain /packets/ from those flows with the highest queuing
> scores into the Classic queue, only if those packets would otherwise risk a
> threshold delay for the low latency queue being exceeded.
>
> If a flow has a temporary wobble, some of its packets get redirected to
> protect the low latency queue, but if it gets back on track, then there's
> just no further packet redirection.
>
> It that happens unresponsive flows will get a service quality that is
> worse than if using a single FIFO for all flows.
>
> 4/ Slight punishment is a feature, not a bug
>
> If an unresponsive flow is well-paced and not contributing to queuing, it
> will accumulate only a low queuing score, and experience no redirected
> packets.
>
> If it is contributing to queuing and it is mismarking itself, then Q Prot
> will redirect some of its packets, and the continual reordering will
> (intentionally) give it worse service quality. This deliberate slight
> punishment gives developers a slight incentive to mark their flows
> correctly.
>
> I could explain more about the queuing score (I think I already did for
> you on these lists), but it's all in Annex P of the DOCSIS spec
> <https://specification-search.cablelabs.com/CM-SP-MULPIv3.1>. and I'm
> trying to write a stand-alone document about it at the moment.
>
>
>
> Using a flow blacklist brings back the complexity that dualq is supposed
> to remove compared to flow-isolation by flow-queueing.
> It seems to me that the blacklist is actually necessary to make dualq work
> under the assumption that x is small,
>
> [BB] As above, the Linux implementation works and aligns incentives
> without Q Prot, which is merely an optional additional protection against
> accidents and malice.
>
> (and there's no flow black-list).
>
>
> because in the other cases the behavior
> of the dualq system is unspecified and likely subject to instabilities,
> i.e. potentially different kind of oscillations.
>
>
> I do find the tone of these emails rather disheartening. We've done all
> this work that we think is really cool. And all we get in return is
> criticism in an authoritative tone as if it is backed by experiments. But
> so far it is not. There seems to be a presumption that we are not
> professional and we are somehow not to be trusted to have done a sound job.
>
> Yes, I'm sure mistakes can be found in our work. But it would be nice if
> the tone of these emails could become more constructive. Possibly even some
> praise. There seems to be a presumption of disrespect that I'm not used to,
> and I would rather it stopped.
>
> Sorry for going silent recently - had too much backlog. I'm working my way
> backwards through this thread. Next I'll reply to Jake's email, which is,
> as always, perfectly constructive.
>
> Cheers
>
>
> Bob
>
> Luca
>
>
>
>
>
> On Tue, Jun 18, 2019 at 9:25 PM Holland, Jake <jholland@akamai.com> wrote:
>
>> Hi Bob and Luca,
>>
>> Thank you both for this discussion, I think it helped crystallize a
>> comment I hadn't figured out how to make yet, but was bothering me.
>>
>> I’m reading Luca’s question as asking about fixed-rate traffic that does
>> something like a cutoff or downshift if loss gets bad enough for long
>> enough, but is otherwise unresponsive.
>>
>> The dualq draft does discuss unresponsive traffic in 3 of the sub-
>> sections in section 4, but there's a point that seems sort of swept
>> aside without comment in the analysis to me.
>>
>> The referenced paper[1] from that section does examine the question
>> of sharing a link with unresponsive traffic in some detail, but the
>> analysis seems to bake in an assumption that there's a fixed amount
>> of unresponsive traffic, when in fact for a lot of the real-life
>> scenarios for unresponsive traffic (games, voice, and some of the
>> video conferencing) there's some app-level backpressure, in that
>> when the quality of experience goes low enough, the user (or a qoe
>> trigger in the app) will often change the traffic demand at a higher
>> layer than a congestion controller (by shutting off video, for
>> instance).
>>
>> The reason I mention it is because it seems like unresponsive
>> traffic has an incentive to mark L4S and get low latency.  It doesn't
>> hurt, since it's a fixed rate and not bandwidth-seeking, so it's
>> perfectly happy to massively underutilize the link. And until the
>> link gets overloaded it will no longer suffer delay when using the
>> low latency queue, whereas in the classic queue queuing delay provides
>> a noticeable degradation in the presence of competing traffic.
>>
>> I didn't see anywhere in the paper that tried to check the quality
>> of experience for the UDP traffic as non-responsive traffic approached
>> saturation, except by inference that loss in the classic queue will
>> cause loss in the LL queue as well.
>>
>> But letting unresponsive flows get away with pushing out more classic
>> traffic and removing the penalty that classic flows would give it seems
>> like a risk that would result in more use of this kind of unresponsive
>> traffic marking itself for the LL queue, since it just would get lower
>> latency almost up until overload.
>>
>> Many of the apps that send unresponsive traffic would benefit from low
>> latency and isolation from the classic traffic, so it seems a mistake
>> to claim there's no benefit, and it furthermore seems like there's
>> systematic pressures that would often push unresponsive apps into this
>> domain.
>>
>> If that line of reasoning holds up, the "rather specific" phrase in
>> section 4.1.1 of the dualq draft might not turn out to be so specific
>> after all, and could be seen as downplaying the risks.
>>
>> Best regards,
>> Jake
>>
>> [1] https://riteproject.files.wordpress.com/2018/07/thesis-henrste.pdf
>>
>> PS: This seems like a consequence of the lack of access control on
>> setting ECT(1), and maybe the queue protection function would address
>> it, so that's interesting to hear about.
>>
>> But I thought the whole point of dualq over fq was that fq state couldn't
>> scale properly in aggregating devices with enough expected flows sharing
>> a queue?  If this protection feature turns out to be necessary, would that
>> advantage be gone?  (Also: why would one want to turn this protection off
>> if it's available?)
>>
>>
>>
> _______________________________________________
> Ecn-sane mailing listEcn-sane@lists.bufferbloat.nethttps://lists.bufferbloat.net/listinfo/ecn-sane
>
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
>
>

[-- Attachment #2: Type: text/html, Size: 21092 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-05  6:46                   ` De Schepper, Koen (Nokia - BE/Antwerp)
@ 2019-07-05  8:51                     ` Jonathan Morton
  2019-07-08 10:26                       ` De Schepper, Koen (Nokia - BE/Antwerp)
  0 siblings, 1 reply; 59+ messages in thread
From: Jonathan Morton @ 2019-07-05  8:51 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp); +Cc: Bob Briscoe, ecn-sane, tsvwg

> On 5 Jul, 2019, at 9:46 am, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
>>> 2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.
> 
> Before jumping to another point, let's close down your original issue. Since you didn't mention, I assume that you agree with the following, right?
> 
>        "You cannot defeat a DualQ" (at least no more than a single Q)

I consider forcibly degrading DualQ to single-queue mode to be a defeat.  However…

>>> But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.
> 
> With L4S a single queue can differentiate between Classic and L4S traffic. That's why it knows exactly how to treat the traffic. For Non-ECT and ECT(0) square the probability, and for ECT(1) don't square, and it works exactly like a DualQ, but then without the latency isolation. Both types get the same throughput, AND delay. See the PI2 paper, which is exactly about a single Q.

Okay, this is an important point: the real assertion is not that DualQ itself is needed for L4S to be safe on the Internet, but for differential AQM treatment to be present at the bottleneck.  Defeating DualQ only destroys L4S' latency advantage over "classic" traffic.  We might actually be making progress here!

> I agree you cannot isolate in a single Q, and this is why L4S is better than SCE, because it tells the AQM what to do, even if it has a single Q. SCE needs isolation, L4S not.

Devil's advocate time.  What if, instead of providing differential treatment WRT CE marking, PI2 instead applied both marking strategies simultaneously - the higher rate using SCE, and the lower rate using CE?  Classic traffic would see only the latter; L4S could use the former.

> We tried years ago similar things like needed for SCE, and found that it can't work. For throughput fairness you need the squared relation between the 2 signals, but with SCE, you need to apply both signals in parallel, because you don't know the sender type. 

Yes, that's exactly what we do - and it does work.

> 	- So either the sender needs to ignore CE if it gets SCE, or ignore SCE if you get CE. The first is dangerous if you have multiple bottlenecks, and the second is defeating the purpose of SCE. Any other combination leads to unfairness (double response).

This is a false dichotomy.  We quickly realised both of those options were unacceptable, and sought a third way.

SCE senders apply a reduced CE response when also responding to parallel SCE feedback, roughly in line with ABE, on the grounds that responding to SCE does some of the necessary reduction already.  The reduced response is still a Multiplicative Decrease, so it fits with normal TCP congestion control principles.

> 	- you separate the signals in queue dept, first applying SCE and later CE, as you originally proposed, but that results in starvation for SCE.

Yes, although this approach gives the best performance for SCE when used with flow isolation, or when all flows are known to be SCE-aware.  So we apply this strategy in those cases, and move the SCE marking function up to overlap CE marking specifically for single queues.

It has been suggested that single queue AQMs are rare in any case, but this approach covers that corner case.

> Add on top that SCE makes it impossible to use DualQ, as you cannot differentiate the traffic types.

SCE is designed around not *needing* to differentiate the traffic types.  Single queues have known disadvantages, and SCE doesn't worsen them.

Meanwhile, we have proposed LFQ to cover the DualQ use case.  I'd be interested in hearing a principled critique of it.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]  Comments on L4S drafts
  2019-07-04 17:54                   ` Bob Briscoe
@ 2019-07-05  8:26                     ` Jonathan Morton
  0 siblings, 0 replies; 59+ messages in thread
From: Jonathan Morton @ 2019-07-05  8:26 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: De Schepper, Koen (Nokia - BE/Antwerp), ecn-sane, tsvwg

> On 4 Jul, 2019, at 8:54 pm, Bob Briscoe <ietf@bobbriscoe.net> wrote:

> You are assuming that the one thing we haven't done yet (fall-back to TCP-friendly on detection of classic ECN) won't work, whereas all the problems you have not addressed yet with SCE will work.

This is whataboutism.  Please don't.

We have a complete end-to-end implementation of SCE, which not only works but is safe-by-design in today's Internet, as outlined not only in the I-Ds we submitted this week, but also below.

> I believe that using this to enable fine-grained congestion control would still rely on the semantics of the SCE style of signalling still. Correct?

Yes, although the fine detail of these semantics has changed since the first I-D in light of implementation experience.  I do suggest reading the new version.

> 	• Q1. Does SCE require per-flow scheduling?

SCE does not require per-flow scheduling.

It does work *better* with per-flow scheduling, but that's also true of most types of existing traffic.

> 		• If so, how do you expect it to be supported on L2 links, where not even the IP header is accessible, let alone L4?

While this question is moot, may I ask how you expect the ECN field to be used when the IP header is inaccessible?  I'm sure either DCTCP or SCE-like principles can be applied to an L2 flow, but it would not be through ECN per se.

> 		• If not, how does it work? 

In the first place, SCE flows work transparently with existing dumb and CE-marking infrastructure, and behave in an RFC-3168 compliant manner in that case.  So no special preparations in the network are required merely to allow SCE endpoints to be deployed safely.  We consider this one of SCE's key advantages over L4S.

We have now implemented and at least briefly tested a way to mark SCE in a single-queue bottleneck while retaining fairness versus non-SCE traffic.  It requires only an adjustment to a detail of the way SCE marking is done at that node - that is, altering the relationship between CE and SCE marking - and does not increase implementation complexity even there.  The tradeoff is that SCE's benefit is diluted because SCE flows may receive unnecessary CE marks, but it does achieve fairness (for example) between plain Reno and Reno-SCE.

You might wish to read the submitted draft outlining our initial test results.  They do in fact focus on single-queue behaviour, both with single flows and with two similar or dissimilar flows competing, and should thus answer additional questions you may have on this topic.  We are still refining this, of course.

> 	• Q2. How do you address the lack of ECT(1) feedback in TCP, given no-one is implementing the AccECN TCP option? And even if they did, do you have measurements on how few middleboxes / proxies, etc will allow traversal?

Our experimental reference implementation uses the former NS bit in the TCP header as an ESCE feedback mechanism.  NS is unused because Nonce Sum was never deployed, but because Nonce Sum was specified in an RFC, we expect it will traverse the Internet quite well.  Additionally, the reuse of NS in another role also associated with ECT(1) seems poetic.  Controlled tests over public Internet paths, as well as more extensively in lab conditions, have been carried out successfully.

Disruption of either SCE or ESCE signals is tolerated by design, because in extremis SCE flows still respond to CE marks and packet drops appropriately for effective congestion control.

We expect to publish an I-D covering the above shortly.

Cursory examination of QUIC indicates that it already has a mechanism specified for detailed ECN feedback, and naturally this can also support SCE.

> 	• Q3. How do you address all the tunnel decapsulators that will black-hole ECT(1) marking of the outer? Do you have measurements of how much of a blockage to progress this will be?

I imagine a blackhole of ECT(1) would also be problematic for L4S.  I would consider such tunnels RFC-ignorant (ie. buggy) because ECT(1) is expressly permitted by RFC-3168 in the same circumstances where ECT(0) is.  We have not encountered any such problems ourselves.

In any case, the precise effects will depend on the nature of the blackhole.  If they change ECT(1) to ECT(0) or Not-ECT, then SCE flows will not receive SCE information and will therefore behave like RFC-3168 flows do.  If the affected packets are dropped, then TCP should be able to recover from that.

> 	• Q4. How do you address the interaction of the two timescale dynamics in the SCE congestion control?

Which two timescale dynamics are you referring to?

> 	• Q5. Can out-of-order tolerance be relaxed on links supporting SCE? (not a problem as such, but a lack of one of L4S's advantages)

We consider that aspect of L2 link design to be orthogonal to SCE.  Most transports currently deployed should be able to cope with microsecond-level reordering on multi-millisecond Internet paths without triggering unnecessary retransmissions.

> {Note 1}: Implementation complexity is only a small part of the objections to FQ.

We are still waiting for a good explanation of these objections.  So far, we are aware only of the well-known vulnerability to "gaming" by employing more flows than necessary - but we also have defences against that, which we plan to add to a future version of the LFQ draft.  These defences are semantically similar to the dual host-flow fairness currently deployed in Cake, but with a more hardware-friendly algorithm.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-04 14:03                 ` Jonathan Morton
  2019-07-04 17:54                   ` Bob Briscoe
@ 2019-07-05  6:46                   ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-05  8:51                     ` Jonathan Morton
  1 sibling, 1 reply; 59+ messages in thread
From: De Schepper, Koen (Nokia - BE/Antwerp) @ 2019-07-05  6:46 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Bob Briscoe, ecn-sane, tsvwg

>> 2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.

Before jumping to another point, let's close down your original issue. Since you didn't mention, I assume that you agree with the following, right?

        "You cannot defeat a DualQ" (at least no more than a single Q)


>> But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.

With L4S a single queue can differentiate between Classic and L4S traffic. That's why it knows exactly how to treat the traffic. For Non-ECT and ECT(0) square the probability, and for ECT(1) don't square, and it works exactly like a DualQ, but then without the latency isolation. Both types get the same throughput, AND delay. See the PI2 paper, which is exactly about a single Q.

I agree you cannot isolate in a single Q, and this is why L4S is better than SCE, because it tells the AQM what to do, even if it has a single Q. SCE needs isolation, L4S not.
We tried years ago similar things like needed for SCE, and found that it can't work. For throughput fairness you need the squared relation between the 2 signals, but with SCE, you need to apply both signals in parallel, because you don't know the sender type. 
	- So either the sender needs to ignore CE if it gets SCE, or ignore SCE if you get CE. The first is dangerous if you have multiple bottlenecks, and the second is defeating the purpose of SCE. Any other combination leads to unfairness (double response).
	- you separate the signals in queue dept, first applying SCE and later CE, as you originally proposed, but that results in starvation for SCE.
	- you only can apply SCE less than CE, but that makes it useless, as it creates a bigger queue for SCE, and CE would kick in first anyway.

Add on top that SCE makes it impossible to use DualQ, as you cannot differentiate the traffic types.
So this is why I think L4S is the best solution. Why would you try an alternative if it cannot work?

Koen.



-----Original Message-----
From: Jonathan Morton <chromatix99@gmail.com> 
Sent: Thursday, July 4, 2019 4:03 PM
To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
Cc: Bob Briscoe <ietf@bobbriscoe.net>; ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts

> On 4 Jul, 2019, at 4:43 pm, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> So conclusion:   a DualQ works exactly the same as any other single Q AQM supporting ECN !!
> Try it, and you'll see...

But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.  This isolation is the very reason why something like DualQ is proposed, so the fact that it can be defeated into this degraded single-queue mode is a genuine problem.

May I direct you to our LFQ draft, published yesterday, for what we consider to be a much more robust approach, yet with similar hardware requirements to DualQ?  I'd be interested in hearing feedback.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]  Comments on L4S drafts
  2019-07-04 14:03                 ` Jonathan Morton
@ 2019-07-04 17:54                   ` Bob Briscoe
  2019-07-05  8:26                     ` Jonathan Morton
  2019-07-05  6:46                   ` De Schepper, Koen (Nokia - BE/Antwerp)
  1 sibling, 1 reply; 59+ messages in thread
From: Bob Briscoe @ 2019-07-04 17:54 UTC (permalink / raw)
  To: Jonathan Morton, De Schepper, Koen (Nokia - BE/Antwerp); +Cc: ecn-sane, tsvwg

[-- Attachment #1: Type: text/plain, Size: 3839 bytes --]

Jonathan,

On 04/07/2019 15:03, Jonathan Morton wrote:
>> On 4 Jul, 2019, at 4:43 pm, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
>>
>> So conclusion:   a DualQ works exactly the same as any other single Q AQM supporting ECN !!
>> Try it, and you'll see...
> But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.
You are assuming that the one thing we haven't done yet (fall-back to 
TCP-friendly on detection of classic ECN) won't work, whereas all the 
problems you have not addressed yet with SCE will work.

>   This isolation is the very reason why something like DualQ is proposed, so the fact that it can be defeated into this degraded single-queue mode is a genuine problem.
>
> May I direct you to our LFQ draft, published yesterday, for what we consider to be a much more robust approach, yet with similar hardware requirements to DualQ?  I'd be interested in hearing feedback.
I will certainly read. I assume you are aware that implementation 
complexity is only a small part of the objections to FQ. {Note 1}

I believe that using this to enable fine-grained congestion control 
would still rely on the semantics of the SCE style of signalling still. 
Correct?

So, for the third time of asking, can you or someone please respond to 
the 5 points that will be problematic for SCE (I listed them on 11 Mar 
2019 on tsvwg@ietf.org re-pasted from bloat@ to you & DaveT the day 
after you posted the first draft). You will not get anywhere in the IETF 
without addressing serious problems that people raise with your proposal.

I don't need to tell you that the Internet is a complex place to 
introduce anything new, especially into IP itself. If you cannot solve 
/all/ these problems, it will save everyone a lot of time if you just 
say so.

I have repeated bullets summarizing each question below (I've removed 
the one about re-purposing the receive window, which DaveT wished hadn't 
been mentioned, and added Q4 which I asked more recently). You may wish 
to start a new thread to answer some of the more substantive ones. They 
are roughly ranked in order of seriousness with Q1-3 being show-stoppers.

  * Q1. Does SCE require per-flow scheduling?
      o If so, how do you expect it to be supported on L2 links, where
        not even the IP header is accessible, let alone L4?
      o If not, how does it work?
  * Q2. How do you address the lack of ECT(1) feedback in TCP, given
    no-one is implementing the AccECN TCP option? And even if they did,
    do you have measurements on how few middleboxes / proxies, etc will
    allow traversal?
  * Q3. How do you address all the tunnel decapsulators that will
    black-hole ECT(1) marking of the outer? Do you have measurements of
    how much of a blockage to progress this will be?
  * Q4. How do you address the interaction of the two timescale dynamics
    in the SCE congestion control?
  * Q5. Can out-of-order tolerance be relaxed on links supporting SCE?
    (not a problem as such, but a lack of one of L4S's advantages)


{Note 1}: Implementation complexity is only a small part of the 
objections to FQ. One major reason is in Q1 above. I have promised a 
write-up of all the other reasons for why per-flow scheduling is not a 
desirable goal even if it can be achieved with low complexity. I've got 
it half written (as a tech report, not an Internet Draft), but it's on 
hold while other stuff takes priority for me (not least an awkwardly 
timed family vacation starting tomorrow for 10 days).


Cheers



Bob




>
>   - Jonathan Morton

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


[-- Attachment #2: Type: text/html, Size: 5347 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-04 13:43               ` De Schepper, Koen (Nokia - BE/Antwerp)
@ 2019-07-04 14:03                 ` Jonathan Morton
  2019-07-04 17:54                   ` Bob Briscoe
  2019-07-05  6:46                   ` De Schepper, Koen (Nokia - BE/Antwerp)
  0 siblings, 2 replies; 59+ messages in thread
From: Jonathan Morton @ 2019-07-04 14:03 UTC (permalink / raw)
  To: De Schepper, Koen (Nokia - BE/Antwerp); +Cc: Bob Briscoe, ecn-sane, tsvwg

> On 4 Jul, 2019, at 4:43 pm, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> So conclusion:   a DualQ works exactly the same as any other single Q AQM supporting ECN !!
> Try it, and you'll see...

But that's exactly the problem.  Single queue AQM does not isolate L4S traffic from "classic" traffic, so the latter suffers from the former's relative aggression in the face of AQM activity.  This isolation is the very reason why something like DualQ is proposed, so the fact that it can be defeated into this degraded single-queue mode is a genuine problem.

May I direct you to our LFQ draft, published yesterday, for what we consider to be a much more robust approach, yet with similar hardware requirements to DualQ?  I'd be interested in hearing feedback.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-06-19  4:24       ` Holland, Jake
  2019-06-19 13:02         ` Luca Muscariello
@ 2019-07-04 13:45         ` Bob Briscoe
  2019-07-10 17:03           ` Holland, Jake
  1 sibling, 1 reply; 59+ messages in thread
From: Bob Briscoe @ 2019-07-04 13:45 UTC (permalink / raw)
  To: Holland, Jake; +Cc: Luca Muscariello, ecn-sane, tsvwg

[-- Attachment #1: Type: text/plain, Size: 9750 bytes --]

Jake,

On 19/06/2019 05:24, Holland, Jake wrote:
> Hi Bob and Luca,
>
> Thank you both for this discussion, I think it helped crystallize a
> comment I hadn't figured out how to make yet, but was bothering me.
>
> I’m reading Luca’s question as asking about fixed-rate traffic that does
> something like a cutoff or downshift if loss gets bad enough for long
> enough, but is otherwise unresponsive.
>
> The dualq draft does discuss unresponsive traffic in 3 of the sub-
> sections in section 4, but there's a point that seems sort of swept
> aside without comment in the analysis to me.
>
> The referenced paper[1] from that section does examine the question
> of sharing a link with unresponsive traffic in some detail, but the
> analysis seems to bake in an assumption that there's a fixed amount
> of unresponsive traffic, when in fact for a lot of the real-life
> scenarios for unresponsive traffic (games, voice, and some of the
> video conferencing) there's some app-level backpressure, in that
> when the quality of experience goes low enough, the user (or a qoe
> trigger in the app) will often change the traffic demand at a higher
> layer than a congestion controller (by shutting off video, for
> instance).
>
> The reason I mention it is because it seems like unresponsive
> traffic has an incentive to mark L4S and get low latency.  It doesn't
> hurt, since it's a fixed rate and not bandwidth-seeking, so it's
> perfectly happy to massively underutilize the link. And until the
> link gets overloaded it will no longer suffer delay when using the
> low latency queue, whereas in the classic queue queuing delay provides
> a noticeable degradation in the presence of competing traffic.
It is very much intentional to allow unresponsive traffic in the L queue 
if it is not contributing to queuing.

You're right that the title of S.4.1.3 sounds like there's a presumption 
that all unresponsive ECN traffic is bad. Sorry that was not the 
intention. Elsewhere the drafts do say that a reasonable amount of 
smoothly paced unresponsive traffic is OK alongside any responsive traffic.

(I've just posted an -09 rev, but I'll post a draft-10 that fixes that, 
hopefully before the Monday cut-off).

If you're talking about where unresponsive traffic is mentioned in 
4.1.1, I think that's OK, 'cos that's in the context of saturated 
congestion marking (when it's not OK to be unresponsive).



>
> I didn't see anywhere in the paper that tried to check the quality
> of experience for the UDP traffic as non-responsive traffic approached
> saturation, except by inference that loss in the classic queue will
> cause loss in the LL queue as well.
Yeah, in the context of Henrik's thesis (your [1]), "unresponsive" was 
used as a byword for "attack traffic". But that shouldn't be taken to 
mean unresponsive is considered evil for L4S in general.

Indeed, Low Latency DOCIS started from the assumption of using a low 
latency queue for unresponsive traffic (games, VoIP, etc), then added 
responsive L4S traffic into the same queue later.

You may have seen the draft about assigning a DSCP for 
Non-Queue-Building (NQB) traffic for that purpose (as with L4S and 
unlike Diffserv, this codepoint solely describes the traffic's 
behaviour, not what it wants or needs).
     https://tools.ietf.org/html/draft-white-tsvwg-nqb-02
And there are references in ecn-l4s-id to other identifiers that could 
be used to get unresponsive traffic into the low latency queue (DOCSIS 
classifies EF and NQB as low latency by default).

We don't want ECN to be the only way to get into the L queue, cos we 
don't want to encourage mismarking as 'ECN' when a flow is not actually 
going to respond to ECN).

>
> But letting unresponsive flows get away with pushing out more classic
> traffic and removing the penalty that classic flows would give it seems
> like a risk that would result in more use of this kind of unresponsive
> traffic marking itself for the LL queue, since it just would get lower
> latency almost up until overload.
As explained to Luca, it's counter-intuitive, but responsive flows 
(either C or L) use the same share of capacity irrespective of which 
queue any unresponsive traffic is in. Think of it as the unresponsive 
traffic subtracting capacity from the aggregate (because both queues can 
use the whole aggregate), then the coupling sharing out what's left. The 
coupling makes it like a FIFO from a bandwidth perspective.

You can try this with the tool you mentioned that you had downloaded. 
There's a slider to add unresponsive traffic to either queue.

So it's fine if unresponsive traffic doesn't cause any queuing itself. 
It can happily use the L queue. This was a very important design goal, 
but we write about it circumspectly in the IETF drafts, 'cos talk about 
allowing unresponsive traffic can trigger political correctness 
arguments. (Oops, am I writing on an IETF list?)

Nonetheless, when an unresponsive flow(s) is consuming some capacity, 
and a responsive flow(s) takes the total over the available capacity, 
then both are responsible in proportion to their contribution to the 
queue, 'cos the unresponsive flow didn't respond (it didn't even try to).

This is why it's OK to have a small unresponsive flow, but it becomes 
less and less OK to have a larger and larger unresponsive flow.

BTW, the proportion of blame for the queue is what the queuing score 
represents in the DOCSIS queue protection algo. It's quite simple but 
subtle. See your PS at the end. Right now I'm going to get on with 
writing about that in a proper doc, rather than in an email.


>
> Many of the apps that send unresponsive traffic would benefit from low
> latency and isolation from the classic traffic, so it seems a mistake
> to claim there's no benefit, and it furthermore seems like there's
> systematic pressures that would often push unresponsive apps into this
> domain.
There's no bandwidth benefit.
There's only latency benefit, and then the only benefits are:

  * the low latency behaviour of yourself and other flows behaving like you
  * and, critically, isolation from those flows not behaving well like you.

Neither give an incentive to mismark - you get nothing if you don't 
behave. And there's a disincentive for 'Classic' TCP flows to mismark, 
'cos they badly underutilize without a queue.

(See also reply to Luca addressing accidents and malice, which lie 
outside control by incentives).

>
> If that line of reasoning holds up, the "rather specific" phrase in
> section 4.1.1 of the dualq draft might not turn out to be so specific
> after all, and could be seen as downplaying the risks.
Yup, as said, will fix the phrasing in 4.1.3. But I'm not going to touch 
4.1.1. without better understand what the problem is there.

>
> Best regards,
> Jake
>
> [1] https://riteproject.files.wordpress.com/2018/07/thesis-henrste.pdf
>
> PS: This seems like a consequence of the lack of access control on
> setting ECT(1), and maybe the queue protection function would address
> it, so that's interesting to hear about.
Yeah, I'm trying to write about that next. But if you extract Appendix P 
from the DOCSIS 3.1 spec it's explained pretty well already and openly 
available.

However, I want it to be clear that Q Prot is not /necessary/ for L4S - 
and it's also got wider applicability, I think.

> But I thought the whole point of dualq over fq was that fq state couldn't
> scale properly in aggregating devices with enough expected flows sharing
> a queue?  If this protection feature turns out to be necessary, would that
> advantage be gone?  (Also: why would one want to turn this protection off
> if it's available?)
1/ The q-prot mechanism certainly has the disadvantage that it has to 
access L4 headers. But it is much more lightweight than FQ.

There's no queue state per flow. The flow-state is just a number that 
represents its own expiry time - a higher queuing score pushes out the 
expiry time further. If it has expired when the next packet of the flow 
arrives, it just starts from now, like a new flow, otherwise it adds to 
the existing expiry time. Long-running L4S flows don't hold on to 
flow-state between most packets - it usually expires reasonably early in 
the gap between the packets of a normal flow, then it can be recycled 
for packets from any other flows that arrive in between. So only 
misbehaving flows hold flow state persistently.

The subtle part is the queuing score. It uses the internal variable from 
the AQM that drives the ECN marking probability - call it p (between 0 
and 1 in floating point). And it takes the size of each arriving packet 
of a flow and scales by the value of p on arrival. This would accumulate 
a number which would rise at the so-called congestion-rate of the flow, 
i.e. the rate at which the flow is causing congestion (the rate at which 
it is sending bytes that are ECN marked or dropped).

However, rather than just doing that, the queuing score is also 
normalized into time units (to represent the expiry time of the flow 
state, as above). That's possible by just dividing by a constant that 
represents the acceptable congestion-rate per flow (rounded up to an 
integer power of 2 for efficiency). A nice property of the linear 
scaling of L4S is that this number is a constant for any link rate.

That's probably not understandable. Let me write it up properly - with 
some explanatory pictures and examples.


Bob

>
>
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


[-- Attachment #2: Type: text/html, Size: 12392 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg]   Comments on L4S drafts
  2019-07-04 12:24             ` Jonathan Morton
@ 2019-07-04 13:43               ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-04 14:03                 ` Jonathan Morton
  0 siblings, 1 reply; 59+ messages in thread
From: De Schepper, Koen (Nokia - BE/Antwerp) @ 2019-07-04 13:43 UTC (permalink / raw)
  To: Jonathan Morton, Bob Briscoe; +Cc: ecn-sane, tsvwg

Jonathan,

>> 2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.

Not correct. DualQ works not different than any (single q) FIFO, which can be defeated by non-responsive traffic.
It even does not matter what type of traffic the adversary is (L4S or Classic drop/mark), as the adversary will push away the responsive traffic only by the congestion signal it invokes in the AQM (drop or classic or L4S marking). The switch to drop for all traffic from 25% onwards avoids that ECN flows get a benefit under overload caused by non-responsive flows. This mechanism protects also Classic ECN single Q AQMs, as defined in the ECN RFCs.

So conclusion:   a DualQ works exactly the same as any other single Q AQM supporting ECN !!
Try it, and you'll see...

Koen.

-----Original Message-----
From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Jonathan Morton
Sent: Thursday, July 4, 2019 2:24 PM
To: Bob Briscoe <ietf@bobbriscoe.net>
Cc: ecn-sane@lists.bufferbloat.net; tsvwg@ietf.org
Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts

> On 4 Jul, 2019, at 2:54 pm, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> 
> The phrase "relative to a FIFO" is important. In a FIFO, it is of course possible for flows to take more throughput than others. We see that as a feature of the Internet not a bug. But we accept that some might disagree...

Chalk me up as among those who consider "no worse than a FIFO" to not be very reassuring.  As is well documented and even admitted in L4S drafts, L4S flows tend to squash "classic" flows in a FIFO.

So the difficulty here is twofold:

1: DualQ or FQ is needed to make L4S coexist with existing traffic, and

2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.

I'll read your reply to Jake when it arrives.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-07-04 11:54           ` Bob Briscoe
@ 2019-07-04 12:24             ` Jonathan Morton
  2019-07-04 13:43               ` De Schepper, Koen (Nokia - BE/Antwerp)
  2019-07-05  9:48             ` Luca Muscariello
  1 sibling, 1 reply; 59+ messages in thread
From: Jonathan Morton @ 2019-07-04 12:24 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: Luca Muscariello, Holland, Jake, ecn-sane, tsvwg

> On 4 Jul, 2019, at 2:54 pm, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> 
> The phrase "relative to a FIFO" is important. In a FIFO, it is of course possible for flows to take more throughput than others. We see that as a feature of the Internet not a bug. But we accept that some might disagree...

Chalk me up as among those who consider "no worse than a FIFO" to not be very reassuring.  As is well documented and even admitted in L4S drafts, L4S flows tend to squash "classic" flows in a FIFO.

So the difficulty here is twofold:

1: DualQ or FQ is needed to make L4S coexist with existing traffic, and

2: DualQ can be defeated by an adversary, destroying its ability to isolate L4S traffic.

I'll read your reply to Jake when it arrives.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-06-19 13:02         ` Luca Muscariello
@ 2019-07-04 11:54           ` Bob Briscoe
  2019-07-04 12:24             ` Jonathan Morton
  2019-07-05  9:48             ` Luca Muscariello
  0 siblings, 2 replies; 59+ messages in thread
From: Bob Briscoe @ 2019-07-04 11:54 UTC (permalink / raw)
  To: Luca Muscariello, Holland, Jake; +Cc: ecn-sane, tsvwg

[-- Attachment #1: Type: text/plain, Size: 13699 bytes --]

Luca,


On 19/06/2019 14:02, Luca Muscariello wrote:
> Jake,
>
> Yes, that is one scenario that I had in mind.
> Your response comforts me that I my message was not totally unreadable.
>
> My understanding was
> - There are incentives to mark packets  if they get privileged 
> treatment because of that marking. This is similar to the diffserv 
> model with all the consequences in terms of trust.
[BB] I'm afraid this is a common misunderstanding. We have gone to great 
lengths to ensure that the coupled dualQ does not give any privilege, by 
separating out latency from throughput, so:

  * It solely isolates traffic that gives /itself/ low latency from
    traffic that doesn't.
  * It is very hard to get any throughput advantage from the mechanism,
    relative to a FIFO (see further down this email).

The phrase "relative to a FIFO" is important. In a FIFO, it is of course 
possible for flows to take more throughput than others. We see that as a 
feature of the Internet not a bug. But we accept that some might disagree...

So those that want equal flow rates can add per-flow bandwidth policing, 
e.g. AFD, to the coupled dualQ. But that should be (and now can be) a 
separate policy choice.

An important advance of the coupled dualQ is to cut latency without 
interfering with throughput.


> - Unresponsive traffic in particular (gaming, voice, video etc.) has 
> incentives to mark. Assuming there is x% of unresponsive traffic in 
> the priority queue, it is non trivial to guess how the system works.
> - in particular it is easy to see the extreme cases,
>                (a) x is very small, assuming the system is stable, the 
> overall equilibrium will not change.
>                (b) x is very large so the dctcp like sources fall back 
> to cubic like and the systems behave almost like a single FIFO.
>                (c) in all other cases x varies according to the 
> unresponsive sources' rates.
>                     Several different equilibria may exist, some of 
> which may include oscillations. Including oscillations of all 
> fallback  mechanisms.
> The reason I'm asking is that these cases are not discussed in the I-D 
> documents or in the references, despite these are very common use cases.
[BB] This has all already been explained and discussed at length during 
various IETF meetings. I had an excellent student (Henrik Steen) act as 
a "red-team" guy. His challenge was: Can you contrive a mis-marking 
strategy with unresponsive traffic to cause any more harm than in a 
FIFO? We wanted to make sure that introducing a priority scheduler could 
not be exploited as a significant new attack vector.

Have you looked at his thesis - the [DualQ-Test 
<https://tools.ietf.org/html/draft-ietf-tsvwg-aqm-dualq-coupled-09#ref-DualQ-Test>] 
reference at the end of this subsection of the Security Considerations 
in the aqm-dualq-coupled draft:
4.1.3. Protecting against Unresponsive ECN-Capable Traffic 
<https://tools.ietf.org/html/draft-ietf-tsvwg-aqm-dualq-coupled-09#section-4.1.3> 
?
(we ruled evaluation results out of scope of this already over-long 
draft - instead giving references).

Firstly, when unresponsive traffic < link rate, counter-intuitively it 
doesn't matter which queue it classifies itself into. Any responsive 
traffic in either or both queues still shares out the remaining capacity 
as if the unresponsive traffic had subtracted from the overall capacity 
(like a FIFO).

Beyond that, Henrik tested whether the persistent overload mechanism 
that switches off any distinction between the queues (code in the 
reference Linux implementation 
<https://github.com/L4STeam/sch_dualpi2_upstream/blob/master/net/sched/sch_dualpi2.c>, 
pseudocode and explanation in Appendix A.2 
<https://tools.ietf.org/html/draft-ietf-tsvwg-aqm-dualq-coupled-09#appendix-A.2>) 
left any room for mis-marked traffic to gain an advantage before the 
switch-over. There was a narrow region in which unresponsive traffic 
mismarked as ECN could strengthen its attack relative to the same attack 
on the Classic queue without mismarking.

I presented a one-slide summary of Henrik's experiment here in 2017 in 
IETF tcpm 
<https://datatracker.ietf.org/meeting/99/materials/slides-99-tcpm-ecn-adding-explicit-congestion-notification-ecn-to-tcp-control-packets-02#page=12>.
I tried to make the legends self-explanatory as long as you work at it, 
but shout if you need it explained.
Each column of plots shows attack traffic at increasing fractions of the 
link rate; from 70% to 200%.

Try to spot the difference between the odd columns and the even columns 
- they're just a little different in the narrow window either side of 
100% - a sharp kink instead of a smooth kink.
I included log-scale plots of the bottom end of the range to magnify the 
difference.

Yes, the system oscillates around the switch-over point, but you can see 
from the tcpm slide that the oscillations are also there in the 3rd 
column (which emulates the same switch-over in a FIFO). So we haven't 
added a new problem.

In summary, the advantage of mismarking was small and it was hard for 
the attacker not to trip the dualQ into overload state when it applies 
the same drop level in either queue. And that was when the victim 
traffic was just a predictable long-running flow. With normal less 
predictable victim traffic, I cannot think how to get this attack to be 
effective.


> If we add the queue protection mechanism, all unresponsive flows that 
> are caught cheating are registered in a blacklist and always scheduled 
> in the non-priority queue.
[BB]
1/ Queue protection is an alternative to overload protection, not an 
addition.

  * The Linux implementation solely uses the overload mechanism, which
    is sufficient to prevent the priority scheduler amplifying a
    mismarking attack (whether ECN or DSCP).
  * The DOCSIS implementation use per-flow queue protection instead.

2/ Aligned incentives

The coupled dualQ with just overload protection ensures incentives are 
aligned so that, normal developers won't intentionally mismark traffic. 
As explained at the start of this email:

    the DualQ solely isolates traffic that gives /itself/ low latency
    from traffic that doesn't. Low latency solely depends on the
    traffic's own behaviour. Traffic doesn't /get/ anything from the low
    latency queue, so there's no point mismarking to get into it.

However, incentives only address rational behaviour, not accidents and 
malice. That's why DOCSIS operators asked for Q protection - to protect 
against something accidentally or deliberately introducing bursty or 
excessive traffic into the low latency queue.

The Linux code is sufficient under normal circumstances though. There 
are already other mechanisms that deal with the worms, trojans, etc. 
that might launch these attacks.

3/ DOCSIS Q protection does not black-list flows.

It redirects certain /packets/ from those flows with the highest queuing 
scores into the Classic queue, only if those packets would otherwise 
risk a threshold delay for the low latency queue being exceeded.

If a flow has a temporary wobble, some of its packets get redirected to 
protect the low latency queue, but if it gets back on track, then 
there's just no further packet redirection.

> It that happens unresponsive flows will get a service quality that is 
> worse than if using a single FIFO for all flows.
4/ Slight punishment is a feature, not a bug

If an unresponsive flow is well-paced and not contributing to queuing, 
it will accumulate only a low queuing score, and experience no 
redirected packets.

If it is contributing to queuing and it is mismarking itself, then Q 
Prot will redirect some of its packets, and the continual reordering 
will (intentionally) give it worse service quality. This deliberate 
slight punishment gives developers a slight incentive to mark their 
flows correctly.

I could explain more about the queuing score (I think I already did for 
you on these lists), but it's all in Annex P of the DOCSIS spec 
<https://specification-search.cablelabs.com/CM-SP-MULPIv3.1>. and I'm 
trying to write a stand-alone document about it at the moment.


>
> Using a flow blacklist brings back the complexity that dualq is 
> supposed to remove compared to flow-isolation by flow-queueing.
> It seems to me that the blacklist is actually necessary to make dualq 
> work under the assumption that x is small,
[BB] As above, the Linux implementation works and aligns incentives 
without Q Prot, which is merely an optional additional protection 
against accidents and malice.

(and there's no flow black-list).


> because in the other cases the behavior
> of the dualq system is unspecified and likely subject to 
> instabilities, i.e. potentially different kind of oscillations.

I do find the tone of these emails rather disheartening. We've done all 
this work that we think is really cool. And all we get in return is 
criticism in an authoritative tone as if it is backed by experiments. 
But so far it is not. There seems to be a presumption that we are not 
professional and we are somehow not to be trusted to have done a sound job.

Yes, I'm sure mistakes can be found in our work. But it would be nice if 
the tone of these emails could become more constructive. Possibly even 
some praise. There seems to be a presumption of disrespect that I'm not 
used to, and I would rather it stopped.

Sorry for going silent recently - had too much backlog. I'm working my 
way backwards through this thread. Next I'll reply to Jake's email, 
which is, as always, perfectly constructive.

Cheers


Bob

> Luca
>
>
>
>
> On Tue, Jun 18, 2019 at 9:25 PM Holland, Jake <jholland@akamai.com 
> <mailto:jholland@akamai.com>> wrote:
>
>     Hi Bob and Luca,
>
>     Thank you both for this discussion, I think it helped crystallize a
>     comment I hadn't figured out how to make yet, but was bothering me.
>
>     I’m reading Luca’s question as asking about fixed-rate traffic
>     that does
>     something like a cutoff or downshift if loss gets bad enough for long
>     enough, but is otherwise unresponsive.
>
>     The dualq draft does discuss unresponsive traffic in 3 of the sub-
>     sections in section 4, but there's a point that seems sort of swept
>     aside without comment in the analysis to me.
>
>     The referenced paper[1] from that section does examine the question
>     of sharing a link with unresponsive traffic in some detail, but the
>     analysis seems to bake in an assumption that there's a fixed amount
>     of unresponsive traffic, when in fact for a lot of the real-life
>     scenarios for unresponsive traffic (games, voice, and some of the
>     video conferencing) there's some app-level backpressure, in that
>     when the quality of experience goes low enough, the user (or a qoe
>     trigger in the app) will often change the traffic demand at a higher
>     layer than a congestion controller (by shutting off video, for
>     instance).
>
>     The reason I mention it is because it seems like unresponsive
>     traffic has an incentive to mark L4S and get low latency.  It doesn't
>     hurt, since it's a fixed rate and not bandwidth-seeking, so it's
>     perfectly happy to massively underutilize the link. And until the
>     link gets overloaded it will no longer suffer delay when using the
>     low latency queue, whereas in the classic queue queuing delay provides
>     a noticeable degradation in the presence of competing traffic.
>
>     I didn't see anywhere in the paper that tried to check the quality
>     of experience for the UDP traffic as non-responsive traffic approached
>     saturation, except by inference that loss in the classic queue will
>     cause loss in the LL queue as well.
>
>     But letting unresponsive flows get away with pushing out more classic
>     traffic and removing the penalty that classic flows would give it
>     seems
>     like a risk that would result in more use of this kind of unresponsive
>     traffic marking itself for the LL queue, since it just would get lower
>     latency almost up until overload.
>
>     Many of the apps that send unresponsive traffic would benefit from low
>     latency and isolation from the classic traffic, so it seems a mistake
>     to claim there's no benefit, and it furthermore seems like there's
>     systematic pressures that would often push unresponsive apps into this
>     domain.
>
>     If that line of reasoning holds up, the "rather specific" phrase in
>     section 4.1.1 of the dualq draft might not turn out to be so specific
>     after all, and could be seen as downplaying the risks.
>
>     Best regards,
>     Jake
>
>     [1] https://riteproject.files.wordpress.com/2018/07/thesis-henrste.pdf
>
>     PS: This seems like a consequence of the lack of access control on
>     setting ECT(1), and maybe the queue protection function would address
>     it, so that's interesting to hear about.
>
>     But I thought the whole point of dualq over fq was that fq state
>     couldn't
>     scale properly in aggregating devices with enough expected flows
>     sharing
>     a queue?  If this protection feature turns out to be necessary,
>     would that
>     advantage be gone?  (Also: why would one want to turn this
>     protection off
>     if it's available?)
>
>
>
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


[-- Attachment #2: Type: text/html, Size: 19085 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-06-19  4:24       ` Holland, Jake
@ 2019-06-19 13:02         ` Luca Muscariello
  2019-07-04 11:54           ` Bob Briscoe
  2019-07-04 13:45         ` Bob Briscoe
  1 sibling, 1 reply; 59+ messages in thread
From: Luca Muscariello @ 2019-06-19 13:02 UTC (permalink / raw)
  To: Holland, Jake; +Cc: Bob Briscoe, tsvwg, ecn-sane

[-- Attachment #1: Type: text/plain, Size: 5469 bytes --]

Jake,

Yes, that is one scenario that I had in mind.
Your response comforts me that I my message was not totally unreadable.

My understanding was
- There are incentives to mark packets  if they get privileged treatment
because of that marking. This is similar to the diffserv model with all the
consequences in terms of trust.
- Unresponsive traffic in particular (gaming, voice, video etc.) has
incentives to mark. Assuming there is x% of unresponsive traffic in the
priority queue, it is non trivial to guess how the system works.
- in particular it is easy to see the extreme cases,
               (a) x is very small, assuming the system is stable, the
overall equilibrium will not change.
               (b) x is very large so the dctcp like sources fall back to
cubic like and the systems behave almost like a single FIFO.
               (c) in all other cases x varies according to the
unresponsive sources' rates.
                    Several different equilibria may exist, some of which
may include oscillations. Including oscillations of all fallback
mechanisms.
The reason I'm asking is that these cases are not discussed in the I-D
documents or in the references, despite these are very common use cases.

If we add the queue protection mechanism, all unresponsive  flows that are
caught cheating are registered in a blacklist and always scheduled in the
non-priority queue.
It that happens unresponsive flows will get a service quality that is worse
than if using a single FIFO for all flows.

Using a flow blacklist brings back the complexity that dualq is supposed to
remove compared to flow-isolation by flow-queueing.
It seems to me that the blacklist is actually necessary to make dualq work
under the assumption that x is small, because in the other cases the
behavior
of the dualq system is unspecified and likely subject to instabilities,
i.e. potentially different kind of oscillations.

Luca




On Tue, Jun 18, 2019 at 9:25 PM Holland, Jake <jholland@akamai.com> wrote:

> Hi Bob and Luca,
>
> Thank you both for this discussion, I think it helped crystallize a
> comment I hadn't figured out how to make yet, but was bothering me.
>
> I’m reading Luca’s question as asking about fixed-rate traffic that does
> something like a cutoff or downshift if loss gets bad enough for long
> enough, but is otherwise unresponsive.
>
> The dualq draft does discuss unresponsive traffic in 3 of the sub-
> sections in section 4, but there's a point that seems sort of swept
> aside without comment in the analysis to me.
>
> The referenced paper[1] from that section does examine the question
> of sharing a link with unresponsive traffic in some detail, but the
> analysis seems to bake in an assumption that there's a fixed amount
> of unresponsive traffic, when in fact for a lot of the real-life
> scenarios for unresponsive traffic (games, voice, and some of the
> video conferencing) there's some app-level backpressure, in that
> when the quality of experience goes low enough, the user (or a qoe
> trigger in the app) will often change the traffic demand at a higher
> layer than a congestion controller (by shutting off video, for
> instance).
>
> The reason I mention it is because it seems like unresponsive
> traffic has an incentive to mark L4S and get low latency.  It doesn't
> hurt, since it's a fixed rate and not bandwidth-seeking, so it's
> perfectly happy to massively underutilize the link. And until the
> link gets overloaded it will no longer suffer delay when using the
> low latency queue, whereas in the classic queue queuing delay provides
> a noticeable degradation in the presence of competing traffic.
>
> I didn't see anywhere in the paper that tried to check the quality
> of experience for the UDP traffic as non-responsive traffic approached
> saturation, except by inference that loss in the classic queue will
> cause loss in the LL queue as well.
>
> But letting unresponsive flows get away with pushing out more classic
> traffic and removing the penalty that classic flows would give it seems
> like a risk that would result in more use of this kind of unresponsive
> traffic marking itself for the LL queue, since it just would get lower
> latency almost up until overload.
>
> Many of the apps that send unresponsive traffic would benefit from low
> latency and isolation from the classic traffic, so it seems a mistake
> to claim there's no benefit, and it furthermore seems like there's
> systematic pressures that would often push unresponsive apps into this
> domain.
>
> If that line of reasoning holds up, the "rather specific" phrase in
> section 4.1.1 of the dualq draft might not turn out to be so specific
> after all, and could be seen as downplaying the risks.
>
> Best regards,
> Jake
>
> [1] https://riteproject.files.wordpress.com/2018/07/thesis-henrste.pdf
>
> PS: This seems like a consequence of the lack of access control on
> setting ECT(1), and maybe the queue protection function would address
> it, so that's interesting to hear about.
>
> But I thought the whole point of dualq over fq was that fq state couldn't
> scale properly in aggregating devices with enough expected flows sharing
> a queue?  If this protection feature turns out to be necessary, would that
> advantage be gone?  (Also: why would one want to turn this protection off
> if it's available?)
>
>
>

[-- Attachment #2: Type: text/html, Size: 6373 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-06-19  1:15     ` Bob Briscoe
  2019-06-19  1:33       ` Dave Taht
@ 2019-06-19  4:24       ` Holland, Jake
  2019-06-19 13:02         ` Luca Muscariello
  2019-07-04 13:45         ` Bob Briscoe
  1 sibling, 2 replies; 59+ messages in thread
From: Holland, Jake @ 2019-06-19  4:24 UTC (permalink / raw)
  To: Bob Briscoe, Luca Muscariello; +Cc: tsvwg, ecn-sane

Hi Bob and Luca,

Thank you both for this discussion, I think it helped crystallize a
comment I hadn't figured out how to make yet, but was bothering me.

I’m reading Luca’s question as asking about fixed-rate traffic that does
something like a cutoff or downshift if loss gets bad enough for long
enough, but is otherwise unresponsive.

The dualq draft does discuss unresponsive traffic in 3 of the sub-
sections in section 4, but there's a point that seems sort of swept
aside without comment in the analysis to me.

The referenced paper[1] from that section does examine the question
of sharing a link with unresponsive traffic in some detail, but the
analysis seems to bake in an assumption that there's a fixed amount
of unresponsive traffic, when in fact for a lot of the real-life
scenarios for unresponsive traffic (games, voice, and some of the
video conferencing) there's some app-level backpressure, in that
when the quality of experience goes low enough, the user (or a qoe
trigger in the app) will often change the traffic demand at a higher
layer than a congestion controller (by shutting off video, for
instance).

The reason I mention it is because it seems like unresponsive
traffic has an incentive to mark L4S and get low latency.  It doesn't
hurt, since it's a fixed rate and not bandwidth-seeking, so it's
perfectly happy to massively underutilize the link. And until the
link gets overloaded it will no longer suffer delay when using the
low latency queue, whereas in the classic queue queuing delay provides
a noticeable degradation in the presence of competing traffic.

I didn't see anywhere in the paper that tried to check the quality
of experience for the UDP traffic as non-responsive traffic approached
saturation, except by inference that loss in the classic queue will
cause loss in the LL queue as well.

But letting unresponsive flows get away with pushing out more classic
traffic and removing the penalty that classic flows would give it seems
like a risk that would result in more use of this kind of unresponsive
traffic marking itself for the LL queue, since it just would get lower
latency almost up until overload.

Many of the apps that send unresponsive traffic would benefit from low
latency and isolation from the classic traffic, so it seems a mistake
to claim there's no benefit, and it furthermore seems like there's
systematic pressures that would often push unresponsive apps into this
domain.

If that line of reasoning holds up, the "rather specific" phrase in
section 4.1.1 of the dualq draft might not turn out to be so specific
after all, and could be seen as downplaying the risks.

Best regards,
Jake

[1] https://riteproject.files.wordpress.com/2018/07/thesis-henrste.pdf

PS: This seems like a consequence of the lack of access control on
setting ECT(1), and maybe the queue protection function would address
it, so that's interesting to hear about.

But I thought the whole point of dualq over fq was that fq state couldn't
scale properly in aggregating devices with enough expected flows sharing
a queue?  If this protection feature turns out to be necessary, would that
advantage be gone?  (Also: why would one want to turn this protection off
if it's available?)



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-06-19  1:15     ` Bob Briscoe
@ 2019-06-19  1:33       ` Dave Taht
  2019-06-19  4:24       ` Holland, Jake
  1 sibling, 0 replies; 59+ messages in thread
From: Dave Taht @ 2019-06-19  1:33 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: Luca Muscariello, ECN-Sane, tsvwg

[-- Attachment #1: Type: text/plain, Size: 10166 bytes --]

I simply have one question. Is the code for the modified dctcp and dualpi
in the l4steam repos on github ready for independent testing?

On Tue, Jun 18, 2019, 6:15 PM Bob Briscoe <ietf@bobbriscoe.net> wrote:

> Luca,
>
> I'm still preparing a (long) reply to Jake's earlier (long) response. But
> I'll take time out to quickly clear this point up inline...
>
> On 14/06/2019 21:10, Luca Muscariello wrote:
>
>
> On Fri, Jun 7, 2019 at 8:10 PM Bob Briscoe <ietf@bobbriscoe.net> wrote:
>
>>
>>  I'm afraid there are not the same pressures to cause rapid roll-out at
>> all, cos it's flakey now, jam tomorrow. (Actually ECN-DualQ-SCE has a much
>> greater problem - complete starvation of SCE flows - but we'll come on to
>> that in Q4.)
>>
>> I want to say at this point, that I really appreciate all the effort
>> you've been putting in, trying to find common ground.
>>
>> In trying to find a compromise, you've taken the fire that is really
>> aimed at the inadequacy of underlying SCE protocol - for anything other
>> than FQ. If the primary SCE proponents had attempted to articulate a way to
>> use SCE in a single queue or a dual queue, as you have, that would have
>> taken my fire.
>>
>> But regardless, the queue-building from classic ECN-capable endpoints that
>> only get 1 congestion signal per RTT is what I understand as the main
>> downside of the tradeoff if we try to use ECN-capability as the dualq
>> classifier.  Does that match your understanding?
>>
>> This is indeed a major concern of mine (not as major as the starvation of
>> SCE explained under Q4, but we'll come to that).
>>
>> Fine-grained (DCTCP-like) and coarse-grained (Cubic-like) congestion
>> controls need to be isolated, but I don't see how, unless their packets are
>> tagged for separate queues. Without a specific fine/coarse identifier,
>> we're left with having to re-use other identifiers:
>>
>>    - You've tried to use ECN vs Not-ECN. But that still lumps two large
>>    incompatible groups (fine ECN and coarse ECN) together.
>>    - The only alternative that would serve this purpose is the flow
>>    identifier at layer-4, because it isolates everything from everything else.
>>    FQ is where SCE started, and that seems to be as far as it can go.
>>
>> Should we burn the last unicorn for a capability needed on
>> "carrier-scale" boxes, but which requires FQ to work? Perhaps yes if there
>> was no alternative. But there is: L4S.
>>
>>
> I have a problem to understand why all traffic ends up to be classified as
> either Cubic-like or DCTCP-like.
> If we know that this is not true today I fail to understand why this
> should be the case in the future.
> It is also difficult to predict now how applications will change in the
> future in terms of the traffic mix they'll generate.
> I feel like we'd be moving towards more customized transport services with
> less predictable patterns.
>
> I do not see for instance much discussion about the presence of RTC
> traffic and how the dualQ system behaves when the
> input traffic does not respond as expected by the 2-types of sources
> assumed by dualQ.
>
> I'm sorry for using "Cubic-like" and "DCTCP-like", but I was trying
> (obviously unsuccessfully) to be clearer than using 'Classic' and
> 'Scalable'.
>
> "Classic" means traffic driven by congestion controls designed to coexist
> in the same queue with Reno (TCP-friendly), which necessarily makes it
> unscalable, as explained below.
>
> The definition of a scalable congestion control concerns the power b in
> the relationship between window, W and the fraction of congestion signals,
> p (ECN or drop) under stable conditions:
>     W = k / p^b
> where k is a constant (or in some cases a function of other parameters
> such as RTT).
>     If b >= 1 the CC is scalable.
>     If b < 1 it is not (i.e. Classic).
>
> "Scalable" does not exclude RTC traffic. For instance the L4S variant of
> SCReAM that Ingemar just talked about is scalable ("DCTCP-like"), because
> it has b = 1.
>
> I used "Cubic-like" 'cos there's more Cubic than Reno on the current
> Internet. Over Internet paths with typical BDP, Cubic is always in its
> Reno-friendly mode, and therefore also just as unscalable as Reno, with b =
> 1/2 (inversely proportional to the square-root). Even in its proper Cubic
> mode on high BDP paths, Cubic is still unscalable with b = 0.75.
>
> As flow rate scales up, the increase-decrease sawteeth of unscalable CCs
> get very large and very infrequent, so the control becomes extremely slack
> during dynamics. Whereas the sawteeth of scalable CCs stay invariant and
> tiny at any scale, keeping control tight, queuing low and utilization high.
> See the example of Cubic & DCTCP at Slide 5 here:
> https://www.files.netdevconf.org/f/4ebdcdd6f94547ad8b77/?dl=1
>
> Also, there's a useful plot of when Cubic switches to Reno mode on the
> last slide.
>
>
> If my application is using simulcast or multi-stream techniques I can have
> several video streams in the same link,  that, as far as I understand,
> will get significant latency in the classic queue.
>
>
> You are talking as if you think that queuing delay is caused by the
> buffer. You haven't said what your RTC congestion control is (gcc
> perhaps?). Whatever, assuming it's TCP-friendly, even in a queue on its
> own, it will need to induce about 1 additional base RTT of queuing delay to
> maintain full utilization.
>
> In the coupled dualQ AQM, the classic queue runs a state-of-the-art
> classic AQM (PI2 in our implementation) with a target delay of 15ms. With
> any less, your classic congestion controlled streams would under-utilize
> the link.
>
> Unless my app starts cheating by marking packets to get into the priority
> queue.
>
> There's two misconceptions here about the DualQ Coupled AQM that I need to
> correct.
>
> 1/ As above, if a classic CC can't build ~1 base RTT of queue in the
> classic buffer, it badly underutiizes. So if you 'cheat' by directing
> traffic from a queue-building CC into the low latency queue with a shallow
> ECN threshold, you'll just massively under-utilize the capacity.
>
> 2/ Even if it were a strict priority scheduler it wouldn't determine the
> scheduling under all normal traffic conditions. The coupling between the
> AQMs dominates the scheduler. I'll explain next...
>
>
> In both cases, i.e. my RTC app is cheating or not, I do not understand how
> the parametrization of the dualQ scheduler
> can cope with traffic that behaves in a different way to what is assumed
> while tuning parameters.
> For instance, in one instantiation of dualQ based on WRR the weights are
> set to 1:16.  This has to necessarily
> change when RTC traffic is present. How?
>
>
> The coupling simply applies congestion signals from the C queue across
> into the L queue, as if the C flows were L flows. So, the L flows leave
> sufficient space for however many C flows there are. Then, in all the gaps
> that the L traffic leaves, any work-conserving scheduler can be used to
> serve the C queue.
>
> The WRR scheduler is only there in case of overload or unresponsive L
> traffic; to prevent the Classic queue starving.
>
>
>
> Is the assumption that a trusted marker is used as in typical diffserv
> deployments
> or that a policer identifies and punishes cheating applications?
>
> As explained, if a classic flow cheats, it will get v low throughput. So
> it has no incentive to cheat.
>
> There's still the possibility of bugs/accidents/malice. The need for
> general Internet flows to be responsive to congestion is also vulnerable to
> bugs/accidents/malice, but it hasn't needed policing.
>
> Nonetheless, in Low Latency DOCSIS, we have implemented a queue protection
> function that maintains a queuing score per flow. Then, any packets from
> high-scoring flows that would cause the queue to exceed a threshold delay,
> are redirected to the classic queue instead. For well-behaved flows the
> state that holds the score ages out between packets, so only ill-behaved
> flows hold flow-state long term.
>
> Queue protection might not be needed, but it's as well to have it in case.
> It can be disabled.
>
>
> BTW I'd love to understand how dualQ is supposed to work under more
> general traffic assumptions.
>
> Coexistence with Reno is a general requirement for long-running Internet
> traffic. That's really all we depend on. That also covers RTC flows in the
> C queue that average to similar throughput as Reno but react more smoothly.
>
> The L traffic can be similarly heterogeneous - part of the L4S experiment
> is to see how broad that will stretch to. It can certainly accommodate
> other lighter traffic like VoIP, DNS, flow startups, transactional, etc,
> etc.
>
>
> BBR (v1) is a good example of something different that wasn't designed to
> coexist with Reno. It sort-of avoided too many problems by being primarily
> used for app-limited flows. It does its RTT probing on much longer
> timescales than typical sawtoothing congestion controls, running on a model
> of the link between times, so it doesn't fit the formulae above.
>
> For BBRv2 we're promised that the non-ECN side of it will coexist with
> existing Internet traffic, at least above a certain loss level. Without
> having seen it I can't be sure, but I assume that implies it will fit the
> formulae above in some way.
>
>
> PS. I believe all the above is explained in the three L4S Internet drafts,
> which we've taken a lot of trouble over. I don't really want to have to
> keep explaining it longhand in response to each email. So I'd prefer
> questions to be of the form "In section X of draft Y, I don't understand
> Z". Then I can devote my time to improving the drafts.
>
> Alternatively, there's useful papers of various lengths on the L4S landing
> page at:
> https://riteproject.eu/dctth/#papers
>
>
> Cheers
>
>
>
> Bob
>
>
>
> Luca
>
>
>
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
>
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane
>

[-- Attachment #2: Type: text/html, Size: 15430 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-06-14 20:10   ` [Ecn-sane] [tsvwg] " Luca Muscariello
  2019-06-14 21:44     ` Dave Taht
@ 2019-06-19  1:15     ` Bob Briscoe
  2019-06-19  1:33       ` Dave Taht
  2019-06-19  4:24       ` Holland, Jake
  1 sibling, 2 replies; 59+ messages in thread
From: Bob Briscoe @ 2019-06-19  1:15 UTC (permalink / raw)
  To: Luca Muscariello; +Cc: Holland, Jake, tsvwg, ecn-sane

[-- Attachment #1: Type: text/plain, Size: 9763 bytes --]

Luca,

I'm still preparing a (long) reply to Jake's earlier (long) response. 
But I'll take time out to quickly clear this point up inline...

On 14/06/2019 21:10, Luca Muscariello wrote:
>
> On Fri, Jun 7, 2019 at 8:10 PM Bob Briscoe <ietf@bobbriscoe.net 
> <mailto:ietf@bobbriscoe.net>> wrote:
>
>
>      I'm afraid there are not the same pressures to cause rapid
>     roll-out at all, cos it's flakey now, jam tomorrow. (Actually
>     ECN-DualQ-SCE has a much greater problem - complete starvation of
>     SCE flows - but we'll come on to that in Q4.)
>
>     I want to say at this point, that I really appreciate all the
>     effort you've been putting in, trying to find common ground.
>
>     In trying to find a compromise, you've taken the fire that is
>     really aimed at the inadequacy of underlying SCE protocol - for
>     anything other than FQ. If the primary SCE proponents had
>     attempted to articulate a way to use SCE in a single queue or a
>     dual queue, as you have, that would have taken my fire.
>
>>     But regardless, the queue-building from classic ECN-capable endpoints that
>>     only get 1 congestion signal per RTT is what I understand as the main
>>     downside of the tradeoff if we try to use ECN-capability as the dualq
>>     classifier.  Does that match your understanding?
>     This is indeed a major concern of mine (not as major as the
>     starvation of SCE explained under Q4, but we'll come to that).
>
>     Fine-grained (DCTCP-like) and coarse-grained (Cubic-like)
>     congestion controls need to be isolated, but I don't see how,
>     unless their packets are tagged for separate queues. Without a
>     specific fine/coarse identifier, we're left with having to re-use
>     other identifiers:
>
>       * You've tried to use ECN vs Not-ECN. But that still lumps two
>         large incompatible groups (fine ECN and coarse ECN) together.
>       * The only alternative that would serve this purpose is the flow
>         identifier at layer-4, because it isolates everything from
>         everything else. FQ is where SCE started, and that seems to be
>         as far as it can go.
>
>     Should we burn the last unicorn for a capability needed on
>     "carrier-scale" boxes, but which requires FQ to work? Perhaps yes
>     if there was no alternative. But there is: L4S.
>
>
> I have a problem to understand why all traffic ends up to be 
> classified as either Cubic-like or DCTCP-like.
> If we know that this is not true today I fail to understand why this 
> should be the case in the future.
> It is also difficult to predict now how applications will change in 
> the future in terms of the traffic mix they'll generate.
> I feel like we'd be moving towards more customized transport services 
> with less predictable patterns.
>
> I do not see for instance much discussion about the presence of RTC 
> traffic and how the dualQ system behaves when the
> input traffic does not respond as expected by the 2-types of sources 
> assumed by dualQ.
I'm sorry for using "Cubic-like" and "DCTCP-like", but I was trying 
(obviously unsuccessfully) to be clearer than using 'Classic' and 
'Scalable'.

"Classic" means traffic driven by congestion controls designed to 
coexist in the same queue with Reno (TCP-friendly), which necessarily 
makes it unscalable, as explained below.

The definition of a scalable congestion control concerns the power b in 
the relationship between window, W and the fraction of congestion 
signals, p (ECN or drop) under stable conditions:
     W = k / p^b
where k is a constant (or in some cases a function of other parameters 
such as RTT).
     If b >= 1 the CC is scalable.
     If b < 1 it is not (i.e. Classic).

"Scalable" does not exclude RTC traffic. For instance the L4S variant of 
SCReAM that Ingemar just talked about is scalable ("DCTCP-like"), 
because it has b = 1.

I used "Cubic-like" 'cos there's more Cubic than Reno on the current 
Internet. Over Internet paths with typical BDP, Cubic is always in its 
Reno-friendly mode, and therefore also just as unscalable as Reno, with 
b = 1/2 (inversely proportional to the square-root). Even in its proper 
Cubic mode on high BDP paths, Cubic is still unscalable with b = 0.75.

As flow rate scales up, the increase-decrease sawteeth of unscalable CCs 
get very large and very infrequent, so the control becomes extremely 
slack during dynamics. Whereas the sawteeth of scalable CCs stay 
invariant and tiny at any scale, keeping control tight, queuing low and 
utilization high. See the example of Cubic & DCTCP at Slide 5 here:
https://www.files.netdevconf.org/f/4ebdcdd6f94547ad8b77/?dl=1

Also, there's a useful plot of when Cubic switches to Reno mode on the 
last slide.

>
> If my application is using simulcast or multi-stream techniques I can 
> have several video streams in the same link,  that, as far as I 
> understand,
> will get significant latency in the classic queue.

You are talking as if you think that queuing delay is caused by the 
buffer. You haven't said what your RTC congestion control is (gcc 
perhaps?). Whatever, assuming it's TCP-friendly, even in a queue on its 
own, it will need to induce about 1 additional base RTT of queuing delay 
to maintain full utilization.

In the coupled dualQ AQM, the classic queue runs a state-of-the-art 
classic AQM (PI2 in our implementation) with a target delay of 15ms. 
With any less, your classic congestion controlled streams would 
under-utilize the link.

> Unless my app starts cheating by marking packets to get into the 
> priority queue.
There's two misconceptions here about the DualQ Coupled AQM that I need 
to correct.

1/ As above, if a classic CC can't build ~1 base RTT of queue in the 
classic buffer, it badly underutiizes. So if you 'cheat' by directing 
traffic from a queue-building CC into the low latency queue with a 
shallow ECN threshold, you'll just massively under-utilize the capacity.

2/ Even if it were a strict priority scheduler it wouldn't determine the 
scheduling under all normal traffic conditions. The coupling between the 
AQMs dominates the scheduler. I'll explain next...

>
> In both cases, i.e. my RTC app is cheating or not, I do not understand 
> how the parametrization of the dualQ scheduler
> can cope with traffic that behaves in a different way to what is 
> assumed while tuning parameters.
> For instance, in one instantiation of dualQ based on WRR the weights 
> are set to 1:16.  This has to necessarily
> change when RTC traffic is present. How?

The coupling simply applies congestion signals from the C queue across 
into the L queue, as if the C flows were L flows. So, the L flows leave 
sufficient space for however many C flows there are. Then, in all the 
gaps that the L traffic leaves, any work-conserving scheduler can be 
used to serve the C queue.

The WRR scheduler is only there in case of overload or unresponsive L 
traffic; to prevent the Classic queue starving.


>
> Is the assumption that a trusted marker is used as in typical diffserv 
> deployments
> or that a policer identifies and punishes cheating applications?
As explained, if a classic flow cheats, it will get v low throughput. So 
it has no incentive to cheat.

There's still the possibility of bugs/accidents/malice. The need for 
general Internet flows to be responsive to congestion is also vulnerable 
to bugs/accidents/malice, but it hasn't needed policing.

Nonetheless, in Low Latency DOCSIS, we have implemented a queue 
protection function that maintains a queuing score per flow. Then, any 
packets from high-scoring flows that would cause the queue to exceed a 
threshold delay, are redirected to the classic queue instead. For 
well-behaved flows the state that holds the score ages out between 
packets, so only ill-behaved flows hold flow-state long term.

Queue protection might not be needed, but it's as well to have it in 
case. It can be disabled.

>
> BTW I'd love to understand how dualQ is supposed to work under more 
> general traffic assumptions.
Coexistence with Reno is a general requirement for long-running Internet 
traffic. That's really all we depend on. That also covers RTC flows in 
the C queue that average to similar throughput as Reno but react more 
smoothly.

The L traffic can be similarly heterogeneous - part of the L4S 
experiment is to see how broad that will stretch to. It can certainly 
accommodate other lighter traffic like VoIP, DNS, flow startups, 
transactional, etc, etc.


BBR (v1) is a good example of something different that wasn't designed 
to coexist with Reno. It sort-of avoided too many problems by being 
primarily used for app-limited flows. It does its RTT probing on much 
longer timescales than typical sawtoothing congestion controls, running 
on a model of the link between times, so it doesn't fit the formulae above.

For BBRv2 we're promised that the non-ECN side of it will coexist with 
existing Internet traffic, at least above a certain loss level. Without 
having seen it I can't be sure, but I assume that implies it will fit 
the formulae above in some way.


PS. I believe all the above is explained in the three L4S Internet 
drafts, which we've taken a lot of trouble over. I don't really want to 
have to keep explaining it longhand in response to each email. So I'd 
prefer questions to be of the form "In section X of draft Y, I don't 
understand Z". Then I can devote my time to improving the drafts.

Alternatively, there's useful papers of various lengths on the L4S 
landing page at:
https://riteproject.eu/dctth/#papers


Cheers



Bob


>
> Luca
>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


[-- Attachment #2: Type: text/html, Size: 14535 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-06-14 20:10   ` [Ecn-sane] [tsvwg] " Luca Muscariello
@ 2019-06-14 21:44     ` Dave Taht
  2019-06-19  1:15     ` Bob Briscoe
  1 sibling, 0 replies; 59+ messages in thread
From: Dave Taht @ 2019-06-14 21:44 UTC (permalink / raw)
  To: Luca Muscariello; +Cc: Bob Briscoe, ecn-sane, tsvwg


This thread using unconventional markers for it is hard to follow.

Luca Muscariello <muscariello@ieee.org> writes:

> On Fri, Jun 7, 2019 at 8:10 PM Bob Briscoe <ietf@bobbriscoe.net>
> wrote:
>
>     
>         
>
>     I'm afraid there are not the same pressures to cause rapid
>     roll-out at all, cos it's flakey now, jam tomorrow. (Actually
>     ECN-DualQ-SCE has a much greater problem - complete starvation of
>     SCE flows - but we'll come on to that in Q4.)

Answering that statement is the only reason why I popped up here.
more below.

>     I want to say at this point, that I really appreciate all the
>     effort you've been putting in, trying to find common ground. 

I am happy to see this thread happen also, and I do plan to
stay out of it.

>     
>     In trying to find a compromise, you've taken the fire that is
>     really aimed at the inadequacy of underlying SCE protocol - for
>     anything other than FQ.

The SCE idea does, indeed work best with FQ, in a world of widely
varying congestion control ideas as explored in the recent paper, 50
shades of congestion control:

https://arxiv.org/pdf/1903.03852.pdf

>     If the primary SCE proponents had
>     attempted to articulate a way to use SCE in a single queue or a
>     dual queue, as you have, that would have taken my fire. 

I have no faith in single or dual queues with ECN either, due to
how anyone can scribble on the relevant bits, however...

>     
>         
>         But regardless, the queue-building from classic ECN-capable endpoints that
> only get 1 congestion signal per RTT is what I understand as the main
> downside of the tradeoff if we try to use ECN-capability as the dualq
> classifier.  Does that match your understanding?
>
>     This is indeed a major concern of mine (not as major as the
>     starvation of SCE explained under Q4, but we'll come to that).

I think I missed a portion of this thread. Starvation is impossible,
you are reduced to no less than cwnd 2 (non-bbr), or cwnd 4 (bbr).

Your own work points out a general problem with needing sub-packet
windows with too many flows that cause excessive marking using CE, which
so far as I know remains an unsolved problem.

https://arxiv.org/pdf/1904.07598.pdf

This is easily demonstrated via experiment, also, and the primary reason
why, even with FQ_codel in the field, we generally have turned off ecn
support at low bitrates until the first major release of sch_cake.

I had an open question outstanding about the 10% figure for converting
to drop sch_pie uses that remains unresolved.

As for what level of compatability with classic transports in a single
queue that is possible with a SCE capable receiver and sender, that
remains to be seen. Only the bits have been defined as yet. Two
approaches are being tried in public, so far.

>     
>     Fine-grained (DCTCP-like) and coarse-grained (Cubic-like)
>     congestion controls need to be isolated, but I don't see how,
>     unless their packets are tagged for separate queues. Without a
>     specific fine/coarse identifier, we're left with having to re-use
>     other identifiers:
>     
>     * You've tried to use ECN vs Not-ECN. But that still lumps two
>       large incompatible groups (fine ECN and coarse ECN) together. 
>     * The only alternative that would serve this purpose is the flow
>       identifier at layer-4, because it isolates everything from
>       everything else. FQ is where SCE started, and that seems to be
>       as far as it can go.

Actually, I was seeking a solution (and had been, for going on 5 years)
to the "too many flows not getting out of slow start fast enough",
problem, which you can see from any congested airport, public space,
small office, or coffeeshop nowadays. The vast majority of traffic
there does not consist of long duration high rate flows.

Even if you eliminate the wireless retries and rate changes and put in a
good fq_codel aqm, the traffic in such a large shared environment is
mostly flows lacking a need for congestion control at all (dns, voip,
etc), or in slow start, hammering away at ever increasing delays in
those environments until the user stops hitting the reload button.

Others have different goals and outlooks in this project and I'm
not really part of that.

I would rather like to see both approaches tried in an environment
that had a normal mix of traffic in a shared environment like that.

Some good potential solutions include reducing the slower bits of the
internet back to IW4 and/or using things like initial spreading, both of
which are good ideas and interact well with SCE's more immediate
response curve, paced chirping also.

>
>     Should we burn the last unicorn for a capability needed on
>     "carrier-scale" boxes, but which requires FQ to work? Perhaps yes
>     if there was no alternative. But there is: L4S.

The core of the internet is simply overprovisioned, with fairly short
queues. DCTCP itself did not deploy in very many places that I know of.

could you define exactly what carrier scale means?

>     
>     
>
> I have a problem to understand why all traffic ends up to be
> classified as either Cubic-like or DCTCP-like. 
> If we know that this is not true today I fail to understand why this
> should be the case in the future. 
> It is also difficult to predict now how applications will change in
> the future in terms of the traffic mix they'll generate.
> I feel like we'd be moving towards more customized transport services
> with less predictable patterns.
>
> I do not see for instance much discussion about the presence of RTC
> traffic and how the dualQ system behaves when the 
> input traffic does not respond as expected by the 2-types of sources
> assumed by dualQ.
>
> If my application is using simulcast or multi-stream techniques I can
> have several video streams in the same link, that, as far as I
> understand,
> will get significant latency in the classic queue. Unless my app
> starts cheating by marking packets to get into the priority queue.
>
> In both cases, i.e. my RTC app is cheating or not, I do not understand
> how the parametrization of the dualQ scheduler 
> can cope with traffic that behaves in a different way to what is
> assumed while tuning parameters. 
> For instance, in one instantiation of dualQ based on WRR the weights
> are set to 1:16. This has to necessarily 
> change when RTC traffic is present. How?
>
> Is the assumption that a trusted marker is used as in typical diffserv
> deployments
> or that a policer identifies and punishes cheating applications?
>
> BTW I'd love to understand how dualQ is supposed to work under more
> general traffic assumptions.
>
> Luca

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
  2019-06-07 18:07 ` Bob Briscoe
@ 2019-06-14 20:10   ` Luca Muscariello
  2019-06-14 21:44     ` Dave Taht
  2019-06-19  1:15     ` Bob Briscoe
  0 siblings, 2 replies; 59+ messages in thread
From: Luca Muscariello @ 2019-06-14 20:10 UTC (permalink / raw)
  To: Bob Briscoe; +Cc: Holland, Jake, tsvwg, ecn-sane

[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]

On Fri, Jun 7, 2019 at 8:10 PM Bob Briscoe <ietf@bobbriscoe.net> wrote:

>
>  I'm afraid there are not the same pressures to cause rapid roll-out at
> all, cos it's flakey now, jam tomorrow. (Actually ECN-DualQ-SCE has a much
> greater problem - complete starvation of SCE flows - but we'll come on to
> that in Q4.)
>
> I want to say at this point, that I really appreciate all the effort
> you've been putting in, trying to find common ground.
>
> In trying to find a compromise, you've taken the fire that is really aimed
> at the inadequacy of underlying SCE protocol - for anything other than FQ.
> If the primary SCE proponents had attempted to articulate a way to use SCE
> in a single queue or a dual queue, as you have, that would have taken my
> fire.
>
> But regardless, the queue-building from classic ECN-capable endpoints that
> only get 1 congestion signal per RTT is what I understand as the main
> downside of the tradeoff if we try to use ECN-capability as the dualq
> classifier.  Does that match your understanding?
>
> This is indeed a major concern of mine (not as major as the starvation of
> SCE explained under Q4, but we'll come to that).
>
> Fine-grained (DCTCP-like) and coarse-grained (Cubic-like) congestion
> controls need to be isolated, but I don't see how, unless their packets are
> tagged for separate queues. Without a specific fine/coarse identifier,
> we're left with having to re-use other identifiers:
>
>    - You've tried to use ECN vs Not-ECN. But that still lumps two large
>    incompatible groups (fine ECN and coarse ECN) together.
>    - The only alternative that would serve this purpose is the flow
>    identifier at layer-4, because it isolates everything from everything else.
>    FQ is where SCE started, and that seems to be as far as it can go.
>
> Should we burn the last unicorn for a capability needed on "carrier-scale"
> boxes, but which requires FQ to work? Perhaps yes if there was no
> alternative. But there is: L4S.
>
>
I have a problem to understand why all traffic ends up to be classified as
either Cubic-like or DCTCP-like.
If we know that this is not true today I fail to understand why this should
be the case in the future.
It is also difficult to predict now how applications will change in the
future in terms of the traffic mix they'll generate.
I feel like we'd be moving towards more customized transport services with
less predictable patterns.

I do not see for instance much discussion about the presence of RTC traffic
and how the dualQ system behaves when the
input traffic does not respond as expected by the 2-types of sources
assumed by dualQ.

If my application is using simulcast or multi-stream techniques I can have
several video streams in the same link,  that, as far as I understand,
will get significant latency in the classic queue. Unless my app starts
cheating by marking packets to get into the priority queue.

In both cases, i.e. my RTC app is cheating or not, I do not understand how
the parametrization of the dualQ scheduler
can cope with traffic that behaves in a different way to what is assumed
while tuning parameters.
For instance, in one instantiation of dualQ based on WRR the weights are
set to 1:16.  This has to necessarily
change when RTC traffic is present. How?

Is the assumption that a trusted marker is used as in typical diffserv
deployments
or that a policer identifies and punishes cheating applications?

BTW I'd love to understand how dualQ is supposed to work under more general
traffic assumptions.

Luca

[-- Attachment #2: Type: text/html, Size: 4610 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2019-07-26 13:10 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <HE1PR07MB4425603844DED8D36AC21B67C2110@HE1PR07MB4425.eurprd07.prod.outlook.com>
2019-06-14 18:27 ` [Ecn-sane] [tsvwg] Comments on L4S drafts Holland, Jake
     [not found]   ` <HE1PR07MB4425E0997EE8ADCAE2D4C564C2E80@HE1PR07MB4425.eurprd07.prod.outlook.com>
2019-06-19 12:59     ` Bob Briscoe
2019-06-05  0:01 [Ecn-sane] " Holland, Jake
2019-06-07 18:07 ` Bob Briscoe
2019-06-14 20:10   ` [Ecn-sane] [tsvwg] " Luca Muscariello
2019-06-14 21:44     ` Dave Taht
2019-06-19  1:15     ` Bob Briscoe
2019-06-19  1:33       ` Dave Taht
2019-06-19  4:24       ` Holland, Jake
2019-06-19 13:02         ` Luca Muscariello
2019-07-04 11:54           ` Bob Briscoe
2019-07-04 12:24             ` Jonathan Morton
2019-07-04 13:43               ` De Schepper, Koen (Nokia - BE/Antwerp)
2019-07-04 14:03                 ` Jonathan Morton
2019-07-04 17:54                   ` Bob Briscoe
2019-07-05  8:26                     ` Jonathan Morton
2019-07-05  6:46                   ` De Schepper, Koen (Nokia - BE/Antwerp)
2019-07-05  8:51                     ` Jonathan Morton
2019-07-08 10:26                       ` De Schepper, Koen (Nokia - BE/Antwerp)
2019-07-08 20:55                         ` Holland, Jake
2019-07-10  0:10                           ` Jonathan Morton
2019-07-10  9:00                           ` De Schepper, Koen (Nokia - BE/Antwerp)
2019-07-10 13:14                             ` Dave Taht
2019-07-10 17:32                               ` De Schepper, Koen (Nokia - BE/Antwerp)
2019-07-17 22:40                             ` Sebastian Moeller
2019-07-19  9:06                               ` De Schepper, Koen (Nokia - BE/Antwerp)
2019-07-19 15:37                                 ` Dave Taht
2019-07-19 18:33                                   ` Wesley Eddy
2019-07-19 20:03                                     ` Dave Taht
2019-07-19 22:09                                       ` Wesley Eddy
2019-07-19 23:42                                         ` Dave Taht
2019-07-24 16:21                                           ` Dave Taht
2019-07-19 20:06                                     ` Black, David
2019-07-19 20:44                                       ` Jonathan Morton
2019-07-19 22:03                                         ` Sebastian Moeller
2019-07-20 21:02                                           ` Dave Taht
2019-07-21 11:53                                           ` Bob Briscoe
2019-07-21 15:33                                             ` Sebastian Moeller
2019-07-21 16:00                                             ` Jonathan Morton
2019-07-21 16:12                                               ` Sebastian Moeller
2019-07-22 18:15                                               ` De Schepper, Koen (Nokia - BE/Antwerp)
2019-07-22 18:33                                                 ` Dave Taht
2019-07-22 19:48                                                 ` Pete Heist
2019-07-25 16:14                                                   ` De Schepper, Koen (Nokia - BE/Antwerp)
2019-07-26 13:10                                                     ` Pete Heist
2019-07-23 10:33                                                 ` Sebastian Moeller
2019-07-21 12:30                                       ` Bob Briscoe
2019-07-21 16:08                                         ` Sebastian Moeller
2019-07-21 19:14                                           ` Bob Briscoe
2019-07-21 20:48                                             ` Sebastian Moeller
2019-07-25 20:51                                               ` Bob Briscoe
2019-07-25 21:17                                                 ` Bob Briscoe
2019-07-25 22:00                                                   ` Sebastian Moeller
     [not found]                                         ` <5D34803D.50501@erg.abdn.ac.uk>
2019-07-21 16:43                                           ` Black, David
2019-07-21 12:30                                       ` Scharf, Michael
2019-07-19 21:49                                     ` Sebastian Moeller
2019-07-22 16:28                                   ` Bless, Roland (TM)
2019-07-19 17:59                                 ` Sebastian Moeller
2019-07-05  9:48             ` Luca Muscariello
2019-07-04 13:45         ` Bob Briscoe
2019-07-10 17:03           ` Holland, Jake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox