[Ecn-sane] sce materials from ietf

Discussion of explicit congestion notification's impact on the Internet
 help / color / mirror / Atom feed

* [Ecn-sane] sce materials from ietf
@ 2019-11-29 20:08 Dave Taht
  2019-11-29 21:20 ` [Ecn-sane] [Bloat] " Jonathan Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Dave Taht @ 2019-11-29 20:08 UTC (permalink / raw)
  To: ECN-Sane, bloat

there are no minutes posted.

https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessa-81-some-congestion-experienced-00

https://datatracker.ietf.org/meeting/106/materials/slides-106-tcpm-some-congestion-experienced-in-tcp-00
-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat] sce materials from ietf
  2019-11-29 20:08 [Ecn-sane] sce materials from ietf Dave Taht
@ 2019-11-29 21:20 ` Jonathan Morton
  2019-11-29 23:10   ` alex.burr
  2019-11-29 22:55 ` [Ecn-sane] " Rodney W. Grimes
  2019-11-30  2:37 ` Dave Taht
  2 siblings, 1 reply; 27+ messages in thread
From: Jonathan Morton @ 2019-11-29 21:20 UTC (permalink / raw)
  To: Dave Taht; +Cc: ECN-Sane, bloat

> On 29 Nov, 2019, at 10:08 pm, Dave Taht <dave.taht@gmail.com> wrote:
> 
> there are no minutes posted.
> 
> https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessa-81-some-congestion-experienced-00
> 
> https://datatracker.ietf.org/meeting/106/materials/slides-106-tcpm-some-congestion-experienced-in-tcp-00

Those should both be the same slide deck.  We got squeezed out of over half our expected talk time in TSVWG, and didn't get to present at all in TCPM (though we were able to comment from the floor on related AccECN matters) - we were just expecting to focus on one or two slides from the deck there.

I think we got two big takeaways from IETF-106:

First, that L4S is really floundering in their fundamental problems and has not yet been able to demonstrate any genuine solutions to them, and instead they're trying to bog us down in process.  However, there is growing interest in SCE, despite the relatively small footprint we still have officially.

Second, I gained a couple of key insights that I think will help to solve SCE's remaining shortcomings.  If we can apply them successfully by Vancouver, we'll be able to stand up and say not only that SCE meets *all* of the Prague Requirements, while L4S is currently missing two of them, but that we've also solved the single-queue problem.  I'm deliberately leaving the technical details vague until we've done some testing, but I will say that the name we've come up with is amusing.

Now if you'll excuse me, I need to fix a computer…

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] sce materials from ietf
  2019-11-29 20:08 [Ecn-sane] sce materials from ietf Dave Taht
  2019-11-29 21:20 ` [Ecn-sane] [Bloat] " Jonathan Morton
@ 2019-11-29 22:55 ` Rodney W. Grimes
  2019-11-29 22:58   ` Rodney W. Grimes
  2019-11-30  2:37 ` Dave Taht
  2 siblings, 1 reply; 27+ messages in thread
From: Rodney W. Grimes @ 2019-11-29 22:55 UTC (permalink / raw)
  To: Dave Taht; +Cc: ECN-Sane, bloat

> there are no minutes posted.
> 
> https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessa-81-some-congestion-experienced-00
> 
> https://datatracker.ietf.org/meeting/106/materials/slides-106-tcpm-some-congestion-experienced-in-tcp-00

The above 2 decks are identical.  Jonathan did not get any time
during tsvwg, so I reposted the whole deck to tcpm, in which I
also did not get any time to present.

BUT, and that is a all caps BUT, good stuff happened for SCE
forward progress during the meetinhgs none the less.  We did
infact get an announcement that we have asked for adoption of
draft-morton-tsvwg-sce, with a 25 hand count on who has read
the draft, which by my rough estimate was more than 1/4 of
the room.

During the tcpm session the issues around allocation of bit 7
for AccECN may of been worked out, that draft (AccECN) is becoming
a proposed standard, which can do the IANA allocation, and Mira
at least continues to affirm that bit 7 can be used for other
purposes after an AccECN negotiation failure when it falls back
to RFC3168 ECN, so we (SCE) believe we do have a path forward on
our alternate use for bit 7.

The tsvwg chairs, and the work group itself now needs to discuss
the 2 experiment problem, the conflicts and compatibilities between
the 2, and just how to deal with the situation.

YOUR (that being all the list members of ecn-sane, and the larger
bufferbloat community) inputs and helps are highly desired in
this process.

The SCE teams possition is that L4S is fundementally flawed in
its use of the ECT(1) code point as a "Traffic Classifier" since
that leads to the end nodes telling the network the traffic is
special, aka treat me differently than any other traffic, and
is likely to lead to abuse, which may possibly lead to bleaching
of the code point, which would be bad for everyone.

It would be much nicer to use this last code point, 1/2 a bit,
for a high fidelity signal from the network to the receiver
of the level of congestion in a fully backwards to RFC3168
way.

We (the SCE team) also feel that L4S is overly complex and continues
to grow complexity as problems with it are exposed.  (Recently it
has become apparent that protecton from RFC3168 behavior is needed,
and thus a new proposal and a new chunk of code are being
developed to deal with this issue.

> Dave T?ht
-- 
Rod Grimes                                                 rgrimes@freebsd.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] sce materials from ietf
  2019-11-29 22:55 ` [Ecn-sane] " Rodney W. Grimes
@ 2019-11-29 22:58   ` Rodney W. Grimes
  0 siblings, 0 replies; 27+ messages in thread
From: Rodney W. Grimes @ 2019-11-29 22:58 UTC (permalink / raw)
  To: Rodney W. Grimes; +Cc: Dave Taht, ECN-Sane, bloat

> > there are no minutes posted.
> > 
> > https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessa-81-some-congestion-experienced-00
> > 
> > https://datatracker.ietf.org/meeting/106/materials/slides-106-tcpm-some-congestion-experienced-in-tcp-00
> 
> The above 2 decks are identical.  Jonathan did not get any time
> during tsvwg, so I reposted the whole deck to tcpm, in which I
> also did not get any time to present.

Correction, Jonathan did get 6 minutes, iirc.  

> 
> BUT, and that is a all caps BUT, good stuff happened for SCE
> forward progress during the meetinhgs none the less.  We did
> infact get an announcement that we have asked for adoption of
> draft-morton-tsvwg-sce, with a 25 hand count on who has read
> the draft, which by my rough estimate was more than 1/4 of
> the room.
> 
> During the tcpm session the issues around allocation of bit 7
> for AccECN may of been worked out, that draft (AccECN) is becoming
> a proposed standard, which can do the IANA allocation, and Mira
> at least continues to affirm that bit 7 can be used for other
> purposes after an AccECN negotiation failure when it falls back
> to RFC3168 ECN, so we (SCE) believe we do have a path forward on
> our alternate use for bit 7.
> 
> The tsvwg chairs, and the work group itself now needs to discuss
> the 2 experiment problem, the conflicts and compatibilities between
> the 2, and just how to deal with the situation.
> 
> YOUR (that being all the list members of ecn-sane, and the larger
> bufferbloat community) inputs and helps are highly desired in
> this process.
> 
> The SCE teams possition is that L4S is fundementally flawed in
> its use of the ECT(1) code point as a "Traffic Classifier" since
> that leads to the end nodes telling the network the traffic is
> special, aka treat me differently than any other traffic, and
> is likely to lead to abuse, which may possibly lead to bleaching
> of the code point, which would be bad for everyone.
> 
> It would be much nicer to use this last code point, 1/2 a bit,
> for a high fidelity signal from the network to the receiver
> of the level of congestion in a fully backwards to RFC3168
> way.
> 
> We (the SCE team) also feel that L4S is overly complex and continues
> to grow complexity as problems with it are exposed.  (Recently it
> has become apparent that protecton from RFC3168 behavior is needed,
> and thus a new proposal and a new chunk of code are being
> developed to deal with this issue.
> 
> 
> > Dave T?ht
> -- 
> Rod Grimes                                                 rgrimes@freebsd.org
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane
> 
> 

-- 
Rod Grimes                                                 rgrimes@freebsd.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat] sce materials from ietf
  2019-11-29 21:20 ` [Ecn-sane] [Bloat] " Jonathan Morton
@ 2019-11-29 23:10   ` alex.burr
  2019-11-30  1:39     ` Jonathan Morton
  0 siblings, 1 reply; 27+ messages in thread
From: alex.burr @ 2019-11-29 23:10 UTC (permalink / raw)
  To: Dave Taht, Jonathan Morton; +Cc: ECN-Sane, bloat

On Friday, November 29, 2019, 9:51:21 PM GMT, Jonathan Morton <chromatix99@gmail.com> wrote:

> Second, I gained a couple of key insights that I think will help to solve SCE's remaining shortcomings.  If we can apply them successfully by 
> Vancouver, we'll be able to stand up and say not only that SCE meets *all* of the Prague Requirements, while L4S is currently missing two of them,
> but that we've also solved the single-queue problem.  I'm deliberately leaving the technical details vague until we've done some testing, but I will say 
> that the name we've come up with is amusing.

I don't see what you gain by going after the Prague requirements. They're internal requirements for a TCP that would fulfill the L4S goals if classified into the L4S side of a DualQ AQM: 'Packet Identification' means that the L4S AQM can identify L4S supporting flows. This seems like a distraction from your main pitch to me. It would seem better to compare against the actual goals of L4S (AFAICT, low latency at the 99th percentile, in the presence of Reno-compatible flows, with some fairness requirement which I increasingly don't understand).

Alex

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat] sce materials from ietf
  2019-11-29 23:10   ` alex.burr
@ 2019-11-30  1:39     ` Jonathan Morton
  2019-11-30  7:27       ` Sebastian Moeller
  0 siblings, 1 reply; 27+ messages in thread
From: Jonathan Morton @ 2019-11-30  1:39 UTC (permalink / raw)
  To: alex.burr; +Cc: Dave Taht, ECN-Sane, bloat

> I don't see what you gain by going after the Prague requirements. They're internal requirements for a TCP that would fulfill the L4S goals if classified into the L4S side of a DualQ AQM: 'Packet Identification' means that the L4S AQM can identify L4S supporting flows. This seems like a distraction from your main pitch to me. It would seem better to compare against the actual goals of L4S (AFAICT, low latency at the 99th percentile, in the presence of Reno-compatible flows, with some fairness requirement which I increasingly don't understand).

We're certainly not treating the Prague Requirements as our only goals.  We just looked over them and realised we do sufficiently well on them already to compare favourably against L4S.  They are failing on their own merits.  Like it or not, we are somewhat in competition with them in IETF space, so this sort of comparison should help to bolster our standing.

A brief summary of the Prague Requirements:

1: Packet Identifier.

We ID ourselves as RFC-3168 compliant using ECT(0), because we are.

L4S has to identify itself more specifically to the network, because it is *not* RFC-3168 compliant.  It additionally relies on AQMs in the network understanding this distinction, which at present none do.  We would much prefer that they use a DSCP for this purpose, but at present they use ECT(1).

2: Accurate ECN Feedback.

We use a spare bit in the header of TCP acks to feed back SCE marks, and the existing ECE/CWR mechanism from RFC-3168 unchanged for CE marks.  The SCE feedback is "accurate" but not "reliable", because it can tolerate large errors (as much as 100% relative) without departing the control loop.  The scheme is very simple and straightforward to implement at the receiver, and interpret at the sender.

L4S uses AccECN to give CE mark feedback that is both "accurate" and "reliable".  It is a somewhat complex specification which takes over three TCP header bits, including the two used for RFC-3168 feedback.

3: TCP-friendly response to packet loss.

Both SCE and L4S do this without difficulty.

4: TCP-friendly response to RFC-3168 CE marking.

SCE does this by design, retaining the existing feedback mechanism for CE marks and implementing an RFC-8511 (ABE) compliant response in each of the TCP algorithms presented so far.  We can do this easily because CE and SCE information from the network is unambiguous.

L4S presently does not do this, largely because CE marks from RFC-3168 AQMs are not easily distinguished vice CE marks from an L4S AQM.  They seem to be working on some sort of solution, but it has not yet been demonstrated to work, and their paper describing it leaves a lot of open questions (including tuning constants).  That we saw no demonstration of it at IETF-106 (indeed they even skipped over their planned talk on it in a side session dedicated to L4S) suggests to me that they found flaws that were difficult to overcome at short notice, and possibly even managed to look bad next to our demonstration of jitter tolerance at the Hackathon.

This point has always been the main difference between L4S and SCE.

5: Reduced RTT dependence

This is a mathematically interesting requirement which, at present, neither L4S nor SCE meets.

Fundamentally, any two flows following the same congestion-signal response which makes average cwnd dependent solely on marking probability, and which share the same bottleneck queue and AQM and therefore experience the same marking probability, will converge to the same average cwnd and their relative throughputs will therefore be inversely proportional to their RTTs.  This adequately describes both the pure AIMD response of Reno, and the so-called 1/p response of DCTCP (which TCP Prague apes slavishly).

The steady-state cwnd formula for CUBIC, however, is a function of both p(CE) and RTT, such that its throughput should be proportional to the reciprocal quartic root of RTT, rather than linearly reciprocal.  This assumes that CUBIC is not in its Reno compatibility regime, of course.  So CUBIC is the standard to beat, or at least match, for this requirement.

As I mentioned, I have an idea for how to do this.  I've seen no evidence that the L4S team have any equivalent ideas; again, they're failing their own requirements.

6: Scale down to fractional effective cwnd.

We technically achieve this with our preferred choice of pacing parameters, reducing send rate to 80% of a segment per RTT at the min-cwnd of 2 segments.  We could easily do better with different pacing ratios.

L4S have apparently implemented a packet size adjustment.  We haven't tried it out yet, but we'll take their word for it for the moment.  There's no inherent technical reason why we couldn't do the same.

7: Reordering tolerance on time basis (ie. RACK).

Both SCE and L4S inherit this capability from the Linux TCP stack, so it's not a problem.  FreeBSD also has a RACK compliant TCP stack which is being stabilised.

Other criteria which we are actively considering are listed in, for example, RFC-5033.  That makes a fun read if you're a masochist; I wonder about Pete sometimes.  :-)

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] sce materials from ietf
  2019-11-29 20:08 [Ecn-sane] sce materials from ietf Dave Taht
  2019-11-29 21:20 ` [Ecn-sane] [Bloat] " Jonathan Morton
  2019-11-29 22:55 ` [Ecn-sane] " Rodney W. Grimes
@ 2019-11-30  2:37 ` Dave Taht
  2019-11-30  3:10   ` Rodney W. Grimes
  2 siblings, 1 reply; 27+ messages in thread
From: Dave Taht @ 2019-11-30  2:37 UTC (permalink / raw)
  To: ECN-Sane, bloat

there are a multitude of papers posted for the buffer sizing workshop

http://buffer-workshop.stanford.edu/papers/paper23.pdf was interesting.

On Fri, Nov 29, 2019 at 12:08 PM Dave Taht <dave.taht@gmail.com> wrote:
>
> there are no minutes posted.
>
> https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessa-81-some-congestion-experienced-00
>
> https://datatracker.ietf.org/meeting/106/materials/slides-106-tcpm-some-congestion-experienced-in-tcp-00
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] sce materials from ietf
  2019-11-30  2:37 ` Dave Taht
@ 2019-11-30  3:10   ` Rodney W. Grimes
  0 siblings, 0 replies; 27+ messages in thread
From: Rodney W. Grimes @ 2019-11-30  3:10 UTC (permalink / raw)
  To: Dave Taht; +Cc: ECN-Sane, bloat

> there are a multitude of papers posted for the buffer sizing workshop
> 
> http://buffer-workshop.stanford.edu/papers/paper23.pdf was interesting.

Would be nice to get them to add ECN(sce) to the mix of there tests.

> On Fri, Nov 29, 2019 at 12:08 PM Dave Taht <dave.taht@gmail.com> wrote:
> >
> > there are no minutes posted.
> >
> > https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessa-81-some-congestion-experienced-00
> >
> > https://datatracker.ietf.org/meeting/106/materials/slides-106-tcpm-some-congestion-experienced-in-tcp-00
> > --
> >
> > Dave T?ht
> Dave T?ht
-- 
Rod Grimes                                                 rgrimes@freebsd.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat] sce materials from ietf
  2019-11-30  1:39     ` Jonathan Morton
@ 2019-11-30  7:27       ` Sebastian Moeller
  2019-11-30 14:32         ` Jonathan Morton
  0 siblings, 1 reply; 27+ messages in thread
From: Sebastian Moeller @ 2019-11-30  7:27 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: alex.burr, ECN-Sane, bloat

Hi Jonathan,


> On Nov 30, 2019, at 02:39, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> I don't see what you gain by going after the Prague requirements. They're internal requirements for a TCP that would fulfill the L4S goals if classified into the L4S side of a DualQ AQM: 'Packet Identification' means that the L4S AQM can identify L4S supporting flows. This seems like a distraction from your main pitch to me. It would seem better to compare against the actual goals of L4S (AFAICT, low latency at the 99th percentile, in the presence of Reno-compatible flows, with some fairness requirement which I increasingly don't understand).
> 
> We're certainly not treating the Prague Requirements as our only goals.  We just looked over them and realised we do sufficiently well on them already to compare favourably against L4S.  They are failing on their own merits.  Like it or not, we are somewhat in competition with them in IETF space, so this sort of comparison should help to bolster our standing.
> 
> A brief summary of the Prague Requirements:
> 
> 
> 1: Packet Identifier.
> 
> We ID ourselves as RFC-3168 compliant using ECT(0), because we are.
> 
> L4S has to identify itself more specifically to the network, because it is *not* RFC-3168 compliant.  It additionally relies on AQMs in the network understanding this distinction, which at present none do.  We would much prefer that they use a DSCP for this purpose, but at present they use ECT(1).
> 
> 
> 2: Accurate ECN Feedback.
> 
> We use a spare bit in the header of TCP acks to feed back SCE marks, and the existing ECE/CWR mechanism from RFC-3168 unchanged for CE marks.  The SCE feedback is "accurate" but not "reliable", because it can tolerate large errors (as much as 100% relative) without departing the control loop.  The scheme is very simple and straightforward to implement at the receiver, and interpret at the sender.
> 
> L4S uses AccECN to give CE mark feedback that is both "accurate" and "reliable".  It is a somewhat complex specification which takes over three TCP header bits, including the two used for RFC-3168 feedback.

Question: How feasible would it be for any SCE aware transport protocol to evaluate AccECN?  This might make sense if not viewed from a technical but from a ietf politics perspective?
I personally believe, that if the ECN feedback woukd e really important it should be packeged into TCP data as the payload has some delivery guarantees, while ACKs are effectively best effort (tangent: and this is why I consider ACK filtering/compression as abominations which should be counted against any guarantee the contract with the traffic-carrier entails, not that this helps end customers).


> 
> 
> 3: TCP-friendly response to packet loss.
> 
> Both SCE and L4S do this without difficulty.
> 
> 
> 4: TCP-friendly response to RFC-3168 CE marking.
> 
> SCE does this by design, retaining the existing feedback mechanism for CE marks and implementing an RFC-8511 (ABE) compliant response in each of the TCP algorithms presented so far.  We can do this easily because CE and SCE information from the network is unambiguous.
> 
> L4S presently does not do this, largely because CE marks from RFC-3168 AQMs are not easily distinguished vice CE marks from an L4S AQM.  They seem to be working on some sort of solution, but it has not yet been demonstrated to work, and their paper describing it leaves a lot of open questions (including tuning constants).  That we saw no demonstration of it at IETF-106 (indeed they even skipped over their planned talk on it in a side session dedicated to L4S) suggests to me that they found flaws that were difficult to overcome at short notice, and possibly even managed to look bad next to our demonstration of jitter tolerance at the Hackathon.

	I fear that they will come up with something that in reality will a) by opt-out, that is they will assume L4S-style feedback until reluctantly convinced that the bottleneck marker is rfc3160-compliant and hence wib) trigger too late c) trigger to rarely to be actually helpful in reality, but might show a good enough effort to push L4S past issue #16.

> 
> This point has always been the main difference between L4S and SCE.
> ll 
> 
> 5: Reduced RTT dependence
> 
> This is a mathematically interesting requirement which, at present, neither L4S nor SCE meets.
> 
> Fundamentally, any two flows following the same congestion-signal response which makes average cwnd dependent solely on marking probability, and which share the same bottleneck queue and AQM and therefore experience the same marking probability, will converge to the same average cwnd and their relative throughputs will therefore be inversely proportional to their RTTs.  This adequately describes both the pure AIMD response of Reno, and the so-called 1/p response of DCTCP (which TCP Prague apes slavishly).
> 
> The steady-state cwnd formula for CUBIC, however, is a function of both p(CE) and RTT, such that its throughput should be proportional to the reciprocal quartic root of RTT, rather than linearly reciprocal.  This assumes that CUBIC is not in its Reno compatibility regime, of course.  So CUBIC is the standard to beat, or at least match, for this requirement.

	"Funny" story, looking at figure 6 of Høiland-Jørgensen T, Hurtig P, Brunstrom A (2015) The Good, the Bad and the WiFi: Modern AQMs in a residential setting. Computer Networks 89:90–106. shows clearly that a) single queue Pie (the AQM L4S inflicts upon at least the standard compliant traffic) causes worse RTT dependence than pfifo_fast and that fq_codel actually does (mostly) better, so by avoiding FQ like the devil, the L4S team shoots their own foot. 


Best Regards
	Sebastian

> 
> As I mentioned, I have an idea for how to do this.  I've seen no evidence that the L4S team have any equivalent ideas; again, they're failing their own requirements.
> 
> 
> 6: Scale down to fractional effective cwnd.
> 
> We technically achieve this with our preferred choice of pacing parameters, reducing send rate to 80% of a segment per RTT at the min-cwnd of 2 segments.  We could easily do better with different pacing ratios.
> 
> L4S have apparently implemented a packet size adjustment.  We haven't tried it out yet, but we'll take their word for it for the moment.  There's no inherent technical reason why we couldn't do the same.
> 
> 
> 7: Reordering tolerance on time basis (ie. RACK).
> 
> Both SCE and L4S inherit this capability from the Linux TCP stack, so it's not a problem.  FreeBSD also has a RACK compliant TCP stack which is being stabilised.
> 
> 
> Other criteria which we are actively considering are listed in, for example, RFC-5033.  That makes a fun read if you're a masochist; I wonder about Pete sometimes.  :-)
> 
> - Jonathan Morton
> 
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat] sce materials from ietf
  2019-11-30  7:27       ` Sebastian Moeller
@ 2019-11-30 14:32         ` Jonathan Morton
  2019-11-30 15:42           ` Sebastian Moeller
  2019-11-30 22:17           ` Carsten Bormann
  0 siblings, 2 replies; 27+ messages in thread
From: Jonathan Morton @ 2019-11-30 14:32 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: alex.burr, ECN-Sane, bloat

>> 2: Accurate ECN Feedback.
>> 
>> We use a spare bit in the header of TCP acks to feed back SCE marks, and the existing ECE/CWR mechanism from RFC-3168 unchanged for CE marks.  The SCE feedback is "accurate" but not "reliable", because it can tolerate large errors (as much as 100% relative) without departing the control loop. The scheme is very simple and straightforward to implement at the receiver, and interpret at the sender.
>> 
>> L4S uses AccECN to give CE mark feedback that is both "accurate" and "reliable".  It is a somewhat complex specification which takes over three TCP header bits, including the two used for RFC-3168 feedback.
> 
> Question: How feasible would it be for any SCE aware transport protocol to evaluate AccECN?  This might make sense if not viewed from a technical but from a ietf politics perspective?
> I personally believe, that if the ECN feedback woukd e really important it should be packeged into TCP data as the payload has some delivery guarantees, while ACKs are effectively best effort (tangent: and this is why I consider ACK filtering/compression as abominations which should be counted against any guarantee the contract with the traffic-carrier entails, not that this helps end customers).

It would be *possible* to use AccECN for SCE feedback, but only because the distinction between ECT(0) and ECT(1) is fed back in a TCP option.  SCE also has no use for the "accurate" CE feedback for which the ECE/CWR bits are replaced; if that three-bit field lay somewhere else, it could conceivably have been used for SCE feedback instead.

There are unfortunate problems with introducing new TCP options, in that some overzealous firewalls block traffic which uses them.  This would be a deployment hazard for SCE, which merely using a spare header flag avoids.  So instead we are still planning to use the spare bit - which happens to be one that AccECN also uses, but AccECN negotiates in such a way that SCE can safely use it even with an AccECN capable partner.

>> 4: TCP-friendly response to RFC-3168 CE marking.
>> 
>> SCE does this by design, retaining the existing feedback mechanism for CE marks and implementing an RFC-8511 (ABE) compliant response in each of the TCP algorithms presented so far.  We can do this easily because CE and SCE information from the network is unambiguous.
>> 
>> L4S presently does not do this, largely because CE marks from RFC-3168 AQMs are not easily distinguished vice CE marks from an L4S AQM.  They seem to be working on some sort of solution, but it has not yet been demonstrated to work, and their paper describing it leaves a lot of open questions (including tuning constants).  That we saw no demonstration of it at IETF-106 (indeed they even skipped over their planned talk on it in a side session dedicated to L4S) suggests to me that they found flaws that were difficult to overcome at short notice, and possibly even managed to look bad next to our demonstration of jitter tolerance at the Hackathon.
> 
> 	I fear that they will come up with something that in reality will a) by opt-out, that is they will assume L4S-style feedback until reluctantly convinced that the bottleneck marker is rfc3160-compliant and hence will b) trigger too late c) trigger to rarely to be actually helpful in reality, but might show a good enough effort to push L4S past issue #16.

I'm sure they will, and we will of course point out these shortcomings as they occur, so as to count them against issue #16.  Conversely, if they do manage to make it fail-safe, it is highly likely that their scheme will give false positives on real Internet paths and fail to switch into L4S mode, impairing their performance in other ways.

>> 5: Reduced RTT dependence
>> 
>> This is a mathematically interesting requirement which, at present, neither L4S nor SCE meets.
>> 
>> Fundamentally, any two flows following the same congestion-signal response which makes average cwnd dependent solely on marking probability, and which share the same bottleneck queue and AQM and therefore experience the same marking probability, will converge to the same average cwnd and their relative throughputs will therefore be inversely proportional to their RTTs.  This adequately describes both the pure AIMD response of Reno, and the so-called 1/p response of DCTCP (which TCP Prague apes slavishly).
>> 
>> The steady-state cwnd formula for CUBIC, however, is a function of both p(CE) and RTT, such that its throughput should be proportional to the reciprocal quartic root of RTT, rather than linearly reciprocal.  This assumes that CUBIC is not in its Reno compatibility regime, of course.  So CUBIC is the standard to beat, or at least match, for this requirement.
> 
> 	"Funny" story, looking at figure 6 of Høiland-Jørgensen T, Hurtig P, Brunstrom A (2015) The Good, the Bad and the WiFi: Modern AQMs in a residential setting. Computer Networks 89:90–106. shows clearly that a) single queue Pie (the AQM L4S inflicts upon at least the standard compliant traffic) causes worse RTT dependence than pfifo_fast and that fq_codel actually does (mostly) better, so by avoiding FQ like the devil, the L4S team shoots their own foot. 

Right, and we can easily explain why this happens.  A dumb FIFO adds a more-or-less constant delay to both competing flows, effectively reducing their RTT ratio towards unity.  Even at the short effective queue lengths proposed by L4S, the example they give in the Prague Requirements is of a 100ms versus 1ms baseline path, lengthened to 101ms versus 2ms by a 1ms queue.  This reduces a 100:1 ratio to 50.5:1.

The FQ example is, however, of the network enforcing fairness, rather than informing the endpoints of the corrections they need to make to resolve unfairness.  We really like FQ, of course, but it's not feasible to deploy it everywhere, so we have to ensure reasonable competition between flows sharing a single queue.  We've already started testing one such idea…

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat] sce materials from ietf
  2019-11-30 14:32         ` Jonathan Morton
@ 2019-11-30 15:42           ` Sebastian Moeller
  2019-11-30 17:11             ` Jonathan Morton
  2019-11-30 22:17           ` Carsten Bormann
  1 sibling, 1 reply; 27+ messages in thread
From: Sebastian Moeller @ 2019-11-30 15:42 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: alex.burr, ECN-Sane, bloat

Hi Jonathan,

thanks, more below.

> On Nov 30, 2019, at 15:32, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>>> 2: Accurate ECN Feedback.
>>> 
>>> We use a spare bit in the header of TCP acks to feed back SCE marks, and the existing ECE/CWR mechanism from RFC-3168 unchanged for CE marks.  The SCE feedback is "accurate" but not "reliable", because it can tolerate large errors (as much as 100% relative) without departing the control loop. The scheme is very simple and straightforward to implement at the receiver, and interpret at the sender.
>>> 
>>> L4S uses AccECN to give CE mark feedback that is both "accurate" and "reliable".  It is a somewhat complex specification which takes over three TCP header bits, including the two used for RFC-3168 feedback.
>> 
>> Question: How feasible would it be for any SCE aware transport protocol to evaluate AccECN?  This might make sense if not viewed from a technical but from a ietf politics perspective?
>> I personally believe, that if the ECN feedback woukd e really important it should be packeged into TCP data as the payload has some delivery guarantees, while ACKs are effectively best effort (tangent: and this is why I consider ACK filtering/compression as abominations which should be counted against any guarantee the contract with the traffic-carrier entails, not that this helps end customers).
> 
> It would be *possible* to use AccECN for SCE feedback, but only because the distinction between ECT(0) and ECT(1) is fed back in a TCP option.  SCE also has no use for the "accurate" CE feedback for which the ECE/CWR bits are replaced; if that three-bit field lay somewhere else, it could conceivably have been used for SCE feedback instead.

	I guess a "proper" solution would be to get a decently sized counter and just accumulate the SCE marks the receiver side saw, similar to the acknowledgment counter, to gain reasonable robustness against lost/filtered ACK packets. I naively would just try to get access to the 16 bit URG field ;) (ducks and runs....)

> 
> There are unfortunate problems with introducing new TCP options, in that some overzealous firewalls block traffic which uses them.  This would be a deployment hazard for SCE, which merely using a spare header flag avoids.  So instead we are still planning to use the spare bit - which happens to be one that AccECN also uses, but AccECN negotiates in such a way that SCE can safely use it even with an AccECN capable partner.

	Fair enough, I was mainly concerned about politics here.

> 
>>> 4: TCP-friendly response to RFC-3168 CE marking.
>>> 
>>> SCE does this by design, retaining the existing feedback mechanism for CE marks and implementing an RFC-8511 (ABE) compliant response in each of the TCP algorithms presented so far.  We can do this easily because CE and SCE information from the network is unambiguous.
>>> 
>>> L4S presently does not do this, largely because CE marks from RFC-3168 AQMs are not easily distinguished vice CE marks from an L4S AQM.  They seem to be working on some sort of solution, but it has not yet been demonstrated to work, and their paper describing it leaves a lot of open questions (including tuning constants).  That we saw no demonstration of it at IETF-106 (indeed they even skipped over their planned talk on it in a side session dedicated to L4S) suggests to me that they found flaws that were difficult to overcome at short notice, and possibly even managed to look bad next to our demonstration of jitter tolerance at the Hackathon.
>> 
>> 	I fear that they will come up with something that in reality will a) by opt-out, that is they will assume L4S-style feedback until reluctantly convinced that the bottleneck marker is rfc3160-compliant and hence will b) trigger too late c) trigger to rarely to be actually helpful in reality, but might show a good enough effort to push L4S past issue #16.
> 
> I'm sure they will, and we will of course point out these shortcomings as they occur, so as to count them against issue #16.  

	That might be bad position to be in though (if one party only gives negative feed-back no matter how justified it will generate a residual feeling of lack of good faith cooperation), I would have preferred if the requirements would have bee discussed before.

> Conversely, if they do manage to make it fail-safe, it is highly likely that their scheme will give false positives on real Internet paths and fail to switch into L4S mode, impairing their performance in other ways.

	Yes, so far they always err on the advantage of L4S, and justify this with "but, latency" and if one buys the latency justification cautiously default to rfc3168 becomes obviously sub-optimal, and so far none of the chairs put down the "first, do no harm" hammer (and I doubt they will). 

> 
>>> 5: Reduced RTT dependence
>>> 
>>> This is a mathematically interesting requirement which, at present, neither L4S nor SCE meets.
>>> 
>>> Fundamentally, any two flows following the same congestion-signal response which makes average cwnd dependent solely on marking probability, and which share the same bottleneck queue and AQM and therefore experience the same marking probability, will converge to the same average cwnd and their relative throughputs will therefore be inversely proportional to their RTTs.  This adequately describes both the pure AIMD response of Reno, and the so-called 1/p response of DCTCP (which TCP Prague apes slavishly).
>>> 
>>> The steady-state cwnd formula for CUBIC, however, is a function of both p(CE) and RTT, such that its throughput should be proportional to the reciprocal quartic root of RTT, rather than linearly reciprocal.  This assumes that CUBIC is not in its Reno compatibility regime, of course.  So CUBIC is the standard to beat, or at least match, for this requirement.
>> 
>> 	"Funny" story, looking at figure 6 of Høiland-Jørgensen T, Hurtig P, Brunstrom A (2015) The Good, the Bad and the WiFi: Modern AQMs in a residential setting. Computer Networks 89:90–106. shows clearly that a) single queue Pie (the AQM L4S inflicts upon at least the standard compliant traffic) causes worse RTT dependence than pfifo_fast and that fq_codel actually does (mostly) better, so by avoiding FQ like the devil, the L4S team shoots their own foot. 
> 
> Right, and we can easily explain why this happens.  A dumb FIFO adds a more-or-less constant delay to both competing flows, effectively reducing their RTT ratio towards unity.  Even at the short effective queue lengths proposed by L4S, the example they give in the Prague Requirements is of a 100ms versus 1ms baseline path, lengthened to 101ms versus 2ms by a 1ms queue.  This reduces a 100:1 ratio to 50.5:1.

	Well, that is my point, the default is no AQM and just a fifo, so "Reduced RTT dependence" is an euphemism as the chosen solution actually makes the RTT dependence in reality worse in the first place ;)

> 
> The FQ example is, however, of the network enforcing fairness, rather than informing the endpoints of the corrections they need to make to resolve unfairness.

	Yes, and the figure shows, even with fq fairness is still sub-optimal, but short of measuring and keeping an individual interval/target pair for each flow not that much that can be done (a shorter control loop simply will be better equipped to sweep up bandwidth freed when longer RTT flows scaled bandwidth down, no?)

>  We really like FQ, of course, but it's not feasible to deploy it everywhere,

	But realistically, all we are talking about, not withstanding L4S' grand design and ambitions, about is the shapers at the ISP/end-customer boundary, and there we have no proof that fq is not feasible?


> so we have to ensure reasonable competition between flows sharing a single queue.  We've already started testing one such idea…

	That, if it works, certainly will make the "fq is too costly" crowd easier to convince. So the best of luck!

Best Regards
	Sebastian

> 
> - Jonathan Morton


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat] sce materials from ietf
  2019-11-30 15:42           ` Sebastian Moeller
@ 2019-11-30 17:11             ` Jonathan Morton
  2019-12-02  5:38               ` Dave Taht
  0 siblings, 1 reply; 27+ messages in thread
From: Jonathan Morton @ 2019-11-30 17:11 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: alex.burr, ECN-Sane, bloat

> On 30 Nov, 2019, at 5:42 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>>> I fear that they will come up with something that in reality will a) by opt-out, that is they will assume L4S-style feedback until reluctantly convinced that the bottleneck marker is rfc3160-compliant and hence will b) trigger too late c) trigger to rarely to be actually helpful in reality, but might show a good enough effort to push L4S past issue #16.
>> 
>> I'm sure they will, and we will of course point out these shortcomings as they occur, so as to count them against issue #16.  
> 
> 	That might be bad position to be in though (if one party only gives negative feed-back no matter how justified it will generate a residual feeling of lack of good faith cooperation), I would have preferred if the requirements would have bee discussed before.
> 
>> Conversely, if they do manage to make it fail-safe, it is highly likely that their scheme will give false positives on real Internet paths and fail to switch into L4S mode, impairing their performance in other ways.
> 
> 	Yes, so far they always err on the advantage of L4S, and justify this with "but, latency" and if one buys the latency justification cautiously default to rfc3168 becomes obviously sub-optimal, and so far none of the chairs put down the "first, do no harm" hammer (and I doubt they will). 

We do have a political ally in the form of David Black.  As one of the authors of RFC-3168, he has a natural desire to defend his work.  At Singapore I believe he mostly spoke from the floor, but he is also advocating for SCE behind the scenes.  He's actually quite encouraged by the situation at present, in which L4S were seen to bluster for 2+ hours without actually moving very much forward, while we were able to present some new work in a very limited time.

I got the impression that failing to close most of L4S' open issues at Singapore is politically damaging to them.  This is a substantial list of problems opened at Montreal, as blockers for their WGLC on publishing L4S drafts as experimental RFCs.  They had all the time in the world to talk about solutions to the major showstopper problems, but were only able to concede a point that maybe tying RACK to the ECT(1) codepoint is better written as a SHOULD instead of a MUST.  That lack of progress was noticed at the WG Chair level; I think they may have been giving them the rope to hang themselves, so to speak.  I think they had a slide up at the side session, showing massive unfairness between L4S and "classic" flows, for a full half-hour - and they somehow thought that was *helpful* to their case!

I'm reasonably sure some industry attendees also noticed this - Stuart Cheshire (of Apple) in particular.  Apple have been on the front lines of enabling ECN deployment in practice in recent years.  He invited me, one of the ICCRG chairs, and Bob Briscoe - among others - to dinner, where we discussed some technical distinctions and Bob demonstrated a fundamental misunderstanding of control theory.

And we will have more ammunition at Vancouver.  It remains to be seen how much progress they'll make…

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-11-30 14:32         ` Jonathan Morton
  2019-11-30 15:42           ` Sebastian Moeller
@ 2019-11-30 22:17           ` Carsten Bormann
  2019-11-30 22:23             ` Jonathan Morton
  1 sibling, 1 reply; 27+ messages in thread
From: Carsten Bormann @ 2019-11-30 22:17 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Sebastian Moeller, ECN-Sane, bloat

On Nov 30, 2019, at 15:32, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
> There are unfortunate problems with introducing new TCP options, in that some overzealous firewalls block traffic which uses them.  This would be a deployment hazard for SCE, which merely using a spare header flag avoids.  So instead we are still planning to use the spare bit - which happens to be one that AccECN also uses, but AccECN negotiates in such a way that SCE can safely use it even with an AccECN capable partner.

This got me curious:  Do you have any evidence that firewalls are friendlier to new flags than to new options?

Grüße, Carsten


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-11-30 22:17           ` Carsten Bormann
@ 2019-11-30 22:23             ` Jonathan Morton
  2019-12-01 16:35               ` Sebastian Moeller
  0 siblings, 1 reply; 27+ messages in thread
From: Jonathan Morton @ 2019-11-30 22:23 UTC (permalink / raw)
  To: Carsten Bormann; +Cc: Sebastian Moeller, ECN-Sane, bloat

> On 1 Dec, 2019, at 12:17 am, Carsten Bormann <cabo@tzi.org> wrote:
> 
>> There are unfortunate problems with introducing new TCP options, in that some overzealous firewalls block traffic which uses them.  This would be a deployment hazard for SCE, which merely using a spare header flag avoids.  So instead we are still planning to use the spare bit - which happens to be one that AccECN also uses, but AccECN negotiates in such a way that SCE can safely use it even with an AccECN capable partner.
> 
> This got me curious:  Do you have any evidence that firewalls are friendlier to new flags than to new options?

Mirja Kuhlewind said as much during the TCPM session we attended, and she ought to know.  There appear to have been several studies performed on this subject; reserved TCP flags tend to get ignored pretty well, but unknown TCP options tend to get either stripped or blocked.

This influenced the design of AccECN as well; in an early version it would have used only a TCP option and left the TCP flags alone.  When it was found that firewalls would often interfere with this, the three-bit field in the TCP flags area was cooked up.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-11-30 22:23             ` Jonathan Morton
@ 2019-12-01 16:35               ` Sebastian Moeller
  2019-12-01 16:54                 ` Jonathan Morton
  2019-12-01 17:30                 ` Rodney W. Grimes
  0 siblings, 2 replies; 27+ messages in thread
From: Sebastian Moeller @ 2019-12-01 16:35 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Carsten Bormann, ECN-Sane, bloat

Hi Jonathan,


> On Nov 30, 2019, at 23:23, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 1 Dec, 2019, at 12:17 am, Carsten Bormann <cabo@tzi.org> wrote:
>> 
>>> There are unfortunate problems with introducing new TCP options, in that some overzealous firewalls block traffic which uses them.  This would be a deployment hazard for SCE, which merely using a spare header flag avoids.  So instead we are still planning to use the spare bit - which happens to be one that AccECN also uses, but AccECN negotiates in such a way that SCE can safely use it even with an AccECN capable partner.
>> 
>> This got me curious:  Do you have any evidence that firewalls are friendlier to new flags than to new options?
> 
> Mirja Kuhlewind said as much during the TCPM session we attended, and she ought to know.  There appear to have been several studies performed on this subject; reserved TCP flags tend to get ignored pretty well, but unknown TCP options tend to get either stripped or blocked.
> 
> This influenced the design of AccECN as well; in an early version it would have used only a TCP option and left the TCP flags alone.  When it was found that firewalls would often interfere with this, the three-bit field in the TCP flags area was cooked up.


	Belt and suspenders, eh? But realistically, the idea of using an accumulating SCE counter to allow for a lossy reverse ACK path seems sort of okay (after all TCP relies on the same, so there would be a nice symmetry ).
I really wonder whether SCE could not, in addition to its current bit, borrow the URG pointer field in cases when it is not used, or not fully used (if the MSS is smaller than 64K there might be a few bits leftover, with an MTU < 2000 I would expect that ~5 bits might still be usable in that rate case). I might be completely of to lunch here, but boy a nice rarely used contiguous 16bit field in the TCP header, what kind of mischief one could arrange with that ;) Looking at the AccECN draft, I see that my idea is not terribly original... But, hey for SCE having an additional higher fidelity SCE counter might be a nice addition, assuming URG(0), urgent pointer > 0 will not bleached/rejected by uninitiated TCP stacks/middleboxes...

Best Regards
	Sebastian




> 
> - Jonathan Morton
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-01 16:35               ` Sebastian Moeller
@ 2019-12-01 16:54                 ` Jonathan Morton
  2019-12-01 19:03                   ` Sebastian Moeller
  2019-12-01 17:30                 ` Rodney W. Grimes
  1 sibling, 1 reply; 27+ messages in thread
From: Jonathan Morton @ 2019-12-01 16:54 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Carsten Bormann, ECN-Sane, bloat

> On 1 Dec, 2019, at 6:35 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> Belt and suspenders, eh? But realistically, the idea of using an accumulating SCE counter to allow for a lossy reverse ACK path seems sort of okay (after all TCP relies on the same, so there would be a nice symmetry ).

Sure, we did think of several schemes that used a counter.  But when it came down to actually implementing it, we decided to try the simplest possible solution first and see how well it worked in practice.  It turned out to work very well, and can recover cleanly from as much as 100% relative feedback error caused by ack loss:

If less feedback is observed by the sender than intended by the AQM, growth will continue and the AQM will increase its marking to compensate, ultimately resorting to a CE mark.  This is, incidentally, exactly what happens if the receiver *or* sender are completely SCE-ignorant, and looks very much like RFC-3168 behaviour, which is entirely intentional.

If feedback is systematically doubled by the time it reaches the sender, perhaps through faulty ack filtering on the return path, it will back off more than intended, the bottleneck queue will empty, and AQM feedback will consequently reduce or cease entirely.  Only a very serious fault would re-inject ESCE feedback once SCE marking has completely ceased, so the sender will then grow back towards the correct cwnd after a relatively small negative excursion.

The above represents both extremes of 100% relative error in the feedback, which is shown to be safe and reasonably tolerable.  Smaller errors due to random ack loss are more likely, and consequently easier to tolerate in a closed negative-feedback control loop.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-01 16:35               ` Sebastian Moeller
  2019-12-01 16:54                 ` Jonathan Morton
@ 2019-12-01 17:30                 ` Rodney W. Grimes
  2019-12-01 19:17                   ` Sebastian Moeller
  1 sibling, 1 reply; 27+ messages in thread
From: Rodney W. Grimes @ 2019-12-01 17:30 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Jonathan Morton, ECN-Sane, bloat

> Hi Jonathan,
> 
> 
> > On Nov 30, 2019, at 23:23, Jonathan Morton <chromatix99@gmail.com> wrote:
> > 
> >> On 1 Dec, 2019, at 12:17 am, Carsten Bormann <cabo@tzi.org> wrote:
> >> 
> >>> There are unfortunate problems with introducing new TCP options, in that some overzealous firewalls block traffic which uses them.  This would be a deployment hazard for SCE, which merely using a spare header flag avoids.  So instead we are still planning to use the spare bit - which happens to be one that AccECN also uses, but AccECN negotiates in such a way that SCE can safely use it even with an AccECN capable partner.
> >> 
> >> This got me curious:  Do you have any evidence that firewalls are friendlier to new flags than to new options?
> > 
> > Mirja Kuhlewind said as much during the TCPM session we attended, and she ought to know.  There appear to have been several studies performed on this subject; reserved TCP flags tend to get ignored pretty well, but unknown TCP options tend to get either stripped or blocked.
> > 
> > This influenced the design of AccECN as well; in an early version it would have used only a TCP option and left the TCP flags alone.  When it was found that firewalls would often interfere with this, the three-bit field in the TCP flags area was cooked up.
> 
> 
> 	Belt and suspenders, eh? But realistically, the idea of using an accumulating SCE counter to allow for a lossy reverse ACK path seems sort of okay (after all TCP relies on the same, so there would be a nice symmetry ).
> I really wonder whether SCE could not, in addition to its current bit, borrow the URG pointer field in cases when it is not used, or not fully used (if the MSS is smaller than 64K there might be a few bits leftover, with an MTU < 2000 I would expect that ~5 bits might still be usable in that rate case). I might be completely of to lunch here, but boy a nice rarely used contiguous 16bit field in the TCP header, what kind of mischief one could arrange with that ;) Looking at the AccECN draft, I see that my idea is not terribly original... But, hey for SCE having an additional higher fidelity SCE counter might be a nice addition, assuming URG(0), urgent pointer > 0 will not bleached/rejected by uninitiated TCP stacks/middleboxes...

We need to fix the ACK issues rather than continue to work around it.  Ack thinning is good, as long as it does not cause information loss.  There is no draft/RFC on this, one needs to be written that explains you can not just ignore all the bits, you have to preserve the reserve bits, so you can only thin if they are the same.  Jonathan already fixed Cake (I think that is the one that has ACK thinning) to not collapse ACK's that have different bit 7 values.

Note that I consider the time of the arriving ACKS to also be informaition, RACK for instance uses that, so in the case of RACK any thinning could be considered bad.  BUTT I'll settle for not tossing reserved bit changes away as a "good enough" step forward that should be simple to implement (2 gate delay xor/or function).

> 	Sebastian
> > - Jonathan Morton
-- 
Rod Grimes                                                 rgrimes@freebsd.org

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-01 16:54                 ` Jonathan Morton
@ 2019-12-01 19:03                   ` Sebastian Moeller
  2019-12-01 19:27                     ` Jonathan Morton
  0 siblings, 1 reply; 27+ messages in thread
From: Sebastian Moeller @ 2019-12-01 19:03 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Carsten Bormann, ECN-Sane, bloat

Hi Jonathan,


> On Dec 1, 2019, at 17:54, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 1 Dec, 2019, at 6:35 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>> Belt and suspenders, eh? But realistically, the idea of using an accumulating SCE counter to allow for a lossy reverse ACK path seems sort of okay (after all TCP relies on the same, so there would be a nice symmetry ).
> 
> Sure, we did think of several schemes that used a counter.  But when it came down to actually implementing it, we decided to try the simplest possible solution first and see how well it worked in practice.  

	+1; simplicity has its own elegance.

> It turned out to work very well, and can recover cleanly from as much as 100% relative feedback error caused by ack loss:
> 
> If less feedback is observed by the sender than intended by the AQM, growth will continue and the AQM will increase its marking to compensate, ultimately resorting to a CE mark.  

	Well, that seems undesirable?

> This is, incidentally, exactly what happens if the receiver *or* sender are completely SCE-ignorant, and looks very much like RFC-3168 behaviour, which is entirely intentional.
> 
> If feedback is systematically doubled by the time it reaches the sender, perhaps through faulty ack filtering on the return path, it will back off more than intended, the bottleneck queue will empty, and AQM feedback will consequently reduce or cease entirely.  Only a very serious fault would re-inject ESCE feedback once SCE marking has completely ceased, so the sender will then grow back towards the correct cwnd after a relatively small negative excursion.

	Am I right to assume that the fault tolerance requires a relative steady ACK stream though?

> 
> The above represents both extremes of 100% relative error in the feedback, which is shown to be safe and reasonably tolerable.

	Great that the current simple scheme is safe (and for my pie in the sky "let's high-jack the URG pointer" scheme essential, since there are valid existimg users of the URG mechanism, at least google tells me that both ftp and telnet are candidates; bit seem rare enough though that giving these 16+1 bits something else to do might be fun).

>  Smaller errors due to random ack loss are more likely, and consequently easier to tolerate in a closed negative-feedback control loop.

	Fair enough.

Best Regards
	Sebastian

> 
> - Jonathan Morton


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-01 17:30                 ` Rodney W. Grimes
@ 2019-12-01 19:17                   ` Sebastian Moeller
  2019-12-02  5:10                     ` Dave Taht
  0 siblings, 1 reply; 27+ messages in thread
From: Sebastian Moeller @ 2019-12-01 19:17 UTC (permalink / raw)
  To: Rodney W. Grimes; +Cc: Jonathan Morton, ECN-Sane, bloat

Hi Rodney,


> On Dec 1, 2019, at 18:30, Rodney W. Grimes <4bone@gndrsh.dnsmgr.net> wrote:
> 
>> Hi Jonathan,
>> 
>> 
>>> On Nov 30, 2019, at 23:23, Jonathan Morton <chromatix99@gmail.com> wrote:
>>> 
>>>> On 1 Dec, 2019, at 12:17 am, Carsten Bormann <cabo@tzi.org> wrote:
>>>> 
>>>>> There are unfortunate problems with introducing new TCP options, in that some overzealous firewalls block traffic which uses them.  This would be a deployment hazard for SCE, which merely using a spare header flag avoids.  So instead we are still planning to use the spare bit - which happens to be one that AccECN also uses, but AccECN negotiates in such a way that SCE can safely use it even with an AccECN capable partner.
>>>> 
>>>> This got me curious:  Do you have any evidence that firewalls are friendlier to new flags than to new options?
>>> 
>>> Mirja Kuhlewind said as much during the TCPM session we attended, and she ought to know.  There appear to have been several studies performed on this subject; reserved TCP flags tend to get ignored pretty well, but unknown TCP options tend to get either stripped or blocked.
>>> 
>>> This influenced the design of AccECN as well; in an early version it would have used only a TCP option and left the TCP flags alone.  When it was found that firewalls would often interfere with this, the three-bit field in the TCP flags area was cooked up.
>> 
>> 
>> 	Belt and suspenders, eh? But realistically, the idea of using an accumulating SCE counter to allow for a lossy reverse ACK path seems sort of okay (after all TCP relies on the same, so there would be a nice symmetry ).
>> I really wonder whether SCE could not, in addition to its current bit, borrow the URG pointer field in cases when it is not used, or not fully used (if the MSS is smaller than 64K there might be a few bits leftover, with an MTU < 2000 I would expect that ~5 bits might still be usable in that rate case). I might be completely of to lunch here, but boy a nice rarely used contiguous 16bit field in the TCP header, what kind of mischief one could arrange with that ;) Looking at the AccECN draft, I see that my idea is not terribly original... But, hey for SCE having an additional higher fidelity SCE counter might be a nice addition, assuming URG(0), urgent pointer > 0 will not bleached/rejected by uninitiated TCP stacks/middleboxes...
> 
> We need to fix the ACK issues rather than continue to work around it.  Ack thinning is good, as long as it does not cause information loss.  There is no draft/RFC on this, one needs to be written that explains you can not just ignore all the bits, you have to preserve the reserve bits, so you can only thin if they are the same.  Jonathan already fixed Cake (I think that is the one that has ACK thinning) to not collapse ACK's that have different bit 7 values.

	Well, I detest ACK thinning and believe that the network should not try to second guess the users traffic (dropping/marking on reaching capacity is acceptable, but the kind of silent ACK thinning some DOCSIS ISPs perform seems actively user-hostile). But thinning or no thinning, the accumulative signaling is how the ACK stream deals with (reasonably) lossy paths, and I think any additional signaling via pure ACK packets should simply be tolerant to unexpected losses. I fully agree that if ACK thinning is performed it really should be careful to not loose information when doing its job, but SCE hopefully can deal with whatever is out in the field today (I am looking at you DOCSIS uplinks...), no?

> 
> Note that I consider the time of the arriving ACKS to also be informaition, RACK for instance uses that, so in the case of RACK any thinning could be considered bad.  

	I am with you here, if the end-points decided to exchange packets the network should do its best to deliver these. That is orthogonal to the question whether a every-two-MSS packets ACK rate is ideal for all/most applications.

> BUTT I'll settle for not tossing reserved bit changes away as a "good enough" step forward that should be simple to implement (2 gate delay xor/or function).

	Fair enough, question is more, what behavior happens out in the field, and could any other bit be toggled ACK by ACK to reduce the likelihood of an ACK filte to trigger?

Best Regards
	Sebastian


> 
>> 	Sebastian
>>> - Jonathan Morton
> -- 
> Rod Grimes                                                 rgrimes@freebsd.org


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-01 19:03                   ` Sebastian Moeller
@ 2019-12-01 19:27                     ` Jonathan Morton
  2019-12-01 19:32                       ` Sebastian Moeller
  0 siblings, 1 reply; 27+ messages in thread
From: Jonathan Morton @ 2019-12-01 19:27 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Carsten Bormann, ECN-Sane, bloat

> On 1 Dec, 2019, at 9:03 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>> If less feedback is observed by the sender than intended by the AQM, growth will continue and the AQM will increase its marking to compensate, ultimately resorting to a CE mark.  
> 
> Well, that seems undesirable?

As a safety valve, getting a CE mark is greatly preferable to losing congestion control entirely, or incurring a packet loss as the other alternative congestion signal.  It would only happen if the SCE signal or feedback were seriously disrupted or entirely erased - the latter being the *normal* state of affairs when either endpoint is not SCE aware in the first place.

> Am I right to assume that the fault tolerance requires a relative steady ACK stream though?

It only needs to be sufficient to keep the TCP stream flowing.  If the acks are bursty, that's a separate problem in which it doesn't really matter if they're all present or not.  And technically, the one-bit feedback mechanism is capable of precisely reflecting a sparse sequence of SCE marks using just two acks per mark.

> I fully agree that if ACK thinning is performed it really should be careful to not loose information when doing its job, but SCE hopefully can deal with whatever is out in the field today (I am looking at you DOCSIS uplinks...), no?

Right, that's the essence of the above discussion about relative feedback error, which is the sort of thing that random ack loss or unprincipled ack thinning is likely to introduce.

Meanwhile, an ack filter that avoids dropping acks in which the reserved flag bits differ from its successor will not lose any information in the one-bit scheme.  This is what's implemented in Cake (except that not all the reserved bits are covered yet, only the one we use).

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-01 19:27                     ` Jonathan Morton
@ 2019-12-01 19:32                       ` Sebastian Moeller
  2019-12-01 20:30                         ` Jonathan Morton
  0 siblings, 1 reply; 27+ messages in thread
From: Sebastian Moeller @ 2019-12-01 19:32 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Carsten Bormann, ECN-Sane, bloat

Hi Jonathan,



> On Dec 1, 2019, at 20:27, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 1 Dec, 2019, at 9:03 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>>> If less feedback is observed by the sender than intended by the AQM, growth will continue and the AQM will increase its marking to compensate, ultimately resorting to a CE mark.  
>> 
>> Well, that seems undesirable?
> 
> As a safety valve, getting a CE mark is greatly preferable to losing congestion control entirely, or incurring a packet loss as the other alternative congestion signal.

	Well, yes, I fully agree, I was referring to the "less feedback is observed by the sender than intended" part; I think it is great that SCE is safe by design in this regard.


>  It would only happen if the SCE signal or feedback were seriously disrupted or entirely erased - the latter being the *normal* state of affairs when either endpoint is not SCE aware in the first place.
> 
>> Am I right to assume that the fault tolerance requires a relative steady ACK stream though?
> 
> It only needs to be sufficient to keep the TCP stream flowing.  If the acks are bursty, that's a separate problem in which it doesn't really matter if they're all present or not.  And technically, the one-bit feedback mechanism is capable of precisely reflecting a sparse sequence of SCE marks using just two acks per mark.
> 
>> I fully agree that if ACK thinning is performed it really should be careful to not loose information when doing its job, but SCE hopefully can deal with whatever is out in the field today (I am looking at you DOCSIS uplinks...), no?
> 
> Right, that's the essence of the above discussion about relative feedback error, which is the sort of thing that random ack loss or unprincipled ack thinning is likely to introduce.

	"unprincipled ack thinning" nice description.


> 
> Meanwhile, an ack filter that avoids dropping acks in which the reserved flag bits differ from its successor will not lose any information in the one-bit scheme.  This is what's implemented in Cake (except that not all the reserved bits are covered yet, only the one we use).

	So, to show my lack of knowledge, basically a pure change in sequence number is acceptable, any other differences should trigger ACK conservation instead of filtering?

Best Regards
	Sebastian

> 
> - Jonathan Morton
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-01 19:32                       ` Sebastian Moeller
@ 2019-12-01 20:30                         ` Jonathan Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Jonathan Morton @ 2019-12-01 20:30 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Carsten Bormann, ECN-Sane, bloat

> On 1 Dec, 2019, at 9:32 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
>> Meanwhile, an ack filter that avoids dropping acks in which the reserved flag bits differ from its successor will not lose any information in the one-bit scheme.  This is what's implemented in Cake (except that not all the reserved bits are covered yet, only the one we use).
> 
> So, to show my lack of knowledge, basically a pure change in sequence number is acceptable, any other differences should trigger ACK conservation instead of filtering?

You are broadly correct, in that a pure advance of acked sequence number effectively obsoletes the earlier ack and it is therefore safe (and even arguably beneficial) to drop it.  However a *duplicate* ack should *not* be dropped, because that may be required to trigger Fast Retransmission in the absence of SACK.

Cake's ack filter is a bit more sophisticated than that, in that it can also accept certain harmless changes within TCP options.  I believe Timestamps and SACK get special handling along these lines; Timestamps can always change, SACK gets equivalent "pure superset" logic to detect when the old ack is completely covered by the new one.  Other options not specifically handled are treated as disqualifying.

All this only occurs in two consecutive packets which are both acks for the same connection and which are both waiting for a delivery opportunity in the queue.  An earlier ack is never delayed just to see if it can be combined with a later one.  The result is a better use of limited capacity to carry useful payloads, without having to rely on dropping acks by AQM action (which Codel is actually rather bad at).

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]     sce materials from ietf
  2019-12-01 19:17                   ` Sebastian Moeller
@ 2019-12-02  5:10                     ` Dave Taht
  2019-12-02  7:18                       ` Sebastian Moeller
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Taht @ 2019-12-02  5:10 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: Rodney W. Grimes, Jonathan Morton, ECN-Sane, bloat

Sebastian Moeller <moeller0@gmx.de> writes:

> Hi Rodney,
>
>
>> On Dec 1, 2019, at 18:30, Rodney W. Grimes <4bone@gndrsh.dnsmgr.net> wrote:
>> 
>>> Hi Jonathan,
>>> 
>>> 
>>>> On Nov 30, 2019, at 23:23, Jonathan Morton <chromatix99@gmail.com> wrote:
>>>> 
>>>>> On 1 Dec, 2019, at 12:17 am, Carsten Bormann <cabo@tzi.org> wrote:
>>>>> 
>>>>>> There are unfortunate problems with introducing new TCP options,
>>>>>> in that some overzealous firewalls block traffic which uses
>>>>>> them.  This would be a deployment hazard for SCE, which merely
>>>>>> using a spare header flag avoids.  So instead we are still
>>>>>> planning to use the spare bit - which happens to be one that
>>>>>> AccECN also uses, but AccECN negotiates in such a way that SCE
>>>>>> can safely use it even with an AccECN capable partner.
>>>>> 
>>>>> This got me curious:  Do you have any evidence that firewalls are friendlier to new flags than to new options?
>>>> 
>>>> Mirja Kuhlewind said as much during the TCPM session we attended,
>>>> and she ought to know.  There appear to have been several studies
>>>> performed on this subject; reserved TCP flags tend to get ignored
>>>> pretty well, but unknown TCP options tend to get either stripped
>>>> or blocked.
>>>> 
>>>> This influenced the design of AccECN as well; in an early version
>>>> it would have used only a TCP option and left the TCP flags alone.
>>>> When it was found that firewalls would often interfere with this,
>>>> the three-bit field in the TCP flags area was cooked up.
>>> 
>>> 
>>> 	Belt and suspenders, eh? But realistically, the idea of using
>>> an accumulating SCE counter to allow for a lossy reverse ACK path
>>> seems sort of okay (after all TCP relies on the same, so there
>>> would be a nice symmetry ).
>>> I really wonder whether SCE could not, in addition to its current
>>> bit, borrow the URG pointer field in cases when it is not used, or
>>> not fully used (if the MSS is smaller than 64K there might be a few
>>> bits leftover, with an MTU < 2000 I would expect that ~5 bits might
>>> still be usable in that rate case). I might be completely of to
>>> lunch here, but boy a nice rarely used contiguous 16bit field in
>>> the TCP header, what kind of mischief one could arrange with that
>>> ;) Looking at the AccECN draft, I see that my idea is not terribly
>>> original... But, hey for SCE having an additional higher fidelity
>>> SCE counter might be a nice addition, assuming URG(0), urgent
>>> pointer > 0 will not bleached/rejected by uninitiated TCP
>>> stacks/middleboxes...
>> 
>> We need to fix the ACK issues rather than continue to work around
>> it.  Ack thinning is good, as long as it does not cause information
>> loss.  There is no draft/RFC on this, one needs to be written that
>> explains you can not just ignore all the bits, you have to preserve
>> the reserve bits, so you can only thin if they are the same.
>> Jonathan already fixed Cake (I think that is the one that has ACK
>> thinning) to not collapse ACK's that have different bit 7 values.
>
> 	Well, I detest ACK thinning and believe that the network
> should not try to second guess the users traffic (dropping/marking on
> reaching capacity is acceptable, but the kind of silent ACK thinning
> some DOCSIS ISPs perform seems actively user-hostile). But thinning or
> no thinning, the accumulative signaling is how the ACK stream deals
> with (reasonably) lossy paths, and I think any additional signaling
> via pure ACK packets should simply be tolerant to unexpected losses. I
> fully agree that if ACK thinning is performed it really should be
> careful to not loose information when doing its job, but SCE hopefully
> can deal with whatever is out in the field today (I am looking at you
> DOCSIS uplinks...), no?

I happen to not be huge on ack thinning either, but the effect
on highly assymetric networks was pretty convincing, and having
to handle less acks at the sender a potential goodness also.

http://blog.cerowrt.org/post/ack_filtering/

At the time we did I thought it could be made even better,
if we allowed more droppable packets to accumulate on each
round, it would both be "fairer" and be able to "drop more"
over each round.

https://github.com/dtaht/sch_cake/issues/104

Never got around to it.

I'd much rather have fewer highly assymmetric networks, and the
endpoint tcps do the thinning (which is what more or less happens
with GSO), but....

secondly, I note that "ack prioritization" is a very common thing in
multiple shapers I've looked at, starting with wondershaper and in many
others (including dd-wrt). A lot of these are *wrong*, wondershaper, for
example, only recognized 64 byte acks. I think more than a few modems
do ack prioritization rather than "thinning".

thirdly, protocols such as QUIC are already sending less
acknowlegements per packet than most TCP do, which is a a good thing.

fourthly, I've been meaning to try thinning on wifi for a while. Wifi
has a problem in that only a fixed number of packets can fit
in a txop and everything in a txop is usually sent reliably. 

Here's 5 days worth of data from one of my sites. It's not hugely
loaded in the uplink direction, but roughly 11% of all packets are
dropped. 


qdisc cake 8007: dev eth0 root refcnt 9 bandwidth 9Mbit diffserv3 triple-isolate nat nowash ack-filter split-gso rtt 100.0ms noatm overhead 18 mpu 64 
 Sent 13088217784 bytes 96513781 pkt (dropped 12173093, overlimits 155529797 requeues 558) 
 backlog 0b 0p requeues 558
 memory used: 1144944b of 4Mb
 capacity estimate: 9Mbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:       64 /    1518
 average network hdr offset:           14

                   Bulk  Best Effort        Voice
  thresh      562496bit        9Mbit     2250Kbit
  target         32.3ms        5.0ms        8.1ms
  interval      127.3ms      100.0ms      103.1ms
  pk_delay        4.7ms        2.0ms        709us
  av_delay        1.3ms        162us         69us
  sp_delay         50us          3us          3us
  backlog            0b           0b           0b
  pkts           150501    108280345       256028
  bytes       146280265  13846693704     40682021
  way_inds          181      7552458        26288
  way_miss         6579      1383844        20861
  way_cols            0            0            0
  drops             125         2682            0
  marks             171          277            0
  ack_drop            0     12170286            0
  sp_flows            2            5            0
  bk_flows            0            2            0
  un_flows            0            0            0
  max_len          4542        28766         2988
  quantum           300          300          300

>
>> 
>> Note that I consider the time of the arriving ACKS to also be
>> informaition, RACK for instance uses that, so in the case of RACK
>> any thinning could be considered bad.
>
> 	I am with you here, if the end-points decided to exchange
> packets the network should do its best to deliver these. That is
> orthogonal to the question whether a every-two-MSS packets ACK rate is
> ideal for all/most applications.
>
>> BUTT I'll settle for not tossing reserved bit changes away as a
>> "good enough" step forward that should be simple to implement (2
>> gate delay xor/or function).
>
> 	Fair enough, question is more, what behavior happens out in
> the field, and could any other bit be toggled ACK by ACK to reduce the
> likelihood of an ACK filte to trigger?
>
> Best Regards
> 	Sebastian
>
>
>> 
>>> 	Sebastian
>>>> - Jonathan Morton
>> -- 
>> Rod Grimes                                                 rgrimes@freebsd.org
>
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-11-30 17:11             ` Jonathan Morton
@ 2019-12-02  5:38               ` Dave Taht
  2019-12-02  7:54                 ` Sebastian Moeller
  2019-12-02  9:57                 ` Pete Heist
  0 siblings, 2 replies; 27+ messages in thread
From: Dave Taht @ 2019-12-02  5:38 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Sebastian Moeller, ECN-Sane, bloat

Jonathan Morton <chromatix99@gmail.com> writes:

>> On 30 Nov, 2019, at 5:42 pm, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>>>> I fear that they will come up with something that in reality will
>>> a) by opt-out, that is they will assume L4S-style feedback until
>>> reluctantly convinced that the bottleneck marker is
>>> rfc3160-compliant and hence will b) trigger too late c) trigger to
>>> rarely to be actually helpful in reality, but might show a good
>>> enough effort to push L4S past issue #16.
>>> 
>>> I'm sure they will, and we will of course point out these shortcomings as they occur, so as to count them against issue #16.  
>> 
>> 	That might be bad position to be in though (if one party only
>> gives negative feed-back no matter how justified it will generate a
>> residual feeling of lack of good faith cooperation), I would have
>> preferred if the requirements would have bee discussed before.
>> 
>>> Conversely, if they do manage to make it fail-safe, it is highly
>>> likely that their scheme will give false positives on real Internet
>>> paths and fail to switch into L4S mode, impairing their performance
>>> in other ways.
>> 
>> 	Yes, so far they always err on the advantage of L4S, and
>> justify this with "but, latency" and if one buys the latency

I do hate watching y'all continually concede the "latency" point and
have to argue on the "chosen ground" of single or dualq about
long-running tcp flows. 

"fq" already achieves "ultra-low latency" for nearly all flows,
especially including flows in the presence of bursts, short flows that
never get out of slow start (e.g. most of them), simple malicious
traffic, and so on.

The need for any additional marking "stuff" is lessened, particularly as
fq enables RTT based tcps such as BBR to work better in the first place.

codel has a target of 5ms where pie is 16ms. Neither achieve these,
but both come close, but the second "classic" queue in dualq is 3x worse
than any given queue in fq_codel.

the gross unfairness of spitting l4s marked packets through a rfc3168
compliant link is made much worse when your flows are short.

both l4s and sce both seem to have an issue in aborting slow start
too early at this point.

lastly, overusage of ecn in either system can bloat up a link.

I'm happy to see the two primary approaches to making that less
disasterous by seeing some code arrive for shortening the MSS or "sub
packet windows", but i'd still like to see cwnd 1 and the other fallback
methods in rfc3168 implemented, and there are still adversaries to
deal with (see rfc3168 sec 7)

>> justification cautiously default to rfc3168 becomes obviously
>> sub-optimal, and so far none of the chairs put down the "first, do
>> no harm" hammer (and I doubt they will).

> I got the impression that failing to close most of L4S' open issues at
> Singapore is politically damaging to them.  This is a substantial list
> of problems opened at Montreal, as blockers for their WGLC on
> publishing L4S drafts as experimental RFCs.  They had all the time in

well, I'd like to file at least one more so we can get a real
rrul test comparison done.

> the world to talk about solutions to the major showstopper problems,
> but were only able to concede a point that maybe tying RACK to the
> ECT(1) codepoint is better written as a SHOULD instead of a MUST.
> That lack of progress was noticed at the WG Chair level; I think they
> may have been giving them the rope to hang themselves, so to speak.  I
> think they had a slide up at the side session, showing massive
> unfairness between L4S and "classic" flows, for a full half-hour - and
> they somehow thought that was *helpful* to their case!

I think there is at least a small segment of the audience that thinks
that prioritizing the internet for data coming from the ISP's or mobile
operator's DC is a very good thing.

It doesn't matter how (think vpn backhauls from cell towers as one
example, or coast to coast data transfers) long term disadvantagous
RTT-unfairness may be to that mindset.

Me, I love how FQ brings true RTT fairness to the internet, offering
a shot at equal bandwidth to sites near and far, bringing the world
closer together.

> I'm reasonably sure some industry attendees also noticed this - Stuart
> Cheshire (of Apple) in particular.  Apple have been on the front lines
> of enabling ECN deployment in practice in recent years.  He invited
> me, one of the ICCRG chairs, and Bob Briscoe - among others - to
> dinner, where we discussed some technical distinctions and Bob
> demonstrated a fundamental misunderstanding of control theory.

I am glad y'all got through dinner alive.

> And we will have more ammunition at Vancouver.  It remains to be seen how much progress they'll make…
>
>  - Jonathan Morton
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]     sce materials from ietf
  2019-12-02  5:10                     ` Dave Taht
@ 2019-12-02  7:18                       ` Sebastian Moeller
  0 siblings, 0 replies; 27+ messages in thread
From: Sebastian Moeller @ 2019-12-02  7:18 UTC (permalink / raw)
  To: Dave Taht; +Cc: Rodney W. Grimes, Jonathan Morton, ECN-Sane, bloat

Hi Dave,

On December 2, 2019 6:10:51 AM GMT+01:00, Dave Taht <dave@taht.net> wrote:
>Sebastian Moeller <moeller0@gmx.de> writes:
>
>> Hi Rodney,
>>
>>
>>> On Dec 1, 2019, at 18:30, Rodney W. Grimes <4bone@gndrsh.dnsmgr.net>
>wrote:
>>> 
>>>> Hi Jonathan,
>>>> 
>>>> 
>>>>> On Nov 30, 2019, at 23:23, Jonathan Morton <chromatix99@gmail.com>
>wrote:
>>>>> 
>>>>>> On 1 Dec, 2019, at 12:17 am, Carsten Bormann <cabo@tzi.org>
>wrote:
>>>>>> 
>>>>>>> There are unfortunate problems with introducing new TCP options,
>>>>>>> in that some overzealous firewalls block traffic which uses
>>>>>>> them.  This would be a deployment hazard for SCE, which merely
>>>>>>> using a spare header flag avoids.  So instead we are still
>>>>>>> planning to use the spare bit - which happens to be one that
>>>>>>> AccECN also uses, but AccECN negotiates in such a way that SCE
>>>>>>> can safely use it even with an AccECN capable partner.
>>>>>> 
>>>>>> This got me curious:  Do you have any evidence that firewalls are
>friendlier to new flags than to new options?
>>>>> 
>>>>> Mirja Kuhlewind said as much during the TCPM session we attended,
>>>>> and she ought to know.  There appear to have been several studies
>>>>> performed on this subject; reserved TCP flags tend to get ignored
>>>>> pretty well, but unknown TCP options tend to get either stripped
>>>>> or blocked.
>>>>> 
>>>>> This influenced the design of AccECN as well; in an early version
>>>>> it would have used only a TCP option and left the TCP flags alone.
>>>>> When it was found that firewalls would often interfere with this,
>>>>> the three-bit field in the TCP flags area was cooked up.
>>>> 
>>>> 
>>>> 	Belt and suspenders, eh? But realistically, the idea of using
>>>> an accumulating SCE counter to allow for a lossy reverse ACK path
>>>> seems sort of okay (after all TCP relies on the same, so there
>>>> would be a nice symmetry ).
>>>> I really wonder whether SCE could not, in addition to its current
>>>> bit, borrow the URG pointer field in cases when it is not used, or
>>>> not fully used (if the MSS is smaller than 64K there might be a few
>>>> bits leftover, with an MTU < 2000 I would expect that ~5 bits might
>>>> still be usable in that rate case). I might be completely of to
>>>> lunch here, but boy a nice rarely used contiguous 16bit field in
>>>> the TCP header, what kind of mischief one could arrange with that
>>>> ;) Looking at the AccECN draft, I see that my idea is not terribly
>>>> original... But, hey for SCE having an additional higher fidelity
>>>> SCE counter might be a nice addition, assuming URG(0), urgent
>>>> pointer > 0 will not bleached/rejected by uninitiated TCP
>>>> stacks/middleboxes...
>>> 
>>> We need to fix the ACK issues rather than continue to work around
>>> it.  Ack thinning is good, as long as it does not cause information
>>> loss.  There is no draft/RFC on this, one needs to be written that
>>> explains you can not just ignore all the bits, you have to preserve
>>> the reserve bits, so you can only thin if they are the same.
>>> Jonathan already fixed Cake (I think that is the one that has ACK
>>> thinning) to not collapse ACK's that have different bit 7 values.
>>
>> 	Well, I detest ACK thinning and believe that the network
>> should not try to second guess the users traffic (dropping/marking on
>> reaching capacity is acceptable, but the kind of silent ACK thinning
>> some DOCSIS ISPs perform seems actively user-hostile). But thinning
>or
>> no thinning, the accumulative signaling is how the ACK stream deals
>> with (reasonably) lossy paths, and I think any additional signaling
>> via pure ACK packets should simply be tolerant to unexpected losses.
>I
>> fully agree that if ACK thinning is performed it really should be
>> careful to not loose information when doing its job, but SCE
>hopefully
>> can deal with whatever is out in the field today (I am looking at you
>> DOCSIS uplinks...), no?
>
>I happen to not be huge on ack thinning either, but the effect
>on highly assymetric networks was pretty convincing, and having
>to handle less acks at the sender a potential goodness also.
>
>http://blog.cerowrt.org/post/ack_filtering/
>
>At the time we did I thought it could be made even better,
>if we allowed more droppable packets to accumulate on each
>round, it would both be "fairer" and be able to "drop more"
>over each round.
>
>https://github.com/dtaht/sch_cake/issues/104
>
>Never got around to it.
>
>I'd much rather have fewer highly assymmetric networks, 

         +1, I will not hold my breath though on getting this anytime soon... GPON by default is asymmetric (2:1) and full duplex DOCSIS requires costly plant changes and got moved from the 3.1 spec to 4, and the DSLs simply have no symmetric Bandplans I know of (well G.fast has, but that only helps if the G.fast uplink is not running over GPON). But then GPON's 2:1 ratio would already be most of the way to symmetry...

and the
>endpoint tcps do the thinning (which is what more or less happens
>with GSO), but....

         Yepp, the endpoints basically show be in control of the ACK rate, but also should be considerate.


>
>secondly, I note that "ack prioritization" is a very common thing in
>multiple shapers I've looked at, starting with wondershaper and in many
>others (including dd-wrt). A lot of these are *wrong*, wondershaper,
>for
>example, only recognized 64 byte acks. I think more than a few modems
>do ack prioritization rather than "thinning".

         I believe indiscriminate ACK boosting to be the wrong thing for a tiered prioritization scheme, as ACKs should have the same priority as the rest of the flow. But for the fidelity of the feedback loop, less delay for ACKs seems benign, no?


>
>thirdly, protocols such as QUIC are already sending less
>acknowlegements per packet than most TCP do, which is a a good thing.

         Again, +1, the endpoints should know best.


>
>fourthly, I've been meaning to try thinning on wifi for a while. Wifi
>has a problem in that only a fixed number of packets can fit
>in a txop and everything in a txop is usually sent reliably. 
>
>Here's 5 days worth of data from one of my sites. It's not hugely
>loaded in the uplink direction, but roughly 11% of all packets are
>dropped. 

        Almost all of those were ACKs though, I guess I see why you consider it unwise to hoist these over the wifi link only to filter them at your edge router....

Best Regards
        Sebastian


>
>
>qdisc cake 8007: dev eth0 root refcnt 9 bandwidth 9Mbit diffserv3
>triple-isolate nat nowash ack-filter split-gso rtt 100.0ms noatm
>overhead 18 mpu 64 
>Sent 13088217784 bytes 96513781 pkt (dropped 12173093, overlimits
>155529797 requeues 558) 
> backlog 0b 0p requeues 558
> memory used: 1144944b of 4Mb
> capacity estimate: 9Mbit
> min/max network layer size:           28 /    1500
> min/max overhead-adjusted size:       64 /    1518
> average network hdr offset:           14
>
>                   Bulk  Best Effort        Voice
>  thresh      562496bit        9Mbit     2250Kbit
>  target         32.3ms        5.0ms        8.1ms
>  interval      127.3ms      100.0ms      103.1ms
>  pk_delay        4.7ms        2.0ms        709us
>  av_delay        1.3ms        162us         69us
>  sp_delay         50us          3us          3us
>  backlog            0b           0b           0b
>  pkts           150501    108280345       256028
>  bytes       146280265  13846693704     40682021
>  way_inds          181      7552458        26288
>  way_miss         6579      1383844        20861
>  way_cols            0            0            0
>  drops             125         2682            0
>  marks             171          277            0
>  ack_drop            0     12170286            0
>  sp_flows            2            5            0
>  bk_flows            0            2            0
>  un_flows            0            0            0
>  max_len          4542        28766         2988
>  quantum           300          300          300
>
>>
>>> 
>>> Note that I consider the time of the arriving ACKS to also be
>>> informaition, RACK for instance uses that, so in the case of RACK
>>> any thinning could be considered bad.
>>
>> 	I am with you here, if the end-points decided to exchange
>> packets the network should do its best to deliver these. That is
>> orthogonal to the question whether a every-two-MSS packets ACK rate
>is
>> ideal for all/most applications.
>>
>>> BUTT I'll settle for not tossing reserved bit changes away as a
>>> "good enough" step forward that should be simple to implement (2
>>> gate delay xor/or function).
>>
>> 	Fair enough, question is more, what behavior happens out in
>> the field, and could any other bit be toggled ACK by ACK to reduce
>the
>> likelihood of an ACK filte to trigger?
>>
>> Best Regards
>> 	Sebastian
>>
>>
>>> 
>>>> 	Sebastian
>>>>> - Jonathan Morton
>>> -- 
>>> Rod Grimes                                                
>rgrimes@freebsd.org
>>
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-02  5:38               ` Dave Taht
@ 2019-12-02  7:54                 ` Sebastian Moeller
  2019-12-02  9:57                 ` Pete Heist
  1 sibling, 0 replies; 27+ messages in thread
From: Sebastian Moeller @ 2019-12-02  7:54 UTC (permalink / raw)
  To: Dave Taht, Jonathan Morton; +Cc: ECN-Sane, bloat

Hi Dave,


On December 2, 2019 6:38:11 AM GMT+01:00, Dave Taht <dave@taht.net> wrote:
>Jonathan Morton <chromatix99@gmail.com> writes:
>
>>> On 30 Nov, 2019, at 5:42 pm, Sebastian Moeller <moeller0@gmx.de>
>wrote:
>>> 
>>>>> I fear that they will come up with something that in reality will
>>>> a) by opt-out, that is they will assume L4S-style feedback until
>>>> reluctantly convinced that the bottleneck marker is
>>>> rfc3160-compliant and hence will b) trigger too late c) trigger to
>>>> rarely to be actually helpful in reality, but might show a good
>>>> enough effort to push L4S past issue #16.
>>>> 
>>>> I'm sure they will, and we will of course point out these
>shortcomings as they occur, so as to count them against issue #16.  
>>> 
>>> 	That might be bad position to be in though (if one party only
>>> gives negative feed-back no matter how justified it will generate a
>>> residual feeling of lack of good faith cooperation), I would have
>>> preferred if the requirements would have bee discussed before.
>>> 
>>>> Conversely, if they do manage to make it fail-safe, it is highly
>>>> likely that their scheme will give false positives on real Internet
>>>> paths and fail to switch into L4S mode, impairing their performance
>>>> in other ways.
>>> 
>>> 	Yes, so far they always err on the advantage of L4S, and
>>> justify this with "but, latency" and if one buys the latency
>
>I do hate watching y'all continually concede the "latency" point and

Well, they have pretty graphs to wield around hitting their PR target of ~1ms queueing delay. I had a trawl through the bake-off data but all SCE results I found show noticeably larger delays (not bad by any standards, but a much harder sell than the simple <1ms story) I might not have looked to carefully though?



>have to argue on the "chosen ground" of single or dualq about
>long-running tcp flows. 
>
>"fq" already achieves "ultra-low latency" for nearly all flows,
>especially including flows in the presence of bursts, short flows that
>never get out of slow start (e.g. most of them), simple malicious
>traffic, and so on.
>
>The need for any additional marking "stuff" is lessened, particularly
>as
>fq enables RTT based tcps such as BBR to work better in the first
>place.
>
>codel has a target of 5ms where pie is 16ms. 

         In the dual queue draft the reference latency sits at 15ms without a good rationale, I would love to see dual queue results at short RTT with a theoretically justified 5 Ms target, to see whether that counteracts the equitable sharing failure.


Neither achieve these,
>but both come close, but the second "classic" queue in dualq is 3x
>worse
>than any given queue in fq_codel.

         My take on that is: Bob must have realized early on that equal sharing very much is not optimal, if you know more about the relative importance of the flows you can do better. (He ignores that as the other side of the coin you can easily do worse, simply permute the latencies and bandwidth of the optimal asymmetric set, but I degrees) And now he struggles in getting this information in the first place and in distributing it to the hops that are in a position to act on this information. The informational ConEx RFC 6789 is a great example of that quixotic quest, where Bob proposes the punish flows based on their total experienced congestion on the full path. 
In short FQ is not perfect, but neither is anything goes, and FQ has fewer catastrophic failure modes, and the one it has, sensitivy to flow count can a) be remedied by an appropriate flow definition (maybe just a 2-tuple for backbone routers, or even by AS on peering routers, already recognised in rfc2914) and b) anything goes is equally sensitive to the same condition. I really hate the line of argument, since we can not come up with a perfect solution, let's do nothing: that is not how evolution operates.


>
>the gross unfairness of spitting l4s marked packets through a rfc3168
>compliant link is made much worse when your flows are short.
>
>both l4s and sce both seem to have an issue in aborting slow start
>too early at this point.
>
>lastly, overusage of ecn in either system can bloat up a link.
      
       According to Koen, PIE is stricter than vodel in bringing down the hammer on flows not reacting fast enough though, so the level of ECN bloat might be a function of the employed AQM, no?


>
>I'm happy to see the two primary approaches to making that less
>disasterous by seeing some code arrive for shortening the MSS or "sub
>packet windows", but i'd still like to see cwnd 1 and the other
>fallback
>methods in rfc3168 implemented, and there are still adversaries to
>deal with (see rfc3168 sec 7)
>
>>> justification cautiously default to rfc3168 becomes obviously
>>> sub-optimal, and so far none of the chairs put down the "first, do
>>> no harm" hammer (and I doubt they will).
>
>> I got the impression that failing to close most of L4S' open issues
>at
>> Singapore is politically damaging to them.  This is a substantial
>list
>> of problems opened at Montreal, as blockers for their WGLC on
>> publishing L4S drafts as experimental RFCs.  They had all the time in
>
>well, I'd like to file at least one more so we can get a real
>rrul test comparison done.
>
>> the world to talk about solutions to the major showstopper problems,
>> but were only able to concede a point that maybe tying RACK to the
>> ECT(1) codepoint is better written as a SHOULD instead of a MUST.
>> That lack of progress was noticed at the WG Chair level; I think they
>> may have been giving them the rope to hang themselves, so to speak. 
>I
>> think they had a slide up at the side session, showing massive
>> unfairness between L4S and "classic" flows, for a full half-hour -
>and
>> they somehow thought that was *helpful* to their case!
>
>I think there is at least a small segment of the audience that thinks
>that prioritizing the internet for data coming from the ISP's or mobile
>operator's DC is a very good thing.

        But also one that can get you on the hot chair with the regulator quickly if the ISP charges extra... 

>
>It doesn't matter how (think vpn backhauls from cell towers as one
>example, or coast to coast data transfers) long term disadvantagous
>RTT-unfairness may be to that mindset.

        This is why I hope for more push back from the IETF community....

>
>Me, I love how FQ brings true RTT fairness to the internet, offering
>a shot at equal bandwidth to sites near and far, bringing the world
>closer together.

         I effortlessly seems to simply doba number of things pretty well and is conceptionally simple to understand and it's behavior is easily predictable, what is not to love about it, except the apparent lack of a fast in silicon implementation....


Best Regards
        Sebastian


>
>
>> I'm reasonably sure some industry attendees also noticed this -
>Stuart
>> Cheshire (of Apple) in particular.  Apple have been on the front
>lines
>> of enabling ECN deployment in practice in recent years.  He invited
>> me, one of the ICCRG chairs, and Bob Briscoe - among others - to
>> dinner, where we discussed some technical distinctions and Bob
>> demonstrated a fundamental misunderstanding of control theory.
>
>I am glad y'all got through dinner alive.
>
>> And we will have more ammunition at Vancouver.  It remains to be seen
>how much progress they'll make…
>>
>>  - Jonathan Morton
>> _______________________________________________
>> Bloat mailing list
>> Bloat@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/bloat

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ecn-sane] [Bloat]   sce materials from ietf
  2019-12-02  5:38               ` Dave Taht
  2019-12-02  7:54                 ` Sebastian Moeller
@ 2019-12-02  9:57                 ` Pete Heist
  1 sibling, 0 replies; 27+ messages in thread
From: Pete Heist @ 2019-12-02  9:57 UTC (permalink / raw)
  To: Dave Taht; +Cc: Jonathan Morton, ECN-Sane, bloat

> On Dec 2, 2019, at 6:38 AM, Dave Taht <dave@taht.net> wrote:
> 
> I do hate watching y'all continually concede the "latency" point and
> have to argue on the "chosen ground" of single or dualq about
> long-running tcp flows. 

I don’t think we’ve conceded that. It would be possible to run more tests on a LAN with a shorter Codel interval and show ~1ms TCP RTT, and perhaps we should. I don’t think there's any fundamental reason why one signaling mechanism or the other should ultimately be any better in this regard.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-12-02  9:57 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-29 20:08 [Ecn-sane] sce materials from ietf Dave Taht
2019-11-29 21:20 ` [Ecn-sane] [Bloat] " Jonathan Morton
2019-11-29 23:10   ` alex.burr
2019-11-30  1:39     ` Jonathan Morton
2019-11-30  7:27       ` Sebastian Moeller
2019-11-30 14:32         ` Jonathan Morton
2019-11-30 15:42           ` Sebastian Moeller
2019-11-30 17:11             ` Jonathan Morton
2019-12-02  5:38               ` Dave Taht
2019-12-02  7:54                 ` Sebastian Moeller
2019-12-02  9:57                 ` Pete Heist
2019-11-30 22:17           ` Carsten Bormann
2019-11-30 22:23             ` Jonathan Morton
2019-12-01 16:35               ` Sebastian Moeller
2019-12-01 16:54                 ` Jonathan Morton
2019-12-01 19:03                   ` Sebastian Moeller
2019-12-01 19:27                     ` Jonathan Morton
2019-12-01 19:32                       ` Sebastian Moeller
2019-12-01 20:30                         ` Jonathan Morton
2019-12-01 17:30                 ` Rodney W. Grimes
2019-12-01 19:17                   ` Sebastian Moeller
2019-12-02  5:10                     ` Dave Taht
2019-12-02  7:18                       ` Sebastian Moeller
2019-11-29 22:55 ` [Ecn-sane] " Rodney W. Grimes
2019-11-29 22:58   ` Rodney W. Grimes
2019-11-30  2:37 ` Dave Taht
2019-11-30  3:10   ` Rodney W. Grimes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox