[Ecn-sane] robustness against attack?

Discussion of explicit congestion notification's impact on the Internet
 help / color / mirror / Atom feed

* [Ecn-sane] robustness against attack?
@ 2019-03-24 22:50 Sebastian Moeller
  2019-03-25  7:16 ` Mikael Abrahamsson
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Moeller @ 2019-03-24 22:50 UTC (permalink / raw)
  To: ecn-sane

Here is a comment on the tsvwg mailing list for the [tsvwg] Questions and comments on draft-ietf-tsvwg-ecn-l4s-id-06 bt G. Fairhurst:

"Section 8. I think there should be some discussion on what happens if an attacker introduces ECT(1) rogue packets can it influence the method, other than an attack which seeks to induce congestion? "

From my layman's perspective this is the the killer argument against the dualQ approach and for fair-queueing, IMHO only fq will be able to (stochastically) isolate rouge flows.... (okay if the attacker randomizes port numbers he/she will also do considerable harm to an fq AQM, but at least it will take more than one flow). I might be overly optimistic about fq and unfairly negative about dualQ/LLLLS, but the idea of fully trusting the end-points to play fair (as far as I can tell dualQ wi)ll only tail-drop once it queue passes a configured threshold) seems overly optimistic to me. This reminds on of the difference between cooperative and preemptive multitasking, while the former has the potential for higher performance, all general purposes OS went for the latter... Anyway, since I am far away from this field I would not be amazed if I would just re-hash old arguments here, but still a thought is a thought, and uttering even a silly thought can result in me learning something ;)

Best Regards
	Sebastian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] robustness against attack?
  2019-03-24 22:50 [Ecn-sane] robustness against attack? Sebastian Moeller
@ 2019-03-25  7:16 ` Mikael Abrahamsson
  2019-03-25  7:54   ` [Ecn-sane] FQ in the core Dave Taht
                     ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Mikael Abrahamsson @ 2019-03-25  7:16 UTC (permalink / raw)
  To: Sebastian Moeller; +Cc: ecn-sane

On Sun, 24 Mar 2019, Sebastian Moeller wrote:

> From my layman's perspective this is the the killer argument against the 
> dualQ approach and for fair-queueing, IMHO only fq will be able to

Do people on this email list think we're trying to trick you when we're 
saying that FQ won't be available anytime soon on a lot of platforms that 
need this kind of AQM?

Since there is always demand for implementations, can we get an ASIC/NPU 
implementation of FQ_CODEL done by someone who claims it's no problem?

Personally I believe we need both. FQ is obviously superior to anything 
else most of the time, but FQ is not making itself into the kind of 
devices it needs to get into for the bufferbloat situation to improve, so 
now what?

Claiming to have a superior solution that is too expensive to go into 
relevant devices, is that proposal still relevant as an alternative to a 
different solution that actually is making itself into silicon?

Again, FQ superior, but what what good is it if it's not being used?

We need to have this discussion and come up with a joint understanding of 
the world, otherwise we're never going to get anywhere.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Ecn-sane] FQ in the core
  2019-03-25  7:16 ` Mikael Abrahamsson
@ 2019-03-25  7:54   ` Dave Taht
  2019-03-25  9:17     ` Luca Muscariello
                       ` (2 more replies)
  2019-03-25  8:34   ` [Ecn-sane] robustness against attack? Jonathan Morton
  2019-03-25  8:46   ` Sebastian Moeller
  2 siblings, 3 replies; 13+ messages in thread
From: Dave Taht @ 2019-03-25  7:54 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Sebastian Moeller, ecn-sane

I don't really have time to debate this today.

Since you forked this conversation back to FQ I need to state a few things.

1) SCE is (we think) compatible with existing single queue AQMs. CE
should not be exerted in this case, just drop. Note that this is also
what L4S wants to do with the "normal" queue (I refuse to call it
classic).

2) SCE is optional. A transport that has a more agressive behavior,
like dctcp, should fall back to being tcp-friendly if it
sees no SCE marks and only CE or drop.

3) At 100Gbit speeds some form of multi-queue oft seems needed. (and
this is in part why folk want to relax ordering requirements). So some
form of multiple queuing is generally the case. At the higher speeds,
DC's usually overprovision anyway.

4) The biggest cpu overhead for any of this stuff is per-tenant (in
the dc) or per customer shaping. This benefits a lot from a hardware
assist. (see senic). I've done quite a bit of DC work in the past 2
years (rather than home routers), and have had a hard look at the
underlying substrates for a few multi-tenant implementations....

4) "dualq" hasn't tried to address the fact that most 10Gbit and
higher cards have 8 or more hardware queues in the first place.

5) Companies like preseem are shipping transparent bridges that do
fq_codel/cake on customer traffic.

I've long been in periodic negotions with makers of "big iron" like,
for example, the new 128 core huwei box and others I cannot talk about
at the moment, to get so far as an existence proof.

So I'd like to kill the meme that SCE requires FQ, at least, for now,
until after we do more tests.

As for FQ everywhere, well, I'd like that, but it's not needed in
devices that already have sufficient multiplexing.

On Mon, Mar 25, 2019 at 8:16 AM Mikael Abrahamsson <swmike@swm.pp.se> wrote:
>
> On Sun, 24 Mar 2019, Sebastian Moeller wrote:
>
> > From my layman's perspective this is the the killer argument against the
> > dualQ approach and for fair-queueing, IMHO only fq will be able to
>
> Do people on this email list think we're trying to trick you when we're
> saying that FQ won't be available anytime soon on a lot of platforms that
> need this kind of AQM?
>
> Since there is always demand for implementations, can we get an ASIC/NPU
> implementation of FQ_CODEL done by someone who claims it's no problem?
>
> Personally I believe we need both. FQ is obviously superior to anything
> else most of the time, but FQ is not making itself into the kind of
> devices it needs to get into for the bufferbloat situation to improve, so
> now what?
>
> Claiming to have a superior solution that is too expensive to go into
> relevant devices, is that proposal still relevant as an alternative to a
> different solution that actually is making itself into silicon?
>
> Again, FQ superior, but what what good is it if it's not being used?
>
> We need to have this discussion and come up with a joint understanding of
> the world, otherwise we're never going to get anywhere.
>
> --
> Mikael Abrahamsson    email: swmike@swm.pp.se
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane

-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] FQ in the core
  2019-03-25  7:54   ` [Ecn-sane] FQ in the core Dave Taht
@ 2019-03-25  9:17     ` Luca Muscariello
  2019-03-25  9:52       ` Sebastian Moeller
  2019-03-25  9:23     ` Sebastian Moeller
  2019-03-25 15:43     ` Mikael Abrahamsson
  2 siblings, 1 reply; 13+ messages in thread
From: Luca Muscariello @ 2019-03-25  9:17 UTC (permalink / raw)
  To: Dave Taht; +Cc: Mikael Abrahamsson, ecn-sane

[-- Attachment #1: Type: text/plain, Size: 6130 bytes --]

We've had this discussion multiple times.

- You do not need FQ everywhere and depending on
the case you can do approximations of that.

- What I believe you always need is flow-awareness.

- There are already implementations of dual-queue systems in DC switches
such as the Cisco nexus 9k.

- The dualQ system in docsis is the wrong way to implement a flow-aware
system with two queues.
For many reasons including, but not limited to the fact  that dualQ to work
makes assumptions about the behaviour of the end-points.
A flow aware queuing  system should not be sensitive to some sort of
compliancy of the end-points.
You cannot trust the end-points and the protection system should not
discriminate good vs bad based on a badge carried by the packets.

- I also believe  dualQ fails to achieve its goal under current and future
traffic patters.

- The approach described in this paper also works for 2 queues only and
makes no assumption about the end-points.
It does not require marking at all.

James Roberts et al.
Minimizing the overhead in implementing flow-aware networking.
In Proceedings of the 2005 ACM symposium on Architecture for networking and
communications systems (ANCS '05).
DOI: https://doi.org/10.1145/1095890.1095912
https://team.inria.fr/rap/files/2013/12/KMOR05a.pdf

-  There is not one TCP, there is no one single transport in the wild.
There are many and there will be many more.
The docsis specs makes the assumption that to be a good guy you must be
part of one Church. The one specified.
This is a very religious assumption about end-points' behaviour.
All the others are bad guys. I'm the only one who sees this as deeply wrong?

- Almost 10 years ago I built a FQ prototype on an NPU based on Cavium with
Alcatel-Lucent (the team was based in Paris).
That was supposed to be an ALU7750 target. The problem was that not a
single ISP was even aware of the topic. Nobody was asking for it.
It's a chicken and egg problem Mikael. If you do not ask loudly nobody
builds it.

- I also built with Ikanos in 2010, a FQ prototype in their MIPS based SoC.
32 queues for the France Telecom livebox
and it worked. Ikanos was later acquired by Qualcomm and that SoC is not
used anymore in favour of Broadcom.

Hope this helps to make progress in the discussion

Luca





On Mon, Mar 25, 2019 at 8:55 AM Dave Taht <dave.taht@gmail.com> wrote:

> I don't really have time to debate this today.
>
> Since you forked this conversation back to FQ I need to state a few things.
>
> 1) SCE is (we think) compatible with existing single queue AQMs. CE
> should not be exerted in this case, just drop. Note that this is also
> what L4S wants to do with the "normal" queue (I refuse to call it
> classic).
>
> 2) SCE is optional. A transport that has a more agressive behavior,
> like dctcp, should fall back to being tcp-friendly if it
> sees no SCE marks and only CE or drop.
>
> 3) At 100Gbit speeds some form of multi-queue oft seems needed. (and
> this is in part why folk want to relax ordering requirements). So some
> form of multiple queuing is generally the case. At the higher speeds,
> DC's usually overprovision anyway.
>
> 4) The biggest cpu overhead for any of this stuff is per-tenant (in
> the dc) or per customer shaping. This benefits a lot from a hardware
> assist. (see senic). I've done quite a bit of DC work in the past 2
> years (rather than home routers), and have had a hard look at the
> underlying substrates for a few multi-tenant implementations....
>
> 4) "dualq" hasn't tried to address the fact that most 10Gbit and
> higher cards have 8 or more hardware queues in the first place.
>
> 5) Companies like preseem are shipping transparent bridges that do
> fq_codel/cake on customer traffic.
>
> I've long been in periodic negotions with makers of "big iron" like,
> for example, the new 128 core huwei box and others I cannot talk about
> at the moment, to get so far as an existence proof.
>
> So I'd like to kill the meme that SCE requires FQ, at least, for now,
> until after we do more tests.
>
> As for FQ everywhere, well, I'd like that, but it's not needed in
> devices that already have sufficient multiplexing.
>
>
>
>
>
> On Mon, Mar 25, 2019 at 8:16 AM Mikael Abrahamsson <swmike@swm.pp.se>
> wrote:
> >
> > On Sun, 24 Mar 2019, Sebastian Moeller wrote:
> >
> > > From my layman's perspective this is the the killer argument against
> the
> > > dualQ approach and for fair-queueing, IMHO only fq will be able to
> >
> > Do people on this email list think we're trying to trick you when we're
> > saying that FQ won't be available anytime soon on a lot of platforms that
> > need this kind of AQM?
> >
> > Since there is always demand for implementations, can we get an ASIC/NPU
> > implementation of FQ_CODEL done by someone who claims it's no problem?
> >
> > Personally I believe we need both. FQ is obviously superior to anything
> > else most of the time, but FQ is not making itself into the kind of
> > devices it needs to get into for the bufferbloat situation to improve, so
> > now what?
> >
> > Claiming to have a superior solution that is too expensive to go into
> > relevant devices, is that proposal still relevant as an alternative to a
> > different solution that actually is making itself into silicon?
> >
> > Again, FQ superior, but what what good is it if it's not being used?
> >
> > We need to have this discussion and come up with a joint understanding of
> > the world, otherwise we're never going to get anywhere.
> >
> > --
> > Mikael Abrahamsson    email: swmike@swm.pp.se
> > _______________________________________________
> > Ecn-sane mailing list
> > Ecn-sane@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/ecn-sane
>
>
>
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane
>

[-- Attachment #2: Type: text/html, Size: 8535 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] FQ in the core
  2019-03-25  9:17     ` Luca Muscariello
@ 2019-03-25  9:52       ` Sebastian Moeller
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Moeller @ 2019-03-25  9:52 UTC (permalink / raw)
  To: Luca Muscariello; +Cc: Dave Täht, ecn-sane

Hi Luca.

> On Mar 25, 2019, at 10:17, Luca Muscariello <luca.muscariello@gmail.com> wrote:
> 
> We've had this discussion multiple times.
> 
> - You do not need FQ everywhere and depending on 
> the case you can do approximations of that.
> 
> - What I believe you always need is flow-awareness.
> 
> - There are already implementations of dual-queue systems in DC switches such as the Cisco nexus 9k.
> 
> - The dualQ system in docsis is the wrong way to implement a flow-aware system with two queues. 
> For many reasons including, but not limited to the fact  that dualQ to work makes assumptions about the behaviour of the end-points.
> A flow aware queuing  system should not be sensitive to some sort of compliancy of the end-points.

+1


> You cannot trust the end-points and the protection system should not discriminate good vs bad based on a badge carried by the packets.

Exactly my sentiment, "speak softly, and carry a big stick" or "Doveryai, no proveryai" in any way do not give special treatment just because someone asks nicely, in other words "Pedo mellon a minno" considered harmful.


> 
> - I also believe  dualQ fails to achieve its goal under current and future traffic patters.

	I would very much like to test that, maybe the VMs intended as one of the hackathon's goals will make playing with that easier.

> 
> - The approach described in this paper also works for 2 queues only and makes no assumption about the end-points.
> It does not require marking at all.
> 
> James Roberts et al.
> Minimizing the overhead in implementing flow-aware networking. 
> In Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems (ANCS '05). 
> DOI: https://doi.org/10.1145/1095890.1095912
> https://team.inria.fr/rap/files/2013/12/KMOR05a.pdf
> 
> -  There is not one TCP, there is no one single transport in the wild. There are many and there will be many more.
> The docsis specs makes the assumption that to be a good guy you must be part of one Church. The one specified.
> This is a very religious assumption about end-points' behaviour.
> All the others are bad guys. I'm the only one who sees this as deeply wrong?

	I tried to make this point in the past, LLLLS seems to accidentally position itself as the be-all end-all of internet transport and if we know one thing about the future it is its crappy predictability in the long run...

> 
> - Almost 10 years ago I built a FQ prototype on an NPU based on Cavium with Alcatel-Lucent (the team was based in Paris). 
> That was supposed to be an ALU7750 target. The problem was that not a single ISP was even aware of the topic. Nobody was asking for it.
> It's a chicken and egg problem Mikael. If you do not ask loudly nobody builds it.

	How could something like this be priced? On a port by port basis, like vectoring for VDSL2, then ISPs could still offer something like that as an extra they could sell for latency sensitive customers?


> 
> - I also built with Ikanos in 2010, a FQ prototype in their MIPS based SoC. 32 queues for the France Telecom livebox
> and it worked. Ikanos was later acquired by Qualcomm and that SoC is not used anymore in favour of Broadcom.

	Ah, this is why ikanos disappeared...


> 
> Hope this helps to make progress in the discussion

	All, quite interesting, gracie!

Best Regards
	Sebastian

> 
> Luca
> 
> 
> 
> 
> 
> On Mon, Mar 25, 2019 at 8:55 AM Dave Taht <dave.taht@gmail.com> wrote:
> I don't really have time to debate this today.
> 
> Since you forked this conversation back to FQ I need to state a few things.
> 
> 1) SCE is (we think) compatible with existing single queue AQMs. CE
> should not be exerted in this case, just drop. Note that this is also
> what L4S wants to do with the "normal" queue (I refuse to call it
> classic).
> 
> 2) SCE is optional. A transport that has a more agressive behavior,
> like dctcp, should fall back to being tcp-friendly if it
> sees no SCE marks and only CE or drop.
> 
> 3) At 100Gbit speeds some form of multi-queue oft seems needed. (and
> this is in part why folk want to relax ordering requirements). So some
> form of multiple queuing is generally the case. At the higher speeds,
> DC's usually overprovision anyway.
> 
> 4) The biggest cpu overhead for any of this stuff is per-tenant (in
> the dc) or per customer shaping. This benefits a lot from a hardware
> assist. (see senic). I've done quite a bit of DC work in the past 2
> years (rather than home routers), and have had a hard look at the
> underlying substrates for a few multi-tenant implementations....
> 
> 4) "dualq" hasn't tried to address the fact that most 10Gbit and
> higher cards have 8 or more hardware queues in the first place.
> 
> 5) Companies like preseem are shipping transparent bridges that do
> fq_codel/cake on customer traffic.
> 
> I've long been in periodic negotions with makers of "big iron" like,
> for example, the new 128 core huwei box and others I cannot talk about
> at the moment, to get so far as an existence proof.
> 
> So I'd like to kill the meme that SCE requires FQ, at least, for now,
> until after we do more tests.
> 
> As for FQ everywhere, well, I'd like that, but it's not needed in
> devices that already have sufficient multiplexing.
> 
> 
> 
> 
> 
> On Mon, Mar 25, 2019 at 8:16 AM Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> >
> > On Sun, 24 Mar 2019, Sebastian Moeller wrote:
> >
> > > From my layman's perspective this is the the killer argument against the
> > > dualQ approach and for fair-queueing, IMHO only fq will be able to
> >
> > Do people on this email list think we're trying to trick you when we're
> > saying that FQ won't be available anytime soon on a lot of platforms that
> > need this kind of AQM?
> >
> > Since there is always demand for implementations, can we get an ASIC/NPU
> > implementation of FQ_CODEL done by someone who claims it's no problem?
> >
> > Personally I believe we need both. FQ is obviously superior to anything
> > else most of the time, but FQ is not making itself into the kind of
> > devices it needs to get into for the bufferbloat situation to improve, so
> > now what?
> >
> > Claiming to have a superior solution that is too expensive to go into
> > relevant devices, is that proposal still relevant as an alternative to a
> > different solution that actually is making itself into silicon?
> >
> > Again, FQ superior, but what what good is it if it's not being used?
> >
> > We need to have this discussion and come up with a joint understanding of
> > the world, otherwise we're never going to get anywhere.
> >
> > --
> > Mikael Abrahamsson    email: swmike@swm.pp.se
> > _______________________________________________
> > Ecn-sane mailing list
> > Ecn-sane@lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/ecn-sane
> 
> 
> 
> -- 
> 
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] FQ in the core
  2019-03-25  7:54   ` [Ecn-sane] FQ in the core Dave Taht
  2019-03-25  9:17     ` Luca Muscariello
@ 2019-03-25  9:23     ` Sebastian Moeller
  2019-03-25 15:43     ` Mikael Abrahamsson
  2 siblings, 0 replies; 13+ messages in thread
From: Sebastian Moeller @ 2019-03-25  9:23 UTC (permalink / raw)
  To: Dave Täht; +Cc: Mikael Abrahamsson, ecn-sane

Hi Dave,



> On Mar 25, 2019, at 08:54, Dave Taht <dave.taht@gmail.com> wrote:
> 
> I don't really have time to debate this today.

	By all means put this on the back-burner until tomorrow....


> 
> Since you forked this conversation back to FQ I need to state a few things.
> 
> 1) SCE is (we think) compatible with existing single queue AQMs. CE
> should not be exerted in this case, just drop. Note that this is also
> what L4S wants to do with the "normal" queue (I refuse to call it
> classic).

	Call it the internet-queue then?

> 
> 2) SCE is optional. A transport that has a more agressive behavior,
> like dctcp, should fall back to being tcp-friendly if it
> sees no SCE marks and only CE or drop.

	And this is where LLLLS faces challenges as it needs to use heuristics to throttle back to tcp-friendliness, which wil also be intersting once LLLLS will start to try to follow BBR in potentially ignoring dropped packets...

> 
> 3) At 100Gbit speeds some form of multi-queue oft seems needed. (and
> this is in part why folk want to relax ordering requirements). So some
> form of multiple queuing is generally the case. At the higher speeds,
> DC's usually overprovision anyway.

	Okay, that should allow to calculate a proposal for a minimum re-ordering window? Because I believe that RACK should allow for a minimum re-ordering window to actually allow transit ARQ to work efficiently without needing many stalls.


> 4) The biggest cpu overhead for any of this stuff is per-tenant (in
> the dc) or per customer shaping. This benefits a lot from a hardware
> assist. (see senic). I've done quite a bit of DC work in the past 2
> years (rather than home routers), and have had a hard look at the
> underlying substrates for a few multi-tenant implementations....

	Can you actually talk about this?

> 
> 4) "dualq" hasn't tried to address the fact that most 10Gbit and
> higher cards have 8 or more hardware queues in the first place.
> 
> 5) Companies like preseem are shipping transparent bridges that do
> fq_codel/cake on customer traffic.
> 
> I've long been in periodic negotions with makers of "big iron" like,
> for example, the new 128 core huwei box and others I cannot talk about
> at the moment, to get so far as an existence proof.
> 
> So I'd like to kill the meme that SCE requires FQ, at least, for now,
> until after we do more tests.

	My point here is not that fq is required, but rather that single queue AQMs seem easier to abuse as they assume full cooperation by all participating flows.


> 
> As for FQ everywhere, well, I'd like that, but it's not needed in
> devices that already have sufficient multiplexing.

	That would be aggregation networks and transit/peering points, no? That still leaves the edge...


Again, please ignore until the IETF meeting is over.

Best Regards
	Sebastian


> 
> 
> 
> 
> 
> On Mon, Mar 25, 2019 at 8:16 AM Mikael Abrahamsson <swmike@swm.pp.se> wrote:
>> 
>> On Sun, 24 Mar 2019, Sebastian Moeller wrote:
>> 
>>> From my layman's perspective this is the the killer argument against the
>>> dualQ approach and for fair-queueing, IMHO only fq will be able to
>> 
>> Do people on this email list think we're trying to trick you when we're
>> saying that FQ won't be available anytime soon on a lot of platforms that
>> need this kind of AQM?
>> 
>> Since there is always demand for implementations, can we get an ASIC/NPU
>> implementation of FQ_CODEL done by someone who claims it's no problem?
>> 
>> Personally I believe we need both. FQ is obviously superior to anything
>> else most of the time, but FQ is not making itself into the kind of
>> devices it needs to get into for the bufferbloat situation to improve, so
>> now what?
>> 
>> Claiming to have a superior solution that is too expensive to go into
>> relevant devices, is that proposal still relevant as an alternative to a
>> different solution that actually is making itself into silicon?
>> 
>> Again, FQ superior, but what what good is it if it's not being used?
>> 
>> We need to have this discussion and come up with a joint understanding of
>> the world, otherwise we're never going to get anywhere.
>> 
>> --
>> Mikael Abrahamsson    email: swmike@swm.pp.se
>> _______________________________________________
>> Ecn-sane mailing list
>> Ecn-sane@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/ecn-sane
> 
> 
> 
> -- 
> 
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] FQ in the core
  2019-03-25  7:54   ` [Ecn-sane] FQ in the core Dave Taht
  2019-03-25  9:17     ` Luca Muscariello
  2019-03-25  9:23     ` Sebastian Moeller
@ 2019-03-25 15:43     ` Mikael Abrahamsson
  2 siblings, 0 replies; 13+ messages in thread
From: Mikael Abrahamsson @ 2019-03-25 15:43 UTC (permalink / raw)
  To: Dave Taht; +Cc: Sebastian Moeller, ecn-sane

On Mon, 25 Mar 2019, Dave Taht wrote:

> 4) The biggest cpu overhead for any of this stuff is per-tenant (in
> the dc) or per customer shaping. This benefits a lot from a hardware

Agreed, I'd say typical deployment will allow to have 4-8 queues per 
tenant. If you need to shape customers then you need per-customer queue, 
and typically these linecards will have enough queues to do 4-8 per 
customer.

This rules out FQ, but it does allow to do things like WRED/PIE or 
something else on these few queues. So if we can skip bringing FQ back 
into the discussion all the time, I agree we can have a productive path 
forward that might actually have a good possibility to go into hardware.

A lot of deployments I've seen does bidirectional shaping in the "BNG", 
which will have one of these linecards with 128k queues per 10G port. ISPs 
will put many thousands of customers on this kind of port. There is no 
flow identification machinery to put things into queues, but it can 
probably match on bits in the header to put traffic into different queues.

So this is where PIE and L4S comes from (I imagine), it's coming from the 
side of "what can we do in this kind of hw". So who do we know who knows 
more about ASIC/NPU design who can help us with that?

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] robustness against attack?
  2019-03-25  7:16 ` Mikael Abrahamsson
  2019-03-25  7:54   ` [Ecn-sane] FQ in the core Dave Taht
@ 2019-03-25  8:34   ` Jonathan Morton
  2019-03-25  8:53     ` Jonathan Morton
  2019-03-25 15:23     ` Mikael Abrahamsson
  2019-03-25  8:46   ` Sebastian Moeller
  2 siblings, 2 replies; 13+ messages in thread
From: Jonathan Morton @ 2019-03-25  8:34 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Sebastian Moeller, ecn-sane

> On 25 Mar, 2019, at 8:16 am, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> 
> Do people on this email list think we're trying to trick you when we're saying that FQ won't be available anytime soon on a lot of platforms that need this kind of AQM?

Well, I don't.  I recognise that most high-capacity links will end up with single-queue AQM, because that's what's already out there in hardware (though it's rarely turned on so far).  I'm still keen to see good FQ used where feasible, and in ways that make local sense.

That's why I've put some effort into making SCE play nicely with single-queue AQMs, since our initial conversation on that point where I was still assuming AIAD response to SCE.  That is, I now have non-AIAD SCE responses which should (theoretically) converge to an RTT-fair state over a single queue.  (One of them is the DCTCP response, which L4S folks should be intimately familiar with by now.)  If you'll recall, my initial workaround was simply to 

Experimentation will be needed to check whether my theorising actually works in practice, but I'm not exactly ignoring the problem.

 - Jonathan Morton

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] robustness against attack?
  2019-03-25  8:34   ` [Ecn-sane] robustness against attack? Jonathan Morton
@ 2019-03-25  8:53     ` Jonathan Morton
  2019-03-25  9:40       ` Sebastian Moeller
  2019-03-25 15:23     ` Mikael Abrahamsson
  1 sibling, 1 reply; 13+ messages in thread
From: Jonathan Morton @ 2019-03-25  8:53 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Sebastian Moeller, ecn-sane

> On 25 Mar, 2019, at 9:34 am, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
> If you'll recall, my initial workaround was simply to 

…not implement SCE on single-queue middleboxes, and rely on the known-good CE response in that case.  But if we can show that putting SCE there too is safe, that's even better.

 - Jonathan Morton


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] robustness against attack?
  2019-03-25  8:53     ` Jonathan Morton
@ 2019-03-25  9:40       ` Sebastian Moeller
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Moeller @ 2019-03-25  9:40 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Mikael Abrahamsson, ecn-sane

Hi Jonathan,


> On Mar 25, 2019, at 09:53, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 25 Mar, 2019, at 9:34 am, Jonathan Morton <chromatix99@gmail.com> wrote:
>> 
>> If you'll recall, my initial workaround was simply to 
> 
> …not implement SCE on single-queue middleboxes, and rely on the known-good CE response in that case.  

	Which reduces the problem to get middle-boxes to use ECN at all ;)


> But if we can show that putting SCE there too is safe, that's even better.

	Given that the above might be harder than desired, it might be a good idea to immediately aim a bit higher than pure-CE ECN.

Best Regards
	Sebastian

> 
> - Jonathan Morton
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] robustness against attack?
  2019-03-25  8:34   ` [Ecn-sane] robustness against attack? Jonathan Morton
  2019-03-25  8:53     ` Jonathan Morton
@ 2019-03-25 15:23     ` Mikael Abrahamsson
  2019-03-25 22:53       ` David P. Reed
  1 sibling, 1 reply; 13+ messages in thread
From: Mikael Abrahamsson @ 2019-03-25 15:23 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: ecn-sane

On Mon, 25 Mar 2019, Jonathan Morton wrote:

>> On 25 Mar, 2019, at 8:16 am, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
>>
>> Do people on this email list think we're trying to trick you when we're saying that FQ won't be available anytime soon on a lot of platforms that need this kind of AQM?
>
> Well, I don't.  I recognise that most high-capacity links will end up 
> with single-queue AQM, because that's what's already out there in 
> hardware (though it's rarely turned on so far).  I'm still keen to see 
> good FQ used where feasible, and in ways that make local sense.

Ok, so can we please drop the "FQ" part of the conversation for the next 
months, and argue on few-queue systems and how to come up with things that 
are friendly to implement in hardware?

Just to state again what I have said several times:

Devices such as high speed residential gateways, BNGs, CMTSs etc, they 
will not get FQ anytime in the next 5-10 years (or someone will have to 
prove me wrong).

So please stop arguing about the wonderfulness of FQ. Yes, fine, it's 
great, but it's also not applicable to lots of places where we need to 
de-bloat.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] robustness against attack?
  2019-03-25 15:23     ` Mikael Abrahamsson
@ 2019-03-25 22:53       ` David P. Reed
  0 siblings, 0 replies; 13+ messages in thread
From: David P. Reed @ 2019-03-25 22:53 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: Jonathan Morton, ecn-sane

[-- Attachment #1: Type: text/plain, Size: 1219 bytes --]

The only latency-under-load mechanism other than FQ that can work is "no (absolute minimal) queueing". That's fine as a goal.

Unfortunately, I would suggest that the whole concept of ECN/SCE has to be rethought from the ground up if the goal is "no queueing", because ECN and SCE are currently defined only when a queue has built up, which of course means that latency has built up.

Now, of course, throughput is completely independent of queueing delay (except when there are a lot of erasure errors on the links, in which case modest queueing can perhaps enhance aggregate throughput).

When the whole point of things is to minimize queueing delay through whatever links turn out to be bottlenecks, by getting flows to be throttled by lowering cwnd or source rate or whatever, the ONLY way to do this is to get early feedback as queueing just begins to build.

(Of course, I am one of those people who constantly point out that classes of service have no meaning, really, unless one precisely defines the queue management in terms of flows, not individual packets).

I really worry that this discussion is going off the rails due to a lack of understanding of queueing theory and control theory.

[-- Attachment #2: Type: text/html, Size: 2458 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Ecn-sane] robustness against attack?
  2019-03-25  7:16 ` Mikael Abrahamsson
  2019-03-25  7:54   ` [Ecn-sane] FQ in the core Dave Taht
  2019-03-25  8:34   ` [Ecn-sane] robustness against attack? Jonathan Morton
@ 2019-03-25  8:46   ` Sebastian Moeller
  2 siblings, 0 replies; 13+ messages in thread
From: Sebastian Moeller @ 2019-03-25  8:46 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: ecn-sane

Hi Mikael,



> On Mar 25, 2019, at 08:16, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> 
> On Sun, 24 Mar 2019, Sebastian Moeller wrote:
> 
>> From my layman's perspective this is the the killer argument against the dualQ approach and for fair-queueing, IMHO only fq will be able to
> 
> Do people on this email list think we're trying to trick you when we're saying that FQ won't be available anytime soon on a lot of platforms that need this kind of AQM?

	I do not claim to speak for people on this list, but really just for myself.

> 
> Since there is always demand for implementations, can we get an ASIC/NPU implementation of FQ_CODEL done by someone who claims it's no problem?

	Out of my area of expertise and interests, sorry.

> 
> Personally I believe we need both. FQ is obviously superior to anything else most of the time, but FQ is not making itself into the kind of devices it needs to get into for the bufferbloat situation to improve, so now what?

	So, I have mostly given up on ISPs in this matter, there simply seems to be no economic incentive I can see for ISPs to spend any money to improve latency under load behavior of the typical choke points (access links, aggregation network, transit/peering connections). IMHO even the very company friendly LLLLS project does not really offer any noticeable incentives for ISPs to change, that is short of the ill-defined reordering tolerance there seems nothing in it that directly might affect an ISP's bottom line. I say ill-defined, as RACK does not seem to give the re-ordering resistance promises that the link-layer people would need (I would expect a minimum re-odering time independent of RTT for this to be useful for both sides).
	And I have seen how badly this line of reasoning played out with docsis 3.1's pie ... To recap, DOCSIS 3.1 rejected fq for a single queue aqm due to keeping CPE cost down to not hinder deployment at scale, but now DOCSIS3.1 is still not rolled-out at scale, and CPE's often sport dual-core intel atoms well capable of shaping at the 50-100 Mbps uplink speeds that ISPs offer (heck these should have enough punch to also run fq-AQM on the up to 1Gbps downlink of modern docsis-3.1 plans). In other words I fail to see how this line of reasoning was valid the last time around and I fail to see how this is going to play out differently this time around. I do see an opportunity for ISPs to offer "gamer-ready" low latency router-modems that do all fancy AQM on the CPE side, as a special item at a extra cost this might actually work economically... Keep the core network as latency agnostic as it is now, but sell latency optimized AQM to interested customers, and then pass the cost for the required hardware to perform this shaping to the same customer that will profit from it. That at least looks like something that ISP might earn something from and that will make customers happy (aka win-win).

> 
> Claiming to have a superior solution that is too expensive to go into relevant devices, is that proposal still relevant as an alternative to a different solution that actually is making itself into silicon?

	So I applaud adding at least a reasonably competent single queue AQM to ISP gear, but from my vantage point this will not magically make everything snappy and well for latency conscious end-users and hence will not replace competent AQM at the client side except that is might serve as a "backstop" to improve the worst case latency-under-load increase (even though LLLLS's worst case of 250ms is not that impressive).

> 
> Again, FQ superior, but what what good is it if it's not being used?

	Good point, but I want to avoid that something like LLLLS with the proposed idea of a heuristic how to detect RFC-compliant ECN markig AQMs will destroy the well-tuned latency-under-load performance of the ingress shaper at my home that uses fq and conpetent AQM to keep latency at bay even at saturating loads. Again, I do not want a half-arsed, optimised for low-cost (let's face it this all boils down to money and who pays), solution like LLLLS to screw things up. I note that this hinges up the leaky classifier proposed and is not an argument against LLLLS and its goals.

> 
> We need to have this discussion and come up with a joint understanding of the world, otherwise we're never going to get anywhere.

	Fair enough. I note again, I am from outside the field and just represent an opinionated end-point and I approach the issue from that vantage point, please do not expect me to reason for ISPs where I have no reliable insight in the economic or engineering challenges.

Best Regards
	Sebastian

> 
> -- 
> Mikael Abrahamsson    email: swmike@swm.pp.se


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-03-25 22:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-24 22:50 [Ecn-sane] robustness against attack? Sebastian Moeller
2019-03-25  7:16 ` Mikael Abrahamsson
2019-03-25  7:54   ` [Ecn-sane] FQ in the core Dave Taht
2019-03-25  9:17     ` Luca Muscariello
2019-03-25  9:52       ` Sebastian Moeller
2019-03-25  9:23     ` Sebastian Moeller
2019-03-25 15:43     ` Mikael Abrahamsson
2019-03-25  8:34   ` [Ecn-sane] robustness against attack? Jonathan Morton
2019-03-25  8:53     ` Jonathan Morton
2019-03-25  9:40       ` Sebastian Moeller
2019-03-25 15:23     ` Mikael Abrahamsson
2019-03-25 22:53       ` David P. Reed
2019-03-25  8:46   ` Sebastian Moeller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox