From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x342.google.com (mail-ot1-x342.google.com [IPv6:2607:f8b0:4864:20::342]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 435523B29E for ; Mon, 25 Mar 2019 05:17:18 -0400 (EDT) Received: by mail-ot1-x342.google.com with SMTP id d24so7313019otl.11 for ; Mon, 25 Mar 2019 02:17:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FIx7CVwHeD8o92v+4lE8q/79oM4F/3upp+3LsqZKsEc=; b=Vlk9LcVM20RGTAK4D3rAh2jvUoRQ+WxruU01gn+e7vef7LdAxGI2VcMyJ8o/DKt4Cp 1HsNRiqt9p7OBz+lrHzHV8Hk1tHYbERierH+CuL3LiN0mbt1LGyTM8gw0JyWlVwPmq2Q HrDwhxJMTq0o/6bhU6ndgNDpZWN2682YzV6o2F26mAjz/m1ymrvm+JfRVgn1+QjM3fAZ xYKH/i0gnO4I72qAYLSSSoG+UeGeNf9vEh2P3HFGreefX2YrRJGo/q6LUMlUcD7a4h+u U+VSRwzbGq6/DcC6pWn6Qa1wb054Ebnuwq6gmTIkJ/5FUvPzKWxMcdR/KccvqKU/3J8+ RYvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FIx7CVwHeD8o92v+4lE8q/79oM4F/3upp+3LsqZKsEc=; b=EE4IkSawFGpDWZPSP7Z1rYqwXIerLQVhbWOsf2vx2DWIMRuxqdtuOgvN333W1JWHGm MMIlM8fQHv0ERQC0cZU436/Rpxc0TVH4p3RdjSgGnba9i/F+VBeyJXLpC0VjyUd2MymJ 1BqfAzXNWd2+TCDMUyn+9TW9bdE+nFC1IO6ZYF7+ba8n47sX7GO/2EoNvdWcim+g3oNC Y+7ompuWRa4Sw8XEBiETFG+uisHjP4IjmDVhSt4ZKiJY/XxxfaOed0JVz3wRHj/ASKeX TFeLpP30OQW9LSA0UtQEFwddAAB7ax4pYhFA989Y03KU72S8Aq3vYa5drPdXyITVch3g 97rw== X-Gm-Message-State: APjAAAW087LqT54ppuyB18lY1gbi8235kM3ar9XDoHMFlIRXJC6+4I2F N4UzopnXe1555kKiFQvuWsxvxGhHGiZMh+A+XDM= X-Google-Smtp-Source: APXvYqyIZTizf6PldOQGYCaNABC0ryF1GoHKzcrXrEIYeWp/WLqVUHMVPUU4UXqu9BgsLqS1hk8YARA8f4I++3FC5+s= X-Received: by 2002:a9d:6255:: with SMTP id i21mr17496179otk.354.1553505437534; Mon, 25 Mar 2019 02:17:17 -0700 (PDT) MIME-Version: 1.0 References: <3E9C6E74-E335-472B-8745-6020F7CDBA01@gmx.de> In-Reply-To: From: Luca Muscariello Date: Mon, 25 Mar 2019 10:17:06 +0100 Message-ID: To: Dave Taht Cc: Mikael Abrahamsson , ecn-sane@lists.bufferbloat.net Content-Type: multipart/alternative; boundary="000000000000c2ce3c0584e7abe7" Subject: Re: [Ecn-sane] FQ in the core X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Mar 2019 09:17:18 -0000 --000000000000c2ce3c0584e7abe7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable We've had this discussion multiple times. - You do not need FQ everywhere and depending on the case you can do approximations of that. - What I believe you always need is flow-awareness. - There are already implementations of dual-queue systems in DC switches such as the Cisco nexus 9k. - The dualQ system in docsis is the wrong way to implement a flow-aware system with two queues. For many reasons including, but not limited to the fact that dualQ to work makes assumptions about the behaviour of the end-points. A flow aware queuing system should not be sensitive to some sort of compliancy of the end-points. You cannot trust the end-points and the protection system should not discriminate good vs bad based on a badge carried by the packets. - I also believe dualQ fails to achieve its goal under current and future traffic patters. - The approach described in this paper also works for 2 queues only and makes no assumption about the end-points. It does not require marking at all. James Roberts et al. Minimizing the overhead in implementing flow-aware networking. In Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems (ANCS '05). DOI: https://doi.org/10.1145/1095890.1095912 https://team.inria.fr/rap/files/2013/12/KMOR05a.pdf - There is not one TCP, there is no one single transport in the wild. There are many and there will be many more. The docsis specs makes the assumption that to be a good guy you must be part of one Church. The one specified. This is a very religious assumption about end-points' behaviour. All the others are bad guys. I'm the only one who sees this as deeply wrong= ? - Almost 10 years ago I built a FQ prototype on an NPU based on Cavium with Alcatel-Lucent (the team was based in Paris). That was supposed to be an ALU7750 target. The problem was that not a single ISP was even aware of the topic. Nobody was asking for it. It's a chicken and egg problem Mikael. If you do not ask loudly nobody builds it. - I also built with Ikanos in 2010, a FQ prototype in their MIPS based SoC. 32 queues for the France Telecom livebox and it worked. Ikanos was later acquired by Qualcomm and that SoC is not used anymore in favour of Broadcom. Hope this helps to make progress in the discussion Luca On Mon, Mar 25, 2019 at 8:55 AM Dave Taht wrote: > I don't really have time to debate this today. > > Since you forked this conversation back to FQ I need to state a few thing= s. > > 1) SCE is (we think) compatible with existing single queue AQMs. CE > should not be exerted in this case, just drop. Note that this is also > what L4S wants to do with the "normal" queue (I refuse to call it > classic). > > 2) SCE is optional. A transport that has a more agressive behavior, > like dctcp, should fall back to being tcp-friendly if it > sees no SCE marks and only CE or drop. > > 3) At 100Gbit speeds some form of multi-queue oft seems needed. (and > this is in part why folk want to relax ordering requirements). So some > form of multiple queuing is generally the case. At the higher speeds, > DC's usually overprovision anyway. > > 4) The biggest cpu overhead for any of this stuff is per-tenant (in > the dc) or per customer shaping. This benefits a lot from a hardware > assist. (see senic). I've done quite a bit of DC work in the past 2 > years (rather than home routers), and have had a hard look at the > underlying substrates for a few multi-tenant implementations.... > > 4) "dualq" hasn't tried to address the fact that most 10Gbit and > higher cards have 8 or more hardware queues in the first place. > > 5) Companies like preseem are shipping transparent bridges that do > fq_codel/cake on customer traffic. > > I've long been in periodic negotions with makers of "big iron" like, > for example, the new 128 core huwei box and others I cannot talk about > at the moment, to get so far as an existence proof. > > So I'd like to kill the meme that SCE requires FQ, at least, for now, > until after we do more tests. > > As for FQ everywhere, well, I'd like that, but it's not needed in > devices that already have sufficient multiplexing. > > > > > > On Mon, Mar 25, 2019 at 8:16 AM Mikael Abrahamsson > wrote: > > > > On Sun, 24 Mar 2019, Sebastian Moeller wrote: > > > > > From my layman's perspective this is the the killer argument against > the > > > dualQ approach and for fair-queueing, IMHO only fq will be able to > > > > Do people on this email list think we're trying to trick you when we're > > saying that FQ won't be available anytime soon on a lot of platforms th= at > > need this kind of AQM? > > > > Since there is always demand for implementations, can we get an ASIC/NP= U > > implementation of FQ_CODEL done by someone who claims it's no problem? > > > > Personally I believe we need both. FQ is obviously superior to anything > > else most of the time, but FQ is not making itself into the kind of > > devices it needs to get into for the bufferbloat situation to improve, = so > > now what? > > > > Claiming to have a superior solution that is too expensive to go into > > relevant devices, is that proposal still relevant as an alternative to = a > > different solution that actually is making itself into silicon? > > > > Again, FQ superior, but what what good is it if it's not being used? > > > > We need to have this discussion and come up with a joint understanding = of > > the world, otherwise we're never going to get anywhere. > > > > -- > > Mikael Abrahamsson email: swmike@swm.pp.se > > _______________________________________________ > > Ecn-sane mailing list > > Ecn-sane@lists.bufferbloat.net > > https://lists.bufferbloat.net/listinfo/ecn-sane > > > > -- > > Dave T=C3=A4ht > CTO, TekLibre, LLC > http://www.teklibre.com > Tel: 1-831-205-9740 > _______________________________________________ > Ecn-sane mailing list > Ecn-sane@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/ecn-sane > --000000000000c2ce3c0584e7abe7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
We've had this discussion multiple ti= mes.

- You do not need FQ everywhere and depending on=C2= =A0
the case you can do approximations of that.

- What I believe you always need is flow-awareness.

- There are already implementations of dual-queue systems in DC swi= tches such as the Cisco nexus 9k.

- The dualQ syst= em in docsis is the wrong way to implement a flow-aware system with two que= ues.=C2=A0
For many reasons including, but not limited to the fac= t=C2=A0 that dualQ to work makes assumptions about the behaviour of the end= -points.
A flow aware queuing=C2=A0 system should not be sensitiv= e to some sort of compliancy of the end-points.
You cannot trust = the end-points and the protection system should not discriminate good vs ba= d based on a badge carried by the packets.

- I als= o believe=C2=A0 dualQ fails to achieve its goal under current and future tr= affic patters.

- The approach described in this pa= per also works for 2 queues only and makes no assumption about the end-poin= ts.
It does not require marking at all.

= James Roberts et al.
Min= imizing the overhead in implementing flow-aware networking.=C2=A0
In=C2=A0Proceedings of the 2005 ACM symposium on Architecture for networking an= d communications systems=C2=A0(ANCS '05).=C2=A0=
DOI: https://doi.org/10.1145/1095890.1095912

-=C2=A0 There is not one TCP, there is no one single transport in the= wild. There are many and there will be many more.
The docsis spe= cs makes the assumption that to be a good guy you must be part of one Churc= h. The one specified.
This is a very religious assumption about e= nd-points' behaviour.
All the others are bad guys. I'm th= e only one who sees this as deeply wrong?

- Almost= 10 years ago I built a FQ prototype on an NPU based on Cavium with Alcatel= -Lucent (the team was based in Paris).=C2=A0
That was supposed to= be an ALU7750 target. The problem was that not a single ISP was even aware= of the topic. Nobody was asking for it.
It's a chicken and e= gg problem Mikael. If you do not ask loudly nobody builds it.
- I also built with Ikanos in 2010, a FQ prototype in their MIP= S based SoC. 32 queues for the France Telecom livebox
and it work= ed. Ikanos was later acquired by Qualcomm and that SoC is not used anymore = in favour of Broadcom.

Hope this helps to make pro= gress in the discussion

Luca

<= div>



On Mon, Mar 25, 201= 9 at 8:55 AM Dave Taht <dave.taht= @gmail.com> wrote:
I don't really have time to debate this today.

Since you forked this conversation back to FQ I need to state a few things.=

1) SCE is (we think) compatible with existing single queue AQMs. CE
should not be exerted in this case, just drop. Note that this is also
what L4S wants to do with the "normal" queue (I refuse to call it=
classic).

2) SCE is optional. A transport that has a more agressive behavior,
like dctcp, should fall back to being tcp-friendly if it
sees no SCE marks and only CE or drop.

3) At 100Gbit speeds some form of multi-queue oft seems needed. (and
this is in part why folk want to relax ordering requirements). So some
form of multiple queuing is generally the case. At the higher speeds,
DC's usually overprovision anyway.

4) The biggest cpu overhead for any of this stuff is per-tenant (in
the dc) or per customer shaping. This benefits a lot from a hardware
assist. (see senic). I've done quite a bit of DC work in the past 2
years (rather than home routers), and have had a hard look at the
underlying substrates for a few multi-tenant implementations....

4) "dualq" hasn't tried to address the fact that most 10Gbit = and
higher cards have 8 or more hardware queues in the first place.

5) Companies like preseem are shipping transparent bridges that do
fq_codel/cake on customer traffic.

I've long been in periodic negotions with makers of "big iron"= ; like,
for example, the new 128 core huwei box and others I cannot talk about
at the moment, to get so far as an existence proof.

So I'd like to kill the meme that SCE requires FQ, at least, for now, until after we do more tests.

As for FQ everywhere, well, I'd like that, but it's not needed in devices that already have sufficient multiplexing.





On Mon, Mar 25, 2019 at 8:16 AM Mikael Abrahamsson <swmike@swm.pp.se> wrote:
>
> On Sun, 24 Mar 2019, Sebastian Moeller wrote:
>
> > From my layman's perspective this is the the killer argument = against the
> > dualQ approach and for fair-queueing, IMHO only fq will be able t= o
>
> Do people on this email list think we're trying to trick you when = we're
> saying that FQ won't be available anytime soon on a lot of platfor= ms that
> need this kind of AQM?
>
> Since there is always demand for implementations, can we get an ASIC/N= PU
> implementation of FQ_CODEL done by someone who claims it's no prob= lem?
>
> Personally I believe we need both. FQ is obviously superior to anythin= g
> else most of the time, but FQ is not making itself into the kind of > devices it needs to get into for the bufferbloat situation to improve,= so
> now what?
>
> Claiming to have a superior solution that is too expensive to go into<= br> > relevant devices, is that proposal still relevant as an alternative to= a
> different solution that actually is making itself into silicon?
>
> Again, FQ superior, but what what good is it if it's not being use= d?
>
> We need to have this discussion and come up with a joint understanding= of
> the world, otherwise we're never going to get anywhere.
>
> --
> Mikael Abrahamsson=C2=A0 =C2=A0 email: swmike@swm.pp.se
> _______________________________________________
> Ecn-sane mailing list
> Ec= n-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane<= /a>



--

Dave T=C3=A4ht
CTO, TekLibre, LLC
ht= tp://www.teklibre.com
Tel: 1-831-205-9740
_______________________________________________
Ecn-sane mailing list
Ecn-san= e@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/ecn-sane
--000000000000c2ce3c0584e7abe7--