From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id DA3E83B2A4 for ; Thu, 23 Apr 2020 08:29:42 -0400 (EDT) Received: by mail-wr1-x434.google.com with SMTP id d15so5018785wrx.3 for ; Thu, 23 Apr 2020 05:29:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ieee.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=SjDJb0fRFaDr2ISbiczQj6zGNESxolz4ZZOv9jlgncE=; b=UdOTqUdzz4Ht5FRa3SO83COIJn8sY3sSZw0h3qs1FGOdUmRDgD3nwljWnBO6RnJFlS 5UOLOi8zZc4dgI2tSCr5YExA4jbJCC2fcjsN9p6E8DJcqeB+wszcIB0FJxgWBd56zsox OHbmDz4D007H4NadUASrT5/42YFkKX8I2d5s8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=SjDJb0fRFaDr2ISbiczQj6zGNESxolz4ZZOv9jlgncE=; b=JFetZW2GGq1WOV1V9sTdkzSHwY2Hqdv5EQukS/DxUZ0hYTfhSh47vkzVWDmXSG6lGg hjLkjruXHxa8VHEz6TttWw2+/AdooTT52DsfvLBZlt0Q/221igjH1AQuKx1PD9bseW/x WAhT1DArRdvDbOK4ZUvjmBhwL6TNL0XFUH/g9au0QgcspvV6Ckh5Bg8JcJEALSNEAM2n 6oaInKIke7dlD7KtXAX0xBpddB9yotQQtkKdhH6EXwlU8+Z5WfBvYs5vnzaTbw/buzK/ OAanVuuFtT6xq52z8lf3NWBlUbw4UAukkCLzQxEG7qsuy3OCqIfVCISemsYSeY1FPLce T1pw== X-Gm-Message-State: AGi0PubR4V+jW0KADMxm/9N3UDLrlSp9lfPzhNWVFYVeG1sJlqrDIIkF OHY3kFdCCUmpedrp99qnlxzqh1ZujRR+nZDD7Uotiw== X-Google-Smtp-Source: APiQypLfVZu3NiL3n3mW529lXREQH57Oly1MZNKKPRvT02yhRyaJE6mQPvJQGkijg7tMzbi0HmRYO+3ph6IGe9lLKoQ= X-Received: by 2002:adf:db0a:: with SMTP id s10mr4717109wri.361.1587644981925; Thu, 23 Apr 2020 05:29:41 -0700 (PDT) MIME-Version: 1.0 References: <75FEC2D9-BFC8-4FA2-A972-D11A823C5528@gmail.com> <603DFF79-D0C0-41BD-A2FB-E40B95A9CBB0@gmail.com> <20200423092909.GC28541@sakura> <87o8ri76u2.fsf@toke.dk> In-Reply-To: <87o8ri76u2.fsf@toke.dk> From: Luca Muscariello Date: Thu, 23 Apr 2020 14:29:30 +0200 Message-ID: To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= Cc: Maxime Bizon , Dave Taht , Cake List Content-Type: multipart/alternative; boundary="0000000000002d879605a3f467b2" Subject: Re: [Cake] Advantages to tightly tuning latency X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2020 12:29:43 -0000 --0000000000002d879605a3f467b2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Apr 23, 2020 at 1:57 PM Toke H=C3=B8iland-J=C3=B8rgensen wrote: > Maxime Bizon writes: > > > On Wednesday 22 Apr 2020 =C3=A0 07:48:43 (-0700), Dave Taht wrote: > > > > Hello, > > > >> > Free has been using SFQ since 2005 (if I remember well). > >> > They announced the wide deployment of SFQ in the free.fr newsgroup. > >> > Wi-Fi in the free.fr router was not as good though. > >> > >> They're working on it. :) > > > > yes indeed. > > > > Switching to softmac approach, so now mac80211 will do rate control > > and scheduling (using wake_tx_queue model). > > > > for 5ghz, we use ath10k > > That is awesome! Please make sure you include the AQL patch for ath10k, > it really works wonders, as Dave demonstrated: > > > https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-March/002721.= html > > >> I am very, very happy for y'all. Fiber has always been the sanest > >> thing. Is there a SPF+ gpon card yet I can plug into a convention > >> open source router yet? > > > > FYI Free.fr uses 10G-EPON, not GPON. > > > > Also most deployments are using an additionnal terminal equipement at > > called "ONT" or "ONU" that handle the PON part and exposes an ethernet > > port where the operator CPE is plugged. So we are back to the early > > days of DSL, where the hardest part (scheduling) is done inside a > > black box. That makes it easier to replace the operator CPE with your > > own standard ethernet router though. > > > > At least SOCs with integrated PON (supporting all flavours > > GPON/EPON/..) are starting to be deployed. Nothing available in > > opensource. > > > > Also note that it's not just kernel drivers, you also need some higher > > OAM stack to make that work, and there are a lot of existing > > standards, DPOE (EPON), OMCI (GPON)... all with interop challenges. > > It always bugged me that there was no open source support for these > esoteric protocols and standards. It would seem like an obvious place to > pool resources, but I guess proprietary vendors are going to keep doing > their thing :/ > > >> > The challenge becomes to keep up with these link rates in software > >> > as there is a lot of hardware offloading. > > > > Yes that's our pain point, because that's what the SOCs vendors > > deliver and you need to use that because there is no alternative. > > > > It's not entierely the SOCs vendors fault though. > > > > 15 years ago, your average SOC's CPU would be something like 200Mhz > > MIPS, Linux standard forwarding path (softirq =3D> routing+netfilter = =3D> > > qdisc) was too slow for this, too much cache footprint/overhead. So > > every vendor started building alternatives forwarding path in their > > hardware and never looked back. > > > > Nowdays, the baseline SOC CPU would be ARM Cortex A53@~1Ghz, which > > with a non crappy network driver and internal fabric should do be able > > to route 1Gbit/s with out-of-the-box kernel forwarding. > > > > But that's too late. SOC vendors compete against each others, and the > > big telcos need a way to tell which SOC is better to make a buying > > decision. So synthetic benchmarks have become the norm, and since > > everybody was able to do fill their pipe with 1500 bytes packets, > > benchmarks have moved to unrealistic 64 bytes packets (so called > > wirespeed) > Yes, I'm not working anymore on these kinds of platforms but I do remember the pain. Hardware offloading may also have unexpected behaviours for stateful offloads. A flow starts in a slow path and then it moves to the fast path in hardware. Out of order at this stage can be nasty for a TCP connection. Worse a packet loss. > > > > If you don't have hardware acceleration for forwarding, you don't > > exist in those benchmarks and will not sell your chipset. Also they > > invested so much in their alternative network stack that it's > > difficult to stop (huge R&D teams). That being said, they do have a > > point, when speed go above 1Gbit/s, the kernel becomes the bottleneck. > > > > For Free.fr 10Gbit/s offer, we had to develop an alternative > > (software) forwarding path using polling mode model (DPDK style), > > otherwise our albeit powerful ARM Cortex A72@2Ghz could not forward > > more than 2Gbit/s. > > We're working on that in kernel land - ever heard of XDP? On big-iron > servers we have no issues pushing 10s and 100s of Gbps in software > (well, the latter only given enough cores to throw at the problem :)). > There's not a lot of embedded platforms support as of yet, but we do > have some people in the ARM world working on that. > > Personally, I do see embedded platforms as an important (future) use > case for XDP, though, in particular for CPEs. So I would be very > interested in hearing details about your particular platform, and your > DPDK solution, so we can think about what it will take to achieve the > same with XDP. If you're interested in this, please feel free to reach > out :) > > > And going multicore/RSS does not fly when the test case is single > > stream TCP session, which is what most speedtest application do (ookla > > only recently added multi-connections test). > > Setting aside the fact that those single-stream tests ought to die a > horrible death, I do wonder if it would be feasible to do a bit of > 'optimising for the test'? With XDP we do have the ability to steer > packets between CPUs based on arbitrary criteria, and while it is not as > efficient as hardware-based RSS it may be enough to achieve line rate > for a single TCP flow? > Toke yes I was implicitly thinking about XDP but I did not read yet any experience in CPEs using that. DPDK, netmap and kernel bypass may be an option but you lose all qdiscs. > > -Toke > > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake > --0000000000002d879605a3f467b2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Thu, Apr 23, 2020 at 1:57 PM Toke H=C3=B8i= land-J=C3=B8rgensen <toke@redhat.com<= /a>> wrote:
M= axime Bizon <mbiz= on@freebox.fr> writes:

> On Wednesday 22 Apr 2020 =C3=A0 07:48:43 (-0700), Dave Taht wrote:
>
> Hello,
>
>> > Free has been using SFQ since 2005 (if I remember well).
>> > They announced the wide deployment of SFQ in the free.fr newsgroup.<= br> >> > Wi-Fi in the free.fr router was not as good though.
>>
>> They're working on it. :)
>
> yes indeed.
>
> Switching to softmac approach, so now mac80211 will do rate control > and scheduling (using wake_tx_queue model).
>
> for 5ghz, we use ath10k

That is awesome! Please make sure you include the AQL patch for ath10k,
it really works wonders, as Dave demonstrated:

https://lists.bufferblo= at.net/pipermail/make-wifi-fast/2020-March/002721.html

>> I am very, very happy for y'all. Fiber has always been the san= est
>> thing. Is there a SPF+ gpon card yet I can plug into a convention<= br> >> open source router yet?
>
> FYI Free.fr uses 10G-EPON, not GPON.
>
> Also most deployments are using an additionnal terminal equipement at<= br> > called "ONT" or "ONU" that handle the PON part and= exposes an ethernet
> port where the operator CPE is plugged. So we are back to the early > days of DSL, where the hardest part (scheduling) is done inside a
> black box. That makes it easier to replace the operator CPE with your<= br> > own standard ethernet router though.
>
> At least SOCs with integrated PON (supporting all flavours
> GPON/EPON/..)=C2=A0 are starting to be deployed. Nothing available in<= br> > opensource.
>
> Also note that it's not just kernel drivers, you also need some hi= gher
> OAM stack to make that work, and there are a lot of existing
> standards, DPOE (EPON), OMCI (GPON)... all with interop challenges.
It always bugged me that there was no open source support for these
esoteric protocols and standards. It would seem like an obvious place to pool resources, but I guess proprietary vendors are going to keep doing
their thing :/

>> > The challenge becomes to keep up with these link rates in sof= tware
>> > as there is a lot of hardware offloading.
>
> Yes that's our pain point, because that's what the SOCs vendor= s
> deliver and you need to use that because there is no alternative.
>
> It's not entierely the SOCs vendors fault though.
>
> 15 years ago, your average SOC's CPU would be something like 200Mh= z
> MIPS, Linux standard forwarding path (softirq =3D> routing+netfilte= r =3D>
> qdisc) was too slow for this, too much cache footprint/overhead. So > every vendor started building alternatives forwarding path in their > hardware and never looked back.
>
> Nowdays, the baseline SOC CPU would be ARM Cortex A53@~1Ghz, which
> with a non crappy network driver and internal fabric should do be able=
> to route 1Gbit/s with out-of-the-box kernel forwarding.
>
> But that's too late. SOC vendors compete against each others, and = the
> big telcos need a way to tell which SOC is better to make a buying
> decision. So synthetic benchmarks have become the norm, and since
> everybody was able to do fill their pipe with 1500 bytes packets,
> benchmarks have moved to unrealistic 64 bytes packets (so called
> wirespeed)

Yes, I'm not working anymore on t= hese kinds of platforms
but I do remember the pain.
Hardware offloading may also have u= nexpected behaviours
for stateful offloads. A flow starts in a slow path and=C2=A0
then it mo= ves to the fast path in hardware.=C2=A0
Out of order at this stage can be nasty for = a TCP connection.
Worse a packet loss.

=C2=A0
>
> If you don't have hardware acceleration for forwarding, you don= 9;t
> exist in those benchmarks and will not sell your chipset. Also they > invested so much in their alternative network stack that it's
> difficult to stop (huge R&D teams). That being said, they do have = a
> point, when speed go above 1Gbit/s, the kernel becomes the bottleneck.=
>
> For Free.fr 10Gbit/s offer, we had to develop an alternative
> (software) forwarding path using polling mode model (DPDK style),
> otherwise our albeit powerful ARM Cortex A72@2Ghz could not forward > more than 2Gbit/s.

We're working on that in kernel land - ever heard of XDP? On big-iron servers we have no issues pushing 10s and 100s of Gbps in software
(well, the latter only given enough cores to throw at the problem :)).
There's not a lot of embedded platforms support as of yet, but we do have some people in the ARM world working on that.

Personally, I do see embedded platforms as an important (future) use
case for XDP, though, in particular for CPEs. So I would be very
interested in hearing details about your particular platform, and your
DPDK solution, so we can think about what it will take to achieve the
same with XDP. If you're interested in this, please feel free to reach<= br> out :)

> And going multicore/RSS does not fly when the test case is single
> stream TCP session, which is what most speedtest application do (ookla=
> only recently added multi-connections test).

Setting aside the fact that those single-stream tests ought to die a
horrible death, I do wonder if it would be feasible to do a bit of
'optimising for the test'? With XDP we do have the ability to steer=
packets between CPUs based on arbitrary criteria, and while it is not as efficient as hardware-based RSS it may be enough to achieve line rate
for a single TCP flow?


Toke yes I was implicitly=C2=A0thi= nking about XDP but I did=C2=A0
not read yet any experience in CPEs using that.

DPDK, netmap and = kernel bypass may be an option but
you lose all qdiscs.=C2=A0


=C2=A0
<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">
-Toke

_______________________________________________
Cake mailing list
Cake@lists.= bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake
--0000000000002d879605a3f467b2--