From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.toke.dk (mail.toke.dk [52.28.52.200]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 105E43BA8E for ; Wed, 25 Apr 2018 12:55:28 -0400 (EDT) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=toke.dk; s=20161023; t=1524675327; bh=R/OT6Sc9yDpnBHNii3ultG6QqYHoOs0jAtwju2oWJJs=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=tROmM10mWV5Xm4XrfScoBsfUIzpI8bzKy88XLiGrZ0ScptRsYLjl8uQcEEHYLItOS g15rP3YJHoI3/pAxBE3KrM1k61MS0nwypE9PgokYDO4M37SH0RBiGUU7W9T3nLxG+c osdBnkLbTXPiLhkNNfDS5mV7Q+QuQ3F6xl8khejIbOl8SziqlL5FWR1Mdm4Ge4oClr 9xt+MB5dVoVzdyhdJ5O8pOcXi6EfgXDX6zIEYQ4QGpSQJkLK2udtBaTyE4L9Cr+yQc aIS/yLN3G2mO290Yt5dRa26T1s7LNJWB/AT6MvuTRyStIrYcRCb5GqzsDYrnVDVJ7L 1J+n/0Ddsxv3Q== To: Eric Dumazet , netdev@vger.kernel.org Cc: cake@lists.bufferbloat.net, Dave Taht In-Reply-To: <8bae2ee1-efcc-1571-2a30-5b7779de2c88@gmail.com> References: <20180425134249.21300-1-toke@toke.dk> <878t9b5n0q.fsf@toke.dk> <6bc11ded-028f-6c8f-964e-a569b4e10813@gmail.com> <8736zj6zj2.fsf@toke.dk> <8bae2ee1-efcc-1571-2a30-5b7779de2c88@gmail.com> Date: Wed, 25 Apr 2018 18:55:26 +0200 X-Clacks-Overhead: GNU Terry Pratchett Message-ID: <87tvrz5ipt.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] [PATCH net-next v3] Add Common Applications Kept Enhanced (cake) qdisc X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Apr 2018 16:55:29 -0000 Eric Dumazet writes: > On 04/25/2018 09:06 AM, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> Eric Dumazet writes: >>=20 >>> On 04/25/2018 08:22 AM, Toke H=C3=B8iland-J=C3=B8rgensen wrote: >>>> Eric Dumazet writes: >>> >>>>> What performance number do you get on a 10Gbit NIC for example ? >>>> >>>> Single-flow throughput through 2 hops on a 40Gbit connection (with CAKE >>>> in unlimited mode vs pfifo_fast on the router): >>>> >>>> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to test= bed-40g-2 () port 0 AF_INET : demo >>>> Recv Send Send=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20 >>>> Socket Socket Message Elapsed=20=20=20=20=20=20=20=20=20=20=20=20=20= =20 >>>> Size Size Size Time Throughput=20=20 >>>> bytes bytes bytes secs. 10^6bits/sec=20=20 >>>> >>>> 87380 16384 16384 10.00 18840.40=20=20=20 >>>> >>>> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to test= bed-40g-2 () port 0 AF_INET : demo >>>> Recv Send Send=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20 >>>> Socket Socket Message Elapsed=20=20=20=20=20=20=20=20=20=20=20=20=20= =20 >>>> Size Size Size Time Throughput=20=20 >>>> bytes bytes bytes secs. 10^6bits/sec=20=20 >>>> >>>> 87380 16384 16384 10.00 24804.77=20=20=20 >>> >>> CPU performance would be interesting here. (netperf -Cc) >>=20 >>=20 >> $ sudo tc qdisc replace dev ens2 root cake >> $ netperf -cC -H 10.70.2.2 >> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.70.= 2.2 () port 0 AF_INET : demo >> Recv Send Send Utilization Service D= emand >> Socket Socket Message Elapsed Send Recv Send R= ecv >> Size Size Size Time Throughput local remote local r= emote >> bytes bytes bytes secs. 10^6bits/s % S % S us/KB u= s/KB >>=20 >> 87380 16384 16384 10.00 15450.35 13.35 6.68 0.849 = 0.283=20=20 >>=20 >> $ sudo tc qdisc del dev ens2 root=20 >> $ netperf -cC -H 10.70.2.2 >> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.70.= 2.2 () port 0 AF_INET : demo >> Recv Send Send Utilization Service D= emand >> Socket Socket Message Elapsed Send Recv Send R= ecv >> Size Size Size Time Throughput local remote local r= emote >> bytes bytes bytes secs. 10^6bits/s % S % S us/KB u= s/KB >>=20 >> 87380 16384 16384 10.00 36414.23 8.20 14.30 0.221 = 0.257=20=20 >>=20 >>=20 >> (In this test I'm running netperf on the machine that was a router >> before, which is why the base throughput is higher; the other machine >> runs out of CPU on the sender side). > > We can see here the high cost of forcing software GSO :/ > > Really, this should be done only : > 1) If requested by the admin ( tc .... gso ....) > > 2) If packet size is above a threshold. > The threshold could be set by the admin, and/or based on a fraction of = the bandwidth parameter. > > I totally understand why you prefer to segment yourself for < 100 Mbit li= nks. > > But this makes no sense on 10Gbit+ Well, as I said, 10Gbit+ links are not really the target audience ;) We did actually have a threshold at some point, but it was removed because it didn't work well (I'm not sure of the details, perhaps someone else will chime in). However, I'm fine with adding a flag, as long as peeling defaults to on, at least when the shaper is active (to properly account for packet overhead we really need to see every packet that goes out on the wire). Would that be acceptable? -Toke