From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id B6B3921FCA3 for ; Mon, 29 Jun 2015 11:24:48 -0700 (PDT) Received: from hms-beagle-7.home.lan ([217.237.70.193]) by mail.gmx.com (mrgmx001) with ESMTPSA (Nemesis) id 0MLNpK-1ZABxp2Pdy-000eA5; Mon, 29 Jun 2015 20:24:43 +0200 Content-Type: multipart/mixed; boundary="Apple-Mail=_71F48B7A-E9FA-42A9-BC88-66479C1C7E42" Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) From: Sebastian Moeller In-Reply-To: Date: Mon, 29 Jun 2015 20:24:52 +0200 Message-Id: <1F11B5EB-247D-481B-8946-FE2476465891@gmx.de> References: <8B853F1C-DE5D-4F3D-88CC-CB8DA2D3E8B1@gmx.de> <04331509-F163-4184-90B4-8589073AFD62@gmx.de> <09BA156C-460D-4794-A082-33E805F3D6FD@gmx.de> <5436B48C-0803-46DA-B355-14E917A5BB37@gmx.de> <4E002218-174D-44F9-91A0-C7F34B9E83C7@gmx.de> <87pp4eomfx.fsf@alrua-karlstad.karlstad.toke.dk> <92199704-0522-447A-887A-1EE0E6AE4421@gmx.de> To: =?iso-8859-1?Q?Dave_T=E4ht?= X-Mailer: Apple Mail (2.1878.6) X-Provags-ID: V03:K0:Gnb4Zgi++CiMv6QwG7NRpNaEIkA/Zk/ZhkFgxBoGcOSEeKNhstr HMtEeqm0kAvofWxIf70BelYlfPXbGo2r5qhT0Mxy567PZxHqqmOVlAmVTFTC0lrdpZ+qkM6 we+/UeTAyz4G5gWXLhMFzobPl29OhQXLTKraDurnYFNOD2D0r6gZwT3XiuKJ3SucsPKNU+p aPzQfCxFMaxKOw8snPExA== X-UI-Out-Filterresults: notjunk:1;V01:K0:XAGVNCxKHK0=:jiZV3xuaX9S8kNCirNACnB MleKf3MhaYdY+AOQKBzS0tFd0DV/NX8QYVEGoe+e8KtB+vu4GO9JVcRja6thp4QmFUEghCu7X jatKDmfM5XEwYMBOAhwCdDsXHUnnqpjCarrslPrVUI06nCU/5ZPe4tsZ2uOXJEAy5nWlyK6g9 8SRN8PL8g09aXAty8iKZLoNd6KbKOJK9yO1c7CEeQItVj/B/rPYVXswLPzMtFr/THRF/Yq1za zP5hIfDdI3YxKp2Ly8YM3zc/s7tEkNMueRy6i7vK3WqGPnB/bpCyPFmyjGxJQEQVvuRjalwx/ qAIPYJJ8d+OZPXP5Lq+IrRJ3P2nVpEn8NbYiVnTiVHT1Qc3qfQZeF7djuDk7vs+81I8zVSczH ytIIWP/YqnLYCWNE97K8Opi5NMMEZRq9rN0Xc4Dys1eRagPC5IJ7/x72i8j7wUK6o+8hNWVBP /kOwcfZdSFHGLGFHEyOoQibvGHvtdmG+8qdtz18xA61D4ZVHmKzW5YnWxvG3w3kIbIyUCvWRg EQnaeUOT6KukOcVu8GAkcjgjvps33mW1DWq1sAMAJdrlp1qF7yyFHcgC1Iosh7m/Pj7qHYYwi l8j/87N0bh1iGyMCQBKcKviIICqdJ/XmTGqHqeFXPp+T+Q3bR3IiXrdvvu25KK8ko3SkT+4ne 3Bb5OsFWmLM2ChfVxnRUh+GJKrQnnVI4b/ZvM63JQkvypWg== Cc: "cerowrt-devel@lists.bufferbloat.net" Subject: Re: [Cerowrt-devel] performance numbers from WRT1200AC (Re: Latest build test - new sqm-scripts seem to work; "cake overhead 40" didn't) X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Jun 2015 18:25:17 -0000 --Apple-Mail=_71F48B7A-E9FA-42A9-BC88-66479C1C7E42 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 Hi List, On Jun 29, 2015, at 18:44 , Dave Taht wrote: > On Mon, Jun 29, 2015 at 6:42 AM, Sebastian Moeller = wrote: >> HI Mikael,, hi Jonathan, >>=20 >>> [...] >>>=20 >>> These are the results from 50M and 500M, also including 50up and = 50down that I added to my test suite script. >>>=20 >>> http://swm.pp.se/aqm/rrul_150629-cake-4.tar >>>=20 >>=20 >> Now both ingress and egress are up to roughly 455Mbps from roughly = 360 with cake just playing leaf qdisc for HTB. This looks even better = than before=85 >=20 > 350 *usec* induced delay on the 50mbit rrul_be test. w00t! >=20 > Most of the tests compare well with the reference rangeley data now. I > would like a 900mbit soft shaped result. Make sure to use the correct per packet overhead of 18 + 20, on = gigabit ethernet inter-frame gap and preamble cost 20 Bytes worth of = data. So with a MTU of 1500 thee is no issue (900 * (1538/1518) =3D = 911.85770751 Mbps) but the smaller the packets get: 900 * (MTU+38/MTU+18) =3D 1000 (MTU+38/MTU+18) =3D (1000 / 900) MTU+38 =3D (10 / 9) * (MTU + 18) MTU + 38 =3D (10/9) * MTU + (10/9)*18 38 - (10/9)*18 =3D (10/9) * MTU - (9/9)MTU 38 - (10/9)*18 =3D (1/9) MTU MTU =3D (38 - ((10/9)*18))*9 =3D 162 So for TCP/IPv4 MSS < 122 the shaper will not keep the ethernet hardware = queues empty=85=20 On the other hand shaping at=20 1000/(88/64) =3D 727.272727273 Mbps should make sure that even at minimal packet size of 64byte shaping = would still be keeping the ethernet queues =93empty-ish=94. If the = 1200ac can shape at 900 I would rather specify the correct overhead = though. To make things a bit trickier, depending on the interface used the = kernel will already account for the standards ethernet header without = the frame check sequence, so I would guess in the 900Mbps soft shaper on = ethN scenario one would need to add a per packet overhead of 24 bytes. = If someone in the know could double check that reasoning I would be much = obliged=85 Best Regards Sebastian >=20 > 1.2ms at 500mbit. Less of a w00t. Possible it is coming from elsewhere > on that path (fq or fq_codel on the server and client?) >=20 > cake currently peels at 1ms / flows (more or less)... NAPI is an > issue... hw mq an issue... >=20 > There are a half dozen things in the mvneta driver I would try to > reduce it's latency more. The easy ones: >=20 > reduce this to 16: >=20 > netif_napi_add(dev, &pp->napi, mvneta_poll, NAPI_POLL_WEIGHT); >=20 > Reduce this to 24: (this will also reduce the max outstanding stuff in > the driver by a LOT, but is still not BQL!) >=20 > /* Max number of allowed TCP segments for software TSO */ > #define MVNETA_MAX_TSO_SEGS 100 >=20 > Both of the will improve read side latency at the cost of more sirqs. >=20 > I do not know what reducing these will do, and would test both of the > above separately. >=20 > /* Coalescing */ > #define MVNETA_TXDONE_COAL_PKTS 1 > #define MVNETA_RX_COAL_PKTS 32 > #define MVNETA_RX_COAL_USEC 100 >=20 > As for cake itself, eric dumazet told us we dont need atomic ops in = it, > and peeling at at even lower threshold has some appeal (to me, anyway) >=20 > attached is a patch for that, put it in your = feeds/cero/kmod_sched_cake/patches > directory, rebuild (make = package/kmod-sched-cake/{clean,compile,install}) >=20 > (bump up the makefile rel number also, if you want) >=20 >=20 >=20 >=20 >=20 >> Best Regards >> Sebastian >> _______________________________________________ >> Cerowrt-devel mailing list >> Cerowrt-devel@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/cerowrt-devel >=20 >=20 >=20 > --=20 > Dave T=E4ht > worldwide bufferbloat report: > http://www.dslreports.com/speedtest/results/bufferbloat > And: > What will it take to vastly improve wifi for everyone? > https://plus.google.com/u/0/explore/makewififast --Apple-Mail=_71F48B7A-E9FA-42A9-BC88-66479C1C7E42 Content-Disposition: attachment; filename=0001-Rid-unneeded-atomic-ops-and-reduce-peeling-threshold.patch Content-Type: text/x-patch; name="0001-Rid-unneeded-atomic-ops-and-reduce-peeling-threshold.patch" Content-Transfer-Encoding: quoted-printable =46rom 46be609e95474e9db856b5e12756d4a7568adf42 Mon Sep 17 00:00:00 2001 From: Dave Taht Date: Mon, 29 Jun 2015 09:38:00 -0700 Subject: [PATCH] Rid unneeded atomic ops and reduce peeling threshold --- sch_cake.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/sch_cake.c b/sch_cake.c index 80e1cb2..9a358b9 100644 --- a/sch_cake.c +++ b/sch_cake.c @@ -121,7 +121,7 @@ struct cake_fqcd_sched_data { =20 struct codel_params cparams; u32 drop_overlimit; - atomic_t flow_count; + u32 flow_count; =20 struct list_head new_flows; /* list of new flows */ struct list_head old_flows; /* list of old flows */ @@ -427,7 +427,7 @@ static int cake_enqueue(struct sk_buff *skb, struct = Qdisc *sch) * Split GSO aggregates if they're likely to impair flow = isolation * or if we need to know individual packet sizes for framing = overhead. */ - if(unlikely((len * max(atomic_read(&fqcd->flow_count), 1)) > = q->peel_threshold && skb_is_gso(skb))) + if(unlikely((len * max(&fqcd->flow_count, 1)) > = q->peel_threshold && skb_is_gso(skb))) { struct sk_buff *segs, *nskb; netdev_features_t features =3D netif_skb_features(skb); @@ -477,7 +477,7 @@ static int cake_enqueue(struct sk_buff *skb, struct = Qdisc *sch) /* flowchain */ if(list_empty(&flow->flowchain)) { list_add_tail(&flow->flowchain, &fqcd->new_flows); - atomic_inc(&fqcd->flow_count); + fqcd->flow_count+=3D1; flow->deficit =3D fqcd->quantum; flow->dropped =3D 0; } @@ -615,7 +615,7 @@ retry: list_move_tail(&flow->flowchain, = &fqcd->old_flows); } else { list_del_init(&flow->flowchain); - atomic_dec(&fqcd->flow_count); + fqcd->flow_count-=3D1; } goto begin; } @@ -966,7 +966,7 @@ static void cake_reconfigure(struct Qdisc *sch) if(q->buffer_limit < 65536) q->buffer_limit =3D 65536; =20 - q->peel_threshold =3D (q->rate_flags & CAKE_FLAG_ATM) ? = 0 : min(65535U, q->rate_bps >> 10); + q->peel_threshold =3D (q->rate_flags & CAKE_FLAG_ATM) ? = 0 : min(65535U, q->rate_bps >> 12); } else { q->buffer_limit =3D 1 << 20; q->peel_threshold =3D 0; @@ -1083,7 +1083,7 @@ static int cake_init(struct Qdisc *sch, struct = nlattr *opt) fqcd->perturbation =3D prandom_u32(); INIT_LIST_HEAD(&fqcd->new_flows); INIT_LIST_HEAD(&fqcd->old_flows); - atomic_set(&fqcd->flow_count, 0); + fqcd->flow_count =3D 0; /* codel_params_init(&fqcd->cparams); */ =20 fqcd->flows =3D cake_zalloc(fqcd->flows_cnt * = sizeof(struct cake_fqcd_flow)); --=20 1.9.1 --Apple-Mail=_71F48B7A-E9FA-42A9-BC88-66479C1C7E42 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii > --Apple-Mail=_71F48B7A-E9FA-42A9-BC88-66479C1C7E42--