From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-x22f.google.com (mail-qt0-x22f.google.com [IPv6:2607:f8b0:400d:c0d::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 9E0DA3B29E for ; Thu, 23 Aug 2018 16:15:41 -0400 (EDT) Received: by mail-qt0-x22f.google.com with SMTP id x7-v6so7784579qtk.5 for ; Thu, 23 Aug 2018 13:15:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=2HvHaWfkdcgZOguvSrwyubM2TRHnVBXavAzWzs+0aTo=; b=EyKTwwmttuuzOY7mr4k9mycrMYJqhIDZBih2ZLV1wC3KRNSrKXefYhUu7rv6rwUbBi yUJ3xXOpYEA5KJG0QZpxhovjSruIeSNbLGK0WJs8KetH4OWfU4QKvjca+hz5HAfalrVM SIjJOKQNY0i6tuy3g5+lbNH3XqvFVgK2BAWxGv0NS/XsWF4GCgsLSQahkdfkK1F+m05e XRD+HCORWM/uQfUcQUK9VmeD2cIT/O+rj/tknNNB7OhkMtvdx2oNvR4KA/fKCDn+NzSj bZCpgrj7FlBrUuyMuD2sLPphQTHlmX0Su1jdTGd2mJPiczni+SAZORqf1zi8cV0uQ41E zLFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=2HvHaWfkdcgZOguvSrwyubM2TRHnVBXavAzWzs+0aTo=; b=jQpKpr/+nRd9vVN/FP6QzdvGlY49cIUaxp0RGXjoWCOO5QQ0QsNI2xGBd++7QPz3iH 3QO/sAxkNehnR0pGOBLEgolBCuUgb5Wi/UnfTwTQB+eNsJUoOWxB46dP56qrKcof3itS 2l0Q4Jj4CMRcDIauixJG5RFEtJ2ygpVbWJd5sqMK7Cv1lvYYybcYPPOr6FXDHHsEBXkw VNSTd+1eOMTO+7c03TQnZ+l5e3vUt2o0HBJkKEefLaS8lH+gqS4vF+UG/YD0lrKFaTwB 93aLbn3u5M7xITX2p6PLC+yNOrd6FzAm0jgcKd6CGvd/Ve3Q5VpGR+IE4K2dLdb5XJ77 GVmQ== X-Gm-Message-State: APzg51Cmgwe9KOvujatqSaF8iWR7VffkkPdbWsegAaHE80+/rQ3i+vec Sb+RqCdXxXy/rG5DB54WyQvKAmzGUugvljE2M7g= X-Google-Smtp-Source: ANB0Vdah5rZRNOKRh4+rxsvYF5SQXXzJoK+cGObRhRJYVy+kpvbzlpUy00oKuyoY42bcsWse0J17olhTi++0uzbHPhQ= X-Received: by 2002:ac8:1e97:: with SMTP id c23-v6mr2865984qtm.298.1535055341144; Thu, 23 Aug 2018 13:15:41 -0700 (PDT) MIME-Version: 1.0 References: <66e2374b-f998-b132-410e-46c9089bb06b@gmail.com> <360212B1-8411-4ED0-877A-92E59070F518@gmx.de> In-Reply-To: From: Dave Taht Date: Thu, 23 Aug 2018 13:15:29 -0700 Message-ID: To: Mikael Abrahamsson Cc: Rosen Penev , bloat Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Bloat] [Cerowrt-devel] beating the drum for BQL X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Aug 2018 20:15:41 -0000 I should also point out that the kinds of routing latency numbers in those blog entries was on very high end intel hardware. It would be good to re-run those sort of tests on the armada and others for 1,10,100, 1000 routes. Clever complicated algorithms have a tendency to bloat icache and cost more than they are worth, fairly often, on hardware that typically has 32k i/d caches, and a small L2. BQL's XMIT_MORE is one example - while on the surface it looked like a win, it cost too much on the ar71xx to use. Similarly I worry about the new rx batching code ( https://lwn.net/SubscriberLink/763056/f9a20ec24b8d29dd/ ) which looks *GREAT* - on *intel* - although I *think* it will be a win everywhere this time. I tend to think a smaller napi value would help, and sometimes I think about revisiting napi itself. (and I'm perfectly willing to wait til openwrt does the rest of the port for mips to 4.19 before fiddling with it... or longer. I could use a dayjob) Still, it's been the rx side of linux that has been increasingly worrisome of late, and anything that can be done there for any chip seems like a goodness. on the mvneta front... I've worked on that driver... oh... if I could get a shot at ripping out all the bloat in it and see what happened... On the marvell front... yes, they tend to produce hardware that runs too hot. I too rather like the chipset, and it's become my default hw for most things in the midrange. Lastly... there are still billions of slower ISP links left in the world to fix, with hardware that now costs well under 40 bucks. The edgerouter X is 50 bucks (sans wifi) and good to ~180mbps for inbound shaping presently. Can we get those edge connections fixed??? On Thu, Aug 23, 2018 at 11:21 AM Dave Taht wrote: > > One of the things not readily evident in trying to scale up, is the > cost of even the most basic routing table lookup. A lot of good work > in this area landed in linux 4.1 and 4.2 (see a couple posts here: > https://vincent.bernat.im/en/blog/2017-performance-progression-ipv4-route= -lookup-linux > ) > > Lookup time for even the smallest number of routes is absolutely > miserable for IPv6 - > https://vincent.bernat.im/en/blog/2017-ipv6-route-lookup-linux > > I think one of the biggest driving factors of the whole TSO/GRO thing > is due to trying to get smaller packets through this phase of the > kernel, and not that they are so much more efficient at the card > itself. Given the kerfuffle over here ( > https://github.com/systemd/systemd/issues/9725 ) I'd actually like to > come up with a way to move the linux application socket buffers to the > post-lookup side of the routing table. We spend a lot of extra time > bloating up superpackets just so they are cheaper to route. > > TCAMs are expensive as hell, but the addition of even a small one, > readily accessible to userspace or from the kernel, might help in the > general case. I've actually oft wished to be able to offload these > sort of lookups into higher level algorithms and languages like > python, as a general purpose facility. Hey, if we can have giant GPUs, > why can't our cpus have tcams? > > programmable TCAM support got enabled in a recent (mellonox?) product. > Can't find the link at the moment TCAMs of course, is where big fat > dedicated routers and switches shine, over linux - and even arp table > lookups are expensive in linux, though I'm not sure if anyone has > looked lately. --=20 Dave T=C3=A4ht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619