From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-x233.google.com (mail-oi0-x233.google.com [IPv6:2607:f8b0:4003:c06::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id DB1743B25E; Wed, 27 Apr 2016 15:50:27 -0400 (EDT) Received: by mail-oi0-x233.google.com with SMTP id x201so60919268oif.3; Wed, 27 Apr 2016 12:50:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-transfer-encoding; bh=rb/jBxhuhmlB2zXo61jyQJq68orWBxdYSpWC72LCj2M=; b=P/9U1gtVFaQ9kLEYzP3LpUgjY3nxwZV4lyI6vpqNz0YFvTocE+4UhfKA/YqVvwUrC2 //RknyybqKd7o+5ai8p3hy7aX5f9LesKCDHjh9aK8ozUEhqd1Y7w/Kxxg3f2is6DOkw3 a0MU2HuZQyX2RYmbyLfBq48cf/cefrUKnBhPxXt6JAVYlgSEkdS99/d+gCMyrQxWQrgN wZGPNMj3l2Zfbj328EmcslBmjLzMmPOePndGJ1HLY0F4NtSi2p8gjYudRSzsmk34otK5 NvF/U34npmpBp9YvY68CTgAmUwTJOseZQUWXfrzT/CqmE4dm8Fofkrt4eDvTAw/f8uO0 AFKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-transfer-encoding; bh=rb/jBxhuhmlB2zXo61jyQJq68orWBxdYSpWC72LCj2M=; b=I2ilksKWhslqLsmPKV4/X5zAZO2I1AfEf8720DSH9Aipy1pyfo3QL9GQa7gK7qlPyt DcHomdwhWVA5wbf4FPPeWda3c4ODkxa2EW3tvV8own4Gz1UYRbNo5K5Yl2+sgGPYFB+v 7rUSzNSUSr7/1YD6k0AhVsr1g0XAHJE1uBEjajozgNKoyJAKfzVVtRMWTR2a2JH2mMAD hNZxu/3+yCthjUGMZhehFI7iyLhk8apcmgGbqjeSD/AXJdiEg/wBSqXuGY8izxIqpT5V sa0FWaOpCAj7BpAWfN8fzV1YmwjDgjJHY8ZlaWyvoL2Qt4gqZjQezfvLCKae/dO/fuZo cIDA== X-Gm-Message-State: AOPr4FWACSRee+7RG/AN/OwctrKDYjPmJx942VwS1apwMNFpMv+9bVc8H83pSWU7ZkirLON9ohdkANZpSdqdFA== MIME-Version: 1.0 X-Received: by 10.157.4.174 with SMTP id 43mr4297069otm.127.1461786627162; Wed, 27 Apr 2016 12:50:27 -0700 (PDT) Received: by 10.202.78.23 with HTTP; Wed, 27 Apr 2016 12:50:27 -0700 (PDT) In-Reply-To: References: Date: Wed, 27 Apr 2016 12:50:27 -0700 Message-ID: From: Dave Taht To: Stephen Hemminger Cc: Aaron Wood , cake@lists.bufferbloat.net, bloat Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Cake] [Bloat] are anyone playing with dpdk and vpp? X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Apr 2016 19:50:28 -0000 Not really relevant to this thread, probably, was this very good article on scaling linux to many cores: https://blog.acolyer.org/2016/04/26/the-linux-scheduler-a-decade-of-wasted-= cores/ I still like the idea of making single threaded cpus better, but only the millcomputer even comes close to trying, effectively. On Wed, Apr 27, 2016 at 12:45 PM, Dave Taht wrote: > On Wed, Apr 27, 2016 at 12:32 PM, Stephen Hemminger > wrote: >> DPDK gets impressive performance on large systems (like 14M packets/sec = per >> core), but not convinced on smaller systems. > > My take on dpdk has been mostly that it's a great way to heat data > centers. Still I would really like to see these advanced algorithms > (cake, pie, fq_codel, htb) tested on it at these higher speeds. > > And I still have great hope for cheap, FPGA-assisted designs that > could one day be turned into asics, but not as much as I did last year > when I first started fiddling with the meshsr onenetswitch. I really > wish I could find a few good EE's to tackle making something fq_codel > like work on the netfpga project, the proof of concept verilog already > exists for DRR and AQM technologies. > >> Performance depends on having good CPU cache. I get poor performance on = Atom >> etc. > > I had hoped that the rangeley class atoms would do better on dpdk, as > they do I/O direct to cache. I am not sure which processors that is > actually in, anymore. > >> Also driver support is limited (mostly 10G and above) > > Well, as we push end-user class devices to 1GigE, we are having issues > with overuse of offloads to get there, and in terms > of PPS, certainly pushing small packets is becoming a problem, on > ethernet and wifi. I would like to see a 100 dollar router that could > do full PPS at that speed, feeding fiber and going over 802.11ac, and > we are quite far from there. I see, for example, that meraki is using > click (I think) to push more processing into userspace. > > Also the time for a packet to transit linux from read to write is > "interesting". Last I looked it was something like 42 function calls > in the path to "get there", and some of my benchmarks on both the c2 > and apu2 are showing that that time is significant enough for fq_codel > to start kicking in to compensate. (which is kind of cool to see the > packet processing adapt to the cpu load, actually - and I still long > for timestamping on rx directly to adapt ever better) > > I have also acquired a mild dislike for seeing stuff like this: > > where the tx and rx rings are cleaned up in the same thread and there > is only one interrupt line for both. > > 51: 18 59244 253350 314273 PCI-MSI > 1572865-edge enp3s0-TxRx-0 > 52: 5 484274 141746 197260 PCI-MSI > 1572866-edge enp3s0-TxRx-1 > 53: 9 152225 29943 436749 PCI-MSI > 1572867-edge enp3s0-TxRx-2 > 54: 22 54327 299670 360356 PCI-MSI > 1572868-edge enp3s0-TxRx-3 > 56: 525343 513165 2355680 525593 PCI-MSI > 2097152-edge ath10k_pci > > and the ath10k only uses one interrupt. Maybe I'm wrong on my > assumptions, I'd think in today's multi-core environment that > processing tx and rx separately might be a win. (?) > > I keep hoping for on-board assist for routing table lookups on > something - your classic cam - for example. I saw today that there has > been some work on getting source specific routing into dpdk, which > makes me happy - > > https://www.ietf.org/proceedings/95/slides/slides-95-hackathon-18.pdf > > which is, incidentally, where I found the reference to the vpp stuff. > > https://www.ietf.org/blog/author/jari/ > > >> >> On Wed, Apr 27, 2016 at 12:28 PM, Aaron Wood wrote: >>> >>> I'm looking at DPDK for a project, but I think I can make substantial >>> gains with just AF_PACKET + FANOUT and SO_REUSEPORT. It's not clear to= my >>> yet how much DPDK is going to gain over those (and those can go a long = way >>> on higher-powered platforms). >>> >>> On lower-end systems, I'm more suspicious of the memory bus (and the ca= che >>> in particular), than I am the raw CPU power. >>> >>> -Aaron >>> >>> On Wed, Apr 27, 2016 at 11:57 AM, Dave Taht wrote= : >>>> >>>> https://fd.io/technology seems to have come a long way. >>>> >>>> -- >>>> Dave T=C3=A4ht >>>> Let's go make home routers and wifi faster! With better software! >>>> http://blog.cerowrt.org >>>> _______________________________________________ >>>> Bloat mailing list >>>> Bloat@lists.bufferbloat.net >>>> https://lists.bufferbloat.net/listinfo/bloat >>> >>> >>> >>> _______________________________________________ >>> Cake mailing list >>> Cake@lists.bufferbloat.net >>> https://lists.bufferbloat.net/listinfo/cake >>> >> > > > > -- > Dave T=C3=A4ht > Let's go make home routers and wifi faster! With better software! > http://blog.cerowrt.org --=20 Dave T=C3=A4ht Let's go make home routers and wifi faster! With better software! http://blog.cerowrt.org