From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from uplift.swm.pp.se (ipv6.swm.pp.se [IPv6:2a00:801::f]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 707283B29E for ; Fri, 24 Aug 2018 03:05:40 -0400 (EDT) Received: by uplift.swm.pp.se (Postfix, from userid 501) id A51A1AF; Fri, 24 Aug 2018 09:05:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=swm.pp.se; s=mail; t=1535094338; bh=com5oChL+SP8lGimonGMtvQHwoGJSi1219MJzpllQp8=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=JzqzLRzp7LcQ1H8duP5JYY5OV0xKR4Q1A4fhozpKpqxV8yIQ7EW4Vm5cXygW6C2rO 0KGa7beiMrgqyULc8OBLKvq7VMbM/1/j/enutaUtUupQxBk0RqUbPpEtTwDVSOL36h yFNPmPWKk3JCS9153uRylT5o6h8kZ6XPR1J7KQBI= Received: from localhost (localhost [127.0.0.1]) by uplift.swm.pp.se (Postfix) with ESMTP id A00119F; Fri, 24 Aug 2018 09:05:38 +0200 (CEST) Date: Fri, 24 Aug 2018 09:05:38 +0200 (CEST) From: Mikael Abrahamsson To: Dave Taht cc: Rosen Penev , bloat In-Reply-To: Message-ID: References: <66e2374b-f998-b132-410e-46c9089bb06b@gmail.com> <360212B1-8411-4ED0-877A-92E59070F518@gmx.de> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) Organization: People's Front Against WWW MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Subject: Re: [Bloat] [Cerowrt-devel] beating the drum for BQL X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Aug 2018 07:05:40 -0000 On Thu, 23 Aug 2018, Dave Taht wrote: > I should also point out that the kinds of routing latency numbers in > those blog entries was on very high end intel hardware. It would be > good to re-run those sort of tests on the armada and others for > 1,10,100, 1000 routes. Clever complicated algorithms have a tendency > to bloat icache and cost more than they are worth, fairly often, on > hardware that typically has 32k i/d caches, and a small L2. My testing has been on OpenWrt with 4.14 on intel x86-64. Looking how the box behaves, I'd say it's limited by context switching / interrupt load, and not actually by CPU being busy doing "hard work". All of the fast routing implementations (snabbswitch, FD.IO/VPP etc) they take away CPU and devices from Linux, and runs busy-loop with polling a lot of the time, an never context switching which means L1 cache is never churned. This is how they become fast. I see potential to do "XDP offload" of forwarding here, basically doing similar job to what a hardware packet accelerator does. Then we can optimise forwarding by using lessons learnt from the other projects potentially. Need to keep the bufferbloat work in mind when doing this though, so we don't make that bad again. -- Mikael Abrahamsson email: swmike@swm.pp.se