[Cerowrt-devel] an option for a new platform?

Dave Taht dave.taht at gmail.com
Tue Dec 16 04:02:43 EST 2014

On Sat, Dec 13, 2014 at 3:02 PM, David P. Reed <dpreed at reed.com> wrote:
> Anyone measured what is the actual bottleneck in 300 mb/s shaping?  On an
> Intel platform you can measure a running piece of code pretty accurately.

Regrettably the MIPs profiler has been broken for as long as I can remember.
I just tried compiling it in again and got it failing on an illegal
op... not unfixable,
but after that you have only short intervals to poke at stuff before running
out of available memory on anything.

> ask because it is not obvious a cpu needs to touch much of a frame to do
> shaping, so it seems more likely that the driver and memory management
> structures are the bottleneck.

I pointed to a specific instruction pattern that was a problem, but I do
think it is a more general problem with caching and context switching.

A recent benchmark of a dual core 1+ghz atom hit 300mbits of shaping
before bottlenecking in softirqd.

So even a low end atom is better
than the 24k mips, and most of the other work points to just plain
running out of 32k cache as the key limitation.

The right place to do the work of inbound and outbound shaping
is in the switch, I think, but that needs a fq_codel enabled switch
in the first place...

... and despite waiting patiently for that to show, no sign of life
from a single silicon vendor yet. I have all the pieces to build
a pretty good fq'd and ecn'd switch now - DRR from netfpga, trimode ethernet
macs from a variety of vendors, timer code from elsewhere, scheduling
from the scenic project...

and someone just built the very simple zynq FPGA interface to
4 ethernet macs with all the needed IP:


except that I'd rewrite the ethernet device entirely to bolt what is
really needed, in. And I gave away my two zedboards to worthy
developers and have to get another.

> But it is really easy to write very slow code in a machine with limited
> cache. So maybe that is it.

Context switches suck on modern arches, far more than I realized
until recently. The software routers are spinning, polling, in userspace.

Sucks on power, works on latency.

> On a multi core intel arch machine these days it is a surprising fact that a
> single core can't use up more than about 25 percent of  a socket's memory
> cycles so to get full i/o speed you need to be running your code on 4 cores
> or more... this kind of thing can really limit achievable speed of a poorly
> threaded design.  Architectural optimization needs more than llvm and clean
> code. You need to think about the whole software pipeline. Debian may not be
> great out of the box for this reason - it was never designed for routing
> throughput.

The work on software routing (click, vandervecken, dpdk, opendaylight)
appears to point towards doing it all in userspace as the way to go on
most current architectures.

Regrettably the userspace stacks are not very open, or complete, at present.

> On Dec 12, 2014, Dave Taht <dave.taht at gmail.com> wrote:
>> There was a review of that hardware that showed it couldn't push more
>> than 600Mbit natively (without shaping). I felt that the ethernet
>> driver could be improved significantly after looking it over, but
>> didn't care for the 600mbit as a starting point to have to improve
>> from.
>> Not ruling it out, though! It met quite a few requirements we have.
>> On Thu, Dec 11, 2014 at 11:33 PM, Erkki Lintunen <ebirdie at iki.fi> wrote:
>>> Hello,
>>> while enjoying and reading another thread from the list...
>>>> -------- Forwarded Message --------
>>>> Subject: Re: [Cerowrt-devel] how is everyone's uptime?
>>>> Date: Thu, 11 Dec 2014 16:42:37 -0800
>>>> From: Dave Taht <dave.taht at gmail.com>
>>> [snip]
>>>> But frankly, I would prefer for most of the chaos there to subside and
>>>> to find
>>>> a new, additional platform, to be working on before resuming work,
>>>> that can do inbound shaping at up to 300mbit. And
>>>> to be more openwrt compatible in whatever we do, whatever that is.
>>> this reminded me that another day I passed a web-page of a platform and
>>> in the hope this has not been on the list yet passing it forward.
>>> <http://www.pcengines.ch/apu.htm>
>>> An interesting tidbit in the platform is the choice of firmware, I
>>> think. Haven't seen any board yet with the similar choice by the
>>> manufacturer. With a quick summing from the vendor part catalog, the
>>> platform is sub 200 EUR (238 USD in current exchange rate) for an about
>>> working assembly of 3x 1GbE, 4G ram, 1G flash, 802.11a/b/g/n radio...
>>> I can't say anything how capable the hw might be for the stated inbound
>>> shaping performance. I have had an ALIX board from their previous
>>> generation for years and its been humming nicely though I haven't pushed
>>> it to its envelope.
>>> Best
>>> Erkki
>>> ________________________________
>>> Cerowrt-devel mailing list
>>> Cerowrt-devel at lists.bufferbloat.net
>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
> -- Sent from my Android device with K-@ Mail. Please excuse my brevity.

Dave Täht


More information about the Cerowrt-devel mailing list