From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dave.taht@gmail.com>
Received: from mail-ia0-f170.google.com (mail-ia0-f170.google.com
	[209.85.210.170]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
	(Client CN "smtp.gmail.com",
	Issuer "Google Internet Authority" (verified OK))
	by huchra.bufferbloat.net (Postfix) with ESMTPS id D6D0321F199;
	Thu, 20 Dec 2012 01:13:19 -0800 (PST)
Received: by mail-ia0-f170.google.com with SMTP id i1so2653069iaa.1
	for <multiple recipients>; Thu, 20 Dec 2012 01:13:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	bh=TH9cMd15ssn9gamwwRUiJO4X1U0RrojddWaufDZXqIM=;
	b=yKwqOj/6K4N9SLHlQm3mk+7rLlo88Uf46Fss3okeP6W+ce2OO9BrZI+K+IO9g7kn4G
	p/IfH4n4HYbPd3cS6hppQlxdq7jGyoHyhxDVmvuZh4ZURX7uAaRZhymCSWwx30pRLjbP
	HpxeO1TN9llzC1mfRs1X4wiRkST7fhG5tdw2IGZykvL3ef0+mULgZJz1dy9cG4L7QKpa
	FlUHVZfoqtcug5IL+a0yQz9wAfIW/o4N4T568eirnUPFE7vs5UzA37tRkXm/m4a4IeSJ
	rxwoK68qL8ZCJBoKC1NAdGjgGcwCt3619DT2u3WsyB3Su7Hnk2bOcXxrD7OvUEts6Gyd
	+dbg==
MIME-Version: 1.0
Received: by 10.50.56.139 with SMTP id a11mr4648174igq.86.1355994799073; Thu,
	20 Dec 2012 01:13:19 -0800 (PST)
Received: by 10.64.135.39 with HTTP; Thu, 20 Dec 2012 01:13:18 -0800 (PST)
In-Reply-To: <20121220081737.1A681800037@ip-64-139-1-69.sjc.megapath.net>
References: <dave.taht@gmail.com>
	<CAA93jw7h8d5Y8r+_DwR_cF9F7TEWzgZuoYvfOWp5B=1=pGx=BQ@mail.gmail.com>
	<20121220081737.1A681800037@ip-64-139-1-69.sjc.megapath.net>
Date: Thu, 20 Dec 2012 04:13:18 -0500
Message-ID: <CAA93jw5ahao=o4Efp6zDqgFaQoJGLxv0k=v3rVnNskFB=c9oGA@mail.gmail.com>
From: Dave Taht <dave.taht@gmail.com>
To: Hal Murray <hmurray@megapathdsl.net>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: codel@lists.bufferbloat.net,
	bloat-devel <bloat-devel@lists.bufferbloat.net>,
	cerowrt-devel@lists.bufferbloat.net
Subject: Re: [Codel] hardware hacking on fq_codel in FPGA form at 10GigE
X-BeenThere: codel@lists.bufferbloat.net
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: CoDel AQM discussions <codel.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/codel>,
	<mailto:codel-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/codel>
List-Post: <mailto:codel@lists.bufferbloat.net>
List-Help: <mailto:codel-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/codel>,
	<mailto:codel-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Thu, 20 Dec 2012 09:13:21 -0000

On Thu, Dec 20, 2012 at 3:17 AM, Hal Murray <hmurray@megapathdsl.net> wrote=
:
>
> If I was going to do something like that, I'd build a small/simple CPU an=
d do
> the work in microcode.

There are two ppc 440 cpus already onboard the 10GigE device, I think.
It's a REALLY NICE fpga.

http://netfpga.org/10G_specs.html

http://www.xilinx.com/support/documentation/data_sheets/ds100.pdf

If we really wanted to get a jump on the high end:

http://www.hitechglobal.com/boards/100gig.htm

>
>> implementing {n,e,s}fq_codel onboard looks very feasible
>
> How many lines of assembler code would it take?

I could do a dump of the current code into any given assembly
language. It's not a lot, but there are a lot of out of band
functions.

> How many registers do you need?  Do you need any memory other than queues=
?
> Maybe counters?

The total overhead for fq_codel is presently 1024*64 bytes for 1024
flows, and 4-8k of pointer overhead (32 or 64 bit). I would argue for
such a device to hash to 64k flows, or heck, higher. And the per-flow
overhead can be reduced a lot in a dedicated device.

As to what of that needs to be on-board the fpga or off-board, is a
fairly good question. The sfq/codel queue management stuff sits nicely
in parallel with getting the packets so that's an obvious second
bus/cache arch...

>> The only thing that is seriously serial about fq_codel is shooting the
>> biggest flow when the queue limit is exceeded, and that could be made
>> embarrassingly parallel with enough gates.There are no doubt other trick=
y
>> issues.
>
> Would it be better to do the fq work in the main CPU and let the FPGA gra=
b

Well there are a few things that would benefit from moving directly
into hardware - the 5 tuple hash, for example.

> packets from some shared  data structure in memory?

The problem that I would like to beat is that TSO/GSO seem to be
necessary on the host processor to reduce the interrupt count to
sanity at 10GigE. A goal here would be to allow for TSO generation
(and GRO receive) to hand off to the board, but for the board to
interleave and aqm packets from there to the wire. Rather than a tx
descriptor ring you'd have a tx descriptor list and tx completion ring
so that you could send streams out of order.

> Can you work out a
> memory structure that doesn't need locks?

The enqueue and dequeue algorithms are entirely decoupled, with the
exception of this error handling phase of (out of queue space) One
thought would be to track packet count on enqueue (this is more
"sfq"-like than fq_codel-like) which still has a tiny lock...
:grumble:

>
>
> --
> These are my opinions.  I hate spam.
>
>
>



--=20
Dave T=E4ht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.=
html