[Bloat] Designer of a new HW gadget wishes to avoid bufferbloat

Fri Oct 26 22:49:48 EDT 2012

Jonathan Morton <chromatix99 at gmail.com> wrote:

> Okay, so you can fudge the write pointer to do a different kind of tail
> drop. That's great for dealing with a physically full buffer on packet
> arrival.
>
> What you need to do for head drop is the same kind of pointer manipulation
> at the read end. You only do it between IP packets,

Yup, that's what I was thinking - give the drain side logic the
ability to skip packet N+1 when it's done transmitting the cells of
packet N, and start transmitting the cells of packet N+2.

> so you may want to make
> the decision of whether to do so in advance, so as to avoid having to
> insert an idle cell while that decision is made.

Of course, my whole point in moving from a CPU-mediated implementation
to an all-FPGA one is to eliminate the possibility of bandwidth
underutilization or unnecessary latency because the L2 converter logic
is "thinking about it".

Having the head drop decision made while the last cell of the previous
in-progress packet is being transmitted seems like a reasonable
approach to me.  IOW, as we transmit the last cell of packet N, decide
whether we'll do packet N+1 or N+2 next.

> Half a second at bottleneck rate is plenty. At the higher link rate the
> buffer will fill more gradually, so you can still absorb a half second
> burst in practice, if your inbound link rate is 2Mbps.

OK, so it sounds like we'll be in a good shape with an EP3C5 or EP3C10
FPGA.  (Those are the lowest-end Cyclone III parts with 46 KiB of RAM
inside.)

> You should definitely think ahead to how you will implement head drop
> eventually. Changing that part of the system later might be more painful
> than you anticipate.

Well, I do for sure need to plan the FPGA logic and RAM resource usage
with this subsequent addition in mind.  The main thing I'll need to
add to my logic in order to implement this head drop mechanism is a
way for the drain side logic to know where the packet boundaries are.

In the simplistic HDLC->SDSL logic function I have in mind right now,
the logic on the drain side of the cell ring buffer literally won't
need to know what a packet is: it will only see cells queued up for
transmission.  But this won't be good enough for head drop: the latter
has to happen on the drain side, and because we need to drop whole
packets, not partial, there will need to be additional communication
from the packet-aware fill side logic.

I'm thinking of implementing an additional ring buffer, with the same
fill and drain sides, whose pointers would increment once per packet,
rather than once per cell as the main buffer.  The data elements
stored in this auxiliary ring buffer would be packet start pointers
into the main cell ring buffer, written by the fill logic.  The drain
logic, when running normally (not dropping any packets) would simply
need to increment its read pointer for the auxiliary buffer whenever
it transmits a cell with the "terminal cell" bit set.  But when
implementing head drop, the pointers read from this auxiliary buffer
can be used to know how to increment the main (cell) read pointer to
skip a whole packet.

> Aside from that, starting with a simple tail drop queue will validate the
> rest of the system and give you a point of comparison.

That's what I had in mind.  The "validate the rest of the system" part
is quite critical - there will be quite a bit of complex logic to
debug and make work...

> The next step might
> be to test the head drop mechanism using a trivial rule such as dropping
> alternate packets if the queue is more than half full.

Good idea.

> It's a valid model for decision making, since you make decisions at
> discrete points in time corresponding to packet boundaries. If you come up
> with some specific objections, feel free to bring them up here.

No objections, I just haven't fully internalized this stuff yet.  But
it looks like my basic logic design will be OK, so we can revisit the
implementation of the actual AQM policy after we do all the other
legwork: getting the basic logic design to compile into an FPGA image,
building the physical board, bringing up all the hardware functionality
of that board which isn't relevant to this discussion, and getting the
L2 converter logic function to work at the basic level.

Albert Rafetseder <albert.rafetseder at univie.ac.at> wrote:

> I.e., you rule out ATM cell loss and reordering! :-)

Yup.

> I'm suggesting a *ring* buffer.

Of course; all of the hardware buffers I've discussed in this thread
are ring buffers.  I don't know of any other way to implement a
continuous fill-drain buffer in hardware.

> Let me try to sketch what I mean (view =
> in your favorite fixed-space font).

My xterm window always uses the same fixed-width font. :-)

> ^r is the read pointer (output to =
> SDSL, head of the queue), ^w is the write pointer (input from HDLC, =
> tail). Pointers can only go right, but all addressing is done modulo the =
> buffer size, so you start over on the left side if you were ever to =
> point past the buffer.

Yes, I know how ring buffers work. :-)

> Currently, the read pointer processes "2", whereas the write pointer =
> will overwrite "A" next.=20
>
> 0123456789ABCDEF
>   ^r      ^w
>
> Now the read pointer has progressed a bit, but the write pointer was =
> much faster:
>
> xyz3456789rstuvw
>    ^w ^r

Yup, you are describing my design.

> Obviously, ^w must not overtake ^r.

Yes, exactly.

> The trick is now to push the read =
> pointer forward a bit instead of holding back the write pointer.

Can't do that: the ^r pointer will normally point at some cell in the
middle of a packet being transmitted.  Jerking that pointer forward
will cause us to transmit garbage consisting of partial packets, which
is the last thing we want.

Jonathan's idea of implementing head drop on the drain (read) side
instead of the fill (write) side makes a lot more sense to me.  That
head drop mechanism would be controlled by the AQM policy, which is a
kind of optional add-on.  But we still have to have tail drop (or my
"modified tail drop") as the backup packet drop mechanism that
protects the integrity of the ring buffer.  I guess the trick is to
design the AQM policy such that it does most of the job and the
out-of-buffer packet drop mechanism is rarely exercised.

> The sketch above applies to both the list of addresses of cell starts =
> and the actual cell buffer. (You could also implement both ring buffers =
> as a single one, as a linked list: Store the address of the start cell =
> of the next packet in front of the cells of the current packet. To =
> head-drop packets, dereference the pointer multiple times. ASCII =
> diagrams on request :-)

See my reply to Jonathan's comments above.  Does that clarify how I
envision the buffers being structured?

SF