[Bloat] Sender-side Buffers and the Case for Multimedia Adaptation, from ACM Queue

Mon Oct 15 08:37:51 EDT 2012

On the off chance you've not all seen this, a new article was announced
on Queue, "Sender-side Buffers and the Case for Multimedia Adaptation",
at http://queue.acm.org/detail.cfm?id=2381998&ref=fullrss

This is a discussion about application-level management of buffer and
queue management that is bufferbloat-aware, something I'm personally
interested in because I'm a capacity planner.

I'd especially  be interested in hearing from the list if the authors
have missed anything...

--dave

On 10/14/2012 03:00 PM, bloat-request at lists.bufferbloat.net wrote:
> Send Bloat mailing list submissions to
> 	bloat at lists.bufferbloat.net
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://lists.bufferbloat.net/listinfo/bloat
> or, via email, send a message with subject or body 'help' to
> 	bloat-request at lists.bufferbloat.net
> 
> You can reach the person managing the list at
> 	bloat-owner at lists.bufferbloat.net
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bloat digest..."
> 
> 
> Today's Topics:
> 
>    1. Designer of a new HW gadget wishes to avoid bufferbloat
>       (Michael Spacefalcon)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Sun, 14 Oct 2012 03:41:57 GMT
> From: msokolov at ivan.Harhan.ORG (Michael Spacefalcon)
> To: bloat at lists.bufferbloat.NET
> Subject: [Bloat] Designer of a new HW gadget wishes to avoid
> 	bufferbloat
> Message-ID: <1210140341.AA29907 at ivan.Harhan.ORG>
> 
> Hello esteemed anti-bufferbloat folks,
> 
> I am designing a new networking-related *hardware* gadget, and I wish
> to design it in such a way that won't be guilty of bufferbloat.  I am
> posting on this mailing list in order to solicit some buffering and
> bloat-related design advice.
> 
> The HW gadget I am designing will be an improved-performance successor
> to this OSHW design:
> 
> http://ifctfvax.Harhan.ORG/OpenWAN/OSDCU/
> 
> The device targets a vanishingly small audience of those few wretched
> souls who are still voluntarily using SDSL, i.e., deliberately paying
> more per month for less bandwidth.  (I am one of those wretched souls,
> and my reasons have to do with a very precious non-portable IPv4
> address block assignment that is inseparably tied to its associated
> 384 kbps SDSL circuit.)
> 
> What my current OSDCU board does (the new one is intended to do the
> exact same thing, but better) is convert SDSL to V.35/HDLC.  My own
> SDSL line (the one with the precious IPv4 block) is served via a Nokia
> D50 DSLAM operated by what used to be Covad, and to the best of my
> knowledge the same holds for all other still-remaining SDSL lines in
> the USA-occupied territories, now that the last CM DSLAM operator has
> bit the dust.  The unfortunate thing about the Nokia/Covad flavor of
> SDSL is that the bit stream sent toward the CPE (and expected from the
> CPE in return) is that abomination called ATM.  Hence my hardware
> device is essentially a converter between ATM cells on the SDSL side
> and HDLC packets on the V.35 side.
> 
> On my current OSDCU board the conversion is mediated by the CPU, which
> has to handle every packet and manage its reassembly from or chopping
> into ATM cells.  The performance sucks, unfortunately.  I am now
> designing a new version in which the entire Layer 2 conversion
> function will be implemented in a single FPGA.  The CPU will stay out
> of the data path, and the FPGA will contain two independent and
> autonomous logic functions: HDLC->SDSL and SDSL->HDLC bit stream
> reformatters.
> 
> The SDSL->HDLC direction involves no bufferbloat issues: I can set
> things up so that no received packet ever has to be dropped, and the
> greatest latency that may be experienced by any packet is the HDLC
> side (DSU->DTE router) transmission time of the longest packet size
> allowed by the static configuration - and I can statically prove that
> both conditions I've just stated will be satisfied given a rather
> small buffer of only M+1 ATM cells, where M is the maximum packet size
> set by the static configuration, translated into ATM cells.  (For IPv4
> packets of up to 1500 octets, including the IPv4 header, using the
> standard RFC 1483 encapsulation, M=32.)
> 
> However, the HDLC->SDSL direction is the tricky one in terms of
> bufferbloat issues, and that's the one I am soliciting advice for.
> Unlike the SDSL->HDLC direction, HDLC->SDSL can't be designed in such
> a way that no packets will ever have to be dropped.  Aside from the
> infamous cell tax (the Nokia SDSL frame structure imposes 6 octets of
> overhead, including both cell headers and SDSL-specific crud, for
> every 48 octets of payload), which is data-independent, the ATM creep
> imposes some data-dependent overhead: the padding of every AAL5 packet
> to the next-up multiple of 48 octets, and the RFC 1483 headers and
> trailers which are longer than their Frame Relay counterparts on the
> HDLC/V.35 side of the DSU.  Both of the latter need to be viewed as
> data-dependent overhead because both are incurred per packet, rather
> than per octet of bulk payload, and thus penalize small packets more
> than large ones.
> 
> Just to clarify, I can set the bit rate on the V.35 side to whatever
> I want (put a trivial programmable clock divider in the FPGA), and I
> can set different bit rates for the DSU->router and router->DSU
> directions.  (Setting the bit rate for the DSU->router direction to at
> least the SDSL bit rate times 1.07 is part of the trick for ensuring
> that the SDSL->HDLC direction can never overflow its tiny buffer.)
> Strictly speaking, one could set the bit rate for the router->DSU
> direction of the V.35 interface so low that no matter what the router
> sends, that packet stream will always fit on the SDSL side without a
> packet ever having to be dropped.  However, because the worst case
> expansion in the HDLC->SDSL direction is so high (in one hypothetical
> case I've considered, UDP packets with 5 octets of payload, such that
> each IPv4 packet is 33 octets long, the RFC 1490->1483 expansion is
> 2.4x *before* the cell tax!), setting the clock so slow that even a
> continuous HDLC line rate stream of worst-case packets will fit is not
> a serious proposition.
> 
> Thus I have to design the HDLC->SDSL logic function in the FPGA with
> the expectation that the packet stream it receives from the HDLC side
> may be such that it exceeds the line capacity on the SDSL side, and
> because the attached V.35 router "has the right" to send a continuous
> line rate stream of such packets, a no-drop policy would require an
> infinite buffer in the DSU.  Whatever finite buffer size I implement,
> my logic will have to be prepared for the possibility of that buffer
> filling up, and has to have a policy for dropping packets.  What I am
> soliciting from the bufferbloat-experienced minds of this list is some
> advice with the sizing of my HDLC->SDSL buffer and the choice of the
> packet dropping policy.
> 
> Because the smallest indivisible unit of transmission on the SDSL side
> (the output side of the HDLC->SDSL logic function in question) is one
> ATM cell (48 octets of payload + 6 octets of overhead, averaged over
> the rigidly repeating SDSL frame structure), one sensible way to
> structure the buffer would be to provide enough FPGA RAM resources to
> hold a certain number of ATM cells, call it N.  Wire it up as a ring
> buffer, such that the HDLC Rx side adds ATM cells at the tail, while
> the SDSL Tx side takes ATM cells from the head.  With this design the
> simplest packet drop policy would be in the form of a latency limit: a
> configurable register in the FPGA would set the maximum allowed
> latency in ATM cells, call it L.  At the beginning of each incoming
> packet, the HDLC Rx logic would check the number of ATM cells queued
> up in the buffer, waiting for SDSL Tx: if that number exceeds L, drop
> the incoming packet, otherwise accept it, adding more cells to the
> tail of the queue as the bits trickle in from V.35.  The constrainst
> on L is that L+M (the max packet size in ATM cells) must never exceed
> N (the number of cells that the HW is capable of storing).
> 
> If I choose the design just described, I know what M is (32 for the
> standard IPv4 usage), and L would be a configuration parameter, but N
> affects the HW design, i.e., I need to know how many FPGA RAM blocks
> I should reserve.  And because I need N >= L+M, in order to decide on
> the N for my HW design, I need to have some idea of what would be a
> reasonable value for L.
> 
> L is the maximum allowed HDLC->SDSL packet latency measured in ATM
> cells, which directly translates into milliseconds for each given SDSL
> kbps tier, of which there are only 5: 192, 384, 768, 1152 and 1536.
> At 384 kbps, one ATM cell (which has to be reckoned as 54 octets
> rather than 53 because of Nokia SDSL) is 1.125 ms; scale accordingly
> for other kbps tiers.  A packet of 1500 octets (32 ATM cells) will
> take 36 ms to transmit - or just 9 ms at the top SDSL tier of 1536
> kbps.  With the logic design proposed above, the HDLC->SDSL latency of
> every packet (from the moment the V.35 router starts transmitting that
> packet on the HDLC interface to the moment its first cell starts Tx on
> the physical SDSL pipe) will be exactly known to the logic in the FPGA
> the moment when the starts begins to arrive from the V.35 port: it
> will be simply equal to the number of ATM cells in the Tx queue at
> that moment.  My proposed logic design will drop the packet if that
> latency measure exceeds a set threshold, or allow it through
> otherwise.  My questions to the list are:
> 
> a) would it be a good packet drop policy, or not?
> 
> b) if it is a good policy, what would be a reasonable value for the
>    latency threshold L?  (In ATM cells or in ms, I can convert :)
> 
> The only major downside I can see with the approach I've just outlined
> is that it is a tail drop.  I've heard it said in the bufferbloat
> community that tail drop is bad and head drop is better.  However,
> implementing head drop or any other policy besides tail drop with the
> HW logic design outlined above would be very difficult: if the buffer
> is physically structured as a queue of ATM cells, rather than packets,
> then deleting a packet from the middle of the queue (it does no good
> to abort the transmission of a packet already started, hence head drop
> effectively becomes middle drop in terms of ATM cells) becomes quite a
> challenge.
> 
> Another approach I have considered (actually my first idea, before I
> came up with the ring-of-cells buffer idea above) is to have a more
> old-fashioned correspondence of 1 buffer = 1 packet.  Size each buffer
> in the HW for the expected max number of cells M (e.g., a 2 KiB HW RAM
> block would allow M<=42), and have some fixed number of these packet
> buffers, say, 2, 4 or 8.  Each buffer would have a "fill level"
> register associated with it, giving the number of ready-to-Tx cells in
> it, so the SDSL Tx block can still begin transmitting a packet before
> it's been fully received from HDLC Rx.  (In the very unlikely case
> that SDSL Tx is faster than HDLC Rx, SDSL Tx can always put idle cells
> in the middle of a packet, which ATM allows.)  Advantage over the
> ring-of-cells approach: head-drop turned middle-drop becomes easy:
> simply drop the complete buffer right after the head (the one whose Tx
> is already in progress.)  Disadvantage: less of a direct relationship
> between the packet drop policy and the latency equivalent of the
> buffered-up ATM cells for Tx.
> 
> Which approach would the bufferbloat experts here recommend?
> 
> TIA for reading my ramblings and for any technical advice,
> 
> Michael Spacefalcon,
> retro-telecom nut
> 
> 
> ------------------------------
> 
> _______________________________________________
> Bloat mailing list
> Bloat at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat
> 
> 
> End of Bloat Digest, Vol 22, Issue 6
> ************************************
> 

-- 
David Collier-Brown,         | Always do right. This will gratify
System Programmer and Author | some people and astonish the rest
davecb at spamcop.net           |                      -- Mark Twain
(416) 223-8968