[Bloat] some comments on draft-ietf-tsvwg-byte-pkt-congest-10.txt

Mon Jun 17 15:40:52 EDT 2013

This draft "updates" RFC2309 which has already been obsoleted by one
of it's original authors and a replacement draft (
http://tools.ietf.org/html/draft-baker-aqm-recommendation-00 ) is in
progress. Be that as it may...

This draft starts off on the wrong foot, and proceeds downhill
rapidly. I am glad that someone is trying to update BCP for their
current RED usage, but in general I think it is incorrect to
extrapolate from RED's behavior to other behaviors in many cases,
so...

"   This document provides recommendations of best current practice for
   dropping or marking packets using any active queue management (AQM)
   algorithm, such as random early detection (RED), BLUE, pre-congestion
   notification (PCN), etc. "

By excluding DRR, SFQ, SQF, Codel, FQ_Codel, PIE and others, and
attempting to generalize from experiences with RED to all AQM
technologies, it does its potential readers a disservice.

I'd change the first sentence to:

"   This document provides recommendations of best current practice for
   dropping or marking packets using the RED active queue management
(AQM) algorithm, using packet drop and congestion notification"

>From that, it is possible to generalize the following, although the
backing argument is suspect, which I'll get into later...

" We give three strong recommendations: (1)
   packet size should be taken into account when transports read and
   respond to congestion indications, (2) packet size should not be
   taken into account when network equipment creates congestion signals
   (marking, dropping), and therefore (3) in the specific case of RED,
   the byte-mode packet drop variant that drops fewer small packets
   should not be used.  "

"This memo updates RFC 2309 to deprecate
   deliberate preferential treatment of small packets in AQM algorithms."

as RFC2309 itself is being obsoleted, we're going around in circles here.

Before tackling byte-pkt-congest-10 directly - a couple asides..

...snip snip...

RFC2309 refers to this 1994 paper:

"On the Self-Similar Nature of Ethernet Traffic (Extended Version)"

I would certainly like the model and analysis of this paper repeated
against modern traffic patterns. Has anyone done this? I loved this
paper when it came out....

last paragraph in section 2 of RFC2039 I heartily agree with

"      In short, scheduling algorithms and queue management should be
       seen as complementary, not as replacements for each other."

And if we can agree that AQM = Active queue *length* management and
can come up with a name for FQ+AQM hybrids that works for people (SQM
- smart queue management?) so we know what we're talking about when
talking about things, certain bits in section 3 get easier to deal
with.

Because of the overload on AQM I'm going to use SQM throughout what I
write below.

As one last general example of problems with rfc2039, some of the
references are to protocols so ancient and non-deployed as to render
the relevant arguments moot:

"      voice and video, and also multicast bulk data transport [SRM96].
      If no action is taken, such unresponsive flows could lead to a new
      congestive collapse.

      In general, all UDP-based streaming applications should
      incorporate effective congestion avoidance mechanisms.  For
      example, recent research has shown the possibility of
      incorporating congestion avoidance mechanisms such as Receiver-
      driven Layered Multicast (RLM) within UDP-based streaming
      applications such as packet video [McCanne96] [Bolot94].  Further
      research and development on ways to accomplish congestion
      avoidance for streaming applications will be very important."

It would be nice to have relevance to new stuff like webrtc in a new
draft of some sort rather than to non-deployed 20 year old protocols.

...ok back to the pkt-congest draft...

"   Consensus has emerged over the years concerning the first stage: if
   queues cannot be measured in time, whether they should be measured in
   bytes or packets.  Section 2.1 of this memo records this consensus in
   the RFC Series.  In summary the choice solely depends on whether the
   resource is congested by bytes or packets."

Measuring queues in time is totally feasible as shown by pie and
codel, and implementations on weak hardware such as on mips. Last week
I got fq_codel running on a raspberri pi and a beaglebone to no
observable hit on cpu usage vs pfifo_fast... I'll do an arduino if
that's necessary to make the point, harder. Anybody got a 68020 or
slower to play with?

So an alternative formulation that makes sense is:

"  When queues cannot be measured in time, should they be measured in
   bytes or packets?

And dropping the rest of this:

" Section 2.1 of this memo records this consensus in
   the RFC Series.  In summary the choice solely depends on whether the
   resource is congested by bytes or packets."

Moving on:

"   This memo updates [RFC2309] to deprecate deliberate preferential
   treatment of small packets in AQM algorithms.  It recommends that (1)
   packet size should be taken into account when transports read
   congestion indications, (2) not when network equipment writes them.
   This memo also adds to the congestion control principles enumerated
   in BCP 41 [RFC2914]."

s/AQM/the RED/g across the draft

Still... to get to the meat of my own complaint with the draft and
philosophy expoused within, I'll pull out two example paragraphs and
try to make my argument....

"However, at the transport layer, TCP congestion control is a widely
   deployed protocol that doesn't scale with packet size.  To date this
   hasn't been a significant problem because most TCP implementations
   have been used with similar packet sizes.  But, as we design new
   congestion control mechanisms, this memo recommends that we should
   build in scaling with packet size rather than assuming we should
   follow TCP's example."

"Although many control packets happen to be
   small, the alternative of network equipment favouring all small
   packets would be dangerous.  That would create perverse incentives to
   split data transfers into smaller packets."

packets were (at least originally) "a small group or package of anything",

I view the incentives to create larger packets as *far, far, far, far
more perverse* than the incentives to create smaller ones. There will
always be header overhead, there will always be small signalling
packets, and there will always be bulk data that can *always* be
broken up into packets of smaller size, that gets in the way of more
interactive data if the larger packets aren't broken up into packets
of smaller size.

In my talks of late, I toss a 64GB memory stick across the room at
someone - wow! check out that bandwidth! 64GB/sec!! and then challenge
someone to try and read the last byte from the file on it going
sequentually)

(hint, most USB sticks barely do 8MB/sec and you'll be waiting a looong time)

If it were up to me I'd have held the internet's MTU to 574 bytes
until everybody was running at greater than 1Mbit, and the ratio
between the largest packet and smallest held to less than 10x1 for
eternity.

Big packets affect latency, badly. They are also subject to much
higher potential rates of error. At 1500 bytes, header overhead, even
with ipv6, is pretty minimal and few gains can be had by increasing
it. Jumbo frames are barely accepted in very few circumstances, etc.

So the argument and conclusions about larger packet sizes that
permeates this document is the opposite of the argument I'd make,
throughout.

So, as one example:

"6. Security Considerations

   This memo recommends that queues do not bias drop probability towards
   small packets as this creates a perverse incentive for transports to
   break down their flows into tiny segments.  One of the benefits of
   implementing AQM was meant to be to remove this perverse incentive
   that drop-tail queues gave to small packets."

I think the author has got the intended statement wrong ("away from"
rather than "towards"?)...

 and my take on it is that the drop tail behavior towards small
packets was indeed very desirable, and should be retained in an SQM,
to keep latencies low on overload and create incentives for
right-sizing packets in general as per their actual transport needs.

"In
   summary, it says that making drop probability depend on the size of
   the packets that bits happen to be divided into simply encourages the
   bits to be divided into smaller packets. "

YEA!

 " Byte-mode drop would
   therefore irreversibly complicate any attempt to fix the Internet's
   incentive structures."

s/complicate/enhance/

The document refers to things like rfc5690 and I long ago lost hope for ECN.

The document then makes some blanket statements that aren't backed by
data (that I'm aware of)

Section 6

"   In practice, transports cannot all be trusted to respond to
   congestion.  So another reason for recommending that queues do not
   bias drop probability towards small packets is to avoid the
   vulnerability to small packet DDoS attacks that would otherwise
   result.  One of the benefits of implementing AQM was meant to be to
   remove drop-tail's DoS vulnerability to small packets, so we
   shouldn't add it back again."

I am only aware of a few small packet attacks. (most UDP DNS attacks
try to do amplification actuallly) A strict DDOS based on small
packets can be made effective against drop tail and somewhat effective
against RED.

Most attacks I've looked at are actually MUCH less effective against a
SQM like fq_codel, given the random hash and equal service guarantees.
I would certainly like to take a hard look at the tools used to attack
and create Robust RED, so everything in section 6 seems highly
theoretical and needs proof.

Lastly:

section 5.2 needs to be thought about in the age of wireless, which is
often TXOP-congested rather than either of the two problems
identified, and is kind of pressing.

"5.2. Bit- & Packet-congestible Network

   The position is much less clear-cut if the Internet becomes populated
   by a more even mix of both packet-congestible and bit-congestible
   resources (see Appendix B.2).  This problem is not pressing, because
   most Internet resources are designed to be bit-congestible before
   packet processing starts to congest (see Section 1.1)."