[Bloat] Thoughts on Stochastic Fair Blue

Sun Apr 3 14:03:53 EDT 2011

On 3 Apr, 2011, at 7:35 pm, Juliusz Chroboczek wrote:

> Sorry for the delay, but I wanted to think this over.
> 
>> My second observation is that marking and dropping both happen at the
>> tail of the queue, not the head.  This delays the congestion information
>> reaching the receiver, and from there to the sender, by the length of
>> the queue
> 
> It's very difficult to drop at the head of the queue in SFB, since we'd
> need to find a suitable packet that's in the same flow.  Since SFB makes
> heroic efforts to keep the queue size small, this shouldn't make much of
> the difference.
> 
> Your suggestion is most certainly valid for plain BLUE.

If the queue is overfull, then the efforts at proactive congestion control have failed and tail-dropping is fine.

I was thinking more of the probabilistic marking/dropping that occurs under normal conditions, which should occur at the head of the queue to minimise feedback latency.  The head of the queue needs to decrement the bucket counters anyway, so it shouldn't be extra overhead to move the probability check here.

>> Another observation is that packets are not re-ordered by SFB, which
>> (given the Bloom filter) is potentially a missed opportunity.
> 
> What's your suggestion?
> 
>> However, they can be re-ordered in the current implementation by
>> using child qdiscs such as SFQ
> 
> I don't see how that buys you anything.  SFB is very aggressive with
> dropping packets when under congestion, and the packet drop happens
> *before* the child qdisc gets a chance to see the packet; hance, putting
> SFQ below SFB won't buy you much, it'll just reorder packets in the
> small queue that SFB allows.  Or am I missing something?

To an analogue modem or a GPRS connection, even a single default SFB bucket can take a significant time to drain.  This gets worse if there are several independent flows.  (And this is not necessarily abuse by the end-user, but a legitimate result of going out of range of the 3G signal while using it as such.)

Consider the extreme case, where you have a dozen flows each filling their bucket with a dozen packets, and then a DNS reply packet arrives.  Without packet reordering, the DNS packet must wait for 144 packets to get out of it's way, which could take tens of seconds, so the DNS resolver will time out.  With it, it only has to wait for 12 packets (possibly less still with DRR, which I haven't investigated in detail), which is fast enough that the connection, though very sluggish, is still functional.

Under a less extreme scenario, suppose I'm downloading an iApp update to my phone, and meanwhile it decides to check for new email - IMAP being a relatively chatty request-response protocol, without large amounts of data being transferred.  Then the train goes under a bridge and suddenly I'm on GPRS, so the tower's queue fills up with iApp.  With packet reordering, the IMAP protocol keeps going fast enough to download a new email in a few seconds, and then happily get out of the way.  Without it, the iApp download monopolises the connection and the IMAP job will take minutes to complete.  Email is a canonical low-bandwidth application; users reasonably expect it to Just Work.

Bear in mind also that my 3G data subscription, though unlimited in per-month traffic, is limited to 500Kbps instantaneous, and is therefore only one order of magnitude faster than an analogue modem.  Packet reordering wouldn't have such a dramatic effect on functionality as at the slower speed, but it would still make the connection feel better to use.  Given that IP packet aggregation into over-the-air frames is normal practice, this would effective put one packet from each of several flows into each frame, which is also a nice form of interleaving that helps to reduce the impact of transitory unreachability.

If you kept a list of packets in each bucket, rather than just the count of them, then you could simply iterate over the buckets (in all rows) when dequeueing, doing a *better* job than SFQ because you have a different hash salt in each row.  You would then need to delete the entry from the relevant bucket in all of the rows - this *is* extra overhead, but it's probably no greater than passing it down to SFQ.

 - Jonathan