[Bloat] philosophical question

Mon May 30 11:29:44 EDT 2011

On Mon, May 30, 2011 at 5:25 AM, Dave Taht <dave.taht at gmail.com> wrote:
>
>
> On Sun, May 29, 2011 at 10:24 PM, George B. <georgeb at gmail.com> wrote:
>>
>> Ok, say I have a network with no over subscription in my net.
>
> I'd love to see one of those. Can I get on it?

Well, we currently have the potential for some microburst oversub
inside the data center but not too much of it.  I can take a 48-port
GigE switch and have 40G of uplink but the switches aren't fully
populated yet.  Bottlenecks are currently where we might have 25 front
end servers talking on GigE to a backend server with 20G.  So some
potential for internal microburst oversub but that's beyond the scope
of this discussion.

>>
>> I have
>> 10G to the internet but am only using about 2G of that.  This is the
>> server side of a network talking to millions of clients.  The clients
>> in this case are on "lossy" wireless networks where packet loss is not
>> an indication of congestion so much as it is an indication that the
>> client moved 15 feet behind a pole and had poor network connectivity
>> for a few minutes.
>>
> Or is using multicast.

Multicast is a fact of life with which one is going to have to learn
to live.  Better to somehow get the gear handling it in a better
fashion, in my opinion.

>> The idea being that in today's internet, packet loss is not a good
>> indication of congestion.  Often it just means that the radio signal
>> has been briefly interrupted.  What I need is something that can tell
>> the difference between real congestion and radio loss.  ECN seems to
>> be the way forward in that respect.
>>
> Yes. When it works. Which is rarely.

I have enabled ECN (been following various bufferbloat discussions for
a while) on a couple of machines and also my own machine (my own in
order to see where it might cause any problems browsing) without any
problems so far.  "Back in the day" when ECN first came out on Linux,
it was enabled by default and caused all sorts of issues with sites
that simply drop packets with either/any of the ECN bits set.  So far
there haven't been any issues that I have run into with ECN set on my
Windows laptop.    Once I am convinced that setting that those bits
isn't going to cause problems, I will roll that out in a more general
fashion. But if networks upstream from us clear those bits anyway, I'm
not convinced what difference it will make.

There is also one fairly small subnet in the overall network where I
have enabled "random-detect ecn" with a policy map on a potentially
oversubscribed link.  But that is the only router in the network that
even supports ECN.  I have sent an inquiry to the manufacturer of the
rest of the gear about supporting ECN with their WRED implementation
but haven't heard anything from them on the subject.

>> But assuming my network, as a server of content is not over
>> subscribed, what would you suggest as the best qdisc for such a
>> traffic profile? In other words, I am looking at this from the server
>> aspect rather than from the client aspect.
>>
>
> Ah, ok. This was discussed in this loooong thread:
>
> https://lists.bufferbloat.net/pipermail/bloat/2011-March/000272.html
>
> Some form of fair queuing distributes the load to the ultimate end nodes
> better.

Ok, as we are using Linux (mostly) for the servers talking to the
clients, it shouldn't be much of an issue to put into place.  Thanks
to the pointer to the thread and I will watch as things develop and
see how things go.

> As for which packet scheduler to choose for that? Don't know, I'm just
> trying to get to where we can actually test stuff on the edge gateways at
> this point.

Yeah, what I am most interested in are things like smart
phones/laptops/tablets and not necessarily on WiFi but also on 3/4g
networks. Those things are pulling a lot of traffic these days and the
network can be lossy at times.  From my own analysis of traffic
captures, it is fairly easy to see when a device that is "on the move"
changes cell towers.  You get a burst of resends and often some out of
order packets and then things settle down for a while.  This isn't so
big of a deal if you have only a few mobile clients but sites that
cater to mobile content might have millions of such clients connected
at any given time with many of them in a state where they have
marginal connectivity or are in the process of moving between towers.
So the TCP notion that "packet loss == congestion" doesn't apply in
those networks.  With those, packet loss is just packet loss and
shouldn't be treated as congestion.  This is why I think it is so
important to get ECN working across the Internet.  But even with ECN
capable end points, if the routers in the middle are not capable of
using ECN to signal congestion and simply drop packets, there is
always a question of why the packet was lost.

We need to hammer on our vendors a bit and get them properly
supporting ECN to signal congestion on ECN aware flows.

> Dave Täht

Thanks, all!

g