[Cerowrt-devel] [Bloat] DC behaviors today

Thu Dec 14 03:22:20 EST 2017

On Wed, 13 Dec 2017, Jonathan Morton wrote:

> Ten times average demand estimated at time of deployment, and struggling 
> badly with peak demand a decade later, yes.  And this is the 
> transportation industry, where a decade is a *short* time - like less 
> than a year in telecoms.

I've worked in ISPs since 1999 or so. I've been at startups and I've been 
at established ISPs.

It's kind of an S curve when it comes to traffic growth, when you're 
adding customers you can easily see 100%-300% growth per year (or more). 
Then after market becomes saturated growth comes from per-customer 
increased usage, and for the past 20 years or so, this has been in the 
neighbourhood of 20-30% per year.

Running a network that congests parts of the day, it's hard to tell what 
"Quality of Experience" your customers will have. I've heard of horror 
stories from the 90ties where a then large US ISP was running an OC3 (155 
megabit/s) full most of the day. So someone said "oh, we need to upgrade 
this", and after a while, they did, to 2xOC3. Great, right? No, after that 
upgrade both OC3:s were completely congested. Ok, then upgrade to OC12 
(622 megabit/s). After that upgrade, evidently that link was not congested 
a few hours of the day, and of course needed more upgrades.

So at the places I've been, I've advocated for planning rules that say 
that when the link is peaking at 5 minute averages of more than 50% of 
link capacity, then upgrade needs to be ordered. This 50% number can be 
larger if the link aggregates larger number of customers, because 
typically your "statistical overbooking" varies less the more customers 
participates.

These devices do not do per-flow anything. They might have 10G or 100G 
link to/from it with many many millions of flows, and it's all NPU 
forwarding. Typically they might do DIFFserv-based queueing and WRED to 
mitigate excessive buffering. Today, they typically don't even do ECN 
marking (which I have advocated for, but there is not much support from 
other ISPs in this mission).

Now, on the customer access line it's a completely different matter. 
Typically people build with BRAS or similar, where (tens of) thousands of 
customers might sit on a (very expensive) access card with hundreds of 
thousands of queues per NPU. This still leaves just a few queues per 
customer, unfortunately. So these do not do per-flow anything either. This 
is where PIE comes in, because these devices like these can do PIE in the 
NPU fairly easily because it's kind of like WRED.

So back to the capacity issue. Since these devices typically aren't good 
at assuring per-customer access to the shared medium (backbone links), 
it's easier to just make sure the backbone links are not regularily full. 
This doesn't mean you're going to have 10x capacity all the time, it 
probably means you're going to be bouncing between 25-70% utilization of 
your links (for the normal case, because you need spare capacity to handle 
events that increase traffic temporarily, plus handle loss of capacity in 
case of a link fault). The upgrade might be to add another link, or a 
higher tier speed interface, bringing down the utilization to typically 
half or quarter of what you had before.

-- 
Mikael Abrahamsson    email: swmike at swm.pp.se