[Cerowrt-devel] [Bloat] DC behaviors today

Mon Dec 18 03:11:40 EST 2017

On Sun, 17 Dec 2017, Benjamin Cronce wrote:

> This is an interesting topic to me. Over the past 5+ years, I've been 
> reading about GPON fiber aggregators(GPON chassis for lack of a proper 
> term) with 400Gb-1Tb/s of uplink, 1-2Tb/s line-cards, and enough GPON 
> ports for several thousand customers.

Yep, they're available if people want to pay for them.

> because they eat up support phone time. They eventually went to a
> non-oversubscribed flat model. He told me that the GPON chassis plugs
> strait into the core router. I asked him about GPON port shared bandwidth
> and the GPON uplink. He said they will not over-subscribe a GPON port, so
> all ONTs on the port can use 100% of their provisioned rate, and they will
> not place more provision bandwidth on a single GPON chassis than what they
> uplink can support.

Yes, it makes sense to have no or very small oversubscription of the GPON 
port. The aggregation uplink is another beast. In order to do statistical 
oversubscription, you need lots of customers so the statitics turn into an 
average that doesn't fluctuate too much.

> loss(not a long time to sample, but was sampled at a rate of 10pps against
> their speed-test server), but 20-40ms pings. I tested this during off
> hours, like 1am.

Yep, sounds like they configured their queues correctly then.

Historically we had 600ms of buffering in backbone platforms in the end of 
the 90ties and beginning of 2000ds, then this has slowly eroded over time 
so now the "golden standard" is down to 30-50ms of buffer. There are 
platforms that have basically no buffer at all, such as 64 port 100G 
equipment with 12-24 megabyte of buffer. These are typically for DC 
deployment where RTTs are extremely low and special congestion algorithms 
are used. They don't work very well for ISP deployments.

> to 1Gb/s. I asked them about this. Yes, the 1Gb tier was also not
> over-subscribed. I'm not sure if some lone customer pretty much got their
> own GPON port or they had some WDM-PON linecards.

Sounds like it, non-oversubscribing gig customers on GPON sounds 
expensive.

> I'm currently paying about $40/m for 150/150 for a "dedicated" 
> connection. I'm currently getting about 1ms+-0.1ms pings to my ISP's 
> speedtest server 24/7.  If I do a ping flood, I can get my avg ping down 
> near 0.12ms. I assume this is because of GPON scheduling. Of course I 
> only test this against their speedtest server and during off hours.

Yes, sounds like GPON scheduler indeed.

> to be load-balanced by some lower-bits in the IP address. This gave a total

Load-balancing can be done a lot of different ways. Typically for 
ISP-speak this is called "hashing", some do it on L3, some on L3/L4 
information, some do it even deeper into the packet.

> of 6 links. The network admin told me that any given link had enough
> bandwidth provisioned, that if all 5 other links were down, that one link
> would have a 95th percentile below 80% during peak hours, and customers
> should be completely unaffected.

That's a hefty margin. I'd say prudent is to make sure you can handle 
single fault without customer degradation. However, transit isn't that 
expensive so it might make sense.

> what... 300Gb/s or so split among 32 customers?. As far as I can tell, 
> last mile bandwidth is a solved problem short of incompetence, greed, or 
> extreme circumstances.

Sure, it's not a technical problem. We have the technology. It's a money, 
politics, law, regulation and will problem. So yes, what you said.

> Ahh yes. Statistical over-subscription was the topic. This works well for
> backbone providers where they have many peering links with a heavy mix of
> flows. Level 3 has a blog where they were showing off a 10Gb link where
> below the 95th percentile, the link had zero total packets lost and a
> queuing delay of less than 0.1ms. But above 80%, suddenly loss and jitter
> went up with a hockey-stick curve. Then they showed a 400Gb link. It was at
> 98% utilization for the 95th percentile and it had zero total packets lost
> and a max queuing delay of 0.01ms with an average of 0.00ms.

Yes, this is still true today as it was 2002 when this presentation was 
done:

https://internetdagarna.se/arkiv/2002/15-bygganat/id2002-peter-lothberg.pdf

(slide 44-47). The speeds have only increased, but this premise still is 
the same.

> There was a major European IX that had a blog about bandwidth planning and
> over-provisioning. They had a 95th percentile in the many-terabits, and
> they said they said they could always predict peak bandwidth to within 1%
> for any given day. Given a large mix of flow types, statistics is very good.

Indeed, the bigger the aggreation, the bigger the statistics show same 
behaviour every day.

> On a slightly different topic, I wonder what trunk providers are using for
> AQMs. My ISP was under a massive DDOS some time in the past year and I use
> a Level 3 looking glass from Chicago, which showed only a 40ms delta
> between the pre-hop and hitting my ISP, where it was normally about 11ms
> for that link. You could say about 30ms of buffering was going on. The
> really interesting thing is I was only getting about 5-10Mb/s, which means
> there was virtually zero free bandwidth. but I had almost no packet-loss. I
> called my ISP shortly after the issue started and that's when they told me
> they were under a DDOS and were at 100% trunk, and they said they were
> going to have their trunk bandwidth increased shortly. 5 minutes later, the
> issue was gone. About 30 minutes later I was called back and told the DDOS
> was still on-going, they just upgraded to enough bandwidth to soak it all.
> I found it very interesting that a DDOS large enough to effectively kill
> 95% of my provisioned bandwidth and increase my ping 30ms over normal, did
> not seem to affect packet-loss almost at all. It was well under 0.1%. Is
> this due to the statistical nature of large links or did Level 3 have an
> AQM to my ISP?

This is interesting. I thought about this for several minutes, but can't 
come up with an explanation to this behaviour, at least not from the 
typical kind of DDOS that's going around. If there was some kind of ddos 
mitigration equipment put into the mix, that might explain what you were 
seeing.

-- 
Mikael Abrahamsson    email: swmike at swm.pp.se