[Cake] inbound cake or fq_codel shaping fails on cable on netflix reno

Mon Jul 23 02:50:00 EDT 2018

On 22 July 2018 at 08:40, Arie <nospam at ariekanarie.nl> wrote:
> You're now the third person where I've seen the hping3 trick work, I'm
> pretty pleased it's with Mr Bufferbloat himself ;)
>
> I figured this out by accident some time ago. I used fq_codel and later cake
> to keep people happy at a small LAN party where 10 people shared a 40Mbit
> DOCSIS connection. Year after year we kept renting the same vacation homes,
> with the same old Cisco EPC3295 cable modems. We used to bring some weird
> pfsense box with tons of custom game-based rules to keep latencies under
> control, but we just replaced that with an openwrt box running fq_codel one
> day, way simpler and better latencies (thank you and the other bufferbloat
> people!)
>
> Suddenly one year, my openwrt/cake box was no longer able to keep the
> latency under control and people started complaining. I noticed that while
> an upload was running, the latency was fine (despite someone hogging the
> downstream with big downloads). As soon as the upload stopped, the big
> download started to cause ping spikes again.
> After some testing, I was able to use the hping3 trick to send the minimum
> needed upstream traffic to keep pings low, LAN party saved.
>
> Meanwhile on my home connection I used a similar DOCSIS modem. I'd always
> been able to just shape my connection close to the advertised rates. One
> day, latencies (and DSLReports bufferbloat score) got bad. Interestingly,
> flent RRUL results reported lower latencies during the test run than during
> the idle period before and after the test. Again I could use the same hping3
> trick to "fix" it. I've asked the bufferbloat mailing list to see if anyone
> knew what was going on, but nothing came of it.
> My ISP kept pushing new DOCSIS modems, so I took my chances despite it using
> a puma 6 chipset (TG2492LG). This one is fine without the hping trick, just
> like my old modem used to be.
>
> Here's what I learned about some cable modems with my particular ISP (Ziggo,
> the Netherlands) in my specific region:
> Cisco EPC3212 (DOCSIS 3.0 8x4), used to work fine, now gets big latency
> spikes regardless of the shaped rate.
> Technicolor 7200 (DOCSIS 3.0 8x4), still works fine.
> Arris  TG2492LG (DOCSIS 3.0 24x8), shaping works just fine, latency is under
> control, but it has a puma 6 chip which causes latency spikes in TCP and
> ICMP packets. UDP does not seem to be affected.
>
>
> On 22 July 2018 at 00:13, Dave Taht <dave.taht at gmail.com> wrote:
>>
>> wow. That is the best dslreports test result for cable I have ever had.
>>
>> with hping3 -2 -d 0 -s 10080 -k -p 80 -i u150 96.120.89.153
>>
>> http://www.dslreports.com/speedtest/36209937
>>
>> Without:
>>
>> http://www.dslreports.com/speedtest/36210095
>>
>> You say a different cablemodem does better without hping3 running? which?
>> :)
>>
>> Most of my production gear is based on an older arris modem (which is
>> quite good), most of my test gear is a bunch of netgear (free) modems
>> and service I got free from my time working for comcast.
>>
>> I haven't got around to springing for a docsis 3.1 modem yet (they are
>> awfully pricy).
>> On Sat, Jul 21, 2018 at 2:37 PM Dave Taht <dave.taht at gmail.com> wrote:
>> >
>> > or, another way we might look at it is there is very little we can do
>> > as the cmts has to have data in it in order to burst schedule the mac
>> > for the next string of packets, much like how fq_codel for wifi has
>> > "one in the hardware, one ready to go", a cmts has at least one in the
>> > hardware (per channel? a multiple? what?).
>> >
>> > or I could be on drugs entirely. And this thread did start with
>> > fast.com misbehaving badly regardless of the shaper in place or not,
>> > which is not what I'm looking at now.  I need to setup a 45ms rtt
>> > test...
>> >
>> > anyway, as per your suggestion, the latency gets MUCH better with your
>> > hping3 idea running, which implies that we've  been fooled all along
>> > by the rrul test. On the other hand, I think this will hurt other
>> > cable modems on the same wire. On the gripping hand, I'm happier
>> > knowing that with a busier network, docsis cable, when shaped, gets
>> > better, and that I should junk my existing test cablemodem due to the
>> > persistent spikes I see.
>> >
>> > I wonder if it's the sent path or the return path shattering latency
>> > so well? I wonder if hping3 would count against your badwidth cap?
>> >
>> > going back to trying to figure out why fast.com is so gnarly
>> > On Sat, Jul 21, 2018 at 2:18 PM Arie <nospam at ariekanarie.nl> wrote:
>> > >
>> > > I had a similar issue with my previous cable modem, whatever I shaped
>> > > to didn't matter, I still had long delays. I "fixed" it by continuously
>> > > sending a stream of empty UDP packets upstream:
>> > >
>> > > hping3 -2 -d 0 -s 10080 -k -p 80 -i u150
>> > > IP-OF-FIRST-OUTSIDE-CABLE-HOP-HERE
>> > >
>> > > On 21 July 2018 at 22:36, Dave Taht <dave.taht at gmail.com> wrote:
>> > >>
>> > >> This is my "inbound trying to shape a cable connection" smoking gun.
>> > >> The delay curve is the same
>> > >> shaping the 110mbit cmts down to 85mbit OR 55mbit.
>> > >>
>> > >> _______________________________________________
>> > >> Cake mailing list
>> > >> Cake at lists.bufferbloat.net
>> > >> https://lists.bufferbloat.net/listinfo/cake
>> > >>
>> > >
>> >
>> >
>> > --
>> >
>> > Dave Täht
>> > CEO, TekLibre, LLC
>> > http://www.teklibre.com
>> > Tel: 1-669-226-2619
>>
>>
>>
>> --
>>
>> Dave Täht
>> CEO, TekLibre, LLC
>> http://www.teklibre.com
>> Tel: 1-669-226-2619
>
>
>
> _______________________________________________
> Cake mailing list
> Cake at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake
>

I can independently confirm the "hping trick" for DOCSIS. In my case
this wasn't a latency concern, but an accidental workaround for
crippling packet loss on a remote connection. Also I used fping :)

Warning: history lesson / rant.

In Australia we had the very peculiar circumstance of the privatised
incumbent telco Telstra having overbuilt both its own copper telephone
network and its competitor's once threating HFC network (Optus) with
their own HFC to protect their landline telephone profits. Now two
decades later, the government has established a new monopoly wholesale
network operator (NBN Co) who after a subsequent change of governent
have effectively agreed to acquire all three of the above networks
from the existing operators.

Cable TV never gained much traction in Australia for a number of
reasons, mostly due to a healthy free to air broadcast industry, and
the rapid advance in satellite TV aroundthe same time that the HFC
networks were being rolled out. The monopoly pay TV retailer Foxtel
prefers to install a satellite dish than run a coax drop to access
their leased spectrum on each of the HFC networks. So only about 1/3
of premises in Australia are passed by HFC, only about 2/3 of those
are connected, and of those only about half had an active service of
any kind over the coax.

The proud new owners of two HFC networks and one copper network
decided that each area will get a different form of access depending
on the existing networks, the so called "multi-technology mix" / MTM.
The Optus HFC network was found to have very large segments, and given
that its coverage is almost completely duplicated by the other HFC
network, it has been abandoned. The Telstra HFC network mentioned
above with ~1/3 of covered premises having an active service is to
replace the copper telephone network where it exists. And where there
is only copper, most will get vectored VDSL2 deployed as FTTN. A lucky
few have an overbuilt GPON FTTP network from the previous government's
rollout (aborted due to cost except for new communities), and weird
gaps between these technology boundaries are now being plugged with
GPON-backhauled unvectored VDSL2 FTTC with reverse-powered DPUs. Oh,
and apartments get vectored VDSL2 FTTB. And in regional areas they've
built a TD-LTE network for fixed wireless. And they deployed two
geostationary bent-pipe satellites and a network of ground stations
for rural areas. Connections via this alphabet soup of technologies
are all supposed to be unified, wholesaled to retail ISPs as a single
product construct (except for satellite, and now due to capacity
constraints fixed wireless also has extra pricing asterisks). So this
is a project with unprecedented scope and complexity within Australia,
and is unsurprisingly running behind schedule and over budget.

Back to HFC/DOCSIS. For the most part, Australia conforms to European
standards which means a 65MHz upstream bandwidth split for HFC, much
more generous than the 42MHz split in North America. The incumbent
operator uses EuroDOCSIS 3.0, with 4 upstream channels fitting
comfortably in the top, clean part of the upstream spectrum. The new
operator taking over the network has installed their CMTS in parallel,
operating on the lower half of the usable upstream spectrum with, as I
recall, 3 upstream channels. The incumbent operator offers a maximum
speed tier of 120/5.5. The new operator offers 100/40, with less
upstream bandwidth available, occupying noisiest end of the spectrum.
They are also expanding the network to 100% coverage, installing drops
to existing premises on an opt-out basis so that they can decommission
the copper telephone network as soon as possible. And of course, they
are aggressively onboarding customers with migration agreements in
place as the government-sanctioned monopoly fixed line network
operator. In order to get the target speeds out of the small amount of
noisy spectrum, they are being very aggressive with modulation etc.

The problem is, they are moving too fast and tripping over themselves.
With all the new drops and new subscribers, the noise floor in the
upstream is rising rapidly. In the early stages of the rollout this
was managed acceptably, an audit was undertaken, modelling performed,
nodes split, the plant swept for noise ingress, all before releasing
the 'new' network to subscribers and around the same time that the
bulk of new drops were being added. Then the bean counters demanded
more revenue sooner, and they got greedy by rapidly releasing the
network in a large number of areas that were not foreen to require
much more work to connect the missing premises. Literally the only
work performed on the plant prior to releasing most of these areas was
to replace the existing node with a segmentable node (but not actually
segmented). Technicians were sent out to perform every installation,
even where the drop already existed and only a new modem was needed
(mostly to babysit IT issues provisioning the modems, as they are
using BSoD/MPLS L2VPN to deliver their wholesale-only service). This
scheme quickly caught up with them, and a brewing PR nightmare
culminated in the indefinite suspension of all new connections to the
network and now nearly a year later they have only re-released a small
amount of the network. They are back to plan A and properly auditing
and upgrading the plant *before* activating new subscribers on their
noise-sensitive spectrum. Once the upgrades are complete, the
incumbent operator has vacated, and DOCSIS 3.1 is enabled across the
footprint (they have been installing D3.1 modems and are using D3.1
capable CMTS chassis) they will be left with a very clean and
performant network after a routh transition.

TL;DR

D3.0 modem with upstream channel bonding, remote location without
ability to power cycle. A  noise ingress fault develops on the primary
upstream channel. The CMTS is aware of the fault and the network is
now operating in DOCSIS 3.0 Partial Service mode, however CMs
connected before the fault are still impacted. Very high (don't
remember, something like 20%) packet loss when the link is unloaded,
however under load packet loss almost disappears. Upstream data
requests are sent on the primary upstream channel, and are getting
lost. As the CMTS is aware of the fault, it is not issuing data grants
on the noisy channel so the actual packet transmission is successful
if it gets that far.

If there is remaining data in the CM's upstream queue when an upstream
packet is sent, a data grant "piggy-backs" the data to avoid a
separate grant. By sending a constrant stream of packets upstream on
an interval shorter than the request-grant cycle, all of these
requests will be piggy-backed with your dummy data on the clean
channel :)

I wish I had done some more measurements and more accurately qualified
the minimum possible packet sizes and longest interval to achieve this
request. I was also deprioritising the dummy packets with DSCP/cake so
there was no impact to my actual traffic, although of course the dummy
data is still going upstream over the coax which is the most
congestion-prone link as far as my operator is concerned. It would
also be nice to qualify the impact of this trick on RTT/jitter,
although for my case I don't really care so long as it's <10ms.
(reordering on upstream bonded DOCSIS 3.0 is an isssue, however).

Myself, I am now on a shiny new 100/40 FTTdp/C connection which is
flawless in terms of latency, loss, jitter, reordering. The network
operator / wholesaler polices traffic, my retailer shapes with a
somewhat reasonably sized FIFO in the downstream, and I shape ingress
w/ cake to 99% link capacity to keep steady-state latency under
control during downloads. Not much I can realistically do about
downstream latency spikes, unless Cisco have some FQ that I can
persuade my retailer to implement on their ASR9k in place of the FIFO.

Regards,
Ryan