From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <chromatix99@gmail.com>
Received: from mail-lf0-x22c.google.com (mail-lf0-x22c.google.com
 [IPv6:2a00:1450:4010:c07::22c])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by lists.bufferbloat.net (Postfix) with ESMTPS id 5E9AB3BA8E
 for <bloat@lists.bufferbloat.net>; Fri, 29 Jun 2018 08:22:23 -0400 (EDT)
Received: by mail-lf0-x22c.google.com with SMTP id n96-v6so6669451lfi.1
 for <bloat@lists.bufferbloat.net>; Fri, 29 Jun 2018 05:22:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to;
 bh=7YpehvlLf7XjmD/COfpnLM8QVT4g72a/nLsImOXimPM=;
 b=noKTuDiXxpBMudTsLEQjGKtOjyq4diQx9PjP1UsIjtIGo3/TYcvofDJ7F+6IaICi/7
 NN3hSL3Ys6W5kypPt/QS4K5jussYfRcwCrvolk6YQpcdbDFta0DCdefVJwUnlWL1xW9a
 klRq2X1BS83I4mKtdm/9IrfgF0fcl/kT5JZerXkQLK97Ve7sxrIUubjGgVlRswc3YrRn
 N9fjiRPoBXTSdW5uYZy0wCNkkrdUOiEXHfeYnOK8h/K790CZSQl5MDegPN8yRQ+dCypN
 nMTaqUaWHJGpqnMJDKwCzN+sEvqC8CvdwrpoZXib0PeCmpZZ4SHRKZOs2+TZHSarj3kl
 zqOg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to;
 bh=7YpehvlLf7XjmD/COfpnLM8QVT4g72a/nLsImOXimPM=;
 b=VBpvaffvaSQTf8GYBmvFQcR9izRnNVTJISF+Lfe1O9+fl6gndCMzP3aDmmZc/fkbs6
 Wyg/xR3WfAs/9FJG6Azo0C3LRj/SzZV95RWwRKH5Rf3FPaqUpjeCjMDvFb0+bSDIKnhl
 sJ8FcoLMoDy6k8mbMEIxWlhLY6w2miXXtpi4ndkW1OGyeb9DLCjHean1oYJtwhUleUhT
 jLEPtOt6Dgds3H5DrFq+Gwiif4VwKHWRnkJ0p7Deo3Aou3waIo9MyAL36ovqtXsFB4UW
 gybjNvui1JxZxPi8A8rBItKH08Vnt/T/Z+MvDuGhiVL2e2RLh01Q7s+6wKbAPs2DVds9
 qdQg==
X-Gm-Message-State: APt69E26T4mby4TiXBsjai89JY2STG9BEcQwi5vK1hXitXrrzzV4f5Hu
 3g/mZRSxw3t0Gb7jFHCUcYQ=
X-Google-Smtp-Source: AAOMgpe9uUXOaW9sBqcxXI1D0iXa5KQLV1Bfy1zj+qXGcwK8lPReb7qFn9bu6yOsKTOgbqxvBSAt4Q==
X-Received: by 2002:a19:8d07:: with SMTP id
 p7-v6mr9349851lfd.117.1530274942241; 
 Fri, 29 Jun 2018 05:22:22 -0700 (PDT)
Received: from [192.168.239.101] (83-245-233-125-nat-p.elisa-mobile.fi.
 [83.245.233.125])
 by smtp.gmail.com with ESMTPSA id p129-v6sm1716287lfd.56.2018.06.29.05.22.19
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 29 Jun 2018 05:22:21 -0700 (PDT)
Content-Type: text/plain;
	charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 11.4 \(3445.8.2\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <CAM9iV=Ldp2YsTHDeF60jn1bP_yCAbbx7FEiVpxHBSNFcEDj=jQ@mail.gmail.com>
Date: Fri, 29 Jun 2018 15:22:15 +0300
Cc: bloat <bloat@lists.bufferbloat.net>
Content-Transfer-Encoding: quoted-printable
Message-Id: <D125DDA5-CF61-4FD2-8497-3B42E91B2EE6@gmail.com>
References: <CAM9iV=Ldp2YsTHDeF60jn1bP_yCAbbx7FEiVpxHBSNFcEDj=jQ@mail.gmail.com>
To: =?utf-8?Q?Jonas_M=C3=A5rtensson?= <martensson.jonas@gmail.com>
X-Mailer: Apple Mail (2.3445.8.2)
Subject: Re: [Bloat] powerboost and sqm
X-BeenThere: bloat@lists.bufferbloat.net
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: General list for discussing Bufferbloat <bloat.lists.bufferbloat.net>
List-Unsubscribe: <https://lists.bufferbloat.net/options/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=unsubscribe>
List-Archive: <https://lists.bufferbloat.net/pipermail/bloat>
List-Post: <mailto:bloat@lists.bufferbloat.net>
List-Help: <mailto:bloat-request@lists.bufferbloat.net?subject=help>
List-Subscribe: <https://lists.bufferbloat.net/listinfo/bloat>,
 <mailto:bloat-request@lists.bufferbloat.net?subject=subscribe>
X-List-Received-Date: Fri, 29 Jun 2018 12:22:23 -0000

> On 29 Jun, 2018, at 1:56 pm, Jonas M=C3=A5rtensson =
<martensson.jonas@gmail.com> wrote:
>=20
> So, what do you think:
>=20
> - Are the latency spikes real? The fact that they disappear with sqm =
suggests so but what could cause such short spikes? Is it related to the =
powerboost?

I think I can explain what's going on here.  TL;DR - the ISP is still =
using a dumb FIFO, but have resized it to something reasonable.  The =
latency spikes result from an interaction between several systems.

"PowerBoost" is just a trade name for the increasingly-common practice =
of configuring a credit-mode shaper (typically a token bucket filter, or =
TBF for short) with a very large bucket.  It refills that bucket at a =
fixed rate (100Mbps in your case), up to a specified maximum, and drains =
it proportionately for every byte sent over the link.  Packets are sent =
when there are enough tokens in the bucket to transmit them in full, and =
not before.  There may be a second TBF with a much smaller bucket, to =
enforce a limit on the burst rate (say 300Mbps) - but let's assume that =
isn't used here, so the true limit is the 1Gbps link rate.

Until the bucket empties, packets are sent over the link as soon as they =
arrive, so the observed latency is minimal and the throughput converges =
on a high value.  The moment the bucket empties, however, packets are =
queued and throughput instantaneously drops to 100Mbps.  The queue fills =
up quickly and overflows; packets are then dropped.  TCP rightly =
interprets this as its cue to back off.

The maximum inter-flow induced latency is consistently about 125ms.  =
This is roughly what you'd expect from a dumb FIFO that's been sized to =
1x BDP, and *much* better than typical ISP configurations to date.  I'd =
still much rather have the sub-millisecond induced latencies that Cake =
achieves, but this is a win for the average Joe Oblivious.

So why the spikes?  Well, TCP backs off when it sees the packet losses, =
and it continues to do so until the queue drains enough to stop losing =
packets.  However, this leaves TCP transmitting at less than the shaped =
rate, so the TBF starts filling its bucket with the leftovers.  Latency =
returns to minimum because the queue is empty.  TCP then gradually grows =
its congestion window again to probe the path; different TCPs do this in =
different patterns, but it'll generally take much more than one RTT at =
this bandwidth.  Windows, I believe, increases cwnd by one packet per =
RTT (ie. is Reno-like).

So by the time TCP gets back up to 100Mbps, the TBF has stored quite a =
few spare tokens in its bucket.  These now start to be spent, while TCP =
continues probing *past* 100Mbps, still not seeing the true limit.  By =
the time the bucket empties, TCP is transmitting considerably faster =
than 100Mbps, such that the *average* throughput since the last loss is =
*exactly* 100Mbps - this last is what TBF is designed to enforce; it =
technically doesn't start with a full bucket but an empty one, but the =
bucket usually fills up before the user gets around to measuring it.

So then the TBF runs out of spare tokens and slams on the brakes, the =
queue rapidly fills and overflows, packets are lost, TCP reels, =
retransmits, and backs off again.  Rinse, repeat.

> - Would you enable sqm on this connection? By doing so I miss out on =
the higher rate for the first few seconds.

Yes, I would absolutely use SQM here.  It'll both iron out those latency =
spikes and reduce packet loss, and what's more it'll prevent =
congestion-related latency and loss from affecting any but the provoking =
flow(s).

IMHO, the benefits of PowerBoost are illusory.  When you've got 100Mbps =
steady-state, tripling that for a short period is simply not perceptible =
in most applications.  Even Web browsing, which typically involves =
transfers smaller than the size of the bucket, is limited by latency not =
bandwidth, once you get above a couple of Mbps.  For a real performance =
benefit - for example, speeding up large software updates - bandwidth =
increases need to be available for minutes, not seconds, so that a =
gigabyte or more can be transferred at the higher speed.

> What are the actual downsides of not enabling sqm in this case?


Those latency spikes would be seen by latency-sensitive applications as =
jitter, which is one of the most insidious problems to cope with in a =
realtime interactive system.  They coincide with momentary spikes in =
packet loss (which unfortunately is not represented in dslreports' =
graphs) which are also difficult to cope with.

That means your VoIP or videoconference session will glitch and drop out =
periodically, unless it has deliberately increased its own latency with =
internal buffering and redundant transmissions to compensate, if a bulk =
transfer is started up by some background application (Steam, Windows =
Update) or by someone else in your house (visiting niece bulk-uploads an =
SD card full of holiday photos to Instagram, wife cues up Netflix for =
the evening).

That also means your online game session, under similar circumstances, =
will occasionally fail to show you an enemy move in time for you to =
react to it, or even delay or fail to register your own actions because =
the packets notifying the server of them were queued and/or lost.  Sure, =
125ms is a far cry from the multiple seconds we often see, but it's =
still problematic for gamers - it corresponds to just 8Hz, when they're =
running their monitors at 144Hz and their mice at 1000Hz.

 - Jonathan Morton