From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf0-x22c.google.com (mail-lf0-x22c.google.com [IPv6:2a00:1450:4010:c07::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 5E9AB3BA8E for ; Fri, 29 Jun 2018 08:22:23 -0400 (EDT) Received: by mail-lf0-x22c.google.com with SMTP id n96-v6so6669451lfi.1 for ; Fri, 29 Jun 2018 05:22:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=7YpehvlLf7XjmD/COfpnLM8QVT4g72a/nLsImOXimPM=; b=noKTuDiXxpBMudTsLEQjGKtOjyq4diQx9PjP1UsIjtIGo3/TYcvofDJ7F+6IaICi/7 NN3hSL3Ys6W5kypPt/QS4K5jussYfRcwCrvolk6YQpcdbDFta0DCdefVJwUnlWL1xW9a klRq2X1BS83I4mKtdm/9IrfgF0fcl/kT5JZerXkQLK97Ve7sxrIUubjGgVlRswc3YrRn N9fjiRPoBXTSdW5uYZy0wCNkkrdUOiEXHfeYnOK8h/K790CZSQl5MDegPN8yRQ+dCypN nMTaqUaWHJGpqnMJDKwCzN+sEvqC8CvdwrpoZXib0PeCmpZZ4SHRKZOs2+TZHSarj3kl zqOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=7YpehvlLf7XjmD/COfpnLM8QVT4g72a/nLsImOXimPM=; b=VBpvaffvaSQTf8GYBmvFQcR9izRnNVTJISF+Lfe1O9+fl6gndCMzP3aDmmZc/fkbs6 Wyg/xR3WfAs/9FJG6Azo0C3LRj/SzZV95RWwRKH5Rf3FPaqUpjeCjMDvFb0+bSDIKnhl sJ8FcoLMoDy6k8mbMEIxWlhLY6w2miXXtpi4ndkW1OGyeb9DLCjHean1oYJtwhUleUhT jLEPtOt6Dgds3H5DrFq+Gwiif4VwKHWRnkJ0p7Deo3Aou3waIo9MyAL36ovqtXsFB4UW gybjNvui1JxZxPi8A8rBItKH08Vnt/T/Z+MvDuGhiVL2e2RLh01Q7s+6wKbAPs2DVds9 qdQg== X-Gm-Message-State: APt69E26T4mby4TiXBsjai89JY2STG9BEcQwi5vK1hXitXrrzzV4f5Hu 3g/mZRSxw3t0Gb7jFHCUcYQ= X-Google-Smtp-Source: AAOMgpe9uUXOaW9sBqcxXI1D0iXa5KQLV1Bfy1zj+qXGcwK8lPReb7qFn9bu6yOsKTOgbqxvBSAt4Q== X-Received: by 2002:a19:8d07:: with SMTP id p7-v6mr9349851lfd.117.1530274942241; Fri, 29 Jun 2018 05:22:22 -0700 (PDT) Received: from [192.168.239.101] (83-245-233-125-nat-p.elisa-mobile.fi. [83.245.233.125]) by smtp.gmail.com with ESMTPSA id p129-v6sm1716287lfd.56.2018.06.29.05.22.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Jun 2018 05:22:21 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 11.4 \(3445.8.2\)) From: Jonathan Morton In-Reply-To: Date: Fri, 29 Jun 2018 15:22:15 +0300 Cc: bloat Content-Transfer-Encoding: quoted-printable Message-Id: References: To: =?utf-8?Q?Jonas_M=C3=A5rtensson?= X-Mailer: Apple Mail (2.3445.8.2) Subject: Re: [Bloat] powerboost and sqm X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jun 2018 12:22:23 -0000 > On 29 Jun, 2018, at 1:56 pm, Jonas M=C3=A5rtensson = wrote: >=20 > So, what do you think: >=20 > - Are the latency spikes real? The fact that they disappear with sqm = suggests so but what could cause such short spikes? Is it related to the = powerboost? I think I can explain what's going on here. TL;DR - the ISP is still = using a dumb FIFO, but have resized it to something reasonable. The = latency spikes result from an interaction between several systems. "PowerBoost" is just a trade name for the increasingly-common practice = of configuring a credit-mode shaper (typically a token bucket filter, or = TBF for short) with a very large bucket. It refills that bucket at a = fixed rate (100Mbps in your case), up to a specified maximum, and drains = it proportionately for every byte sent over the link. Packets are sent = when there are enough tokens in the bucket to transmit them in full, and = not before. There may be a second TBF with a much smaller bucket, to = enforce a limit on the burst rate (say 300Mbps) - but let's assume that = isn't used here, so the true limit is the 1Gbps link rate. Until the bucket empties, packets are sent over the link as soon as they = arrive, so the observed latency is minimal and the throughput converges = on a high value. The moment the bucket empties, however, packets are = queued and throughput instantaneously drops to 100Mbps. The queue fills = up quickly and overflows; packets are then dropped. TCP rightly = interprets this as its cue to back off. The maximum inter-flow induced latency is consistently about 125ms. = This is roughly what you'd expect from a dumb FIFO that's been sized to = 1x BDP, and *much* better than typical ISP configurations to date. I'd = still much rather have the sub-millisecond induced latencies that Cake = achieves, but this is a win for the average Joe Oblivious. So why the spikes? Well, TCP backs off when it sees the packet losses, = and it continues to do so until the queue drains enough to stop losing = packets. However, this leaves TCP transmitting at less than the shaped = rate, so the TBF starts filling its bucket with the leftovers. Latency = returns to minimum because the queue is empty. TCP then gradually grows = its congestion window again to probe the path; different TCPs do this in = different patterns, but it'll generally take much more than one RTT at = this bandwidth. Windows, I believe, increases cwnd by one packet per = RTT (ie. is Reno-like). So by the time TCP gets back up to 100Mbps, the TBF has stored quite a = few spare tokens in its bucket. These now start to be spent, while TCP = continues probing *past* 100Mbps, still not seeing the true limit. By = the time the bucket empties, TCP is transmitting considerably faster = than 100Mbps, such that the *average* throughput since the last loss is = *exactly* 100Mbps - this last is what TBF is designed to enforce; it = technically doesn't start with a full bucket but an empty one, but the = bucket usually fills up before the user gets around to measuring it. So then the TBF runs out of spare tokens and slams on the brakes, the = queue rapidly fills and overflows, packets are lost, TCP reels, = retransmits, and backs off again. Rinse, repeat. > - Would you enable sqm on this connection? By doing so I miss out on = the higher rate for the first few seconds. Yes, I would absolutely use SQM here. It'll both iron out those latency = spikes and reduce packet loss, and what's more it'll prevent = congestion-related latency and loss from affecting any but the provoking = flow(s). IMHO, the benefits of PowerBoost are illusory. When you've got 100Mbps = steady-state, tripling that for a short period is simply not perceptible = in most applications. Even Web browsing, which typically involves = transfers smaller than the size of the bucket, is limited by latency not = bandwidth, once you get above a couple of Mbps. For a real performance = benefit - for example, speeding up large software updates - bandwidth = increases need to be available for minutes, not seconds, so that a = gigabyte or more can be transferred at the higher speed. > What are the actual downsides of not enabling sqm in this case? Those latency spikes would be seen by latency-sensitive applications as = jitter, which is one of the most insidious problems to cope with in a = realtime interactive system. They coincide with momentary spikes in = packet loss (which unfortunately is not represented in dslreports' = graphs) which are also difficult to cope with. That means your VoIP or videoconference session will glitch and drop out = periodically, unless it has deliberately increased its own latency with = internal buffering and redundant transmissions to compensate, if a bulk = transfer is started up by some background application (Steam, Windows = Update) or by someone else in your house (visiting niece bulk-uploads an = SD card full of holiday photos to Instagram, wife cues up Netflix for = the evening). That also means your online game session, under similar circumstances, = will occasionally fail to show you an enemy move in time for you to = react to it, or even delay or fail to register your own actions because = the packets notifying the server of them were queued and/or lost. Sure, = 125ms is a far cry from the multiple seconds we often see, but it's = still problematic for gamers - it corresponds to just 8Hz, when they're = running their monitors at 144Hz and their mice at 1000Hz. - Jonathan Morton