From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 0BDE13CB35 for ; Tue, 27 Nov 2018 08:37:40 -0500 (EST) Received: by mail-qt1-x82f.google.com with SMTP id p17so21660695qtl.5 for ; Tue, 27 Nov 2018 05:37:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9e6V4Zpi0mMoncJ32lq10VeP8dHBoY21PqzP+xUeKRQ=; b=OnvWSysZCiHljMM6jRm+ES7PPrds9R3TMPfSjEWp1D3BJQaWyLag2aYn1M9IEJqMDc hAJbQVR4biifc9YwlIBxGSjdviUWAfxwDA7I2hk7B3y/3JqEQV+qvfH6JLO5jNt7/huY qxNitIqNdk7ApNrsZxWrd1x6DjLTXmq46XKPOZ/99e/TDZJKc7fqbRhUakabihXzpZ4C 5Hljv+ICaqxUoFGy6koX2eFfe4MO8qJ9/wTrDP9PymlJdbvysUKl/enkVBy7+nvE/aM5 k+ap1Uy3LE5VD6GnS5WHNWUpRiM8LeBskk/sjY4q38xSY2szHCC0NN+aSDb/vEESUn1y RlRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9e6V4Zpi0mMoncJ32lq10VeP8dHBoY21PqzP+xUeKRQ=; b=fzN2vTnrIjf16yEThgyLLXCMSrnzT94MaYEYfMil9lXTObUU13X7wynp0w8znZNcL7 ssNTBJmgtnBGHcNeZy6AB4R5Fw39lM9mpMQ2Lb5e5STC2B3ehOZMeiqqgJBTOeWKLGc6 oI6gy6gZw914u6iyyHQu8hIXq8Z+WkiX9t/xBvRrbg7cJsxVCPDHKRYVuXGZFUhTFAaU N4CbSpztFQ25JA7l3vDCP4L9VIWBji15ggC27vhqcvFPnHaqT1ZTLqwtJPPSgPM4+ztP HL3bRwKTT/mz4hVtltpvZi13YARQA5RU4RwkrPs+3H+IQpmWCruYAQER5ageOLmjRFBV wQjw== X-Gm-Message-State: AGRZ1gJSO+uJgBL4DdkqBFeLmXSUwxFf1E4/N7wPGsA/CDueUYJ2VUhv rWNltOCbyXixY06mTUYTAE1UydgxsWfRCe8xvCo= X-Google-Smtp-Source: AJdET5ceiI3fwymqF/qtc14BKd4W7K9kSKUhy2ZRYJRXLqi/WTdV1eYr11ky3y2QHN+DfZ7tv2W7aJ/zPkVTtAwKVk8= X-Received: by 2002:aed:2946:: with SMTP id s64mr30632886qtd.383.1543325860448; Tue, 27 Nov 2018 05:37:40 -0800 (PST) MIME-Version: 1.0 References: <65EAC6C1-4688-46B6-A575-A6C7F2C066C5@heistp.net> <86b16a95-e47d-896b-9d43-69c65c52afc7@kit.edu> In-Reply-To: From: Luca Muscariello Date: Tue, 27 Nov 2018 14:37:29 +0100 Message-ID: To: Jonathan Morton Cc: Mikael Abrahamsson , "Bless, Roland (TM)" , bloat Content-Type: multipart/alternative; boundary="000000000000af4d2f057ba58dd2" Subject: Re: [Bloat] when does the CoDel part of fq_codel help in the real world? X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2018 13:37:41 -0000 --000000000000af4d2f057ba58dd2 Content-Type: text/plain; charset="UTF-8" Another bit to this. A router queue is supposed to serve packets no matter what is running at the controlled end-point, BBR, Cubic or else. So, delay-based congestion controller still get hurt in today Internet unless they can get their portion of buffer at the line card. FQ creates incentives for end-points to send traffic in a smoother way because the reward to the application is immediate and measurable. But the end-point does not know in advance if FQ is there or not. So going back to sizing the link buffer, the rule I mentioned applies. And it allows to get best completion times for a wider range of RTTs. If you, Mikael don't want more than 10ms buffer, how do you achieve that? You change the behaviour of the source and hope flow isolation is available. If you just cut the buffer down to 10ms and do nothing else, the only thing you get is a short queue and may throw away half of your link capacity. On Tue, Nov 27, 2018 at 1:17 PM Jonathan Morton wrote: > > On 27 Nov, 2018, at 1:21 pm, Mikael Abrahamsson > wrote: > > > > It's complicated. I've had people throw in my face that I need 2xBDP in > buffer size to smoothe things out. Personally I don't want more than 10ms > buffer (max), and I don't see why I should need more than that even if > transfers are running over hundreds of ms of light-speed-in-medium induced > delay between the communicating systems. > > I think we can agree that the ideal CC algo would pace packets out > smoothly at exactly the path capacity, neither building a queue at the > bottleneck nor leaving capacity on the table. > > Actually achieving that in practice turns out to be difficult, because > there's no general way to discover the path capacity in advance. AQMs like > Codel, in combination with ECN, get us a step closer by explicitly > informing each flow when it is exceeding that capacity while the queue is > still reasonably short. FQ also helps, by preventing flows from > inadvertently interfering with each other by imperfectly managing their > congestion windows. > > So with the presently deployed state of the art, we have cwnds oscillating > around reasonably short queue lengths, backing off sharply in response to > occasional signals, then probing back upwards when that signal goes away > for a while. It's a big improvement over dumb drop-tail FIFOs, but it's > still some distance from the ideal. That's because the information > injected by the bottleneck AQM is a crude binary state. > > I do not include DCTCP in the deployed state of the art, because it is not > deployable in the RFC-compliant Internet; it is effectively incompatible > with Codel in particular, because it wrongly interprets CE marks and is > thus noncompliant with the ECN RFC. > > However, I agree with DCTCP's goal of achieving finer-grained control of > the cwnd, through AQMs providing more nuanced information about the state > of the path capacity and/or bottleneck queue. An implementation that made > use of ECT(1) instead of changing the meaning of CE marks would remain > RFC-compliant, and could get "sufficiently close" to the ideal described > above. > > - Jonathan Morton > > --000000000000af4d2f057ba58dd2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Another bit to this.
A router queue is supposed to ser= ve packets no matter what is running at the controlled end-point, BBR, Cubi= c or else.
So, delay-based congestion controller still get hurt i= n today Internet unless they can get their portion of buffer at the line ca= rd.
FQ creates incentives for end-points to send traffic in a smo= other way because the reward to the application is immediate and=C2=A0
measurable. But the end-point does not know in advance if FQ is there= or not.

So going back to sizing the link buffer, = the rule I mentioned applies. And it allows to get best completion times fo= r a wider range of RTTs.
If you, Mikael don't want more than = 10ms buffer, how do you achieve that? You change the behaviour of the sourc= e and hope flow isolation is available.
If you just cut the buffe= r down to 10ms and do nothing else, the only thing you get is a short queue= and may throw away half of your link capacity.


On Tue, Nov 2= 7, 2018 at 1:17 PM Jonathan Morton <chromatix99@gmail.com> wrote:
> On 27 Nov, 2018, at 1:21 pm, Mikael Abrahamsson <swmike@swm.pp.se> wrote:<= br> >
> It's complicated. I've had people throw in my face that I need= 2xBDP in buffer size to smoothe things out. Personally I don't want mo= re than 10ms buffer (max), and I don't see why I should need more than = that even if transfers are running over hundreds of ms of light-speed-in-me= dium induced delay between the communicating systems.

I think we can agree that the ideal CC algo would pace packets out smoothly= at exactly the path capacity, neither building a queue at the bottleneck n= or leaving capacity on the table.

Actually achieving that in practice turns out to be difficult, because ther= e's no general way to discover the path capacity in advance.=C2=A0 AQMs= like Codel, in combination with ECN, get us a step closer by explicitly in= forming each flow when it is exceeding that capacity while the queue is sti= ll reasonably short.=C2=A0 FQ also helps, by preventing flows from inadvert= ently interfering with each other by imperfectly managing their congestion = windows.

So with the presently deployed state of the art, we have cwnds oscillating = around reasonably short queue lengths, backing off sharply in response to o= ccasional signals, then probing back upwards when that signal goes away for= a while.=C2=A0 It's a big improvement over dumb drop-tail FIFOs, but i= t's still some distance from the ideal.=C2=A0 That's because the in= formation injected by the bottleneck AQM is a crude binary state.

I do not include DCTCP in the deployed state of the art, because it is not = deployable in the RFC-compliant Internet; it is effectively incompatible wi= th Codel in particular, because it wrongly interprets CE marks and is thus = noncompliant with the ECN RFC.

However, I agree with DCTCP's goal of achieving finer-grained control o= f the cwnd, through AQMs providing more nuanced information about the state= of the path capacity and/or bottleneck queue.=C2=A0 An implementation that= made use of ECT(1) instead of changing the meaning of CE marks would remai= n RFC-compliant, and could get "sufficiently close" to the ideal = described above.

=C2=A0- Jonathan Morton

--000000000000af4d2f057ba58dd2--