Discussion of explicit congestion notification's impact on the Internet
 help / color / mirror / Atom feed
From: Bob Briscoe <research@bobbriscoe.net>
To: Dave Taht <dave.taht@gmail.com>,
	"Mohit P. Tahiliani" <tahiliani@nitk.edu.in>,
	Asad Sajjad Ahmed <me@asadsa.com>
Cc: ECN-Sane <ecn-sane@lists.bufferbloat.net>
Subject: Re: [Ecn-sane] paper idea: praising smaller packets
Date: Mon, 27 Sep 2021 15:50:14 +0100	[thread overview]
Message-ID: <a0aef109-e681-6d02-363c-ab65fc2134c4@bobbriscoe.net> (raw)
In-Reply-To: <CAA93jw68BLggvcjqgYPmz8sdUsfMD2CpoMoKsQt3uz2HKmkT+g@mail.gmail.com>

Dave,

On 26/09/2021 21:08, Dave Taht wrote:
> ... an exploration of smaller mss sizes in response to persistent congestion
>
> This is in response to two declarative statements in here that I've
> long disagreed with,
> involving NOT shrinking the mss, and not trying to do pacing...

I would still avoid shrinking the MSS, 'cos you don't know if the 
congestion constraint is the CPU, in which case you'll make congestion 
worse. But we'll have to differ on that if you disagree.

I don't think that paper said don't do pacing. In fact, it says "...pace 
the segments at less than one per round trip..."

Whatever, that paper was the problem statement, with just some ideas on 
how we were going to solve it.
after that, Asad (added to the distro) did his whole Masters thesis on 
this - I suggest you look at his thesis and code (pointers below).

Also soon after he'd finished, changes to BBRv2 were introduced to 
reduce queuing delay with large numbers of flows. You might want to take 
a look at that too:
https://datatracker.ietf.org/meeting/106/materials/slides-106-iccrg-update-on-bbrv2#page=10

>
> https://www.bobbriscoe.net/projects/latency/sub-mss-w.pdf
>
> OTherwise, for a change, I largely agree with bob.
>
> "No amount of AQM twiddling can fix this. The solution has to fix TCP."
>
> "nearly all TCP implementations cannot operate at less than two packets per RTT"

Back to Asad's Master's thesis, we found that just pacing out the 
packets wasn't enough. There's a very brief summary of the 4 things we 
found we had to do in 4 bullets in this section of our write-up for netdev:
https://bobbriscoe.net/projects/latency/tcp-prague-netdev0x13.pdf#subsubsection.3.1.6
And I've highlighted a couple of unexpected things that cropped up below.

Asad's full thesis:
               Ahmed, A., "Extending TCP for Low Round Trip Delay",
               Masters Thesis, Uni Oslo , August 2019,
               <https://www.duo.uio.no/handle/10852/70966>.
Asad's thesis presentation:
     https://bobbriscoe.net/presents/1909submss/present_asadsa.pdf

Code:
     https://bitbucket.org/asadsa/kernel420/src/submss/
Despite significant changes to basic TCP design principles, the diffs 
were not that great.

A number of tricky problems came up.

* For instance, simple pacing when <1 ACK per RTT wasn't that simple. 
Whenever there were bursts from cross-traffic, the consequent burst in 
your own flow kept repeating in subsequent rounds. We realized this was 
because you never have a real ACK clock (you always set the next send 
time based on previous send times). So we set up the the next send time 
but then re-adjusted it if/when the next ACK did actually arrive.

* The additive increase of one segment was the other main problem. When 
you have such a small window, multiplicative decrease scales fine, but 
an additive increase of 1 segment is a huge jump in comparison, when 
cwnd is a fraction of a segment. "Logarithmically scaled additive 
increase" was our solution to that (basically, every time you set 
ssthresh, alter the additive increase constant using a formula that 
scales logarithmically with ssthresh, so it's still roughly 1 for the 
current Internet scale).

What became of Asad's work?
Altho the code finally worked pretty well {1}, we decided not to pursue 
it further 'cos a minimum cwnd actually gives a trickle of throughput 
protection against unresponsive flows (with the downside that it 
increases queuing delay). That's not to say this isn't worth working on 
further, but there was more to do to make it bullet proof, and we were 
in two minds how important it was, so it worked its way down our 
priority list.

{Note 1: From memory, there was an outstanding problem with one flow 
remaining dominant if you had step-ECN marking, which we worked out was 
due to the logarithmically scaled additive increase, but we didn't work 
on it further to fix it.}



Bob


-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/


  reply	other threads:[~2021-09-27 14:50 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-26 20:08 Dave Taht
2021-09-27 14:50 ` Bob Briscoe [this message]
2021-09-27 15:14   ` Dave Taht
2021-09-28 22:15   ` David P. Reed
2021-09-29  9:26     ` Vint Cerf
2021-09-29 10:36     ` Jonathan Morton
2021-09-29 10:55       ` Vint Cerf
2021-09-29 11:38         ` Jonathan Morton
2021-09-29 19:34       ` David P. Reed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/ecn-sane.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a0aef109-e681-6d02-363c-ab65fc2134c4@bobbriscoe.net \
    --to=research@bobbriscoe.net \
    --cc=dave.taht@gmail.com \
    --cc=ecn-sane@lists.bufferbloat.net \
    --cc=me@asadsa.com \
    --cc=tahiliani@nitk.edu.in \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox