revolutions per minute - a new metric for measuring responsiveness
 help / color / mirror / Atom feed
* [Rpm] receive window bug fix
@ 2023-06-03  8:03 Dave Taht
  2023-06-03 18:20 ` Aaron Wood
  2023-06-03 18:56 ` rjmcmahon
  0 siblings, 2 replies; 4+ messages in thread
From: Dave Taht @ 2023-06-03  8:03 UTC (permalink / raw)
  To: bloat, Rpm

these folk do good work, and I loved the graphs

https://blog.cloudflare.com/unbounded-memory-usage-by-tcp-for-receive-buffers-and-how-we-fixed-it/

-- 
Podcast: https://www.linkedin.com/feed/update/urn:li:activity:7058793910227111937/
Dave Täht CSO, LibreQos

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Rpm] receive window bug fix
  2023-06-03  8:03 [Rpm] receive window bug fix Dave Taht
@ 2023-06-03 18:20 ` Aaron Wood
  2023-06-03 19:15   ` rjmcmahon
  2023-06-03 18:56 ` rjmcmahon
  1 sibling, 1 reply; 4+ messages in thread
From: Aaron Wood @ 2023-06-03 18:20 UTC (permalink / raw)
  To: Dave Taht; +Cc: Rpm, bloat

[-- Attachment #1: Type: text/plain, Size: 1019 bytes --]

This is good work!  I love reading their posts on scale like this.

It’s wild to me that the Linux kernel has (apparently) never implemented
shrinking the receive window, or handling the case of userspace starting a
large transfer and then just not ever reading it…  the latter is less
surprising, I guess, because that’s an application bug that you probably
would catch separately, and would be focused on fixing in the application
layer…

-Aaron

On Sat, Jun 3, 2023 at 1:04 AM Dave Taht via Rpm <rpm@lists.bufferbloat.net>
wrote:

> these folk do good work, and I loved the graphs
>
>
> https://blog.cloudflare.com/unbounded-memory-usage-by-tcp-for-receive-buffers-and-how-we-fixed-it/
>
> --
> Podcast:
> https://www.linkedin.com/feed/update/urn:li:activity:7058793910227111937/
> Dave Täht CSO, LibreQos
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm
>
-- 
- Sent from my iPhone.

[-- Attachment #2: Type: text/html, Size: 1935 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Rpm] receive window bug fix
  2023-06-03  8:03 [Rpm] receive window bug fix Dave Taht
  2023-06-03 18:20 ` Aaron Wood
@ 2023-06-03 18:56 ` rjmcmahon
  1 sibling, 0 replies; 4+ messages in thread
From: rjmcmahon @ 2023-06-03 18:56 UTC (permalink / raw)
  To: Dave Taht; +Cc: bloat, Rpm

> these folk do good work, and I loved the graphs
> 
> https://blog.cloudflare.com/unbounded-memory-usage-by-tcp-for-receive-buffers-and-how-we-fixed-it/

Very cool. Thanks for sharing.

I've been considering adding stress tests to iperf 2. Looks like 
Cloudfare has at least two

Small reads & writes with short delay to stress receive window 
processing per

   At the sending host, run a TCP program with an infinite loop, sending 
1500B packets, with a 1 ms delay between each send.
   At the receiving host, run a TCP program with an infinite loop, 
reading 1B at a time, with a 1 ms delay between each read.

And then, rx buffer limit tests, from 
https://blog.cloudflare.com/optimizing-tcp-for-high-throughput-and-low-latency/

   reads as fast as it can, for five seconds this is called fast mode, 
opens up the window
   calculates 5% of the high watermark of the bytes reader during any 
previous one second
   for each second of the next 15 seconds: this is called slow mode
   reads that 5% number of bytes, then stops reading
   sleeps for the remainder of that particular second
   most of the second consists of no reading at all
   steps 1-3 are repeated in a loop three times, so the entire run is 60 
seconds

   This has the effect of highlighting any issues in the handling of 
packets when the buffers repeatedly hit the limit.

Curious about any other traffic scenarios driven by socket read/write 
behaviors that could be useful. Or any others that might apply to WiFi 
aggregation.

Then, if there is a way to generalize these types of send/read/delay 
graphs with a parametric command line?

Bob

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Rpm] receive window bug fix
  2023-06-03 18:20 ` Aaron Wood
@ 2023-06-03 19:15   ` rjmcmahon
  0 siblings, 0 replies; 4+ messages in thread
From: rjmcmahon @ 2023-06-03 19:15 UTC (permalink / raw)
  To: Aaron Wood; +Cc: Dave Taht, Rpm, bloat

I think better tooling can help and I am always interested in 
suggestions on what to add to iperf 2 for better coverages.

I've thought it good for iperf 2 to support some sort of graph which 
drives socket read/write/delays vs a simplistic pattern of AFAP. It for 
sure stresses things differently, even in drivers. I've seen huge delays 
in some 10G drivers where some UDP packets seem to get stuck in queues 
and where the e2e latency is driven by the socket write rates vs the 
network delays. This is most obvious using burst patterns where the last 
packet of a latency burst is coupled to the first packet of the 
subsequent burst. The coupling between the syscalls to network 
performance is nonobvious and sometimes hard to believe.

We've been adding more "traffic profile" knobs for socket testing and 
have much of the latency metrics incorporated. Most don't use these. 
They seem to be hard to generalize. Cloudflare seems to have crafted 
specific tests after obtaining knowledge of causality.

Bob

PS. As a side note, I'm now being asked how to generate "AI loads" into 
switch fabrics, though there it probably won't be based upon socket 
syscalls but maybe using io_urings - not sure.

> This is good work!  I love reading their posts on scale like this.
> 
> It’s wild to me that the Linux kernel has (apparently) never
> implemented shrinking the receive window, or handling the case of
> userspace starting a large transfer and then just not ever reading
> it…  the latter is less surprising, I guess, because that’s an
> application bug that you probably would catch separately, and would be
> focused on fixing in the application layer…
> 
> -Aaron
> 
> On Sat, Jun 3, 2023 at 1:04 AM Dave Taht via Rpm
> <rpm@lists.bufferbloat.net> wrote:
> 
>> these folk do good work, and I loved the graphs
>> 
>> 
> https://blog.cloudflare.com/unbounded-memory-usage-by-tcp-for-receive-buffers-and-how-we-fixed-it/
>> 
>> --
>> Podcast:
>> 
> https://www.linkedin.com/feed/update/urn:li:activity:7058793910227111937/
>> Dave Täht CSO, LibreQos
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/rpm
>  --
> - Sent from my iPhone.
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-06-03 19:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-03  8:03 [Rpm] receive window bug fix Dave Taht
2023-06-03 18:20 ` Aaron Wood
2023-06-03 19:15   ` rjmcmahon
2023-06-03 18:56 ` rjmcmahon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox