<div dir="ltr">Just an FYI,<br><br>iperf 2 uses a 4 usec delay for TCP and 100 usec delay for UDP to fill the token bucket. We thought about providing a knob for this but decided not to. We figured a busy wait CPU thread wasn't a big deal because of the trend of many CPU cores. The threaded design works well for this. We also support fq-pacing and isochronous traffic using clock_nanosleep() to schedule the writes. We'll probably add Markov chain support but that's not critical and may not affect actionable engineering. We found isoch as a useful traffic profile, at least for our WiFi testing. I'm going to add support for TCP_NOTSENT_LOWAT for select()/write() based transmissions. I'm doubtful this is very useful as event based scheduling based on times seems better. We'll probably use it for unit testing WiFi aggregation and see if it helps there or not. I'll see if it aligns with the OWD measurements. <br><br>On queue depth, we use two techniques. The most obvious is to measure the end to end delay and use rx histograms, getting all the samples without averaging. The second, internal for us only, is using network telemetry and mapping all the clock domains to the GPS domain. Any moment in time the end/end path can be inspected to where every packet is. <br><br>Our automated testing is focused around unit tests and used to statistically monitor code changes (which come at a high rate and apply to a broad range of chips) - so the requirements can be very different from a network or service provider.<br><br>Agreed that the amount of knobs and reactive components are a challenge. And one must assume non-linearity which becomes obvious after a few direct measurements (i.e. no averaging.) The challenge of statistical;y reproducible is always there. We find Montec Carlo techniques can be useful only when they are proven to be statistically reproducible.<br><br>Bob<br><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Jul 17, 2021 at 4:29 PM Aaron Wood <<a href="mailto:woody77@gmail.com">woody77@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Mon, Jul 12, 2021 at 1:32 PM Ben Greear <<a href="mailto:greearb@candelatech.com" target="_blank">greearb@candelatech.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">UDP is better for getting actual packet latency, for sure. TCP is typical-user-experience-latency though,<br>
so it is also useful.<br>
<br>
I'm interested in the test and visualization side of this. If there were a way to give engineers<br>
a good real-time look at a complex real-world network, then they have something to go on while trying<br>
to tune various knobs in their network to improve it.<br></blockquote><div><br></div><div>I've always liked the smoke-ping visualization, although a single graph is only really useful for a single pair of endpoints (or a single segment, maybe). But I can see using a repeated set of graphs (Tufte has some examples), that can represent an overview of pairwise collections of latency+loss:</div><div><a href="https://www.edwardtufte.com/bboard/images/0003Cs-8047.GIF" target="_blank">https://www.edwardtufte.com/bboard/images/0003Cs-8047.GIF</a><br></div><div><a href="https://www.edwardtufte.com/tufte/psysvcs_p2" target="_blank">https://www.edwardtufte.com/tufte/psysvcs_p2</a><br></div><div><br></div><div>These work for understanding because the tiled graphs are all identically constructed, and the reader first learns how to read a single tile, and then learns the pattern of which tiles represent which measurements.</div><div><br></div><div>Further, they are opinionated. In the second link above, the y axis is not based on the measured data, but standardized expected values, which (I think) is key to quick readability. You never need to read the axes. Much like setting up gauges such that "nominal" is always at the same indicator position for all graphs (e.g. straight up). At a glance, you can see if things are "correct" or not.</div><div><br></div><div>That tiling arrangement wouldn't be great for showing interrelationships (although it may give you a good historical view of correlated behavior). One thought is to overlay a network graph diagram (graph of all network links) with small "sparkline" type graphs.</div><div><br></div><div>For a more physical-based network graph, I could see visualizing the queue depth for each egress port (max value over a time of X, or percentage of time at max depth).</div><div><br></div><div>Taken together, the timewise correlation could be useful (which peers are having problems communicating, and which ports between them are impacted?).</div><div><br></div><div>I think getting good data about queue depth may be the hard part, especially catching transients and the duty cycle / pulse-width of the load (and then converting that to a number). Back when I uncovered the iperf application-level pacing granularity was too high 5 years ago, I called it them "millibursts", and maybe dtaht pointed out that link utilization is always 0% or 100%, and it's just a matter of the PWM of the packet rate that makes it look like something in between.</div><div><a href="https://burntchrome.blogspot.com/2016/09/iperf3-and-microbursts.html" target="_blank">https://burntchrome.blogspot.com/2016/09/iperf3-and-microbursts.html</a><br></div><div><br></div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
I'll let others try to figure out how build and tune the knobs, but the data acquisition and<br>
visualization is something we might try to accomplish. I have a feeling I'm not the<br>
first person to think of this, however....probably someone already has done such<br>
a thing.<br>
<br>
Thanks,<br>
Ben<br>
<br>
On 7/12/21 1:04 PM, Bob McMahon wrote:<br>
> I believe end host's TCP stats are insufficient as seen per the "failed" congested control mechanisms over the last decades. I think Jaffe pointed this out in <br>
> 1979 though he was using what's been deemed on this thread as "spherical cow queueing theory."<br>
> <br>
> "Flow control in store-and-forward computer networks is appropriate for decentralized execution. A formal description of a class of "decentralized flow control <br>
> algorithms" is given. The feasibility of maximizing power with such algorithms is investigated. On the assumption that communication links behave like M/M/1 <br>
> servers it is shown that no "decentralized flow control algorithm" can maximize network power. Power has been suggested in the literature as a network <br>
> performance objective. It is also shown that no objective based only on the users' throughputs and average delay is decentralizable. Finally, a restricted class <br>
> of algorithms cannot even approximate power."<br>
> <br>
> <a href="https://ieeexplore.ieee.org/document/1095152" rel="noreferrer" target="_blank">https://ieeexplore.ieee.org/document/1095152</a><br>
> <br>
> Did Jaffe make a mistake?<br>
> <br>
> Also, it's been observed that latency is non-parametric in it's distributions and computing gaussians per the central limit theorem for OWD feedback loops <br>
> aren't effective. How does one design a control loop around things that are non-parametric? It also begs the question, what are the feed forward knobs that can <br>
> actually help?<br>
> <br>
> Bob<br>
> <br>
> On Mon, Jul 12, 2021 at 12:07 PM Ben Greear <<a href="mailto:greearb@candelatech.com" target="_blank">greearb@candelatech.com</a> <mailto:<a href="mailto:greearb@candelatech.com" target="_blank">greearb@candelatech.com</a>>> wrote:<br>
> <br>
> Measuring one or a few links provides a bit of data, but seems like if someone is trying to understand<br>
> a large and real network, then the OWD between point A and B needs to just be input into something much<br>
> more grand. Assuming real-time OWD data exists between 100 to 1000 endpoint pairs, has anyone found a way<br>
> to visualize this in a useful manner?<br>
> <br>
> Also, considering something better than ntp may not really scale to 1000+ endpoints, maybe round-trip<br>
> time is only viable way to get this type of data. In that case, maybe clever logic could use things<br>
> like trace-route to get some idea of how long it takes to get 'onto' the internet proper, and so estimate<br>
> the last-mile latency. My assumption is that the last-mile latency is where most of the pervasive<br>
> assymetric network latencies would exist (or just ping 8.8.8.8 which is 20ms from everywhere due to<br>
> $magic).<br>
> <br>
> Endpoints could also triangulate a bit if needed, using some anchor points in the network<br>
> under test.<br>
> <br>
> Thanks,<br>
> Ben<br>
> <br>
> On 7/12/21 11:21 AM, Bob McMahon wrote:<br>
> > iperf 2 supports OWD and gives full histograms for TCP write to read, TCP connect times, latency of packets (with UDP), latency of "frames" with<br>
> > simulated video traffic (TCP and UDP), xfer times of bursts with low duty cycle traffic, and TCP RTT (sampling based.) It also has support for sampling (per<br>
> > interval reports) down to 100 usecs if configured with --enable-fastsampling, otherwise the fastest sampling is 5 ms. We've released all this as open source.<br>
> ><br>
> > OWD only works if the end realtime clocks are synchronized using a "machine level" protocol such as IEEE 1588 or PTP. Sadly, *most data centers don't<br>
> provide<br>
> > sufficient level of clock accuracy and the GPS pulse per second * to colo and vm customers.<br>
> ><br>
> > <a href="https://iperf2.sourceforge.io/iperf-manpage.html" rel="noreferrer" target="_blank">https://iperf2.sourceforge.io/iperf-manpage.html</a><br>
> ><br>
> > Bob<br>
> ><br>
> > On Mon, Jul 12, 2021 at 10:40 AM David P. Reed <<a href="mailto:dpreed@deepplum.com" target="_blank">dpreed@deepplum.com</a> <mailto:<a href="mailto:dpreed@deepplum.com" target="_blank">dpreed@deepplum.com</a>> <mailto:<a href="mailto:dpreed@deepplum.com" target="_blank">dpreed@deepplum.com</a><br>
> <mailto:<a href="mailto:dpreed@deepplum.com" target="_blank">dpreed@deepplum.com</a>>>> wrote:<br>
> ><br>
> ><br>
> > On Monday, July 12, 2021 9:46am, "Livingood, Jason" <<a href="mailto:Jason_Livingood@comcast.com" target="_blank">Jason_Livingood@comcast.com</a> <mailto:<a href="mailto:Jason_Livingood@comcast.com" target="_blank">Jason_Livingood@comcast.com</a>><br>
> <mailto:<a href="mailto:Jason_Livingood@comcast.com" target="_blank">Jason_Livingood@comcast.com</a> <mailto:<a href="mailto:Jason_Livingood@comcast.com" target="_blank">Jason_Livingood@comcast.com</a>>>> said:<br>
> ><br>
> > > I think latency/delay is becoming seen to be as important certainly, if not a more direct proxy for end user QoE. This is all still evolving and I<br>
> have<br>
> > to say is a super interesting & fun thing to work on. :-)<br>
> ><br>
> > If I could manage to sell one idea to the management hierarchy of communications industry CEOs (operators, vendors, ...) it is this one:<br>
> ><br>
> > "It's the end-to-end latency, stupid!"<br>
> ><br>
> > And I mean, by end-to-end, latency to complete a task at a relevant layer of abstraction.<br>
> ><br>
> > At the link level, it's packet send to packet receive completion.<br>
> ><br>
> > But at the transport level including retransmission buffers, it's datagram (or message) origination until the acknowledgement arrives for that<br>
> message being<br>
> > delivered after whatever number of retransmissions, freeing the retransmission buffer.<br>
> ><br>
> > At the WWW level, it's mouse click to display update corresponding to completion of the request.<br>
> ><br>
> > What should be noted is that lower level latencies don't directly predict the magnitude of higher-level latencies. But longer lower level latencies<br>
> almost<br>
> > always amplfify higher level latencies. Often non-linearly.<br>
> ><br>
> > Throughput is very, very weakly related to these latencies, in contrast.<br>
> ><br>
> > The amplification process has to do with the presence of queueing. Queueing is ALWAYS bad for latency, and throughput only helps if it is in exactly the<br>
> > right place (the so-called input queue of the bottleneck process, which is often a link, but not always).<br>
> ><br>
> > Can we get that slogan into Harvard Business Review? Can we get it taught in Managerial Accounting at HBS? (which does address logistics/supply chain<br>
> queueing).<br>
> ><br>
> ><br>
> ><br>
> ><br>
> ><br>
> ><br>
> ><br>
> > This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the<br>
> use of<br>
> > the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or<br>
> otherwise<br>
> > restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended<br>
> recipient,<br>
> > you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you<br>
> > received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.<br>
> <br>
> <br>
> -- <br>
> Ben Greear <<a href="mailto:greearb@candelatech.com" target="_blank">greearb@candelatech.com</a> <mailto:<a href="mailto:greearb@candelatech.com" target="_blank">greearb@candelatech.com</a>>><br>
> Candela Technologies Inc <a href="http://www.candelatech.com" rel="noreferrer" target="_blank">http://www.candelatech.com</a><br>
> <br>
> <br>
> This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of <br>
> the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise <br>
> restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, <br>
> you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you <br>
> received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.<br>
<br>
<br>
-- <br>
Ben Greear <<a href="mailto:greearb@candelatech.com" target="_blank">greearb@candelatech.com</a>><br>
Candela Technologies Inc <a href="http://www.candelatech.com" rel="noreferrer" target="_blank">http://www.candelatech.com</a><br>
<br>
_______________________________________________<br>
Bloat mailing list<br>
<a href="mailto:Bloat@lists.bufferbloat.net" target="_blank">Bloat@lists.bufferbloat.net</a><br>
<a href="https://lists.bufferbloat.net/listinfo/bloat" rel="noreferrer" target="_blank">https://lists.bufferbloat.net/listinfo/bloat</a><br>
</blockquote></div></div>
</blockquote></div>
<br>
<span style="background-color:rgb(255,255,255)"><font size="2">This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.</font></span>