[Rpm] rpm meeting notes from last week

Mon Mar 28 12:56:38 EDT 2022

First, again, sorry for the top-post!

I really, *really* appreciate your email and your feedback, Sebastian!
It's great to work with people like you!!

I will write more as I have time -- ${DAY_JOB} is busy today!! Thanks
for everything!

Will

On Mon, Mar 28, 2022 at 4:41 AM Sebastian Moeller <moeller0 at gmx.de> wrote:
>
> Hi Will,
>
> just had a look at the draft and my expected run times assume bi-directional saturation, while the draft recommends two sets of uni-directional saturation... so my estimates are off not by roughly a factor of 2 but rather more like a factor of 4... I will try to read the go code to see what it intends to do here, but I am not a fluent go reader (@dayjob I use matlab, so far far away from c-ish languages).
>
> Regards
>         Sebastian
>
>
>
>
>
> > On Mar 28, 2022, at 09:58, Sebastian Moeller via Rpm <rpm at lists.bufferbloat.net> wrote:
> >
> > Hi Will,
> >
> >
> > first, but belated, let me express my excitement about finding your project. It is great that apple distributes something like this in their recent OSs but for the "rest of us" that does not help (pun intended).
> >
> >
> >> On Mar 28, 2022, at 00:26, Will Hawkins <hawkinsw at obs.cr> wrote:
> >>
> >> On Sun, Mar 27, 2022 at 2:34 PM Sebastian Moeller via Rpm
> >> <rpm at lists.bufferbloat.net> wrote:
> >>>
> >>> Dear RPMers,
> >>>
> >>> I just managed to install and test goresponsiveness on an Ubuntu20.4 LTS linux host (where the default go is a bit on the old side), and I am 100% not confident that the test runs sufficiently long to either saturate the link or to measure the saturating load sufficiently precise and IMHO most likely neither...
> >>>
> >>> Here is the text of an issue I posted as an issue to the nice goresponsiveness project:
> >>>
> >>> The total measurement duration of the test against apples servers appears quite short. It would be nice if the duration could be controlled from the client (or if that is not according to the spec if the servers could be configured to measure for longer durations):
> >>>
> >>> user at ubuntu20.4LTS:~/CODE/goresponsiveness$ time ./networkQuality --config mensura.cdn-apple.com --port 443 --path /api/v1/gm/config --timeout 60
> >>> 03-27-2022 17:51:51 UTC Go Responsiveness to mensura.cdn-apple.com:443...
> >>> Download:  87.499 Mbps ( 10.937 MBps), using 12 parallel connections.
> >>> Upload:    26.531 Mbps (  3.316 MBps), using 12 parallel connections.
> >>> Total RTTs measured: 5
> >>> RPM:   829
> >>>
> >>> real    0m5.504s
> >>> user    0m1.277s
> >>> sys     0m1.225s
> >>>
> >>>
> >>> ~6 seconds, that is even shorter than run-of-the-mill speedtests (which IMHO tend to clock in at ~10 seconds per direction)... This might or might not result in a saturating load, but it most certainly will result in an imprecise throughput measurement...
> >>>
> >>> Experience with bufferbloat measurements seems to indicate that >= 20-30 seconds are required to really saturate a link.
> >>>
> >>> Trying to confirm this with flent/netperf:
> >>>
> >>> date ; ping -c 10 netperf-eu.bufferbloat.net ; ./run-flent --ipv4 -l 30 -H netperf-eu.bufferbloat.net rrul_var --remote-metadata=root at 192.168.42.1 --test-parameter=cpu_stats_hosts=root at 192.168.42.1 --step-size=.05 --socket-stats --test-parameter bidir_streams=8 --test-parameter markings=0,0,0,0,0,0,0,0 --test-parameter ping_hosts=1.1.1.1 -D . -t IPv4_SQM_cake_layer-cake_LLA-ETH_OH34_U097pct36000of36998K-D090pct105000of116797K_work-horse-eth0_2_TurrisOmnia-TurrisOS.5.3.6-pppoe-wan-eth2.7_2_bridged-BTHH5A-OpenWrt-r18441-ba7cee05ed-Hvt-VDSL100_2_netperf-eu.bufferbloat.net --log-file
> >>>
> >>> [...]
> >>>
> >>> Ping (ms) ICMP 1.1.1.1 (extra)       :        12.20        12.10        21.91 ms              914
> >>> Ping (ms) avg                        :        26.11          N/A          N/A ms              914
> >>> Ping (ms)::ICMP                      :        25.30        25.10        35.11 ms              914
> >>> Ping (ms)::UDP 0 (0)                 :        25.42        25.43        33.43 ms              914
> >>> Ping (ms)::UDP 1 (0)                 :        25.36        25.32        33.77 ms              914
> >>> Ping (ms)::UDP 2 (0)                 :        26.99        26.88        36.13 ms              914
> >>> Ping (ms)::UDP 3 (0)                 :        25.27        25.20        34.42 ms              914
> >>> Ping (ms)::UDP 4 (0)                 :        25.26        25.22        33.32 ms              914
> >>> Ping (ms)::UDP 5 (0)                 :        27.99        27.85        37.20 ms              914
> >>> Ping (ms)::UDP 6 (0)                 :        27.95        27.89        36.29 ms              914
> >>> Ping (ms)::UDP 7 (0)                 :        25.42        25.32        34.56 ms              914
> >>> TCP download avg                     :        11.93          N/A          N/A Mbits/s         914
> >>> TCP download sum                     :        95.44          N/A          N/A Mbits/s         914
> >>> TCP download::0 (0)                  :        11.85        12.26        13.97 Mbits/s         914
> >>> TCP download::1 (0)                  :        11.81        12.28        13.81 Mbits/s         914
> >>> TCP download::2 (0)                  :        12.13        12.31        16.28 Mbits/s         914
> >>> TCP download::3 (0)                  :        12.04        12.31        13.76 Mbits/s         914
> >>> TCP download::4 (0)                  :        11.78        12.28        13.83 Mbits/s         914
> >>> TCP download::5 (0)                  :        12.07        12.30        14.44 Mbits/s         914
> >>> TCP download::6 (0)                  :        11.78        12.29        13.67 Mbits/s         914
> >>> TCP download::7 (0)                  :        11.98        12.30        13.92 Mbits/s         914
> >>> TCP totals                           :       126.34          N/A          N/A Mbits/s         914
> >>> TCP upload avg                       :         3.86          N/A          N/A Mbits/s         914
> >>> TCP upload sum                       :        30.90          N/A          N/A Mbits/s         914
> >>> [...]
> >>>
> >>> This is a bidirectional test that only reports actual TCP goodput, neither the latency probes nor the reverse ACK traffic is accounted and yet:
> >>> TCP download sum: 95.44
> >>> TCP upload sum: 30.90
> >>>
> >>> which compared to the actual shape settings of 105/36 (with 34 bytes overhead) seems sane:
> >>> theoretical maximal throughput:
> >>> IPv4
> >>> 105 * ((1500-8-20-20-12)/(1500+26)) = 99.08 Mbps
> >>> 36 * ((1500-8-20-20-12)/(1500+26)) = 33.97 Mbps
> >>> IPv6
> >>> 105 * ((1500-8-40-20-12)/(1500+26)) = 97.71 Mbps
> >>> 36 * ((1500-8-40-20-12)/(1500+26)) = 33.50 Mbps
> >>>
> >>> Compared to these numbers the reported 87.499/26.531 do not seems either saturating or precise, probably neither.... I do wonder how reliable the RPM number is gong to be if the "saturate the link" part apparently failed?
> >>>
> >>> Question: is full saturation actually intended here, or is "sufficiently high load" the goal (if the later what constitutes sufficiently high)?
> >>
> >> According to the spec, you may not get completely full saturation. The
> >> definition of saturation per the spec is fairly specific: See 4.1.4 of
> >> https://github.com/network-quality/draft-ietf-ippm-responsiveness/blob/master/draft-ietf-ippm-responsiveness.txt.
> >
> >       Thanks for the pointer... Not sure I am convinced of that logic regarding saturation.... I think I like the "add more flows" approach to generating load, not sure I think that the 4 second increments seem long enough... also aiming for 20 seconds run time seems like a decent goal, but soldering on in case the results are not stable at that point seems more helpful than leaving that undefined, no?
> >
> > Also, I wish the test would report something like the raw delays as well as the RPMs, and personally I think reporting delay/RPM without load as well also seems worthwhile, as that basically indicates how much a specific link could actually gain from traffic-shaping/competent AQM...
> >
> >
> >
> >>
> >> We are still working through some of the timing issues in the
> >> goresponsiveness client and doing our best.
> >
> >       Ah, that seems to be the case from my understanding each additional set of 4 flows will increase the run time for up to 4 seconds, so with 12 parrallel connections I would expect a run time of between 8+1 to 8+4 so 9-12, which is a bit above the reported 5.5 seconds (goresponsivess results above)
> > and similar for the results below
> > 20 flows -> expected: 17-20 seconds actual:  9.7
> > 24 flows -> expected: 21-24 seconds actual:  12.9
> >
> > So it seems like the run time is roughly half as the draft seems to recommend, no?
> >
> >
> >> Thank you for your
> >> patience and helpful, constructive and kind feedback.
> >
> >       Sorry, on second reading my mail is neither helpful, constructive or kind, but appears rather rude. I want to apologize for this as this was not my intent. I very much appreciate your project and actually just want to understand it better.
> >
> >
> >>
> >> Will
> >>
> >>
> >>>
> >>> Regards
> >>>       Sebastian
> >>>
> >>> P.S.: When disabling my traffic shaper I get:
> >>>
> >>> user at ubuntu20.4LTS:~/CODE/flent$ time ../goresponsiveness/networkQuality --config mensura.cdn-apple.com --port 443 --path /api/v1/gm/config --timeout 60
> >>> 03-27-2022 18:20:06 UTC Go Responsiveness to mensura.cdn-apple.com:443...
> >>> Download:  99.390 Mbps ( 12.424 MBps), using 12 parallel connections.
> >>> Upload:    27.156 Mbps (  3.395 MBps), using 20 parallel connections.
> >>> Total RTTs measured: 5
> >>> RPM:   512
> >>>
> >>> real    0m9.728s
> >>> user    0m1.763s
> >>> sys     0m1.660s
> >>> user at ubuntu20.4LTS:~/CODE/flent$ time ../goresponsiveness/networkQuality --config mensura.cdn-apple.com --port 443 --path /api/v1/gm/config --timeout 60
> >>> 03-27-2022 18:20:27 UTC Go Responsiveness to mensura.cdn-apple.com:443...
> >>> Download:  97.608 Mbps ( 12.201 MBps), using 12 parallel connections.
> >>> Upload:    27.375 Mbps (  3.422 MBps), using 24 parallel connections.
> >>> Total RTTs measured: 5
> >>> RPM:   366
> >>>
> >>> real    0m12.926s
> >>> user    0m2.330s
> >>> sys     0m2.442s
> >>>
> >>> Which for the download approaches/exceeds the theoretical limit* yet for the upload still falls a bit short of the 8 flow flent result
> >>>
> >>>
> >>> *) This is nothing particular unique to goresponsiveness most multi-stream tests fail to actually report robust and reliable values for thoughput, even though this "ain't rocket science". Instead of trying to discount the TCP start-up phase somehow, simply run the test long enough so the start up does not dominate the result and just divide total transported volume by total duration... yes that is going to slightly underestimate the total link capacity, but that seems IMHO considerably more benign/useful than reporting 99.39 of 99.08 possible or rather 97.71 possible , since the actual test uses IPv6 on my link...
> >>>
> >>>
> >>>
> >>>> On Mar 27, 2022, at 16:25, Dave Taht via Rpm <rpm at lists.bufferbloat.net> wrote:
> >>>>
> >>>> Sasha has re-organized the meeting notes we've been taking which will
> >>>> be permanently up here:
> >>>>
> >>>> https://docs.google.com/document/d/19K14vrBLNvX_KOdRMK_s0SCYTnj70IJ-dQubL37kQgw/edit
> >>>>
> >>>> The public meetings continue at 10AM PDT tuesdays.
> >>>>
> >>>> One thing came up in that meeting that bears further reflection: The
> >>>> structure of the the responsiveness test is e2e, and differentiating
> >>>> between the wifi hop and/or sourcing tests from the router - is hard.
> >>>> In the latter case, we are asking a router not only to saturate its
> >>>> network connection, but to do it using crypto, which is a mighty task
> >>>> to ask of a router that's job is primarily routing (with offloads).
> >>>>
> >>>> A thought might be to attempt p2p tests between say, an ios phone and
> >>>> an osx laptop, which would exercise the router as a wifi router. A
> >>>> docker container for the server would be straightforward. (It's still
> >>>> hard for me to
> >>>> trust containers or vm's to run fast or accurately enough, too).
> >>>>
> >>>> Former case is still e2e but with the icmp unreachable idea I described in wtbb.
> >>>>
> >>>> --
> >>>> I tried to build a better future, a few times:
> >>>> https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org
> >>>>
> >>>> Dave Täht CEO, TekLibre, LLC
> >>>> _______________________________________________
> >>>> Rpm mailing list
> >>>> Rpm at lists.bufferbloat.net
> >>>> https://lists.bufferbloat.net/listinfo/rpm
> >>>
> >>> _______________________________________________
> >>> Rpm mailing list
> >>> Rpm at lists.bufferbloat.net
> >>> https://lists.bufferbloat.net/listinfo/rpm
> >
> > _______________________________________________
> > Rpm mailing list
> > Rpm at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/rpm
>