[Cake] Simple metrics

Tue Nov 28 17:41:47 EST 2017

Pete Heist <peteheist at gmail.com> writes:

>> On Nov 28, 2017, at 7:15 PM, Dave Taht <dave at taht.net> wrote:
>> 
>> Pete Heist <peteheist at gmail.com> writes:
>> 
>>>    On Nov 27, 2017, at 7:28 PM, Jonathan Morton <chromatix99 at gmail.com> wrote:
>>> 
>>>    An important factor when designing the test is the difference between
>>>    intra-flow and inter-flow induced latencies, as well as the baseline
>>>    latency.
>>> 
>>>    In general, AQM by itself controls intra-flow induced latency, while flow
>>>    isolation (commonly FQ) controls inter-flow induced latency. I consider the
>>>    latter to be more important to measure.
>>> 
>>> Intra-flow induced latency should also be important for web page load time and
>>> websockets, for example. Maybe not as important as inter-flow, because there
>>> you’re talking about how voice, videoconferencing and other interactive apps
>>> work together with other traffic, which is what people are affected by the most
>>> when it doesn’t work.
>>> 
>>> I don’t think it’s too much to include one public metric for each. People are
>>> used to “upload” and “download”, maybe they’d one day get used to “reactivity”
>>> and “interactivity”, or some more accessible terms.
>> 
>> Well, what I proposed was using a pfifo as the reference
>> standard, and "FQ" as one metric name against pfifo 1000/newstuff. 
>> 
>> That normalizes any test we come up with.
>
> So one could have 6 FQ on the intra-flow latency test and 4 FQ on the inter-flow latency test, for example, because it’s always a factor of pfifo 1000’s result on whatever test is run?

yep. using 1000 for FIFO queue length also pleases me due to all the
academic work at 50 or 100.

It would even work for tcp RTT measurement changes, although "FQ" is
sort of a bad name here. I'd be up to another name. LQ? (latency
quotient). LS (latency stress)

>
>>>        Baseline latency is a factor of the underlying network topology, and is
>>>    the type of latency most often measured. It should be measured in the
>>>    no-load condition, but the choice of remote endpoint is critical. Large ISPs
>>>    could gain an unfair advantage if they can provide a qualifying endpoint
>>>    within their network, closer to the last mile links than most realistic
>>>    Internet services. Conversely, ISPs are unlikely to endorse a measurement
>>>    scheme which places the endpoints too far away from them.
>>> 
>>>    One reasonable possibility is to use DNS lookups to randomly-selected gTLDs
>>>    as the benchmark. There are gTLD DNS servers well-placed in essentially all
>>>    regions of interest, and effective DNS caching is a legitimate means for an
>>>    ISP to improve their customers' internet performance. Random lookups
>>>    (especially of domains which are known to not exist) should defeat the
>>>    effects of such caching.
>>> 
>>>    Induced latency can then be measured by applying a load and comparing the
>>>    new latency measurement to the baseline. This load can simultaneously be
>>>    used to measure available throughput. The tests on dslreports offer a decent
>>>    example of how to do this, but it would be necessary to standardise the
>>>    load.
>>> 
>>> It would be good to know what an average worst case heavy load is on a typical
>>> household Internet connection and standardize on that. Windows updates for
>>> example can be pretty bad (many flows).
>> 
>> My mental reference has always been family of four -
>> 
>> Mom in a videoconference
>> Dad surfing the web
>> Son playing a game
>> Daughter uploading to youtube
>> 
>> (pick your gender neutral roles at will)
>> 
>> + Torrenting or dropbox or windows update or steam or …
>
> That sounds like a pretty good reasonable family maximum.
>
>> A larger scale reference might be a company of 30 people.
>
> I’m only speculating that an average active company user generates less traffic than an average active home user, depending on the line of work of course.
>
> Could there be a single test that’s independent of scale and intelligently
> exercises the connection until the practical limits of its rrul related
> variables are known? I think that’s what would make testing much easier. I
> realize I'm conflating the concept of a simple testing metric with this idea.