[Bloat] [ippm] Fwd: New Version Notification for draft-cpaasch-ippm-responsiveness-00.txt
Toerless Eckert
tte at cs.fau.de
Tue Sep 21 16:50:03 EDT 2021
Dear authors
Thanks for the draft
a) Can you please update naming of the draft so people remembering RPM will find the draft ?
Something like:
draft-cpaasch-ippm-rpm-bufferbloat-metric-00
Round-trips Per Minute (RPM) under load - a Metric for bufferbloat.
b) The draft does not mention, or at least does not have a
separate section to discuss where the server is against which the test is run.
It should have such a section. I can hink of at least two key options,
- the server used for the service in question (e.g.: where contents comes from),
- a server at a wll defined location in the access network provider.
c) I fear that b) leads to be biggest current issue with the metric:
The longer the path is, such as full path to a server, the more useful the
metric is for the user. But the user will effectively get a per-service metric.
To make this more fun to the authors: Imagine the appleTV server nodes have a worse
path to a particular user than the Netflix servers. Or vice versa.
If we just use a path to some fixed point in the access provider,
then we take away the users ability to beat up their OTT services to
improve their paths.
If we use only a path toward the service, it will be harder to
hit on the service provider, if the service provider is bad.
So, obviously, i would like to have all three RPM: to Netflix, AppleTV
and a well defined server in Comcast. Then i can triangulate where
my bufferbloat problem is.
d) Worst yet, without having seen more example numbers (a reference pointing
to some good collected RPM numbers would be excellent), my
concern is that instead of fixing bufferbloat on paths, we would simply
encourage OTT to co-locate servers to the access providers own measurement
point, aka: as close to the subscriber.
e) To solve d), maybe two ideas:
- relevant to improve bufferbloat is only (lRPM - iRPM), where
lRPM would be your current RPM, e.g.: under (l)oaded condition),
and iRPM is idle RPM. This still does not take away from the
fact that a path with more queuing hops or higher queue loads
will fare worse than the shorter physcial propagation latency path,
but it does mke the metric significantly be focussed on queueing,
and should help a lot when we do compare service that might not
have servers in the users metro area.
- lRPM/m - RPM under load per mile (roughly).
- Measure idle RTT in units of msec (iRTT)
- Measure load RTT in units of msec (lRTT)
- Just take iRTT as a measure for the path lenth.
normalizing it absolutely is not of first order
important, we are primarily interested in relative number,
and this keeps the example calculation simple.
- the RTT increase because of queueing is (lRTT - iRTT).
- (lRTT - iRTT) / iRTT is therefore something like queuing RTT
per path stretch. I think this is th relative number we want.
- RPM = iRTT / (lRTT - iRTT) * 1000 turns this into some
number increasing with desired non-bufferbloat performance
with enough significant in non fractionals.
- Example:
idle RTT: 5msec, loaded RTT: 20 msec => 333 RPM
idle RTT: 10msec, loaded RTT: 20 msec => 1000 RPM
idle RTT: 15msec, loaded RTT: 20 msec => 3000 RPM
This nicely shows how the RPM will go up when the physcial
path itself gets longer, but the relevant load RTT stays
the same.
idle RTT: 5msec, loaded RTT: 20 msec => 333 RPM
idle RTT: 10msec, loaded RTT: 40 msec => 333 RPM
idle RTT: 15msec, loaded RTT: 60 msec => 333 RPM
This nicely shows that we can have servers at different
physical distance and get the same RPM number, when the
bufferbloat is the same, e.g.: 15 msec worth of bufferbloat
for every 5msec propagation latency segment.
f) I can see how you do NOT want the type of metric i am
proposing, because it only focusses on the bufferbloat
factor, and you may want to stick to the full experience of
the user, where unmistakingly the propagation latency can
not be ignored, but to repeat from above:
If we do not use a metric that fairly treats paths of different
propagation latencies as the same wrt. performance, i am
quite persuaded we will continue to just see big services
win out, because hey can more easily afford to get closer
to the user with their (rented/time-shared/owned) servers.
Aka: Right now RPM is a metric that will specifically
make it easier for one of the big providers of sttreaming
such as that of the authors to position themselves better
against smaller services streaming from further away.
Cheers
Toerless
More information about the Bloat
mailing list