[Ecn-sane] rtt-fairness question

Rodney W. Grimes 4bone at gndrsh.dnsmgr.net
Tue Apr 19 19:55:06 EDT 2022


> David's last point reminds me of a time-sharing system I once worked on. We
> adjusted the scheduling so tasks that needed lower latency got priority and
> we deliberately increased latency for tasks that users assumed would take a
> while :-)))

To some extent that concept still exists in modern system schedulers,
such as giving a priority boost to processes that seem to have
high interactivity, a bullet item from the sched_ule man page of FreeBSD:

	Interactivity heuristics that detect interactive applications
	and schedules them preferentially under high load.

I believe there is similar functionality in the sched_4bsd scheduler,
as well as the linux scheduler.

Regads,
Rod

> 
> v
> 
> 
> On Tue, Apr 19, 2022 at 4:40 PM David P. Reed <dpreed at deepplum.com> wrote:
> 
> > Sebastian - all your thoughts here seem reasonable.
> >
> >
> >
> > I would point out only two things:
> >
> >
> >
> > 1) 100 ms. is a magic number for human perception. It's basically the
> > order of magnitude of humans' ability to respond to unpredictable events
> > outside the human. That's why it is magic. Now humans can actually perceive
> > intervals much, much shorter (depending on how we pay attention), but
> > usually it is by comparing two events' time ordering. We can even
> > synchronize to external, predictable events with finer resolution (as in
> > Jazz improv or just good chamber music playing).  A century of careful
> > scientific research supports this, niot just one experiment. Which is why
> > one should take it seriously as a useful target. (the fact that one can
> > achieve it across the planet with digital signalling networks makes it a
> > desirable goal for anything interactive between a human and any entity, be
> > it computer or human). If one can do better, of course, that's great. I
> > like that from my home computer I can get lots of places in under 8 msec
> > (15 msec RTT).
> >
> >
> >
> > 2) given that a particular heavily utilized link might be shared for paths
> > where the light-speed-in-fiber round trip for active flows varies by an
> > order of magnitude, why does one try to make fair RTT (as opposed to all
> > other possible metrics on each flow) among flows. It doesn't make any sense
> > to me why. Going back to human interaction times, it makes sense to me that
> > you might want to be unfair so that most flows get faster than 200 ms. RTT,
> > for example, penalizing those who are really close to each other anyway.
> >
> > If the RTT is already low because congestion has been controlled, you
> > can't make it lower. Basically, the ideal queue state is < 1 packet in the
> > bottleneck outbound queues, no matter what the RTT through that queue is.
> >
> >
> >
> >
> >
> >
> >
> > On Thursday, April 14, 2022 5:25pm, "Sebastian Moeller" <moeller0 at gmx.de>
> > said:
> >
> > > Just indulge me here for a few crazy ideas ;)
> > >
> > > > On Apr 14, 2022, at 18:54, David P. Reed <dpreed at deepplum.com> wrote:
> > > >
> > > > Am I to assume, then, that routers need not pay any attention to RTT to
> > > achieve RTT-fairness?
> > >
> > > Part of RTT-bias seems caused by the simple fact that tight control
> > loops work
> > > better than sloppy ones ;)
> > >
> > > There seem to be three ways to try to remedy that to some degree:
> > > 1) the daft one:
> > > define a reference RTT (larger than typically encountered) and have all
> > TCPs
> > > respond as if encountering that delay -> until the path RTT exceeds that
> > > reference TCP things should be reasonably fair
> > >
> > > 2) the flows communicate with the bottleneck honestly:
> > > if flows would communicate their RTT to the bottleneck the bottleneck
> > could
> > > partition its resources such that signaling (mark/drop) and puffer size
> > is
> > > bespoke per-flow. In theory that can work, but relies on either the RTT
> > > information being non-gameably linked to the protocol's operation* or
> > everybody
> > > being fully veridical and honest
> > > *) think a protocol that will only work if the best estimate of the RTT
> > is
> > > communicated between the two sides continuously
> > >
> > > 3) the router being verbose:
> > > If routers communicate the fill-state of their queue (global or per-flow
> > does not
> > > matter all that much) flows in theory can do a better job at not putting
> > way too
> > > much data in flight remedying the cost of drops/marks that affects high
> > RTT flows
> > > more than the shorter ones. (The router has little incentive to lie
> > here, if it
> > > wanted to punish a flow it would be easier to simply drop its packets
> > and be done
> > > with).
> > >
> > >
> > > IMHO 3, while theoretically the least effective of the three is the only
> > one that
> > > has a reasonable chance of being employed... or rather is already
> > deployed in the
> > > form of ECN (with mild effects).
> > >
> > > > How does a server or client (at the endpoint) adjust RTT so that it is
> > fair?
> > >
> > > See 1) above, but who in their right mind would actually implement
> > something like
> > > that (TCP Prague did that, but IMHO never in earnest but just to
> > "address" the
> > > L4S bullet point RTT-bias reduction).
> > >
> > > > Now RTT, technically, is just the sum of the instantaneous queue
> > lengths in
> > > bytes along the path and the reverse path, plus a fixed wire-level
> > delay. And
> > > routers along any path do not have correlated queue sizes.
> > > >
> > > > It seems to me that RTT adjustment requires collective real-time
> > cooperation
> > > among all-or-most future users of that path. The path is partially
> > shared by many
> > > servers and many users, none of whom directly speak to each other.
> > > >
> > > > And routers have very limited memory compared to their
> > throughput-RTdelay
> > > product. So calculating the RTT using spin bits and UIDs for packets
> > seems a bit
> > > much to expect all routers to do.
> > >
> > > If posed like this, I guess the better question is, what can/should
> > routers be
> > > expected to do here: either equitably share their queues or share queue
> > > inequitably such that throughput is equitable. From a pure router point
> > of the
> > > view the first seems "fairest", but as fq_codel and cake show, within
> > reason
> > > equitable capacity sharing is possible (so not perfectly and not for
> > every
> > > possible RTT spread).
> > >
> > > >
> > > > So, what process measures the cross-interactions among all the users
> > of all
> > > the paths, and what control-loop (presumably stable and TCP-compatible)
> > actually
> > > converges to RTT fairness IRL.
> > >
> > > Theoretically nothing, in reality on a home link FQ+competent AQM goes a
> > long way
> > > in that direction.
> > >
> > >
> > > >
> > > > Today, the basis of congestion control in the Internet is that each
> > router is
> > > a controller of all endpoint flows that share a link, and each router is
> > free to
> > > do whatever it takes to reduce its queue length to near zero as an
> > average on all
> > > timescales larger than about 1/10 of a second (a magic number that is
> > directly
> > > derived from measured human brain time resolution).
> > >
> > > The typical applies, be suspicious of too round numbers.... 100ms is in
> > no way
> > > magic and also not "correct" it is however a decent description of
> > reaction times
> > > in a number of perceptul tasks that can be mis-interpreted as showing
> > things like
> > > the brain runs at 10Hz or similar...
> > >
> > >
> > > >
> > > > So, for any two machines separated by less than 1/10 of a light-second
> > in
> > > distance, the total queueing delay has to stabilize in about 1/10 of a
> > second.
> > > (I'm using a light-second in a fiber medium, not free-space, as the
> > speed of light
> > > in fiber is a lot slower than the speed of light on microwaves, as Wall
> > Street has
> > > recently started recoginizing and investing in).
> > > >
> > > > I don't see how RTT-fairness can be achieved by some set of bits in
> > the IP
> > > header. You can't shorten RTT below about 2/10 of a second in that
> > desired system
> > > state. You can only "lengthen" RTT by delaying packets in source or
> > endpoint
> > > buffers, because it's unreasonable to manage all the routers.
> > > >
> > > > And the endpoints that share a path can't talk to each other and reach
> > a
> > > decision in on the order of 2/10 of a second.
> > > >
> > > > So at the very highest level, what is RTT-fairness's objective function
> > > optimizing, and how can it work?
> > > >
> > > > Can it be done without any change to routers?
> > >
> > > Well the goal here seems to undo the RTT-dependence of throughput so a
> > router can
> > > equalize per flow throughput and thereby (from its own vantage point)
> > enforce RTT
> > > independence, within the amount of memory available. And that already
> > works today
> > > for all identifiable flows, but apparently at a computational cost that
> > larger
> > > routers do not want to pay. But you knew all that
> > >
> > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tuesday, April 12, 2022 3:07pm, "Michael Welzl" <michawe at ifi.uio.no
> > >
> > > said:
> > > >
> > > >
> > > >
> > > > On Apr 12, 2022, at 8:52 PM, Sebastian Moeller <moeller0 at gmx.de>
> > > wrote:
> > > > Question: is QUIC actually using the spin bit as an essential part of
> > the
> > > protocol?
> > > > The spec says it?s optional:
> > > https://www.rfc-editor.org/rfc/rfc9000.html#name-latency-spin-bit
> > > > Otherwise endpoints might just game this if faking their RTT at a
> > router
> > > yields an advantage...
> > > > This was certainly discussed in the QUIC WG. Probably perceived as an
> > unclear
> > > incentive, but I didn?t really follow this.
> > > > Cheers,
> > > > Michael
> > > >
> > > > This is why pping's use of tcp timestamps is elegant, little incentive
> > for
> > > the endpoints to fudge....
> > > >
> > > > Regards
> > > > Sebastian
> > > >
> > > >
> > > > On 12 April 2022 18:00:15 CEST, Michael Welzl <michawe at ifi.uio.no>
> > > wrote:
> > > > Hi,
> > > > Who or what are you objecting against? At least nothing that I
> > described
> > > does what you suggest.
> > > > BTW, just as a side point, for QUIC, routers can know the RTT today -
> > using
> > > the spin bit, which was designed for that specific purpose.
> > > > Cheers,
> > > > Michael
> > > >
> > > >
> > > > On Apr 12, 2022, at 5:51 PM, David P. Reed <dpreed at deepplum.com>
> > > wrote:
> > > > I strongly object to congestion control *in the network* attempting to
> > > measure RTT (which is an end-to-end comparative metric). Unless the
> > current RTT is
> > > passed in each packet a router cannot enforce fairness. Period.
> > > >
> > > > Today, by packet drops and fair marking, information is passed to the
> > sending
> > > nodes (eventually) about congestion. But the router can't know RTT today.
> > > >
> > > > The result of *requiring* RTT fairness would be to put the random
> > bottleneck
> > > router (chosen because it is the slowest forwarder on a contended path)
> > become the
> > > endpoint controller.
> > > >
> > > > That's the opposite of an "end-to-end resource sharing protocol".
> > > >
> > > > Now, I'm not saying it is impossible - what I'm saying it is asking all
> > > endpoints to register with an "Internet-wide" RTT real-time tracking and
> > control
> > > service.
> > > >
> > > > This would be the technical equivalent of an ITU central control point.
> > > >
> > > > So, either someone will invent something I cannot imagine (a
> > distributed,
> > > rapid-convergence algortithm that rellects to *every potential user* of
> > a shared
> > > router along the current path the RTT's of ALL other users (and
> > potential users).
> > > >
> > > > IMHO, the wish for RTT fairness is like saying that the entire solar
> > system's
> > > gravitational pull should be equalized so that all planets and asteroids
> > have fair
> > > access to 1G gravity.
> > > >
> > > >
> > > > On Friday, April 8, 2022 2:03pm, "Michael Welzl" <michawe at ifi.uio.no>
> > > said:
> > > >
> > > > Hi,
> > > > FWIW, we have done some analysis of fairness and convergence of DCTCP
> > in:
> > > > Peyman Teymoori, David Hayes, Michael Welzl, Stein Gjessing:
> > "Estimating an
> > > Additive Path Cost with Explicit Congestion Notification", IEEE
> > Transactions on
> > > Control of Network Systems, 8(2), pp. 859-871, June 2021. DOI
> > > 10.1109/TCNS.2021.3053179
> > > > Technical report (longer version):
> > > >
> > >
> > https://folk.universitetetioslo.no/michawe/research/publications/NUM-ECN_report_2019.pdf
> > > > and there?s also some in this paper, which first introduced our LGC
> > > mechanism:
> > > > https://ieeexplore.ieee.org/document/7796757
> > > > See the technical report on page 9, section D: a simple trick can
> > improve
> > > DCTCP?s fairness (if that?s really the mechanism to stay with?
> > > I?m getting quite happy with the results we get with our LGC scheme :-)
> > > )
> > > >
> > > > Cheers,
> > > > Michael
> > > >
> > > > On Apr 8, 2022, at 6:33 PM, Dave Taht <dave.taht at gmail.com> wrote:
> > > > I have managed to drop most of my state regarding the state of various
> > > > dctcp-like solutions. At one level it's good to have not been keeping
> > > > up, washing my brain clean, as it were. For some reason or another I
> > > > went back to the original paper last week, and have been pounding
> > > > through this one again:
> > > >
> > > > Analysis of DCTCP: Stability, Convergence, and Fairness
> > > >
> > > > "Instead, we propose subtracting ?/2 from the window size for each
> > > marked ACK,
> > > > resulting in the following simple window update equation:
> > > >
> > > > One result of which I was most proud recently was of demonstrating
> > > > perfect rtt fairness in a range of 20ms to 260ms with fq_codel
> > > > https://forum.mikrotik.com/viewtopic.php?t=179307 )- and I'm pretty
> > > > interested in 2-260ms, but haven't got around to it.
> > > >
> > > > Now, one early result from the sce vs l4s testing I recall was severe
> > > > latecomer convergence problems - something like 40s to come into flow
> > > > balance - but I can't remember what presentation, paper, or rtt that
> > > > was from. ?
> > > >
> > > > Another one has been various claims towards some level of rtt
> > > > unfairness being ok, but not the actual ratio, nor (going up to the
> > > > paper's proposal above) whether that method had been tried.
> > > >
> > > > My opinion has long been that any form of marking should look more
> > > > closely at the observed RTT than any fixed rate reduction method, and
> > > > compensate the paced rate to suit. But that's presently just reduced
> > > > to an opinion, not having kept up with progress on prague, dctcp-sce,
> > > > or bbrv2. As one example of ignorance, are 2 packets still paced back
> > > > to back? DRR++ + early marking seems to lead to one packet being
> > > > consistently unmarked and the other marked.
> > > >
> > > > --
> > > > I tried to build a better future, a few times:
> > > > https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org
> > > >
> > > > Dave T?ht CEO, TekLibre, LLC
> > > > _______________________________________________
> > > > Ecn-sane mailing list
> > > > Ecn-sane at lists.bufferbloat.net
> > > > https://lists.bufferbloat.net/listinfo/ecn-sane
> > > >
> > > > --
> > > > Sent from my Android device with K-9 Mail. Please excuse my brevity.
> > > >
> > >
> > >
> > _______________________________________________
> > Ecn-sane mailing list
> > Ecn-sane at lists.bufferbloat.net
> > https://lists.bufferbloat.net/listinfo/ecn-sane
> >
> 
> 
> -- 
> Please send any postal/overnight deliveries to:
> Vint Cerf
> 1435 Woodhurst Blvd
> McLean, VA 22102
> 703-448-0965
> 
> until further notice

[ Charset UTF-8 unsupported, converting... ]
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane at lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane
> 
-- 
Rod Grimes                                                 rgrimes at freebsd.org


More information about the Ecn-sane mailing list