Right, I understand that under 3168 behavior the sender would react differently to ECE markings than L4S flows would, but I guess I don't understand why a sender willing to misclassify traffic with ECT(1) wouldn't also choose to react non-normatively to ECE markings. On the rest, I think we agree. Kyle On Thu, Jul 25, 2019 at 3:26 PM Holland, Jake wrote: > Hi Kyle, > > > > I almost agree, except that the concern is not about classic flows. > > > > I agree (with caveats) with what Bob and Greg have said before: ordinary > classic flows don’t have an incentive to mis-mark if they’ll be responding > normally to CE, because a classic flow will back off too aggressively and > starve itself if it’s getting CE marks from the LL queue. > > > > That said, I had a message where I tried to express something similar to > the concerns I think you just raised, with regard to a different category > of flow: > > https://mailarchive.ietf.org/arch/msg/tsvwg/bUu7pLmQo6BhR1mE2suJPPluW3Q > > > > So I agree with the concerns you’ve raised here, and I want to +1 that > aspect of it while also correcting that I don’t think these apply for > ordinary classic flows, but rather for flows that use application-level > quality metrics to change bit-rates instead responding at the transport > level. > > > > For those flows (which seems to include some of today’s video conferencing > traffic), I expect they really would see an advantage by mis-marking > themselves, and will require policing that imposes a policy decision. > Given that, I agree that I don’t see a simple alternative to FQ for flows > originating outside the policer’s trust domain when the network is fully > utilized. > > > > I hope that makes at least a little sense. > > > > Best regards, > > Jake > > > > *From: *Kyle Rose > *Date: *2019-07-23 at 11:13 > *To: *Bob Briscoe > *Cc: *"ecn-sane@lists.bufferbloat.net" , > tsvwg IETF list , "David P. Reed" > *Subject: *Re: [tsvwg] [Ecn-sane] per-flow scheduling > > > > On Mon, Jul 22, 2019 at 9:44 AM Bob Briscoe wrote: > > Folks, > > As promised, I've pulled together and uploaded the main architectural > arguments about per-flow scheduling that cause concern: > > Per-Flow Scheduling and the End-to-End Argum ent > > > > It runs to 6 pages of reading. But I tried to make the time readers will > have to spend worth it. > > > > Before reading the other responses (poisoning my own thinking), I wanted > to offer my own reaction. In the discussion of figure 1, you seem to imply > that there's some obvious choice of bin packing for the flows involved, but > that can't be right. What if the dark green flow has deadlines? Why should > that be the one that gets only leftover bandwidth? I'll return to this > point in a bit. > > > > The tl;dr summary of the paper seems to be that the L4S approach leaves > the allocation of limited bandwidth up to the endpoints, while FQ > arbitrarily enforces equality in the presence of limited bandwidth; but in > reality the bottleneck device needs to make *some* choice when there's a > shortage and flows don't respond. That requires some choice of policy. > > > > In FQ, the chosen policy is to make sure every flow has the ability to get > low latency for itself, but in the absence of some other kind of trusted > signaling allocates an equal proportion of the available bandwidth to each > flow. ISTM this is the best you can do in an adversarial environment, > because anything else can be gamed to get a more than equal share (and > depending on how "flow" is defined, even this can be gamed by opening up > more flows; but this is not a problem unique to FQ). > > > > In L4S, the policy is to assume one queue is well-behaved and one not, and > to use the ECT(1) codepoint as a classifier to get into one or the other. > But policy choice doesn't end there: in an uncooperative or adversarial > environment, you can easily get into a situation in which the bottleneck > has to apply policy to several unresponsive flows in the supposedly > well-behaved queue. Note that this doesn't even have to involve bad actors > misclassifying on purpose: it could be two uncooperative 200 Mb VR flows > competing for 300 Mb of bandwidth. In this case, L4S falls back to classic, > which with DualQ means every flow, not just the uncooperative ones, > suffers. As a user, I don't want my small, responsive flows to suffer when > uncooperative actors decide to exceed the BBW. > > > > Getting back to figure 1, how do you choose the right allocation? With the > proposed use of ECT(1) as classifier, you have exactly one bit available to > decide which queue, and therefore which policy, applies to a flow. Should > all the classic flows get assigned whatever is left after the L4S flows are > allocated bandwidth? That hardly seems fair to classic flows. But let's say > this policy is implemented. It then escapes me how this is any different > from the trust problems facing end-to-end DSCP/QoS: why wouldn't everyone > just classify their classic flows as L4S, forcing everything to be treated > as classic and getting access to a (greater) share of the overall BBW? Then > we're left both with a spent ECT(1) codepoint and a need for FQ or some > other queuing policy to arbitrate between flows, without any bits with > which to implement the high-fidelity congestion signal required to achieve > low latency without getting squeezed out. > > > > The bottom line is that I see no way to escape the necessity of something > FQ-like at bottlenecks outside of the sender's trust domain. If FQ can't be > done in backbone-grade hardware, then the only real answer is pipes in the > core big enough to force the bottleneck to live somewhere closer to the > edge, where FQ does scale. > > > > Note that, in a perfect world, FQ wouldn't trigger at all because there > would always be enough bandwidth for everything users wanted to do, but in > the real world it seems like the best you can possibly do in the absence of > trusted information about how to prioritize traffic. IMO, best to think of > FQ as a last-ditch measure indicating to the operator that they're gonna > need a bigger pipe than as a steady-state bandwidth allocator. > > > > Kyle > > >