[Ecn-sane] [tsvwg] Comments on L4S drafts

Fri Jul 19 13:59:38 EDT 2019

Hi Koen,

> On Jul 19, 2019, at 11:06, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper at nokia-bell-labs.com> wrote:
> 
> Hi Sebastian,
> 
> To avoid people to read through the long mail, I think the main point I want to make is:
> "Indeed, having common-Qs supported is one of my requirements. That's why I want to keep the discussion on that level: is there consensus that low latency is only needed for a per flow FQ system with an AQM per flow?"
> 
> If there is this consensus, this means that we can use SCE and that from now on, all network nodes have to implement per flow queuing with an AQM per flow.

	Well, this in this exclusivity I would say this is wrong. as always only few nodes along a path actually develop queues in the first place and only those need to implement a competent AQM. As a data point from real life, I employ an fq-shaper for both ingress and egress traffic on my CPE and almost all of my latency-under-load issues improved to a level where I do not care anymore; and of the remaining issues most are/were caused by my ISPs peerings/transits to the other endpoint of a connection was running "hot". And as stated in this thread already, I do not see any of our proposals reach the transit/peering routers for lack of a monetary incentive for those that would need to operate AQMs on such devices.
	On monetary incentives, I add, that, even though it is not one of L4S's stated goals, but it looks like a reasonable match for the "special services" exemption carved out in the EU's network neutrality regulations. I do not want to go into a political discussion about special services here, but just notice that this is one option for ISPs to monetize a special low-latency service tier (as L4S aims to deliver), but even in that case the ISPs are at best incentivized to build L4S-empowered links into their own data-centers and for payed peeerings, this still does not address the issue of generel peering/transit routers IMHO.

> If there is no consensus, we cannot use SCE and need to use L4S.

	I am always very wary of these kind on "tertium non datur" arguments, as if L4S and SCE would be the only options to tackle the issue (sure those are the two alternatives in the table right now, but that is a different argument).

> 
> For all the other detailed discussion topics, see [K] inline:
> 
> Regards,
> Koen.
> 
> -----Original Message-----
> From: Sebastian Moeller <moeller0 at gmx.de> 
> Sent: Thursday, July 18, 2019 12:40 AM
> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper at nokia-bell-labs.com>
> Cc: Holland, Jake <jholland at akamai.com>; Jonathan Morton <chromatix99 at gmail.com>; ecn-sane at lists.bufferbloat.net; tsvwg at ietf.org
> Subject: Re: [Ecn-sane] [tsvwg] Comments on L4S drafts
> 
> Dear Koen,
> 
> 
>> On Jul 10, 2019, at 11:00, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper at nokia-bell-labs.com> wrote:
> [...]
>>>> Are you saying that even if a scalable FQ can be implemented in high-volume aggregated links at the same cost and difficulty as dualq, there's a reason not to use FQ?
>> 
>> FQ for "per-user" isolation in access equipment has clearly an extra cost, not? If we need to implement FQ "per-flow" on top, we need 2 levels of FQ (per-user and per-user-flow, so from thousands to millions of queues). Also, I haven’t seen DC switches coming with an FQ AQM...
> 
> 	I believe there is work available demonstrating that a) millions of concurrently active flows might be overly pessimistic (even for peering routers) and b) IMHO it is far from settled that these bid transit/peering routers will employ any of the the schemes we are cooking up here. For b) I argue that both L4S "linear" CE-marking and SCE linear ECT(1) marking will give a clear signal of overload that an big ISP might not want to explicitly tell its customers...
> 
> [K] a) indeed, if queues can be dynamically allocated you could settle with less, not sure if dynamic allocation is compatible with high speed implementations. Anyway, any additional complexity is additional cost (and cycles, energy, heat dissipation, ...). Of course everything can be done...

	Great that we agree here, this is all about trade-offs.

> b) I don't agree that ECN is a signal of overload.

	Rereading RFC3168, I believe a CE mark is merited only if the packet would be dropped otherwise. IMHO that most likely will be caused by the node running out of some limited resource (bandwidth and/or CPU cycles), but sure it can be a policy decision as well, but I fail to see how such subtleties matter in our discussion.

> It is a natural part of feedback to tell greedy TCP that it reached its full capacity. Excessive drop/latency

	Well, excessive latency often correlates with overload, but IMHO is not causally linked (and hence I believe all schemes trying to deduce overload from latency-under-load-increases are _not_ looking at the right measure).

> is the signal of overload and an ECN-capable AQM switches from ECN to drop anyway in overload conditions.  

	This is the extreme situation, like in L4S when the 20ms queue limit gets exceeded and head- or tail-dropping starts?

> Excessive drop and latency can also be measured today, not?

	Well, only if you have a reasonable prior for what drop-rate and latency variation is under normal conditions. And even then one needs time to get measurement error down to the desired level, in other words that seems sub-optimal for a tight control loop.

> Running a few probes can tell customers the same with or without ECN, and capacity is measured simply with speedtests.

	Running probes is a) harder than it seems (as the probes should run against the servers of interest) b) requires probes send over the reverse path as well (so one needs looking glass servers close to the endpoints of interest). Ans speedtests are a whole different can of worms.... most end-user accessible speedtests severely under-report the necessary details to actually being able to access a link's properties even at rest, IMHO.

> 
>> 
>>>> Is there a use case where it's necessary to avoid strict isolation if strict isolation can be accomplished as cheaply?
>> 
>> Even if as cheaply, as long as there is no reliable flow identification, it clearly has side effects. Many homeworkers are using a VPN tunnel, which is only one flow encapsulating maybe dozens.
> 
> 	Fair enough, but why do you see a problem of treating this multiplexed flow as different from any other flow, after all it was the end-points conscious decision to masquerade as a single flow so why assume special treatment; it is not that intermediate hops have any insight into the multiplexing, so why expect them to cater for this?
> 
> [K] Because the design of VPN tunnels had as a main goal to maintain a secure/encrypted connection between clients and servers, trying to minimize the overhead on clients and servers by using a single TCP/UDP connection. I don't think the single flow was chosen to get treated as one flow's budget of throughput. This "feature" didn't exist at that (pre-FQ) time.

	Well, in pre-FQ-times there was no guarantee what so ever, so claiming this is an insurmountable problems seems a bit naive to me. For one using IPv6 flow labels or multiple flows are all options to deal with an FQ world. I see this as an non-serious strawman argument.

> 
>> Drop and ECN (if implemented correctly) are tunnel agnostic.
> 
> 	Exactly, and that is true for each identified flow as well, so fq does not diminish this, but rather builds on top of it.
> 
> [K] True for flows within a tunnel, but the point was that FQ treats the aggregated tunnel as a single flow compared to other single flows.

	And so does L4S... (modulo queue protection, but that will only act on packet ingress as it seems to leave the already queued packets alone). But yes, tunneling has side-effects, don't do it of you dislike these.

> 
>> Also how flows are identified might evolve (new transport protocols, encapsulations, ...?).
> 
> 	You are jesting surely, new protocols? We are in this kefuffle, because you claim that a new protocol to signal linear CE-marking response to be made of unobtaininum so you want to abuse an underused EVN code point as a classifier. If new protocols are an option, just bite the bullet and give tcp-reno a new protocol number and use this for your L4S classifier; problem solved in a nice and clean fashion.
> 
> [K] Indeed, it is hardly impossible to deploy new protocols in practice, but I hope we can make it more possible in the future, not less possible... Maybe utopic, but at least we should try to learn from past mistakes.

	So, why not use a new protocol for L4S behaviour then? If L4S truly is the bee's knees then it will drive adoptation of the new protocol, and if not, that also tells us something about the market's assessment of L4S's promises.

> 
>> Also if strict flow isolation could be done correctly, it has additional issues related to missed scheduling opportunities,
> 
> 	Please elaborate, how an intermediate hop would know about the desires of the endpoints here. As far as I can tell such hops have their own ideas about optimal scheduling that they will enforce independent of the what the endpoints deem optimal (by ncessity as most endpoints will desire highest priority for their packets).
> 
> [K] That network nodes cannot know what the end-systems want is exactly the point. FQ just assumes everybody should have the same throughput,

	Which has the great advantage of being predictable by the enduser.

> and makes an exception for single packets (to undo the most flagrant disadvantage of this strategy).

	Sorry, IMHO this one-packet rule assures forward progress for all flows and is a feature, not a kludge. But I guess I am missing something in your argument, care to elaborate?

>  But again, I don't want to let the discussion get distracted by arguing pro or con FQ. I think we have to live with both now.
> 
> [...]
> 
>>>> Anyway, to me this discussion is about the tradeoffs between the 2 proposals.  It seems to me SCE has some safety advantages that should not be thrown away lightly, 
>> 
>> I appreciate the efforts of trying to improve L4S, but nobody working on L4S for years now see a way that SCE can work on a non-FQ system.
> 
> 	That is a rather peculiar argument, especially given that both you and Bob, major forces in the L4S approach, seemm to have philosophical issues with fq?
> 
> [K] I think I am realistic to accept pro's and con's and existence of both. I think wanting only FQ is as philosophical as wanting no FQ at all.

	Nobody wants you to switch your design away from dualQ or whathever you might want, as long as your choice does not have side-effects on the rest of the internet; use a real classifier instead of trying to press ECT(1) into service where a full bit is required and the issue is solved. My point is, again, I already use an fq-system on my CPE which gets me quite close to what L4S promises, but without necessarily redesigning most of the internet. So from my perspective FQ proofed itself already, now the newcomer L4S will need to demonstrate sufficient improvements over the existing FQ solution to merit the required non-backward compatible changes it mandates. And I do want to see a fair competition between the options (and will happily switch to L4S if it proves to be superior) under fair conditions.

> 
>> For me (and I think many others) it is a no-go to only support FQ. Unfortunately we only have half a bit free, 
> 
> 	??? Again you elaborately state the options in the L4S RFC and just converge on the one which is most convenient, but also not the best match for your requirements.
> 
> [K] Indeed, having common-Qs supported is one of my requirements.

	Misunderstanding here, I am not talking about dualQ/common-Q or mandating FQ everywhere, but about the fact that you committed on (ab)using ECT(1) as you "classifier" of choice even though this has severe side-effects...

> That's why I want to keep the discussion on that level: is there consensus that low latency is only needed for a per flow FQ system with an AQM per flow?

	This is a strawman argument , as far as I am concerned, as all I want is that L4S be orthogonal the the existing internet. As the L4S-RFCs verbosely describe there are other options for the required classification, so why insist upon using ECT(1)?

> 
>> and we need to choose how to use it. Would you choose for the existing ECN switches that cannot be upgraded (are there any?) or for all future non-FQ systems.
>> 
>>>> so if the performance can be made equivalent, it would be good to know about it before committing the codepoint.
>> 
>> The performance in FQ is clearly equivalent, but for a common-Q behavior, only L4S can work.
>> As far as I understood the SCE-LFQ proposal is actually a slower FQ implementation (an FQ in DualQ disguise 😉), so I think not really a better alternative than pure FQ. Also its single AQM on the bulk queue will undo any isolation, as a coupled AQM is stronger than any scheduler, including FQ.
> 
> 	But how would the bulk queue actually care, being dedicated to bulk flows? This basically just uses a single codel instance for all flows in the bulk queue, exactly the situation codel was designed for, if I recall correctly. Sure this will run into problems with unrepsonsive flows, but not any more than DualQ with or without  queue protection (you can steer misbehaving flows into the the "classic" queue, but this will just change which flows will suffer most of the collateral damage of that unresponsive flow, IMHO).
> 
> [K] As far as I recall, CoDel works best for a single flow.

	As any other AQM on a single queue... The point is that the AQM really really wants to target those flows that cause most of the traffic (as throttling those will cause the most immediate reduction on ingress rate for the AQM hop), FQ presents those flows on a platter, single queue AQMs rely on stochastic truths like the likelihood of marking/dropping a flow's packets being proportional to the fraction of packets of this flow in the queue. As far as I can tell DualQ works exactly on the same (stochastically marking) principle and hence also will work best for a single flow (sure due to the higher marking probability this might not be as pronounced as with RED and codel, but still theoretically it will be there). I might be confused by DualQ, so please correct me if my assumption is wrong.

> For a stateless AQM like a step using only per packet sojourn time, a common AQM over FQs is indeed working as an FQ with an AQM per queue. Applying an stateless AQM for Classic traffic (like sojourn-time RED without smoothing) will have impact on its performance. Adding common state for all bulk queue AQMs will disable the FQ effect. Anyway, the sequential scan at dequeue is the main reason why LFQ will be hard to get traction in high-speed equipment.

	I believe this to be directed at Jonathan, so no comment from my side.

Best Regards
	Sebastian

> 
> 
> Best Regards
> 	Sebastian Moeller