From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp96.iad3a.emailsrvr.com (smtp96.iad3a.emailsrvr.com [173.203.187.96]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id E851F3B2A4 for ; Sat, 22 Jun 2019 18:09:37 -0400 (EDT) Received: from smtp5.relay.iad3a.emailsrvr.com (localhost [127.0.0.1]) by smtp5.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id B14E7225EB; Sat, 22 Jun 2019 18:09:37 -0400 (EDT) X-SMTPDoctor-Processed: csmtpprox beta DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=g001.emailsrvr.com; s=20190322-9u7zjiwi; t=1561241377; bh=zGRas9UtVRlhyX7jjYhsfU/aPtN5LtKkg1jU1dh+xNI=; h=Date:Subject:From:To:From; b=tPNdc99tQYSSnRlFLjb7D/KY9J5dcLIsO8iQcbadfsOuFcHW7/ay7JqdEysPkAFqM QLzwxtwS++O1wRV+N7O5eyxUQp/iqafBi23/M+OXnV9HDMoGd2xpuzWe/LrizRYKqb C1YKHLdXN1pQAafDjQ/ecSLfKYHWMiCkTB80C5hw= Received: from app59.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by smtp5.relay.iad3a.emailsrvr.com (SMTP Server) with ESMTP id 7440920F4D; Sat, 22 Jun 2019 18:09:37 -0400 (EDT) X-Sender-Id: dpreed@deepplum.com Received: from app59.wa-webapps.iad3a (relay-webapps.rsapps.net [172.27.255.140]) by 0.0.0.0:25 (trex/5.7.12); Sat, 22 Jun 2019 18:09:37 -0400 Received: from deepplum.com (localhost.localdomain [127.0.0.1]) by app59.wa-webapps.iad3a (Postfix) with ESMTP id 62F7A601EB; Sat, 22 Jun 2019 18:09:37 -0400 (EDT) Received: by apps.rackspace.com (Authenticated sender: dpreed@deepplum.com, from: dpreed@deepplum.com) with HTTP; Sat, 22 Jun 2019 18:09:37 -0400 (EDT) X-Auth-ID: dpreed@deepplum.com Date: Sat, 22 Jun 2019 18:09:37 -0400 (EDT) From: "David P. Reed" To: "Jonathan Morton" Cc: "Brian E Carpenter" , "ecn-sane@lists.bufferbloat.net" , "tsvwg IETF list" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_20190622180937000000_92011" Importance: Normal X-Priority: 3 (Normal) X-Type: html In-Reply-To: <71EF351D-AFBF-4C92-B6B9-7FD695B68815@gmail.com> References: <350f8dd5-65d4-d2f3-4d65-784c0379f58c@bobbriscoe.net> <46D1ABD8-715D-44D2-B7A0-12FE2A9263FE@gmx.de> <835b1fb3-e8d5-c58c-e2f8-03d2b886af38@gmail.com> <1561233009.95886420@apps.rackspace.com> <71EF351D-AFBF-4C92-B6B9-7FD695B68815@gmail.com> Message-ID: <1561241377.4026977@apps.rackspace.com> X-Mailer: webmail/16.4.5-RC Subject: Re: [Ecn-sane] [tsvwg] per-flow scheduling X-BeenThere: ecn-sane@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion of explicit congestion notification's impact on the Internet List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Jun 2019 22:09:38 -0000 ------=_20190622180937000000_92011 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable =0AGood point, but saturation is not congestion, in my view. Optimal state = of a single bottleneck link is a queue of length <=3D 1. This can be mainta= ined even under full load, by endpoint rate control. (Note that the "Little= 's Law" result in queueing theory is for a special case of queuing due to u= ncontrolled Poisson random arrivals. The traffic you refer to, downloading = an update, is about as far from Poisson-random as possible. It's neither ra= ndom nor Poisson).=0A =0ABut I take your point that there will be bottlenec= ked links that are special, and thus congestion control comes into play on = those links.=0A =0Aper-flow scheduling is appropriate on a shared link. How= ever, the end-to-end argument would suggest that the network not try to div= ine which flows get preferred.=0AAnd beyond the end-to-end argument, there'= s a practical problem - since the ideal state of a shared link means that i= t ought to have no local backlog in the queue, the information needed to sc= hedule "fairly" isn't in the queue backlog itself. If there is only one pa= cket, what's to schedule?=0A =0AIn fact, what the ideal queueing discipline= would do is send signals to the endpoints that provide information as to w= hat each flow's appropriate share is, and/or how far its current share is f= rom what's fair. A somewhat less optimal discipline to achieve fairness mig= ht also drop packets beyond the fair share. Dropping is the best way to sig= nal the endpoint that it is trying to overuse the link.=0A =0AMerely re-ord= ering the packets on a link is just not very effective at achieving fairnes= s.=0A =0ANow what to do to determine fair share if the average number of pa= ckets in a saturated link's queue is <=3D 1?=0A =0AWell, presumably the flo= ws have definable average rates. So good statistical estimators of the ave= rage rate of each flow exist. For example a table with a moving average of = the rate of bytes on each flow can be maintained, which provides wonderful = information about the recent history of capacities used by each flow. This = allows signalling of overuse to the endpoints of each flow, either by dropp= ing packets or by some small number of bits (e.g. ECN marking). Underuse ca= n also be signalled.=0A =0AWe know (from pragmatic observation) that flows = defined as source-host, dest-host pairs, have relatively stable demand over= many packets, and many RTT's. More specific flows (say, defined using TCP= and UDP ports, the "extended header" that maybe should have been included = in the IP layer) are less stable in some protocol usages we see.=0A =0AHowe= ver, what's clear is that something like fq_codel can be created that does = not need to accumulate long queues on links to have an effect - obviously q= ueues still result from transient overload, but one wants to avoid requirin= g queueing to build up in order to balance traffic fairly.=0A =0AIdeally, i= n a situation where some definition of fairness needs to be implemented amo= ng flows competing for a limited resource, the queueing should build up at = the source endpoint as much as possible, and yet the point where the confli= ct occurs may be deep within the network of networks. However, once we sta= rt being able to balance loads across many independent paths (rather than a= ssuming that there is exactly one path per flow), the definition of "fairne= ss" becomes FAR more complicated, because the competing flows may not be be= tween a particular source and destination pair - instead the traffic in eac= h flow can be managed across different sets of multipaths.=0A =0ASo the end= -to-end approach would suggest moving most of the scheduling back to the en= dpoints of each flow, with the role of the routers being to extract informa= tion about the competing flows that are congesting the network, and forward= ing those signals (via drops or marking) to the endpoints. That's because, = in the end-to-end argument that applies here - the router cannot do the ent= ire function of managing congestion or priority.=0A =0AThe case of the "upl= ink" from a residential network is a common, but not general case. Once a r= esidence has multiple uplinks (say in a Mesh neighborhood network or just i= n a complex wireless situation where multiple access points are reachable) = the idea of localizing priority or congestion management in one box below t= he IP layer no longer works well at all.=0A =0AOn Saturday, June 22, 2019 4= :47pm, "Jonathan Morton" said:=0A=0A=0A=0A> > On 22= Jun, 2019, at 10:50 pm, David P. Reed =0A> wrote:=0A>= >=0A> > Pragmatic networks (those that operate in the real world) do not c= hoose to=0A> operate with shared links in a saturated state. That's known i= n the phone business=0A> as the Mother's Day problem. You want to have enou= gh capacity for the rare=0A> near-overload to never result in congestion.= =0A> =0A> This is most likely true for core networks. However, I know of se= veral real-world=0A> networks and individual links which, in practice, are = regularly in a saturated=0A> and/or congested state.=0A> =0A> Indeed, the a= verage Internet consumer's ADSL or VDSL last-mile link becomes=0A> saturate= d for a noticeable interval, every time his operating system or game=0A> ve= ndor releases an update. In my case, I share a 3G/4G tower's airtime with= =0A> whatever variable number of subscribers to the same network happen to = be in the=0A> area on any given day; today, during midsummer weekend, that = number is=0A> considerably inflated compared to normal, and my available li= nk bandwidth is=0A> substantially impacted as a result, indicating congesti= on.=0A> =0A> I did not see anything in your argument specifically about per= -flow scheduling for=0A> the simple purpose of fairly sharing capacity betw= een flows and/or between=0A> subscribers, and minimising the impact of elep= hants on mice. Empirical evidence=0A> suggests that it makes the network ru= n more smoothly. Does anyone have a concrete=0A> refutation?=0A> =0A> - Jon= athan Morton ------=_20190622180937000000_92011 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Good point, but satura= tion is not congestion, in my view. Optimal state of a single bottleneck li= nk is a queue of length <=3D 1. This can be maintained even under full l= oad, by endpoint rate control. (Note that the "Little's Law" result in queu= eing theory is for a special case of queuing due to uncontrolled Poisson ra= ndom arrivals. The traffic you refer to, downloading an update, is about as= far from Poisson-random as possible. It's neither random nor Poisson).

= =0A

 

=0A

But I take your p= oint that there will be bottlenecked links that are special, and thus conge= stion control comes into play on those links.

=0A

&n= bsp;

=0A

per-flow scheduling is appropriate on a sha= red link. However, the end-to-end argument would suggest that the network n= ot try to divine which flows get preferred.

=0A

And = beyond the end-to-end argument, there's a practical problem - since the ide= al state of a shared link means that it ought to have no local backlog in t= he queue, the information needed to schedule "fairly" isn't in the queue ba= cklog itself.  If there is only one packet, what's to schedule?

=0A=

 

=0A

In fact, what the id= eal queueing discipline would do is send signals to the endpoints that prov= ide information as to what each flow's appropriate share is, and/or how far= its current share is from what's fair. A somewhat less optimal discipline = to achieve fairness might also drop packets beyond the fair share. Dropping= is the best way to signal the endpoint that it is trying to overuse the li= nk.

=0A

 

=0A

Merely re-= ordering the packets on a link is just not very effective at achieving fair= ness.

=0A

 

=0A

Now what= to do to determine fair share if the average number of packets in a satura= ted link's queue is <=3D 1?

=0A

 

=0A

Well, presumably the flows have definable average rates. S= o  good statistical estimators of the average rate of each flow exist.= For example a table with a moving average of the rate of bytes on each flo= w can be maintained, which provides wonderful information about the recent = history of capacities used by each flow. This allows signalling of overuse = to the endpoints of each flow, either by dropping packets or by some small = number of bits (e.g. ECN marking). Underuse can also be signalled.

=0A 

=0A

We know (from pragmati= c observation) that flows defined as source-host, dest-host pairs, have rel= atively stable demand over many packets, and many RTT's.  More specifi= c flows (say, defined using TCP and UDP ports, the "extended header" that m= aybe should have been included in the IP layer) are less stable in some pro= tocol usages we see.

=0A

 

=0A

However, what's clear is that something like fq_codel can be created= that does not need to accumulate long queues on links to have an effect - = obviously queues still result from transient overload, but one wants to avo= id requiring queueing to build up in order to balance traffic fairly.

= =0A

 

=0A

Ideally, in a sit= uation where some definition of fairness needs to be implemented among flow= s competing for a limited resource, the queueing should build up at the sou= rce endpoint as much as possible, and yet the point where the conflict occu= rs may be deep within the network of networks.  However, once we start= being able to balance loads across many independent paths (rather than ass= uming that there is exactly one path per flow), the definition of "fairness= " becomes FAR more complicated, because the competing flows may not be betw= een a particular source and destination pair - instead the traffic in each = flow can be managed across different sets of multipaths.

=0A

 

=0A

So the end-to-end approach would= suggest moving most of the scheduling back to the endpoints of each flow, = with the role of the routers being to extract information about the competi= ng flows that are congesting the network, and forwarding those signals (via= drops or marking) to the endpoints. That's because, in the end-to-end argu= ment that applies here - the router cannot do the entire function of managi= ng congestion or priority.

=0A

 

=0A

The case of the "uplink" from a residential network is a commo= n, but not general case. Once a residence has multiple uplinks (say in a Me= sh neighborhood network or just in a complex wireless situation where multi= ple access points are reachable) the idea of localizing priority or congest= ion management in one box below the IP layer no longer works well at all.=0A

 

=0A

On Saturday, Ju= ne 22, 2019 4:47pm, "Jonathan Morton" <chromatix99@gmail.com> said:

=0A
=0A

&= gt; > On 22 Jun, 2019, at 10:50 pm, David P. Reed <dpreed@deepplum.co= m>
> wrote:
> >
> > Pragmatic networks (th= ose that operate in the real world) do not choose to
> operate with= shared links in a saturated state. That's known in the phone business
> as the Mother's Day problem. You want to have enough capacity for the= rare
> near-overload to never result in congestion.
>
> This is most likely true for core networks. However, I know of sever= al real-world
> networks and individual links which, in practice, a= re regularly in a saturated
> and/or congested state.
> > Indeed, the average Internet consumer's ADSL or VDSL last-mile link= becomes
> saturated for a noticeable interval, every time his oper= ating system or game
> vendor releases an update. In my case, I sha= re a 3G/4G tower's airtime with
> whatever variable number of subsc= ribers to the same network happen to be in the
> area on any given = day; today, during midsummer weekend, that number is
> considerably= inflated compared to normal, and my available link bandwidth is
> = substantially impacted as a result, indicating congestion.
>
= > I did not see anything in your argument specifically about per-flow sc= heduling for
> the simple purpose of fairly sharing capacity betwee= n flows and/or between
> subscribers, and minimising the impact of = elephants on mice. Empirical evidence
> suggests that it makes the = network run more smoothly. Does anyone have a concrete
> refutation= ?
>
> - Jonathan Morton

=0A
------=_20190622180937000000_92011--