From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yh0-x22c.google.com (mail-yh0-x22c.google.com [IPv6:2607:f8b0:4002:c01::22c]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id AF6A921F339 for ; Sun, 21 Jun 2015 10:53:48 -0700 (PDT) Received: by yhpn97 with SMTP id n97so95350035yhp.0 for ; Sun, 21 Jun 2015 10:53:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=VW1C5X+gWAOnXRMCMeKAUdYgN7W5/4nIuAPJ0OxnW+k=; b=PurdT6DGuxdP8ccUJxIzNFx6q6LmPVYYsHzefDVcupjf9CZvfmF1Rps7hXOc3zF3wk 6+NdwzXqZTKRLWSzbOeZ3qFjMRhUJrp2F2WQjZGy4htjukV7rWfBTUqOCavMOAx5m11L rR5O43I7snZyuPM/JLWOGUoVHK+ePIplKqZZi9MOJRjMhhET8/cIfJIT6Em/Ctmfhsv2 gX+zon1Q4iTHt3Rgua7RoXhgUoS9qLj2AE5YYL/iIPZZZ1kl9JvAT7nCxAXCI5TGob+a afEZE2HBQFnx+EAYBpZy9N1JDc9zXLM2IDeSLu1/FxLlekbHtJS1Yv/QkYy6PL/nRXsP 1Rew== MIME-Version: 1.0 X-Received: by 10.170.123.214 with SMTP id p205mr19979206ykb.14.1434909227428; Sun, 21 Jun 2015 10:53:47 -0700 (PDT) Received: by 10.37.8.132 with HTTP; Sun, 21 Jun 2015 10:53:47 -0700 (PDT) In-Reply-To: References: Date: Sun, 21 Jun 2015 10:53:47 -0700 Message-ID: From: G B To: Benjamin Cronce Content-Type: multipart/alternative; boundary=001a1137c0cec934fd05190ad699 Cc: bloat Subject: Re: [Bloat] TCP congestion detection - random thoughts X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Jun 2015 17:54:17 -0000 --001a1137c0cec934fd05190ad699 Content-Type: text/plain; charset=UTF-8 There should also be a way to track the "ack backlog". By that I mean, if you can see that the packets being acked were sent 10 seconds ago and they are consistently so, you should then be able to determine that you are likely (10 seconds - real RTT - processing delay) deep in buffers somewhere. If you back off on the number of packets in flight and that ack backlog doesn't seem to change much, then the congestion is probably not related to your specific flow. It is likely due to aggregate congestion somewhere in the path. Could be a congested peering point, pop, busy distant end, whatever. But if the backing off DOES significantly reduce the ack backlog (acks are now arriving for packets sent only 5 seconds ago rather than 10) then you have a notion that the flow is a significant contributor to the total backlog. Exactly what one would do with that information is the question, I guess. Is the backlog consistent across all flows or just one? If it is consistent across all flows then the source of buffering is very close to you. If it is wildly different, it is likely somewhere in the path of that particular flow. And looking at the document linked concerning CDG, I see they take that into account. If I back off but the RTT doesn't decrease, then my flow is not a significant contributor to the delay. The problem with the algorithm to my mind is that finding the size of "the queue" for any particular flow is practically impossible because each flow will have its own specific amount of buffering along the path and if you get into things like asymmetric routing where the reply path might not be the same as the send path (multihomed transit provider or end node sending reply traffic over different peer than the traffic in the other direction is arriving on) or (worse) where ECMP is being done across peers on a packet by packed and not flow-based basis. At that point it is impossible to really profile the path. So if I were designing such an algorithm, I would try to determine: Is the delay consistent across all flows? Is the delay consistent even within a single flow? When I reduce my rate, does the backlog drop? Exactly what I would do with that information would require more thought. On Sun, Jun 21, 2015 at 9:19 AM, Benjamin Cronce wrote: > Just a random Sunday morning thought that has probably already been > thought of before, but I currently can't think of hearing it before. > > My understanding of most TCP congestion control algorithms is they > primarily watch for drops, but drops are indicated by the receiving party > via ACKs. The issue with this is TCP keeps pushing more data into the > window until a drop is signaled, even if the rate received is not > increased. What if the sending TCP also monitors rate received and backs > off cramming more segments into a window if the received rate does not > increase. > > Two things to measure this. RTT which is part of TCP statistics already > and the rate at which bytes are ACKed. If you double the number of segments > being sent, but in a time frame relative to the RTT, you do not see a > meaningful increase in the rate at which bytes are being ACKed, may want to > back off. > > It just seems to me that if you have a 50ms RTT and 10 seconds of > bufferbloat, TCP is cramming data down the path with no care in the world > about how quickly data is actually getting ACKed, it's just waiting for the > first segment to get dropped, which would never happen in an infinitely > buffered network. > > TCP should be able to keep state that tracks the minimum RTT and maximum > ACK rate. Between these two, it should not be able to go over the max path > rate except when attempting to probe for a new max or min. Min RTT is > probably a good target because path latency should be relatively static, > however path free-bandwidth is not static. The desirable number of segments > in flight would need to change but would be bounded by the max. > > Of course naggle type algorithms can mess with this because when ACKs > occur is no longer based entirely when a segment is received, but also by > some other additional amount of time. If you assume that naggle will > coalesce N segments into a single ACK, then you need to add to the RTT, the > amount of time at the current PPS, how long until you expect another ACK > assuming N number of segments will be coalesced. This would be even > important for low latency low bandwidth paths. Coalesce information could > be assumed, negotiated, or inferred. Negotiated would be best. > > Anyway, just some random Sunday thoughts. > > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat > > --001a1137c0cec934fd05190ad699 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
There should also be a way to track the "ack backlog&= quot;.=C2=A0 By that I mean, if you can see that the packets being acked we= re sent 10 seconds ago and they are consistently so, you should then be abl= e to determine that you are likely (10 seconds - real RTT - processing dela= y) deep in buffers somewhere. =C2=A0 If you back off on the number of packe= ts in flight and that ack backlog doesn't seem to change much, then the= congestion is probably not related to your specific flow.=C2=A0 It is like= ly due to aggregate congestion somewhere in the path.=C2=A0 Could be a cong= ested peering point, pop, busy distant end, whatever. But if the backing of= f DOES significantly reduce the ack backlog (acks are now arriving for pack= ets sent only 5 seconds ago rather than 10) then you have a notion that the= flow is a significant contributor to the total backlog. =C2=A0 Exactly wha= t one would do with that information is the question, I guess. =C2=A0
<= br>
Is the backlog consistent across all flows or just one?=C2=A0= If it is consistent across all flows then the source of buffering is very = close to you.=C2=A0 If it is wildly different, it is likely somewhere in th= e path of that particular flow.=C2=A0 And looking at the document linked co= ncerning CDG, I see they take that into account.=C2=A0 If I back off but th= e RTT doesn't decrease, then my flow is not a significant contributor t= o the delay.=C2=A0 The problem with the algorithm to my mind is that findin= g the size of "the queue" for any particular flow is practically = impossible because each flow will have its own specific amount of buffering= along the path and if you get into things like asymmetric routing where th= e reply path might not be the same as the send path (multihomed transit pro= vider or end node sending reply traffic over different peer than the traffi= c in the other direction is arriving on) or (worse) where ECMP is being don= e across peers on a packet by packed and not flow-based basis.=C2=A0 At tha= t point it is impossible to really profile the path.

So if I were designing such an algorithm, I would try to determine: =C2= =A0Is the delay consistent across all flows?=C2=A0 Is the delay consistent = even within a single flow?=C2=A0 When I reduce my rate, does the backlog dr= op?=C2=A0 Exactly what I would do with that information would require more = thought.



On Sun, Jun 21, 2015 at 9:19 AM, Benjamin = Cronce <bcronce@gmail.com> wrote:
Just a random Sunday morning thought that has= probably already been thought of before, but I currently can't think o= f hearing it before.

My understanding of most = TCP congestion control algorithms is they primarily watch for drops, but dr= ops are indicated by the receiving party via ACKs. The issue with this is T= CP keeps pushing more data into the window until a drop is signaled, even i= f the rate received is not increased. What if the sending TCP also monitors= rate received and backs off cramming more segments into a window if the re= ceived rate does not increase.

Two things to measu= re this. RTT which is part of TCP statistics already and the rate at which = bytes are ACKed. If you double the number of segments being sent, but in a = time frame relative to the RTT, you do not see a meaningful increase in the= rate at which bytes are being ACKed, may want to back off.

<= /div>
It just seems to me that if you have a 50ms RTT and 10 seconds of= bufferbloat, TCP is cramming data down the path with no care in the world = about how quickly data is actually getting ACKed, it's just waiting for= the first segment to get dropped, which would never happen in an infinitel= y buffered network.

TCP should be able to keep sta= te that tracks the minimum RTT and maximum ACK rate. Between these two, it = should not be able to go over the max path rate except when attempting to p= robe for a new max or min. Min RTT is probably a good target because path l= atency should be relatively static, however path free-bandwidth is not stat= ic. The desirable number of segments in flight would need to change but wou= ld be bounded by the max.

Of course naggle type al= gorithms can mess with this because when ACKs occur is no longer based enti= rely when a segment is received, but also by some other additional amount o= f time. If you assume that naggle will coalesce N segments into a single AC= K, then you need to add to the RTT, the amount of time at the current PPS, = how long until you expect another ACK assuming N number of segments will be= coalesced. This would be even important for low latency low bandwidth path= s. Coalesce information could be assumed, negotiated, or inferred. Negotiat= ed would be best.

Anyway, just some random Sunday = thoughts.

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net<= /a>
https://lists.bufferbloat.net/listinfo/bloat


--001a1137c0cec934fd05190ad699--