* [Bloat] http/2 @ 2015-03-06 21:38 Kartik Agaram 2015-03-12 15:02 ` Jonathan Morton ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Kartik Agaram @ 2015-03-06 21:38 UTC (permalink / raw) To: bloat; +Cc: Jordan Peacock [-- Attachment #1: Type: text/plain, Size: 1514 bytes --] Has HTTP/2[1] been discussed on this list?[2] I've been thinking about bufferbloat as I read the spec, and had a couple of questions that weren't answered in the FAQ[3]: 1. HTTP/2 reduces the number of connections per webpage. Assume for a second that all players instantaneously adopt HTTP/2 and so reduce their buffer sizes everywhere. Latencies will improve and there'll be less congestion. Now back to the real world with people building websites, trying to improve performance of websites and devices all over the place. Will bufferbloat stay eradicated, or will the gains be temporary? 2. More generally, is there any technical way for bufferbloat to stay solved? Or is it an inevitable tragedy of the commons dynamic that we just have to live with and make temporary dents in? 3. Has there been discussion of solving bufferbloat at the TCP layer, by making buffers harder to fill up? I'm thinking of heuristics like disallowing a single site from using 80% of the buffer, thereby leaving some slack available for other bursty requirements. I'm sure these questions are quite naive. Pointers to further reading greatly appreciated. Kartik http://akkartik.name/about [1] https://insouciant.org/tech/http-slash-2-considerations-and-tradeoffs [2] Google search on "site:https://lists.bufferbloat.net" didn't turn up anything, and I get "permission denied" when trying to access the downloadable archives at https://lists.bufferbloat.net/pipermail/bloat. [3] https://gettys.wordpress.com/bufferbloat-faq [-- Attachment #2: Type: text/html, Size: 2007 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-06 21:38 [Bloat] http/2 Kartik Agaram @ 2015-03-12 15:02 ` Jonathan Morton 2015-03-12 18:18 ` Narseo Vallina Rodriguez 2015-03-12 18:05 ` Rich Brown 2015-03-15 7:13 ` David Lang 2 siblings, 1 reply; 11+ messages in thread From: Jonathan Morton @ 2015-03-12 15:02 UTC (permalink / raw) To: Kartik Agaram; +Cc: Jordan Peacock, bloat [-- Attachment #1: Type: text/plain, Size: 1804 bytes --] I think you may be conflating several different buffers which exist in different places and are controlled by different means. I'll try to illustrate this using a single scenario with which I'm personally familiar: a half megabit 3G connection without AQM. Status quo is that loading a web page with many resources on it is unreliable. Early connections succeed and become established, the congestion window opens, the buffer in the 3G tower begins to fill up, inducing several seconds of latency, and subsequent DNS lookups and TCP handshakes tend to time out. End result: often, half the images on the page are broken. Status quo is also that a single big, continuous download (such as a software update) is capable of inducing 45 seconds of latency on the same connection, making it virtually impossible to do anything else with it concurrently. This corresponds to several megabytes of dumb buffering in the tower AND several megabytes of TCP receive window AND several megabytes of TCP congestion window. Lose any one of those three things and the induced latency disappears. But it's there, with a single connection. As far as bufferbloat is concerned, HTTP 2 just converts the first situation into the second one. If images and other resources are loaded from the same server as the base page, as they should be, then they'll load more reliably. But any resource loaded externally (even just sharded off) will become less reliable (if anything) in the presence of bufferbloat, because a separate connection still has to be made per host server. If the queue in the tower was less dumb, then TCP would be given congestion signals when it began to fill up. In that situation, HTTP 2 helps because there are fewer connections that need to receive that signal to be effective. - Jonathan Morton [-- Attachment #2: Type: text/html, Size: 1931 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-12 15:02 ` Jonathan Morton @ 2015-03-12 18:18 ` Narseo Vallina Rodriguez 2015-03-12 18:39 ` Jonathan Morton 0 siblings, 1 reply; 11+ messages in thread From: Narseo Vallina Rodriguez @ 2015-03-12 18:18 UTC (permalink / raw) To: Jonathan Morton; +Cc: Jordan Peacock, Kartik Agaram, bloat Hi Jonathan > Status quo is that loading a web page with many resources on it is > unreliable. Early connections succeed and become established, the congestion > window opens, the buffer in the 3G tower begins to fill up, inducing several > seconds of latency, and subsequent DNS lookups and TCP handshakes tend to > time out. End result: often, half the images on the page are broken. > The way you're describing this specific part, sounds more to me like a control-plane latency issue (i.e., the time for the RNC to allocate a radio channel to the client by promoting it from IDLE/FACH to DCH) rather than a buffer size related issue (which is actually introduced both on the handset and the RNC/eNB to deal with the C-Plane latency) https://www.qualcomm.com/media/documents/files/qualcomm-research-latency-in-hspa-data-networks.pdf ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-12 18:18 ` Narseo Vallina Rodriguez @ 2015-03-12 18:39 ` Jonathan Morton 2015-03-12 18:56 ` Narseo Vallina Rodriguez 0 siblings, 1 reply; 11+ messages in thread From: Jonathan Morton @ 2015-03-12 18:39 UTC (permalink / raw) To: Narseo Vallina Rodriguez; +Cc: Kartik Agaram, Jordan Peacock, bloat [-- Attachment #1: Type: text/plain, Size: 1478 bytes --] On 12 Mar 2015 20:18, "Narseo Vallina Rodriguez" <narseo@icsi.berkeley.edu> wrote: > > Hi Jonathan > > > Status quo is that loading a web page with many resources on it is > > unreliable. Early connections succeed and become established, the congestion > > window opens, the buffer in the 3G tower begins to fill up, inducing several > > seconds of latency, and subsequent DNS lookups and TCP handshakes tend to > > time out. End result: often, half the images on the page are broken. > > > > The way you're describing this specific part, sounds more to me like a > control-plane latency issue (i.e., the time for the RNC to allocate a > radio channel to the client by promoting it from IDLE/FACH to DCH) > rather than a buffer size related issue (which is actually introduced > both on the handset and the RNC/eNB to deal with the C-Plane latency) > > https://www.qualcomm.com/media/documents/files/qualcomm-research-latency-in-hspa-data-networks.pdf No, that's backwards. The first connection is the most reliable, because the link isn't loaded yet, and trying to make later connections times out because the buffers are full from the first ones, still in progress. If C-plane latency was the problem, the symptoms would be reversed - unless the system is inexplicably reverting to the idle state between packets in a continuous stream, and I refuse to believe it's that dumb without firm evidence. Unloaded latency on this link is on the order of 100ms. - Jonathan Morton [-- Attachment #2: Type: text/html, Size: 1893 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-12 18:39 ` Jonathan Morton @ 2015-03-12 18:56 ` Narseo Vallina Rodriguez 2015-03-12 19:07 ` Jonathan Morton 2015-03-15 7:23 ` Mikael Abrahamsson 0 siblings, 2 replies; 11+ messages in thread From: Narseo Vallina Rodriguez @ 2015-03-12 18:56 UTC (permalink / raw) To: Jonathan Morton; +Cc: Kartik Agaram, Jordan Peacock, bloat >> > Status quo is that loading a web page with many resources on it is >> > unreliable. Early connections succeed and become established, the >> > congestion >> > window opens, the buffer in the 3G tower begins to fill up, inducing >> > several >> > seconds of latency, and subsequent DNS lookups and TCP handshakes tend >> > to >> > time out. End result: often, half the images on the page are broken. >> > >> >> The way you're describing this specific part, sounds more to me like a >> control-plane latency issue (i.e., the time for the RNC to allocate a >> radio channel to the client by promoting it from IDLE/FACH to DCH) >> rather than a buffer size related issue (which is actually introduced >> both on the handset and the RNC/eNB to deal with the C-Plane latency) >> >> >> https://www.qualcomm.com/media/documents/files/qualcomm-research-latency-in-hspa-data-networks.pdf > > No, that's backwards. The first connection is the most reliable, because the > link isn't loaded yet, and trying to make later connections times out > because the buffers are full from the first ones, still in progress. If > C-plane latency was the problem, the symptoms would be reversed - unless the > system is inexplicably reverting to the idle state between packets in a > continuous stream, and I refuse to believe it's that dumb without firm > evidence. > > Unloaded latency on this link is on the order of 100ms. > It depends and I'm not sure if we're now on the same page :). Control-plane latency can affect more than you think and the control-plane dynamics can be very complex, including also promotions and demotions between UMTS channels to HS(D/U)PA(+) channels which also increase user-plane latency. The latter case affects more during long flows as a result of fairness policies implemented by the RNC as the number of HSPA channels are limited (each HSPA category has a defined number of channels using TDM). The most common demotion (or inactivity) timeout from DCH to FACH/IDLE is 6 seconds in most mobile operators which is triggered even if a TCP connection is kept alive but no packet was transmitted during this interval. The timeout can be lower for some operators with more aggressive configurations, larger for others more conservative (at the expenses of draining the battery of the phones) or even 0s for operators and handsets supporting "Fast Dormancy". If the handset is demoted, then the next packet will suffer the control plane latency again that is in the order of 1 to 2 seconds depending on signaling congestion at the RNC, SNR, and 3GPP standard. There's a lot of evidence of these dynamics ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-12 18:56 ` Narseo Vallina Rodriguez @ 2015-03-12 19:07 ` Jonathan Morton 2015-03-12 19:28 ` Narseo Vallina Rodriguez 2015-03-15 7:23 ` Mikael Abrahamsson 1 sibling, 1 reply; 11+ messages in thread From: Jonathan Morton @ 2015-03-12 19:07 UTC (permalink / raw) To: Narseo Vallina Rodriguez; +Cc: Kartik Agaram, Jordan Peacock, bloat [-- Attachment #1: Type: text/plain, Size: 414 bytes --] Tell me: does a Nokia E70 support fast dormancy? And how does a less than 2 second setup latency translate into a 45 seconds latency under continuous load with continuous, smooth packet delivery? Which evaporates to half a second if I clamp the TCP receive window down to a sane value? Those are the facts I established years ago. They're still true today with a newer handset/dongle tethered. - Jonathan Morton [-- Attachment #2: Type: text/html, Size: 477 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-12 19:07 ` Jonathan Morton @ 2015-03-12 19:28 ` Narseo Vallina Rodriguez 2015-03-12 19:42 ` Jonathan Morton 0 siblings, 1 reply; 11+ messages in thread From: Narseo Vallina Rodriguez @ 2015-03-12 19:28 UTC (permalink / raw) To: Jonathan Morton; +Cc: Kartik Agaram, Jordan Peacock, bloat > Tell me: does a Nokia E70 support fast dormancy? And how does a less than 2 > second setup latency translate into a 45 seconds latency under continuous > load with continuous, smooth packet delivery? Which evaporates to half a > second if I clamp the TCP receive window down to a sane value? > Well, I'm not saying that you're not right. I'm just saying that there are many more dynamics that you cannot control at the control-plane as they are transparent to the handset. In the particular case you're describing, it's very likely a buffer issue but it could be also a network issue as you could be connected through an old APN (gateway). The E70 does not have fast dormancy, and it does not support HSPA. It's a basic UMTS phone from 2005 which is not very representative as of today. The latency you're describing could be caused by signaling overload or poor radio conditions but also very likely by buffers as you pointed. Are there any other buffers on the RIL interface of the handset? How do they behave? I know some operators use leaky bucket. In any case, I was referring to a general scenario and as you framed it initially "first packet", DNS, etc, it sounded as a control-plane case. That's why I said in the previous email that I was not sure if we're on the same page. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-12 19:28 ` Narseo Vallina Rodriguez @ 2015-03-12 19:42 ` Jonathan Morton 0 siblings, 0 replies; 11+ messages in thread From: Jonathan Morton @ 2015-03-12 19:42 UTC (permalink / raw) To: Narseo Vallina Rodriguez; +Cc: Kartik Agaram, Jordan Peacock, bloat [-- Attachment #1: Type: text/plain, Size: 665 bytes --] I have sometimes noticed that the first packet after an idle period gets additional latency on 3G. It's not much, maybe a quarter second. I'm not worried about that, and it doesn't really cause any problems for me. I expect that's the C-plane latency you're on about. I was talking specifically about what happens when traffic is already flowing - NOT idle - and then an additional, concurrent flow wants to start up. If it goes to a different server, it's likely to start with a DNS lookup, and resolvers tend to have remarkably short timeouts these days. A few seconds induced latency, due to plain old bufferbloat, is enough to make it fail. - Jonathan Morton [-- Attachment #2: Type: text/html, Size: 748 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-12 18:56 ` Narseo Vallina Rodriguez 2015-03-12 19:07 ` Jonathan Morton @ 2015-03-15 7:23 ` Mikael Abrahamsson 1 sibling, 0 replies; 11+ messages in thread From: Mikael Abrahamsson @ 2015-03-15 7:23 UTC (permalink / raw) To: Narseo Vallina Rodriguez; +Cc: bloat On Thu, 12 Mar 2015, Narseo Vallina Rodriguez wrote: > Control-plane latency can affect more than you think and the > control-plane dynamics can be very complex, including also promotions > and demotions between UMTS channels to HS(D/U)PA(+) channels which also > increase user-plane latency. The latter case affects more during long > flows as a result of fairness policies implemented by the RNC as the > number of HSPA channels are limited (each HSPA category has a defined > number of channels using TDM). Ok, I understand you're trying to get this right, however I don't see this as the most probable explanation for the use-case described. Most of the time for this use-case, you'll see the HSPA channels get properly established after approximately 1 second, and they'll stay up until the transfer is done. One RNC vendor I have fairly well knowledge of, would 400 packets of buffering in the GGSN->RNC->eNodeB->Handset direction. I don't know about the others. With half a megabit/s of buffer drain, that means max 10 seconds of buffering if my calculations are correct. There can potentially be buffering in the GGSN/SGSN as well. This is if everything is working perfectly. If there are other problems, the drain rate might be slower than half a megabit/s and this might induce further latency. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-06 21:38 [Bloat] http/2 Kartik Agaram 2015-03-12 15:02 ` Jonathan Morton @ 2015-03-12 18:05 ` Rich Brown 2015-03-15 7:13 ` David Lang 2 siblings, 0 replies; 11+ messages in thread From: Rich Brown @ 2015-03-12 18:05 UTC (permalink / raw) To: Kartik Agaram; +Cc: Jordan Peacock, bloat [-- Attachment #1.1: Type: text/plain, Size: 4049 bytes --] Hi Kartik, Thanks for the questions. > Has HTTP/2[1] been discussed on this list?[2] I've been thinking about bufferbloat as I read the spec, and had a couple of questions that weren't answered in the FAQ[3]: > > 1. HTTP/2 reduces the number of connections per webpage. Assume for a second that all players instantaneously adopt HTTP/2 and so reduce their buffer sizes everywhere. Latencies will improve and there'll be less congestion. Now back to the real world with people building websites, trying to improve performance of websites and devices all over the place. Will bufferbloat stay eradicated, or will the gains be temporary? A Bufferbloat algorithm (fq_codel, or other SQM (smart queue management)) is required to minimize the number of buffers queued at *any* bottleneck in a network. This occurs frequently at the home router/edge of the network, but can appear anywhere. Wherever a queue begins to build up in a network, optimal performance demands some kind of SQM HTTP/2 may well help by requesting fewer connections, but the SQM in the router will still be in effect. If the smaller number of HTTP/2 requests from the browser don't create a queue, then SQM won't even become active. But if the browser traffic does manage to generate a queue, the router's SQM will keep it under control. I want to emphasize that point: SQM doesn't force any fixed allocations of bandwidth, packet rate, etc. It actually measures the queue length (in msec) for each traffic flow. If all the packets are whistling through without any congestion, every sender will get the full rate of the link. SQM only becomes active when there *is* congestion, and it throttles those flows that are sending the most traffic (to preserve the link capacity for the "little flows" that are time sensitive. > 2. More generally, is there any technical way for bufferbloat to stay solved? Or is it an inevitable tragedy of the commons dynamic that we just have to live with and make temporary dents in? Yes, it will stay solved. No, there's no tragedy of the commons. (Great question, though.) The SQM algorithm only examines packets within a single router, so multiple routers are essentially independent. There's no central communication required - it's all local to a router. In fact, the "tragedy" of solving bufferbloat is that it needs to be solved *everywhere*. That is to say that *every* router, cell phone, DSLAM, Cable modem (home and head-end), personal computer OS, and other piece of equipment on the planet needs to be updated. This is the hard part. > 3. Has there been discussion of solving bufferbloat at the TCP layer, by making buffers harder to fill up? I'm thinking of heuristics like disallowing a single site from using 80% of the buffer, thereby leaving some slack available for other bursty requirements. I am personally not hopeful for this kind of approach. a) The TCP algorithm in hosts isn't easily made aware of congestion elsewhere in the network, so it can't react to that congestion; b) there aren't a lot of tested proposals (beyond dropping packets) to make things better; c) it suffers from exactly the same problem as solving bufferbloat - it needs to be rolled out in every piece of gear. (We can't even attract the attention of vendors (Apple, Microsoft, most routing gear, etc). to implement the solved algorithms to improve bufferbloat. Sigh.) > I'm sure these questions are quite naive. Pointers to further reading greatly appreciated. > > Kartik > http://akkartik.name/about > > [1] https://insouciant.org/tech/http-slash-2-considerations-and-tradeoffs > > [2] Google search on "site:https://lists.bufferbloat.net" didn't turn up anything, and I get "permission denied" when trying to access the downloadable archives at https://lists.bufferbloat.net/pipermail/bloat. > > [3] https://gettys.wordpress.com/bufferbloat-faq > _______________________________________________ > Bloat mailing list > Bloat@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/bloat [-- Attachment #1.2: Type: text/html, Size: 5099 bytes --] [-- Attachment #2: Message signed with OpenPGP using GPGMail --] [-- Type: application/pgp-signature, Size: 496 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Bloat] http/2 2015-03-06 21:38 [Bloat] http/2 Kartik Agaram 2015-03-12 15:02 ` Jonathan Morton 2015-03-12 18:05 ` Rich Brown @ 2015-03-15 7:13 ` David Lang 2 siblings, 0 replies; 11+ messages in thread From: David Lang @ 2015-03-15 7:13 UTC (permalink / raw) To: Kartik Agaram; +Cc: Jordan Peacock, bloat [-- Attachment #1: Type: TEXT/Plain, Size: 886 bytes --] On Fri, 6 Mar 2015, Kartik Agaram wrote: > 3. Has there been discussion of solving bufferbloat at the TCP layer, by > making buffers harder to fill up? I'm thinking of heuristics like > disallowing a single site from using 80% of the buffer, thereby leaving > some slack available for other bursty requirements. There are already multiple solutions at the TCP layer. 1. TCP slows down if it gets ECN responses, so it won't fill up buffers not everything implements ECN and some firewall-type devices strip out or lie about ECN data 2. TCP slows down if packets get dropped. Not all dropping needs to wait until the buffers are full. 2a fq_codel does exactly this, it drops packets when there is congestion, before the buffers fill up. The problem is just getting these changes into new equipment/software and then replacing the equipment/software in the field. David Lang [-- Attachment #2: Type: TEXT/PLAIN, Size: 140 bytes --] _______________________________________________ Bloat mailing list Bloat@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/bloat ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-03-15 7:24 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-03-06 21:38 [Bloat] http/2 Kartik Agaram 2015-03-12 15:02 ` Jonathan Morton 2015-03-12 18:18 ` Narseo Vallina Rodriguez 2015-03-12 18:39 ` Jonathan Morton 2015-03-12 18:56 ` Narseo Vallina Rodriguez 2015-03-12 19:07 ` Jonathan Morton 2015-03-12 19:28 ` Narseo Vallina Rodriguez 2015-03-12 19:42 ` Jonathan Morton 2015-03-15 7:23 ` Mikael Abrahamsson 2015-03-12 18:05 ` Rich Brown 2015-03-15 7:13 ` David Lang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox