From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lb0-x22e.google.com (mail-lb0-x22e.google.com [IPv6:2a00:1450:4010:c04::22e]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 5BAF521F5D9 for ; Sat, 23 Aug 2014 16:29:56 -0700 (PDT) Received: by mail-lb0-f174.google.com with SMTP id c11so10738087lbj.19 for ; Sat, 23 Aug 2014 16:29:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=s+N1WZ2PYF0qcJ4+lXyjSWJrLWpzWslEZO2MEJ4FNbQ=; b=YXoNLOMaPv7hJYYwVh7pwVHN5lmXcfLbWIeFvztRru6Xlf/6l39RLI8ts6ShPhAppP 3cO7+qMuoDQpE7a/vGJWhGRj8S3hE4pXpkVbmljXdEOCAQyjA8hKrj+s6a6fgeVpfY0J RlnfhBfOC0WEqmIMzieZgloe4jw+krBkLtPIdbzQU/1s6EqbWGtW083gdjs1dEZEFxRm sueZx0et/ZaywVcKGBitREU+SYmEaT9Tv+1l6/G2V4+NP9GfCILn8e1N5qcsI9+ONtmo DhkZOodRK00PwaIUsfUEBKXzpKsF2sfTFuLXm+SXAN2ze7CA5Z5Kat8rZ1ZL/z1xoeEP LLQQ== X-Received: by 10.112.242.162 with SMTP id wr2mr11779722lbc.10.1408836593489; Sat, 23 Aug 2014 16:29:53 -0700 (PDT) Received: from bass.home.chromatix.fi (178-55-89-211.bb.dnainternet.fi. [178.55.89.211]) by mx.google.com with ESMTPSA id kh9sm54390762lbc.5.2014.08.23.16.29.51 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 23 Aug 2014 16:29:52 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Jonathan Morton In-Reply-To: Date: Sun, 24 Aug 2014 02:29:50 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <8651E326-171F-472F-9456-920A9E43367D@gmail.com> References: <91696A3A-EF44-4A1A-8070-D3AF25D0D9AC@netapp.com> <64CD1035-2E14-4CA6-8E90-C892BAD48EC6@netapp.com> <4C1661D0-32C6-48E7-BAE6-60C98D7B2D69@ifi.uio.no> To: Michael Welzl X-Mailer: Apple Mail (2.1085) Cc: bloat Mainlinglist Subject: Re: [Bloat] sigcomm wifi X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2014 23:29:57 -0000 I've done some reading on how wifi actually works, and what mechanisms = the latest variants use to improve performance. It might be helpful to = summarise my understanding here - biased towards the newer variants, = since they are by now widely deployed. First a note on the variants themselves: 802.11 without suffix is obsolete and no longer in use. 802.11a was the original 5GHz band version, giving 54Mbps in 20MHz = channels. 802.11b was the first "affordable" version, using 2.4GHz and giving = 11Mbps in 20MHz channels. 802.11g brought the 802.11a modulation schemes and (theoretical) = performance to the 2.4GHz band. 802.11n is dual-band, but optionally. Aggregation, 40MHz channels, = single-target MIMO. 802.11ac is 5GHz only. More aggregation, 80 & 160MHz channels, = multi-target MIMO. Rationalised options, dropping many 'n' features = that are more trouble than they're worth. Coexists nicely with older = 20MHz-channel equipment, and nearby APs with overlapping spectrum. My general impression is that 802.11ac makes a serious effort to improve = matters in heavily-congested, many-clients scenarios, which was where = earlier variants had the most trouble. If you're planning to set up or = go to a major conference, the best easy thing you can do is get 'ac' = equipment all round - if nothing else, it's guaranteed to support the = 5GHz band. Of course, we're not just considering the easy solutions. Now for some technical details: The wireless spectrum is fundamentally a shared-access medium. It also = has the complication of being noisy and having various path-loss = mechanisms, and of the "hidden node" problem where one client might not = be able to hear another client's transmission, even though both are in = range of the AP. Thus wifi uses a CSMA/CA algorithm as follows: 1) Listen for competing carrier. If heard, backoff and retry later. = (Listening is continuous, and detected preambles are used to infer the = time-length of packets when the data modulation is unreadable.) 2) Perform an RTS/CTS handshake. If CTS doesn't arrive, backoff and = retry later. 3) Transmit, and await acknowledgement. If no ack, backoff and retry = later, possibly using different modulation. This can be compared to Ethernet's CSMA/CD algorithm: 1) Listen for competing carrier. If heard, backoff and retry later. 2) Transmit, listening for collision with a competing transmission. If = collision, backoff and retry later. In both cases, the backoff is random and exponentially increasing, to = reduce the chance of repeated collisions. The 2.4GHz band is chock-full of noise sources, from legacy 802.11b/g = equipment to cordless phones, Bluetooth, and even microwave ovens - = which generate the best part of a kilowatt of RF energy, but somehow = manage to contain the vast majority of it within the cavity. It's also = a relatively narrow band, with only three completely separate 20MHz = channels available in most of the world (four in Japan). This isn't a massive concern for home use, but consumers still notice = the effects surprisingly often. Perhaps they live in an apartment block = with lots of devices and APs crowded together in an unmanaged mess. = Perhaps they have a large home to themselves, but a bunch of noisy = equipment reduces the effective range and reliability of their network. = It's not uncommon to hear about networks that drop out whenever the = phone rings, thanks to an old cordless phone. The 5GHz band is much less crowded. There are several channels which = are shared with weather radar, so wifi equipment can't use those unless = they are capable of detecting the radar transmissions, but even without = those there are far more 20MHz channels available. There's also much = less legacy equipment using it - even 802.11a is relatively uncommon = (and is fairly benign in behaviour). The downside is that 5GHz doesn't = propagate as far, or as easily through walls. Wider bandwidth channels can be used to shorten the time taken for each = transmission. However, this effect is not linear, because the RTS/CTS = handshake and preamble are fixed overheads (since they must be = transmitted at a low speed to ensure that all clients can hear them), = taking the same length of time regardless of any other enhancements. = This implies that in seriously geographically-congested scenarios, 20MHz = channels (and lots of APs to use them all) are still the most efficient. = MIMO can still be used to beneficial effect in these situations. Multi-target MIMO allows an AP to transmit to several clients = simultaneously, without requiring the client to support MIMO themselves. = This requires the AP's antennas and radios to be dynamically = reconfigured for beamforming - giving each client a clear version of its = own signal and a null for the other signals - which is a tricky = procedure. APs that do implement this well are highly valuable in = congested situations. Single-target MIMO allows higher bandwidth between one client at a time = and the AP. Both the AP and the client must support MIMO for this to = work. There are physical constraints which limit the ability for = handheld devices to support MIMO. In general, this form of MIMO = improves throughput in the home, but is not very useful in congested = situations. High individual throughput is not what's needed in a = crowded arena; rather, reliable if slow individual throughput, = reasonable latency, and high aggregate throughput. Choosing the most effective radio bandwidth and modulation is a = difficult problem. The Minstrel algorithm seems to be an effective = solution for general traffic. Some manual constraints may be = appropriate in some circumstances, such as reducing the maximum radio = bandwidth (trading throughput of one AP against coexistence with other = APs) and increasing the modulation rate of management broadcasts = (reducing per-packet overhead). Packet aggregation allow several IP packets to be combined into a single = wireless transmission. This avoids performing the CSMA/CA steps = repeatedly, which is a considerable overhead. There are several types = of packet aggregation - the type adopted by 802.11ac allows individual = IP packets within a transmission to be link-layer acknowledged = separately, so that a minor corruption doesn't require transmission of = the entire aggregate. By contrast, 802.11n also supported a version = which did require that, despite a slightly lower overhead. Implicit in the packet-aggregation system is the problem of collecting = packets to aggregate. Each transmission is between the AP and one = client, so the packets aggregated by the AP all have to be for the same = client. (The client can assume that all packets go to the AP.) A = fair-queueing algorithm could have the effect of forming per-client = queues, so several suitable packets could easily be located in such a = queue. In a straight FIFO queue, however, packets for the same client = are likely to be separated in the queue and thus difficult to find. It = is therefore *obviously* in the AP's interest to implement a = fair-queueing algorithm based on client MAC address, even if it does = nothing else to manage congestion. NB: if a single aggregate could be intended to be heard by more than one = client, then the complexity of multi-target beamforming MIMO would not = be necessary. This is how I infer the strict one-to-one nature of data = transmissions, as distinct from management broadcasts. On 23 Aug, 2014, at 10:26 pm, Michael Welzl wrote: >>> because of the "function" i wrote above: the more you retry, the = more you need to buffer when traffic continuously arrives because you're = stuck trying to send a frame again. >>=20 >> huh, I'm missing something here, retrying sends would require you to = buffer more when sending. >=20 > aren't you the saying the same thing as I ? Sorry else, I might have = expressed it confusingly somehow There should be enough buffering to allow effective aggregation, but as = little as possible on top of that. I don't know how much aggregation = can be done, but I assume that there is a limit, and that it's not = especially high in terms of full-length packets. After all, tying up = the channel for long periods of time is unfair to other clients - a = typical latency/throughput tradeoff. Equally clearly, in a heavily congested scenario the AP benefits from = having a lot of buffer divided among a large number of clients, but each = client should have only a small buffer. >> If people are retrying when they really don't need to, that cuts down = on the avialable airtime. >=20 > Yes Given that TCP retries on loss, and UDP protocols are generally = loss-tolerant to a degree, there should therefore be a limit on how hard = the link-layer stuff tries to get each individual packet through. = Minstrel appears to be designed around a time limit for that sort of = thing, which seems sane - and they explicitly talk about TCP retransmit = timers in that context. With that said, link-layer retries are a valid mechanism to minimise = unnecessarily lost packets. It's also not new - bus/hub Ethernet does = this on collision detection. What Ethernet doesn't have is the = link-layer ack, so there's an additional set of reasons why a = backoff-and-retry might happen in wifi. Modern wifi variants use packet aggregation to improve efficiency. This = only works when there are multiple packets to send at a time from one = place to a specific other place - which is more likely when the link is = congested. In the event of a retry, it makes sense to aggregate newly = buffered packets with the original ones, to reduce the number of = negotiation and retry cycles. >> But if you have continual transmissions taking place, so you have a = hard time getting a chance to send your traffic, then you really do have = congestion and should be dropping packets to let the sender know that it = shouldn't try to generate as much. >=20 > Yes; but the complexity that I was pointing at (but maybe it's a = simple parameter, more like a 0 or 1 situation in practice?) lies in the = word "continual". How long do you try before you decide that the sending = TCP should really think it *is* congestion? To really optimize the = behavior, that would have to depend on the RTT, which you can't easily = know. There are TCP congestion algorithms which explicitly address this (eg. = Westwood+), by reacting only a little to individual drops, but reacting = more rapidly if drops occur frequently. In principle they should also = react quickly to ECN, because that is never triggered by random noise = loss alone. - Jonathan Morton