From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-out5.uio.no (mail-out5.uio.no [IPv6:2001:700:100:10::17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id 5DF8021F3B1 for ; Mon, 25 Aug 2014 01:01:25 -0700 (PDT) Received: from mail-mx4.uio.no ([129.240.10.45]) by mail-out5.uio.no with esmtp (Exim 4.80.1) (envelope-from ) id 1XLpDP-0007eR-LA; Mon, 25 Aug 2014 10:01:23 +0200 Received: from boomerang.ifi.uio.no ([129.240.68.135]) by mail-mx4.uio.no with esmtpsa (TLSv1:AES128-SHA:128) user michawe (Exim 4.80) (envelope-from ) id 1XLpDO-0005Go-UX; Mon, 25 Aug 2014 10:01:23 +0200 Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=us-ascii From: Michael Welzl In-Reply-To: Date: Mon, 25 Aug 2014 10:01:21 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <8C18A920-D0A8-4052-85EC-FF6D6039FD53@ifi.uio.no> References: <91696A3A-EF44-4A1A-8070-D3AF25D0D9AC@netapp.com> <64CD1035-2E14-4CA6-8E90-C892BAD48EC6@netapp.com> <4C1661D0-32C6-48E7-BAE6-60C98D7B2D69@ifi.uio.no> To: David Lang X-Mailer: Apple Mail (2.1283) X-UiO-SPF-Received: X-UiO-Ratelimit-Test: rcpts/h 2 msgs/h 1 sum rcpts/h 8 sum msgs/h 3 total rcpts 19470 max rcpts/h 44 ratelimit 0 X-UiO-Spam-info: not spam, SpamAssassin (score=-6.0, required=5.0, autolearn=disabled, RP_MATCHES_RCVD=-1.05, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO) X-UiO-Scanned: 3A1557E8BDF9AA1C0CEF092B12C3F9AB2A924A9D X-UiO-SPAM-Test: remote_host: 129.240.68.135 spam_score: -59 maxlevel 80 minaction 2 bait 0 mail/h: 1 total 5873 max/h 17 blacklist 0 greylist 0 ratelimit 0 Cc: bloat Subject: Re: [Bloat] sigcomm wifi X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Aug 2014 08:01:25 -0000 >> Yep... I remember a neat paper from colleagues at Trento University = that piggybacked TCP's ACKs on link layer ACKs, thereby avoiding the = collisions between TCP's ACKs and other data packets - really nice. Not = sure if it wasn't just simulations, though. >=20 > that's a neat hack, but I don't see it working, except when one end of = the wireless link is also the endpoint of the TCP connection (and then = only for acks from that device) >=20 > so in a typical wifi environment, it would be one less transmission = from the laptop, no change to the AP. >=20 > But even with that, doesn't TCP try to piggyback the ack on the next = packet of data anyway? so unless it's a purely one-way dataflow, this = still wouldn't help. Yes but of course many dataflows are indeed one-way - HTTP typically = sends a get/put and not much else. >>> but can the firmware really tell the difference between quality = degredation due to interference and collisions with other transmitters? >>=20 >> Well, with heuristics it can, sort of. As a simple example from one = older mechanism, consider: multiple consecutive losses are *less* likely = from random collisions than from link noise. That sort of thing. = Minstrel worked best our tests, using tables of rates that worked well / = didn't work well in the past: >> http://heim.ifi.uio.no/michawe/research/publications/wowmom2012.pdf >=20 > the question is if this is deployed in any comoddity OS stacks. If = not, it could only help on the AP, and we are better off just locking = the speeds there. I thought it's widely deployed but I really don't know. I'm sure others = on this list do? What I do know is that it was (is) a part of madwifi. >>>>> retries of packets that the OS has given up on (including the user = has closed the app that sent them) >>>>>=20 >>>>> Ideally we want the wifi layer to be just like the wired layer, = buffer only what's needed to get it on the air without 'dead air' (where = the driver is waiting for the OS to give it more data), at that point, = we can do the retries from the OS as appropriate. >>>>>=20 >>>>>> I have two questions: 1) is my characterization roughly correct? = 2) have people investigated the downsides (negative effect on TCP) of = buffering *too little* in wireless equipment? (I suspect so?) Finding = where "too little" begins could give us a better idea of what the ideal = buffer length should really be. >>>>>=20 >>>>> too little buffering will reduce the throughput as a result of = unused airtime. >>>>=20 >>>> so that's a function of, at least: 1) incoming traffic rate; 2) no. = retries * ( f(MAC behavior; number of other senders trying) ). >>>=20 >>> incoming to the AP you mean? >>=20 >> incoming to whoever is sending and would be retrying - mostly the AP, = yes. >=20 > terminology issue here >=20 > a receiver is never going to be retrying, it has nothing to retry. = It's the sender that keeps track of what it's sent and retries if it = doesn't get an ack. Sorry, I must have expressed myself very unclearly. I am of course = talking about a sender. With "incoming" I meant incoming to the buffer = of the device that's sending, from the upstream (or up-stack) sender. >>> It also matters if you are worrying about aggregate throughput of a = lot of users, or per-connection throughput for a single user. >>>=20 >>> =46rom a sender's point of view, if it takes 100 time units to send = a packet, and 1-5 time units to queue the next packet for transmission, = you loose a few percentage of your possible airtime and there's very = little concern. >>>=20 >>> but if it takes 10 time units to send the packet and 1-5 time units = to queue the next packet, you have just lost a lot of potential = bandwidth. >>>=20 >>> But from the point of view of the aggregate, these gaps just give = someone else a chance to transmit and have very little effect on the = amount of traffic arriving at the AP. >>>=20 >>> I was viewing things from the point of view of the app on the = laptop. >>=20 >> Yes... I agree, and that's the more common + more reasonable way to = think about it. I tend to think upstream, which of course is far less = common, but maybe even more problematic. Actually I suspect the = following: things get seriously bad when a lot of senders are sending = upstream together; this isn't really happening much in practice - BUT = when we have a very very large number of hosts connected in a conference = style situation, all the HTTP GETs and SMTP messages and whatnot *do* = create lots of collisions, a situation that isn't really too common (and = maybe not envisioned / parametrized for), and that's why things often = get so bad. (At least one of the reasons.) >=20 > the thing is that in the high-density environment, there's not that = much the AP can do, most of the problem is related to the mobile = endpoints and what they decide to do. True! (though, as you say, limiting allowed physical rates on the AP probably = helps) >>>>> But at the low data rates involved, the system would have to be = extremely busy to be a significant amount of time if even one packet at = a time is buffered. >>>>=20 >>>>=20 >>>>=20 >>>>> You are also conflating the effect of the driver/hardware = buffering with it doing retries. >>>>=20 >>>> because of the "function" i wrote above: the more you retry, the = more you need to buffer when traffic continuously arrives because you're = stuck trying to send a frame again. >>>=20 >>> huh, I'm missing something here, retrying sends would require you to = buffer more when sending. >>=20 >> aren't you the saying the same thing as I ? Sorry else, I might have = expressed it confusingly somehow >=20 > as I said above, the machine receiving packets doesn't need to buffer = them, because it has no need to re-send them. It's the machine sending = packets that needs to keep track of what's been sent in case it needs to = re-send it. Sure, that was a plain misunderstanding. > But this cache of recently sent packets is separate from a queue of = packets waiting to be sent. >=20 > the size of the buffer used to track what's been sent isn't a problem. = the bufferbloat problem is aroudn the size of the queue for packet = waiting to be sent. This confuses me. Why do you even need a cache of recently sent packets? Anyway, what I am talking about *is* the size of the queue for packets = waiting to be sent - and not only due to aggregation but also link layer = retransmits. Per device, at the link layer, packets (frames, really) are = sent in sequence AFAIK, and so any frame that has been sent but not yet = acknowledged and then has to be resent if it isn't acknowledged holds up = all other frames to that same destination. >>> If people are retrying when they really don't need to, that cuts = down on the avialable airtime. >>=20 >> Yes >>=20 >>=20 >>> But if you have continual transmissions taking place, so you have a = hard time getting a chance to send your traffic, then you really do have = congestion and should be dropping packets to let the sender know that it = shouldn't try to generate as much. >>=20 >> Yes; but the complexity that I was pointing at (but maybe it's a = simple parameter, more like a 0 or 1 situation in practice?) lies in the = word "continual". How long do you try before you decide that the sending = TCP should really think it *is* congestion? To really optimize the = behavior, that would have to depend on the RTT, which you can't easily = know. >=20 > Again, I think you are mixing two different issues here. No, I think you misunderstand me - > 1. waiting for a pause in everyone else's transmissions so that you = can transmit wihtout _knowing_ that you are going to clobber someone >=20 > Even this can get tricky, is that station you are hearing faintly = trying to transmit to a AP near you so you should be quiet? or is it = transmitting to a station enough further away from you so you can go = ahead and transmit your packet to your AP without interfering with it? You mean the normal CSMA/CA procedure ( + RTS/CTS)? Sure that's tricky = in itself but I wasn't talking about that. > 2. your transmission gettng clobbered so the packet doesn't get = through, where you need to wait 'long enough' to decide that it's not = going to be acknowledged and try again. I was always only talking about that second bit. I'm sure I wasn't clear = enough in writing and I'm sorry for that. > This is a case where a local proxy server can actually make a big = difference to you. The connections between your mobile devices and the = local proxy server have a short RTT and so all timeouts can be nice and = short, and then the proxy deals with the long RTT connections out to the = Internet. Adding a proxy to these considerations only complicates them: it's a = hard enough trade-off when we just ask ourselves: how large should a = buffer for the sake of link layer retransmissions be? (which is closely = related to the question: how often should a link layer try to retransmit = before giving up?) That's what my emails were about. I suspect that we = don't have a good answer to even these questions, and I suspect that = we'd better off having something dynamic than fixed default values. Cheers, Michael