From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-x229.google.com (mail-qt0-x229.google.com [IPv6:2607:f8b0:400d:c0d::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id F19523BA8E for ; Mon, 9 Oct 2017 18:02:06 -0400 (EDT) Received: by mail-qt0-x229.google.com with SMTP id o52so45984593qtc.9 for ; Mon, 09 Oct 2017 15:02:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=htLRBgwB+iIy/gxsJEoDq6uWJKbyFWh6gZuJa+cccdk=; b=hOn/SncEFQ+L69En72q8RfM9n413VvCZaQ9kobUyOM7OTeqkvQgK7fquiNDRdFvsja E3CYtScX4pmjgcSkhS3r8P/ZjpjbbOgeC34os5CiT+bvnYEWxitdwFTYzsukBG4dt1fN l3+WAneDnA1B9E2bpR+KOFfVz1IFIzEF2kTUk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=htLRBgwB+iIy/gxsJEoDq6uWJKbyFWh6gZuJa+cccdk=; b=bF6doLVxApw4SIeSW+XIHouGnLHZCwYfsDSCgyp4CV1MZJanV7DOUE2d/tGcJ2Iqu9 zyOvIXxO5WaFYxcR1CYqV5AYumEXxG4Y+K5+rD+ZUCOVYSCkraLtTVksjEIr6yXZG7DG 8JXwWeE6V+Oa1XwWy3T55s046ewOGu0i0pK1ded9IRTIBgcZaxf2zBkkNte+yDZs8Tje uCr5cs2+0FLIqBgD9h0thQ5UuwJ8zgttQYS6NmgMhls0xFrbEyxq8PqfgkmpK69eaXQE CpNSl3KsjovCA38ZP7oWf3v7zVY6HiOxBwt4dnjBTdSrrTpUb32YNW8A8cnm7YwsHL8H Ve0A== X-Gm-Message-State: AMCzsaX7ugVae02Mk36Hsws71HW0PIGg29z9oantaZYvih4f4qoHjnhU 0id/fr69AxkvfShhJZIsy50IiX6ZXVoqpbYnWAbfIlhu X-Google-Smtp-Source: AOwi7QDkexL68BXjU+neIyLSX+uajmLJ5YCrAiLsSuIJMd5G90SFjaiYBv9Awbtncb0mIaiQ47NN747XRiMPaJF77A4= X-Received: by 10.200.18.66 with SMTP id g2mr16299785qtj.223.1507586526240; Mon, 09 Oct 2017 15:02:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.142.67 with HTTP; Mon, 9 Oct 2017 15:02:05 -0700 (PDT) In-Reply-To: References: <1507581711.45638427@apps.rackspace.com> From: Bob McMahon Date: Mon, 9 Oct 2017 15:02:05 -0700 Message-ID: To: Simon Barber Cc: David Reed , make-wifi-fast@lists.bufferbloat.net, Johannes Berg Content-Type: multipart/alternative; boundary="089e0826e16c5d8ef6055b245715" Subject: Re: [Make-wifi-fast] less latency, more filling... for wifi X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Oct 2017 22:02:07 -0000 --089e0826e16c5d8ef6055b245715 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Not sure how to determine when one way latency is above round trip. Iperf traffic for latency uses UDP where nothing is coming back. For TCP, the iperf client will report a sampled RTT per the network stack (on operating systems that support this.) One idea - have two traffic streams, one TCP and one UDP, and use higher level script (e.g. via python ) to poll data from each and perform the compare? Though, not sure if this would give you what you're looking for. Bob On Mon, Oct 9, 2017 at 2:44 PM, Simon Barber wrote: > Very nice - I=E2=80=99m using iperf3.2 and always have to figure packets = per > second by combining packet size and bandwidth. This will be much easier. > Also direct reporting of one way latency variance above minimum round tri= p > would be very useful. > > Simon > > On Oct 9, 2017, at 2:04 PM, Bob McMahon wrote: > > Hi, > > Not sure if this is helpful but we've added end/end latency measurements > for UDP traffic in iperf 2.0.10 . > It does require the clocks to be synched. I use a spectracom tsync pci= e > card with either an oven controlled oscillator or a GPS disciplined one, > then use precision time protocol to distribute the clock over ip > multicast. For Linux, the traffic threads are set to realtime scheduling > to minimize latency adds per thread scheduling.. > > I'm also in the process of implementing a very simple isochronous option > where the iperf client (tx) accepts a frames per second commmand line val= ue > (e.g. 60) as well as a log normal distribution > for the > input to somewhat simulate variable bit rates. On the iperf receiver > considering implementing an underflow/overflow counter per the expected > frames per second. > > Latency does seem to be a significant metric. Also is power consumption. > > Comments welcome. > > Bob > > On Mon, Oct 9, 2017 at 1:41 PM, wrote: > >> It's worth setting a stretch latency goal that is in principle achievabl= e. >> >> >> I get the sense that the wireless group obsesses over maximum channel >> utilization rather than excellent latency. This is where it's important= to >> put latency as a primary goal, and utilization as the secondary goal, >> rather than vice versa. >> >> >> It's easy to get at this by observing that the minimum latency on the >> shared channel is achieved by round-robin scheduling of packets that are= of >> sufficient size that per packet overhead doesn't dominate. >> >> >> So only aggregate when there are few contenders for the channel, or the >> packets are quite small compared to the per-packet overhead. When there = are >> more contenders, still aggregate small packets, but only those that are >> actually waiting. But large packets shouldn't be aggregated. >> >> >> Multicast should be avoided by higher level protocols for the most part, >> and the latency of multicast should be a non-issue. In wireless, it's ki= nd >> of a dumb idea anyway, given that stations have widely varying propagati= on >> characteristics. Do just enough to support DHCP and so forth. >> >> >> It's so much fun for tha hardware designers to throw in stuff that only >> helps in marketing benchmarks (like getting a few percent on throughput = in >> lab conditions that never happen in the field) that it is tempting for O= S >> driver writers to use those features (like deep queues and offload >> processing bells and whistles). But the real issue to be solved is that >> turn-taking "bloat" that comes from too much attempt to aggregate, to >> handle the "sole transmitter to dedicated receiver case" etc. >> >> >> I use 10 GigE in my house. I don't use it because I want to do 10 Gig >> File Transfers all day and measure them. I use it because (properly >> managed) it gives me *low latency*. That low latency is what matters, no= t >> throughput. My average load, if spread out across 24 hours, could be >> handled by 802.11b for the entire house. >> >> >> We are soon going to have 802.11ax in the home. That's approximately 10 >> Gb/sec, but wireless. No TV streaming can fill it. It's not for continuo= us >> isochronous traffic at all. >> >> >> What it is for is *low latency*. So if the adapters and the drivers won'= t >> give me that low latency, what good is 10 Gb/sec at all. This is true fo= r >> 802.11ac, as well. >> >> >> We aren't building Dragsters fueled with nitro, to run down 1/4 mile of >> track but unable to steer. >> >> >> Instead, we want to be able to connect musical instruments in an >> electronic symphony, where timing is everything. >> >> >> >> >> On Monday, October 9, 2017 4:13pm, "Dave Taht" >> said: >> >> > There were five ideas I'd wanted to pursue at some point. I''m not >> > presently on linux-wireless, nor do I have time to pay attention right >> > now - but I'm enjoying that thread passively. >> > >> > To get those ideas "out there" again: >> > >> > * adding a fixed length fq'd queue for multicast. >> > >> > * Reducing retransmits at low rates >> > >> > See the recent paper: >> > >> > "Resolving Bufferbloat in TCP Communication over IEEE 802.11 n WLAN by >> > Reducing MAC Retransmission Limit at Low Data Rate" (I'd paste a link >> > but for some reason that doesn't work well) >> > >> > Even with their simple bi-modal model it worked pretty well. >> > >> > It also reduces contention with "bad" stations more automagically. >> > >> > * Less buffering at the driver. >> > >> > Presently (ath9k) there are two-three aggregates stacked up at the >> driver. >> > >> > With a good estimate for how long it will take to service one, forming >> > another within that deadline seems feasible, so you only need to have >> > one in the hardware itself. >> > >> > Simple example: you have data in the hardware projected to take a >> > minimum of 4ms to transmit. Don't form a new aggregate and submit it >> > to the hardware for 3.5ms. >> > >> > I know full well that a "good" estimate is hard, and things like >> > mu-mimo complicate things. Still, I'd like to get below 20ms of >> > latency within the driver, and this is one way to get there. >> > >> > * Reducing the size of a txop under contention >> > >> > if you have 5 stations getting blasted away at 5ms each, and one that >> > only wants 1ms worth of traffic, "soon", temporarily reducing the size >> > of the txop for everybody so you can service more stations faster, >> > seems useful. >> > >> > * Merging acs when sane to do so >> > >> > sane aggregation in general works better than prioritizing does, as >> > shown in ending the anomaly. >> > >> > -- >> > >> > Dave T=C3=A4ht >> > CEO, TekLibre, LLC >> > http://www.teklibre.com >> > Tel: 1-669-226-2619 <(669)%20226-2619> >> > _______________________________________________ >> > Make-wifi-fast mailing list >> > Make-wifi-fast@lists.bufferbloat.net >> > https://lists.bufferbloat.net/listinfo/make-wifi-fast >> >> _______________________________________________ >> Make-wifi-fast mailing list >> Make-wifi-fast@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/make-wifi-fast >> > > _______________________________________________ > Make-wifi-fast mailing list > Make-wifi-fast@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/make-wifi-fast > > > --089e0826e16c5d8ef6055b245715 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Not sure how to determine when one way latency is above ro= und trip. =C2=A0 Iperf traffic for latency uses UDP where nothing is coming= back.=C2=A0 For TCP, the iperf client will report a sampled RTT per the ne= twork stack (on operating systems that support this.)=C2=A0

One idea= - have two traffic streams, one TCP and one UDP, and use higher level scri= pt (e.g. via python) to poll data from each and perform the compar= e? =C2=A0 =C2=A0Though, not sure if this would give you what you're loo= king for.

Bob

=
On Mon, Oct 9, 2017 at 2:44 PM, Simon Barber <simon@superduper.net> wrote:
Very nice - I=E2=80=99m using iperf= 3.2 and always have to figure packets per second by combining packet size a= nd bandwidth. This will be much easier. Also direct reporting of one way la= tency variance above minimum round trip would be very useful.

Simon

On O= ct 9, 2017, at 2:04 PM, Bob McMahon <bob.mcmahon@broadcom.com> wrote:
Hi,

Not sure if this is helpful but we've added end/end lat= ency measurements for UDP traffic in iperf 2.0.10. =C2=A0 It does require t= he clocks to be synched.=C2=A0 I use a spectracom tsync pcie card with eith= er an oven controlled oscillator or a GPS disciplined one, then use precisi= on time protocol to distribute the clock over ip multicast.=C2=A0 For Linux= , the traffic threads are set to realtime scheduling to minimize latency ad= ds per thread scheduling..

I'm also in the process o= f implementing a very simple isochronous option where the iperf client (tx)= accepts a frames per second commmand line value (e.g. 60) as well as a log normal distribution for the input to somewhat simul= ate variable bit rates.=C2=A0 On the iperf receiver considering implementin= g an underflow/overflow counter per the expected frames per second.

= Latency does seem to be a significant metric.=C2=A0 Also is power consumpti= on.

Comments welcome.

Bob
<= /div>

On Mon, Oct = 9, 2017 at 1:41 PM, <dpreed@reed.com> wrote:
It's worth setting a = stretch latency goal that is in principle achievable.

=C2=A0

I get the sense = that the wireless group obsesses over maximum channel utilization rather th= an excellent latency.=C2=A0 This is where it's important to put latency= as a primary goal, and utilization as the secondary goal, rather than vice= versa.

=C2=A0

It's easy to get at this by observing that the minimum lat= ency on the shared channel is achieved by round-robin scheduling of packets= that are of sufficient size that per packet overhead doesn't dominate.=

= =C2=A0

So only aggregate when there are few contenders for the channel, or= the packets are quite small compared to the per-packet overhead. When ther= e are more contenders, still aggregate small packets, but only those that a= re actually waiting. But large packets shouldn't be aggregated.
=C2=A0

=
Mu= lticast should be avoided by higher level protocols for the most part, and = the latency of multicast should be a non-issue. In wireless, it's kind = of a dumb idea anyway, given that stations have widely varying propagation = characteristics. Do just enough to support DHCP and so forth.

=C2=A0

It's= so much fun for tha hardware designers to throw in stuff that only helps i= n marketing benchmarks (like getting a few percent on throughput in lab con= ditions that never happen in the field) that it is tempting for OS driver w= riters to use those features (like deep queues and offload processing bells= and whistles). But the real issue to be solved is that turn-taking "b= loat" that comes from too much attempt to aggregate, to handle the &qu= ot;sole transmitter to dedicated receiver case" etc.

=C2=A0

I use 10 Gig= E in my house. I don't use it because I want to do 10 Gig File Transfer= s all day and measure them. I use it because (properly managed) it gives me= *low latency*. That low latency is what matters, not throughput. My averag= e load, if spread out across 24 hours, could be handled by 802.11b for the = entire house.

=C2=A0

We are soon going to have 802.11ax in the home. That'= ;s approximately 10 Gb/sec, but wireless. No TV streaming can fill it. It&#= 39;s not for continuous isochronous traffic at all.

=C2=A0

What it is for is = *low latency*. So if the adapters and the drivers won't give me that lo= w latency, what good is 10 Gb/sec at all. This is true for 802.11ac, as wel= l.

= =C2=A0

We aren't building Dragsters fueled with nitro, to run down 1/4= mile of track but unable to steer.

=C2=A0

Instead, we want to be able to con= nect musical instruments in an electronic symphony, where timing is everyth= ing.

=C2=A0



On Monday, Oc= tober 9, 2017 4:13pm, "Dave Taht" <dave.taht@gmail.com> said:

=
&g= t; There were five ideas I'd wanted to pursue at some point. I''= ;m not
> presently on linux-wireless, nor do I have time to pay atten= tion right
> now - but I'm enjoying that thread passively.
>= ;
> To get those ideas "out there" again:
>
>= * adding a fixed length fq'd queue for multicast.
>
> * R= educing retransmits at low rates
>
> See the recent paper:
= >
> "Resolving Bufferbloat in TCP Communication over IEEE 80= 2.11 n WLAN by
> Reducing MAC Retransmission Limit at Low Data Rate&q= uot; (I'd paste a link
> but for some reason that doesn't wor= k well)
>
> Even with their simple bi-modal model it worked pr= etty well.
>
> It also reduces contention with "bad"= stations more automagically.
>
> * Less buffering at the driv= er.
>
> Presently (ath9k) there are two-three aggregates stack= ed up at the driver.
>
> With a good estimate for how long it = will take to service one, forming
> another within that deadline seem= s feasible, so you only need to have
> one in the hardware itself.>
> Simple example: you have data in the hardware projected to t= ake a
> minimum of 4ms to transmit. Don't form a new aggregate an= d submit it
> to the hardware for 3.5ms.
>
> I know full= well that a "good" estimate is hard, and things like
> mu-= mimo complicate things. Still, I'd like to get below 20ms of
> la= tency within the driver, and this is one way to get there.
>
>= * Reducing the size of a txop under contention
>
> if you hav= e 5 stations getting blasted away at 5ms each, and one that
> only wa= nts 1ms worth of traffic, "soon", temporarily reducing the size> of the txop for everybody so you can service more stations faster,> seems useful.
>
> * Merging acs when sane to do so
&= gt;
> sane aggregation in general works better than prioritizing doe= s, as
> shown in ending the anomaly.
>
> --
>
= > Dave T=C3=A4ht
> CEO, TekLibre, LLC
> http://www.teklibre.com
> Tel:= = 1-669-226-2619
> ___________________________________________= ____
> Make-wifi-fast mailing list
> Make-wifi-fast@lists.buffer= bloat.net
> https://lists.bufferbloat.net/list= info/make-wifi-fast

_________________________________________= ______
Make-wifi-fast mailing list
M= ake-wifi-fast@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/mak= e-wifi-fast

_______________________________________________
Make-wifi-fast mail= ing list
Make-wifi-fast@lists.bufferbloat.net
https= ://lists.bufferbloat.net/listinfo/make-wifi-fast


--089e0826e16c5d8ef6055b245715--