From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f180.google.com (mail-pf0-f180.google.com [209.85.192.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id EA3733BA8E for ; Mon, 9 Oct 2017 17:44:06 -0400 (EDT) Received: by mail-pf0-f180.google.com with SMTP id d28so359791pfe.2 for ; Mon, 09 Oct 2017 14:44:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=JeJxVOdI6ggOept3CDx0adJHgPfIpAODskbS3Ok6o7A=; b=Qw2GeI361wx7qzVdrztZH1oP0GqVQZQTr7QRZ+5LfMHri2VBrDTTnsmifWybEAi8DU LCDfcco2EFU0L5k2XwxVZyNCJlJNMKEAm7PphLCRPonuYctyy++IPoU82fG2GMLdFLQA m/RhDF8iMm/+joGLlOOGjWXgj9DnUe2URMqd+kANPhIOljxhbgFasb9TTHI87rN993CQ VaaqlXtPqcF9OcYAmhiVpf+x2gUu92tR9sA0Xlige0u2lr0PbInSG5BTL02lcy9smX5B VulTH3aGXpXeYm5W02HYWY2jBeQaGiJPTzJt5GrVUhvJ2hTK7hHa9aQh6YSPRyfFi4iP ULQw== X-Gm-Message-State: AMCzsaXqIIBwT8OYNezZYxx4DlRjWslv4R8AGDEo+NS1NrPKgLj+8+rF Zi0jrpw2j3OHhpLapogT0iY= X-Google-Smtp-Source: AOwi7QCjrJWif33GyfESn4XxDwYRdArpuct4G50uOOgp22NxntlBy702aM2v7deFn+hB7eIOFmeqiQ== X-Received: by 10.98.131.197 with SMTP id h188mr7232466pfe.224.1507585445928; Mon, 09 Oct 2017 14:44:05 -0700 (PDT) Received: from [10.3.0.215] (184-23-135-132.dedicated.static.sonic.net. [184.23.135.132]) by smtp.gmail.com with ESMTPSA id 66sm15584077pgh.31.2017.10.09.14.44.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 09 Oct 2017 14:44:04 -0700 (PDT) From: Simon Barber Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_82B092E3-6C32-477C-AF9F-F28C5892657D" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Date: Mon, 9 Oct 2017 14:44:02 -0700 In-Reply-To: Cc: David Reed , make-wifi-fast@lists.bufferbloat.net, Johannes Berg To: Bob McMahon References: <1507581711.45638427@apps.rackspace.com> X-Mailer: Apple Mail (2.3273) Subject: Re: [Make-wifi-fast] less latency, more filling... for wifi X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Oct 2017 21:44:07 -0000 --Apple-Mail=_82B092E3-6C32-477C-AF9F-F28C5892657D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Very nice - I=E2=80=99m using iperf3.2 and always have to figure packets = per second by combining packet size and bandwidth. This will be much = easier. Also direct reporting of one way latency variance above minimum = round trip would be very useful. Simon > On Oct 9, 2017, at 2:04 PM, Bob McMahon = wrote: >=20 > Hi, >=20 > Not sure if this is helpful but we've added end/end latency = measurements for UDP traffic in iperf 2.0.10 = . It does require the clocks = to be synched. I use a spectracom tsync pcie card with either an oven = controlled oscillator or a GPS disciplined one, then use precision time = protocol to distribute the clock over ip multicast. For Linux, the = traffic threads are set to realtime scheduling to minimize latency adds = per thread scheduling.. >=20 > I'm also in the process of implementing a very simple isochronous = option where the iperf client (tx) accepts a frames per second commmand = line value (e.g. 60) as well as a log normal distribution = for = the input to somewhat simulate variable bit rates. On the iperf = receiver considering implementing an underflow/overflow counter per the = expected frames per second. >=20 > Latency does seem to be a significant metric. Also is power = consumption. >=20 > Comments welcome. >=20 > Bob >=20 > On Mon, Oct 9, 2017 at 1:41 PM, > wrote: > It's worth setting a stretch latency goal that is in principle = achievable. > =20 > I get the sense that the wireless group obsesses over maximum channel = utilization rather than excellent latency. This is where it's important = to put latency as a primary goal, and utilization as the secondary goal, = rather than vice versa. > =20 > It's easy to get at this by observing that the minimum latency on the = shared channel is achieved by round-robin scheduling of packets that are = of sufficient size that per packet overhead doesn't dominate. > =20 > So only aggregate when there are few contenders for the channel, or = the packets are quite small compared to the per-packet overhead. When = there are more contenders, still aggregate small packets, but only those = that are actually waiting. But large packets shouldn't be aggregated. > =20 > Multicast should be avoided by higher level protocols for the most = part, and the latency of multicast should be a non-issue. In wireless, = it's kind of a dumb idea anyway, given that stations have widely varying = propagation characteristics. Do just enough to support DHCP and so = forth. > =20 > It's so much fun for tha hardware designers to throw in stuff that = only helps in marketing benchmarks (like getting a few percent on = throughput in lab conditions that never happen in the field) that it is = tempting for OS driver writers to use those features (like deep queues = and offload processing bells and whistles). But the real issue to be = solved is that turn-taking "bloat" that comes from too much attempt to = aggregate, to handle the "sole transmitter to dedicated receiver case" = etc. > =20 > I use 10 GigE in my house. I don't use it because I want to do 10 Gig = File Transfers all day and measure them. I use it because (properly = managed) it gives me *low latency*. That low latency is what matters, = not throughput. My average load, if spread out across 24 hours, could be = handled by 802.11b for the entire house. > =20 > We are soon going to have 802.11ax in the home. That's approximately = 10 Gb/sec, but wireless. No TV streaming can fill it. It's not for = continuous isochronous traffic at all. > =20 > What it is for is *low latency*. So if the adapters and the drivers = won't give me that low latency, what good is 10 Gb/sec at all. This is = true for 802.11ac, as well. > =20 > We aren't building Dragsters fueled with nitro, to run down 1/4 mile = of track but unable to steer. > =20 > Instead, we want to be able to connect musical instruments in an = electronic symphony, where timing is everything. > =20 >=20 >=20 > On Monday, October 9, 2017 4:13pm, "Dave Taht" > said: >=20 > > There were five ideas I'd wanted to pursue at some point. I''m not > > presently on linux-wireless, nor do I have time to pay attention = right > > now - but I'm enjoying that thread passively. > >=20 > > To get those ideas "out there" again: > >=20 > > * adding a fixed length fq'd queue for multicast. > >=20 > > * Reducing retransmits at low rates > >=20 > > See the recent paper: > >=20 > > "Resolving Bufferbloat in TCP Communication over IEEE 802.11 n WLAN = by > > Reducing MAC Retransmission Limit at Low Data Rate" (I'd paste a = link > > but for some reason that doesn't work well) > >=20 > > Even with their simple bi-modal model it worked pretty well. > >=20 > > It also reduces contention with "bad" stations more automagically. > >=20 > > * Less buffering at the driver. > >=20 > > Presently (ath9k) there are two-three aggregates stacked up at the = driver. > >=20 > > With a good estimate for how long it will take to service one, = forming > > another within that deadline seems feasible, so you only need to = have > > one in the hardware itself. > >=20 > > Simple example: you have data in the hardware projected to take a > > minimum of 4ms to transmit. Don't form a new aggregate and submit it > > to the hardware for 3.5ms. > >=20 > > I know full well that a "good" estimate is hard, and things like > > mu-mimo complicate things. Still, I'd like to get below 20ms of > > latency within the driver, and this is one way to get there. > >=20 > > * Reducing the size of a txop under contention > >=20 > > if you have 5 stations getting blasted away at 5ms each, and one = that > > only wants 1ms worth of traffic, "soon", temporarily reducing the = size > > of the txop for everybody so you can service more stations faster, > > seems useful. > >=20 > > * Merging acs when sane to do so > >=20 > > sane aggregation in general works better than prioritizing does, as > > shown in ending the anomaly. > >=20 > > -- > >=20 > > Dave T=C3=A4ht > > CEO, TekLibre, LLC > > http://www.teklibre.com > > Tel: 1-669-226-2619 > > _______________________________________________ > > Make-wifi-fast mailing list > > Make-wifi-fast@lists.bufferbloat.net = > > https://lists.bufferbloat.net/listinfo/make-wifi-fast = > _______________________________________________ > Make-wifi-fast mailing list > Make-wifi-fast@lists.bufferbloat.net = > https://lists.bufferbloat.net/listinfo/make-wifi-fast = >=20 > _______________________________________________ > Make-wifi-fast mailing list > Make-wifi-fast@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/make-wifi-fast --Apple-Mail=_82B092E3-6C32-477C-AF9F-F28C5892657D Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Very nice - I=E2=80=99m using iperf3.2 and always have to = figure packets per second by combining packet size and bandwidth. This = will be much easier. Also direct reporting of one way latency variance = above minimum round trip would be very useful.

Simon

On = Oct 9, 2017, at 2:04 PM, Bob McMahon <bob.mcmahon@broadcom.com> wrote:

Hi,

Not sure if this is helpful = but we've added end/end latency measurements for UDP traffic in iperf = 2.0.10.   It does require the clocks to be synched.  I use = a spectracom tsync pcie card with either an oven controlled oscillator = or a GPS disciplined one, then use precision time protocol to distribute = the clock over ip multicast.  For Linux, the traffic threads are = set to realtime scheduling to minimize latency adds per thread = scheduling..

I'm = also in the process of implementing a very simple isochronous option = where the iperf client (tx) accepts a frames per second commmand line = value (e.g. 60) as well as a log normal distribution for the input to somewhat = simulate variable bit rates.  On the iperf receiver considering = implementing an underflow/overflow counter per the expected frames per = second.

Latency does seem to be a = significant metric.  Also is power consumption.

Comments welcome.

Bob

On Mon, = Oct 9, 2017 at 1:41 PM, <dpreed@reed.com> wrote:
It's worth setting a stretch latency goal that is in = principle achievable.

 

I get the sense that = the wireless group obsesses over maximum channel utilization rather than = excellent latency.  This is where it's important to put latency as = a primary goal, and utilization as the secondary goal, rather than vice = versa.

 

It's easy to get at = this by observing that the minimum latency on the shared channel is = achieved by round-robin scheduling of packets that are of sufficient = size that per packet overhead doesn't dominate.

 

So only aggregate = when there are few contenders for the channel, or the packets are quite = small compared to the per-packet overhead. When there are more = contenders, still aggregate small packets, but only those that are = actually waiting. But large packets shouldn't be aggregated.

 

Multicast should be = avoided by higher level protocols for the most part, and the latency of = multicast should be a non-issue. In wireless, it's kind of a dumb idea = anyway, given that stations have widely varying propagation = characteristics. Do just enough to support DHCP and so forth.

 

It's so much fun for = tha hardware designers to throw in stuff that only helps in marketing = benchmarks (like getting a few percent on throughput in lab conditions = that never happen in the field) that it is tempting for OS driver = writers to use those features (like deep queues and offload processing = bells and whistles). But the real issue to be solved is that turn-taking = "bloat" that comes from too much attempt to aggregate, to handle the = "sole transmitter to dedicated receiver case" etc.

 

I use 10 GigE in my = house. I don't use it because I want to do 10 Gig File Transfers all day = and measure them. I use it because (properly managed) it gives me *low = latency*. That low latency is what matters, not throughput. My average = load, if spread out across 24 hours, could be handled by 802.11b for the = entire house.

 

We are soon going to = have 802.11ax in the home. That's approximately 10 Gb/sec, but wireless. = No TV streaming can fill it. It's not for continuous isochronous traffic = at all.

 

What it is for is = *low latency*. So if the adapters and the drivers won't give me that low = latency, what good is 10 Gb/sec at all. This is true for 802.11ac, as = well.

 

We aren't building = Dragsters fueled with nitro, to run down 1/4 mile of track but unable to = steer.

 

Instead, we want to = be able to connect musical instruments in an electronic symphony, where = timing is everything.

 



On Monday, October 9, 2017 4:13pm, "Dave Taht" <dave.taht@gmail.com> said:

> There were five ideas I'd wanted to pursue at = some point. I''m not
> presently on linux-wireless, nor = do I have time to pay attention right
> now - but I'm = enjoying that thread passively.
>
> = To get those ideas "out there" again:
>
> * adding a fixed length fq'd queue for multicast.
>
> * Reducing retransmits at low = rates
>
> See the recent paper:
>
> "Resolving Bufferbloat in TCP = Communication over IEEE 802.11 n WLAN by
> Reducing MAC = Retransmission Limit at Low Data Rate" (I'd paste a link
> but for some reason that doesn't work well)
>
> Even with their simple bi-modal = model it worked pretty well.
>
> It = also reduces contention with "bad" stations more automagically.
>
> * Less buffering at the driver.
>
> Presently (ath9k) there are = two-three aggregates stacked up at the driver.
>
> With a good estimate for how long it will take to = service one, forming
> another within that deadline = seems feasible, so you only need to have
> one in the = hardware itself.
>
> Simple example: = you have data in the hardware projected to take a
> = minimum of 4ms to transmit. Don't form a new aggregate and submit it
> to the hardware for 3.5ms.
>
> I know full well that a "good" estimate is hard, and = things like
> mu-mimo complicate things. Still, I'd = like to get below 20ms of
> latency within the driver, = and this is one way to get there.
>
> = * Reducing the size of a txop under contention
>
> if you have 5 stations getting blasted away at 5ms each, = and one that
> only wants 1ms worth of traffic, "soon", = temporarily reducing the size
> of the txop for = everybody so you can service more stations faster,
> = seems useful.
>
> * Merging acs when = sane to do so
>
> sane aggregation in = general works better than prioritizing does, as
> shown = in ending the anomaly.
>
> --
>
> Dave T=C3=A4ht
> = CEO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-669-226-2619
> _______________________________________________
> Make-wifi-fast mailing = list
> Make-wifi-fast@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/make-wifi-fast

_______________________________________________
Make-wifi-fast mailing list
Make-wifi-fast@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/make-wifi-fast

_______________________________________________
Make-wifi-fast mailing list
Make-wifi-fast@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/make-wifi-fast
= --Apple-Mail=_82B092E3-6C32-477C-AF9F-F28C5892657D--