From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 6F9643CB35 for ; Thu, 3 Jan 2019 17:06:33 -0500 (EST) Received: by mail-wr1-x42a.google.com with SMTP id u4so34943406wrp.3 for ; Thu, 03 Jan 2019 14:06:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heistp.net; s=google; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=xYbIWEnpbWNf21HIgGOmBsmGy7ag6KATR4WYrQRZFVs=; b=N+bjQQ2ONe+D9m2IyNsTUGsSMo7WVCn8EwzI/SV9laAFUKNYMBpKSUBwxf20AUMiEY 43z7379d6z1Lrb3nJPHrmh3NcKx0NSNRsW5zctN4jWryVrtfd5F6WClrWZdom461HQ73 J/M8+aekmlXBCyWElGSDwgxiuU1XzNES2iPteMNLECMrzlaYCkO3p+JSFm2/ARTOu0I7 hfMIhsF+4ehskO+1xhGypEl3BHXqDrFsMjaxvAG1x60OWTWKCBLgwj5y7zavWJHxgXRk G5u9qQX5H0hT7HaFO/KS5ICEyuNcLq2T73l/O8rfTRyaGh16GVZ+ncUdHX8sj8R7qZZp j1/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=xYbIWEnpbWNf21HIgGOmBsmGy7ag6KATR4WYrQRZFVs=; b=Zb3x9puN9KWTNGmkYfUpKovh8TyHLxNv83cZg1OM/Ic6Ziqe+GAfB51GBzj3kf99na QmDkFHGRBtfSV/WFCxRTh+hmL+G12y3VvmVAvhu11ee62onYHpcOXxJbeWITV4hvj5SZ F47KHNXuxrMTDC6ArHh6g5Pze8pRsHbF77nxmE69YK+jK8c+Xz+MMGvKn2lh5X1YdvkG lzfpKGJWL/ZVz/2fDcBoruR+bHeuUKx1Ddsbwn8fEHt9OW26yYcuPSRSlfEcxuGaVtaJ POUSYp7P7I23KqvqNwSFOOQFq4ef+tTz5w4p1FvucAgF0R8eJyZwgj8xUj6QxlRPd83r 0FiQ== X-Gm-Message-State: AJcUukdNvyfcQJB22RdMoqRbV0Os70OGqsoqfWt2S9X4ye9NEgKsoPpi DA3kfa+l7h8tmMpf7nn+xMJyVg== X-Google-Smtp-Source: ALg8bN7n9i2icXXKZujW99LHV+JoQKXQ+McpapmVoonm9esouxDsWRknHjhGUtkbzhnpFd4iGpEDuQ== X-Received: by 2002:a5d:4ccb:: with SMTP id c11mr45247915wrt.241.1546553192392; Thu, 03 Jan 2019 14:06:32 -0800 (PST) Received: from [10.72.0.65] (h-1169.lbcfree.net. [185.193.85.130]) by smtp.gmail.com with ESMTPSA id o17sm103550642wmg.35.2019.01.03.14.06.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 03 Jan 2019 14:06:31 -0800 (PST) From: Pete Heist Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_988E2649-E62E-4FB9-AEC9-72C26A1B6C7A" Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Date: Thu, 3 Jan 2019 23:06:28 +0100 In-Reply-To: <316524A2-FEB5-46DC-AC96-4E1AED27695A@heistp.net> Cc: Jonathan Morton , Cake List To: =?utf-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= References: <8F9DE6A8-8614-46A8-9E9B-7B7E4CC7414F@heistp.net> <43a8ddec5beb962c53fe828363ecc839832de2c0.camel@gmail.com> <3650A136-97A6-43F5-ADD3-B94A19775379@gmail.com> <99C93851-3539-4CB6-BED1-193B56658486@heistp.net> <87imz6xatw.fsf@toke.dk> <87ftu9yj1n.fsf@toke.dk> <316524A2-FEB5-46DC-AC96-4E1AED27695A@heistp.net> X-Mailer: Apple Mail (2.3445.9.1) Subject: Re: [Cake] dual-src/dsthost unfairness, only with bi-directional traffic X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jan 2019 22:06:33 -0000 --Apple-Mail=_988E2649-E62E-4FB9-AEC9-72C26A1B6C7A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 I have a simpler setup now to remove some variables, both hosts are APU2 = on Debian 9.6, kernel 4.9.0-8: apu2a (iperf3 client) <=E2=80=94 default VLAN =E2=80=94> apu2b (iperf3 = server) Both have cake at 100mbit only on egress, with dual-srchost on client = and dual-dsthost on server. With this setup (and probably previous ones, = I just didn=E2=80=99t test it this way), bi-directional fairness with = these flow counts works: IP1 8-flow TCP up: 46.4 IP2 1-flow TCP up: 47.3 IP1 8-flow TCP down: 46.8 IP2 1-flow TCP down: 46.7 but with the original flow counts reported it=E2=80=99s still similarly = imbalanced as before: IP1 8-flow TCP up: 82.9 IP2 1-flow TCP up: 10.9 IP1 1-flow TCP down: 10.8 IP2 8-flow TCP down: 83.3 and now with ack-filter on both ends (not much change): IP1 8-flow TCP up: 82.8 IP2 1-flow TCP up: 10.9 IP1 1-flow TCP down: 10.5 IP2 8-flow TCP down: 83.2 Before I go further, what I=E2=80=99m seeing with this rig is that when = =E2=80=9Cinterplanetary=E2=80=9D is used and the number of iperf3 TCP = flows goes above the number of CPUs minus one (in my case, 4 cores), the = UDP send rate starts dropping. This only happens with interplanetary for = some reason, but such as it is, I=E2=80=99m changed my tests to pit 8 = UDP flows against 1 TCP flow instead, giving the UDP senders more CPU, = as this seems to work much better. All tests except the last are with = =E2=80=9Cinterplanetary=E2=80=9D. UDP upload competition (looks good): IP1 1-flow TCP up: 48.6 IP2 8-flow UDP 48-mbit up: 48.2 (0% loss) UDP download competition (some imbalance, maybe a difference in how = iperf3 reverse mode works?): IP1 8-flow UDP 48-mbit down: 43.1 (0% loss) IP2 1-flow TCP down: 53.4 (0% loss) All four at once (looks similar to previous two tests not impacting one = another, which is good): IP1 1-flow TCP up: 47.7 IP2 8-flow UDP 48-mbit up: 48.2 (0% loss) IP1 8-flow UDP 48-mbit down: 43.3 (0% loss) IP2 1-flow TCP down: 52.3 All four at once, up IPs flipped (less fair): IP1 8-flow UDP 48-mbit up: 37.7 (0% loss) IP2 1-flow TCP up: 57.9 IP1 8-flow UDP 48-mbit down: 38.9 (0% loss) IP2 1-flow TCP down: 56.3 All four at once, interplanetary off again, to double check it, and yes, = UDP gets punished in this case: IP1 1-flow TCP up: 60.6 IP2 8-flow UDP 48-mbit up: 6.7 (86% loss) IP1 8-flow UDP 48-mbit down: 2.9 (94% loss) IP2 1-flow TCP down: 63.1 So have we learned something from this? Yes, fairness is improved when = using UDP instead of TCP for the 8-flow clients, but by turning AQM off = we=E2=80=99re also testing a very different scenario, one that=E2=80=99s = not too realistic. Does this prove the cause of the problem is TCP ack = traffic? Thanks again for the help on this. After a whole day on it, I=E2=80=99ll = have to shift gears tomorrow to FreeNet router changes. I=E2=80=99ll = show them the progress on Monday so of course I=E2=80=99d like to have a = great host fairness story for Cake, as this is one of the main reasons = to use it instead of fq_codel, but perhaps this will get sorted out = before then. :) I agree with George that we=E2=80=99ve been through this before, and = also with how he explained it in his latest email, but there have been = many changes to Cake since we tested in 2017, so this could be a = regression. I=E2=80=99m almost sure I tested this exact scenario, and = would not have put 8 up / 8 down on one IP and 1 up / 1 down on the = other, which works with fairness for some reason. FWIW, I also reproduced it in flent between the same APU2s used above, = to be sure iperf3 wasn=E2=80=99t somehow causing it: https://www.heistp.net/downloads/fairness_8_1/ = --Apple-Mail=_988E2649-E62E-4FB9-AEC9-72C26A1B6C7A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
I = have a simpler setup now to remove some variables, both hosts are APU2 = on Debian 9.6, kernel 4.9.0-8:

apu2a (iperf3 client) = <=E2=80=94 default VLAN =E2=80=94>  apu2b (iperf3 = server)

Both = have cake at 100mbit only on egress, with dual-srchost on client and = dual-dsthost on server. With this setup (and probably previous ones, I = just didn=E2=80=99t test it this way), bi-directional fairness with = these flow counts works:

= IP1 8-flow TCP up: 46.4
IP2 = 1-flow TCP up: 47.3
IP1 8-flow TCP down: 46.8
= IP2 1-flow TCP down: 46.7

but with the original flow counts = reported it=E2=80=99s still similarly imbalanced as before:

IP1 = 8-flow TCP up: 82.9
IP2 1-flow TCP up: 10.9
= IP1 1-flow TCP down: 10.8
IP2 = 8-flow TCP down: 83.3

and now with ack-filter on both ends (not much = change):

= IP1 8-flow TCP up: 82.8
IP2 = 1-flow TCP up: 10.9
IP1 1-flow TCP down: 10.5
= IP2 8-flow TCP down: 83.2

Before I go further, = what I=E2=80=99m seeing with this rig is that when =E2=80=9Cinterplanetary= =E2=80=9D is used and the number of iperf3 TCP flows goes above the = number of CPUs minus one (in my case, 4 cores), the UDP send rate starts = dropping. This only happens with interplanetary for some reason, but = such as it is, I=E2=80=99m changed my tests to pit 8 UDP flows against 1 = TCP flow instead, giving the UDP senders more CPU, as this seems to work = much better. All tests except the last are with = =E2=80=9Cinterplanetary=E2=80=9D.

UDP upload competition (looks = good):

IP1 1-flow TCP up: 48.6
= IP2 8-flow UDP 48-mbit up: 48.2 (0% loss)

UDP download = competition (some imbalance, maybe a difference in how iperf3 reverse = mode works?):

IP1 8-flow UDP 48-mbit down: 43.1 = (0% loss)
IP2 1-flow TCP down: 53.4 (0% = loss)

All = four at once (looks similar to previous two tests not impacting one = another, which is good):

IP1 = 1-flow TCP up: 47.7
IP2 8-flow UDP 48-mbit up: 48.2 = (0% loss)
IP1 = 8-flow UDP 48-mbit down: 43.3 (0% loss)
IP2 = 1-flow TCP down: 52.3

All four at once, up IPs flipped (less = fair):

IP1 8-flow UDP 48-mbit up: 37.7 = (0% loss)
IP2 = 1-flow TCP up: 57.9
IP1 8-flow UDP 48-mbit down: 38.9 = (0% loss)
IP2 = 1-flow TCP down: 56.3

All four at once, interplanetary off = again, to double check it, and yes, UDP gets punished in this = case:

IP1 1-flow TCP up: 60.6
= IP2 8-flow UDP 48-mbit up: 6.7 (86% loss)
IP1 8-flow UDP 48-mbit down: 2.9 = (94% loss)
IP2 1-flow TCP down: = 63.1

So have we learned something from this? Yes, fairness is = improved when using UDP instead of TCP for the 8-flow clients, but by = turning AQM off we=E2=80=99re also testing a very different scenario, = one that=E2=80=99s not too realistic. Does this prove the cause of the = problem is TCP ack traffic?

Thanks again for the help on this. After a whole day on it, = I=E2=80=99ll have to shift gears tomorrow to FreeNet router changes. = I=E2=80=99ll show them the progress on Monday so of course I=E2=80=99d = like to have a great host fairness story for Cake, as this is one of the = main reasons to use it instead of fq_codel, but perhaps this will get = sorted out before then. :)

I agree with George that we=E2=80=99ve been through this = before, and also with how he explained it in his latest email, but there = have been many changes to Cake since we tested in 2017, so this could be = a regression. I=E2=80=99m almost sure I tested this exact scenario, and = would not have put 8 up / 8 down on one IP and 1 up / 1 down on the = other, which works with fairness for some reason.
FWIW, I also reproduced it in flent = between the same APU2s used above, to be sure iperf3 wasn=E2=80=99t = somehow causing it:


= --Apple-Mail=_988E2649-E62E-4FB9-AEC9-72C26A1B6C7A--