From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-x22b.google.com (mail-wr0-x22b.google.com [IPv6:2a00:1450:400c:c0c::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id A4A0E3B2A4 for ; Fri, 7 Apr 2017 05:37:51 -0400 (EDT) Received: by mail-wr0-x22b.google.com with SMTP id c55so25504261wrc.3 for ; Fri, 07 Apr 2017 02:37:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc:message-id:references :to; bh=P2IfbPkE3DPm6yIs+vU3+dTQwKlQgNOfHz42x/1fWew=; b=CAiu3VTe8S5lhX4tTdZhkZxeo3dkZ11jThT48ULT6x/aih0sEzOFY9b+1kFu93pFd1 7i4pPptDC7QiwEyqv5CB2M/mrr8LroVkZge+2Z2K1LVqWZ8L0AE7taabCd+IBCERqsuY pxY4CV5gAuYHNI5gxUtK3rs57cVkUTWlIHFW8kxK16h1UnMiNnpQdQdC36WZSMZRXhsQ gIrRHmICKXULXRFWKxXqIWH6hP+fcXLFDLMxn25Qcr8RKq4N2OqDQEqB/n+PIN/IYzev xSOqzoynsW4xWpdgHDV5KdeUTrB19fI+10HhsPMqoHub1iwKhzSZNx5BhQukkSi/cEDo jDDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :message-id:references:to; bh=P2IfbPkE3DPm6yIs+vU3+dTQwKlQgNOfHz42x/1fWew=; b=oLTk9HvsmRY4XpRfzNnrVL6e0cJvfCYXlX6tp/+8RALWZnov7HrtsbW3AEH4bKRZrL MZ6LoST6iVBNIcY2hXrrKS22NVU2OuFQDya8N0nuul5usiVpwYKk38h81rEuerXkliwv LAkBPje4dliauN4ka35/iRbWn+pGLahwHcMwsDUZsEnbFvZF+Sp6jQ8n7/17BP8G2gq7 JoyuExn6UPMSpC183CxR+jurZR10ZG6pAc51cwBu/lTooVMQr4f8KL9fTxlobswLiZ1T fA89aIDUdgjJWfPcdzRkJzARudBjbKEac/YSPXoDkiFPXK5DlMkQC4DlekkC9A044w+Y WX3w== X-Gm-Message-State: AFeK/H1qfcLi2dD4B3wDTuMWWDsAYsyLm2SSrOFDWZGqmFkP3TnDATntqetyhG1UBxli8A== X-Received: by 10.28.167.216 with SMTP id q207mr24248382wme.134.1491557870624; Fri, 07 Apr 2017 02:37:50 -0700 (PDT) Received: from [10.72.0.130] (h-1169.lbcfree.net. [185.99.119.68]) by smtp.gmail.com with ESMTPSA id 94sm5327660wrp.34.2017.04.07.02.37.49 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 07 Apr 2017 02:37:49 -0700 (PDT) Content-Type: multipart/alternative; boundary="Apple-Mail=_16412528-E843-42E7-9B8C-F0236B2156EB" Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) From: Pete Heist In-Reply-To: Date: Fri, 7 Apr 2017 11:37:49 +0200 Cc: Cake List , =?utf-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , =?utf-8?Q?Dave_T=C3=A4ht?= Message-Id: References: <2FD59D30-3102-4A3E-A38E-050E438DABF0@gmail.com> <6F118C46-16DB-48AC-A90D-7E6D44B6D069@gmail.com> <1E4563E2-63E2-419D-AFDD-8CD74F22539B@gmail.com> To: Jonathan Morton X-Mailer: Apple Mail (2.3124) Subject: Re: [Cake] flow isolation for ISPs X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Apr 2017 09:37:51 -0000 --Apple-Mail=_16412528-E843-42E7-9B8C-F0236B2156EB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Apr 7, 2017, at 10:28 AM, Jonathan Morton = wrote: >>=20 >> On 7 Apr, 2017, at 11:13, Pete Heist wrote: >>=20 >>> On Apr 6, 2017, at 11:26 AM, Pete Heist wrote: >>>=20 >>>> On Apr 6, 2017, at 11:11 AM, Jonathan Morton = wrote: >>>>=20 >>>> On 6 Apr, 2017, at 11:27, Pete Heist wrote: >>>>>=20 >>>>> There is a table of member ID to a list of MAC addresses for the = member, so if there could somehow be fairness based on that table and by = MAC address, that could solve it, but I don=E2=80=99t see how it could = be implemented. >>>>=20 >>>> One option would be to use HTB with FLOWER filters to sort out the = subscribers into classes, and use Cake or fq_codel as a child qdisc per = class. Remember that Cake can be used in =E2=80=9Cunlimited=E2=80=9D = mode to rely on an external shaping source. >>=20 >> One more thought, would it be possible for Cake to optionally include = the packet=E2=80=99s mark in the hash? >>=20 >> I know it=E2=80=99s additional functionality, and another keyword, = but it could get you out of the business of the myriad of ways people = might want to do flow isolation, and you=E2=80=99d still have a = catch-all answer for such cases. >>=20 >> There could be a keyword =E2=80=98hash-mark=E2=80=99, let=E2=80=99s = say, which first includes the mark in the hash, then does on to deal = with any other flow isolation keywords as usual. So for example if I = have =E2=80=98hash-mark=E2=80=99 and =E2=80=98dual-srchost=E2=80=99, the = hash is first on the mark, then by source host, then by flow. I could = set the mark to be the member number with iptables. >=20 > That isn=E2=80=99t really how hashing works; there is no =E2=80=9Cfirst,= second, third=E2=80=9D structure, just an accumulation of entropy which = is all mashed together. In order to run the triple-isolation algorithm = at all, I have to take separate hashes of the relevant host addresses, = alongside the general 5-tuple hash. >=20 > However, it would be possible to use the =E2=80=9Cmark=E2=80=9D = directly as one of the host identifiers which triple-isolate operates on = to provide that layer of fairness. That=E2=80=99s probably what you = meant. >=20 > Since this wouldn=E2=80=99t unduly complicate the configuration = interface, it could be a feasible way of adding this functionality for = modest installations, up to a strict maximum of 1024 subscribers (and a = recommended maximum somewhat below that). Ok, I=E2=80=99m still getting familiar with how triple-isolate is = implemented. For example, I was surprised in my test setup that no = fairness is enforced when four client IPs connect to a single server IP, = but I understand from this discussion = (https://github.com/dtaht/sch_cake/issues/46) that that is actually what = is expected. We would probably use dual-srchost and dual-dsthost in the = backhaul, which seems to work very well, and in the backhaul we have the = information to specify that in both directions. (Also, there is no NAT = to deal with at this level.) Just to see if I understand the marking proposal, here's the behavior I = would expect: if there are two TCP flows (on egress) with mark 1 and one = with mark 2, that together saturate the link, the measured rate of the = two flows with mark 1 will add up to the rate of the single flow with = mark 2. Is that right? And would you still add a keyword to specify that = the mark should be used at all? I=E2=80=99m not sure where the 1024 limit comes from, but it would = probably be fine in our case as of now, with 800 members. Even in the = future, I don=E2=80=99t think occasional collisions would be a big = problem, and I think there are things we could do to minimize them. >> It looks like the mark could be obtained from the =E2=80=98mark' = field of the sk_buff struct, but I don=E2=80=99t know the validity of = the field in various cases. For example, I don=E2=80=99t think I can set = the mark on ingress before it reaches a qdisc on an IFB device. >=20 > It has been suggested, in the context of using the =E2=80=9Cmark=E2=80=9D= for Diffserv purposes, that Linux=E2=80=99 conntrack facility could = preserve the mark between directions of flow. Cake can already query = conntrack for NAT awareness. That would be nice for the future, but for now I guess this wouldn=E2=80=99= t work on ingress. It shouldn=E2=80=99t be much of a problem in the = backhaul though, because we=E2=80=99re the ones sending the downstream = traffic, and we can set the marks on that. Overall, I think this could be a nice feature. Let me know if I can help = in some way and thank you for your feedback. :) --Apple-Mail=_16412528-E843-42E7-9B8C-F0236B2156EB Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
On Apr 7, 2017, at 10:28 AM, Jonathan Morton <chromatix99@gmail.com> wrote:

On 7 Apr, 2017, at = 11:13, Pete Heist <peteheist@gmail.com> wrote:

On Apr 6, 2017, at 11:26 = AM, Pete Heist <peteheist@gmail.com> wrote:

On Apr 6, 2017, at 11:11 = AM, Jonathan Morton <chromatix99@gmail.com> wrote:

On 6 Apr, 2017, at 11:27, Pete Heist <peteheist@gmail.com>= wrote:

There is a table of member ID to a list of MAC addresses for = the member, so if there could somehow be fairness based on that table = and by MAC address, that could solve it, but I don=E2=80=99t see how it = could be implemented.

One = option would be to use HTB with FLOWER filters to sort out the = subscribers into classes, and use Cake or fq_codel as a child qdisc per = class.  Remember that Cake can be used in =E2=80=9Cunlimited=E2=80=9D= mode to rely on an external shaping source.

One more thought, = would it be possible for Cake to optionally include the packet=E2=80=99s = mark in the hash?

I know it=E2=80=99s = additional functionality, and another keyword, but it could get you out = of the business of the myriad of ways people might want to do flow = isolation, and you=E2=80=99d still have a catch-all answer for such = cases.

There could be a keyword = =E2=80=98hash-mark=E2=80=99, let=E2=80=99s say, which first includes the = mark in the hash, then does on to deal with any other flow isolation = keywords as usual. So for example if I have =E2=80=98hash-mark=E2=80=99 = and =E2=80=98dual-srchost=E2=80=99, the hash is first on the mark, then = by source host, then by flow. I could set the mark to be the member = number with iptables.

That isn=E2=80=99t really how hashing = works; there is no =E2=80=9Cfirst, second, third=E2=80=9D structure, = just an accumulation of entropy which is all mashed together.  In = order to run the triple-isolation algorithm at all, I have to take = separate hashes of the relevant host addresses, alongside the general = 5-tuple hash.

However, it would = be possible to use the =E2=80=9Cmark=E2=80=9D directly as one of the = host identifiers which triple-isolate operates on to provide that layer = of fairness.  That=E2=80=99s probably what you meant.

Since this wouldn=E2=80=99t unduly = complicate the configuration interface, it could be a feasible way of = adding this functionality for modest installations, up to a strict = maximum of 1024 subscribers (and a recommended maximum somewhat below = that).

Ok, I=E2=80=99= m still getting familiar with how triple-isolate is implemented. For = example, I was surprised in my test setup that no fairness is enforced = when four client IPs connect to a single server IP, but I understand = from this discussion (https://github.com/dtaht/sch_cake/issues/46) that that is = actually what is expected. We would probably use dual-srchost and = dual-dsthost in the backhaul, which seems to work very well, and in the = backhaul we have the information to specify that in both directions. = (Also, there is no NAT to deal with at this level.)

Just to see if I understand the marking proposal, = here's the behavior I would expect: if there are two TCP flows (on = egress) with mark 1 and one with mark 2, that together saturate the = link, the measured rate of the two flows with mark 1 will add up to the = rate of the single flow with mark 2. Is that right? And would you still = add a keyword to specify that the mark should be used at = all?

I=E2=80=99m not sure where the = 1024 limit comes from, but it would probably be fine in our case as of = now, with 800 members. Even in the future, I don=E2=80=99t think = occasional collisions would be a big problem, and I think there are = things we could do to minimize them.

It looks like the mark could be obtained from the =E2=80=98mark= ' field of the sk_buff struct, but I don=E2=80=99t know the validity of = the field in various cases. For example, I don=E2=80=99t think I can set = the mark on ingress before it reaches a qdisc on an IFB device.

It has been suggested, in the context of using = the =E2=80=9Cmark=E2=80=9D for Diffserv purposes, that Linux=E2=80=99 = conntrack facility could preserve the mark between directions of flow. =  Cake can already query conntrack for NAT awareness.

That would be nice for the future, but for = now I guess this wouldn=E2=80=99t work on ingress. It shouldn=E2=80=99t = be much of a problem in the backhaul though, because we=E2=80=99re the = ones sending the downstream traffic, and we can set the marks on = that.

Overall, = I think this could be a nice feature. Let me know if I can help in some = way and thank you for your feedback. :)

= --Apple-Mail=_16412528-E843-42E7-9B8C-F0236B2156EB--