From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-x234.google.com (mail-qk0-x234.google.com [IPv6:2607:f8b0:400d:c09::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 0FA073CB36 for ; Wed, 16 Aug 2017 19:55:17 -0400 (EDT) Received: by mail-qk0-x234.google.com with SMTP id x191so28824564qka.5 for ; Wed, 16 Aug 2017 16:55:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=rPsiYpOmacS68bih25GMArSEn5r8f7CU8e1yb3MzzZM=; b=ma0NxB3cia8FJrpuRs/WGAKD4mEGNXJSGZLHsjgrrc9qs91FRyAXhgiEWlWHN0NXJf 7CfCN7AbqeywCEDeJwKyfoIhXGqno5SlOrY96VsbVqHsBGwoxvsldLDEsTnxC6/8uHKI AHsbcNEZ0rfb9tKKMmKmBqr4AuizA9Qd92RXbNQ5y/TiODlV3jupCdgYvapZaNsQyJ3t ku5D7lPe/1b5N3i5WOPngpJgdv3fpbWYxtH4HwDOG9Zi2DscQ+8jQ9kAbP3HCmFC6SbZ w9GtGexK99kf1DuojzmcVjk/nMMGM5c15IhlFqgVfn30KNPFlEh9A4Wg8CdQO9X0BKwl W3Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=rPsiYpOmacS68bih25GMArSEn5r8f7CU8e1yb3MzzZM=; b=XAHjpvYwmDABt7xRxghb62LZIH45LXA8OT1sAlLIyTvcA6l4OP8cpLjIJUtuAVXfs+ vIwweR9iaJ0SHc91fin69B2oOLIfvAp89mseegKyMWjkGQ5aGrZ4W1Q2C0Q1aav4dzig ujxA7uV0oqW02yEeqXYervJNPY34Jlb8t9pRis0odZ36A2oaCA+qmB6z2cplSbE44zd6 PEK5tilUobdxUw90TuQd7W0oICY48sOcLwZSbapi1Bi7659d6zEvM0am1Eb7FTBEZEr1 XNSAyUmcUbJenbX0BpglTAbKIqEEgyb/Og25Uqt5rTi0hIB302AhDZQIOSnbRtAGNXKy nMBQ== X-Gm-Message-State: AHYfb5hUKIfTtfI1ZvxQKzMQvNfILczuwYKjgiGnF6I6fpCISpjKtK7A R/NjQylYrRQhaOoIZtcj7aM8G6s2Mg== X-Received: by 10.55.154.3 with SMTP id c3mr95979qke.241.1502927717621; Wed, 16 Aug 2017 16:55:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.40.176 with HTTP; Wed, 16 Aug 2017 16:55:17 -0700 (PDT) Received: by 10.140.40.176 with HTTP; Wed, 16 Aug 2017 16:55:17 -0700 (PDT) In-Reply-To: References: From: Jonathan Morton Date: Thu, 17 Aug 2017 02:55:17 +0300 Message-ID: To: Cong Xu Cc: Cake List Content-Type: multipart/alternative; boundary="94eb2c053408bb63f30556e7a075" Subject: Re: [Cake] flow isolation with ipip X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Aug 2017 23:55:18 -0000 --94eb2c053408bb63f30556e7a075 Content-Type: text/plain; charset="UTF-8" Cake makes use of Linux' "packet dissecting" infrastructure. If the latter knows about the tunnelling protocol, Cake should naturally see the IP and port numbers of the inner payload rather than the outer tunnel. I don't know, however, precisely what tunnels are supported. At minimum, don't ever expect encrypted tunnels to behave this way! - Jonathan Morton On 18 Jun 2017 21:13, "Cong Xu" wrote: > Hi, > > I wonder if cake's flow isolation works with the ipip tunnel? I hope to > guarantee the networking fair-share among containers/VMs in the same host. > Thus, I used sfq/fq to associate with each tc class created in advance to > provide both shaping and scheduling. The scripts roughly look like this > (Assume 2 containers hosting iperf client run in the same host. One > container sends 100 parallel streams via -P 100 to iperf server running in > another host, the other one send 10 parallel streams with -P 10.): > > tc qdisc add dev $NIC root handle 1: htb default 2 > tc class add dev $NIC parent 1: classid 1:1 htb rate ${NIC_RATE}mbit > burst 1m cburst 1m > tc class add dev $NIC parent 1:1 classid 1:2 htb rate ${RATE1}mbit ceil > ${NIC_RATE}mbit burst 1m cburst 1m > tc class add dev $NIC parent 1:1 classid 1:3 htb rate ${RATE2}mbit ceil > ${NIC_RATE}mbit burst1m cburst 1m > tc qdisc add dev $NIC parent 1:2 handle 2 sfq perturb 10 > tc qdisc add dev $NIC parent 1:3 handle 3 sfq perturb 10 > tc filter ad ... > > It works well, each container running iperf gets the almost same bandwidth > regardless of the flows number. (Without the sfq, the container sending 100 > streams acchieves much higher bandwidth than the 10 streams guy.) > > -------------- simultaneous 2 unlimited (100 conns vs 10 conns) > ------------- > job "big-unlimited-client" created > job "small-unlimited-client" created > -------------- unlimited server <-- unlimited client (100 conns) > ------------- > [SUM] 0.00-50.01 sec 24.9 GBytes 4.22 Gbits/sec 16874 > sender > [SUM] 0.00-50.01 sec 24.8 GBytes 4.21 Gbits/sec > receiver > > -------------- unlimited server <-- unlimited client (10 conns) > ------------- > [SUM] 0.00-50.00 sec 24.4 GBytes 4.19 Gbits/sec 13802 > sender > [SUM] 0.00-50.00 sec 24.4 GBytes 4.19 Gbits/sec > receiver > > However, if the ipip is enabled, sfq dose not work anymore. > > -------------- simultaneous 2 unlimited (100 conns vs 10 conns) > ------------- > job "big-unlimited-client" created > job "small-unlimited-client" created > -------------- unlimited server <-- unlimited client (100 conns) > ------------- > [SUM] 0.00-50.00 sec 27.2 GBytes 4.67 Gbits/sec 391278 > sender > [SUM] 0.00-50.00 sec 27.1 GBytes 4.65 Gbits/sec > receiver > > -------------- unlimited server <-- unlimited client (10 conns) > ------------- > [SUM] 0.00-50.00 sec 6.85 GBytes 1.18 Gbits/sec 64153 > sender > [SUM] 0.00-50.00 sec 6.82 GBytes 1.17 Gbits/sec > receiver > > The reason behind is that the src/dst ip addresses using ipip tunnel are > same for all flows which are the src/dst ip of the host NICs instead of > veth ip of each container/VM, and there is no ports number for the outside > header of ipip packet. I verified this by capturing the traffic on NIC and > analyzing it with wireshark. The real src/dst ip of container/VM is visible > on the tunnel device (e.g. tunl0). Theoretically, this issue can be solved > if I set up tc class and sfq on tunl0 instead of host NIC. I tried it, > unfortunately, it did not work either. fq does not work for the same > reason, because both sfq and fq use the same flow classifier (src/dst ips > and ports). So, I just wonder if cake works with ipip tunnel or not. > > I appreciate if you can provide any help based on your expertise. Thanks. > > Regards, > Cong > > _______________________________________________ > Cake mailing list > Cake@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cake > > --94eb2c053408bb63f30556e7a075 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Cake makes use of Linux' "packet dissecting" i= nfrastructure.=C2=A0 If the latter knows about the tunnelling protocol, Cak= e should naturally see the IP and port numbers of the inner payload rather = than the outer tunnel.

I don't know, however, precisely what tunnels are suppor= ted. At minimum, don't ever expect encrypted tunnels to behave this way= !

- Jonathan Morton


On 18 Jun 2017 21= :13, "Cong Xu" <davidxu= 06@gmail.com> wrote:
Hi,

I wonder if cake's flow = isolation works with the ipip tunnel? I hope to guarantee the networking fa= ir-share among containers/VMs in the same host. Thus, I used sfq/fq to asso= ciate with each tc class created in advance to provide both shaping and sch= eduling. The scripts roughly look like this (Assume 2 containers hosting ip= erf client run in the same host. One container sends 100 parallel streams v= ia -P 100 to iperf server running in another host, the other one send 10 pa= rallel streams with -P 10.):=C2=A0

tc qdisc add dev $NIC r= oot handle 1: htb default 2
tc class add dev $NIC parent 1: cla= ssid 1:1 htb rate ${NIC_RATE}mbit burst 1m cburst 1m
tc class a= dd dev $NIC parent 1:1 classid 1:2 htb rate ${RATE1}mbit = ceil ${NIC_RATE}mbit burst 1m cburst 1m
tc class add dev $= NIC parent 1:1 classid 1:3 htb rate ${RATE2}mbit ceil ${N= IC_RATE}mbit burst1m cburst 1m
tc qdisc add dev $NIC p= arent 1:2 handle 2 sfq perturb 10
tc qdisc add dev $NIC parent= 1:3 handle 3 sfq perturb 10
tc filter ad ...

It works well, eac= h container running iperf gets the almost same bandwidth regardless of the = flows number. (Without the sfq, the container sending 100 streams acchieves= much higher bandwidth than the 10 streams guy.)

-------------- simu= ltaneous 2 unlimited (100 conns vs 10 conns) -------------
job "big= -unlimited-client" created
job "small-unlimited-client" c= reated
-------------- unlimited server <-- unlimited client (100 conn= s) -------------
[SUM]=C2=A0=C2=A0 0.00-50.01=C2=A0 sec=C2=A0 24.9 GByte= s=C2=A0 4.22 Gbits/sec=C2=A0 16874=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sender
[SUM]=C2=A0=C2=A0 0.00-50.01=C2= =A0 sec=C2=A0 24.8 GBytes=C2=A0 4.21 Gbits/sec=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= receiver

-------------- unlimited server <-- unlimited client (1= 0 conns) -------------
[SUM]=C2=A0=C2=A0 0.00-50.00=C2=A0 sec=C2=A0 24.4= GBytes=C2=A0 4.19 Gbits/sec=C2=A0 13802=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sender
[SUM]=C2=A0=C2=A0 0.00-50= .00=C2=A0 sec=C2=A0 24.4 GBytes=C2=A0 4.19 Gbits/sec=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 receiver

However, if the ipip is enabled, sfq dose not work a= nymore.

-------------- simultaneous 2 unlimited (100 conns vs 10 co= nns) -------------
job "big-unlimited-client" created
job &= quot;small-unlimited-client" created
-------------- unlimited serve= r <-- unlimited client (100 conns) -------------
[SUM]=C2=A0=C2=A0 0.= 00-50.00=C2=A0 sec=C2=A0 27.2 GBytes=C2=A0 4.67 Gbits/sec=C2=A0 391278=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sende= r
[SUM]=C2=A0=C2=A0 0.00-50.00=C2=A0 sec=C2=A0 27.1 GBytes=C2=A0 4.65 Gb= its/sec=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 receiver

-------------- unlimit= ed server <-- unlimited client (10 conns) -------------
[SUM]=C2=A0= =C2=A0 0.00-50.00=C2=A0 sec=C2=A0 6.85 GBytes=C2=A0 1.18 Gbits/sec=C2=A0 64= 153=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= sender
[SUM]=C2=A0=C2=A0 0.00-50.00=C2=A0 sec=C2=A0 6.82 GBytes=C2=A0 1= .17 Gbits/sec=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 receiver

The reason behin= d is that the src/dst ip addresses using ipip tunnel are same for all flows= which are the src/dst ip of the host NICs instead of veth ip of each conta= iner/VM, and there is no ports number for the outside header of ipip packet= . I verified this by capturing the traffic on NIC and analyzing it with wir= eshark. The real src/dst ip of container/VM is visible on the tunnel device= (e.g. tunl0). Theoretically, this issue can be solved if I set up tc class= and sfq on tunl0 instead of host NIC. I tried it, unfortunately, it did no= t work either. fq does not work for the same reason, because both sfq and f= q use the same flow classifier (src/dst ips and ports). So, I just wonder i= f cake works with ipip tunnel or not.

I appreciate if you can provi= de any help based on your expertise. Thanks.

Regards,
Cong

_______________________________________________
Cake mailing list
Cake@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cake

--94eb2c053408bb63f30556e7a075--