<div dir="auto">I am so happy to see this patch land finally!</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">---------- Forwarded message ---------<br>From: <strong class="gmail_sendername" dir="auto">Jason Xing</strong> <span dir="auto"><<a href="mailto:kerneljasonxing@gmail.com">kerneljasonxing@gmail.com</a>></span><br>Date: Mon, Jun 17, 2024, 8:48 AM<br>Subject: Re: [PATCH net-next v2] virtio_net: add support for Byte Queue Limits<br>To: Jiri Pirko <<a href="mailto:jiri@resnulli.us">jiri@resnulli.us</a>><br>Cc:  <<a href="mailto:netdev@vger.kernel.org">netdev@vger.kernel.org</a>>,  <<a href="mailto:davem@davemloft.net">davem@davemloft.net</a>>,  <<a href="mailto:edumazet@google.com">edumazet@google.com</a>>,  <<a href="mailto:kuba@kernel.org">kuba@kernel.org</a>>,  <<a href="mailto:pabeni@redhat.com">pabeni@redhat.com</a>>,  <<a href="mailto:mst@redhat.com">mst@redhat.com</a>>,  <<a href="mailto:jasowang@redhat.com">jasowang@redhat.com</a>>,  <<a href="mailto:xuanzhuo@linux.alibaba.com">xuanzhuo@linux.alibaba.com</a>>,  <<a href="mailto:virtualization@lists.linux.dev">virtualization@lists.linux.dev</a>>,  <<a href="mailto:ast@kernel.org">ast@kernel.org</a>>,  <<a href="mailto:daniel@iogearbox.net">daniel@iogearbox.net</a>>,  <<a href="mailto:hawk@kernel.org">hawk@kernel.org</a>>,  <<a href="mailto:john.fastabend@gmail.com">john.fastabend@gmail.com</a>>,  <<a href="mailto:dave.taht@gmail.com">dave.taht@gmail.com</a>>,  <<a href="mailto:hengqi@linux.alibaba.com">hengqi@linux.alibaba.com</a>><br></div><br><br>On Mon, Jun 17, 2024 at 5:15 PM Jiri Pirko <<a href="mailto:jiri@resnulli.us" target="_blank" rel="noreferrer">jiri@resnulli.us</a>> wrote:<br>

><br>

> Fri, Jun 14, 2024 at 11:54:04AM CEST, <a href="mailto:kerneljasonxing@gmail.com" target="_blank" rel="noreferrer">kerneljasonxing@gmail.com</a> wrote:<br>

> >Hello Jiri,<br>

> ><br>

> >On Thu, Jun 13, 2024 at 1:08 AM Jiri Pirko <<a href="mailto:jiri@resnulli.us" target="_blank" rel="noreferrer">jiri@resnulli.us</a>> wrote:<br>

> >><br>

> >> From: Jiri Pirko <<a href="mailto:jiri@nvidia.com" target="_blank" rel="noreferrer">jiri@nvidia.com</a>><br>

> >><br>

> >> Add support for Byte Queue Limits (BQL).<br>

> >><br>

> >> Tested on qemu emulated virtio_net device with 1, 2 and 4 queues.<br>

> >> Tested with fq_codel and pfifo_fast. Super netperf with 50 threads is<br>

> >> running in background. Netperf TCP_RR results:<br>

> >><br>

> >> NOBQL FQC 1q:  159.56  159.33  158.50  154.31    agv: 157.925<br>

> >> NOBQL FQC 2q:  184.64  184.96  174.73  174.15    agv: 179.62<br>

> >> NOBQL FQC 4q:  994.46  441.96  416.50  499.56    agv: 588.12<br>

> >> NOBQL PFF 1q:  148.68  148.92  145.95  149.48    agv: 148.2575<br>

> >> NOBQL PFF 2q:  171.86  171.20  170.42  169.42    agv: 170.725<br>

> >> NOBQL PFF 4q: 1505.23 1137.23 2488.70 3507.99    agv: 2159.7875<br>

> >>   BQL FQC 1q: 1332.80 1297.97 1351.41 1147.57    agv: 1282.4375<br>

> >>   BQL FQC 2q:  768.30  817.72  864.43  974.40    agv: 856.2125<br>

> >>   BQL FQC 4q:  945.66  942.68  878.51  822.82    agv: 897.4175<br>

> >>   BQL PFF 1q:  149.69  151.49  149.40  147.47    agv: 149.5125<br>

> >>   BQL PFF 2q: 2059.32  798.74 1844.12  381.80    agv: 1270.995<br>

> >>   BQL PFF 4q: 1871.98 4420.02 4916.59 13268.16   agv: 6119.1875<br>

> ><br>

> >I cannot get such a huge improvement when I was doing multiple tests<br>

> >between two VMs. I'm pretty sure the BQL feature is working, but the<br>

> >numbers look the same with/without BQL.<br>

> ><br>

> >VM 1 (client):<br>

> >16 cpus, x86_64, 4 queues, the latest net-next kernel with/without<br>

> >this patch, pfifo_fast, napi_tx=true, napi_weight=128<br>

> ><br>

> >VM 2 (server):<br>

> >16 cpus, aarch64, 4 queues, the latest net-next kernel without this<br>

> >patch, pfifo_fast<br>

> ><br>

> >What the 'ping' command shows to me between two VMs is : rtt<br>

> >min/avg/max/mdev = 0.233/0.257/0.300/0.024 ms<br>

> ><br>

> >I started 50 netperfs to communicate the other side with the following command:<br>

> >#!/bin/bash<br>

> ><br>

> >for i in $(seq 5000 5050);<br>

> >do<br>

> >netperf -p $i -H [ip addr] -l 60 -t TCP_RR -- -r 64,64 > /dev/null 2>&1 &<br>

> >done<br>

> ><br>

> >The results are around 30423.62 txkB/s. If I remove '-r 64 64', they<br>

> >are still the same/similar.<br>

><br>

> You have to stress the line by parallel TCP_STREAM instances (50 in my<br>

> case). For consistent results, use -p portnum,locport to specify the<br>

> local port.<br>

<br>

Thanks. Even though the results of TCP_RR mode vary sometimes, I can<br>

see a big improvement in the total value of those results under such<br>

circumstances.<br>

With BQL, the throughput is 2159.17<br>

Without BQL, it's 1099.33<br>

<br>

Please feel free to add the tag:<br>

Tested-by: Jason Xing <<a href="mailto:kerneljasonxing@gmail.com" target="_blank" rel="noreferrer">kerneljasonxing@gmail.com</a>><br>

<br>

Thanks,<br>

Jason<br>

</div>