From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp97.ord1c.emailsrvr.com (smtp97.ord1c.emailsrvr.com [108.166.43.97]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by huchra.bufferbloat.net (Postfix) with ESMTPS id E3FD421F518 for ; Fri, 10 Oct 2014 21:20:47 -0700 (PDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp5.relay.ord1c.emailsrvr.com (SMTP Server) with ESMTP id 222C11803E4; Sat, 11 Oct 2014 00:20:46 -0400 (EDT) X-Virus-Scanned: OK Received: by smtp5.relay.ord1c.emailsrvr.com (Authenticated sender: dpreed-AT-reed.com) with ESMTPSA id 3C5171803B7; Sat, 11 Oct 2014 00:20:45 -0400 (EDT) X-Sender-Id: dpreed@reed.com Received: from [192.168.192.52] (209-6-168-90.c3-0.ned-ubr1.sbo-ned.ma.cable.rcn.com [209.6.168.90]) (using TLSv1 with cipher DHE-RSA-AES256-SHA) by 0.0.0.0:465 (trex/5.2.13); Sat, 11 Oct 2014 04:20:46 GMT User-Agent: K-@ Mail for Android X-Priority: 3 In-Reply-To: References: <1412988767.10122173@apps.rackspace.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----2W13NP9HR0T9WUH0STAH381PE23J2Z" Content-Transfer-Encoding: 7bit From: "David P. Reed" Date: Sat, 11 Oct 2014 00:20:43 -0400 To: David Lang Message-ID: <1386b2a4-2002-49e2-8dbc-9747c631256e@reed.com> Cc: "cerowrt-devel@lists.bufferbloat.net" , Jesper Dangaard Brouer Subject: Re: [Cerowrt-devel] bulk packet transmission X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Oct 2014 04:21:16 -0000 ------2W13NP9HR0T9WUH0STAH381PE23J2Z Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I do know that=2E I would say that benchmarks rarely match real world probl= ems of real systems- they come from sources like academia and technical mar= keting depts=2E My job for the last few years has been looking at stems wit= h dozens of processors across 2 and 4 sockets and multiple 10 GigE adapters= =2E There are few benchmarks that look like real workloads=2E And even sma= ller systems do very poorly compared to what is possible=2E Linux is slowl= y getting better but not so much in the network area at scale=2E That woul= d take a plan and a rethinking=2E Beyond incremental tweaks=2E My opinion = =2E=2E=2E ymmv=2E On Oct 10, 2014, David Lang wrote: >I'= ve been watching Linux kernel development for a long time and they >add loc= ks >only when benchmarks show that a lock is causing a bottleneck=2E They = >don't just >add them because they can=2E > >They do also spend a lot of t= ime working to avoid locks=2E > >One thing that you are missing is that you= are thinking of the TCP/IP >system as >a single thread of execution, but = there's far more going on than that, >especially when you have multiple NI= Cs and cores and have lots of >interrupts >going on=2E > >Each TCP/IP stre= am is not a separate queue of packets in the kernel, >instead >the details= of what threads exist is just a table of information=2E The >packets >are= all put in a small number of queues to be sent out, and the >low-level dri= ver >picks the next packet to send from these queues without caring about = >what TCP/IP >stream it's from=2E > >David Lang > >On Fri, 10 Oct 2014, dp= reed@reed=2Ecom wrote: > >> The best approach to dealing with "locking over= head" is to stop >thinking that >> if locks are good, more locking (finer = grained locking) is better=2E >OS >> designers (and Linux designers in pa= rticular) are still putting in >way too >> much locking=2E I deal with th= is in my day job (we support systems >with very >> large numbers of cpus a= nd because of the "fine grained" locking >obsession, the >> parallelized c= apacity is limited)=2E If you do a thoughtful design of >your >> network = code, you don't need lots of locking - because TCP/IP streams >don't >> ha= ve to interact much - they are quite independent=2E But instead OS >design= ers >> spend all their time thinking about doing "one thing at a time"=2E = >> >> There are some really good ideas out there (e=2Eg=2E RCU) but you ha= ve to >think >> about the big picture of networking to understand how to u= se them=2E >I'm not >> impressed with the folks who do the Linux networki= ng stacks=2E >> >> >> On Thursday, October 9, 2014 3:48pm, "Dave Taht" > said: >> >> >> >>> I have some hope that the skb->xm= it_more API could be used to make >>> aggregating packets in wifi on an AP = saner=2E (my vision for it was >that >>> the overlying qdisc would set xmit= _more while it still had packets >>> queued up for a given station and then= stop and switch to the next=2E >>> But the rest of the infrastructure ende= d up pretty closely tied to >>> BQL=2E=2E=2E=2E) >>> >>> Jesper just wrote= a nice piece about it also=2E >>> >http://netoptimizer=2Eblogspot=2Ecom/20= 14/10/unlocked-10gbps-tx-wirespeed-smallest=2Ehtml >>> >>> It was nice to = fool around at 10GigE for a while! And >netperf-wrapper >>> scales to this = speed also! :wow: >>> >>> I do worry that once sch_fq and fq_codel support= is added that there >>> will be side effects=2E I would really like - now = that there are al >>> these people profiling things at this level to see pr= ofiles >including >>> those qdiscs=2E >>> >>> /me goes grumbling back to t= hinking about wifi=2E >>> >>> On Thu, Oct 9, 2014 at 12:40 PM, David Lang = wrote: >>> > lwn=2Enet has an article about a set of new = patches that avoid some >locking >>> > overhead by transmitting multiple pa= ckets at once=2E >>> > >>> > It doesn't work for things with multiple queue= s (like fq_codel) in >it's >>> > current iteration, but it sounds like some= thing that should be >looked at and >>> > watched for latency related issue= s=2E >>> > >>> > http://lwn=2Enet/Articles/615238/ >>> > >>> > David Lang >= >> > _______________________________________________ >>> > Cerowrt-devel ma= iling list >>> > Cerowrt-devel@lists=2Ebufferbloat=2Enet >>> > https://list= s=2Ebufferbloat=2Enet/listinfo/cerowrt-devel >>> >>> >>> >>> -- >>> Dave= T=C3=A4ht >>> >>> https://www=2Ebufferbloat=2Enet/projects/make-wifi-fast= >>> _______________________________________________ >>> Cerowrt-devel mail= ing list >>> Cerowrt-devel@lists=2Ebufferbloat=2Enet >>> https://lists=2Ebu= fferbloat=2Enet/listinfo/cerowrt-devel >>> -- Sent from my Android device = with K-@ Mail=2E Please excuse my brevity=2E ------2W13NP9HR0T9WUH0STAH381PE23J2Z Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable I do know that=2E I would say that benchmarks rare= ly match real world problems of real systems- they come from sources like a= cademia and technical marketing depts=2E My job for the last few years has = been looking at stems with dozens of processors across 2 and 4 sockets and = multiple 10 GigE adapters=2E

There a= re few benchmarks that look like real workloads=2E And even smaller systems= do very poorly compared to what is possible=2E  Linux is slowly getti= ng better but not so much in the network area at scale=2E  That would = take a plan and a rethinking=2E Beyond incremental tweaks=2E My opinion =2E= =2E=2E ymmv=2E

On Oct 10, 2014, David Lang <david@lang=2Ehm> wrote:
I've=
 been watching Linux kernel development for a long time and they add locks =

only when benchmarks show that a lock is causing a bottl= eneck=2E They don't just
add them because they can=2E
They do also spend a lot of time working= to avoid locks=2E

One thing that you = are missing is that you are thinking of the TCP/IP system as
a single thread of execution, but there's far more going on than that,=
especially when you have multiple NICs and cores and ha= ve lots of interrupts
going on=2E

Each TCP/IP stream is not a separate queue of packets in the= kernel, instead
the details of what threads exist is ju= st a table of information=2E The packets
are all put in = a small number of queues to be sent out, and the low-level driver
picks the next packet to send from these queues without caring ab= out what TCP/IP
stream it's from=2E
David Lang

On Fri, 10= Oct 2014, dpreed@reed=2Ecom wrote:

The best approach to d= ealing with "locking overhead" is to stop thinking that
if locks are good, more locking (finer grained locking) is better= =2E OS
designers (and Linux designers in particular) ar= e still putting in way too
much locking=2E I deal with = this in my day job (we support systems with very
large n= umbers of cpus and because of the "fine grained" locking obsessio= n, the
parallelized capacity is limited)=2E If you do a= thoughtful design of your
network code, you don't need = lots of locking - because TCP/IP streams don't
have to i= nteract much - they are quite independent=2E But instead OS designers
spend all their time thinking about doing "one thing at= a time"=2E

There are some really= good ideas out there (e=2Eg=2E RCU) but you have to think
about the big picture of networking to understand how to use them=2E I'= m not
impressed with the folks who do the Linux networki= ng stacks=2E


On Thu= rsday, October 9, 2014 3:48pm, "Dave Taht" <dave=2Etaht@gmail= =2Ecom> said:


I have so= me hope that the skb->xmit_more API could be used to make
aggregating packets in wifi on an AP saner=2E (my vision for it was tha= t
the overlying qdisc would set xmit_more while it still = had packets
queued up for a given station and then stop a= nd switch to the next=2E
But the rest of the infrastructu= re ended up pretty closely tied to
BQL=2E=2E=2E=2E)

Jesper just wrote a nice piece about it als= o=2E
http://net= optimizer=2Eblogspot=2Ecom/2014/10/unlocked-10gbps-tx-wirespeed-smallest=2E= html

It was nice to fool around at= 10GigE for a while! And netperf-wrapper
scales to this s= peed also! :wow:

I do worry that once = sch_fq and fq_codel support is added that there
will be s= ide effects=2E I would really like - now that there are al
these people profiling things at this level to see profiles including
those qdiscs=2E

/me goe= s grumbling back to thinking about wifi=2E

On Thu, Oct 9, 2014 at 12:40 PM, David Lang <david@lang=2Ehm> wr= ote:
lwn=2Enet has an article about= a set of new patches that avoid some locking
overhead by= transmitting multiple packets at once=2E

It doesn't work for things with multiple queues (like fq_codel) in it's=
current iteration, but it sounds like something that sho= uld be looked at and
watched for latency related issues= =2E

http://lwn=2Enet/Articles/615238/

David Lang


Cerowrt-devel mailing list
Cerowrt-devel@lists=2Ebuffe= rbloat=2Enet
https://lists=2Ebufferbloat=2Enet/l= istinfo/cerowrt-devel



--
Dave Täht
<= br clear=3D"none">https://www=2Ebufferbloat=2Enet/projects/make-wif= i-fast


Cerowrt-devel mailing l= ist
Cerowrt-devel@lists=2Ebufferbloat=2Enet
https://lists=2Ebufferbloat=2Enet/listinfo/cerowrt-devel<= br clear=3D"none">

-- Sent from my Android device with K-@ Mail=2E Please excuse my brevity=2E ------2W13NP9HR0T9WUH0STAH381PE23J2Z--