From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-x22a.google.com (mail-qc0-x22a.google.com [IPv6:2607:f8b0:400d:c01::22a]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id B828221F2B5 for ; Tue, 2 Sep 2014 03:05:42 -0700 (PDT) Received: by mail-qc0-f170.google.com with SMTP id r5so6656761qcx.1 for ; Tue, 02 Sep 2014 03:05:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aenertia.net; s=dkimaenertianet; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type:content-transfer-encoding; bh=km9J2PrXYAUZ1pBYCWLV7zeNipNK+YWqtEx2Qbj1p60=; b=FZtxmLNuZ/AvkDpyaQ7++ZOnyaD2KHLl0QCo5vC4ldO6FBu7PM4Ga+fQZnvatvreec 7ym+XsNYAxWjL2xN3LUfguph0ArKZWmZ3ZWbMvKCs+DSsHdFkZlCoFGh/chEpMsMtkF/ gxG/0wQ0zNoCYQHu9Gtp1/a9jFCKqFtR4YQao= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=km9J2PrXYAUZ1pBYCWLV7zeNipNK+YWqtEx2Qbj1p60=; b=VefCdwzHqilTz7hQ8jvlfvDUfgsLxiAuyWeUHnopzcXv1EfDbUSr2tMuD84rvAx9jO 7NntkJuSEw0LSnP7FQrM8pWJMDTYW3KtliUpHT4Tswe5tpU4K76qAesUe7wou3kMysoL wXVmPPPCzXMx0hNoBA6avwMlN8rbQiEy/fcIWGJrTfZpY2Sw9lVJOSuFyJxH68qDP3Mc 3WT5qriw8RxLklJRruy25OXVmOZuy+zlFyQQjhlCMJruOd5kKJCwgjkWUTcF6ByBvuRI 9qNc/jWVuFJHAmDz/xrN0SGIELPgvGl88W1POhluqTSjvocjyHbur4bUU3c3b+dUDOUx cYXw== X-Gm-Message-State: ALoCoQkxmJDkiVCOidqLNUA0mqVJc8Aoksj9slHJvNleUXgELkqvm0VJnUcq2juIj0/6brVmQD3H X-Received: by 10.224.88.137 with SMTP id a9mr54949196qam.88.1409652341231; Tue, 02 Sep 2014 03:05:41 -0700 (PDT) MIME-Version: 1.0 Sender: aenertia@aenertia.net Received: by 10.96.44.134 with HTTP; Tue, 2 Sep 2014 03:05:21 -0700 (PDT) In-Reply-To: References: <87ppfijfjc.fsf@toke.dk> <4FF4917C-1B6D-4D5F-81B6-5FC177F12BFC@gmail.com> <4DA71387-6720-4A2F-B462-2E1295604C21@gmail.com> <0DB9E121-7073-4DE9-B7E2-73A41BCBA1D1@gmail.com> From: =?UTF-8?Q?Joel_Wir=C4=81mu_Pauling?= Date: Tue, 2 Sep 2014 22:05:21 +1200 X-Google-Sender-Auth: UOqQpEuk3bzEFIMZ4HeciwIAyQc Message-ID: To: Jonathan Morton Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: "cerowrt-devel@lists.bufferbloat.net" , bloat Subject: Re: [Cerowrt-devel] [Bloat] Comcast upped service levels -> WNDR3800 can't cope... X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Sep 2014 10:05:43 -0000 On a somewhat related note - I've just received my NZ/AU Region Almond+ which is an arm9 Dual core router based on the Cortina CSC SoC : https://www.cortina-systems.com/product/digital-home-processors/16-products= /996-cs7542-cs7522 More details : On 2 September 2014 21:27, Jonathan Morton wrote: > > On 2 Sep, 2014, at 1:14 am, Aaron Wood wrote: > >>> For the purposes of shaping, the CPU shouldn't need to touch the majori= ty of the payload - only the headers, which are relatively small. The bulk= of the payload should DMA from one NIC to RAM, then DMA back out of RAM to= the other NIC. It has to do that anyway to route them, and without shapin= g there'd be more of them to handle. The difference might be in the data s= tructures used by the shaper itself, but I think those are also reasonably = compact. It doesn't even have to touch userspace, since it's not acting as= the endpoint as my PowerBook was during my tests. >> >> In an ideal case, yes. But is that how this gets managed? (I have no i= dea, I'm certainly not a kernel developer). > > It would be monumentally stupid to integrate two GigE MACs onto an SoC, a= nd then to call it a "network processor", without adequate DMA support. I = don't think Atheros are that stupid. > > Here's a more detailed datasheet: > http://pdf.datasheetarchive.com/indexerfiles/Datasheets-SW6/DSASW= 00118777.pdf > > "Another memory factor is the ability to support multiple I/O operations = in parallel via the WNPU's various ports. The on-chip SRAM in AR7100 WNPUs = has 5 ports that enable simultaneous access to and from five sources: the t= wo gigabit Ethernet ports, the PCI port, the USB 2.0 port and the MIPS proc= essor." > > It's a reasonable question, however, whether the driver uses that support= properly. Mainline Linux kernel code seems to support the SoC but not the= Ethernet; if it were just a minor variant of some other Atheros hardware, = I'd have expected to see it integrated into one of the existing drivers. O= r maybe it is, and my greps just aren't showing it. > > At minimum, however, there are MMIO ranges reported for each MAC during O= penWRT's boot sequence. That's where the ring buffers are. The most the C= PU has to do is read each packet from RAM and write it into those buffers, = or vice versa for receive - I think that's what my PowerBook has to do. Id= eally, a bog-standard DMA engine would take over that simple duty. Either = way, that's something that has to happen whether it's shaped or not, so it'= s unlikely to be our problem. > > The same goes for the wireless MACs, incidentally. These are standard at= h9k mini-PCI cards, and the drivers *are* in mainline. There shouldn't be = any surprises with them. > >> If the packet data is getting moved about from buffer to buffer (for ins= tance to do the htb calculations?) could that substantially change the proc= essing load? > > The qdiscs only deal with packet and socket headers, not the full packet = data. Even then, they largely pass pointers around, inserting the headers = into linked lists rather than copying them into arrays. I believe a lot of= attention has been directed at cache-friendliness in this area, and the MI= PS caches are of conventional type. > >>> Which brings me back to the timers, and other items of black magic. >> >> Which would point to under-utilizing the processor core, while still hav= ing high load? (I'm not seeing that, I'm curious if that would be the case)= . > > It probably wouldn't manifest as high system load. Rather, poor timer re= solution or latency would show up as excessive delays between packets, duri= ng which the CPU is idle. The packet egress times may turn out to be quant= ised - that would be a smoking gun, if detectable. > >>> Incidentally, transfer speed benchmarks involving wireless will certain= ly be limited by the wireless link. I assume that's not a factor here. >> >> That's the usual suspicion. But these are RF-chamber, short-range lab s= etups where the radios are running at full speed in perfect environments... > > Sure. But even turbocharged 'n' gear tops out at 450Mbps signalling, and= much less than that is available even theoretically for TCP/IP throughput.= My point is that you're probably not running *your* tests over wireless. > >> What this makes me realize is that I should go instrument the cpu stats = with each of the various operating modes: >> >> * no shaping, anywhere >> * egress shaping >> * egress and ingress shaping at various limited levels: >> * 10Mbps >> * 20Mbps >> * 50Mbps >> * 100Mbps > > Smaller increments at the high end of the range may prove to be useful. = I would expect the CPU usage to climb nonlinearly (busy-waiting) if there's= a bottleneck in a peripheral device, such as the PCI bus. The way the ker= nel classifies that usage may also be revealing. > >> Heck, what about running HTB simply from a 1ms timer instead of from a d= ata driven timer? > > That might be what's already happening. We have to figure out that befor= e we can work out a solution. > > - Jonathan Morton > > _______________________________________________ > Cerowrt-devel mailing list > Cerowrt-devel@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/cerowrt-devel