Lets make wifi fast again!
 help / color / mirror / Atom feed
* [Make-wifi-fast] Fwd: Wifi Memory limits in small platforms
       [not found]                                           ` <CAA93jw4=13D-+WHLYPiV4NPqeVJwrLJe=nkr+a9D9Cqvq49pEQ@mail.gmail.com>
@ 2019-08-22 13:22                                             ` Dave Taht
  2019-08-22 14:59                                             ` [Make-wifi-fast] " Dave Taht
       [not found]                                             ` <dcb92eaf-928e-f909-981d-c2baf74fbc90@newmedia-net.de>
  2 siblings, 0 replies; 13+ messages in thread
From: Dave Taht @ 2019-08-22 13:22 UTC (permalink / raw)
  To: Make-Wifi-fast

---------- Forwarded message ---------
From: Dave Taht <dave.taht@gmail.com>
Date: Thu, Aug 22, 2019 at 6:15 AM
Subject: Wifi Memory limits in small platforms
To: Sebastian Gottschall <s.gottschall@newmedia-net.de>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>, Dave Taht
<dave@taht.net>, Cake List <cake@lists.bufferbloat.net>, Battle of the
Mesh Mailing List <battlemesh@ml.ninux.org>


It's very good to know how much folk have been struggling to keep
things from OOMing on 32MB platforms. I'd like to hope that the
unified memory management in cake (vs a collection of QoS qdiscs) and
the new fq_codel for wifi stuff (cutting it down to 1 alloc from four)
help, massively on this issue, but until today I was unaware of how
much the field may have been patching things out.

The default 32MB memory limits in fq_codel comes from the stressing
about 10GigE networking from google. 4MB is limit in openwrt,
which is suitable for ~1Gbit, and is sort of there  due to 802.11ac's
maximum (impossible to hit) of a txop that large.

Something as small as 256K is essentially about 128 full size packets
(and often, acks from an ethernet device's rx ring eat 2k).

The structure of the new fq_codel for wifi subsystem is "one in the
hardware, one ready to go, and the rest accumulating". I
typically see about 13-20 packets in an aggregate. 256k strikes me as
a bit small.

I haven't checked, but does this patch still exist in openwrt/dd-wrt?
It had helped a lot when under memory pressure from
a lot of small packets.

https://github.com/dtaht/cerowrt-3.10/blob/master/target/linux/generic/patches-3.10/657-qdisc_reduce_truesize.patch

Arguably this could be made more aggressive, but it massively reduced
memory burdens at the time I did it when
flooding the device, or having lots of acks, and while it cost cpu it
saved on ooming.

There's two other dubious things in the fq_codel for wifi stack
presently. Right now the codel target is set too high for p2p use
(20ms, where 6ms seems more right), and it also flips up to a really
high target and interval AND turns off ecn when there's more than a
few stations available (rather than "active") - it's an overly
conservative figure we used back when we had major issues with
powersave
and multicast that I'd hoped we could cut back to normal after we got
another round of research funding and feedback from the field (which
didn't happen, and we never got around to making it configurable, and
being 25x better than it was before seemed "enough")

I was puzzled at battlemesh as to why I had dropping at about 50ms
delay rather than ecn, and thought it was something
else, and this morning I'm thinking that folk have been reducing the
memlimit to 256k rather....


-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] Wifi Memory limits in small platforms
       [not found]                                           ` <CAA93jw4=13D-+WHLYPiV4NPqeVJwrLJe=nkr+a9D9Cqvq49pEQ@mail.gmail.com>
  2019-08-22 13:22                                             ` [Make-wifi-fast] Fwd: Wifi Memory limits in small platforms Dave Taht
@ 2019-08-22 14:59                                             ` Dave Taht
       [not found]                                             ` <dcb92eaf-928e-f909-981d-c2baf74fbc90@newmedia-net.de>
  2 siblings, 0 replies; 13+ messages in thread
From: Dave Taht @ 2019-08-22 14:59 UTC (permalink / raw)
  To: Sebastian Gottschall
  Cc: Toke Høiland-Jørgensen, Dave Taht, Cake List, Make-Wifi-fast

People have a tendency to try and construct very complicated QoS
systems and then try to run them in limited memory. We see a lot of 6
or 7 class hfsc + sfq or fq_codel systems, which can accumulate
hundreds or thousands of packets before doing a drop. A stress test
like ping -f -s 1500 and ping -f -s 64 hitting every defined queue
(somehow) should be run to make sure you don't OOM, and even, even
then, things like acks coming off an ethernet eat 2k when they are
only 64 bytes in size.

this patch (Not even compile tested) might take some of the memory
pressure off when being flooded in this way. While
it costs cpu, given a choice between ooming and slowing down, I'd
rather slow down. Should have done tbf too.

I wonder if the large number of queues we've seen people try to use
with hfsc has been one of the sources of
frequent reports of flakyness? How many queues do people actually try
to create with it in the field... Certainly sfq's default of 127
packets is better than 1000...

Reducing the truesize could be added to cake and tbf also. As well as
the fq_codel for wifi code, on small platforms.

diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 433f2190960f..c0777ce4a259 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1535,7 +1535,8 @@ hfsc_enqueue(struct sk_buff *skb, struct Qdisc
*sch, struct sk_buff **to_free)
        struct hfsc_class *cl;
        int uninitialized_var(err);
        bool first;
-
+       if (sch->q.qlen > 128)
+               skb = skb_reduce_truesize(skb);
        cl = hfsc_classify(skb, sch, &err);
        if (cl == NULL) {
                if (err & __NET_XMIT_BYPASS)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 7bcf20ef9145..40a27392f88e 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -584,6 +584,9 @@ static int htb_enqueue(struct sk_buff *skb, struct
Qdisc *sch,
        struct htb_sched *q = qdisc_priv(sch);
        struct htb_class *cl = htb_classify(skb, sch, &ret);

+       if (sch->q.qlen > 128)
+               skb = skb_reduce_truesize(skb);
+
        if (cl == HTB_DIRECT) {
                /* enqueue to helper queue */
                if (q->direct_queue.qlen < q->direct_qlen) {

On Thu, Aug 22, 2019 at 6:15 AM Dave Taht <dave.taht@gmail.com> wrote:
>
> It's very good to know how much folk have been struggling to keep
> things from OOMing on 32MB platforms. I'd like to hope that the
> unified memory management in cake (vs a collection of QoS qdiscs) and
> the new fq_codel for wifi stuff (cutting it down to 1 alloc from four)
> help, massively on this issue, but until today I was unaware of how
> much the field may have been patching things out.
>
> The default 32MB memory limits in fq_codel comes from the stressing
> about 10GigE networking from google. 4MB is limit in openwrt,
> which is suitable for ~1Gbit, and is sort of there  due to 802.11ac's
> maximum (impossible to hit) of a txop that large.
>
> Something as small as 256K is essentially about 128 full size packets
> (and often, acks from an ethernet device's rx ring eat 2k).
>
> The structure of the new fq_codel for wifi subsystem is "one in the
> hardware, one ready to go, and the rest accumulating". I
> typically see about 13-20 packets in an aggregate. 256k strikes me as
> a bit small.
>
> I haven't checked, but does this patch still exist in openwrt/dd-wrt?
> It had helped a lot when under memory pressure from
> a lot of small packets.
>
> https://github.com/dtaht/cerowrt-3.10/blob/master/target/linux/generic/patches-3.10/657-qdisc_reduce_truesize.patch
>
> Arguably this could be made more aggressive, but it massively reduced
> memory burdens at the time I did it when
> flooding the device, or having lots of acks, and while it cost cpu it
> saved on ooming.
>
> There's two other dubious things in the fq_codel for wifi stack
> presently. Right now the codel target is set too high for p2p use
> (20ms, where 6ms seems more right), and it also flips up to a really
> high target and interval AND turns off ecn when there's more than a
> few stations available (rather than "active") - it's an overly
> conservative figure we used back when we had major issues with
> powersave
> and multicast that I'd hoped we could cut back to normal after we got
> another round of research funding and feedback from the field (which
> didn't happen, and we never got around to making it configurable, and
> being 25x better than it was before seemed "enough")
>
> I was puzzled at battlemesh as to why I had dropping at about 50ms
> delay rather than ecn, and thought it was something
> else, and this morning I'm thinking that folk have been reducing the
> memlimit to 256k rather....



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] Wifi Memory limits in small platforms
       [not found]                                             ` <dcb92eaf-928e-f909-981d-c2baf74fbc90@newmedia-net.de>
@ 2019-08-22 17:03                                               ` Dave Taht
  2019-08-22 17:37                                                 ` Sebastian Gottschall
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Taht @ 2019-08-22 17:03 UTC (permalink / raw)
  To: Sebastian Gottschall
  Cc: Dave Taht, Toke Høiland-Jørgensen, Cake List,
	Battle of the Mesh Mailing List, make-wifi-fast

Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:

> Am 22.08.2019 um 15:15 schrieb Dave Taht:
>> It's very good to know how much folk have been struggling to keep
>> things from OOMing on 32MB platforms. I'd like to hope that the
>> unified memory management in cake (vs a collection of QoS qdiscs) and
>> the new fq_codel for wifi stuff (cutting it down to 1 alloc from four)
>> help, massively on this issue, but until today I was unaware of how
>> much the field may have been patching things out.
>>
>> The default 32MB memory limits in fq_codel comes from the stressing
>> about 10GigE networking from google. 4MB is limit in openwrt,
>> which is suitable for ~1Gbit, and is sort of there  due to 802.11ac's
>> maximum (impossible to hit) of a txop that large.

I did kind of conflate "qos + fq_codel" vs wifi in this message. It
looks like yer staying with me. 

>> Something as small as 256K is essentially about 128 full size packets
>> (and often, acks from an ethernet device's rx ring eat 2k).
>
> what i miss in mac80211 is the following option "fq_codel = off"
> its essential and i will definitly work on a patch to deal with this
> way for low memory 802.11n platforms.

Well, it would be my hope that turning it off would A) not help that
much on memory or cpu and B) show such a dramatic reduction in
multi-station performance that you'd immediately turn it on again.

I try to encourage folk to run the rtt_fair tests in flent when
twiddling with wifi. Those really shows how bad things are when you
don't have ATF + FQ + Per station aggregation and lots of
clients. Single threaded tests are misleading.

I gave a good demo of why this is (was!), here: https://www.youtube.com/watch?v=Rb-UnHDw02o&t=1551s

and there's more in the ending the anomaly paper. Perversely though,
now that we can do 25x latency reductions and 2.5x more throughput,
more memory is needed to achieve those goals in some cases, which
is part of my concern about chopping things down to 256k here.

>
>>
>> The structure of the new fq_codel for wifi subsystem is "one in the
>> hardware, one ready to go, and the rest accumulating". I
>> typically see about 13-20 packets in an aggregate. 256k strikes me as
>> a bit small.
> from the rules its that 256 is used for ht only and if vht is involved
> the limit of 4mb is used.
> but now comes the point. all 802.11ac platforms having 64mb ram or
> more. but ath10k chipsets are using
> about 40 mb of shared memory. so mmh we are hitting the wall
> again. most routers have 128 mb with 802.11ac, but some (noticable
> dlink) have just 64mb

Ugh.

Is it just the mips boxes with so little ram? All the arm routers I have
have at least 128, some as much as 512.

Yes, having a wifi chip that can theoretically have 4MB in transit
with so little ram is problematic.

Dear dlink: don't do that. It hurts when you do that.

>>
>> I haven't checked, but does this patch still exist in openwrt/dd-wrt?
>> It had helped a lot when under memory pressure from
>> a lot of small packets.
>>
>> https://github.com/dtaht/cerowrt-3.10/blob/master/target/linux/generic/patches-3.10/657-qdisc_reduce_truesize.patch
>>
>> Arguably this could be made more aggressive, but it massively reduced
>> memory burdens at the time I did it when
>> flooding the device, or having lots of acks, and while it cost cpu it
>> saved on ooming.
> mmh let me check -> nope its at least not in my tree. but will be soon :-)

Well, I sent along a mildly improved version of the idea.

I can really see some sort of "test my qos" script that attempts
to flood every queue on the system. And wider adoption of
cake which is lighter weight than the alterntives.

one idea that's in cake was that: we'd hoped to capture the most typical
qos setups with it with "models". It's very easy to add a new model
(besteffort, diffserv3, diffserv4) (it's a lookup table and bandwidth
allocation call), but lacking feedback on more typical QoS constructs
from the field, that's where it ended. When we started the project,
I figured we'd end up with 20+ models before the end.

It would be good to get a tc class dump or output from more typical
QoS Setups.

In sqm and cake...
we have a terrible tendency to tell people "no, just use the defaults!
they work! trust us!"... 

who generally don't believe us and want to keep doing things the
way they always have.

In more than a few circumstances they are right, but we don't understand
what they are trying to do.

As one case that cake doesn't handle, at least some iptv setups are
visible as a strict priority queue over everything else, below which you
do everything else, so the tv stream never, ever, drops a packet.

We didn't do that, but could *easily* add an iptv model to shape
inbound better - if we knew more about how free, FT etc, construct
their packets.

Similarly some folk in this world want strict priority for EF.

>> There's two other dubious things in the fq_codel for wifi stack
>> presently. Right now the codel target is set too high for p2p use
>> (20ms, where 6ms seems more right), and it also flips up to a really
>> high target and interval AND turns off ecn when there's more than a
>> few stations available (rather than "active") - it's an overly
>> conservative figure we used back when we had major issues with
>> powersave
>> and multicast that I'd hoped we could cut back to normal after we got
>> another round of research funding and feedback from the field (which
>> didn't happen, and we never got around to making it configurable, and
>> being 25x better than it was before seemed "enough")
>>
>> I was puzzled at battlemesh as to why I had dropping at about 50ms
>> delay rather than ecn, and thought it was something
>> else, and this morning I'm thinking that folk have been reducing the
>> memlimit to 256k rather....
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] Wifi Memory limits in small platforms
  2019-08-22 17:03                                               ` Dave Taht
@ 2019-08-22 17:37                                                 ` Sebastian Gottschall
  2019-08-22 18:23                                                   ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 13+ messages in thread
From: Sebastian Gottschall @ 2019-08-22 17:37 UTC (permalink / raw)
  To: Dave Taht
  Cc: Dave Taht, Toke Høiland-Jørgensen, Cake List,
	Battle of the Mesh Mailing List, make-wifi-fast


Am 22.08.2019 um 19:03 schrieb Dave Taht:
> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
>
>> Am 22.08.2019 um 15:15 schrieb Dave Taht:
>>> It's very good to know how much folk have been struggling to keep
>>> things from OOMing on 32MB platforms. I'd like to hope that the
>>> unified memory management in cake (vs a collection of QoS qdiscs) and
>>> the new fq_codel for wifi stuff (cutting it down to 1 alloc from four)
>>> help, massively on this issue, but until today I was unaware of how
>>> much the field may have been patching things out.
>>>
>>> The default 32MB memory limits in fq_codel comes from the stressing
>>> about 10GigE networking from google. 4MB is limit in openwrt,
>>> which is suitable for ~1Gbit, and is sort of there  due to 802.11ac's
>>> maximum (impossible to hit) of a txop that large.
> I did kind of conflate "qos + fq_codel" vs wifi in this message. It
> looks like yer staying with me.
>
>>> Something as small as 256K is essentially about 128 full size packets
>>> (and often, acks from an ethernet device's rx ring eat 2k).
>> what i miss in mac80211 is the following option "fq_codel = off"
>> its essential and i will definitly work on a patch to deal with this
>> way for low memory 802.11n platforms.
> Well, it would be my hope that turning it off would A) not help that
> much on memory or cpu and B) show such a dramatic reduction in
> multi-station performance that you'd immediately turn it on again.
isnt it better to have a working platform with less performance than a 
crashing platform with no performance?
i mean i can user older mac80211 versions without that issue on a 
typical nanostation 2/5 which is often used just as CPE device

but with current mac80211 versions (current means last 2-3 years). they 
are just unstable and running out of memory after a while
the only thing which helped was cutting of the memory limit of fq_codel 
inside mac80211
i also have another fancy testunit which is a linksys wrt400 with 32 mb 
ram and 2 ath9k based wifi chipsets. no hope here for running stable
for only 5 minutes even with a single connection under load (my crashing 
test is running a hdtv iptv stream converted to unicast using a 
stateless eoip tunnel)

> I try to encourage folk to run the rtt_fair tests in flent when
> twiddling with wifi. Those really shows how bad things are when you
> don't have ATF + FQ + Per station aggregation and lots of
> clients. Single threaded tests are misleading.
i know but even single threaded tests arent working good on such 
devices. so there is no need to talk about the benefits of atf,fq_codel etc.
but there is need to talk about configurable use of it which also allows 
to disable it if required. if you just have a cpe device with pppoe 
running on it which is common for wisps
there is no need for much fair queuing. this is a task for the 
accesspoint. another typical use for such devices like nanostation, 
rocket, bullet etc. are simple point to point long range links.
this is the main use for such high gain devices like these is my 
assumption.
so we dont talk about a typical cool and fancy ab. we talk about 
compatibility with low end devices without running out of resources. i'm 
a typical programmer from the 80s. keep it small, simple and resource 
efficient as possible. these coding standards should still be considered 
today even if i dont write tetris clones anymore running on 512 byte 
boot sectors using the msdos builtin debug assembler program
>
> I gave a good demo of why this is (was!), here: https://www.youtube.com/watch?v=Rb-UnHDw02o&t=1551s
>
> and there's more in the ending the anomaly paper. Perversely though,
> now that we can do 25x latency reductions and 2.5x more throughput,
> more memory is needed to achieve those goals in some cases, which
> is part of my concern about chopping things down to 256k here.
>>> The structure of the new fq_codel for wifi subsystem is "one in the
>>> hardware, one ready to go, and the rest accumulating". I
>>> typically see about 13-20 packets in an aggregate. 256k strikes me as
>>> a bit small.
>> from the rules its that 256 is used for ht only and if vht is involved
>> the limit of 4mb is used.
>> but now comes the point. all 802.11ac platforms having 64mb ram or
>> more. but ath10k chipsets are using
>> about 40 mb of shared memory. so mmh we are hitting the wall
>> again. most routers have 128 mb with 802.11ac, but some (noticable
>> dlink) have just 64mb
> Ugh.
>
> Is it just the mips boxes with so little ram? All the arm routers I have
> have at least 128, some as much as 512.
you got it. all the mips routers. most problematic the tplink wr841 (and 
similar series) and ubnt devices of course.
these are 802.11 but just comming with 32 mb ram. but there are others 
too of course and i love to maintain all the older devices
for the community. for newer arm based devices we really dont need to 
care about. broadcom arm cpus are comming with chipsets which are not 
supported by linux/mac80211 anyway
or just bad supported for newer chipsets using brcmfmac. (but the 
original broadcom propertiery driver is unstable too of course)
and all other models based on qca ipq8064 etc. are comming with 256 mb 
and more and we really only need to take care about ath9k and ath10k 
(soon maybe ath11k)
everything else doesnt matter. the linksys wrtXXXX series has a mac80211 
driver, but marvell stopped maintaining it at a point where it still was 
shit and unstable. and its mainly based on a binary firmware blob.


>
> Yes, having a wifi chip that can theoretically have 4MB in transit
> with so little ram is problematic.
>
> Dear dlink: don't do that. It hurts when you do that.
>
i talked alot with dlink about this issue, but dlinks solution was just 
switching to a cheaper mediatek mips based platform. now we have more 
ram, but a featureless chipset.
same for tplink.
>>> I haven't checked, but does this patch still exist in openwrt/dd-wrt?
>>> It had helped a lot when under memory pressure from
>>> a lot of small packets.
>>>
>>> https://github.com/dtaht/cerowrt-3.10/blob/master/target/linux/generic/patches-3.10/657-qdisc_reduce_truesize.patch
>>>
>>> Arguably this could be made more aggressive, but it massively reduced
>>> memory burdens at the time I did it when
>>> flooding the device, or having lots of acks, and while it cost cpu it
>>> saved on ooming.
>> mmh let me check -> nope its at least not in my tree. but will be soon :-)
> Well, I sent along a mildly improved version of the idea.
>
> I can really see some sort of "test my qos" script that attempts
> to flood every queue on the system. And wider adoption of
> cake which is lighter weight than the alterntives.
>
> one idea that's in cake was that: we'd hoped to capture the most typical
> qos setups with it with "models". It's very easy to add a new model
> (besteffort, diffserv3, diffserv4) (it's a lookup table and bandwidth
> allocation call), but lacking feedback on more typical QoS constructs
> from the field, that's where it ended. When we started the project,
> I figured we'd end up with 20+ models before the end.
>
> It would be good to get a tc class dump or output from more typical
> QoS Setups.
>
> In sqm and cake...
> we have a terrible tendency to tell people "no, just use the defaults!
> they work! trust us!"...
yeah i know that feeling .but i can never trust the users. the always do 
what they think is good for them
and everyone thinks he knows better since he was reading something using 
google / reddit
>
> who generally don't believe us and want to keep doing things the
> way they always have.
>
> In more than a few circumstances they are right, but we don't understand
> what they are trying to do.
>
> As one case that cake doesn't handle, at least some iptv setups are
> visible as a strict priority queue over everything else, below which you
> do everything else, so the tv stream never, ever, drops a packet.
as i mentioned before. my solition for iptv is layer 2 tunneling to get 
rid of multicast issues and it also converts everthing to a single 
connection.
i use a rfc compliant ether over ip tunnel for this which is not 
upstream in linux, but in freebsd. but there was a driver for kernel 2.4 
around many years ago and i maintained it up
to the latest kernel. its robust, handles fragmentation and just has 12 
bytes overhead.
>
> We didn't do that, but could *easily* add an iptv model to shape
> inbound better - if we knew more about how free, FT etc, construct
> their packets.

inbound they are marked with tos. typical internet has 0 of course. iptv 
has X and voice has Y. (dont ask me for the numbers, i dont have them in 
mind right now)
but for dhcp leases you need to mark your own packets with another dscp. 
otherwise the isp returns no ip. i dont know why this has been made. but 
it has to be handled.
normally orange ships black boxes as routers and to get it working with 
free systems, some people reverse engineered that shit. my conclusion is 
its some sort of
obfuscation to avoid third party hardware since the EU regulated the 
ISP's in a way that they got forced to allow 3rd party products which 
they still try to avoid. (refusing support for internet problems etc.)

> Similarly some folk in this world want strict priority for EF.
>
>>> There's two other dubious things in the fq_codel for wifi stack
>>> presently. Right now the codel target is set too high for p2p use
>>> (20ms, where 6ms seems more right), and it also flips up to a really
>>> high target and interval AND turns off ecn when there's more than a
>>> few stations available (rather than "active") - it's an overly
>>> conservative figure we used back when we had major issues with
>>> powersave
>>> and multicast that I'd hoped we could cut back to normal after we got
>>> another round of research funding and feedback from the field (which
>>> didn't happen, and we never got around to making it configurable, and
>>> being 25x better than it was before seemed "enough")
>>>
>>> I was puzzled at battlemesh as to why I had dropping at about 50ms
>>> delay rather than ecn, and thought it was something
>>> else, and this morning I'm thinking that folk have been reducing the
>>> memlimit to 256k rather....
>>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] Wifi Memory limits in small platforms
  2019-08-22 17:37                                                 ` Sebastian Gottschall
@ 2019-08-22 18:23                                                   ` Toke Høiland-Jørgensen
  2019-08-22 18:56                                                     ` Dave Taht
  0 siblings, 1 reply; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-08-22 18:23 UTC (permalink / raw)
  To: Sebastian Gottschall, Dave Taht
  Cc: Dave Taht, Cake List, Battle of the Mesh Mailing List, make-wifi-fast

Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:

> Am 22.08.2019 um 19:03 schrieb Dave Taht:
>> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
>>
>>> Am 22.08.2019 um 15:15 schrieb Dave Taht:
>>>> It's very good to know how much folk have been struggling to keep
>>>> things from OOMing on 32MB platforms. I'd like to hope that the
>>>> unified memory management in cake (vs a collection of QoS qdiscs) and
>>>> the new fq_codel for wifi stuff (cutting it down to 1 alloc from four)
>>>> help, massively on this issue, but until today I was unaware of how
>>>> much the field may have been patching things out.
>>>>
>>>> The default 32MB memory limits in fq_codel comes from the stressing
>>>> about 10GigE networking from google. 4MB is limit in openwrt,
>>>> which is suitable for ~1Gbit, and is sort of there  due to 802.11ac's
>>>> maximum (impossible to hit) of a txop that large.
>> I did kind of conflate "qos + fq_codel" vs wifi in this message. It
>> looks like yer staying with me.
>>
>>>> Something as small as 256K is essentially about 128 full size packets
>>>> (and often, acks from an ethernet device's rx ring eat 2k).
>>> what i miss in mac80211 is the following option "fq_codel = off"
>>> its essential and i will definitly work on a patch to deal with this
>>> way for low memory 802.11n platforms.
>> Well, it would be my hope that turning it off would A) not help that
>> much on memory or cpu and B) show such a dramatic reduction in
>> multi-station performance that you'd immediately turn it on again.
> isnt it better to have a working platform with less performance than a 
> crashing platform with no performance?
> i mean i can user older mac80211 versions without that issue on a 
> typical nanostation 2/5 which is often used just as CPE device

So before the queueing patches to mac80211, the maximum packet queue
size for ath9k was 3MB in total, or 2.2MB if only a single AC was used
on the WiFi link (that's 128 packets in the driver + 1000 in the
pfifo_fast qdisc * 2074 bytes for the truesize of a full-size packet).
Whereas now the default is 4MB for a non-vht device. So it's not
actually that big of a difference, and as you've already discovered the
defaults can be changed.

Would it be helpful to add support for setting the memory limit in
hostapd (to avoid having to patch the kernel default)?

> but with current mac80211 versions (current means last 2-3 years). they 
> are just unstable and running out of memory after a while
> the only thing which helped was cutting of the memory limit of fq_codel 
> inside mac80211
> i also have another fancy testunit which is a linksys wrt400 with 32 mb 
> ram and 2 ath9k based wifi chipsets. no hope here fonr running stable
> for only 5 minutes even with a single connection under load (my crashing 
> test is running a hdtv iptv stream converted to unicast using a 
> stateless eoip tunnel)
>
>> I try to encourage folk to run the rtt_fair tests in flent when
>> twiddling with wifi. Those really shows how bad things are when you
>> don't have ATF + FQ + Per station aggregation and lots of
>> clients. Single threaded tests are misleading.
> i know but even single threaded tests arent working good on such 
> devices. so there is no need to talk about the benefits of atf,fq_codel etc.
> but there is need to talk about configurable use of it which also allows 
> to disable it if required.

Disabling the fq part won't actually gain you much in terms of memory
usage, though, as most of it is packet memory which is already
configurable.

The one exception to this is the static overhead of 'struct fq_flow', of
which mac80211 currently allocates 4k. That's 300k of memory which is
currently not configurable. But that could be fixed :)

-Toke

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] Wifi Memory limits in small platforms
  2019-08-22 18:23                                                   ` Toke Høiland-Jørgensen
@ 2019-08-22 18:56                                                     ` Dave Taht
  2019-08-22 19:37                                                       ` [Make-wifi-fast] [Battlemesh] " Toke Høiland-Jørgensen
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Dave Taht @ 2019-08-22 18:56 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Sebastian Gottschall, Dave Taht, Cake List,
	Battle of the Mesh Mailing List, Make-Wifi-fast

On Thu, Aug 22, 2019 at 11:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
>
> > Am 22.08.2019 um 19:03 schrieb Dave Taht:
> >> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
> >>
> >>> Am 22.08.2019 um 15:15 schrieb Dave Taht:
> >>>> It's very good to know how much folk have been struggling to keep
> >>>> things from OOMing on 32MB platforms. I'd like to hope that the
> >>>> unified memory management in cake (vs a collection of QoS qdiscs) and
> >>>> the new fq_codel for wifi stuff (cutting it down to 1 alloc from four)
> >>>> help, massively on this issue, but until today I was unaware of how
> >>>> much the field may have been patching things out.
> >>>>
> >>>> The default 32MB memory limits in fq_codel comes from the stressing
> >>>> about 10GigE networking from google. 4MB is limit in openwrt,
> >>>> which is suitable for ~1Gbit, and is sort of there  due to 802.11ac's
> >>>> maximum (impossible to hit) of a txop that large.
> >> I did kind of conflate "qos + fq_codel" vs wifi in this message. It
> >> looks like yer staying with me.
> >>
> >>>> Something as small as 256K is essentially about 128 full size packets
> >>>> (and often, acks from an ethernet device's rx ring eat 2k).
> >>> what i miss in mac80211 is the following option "fq_codel = off"
> >>> its essential and i will definitly work on a patch to deal with this
> >>> way for low memory 802.11n platforms.
> >> Well, it would be my hope that turning it off would A) not help that
> >> much on memory or cpu and B) show such a dramatic reduction in
> >> multi-station performance that you'd immediately turn it on again.
> > isnt it better to have a working platform with less performance than a
> > crashing platform with no performance?
> > i mean i can user older mac80211 versions without that issue on a
> > typical nanostation 2/5 which is often used just as CPE device
>
> So before the queueing patches to mac80211, the maximum packet queue
> size for ath9k was 3MB in total, or 2.2MB if only a single AC was used
> on the WiFi link (that's 128 packets in the driver + 1000 in the
> pfifo_fast qdisc * 2074 bytes for the truesize of a full-size packet).
> Whereas now the default is 4MB for a non-vht device. So it's not
> actually that big of a difference, and as you've already discovered the
> defaults can be changed.
>
> Would it be helpful to add support for setting the memory limit in
> hostapd (to avoid having to patch the kernel default)?

hmm. I guess exposing that via netlink, etc is a good idea. Me I just
write the sys/kernel/debug/*/*/aqm files.

btw:

qos_map in my mind, for APs at this point, should default to the best
effort queue only. Not sure how to set
that in openwrt (I just patched it out of the kernel). 4 queues with 4
ready to go is a lot, and I have some ugly pics
from battlemesh when I tested it that I should get around to publishing it.

as for that sys file...

I'd rather like to expose target and interval, stop disabling ecn
dynamically, and have something closer
to an ewma for fiddling with the target in the first place.....

/me hides

> > but with current mac80211 versions (current means last 2-3 years). they
> > are just unstable and running out of memory after a while
> > the only thing which helped was cutting of the memory limit of fq_codel
> > inside mac80211
> > i also have another fancy testunit which is a linksys wrt400 with 32 mb
> > ram and 2 ath9k based wifi chipsets. no hope here fonr running stable
> > for only 5 minutes even with a single connection under load (my crashing
> > test is running a hdtv iptv stream converted to unicast using a
> > stateless eoip tunnel)
> >
> >> I try to encourage folk to run the rtt_fair tests in flent when
> >> twiddling with wifi. Those really shows how bad things are when you
> >> don't have ATF + FQ + Per station aggregation and lots of
> >> clients. Single threaded tests are misleading.
> > i know but even single threaded tests arent working good on such
> > devices. so there is no need to talk about the benefits of atf,fq_codel etc.
> > but there is need to talk about configurable use of it which also allows
> > to disable it if required.

I 110% agree that a system that can stay up for years is much better
than one that is fast for 5 minutes!

However I'd like a chance, in collaborating with you and your upcoming
patches - to try and narrow
down crash bugs to various subsystems and be able to get some
benchmarks done that I simply
couldn't do anymore at the financial conclusion of the make-wifi-fast
and cake projects.

I think I have a lot of gear that is dd-wrt compatible - apu2,
wndr3700s, 3800s....

The reduce truesize patch had helped a lot at the time (2012). There
were all kinds of flaky bugs that disappeared.

the new drop monitor patchset looks WONDERFUL for seeing more about
packet drop behavior in the stack, but
it's a 5.3(?) feature only.

I note that I run 18.06.1 on my 32MB pico and nanostations on the
lupin campus, but I run no gui, few additional applications at all
(except babel, snmpd, netperf, and the other core needed daemons).  My
uptimes are principally governed by power failures. I can't remember
the last  "crash, crash" I had, and I do track memory leaks (none).
That said, I'm painfully aware that I should probably give dd-wrt and
openwrt 19.x some testing just to make sure there's no regressions,
but have been reluctant to get involved again without more partners in
crime, because the scars from deploying 18.x widely are only beginning
to heal... and only last week did the needed babel 1.9 upgrade arrive
so I can finally redeploy ipv6 universally. I fear my current
reliability metrics are so good because I took down ipv6 last year....

Pico:

root@pool2:~# free
             total         used         free       shared      buffers
Mem:         28480        23796         4684           92         1868
-/+ buffers:              21928         6552
Swap:            0            0            0

root@pool2:~# uptime
 11:38:09 up 43 days, 21:37,  load average: 0.04, 0.03, 0.04

Same workload over here, on a wndr3800, almost exactly the same config

root@couch:~# free
             total       used       free     shared    buffers     cached
Mem:         60320      22872      37448         68       1960       6120
-/+ buffers/cache:      14792      45528
Swap:            0          0          0


>
> Disabling the fq part won't actually gain you much in terms of memory
> usage, though, as most of it is packet memory which is already
> configurable.
>
> The one exception to this is the static overhead of 'struct fq_flow', of
> which mac80211 currently allocates 4k. That's 300k of memory which is
> currently not configurable. But that could be fixed :)
>
> -Toke
--

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] [Battlemesh] Wifi Memory limits in small platforms
  2019-08-22 18:56                                                     ` Dave Taht
@ 2019-08-22 19:37                                                       ` Toke Høiland-Jørgensen
  2019-08-22 20:10                                                         ` [Make-wifi-fast] [Cake] " Sebastian Moeller
  2019-08-22 20:30                                                       ` [Make-wifi-fast] " Sebastian Gottschall
  2019-08-22 20:32                                                       ` [Make-wifi-fast] fq_codel_fast crash/lockup Sebastian Gottschall
  2 siblings, 1 reply; 13+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-08-22 19:37 UTC (permalink / raw)
  To: Dave Taht
  Cc: Cake List, Sebastian Gottschall, Make-Wifi-fast, Dave Taht,
	Battle of the Mesh Mailing List

Dave Taht <dave.taht@gmail.com> writes:

> On Thu, Aug 22, 2019 at 11:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
>>
>> > Am 22.08.2019 um 19:03 schrieb Dave Taht:
>> >> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
>> >>
>> >>> Am 22.08.2019 um 15:15 schrieb Dave Taht:
>> >>>> It's very good to know how much folk have been struggling to keep
>> >>>> things from OOMing on 32MB platforms. I'd like to hope that the
>> >>>> unified memory management in cake (vs a collection of QoS qdiscs) and
>> >>>> the new fq_codel for wifi stuff (cutting it down to 1 alloc from four)
>> >>>> help, massively on this issue, but until today I was unaware of how
>> >>>> much the field may have been patching things out.
>> >>>>
>> >>>> The default 32MB memory limits in fq_codel comes from the stressing
>> >>>> about 10GigE networking from google. 4MB is limit in openwrt,
>> >>>> which is suitable for ~1Gbit, and is sort of there  due to 802.11ac's
>> >>>> maximum (impossible to hit) of a txop that large.
>> >> I did kind of conflate "qos + fq_codel" vs wifi in this message. It
>> >> looks like yer staying with me.
>> >>
>> >>>> Something as small as 256K is essentially about 128 full size packets
>> >>>> (and often, acks from an ethernet device's rx ring eat 2k).
>> >>> what i miss in mac80211 is the following option "fq_codel = off"
>> >>> its essential and i will definitly work on a patch to deal with this
>> >>> way for low memory 802.11n platforms.
>> >> Well, it would be my hope that turning it off would A) not help that
>> >> much on memory or cpu and B) show such a dramatic reduction in
>> >> multi-station performance that you'd immediately turn it on again.
>> > isnt it better to have a working platform with less performance than a
>> > crashing platform with no performance?
>> > i mean i can user older mac80211 versions without that issue on a
>> > typical nanostation 2/5 which is often used just as CPE device
>>
>> So before the queueing patches to mac80211, the maximum packet queue
>> size for ath9k was 3MB in total, or 2.2MB if only a single AC was used
>> on the WiFi link (that's 128 packets in the driver + 1000 in the
>> pfifo_fast qdisc * 2074 bytes for the truesize of a full-size packet).
>> Whereas now the default is 4MB for a non-vht device. So it's not
>> actually that big of a difference, and as you've already discovered the
>> defaults can be changed.
>>
>> Would it be helpful to add support for setting the memory limit in
>> hostapd (to avoid having to patch the kernel default)?
>
> hmm. I guess exposing that via netlink, etc is a good idea. Me I just
> write the sys/kernel/debug/*/*/aqm files.

It already is, and you can set it through iw (as I pointed out
up-thread):

iw phy phy0 set txq memory_limit 2097152

But it's not supported in hostapd, so you have to do that manually as it
is now.

> btw:
>
> qos_map in my mind, for APs at this point, should default to the best
> effort queue only. Not sure how to set that in openwrt (I just patched
> it out of the kernel).

Think it's possible to set this in hostapd config; haven't tried it...

-Toke

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] [Cake] [Battlemesh] Wifi Memory limits in small platforms
  2019-08-22 19:37                                                       ` [Make-wifi-fast] [Battlemesh] " Toke Høiland-Jørgensen
@ 2019-08-22 20:10                                                         ` Sebastian Moeller
  0 siblings, 0 replies; 13+ messages in thread
From: Sebastian Moeller @ 2019-08-22 20:10 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Dave Täht, Cake List, Battle of the Mesh Mailing List,
	Make-Wifi-fast



> On Aug 22, 2019, at 21:37, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> 
> Dave Taht <dave.taht@gmail.com> writes:
> 
>> On Thu, Aug 22, 2019 at 11:23 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>> 
>>> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
>>> 
>>>> Am 22.08.2019 um 19:03 schrieb Dave Taht:
>>>>> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
>>>>> 
>>>>>> Am 22.08.2019 um 15:15 schrieb Dave Taht:
>>>>>>> It's very good to know how much folk have been struggling to keep
>>>>>>> things from OOMing on 32MB platforms. I'd like to hope that the
>>>>>>> unified memory management in cake (vs a collection of QoS qdiscs) and
>>>>>>> the new fq_codel for wifi stuff (cutting it down to 1 alloc from four)
>>>>>>> help, massively on this issue, but until today I was unaware of how
>>>>>>> much the field may have been patching things out.
>>>>>>> 
>>>>>>> The default 32MB memory limits in fq_codel comes from the stressing
>>>>>>> about 10GigE networking from google. 4MB is limit in openwrt,
>>>>>>> which is suitable for ~1Gbit, and is sort of there  due to 802.11ac's
>>>>>>> maximum (impossible to hit) of a txop that large.
>>>>> I did kind of conflate "qos + fq_codel" vs wifi in this message. It
>>>>> looks like yer staying with me.
>>>>> 
>>>>>>> Something as small as 256K is essentially about 128 full size packets
>>>>>>> (and often, acks from an ethernet device's rx ring eat 2k).
>>>>>> what i miss in mac80211 is the following option "fq_codel = off"
>>>>>> its essential and i will definitly work on a patch to deal with this
>>>>>> way for low memory 802.11n platforms.
>>>>> Well, it would be my hope that turning it off would A) not help that
>>>>> much on memory or cpu and B) show such a dramatic reduction in
>>>>> multi-station performance that you'd immediately turn it on again.
>>>> isnt it better to have a working platform with less performance than a
>>>> crashing platform with no performance?
>>>> i mean i can user older mac80211 versions without that issue on a
>>>> typical nanostation 2/5 which is often used just as CPE device
>>> 
>>> So before the queueing patches to mac80211, the maximum packet queue
>>> size for ath9k was 3MB in total, or 2.2MB if only a single AC was used
>>> on the WiFi link (that's 128 packets in the driver + 1000 in the
>>> pfifo_fast qdisc * 2074 bytes for the truesize of a full-size packet).
>>> Whereas now the default is 4MB for a non-vht device. So it's not
>>> actually that big of a difference, and as you've already discovered the
>>> defaults can be changed.
>>> 
>>> Would it be helpful to add support for setting the memory limit in
>>> hostapd (to avoid having to patch the kernel default)?
>> 
>> hmm. I guess exposing that via netlink, etc is a good idea. Me I just
>> write the sys/kernel/debug/*/*/aqm files.
> 
> It already is, and you can set it through iw (as I pointed out
> up-thread):
> 
> iw phy phy0 set txq memory_limit 2097152
> 
> But it's not supported in hostapd, so you have to do that manually as it
> is now.
> 
>> btw:
>> 
>> qos_map in my mind, for APs at this point, should default to the best
>> effort queue only. Not sure how to set that in openwrt (I just patched
>> it out of the kernel).
> 
> Think it's possible to set this in hostapd config; haven't tried it...

	I believe that OpenWrt's hostapd does not support that feature, at least it did not last year when I looked...

Best Regards
	Sebastian

> 
> -Toke
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] Wifi Memory limits in small platforms
  2019-08-22 18:56                                                     ` Dave Taht
  2019-08-22 19:37                                                       ` [Make-wifi-fast] [Battlemesh] " Toke Høiland-Jørgensen
@ 2019-08-22 20:30                                                       ` Sebastian Gottschall
  2019-08-22 23:39                                                         ` Dave Taht
  2019-08-22 20:32                                                       ` [Make-wifi-fast] fq_codel_fast crash/lockup Sebastian Gottschall
  2 siblings, 1 reply; 13+ messages in thread
From: Sebastian Gottschall @ 2019-08-22 20:30 UTC (permalink / raw)
  To: Dave Taht, Toke Høiland-Jørgensen
  Cc: Dave Taht, Cake List, Battle of the Mesh Mailing List, Make-Wifi-fast


>>> but with current mac80211 versions (current means last 2-3 years). they
>>> are just unstable and running out of memory after a while
>>> the only thing which helped was cutting of the memory limit of fq_codel
>>> inside mac80211
>>> i also have another fancy testunit which is a linksys wrt400 with 32 mb
>>> ram and 2 ath9k based wifi chipsets. no hope here fonr running stable
>>> for only 5 minutes even with a single connection under load (my crashing
>>> test is running a hdtv iptv stream converted to unicast using a
>>> stateless eoip tunnel)
>>>
>>>> I try to encourage folk to run the rtt_fair tests in flent when
>>>> twiddling with wifi. Those really shows how bad things are when you
>>>> don't have ATF + FQ + Per station aggregation and lots of
>>>> clients. Single threaded tests are misleading.
>>> i know but even single threaded tests arent working good on such
>>> devices. so there is no need to talk about the benefits of atf,fq_codel etc.
>>> but there is need to talk about configurable use of it which also allows
>>> to disable it if required.
> I 110% agree that a system that can stay up for years is much better
> than one that is fast for 5 minutes!
>
> However I'd like a chance, in collaborating with you and your upcoming
> patches - to try and narrow
> down crash bugs to various subsystems and be able to get some
> benchmarks done that I simply
> couldn't do anymore at the financial conclusion of the make-wifi-fast
> and cake projects.
>
> I think I have a lot of gear that is dd-wrt compatible - apu2,
> wndr3700s, 3800s....
if its v4, these are having 128 mb (i have them too). and apu2 has 2 gb. 
so its getting real interesting
if you choose such a bad one with 32 mb ram which are still commonly 
used by "freifunk"
> The reduce truesize patch had helped a lot at the time (2012). There
> were all kinds of flaky bugs that disappeared.
i tested and it helped to make ethernet unavailable. it worked for wifi 
interfaces. but the eth0 and eth1 on my ipq8064 based
testboard did not work anymore. no dhcp lease, no ping. but i was able 
to capture inbound packets. (qos was not even enabled while testing, so 
no cake, fq_code letc. just standard sfq scheduler)
so i reverted and all worked again
>
> the new drop monitor patchset looks WONDERFUL for seeing more about
> packet drop behavior in the stack, but
> it's a 5.3(?) feature only.
i love backporting :-)
>
> I note that I run 18.06.1 on my 32MB pico and nanostations on the
> lupin campus, but I run no gui, few additional applications at all
> (except babel, snmpd, netperf, and the other core needed daemons).  My
> uptimes are principally governed by power failures. I can't remember
> the last  "crash, crash" I had, and I do track memory leaks (none).
> That said, I'm painfully aware that I should probably give dd-wrt and
> openwrt 19.x some testing just to make sure there's no regressions,
> but have been reluctant to get involved again without more partners in
> crime, because the scars from deploying 18.x widely are only beginning
> to heal... and only last week did the needed babel 1.9 upgrade arrive
> so I can finally redeploy ipv6 universally. I fear my current
> reliability metrics are so good because I took down ipv6 last year....
my workaround with memory problems is also disabling http normally. i 
have some of these nanostations in the field

just running hostapd, snmp, syslog. but anything else is disabled due 
the oom problematics. it never was a real crash.

but oom. but i never played with babel. ospf etc. all working out of the 
box based on quagga on low end devices and frr on bigger ones.

>
> Pico:
>
> root@pool2:~# free
>               total         used         free       shared      buffers
> Mem:         28480        23796         4684           92         1868
> -/+ buffers:              21928         6552
> Swap:            0            0            0
>
> root@pool2:~# uptime
>   11:38:09 up 43 days, 21:37,  load average: 0.04, 0.03, 0.04
>
> Same workload over here, on a wndr3800, almost exactly the same config
>
> root@couch:~# free
>               total       used       free     shared    buffers     cached
> Mem:         60320      22872      37448         68       1960       6120
> -/+ buffers/cache:      14792      45528
> Swap:            0          0          0

NS2

root@TRO1:~# free

               total        used        free      shared buff/cache   
available
Mem:          29124       19228        3552           0 6344        7752
Swap:             0           0           0

wndr3700v4

root@DD-WRT:~# free
               total        used        free      shared buff/cache   
available
Mem:         125884       23048       92940           0 9896       99824
Swap:             0           0           0
root@DD-WRT:~#


>
>> Disabling the fq part won't actually gain you much in terms of memory
>> usage, though, as most of it is packet memory which is already
>> configurable.
>>
>> The one exception to this is the static overhead of 'struct fq_flow', of
>> which mac80211 currently allocates 4k. That's 300k of memory which is
>> currently not configurable. But that could be fixed :)
>>
>> -Toke
> --
>
> Dave Täht
> CTO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-831-205-9740
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Make-wifi-fast] fq_codel_fast crash/lockup
  2019-08-22 18:56                                                     ` Dave Taht
  2019-08-22 19:37                                                       ` [Make-wifi-fast] [Battlemesh] " Toke Høiland-Jørgensen
  2019-08-22 20:30                                                       ` [Make-wifi-fast] " Sebastian Gottschall
@ 2019-08-22 20:32                                                       ` Sebastian Gottschall
  2 siblings, 0 replies; 13+ messages in thread
From: Sebastian Gottschall @ 2019-08-22 20:32 UTC (permalink / raw)
  To: Dave Taht, Toke Høiland-Jørgensen
  Cc: Dave Taht, Cake List, Battle of the Mesh Mailing List, Make-Wifi-fast

if you mind.

running on arm kernel htb+fq_codel_fast

INFO: rcu_preempt self-detected stall on CPU
         0-...: (1 GPs behind) idle=0ab/140000000000001/0 
softirq=2280/2285 fqs=5984
          (t=6000 jiffies g=211 c=210 q=565)
Task dump for CPU 0:
tc              R running      0  1024    890 0x00000002
Backtrace:
[<8001b71c>] (dump_backtrace) from [<8001b9c0>] (show_stack+0x18/0x1c)
  r7:8051ca40 r6:8050f908 r5:0000037a r4:87a8f400
[<8001b9a8>] (show_stack) from [<8005bad0>] (sched_show_task+0xe0/0xe8)
[<8005b9f0>] (sched_show_task) from [<8005cf20>] (dump_cpu_task+0x40/0x44)
  r5:00000000 r4:00000000
[<8005cee0>] (dump_cpu_task) from [<80078534>] 
(rcu_dump_cpu_stacks+0x74/0xac)
  r5:00000000 r4:8051ca40
[<800784c0>] (rcu_dump_cpu_stacks) from [<8007ba20>] 
(rcu_check_callbacks+0x2e8/0x854)
  r9:8051ca40 r8:8050e100 r7:068d8000 r6:00000235 r5:8050f9c0 r4:804fed40
[<8007b738>] (rcu_check_callbacks) from [<8007df44>] 
(update_process_times+0x44/0x6c)
  r10:00000000 r9:00000000 r8:00000000 r7:0000001d r6:87a8f400 r5:00000000
  r4:ffffe000
[<8007df00>] (update_process_times) from [<80089a08>] 
(tick_periodic+0xb0/0xb8)
  r7:0000001d r6:7fffffff r5:86dd9800 r4:8050e144
[<80089958>] (tick_periodic) from [<80089a44>] 
(tick_handle_periodic+0x34/0xac)
  r5:86dd9800 r4:ffffffff
[<80089a10>] (tick_handle_periodic) from [<8001dc04>] 
(twd_handler+0x38/0x40)
  r9:00000000 r8:86dd9800 r7:0000001d r6:87801cc0 r5:805214bc r4:00000001
[<8001dbcc>] (twd_handler) from [<80074c8c>] 
(handle_percpu_devid_irq+0x7c/0x94)
  r5:805214bc r4:805036c0
[<80074c10>] (handle_percpu_devid_irq) from [<80071214>] 
(__handle_domain_irq+0xb8/0xd8)
  r9:00000000 r8:87804800 r7:00000001 r6:804fcfc4 r5:00000000 r4:00000000
[<8007115c>] (__handle_domain_irq) from [<80009388>] 
(gic_handle_irq+0x50/0x7c)
  r9:00000000 r8:b0101100 r7:b0100100 r6:b010010c r5:87147ac8 r4:8050fb20
[<80009338>] (gic_handle_irq) from [<80009ed4>] (__irq_svc+0x54/0x90)
Exception stack(0x87147ac8 to 0x87147b10)
7ac0:                   8696e090 00000000 0000001f 00000020 86740000 
87147b78
7ae0: 86680094 86680094 87147b78 00000000 00000000 87147b24 87147b28 
87147b18
7b00: 7f4cb1ec 80017a28 20000013 ffffffff
  r9:00000000 r8:87147b78 r7:87147afc r6:ffffffff r5:20000013 r4:80017a28
[<800179d8>] (_raw_spin_lock_bh) from [<7f4cb1ec>] 
(fq_codel_dump_stats+0x68/0xd8 [sch_fq_codel_fast])
[<7f4cb184>] (fq_codel_dump_stats [sch_fq_codel_fast]) from [<802786c0>] 
(tc_fill_qdisc+0x1d0/0x284)
  r5:86740000 r4:870e0c00
[<802784f0>] (tc_fill_qdisc) from [<80278a78>] 
(qdisc_notify.isra.0+0xec/0x138)
  r10:86740000 r9:869cb800 r8:870d6308 r7:870d6306 r6:8052b780 r5:00000400
  r4:870e0c00
[<8027898c>] (qdisc_notify.isra.0) from [<80278afc>] 
(notify_and_destroy+0x38/0x50)
  r10:8052b780 r9:00000000 r8:80285a50 r7:869cba00 r6:00000000 r5:8696e000
  r4:869cb800
[<80278ac4>] (notify_and_destroy) from [<80278da8>] 
(qdisc_graft+0x294/0x360)
  r4:86740000
[<80278b14>] (qdisc_graft) from [<8027a424>] (tc_modify_qdisc+0x4a0/0x4f4)
  r10:8052b780 r9:00000000 r8:8696e000 r7:00010100 r6:870e1000 r5:870d6300
  r4:86740000
[<80279f84>] (tc_modify_qdisc) from [<802688b0>] 
(rtnetlink_rcv_msg+0x16c/0x1e4)
  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:870d7b00 r5:870d6300
  r4:000000f0
[<80268744>] (rtnetlink_rcv_msg) from [<802a37e4>] 
(netlink_rcv_skb+0x64/0xc8)
  r8:00000000 r7:870d7b00 r6:870d6300 r5:80268744 r4:870d7b00
[<802a3780>] (netlink_rcv_skb) from [<8026873c>] (rtnetlink_rcv+0x2c/0x34)
  r7:870d7b00 r6:87224000 r5:0000003c r4:870d7b00
[<80268710>] (rtnetlink_rcv) from [<802a3218>] 
(netlink_unicast+0x158/0x20c)
  r5:0000003c r4:878a8400
[<802a30c0>] (netlink_unicast) from [<802a3610>] 
(netlink_sendmsg+0x344/0x38c)
  r8:024000c0 r7:87224000 r6:0000003c r5:870d7b00 r4:87147f4c
[<802a32cc>] (netlink_sendmsg) from [<8023f680>] 
(___sys_sendmsg+0x1e8/0x22c)
  r10:87147e28 r9:87147e28 r8:00000000 r7:00000000 r6:8751ae00 r5:00000000
  r4:87147f4c
[<8023f498>] (___sys_sendmsg) from [<8023fd88>] (__sys_sendmsg+0x44/0x68)
  r10:00000000 r9:87146000 r8:80009724 r7:00000128 r6:00000000 r5:7e950ccc
  r4:8751ae00
[<8023fd44>] (__sys_sendmsg) from [<8023fdbc>] (SyS_sendmsg+0x10/0x14)
  r6:00000000 r5:00000000 r4:00000000
[<8023fdac>] (SyS_sendmsg) from [<80009560>] (ret_fast_syscall+0x0/0x48)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] Wifi Memory limits in small platforms
  2019-08-22 20:30                                                       ` [Make-wifi-fast] " Sebastian Gottschall
@ 2019-08-22 23:39                                                         ` Dave Taht
  2019-08-23  6:25                                                           ` Sebastian Gottschall
  2019-08-23  6:48                                                           ` [Make-wifi-fast] [Cake] " Sebastian Moeller
  0 siblings, 2 replies; 13+ messages in thread
From: Dave Taht @ 2019-08-22 23:39 UTC (permalink / raw)
  To: Sebastian Gottschall
  Cc: Dave Taht, Toke Høiland-Jørgensen, Cake List,
	Battle of the Mesh Mailing List, Make-Wifi-fast

Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:

>>>> but with current mac80211 versions (current means last 2-3 years). they
>>>> are just unstable and running out of memory after a while
>>>> the only thing which helped was cutting of the memory limit of fq_codel
>>>> inside mac80211
>>>> i also have another fancy testunit which is a linksys wrt400 with 32 mb
>>>> ram and 2 ath9k based wifi chipsets. no hope here fonr running stable
>>>> for only 5 minutes even with a single connection under load (my crashing
>>>> test is running a hdtv iptv stream converted to unicast using a
>>>> stateless eoip tunnel)
>>>>
>>>>> I try to encourage folk to run the rtt_fair tests in flent when
>>>>> twiddling with wifi. Those really shows how bad things are when you
>>>>> don't have ATF + FQ + Per station aggregation and lots of
>>>>> clients. Single threaded tests are misleading.
>>>> i know but even single threaded tests arent working good on such
>>>> devices. so there is no need to talk about the benefits of atf,fq_codel etc.
>>>> but there is need to talk about configurable use of it which also allows
>>>> to disable it if required.
>> I 110% agree that a system that can stay up for years is much better
>> than one that is fast for 5 minutes!
>>
>> However I'd like a chance, in collaborating with you and your upcoming
>> patches - to try and narrow
>> down crash bugs to various subsystems and be able to get some
>> benchmarks done that I simply
>> couldn't do anymore at the financial conclusion of the make-wifi-fast
>> and cake projects.
>>
>> I think I have a lot of gear that is dd-wrt compatible - apu2,
>> wndr3700s, 3800s....
> if its v4, these are having 128 mb (i have them too).

These are from the cerowrt era, so, 32 or 64MB of ram.

> and apu2 has 2
> gb. so its getting real interesting
> if you choose such a bad one with 32 mb ram which are still commonly
> used by "freifunk"

One thing we can start doing more 'round here is to boot the x86 boxes
with mem=32MB or something similar (40% larger due to 64 bits? no idea,
maybe look at free mem on a similar config) to see what shows up. 

For example, one of my APU2s has dual ath9/ath10k cards which is a
a reasonable sim of one of your configs. 

>> The reduce truesize patch had helped a lot at the time (2012). There
>> were all kinds of flaky bugs that disappeared.
> i tested and it helped to make ethernet unavailable. it worked for

thx for making me chortle in sad empathy.

> wifi interfaces. but the eth0 and eth1 on my ipq8064 based
> testboard did not work anymore. no dhcp lease, no ping. but i was able
> to capture inbound packets. (qos was not even enabled while testing,
> so no cake, fq_code letc. just standard sfq scheduler)
> so i reverted and all worked again

OK. Thx for trying. there have been so many bugs in gso/gro and hardware
offloads that I figure that that's why the patch was dropped over time.

is cake's gso-splitting working on that same hardware? I'm not sure
to what extent that reduces packet size or not these days.

I'll try that again on x86, maybe it needed to pullskb....

>>
>> the new drop monitor patchset looks WONDERFUL for seeing more about
>> packet drop behavior in the stack, but
>> it's a 5.3(?) feature only.
> i love backporting :-)

I used to but these days I'm content to work out of net-next x.y.0-rc4
or later. I get more sleep that way. Oh, wait, it just hit that....

>>
>> I note that I run 18.06.1 on my 32MB pico and nanostations on the
>> lupin campus, but I run no gui, few additional applications at all
>> (except babel, snmpd, netperf, and the other core needed daemons).  My
>> uptimes are principally governed by power failures. I can't remember
>> the last  "crash, crash" I had, and I do track memory leaks (none).
>> That said, I'm painfully aware that I should probably give dd-wrt and
>> openwrt 19.x some testing just to make sure there's no regressions,
>> but have been reluctant to get involved again without more partners in
>> crime, because the scars from deploying 18.x widely are only beginning
>> to heal... and only last week did the needed babel 1.9 upgrade arrive
>> so I can finally redeploy ipv6 universally. I fear my current
>> reliability metrics are so good because I took down ipv6 last year....
> my workaround with memory problems is also disabling http normally. i
> have some of these nanostations in the field
>
> just running hostapd, snmp, syslog. but anything else is disabled due
> the oom problematics. it never was a real crash.
>
> but oom. but i never played with babel. ospf etc. all working out of
> the box based on quagga on low end devices and frr on bigger ones.
>
>>
>> Pico:
>>
>> root@pool2:~# free
>>               total         used         free       shared      buffers
>> Mem:         28480        23796         4684           92         1868
>> -/+ buffers:              21928         6552
>> Swap:            0            0            0
>>
>> root@pool2:~# uptime
>>   11:38:09 up 43 days, 21:37,  load average: 0.04, 0.03, 0.04
>>
>> Same workload over here, on a wndr3800, almost exactly the same config
>>
>> root@couch:~# free
>>               total       used       free     shared    buffers     cached
>> Mem:         60320      22872      37448         68       1960       6120
>> -/+ buffers/cache:      14792      45528
>> Swap:            0          0          0
>
> NS2
>
> root@TRO1:~# free
>
>               total        used        free      shared buff/cache  
> available
> Mem:          29124       19228        3552           0 6344        7752
> Swap:             0           0           0

It looks like you are running even less stuff than I am. And this
machine is running with 256k bufs?

> wndr3700v4
>
> root@DD-WRT:~# free
>               total        used        free      shared buff/cache  
> available
> Mem:         125884       23048       92940           0 9896       99824
> Swap:             0           0           0
> root@DD-WRT:~#
>
>
>>
>>> Disabling the fq part won't actually gain you much in terms of memory
>>> usage, though, as most of it is packet memory which is already
>>> configurable.
>>>
>>> The one exception to this is the static overhead of 'struct fq_flow', of
>>> which mac80211 currently allocates 4k. That's 300k of memory which is
>>> currently not configurable. But that could be fixed :)
>>>
>>> -Toke
>> --
>>
>> Dave Täht
>> CTO, TekLibre, LLC
>> http://www.teklibre.com
>> Tel: 1-831-205-9740
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] Wifi Memory limits in small platforms
  2019-08-22 23:39                                                         ` Dave Taht
@ 2019-08-23  6:25                                                           ` Sebastian Gottschall
  2019-08-23  6:48                                                           ` [Make-wifi-fast] [Cake] " Sebastian Moeller
  1 sibling, 0 replies; 13+ messages in thread
From: Sebastian Gottschall @ 2019-08-23  6:25 UTC (permalink / raw)
  To: Dave Taht
  Cc: Dave Taht, Toke Høiland-Jørgensen, Cake List,
	Battle of the Mesh Mailing List, Make-Wifi-fast


Am 23.08.2019 um 01:39 schrieb Dave Taht:
> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
>
>>>>> but with current mac80211 versions (current means last 2-3 years). they
>>>>> are just unstable and running out of memory after a while
>>>>> the only thing which helped was cutting of the memory limit of fq_codel
>>>>> inside mac80211
>>>>> i also have another fancy testunit which is a linksys wrt400 with 32 mb
>>>>> ram and 2 ath9k based wifi chipsets. no hope here fonr running stable
>>>>> for only 5 minutes even with a single connection under load (my crashing
>>>>> test is running a hdtv iptv stream converted to unicast using a
>>>>> stateless eoip tunnel)
>>>>>
>>>>>> I try to encourage folk to run the rtt_fair tests in flent when
>>>>>> twiddling with wifi. Those really shows how bad things are when you
>>>>>> don't have ATF + FQ + Per station aggregation and lots of
>>>>>> clients. Single threaded tests are misleading.
>>>>> i know but even single threaded tests arent working good on such
>>>>> devices. so there is no need to talk about the benefits of atf,fq_codel etc.
>>>>> but there is need to talk about configurable use of it which also allows
>>>>> to disable it if required.
>>> I 110% agree that a system that can stay up for years is much better
>>> than one that is fast for 5 minutes!
>>>
>>> However I'd like a chance, in collaborating with you and your upcoming
>>> patches - to try and narrow
>>> down crash bugs to various subsystems and be able to get some
>>> benchmarks done that I simply
>>> couldn't do anymore at the financial conclusion of the make-wifi-fast
>>> and cake projects.
>>>
>>> I think I have a lot of gear that is dd-wrt compatible - apu2,
>>> wndr3700s, 3800s....
>> if its v4, these are having 128 mb (i have them too).
> These are from the cerowrt era, so, 32 or 64MB of ram.
>
>> and apu2 has 2
>> gb. so its getting real interesting
>> if you choose such a bad one with 32 mb ram which are still commonly
>> used by "freifunk"
> One thing we can start doing more 'round here is to boot the x86 boxes
> with mem=32MB or something similar (40% larger due to 64 bits? no idea,
> maybe look at free mem on a similar config) to see what shows up.
>
> For example, one of my APU2s has dual ath9/ath10k cards which is a
> a reasonable sim of one of your configs.
since x64 have alot more differen configurations the kernels are much 
bigger (drivers, drivers, drivers) . i'm sure it will not work with just 
32 mb
>
>>> The reduce truesize patch had helped a lot at the time (2012). There
>>> were all kinds of flaky bugs that disappeared.
>> i tested and it helped to make ethernet unavailable. it worked for
> thx for making me chortle in sad empathy.
>
>> wifi interfaces. but the eth0 and eth1 on my ipq8064 based
>> testboard did not work anymore. no dhcp lease, no ping. but i was able
>> to capture inbound packets. (qos was not even enabled while testing,
>> so no cake, fq_code letc. just standard sfq scheduler)
>> so i reverted and all worked again
> OK. Thx for trying. there have been so many bugs in gso/gro and hardware
> offloads that I figure that that's why the patch was dropped over time.
>
> is cake's gso-splitting working on that same hardware? I'm not sure
> to what extent that reduces packet size or not these days.
cake works yes, but i have not checked explicit for gso-splitting. it 
just worked
>
> I'll try that again on x86, maybe it needed to pullskb....
can be hw specific. but who knows.
>
>>> Pico:
>>>
>>> root@pool2:~# free
>>>                total         used         free       shared      buffers
>>> Mem:         28480        23796         4684           92         1868
>>> -/+ buffers:              21928         6552
>>> Swap:            0            0            0
>>>
>>> root@pool2:~# uptime
>>>    11:38:09 up 43 days, 21:37,  load average: 0.04, 0.03, 0.04
>>>
>>> Same workload over here, on a wndr3800, almost exactly the same config
>>>
>>> root@couch:~# free
>>>                total       used       free     shared    buffers     cached
>>> Mem:         60320      22872      37448         68       1960       6120
>>> -/+ buffers/cache:      14792      45528
>>> Swap:            0          0          0
>> NS2
>>
>> root@TRO1:~# free
>>
>>                total        used        free      shared buff/cache
>> available
>> Mem:          29124       19228        3552           0 6344        7752
>> Swap:             0           0           0
> It looks like you are running even less stuff than I am. And this
> machine is running with 256k bufs?
yes. but it may also depend what you're running. i mean openwrt and 
dd-wrt are very different.
the webserver etc. in dd-wrt might be more lightweight. i do not use lua 
or any other slow component
its all written in C including the web code.

thats my process list

   PID USER       VSZ STAT COMMAND
     1 root      1172 S    /sbin/init
     2 root         0 SW   [kthreadd]
     3 root         0 SW   [ksoftirqd/0]
     4 root         0 SW   [kworker/0:0]
     5 root         0 SW<  [kworker/0:0H]
     6 root         0 SW   [kworker/u2:0]
     7 root         0 SW<  [khelper]
     8 root         0 SW<  [writeback]
    10 root         0 SW<  [crypto]
    12 root         0 SW<  [bioset]
    64 root         0 SW<  [kblockd]
    65 root         0 SW   [kswapd0]
    66 root         0 SW   [kworker/0:1]
   108 root         0 SW   [fsnotify_mark]
   120 root         0 SW<  [deferwq]
   121 root         0 SW   [kworker/u2:2]
   503 root      1176 S    /sbin/hotplug2 --set-rules-file 
/etc/hotplug2.rules --persistent
   524 root      1856 S    watchdog
   553 root         0 SW<  [cfg80211]
   574 root      1792 S    /sbin/wlanled -l generic_0:-94 -l 
generic_1:-80 -l generic_11:-73 -l generic_7:-65
   777 root      3780 S    hostapd -B -P /var/run/ath0_hostapd.pid 
/tmp/ath0_hostap.conf
   951 root      1812 S    wland
  1025 root       904 S    cron
  1083 root      1544 S    resetbutton
  1086 root      1856 S    process_monitor
  1217 root      1376 S    syslogd -Z -L -R 192.168.0.117
  1224 root      1376 S    klogd
  1341 root      1112 S    mactelnetd
  1456 root      1224 S    dropbear -b /tmp/loginprompt -r 
/tmp/root/.ssh/ssh_host_rsa_key -p 22 -a
  4449 root      3692 S    httpd -p 80
10770 root      1172 S    dnsmasq -u root -g root 
--conf-file=/tmp/dnsmasq.conf
29786 root      1292 R    dropbear -b /tmp/loginprompt -r 
/tmp/root/.ssh/ssh_host_rsa_key -p 22 -a
29787 root      1376 S    -sh
29799 root      1376 R    ps


>
>> wndr3700v4
>>
>> root@DD-WRT:~# free
>>                total        used        free      shared buff/cache
>> available
>> Mem:         125884       23048       92940           0 9896       99824
>> Swap:             0           0           0
>> root@DD-WRT:~#
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Make-wifi-fast] [Cake] Wifi Memory limits in small platforms
  2019-08-22 23:39                                                         ` Dave Taht
  2019-08-23  6:25                                                           ` Sebastian Gottschall
@ 2019-08-23  6:48                                                           ` Sebastian Moeller
  1 sibling, 0 replies; 13+ messages in thread
From: Sebastian Moeller @ 2019-08-23  6:48 UTC (permalink / raw)
  To: Dave Taht; +Cc: Sebastian Gottschall, Cake List, Make-Wifi-fast



> On Aug 23, 2019, at 01:39, Dave Taht <dave@taht.net> wrote:
> 
> Sebastian Gottschall <s.gottschall@newmedia-net.de> writes:
> 
>>>>> but with current mac80211 versions (current means last 2-3 years). they
>>>>> are just unstable and running out of memory after a while
>>>>> the only thing which helped was cutting of the memory limit of fq_codel
>>>>> inside mac80211
>>>>> i also have another fancy testunit which is a linksys wrt400 with 32 mb
>>>>> ram and 2 ath9k based wifi chipsets. no hope here fonr running stable
>>>>> for only 5 minutes even with a single connection under load (my crashing
>>>>> test is running a hdtv iptv stream converted to unicast using a
>>>>> stateless eoip tunnel)
>>>>> 
>>>>>> I try to encourage folk to run the rtt_fair tests in flent when
>>>>>> twiddling with wifi. Those really shows how bad things are when you
>>>>>> don't have ATF + FQ + Per station aggregation and lots of
>>>>>> clients. Single threaded tests are misleading.
>>>>> i know but even single threaded tests arent working good on such
>>>>> devices. so there is no need to talk about the benefits of atf,fq_codel etc.
>>>>> but there is need to talk about configurable use of it which also allows
>>>>> to disable it if required.
>>> I 110% agree that a system that can stay up for years is much better
>>> than one that is fast for 5 minutes!
>>> 
>>> However I'd like a chance, in collaborating with you and your upcoming
>>> patches - to try and narrow
>>> down crash bugs to various subsystems and be able to get some
>>> benchmarks done that I simply
>>> couldn't do anymore at the financial conclusion of the make-wifi-fast
>>> and cake projects.
>>> 
>>> I think I have a lot of gear that is dd-wrt compatible - apu2,
>>> wndr3700s, 3800s....
>> if its v4, these are having 128 mb (i have them too).
> 
> These are from the cerowrt era, so, 32 or 64MB of ram.

	I believe we only used wndr3700v2 (64MB) and wndr3800 (128MB), at least those were the recommended ones. I also remember making these OOM with a simple UDP flood with randomized port addresses quite easily intially. That is, until we used fq_codel's limit keyword to restrict the number of maximally queued packets. This experience also carried into cake and culminated into the memlimit keyword. It seems I completely missed the addition of the "memory_limit BYTES" keyword to fq_codel, which seems a better fit to our needs than the "limit 1001" we currently use (why 1001 instead of 1000, simply to be able to quickly see whether this is our limit or something the user used, pleople ted to leave the last digit alone when playing with these parameters ;)).
I guess I have not bothered to repeat that test since fq_codel became the default qdisc in OpenWrt...

Best Regards
	Sebastian


> 
>> and apu2 has 2
>> gb. so its getting real interesting
>> if you choose such a bad one with 32 mb ram which are still commonly
>> used by "freifunk"
> 
> One thing we can start doing more 'round here is to boot the x86 boxes
> with mem=32MB or something similar (40% larger due to 64 bits? no idea,
> maybe look at free mem on a similar config) to see what shows up. 
> 
> For example, one of my APU2s has dual ath9/ath10k cards which is a
> a reasonable sim of one of your configs. 
> 
>>> The reduce truesize patch had helped a lot at the time (2012). There
>>> were all kinds of flaky bugs that disappeared.
>> i tested and it helped to make ethernet unavailable. it worked for
> 
> thx for making me chortle in sad empathy.
> 
>> wifi interfaces. but the eth0 and eth1 on my ipq8064 based
>> testboard did not work anymore. no dhcp lease, no ping. but i was able
>> to capture inbound packets. (qos was not even enabled while testing,
>> so no cake, fq_code letc. just standard sfq scheduler)
>> so i reverted and all worked again
> 
> OK. Thx for trying. there have been so many bugs in gso/gro and hardware
> offloads that I figure that that's why the patch was dropped over time.
> 
> is cake's gso-splitting working on that same hardware? I'm not sure
> to what extent that reduces packet size or not these days.
> 
> I'll try that again on x86, maybe it needed to pullskb....
> 
>>> 
>>> the new drop monitor patchset looks WONDERFUL for seeing more about
>>> packet drop behavior in the stack, but
>>> it's a 5.3(?) feature only.
>> i love backporting :-)
> 
> I used to but these days I'm content to work out of net-next x.y.0-rc4
> or later. I get more sleep that way. Oh, wait, it just hit that....
> 
>>> 
>>> I note that I run 18.06.1 on my 32MB pico and nanostations on the
>>> lupin campus, but I run no gui, few additional applications at all
>>> (except babel, snmpd, netperf, and the other core needed daemons).  My
>>> uptimes are principally governed by power failures. I can't remember
>>> the last  "crash, crash" I had, and I do track memory leaks (none).
>>> That said, I'm painfully aware that I should probably give dd-wrt and
>>> openwrt 19.x some testing just to make sure there's no regressions,
>>> but have been reluctant to get involved again without more partners in
>>> crime, because the scars from deploying 18.x widely are only beginning
>>> to heal... and only last week did the needed babel 1.9 upgrade arrive
>>> so I can finally redeploy ipv6 universally. I fear my current
>>> reliability metrics are so good because I took down ipv6 last year....
>> my workaround with memory problems is also disabling http normally. i
>> have some of these nanostations in the field
>> 
>> just running hostapd, snmp, syslog. but anything else is disabled due
>> the oom problematics. it never was a real crash.
>> 
>> but oom. but i never played with babel. ospf etc. all working out of
>> the box based on quagga on low end devices and frr on bigger ones.
>> 
>>> 
>>> Pico:
>>> 
>>> root@pool2:~# free
>>>              total         used         free       shared      buffers
>>> Mem:         28480        23796         4684           92         1868
>>> -/+ buffers:              21928         6552
>>> Swap:            0            0            0
>>> 
>>> root@pool2:~# uptime
>>>  11:38:09 up 43 days, 21:37,  load average: 0.04, 0.03, 0.04
>>> 
>>> Same workload over here, on a wndr3800, almost exactly the same config
>>> 
>>> root@couch:~# free
>>>              total       used       free     shared    buffers     cached
>>> Mem:         60320      22872      37448         68       1960       6120
>>> -/+ buffers/cache:      14792      45528
>>> Swap:            0          0          0
>> 
>> NS2
>> 
>> root@TRO1:~# free
>> 
>>               total        used        free      shared buff/cache  
>> available
>> Mem:          29124       19228        3552           0 6344        7752
>> Swap:             0           0           0
> 
> It looks like you are running even less stuff than I am. And this
> machine is running with 256k bufs?
> 
>> wndr3700v4
>> 
>> root@DD-WRT:~# free
>>               total        used        free      shared buff/cache  
>> available
>> Mem:         125884       23048       92940           0 9896       99824
>> Swap:             0           0           0
>> root@DD-WRT:~#
>> 
>> 
>>> 
>>>> Disabling the fq part won't actually gain you much in terms of memory
>>>> usage, though, as most of it is packet memory which is already
>>>> configurable.
>>>> 
>>>> The one exception to this is the static overhead of 'struct fq_flow', of
>>>> which mac80211 currently allocates 4k. That's 300k of memory which is
>>>> currently not configurable. But that could be fixed :)
>>>> 
>>>> -Toke
>>> --
>>> 
>>> Dave Täht
>>> CTO, TekLibre, LLC
>>> http://www.teklibre.com
>>> Tel: 1-831-205-9740
>>> 
> _______________________________________________
> Cake mailing list
> Cake@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/cake


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-08-23  6:48 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAA93jw4FDjYjxxStyzMu8eCz_6Ezcumg-iZaYeM3kBZ5da8MBQ@mail.gmail.com>
     [not found] ` <fc64c772-d85c-deba-a0e4-4e590cfb76ee@newmedia-net.de>
     [not found]   ` <CAA93jw4Fm=uY08A3XHWh3d=OKNfraCeYHwFGtAdPH7a51vG6WA@mail.gmail.com>
     [not found]     ` <384866b4-4c91-cf2c-c267-ee4036e5fbf7@newmedia-net.de>
     [not found]       ` <87wof7sriw.fsf@toke.dk>
     [not found]         ` <6782ec15-30eb-63b0-f54f-376c5e6b840b@newmedia-net.de>
     [not found]           ` <87tvabsp99.fsf@toke.dk>
     [not found]             ` <74bccc2b-b805-255f-b6a7-83ade9af6765@newmedia-net.de>
     [not found]               ` <87r25fsn70.fsf@toke.dk>
     [not found]                 ` <b52b087d-c21c-e190-1bc7-a06e5fe6305f@newmedia-net.de>
     [not found]                   ` <54438C64-C613-438E-9CB9-6C6D0C5EAFA0@gmail.com>
     [not found]                     ` <87sgpvflo4.fsf@taht.net>
     [not found]                       ` <87wof6rf7t.fsf@toke.dk>
     [not found]                         ` <7656FCDE-C590-4B0C-B191-B9FAC928A762@gmail.com>
     [not found]                           ` <CAA93jw4sEE_oQsX66xLkE+YUv=wM7AchfpUspC0y_Bf2nLdVOQ@mail.gmail.com>
     [not found]                             ` <5eb4c395-c718-2d28-65a7-9762cf8d5bea@newmedia-net.de>
     [not found]                               ` <47AD5102-B66F-44A5-AADE-D167ECB94A61@gmx.de>
     [not found]                                 ` <1d772664-b6cc-a528-9725-96a431032875@newmedia-net.de>
     [not found]                                   ` <87v9uqea3x.fsf@taht.net>
     [not found]                                     ` <87tvaap57q.fsf@toke.dk>
     [not found]                                       ` <CAA93jw6f0kedxwoN-ER3W1QKeg0sMxVCy6YYk_gRbrVwhD42jQ@mail.gmail.com>
     [not found]                                         ` <5bbd2b81-9846-3a7a-130c-0f59e04fd2d1@newmedia-net.de>
     [not found]                                           ` <CAA93jw4=13D-+WHLYPiV4NPqeVJwrLJe=nkr+a9D9Cqvq49pEQ@mail.gmail.com>
2019-08-22 13:22                                             ` [Make-wifi-fast] Fwd: Wifi Memory limits in small platforms Dave Taht
2019-08-22 14:59                                             ` [Make-wifi-fast] " Dave Taht
     [not found]                                             ` <dcb92eaf-928e-f909-981d-c2baf74fbc90@newmedia-net.de>
2019-08-22 17:03                                               ` Dave Taht
2019-08-22 17:37                                                 ` Sebastian Gottschall
2019-08-22 18:23                                                   ` Toke Høiland-Jørgensen
2019-08-22 18:56                                                     ` Dave Taht
2019-08-22 19:37                                                       ` [Make-wifi-fast] [Battlemesh] " Toke Høiland-Jørgensen
2019-08-22 20:10                                                         ` [Make-wifi-fast] [Cake] " Sebastian Moeller
2019-08-22 20:30                                                       ` [Make-wifi-fast] " Sebastian Gottschall
2019-08-22 23:39                                                         ` Dave Taht
2019-08-23  6:25                                                           ` Sebastian Gottschall
2019-08-23  6:48                                                           ` [Make-wifi-fast] [Cake] " Sebastian Moeller
2019-08-22 20:32                                                       ` [Make-wifi-fast] fq_codel_fast crash/lockup Sebastian Gottschall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox