From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-x233.google.com (mail-qt0-x233.google.com [IPv6:2607:f8b0:400d:c0d::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 50D343B29E; Tue, 18 Sep 2018 20:32:34 -0400 (EDT) Received: by mail-qt0-x233.google.com with SMTP id l42-v6so3529560qtf.13; Tue, 18 Sep 2018 17:32:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=i9nCgiZK3xmmonJRM5cSO2knfniTx4bQqG6geXNuOFk=; b=sEAOYGfQuEHeVYnL4ALKAa3wIWwBL12TbI5tAaA5FzfX5onTw0qun7bgs9shhZaahk UcX0mgnqxlsYnPMa4WJnoEI7npwvaLf9R1hPMIdPpo+X+Pw/+viwLHp5a7e/SRgHOZhy JkbDGj7x3wlR1bdCxxZgBWjkNjDmiYopuzljmcTV4dbj1pzl192EM8/dAsRoaUPeWwCX eROb3bdiM+NrBAj3hI/hJCTXyVXRWowU1WCoHaygRTZaSDSv1DpzXQQO5P1QEXCvybRo zYd9H4g9Wihp4uEEDR9pAtC6UpcZY2Sf3H5GjtK5LnrYZRRuS7CVfPKS61koqJsjaYUS r13Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=i9nCgiZK3xmmonJRM5cSO2knfniTx4bQqG6geXNuOFk=; b=UY1AI4Pdtg9Ur+CvNyJqD0q12f5i/ktoD1Js/Dmc9GDk52MNdCX9PTEnunSuXkWYGP AvdXFVYoo2vKLKiig0elRdG6uXfsjxef5jP8vUHxY5znjpJMVHTViclitoScf8pUU9o6 X/HpxVxJjcXxrnoX+DbINLXGExQPsC0WOLj7L/sN1JjQSWs+4VGb3IlZYFfG+2mVtshQ Asn8YfRMc3MmAz/2ZMcAwyxtYzeIFARs4eOzmwA+ihwThIF6/tWazhB4R37lHBqSStxG xvhSjrZ+hvsiNtgLFrEXwjXjy5R7K4GCPs2bmhmVYsGRu2i45MSX1VNd1mP7rMEPFIZz AWZw== X-Gm-Message-State: APzg51A97xZ9vTEDkGSqocH8pYpEcna+gcMeisRm1kIwICgB04XyPdXS /8UncGwjufhfUMX7HNrCgg5Pqn7gi7j8tNatXfs= X-Google-Smtp-Source: ANB0VdZkMVwAkwDRrq56/siOek1cGmzQq6xK6cPktZLhpf1ew8kuf3jOx7ya9mJ0PQPuU79RA6dbuhFFuK8ZfnDK4k8= X-Received: by 2002:aed:2841:: with SMTP id r59-v6mr22432583qtd.6.1537317153755; Tue, 18 Sep 2018 17:32:33 -0700 (PDT) MIME-Version: 1.0 References: <87efdqcs28.wl-jch@irif.fr> <8736u6cpyu.wl-jch@irif.fr> In-Reply-To: <8736u6cpyu.wl-jch@irif.fr> From: Dave Taht Date: Tue, 18 Sep 2018 17:32:22 -0700 Message-ID: To: Juliusz Chroboczek Cc: babel-users@lists.alioth.debian.org, Make-Wifi-fast , ecn-sane@lists.bufferbloat.net Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Make-wifi-fast] [Babel-users] reducing delays in wifi mcast queues X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Sep 2018 00:32:34 -0000 On Tue, Sep 18, 2018 at 5:04 PM Juliusz Chroboczek wrote: > > > Recently I tried to deploy a few babel 1.8.2 nodes with the latest > > openwrt, which I had to back out rapidly because I was dropping so many > > babel packets under contention. > > That's interesting. Could I please see a log? I will be more rigorous while upgrading to 1.8.3 tomorrow. Not sure what sort of log you would like would: echo dump | nc :1 32123 every 4 sec suit? The other log I was creating was of ip route show every 10 sec while collecting the usual flent stats of course. tcpdump? The most effective thing I've done to show "evolution" has been to take a movie of babelweb... > > A patch to universally enable babel ecn in net.c "solves" this problem, > > Interesting. AFAIK, ECN is only considered by AQM queues, so this implie= s > there's a queue in the way that's dropping Babel packets. There's fq_codel on every queue, which does FQ, and codel assumes everything is at least moderately TCP friendly (and/or reasonably responsive to ecn marks) My easy test '(other than a field deployment), is to try and pump, say, 100 flent-driven TCP flows through an otherwise reliable 100Mbit link for a few minutes. Routes get lost, hellos get lost, eventually the link gets cut off from the net entirely, even if it's the only link. I've been planning on repeating that test formally since early august, your 1.8.3 announcement caught me at a good time. > Perhaps this > queue could be convinced to treat Babel packets specially without having > to hack around it using ECN? So this goes to a deep philosophical question also. I would not mind if there was a setsockopt like the existing TCP_SENT_LOWAT for udp to provide some backpressure. Routing is a special case - for Babel, and OSPF, adding ecn is an option. For ISIS not so. > Or perhaps, if we know which queue that is, > we could modify Babel's packet scheduling to be more AQM friendly? How would you describe babel's packet schedulig now? CS6 on wifi stuff tends to end up in the VO or VI queues fq_codel by itself on eithernet doesn't pay attention to diffserv cake has support for diffserv markings and reserves up to 25% of the bandwidth for higher priority flows. It's harder to get it to do bad things unless you attack it with 100 CS4 marked tcp flows... As for being AQM friendly, a better way to put it would be being TCP-friendly, I guess. Never put in more than you can expect to get out. The fq_codel algorithm in the linux mac80211 stack currently defaults to 20ms as a target local delay. So dumping packets in there at a rate no more than 20ms each (short term burst of 100ms) - relative to whatever bandwidth can be achieved vs the other flows. Randomizing the order in which routes are sent out might help, repeating critical routes (like hellos with default gateways in them), I don't know what else. Perhaps we need to revisit the mcast queue driver on this round of the mac802.11 work. It's just really observable now... BTW: The OSX version of fq_codel (which has been on by default for wifi for a version or two), uses different targets for the VO queue. Not clear how it does mcast. daves-Air-3:~ d$ netstat -I en0 -qq en0: [ sched: FQ_CODEL qlength: 0/128 ] [ pkts: 0 bytes: 0 dropped pkts: 50 bytes: 6= 129 ] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D [ pri: VO (1) srv_cl: 0x400180 quantum: 600 drr_max: 8 ] [ queued pkts: 0 bytes: 0 ] [ dequeued pkts: 2652 bytes: 272144 ] [ budget: 0 target qdelay: 10.00 msec update interval:100.00 msec ] [ flow control: 0 feedback: 0 stalls: 0 failed: 0 ] [ drop overflow: 0 early: 0 memfail: 0 duprexmt:0 ] [ flows total: 0 new: 0 old: 0 ] [ throttle on: 0 off: 0 drop: 0 ] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D [ pri: VI (2) srv_cl: 0x380100 quantum: 3000 drr_max: 6 ] [ queued pkts: 0 bytes: 0 ] [ dequeued pkts: 0 bytes: 0 ] [ budget: 0 target qdelay: 10.00 msec update interval:100.00 msec ] [ flow control: 0 feedback: 0 stalls: 0 failed: 0 ] [ drop overflow: 0 early: 0 memfail: 0 duprexmt:0 ] [ flows total: 0 new: 0 old: 0 ] [ throttle on: 0 off: 0 drop: 0 ] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D [ pri: BE (7) srv_cl: 0x0 quantum: 1500 drr_max: 4 ] [ queued pkts: 0 bytes: 0 ] [ dequeued pkts: 147577 bytes: 42979533 ] [ budget: 0 target qdelay: 10.00 msec update interval:100.00 msec ] [ flow control: 0 feedback: 0 stalls: 0 failed: 0 ] [ drop overflow: 0 early: 0 memfail: 0 duprexmt:0 ] [ flows total: 0 new: 0 old: 0 ] [ throttle on: 0 off: 0 drop: 0 ] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D [ pri: BK (8) srv_cl: 0x100080 quantum: 1500 drr_max: 2 ] [ queued pkts: 0 bytes: 0 ] [ dequeued pkts: 1312 bytes: 249257 ] [ budget: 0 target qdelay: 10.00 msec update interval:100.00 msec ] [ flow control: 0 feedback: 0 stalls: 0 failed: 0 ] [ drop overflow: 0 early: 0 memfail: 0 duprexmt:0 ] [ flows total: 0 new: 0 old: 0 ] [ throttle on: 0 off: 0 drop: 0 ] > > -- Juliusz --=20 Dave T=C3=A4ht CEO, TekLibre, LLC http://www.teklibre.com Tel: 1-669-226-2619