From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x144.google.com (mail-lf1-x144.google.com [IPv6:2a00:1450:4864:20::144]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 8ACDD3B2A4 for ; Fri, 8 Nov 2019 20:22:50 -0500 (EST) Received: by mail-lf1-x144.google.com with SMTP id y186so47860lfa.1 for ; Fri, 08 Nov 2019 17:22:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=wPXyqZBACPt2T8JabE4s0igf2vH/Qf9iXz5fOc/g31A=; b=KaCfKXPf9tclQGg3xbw0RP8zIY19/2N7+HspE9hY+ndU+h4disr4tT5Bf0aNZolhSQ /mkmKrGdP45uwLYAo7glcfUp5zsMVCkXcwzx3Jdqt10N4Cvw7tyFA8ogAB9j7MnlTSZb MyajoVYU6YLhXDNS38lDuVj2l6W0TCRRFvJfXmAJEiM+qMpcov7zMPMRQFRXof1Z3kHH FbI9yoL0lCtsztv2jCASaiyumAdWzv3p1LwNo3t+zbFfwyOqx/dzVdwWYJdANEnkQIba kj0Ga/10Bl9C62V8e9LwreSMqBg90IH55OGLfaxJYXPqvrNRvJl31RH4+pN+yV3jFFQo w+dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=wPXyqZBACPt2T8JabE4s0igf2vH/Qf9iXz5fOc/g31A=; b=cpNlwNSLpqENNPTSpQcMD7HzAOtsofQ9gOGgUcqF5RUP+W1HRE43zPuKU12VJ3HP43 Y+OZmHbZnGDeq/pLxd7kewchEeOBaIVO0czi5goxybKF32c2QCaAclmBJ4nZwmRZP/zh c5L6Rpa9NmE4p5kB2GfhKbWfD/83Tsv9z90SKMKH02yOy9uAPlSQE5v6kI0OmOdwk8xC mtBPxwnVgm/EsNYeFCb5HPYW3VlQ08Vsp4v2DhH6cCJ3uMUTb06HyX/AR+5wkWcCPDFw sxqKybc2OeSouWap+syB6XARkbYbgtB9wS2bea4727OBMBAn2Ye27Ud0bpOEClXU4oXV nQyw== X-Gm-Message-State: APjAAAWpOgpZe+K5RHjd9f17ZiibwEqSjS/EiWBnXMT/IQAoMupFr3Lb 9Tj8eiKufz/Lufwrdp74xaQjAfngGIgkEKjgr7OjCA== X-Google-Smtp-Source: APXvYqzcFLo69QaprJxtW7QMFzk0LEtt8oG4AUIf8h9IGzZQ4qgLpbLfpusrfQ5D8khH8QL1egEqTvs1kOD48rXrM1Q= X-Received: by 2002:a19:5010:: with SMTP id e16mr2851614lfb.49.1573262569148; Fri, 08 Nov 2019 17:22:49 -0800 (PST) MIME-Version: 1.0 References: <157182473951.150713.7978051149956899705.stgit@toke.dk> <157182474287.150713.12867638269538730397.stgit@toke.dk> <1a2eb096119c9029e67caf797564d6511c8803a7.camel@sipsolutions.net> <87a796fxgd.fsf@toke.dk> <874kzefwt3.fsf@toke.dk> <300bf0146db6c0d5890699b3911d35174d28c9c0.camel@sipsolutions.net> In-Reply-To: <300bf0146db6c0d5890699b3911d35174d28c9c0.camel@sipsolutions.net> From: Kan Yan Date: Fri, 8 Nov 2019 17:22:38 -0800 Message-ID: To: Johannes Berg Cc: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= , linux-wireless@vger.kernel.org, Make-Wifi-fast , ath10k@lists.infradead.org, John Crispin , Lorenzo Bianconi , Felix Fietkau , Rajkumar Manoharan , Kevin Hayes Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Make-wifi-fast] [PATCH v6 3/4] mac80211: Implement Airtime-based Queue Limit (AQL) X-BeenThere: make-wifi-fast@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Nov 2019 01:22:50 -0000 It is most likely just insufficient locking. active_txq_lock is per AC, can't protect local->aql_total_pending_airtime against racing conditions: void ieee80211_sta_update_pending_airtime(...) { spin_lock_bh(&local->active_txq_lock[ac]); ... local->aql_total_pending_airtime -=3D tx_airtime; ... spin_unlock_bh(&local->active_txq_lock[ac]); } After changing it to atomic_t, no more aql_total_pending_airtime underflow so far :). Using atomic operation should also help reduce CPU overhead due to lock contention, as ieee80211_sta_update_pending_airtime() is often called from the tx completion routine triggered by interrupts, often in a different core than where __ieee80211_schedule_txq() is running. I will post a new version a bit later if the test goes well. Regards, Kan On Fri, Nov 8, 2019 at 3:17 AM Johannes Berg wr= ote: > > On Fri, 2019-11-08 at 12:10 +0100, Toke H=C3=B8iland-J=C3=B8rgensen wrote= : > > > Right, bugger. I was thinking maybe there's a case where skbs can be > > cloned (and retain the tx_time_est field) and then released twice? > > They could be cloned, but I don't see how that'd be while *inside* the > stack and then they get reported twice - unless the driver did something > like that? > > I mean, TCP surely does that for example, but it's before we even get to > mac80211. > > > Or > > maybe somewhere that steps on the skb->cb field in some other way? > > Couldn't find anything obvious on a first perusal of the TX path code, > > but maybe you could think of something? > > No, sorry. But I also didn't actually look at the driver at all. > > > Otherwise I guess we'll be forced to go and do some actual, > > old-fashioned debugging ;) > > :) > > johannes >