From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by lists.bufferbloat.net (Postfix) with ESMTPS id 8C9F53B29D for ; Thu, 25 Jun 2020 16:12:12 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1593115932; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mN7Y1KGa2J2xgeNCzqBVm8zM5KlsQKQcbLdGW8o5Jko=; b=BVE82JE3D8z+tCPl2DgQIk4FxJhCv4BvNYiAWyRNzYOGwaObXNz6D9IQ+qev7ybzWrJc8U lYJiAvDG6JqRmTs9YwnBwV4w49ZkYaQC67EkdVDSAc6v9P6jCeZaU8VcUfzlIuL05Glomy Tdpwk8qs/HDHfl6Y4bBtrifTu9rFXgQ= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-96-kZwNxgIrPiCs-3jll10ILg-1; Thu, 25 Jun 2020 16:12:10 -0400 X-MC-Unique: kZwNxgIrPiCs-3jll10ILg-1 Received: by mail-ed1-f71.google.com with SMTP id m12so4714839edv.3 for ; Thu, 25 Jun 2020 13:12:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=mN7Y1KGa2J2xgeNCzqBVm8zM5KlsQKQcbLdGW8o5Jko=; b=oViu2KH2MXXQrFr6XX7f3Q2sISUd57CaNDY5Bj/P2KRhCERZFUYMl1NHjMtj49XjLD AnBRKIiGQI8gIqAiL+3Gs9F8JbgEcQJ5oLYOg2lMpvwEEm9kxO3gRwp3yHaW4s+aeskv s0BOi7+gKtPpZPFjw7j8LcDkV2e9J9BQdFcU/c5a6jiG/S0Ph11b2qO6MQ++/vf0jhKs X4FTY37TOE73G4sAYufrz7/5Uj/KhNQFSwgDeP3IYDOkx9YsHePtN4+E76MmrgaRRQ8D czWJcsL2dz/F9H6vbrHbndfxdPDYpRj6OSeKHvnqJpG+k0/2vRQsJLbPxEiTV9xgv+Pw Rggw== X-Gm-Message-State: AOAM533DCrhmpidmVyzmQJNeoqg9hp2/6nQg4NB22aM2mJlzBX01BSP6 dOMEWOrKzTzH2rU9PqfaKqsrOBn/uBXUkC+rqM6qiHbpD9zJf8gNJHmaSlVFVky/KQG52t/olU4 PpKix/1ftTOpXwDRvy+9DhQ== X-Received: by 2002:aa7:c407:: with SMTP id j7mr14112609edq.96.1593115928929; Thu, 25 Jun 2020 13:12:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyp7QvmjJwBiH/dfiy9QdMEtZMuD1OjwUyOp8+bCFpFIw2UNiSUMdidP3us9YRTOpD6oHnF9Q== X-Received: by 2002:aa7:c407:: with SMTP id j7mr14112592edq.96.1593115928709; Thu, 25 Jun 2020 13:12:08 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([2a0c:4d80:42:443::2]) by smtp.gmail.com with ESMTPSA id f1sm18891018edn.66.2020.06.25.13.12.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Jun 2020 13:12:07 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 38F591814FA; Thu, 25 Jun 2020 22:12:07 +0200 (CEST) From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= To: David Miller Cc: netdev@vger.kernel.org, cake@lists.bufferbloat.net Date: Thu, 25 Jun 2020 22:12:07 +0200 Message-ID: <159311592714.207748.900920527922661905.stgit@toke.dk> In-Reply-To: <159311592607.207748.5904268231642411759.stgit@toke.dk> References: <159311592607.207748.5904268231642411759.stgit@toke.dk> User-Agent: StGit/0.23 MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=toke@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Subject: [Cake] [PATCH net 1/3] sch_cake: don't try to reallocate or unshare skb unconditionally X-BeenThere: cake@lists.bufferbloat.net X-Mailman-Version: 2.1.20 Precedence: list List-Id: Cake - FQ_codel the next generation List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jun 2020 20:12:12 -0000 From: Ilya Ponetayev cake_handle_diffserv() tries to linearize mac and network header parts of skb and to make it writable unconditionally. In some cases it leads to full skb reallocation, which reduces throughput and increases CPU load. Some measurements of IPv4 forward + NAPT on MIPS router with 580 MHz single-core CPU was conducted. It appears that on kernel 4.9 skb_try_make_writable() reallocates skb, if skb was allocated in ethernet driver via so-called 'build skb' method from page cache (it was discovered by strange increase of kmalloc-2048 slab at first). Obtain DSCP value via read-only skb_header_pointer() call, and leave linearization only for DSCP bleaching or ECN CE setting. And, as an additional optimisation, skip diffserv parsing entirely if it is not needed by the current configuration. Fixes: c87b4ecdbe8d ("sch_cake: Make sure we can write the IP header before changing DSCP bits") Signed-off-by: Ilya Ponetayev [ fix a few style issues, reflow commit message ] Signed-off-by: Toke Høiland-Jørgensen --- net/sched/sch_cake.c | 41 ++++++++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 11 deletions(-) diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index 60f8ae578819..cae006bef565 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1553,30 +1553,49 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free) static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash) { - int wlen = skb_network_offset(skb); + const int offset = skb_network_offset(skb); + u16 *buf, buf_; u8 dscp; switch (tc_skb_protocol(skb)) { case htons(ETH_P_IP): - wlen += sizeof(struct iphdr); - if (!pskb_may_pull(skb, wlen) || - skb_try_make_writable(skb, wlen)) + buf = skb_header_pointer(skb, offset, sizeof(buf_), &buf_); + if (unlikely(!buf)) return 0; - dscp = ipv4_get_dsfield(ip_hdr(skb)) >> 2; - if (wash && dscp) + /* ToS is in the second byte of iphdr */ + dscp = ipv4_get_dsfield((struct iphdr *)buf) >> 2; + + if (wash && dscp) { + const int wlen = offset + sizeof(struct iphdr); + + if (!pskb_may_pull(skb, wlen) || + skb_try_make_writable(skb, wlen)) + return 0; + ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, 0); + } + return dscp; case htons(ETH_P_IPV6): - wlen += sizeof(struct ipv6hdr); - if (!pskb_may_pull(skb, wlen) || - skb_try_make_writable(skb, wlen)) + buf = skb_header_pointer(skb, offset, sizeof(buf_), &buf_); + if (unlikely(!buf)) return 0; - dscp = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2; - if (wash && dscp) + /* Traffic class is in the first and second bytes of ipv6hdr */ + dscp = ipv6_get_dsfield((struct ipv6hdr *)buf) >> 2; + + if (wash && dscp) { + const int wlen = offset + sizeof(struct ipv6hdr); + + if (!pskb_may_pull(skb, wlen) || + skb_try_make_writable(skb, wlen)) + return 0; + ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, 0); + } + return dscp; case htons(ETH_P_ARP):