From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 2407421F14B for ; Sat, 8 Dec 2012 09:53:59 -0800 (PST) Received: by mail-wi0-f177.google.com with SMTP id hm2so269736wib.10 for ; Sat, 08 Dec 2012 09:53:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=qGXtJuZRfzh7yBZlJAi+FaGOEzbccst7vcEZetgXSww=; b=TR0eBtkLar3r/zy3ERZrwS70Hl4clu1lVRUDhjEn7tSK+xpb1niL0SWO8LE1pCtiuf cfvxeyVSgTZZmLPzEQT8x6gnGEw/54IH5mRYneSYv1iYBoYXY0ColVZ5y8fGtpdc4L95 V92UeevU9/AHwIOxOcfsTJ7fx9W0n+LiSwhDHzrH7191/bXqY1l4PaLg47ves8JdMhnw IAyV9sY78XMBY+dJFkWDBy6dPtSUCYg3frorUf5Su6pls5TDziw8Pcq9PtPfBs/gI6t4 D6vhnjFx0Wq8Y+luZdsqmx7AUjBahM5mMBW3iUozGqzi5ugqmkAfCNmkhyqh7/7lmYV7 WyBA== Received: by 10.180.97.137 with SMTP id ea9mr3861687wib.13.1354989237717; Sat, 08 Dec 2012 09:53:57 -0800 (PST) Received: from ?IPv6:2001:630:c2:3103::2aa7:14c? ([2001:630:c2:3103::2aa7:14c]) by mx.google.com with ESMTPS id bz12sm3510540wib.5.2012.12.08.09.53.56 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 08 Dec 2012 09:53:57 -0800 (PST) Message-ID: <50C37EB3.2060806@gmail.com> Date: Sat, 08 Dec 2012 17:53:55 +0000 From: Robert Bradley User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: cerowrt-devel@lists.bufferbloat.net References: In-Reply-To: Content-Type: multipart/mixed; boundary="------------030605060808010802010300" Subject: Re: [Cerowrt-devel] cerowrt-3.6.9-3 test release X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Dec 2012 17:54:00 -0000 This is a multi-part message in MIME format. --------------030605060808010802010300 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 07/12/12 08:20, Dave Taht wrote: > Also, some new unaligned exception traps have surfaced with ipv6. They > are not bad - only about 1200 on a 60 second rrul test with 24.5/5.5 > ceroshaper, but enough to dramatically affect ipv6 latencies, and > affect fq_codel. (see attached) I have managed to track down possible causes of these new traps now. I've noticed that in the IPv6 stack in particular, there's a tendency to access the flow label and traffic class fields by casting the header structure to __be32*. Needless to say, this ends up producing a lot of unaligned access traps. There's a possible issue with setting the IPv6 ECN bits, but that should also affect the Sugarland build too. There's also a few *(__be32*) dereferences in net/ipv6/ip6_route.c and net/ipv6/ip6_tunnel.c that are new to the 3.6 kernel. The attached patches should hopefully fix all of the new traps. -- Robert Bradley --------------030605060808010802010300 Content-Type: text/x-patch; name="924-ecn-alignment-fixes.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="924-ecn-alignment-fixes.patch" diff --git a/include/net/inet_ecn.h b/include/net/inet_ecn.h index 2fa1469..0b6a7fd 100644 --- a/include/net/inet_ecn.h +++ b/include/net/inet_ecn.h @@ -111,15 +111,21 @@ struct ipv6hdr; static inline int IP6_ECN_set_ce(struct ipv6hdr *iph) { + __be32 dsfield; if (INET_ECN_is_not_ect(ipv6_get_dsfield(iph))) return 0; - *(__be32*)iph |= htonl(INET_ECN_CE << 20); + dsfield = __get_unaligned_cpu32((__be32*)iph); + dsfield |= htonl(INET_ECN_CE << 20); + __put_unaligned_cpu32(dsfield, (__be32*)iph); return 1; } static inline void IP6_ECN_clear(struct ipv6hdr *iph) { - *(__be32*)iph &= ~htonl(INET_ECN_MASK << 20); + __be32 dsfield; + dsfield = __get_unaligned_cpu32((__be32*)iph); + dsfield &= ~htonl(INET_ECN_MASK << 20); + __put_unaligned_cpu32(dsfield, (__be32*)iph); } static inline void ipv6_copy_dscp(unsigned int dscp, struct ipv6hdr *inner) --------------030605060808010802010300 Content-Type: text/x-patch; name="925-ipv6-alignment-fixes.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="925-ipv6-alignment-fixes.patch" diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 19e7aa0..6cbd2b7 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -492,7 +492,7 @@ int datagram_recv_ctl(struct sock *sk, struct msghdr *msg, struct sk_buff *skb) put_cmsg(msg, SOL_IPV6, IPV6_TCLASS, sizeof(tclass), &tclass); } - if (np->rxopt.bits.rxflow && (*(__be32 *)nh & IPV6_FLOWINFO_MASK)) { + if (np->rxopt.bits.rxflow && (__get_unaligned_cpu32((__be32 *)nh) & IPV6_FLOWINFO_MASK)) { __be32 flowinfo = __get_unaligned_cpu32((__be32 *)nh) & IPV6_FLOWINFO_MASK; put_cmsg(msg, SOL_IPV6, IPV6_FLOWINFO, sizeof(flowinfo), &flowinfo); } diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 5b2d63e..b6689e3 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -221,7 +221,9 @@ int ip6_xmit(struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6, if (hlimit < 0) hlimit = ip6_dst_hoplimit(dst); - *(__be32 *)hdr = htonl(0x60000000 | (tclass << 20)) | fl6->flowlabel; + __put_unaligned_cpu32(htonl(0x60000000 | (tclass << 20)) + | fl6->flowlabel, + (__be32*)hdr); hdr->payload_len = htons(seg_len); hdr->nexthdr = proto; @@ -272,7 +274,7 @@ int ip6_nd_hdr(struct sock *sk, struct sk_buff *skb, struct net_device *dev, skb_put(skb, sizeof(struct ipv6hdr)); hdr = ipv6_hdr(skb); - *(__be32*)hdr = htonl(0x60000000); + __put_unaligned_cpu32(htonl(0x60000000), (__be32*)hdr); hdr->payload_len = htons(len); hdr->nexthdr = proto; @@ -1635,8 +1637,9 @@ int ip6_push_pending_frames(struct sock *sk) skb_reset_network_header(skb); hdr = ipv6_hdr(skb); - *(__be32*)hdr = fl6->flowlabel | - htonl(0x60000000 | ((int)np->cork.tclass << 20)); + __put_unaligned_cpu32(fl6->flowlabel | + htonl(0x60000000 | ((int)np->cork.tclass << 20)), + (__be32*)hdr); hdr->hop_limit = np->cork.hop_limit; hdr->nexthdr = proto; diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 9a1d5fe..455ea7f 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -997,7 +997,7 @@ static int ip6_tnl_xmit2(struct sk_buff *skb, skb_push(skb, sizeof(struct ipv6hdr)); skb_reset_network_header(skb); ipv6h = ipv6_hdr(skb); - *(__be32*)ipv6h = fl6->flowlabel | htonl(0x60000000); + __put_unaligned_cpu32(fl6->flowlabel | htonl(0x60000000), (__be32*)ipv6h); dsfield = INET_ECN_encapsulate(0, dsfield); ipv6_change_dsfield(ipv6h, ~INET_ECN_MASK, dsfield); ipv6h->hop_limit = t->parms.hop_limit; @@ -1103,9 +1103,9 @@ ip6ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev) dsfield = ipv6_get_dsfield(ipv6h); if (t->parms.flags & IP6_TNL_F_USE_ORIG_TCLASS) - fl6.flowlabel |= (*(__be32 *) ipv6h & IPV6_TCLASS_MASK); + fl6.flowlabel |= (__get_unaligned_cpu32((__be32 *) ipv6h) & IPV6_TCLASS_MASK); if (t->parms.flags & IP6_TNL_F_USE_ORIG_FLOWLABEL) - fl6.flowlabel |= (*(__be32 *) ipv6h & IPV6_FLOWLABEL_MASK); + fl6.flowlabel |= (__get_unaligned_cpu32((__be32 *) ipv6h) & IPV6_FLOWLABEL_MASK); if (t->parms.flags & IP6_TNL_F_USE_ORIG_FWMARK) fl6.flowi6_mark = skb->mark; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 3e350eb..c2787a1 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1117,7 +1117,7 @@ void ip6_update_pmtu(struct sk_buff *skb, struct net *net, __be32 mtu, fl6.flowi6_flags = 0; fl6.daddr = iph->daddr; fl6.saddr = iph->saddr; - fl6.flowlabel = (*(__be32 *) iph) & IPV6_FLOWINFO_MASK; + fl6.flowlabel = __get_unaligned_cpu32((__be32 *) iph) & IPV6_FLOWINFO_MASK; dst = ip6_route_output(net, NULL, &fl6); if (!dst->error) @@ -1145,7 +1145,7 @@ void ip6_redirect(struct sk_buff *skb, struct net *net, int oif, u32 mark) fl6.flowi6_flags = 0; fl6.daddr = iph->daddr; fl6.saddr = iph->saddr; - fl6.flowlabel = (*(__be32 *) iph) & IPV6_FLOWINFO_MASK; + fl6.flowlabel = __get_unaligned_cpu32((__be32 *) iph) & IPV6_FLOWINFO_MASK; dst = ip6_route_output(net, NULL, &fl6); if (!dst->error) --------------030605060808010802010300--