From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-gy0-f171.google.com (mail-gy0-f171.google.com [209.85.160.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id 57AE3200CBF for ; Tue, 13 Sep 2011 14:51:59 -0700 (PDT) Received: by gyh3 with SMTP id 3so1240390gyh.16 for ; Tue, 13 Sep 2011 14:51:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=zMH4I2/r1sw3jOyGg6SsUgO4u4YaMrrf38TftEJWUGk=; b=Vvo78Rt4HacLogf/8fgkS8XXBt137hyy+b3w1hZ9lLxcUB5Dxm55hRC7/tCXMwTnil EeTxERNx2fXAxgu//QkTxCmVYDvjjndgg34MmPpFG7wcnrz3bZFwU3tr96vkPlrZQcKS U5JVuWBsMXx5uskmy+VNXk+sRgIB55Tk2K1Uc= MIME-Version: 1.0 Received: by 10.42.146.133 with SMTP id j5mr2651860icv.180.1315950717583; Tue, 13 Sep 2011 14:51:57 -0700 (PDT) Received: by 10.43.133.129 with HTTP; Tue, 13 Sep 2011 14:51:57 -0700 (PDT) In-Reply-To: <20110913200027.GQ28007@angus.ind.WPI.EDU> References: <20110913200027.GQ28007@angus.ind.WPI.EDU> Date: Tue, 13 Sep 2011 14:51:57 -0700 Message-ID: From: Dave Taht To: bloat Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: [Bloat] Fwd: bridge should flood non-IPv4-multicast ethernet frames X-BeenThere: bloat@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: General list for discussing Bufferbloat List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Sep 2011 21:51:59 -0000 I think this possibly explains why I ended up routing ipv6 everywhere rather than bridging it. ---------- Forwarded message ---------- From: Chuck Anderson Date: Tue, Sep 13, 2011 at 1:00 PM Subject: bridge should flood non-IPv4-multicast ethernet frames To: netdev@vger.kernel.org When the bridge code grew multicast snooping capability (currently IPv4/IGMPv2-only as I understand it), it stopped flooding non-IPv4 multicast ethernet frames. =A0This breaks the capability to bridge any non-IPv4 protocols that also use multicast ethernet frames, such as IPv6 and IS-IS, while the bridge snooping capability remains enabled (it appears to be default enabled at least in the RHEL 6 vendor kernel). =A0I noticed this when IPv6 neighbor discovery (ND) and router advertisement (RA) packets weren't making it to a KVM guest via br0 on the host, breaking IPv6 connectivity to the guest. =A0This type of thing is a common bug with vendor's multicast snooping implementations, but I was surprised to discover that Linux has this same bug. =A0See RFC 4541, section 1, last paragraph, and section 2.1.2, paragraph 4: http://tools.ietf.org/html/rfc4541.html I believe the relevent code is in br_device.c: =A0 =A0 =A0 =A0if (is_broadcast_ether_addr(dest)) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_flood_deliver(br, skb); =A0 =A0 =A0 =A0else if (is_multicast_ether_addr(dest)) { =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (unlikely(netpoll_tx_running(dev))) { =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_flood_deliver(br, skb); =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (br_multicast_rcv(br, NULL, skb)) { =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0kfree_skb(skb); =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0goto out; =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mdst =3D br_mdb_get(br, skb); =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(sk= b)) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_multicast_deliver(mdst, s= kb); =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0else =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_flood_deliver(br, skb); =A0 =A0 =A0 =A0} else if ((dst =3D __br_fdb_get(br, dest)) !=3D NULL) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_deliver(dst->dst, skb); =A0 =A0 =A0 =A0else =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_flood_deliver(br, skb); is_multicast_ether_addr() only checks to see if the lowest bit is 1--i.e. any multicast ethernet address. =A0That check alone isn't sufficient. =A0There also needs to be a check that the ethernet frame is in one of the well-known formats for the particular protocol for which snooping is supported, IPv4 being the only one supported by Linux bridging so far. I see a few ways to fix this: 1. =A0IPv4 Multicast always uses multicast ethernet addresses in the format 01:00:5E:xx:xx:xx. =A0Insert a check that the dest address matches 01:00:5E:xx:xx:xx, otherwise always flood the frame so we don't break non-IPv4-multicast frames from being bridged. =A0Something like this pseudocode: =A0 =A0 =A0 =A0if (is_broadcast_ether_addr(dest)) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0br_flood_deliver(br, skb); =A0 =A0 =A0 =A0else if (is_ipv4_multicast_ether_addr(dest)) { ... static inline int is_ipv4_multicast_ether_addr(const u8 *addr) { =A0 =A0 =A0 =A0return (addr[0] =3D=3D 0x01 && addr[1] =3D=3D 0x00 && addr[2= ] =3D=3D 0x5e); } 2. Check that the Ethertype is 0x800 (IPv4), and if it is not, always flood the frame so we don't break non-IPv6-multicast frames being bridged. 3. Do both of the above, the key point being that IPv6 multicast frames (33:33:xx:xx:xx:xx), along with any other ethernet multicast frames that aren't supported by the current bridge snooping code, should always be flooded unconditionally. =A0IS-IS for example uses 01:80:C2:00:00:14 and 01:80:C2:00:00:15. Thoughts? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at =A0http://vger.kernel.org/majordomo-info.html --=20 Dave T=E4ht SKYPE: davetaht US Tel: 1-239-829-5608 http://the-edge.blogspot.com