General list for discussing Bufferbloat
 help / color / mirror / Atom feed
From: Dave Taht <dave.taht@gmail.com>
To: bloat <bloat@lists.bufferbloat.net>
Subject: [Bloat] Fwd: bridge should flood non-IPv4-multicast ethernet frames
Date: Tue, 13 Sep 2011 14:51:57 -0700	[thread overview]
Message-ID: <CAA93jw5q-UET1NH3Cu15UbH6T_kqNCUdW_-yzYqPPu5nre+xTA@mail.gmail.com> (raw)
In-Reply-To: <20110913200027.GQ28007@angus.ind.WPI.EDU>

I think this possibly explains why I ended up routing ipv6 everywhere
rather than bridging it.


---------- Forwarded message ----------
From: Chuck Anderson <cra@wpi.edu>
Date: Tue, Sep 13, 2011 at 1:00 PM
Subject: bridge should flood non-IPv4-multicast ethernet frames
To: netdev@vger.kernel.org


When the bridge code grew multicast snooping capability (currently
IPv4/IGMPv2-only as I understand it), it stopped flooding non-IPv4
multicast ethernet frames.  This breaks the capability to bridge any
non-IPv4 protocols that also use multicast ethernet frames, such as
IPv6 and IS-IS, while the bridge snooping capability remains enabled
(it appears to be default enabled at least in the RHEL 6 vendor
kernel).  I noticed this when IPv6 neighbor discovery (ND) and router
advertisement (RA) packets weren't making it to a KVM guest via br0 on
the host, breaking IPv6 connectivity to the guest.  This type of thing
is a common bug with vendor's multicast snooping implementations, but
I was surprised to discover that Linux has this same bug.  See RFC
4541, section 1, last paragraph, and section 2.1.2, paragraph 4:

http://tools.ietf.org/html/rfc4541.html

I believe the relevent code is in br_device.c:

       if (is_broadcast_ether_addr(dest))
               br_flood_deliver(br, skb);
       else if (is_multicast_ether_addr(dest)) {
               if (unlikely(netpoll_tx_running(dev))) {
                       br_flood_deliver(br, skb);
                       goto out;
               }
               if (br_multicast_rcv(br, NULL, skb)) {
                       kfree_skb(skb);
                       goto out;
               }

               mdst = br_mdb_get(br, skb);
               if (mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb))
                       br_multicast_deliver(mdst, skb);
               else
                       br_flood_deliver(br, skb);
       } else if ((dst = __br_fdb_get(br, dest)) != NULL)
               br_deliver(dst->dst, skb);
       else
               br_flood_deliver(br, skb);

is_multicast_ether_addr() only checks to see if the lowest bit is
1--i.e. any multicast ethernet address.  That check alone isn't
sufficient.  There also needs to be a check that the ethernet frame is
in one of the well-known formats for the particular protocol for which
snooping is supported, IPv4 being the only one supported by Linux
bridging so far.

I see a few ways to fix this:

1.  IPv4 Multicast always uses multicast ethernet addresses in the
format 01:00:5E:xx:xx:xx.  Insert a check that the dest address
matches 01:00:5E:xx:xx:xx, otherwise always flood the frame so we
don't break non-IPv4-multicast frames from being bridged.  Something
like this pseudocode:

       if (is_broadcast_ether_addr(dest))
               br_flood_deliver(br, skb);
       else if (is_ipv4_multicast_ether_addr(dest)) {

...

static inline int is_ipv4_multicast_ether_addr(const u8 *addr)
{
       return (addr[0] == 0x01 && addr[1] == 0x00 && addr[2] == 0x5e);
}


2. Check that the Ethertype is 0x800 (IPv4), and if it is not, always
flood the frame so we don't break non-IPv6-multicast frames being
bridged.

3. Do both of the above, the key point being that IPv6 multicast
frames (33:33:xx:xx:xx:xx), along with any other ethernet multicast
frames that aren't supported by the current bridge snooping code,
should always be flooded unconditionally.  IS-IS for example uses
01:80:C2:00:00:14 and 01:80:C2:00:00:15.

Thoughts?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Dave Täht
SKYPE: davetaht
US Tel: 1-239-829-5608
http://the-edge.blogspot.com

       reply	other threads:[~2011-09-13 21:51 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20110913200027.GQ28007@angus.ind.WPI.EDU>
2011-09-13 21:51 ` Dave Taht [this message]
2012-02-09 20:38   ` Yan Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/bloat.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAA93jw5q-UET1NH3Cu15UbH6T_kqNCUdW_-yzYqPPu5nre+xTA@mail.gmail.com \
    --to=dave.taht@gmail.com \
    --cc=bloat@lists.bufferbloat.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox