Historic archive of defunct list bismark-devel@lists.bufferbloat.net
 help / color / mirror / Atom feed
From: d@taht.net (Dave Täht)
To: bismark-devel <bismark-devel@lists.bufferbloat.net>
Subject: [Bismark-devel] ECN bug found, fixed
Date: Tue, 15 Mar 2011 11:33:02 -0600	[thread overview]
Message-ID: <877hc0i90x.fsf_-_@cruithne.co.teklibre.org> (raw)
In-Reply-To: <1300164166.2649.70.camel@edumazet-laptop> (Eric Dumazet's message of "Tue, 15 Mar 2011 05:42:46 +0100")


We think we've found and are in the process of fixing a bug in ECN
handling in Linux. I'm curious if you had ecn turned on at all in your
initial testing and what qdisc you were using.



Eric Dumazet <eric.dumazet@gmail.com> writes:

> Le lundi 14 mars 2011 à 21:24 +0100, Eric Dumazet a écrit :
>
> remove CC to bloat lists for now, adding David Miller to thread.
>
>> Le lundi 14 mars 2011 à 21:55 +0200, Jonathan Morton a écrit :
>> > On 14 Mar, 2011, at 9:26 pm, Dave Täht wrote:
>> > 
>> > > Over the weekend, Dan Siemons uncovered a possible bad interaction
>> > > between ECN and the default pfifo_fast qdisc in Linux.
>> > > 
>> > > http://www.coverfire.com/archives/2011/03/13/pfifo_fast-and-ecn/
>> > 
>> > This seems to be more complicated that it appears.  It looks as though
>> > Linux has re-used the LSB of the old TOS field for some "link local"
>> > flag which is used by routing.
>> > 
>> > It's not immediately obvious whether pfifo_fast is using this new
>> > interpretation though.  If it isn't, the fix should be to remove the
>> > RTO_ONLINK bit from the mask it's using on the tos field.  The other
>> > half of the mask correctly excludes the ECN bits from the field.
>> > 
>> 
>> CC netdev, where linux network dev can take a look.
>> 
>> I would say that this is a wrong analysis : 
>> 
>> 1) ECN uses two low order bits of TOS byte
>> 
>> 2) pfifo_fast uses skb->priority
>> 
>> 
>> skb->priority = rt_tos2priority(iph->tos);
>> 
>> #define IPTOS_TOS_MASK            0x1E
>> #define IPTOS_TOS(tos)            ((tos)&IPTOS_TOS_MASK)
>> 
>> static inline char rt_tos2priority(u8 tos)
>> {
>> 	return ip_tos2prio[IPTOS_TOS(tos)>>1];
>> }
>> 
>> No interference between two mechanisms, unless sysadmin messed up things
>> (skb_edit)
>> 
>> 
>
> David, it seems ip_tos2prio is wrong on its 2nd entry :
>
> #define TC_PRIO_BESTEFFORT              0
> #define TC_PRIO_FILLER                  1
> #define TC_PRIO_BULK                    2
> #define TC_PRIO_INTERACTIVE_BULK        4
> #define TC_PRIO_INTERACTIVE             6
> #define TC_PRIO_CONTROL                 7
>
> #define TC_PRIO_MAX                     15
>
> net/ipv4/route.c:170:#define ECN_OR_COST(class) TC_PRIO_##class
>
> const __u8 ip_tos2prio[16] = {
> 	TC_PRIO_BESTEFFORT,   /* 0 : for flow without ECN */
> 	ECN_OR_COST(FILLER), /* 1 : flow with ECN */
> 	...
> };
>
>
>
>
> This means ECN enabled flows got TC_PRIO_FILLER (what the hell is
> that ?)
>
> pfifo_fast has :
>
> static const u8 prio2band[TC_PRIO_MAX+1] =
> 	{ 1, 2, 2, 2, 1, 2, 0, 0 , 1, 1, 1, 1, 1, 1, 1, 1 };
>
> So a non ECN enabled flow goes to band 1, while an ECN enabled one is in
> band 2 (!).  Thus, ECN enabled flows have a chance being droped more
> often than non ECN flows. Thats not fair...
>
> What do you think ?
>
> Thanks
>
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 6ed6603..fabfe81 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -171,7 +171,7 @@ static struct dst_ops ipv4_dst_ops = {
>  
>  const __u8 ip_tos2prio[16] = {
>  	TC_PRIO_BESTEFFORT,
> -	ECN_OR_COST(FILLER),
> +	ECN_OR_COST(BESTEFFORT),
>  	TC_PRIO_BESTEFFORT,
>  	ECN_OR_COST(BESTEFFORT),
>  	TC_PRIO_BULK,
>
>

-- 
Dave Taht
http://nex-6.taht.net

           reply	other threads:[~2011-03-15 17:33 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <1300164166.2649.70.camel@edumazet-laptop>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877hc0i90x.fsf_-_@cruithne.co.teklibre.org \
    --to=d@taht.net \
    --cc=bismark-devel@lists.bufferbloat.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox