From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id D855321F0F2 for ; Sun, 13 Jan 2013 20:43:45 -0800 (PST) Received: by mail-ie0-f182.google.com with SMTP id s9so4481147iec.41 for ; Sun, 13 Jan 2013 20:43:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ktcWBfS88xG4vx8OoBJXI/r/5sfC/oJn9Yi9j7d7p5U=; b=MCmcvyrksQBOwVXKamMaujWKnE3RzPC2lRh7zrtW1MUFjlg9LfDxcfgXoyr4gtyx6P xMFsOo2PGojirFWTqK6JgvU0f2MtNYTW+kycU3kMAoUb0ZIdQdzDPpNacVTzybVnDN32 4af43qECvdc0ES6T92Q03QuRZ14gysYKgwSESdlwAabgQa/VzQK1QX/r7cQZOMlXsEiu UwjY5S06hDS0z9Qfbmsn52tZwdxucAjD5XnyYkCOb8YZ4DPPk7EV8p5LfwIfJXOD0zam SRCpID9ysxYi3CZBLhqqCEZgRzYRPlWhdxsZ4hqxARqves/0eLesE0ZIuMj8Ys0LPOL6 aCrA== MIME-Version: 1.0 Received: by 10.42.101.144 with SMTP id e16mr63372525ico.5.1358138625322; Sun, 13 Jan 2013 20:43:45 -0800 (PST) Received: by 10.64.53.105 with HTTP; Sun, 13 Jan 2013 20:43:44 -0800 (PST) Received: by 10.64.53.105 with HTTP; Sun, 13 Jan 2013 20:43:44 -0800 (PST) In-Reply-To: References: <50F32981.9080404@openwrt.org> Date: Mon, 14 Jan 2013 10:13:44 +0530 Message-ID: From: Ketan Kulkarni To: Eric Dumazet , Yuchung Cheng Content-Type: multipart/alternative; boundary=047d7beba2f8516f6904d3384a21 Cc: cerowrt-devel Subject: Re: [Cerowrt-devel] TFO crashes cerowrt 3.7.1-1 X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jan 2013 04:43:46 -0000 --047d7beba2f8516f6904d3384a21 Content-Type: text/plain; charset=ISO-8859-1 Thanks Eric and Yuchung for taking care of the patch. I will test few more TFO cases as well once this patch is built in cero. Thanks, Ketan On Jan 14, 2013 9:37 AM, "Eric Dumazet" wrote: > > Quite frankly I would just remove the BUG_ON() > > diff --git a/net/core/request_sock.c b/net/core/request_sock.c > index c31d9e8..4425148 100644 > --- a/net/core/request_sock.c > +++ b/net/core/request_sock.c > @@ -186,8 +186,6 @@ void reqsk_fastopen_remove(struct sock *sk, struct request_sock *req, > struct fastopen_queue *fastopenq = > inet_csk(lsk)->icsk_accept_queue.fastopenq; > > - BUG_ON(!spin_is_locked(&sk->sk_lock.slock) && !sock_owned_by_user(sk)); > - > tcp_sk(sk)->fastopen_rsk = NULL; > spin_lock_bh(&fastopenq->lock); > fastopenq->qlen--; > > > > On Sun, Jan 13, 2013 at 7:05 PM, Eric Dumazet wrote: >> >> Oh well yes, this doesnt quite work on !SMP. >> >> And this kind of bug is frequent.... >> >> See following example : >> >> commit b9980cdcf2524c5fe15d8cbae9c97b3ed6385563 >> Author: Hugh Dickins >> Date: Wed Feb 8 17:13:40 2012 -0800 >> >> mm: fix UP THP spin_is_locked BUGs >> >> Fix CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_SMP=n CONFIG_DEBUG_VM=y >> CONFIG_DEBUG_SPINLOCK=n kernel: spin_is_locked() is then always false, >> and so triggers some BUGs in Transparent HugePage codepaths. >> >> asm-generic/bug.h mentions this problem, and provides a WARN_ON_SMP(x); >> but being too lazy to add VM_BUG_ON_SMP, BUG_ON_SMP, WARN_ON_SMP_ONCE, >> VM_WARN_ON_SMP_ONCE, just test NR_CPUS != 1 in the existing VM_BUG_ONs. >> >> Signed-off-by: Hugh Dickins >> Cc: Andrea Arcangeli >> Cc: >> Signed-off-by: Andrew Morton >> Signed-off-by: Linus Torvalds >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index b3ffc21..91d3efb 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -2083,7 +2083,7 @@ static void collect_mm_slot(struct mm_slot *mm_slot) >> { >> struct mm_struct *mm = mm_slot->mm; >> >> - VM_BUG_ON(!spin_is_locked(&khugepaged_mm_lock)); >> + VM_BUG_ON(NR_CPUS != 1 && !spin_is_locked(&khugepaged_mm_lock)); >> >> >> >> >> On Sun, Jan 13, 2013 at 1:39 PM, Felix Fietkau wrote: >>> >>> On 2013-01-13 7:03 PM, Eric Dumazet wrote: >>> > I suspect a bug in the spin_is_locked() implementation on your arch, as >>> > he socket lock should be held at this point. >>> I don't think this is an arch implementation bug, this probably happens >>> on all !SMP systems. See this bit from include/linux/spinlock_up.h: >>> >>> #define arch_spin_is_locked(lock) ((void)(lock), 0) >>> >>> - Felix >>> >> > --047d7beba2f8516f6904d3384a21 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

Thanks Eric and Yuchung for taking care of the patch. I will test few mo= re TFO cases as well once this patch is built in cero.

Thanks,
Ketan

On Jan 14, 2013 9:37 AM, "Eric Dumazet" <edumazet@google.com> wrote:
>
> Quite frankly I would just remove the BUG_ON()
>
> diff --git a/net/core/request_sock.c b/net/core/request_sock.c
> index c31d9e8..4425148 100644
> --- a/net/core/request_sock.c
> +++ b/net/core/request_sock.c
> @@ -186,8 +186,6 @@ void reqsk_fastopen_remove(struct sock *sk, struct= request_sock *req,
> =A0 =A0 =A0 =A0 struct fastopen_queue *fastopenq =3D
> =A0 =A0 =A0 =A0 =A0 =A0 inet_csk(lsk)->icsk_accept_queue.fastopenq;=
> =A0
> - =A0 =A0 =A0 BUG_ON(!spin_is_locked(&sk->sk_lock.slock) &&= amp; !sock_owned_by_user(sk));
> -
> =A0 =A0 =A0 =A0 tcp_sk(sk)->fastopen_rsk =3D NULL;
> =A0 =A0 =A0 =A0 spin_lock_bh(&fastopenq->lock);
> =A0 =A0 =A0 =A0 fastopenq->qlen--;
>
>
>
> On Sun, Jan 13, 2013 at 7:05 PM, Eric Dumazet <edumazet@google.com> wrote:
>>
>> Oh well yes, this doesnt quite work on !SMP.
>>
>> And this kind of bug is frequent....
>>
>> See following example :
>>
>> commit b9980cdcf2524c5fe15d8cbae9c97b3ed6385563
>> Author: Hugh Dickins <hughd= @google.com>
>> Date: =A0 Wed Feb 8 17:13:40 2012 -0800
>>
>> =A0 =A0 mm: fix UP THP spin_is_locked BUGs
>> =A0 =A0=A0
>> =A0 =A0 Fix CONFIG_TRANSPARENT_HUGEPAGE=3Dy CONFIG_SMP=3Dn CONFIG_= DEBUG_VM=3Dy
>> =A0 =A0 CONFIG_DEBUG_SPINLOCK=3Dn kernel: spin_is_locked() is then= always false,
>> =A0 =A0 and so triggers some BUGs in Transparent HugePage codepath= s.
>> =A0 =A0=A0
>> =A0 =A0 asm-generic/bug.h mentions this problem, and provides a WA= RN_ON_SMP(x);
>> =A0 =A0 but being too lazy to add VM_BUG_ON_SMP, BUG_ON_SMP, WARN_= ON_SMP_ONCE,
>> =A0 =A0 VM_WARN_ON_SMP_ONCE, just test NR_CPUS !=3D 1 in the exist= ing VM_BUG_ONs.
>> =A0 =A0=A0
>> =A0 =A0 Signed-off-by: Hugh Dickins <hughd@google.com>
>> =A0 =A0 Cc: Andrea Arcangeli <aarcange@redhat.com>
>> =A0 =A0 Cc: <stable@v= ger.kernel.org>
>> =A0 =A0 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>> =A0 =A0 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index b3ffc21..91d3efb 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -2083,7 +2083,7 @@ static void collect_mm_slot(struct mm_slot *= mm_slot)
>> =A0{
>> =A0 =A0 =A0 =A0 struct mm_struct *mm =3D mm_slot->mm;
>> =A0
>> - =A0 =A0 =A0 VM_BUG_ON(!spin_is_locked(&khugepaged_mm_lock));=
>> + =A0 =A0 =A0 VM_BUG_ON(NR_CPUS !=3D 1 && !spin_is_locked(= &khugepaged_mm_lock));
>>
>>
>>
>>
>> On Sun, Jan 13, 2013 at 1:39 PM, Felix Fietkau <nbd@openwrt.org> wrote:
>>>
>>> On 2013-01-13 7:03 PM, Eric Dumazet wrote:
>>> > I suspect a bug in the spin_is_locked() implementation on= your arch, as
>>> > he socket lock should be held at this point.
>>> I don't think this is an arch implementation bug, this pro= bably happens
>>> on all !SMP systems. See this bit from include/linux/spinlock_= up.h:
>>>
>>> #define arch_spin_is_locked(lock) =A0 ((void)(lock), 0)
>>>
>>> - Felix
>>>
>>
>

--047d7beba2f8516f6904d3384a21--