From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ie0-f180.google.com (mail-ie0-f180.google.com [209.85.223.180]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority" (verified OK)) by huchra.bufferbloat.net (Postfix) with ESMTPS id EF65E21F170 for ; Mon, 14 Jan 2013 11:50:19 -0800 (PST) Received: by mail-ie0-f180.google.com with SMTP id c10so5796421ieb.11 for ; Mon, 14 Jan 2013 11:50:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:content-transfer-encoding; bh=oRiaMF05COgfu9C6oIjLDnGIK+Ue+msAWCZuejyUQZg=; b=Z66hEpQsonDx2RIAddg91cbED7cWkHhlgEjGsvXIX67XL8rFCFqLXwjD3WHPc5aLNm xuF/HyhLBoO8Vj1l+GsyXPAfedF2RDw0CJZ2QDE5t/cLaGEpNIFN4GBqUNK6JZ7kO4nb U+s7I5QC+z2Eezp2jcbxghm7SG/9e8Tlz5SO+xpbIrZ0Z3agMO4BYTWtJbqqe5KL5KVY 2eNRS0DFtkXmbIwrpdqrHV1k9B18r2EGLoC/EzCIXZcDzY43vLnwaAWAoEW41+i5itWq K7sNi/3o/s612EM3WQlwPIcO1cZryFP1p5LTZKYaC0E17ZLGO1/hlpPMQmeE/8xyBO/x PNPQ== MIME-Version: 1.0 X-Received: by 10.50.213.73 with SMTP id nq9mr7829314igc.27.1358193019237; Mon, 14 Jan 2013 11:50:19 -0800 (PST) Received: by 10.64.135.39 with HTTP; Mon, 14 Jan 2013 11:50:18 -0800 (PST) In-Reply-To: References: <50F32981.9080404@openwrt.org> Date: Mon, 14 Jan 2013 14:50:18 -0500 Message-ID: From: Dave Taht To: Ketan Kulkarni Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Eric Dumazet , Yuchung Cheng , cerowrt-devel Subject: Re: [Cerowrt-devel] TFO crashes cerowrt 3.7.1-1 X-BeenThere: cerowrt-devel@lists.bufferbloat.net X-Mailman-Version: 2.1.13 Precedence: list List-Id: Development issues regarding the cerowrt test router project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jan 2013 19:50:20 -0000 On Mon, Jan 14, 2013 at 1:14 AM, Dave Taht wrote: > I am so buried as to only be able to do new builds of cero once a week. > > Can the bad behavior be duplicated on a single core other sort of > processor, like x86? Or merely boot up a x86 box in a single processor > mode? > > I'll try to get a new release out next sunday. I lied. Crash bugs bother me a lot. A release of cerowrt with the BUG_ON re= moved for TFO is now up at: There are no other changes from Cerowrt-3.7.2-1. Those playing with it should enable TFO in polipo as per this thread and also fiddle with various settings for the gethostbyname option in polipo. I did not look into the presumably separate DNS lookup issue, nor the multi= cast issue also mentioned on this thread. a new, cleaned up version of the ar71xx unaligned access code arrived in openwrt head (thx nbd!), which addresses some new stuff and leaves out some stuff in the existing cerowrt patch set for unaligned access, notably a bunch of ipv6 stuff that inspired the patch in the first place. I retain concerns re the checksum code on both versions. There were multiple other (mosty ipv6 related) changes to openwrt over the weekend ... which made risking a pull forward of that stuff into this quick snapshot release of cero too risky to do, and I would prefer that the two differing unaligned patches be merged cleanly and pushed up to openWrt. So hopefully the TFO portion of this bug thread is resolved, and there are 3 other bugs left to look at separately... > > On Sun, Jan 13, 2013 at 8:43 PM, Ketan Kulkarni wrot= e: >> Thanks Eric and Yuchung for taking care of the patch. I will test few mo= re >> TFO cases as well once this patch is built in cero. >> >> Thanks, >> Ketan >> >> On Jan 14, 2013 9:37 AM, "Eric Dumazet" wrote: >>> >>> Quite frankly I would just remove the BUG_ON() >>> >>> diff --git a/net/core/request_sock.c b/net/core/request_sock.c >>> index c31d9e8..4425148 100644 >>> --- a/net/core/request_sock.c >>> +++ b/net/core/request_sock.c >>> @@ -186,8 +186,6 @@ void reqsk_fastopen_remove(struct sock *sk, struct >>> request_sock *req, >>> struct fastopen_queue *fastopenq =3D >>> inet_csk(lsk)->icsk_accept_queue.fastopenq; >>> >>> - BUG_ON(!spin_is_locked(&sk->sk_lock.slock) && >>> !sock_owned_by_user(sk)); >>> - >>> tcp_sk(sk)->fastopen_rsk =3D NULL; >>> spin_lock_bh(&fastopenq->lock); >>> fastopenq->qlen--; >>> >>> >>> >>> On Sun, Jan 13, 2013 at 7:05 PM, Eric Dumazet wro= te: >>>> >>>> Oh well yes, this doesnt quite work on !SMP. >>>> >>>> And this kind of bug is frequent.... >>>> >>>> See following example : >>>> >>>> commit b9980cdcf2524c5fe15d8cbae9c97b3ed6385563 >>>> Author: Hugh Dickins >>>> Date: Wed Feb 8 17:13:40 2012 -0800 >>>> >>>> mm: fix UP THP spin_is_locked BUGs >>>> >>>> Fix CONFIG_TRANSPARENT_HUGEPAGE=3Dy CONFIG_SMP=3Dn CONFIG_DEBUG_VM= =3Dy >>>> CONFIG_DEBUG_SPINLOCK=3Dn kernel: spin_is_locked() is then always >>>> false, >>>> and so triggers some BUGs in Transparent HugePage codepaths. >>>> >>>> asm-generic/bug.h mentions this problem, and provides a >>>> WARN_ON_SMP(x); >>>> but being too lazy to add VM_BUG_ON_SMP, BUG_ON_SMP, >>>> WARN_ON_SMP_ONCE, >>>> VM_WARN_ON_SMP_ONCE, just test NR_CPUS !=3D 1 in the existing >>>> VM_BUG_ONs. >>>> >>>> Signed-off-by: Hugh Dickins >>>> Cc: Andrea Arcangeli >>>> Cc: >>>> Signed-off-by: Andrew Morton >>>> Signed-off-by: Linus Torvalds >>>> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>>> index b3ffc21..91d3efb 100644 >>>> --- a/mm/huge_memory.c >>>> +++ b/mm/huge_memory.c >>>> @@ -2083,7 +2083,7 @@ static void collect_mm_slot(struct mm_slot >>>> *mm_slot) >>>> { >>>> struct mm_struct *mm =3D mm_slot->mm; >>>> >>>> - VM_BUG_ON(!spin_is_locked(&khugepaged_mm_lock)); >>>> + VM_BUG_ON(NR_CPUS !=3D 1 && !spin_is_locked(&khugepaged_mm_loc= k)); >>>> >>>> >>>> >>>> >>>> On Sun, Jan 13, 2013 at 1:39 PM, Felix Fietkau wrote= : >>>>> >>>>> On 2013-01-13 7:03 PM, Eric Dumazet wrote: >>>>> > I suspect a bug in the spin_is_locked() implementation on your arch= , >>>>> > as >>>>> > he socket lock should be held at this point. >>>>> I don't think this is an arch implementation bug, this probably happe= ns >>>>> on all !SMP systems. See this bit from include/linux/spinlock_up.h: >>>>> >>>>> #define arch_spin_is_locked(lock) ((void)(lock), 0) >>>>> >>>>> - Felix >>>>> >>>> >>> >> >> >> _______________________________________________ >> Cerowrt-devel mailing list >> Cerowrt-devel@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/cerowrt-devel >> > > > > -- > Dave T=E4ht > > Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscrib= e.html --=20 Dave T=E4ht Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.= html