Development issues regarding the cerowrt test router project
 help / color / mirror / Atom feed
From: Dave Taht <dave.taht@gmail.com>
To: Maciej Soltysiak <maciej@soltysiak.com>
Cc: cerowrt-devel@lists.bufferbloat.net
Subject: Re: [Cerowrt-devel] TFO crashes cerowrt 3.7.1-1
Date: Sun, 13 Jan 2013 22:11:34 -0800	[thread overview]
Message-ID: <CAA93jw6gUP-nLCNa81Z_dKM=1nGZhYn15iuZGJtR7pnn+n+qvQ@mail.gmail.com> (raw)
In-Reply-To: <CAMZR1YBs0w1r9T1S7A-Uin34tGMrY3y7oh0WqiZZ-SaXTeQMNA@mail.gmail.com>

This is a different issue that tfo, so taking the tfo-ers off the list

On Fri, Jan 4, 2013 at 12:42 PM, Maciej Soltysiak <maciej@soltysiak.com> wrote:
> I am seeing something strange here, with polipo related to TFO but also DNS.

I have had polipo's internal dns resolver mess up on multiple occasions
exactly along the lines you describe. There is a bug for it in the
cerowrt database as best as I recall.

I have never tracked down why it happens.

> When I just took 3.7.1-1 and set my windows 7 laptop to use gw.home.lan:8123
> as http proxy it didn't work. What I observed was:
> A) after quite a while polipo's response to browser was 504 Host
> www.osnews.com lookup failed: Timeout
> b) this error in ssh console: Host osnews.com lookup failed: Timeout
> (131072)
> c) Disabling TFO by adding option useTCPFastOpen 'false' to config 'polipo'
> 'general' works around the problem
> d) Alternatively, you can keep TFO enabled in polipo but change option
> 'dnsUseGethostbyname' from 'reluctantly' to 'true' (!)
> This is very weird, because TFO is TCP and the DNS queries fired off by
> polipo are UDP:
> root@OpenWrt:/tmp/log# tcpdump -n -v -vv -vvv -x -X -s 1500 -i lo
> 20:21:56.160245 IP (tos 0x0, ttl 64, id 50129, offset 0, flags [DF], proto

No, it's not weird, there's something about uclibc and polipo interacting here
that is kind of unknown. It has always seemed to me to be maybe a bug
in polipo's internal dns resolver on mips...

> UDP (17), length 60)
> 127.0.0.1.47304 > 127.0.0.1.53: [bad udp cksum 0xfe3b -> 0xd17f!] 55396+ A?

The bad checksum issue probably doesn't matter.

However an actual tcpdump capture file would be useful to have to look
at the format of the dns query.

> www.osnews.com. (32)

> 0x0000: 4500 003c c3d1 4000 4011 78dd 7f00 0001 E..<..@.@.x.....
> 0x0010: 7f00 0001 b8c8 0035 0028 fe3b d864 0100 .......5.(.;.d..
> 0x0020: 0001 0000 0000 0000 0377 7777 066f 736e .........www.osn
> 0x0030: 6577 7303 636f 6d00 0001 0001 ews.com.....
> 20:21:56.160319 IP (tos 0x0, ttl 64, id 50130, offset 0, flags [DF], proto
> UDP (17), length 60)
> 127.0.0.1.47304 > 127.0.0.1.53: [bad udp cksum 0xfe3b -> 0xd164!] 55396+
> AAAA? www.osnews.com. (32)
> 0x0000: 4500 003c c3d2 4000 4011 78dc 7f00 0001 E..<..@.@.x.....
> 0x0010: 7f00 0001 b8c8 0035 0028 fe3b d864 0100 .......5.(.;.d..
> 0x0020: 0001 0000 0000 0000 0377 7777 066f 736e .........www.osn
> 0x0030: 6577 7303 636f 6d00 001c 0001 ews.com.....
> 20:21:56.169942 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP
> (17), length 123)
> 127.0.0.1.53 > 127.0.0.1.47304: [bad udp cksum 0xfe7a -> 0x5f73!] 55396 q:
> A? www.osnews.com. 1/2/0 www.osnews.com. [29m3s] A 74.86.31.159 ns:
> osnews.com. [29m3s] NS ns2.swelter.net., osnews.com. [29m3s] NS
> ns1.swelter.net. (95)
> 0x0000: 4500 007b 0000 4000 4011 3c70 7f00 0001 E..{..@.@.<p....
> 0x0010: 7f00 0001 0035 b8c8 0067 fe7a d864 8180 .....5...g.z.d..
> 0x0020: 0001 0001 0002 0000 0377 7777 066f 736e .........www.osn
> 0x0030: 6577 7303 636f 6d00 0001 0001 c00c 0001 ews.com.........
> 0x0040: 0001 0000 06cf 0004 4a56 1f9f c010 0002 ........JV......
> 0x0050: 0001 0000 06cf 0011 036e 7332 0773 7765 .........ns2.swe
> 0x0060: 6c74 6572 036e 6574 00c0 1000 0200 0100 lter.net........
> 0x0070: 0006 cf00 0603 6e73 31c0 40 ......ns1.@
> 20:21:56.173901 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP
> (17), length 135)
> 127.0.0.1.53 > 127.0.0.1.47304: [bad udp cksum 0xfe86 -> 0x8ecb!] 55396 q:
> AAAA? www.osnews.com. 1/2/0 www.osnews.com. [54m44s] AAAA
> 2607:f0d0:1002:62::3 ns: osnews.com. [29m3s] NS ns1.swelter.net.,
> osnews.com. [29m3s] NS ns2.swelter.net. (107)
> 0x0000: 4500 0087 0000 4000 4011 3c64 7f00 0001 E.....@.@.<d....
> 0x0010: 7f00 0001 0035 b8c8 0073 fe86 d864 8180 .....5...s...d..
> 0x0020: 0001 0001 0002 0000 0377 7777 066f 736e .........www.osn
> 0x0030: 6577 7303 636f 6d00 001c 0001 c00c 001c ews.com.........
> 0x0040: 0001 0000 0cd4 0010 2607 f0d0 1002 0062 ........&......b
> 0x0050: 0000 0000 0000 0003 c010 0002 0001 0000 ................
> 0x0060: 06cf 0011 036e 7331 0773 7765 6c74 6572 .....ns1.swelter
> 0x0070: 036e 6574 00c0 1000 0200 0100 0006 cf00 .net............
> 0x0080: 0603 6e73 32c0 4c ..ns2.L
> This is the only DNS traffic I saw during the attempts. The tcpdumps have
> udp bad checksum but when I disabled TFO in polipo, the UDP where still bad
> checksum but they worked.

I hesitate to draw a connection between TFO and the DNS failures. What
I would see was polipo would work for a while, then start failing on
DNS traffic, and like you my workaround was to use gethostbyname
(which unfortunately clobbers performance).

As fond as I am of split tcp solutions I never poked into this further
at the time....

It's probably a really simple off-by-one error in the dns code inside
polipo. Perhaps a packet capture will get us closer. Is there an
active mailing list for it?


> Really weird.
> p.s. UPNP still works for port forwarding negotiation as it did in 3.6.11-4
> I still couldn't get the UPNP/SSDP broadcasts (udp to 239.255.255.250) to
> being forwarded between se00 and sw00/sw10. Last time it worked was ~3.3.8.

OK, yet another issue.

The routing cache got eliminated between 3.3 and 3.6, and there were
all sorts of changes to it over the last 6 releases that have been
bothersome.

or perhaps I did something stupid regarding igmp. (is it even on?)

> I'm starting not to question why it doesn't work, I'm starting to wonder why
> it did work then ;-)
> Regards,
> Maciej
> On Fri, Jan 4, 2013 at 6:33 PM, Dave Taht <dave.taht@gmail.com> wrote:
>>
>> On Fri, Jan 4, 2013 at 9:27 AM, Eric Dumazet <edumazet@google.com> wrote:
>> > Sorry, could you give us a copy of the panic stack trace ?
>>
>> I will get a serial console up on a wndr3800 by sunday. (sorry, just
>> landed in california, am in disarray)
>>
>> The latest dev build of cero for the wndr3800 and wndr3700v2 is at:
>>
>> http://snapon.lab.bufferbloat.net/~cero2/cerowrt/wndr/3.7.1-1/
>>
>> --
>> Dave Täht
>>
>> Fixing bufferbloat with cerowrt:
>> http://www.teklibre.com/cerowrt/subscribe.html
>> _______________________________________________
>> Cerowrt-devel mailing list
>> Cerowrt-devel@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>
>



-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html

  parent reply	other threads:[~2013-01-14  6:11 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-04 17:04 Dave Taht
2013-01-04 17:27 ` Eric Dumazet
2013-01-04 17:33   ` Dave Taht
2013-01-04 20:42     ` Maciej Soltysiak
2013-01-04 20:43       ` Maciej Soltysiak
2013-01-04 20:57         ` Jerry Chu
2013-01-04 21:21           ` Dave Taht
2013-01-04 21:36             ` Jerry Chu
2013-01-04 21:44               ` Dave Taht
2013-01-04 21:01         ` dpreed
2013-01-04 22:49           ` Robert Bradley
2013-01-04 21:11       ` Dave Taht
2013-01-04 21:19         ` Jerry Chu
2013-01-05  1:59           ` Ketan Kulkarni
2013-01-05  2:20             ` Yuchung Cheng
2013-01-05  3:02               ` Ketan Kulkarni
2013-01-05  3:16                 ` Eric Dumazet
2013-01-05  3:35                 ` Dave Taht
2013-01-05  4:05                   ` Dave Taht
2013-01-05 19:13                 ` Ketan Kulkarni
2013-01-13 17:01                   ` Ketan Kulkarni
2013-01-13 18:03                     ` Eric Dumazet
2013-01-13 21:39                       ` Felix Fietkau
2013-01-14  0:38                         ` Yuchung Cheng
2013-01-14  3:05                         ` Eric Dumazet
2013-01-14  4:07                           ` Eric Dumazet
2013-01-14  4:43                             ` Ketan Kulkarni
2013-01-14  6:14                               ` Dave Taht
2013-01-14 19:50                                 ` Dave Taht
2013-01-14  8:18                           ` Jerry Chu
2013-01-14 16:32                             ` Eric Dumazet
2013-01-04 22:25       ` Robert Bradley
2013-01-14  6:11       ` Dave Taht [this message]
2013-01-14 16:37         ` Ketan Kulkarni
2013-01-16 22:19         ` Maciej Soltysiak
2013-01-17  0:58           ` Dave Taht
2013-01-17  3:44           ` Dave Taht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.bufferbloat.net/postorius/lists/cerowrt-devel.lists.bufferbloat.net/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAA93jw6gUP-nLCNa81Z_dKM=1nGZhYn15iuZGJtR7pnn+n+qvQ@mail.gmail.com' \
    --to=dave.taht@gmail.com \
    --cc=cerowrt-devel@lists.bufferbloat.net \
    --cc=maciej@soltysiak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox