On Mon, Mar 23, 2015 at 3:31 PM, John Knight <John.Knight@belkin.com> wrote:

Hi,

 

We use dnsmasq 2.55 in our Linksys routers.  We have generally had few problems with dnsmasq, but recently one of our customers reported a failure that did not recover.


I have seen a failure with dns for ipv6 on dnsmasq like this (2-3 days of uptime, then spiraling off into a unresponsive loop) as recently as 2.72. Since ipv6 has now been rolled out universally across comcast's services and this is (to my knowledge) the only crash bug dnsmasq has anymore, I have been meaning to get the 2.73rc1 release candidate up and running full time in an mixed ipv4/ipv6 environment and to pound it flat with queries for some time now. I hope to deploy that candidate in the lab in a week or so. More eyeballs would be helpful on finally tracking it down. What I typically do is pound through the alexa top 1 million, repeatedly, and also namebench, other tests are feasible.  I have some hope that the edns0 bugfix in this release candidate actually fixes it.

The five years of dnsmasq releases since 2.55 have had many bug fixes (see release notes) and upgrades (notably dnssec), also, and it is well worth upgrading your userbase even if this fix remains difficult to find. 

folk on your side of the firmware might also benefit from seeing jim gettys do his "insecurity in home embedded devices" talk, filmed here:

http://cyber.law.harvard.edu/events/luncheon/2014/06/gettys

and updated considerably since, with new data.

https://gettys.wordpress.com/2014/10/06/bufferbloat-and-other-challenges/
 

 

On the customer’s router, the maximum number of dhcp clients is 50.  The customer happens to be a hotel and experienced a high usage rate and exceeded the 50 dhcp client limit.  As expected, the 50 active clients received their IP address and those over 50 did not.  However, the customer noticed that dns stopped working when this happened.  The CPU usage went to 50% (49.4% was logged to dnsmasq by top).  This may be 100% of single CPU as this is a dual CPU router.  The environment is a mixed environment of both IPv4 and IPv6… I believe we only use dnsmasq for IPv6 for the dns capability; ipv4 uses dhcp and dns capability from dnsmasq.  The customer goes on to say that dhcp renewals were not being handled as well.  Customer has also added that sometimes they see issue with only 30 active dhcp users or so.  Apparently it requires a good number of users before it happens… and it does not occur immediately…. Sometimes a few days or more before it happens.

 

I looked through some of your release notes and noticed some similarities on some bug reports, but did not see an exact match.  Does any of this make sense to you?  We tried reproducing this in our lab and were unsuccessful simulating the scenario.  We realize that our dnsmasq version is pretty old and a lot of fixes may have been applied since version 2.55.  If there is a good chance that the issues seen have been fixed in a later release, we will consider upgrading to a newer version.  If you could comment on the likelihood of a fix already being made, I would appreciate it.  Since we cannot reproduce the problem, we will likely have to apply a newer version (hopefully very stable) and have customer try it… something we would prefer to avoid unless we have some confidence that the problem may already be fixed in the newer release.

 

Thank you for your time.  I look forward to hearing from you.

 

Best Regards and thanks,

 

John Knight

 

JOHN KNIGHT 
Senior Software Engineer 

Belkin International 

O +1
949 238 4543
M +1 949 351 1020 
Belkin-WeMo-Linksys

 

 

__________________________________________________________________ Confidential This e-mail and any files transmitted with it are the property of Belkin International, Inc. and/or its affiliates, are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not one of the named recipients or otherwise have reason to believe that you have received this e-mail in error, please notify the sender and delete this message immediately from your computer. Any other use, retention, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. Pour la version française: http://www.belkin.com/email-notice/French.html Für die deutsche Übersetzung: http://www.belkin.com/email-notice/German.html __________________________________________________________________

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss




--
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb