For almost 48 hours now we've been experiencing sporadic DNS outages for A and CNAME records. Have mostly been tracking the issue via ping and nslookup on Windows.
Lookups for 'www' (A), 'img1' (CNAME), 'store' (A) have come back as not found (windows ping or nslookup says it just cant find the host) - andon one online DNS tester I even saw an NXDOMAIN response.
I'm pretty sure that somewhere in the DNS 'chain' there is a cached NXDOMAIN response coming back thats still getting cached after 46 hours now.
I've even seen a case where - using nslookup - i've done a lookup on a CNAME record img1.example.com
10 times within 5 seconds and had negative and positive responses from the same verizon DNS server within a second.
Like I said this has happened for 48 hours now. The 'outage' occurs only briefly for a few minutes, but has been seen from at least 4 differnet geographical locations/networks.
I thought the bad record would have cleared itself out by now, but I'm hoping that finding the offending DNS server I can at least try to contact them - or find out whose fault it is.
Answers to obvious questions
- DNS currently godaddy, has not been changed at all
- domain has been active with DNS on godaddys hosted DNS (ns41.domaincontrol.com) for 3 years
- Problem observed on several differnet networks, verizon DSL, comcast cable, verizon EVDO, site24x7 website
- even happening with CNAME records to amazon A3 (i.e. 100% not a webserver problem and 100% DNS problem)
- I'm not an expert, but the problem confirmed by two people that know more than i do. one thinks the most likely issue is a cached NXDOMAIN response somewhere.
Should we just wait up to 4 days before changing DNS providers? Is there a tool of some sort to trace where the DNS is coming from and find the actual server which is caching the NXDOMAIN response - or perhaps a service to just test hundreds/thousands of DNS servers for their responses?