Increased Query Load on Root Name Servers
[apologies for duplicates] Dear colleagues, The RIPE NCC experienced an increased query load on K-root and other root name servers. Please find some analysis and graphs on RIPE Labs describing this increase. http://labs.ripe.net/Members/wnagele/increased-query-load-on-root-name-serve... Please note that we did not see any noticeable degradation of any Internet services caused by this. Kind Regards, Mirjam Kuehne RIPE NCC
[apologies for duplicates] Dear colleagues, We did some more analysis of the recent increase in query load on K-root and other root name servers. Please read on RIPE Labs: http://labs.ripe.net/Members/wnagele/analysis-of-increased-query-load-on-roo... Kind Regards, Mirjam Kuehne RIPE NCC
On 07/11/2011 07:02, Mirjam Kuehne wrote:
[apologies for duplicates]
Dear colleagues,
We did some more analysis of the recent increase in query load on K-root and other root name servers. Please read on RIPE Labs:
http://labs.ripe.net/Members/wnagele/analysis-of-increased-query-load-on-roo...
This analysis is interesting from the traffic standpoint, but doesn't seem to answer one of the questions that I had, which is what caused the sudden increase? Historically this kind of thing has happened in the case of a misconfiguration for the name service for a popular domain, but (unless I missed it, and if so apologies) the question of, "Was <domain> misconfigured?" isn't answered in this paper. I'm not (necessarily) asking you to "name and shame" the domain in question, but I still believe that it's an interesting question on several levels. Meanwhile congrats to the K staff for handling this issue so adroitly. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/
On 12/07/2011 02:41, Doug Barton wrote:
On 07/11/2011 07:02, Mirjam Kuehne wrote:
[apologies for duplicates]
Dear colleagues,
We did some more analysis of the recent increase in query load on K-root and other root name servers. Please read on RIPE Labs:
http://labs.ripe.net/Members/wnagele/analysis-of-increased-query-load-on-roo...
This analysis is interesting from the traffic standpoint, but doesn't seem to answer one of the questions that I had, which is what caused the sudden increase? Historically this kind of thing has happened in the case of a misconfiguration for the name service for a popular domain, but (unless I missed it, and if so apologies) the question of, "Was <domain> misconfigured?" isn't answered in this paper.
Hi Doug, We don't have all the answers, but it appears not to be related to a misconfigured zone, the zone looked (and still looks) like this: <domain>.com. 7200 IN SOA ns1.<nsdomain>. root.ns1.<domain>.com. 20091027 28800 600 604800 86400 <domain>.com. 300 IN A <ipv4_1> <domain>.com. 300 IN A <ipv4_2> <domain>.com. 7200 IN NS ns1.<nsdomain>. <domain>.com. 7200 IN NS ns2.<nsdomain>. <domain>.com. 7200 IN NS ns3.<nsdomain>. <domain>.com. 7200 IN NS ns4.<nsdomain>. www.<domain>.com. 300 IN A <ipv4_1> www.<domain>.com. 300 IN A <ipv4_2> <domain>.com. 7200 IN SOA ns1.<nsdomain>. root.ns1.<domain>.com. 20091027 28800 600 604800 86400 As mentioned in the article, we have several indications that this was caused by a botnet. It is unlikely this was a reflector attack with spoofed source addresses, as there are some 60,000 unique source IPs per hour in the queries for this specific domain. For targeted spoofing I'd would expect this number to be very low, for random spoofing I'd expect this number would be far higher. If you have any clue or indication on things we could further investigate, let us know, here or on RIPE Labs. best regards, Emile Aben RIPE NCC
On 07/12/2011 00:20, Emile Aben wrote:
We don't have all the answers, but it appears not to be related to a misconfigured zone
Thank you for satisfying my idle curiosity. :) I did not mean to imply that your report was in any way deficient at describing what you think the problem was actually caused by. My curiosity about this particular issue was raised for 2 reasons, one being (as I said previously) history of previous incidents. The other is given that if this were a DDOS attempt it's a rather weak one (on several levels) I can't help finding that unlikely. (Which again, is not a criticism of your analysis, merely a disturbing lack of pieces falling neatly into previously-known patterns.) I did note this from your scrubbed zone file: <domain>.com. 7200 IN NS ns1.<nsdomain>. <domain>.com. 7200 IN NS ns2.<nsdomain>. <domain>.com. 7200 IN NS ns3.<nsdomain>. <domain>.com. 7200 IN NS ns4.<nsdomain>. Are we to conclude from that that <nsdomain> is different from <domain>.com? If so, and <nsdomain> is misconfigured somehow, that would start to look more like misconfiguration patterns that we've seen in the past; particularly if <nsdomain> is not in COM, and therefore the COM zone has no glue for those hostnames. I also note that 2 hours seems to be a ridiculously short TTL for NS records, which would seem to put a little more weight on the "possible misconfiguration" side of the balance. One could imagine a moderately popular game site receiving the CN equivalent of being slashdotted, and previously-painless minor misconfigurations suddenly causing much larger problems. hth, Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/
On 13/07/2011 02:43, Doug Barton wrote:
On 07/12/2011 00:20, Emile Aben wrote:
We don't have all the answers, but it appears not to be related to a misconfigured zone
Thank you for satisfying my idle curiosity. :) I did not mean to imply that your report was in any way deficient at describing what you think the problem was actually caused by. My curiosity about this particular issue was raised for 2 reasons, one being (as I said previously) history of previous incidents. The other is given that if this were a DDOS attempt it's a rather weak one (on several levels) I can't help finding that unlikely. (Which again, is not a criticism of your analysis, merely a disturbing lack of pieces falling neatly into previously-known patterns.)
I agree that this is a bit strange, my guess would be that this was a test of capabilities of some kind. Not too strange to pick the root-servers as a target, since it is relatively well instrumented.
I did note this from your scrubbed zone file:
<domain>.com. 7200 IN NS ns1.<nsdomain>. <domain>.com. 7200 IN NS ns2.<nsdomain>. <domain>.com. 7200 IN NS ns3.<nsdomain>. <domain>.com. 7200 IN NS ns4.<nsdomain>.
Are we to conclude from that that <nsdomain> is different from <domain>.com? If so, and <nsdomain> is misconfigured somehow, that would
Yes, it was a different domain, not in COM. We asked folks that operate COM and they didn't see the same query-storm for this domain though. If these were all 'normal' resolvers dealing with a misconfigured zone, I'd expect them to follow the delegation chain. Also when spot-checking some 20 source IPs for these queries we didn't find these did any other queries to K-root then for things in <domain>.com. But again, we don't have the definite answer and are not excluding any possible explanation, so thanks for inquiring deeper into this.
start to look more like misconfiguration patterns that we've seen in the past; particularly if <nsdomain> is not in COM, and therefore the COM zone has no glue for those hostnames.
I also note that 2 hours seems to be a ridiculously short TTL for NS records, which would seem to put a little more weight on the "possible misconfiguration" side of the balance. One could imagine a moderately popular game site receiving the CN equivalent of being slashdotted, and previously-painless minor misconfigurations suddenly causing much larger problems.
I just looked at the query load for www.<domain>.com on 20110628, and before 16:28 UTC (0:28 Chinese Standard time) we have 2 queries for this domain, then it all starts: #queries timestamp 1 1309252434 1 1309274472 8603 1309278521 9630 1309278522 11277 1309278523 14123 1309278524 12271 1309278525 12457 1309278526 12118 1309278527 12369 1309278528 12234 1309278529 12402 1309278530 12202 1309278531 12469 1309278532 12138 1309278533 12149 1309278534 ... (continues to be in 10-12kps range for a while) So either the misconfiguration started at around 16:28 UTC, or this wasn't a misconfiguration. The third possibility, already misconfigurated+CN-slashdotted, I think is not impossible but unlikely, both because of it being past midnight at the ASes that were a major source of queries, and the very sudden increase in load. regards, Emile Aben RIPE NCC
On 07/13/2011 02:32, Emile Aben wrote:
Yes, it was a different domain, not in COM. We asked folks that operate COM and they didn't see the same query-storm for this domain though. If these were all 'normal' resolvers dealing with a misconfigured zone, I'd expect them to follow the delegation chain. Also when spot-checking some 20 source IPs for these queries we didn't find these did any other queries to K-root then for things in <domain>.com.
Interesting stuff! Thanks for once again indulging my curiosity. I always find it interesting when more data makes a problem muddier instead of clearer. :) Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/
participants (3)
-
Doug Barton
-
Emile Aben
-
Mirjam Kuehne