Re: [dns-wg] [dns-operations] Additional information about the RIPE NCC reverse DNS issue
Doug, At 2017-03-18 18:34:25 -0700 Doug Barton <dougb@dougbarton.email> wrote:
On 03/18/2017 08:46 AM, Anand Buddhdev wrote:
Dear colleagues,
This is a follow-up to our message of Friday about issues with some reverse delegations.
After doing a thorough analysis, and with the help of ARIN staff, we found more issues with our zonelet generation code
Can you say more about the benefit of this "zonelet" system vs. ARIN simply delegating the appropriate zones to you, and you managing them like any other DNS zone?
I do appreciate you keeping the community informed about the causes of the outage, but it seems that at least part of the root cause is that you're operating what sounds like a fairly fragile system in the first place, with (de fact) insufficient validity checking.
I was at the RIPE NCC when we adopted the zonelet approach, although I haven't worked for them for over a decade. The zonelet system was designed to allow reverse DNS for IPv4 space that was originally assigned to one RIR but was later partially migrated to other RIRs. This happened when LACNIC and AfriNIC were formed, although I think that an audit was done at the time and so space was moved around between all 5 RIRs. The problem is that we could have a delegation like 999.in-addr.arpa going to the RIPE NCC and then 888.999.in-addr.arpa being managed by ARIN... but want 888.999.in-addr.arpa to point to the **address holder's** name servers, not **ARIN's** name servers. So ARIN needs a way to get the information about the name servers to the RIPE NCC somehow (and RIPE NCC to LACNIC, and so on). Zonelets are used for this, which is basically just the NS records needed, probably picked up using SSH. I think that we discussed using dynamic DNS (DDNS) for this at the time, but decided that the simplest & best solution was zonelets. DNAME could be used, but it would involve an extra lookup for resolvers, right? (DNAME was pretty new when zonelets were adopted, and I don't know that BIND 8 supported them, which was still the most popular DNS server at that time.) My guess is that the bugs are probably more due to ancient Perl code than an overly-complicated system for exchanging this information. Heck, it's possible that the bugs are due to MY ancient Perl code, although I really don't remember who wrote or tested the code.... Cheers, -- Shane
On 03/19/2017 08:09 AM, Shane Kerr wrote:
Doug,
At 2017-03-18 18:34:25 -0700 Doug Barton <dougb@dougbarton.email> wrote:
On 03/18/2017 08:46 AM, Anand Buddhdev wrote:
Dear colleagues,
This is a follow-up to our message of Friday about issues with some reverse delegations.
After doing a thorough analysis, and with the help of ARIN staff, we found more issues with our zonelet generation code
Can you say more about the benefit of this "zonelet" system vs. ARIN simply delegating the appropriate zones to you, and you managing them like any other DNS zone?
I do appreciate you keeping the community informed about the causes of the outage, but it seems that at least part of the root cause is that you're operating what sounds like a fairly fragile system in the first place, with (de fact) insufficient validity checking.
I was at the RIPE NCC when we adopted the zonelet approach, although I haven't worked for them for over a decade.
The zonelet system was designed to allow reverse DNS for IPv4 space that was originally assigned to one RIR but was later partially migrated to other RIRs. This happened when LACNIC and AfriNIC were formed, although I think that an audit was done at the time and so space was moved around between all 5 RIRs.
The problem is that we could have a delegation like 999.in-addr.arpa going to the RIPE NCC and then 888.999.in-addr.arpa being managed by ARIN... but want 888.999.in-addr.arpa to point to the **address holder's** name servers, not **ARIN's** name servers.
So ARIN needs a way to get the information about the name servers to the RIPE NCC somehow (and RIPE NCC to LACNIC, and so on). Zonelets are used for this, which is basically just the NS records needed, probably picked up using SSH.
I think that we discussed using dynamic DNS (DDNS) for this at the time, but decided that the simplest & best solution was zonelets.
DNAME could be used, but it would involve an extra lookup for resolvers, right? (DNAME was pretty new when zonelets were adopted, and I don't know that BIND 8 supported them, which was still the most popular DNS server at that time.)
My guess is that the bugs are probably more due to ancient Perl code than an overly-complicated system for exchanging this information. Heck, it's possible that the bugs are due to MY ancient Perl code, although I really don't remember who wrote or tested the code....
Thank you, Shane for the explanation, which makes perfect sense. RIPE folks, the operational answer to this problem would seem to be having ARIN implement a sanity check such that if more than N% of the information is changed in a given pass that humans need to get involved to approve the change. I had a lovely chat with John Curran about that on NANOG, which you can see starting here: https://mailman.nanog.org/pipermail/nanog/2017-March/090626.html Short version, they won't do anything differently unless you specifically ask them to. We all make mistakes, and I have no doubts that y'all have done your best to find/fix the bugs that created the most recent problem. But I've used similar sanity check systems in the past with good success. Everyone makes mistakes, and there is no shame to a "belt and braces" approach to critical infrastructure like this. I hope that you'll consider it. Doug
Hi, On Sun, Mar 19, 2017 at 12:36:33PM -0700, Doug Barton wrote:
RIPE folks, the operational answer to this problem would seem to be having ARIN implement a sanity check such that if more than N% of the information is changed in a given pass that humans need to get involved to approve the change.
This makes sense (... and is exactly what our "update the list of zones on our secondary servers" system does, for funny reasons :-) ) So, +1 for adding such a check :-) Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
Shane Kerr <shane@time-travellers.org> wrote:
DNAME could be used, but it would involve an extra lookup for resolvers, right?
Yes, and it would also require a change of name for the delegated zones which I suspect would be a bigger problem in practice :-) There's some discussion and examples of using DNAME for reverse DNS in https://tools.ietf.org/html/draft-ietf-dnsop-rfc2317bis (which expired mainly because I wasn't sure how to resolve the open questions, and I lacked feedback) Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ - I xn--zr8h punycode Thames, Dover, Wight, Portland, Plymouth: West or southwest 5 to 7, occasionally gale 8 at first. Moderate or rough. Rain at first, then showers. Moderate or good, occasionally poor at first.
Dear DNS working group, Shane's explanation of the zonelet exchange system is correct. This system was adopted by the RIRs in the early days of resource transfers. The RIR with the majority of address space in a given IPv4 XXX/8 of address space is responsible for running the corresponding XXX.in-addr.arpa zone. If any of this address space is registered in another RIR, then the majority RIR needs to get delegation information from that RIR, and this is done by importing "zonelets", which are similar to zone files, and contain NS and DS records, and perhaps glue records for in-bailiwick name server names. The original code for this at the RIPE NCC was indeed written in perl. However, that code is not in use any more. It has since been replaced with newer code, for a variety of reasons. However, it still produces and consumes zonelets for exchanging delegation information with other RIRs. The zonelet system is quite simple in many ways, and I can appreciate why it was chosen back in the day. However, it is pull-based, and so delegation information takes time to propagate. In the event of an error, it similarly takes a while before correct information can be republished. Shane mentioned the use of DNAME records, but I don't think it's the right solution for this case. DNAME records alias a name and everything below it to another name. But here, we don't quite want aliases. We just want the NS and DS records of delegations from another RIRs merged into the parent zone we operate. We are working with the other RIRs to look at the system, to see if we can make it more robust, and perhaps faster, so that delegation information can be exchanged more quickly, and in the event of errors, also corrected more quickly. Regards, Anand Buddhdev RIPE NCC
Hi, On Mon, Mar 20, 2017 at 04:12:02PM +0100, Anand Buddhdev wrote:
The zonelet system is quite simple in many ways, and I can appreciate why it was chosen back in the day. However, it is pull-based, and so delegation information takes time to propagate. In the event of an error, it similarly takes a while before correct information can be republished.
pull-based can always be sped up by sending trigger messages. Like, AXFR and NOTIFY :-) Gert Doering -- NetMaster -- have you enabled IPv6 on something today...? SpaceNet AG Vorstand: Sebastian v. Bomhard Joseph-Dollinger-Bogen 14 Aufsichtsratsvors.: A. Grundner-Culemann D-80807 Muenchen HRB: 136055 (AG Muenchen) Tel: +49 (0)89/32356-444 USt-IdNr.: DE813185279
participants (5)
-
Anand Buddhdev
-
Doug Barton
-
Gert Doering
-
Shane Kerr
-
Tony Finch