# # $Id: DNSmigration,v 1.7 2008/04/27 17:38:31 jim Exp $ # Recommended DNS Administration Procedure for Domain Name Migration Fernando Garcia Abstract The configuration and maintenance of DNS zones offer many degrees of freedom and thus several opportunities for making mistakes. One of the most complex and error prone situations that can arise is moving a domain name from one set of name servers to another. This can happen for a variety of reasons: technical, administrative and procedural. These problems can be even more awkward when the domain name concerns reverse lookups of IP addresses: for instance when a renumbering exercise is under way. One common reason for renumbering is when an an organisation changes its Internet Service Provider. An orderly sequence of changes has to be carefully co-ordinated to ensure the minimum disruption to important services such as those offered by email and web servers. Since this is not a situation which network or systems administrators need to handle very often, it is painfully easy to make mistakes when undertaking the migration process, These mistakes can have serious consequences for the services that depend on DNS service. This document is offers a step by step guide to system and network ^^^^^^^^^^^^^^^^^^ ==> document offers administrators, so they can have a problem free migration with full continuity of all services. Although the procedures described here do not require in-depth knowledge of the DNS protocol and its implementations, it is necessary to have some understanding of DNS fundamentals, notably name server configuration. The examples shown are specific to BIND but should be readily applied to other DNS implementations. 1. Conventions used in this document Domain names used in this document are for explanatory purposes only and should not be expected to lead to useful information in real life [RFC2606]. 2. Slave Server Migration The simplest form of name server migration is to move from one slave (secondary) server to another. In this case, the main effort is co-ordinating the reconfiguration of the name servers. For this example a new slave name server, ns1.example.com, is added for the example.com zone. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ==> rather "is going to replace slave.example.com" 2.1 Make new server authoritative for the zone For BIND, this is simply a matter of adding a new zone{} statement to the new name server's configuration file /etc/named.conf. [It is assumed that this file has already been created and a minimal but properly configured name server has been set up and is running. A suitable zone statement would resemble: zone "example.com" { type slave; file "slave-zones/example.com"; masters { 10.1.2.3; }; }; This instructs the new name server to be a slave server for the example.com zone and answer authoritatively for it. The server should check the zone's SOA record on the master server at IP address 10.1.2.3 and transfer the zone whenever it is updated on the master server. A copy of the transferred zone will be stored in the file slave-zones/example.com. Once /etc/named.conf has been updated, the configuration file should be checked with BIND9's named-checkconf utility. This should ensure the file contains no syntax errors or other mistakes which could leave the name server in an undefined or unknown state. Some form of version control should also be used at this point so /etc/named.conf should get committed to the organisation's preferred version control system, typically CVS or Subversion. The name server should now be forced to read the updated /etc/named.conf. This would normally be done with an rndc reconfig: % rndc reconfig The name server's logs should be checked to ensure that it has picked up the new configuration information and successfully transferred the example.com zone from the master server at 10.1.2.3. The obvious difficulties that could occur at this point should be readily identified from the name server's logs: some form of file system access permission issues ; connectivity problems; a non-authoritative master server; or incorrect name access controls on the master server. A discussion on how to resolve these problems is out of scope for this document. 2.2 Check the new server is authoritative for the zone Once the new slave server is answering authoritatively for example.com, this should be verified with a DNS query utility such as dig: % dig @10.9.8.7 example.com soa +norecurse ^^^^^^^^ ==> 10.40.5.2 ? Which server is quried and which server is responding? Update text below accordingly... ; <<>> DiG 9.4.2 <<>> @10.40.5.2 www.example.com ^^^^^^^^^ ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26413 ;; flags: qr aa ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1 ;; QUESTION SECTION: ;www.example.com. IN SOA ;; ANSWER SECTION: example.com. 3600 IN SOA master.example.com. hostmaster.example.com. 2007032700 86400 7200 3600000 172800 ;; AUTHORITY SECTION: example.com. 3600 IN NS master.example.com. example.com. 3600 IN NS slave.example.com. ;; ADDITIONAL SECTION: master.example.com. 3600 IN A 10.1.2.3 slave.example.com. 3600 IN A 10.11.12.13 ;; Query time: 4 msec ;; SERVER: 10.9.8.7#53(10.40.5.2) ^^^^^^^^ ^^^^^^^^^ ;; WHEN: Sun Mar 27 21:47:15 2007 ;; MSG SIZE rcvd: 131 Note that the new slave server's IP address (10.9.8.7) is queried directly instead of dig defaulting to whatever name server would be used by for routine DNS queries. The value of the returned SOA record should be identical to that found on the existing authoritative servers for the zone, master.example.com and slave.example.com. It is also important to check that the AA (Authoritative Answer) bit is set in the response from the new slave server. This is indicated by the "aa" entry in the flags header shown in the above output from dig. The +norecurse option is used to clear the RD bit (Recursion Denied) ^^^^^^ ==> Desired on the query. This shouldn't be necessary, especially when querying up-to-date name servers. However some DNS implementations such as BIND8 sometimes erroneously set the AA bit on answers. This may arise when the server also acts as a resolver. It can pass on the AA bit in the answer from a remote authoritative server when it resolves the query even though it was not authoritative for that data. If the new slave server suffers from this broken behaviour, it can lead the operator to assume that the new slave server is correctly configured with authoritative data even when it is not. Querying the new server with the RD bit cleared should ensure that the AA bit is only set on responses from servers that geunuinely are authoritative. ^^^^^^^^^^^^^^ ==> are genuinely If possible, the next check would be to query the name name server from different locations on the network. This could identify any connectivity problems or firewall issues that might prevent general access to the new slave server for DNS lookups. It may be advisable to check that EDNS0 queries work properly: some firewalls and middleware boxes are known to erroneously prevent these queries and responses. The +bufsize option to dig will generate EDNS0 queries. ie: % dig @10.9.8.7 example.com soa +bufsize=1024 +norecurse 2.3 Make the new slave server visible The next step is to make the newly-added slave server visible to the DNS. This is a two stage process. First the zone is updated. Then the delegation is updated, if necessary. 2.3.1 Updating the zone This is straightforward. An NS record for the new slave server is added to the example.com zone. If an address record (A or AAAA record) is needed for the new slave server, it should be added at this point. When the name of the new slave server is within the example.com zone, its address records must be added to the example.com zone. In this example, an address record for ns1.example.com is needed because this name lives in the example.com zone. Before the update, the zone file on the master server would resemble: example.com. IN NS master.example.com. example.com. IN NS slave.example.com. ... master.example.com. IN A 10.1.2.3 slave.example.com. IN A 10.11.12.13 After the update, the NS and glue records would look like: ^^^^ ==> it is not "glue", it's in-zone data ! instead of "glue", rather "A/AAA" example.com. IN NS master.example.com. example.com. IN NS slave.example.com. example.com. IN NS ns1.example.com. ... master.example.com. IN A 10.1.2.3 slave.example.com. IN A 10.11.12.13 ns1.example.com. IN A 10.9.8.7 Since the example.com has been updated, the zone's SOA record would be given a new serial number. The modified zone file would presumably be committed to the version control system. BIND9's named-checkzone should also be used to check for errors in the updated zone file: master% named-checkzone example.com example.com If all is well, the next step would be to make the master server load the new zone file with rndc: master% rndc reload example.com The master server's logs should be checked at this point. These should show that the server has successfully loaded the updated example.com zone. The logs should also indicate that it has sent out NOTIFY messages to the slave servers and that those slave servers have picked up the new version of the zone with a zone transfer. Ultra-cautious DNS administrators will use dig at this point to query the master server ^^^^^^^^^^^^^ ==> master and slave servers to ensure that the correct data is now being served: ^^ ==> "are" ? % dig @10.1.2.3 example.com ns Aside from mistakes when updating the example.com zone, likely errors at this stage will concern possible connectivity problems such as a routing or network fault or perhaps an incorrect name server or firewall access control list. As before, the logs should indicate what went wrong and resolution of those faults are beyond the scope of this document. With BIND 9.4, it is also possible for an authoritative name server to send NOTIFY messages with rndc: master% rndc notify example.com It is also possible to make a slave server schedule an immediate SOA serial number check on the master server without waiting for the zone's refresh interval to be reached: slave% rndc refresh example.com A slave server can be forced to transfer the zone from the master server irrespective of the value of the zone's SOA serial number. This tends to be used when the master and slave servers have inconsistent serial numbers caused by administrative error. slave: rndc retransfer example.com The notify, refresh and retransfer features offered by rndc are not needed for the process described here. They are just explained in case a DNS administrator makes a mistake and needs an easy way of correcting that error. 2.3.2 Update the delegation The new name server, ns1.example.com, is now answering queries for the example.com zone and is listed in its NS RRset. However the job is not yet done. The parent zone, in this case .com, needs to be updated. The parent zone contains delegation information for example.com and this has to reflect the NS records and any relevant glue records that are present in the child zone, example.com. How this update gets performed depends on the parent zone. In the case of .com, this involves the registrar sending an EPP transaction to the registry to update its database. Most TLDs operate on this model. In general registrars provide a web-based interface to allow their customers to update their delegations, renew domain name registrations, amend contact data and so on. Some registries offer other mechanisms such as email-based text templates. As will be seen later, RIRs use email-based templates for updating delegation information for reverse zones assigned to their LIRs. Though these updates can also be done through web-based systems like RIPE NCC's LIR portal service. Since there are so many options here, it is impractical to describe these in detail here. In simple terms, the DNS administrator can login to some web page and select the option that allows delegation information to be updated. At this point the name of the new server, ns1.example.com, is added. Its IP address will have to be added as a ^^^^^^^ glue record too. The .com zone needs to advertise the IP address of ^^^^^^ ns1.example.com even though this under the example.com zone cut. ^^+is ==> Comment: if ns1.example.com. has got an IPv6 address, update the paragraph above accordingly. After the update has been entered into the registry database, the new DNS data get added to the parent zone and propagated to its name servers. The time for this change to take effect varies. Some registries do this immediately or within 10 to 15 minutes. Others generate a new zone file every few hours or perhaps just once or twice a day. DNS administrators need to take account of these propagation delays when moving name servers for a zone. 2.4 Delete the old slave server ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ==> I would suggest to rename this section as: "Remove the old slave server from the name servers list" Once the parent zone has been updated and the new delegation information propagated to its name servers, the old name server can be removed. The process is essentially the reverse of the steps above. First, the NS record for the old server, slave.example.com is removed from the example.com zone on the master server. Version control and checks as described earlier should be carried out. The A record for slave.example.com should be removed at this point. However if that name is used elsewhere -- say as the target of an MX or CNAME record -- the A record should remain. With the NS record deleted from the example.com zone, the delegation should be updated. The procedure outlined in 2.3.2 would be followed, except to remove the NS record for slave.example.com and any relevant glue records. It is important to ensure that the glue records for a retired name server are removed from the parent. If this is not done, discrepancies can arise. For instance, the DNS administrator for example.com adds a new A or AAAA record for slave.example.com but there's a stale glue record for slave.example.com in the .com zone which points at the old IP address of the now long gone name server. 2.5 Switch off the old name server ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ==> Would suggest to rename this section as: "Stop serving the DNS zone on the old slave server" The final step of the process is to switch off the old name server or ^^^^^^^^^^^^^^^^ reconfigure it to stop serving the example.com zone. ------------------------------------------------------------------------ ==> Generally speaking, the real action is just "reconfiguring the old slave to stop serving the zone". Any further action (such as switching off the server) is specific to some contexts (for example, if the slave was serving only that zone... and not being used as a recurisve name server... And yes, even if such recursive functionality is discouraged on an authoritative nameserver, it is not forbidden). So I would rather prefer rewriting the sentence above as: "The final step of the process to reconfigure the old sevrer to stop serving the example.com zone. The server may even be switched off if it does not serve othe DNS zones." ------------------------------------------------------------------------ This should not be done immediately. First of all, other name servers may still be cacheing the old NS and address records for slave.example.com. These may not expire from the caches for some time: the TTL values in either the example.com or .com zones for slave.example.com. It is possible of course to reduce these TTLs so that the old server can be switched off sooner. However this will require further changes ^^^^^^^^^^^^ to the zone and its parent delegation: for instance to reduce the TTL for the NS and any address records for the old name server. This introduces extra steps into the migration process and may not be worth the effort. In the final analysis, the old name server shouldn't be switched off after the migration for at least the TTL value for the ^^^^^^^^^^^^ server that may have been cached anywhere. ==> Same remarks as to "switch off" vs "stop serving the zone" Even after a reasonable transition period of a few days, it is always possible for the old name server to still receive queries for example.com. The most likely cause of this are forwarding name servers that have been configured with the IP address of the old name server. Checking the name server's query logs will indicate if the old server receives such queries. If it does, the logs will identify the source of these queries. The DNS administrator can then arrange for these systems to be fixed or at least investigate why they are still sending queries to the wrong name server. It would be prudent to ensure that no harm would result from shutting down the old name server or stopping it from serving the example.com zone before taking that course of action. Removing the zone is simply the reverse of the procedure described in 2.1 above. However this applies to the old slave server, slave.example.com, and not the new one, ns1.example.com, that is now in use. The zone{} statement for example.com in /etc/named.conf on slave.example.com is removed and the usual version control process and configuration file checks with named-checkconf applied. The name server is made to re-read its configuration file with rndc and the server's logs checked to ensure the change has been carried out. Finally, any old copies of the zone file can be removed if necessary. 2.6 Documentation Once the new slave server or servers are in use, the organisation's DNS documentation should be updated. This should reflect the details of the slave servers: where they are located, what software they run; contact details for whoever is responsible for them; information about change procedures, change windows and any service level agreements that are in place. 2.7 Observations Although this procedure may be over-cautious, it does ensure that stable DNS service is maintained throughout the procedure. First, the new server is configured to serve the zone and then tested. This makes sure it can communicate with the zone's master server, there are no connectivity or firewall issues and that the new server is performing satisfactorily. It is only at this point that a go/no go decision is made about adding the new server to the zone's NS RRset. Further checks can then be made to ensure the server is properly handling queries from other name servers on the network. If anything goes wrong, the NS record can be ^^^^^^^ ==> and clients... removed. Everything then continues as normal on the existing DNS infrastructure for the zone. Once the new server is handling queries satisfactorily, the delegation information in the parent zone is updated. At this point the new slave server is fully activated. The old slave server can be removed once it no longer receives queries for the zone. This can be verified by checking its query logs. Other name servers may be holding on to stale delegation information that is still to expire from their caches. Or they may have been configured to forward to the old slave server. Deleting that zone from the slave is simply a matter of removing the corresponding zone{} statement from its named.conf file once it is considered safe for the slave server to no longer serve that zone. Another approach might be to switch off the slave server, provided of course it is not serving other zones or ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ being queried by stub resolvers. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ==> This remark is important, that's why I proposed to insert it earlier (above) in the text. Although this procedure could be carried out in parallel whenever several slave servers have to be changed, this is probably unwise. It is simpler and less error-prone for the servers to be migrated one at a time rather than in bulk. That way, there is less co-ordination to be done at any given moment, so there is less chance of something going wrong or being overlooked. On the other hand, this can mean that changing several name servers takes longer elapsed time than might be desirable. In principle, the same procedure could be followed for moving a zone's master server. However there are some additional steps that need to be taken for that and these are discussed in the next section. 3 Moving a master server The safest way to move the master server for a zone is to configure a new name server at the new location. Perhaps the easiest way to do this is instantiate the new server with the configuration data -- zone files, named.conf, version control logs, ticketing system references -- from the backups/archives of the current master server. Once the new master server is set up, it should be checked that it is answering authoritatively for example.com and that it has the same data that is found on the existing master server for the zone. This cannot be one by transferring the zone from both servers and comparing ^^^ ==> done the zone files because there is no guarantee the two files will be identical. The files should of course have the same DNS data but there is no DNS protocol requirement for that data to be presented in a single, canonical order when a zone transfer is performed. It should be remembered that a zone transfer replicates zone data and it is not a file transfer protocol. Although ensuring the same DNS data exists is the most important consideration, the actual master file can be important too. For example it may contain comments that would not be visible when a zone transfer is performed. The new master server should be checked to ensure it is answering correctly and authoritatively for example.com. Zone transfers and EDNS0 queries should be used to ensure that there are no connectivity problems or access control lists which prevent the new server being ^^^^^^ ==> "from being"? used. The checks illustrated in Section 2.2 can be applied here. Once the new server is running correctly, there is a potential problem. There are now two servers which are master for the zone and these can get out of synchronisation with each other. Updates made to the zone on one will not be seen by the other. The simplest way to do that is to only permit updates on one server, the new master server, or to decide no changes can be made to the zone until the new master server is fully activated. This may mean temporarily preventing dynamic updates or other changes to the zone. 3.1 SOA Record Values Before the new master server is brought on-line, the zone's SOA record should have a sensible expire interval: 30 days is reasonable. Short expire intervals are unwise in general. They are especially unwise when the master server is moved. If the slave servers cannot contact the master server within one expire interval, they will no longer answer authoritatively for the zone, creating a lame delegation. A suitably long expire interval allows plenty of time to correct any master-slave connectivity issues before they can cause serious trouble. 3.2 Notification of the Slave Servers With the new server answering authoritatively for the zone, all the slave servers for the zone need to be reconfigured. They should be told to use the IP address(es) of the new server for zone transfers and SOA refresh checks. This will be done by some out-of-band means. Usually the DNS administrator will inform the DNS administrators of the slave servers about the migration by email. They will then update their name server's configurations with the details of the new server. If the master server for example.com was being moved from 10.1.2.3 to 172.16.1.1, the appropriate zone{} statement in the slave server's named.conf would look like: zone "example.com" { type slave; file "slave-zones/example.com"; masters { 172.16.1.1; 10.1.2.3; }; }; As always, version control, a ticketing system and a check with named-checkconf would be done before the slave server loads the new named.conf file. The above entry would tell the slave server to try the new IP address (172.16.1.1) as well as the old one when checking for fresh versions of example.com and transferring them. At this point it is crucial that both the old and new master servers for example.com have identical versions of the zone. If not, slave servers could be picking up different versions of example.com from either server, resulting in inconsistent data being published in the DNS. This is why it is safest to prevent any changes to the example.com zone until all the slave servers are now using the new master server. Updates to the slave server configurations should be carefully planned. The operators of the slave servers may have change windows, perhaps just once a week or month, where name server configurations can be altered. The slave server administrators may have their own ticketing systems and procedures for making those changes. So the DNS administrator should take account of these and give plenty of notice, perhaps as long as a month, of the impending move of the zone's master server. There may be service level agreements with the slave service providers which document the procedure and turn-round times for responding to such change requests. Putting the IP addresses of both servers in the masters{} clauses of the zone{} statement is a fail-safe. If the new server is unreachable for some reason, the zone can still be checked and transferred from the old master server. Assuming of course the old master server still has an up to date copy of the zone. And if the new master server is misbehaving, it can be switched off immediately. The slave servers will automatically fall back to using the IP address of the old master server for their checks. 3.3 Activating the new master server The logs on the slave servers and the new master server should be checked to ensure that they are being used for zone transfers and refresh checks, not the old master server. Once this has been verified, the slave servers can be reconfigured to only use the new master server. That simply means a repeat of the process in Section 3.2, except this time to remove the old IP address (10.1.2.3) from the masters{} clauses of the relevant zone{} statement. From this point onwards, the new master server will be the only place for any updates to the zone to be performed. 3.4 Update the example.com zone With the slave servers now using the new master server, the zone file will probably have to be updated. If the zone's master server has been renamed, that should be shown in the SOA record's MNAME field. The zone's NS RRset may have to be updated too and any "glue" for the new master server added. ^^^^^ ------------------------------------------------------------------------ ==> once again, it is not "glue", it is in-zone data! If the new master was already published in the DNS zone, its IP address(es) will remain unchanged. Otherwise, (an) A/AAAA entr{y,ies} will be added. OTOH, I think you should mention as early as in the paragraph above, the "hidden master option", in which case, nothing would be updated in the SOA record. ------------------------------------------------------------------------ In the example used for this document, the new master server retains the same name, master.example.com, but gets renumbered to a new IP address, 172.16.1.1. The relevant updates to the zone file would be similar to the following: example.com. 3600 SOA master.example.com. hostmaster.example.com. ( 2007032701 ; serial number 86400 ; refresh (24 hours) 7200 ; retry (2 hours) 3600000 ; expire (1000 hours) 172800 ; minimum (2 days) example.com. IN NS master.example.com. example.com. IN NS ns1.example.com. ... master.example.com. IN A 172.16.1.1 ns1.example.com. IN A 10.9.8.7 As before, changes to the example.com should be linked to the organisation's ticketing system, its change control procedure and checks with named-checkzone before the name server loads the new zone file. Once loaded, the name server's logs should be inspected to make sure that the new zone propagates to the zone's slave servers. Stealth master servers are commonly used. These are master servers ^^^^^^^ ==> AFAIK, it's the term "Hidden" which is rather used with masters. For slaves, it is indeed "stealth" which is used. see: http://www.oreillynet.com/pub/a/network/excerpt/dnsbindcook_ch07/ (search for "stealth") that are not listed in the zone's NS RRset or SOA record's MNAME field or in any parent zone's delegation information. This means the location of the master name server is not readily disclosed in the zone file and it is not found in the DNS metadata. The name and address of that server is only known to the authoritative slave servers for the zone and any utilities that are permitted to send dynamic updates to the server. If a stealth master configuration is in place, there ^^^^^^^ ==> hidden will be no need to update the zone's NS RRset or SOA MNAME field. 3.5 Update the delegation If the master server was part of the delegation information in the parent zone, that zone will need to be updated too. The name and address of the new master server will need to be in the parent zone. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ==> Only in case of "glue" and that's not the general case. An outline of the procedure to be followed for updating the parent zone is given in Section 2.3.2. There is no need to update the delegation if a stealth master configuration is used: the name and ^^^^^^^ ==> hidden address of the master server will not be in the parent zone and do not need to be added to it. 3.6 Update tools and clients Any tools or procedures which are used to modify the zone contents will have to be updated and tested for use on the new server. These changes will have to be documented. Clients which issue dynamic updates -- for example a DHCP server -- may need to be reconfigured to use the master server's new IP address. ^^^^^^^^ ==> addresse(s). However they may be able to find the new location of the master server from the MNAME field in the zone's SOA record. Dynamic update tools like nsupdate will do this automatically. 3.7 Decommission the old master server Finally, the procedure described in Section 2.5 can be used to stop the old master server from serving the zone. There will be additional steps to take however. The organisation's documentation should be updated to reflect the change of location for the master server. Administrative measures should be put in place to ensure no updates for the zone get made on the old server. These could include blocking or rejecting any dynamic update requests, preventing the old master zone file from being modified, training for DNS operations staff or perhaps all of these measures. 4 Renumbering When hosts get renumbered or renamed, it is not always the case that this can be done by simply updating the DNS. This is of course a major component of the renaming or renumbering. But it is by no means the only one. Any changes to the DNS should be carefully co-ordinated and integrated with the other aspects of the renumbering or renaming exercise. These can include the reconfiguration and testing of web servers and mail systems, changes to router and firewall configurations, updates to application-level access control lists, server downtime and the replacement of X.509 certificates. Changing the name or IP address of a host can be straightforward. It's unlikely anyone cares what name or IP address is assigned by a DHCP server to some host. All that is likely to matter in that sort of example is that the forward and reverse entries match up: ie a reverse lookup of the IP address returns a name that when looked up returns that IP address. Even that might not be essential. Though it is of course desirable. There are examples where the names and addresses of hosts are significant, particularly servers. POP and IMAP client will want to connect to the appropriate server. Similarly, mail clients will have to connect to the correct SMTP or ESMTP server when delivering or submitting email. This paradigm obviously extends to web servers, database servers and so on. When these sorts of critical network resources get moved or renamed, some careful planning is needed. For illustrative purposes below, the mail server for example.com, oldbox.example.com on 10.20.30.40 is going to be moved to newbox.example.com which has IP address 172.16.0.1. Before the DNS is updated, it can be assumed that the new mail server has been tested thoroughly and any accompanying addressing or numbering requirements have been resolved: for instance, the installation of new X.509 certificates. 4.1 Changing the forward zone Before this change, there will be MX and A records like the ones below in the example.com zone file: example.com. 3600 MX 10 oldbox.example.com. ... oldbox.example.com. 3600 A 10.20.30.40 Some time before the switch to the new mail server, the TTL values for the above resource records should be reduced to a few minutes. This will ensure old and possibly stale data does not live for more than a few minutes in the caches of other name servers. The absolute minimum time for this to be done is 1 refresh interval (24 hours in Section 3.4) plus the maximum TTL value for these resource records: 1 hour. This means that the TTL should be decreased no later than 25 hours before the planned switch. The rationale for this is a slave server may have just performed its refresh check just before the zone was updated. So if it does not support NOTIFY (or doesn't get any NOTIFY messages from the master server), the slave will take one refresh interval before it does another refresh check and picks up a new copy of the zone. Assuming that the zone transfer takes no time, the slave server will still be responding to lookups for example.com's MX records with data that has a one hour TTL value for one expire interval after the zone was updated. A ticket should be opened and the zone file updated in the usual manner with version control and zone file syntax checks with named-checkzone. Even though only the TTL for the records below are modified, the zone's SOA record serial number should be updated according to local convention. In the example below, the updated MX and A records now have a TTL of 5 minutes. Not that no other data for ^^^ ==> Note these records is changed ^^are? example.com. 300 MX 10 oldbox.example.com. ... oldbox.example.com. 300 A 10.20.30.40 The new zone is loaded on the master server and propagated to the slave servers in the usual way. Subsequent lookups for the MX records for example.com will only be cached for 5 minutes unless they'd already been cached from an earlier lookup. This means that there will probably be more DNS queries for example.com's MX records because this ^^ese? data does not get cached for as long as it used to. However within 25 ^^^^do? ^^they? hours of this zone update, no name server should be caching these MX records for longer than 5 minutes. Any data that was cached prior to ^^^were? the change should have expired from name server caches by then. This now means there is a 5 minute window for changing the mail server to the new system on a new IP address. At this point, the old mail server should be stopped from accepting inbound email. This will prevent it from processing incoming mail and storing it on mailboxes that have probably been migrated to the new server. With the old mail server not accepting email, all the mailboxes and other configuration data can be moved to the new server. Describing how this would be done is beyond the scope of this document. The new mail server can now be started. Once that has been done, the DNS can be updated. The MX and A records above would be replaced with the entries shown below: example.com. 300 MX 10 newbox.example.com. ... oldbox.example.com. 300 A 172.16.0.1 As before, the zone file should undergo version control and a check with named-checkzone. It should also get a new SOA serial number. Once this zone has propagated to the slave servers for example.com, incoming mail should start to be delivered to the new server. This is one example where the use of NOTIFY is a great benefit to get the slave servers to quickly converge on the new version of the zone. This could also be achieved by reducing the SOA record's refresh interval but this should only be necessary for legacy DNS implementations that do not implement NOTIFY. Those legacy implementations are likely to be so out of date that they need to be replaced anyway. Note that the TTL for the MX and A records above are still set to 5 minutes. This is to allow for a quick back-out in case of unforeseen problems with the new servers. The DNS administrator can revert to the old data and the old mail server restarted if there are problems. Since the DNS data for the new server has a short TTL, this will not be cached for too long by other name servers, reducing the potential for mail to go astray by being delivered to the new server when it is not performing correctly. Once mail operations have bedded down, the example.com zone file can be updated once more and the above MX and A records given more suitable TTL values. 4.2 Reverse zones For reverse zones, things can be somewhat simpler. The DNS administrator can prepare the reverse zones and populate them with PTR records before the corresponding IP addresses are actually used. The zone file for 16.172.in-addr.arpa could contain the following PTR record: 1.0.16.172.in-addr.arpa. 600 PTR newbox.example.com. Note that the PTR has the same TTL as the A record in the forward zone. This is not a DNS protocol requirement. It is an administrative convenience so that reverse lookups of 172.16.0.1 expire from the world's name server caches in no longer than the corresponding A record in example.com would. As with the forward zone, the reverse zone should be updated and propagated to its slave servers no later than 1 expire interval + the ^^^^^^ ==> refresh! PTR TTL value before the new IP address is used. The SOA record for 16.172.in-addr.arpa is not shown here, but it can be assumed that its expire interval is 8 hours and the PTR record previously had a TTL of ^^^^^^ ==> refresh 1 hour. [There is no requirement for SOA timer values and TTLs in reverse zones to be identical to forward zones or vice versa.] The rationale for the short TTL is the same as that given in Section 4.1: the data won't hang around caches for long if the change has to be backed out for some reason. In practice, reverse DNS entries are rarely critical to email operations (or any server operations in general). However some anti-spam defences use the absence of a working reverse DNS entry as a good rule of thumb for identifying a spam source. As with the forward zone, the PTR record in the 16.172.in-addr.arpa can be given a more reasonable TTL once everyone is happy that the new mail server is performing satisfactorily. 4.4 Other servers The same procedure can be applied to other network services that get moved or renumbered. Well in advance of the change, the TTL values on the corresponding resource records in the DNS should be reduced to a few minutes. This should be done to allow for any already cached data for those entries to expire from the caches of any other name servers. They will then lookup the names again and get the same responses, albeit with much shorter TTLs. This will mean more traffic to the name servers for the forward and reverse zones while the migration is in progress. The reduced TTL provides the window for switching from one server to another. While this switch is actually in progress, it may be necessary to stop the old server. For instance to ensure transactions are not sent to the wrong database or email delivered to the old mail store instead of the new one. Once the new server is ready, the DNS should be updated with the new addressing and naming data. This should still have a short TTL in case the new system has to be backed out. That can be done without the new DNS data staying in name server caches for too long. Finally, once the new server is shown to be working correctly, the TTLs for the new DNS data can be increased to more normal values. 4.5 Changes to service infrastructure Changes to DNS are often coupled to other changes to an organisation's service infrastructure. For example the replacement of web servers or mail servers may involve a change of ISP, some renumbering and DNS migration. Or some combination of these. A common scenario is that the new web site does not become visible until the DNS is updated and the delegation information in the parent zone has been updated too. This can put the organisation at the mercy of several external parties: for instance the old and new DNS service providers and ISPs, registrars, parent registry and so on. There is likely to be limited direct control over those parties, so careful co-ordination and communication will be needed to synchronise and sequence each part of the update process with any procedures or maintenance windows that these parties have. If there are any delays or errors happen during the DNS change, the ^ing resulting problems may be severe. These changes of course tend to take place at weekends or other quiet periods when key technical staff may not be available. Minimising the number of parties in a given change or migration reduces the risk of something going wrong and simplifies any recovery or back-out procedures when a problem occurs. As far as is practically possible, changes to the DNS delegation should be performed separately from changes to the other service infrastructure. Then, for the service infrastructure changes, the zone administrator is free to manage the zone contents without being dependent on other parties.