Hi,
I noticed a typo that I mentioned earlier was still present in RIPE-823.
https://www.ripe.net/publications/docs/ripe-823/#aggressive-nsec-caching
says: RFC8189, whereas it should say RFC8198.
Is this something that can be fixed perhaps?
--
Marco
1
0
Hi,
I noticed a typo that I mentioned earlier was still present in RIPE-823.
https://www.ripe.net/publications/docs/ripe-823/#aggressive-nsec-caching
says: RFC8189, whereas it should say RFC8198.
Is this something that can be fixed perhaps?
--
Marco
1
0
Hello.
Under the section discussing Ingress Filtering you failed to discuss the
issue of fragment filtering.
A very common and powerful DDoS attack is UDP fragment attack:
https://ddos-guard.net/en/terms/ddos-attack-types/udp-fragmentation-flood
The common thing many ISPs as well as enterprises do to mitigate the
attack is to block all fragments which on most servers has almost no
effect. But on DNS and VPN servers, blocking fragments is fatal and
therefore a warning needs to be put into the doc that UDP fragments
should *never* be blocked to DNS servers - even when under fragment
attack. See:
https://puck.nether.net/pipermail/cisco-nsp/2023-December/108992.html
for further details.
Regards,
Hank
1
0
01 May '24
Hello fellow task force members,
RIPE officially published our recommendations as a RIPE document today.
With that, I consider our work on the task force done.
Thank you all very much!
I am working with three public resolver operators to put together a 25
minute session at the DNS working group at the upcoming RIPE meeting. We
will be comparing the recommendations in the task force document with
their actual operations, and hopefully digging into the differences or
details in an interesting way.
I think there will be more work in this area, whether that is in RIPE or
in other forums, and I look forward to any chance to work with any of
you in the future.
Cheers,
--
Shane
1
0
Fellow Task Force members,
I am going to be requesting that the RIPE NCC edit the attached
document, and then hopefully Mirjam will approve it for publication as a
RIPE document.
Tim, I made the text changes you requested. I'll make a separate reply
about the TTL though, which I did not change. 😉 Maarten, I removed the
Cloudflare statement as an example of a recursive operator policy
statement since they do not explicitly intend to follow the RFC's
recommendations.
I think that's it. Thank you all!
Cheers,
--
Shane
4
3
Fellow Task Force members,
I have updated the document based on the feedback and text from Farzaneh
and John.
If you spot anything at this point, please let me know. I plan on
turning the document over to the RIPE NCC for editing on 2024-03-29 (29
March 2024). I know that it is not perfect, but I'd like to get it
published at some point. 😉
I included all changes except for the cache flushing suggestion:
* Added description of the problems with ANY replies.
* Added a section about name server identification.
* Added the proposal to document as much about blocklists as possible,
even if source of the blocklists themselves cannot be published.
* Added a section about NTA transparency.
Again, thank you all for the work you've put in.
Cheers,
--
Shane
# DNS Resolver Recommendations
About the DNS Resolver Best Common Practice Task Force
https://www.ripe.net/participate/ripe/tf/dns-resolver-best-common-practice-…
## Terminology
* Open Resolver: A DNS resolver that accepts queries from any client.
Often the result of misconfiguration.
* Public Resolver: A resolver intentionally configured to be an open
resolver.
## Introduction
### What Is This Document? Who Is It For?
This document presents recommendations and best current practices for
operating DNS resolvers, both public and non-public ones. It covers
technical aspects of operations and provides best practice
recommendations for data management, with a particular focus on user
privacy, security, and resilience.
The document serves as guidance for the wider Internet community,
offering input to:
* Those running public DNS resolver services, and
* Those who want to make informed choices between such services.
Its purpose is to provide clear guidance and promote effective
practices in DNS resolver operation.
The intended audience is not the entire DNS community. Advice here is
probably not useful for operators of authoritative servers, domain
registrars, and so on. It is also not meant to be an introductory or
educational document. There are many documents which cover the basics
of DNS and the roles of organizations in it; a good overview is:
Addressing the challenges of modern DNS - a comprehensive tutorial
by van der Toorn et al.
https://ris.utwente.nl/ws/files/282427879/1_s2.0_S1574013722000132_main.pdf
The document does not consider how to measure adherence to these
recommendations. So it is not intended to be used for certification,
although certification created based on the principles here is
possible.
### How Is This Document Organized?
This document has a number of sections, and specific recommendations
in each section. The intent is for each recommendations to have clear
guidance at the top, and then background and discussion related to the
recommendation afterwards. Each recommendation indicates whether it is
mostly for operators of public resolvers or for operators of any
resolver.
### About Recomendation Text
This is not a standards document, and does not propose any way to
measure compliance or interoperability. It does use words like
"should" or "may be" throughout. These are meant to be interpreted in
the usual English sense, and not as IETF-style RFC 2119 jargon.
## System and Network Hardening
### Infrastructure considerations
Running any Internet service requires attention to the infrastructure
used to operate it. This section discusses various approaches that can
be used to run a DNS resolver. Everything applies to both public and
non-public DNS resolvers.
#### Bare metal or public cloud
All DNS resolver software can run either on dedicated servers (rented
or colocated), or in virtualized clouds, or in a combination of those.
Every approach has pros and cons. Most of these are not specific to
running DNS resolvers, however, some of them are.
**Running DNS resolver instances as OS level daemons on bare metal
hosts:**
Pros:
- Performance: Bare metal servers have direct access to the
underlying hardware, and can offer superior performance/cost
balance by avoiding the overhead associated with virtualization.
Moreover, you have full control over the server's configurations,
down to the hardware level, which can be beneficial for
performance and cost optimization once you get the understanding
of your typical work load during peak hours.
- Data Security: Since you are in control of the physical servers,
there is no risk of data leakage that can occur due to
vulnerabilities in multi-tenant virtualization platforms,
including CPU cache-based side-channel vulnerabilities. It could
be argued that attacks targeting such issues are rare, and their
impact on a DNS resolver service is low, but potential breaches
may have significant privacy impact. It is advised to evaluate
this against your organisation's risk model, or to discuss this
with your information security compliance experts.
- Predictability: Because there is no virtualization layer and no
"noisy neighbours" on the host, the performance of your servers is
more predictable.
Cons:
- Cost of failure: If you pick hardware configuration that is not
optimal for the workload of your DNS resolver, you may need to
upgrade and replace hardware components afterwards. Ways to reduce
this risk include renting servers instead of buying them, carrying
load testing with data similar to production workloads, and
providing limited beta access to the service before it fully
enters the production phase.
- Scalability: Scaling up with physical servers means acquiring or
renting, installing, and configuring new hardware, which will take
more time than provisioning new virtual servers in a cloud
environment. Moreover, most cloud environments will provide you
with cluster autoscaling features, which could barely be achieved
in bare metal.
- Maintenance: You will be responsible for all server maintenance
tasks, including hardware issues, which can require significant
effort and specific expertise.
- Redundancy: Setting up high availability and disaster recovery
strategies can be more complex and time consuming compared to the
cloud, where these features are often provided as value added
products. See the Redundancy section for more details.
**Running DNS resolver instances in containers in a public cloud:**
Pros:
- Scalability: Clouds excel at scaling applications. You can scale
up and down rapidly based on load, which is important for a DNS
resolver that needs to handle variable query loads. In case of
regional or geographically distributed resolvers, in every region
where the resolver would be deployed, daily periodicity is likely
to be observed, for example peak hour is likely to occur around
19:00 local time, and off-peak hours may begin at around
01:00-03:00. In a situation like that, using cluster autoscaling
features and tools, you can run less instances in the night and
more instances throughout the day, which may help to optimize your
cloud hosting costs.
- Fault Tolerance and High Availability: Most clouds have built-in
strategies, features, and products for handling node failures,
which can increase your service's availability.
- Deployment and Management: Cloud providers offer built-in methods
to deploy and manage applications, which can simplify operations
and reduce the likelihood of human errors if your infrastructure
management department is already familiar with these tools.
- Cost: While this largely depends on your specific usage, cloud
services can sometimes be more cost-effective than managing your
own physical servers, especially when you consider the total cost
of ownership, including power, cooling, and maintenance.
Cons:
- Performance: The virtualization layer of public clouds can impact
performance. While this certainly could be mitigated through
scaling the number of virtual hosts, the cost would also increase
accordingly.
- Complexity: Advanced cloud technologies are complex systems which
come with a steep learning curve. Without prior experience,
properly configuring and managing a cloud-based compute cluster
can be challenging.
- Cost Variability: While the cloud can be cheaper, it can also be
more expensive if not properly managed. Costs can rise
unexpectedly based on traffic. Make sure to always set some limits
on how much may be spent on hosting in the cloud control panel,
and to set up notifications to be sent to you when these
thresholds are about to be triggered.
- Multi-tenancy Risks: In a public cloud environment, the "noisy
neighbour" problem could potentially affect your service's
performance. Additionally, even though cloud providers take steps
to isolate tenant environments, vulnerabilities could potentially
expose sensitive data (see the previous section for a detailed
explanation).
**Additional considerations**
- In today's environments, Kubernetes and Terraform are sometimes
used as a substitute for cloud APIs when it comes to production
services' management. When running a DNS resolver in a Kubernetes
cluster on top of a public cloud environment, all the pros and
cons of the public cloud apply; basically, Kubernetes becomes your
public cloud provider. If you have significant prior experience
running services in Kubernetes in production, you may successfully
replicate your experience with the DNS resolver software.
Otherwise, we would advise against Kubernetes in this case.
- The only reason we may find to run a DNS resolver in a Kubernetes
cluster on top of self-hosted dedicated servers is when you have
significant hands-on experience with Kubernetes and it is natural
for you to manage applications this way. Otherwise, running DNS
resolver daemons in containers brings little, if any, benefit.
Autoscaling features are not available to you in this case, and
neither horizontal nor vertical pod autoscaling is of any use,
because DNS resolver software typically scales in-host by itself
just fine.
- When designing a cluster of resolvers for autoscaling, keep in
mind that newly spawned resolver machines would need to populate
resolver cache first before they are fully useful. Your DNS
resolver software may provide cache replication mechanisms.
Otherwise, it is safe to overprovision clusters somewhat under
heavy load, and discarding excessive instances once all the caches
are populated and the average load of a compute instance
decreases. In addition, it may be worthwhile to consider sharing
cache data between instances.
- It is always advised to prefer environments your infrastructure
management team is familiar with.
### Software considerations
#### Open Source
**Recommendation**: Choose any well-maintained DNS software you are
comfortable using. Regardless of which software you choose, ensure you
have somewhere to go for support. In the case of open source software,
consider providing financial support to ensure continued development.
Some open source maintainers take donations, while others offer
support contracts.
There are both open source and proprietary implementations of DNS
resolver software. Mixing these is also possible, for example, by
using proprietary extensions with open source software or deploying
open source software modified in-house.
General observations:
- Software licensing is orthogonal to software security. Neither is
proprietary software less secure on principle nor are
contributions by "unknown" developers more of a risk in open
source.
Benefits of open source:
- Open source allows for inspection, independent auditing, and
troubleshooting.
- Open source can avoid vendor lock-in.
- Open source can aid internet standards development.
Widely-deployed open source implementations allow proponents of
standards drafts to contribute proof of concept implementations
without permission or cooperation of vendors.
Drawbacks of open source
- Both open source and proprietary software require skilled
maintenance, which has costs. Proprietary licensed software or
appliances typically come with license fees to cover these. In
contrast, open source licenses decouple usage by operators from
monetary compensation to developers. It is up to operators to
consider the financial sustainability of continued maintenance of
the open source DNS software they depend upon.
Please also consider deploying different software implementations to
ensure diversity, as discussed in the diversity section below.
### Networking considerations
#### IPv4 and IPv6
**If available, both IPv4 and IPv6 must be deployed.**
Large parts of the authoritative DNS are only accessible via IPv4, so
the resolver must be able to originate IPv4 queries. Authoritative DNS
that is only accessible via IPv6 is very rare.
Depending on the connectivity of clients, a resolver may be IPv4-only,
IPv6-only, or support IPv4 and IPv6.
#### Addressing
**Using multiple IP addresses for the service address should be
considered.**
Using 2 or more IPv4 addresses and 2 or more IPv6 addresses from
different RIR will allow resilience in failure at an RIR, either
governance, security, or technical. Note that support for multiple
addresses for recursive resolvers varies and some clients perform
poorly if any address does not respond normally.
There is no need to pick an IPv4 address with all octets the same,
like 2.2.2.2 or 11.11.11.11.
**Publishing a list of back-end addresses used for resolving should be
considered.**
Publishing a list of back-end addresses used for resolving can be
useful for other network & DNS operators (for example, geo-IP
location, making sure data is getting to correct places, and so on).
#### High Availability
This can be considered in terms of local and global scope.
##### Local scope
Inside a single location/region, such as an office, campus, or small
ISP network, the main availability concern is that a resolver is
always reachable. Client systems can be configured with multiple
resolver addresses, but the failover behaviour of stub resolvers to a
second address can be painful. Ideally the primary address is highly
available and such fallback rarely required. How much effort is put
into ensuring this is true should probably scale in line with the
number of users, or sensitivity of the clients using that resolver to
delayed resolution.
There are several ways to promote high availability of an individual
resolver address, such as dedicated load balancing equipment, or
network techniques like VRRP, or IP anycast. These generally have in
common a pool of recursive servers and the means to direct queries to
them when a health check has determined them to be capable of
answering those queries.
Dedicated free or commercially produced, hardware or software load
balancing solutions are available. These typically own the resolver IP
address and forward queries to the currently available instances of a
pool of recursive servers.
VRRP enables a technique to make the resolver IP address available on
multiple servers, often used to provide automatic failover between
two. A pool of recursive servers using this technique must reside in
the same broadcast domain.
IP anycast in the local scope typically involves a pool of recursive
servers advertising a route to a shared resolver IP address into a
routing protocol. This can be configured in failover or load-sharing
configurations. A load sharing configuration typically requires
network equipment able to balance traffic to a destination over equal
cost paths (ECMP). A pool of recursive servers using this technique
can be distributed in different parts of the network.
##### Global scope
The same concerns as for local service availability are present in the
global scope, with the added issue that DNS resolution over long
distances may be slow. Practically speaking, only multiple resolver
addresses, or IP anycast are useful strategies here. The motivations
for finding better failover solutions than multiple resolver addresses
have been covered above.
IP anycast in the global scope means routing the same IP prefix to
more than one location. This can provide effective solutions for
failover and, when optimally configured for routing client queries to
the topologically least distant recursive server location. IP anycast
in the global scope requires the use of globally routable prefixes. If
a separate prefix is to be used for anycasting, usually this means a
/24 in IPv4 and a /48 in IPv6, as those are the smallest sizes that
will be widely propagated in BGP. A common practice is to use a
covering prefix (/23 in IPv4 or /47 in IPv6) for fallback, and a
more-specific prefix (/24 or /48) for the traffic. The more-specific
prefix can then be withdrawn to send traffic to a backup site; this
will happen automatically if the site is disconnected from routing.
[RFC7094](https://www.rfc-editor.org/rfc/rfc7094.html) discusses
anycast architecture in detail, including references to various other
RFC which discuss anycast in general and to DNS in particular.
[RFC4786](https://datatracker.ietf.org/doc/html/rfc4786) discuses
operation of anycast services.
##### Generally
Operators of a globally scoped recursive service are encouraged to
also adopt the local scope recommendations in each of the locations
where the service is provisioned.
Though the above deals with the shortcomings of reliance on stub
resolver failover between a list of addresses those recommendations
shouldn’t be seen as an exclusive alternative. Multiple resolver
addresses, where each is provisioned using differing failover
strategies, can provide a resolver of last resort and further improved
resilience.
#### Ingress Filtering
**Ingress Filtering to follow BCP 38 should be deployed.**
DNS normally uses UDP traffic, which makes it a common vector of both
[reflection](https://en.wikipedia.org/wiki/Reflection_attack) and
[amplification](https://www.cisa.gov/news-events/alerts/2014/01/17/udp-based…
attacks. To minimize the amount of spoofed traffic that a resolver
responds to, the network should be configured as recommended in
[BCP 38](https://www.rfc-editor.org/rfc/rfc2827.html).
#### RPKI Sign Advertised Routes
**Route Advertisements should be signed using RPKI**
Using RPKI to sign any route advertisements - either toward
authoritative servers or toward DNS clients - is straightforward to do
and will reduce the impact of BGP misconfigurations and some BGP
hijacking attempts.
RPKI validation is also possible, although the effort is greater. It
is possible that the hosting provider or the transit provider for your
service validates BGP; asking and making this part of your selection
criteria is reasonable.
#### (D)DoS measures
Denial-of-Service (DoS) attacks, both distributed (DDoS) and not are a
threat to any Internet service. Network operators for a service
providing any DNS service must be prepared for large amounts of attack
traffic.
In addition to attacks on the service itself, a resolver may be used
both as an attack reflector and as an attack amplifier.
Active monitoring of network and service usage, careful logging, and a
security team that is able to respond to problem reports is necessary.
Mitigation techniques will include filtering or rate-limiting traffic,
both on the authoritative and client side of the resolver.
### Capacity planning
#### Server capacity
If using a model that is easy to scale (cloud based, or Kubernetes
based, or similar), then getting server capacity correct is largely a
question of budgeting. If using a less-flexible model (bare metal for
example), then under-estimating will mean problems delivering service.
Hardware performance varies widely, as does operating system and
resolver performance. Some lab testing will be necessary to estimate
the number of systems needed.
#### Network capacity
Since DNS is mostly UDP-based, it is often easy to generate large
amounts of spoofed traffic to and from DNS servers. DNS traffic is
small compared to application traffic (videos and other content), but
still significant. Authoritative server operators often build their
networks and servers to handle 10 times their normal load. Recursive
server operators may need to do the same, although the service only
accepts traffic from IP addresses that cannot be spoofed (for example
users within a network that operated by the same company) then this
can be reduced, for example to 3 times normal load. To estimate
expected load, the best approach is to examine historical usage for
the actual expected users of the system.
### Resilience
#### System Diversity
In addition to the software considerations above, operators should
consider whether to use different server implementations to provide
service. This allows continued operation if a critical vulnerability
is found in one implementation, by shifting traffic to other
implementations.
Placing resolvers and control systems in different physical locations
will allow continued operation in the event of a disaster or other
problem that impacts a single location. In addition, ensuring diverse
connectivity to other networks will prevent single points of failure
on the network side. Ensuring network diversity may take some care, as
it is not always obvious what fate is shared between any given path;
this may be physical, virtual, or organizational, and my sometimes be
hidden.
#### Security
In addition to the DNS-specific security considerations, normal
security best practices for any Internet service should be followed,
including updating software updated regularly, patching software as
soon as possible for any known security vulnerabilities, following
CERT announcements and so on.
#### Certification
It may be useful or required for an organization to follow specific
certifications, such as ISO or ITIL. These can be government-defined
or industry-defined. For end users there is typically not much direct
value, but business customers will often look for services that are
operated by organizations meeting such standards.
## DNS configuration knobs
The DNS is an old protocol that has a lot of settings that can be
tweaked. This section reviews these and provides recommendations on
which should be used for a resolver.
### DNSSEC validation
**DNSSEC validation should be enabled.**
For: All DNS resolver operators.
DNSSEC validation is the best way to ensure that the answers from the
owner of domain name being queried are returned.
The root KSK must be updated when it changes. While
[RFC5011](https://www.rfc-editor.org/rfc/rfc5011.html) defines an
automated way to do this, a resolver operator will probably either
manage this trust anchor directly or have it updated via OS updates.
[RFC9364](https://www.rfc-editor.org/rfc/rfc9364.html) provides a lot
of useful information, and links to further documents about DNSSEC.
However, operators usually do not need to know the details, and can
simply ensure that DNSSEC validation is enabled in their software.
Resolver software that does not support DNSSEC validation should be
avoided.
### DNS Transport Protocols
**UDP and TCP must be supported.**
For: All DNS resolver operators.
UDP is what most clients use, and TCP is necessary for DNS answers
that are too large for a single UDP packet.
[RFC7766](https://www.rfc-editor.org/rfc/rfc7766.html) explains why
TCP is necessary in more detail.
### Packet Fragmentation Avoidance
**Servers should be configured to avoid fragmentation.**
For: ALL DNS resolver operators.
Packet fragmentation can cause issues with DNS over UDP, especially
over IPv6. These issues can be minimized by choosing implementations
that set IP options to avoid this, and by taking care with EDNS0
message sizes.
Recommendations are available in
[draft-ietf-dnsop-avoid-fragmentation](https://datatracker.ietf.org/doc/draf….
### Encrypted DNS
**At least one of DNS-over-TLS (DoT), DNS-over-HTTPS (DoH), and
DNS-over-QUIC (DoQ) should be supported.**
For: All DNS resolver operators.
DoT, DoH, and DoQ are different technologies that all provide an
encrypted channel between the resolver and the authoritative server.
DoT is the oldest, and provides encrypted DNS using TLS. DoH uses HTTP
over TLS as a way to transmit queries and answers, and is widely
supported by web browsers. DoQ is the newest, and provides advanced
features such as separate streams for each query, avoiding the "head
of line" blocking problem common with all protocols layered on top of
TCP (such as DoT and DoH).
- DoT
- [RFC7858](https://www.rfc-editor.org/rfc/rfc7858.html)
- DoH
- [RFC8484](https://www.rfc-editor.org/rfc/rfc8484.html)
- DoQ
- [RFC9250](https://www.rfc-editor.org/rfc/rfc9250.html)
**Discovery of DNS Designated Resolvers**
There are new mechanisms that allow DNS clients to use DNS records to
discover encrypted DNS configurations. Resolvers should publish DNS
records to assist clients finding encrypted resolvers.
- Discovery of Designated Resolvers
- [RFC9462](https://www.rfc-editor.org/rfc/rfc9462.html)
QUESTION: Do we need to publish certificate in other ways that via the
DDR mechanisms?
### QNAME Minimization
**QNAME minimization should be enabled.**
For: All DNS resolver operators.
Using QNAME minimization, a resolver does not send the full name that
it is trying to resolver to authoritative servers higher in the DNS
hierarchy. So, for example, when querying "atlas.ripe.net" the servers
for ".net" would only be asked for "ripe.net". This improves privacy
for the end user querying the name.
[RFC7816](https://www.rfc-editor.org/rfc/rfc7816.html) covers QNAME
minimization.
### Aggressive NSEC caching
**Aggressive NSEC caching may be enabled.**
For: Public resolver operators.
"Aggressive NSEC caching", meaning negative caching based on NSEC and
NSEC3 values, can reduce traffic greatly. It is important to protect
against random subdomain attacks.
This style of caching takes advantage of the way that NSEC and NSEC3
records cover a range of names in a zone. A resolver can know that a
query falls within such a range without sending any further queries,
by remembering the NSEC or NSEC3 redords that is has seen as answers
to earlier queries.
Aggressive NSEC caching is almost always a good idea. However enabling
this is less important for DNS resolver operators who have a close
relationship with users, since they can stop attacks by blocking users
or otherwise directly dealing with the source of abusive queries.
[RFC8189](https://www.rfc-editor.org/rfc/rfc8189.html) describes
negative caching in detail.
### ANY Queries
**ANY queries responses should be limited.***
For: All resolver operators.
Public or large-scale resolvers should be exceptionally careful with
queries of type ANY, which return all all records at a given name. If
a resolver replies with all of the records cached for a given type,
the response can be much larger than for a single record type. Strict
limits should be enforced on volumes of such queries to prevent
amplification abuse, or truncation should be applied to prevent
spoofed redirections.
[RFC8482](https://www.rfc-editor.org/rfc/rfc8482.html) describes
several approaches to limiting ANY responses.
### Local Root
**Local root should be used.**
For: Public resolver operators.
Running a local root has several benefits, but it is an additional
component to maintain. For public resolver operators this is
definitely worth the cost, but other resolver operators may choose to
simply send all queries to the well-distributed root name servers.
[RFC8806](https://www.rfc-editor.org/rfc/rfc8806.html) describes local
root, including several example configurations.
In the future it will be possible to use ZONEMD to validate the copy
of the root zone obtained before using it. This is currently available
for the root zone.
[RFC8976](https://www.rfc-editor.org/rfc/rfc8976.html) describes ZONEMD.
### DNS Cookies
**Interoperable DNS Cookies may be supported.**
For: Public resolver operators.
DNS cookies provide some improved security over plain UDP, and are
designed to be more lightweight than TCP. If more than one server is
responding for a given IP address, then the Server Secret must be
shared by all servers, and the answer must be constructed in a
consistent manner by all server implementations.
Since client-side support for DNS cookies is not very widespread, and
since managing server-side secrets involves some work, the costs may
outweigh the benefits for some non-public resolver operators.
[RFC7873](https://www.rfc-editor.org/rfc/rfc7873.html) describes DNS
cookies, and [RFC9018](https://www.rfc-editor.org/rfc/rfc9018.html)
standardizes shared DNS cookies.
### TTL Recommendations
**TTL limits may be adjusted.**
For: All DNS resolver operators.
Software typically defaults to a maximum stored TTL of 1 or 2 days.
A lower TTL will mean removing rarely-used records that have long TTL,
and should not have much operational impact from a CPU or network
point of view.
It is possible to set a minimum TTL in many implementations. This is a
violation of the DNS protocol, although may be useful to reduce load
from records with very low TTL (less than 5 seconds).
Note that software may set different maximum and minimum TTL
independent of the results that the resolver returns. That may have a
significant impact on queries as well, but resolver operators cannot
influence that.
### TTL-based Record Pre-Fetch
**TTL record pre-fetch should be enabled when available.**
For: All DNS resolver operators.
Some resolvers have the ability to look up a record before it has
expired from cache, in order to refresh the value and extend the TTL.
This way there is never a time when the records are missing from the
cache. This is not currently standardized, but a form of this was
proposed in the IETF as
[DNS
Hammer](https://datatracker.ietf.org/doc/html/draft-wkumari-dnsop-hammer-03).
We recommend turning this feature on if available.
### EDNS Client Subnet (ECS)
**ECS may be enabled.**
For: All DNS resolver operators.
EDNS Client Subnet (ECS) allows the resolver to include information
about the IP address of the client querying it when sending messages
to authoritative servers. This may allow authoritative servers to
provide different answers which are more appropriate for the client.
However, ECS will increase the amount of cache space required by
resolvers, may reduce DNS performance, and may have privacy
implications.
A resolver operator that has clients that are limited to a specific
region may see no benefit. A resolver operator that has a widely
distributed anycast network may not have much benefit from ECS, since
the locations that initiate the query will be close to the client. But
a resolver operator that answers client queries only from a few
locations, and expects clients to come from a wide area, may provide
better service for end-users by supporting ECS.
EDNS client subnet is described in
[RFC7871](https://www.rfc-editor.org/rfc/rfc7871.html), an
informational RFC.
### Extended DNS Errors
**Extended DNS errors should be enabled.**
For: All DNS resolver operators.
DNS traditionally provides very broad error reporting, SERVFAIL being
the most common. This makes diagnosing and fixing problems difficult.
Extended DNS errors provide extra information about failures, for
example expired DNSSEC signatures. They also allow resolver operators
to report administrative reasons for DNS failures, such as blocks due
to legal requirements.
[RFC8914](https://www.rfc-editor.org/rfc/rfc8914.html) defines
extended DNS errors.
### Negative Trust Anchors
**Negative trust anchors may be deployed.**
For: All DNS resolver operators.
Negative trust anchors (NTA) allow a resolver operator to handle a
case where an authoritative server has a DNSSEC problem and becomes
inaccessible. They basically disable DNSSEC checking for a domain.
When this is warranted is difficult to know with certainty, and will
usually requires some manual checking. Since DNSSEC validation is now
widespread, DNSSEC failures on the authoritative side will impact many
resolvers.
Because of these reasons this document does not recommend NTA, but
also does not recommend that a deployment avoid NTA if it makes sense
for that environment.
Negative trust anchors are documented in
[RFC7646](https://www.rfc-editor.org/rfc/rfc7646.html).
### DNS Error Reporting
**DNS error reporting may be enabled.**
For: All DNS resolver operators.
DNS error reporting is a way for resolver operators to let
authoritative operators know about problems in authoritative servers
or zones. It provides little direct value for the resolver operators,
but over time should improve the overall quality of the DNS ecosystem.
It is neither widely deployed nor standardized, but hopefully will be
both soon. Resolver operators are encouraged to enable DNS error
reporting when it is available.
DNS error reporting is proposed in
[draft-ietf-dnsop-dns-error-reporting](https://datatracker.ietf.org/doc/draf….
### Trust Anchor Reporting
**Trust anchor reporting should be enabled.**
For: All DNS resolver operators.
Trust anchor reporting is a way for resolver operators to convey their
DNSSEC trust anchor configuration to the operator of the zone that it
is for. For most resolvers this is only the root zone. This
information is intended to be used during a root KSK rollover to
ensure that it is safe to proceed. In the future ICANN is planning an
algorithm roll for the root KSK, and this information could be
helpful. Resolver operators are encouraged to enable trust anchor
reporting.
[RFC8145](https://www.rfc-editor.org/rfc/rfc8145.html) covers trust
anchor reporting, in both possibilities available.
### Name Server Identification
**Servers should be configured to identify themselves.**
For: All DNS resolver operators.
Large resolver operations, especially publicly available resolvers,
should support an in-band method of discovery that is obvious to
permit users to discover what node has answered their query. This
improves troubleshooting significantly, and may be useful for research
and testing purposes. NSID (Name Server Identifier) is ideal for this,
though also “CH TXT id.server” support is also reasonable. Geographic
hints should be provided in this data, though specific host data is
optional for arrays of servers in clusters. IATA codes have
traditionally been used for naming points-of-presence, though this is
at the discretion of the operator.
[RFC5001](https://www.rfc-editor.org/rfc/rfc5001.html) describes NSID.
[RFC4892](https://www.rfc-editor.org/rfc/rfc4892.html) describes name
server identification in general, and documents the pre-NSID
approaches.
## Privacy, Filtering, Transparency
### Privacy & anonymity
Operators are advised to apply
[RFC8932](https://www.rfc-editor.org/rfc/rfc8932.html)
"Recommendations for DNS Privacy Service Operators" as follows:
1. its operational and policy guidance related to DNS encrypted
transports and data handling, by applying all "Threat mitigations"
(thereby by meeting its level of "minimally compliant") and
additionally by applying the "Optimizations" on EDNS Client Subnet
listed in section 5.3.1.
2. its framework on a Recursive operator Privacy Statement, by
publishing a privacy statement on their website that is compliant
with Section 6.
#### Logging considerations
1. Public privacy policy: DNS resolvers are recommended to publish
their privacy policies transparently on their website. It can be a
brief privacy commitment as well or be more elaborate on how the
privacy policy was made. (See for example
[Cloudflare's
statement](https://developers.cloudflare.com/1.1.1.1/privacy/public-dns-res…
or [Quad9's privacy page](https://www.quad9.net/service/privacy/).)
Such policies should explicitly mention the sampling rate of DNS
queries/responses that are kept, and whether these are anonymized.
2. Third party access to personal data: it seems that the only
critical personal data that DNS resolvers collect are IP addresses
and the queries that are resolved. The other meta data collected
can be used to have an understanding of for example which user
accessed which website which can reveal information about a
person’s health, lifestyle and other personal preferences (we call
this profiling). For example, resolving the website for alcoholics
anonymous may tell you something about the health of a person
behind an IP address. IP addresses are personally identifiable
information. Follow the applicable privacy laws or privacy
principles when receiving third party requests to access. Resolvers
should only comply with such requests when balancing legitimate
third party interest with other fundamental rights.
3. Access to data for researchers: how it is done, who has access and
who can request access, how the resolver makes a decision to give
access (validated and credible researchers, what they can access
and other issues)
4. Data minimization: do not collect personal information not needed
for critical operations. Only retain or use what is being asked
(the query). If collecting data to make the service more private
and secure, explain the rationale for each piece of data (data
collection purpose)
5. Encryption: If data is encrypted, explain how it has been encrypted
(DoH, DoT, or so on).
6. Data security and retention: when to delete the data and how it is
stored
#### Advertisement Policy
If there is any advertising from the service, the policy should be
published as well as how it can potentially affect the users' privacy.
### Filtering and blocking
#### Block Lists
Resolvers can be directed to block or modify answers in various ways.
Blocklists may be provided by governments, communities, or other
parties (for example security firms).
Response Policy Zone (RPZ) allows a way to both document specific
modifications that resolvers will make to DNS answers, and send the
rules to resolvers. This allows updates to occur very quickly. If RPZ
or some other high-speed blocking technology is used, the parties
supplying these sources must be highly trusted, as changes to
blocklists will usually immediately impact user queries.
RPZ is not standardized, but there is an IETF draft,
[draft-vixie-dnsop-dns-rpz](https://datatracker.ietf.org/doc/draft-vixie-dns….
#### Legal blocking
**Legal requests and blocking and filtering laws:** DNS resolvers
should not filter content and block access to web-services. When the
local law requires blocking, and the law applies to the resolver, the
resolver should transparently disclose a list of blocked websites and
services, when possible (disclosing such a list may not be allowed by
law or regulation). Similarly, the resolver should disclose the source
of such block lists, when possible.
If it is not possible to disclose the source of blocklists, operators
should try to be as transparent as possible about how they receive
those blocklists, based on what criteria, and how they mitigate errors
and false positives. Disclosing which organizations operators interact
with, how they liase, and so on, can help users understand the impact
on the service provided.
If possible, resolvers should provide information about blocked
responses via the Extended DNS Error with the Blocked, Censored,
Filtered, or Prohibited code - whichever applies best - along with a
text why the response was blocked, censored, filtered, or prohibited.
[RFC 8914](https://www.rfc-editor.org/rfc/rfc8914.html#section-4.16)
provides information about the meanings of the different codes.
**Community governance of blocklists:** blocklists, if mandatory, have
to be audited and assessed by third parties and there should be a
right to appeal for those blocked. The Internet community can vet the
blocklists from time to time to avoid blocking access to websites that
are mistakenly blocked. During crisis - when mistakes can have drastic
effects on accessing a critical service - preferably filtering and
blocking should not be used.
#### Opt-in/Opt-out Mechanisms
End users may choose to use a DNS resolver that filters specific kinds
of traffic. For example, they may wish to avoid potential malware web
sites. Or resolver operators may be required to default to filtering
but allowed for to provide an unfiltered DNS resolver service.
Depending on the specific requirements, a resolver service may publish
different IP addresses and what type of filtering applies to each
address. It is also possible to perform client authentication and
authorization, using IP-based authentication, TSIG keys, or
client-side TLS certificates.
### Transparency
DNS resolvers usually provide transparency reports once a year. The
reports inform the public about disclosure of user information and
removal of content required by law enforcement and other government
agencies.
Transparency reports should (to the extent that the law allows)
indicate which government agencies and law enforcement agencies
request access on what basis.
It should also be clear from the transparency reports what kind of
data has been requested and if content removal and content blocking
have been requested. Categories of data include: Content Data, Basic
Subscriber Data, Other Non-Content Data and Content Blocking.
#### Voluntary certificates and standards
Some DNS resolvers opt for obtaining certificates in security and
privacy. Some also undertake audits on their privacy practices. See
for example:https://www.cloudflare.com/trust-hub/compliance-resources/
#### Negative Trust Anchor reporting
Negative Trust Anchors (NTA) are discussed in the previous section on
DNS configurations. If NTAs are present in the resolver, they should
be published with as much detail as possible about them. This includes
reasons for insertion, dates of activation and expected removal dates,
or a published review date or cycle for when NTAs should be actively
examined for deletion if such fine-grained information cannot be
shared.
NTAs are equivalent to a security fault, and may even be more
significant than a block event as they remove expected trust behavior
with limited signal of that trust downgrade ("limited" because few if
any clients care about those response bits changing.).
#### Human rights considerations
DNS resolvers can opt for declaring their understanding of their
responsibilities regarding human rights from the Universal Declaration
of Human Rights. Specifically, Quad9 mentions rights to freedoms
without distinction made on the basis of country, no interference with
privacy, the right to freedom of opinion and expression, the right to
peaceful assembly, and the right to freely participate in the cultural
life of the community.
See
[Quad9's Human Rights
Considerations](https://www.quad9.net/privacy/human-rights-considerations/)
for the full statement.
It also invokes other human rights related solutions other than
[UDHR](https://www.un.org/en/universal-declaration-human-rights/) such
as Articles 8 and 9 of Resolution 42/15 of the United Nations Human
Rights Council on the right to privacy in the digital age of 26
September 2019 more directly define the responsibilities of the
private sector toward the furtherance of human rights in modern terms.
They also follow the Guidelines for Human Rights Protocol and
Architecture Consideration of the Human Rights Protocol Considerations
Research Group at Internet Research Task Force.
The latest version of the IRTF
[Guidelines for
HRPC](https://tools.ietf.org/html/draft-irtf-hrpc-guidelines) may be
considered for all network operators.
## Appendix A: Why Did RIPE Write This Document?
There is increasing concern that large open DNS resolvers will become
centralised points of DNS operations on the Internet. In order to
address this, the European Commission issued the
[DNS4EU](https://hadea.ec.europa.eu/calls-proposals/equipping-backbone-networks-high-performance-and-secure-dns-resolution-infrastructures-works_en)
proposal. However, such an initiative could lead to centralised
guidance or regulation which might interfere with the decentralised
way the Internet infrastructure works - including the DNS. See for
reference the
[RIPE NCC Open House
discussion](https://labs.ripe.net/author/chrisb/dns4eu-ripe-ncc-open-house-…
on this topic.
Rather than attempting to respond to the EC proposal or organize
specific DNS resolver deployments, the RIPE community has decided that
it is best able to provide advice and guidance. The RIPE Community is
well positioned to provide a set of Best Current Practices that
operators of Open DNS Resolvers will be encouraged to subscribe to.
5
10
Fellow Task Force members,
We had a Zoom session last week, and discussed all the feedback on the
document.
I made a bunch of edits (listed below) and think the document is
basically ready to be send to the RIPE NCC for editing and/or to Mirjam
for review. Please have a look and let me know what you think!
One minor outstanding thing is that Andronikos said that he would track
down a reference about BIND connection tracking recommendations, which I
think is this:
https://kb.isc.org/docs/aa-01183
But I cannot remember in what context this was, or where we wanted to
put it in the document. Please help clue me in!
Anyway, I made the following edits based on our conversation:
* Added a section explaining that this document does not use RFC 2119
language.
* Reduced the strength of recommendation for aggressive NSEC caching
from "should" to "may".
* Reduced the strength of recommendation for interoperable DNS cookies
from "should" to "may".
* Changed recommendation to disclose the list of blocked websites and
services to recognize that it is not always possible.
* Added recommendation to disclose the source of block lists when
possible.
* Reworked RPZ section to be a generic block list section, although
RPZ is mentioned as a specific technology for distributing such
lists.
* Added recommendation to include DNS Extended Error codes for blocked
traffic.
* Replaced the section on anycasting with David Knight's text, which
covers high availability in general and anycasting as a single case
within that.
* ZONEMD is now in production, so the section mentioning that it is
being deployed was updated.
* Noted that privacy policies should mention the sampling rate of DNS
queries/responses kept.
* Moved the ad policy out of logging considerations.
* Removed the phrase "this is usually enabled by default" describing
DNSSEC validation, to avoid the chance that resolver operators may
mistakenly overlook this if it is not enabled.
* De-emphasized ALL DNS resolver operators.
* Changed recommendation from all of DoT+DoH+DoQ to one of them.
* Removed some stray text.
* Removed suggestions that a lower maximum TTL may reduce cache size
or save memory.
* Increased the strength of the recommendation for trust anchor
reporting from "may" to "should".
Thank you all for your efforts here. We're almost ready for champagne!
Cheers,
--
Shane
## DNS Resolver Recommendations
About the DNS Resolver Best Common Practice Task Force
https://www.ripe.net/participate/ripe/tf/dns-resolver-best-common-practice-…
## Terminology
* Open Resolver: A DNS resolver that accepts queries from any client.
Often the result of misconfiguration.
* Public Resolver: A resolver intentionally configured to be an open
resolver.
## Introduction
### What Is This Document? Who Is It For?
This document presents recommendations and best current practices for
operating DNS resolvers, both public and non-public ones. It covers
technical aspects of operations and provides best practice
recommendations for data management, with a particular focus on user
privacy, security, and resilience.
The document serves as guidance for the wider Internet community,
offering input to:
* Those running public DNS resolver services, and
* Those who want to make informed choices between such services.
Its purpose is to provide clear guidance and promote effective
practices in DNS resolver operation.
The intended audience is not the entire DNS community. Advice here is
probably not useful for operators of authoritative servers, domain
registrars, and so on. It is also not meant to be an introductory or
educational document. There are many documents which cover the basics
of DNS and the roles of organizations in it; a good overview is:
Addressing the challenges of modern DNS - a comprehensive tutorial
by van der Toorn et al.
https://ris.utwente.nl/ws/files/282427879/1_s2.0_S1574013722000132_main.pdf
The document does not consider how to measure adherence to these
recommendations. So it is not intended to be used for certification,
although certification created based on the principles here is
possible.
### How Is This Document Organized?
This document has a number of sections, and specific recommendations
in each section. The intent is for each recommendations to have clear
guidance at the top, and then background and discussion related to the
recommendation afterwards. Each recommendation indicates whether it is
mostly for operators of public resolvers or for operators of any
resolver.
### About Recomendation Text
This is not a standards document, and does not propose any way to
measure compliance or interoperability. It does use words like
"should" or "may be" throughout. These are meant to be interpreted in
the usual English sense, and not as IETF-style RFC 2119 jargon.
## System and Network Hardening
### Infrastructure considerations
Running any Internet service requires attention to the infrastructure
used to operate it. This section discusses various approaches that can
be used to run a DNS resolver. Everything applies to both public and
non-public DNS resolvers.
#### Bare metal or public cloud
All DNS resolver software can run either on dedicated servers (rented
or colocated), or in virtualized clouds, or in a combination of those.
Every approach has pros and cons. Most of these are not specific to
running DNS resolvers, however, some of them are.
**Running DNS resolver instances as OS level daemons on bare metal
hosts:**
Pros:
- Performance: Bare metal servers have direct access to the
underlying hardware, and can offer superior performance/cost
balance by avoiding the overhead associated with virtualization.
Moreover, you have full control over the server's configurations,
down to the hardware level, which can be beneficial for
performance and cost optimization once you get the understanding
of your typical work load during peak hours.
- Data Security: Since you are in control of the physical servers,
there is no risk of data leakage that can occur due to
vulnerabilities in multi-tenant virtualization platforms,
including CPU cache-based side-channel vulnerabilities. It could
be argued that attacks targeting such issues are rare, and their
impact on a DNS resolver service is low, but potential breaches
may have significant privacy impact. It is advised to evaluate
this against your organisation's risk model, or to discuss this
with your information security compliance experts.
- Predictability: Because there is no virtualization layer and no
"noisy neighbours" on the host, the performance of your servers is
more predictable.
Cons:
- Cost of failure: If you pick hardware configuration that is not
optimal for the workload of your DNS resolver, you may need to
upgrade and replace hardware components afterwards. Ways to reduce
this risk include renting servers instead of buying them, carrying
load testing with data similar to production workloads, and
providing limited beta access to the service before it fully
enters the production phase.
- Scalability: Scaling up with physical servers means acquiring or
renting, installing, and configuring new hardware, which will take
more time than provisioning new virtual servers in a cloud
environment. Moreover, most cloud environments will provide you
with cluster autoscaling features, which could barely be achieved
in bare metal.
- Maintenance: You will be responsible for all server maintenance
tasks, including hardware issues, which can require significant
effort and specific expertise.
- Redundancy: Setting up high availability and disaster recovery
strategies can be more complex and time consuming compared to the
cloud, where these features are often provided as value added
products. See the Redundancy section for more details.
**Running DNS resolver instances in containers in a public cloud:**
Pros:
- Scalability: Clouds excel at scaling applications. You can scale
up and down rapidly based on load, which is important for a DNS
resolver that needs to handle variable query loads. In case of
regional or geographically distributed resolvers, in every region
where the resolver would be deployed, daily periodicity is likely
to be observed, for example peak hour is likely to occur around
19:00 local time, and off-peak hours may begin at around
01:00-03:00. In a situation like that, using cluster autoscaling
features and tools, you can run less instances in the night and
more instances throughout the day, which may help to optimize your
cloud hosting costs.
- Fault Tolerance and High Availability: Most clouds have built-in
strategies, features, and products for handling node failures,
which can increase your service's availability.
- Deployment and Management: Cloud providers offer built-in methods
to deploy and manage applications, which can simplify operations
and reduce the likelihood of human errors if your infrastructure
management department is already familiar with these tools.
- Cost: While this largely depends on your specific usage, cloud
services can sometimes be more cost-effective than managing your
own physical servers, especially when you consider the total cost
of ownership, including power, cooling, and maintenance.
Cons:
- Performance: The virtualization layer of public clouds can impact
performance. While this certainly could be mitigated through
scaling the number of virtual hosts, the cost would also increase
accordingly.
- Complexity: Advanced cloud technologies are complex systems which
come with a steep learning curve. Without prior experience,
properly configuring and managing a cloud-based compute cluster
can be challenging.
- Cost Variability: While the cloud can be cheaper, it can also be
more expensive if not properly managed. Costs can rise
unexpectedly based on traffic. Make sure to always set some limits
on how much may be spent on hosting in the cloud control panel,
and to set up notifications to be sent to you when these
thresholds are about to be triggered.
- Multi-tenancy Risks: In a public cloud environment, the "noisy
neighbour" problem could potentially affect your service's
performance. Additionally, even though cloud providers take steps
to isolate tenant environments, vulnerabilities could potentially
expose sensitive data (see the previous section for a detailed
explanation).
**Additional considerations**
- In today's environments, Kubernetes and Terraform are sometimes
used as a substitute for cloud APIs when it comes to production
services' management. When running a DNS resolver in a Kubernetes
cluster on top of a public cloud environment, all the pros and
cons of the public cloud apply; basically, Kubernetes becomes your
public cloud provider. If you have significant prior experience
running services in Kubernetes in production, you may successfully
replicate your experience with the DNS resolver software.
Otherwise, we would advise against Kubernetes in this case.
- The only reason we may find to run a DNS resolver in a Kubernetes
cluster on top of self-hosted dedicated servers is when you have
significant hands-on experience with Kubernetes and it is natural
for you to manage applications this way. Otherwise, running DNS
resolver daemons in containers brings little, if any, benefit.
Autoscaling features are not available to you in this case, and
neither horizontal nor vertical pod autoscaling is of any use,
because DNS resolver software typically scales in-host by itself
just fine.
- When designing a cluster of resolvers for autoscaling, keep in
mind that newly spawned resolver machines would need to populate
resolver cache first before they are fully useful. Your DNS
resolver software may provide cache replication mechanisms.
Otherwise, it is safe to overprovision clusters somewhat under
heavy load, and discarding excessive instances once all the caches
are populated and the average load of a compute instance
decreases. In addition, it may be worthwhile to consider sharing
cache data between instances.
- It is always advised to prefer environments your infrastructure
management team is familiar with.
### Software considerations
#### Open Source
**Recommendation**: Choose any well-maintained DNS software you are
comfortable using. Regardless of which software you choose, ensure you
have somewhere to go for support. In the case of open source software,
consider providing financial support to ensure continued development.
Some open source maintainers take donations, while others offer
support contracts.
There are both open source and proprietary implementations of DNS
resolver software. Mixing these is also possible, for example, by
using proprietary extensions with open source software or deploying
open source software modified in-house.
General observations:
- Software licensing is orthogonal to software security. Neither is
proprietary software less secure on principle nor are
contributions by "unknown" developers more of a risk in open
source.
Benefits of open source:
- Open source allows for inspection, independent auditing, and
troubleshooting.
- Open source can avoid vendor lock-in.
- Open source can aid internet standards development.
Widely-deployed open source implementations allow proponents of
standards drafts to contribute proof of concept implementations
without permission or cooperation of vendors.
Drawbacks of open source
- Both open source and proprietary software require skilled
maintenance, which has costs. Proprietary licensed software or
appliances typically come with license fees to cover these. In
contrast, open source licenses decouple usage by operators from
monetary compensation to developers. It is up to operators to
consider the financial sustainability of continued maintenance of
the open source DNS software they depend upon.
Please also consider deploying different software implementations to
ensure diversity, as discussed in the diversity section below.
### Networking considerations
#### IPv4 and IPv6
**If available, both IPv4 and IPv6 must be deployed.**
Large parts of the authoritative DNS are only accessible via IPv4, so
the resolver must be able to originate IPv4 queries. Authoritative DNS
that is only accessible via IPv6 is very rare.
Depending on the connectivity of clients, a resolver may be IPv4-only,
IPv6-only, or support IPv4 and IPv6.
#### Addressing
**Using multiple IP addresses for the service address should be
considered.**
Using 2 or more IPv4 addresses and 2 or more IPv6 addresses from
different RIR will allow resilience in failure at an RIR, either
governance, security, or technical. Note that support for multiple
addresses for recursive resolvers varies and some clients perform
poorly if any address does not respond normally.
There is no need to pick an IPv4 address with all octets the same,
like 2.2.2.2 or 11.11.11.11.
**Publishing a list of back-end addresses used for resolving should be
considered.**
Publishing a list of back-end addresses used for resolving can be
useful for other network & DNS operators (for example, geo-IP
location, making sure data is getting to correct places, and so on).
#### High Availability
This can be considered in terms of local and global scope.
##### Local scope
Inside a single location/region, such as an office, campus, or small
ISP network, the main availability concern is that a resolver is
always reachable. Client systems can be configured with multiple
resolver addresses, but the failover behaviour of stub resolvers to a
second address can be painful. Ideally the primary address is highly
available and such fallback rarely required. How much effort is put
into ensuring this is true should probably scale in line with the
number of users, or sensitivity of the clients using that resolver to
delayed resolution.
There are several ways to promote high availability of an individual
resolver address, such as dedicated load balancing equipment, or
network techniques like VRRP, or IP anycast. These generally have in
common a pool of recursive servers and the means to direct queries to
them when a health check has determined them to be capable of
answering those queries.
Dedicated free or commercially produced, hardware or software load
balancing solutions are available. These typically own the resolver IP
address and forward queries to the currently available instances of a
pool of recursive servers.
VRRP enables a technique to make the resolver IP address available on
multiple servers, often used to provide automatic failover between
two. A pool of recursive servers using this technique must reside in
the same broadcast domain.
IP anycast in the local scope typically involves a pool of recursive
servers advertising a route to a shared resolver IP address into a
routing protocol. This can be configured in failover or load-sharing
configurations. A load sharing configuration typically requires
network equipment able to balance traffic to a destination over equal
cost paths (ECMP). A pool of recursive servers using this technique
can be distributed in different parts of the network.
##### Global scope
The same concerns as for local service availability are present in the
global scope, with the added issue that DNS resolution over long
distances may be slow. Practically speaking, only multiple resolver
addresses, or IP anycast are useful strategies here. The motivations
for finding better failover solutions than multiple resolver addresses
have been covered above.
IP anycast in the global scope means routing the same IP prefix to
more than one location. This can provide effective solutions for
failover and, when optimally configured for routing client queries to
the topologically least distant recursive server location. IP anycast
in the global scope requires the use of globally routable prefixes. If
a separate prefix is to be used for anycasting, usually this means a
/24 in IPv4 and a /48 in IPv6, as those are the smallest sizes that
will be widely propagated in BGP. A common practice is to use a
covering prefix (/23 in IPv4 or /47 in IPv6) for fallback, and a
more-specific prefix (/24 or /48) for the traffic. The more-specific
prefix can then be withdrawn to send traffic to a backup site; this
will happen automatically if the site is disconnected from routing.
[RFC7094](https://www.rfc-editor.org/rfc/rfc7094.html) discusses
anycast architecture in detail, including references to various other
RFC which discuss anycast in general and to DNS in particular.
[RFC4786](https://datatracker.ietf.org/doc/html/rfc4786) discuses
operation of anycast services.
##### Generally
Operators of a globally scoped recursive service are encouraged to
also adopt the local scope recommendations in each of the locations
where the service is provisioned.
Though the above deals with the shortcomings of reliance on stub
resolver failover between a list of addresses those recommendations
shouldn’t be seen as an exclusive alternative. Multiple resolver
addresses, where each is provisioned using differing failover
strategies, can provide a resolver of last resort and further improved
resilience.
#### Ingress Filtering
**Ingress Filtering to follow BCP 38 should be deployed.**
DNS normally uses UDP traffic, which makes it a common vector of both
[reflection](https://en.wikipedia.org/wiki/Reflection_attack) and
[amplification](https://www.cisa.gov/news-events/alerts/2014/01/17/udp-based…
attacks. To minimize the amount of spoofed traffic that a resolver
responds to, the network should be configured as recommended in
[BCP 38](https://www.rfc-editor.org/rfc/rfc2827.html).
#### RPKI Sign Advertised Routes
**Route Advertisements should be signed using RPKI**
Using RPKI to sign any route advertisements - either toward
authoritative servers or toward DNS clients - is straightforward to do
and will reduce the impact of BGP misconfigurations and some BGP
hijacking attempts.
RPKI validation is also possible, although the effort is greater. It
is possible that the hosting provider or the transit provider for your
service validates BGP; asking and making this part of your selection
criteria is reasonable.
#### (D)DoS measures
Denial-of-Service (DoS) attacks, both distributed (DDoS) and not are a
threat to any Internet service. Network operators for a service
providing any DNS service must be prepared for large amounts of attack
traffic.
In addition to attacks on the service itself, a resolver may be used
both as an attack reflector and as an attack amplifier.
Active monitoring of network and service usage, careful logging, and a
security team that is able to respond to problem reports is necessary.
Mitigation techniques will include filtering or rate-limiting traffic,
both on the authoritative and client side of the resolver.
### Capacity planning
#### Server capacity
If using a model that is easy to scale (cloud based, or Kubernetes
based, or similar), then getting server capacity correct is largely a
question of budgeting. If using a less-flexible model (bare metal for
example), then under-estimating will mean problems delivering service.
Hardware performance varies widely, as does operating system and
resolver performance. Some lab testing will be necessary to estimate
the number of systems needed.
#### Network capacity
Since DNS is mostly UDP-based, it is often easy to generate large
amounts of spoofed traffic to and from DNS servers. DNS traffic is
small compared to application traffic (videos and other content), but
still significant. Authoritative server operators often build their
networks and servers to handle 10 times their normal load. Recursive
server operators may need to do the same, although the service only
accepts traffic from IP addresses that cannot be spoofed (for example
users within a network that operated by the same company) then this
can be reduced, for example to 3 times normal load. To estimate
expected load, the best approach is to examine historical usage for
the actual expected users of the system.
### Resilience
#### System Diversity
In addition to the software considerations above, operators should
consider whether to use different server implementations to provide
service. This allows continued operation if a critical vulnerability
is found in one implementation, by shifting traffic to other
implementations.
Placing resolvers and control systems in different physical locations
will allow continued operation in the event of a disaster or other
problem that impacts a single location. In addition, ensuring diverse
connectivity to other networks will prevent single points of failure
on the network side. Ensuring network diversity may take some care, as
it is not always obvious what fate is shared between any given path;
this may be physical, virtual, or organizational, and my sometimes be
hidden.
#### Security
In addition to the DNS-specific security considerations, normal
security best practices for any Internet service should be followed,
including updating software updated regularly, patching software as
soon as possible for any known security vulnerabilities, following
CERT announcements and so on.
#### Certification
It may be useful or required for an organization to follow specific
certifications, such as ISO or ITIL. These can be government-defined
or industry-defined. For end users there is typically not much direct
value, but business customers will often look for services that are
operated by organizations meeting such standards.
## DNS configuration knobs
The DNS is an old protocol that has a lot of settings that can be
tweaked. This section reviews these and provides recommendations on
which should be used for a resolver.
### DNSSEC validation
**DNSSEC validation should be enabled.**
For: All DNS resolver operators.
DNSSEC validation is the best way to ensure that the answers from the
owner of domain name being queried are returned.
The root KSK must be updated when it changes. While
[RFC5011](https://www.rfc-editor.org/rfc/rfc5011.html) defines an
automated way to do this, a resolver operator will probably either
manage this trust anchor directly or have it updated via OS updates.
[RFC9364](https://www.rfc-editor.org/rfc/rfc9364.html) provides a lot
of useful information, and links to further documents about DNSSEC.
However, operators usually do not need to know the details, and can
simply ensure that DNSSEC validation is enabled in their software.
Resolver software that does not support DNSSEC validation should be
avoided.
### DNS Transport Protocols
**UDP and TCP must be supported.**
For: All DNS resolver operators.
UDP is what most clients use, and TCP is necessary for DNS answers
that are too large for a single UDP packet.
[RFC7766](https://www.rfc-editor.org/rfc/rfc7766.html) explains why
TCP is necessary in more detail.
### Packet Fragmentation Avoidance
**Servers should be configured to avoid fragmentation.**
For: ALL DNS resolver operators.
Packet fragmentation can cause issues with DNS over UDP, especially
over IPv6. These issues can be minimized by choosing implementations
that set IP options to avoid this, and by taking care with EDNS0
message sizes.
Recommendations are available in
[draft-ietf-dnsop-avoid-fragmentation](https://datatracker.ietf.org/doc/draf….
### Encrypted DNS
**At least one of DNS-over-TLS (DoT), DNS-over-HTTPS (DoH), and
DNS-over-QUIC (DoQ) should be supported.**
For: All DNS resolver operators.
DoT, DoH, and DoQ are different technologies that all provide an
encrypted channel between the resolver and the authoritative server.
DoT is the oldest, and provides encrypted DNS using TLS. DoH uses HTTP
over TLS as a way to transmit queries and answers, and is widely
supported by web browsers. DoQ is the newest, and provides advanced
features such as separate streams for each query, avoiding the "head
of line" blocking problem common with all protocols layered on top of
TCP (such as DoT and DoH).
- DoT
- [RFC7858](https://www.rfc-editor.org/rfc/rfc7858.html)
- DoH
- [RFC8484](https://www.rfc-editor.org/rfc/rfc8484.html)
- DoQ
- [RFC9250](https://www.rfc-editor.org/rfc/rfc9250.html)
**Discovery of DNS Designated Resolvers**
There are new mechanisms that allow DNS clients to use DNS records to
discover encrypted DNS configurations. Resolvers should publish DNS
records to assist clients finding encrypted resolvers.
- Discovery of Designated Resolvers
- [RFC9462](https://www.rfc-editor.org/rfc/rfc9462.html)
QUESTION: Do we need to publish certificate in other ways that via the
DDR mechanisms?
### QNAME Minimization
**QNAME minimization should be enabled.**
For: All DNS resolver operators.
Using QNAME minimization, a resolver does not send the full name that
it is trying to resolver to authoritative servers higher in the DNS
hierarchy. So, for example, when querying "atlas.ripe.net" the servers
for ".net" would only be asked for "ripe.net". This improves privacy
for the end user querying the name.
[RFC7816](https://www.rfc-editor.org/rfc/rfc7816.html) covers QNAME
minimization.
### Aggressive NSEC caching
**Aggressive NSEC caching may be enabled.**
For: Public resolver operators.
"Aggressive NSEC caching", meaning negative caching based on NSEC and
NSEC3 values, can reduce traffic greatly. It is important to protect
against random subdomain attacks.
This style of caching takes advantage of the way that NSEC and NSEC3
records cover a range of names in a zone. A resolver can know that a
query falls within such a range without sending any further queries,
by remembering the NSEC or NSEC3 redords that is has seen as answers
to earlier queries.
Aggressive NSEC caching is almost always a good idea. However enabling
this is less important for DNS resolver operators who have a close
relationship with users, since they can stop attacks by blocking users
or otherwise directly dealing with the source of abusive queries.
[RFC8189](https://www.rfc-editor.org/rfc/rfc8189.html) describes
negative caching in detail.
### Local Root
**Local root should be used.**
For: Public resolver operators.
Running a local root has several benefits, but it is an additional
component to maintain. For public resolver operators this is
definitely worth the cost, but other resolver operators may choose to
simply send all queries to the well-distributed root name servers.
[RFC8806](https://www.rfc-editor.org/rfc/rfc8806.html) describes local
root, including several example configurations.
In the future it will be possible to use ZONEMD to validate the copy
of the root zone obtained before using it. This is currently available
for the root zone.
[RFC8976](https://www.rfc-editor.org/rfc/rfc8976.html) describes ZONEMD.
### DNS Cookies
**Interoperable DNS Cookies may be supported.**
For: Public resolver operators.
DNS cookies provide some improved security over plain UDP, and are
designed to be more lightweight than TCP. If more than one server is
responding for a given IP address, then the Server Secret must be
shared by all servers, and the answer must be constructed in a
consistent manner by all server implementations.
Since client-side support for DNS cookies is not very widespread, and
since managing server-side secrets involves some work, the costs may
outweigh the benefits for some non-public resolver operators.
[RFC7873](https://www.rfc-editor.org/rfc/rfc7873.html) describes DNS
cookies, and [RFC9018](https://www.rfc-editor.org/rfc/rfc9018.html)
standardizes shared DNS cookies.
### TTL Recommendations
**TTL limits may be adjusted.**
For: All DNS resolver operators.
Software typically defaults to a maximum stored TTL of 1 or 2 days.
A lower TTL will mean removing rarely-used records that have long TTL,
and should not have much operational impact from a CPU or network
point of view.
It is possible to set a minimum TTL in many implementations. This is a
violation of the DNS protocol, although may be useful to reduce load
from records with very low TTL (less than 5 seconds).
Note that software may set different maximum and minimum TTL
independent of the results that the resolver returns. That may have a
significant impact on queries as well, but resolver operators cannot
influence that.
### TTL-based Record Pre-Fetch
**TTL record pre-fetch should be enabled when available.**
For: All DNS resolver operators.
Some resolvers have the ability to look up a record before it has
expired from cache, in order to refresh the value and extend the TTL.
This way there is never a time when the records are missing from the
cache. This is not currently standardized, but a form of this was
proposed in the IETF as
[DNS
Hammer](https://datatracker.ietf.org/doc/html/draft-wkumari-dnsop-hammer-03).
We recommend turning this feature on if available.
### EDNS Client Subnet (ECS)
**ECS may be enabled.**
For: All DNS resolver operators.
EDNS Client Subnet (ECS) allows the resolver to include information
about the IP address of the client querying it when sending messages
to authoritative servers. This may allow authoritative servers to
provide different answers which are more appropriate for the client.
However, ECS will increase the amount of cache space required by
resolvers, may reduce DNS performance, and may have privacy
implications.
A resolver operator that has clients that are limited to a specific
region may see no benefit. A resolver operator that has a widely
distributed anycast network may not have much benefit from ECS, since
the locations that initiate the query will be close to the client. But
a resolver operator that answers client queries only from a few
locations, and expects clients to come from a wide area, may provide
better service for end-users by supporting ECS.
EDNS client subnet is described in
[RFC7871](https://www.rfc-editor.org/rfc/rfc7871.html), an
informational RFC.
### Extended DNS Errors
**Extended DNS errors should be enabled.**
For: All DNS resolver operators.
DNS traditionally provides very broad error reporting, SERVFAIL being
the most common. This makes diagnosing and fixing problems difficult.
Extended DNS errors provide extra information about failures, for
example expired DNSSEC signatures. They also allow resolver operators
to report administrative reasons for DNS failures, such as blocks due
to legal requirements.
[RFC8914](https://www.rfc-editor.org/rfc/rfc8914.html) defines
extended DNS errors.
### Negative Trust Anchors
**Negative trust anchors may be deployed.**
For: All DNS resolver operators.
Negative trust anchors (NTA) allow a resolver operator to handle a
case where an authoritative server has a DNSSEC problem and becomes
inaccessible. They basically disable DNSSEC checking for a domain.
When this is warranted is difficult to know with certainty, and will
usually requires some manual checking. Since DNSSEC validation is now
widespread, DNSSEC failures on the authoritative side will impact many
resolvers.
Because of these reasons this document does not recommend NTA, but
also does not recommend that a deployment avoid NTA if it makes sense
for that environment.
Negative trust anchors are documented in
[RFC7646](https://www.rfc-editor.org/rfc/rfc7646.html).
### DNS Error Reporting
**DNS error reporting may be enabled.**
For: All DNS resolver operators.
DNS error reporting is a way for resolver operators to let
authoritative operators know about problems in authoritative servers
or zones. It provides little direct value for the resolver operators,
but over time should improve the overall quality of the DNS ecosystem.
It is neither widely deployed nor standardized, but hopefully will be
both soon. Resolver operators are encouraged to enable DNS error
reporting when it is available.
DNS error reporting is proposed in
[draft-ietf-dnsop-dns-error-reporting](https://datatracker.ietf.org/doc/draf….
### Trust Anchor Reporting
**Trust anchor reporting should be enabled.**
For: All DNS resolver operators.
Trust anchor reporting is a way for resolver operators to convey their
DNSSEC trust anchor configuration to the operator of the zone that it
is for. For most resolvers this is only the root zone. This
information is intended to be used during a root KSK rollover to
ensure that it is safe to proceed. In the future ICANN is planning an
algorithm roll for the root KSK, and this information could be
helpful. Resolver operators are encouraged to enable trust anchor
reporting.
[RFC8145](https://www.rfc-editor.org/rfc/rfc8145.html) covers trust
anchor reporting, in both possibilities available.
## Privacy, Filtering, Transparency
### Privacy & anonymity
Operators are advised to apply
[RFC8932](https://www.rfc-editor.org/rfc/rfc8932.html)
"Recommendations for DNS Privacy Service Operators" as follows:
1. its operational and policy guidance related to DNS encrypted
transports and data handling, by applying all "Threat mitigations"
(thereby by meeting its level of "minimally compliant") and
additionally by applying the "Optimizations" on EDNS Client Subnet
listed in section 5.3.1.
2. its framework on a Recursive operator Privacy Statement, by
publishing a privacy statement on their website that is compliant
with Section 6.
#### Logging considerations
1. Public privacy policy: DNS resolvers are recommended to publish
their privacy policies transparently on their website. It can be a
brief privacy commitment as well or be more elaborate on how the
privacy policy was made. (See for example
[Cloudflare's
statement](https://developers.cloudflare.com/1.1.1.1/privacy/public-dns-res…
or [Quad9's privacy page](https://www.quad9.net/service/privacy/).)
Such policies should explicitly mention the sampling rate of DNS
queries/responses that are kept, and whether these are anonymized.
2. Third party access to personal data: it seems that the only
critical personal data that DNS resolvers collect are IP addresses
and the queries that are resolved. The other meta data collected
can be used to have an understanding of for example which user
accessed which website which can reveal information about a
person’s health, lifestyle and other personal preferences (we call
this profiling). For example, resolving the website for alcoholics
anonymous may tell you something about the health of a person
behind an IP address. IP addresses are personally identifiable
information. Follow the applicable privacy laws or privacy
principles when receiving third party requests to access. Resolvers
should only comply with such requests when balancing legitimate
third party interest with other fundamental rights.
3. Access to data for researchers: how it is done, who has access and
who can request access, how the resolver makes a decision to give
access (validated and credible researchers, what they can access
and other issues)
4. Data minimization: do not collect personal information not needed
for critical operations. Only retain or use what is being asked
(the query). If collecting data to make the service more private
and secure, explain the rationale for each piece of data (data
collection purpose)
5. Encryption: If data is encrypted, explain how it has been encrypted
(DoH, DoT, or so on).
6. Data security and retention: when to delete the data and how it is
stored
#### Advertisement Policy
If there is any advertising from the service, the policy should be
published as well as how it can potentially affect the users' privacy.
### Filtering and blocking
#### Block Lists
Resolvers can be directed to block or modify answers in various ways.
Blocklists may be provided by governments, communities, or other
parties (for example security firms).
Response Policy Zone (RPZ) allows a way to both document specific
modifications that resolvers will make to DNS answers, and send the
rules to resolvers. This allows updates to occur very quickly. If RPZ
or some other high-speed blocking technology is used, the parties
supplying these sources must be highly trusted, as changes to
blocklists will usually immediately impact user queries.
RPZ is not standardized, but there is an IETF draft,
[draft-vixie-dnsop-dns-rpz](https://datatracker.ietf.org/doc/draft-vixie-dns….
#### Legal blocking
**Legal requests and blocking and filtering laws:** DNS resolvers
should not filter content and block access to web-services. When the
local law requires blocking, and the law applies to the resolver, the
resolver should transparently disclose a list of blocked websites and
services, when possible (disclosing such a list may not be allowed by
law or regulation). Similarly, the resolver should disclose the source
of such block lists, when possible.
If possible, resolvers should provide information about blocked
responses via the Extended DNS Error with the Blocked, Censored,
Filtered, or Prohibited code - whichever applies best - along with a
text why the response was blocked, censored, filtered, or prohibited.
[RFC 8914](https://www.rfc-editor.org/rfc/rfc8914.html#section-4.16)
provides information about the meanings of the different codes.
**Community governance of blocklists:** blocklists, if mandatory, have
to be audited and assessed by third parties and there should be a
right to appeal for those blocked. The Internet community can vet the
blocklists from time to time to avoid blocking access to websites that
are mistakenly blocked. During crisis - when mistakes can have drastic
effects on accessing a critical service - preferably filtering and
blocking should not be used.
#### Opt-in/Opt-out Mechanisms
End users may choose to use a DNS resolver that filters specific kinds
of traffic. For example, they may wish to avoid potential malware web
sites. Or resolver operators may be required to default to filtering
but allowed for to provide an unfiltered DNS resolver service.
Depending on the specific requirements, a resolver service may publish
different IP addresses and what type of filtering applies to each
address. It is also possible to perform client authentication and
authorization, using IP-based authentication, TSIG keys, or
client-side TLS certificates.
### Transparency
DNS resolvers usually provide transparency reports once a year. The
reports inform the public about disclosure of user information and
removal of content required by law enforcement and other government
agencies.
Transparency reports should (to the extent that the law allows)
indicate which government agencies and law enforcement agencies
request access on what basis.
It should also be clear from the transparency reports what kind of
data has been requested and if content removal and content blocking
have been requested. Categories of data include: Content Data, Basic
Subscriber Data, Other Non-Content Data and Content Blocking.
#### Voluntary certificates and standards
Some DNS resolvers opt for obtaining certificates in security and
privacy. Some also undertake audits on their privacy practices. See
for example:https://www.cloudflare.com/trust-hub/compliance-resources/
#### Human rights considerations
DNS resolvers can opt for declaring their understanding of their
responsibilities regarding human rights from the Universal Declaration
of Human Rights. Specifically, Quad9 mentions rights to freedoms
without distinction made on the basis of country, no interference with
privacy, the right to freedom of opinion and expression, the right to
peaceful assembly, and the right to freely participate in the cultural
life of the community.
See
[Quad9's Human Rights
Considerations](https://www.quad9.net/privacy/human-rights-considerations/)
for the full statement.
It also invokes other human rights related solutions other than
[UDHR](https://www.un.org/en/universal-declaration-human-rights/) such
as Articles 8 and 9 of Resolution 42/15 of the United Nations Human
Rights Council on the right to privacy in the digital age of 26
September 2019 more directly define the responsibilities of the
private sector toward the furtherance of human rights in modern terms.
They also follow the Guidelines for Human Rights Protocol and
Architecture Consideration of the Human Rights Protocol Considerations
Research Group at Internet Research Task Force.
The latest version of the IRTF
[Guidelines for
HRPC](https://tools.ietf.org/html/draft-irtf-hrpc-guidelines) may be
considered for all network operators.
## Appendix A: Why Did RIPE Write This Document?
There is increasing concern that large open DNS resolvers will become
centralised points of DNS operations on the Internet. In order to
address this, the European Commission issued the
[DNS4EU](https://hadea.ec.europa.eu/calls-proposals/equipping-backbone-networks-high-performance-and-secure-dns-resolution-infrastructures-works_en)
proposal. However, such an initiative could lead to centralised
guidance or regulation which might interfere with the decentralised
way the Internet infrastructure works - including the DNS. See for
reference the
[RIPE NCC Open House
discussion](https://labs.ripe.net/author/chrisb/dns4eu-ripe-ncc-open-house-…
on this topic.
Rather than attempting to respond to the EC proposal or organize
specific DNS resolver deployments, the RIPE community has decided that
it is best able to provide advice and guidance. The RIPE Community is
well positioned to provide a set of Best Current Practices that
operators of Open DNS Resolvers will be encouraged to subscribe to.
DNS Resolver Recommendations
About the DNS Resolver Best Common Practice Task Force
https://www.ripe.net/participate/ripe/tf/dns-resolver-best-common-practice-…
## Terminology
* Open Resolver: A DNS resolver that accepts queries from any client.
Often the result of misconfiguration.
* Public Resolver: A resolver intentionally configured to be an open
resolver.
## Introduction
### What Is This Document? Who Is It For?
This document presents recommendations and best current practices for
operating DNS resolvers, both public and non-public ones. It covers
technical aspects of operations and provides best practice
recommendations for data management, with a particular focus on user
privacy, security, and resilience.
The document serves as guidance for the wider Internet community,
offering input to:
* Those running public DNS resolver services, and
* Those who want to make informed choices between such services.
Its purpose is to provide clear guidance and promote effective
practices in DNS resolver operation.
The intended audience is not the entire DNS community. Advice here is
probably not useful for operators of authoritative servers, domain
registrars, and so on. It is also not meant to be an introductory or
educational document. There are many documents which cover the basics
of DNS and the roles of organizations in it; a good overview is:
Addressing the challenges of modern DNS - a comprehensive tutorial
by van der Toorn et al.
https://ris.utwente.nl/ws/files/282427879/1_s2.0_S1574013722000132_main.pdf
The document does not consider how to measure adherence to these
recommendations. So it is not intended to be used for certification,
although certification created based on the principles here is
possible.
### How Is This Document Organized?
This document has a number of sections, and specific recommendations
in each section. The intent is for each recommendations to have clear
guidance at the top, and then background and discussion related to the
recommendation afterwards. Each recommendation indicates whether it is
mostly for operators of public resolvers or for operators of any
resolver.
### About Recomendation Text
This is not a standards document, and does not propose any way to
measure compliance or interoperability. It does use words like
"should" or "may be" throughout. These are meant to be interpreted in
the usual English sense, and not as IETF-style RFC 2119 jargon.
## System and Network Hardening
### Infrastructure considerations
Running any Internet service requires attention to the infrastructure
used to operate it. This section discusses various approaches that can
be used to run a DNS resolver. Everything applies to both public and
non-public DNS resolvers.
#### Bare metal or public cloud
All DNS resolver software can run either on dedicated servers (rented
or colocated), or in virtualized clouds, or in a combination of those.
Every approach has pros and cons. Most of these are not specific to
running DNS resolvers, however, some of them are.
**Running DNS resolver instances as OS level daemons on bare metal
hosts:**
Pros:
- Performance: Bare metal servers have direct access to the
underlying hardware, and can offer superior performance/cost
balance by avoiding the overhead associated with virtualization.
Moreover, you have full control over the server's configurations,
down to the hardware level, which can be beneficial for
performance and cost optimization once you get the understanding
of your typical work load during peak hours.
- Data Security: Since you are in control of the physical servers,
there is no risk of data leakage that can occur due to
vulnerabilities in multi-tenant virtualization platforms,
including CPU cache-based side-channel vulnerabilities. It could
be argued that attacks targeting such issues are rare, and their
impact on a DNS resolver service is low, but potential breaches
may have significant privacy impact. It is advised to evaluate
this against your organisation's risk model, or to discuss this
with your information security compliance experts.
- Predictability: Because there is no virtualization layer and no
"noisy neighbours" on the host, the performance of your servers is
more predictable.
Cons:
- Cost of failure: If you pick hardware configuration that is not
optimal for the workload of your DNS resolver, you may need to
upgrade and replace hardware components afterwards. Ways to reduce
this risk include renting servers instead of buying them, carrying
load testing with data similar to production workloads, and
providing limited beta access to the service before it fully
enters the production phase.
- Scalability: Scaling up with physical servers means acquiring or
renting, installing, and configuring new hardware, which will take
more time than provisioning new virtual servers in a cloud
environment. Moreover, most cloud environments will provide you
with cluster autoscaling features, which could barely be achieved
in bare metal.
- Maintenance: You will be responsible for all server maintenance
tasks, including hardware issues, which can require significant
effort and specific expertise.
- Redundancy: Setting up high availability and disaster recovery
strategies can be more complex and time consuming compared to the
cloud, where these features are often provided as value added
products. See the Redundancy section for more details.
**Running DNS resolver instances in containers in a public cloud:**
Pros:
- Scalability: Clouds excel at scaling applications. You can scale
up and down rapidly based on load, which is important for a DNS
resolver that needs to handle variable query loads. In case of
regional or geographically distributed resolvers, in every region
where the resolver would be deployed, daily periodicity is likely
to be observed, for example peak hour is likely to occur around
19:00 local time, and off-peak hours may begin at around
01:00-03:00. In a situation like that, using cluster autoscaling
features and tools, you can run less instances in the night and
more instances throughout the day, which may help to optimize your
cloud hosting costs.
- Fault Tolerance and High Availability: Most clouds have built-in
strategies, features, and products for handling node failures,
which can increase your service's availability.
- Deployment and Management: Cloud providers offer built-in methods
to deploy and manage applications, which can simplify operations
and reduce the likelihood of human errors if your infrastructure
management department is already familiar with these tools.
- Cost: While this largely depends on your specific usage, cloud
services can sometimes be more cost-effective than managing your
own physical servers, especially when you consider the total cost
of ownership, including power, cooling, and maintenance.
Cons:
- Performance: The virtualization layer of public clouds can impact
performance. While this certainly could be mitigated through
scaling the number of virtual hosts, the cost would also increase
accordingly.
- Complexity: Advanced cloud technologies are complex systems which
come with a steep learning curve. Without prior experience,
properly configuring and managing a cloud-based compute cluster
can be challenging.
- Cost Variability: While the cloud can be cheaper, it can also be
more expensive if not properly managed. Costs can rise
unexpectedly based on traffic. Make sure to always set some limits
on how much may be spent on hosting in the cloud control panel,
and to set up notifications to be sent to you when these
thresholds are about to be triggered.
- Multi-tenancy Risks: In a public cloud environment, the "noisy
neighbour" problem could potentially affect your service's
performance. Additionally, even though cloud providers take steps
to isolate tenant environments, vulnerabilities could potentially
expose sensitive data (see the previous section for a detailed
explanation).
**Additional considerations**
- In today's environments, Kubernetes and Terraform are sometimes
used as a substitute for cloud APIs when it comes to production
services' management. When running a DNS resolver in a Kubernetes
cluster on top of a public cloud environment, all the pros and
cons of the public cloud apply; basically, Kubernetes becomes your
public cloud provider. If you have significant prior experience
running services in Kubernetes in production, you may successfully
replicate your experience with the DNS resolver software.
Otherwise, we would advise against Kubernetes in this case.
- The only reason we may find to run a DNS resolver in a Kubernetes
cluster on top of self-hosted dedicated servers is when you have
significant hands-on experience with Kubernetes and it is natural
for you to manage applications this way. Otherwise, running DNS
resolver daemons in containers brings little, if any, benefit.
Autoscaling features are not available to you in this case, and
neither horizontal nor vertical pod autoscaling is of any use,
because DNS resolver software typically scales in-host by itself
just fine.
- When designing a cluster of resolvers for autoscaling, keep in
mind that newly spawned resolver machines would need to populate
resolver cache first before they are fully useful. Your DNS
resolver software may provide cache replication mechanisms.
Otherwise, it is safe to overprovision clusters somewhat under
heavy load, and discarding excessive instances once all the caches
are populated and the average load of a compute instance
decreases. In addition, it may be worthwhile to consider sharing
cache data between instances.
- It is always advised to prefer environments your infrastructure
management team is familiar with.
### Software considerations
#### Open Source
**Recommendation**: Choose any well-maintained DNS software you are
comfortable using. Regardless of which software you choose, ensure you
have somewhere to go for support. In the case of open source software,
consider providing financial support to ensure continued development.
Some open source maintainers take donations, while others offer
support contracts.
There are both open source and proprietary implementations of DNS
resolver software. Mixing these is also possible, for example, by
using proprietary extensions with open source software or deploying
open source software modified in-house.
General observations:
- Software licensing is orthogonal to software security. Neither is
proprietary software less secure on principle nor are
contributions by "unknown" developers more of a risk in open
source.
Benefits of open source:
- Open source allows for inspection, independent auditing, and
troubleshooting.
- Open source can avoid vendor lock-in.
- Open source can aid internet standards development.
Widely-deployed open source implementations allow proponents of
standards drafts to contribute proof of concept implementations
without permission or cooperation of vendors.
Drawbacks of open source
- Both open source and proprietary software require skilled
maintenance, which has costs. Proprietary licensed software or
appliances typically come with license fees to cover these. In
contrast, open source licenses decouple usage by operators from
monetary compensation to developers. It is up to operators to
consider the financial sustainability of continued maintenance of
the open source DNS software they depend upon.
Please also consider deploying different software implementations to
ensure diversity, as discussed in the diversity section below.
### Networking considerations
#### IPv4 and IPv6
**If available, both IPv4 and IPv6 must be deployed.**
Large parts of the authoritative DNS are only accessible via IPv4, so
the resolver must be able to originate IPv4 queries. Authoritative DNS
that is only accessible via IPv6 is very rare.
Depending on the connectivity of clients, a resolver may be IPv4-only,
IPv6-only, or support IPv4 and IPv6.
#### Addressing
**Using multiple IP addresses for the service address should be
considered.**
Using 2 or more IPv4 addresses and 2 or more IPv6 addresses from
different RIR will allow resilience in failure at an RIR, either
governance, security, or technical. Note that support for multiple
addresses for recursive resolvers varies and some clients perform
poorly if any address does not respond normally.
There is no need to pick an IPv4 address with all octets the same,
like 2.2.2.2 or 11.11.11.11.
**Publishing a list of back-end addresses used for resolving should be
considered.**
Publishing a list of back-end addresses used for resolving can be
useful for other network & DNS operators (for example, geo-IP
location, making sure data is getting to correct places, and so on).
#### High Availability
This can be considered in terms of local and global scope.
##### Local scope
Inside a single location/region, such as an office, campus, or small
ISP network, the main availability concern is that a resolver is
always reachable. Client systems can be configured with multiple
resolver addresses, but the failover behaviour of stub resolvers to a
second address can be painful. Ideally the primary address is highly
available and such fallback rarely required. How much effort is put
into ensuring this is true should probably scale in line with the
number of users, or sensitivity of the clients using that resolver to
delayed resolution.
There are several ways to promote high availability of an individual
resolver address, such as dedicated load balancing equipment, or
network techniques like VRRP, or IP anycast. These generally have in
common a pool of recursive servers and the means to direct queries to
them when a health check has determined them to be capable of
answering those queries.
Dedicated free or commercially produced, hardware or software load
balancing solutions are available. These typically own the resolver IP
address and forward queries to the currently available instances of a
pool of recursive servers.
VRRP enables a technique to make the resolver IP address available on
multiple servers, often used to provide automatic failover between
two. A pool of recursive servers using this technique must reside in
the same broadcast domain.
IP anycast in the local scope typically involves a pool of recursive
servers advertising a route to a shared resolver IP address into a
routing protocol. This can be configured in failover or load-sharing
configurations. A load sharing configuration typically requires
network equipment able to balance traffic to a destination over equal
cost paths (ECMP). A pool of recursive servers using this technique
can be distributed in different parts of the network.
##### Global scope
The same concerns as for local service availability are present in the
global scope, with the added issue that DNS resolution over long
distances may be slow. Practically speaking, only multiple resolver
addresses, or IP anycast are useful strategies here. The motivations
for finding better failover solutions than multiple resolver addresses
have been covered above.
IP anycast in the global scope means routing the same IP prefix to
more than one location. This can provide effective solutions for
failover and, when optimally configured for routing client queries to
the topologically least distant recursive server location. IP anycast
in the global scope requires the use of globally routable prefixes. If
a separate prefix is to be used for anycasting, usually this means a
/24 in IPv4 and a /48 in IPv6, as those are the smallest sizes that
will be widely propagated in BGP. A common practice is to use a
covering prefix (/23 in IPv4 or /47 in IPv6) for fallback, and a
more-specific prefix (/24 or /48) for the traffic. The more-specific
prefix can then be withdrawn to send traffic to a backup site; this
will happen automatically if the site is disconnected from routing.
[RFC7094](https://www.rfc-editor.org/rfc/rfc7094.html) discusses
anycast architecture in detail, including references to various other
RFC which discuss anycast in general and to DNS in particular.
[RFC4786](https://datatracker.ietf.org/doc/html/rfc4786) discuses
operation of anycast services.
##### Generally
Operators of a globally scoped recursive service are encouraged to
also adopt the local scope recommendations in each of the locations
where the service is provisioned.
Though the above deals with the shortcomings of reliance on stub
resolver failover between a list of addresses those recommendations
shouldn’t be seen as an exclusive alternative. Multiple resolver
addresses, where each is provisioned using differing failover
strategies, can provide a resolver of last resort and further improved
resilience.
#### Ingress Filtering
**Ingress Filtering to follow BCP 38 should be deployed.**
DNS normally uses UDP traffic, which makes it a common vector of both
[reflection](https://en.wikipedia.org/wiki/Reflection_attack) and
[amplification](https://www.cisa.gov/news-events/alerts/2014/01/17/udp-based…
attacks. To minimize the amount of spoofed traffic that a resolver
responds to, the network should be configured as recommended in
[BCP 38](https://www.rfc-editor.org/rfc/rfc2827.html).
#### RPKI Sign Advertised Routes
**Route Advertisements should be signed using RPKI**
Using RPKI to sign any route advertisements - either toward
authoritative servers or toward DNS clients - is straightforward to do
and will reduce the impact of BGP misconfigurations and some BGP
hijacking attempts.
RPKI validation is also possible, although the effort is greater. It
is possible that the hosting provider or the transit provider for your
service validates BGP; asking and making this part of your selection
criteria is reasonable.
#### (D)DoS measures
Denial-of-Service (DoS) attacks, both distributed (DDoS) and not are a
threat to any Internet service. Network operators for a service
providing any DNS service must be prepared for large amounts of attack
traffic.
In addition to attacks on the service itself, a resolver may be used
both as an attack reflector and as an attack amplifier.
Active monitoring of network and service usage, careful logging, and a
security team that is able to respond to problem reports is necessary.
Mitigation techniques will include filtering or rate-limiting traffic,
both on the authoritative and client side of the resolver.
### Capacity planning
#### Server capacity
If using a model that is easy to scale (cloud based, or Kubernetes
based, or similar), then getting server capacity correct is largely a
question of budgeting. If using a less-flexible model (bare metal for
example), then under-estimating will mean problems delivering service.
Hardware performance varies widely, as does operating system and
resolver performance. Some lab testing will be necessary to estimate
the number of systems needed.
#### Network capacity
Since DNS is mostly UDP-based, it is often easy to generate large
amounts of spoofed traffic to and from DNS servers. DNS traffic is
small compared to application traffic (videos and other content), but
still significant. Authoritative server operators often build their
networks and servers to handle 10 times their normal load. Recursive
server operators may need to do the same, although the service only
accepts traffic from IP addresses that cannot be spoofed (for example
users within a network that operated by the same company) then this
can be reduced, for example to 3 times normal load. To estimate
expected load, the best approach is to examine historical usage for
the actual expected users of the system.
### Resilience
#### System Diversity
In addition to the software considerations above, operators should
consider whether to use different server implementations to provide
service. This allows continued operation if a critical vulnerability
is found in one implementation, by shifting traffic to other
implementations.
Placing resolvers and control systems in different physical locations
will allow continued operation in the event of a disaster or other
problem that impacts a single location. In addition, ensuring diverse
connectivity to other networks will prevent single points of failure
on the network side. Ensuring network diversity may take some care, as
it is not always obvious what fate is shared between any given path;
this may be physical, virtual, or organizational, and my sometimes be
hidden.
#### Security
In addition to the DNS-specific security considerations, normal
security best practices for any Internet service should be followed,
including updating software updated regularly, patching software as
soon as possible for any known security vulnerabilities, following
CERT announcements and so on.
#### Certification
It may be useful or required for an organization to follow specific
certifications, such as ISO or ITIL. These can be government-defined
or industry-defined. For end users there is typically not much direct
value, but business customers will often look for services that are
operated by organizations meeting such standards.
## DNS configuration knobs
The DNS is an old protocol that has a lot of settings that can be
tweaked. This section reviews these and provides recommendations on
which should be used for a resolver.
### DNSSEC validation
**DNSSEC validation should be enabled.**
For: All DNS resolver operators.
DNSSEC validation is the best way to ensure that the answers from the
owner of domain name being queried are returned.
The root KSK must be updated when it changes. While
[RFC5011](https://www.rfc-editor.org/rfc/rfc5011.html) defines an
automated way to do this, a resolver operator will probably either
manage this trust anchor directly or have it updated via OS updates.
[RFC9364](https://www.rfc-editor.org/rfc/rfc9364.html) provides a lot
of useful information, and links to further documents about DNSSEC.
However, operators usually do not need to know the details, and can
simply ensure that DNSSEC validation is enabled in their software.
Resolver software that does not support DNSSEC validation should be
avoided.
### DNS Transport Protocols
**UDP and TCP must be supported.**
For: All DNS resolver operators.
UDP is what most clients use, and TCP is necessary for DNS answers
that are too large for a single UDP packet.
[RFC7766](https://www.rfc-editor.org/rfc/rfc7766.html) explains why
TCP is necessary in more detail.
### Packet Fragmentation Avoidance
**Servers should be configured to avoid fragmentation.**
For: ALL DNS resolver operators.
Packet fragmentation can cause issues with DNS over UDP, especially
over IPv6. These issues can be minimized by choosing implementations
that set IP options to avoid this, and by taking care with EDNS0
message sizes.
Recommendations are available in
[draft-ietf-dnsop-avoid-fragmentation](https://datatracker.ietf.org/doc/draf….
### Encrypted DNS
**At least one of DNS-over-TLS (DoT), DNS-over-HTTPS (DoH), and
DNS-over-QUIC (DoQ) should be supported.**
For: All DNS resolver operators.
DoT, DoH, and DoQ are different technologies that all provide an
encrypted channel between the resolver and the authoritative server.
DoT is the oldest, and provides encrypted DNS using TLS. DoH uses HTTP
over TLS as a way to transmit queries and answers, and is widely
supported by web browsers. DoQ is the newest, and provides advanced
features such as separate streams for each query, avoiding the "head
of line" blocking problem common with all protocols layered on top of
TCP (such as DoT and DoH).
- DoT
- [RFC7858](https://www.rfc-editor.org/rfc/rfc7858.html)
- DoH
- [RFC8484](https://www.rfc-editor.org/rfc/rfc8484.html)
- DoQ
- [RFC9250](https://www.rfc-editor.org/rfc/rfc9250.html)
**Discovery of DNS Designated Resolvers**
There are new mechanisms that allow DNS clients to use DNS records to
discover encrypted DNS configurations. Resolvers should publish DNS
records to assist clients finding encrypted resolvers.
- Discovery of Designated Resolvers
- [RFC9462](https://www.rfc-editor.org/rfc/rfc9462.html)
QUESTION: Do we need to publish certificate in other ways that via the
DDR mechanisms?
### QNAME Minimization
**QNAME minimization should be enabled.**
For: All DNS resolver operators.
Using QNAME minimization, a resolver does not send the full name that
it is trying to resolver to authoritative servers higher in the DNS
hierarchy. So, for example, when querying "atlas.ripe.net" the servers
for ".net" would only be asked for "ripe.net". This improves privacy
for the end user querying the name.
[RFC7816](https://www.rfc-editor.org/rfc/rfc7816.html) covers QNAME
minimization.
### Aggressive NSEC caching
**Aggressive NSEC caching may be enabled.**
For: Public resolver operators.
"Aggressive NSEC caching", meaning negative caching based on NSEC and
NSEC3 values, can reduce traffic greatly. It is important to protect
against random subdomain attacks.
This style of caching takes advantage of the way that NSEC and NSEC3
records cover a range of names in a zone. A resolver can know that a
query falls within such a range without sending any further queries,
by remembering the NSEC or NSEC3 redords that is has seen as answers
to earlier queries.
Aggressive NSEC caching is almost always a good idea. However enabling
this is less important for DNS resolver operators who have a close
relationship with users, since they can stop attacks by blocking users
or otherwise directly dealing with the source of abusive queries.
[RFC8189](https://www.rfc-editor.org/rfc/rfc8189.html) describes
negative caching in detail.
### Local Root
**Local root should be used.**
For: Public resolver operators.
Running a local root has several benefits, but it is an additional
component to maintain. For public resolver operators this is
definitely worth the cost, but other resolver operators may choose to
simply send all queries to the well-distributed root name servers.
[RFC8806](https://www.rfc-editor.org/rfc/rfc8806.html) describes local
root, including several example configurations.
In the future it will be possible to use ZONEMD to validate the copy
of the root zone obtained before using it. This is currently available
for the root zone.
[RFC8976](https://www.rfc-editor.org/rfc/rfc8976.html) describes ZONEMD.
### DNS Cookies
**Interoperable DNS Cookies may be supported.**
For: Public resolver operators.
DNS cookies provide some improved security over plain UDP, and are
designed to be more lightweight than TCP. If more than one server is
responding for a given IP address, then the Server Secret must be
shared by all servers, and the answer must be constructed in a
consistent manner by all server implementations.
Since client-side support for DNS cookies is not very widespread, and
since managing server-side secrets involves some work, the costs may
outweigh the benefits for some non-public resolver operators.
[RFC7873](https://www.rfc-editor.org/rfc/rfc7873.html) describes DNS
cookies, and [RFC9018](https://www.rfc-editor.org/rfc/rfc9018.html)
standardizes shared DNS cookies.
### TTL Recommendations
**TTL limits may be adjusted.**
For: All DNS resolver operators.
Software typically defaults to a maximum stored TTL of 1 or 2 days.
A lower TTL will mean removing rarely-used records that have long TTL,
and should not have much operational impact from a CPU or network
point of view.
It is possible to set a minimum TTL in many implementations. This is a
violation of the DNS protocol, although may be useful to reduce load
from records with very low TTL (less than 5 seconds).
Note that software may set different maximum and minimum TTL
independent of the results that the resolver returns. That may have a
significant impact on queries as well, but resolver operators cannot
influence that.
### TTL-based Record Pre-Fetch
**TTL record pre-fetch should be enabled when available.**
For: All DNS resolver operators.
Some resolvers have the ability to look up a record before it has
expired from cache, in order to refresh the value and extend the TTL.
This way there is never a time when the records are missing from the
cache. This is not currently standardized, but a form of this was
proposed in the IETF as
[DNS
Hammer](https://datatracker.ietf.org/doc/html/draft-wkumari-dnsop-hammer-03).
We recommend turning this feature on if available.
### EDNS Client Subnet (ECS)
**ECS may be enabled.**
For: All DNS resolver operators.
EDNS Client Subnet (ECS) allows the resolver to include information
about the IP address of the client querying it when sending messages
to authoritative servers. This may allow authoritative servers to
provide different answers which are more appropriate for the client.
However, ECS will increase the amount of cache space required by
resolvers, may reduce DNS performance, and may have privacy
implications.
A resolver operator that has clients that are limited to a specific
region may see no benefit. A resolver operator that has a widely
distributed anycast network may not have much benefit from ECS, since
the locations that initiate the query will be close to the client. But
a resolver operator that answers client queries only from a few
locations, and expects clients to come from a wide area, may provide
better service for end-users by supporting ECS.
EDNS client subnet is described in
[RFC7871](https://www.rfc-editor.org/rfc/rfc7871.html), an
informational RFC.
### Extended DNS Errors
**Extended DNS errors should be enabled.**
For: All DNS resolver operators.
DNS traditionally provides very broad error reporting, SERVFAIL being
the most common. This makes diagnosing and fixing problems difficult.
Extended DNS errors provide extra information about failures, for
example expired DNSSEC signatures. They also allow resolver operators
to report administrative reasons for DNS failures, such as blocks due
to legal requirements.
[RFC8914](https://www.rfc-editor.org/rfc/rfc8914.html) defines
extended DNS errors.
### Negative Trust Anchors
**Negative trust anchors may be deployed.**
For: All DNS resolver operators.
Negative trust anchors (NTA) allow a resolver operator to handle a
case where an authoritative server has a DNSSEC problem and becomes
inaccessible. They basically disable DNSSEC checking for a domain.
When this is warranted is difficult to know with certainty, and will
usually requires some manual checking. Since DNSSEC validation is now
widespread, DNSSEC failures on the authoritative side will impact many
resolvers.
Because of these reasons this document does not recommend NTA, but
also does not recommend that a deployment avoid NTA if it makes sense
for that environment.
Negative trust anchors are documented in
[RFC7646](https://www.rfc-editor.org/rfc/rfc7646.html).
### DNS Error Reporting
**DNS error reporting may be enabled.**
For: All DNS resolver operators.
DNS error reporting is a way for resolver operators to let
authoritative operators know about problems in authoritative servers
or zones. It provides little direct value for the resolver operators,
but over time should improve the overall quality of the DNS ecosystem.
It is neither widely deployed nor standardized, but hopefully will be
both soon. Resolver operators are encouraged to enable DNS error
reporting when it is available.
DNS error reporting is proposed in
[draft-ietf-dnsop-dns-error-reporting](https://datatracker.ietf.org/doc/draf….
### Trust Anchor Reporting
**Trust anchor reporting should be enabled.**
For: All DNS resolver operators.
Trust anchor reporting is a way for resolver operators to convey their
DNSSEC trust anchor configuration to the operator of the zone that it
is for. For most resolvers this is only the root zone. This
information is intended to be used during a root KSK rollover to
ensure that it is safe to proceed. In the future ICANN is planning an
algorithm roll for the root KSK, and this information could be
helpful. Resolver operators are encouraged to enable trust anchor
reporting.
[RFC8145](https://www.rfc-editor.org/rfc/rfc8145.html) covers trust
anchor reporting, in both possibilities available.
## Privacy, Filtering, Transparency
### Privacy & anonymity
Operators are advised to apply
[RFC8932](https://www.rfc-editor.org/rfc/rfc8932.html)
"Recommendations for DNS Privacy Service Operators" as follows:
1. its operational and policy guidance related to DNS encrypted
transports and data handling, by applying all "Threat mitigations"
(thereby by meeting its level of "minimally compliant") and
additionally by applying the "Optimizations" on EDNS Client Subnet
listed in section 5.3.1.
2. its framework on a Recursive operator Privacy Statement, by
publishing a privacy statement on their website that is compliant
with Section 6.
#### Logging considerations
1. Public privacy policy: DNS resolvers are recommended to publish
their privacy policies transparently on their website. It can be a
brief privacy commitment as well or be more elaborate on how the
privacy policy was made. (See for example
[Cloudflare's
statement](https://developers.cloudflare.com/1.1.1.1/privacy/public-dns-res…
or [Quad9's privacy page](https://www.quad9.net/service/privacy/).)
Such policies should explicitly mention the sampling rate of DNS
queries/responses that are kept, and whether these are anonymized.
2. Third party access to personal data: it seems that the only
critical personal data that DNS resolvers collect are IP addresses
and the queries that are resolved. The other meta data collected
can be used to have an understanding of for example which user
accessed which website which can reveal information about a
person’s health, lifestyle and other personal preferences (we call
this profiling). For example, resolving the website for alcoholics
anonymous may tell you something about the health of a person
behind an IP address. IP addresses are personally identifiable
information. Follow the applicable privacy laws or privacy
principles when receiving third party requests to access. Resolvers
should only comply with such requests when balancing legitimate
third party interest with other fundamental rights.
3. Access to data for researchers: how it is done, who has access and
who can request access, how the resolver makes a decision to give
access (validated and credible researchers, what they can access
and other issues)
4. Data minimization: do not collect personal information not needed
for critical operations. Only retain or use what is being asked
(the query). If collecting data to make the service more private
and secure, explain the rationale for each piece of data (data
collection purpose)
5. Encryption: If data is encrypted, explain how it has been encrypted
(DoH, DoT, or so on).
6. Data security and retention: when to delete the data and how it is
stored
#### Advertisement Policy
If there is any advertising from the service, the policy should be
published as well as how it can potentially affect the users' privacy.
### Filtering and blocking
#### Block Lists
Resolvers can be directed to block or modify answers in various ways.
Blocklists may be provided by governments, communities, or other
parties (for example security firms).
Response Policy Zone (RPZ) allows a way to both document specific
modifications that resolvers will make to DNS answers, and send the
rules to resolvers. This allows updates to occur very quickly. If RPZ
or some other high-speed blocking technology is used, the parties
supplying these sources must be highly trusted, as changes to
blocklists will usually immediately impact user queries.
RPZ is not standardized, but there is an IETF draft,
[draft-vixie-dnsop-dns-rpz](https://datatracker.ietf.org/doc/draft-vixie-dns….
#### Legal blocking
**Legal requests and blocking and filtering laws:** DNS resolvers
should not filter content and block access to web-services. When the
local law requires blocking, and the law applies to the resolver, the
resolver should transparently disclose a list of blocked websites and
services, when possible (disclosing such a list may not be allowed by
law or regulation). Similarly, the resolver should disclose the source
of such block lists, when possible.
If possible, resolvers should provide information about blocked
responses via the Extended DNS Error with the Blocked, Censored,
Filtered, or Prohibited code - whichever applies best - along with a
text why the response was blocked, censored, filtered, or prohibited.
[RFC 8914](https://www.rfc-editor.org/rfc/rfc8914.html#section-4.16)
provides information about the meanings of the different codes.
**Community governance of blocklists:** blocklists, if mandatory, have
to be audited and assessed by third parties and there should be a
right to appeal for those blocked. The Internet community can vet the
blocklists from time to time to avoid blocking access to websites that
are mistakenly blocked. During crisis - when mistakes can have drastic
effects on accessing a critical service - preferably filtering and
blocking should not be used.
#### Opt-in/Opt-out Mechanisms
End users may choose to use a DNS resolver that filters specific kinds
of traffic. For example, they may wish to avoid potential malware web
sites. Or resolver operators may be required to default to filtering
but allowed for to provide an unfiltered DNS resolver service.
Depending on the specific requirements, a resolver service may publish
different IP addresses and what type of filtering applies to each
address. It is also possible to perform client authentication and
authorization, using IP-based authentication, TSIG keys, or
client-side TLS certificates.
### Transparency
DNS resolvers usually provide transparency reports once a year. The
reports inform the public about disclosure of user information and
removal of content required by law enforcement and other government
agencies.
Transparency reports should (to the extent that the law allows)
indicate which government agencies and law enforcement agencies
request access on what basis.
It should also be clear from the transparency reports what kind of
data has been requested and if content removal and content blocking
have been requested. Categories of data include: Content Data, Basic
Subscriber Data, Other Non-Content Data and Content Blocking.
#### Voluntary certificates and standards
Some DNS resolvers opt for obtaining certificates in security and
privacy. Some also undertake audits on their privacy practices. See
for example:https://www.cloudflare.com/trust-hub/compliance-resources/
#### Human rights considerations
DNS resolvers can opt for declaring their understanding of their
responsibilities regarding human rights from the Universal Declaration
of Human Rights. Specifically, Quad9 mentions rights to freedoms
without distinction made on the basis of country, no interference with
privacy, the right to freedom of opinion and expression, the right to
peaceful assembly, and the right to freely participate in the cultural
life of the community.
See
[Quad9's Human Rights
Considerations](https://www.quad9.net/privacy/human-rights-considerations/)
for the full statement.
It also invokes other human rights related solutions other than
[UDHR](https://www.un.org/en/universal-declaration-human-rights/) such
as Articles 8 and 9 of Resolution 42/15 of the United Nations Human
Rights Council on the right to privacy in the digital age of 26
September 2019 more directly define the responsibilities of the
private sector toward the furtherance of human rights in modern terms.
They also follow the Guidelines for Human Rights Protocol and
Architecture Consideration of the Human Rights Protocol Considerations
Research Group at Internet Research Task Force.
The latest version of the IRTF
[Guidelines for
HRPC](https://tools.ietf.org/html/draft-irtf-hrpc-guidelines) may be
considered for all network operators.
## Appendix A: Why Did RIPE Write This Document?
There is increasing concern that large open DNS resolvers will become
centralised points of DNS operations on the Internet. In order to
address this, the European Commission issued the
[DNS4EU](https://hadea.ec.europa.eu/calls-proposals/equipping-backbone-networks-high-performance-and-secure-dns-resolution-infrastructures-works_en)
proposal. However, such an initiative could lead to centralised
guidance or regulation which might interfere with the decentralised
way the Internet infrastructure works - including the DNS. See for
reference the
[RIPE NCC Open House
discussion](https://labs.ripe.net/author/chrisb/dns4eu-ripe-ncc-open-house-…
on this topic.
Rather than attempting to respond to the EC proposal or organize
specific DNS resolver deployments, the RIPE community has decided that
it is best able to provide advice and guidance. The RIPE Community is
well positioned to provide a set of Best Current Practices that
operators of Open DNS Resolvers will be encouraged to subscribe to.
3
2
Colleagues,
Here is a draft of the RIPE DNS Resolver Best Common Practices document
that we have been working on.
My intention is to collect feedback during the RIPE 87 meeting and
afterwards, and either publish a RIPE document or another draft based on
that.
I'm going to send a copy of this to the DNS working group list, and
mention that to the RIPE list and the cooperation working group list.
Cheers,
--
Shane
# DNS Resolver Recommendations
About the DNS Resolver Best Common Practice Task Force
https://www.ripe.net/participate/ripe/tf/dns-resolver-best-common-practice-…
## Terminology
* Open Resolver: A DNS resolver that accepts queries from any client.
Often the result of misconfiguration.
* Public Resolver: A resolver intentionally configured to be an open
resolver.
## Introduction
### What Is This Document? Who Is It For?
This document presents recommendations and best current practices for
operating DNS resolvers, both public and non-public ones. It covers
technical aspects of operations and provides best practice
recommendations for data management, with a particular focus on user
privacy, security, and resilience.
The document serves as guidance for the wider Internet community,
offering input to:
* Those running public DNS resolver services, and
* Those who want to make informed choices between such services.
Its purpose is to provide clear guidance and promote effective
practices in DNS resolver operation.
The intended audience is not the entire DNS community. Advice here is
probably not useful for operators of authoritative servers, domain
registrars, and so on. It is also not meant to be an introductory or
educational document. There are many documents which cover the basics
of DNS and the roles of organizations in it; a good overview is:
Addressing the challenges of modern DNS - a comprehensive tutorial
by van der Toorn et al.
https://ris.utwente.nl/ws/files/282427879/1_s2.0_S1574013722000132_main.pdf
The document does not consider how to measure adherence to these
recommendations. So it is not intended to be used for certification,
although certification created based on the principles here is
possible.
### How Is This Document Organized?
This document has a number of sections, and specific recommendations
in each section. The intent is for each recommendations to have clear
guidance at the top, and then background and discussion related to the
recommendation afterwards. Each recommendation indicates whether it is
mostly for operators of public resolvers or for operators of any
resolver.
## System and Network Hardening
### Infrastructure considerations
Running any Internet service requires attention to the infrastructure
used to operate it. This section discusses various approaches that can
be used to run a DNS resolver. Everything applies to both public and
non-public DNS resolvers.
#### Bare metal or public cloud
All DNS resolver software can run either on dedicated servers (rented
or colocated), or in virtualized clouds, or in a combination of those.
Every approach has pros and cons. Most of these are not specific to
running DNS resolvers, however, some of them are.
**Running DNS resolver instances as OS level daemons on bare metal
hosts:**
Pros:
- Performance: Bare metal servers have direct access to the
underlying hardware, and can offer superior performance/cost
balance by avoiding the overhead associated with virtualization.
Moreover, you have full control over the server's configurations,
down to the hardware level, which can be beneficial for
performance and cost optimization once you get the understanding
of your typical work load during peak hours.
- Data Security: Since you are in control of the physical servers,
there is no risk of data leakage that can occur due to
vulnerabilities in multi-tenant virtualization platforms,
including CPU cache-based side-channel vulnerabilities. It could
be argued that attacks targeting such issues are rare, and their
impact on a DNS resolver service is low, but potential breaches
may have significant privacy impact. It is advised to evaluate
this against your organisation's risk model, or to discuss this
with your information security compliance experts.
- Predictability: Because there is no virtualization layer and no
"noisy neighbours" on the host, the performance of your servers is
more predictable.
Cons:
- Cost of failure: If you pick hardware configuration that is not
optimal for the workload of your DNS resolver, you may need to
upgrade and replace hardware components afterwards. Ways to reduce
this risk include renting servers instead of buying them, carrying
load testing with data similar to production workloads, and
providing limited beta access to the service before it fully
enters the production phase.
- Scalability: Scaling up with physical servers means acquiring or
renting, installing, and configuring new hardware, which will take
more time than provisioning new virtual servers in a cloud
environment. Moreover, most cloud environments will provide you
with cluster autoscaling features, which could barely be achieved
in bare metal.
- Maintenance: You will be responsible for all server maintenance
tasks, including hardware issues, which can require significant
effort and specific expertise.
- Redundancy: Setting up high availability and disaster recovery
strategies can be more complex and time consuming compared to the
cloud, where these features are often provided as value added
products. See the Redundancy section for more details.
**Running DNS resolver instances in containers in a public cloud:**
Pros:
- Scalability: Clouds excel at scaling applications. You can scale
up and down rapidly based on load, which is important for a DNS
resolver that needs to handle variable query loads. In case of
regional or geographically distributed resolvers, in every region
where the resolver would be deployed, daily periodicity is likely
to be observed, for example peak hour is likely to occur around
19:00 local time, and off-peak hours may begin at around
01:00-03:00. In a situation like that, using cluster autoscaling
features and tools, you can run less instances in the night and
more instances throughout the day, which may help to optimize your
cloud hosting costs.
- Fault Tolerance and High Availability: Most clouds have built-in
strategies, features, and products for handling node failures,
which can increase your service's availability.
- Deployment and Management: Cloud providers offer built-in methods
to deploy and manage applications, which can simplify operations
and reduce the likelihood of human errors if your infrastructure
management department is already familiar with these tools.
- Cost: While this largely depends on your specific usage, cloud
services can sometimes be more cost-effective than managing your
own physical servers, especially when you consider the total cost
of ownership, including power, cooling, and maintenance.
Cons:
- Performance: The virtualization layer of public clouds can impact
performance. While this certainly could be mitigated through
scaling the number of virtual hosts, the cost would also increase
accordingly.
- Complexity: Advanced cloud technologies are complex systems which
come with a steep learning curve. Without prior experience,
properly configuring and managing a cloud-based compute cluster
can be challenging.
- Cost Variability: While the cloud can be cheaper, it can also be
more expensive if not properly managed. Costs can rise
unexpectedly based on traffic. Make sure to always set some limits
on how much may be spent on hosting in the cloud control panel,
and to set up notifications to be sent to you when these
thresholds are about to be triggered.
- Multi-tenancy Risks: In a public cloud environment, the "noisy
neighbour" problem could potentially affect your service's
performance. Additionally, even though cloud providers take steps
to isolate tenant environments, vulnerabilities could potentially
expose sensitive data (see the previous section for a detailed
explanation).
**Additional considerations**
- In today's environments, Kubernetes and Terraform are sometimes
used as a substitute for cloud APIs when it comes to production
services' management. When running a DNS resolver in a Kubernetes
cluster on top of a public cloud environment, all the pros and
cons of the public cloud apply; basically, Kubernetes becomes your
public cloud provider. If you have significant prior experience
running services in Kubernetes in production, you may successfully
replicate your experience with the DNS resolver software.
Otherwise, we would advise against Kubernetes in this case.
- The only reason we may find to run a DNS resolver in a Kubernetes
cluster on top of self-hosted dedicated servers is when you have
significant hands-on experience with Kubernetes and it is natural
for you to manage applications this way. Otherwise, running DNS
resolver daemons in containers brings little, if any, benefit.
Autoscaling features are not available to you in this case, and
neither horizontal nor vertical pod autoscaling is of any use,
because DNS resolver software typically scales in-host by itself
just fine.
- When designing a cluster of resolvers for autoscaling, keep in
mind that newly spawned resolver machines would need to populate
resolver cache first before they are fully useful. Your DNS
resolver software may provide cache replication mechanisms.
Otherwise, it is safe to overprovision clusters somewhat under
heavy load, and discarding excessive instances once all the caches
are populated and the average load of a compute instance
decreases. In addition, it may be worthwhile to consider sharing
cache data between instances.
- It is always advised to prefer environments your infrastructure
management team is familiar with.
### Software considerations
#### Open Source
**Recommendation**: Choose any well-maintained DNS software you are
comfortable using. Regardless of which software you choose, ensure you
have somewhere to go for support. In the case of open source software,
consider providing financial support to ensure continued development.
Some open source maintainers take donations, while others offer
support contracts.
There are both open source and proprietary implementations of DNS
resolver software. Mixing these is also possible, for example, by
using proprietary extensions with open source software or deploying
open source software modified in-house.
General observations:
- Software licensing is orthogonal to software security. Neither is
proprietary software less secure on principle nor are
contributions by "unknown" developers more of a risk in open
source.
Benefits of open source:
- Open source allows for inspection, independent auditing, and
troubleshooting.
- Open source can avoid vendor lock-in.
- Open source can aid internet standards development.
Widely-deployed open source implementations allow proponents of
standards drafts to contribute proof of concept implementations
without permission or cooperation of vendors.
Drawbacks of open source
- Both open source and proprietary software require skilled
maintenance, which has costs. Proprietary licensed software or
appliances typically come with license fees to cover these. In
contrast, open source licenses decouple usage by operators from
monetary compensation to developers. It is up to operators to
consider the financial sustainability of continued maintenance of
the open source DNS software they depend upon.
Please also consider deploying different software implementations to
ensure diversity, as discussed in the diversity section below.
### Networking considerations
#### IPv4 and IPv6
**If available, both IPv4 and IPv6 must be deployed.**
Large parts of the authoritative DNS are only accessible via IPv4, so
the resolver must be able to originate IPv4 queries. Authoritative DNS
that is only accessible via IPv6 is very rare.
Depending on the connectivity of clients, a resolver may be IPv4-only,
IPv6-only, or support IPv4 and IPv6.
#### Addressing
**Using multiple IP addresses for the service address should be
considered.**
Using 2 or more IPv4 addresses and 2 or more IPv6 addresses from
different RIR will allow resilience in failure at an RIR, either
governance, security, or technical. Note that support for multiple
addresses for recursive resolvers varies and some clients perform
poorly if any address does not respond normally.
There is no need to pick an IPv4 address with all octets the same,
like 2.2.2.2 or 11.11.11.11.
**Publishing a list of back-end addresses used for resolving should be
considered.**
Publishing a list of back-end addresses used for resolving can be
useful for other network & DNS operators (for example, geo-IP
location, making sure data is getting to correct places, and so on).
#### Anycasting
**Anycasting may be considered**
Anycasting means routing the same IP prefix to more than one location.
As mentioned above for addressing, client support for multiple
addresses is not always good; with anycasting you can use a single IP
address and have redundancy from different sites. This will often
allow you to place sites close to the user - although it is tricky to
get optimal routing with BGP.
For a resolver service with a single site there is no benefit. For a
resolver service with multiple sites, it may be better to configure
clients with different IP addresses rather than use anycasting.
[RFC7094](https://www.rfc-editor.org/rfc/rfc7094.html) discusses
anycast in detail, including references to various other RFC which
discuss anycasting in general and to DNS in particular.
If a separate prefix is to be used for anycasting, usually this means
a /24 in IPv4 and a /48 in IPv6, as those are the smallest sizes that
will be widely propagated in BGP. A common practice is to use a
covering prefix (/23 in IPv4 or /47 in IPv6) for fallback, and a
more-specific prefix (/24 or /48) for the traffic. The more-specific
prefix can then be withdrawn to send traffic to a backup site; this
will happen automatically if the site is disconnected from routing.
#### Ingress Filtering
**Ingress Filtering to follow BCP 38 should be deployed.**
DNS normally uses UDP traffic, which makes it a common vector of both
[reflection](https://en.wikipedia.org/wiki/Reflection_attack) and
[amplification](https://www.cisa.gov/news-events/alerts/2014/01/17/udp-based…
attacks. To minimize the amount of spoofed traffic that a resolver
responds to, the network should be configured as recommended in
[BCP 38](https://www.rfc-editor.org/rfc/rfc2827.html).
#### RPKI Sign Advertised Routes
**Route Advertisements should be signed using RPKI**
Using RPKI to sign any route advertisements - either toward
authoritative servers or toward DNS clients - is straightforward to do
and will reduce the impact of BGP misconfigurations and some BGP
hijacking attempts.
RPKI validation is also possible, although the effort is greater. It
is possible that the hosting provider or the transit provider for your
service validates BGP; asking and making this part of your selection
criteria is reasonable.
#### (D)DoS measures
Denial-of-Service (DoS) attacks, both distributed (DDoS) and not are a
threat to any Internet service. Network operators for a service
providing any DNS service must be prepared for large amounts of attack
traffic.
In addition to attacks on the service itself, a resolver may be used
both as an attack reflector and as an attack amplifier.
Active monitoring of network and service usage, careful logging, and a
security team that is able to respond to problem reports is necessary.
Mitigation techniques will include filtering or rate-limiting traffic,
both on the authoritative and client side of the resolver.
### Capacity planning
#### Server capacity
If using a model that is easy to scale (cloud based, or Kubernetes
based, or similar), then getting server capacity correct is largely a
question of budgeting. If using a less-flexible model (bare metal for
example), then under-estimating will mean problems delivering service.
Hardware performance varies widely, as does operating system and
resolver performance. Some lab testing will be necessary to estimate
the number of systems needed.
#### Network capacity
Since DNS is mostly UDP-based, it is often easy to generate large
amounts of spoofed traffic to and from DNS servers. DNS traffic is
small compared to application traffic (videos and other content), but
still significant. Authoritative server operators often build their
networks and servers to handle 10 times their normal load. Recursive
server operators may need to do the same, although the service only
accepts traffic from IP addresses that cannot be spoofed (for example
users within a network that operated by the same company) then this
can be reduced, for example to 3 times normal load. To estimate
expected load, the best approach is to examine historical usage for
the actual expected users of the system.
### Resilience
#### System Diversity
In addition to the software considerations above, operators should
consider whether to use different server implementations to provide
service. This allows continued operation if a critical vulnerability
is found in one implementation, by shifting traffic to other
implementations.
Placing resolvers and control systems in different physical locations
will allow continued operation in the event of a disaster or other
problem that impacts a single location. In addition, ensuring diverse
connectivity to other networks will prevent single points of failure
on the network side. Ensuring network diversity may take some care, as
it is not always obvious what fate is shared between any given path;
this may be physical, virtual, or organizational, and my sometimes be
hidden.
#### Security
In addition to the DNS-specific security considerations, normal
security best practices for any Internet service should be followed,
including updating software updated regularly, patching software as
soon as possible for any known security vulnerabilities, following
CERT announcements and so on.
#### Certification
It may be useful or required for an organization to follow specific
certifications, such as ISO or ITIL. These can be government-defined
or industry-defined. For end users there is typically not much direct
value, but business customers will often look for services that are
operated by organizations meeting such standards.
## DNS configuration knobs
The DNS is an old protocol that has a lot of settings that can be
tweaked. This section reviews these and provides recommendations on
which should be used for a resolver.
### DNSSEC validation
**DNSSEC validation should be enabled.**
For: All DNS resolver operators.
DNSSEC validation is the best way to ensure that the answers from the
owner of domain name being queried are returned.
The root KSK must be updated when it changes. While
[RFC5011](https://www.rfc-editor.org/rfc/rfc5011.html) defines an
automated way to do this, a resolver operator will probably either
manage this trust anchor directly or have it updated via OS updates.
[RFC9364](https://www.rfc-editor.org/rfc/rfc9364.html) provides a lot
of useful information, and links to further documents about DNSSEC.
However, operators usually do not need to know the details, and can
simply ensure that DNSSEC validation is enabled in their software;
this is usually enabled by default.
Resolver software that does not support DNSSEC validation should be
avoided.
### DNS Transport Protocols
**UDP and TCP must be supported.**
For: ALL DNS resolver operators.
UDP is what most clients use, and TCP is necessary for DNS answers
that are too large for a single UDP packet.
[RFC7766](https://www.rfc-editor.org/rfc/rfc7766.html) explains why
TCP is necessary in more detail.
### Packet Fragmentation Avoidance
**Servers should be configured to avoid fragmentation.**
For: ALL DNS resolver operators.
Packet fragmentation can cause issues with DNS over UDP, especially
over IPv6. These issues can be minimized by choosing implementations
that set IP options to avoid this, and by taking care with EDNS0
message sizes.
Recommendations are available in
[draft-ietf-dnsop-avoid-fragmentation](https://datatracker.ietf.org/doc/draf….
### Encrypted DNS
**DNS-over-TLS (DoT), DNS-over-HTTPS (DoH), and DNS-over-QUIC (DoQ)
should be supported.**
For: All DNS resolver operators.
DoT, DoH, and DoQ are different technologies that all provide an
encrypted channel between the resolver and the authoritative server.
DoT is the oldest, and provides encrypted DNS using TLS. DoH uses HTTP
over TLS as a way to transmit queries and answers, and is widely
supported by web browsers. DoQ is the newest, and provides advanced
features such as separate streams for each query, avoiding the "head
of line" blocking problem common with all protocols layered on top of
TCP (such as DoT and DoH).
- DoT
- [RFC7858](https://www.rfc-editor.org/rfc/rfc7858.html)
- DoH
- [RFC8484](https://www.rfc-editor.org/rfc/rfc8484.html)
- DoQ
- [RFC9250](https://www.rfc-editor.org/rfc/rfc9250.html)
**Discovery of DNS Designated Resolvers**
There are new mechanisms that allow DNS clients to use DNS records to
discover encrypted DNS configurations. Resolvers should publish DNS
records to assist clients finding encrypted resolvers.
- Discovery of Designated Resolvers
- [RFC9462](https://www.rfc-editor.org/rfc/rfc9462.html)
QUESTION: Do we need to publish certificate in other ways that via the
DDR mechanisms?
### QNAME Minimization
**QNAME minimization should be enabled.**
For: All DNS resolver operators.
Using QNAME minimization, a resolver does not send the full name that
it is trying to resolver to authoritative servers higher in the DNS
hierarchy. So, for example, when querying "atlas.ripe.net" the servers
for ".net" would only be asked for "ripe.net". This improves privacy
for the end user querying the name.
[RFC7816](https://www.rfc-editor.org/rfc/rfc7816.html) covers QNAME
minimization.
### Aggressive NSEC caching
**Aggressive NSEC caching should be enabled.**
For: Public resolver operators.
"Aggressive NSEC caching", meaning negative caching based on NSEC and
NSEC3 values, can reduce traffic greatly. It is important to protect
against random subdomain attacks.
This style of caching takes advantage of the way that NSEC and NSEC3
records cover a range of names in a zone. A resolver can know that a
query falls within such a range without sending any further queries,
by remembering the NSEC or NSEC3 redords that is has seen as answers
to earlier queries.
Aggressive NSEC caching is almost always a good idea. However enabling
this is less important for DNS resolver operators who have a close
relationship with users, since they can stop attacks by blocking users
or otherwise directly dealing with the source of abusive queries.
[RFC8189](https://www.rfc-editor.org/rfc/rfc8189.html) describes
negative caching in detail.
### Local Root
**Local root should be used.**
For: Public resolver operators.
Since the root zone is DNSSEC signed,
Running a local root has several benefits, but it is an additional
component to maintain. For public resolver operators this is
definitely worth the cost, but other resolver operators may choose to
simply send all queries to the well-distributed root name servers.
[RFC8806](https://www.rfc-editor.org/rfc/rfc8806.html) describes local
root, including several example configurations.
In the future it will be possible to use ZONEMD to validate the copy
of the root zone obtained before using it. This is currently being
deployed for the root zone, but not yet available.
[RFC8976](https://www.rfc-editor.org/rfc/rfc8976.html) describes ZONEMD.
### DNS Cookies
**Interoperable DNS Cookies should be supported.**
For: Public resolver operators.
DNS cookies provide some improved security over plain UDP, and are
designed to be more lightweight than TCP. If more than one server is
responding for a given IP address, then the Server Secret must be
shared by all servers, and the answer must be constructed in a
consistent manner by all server implementations.
Since client-side support for DNS cookies is not very widespread, and
since managing server-side secrets involves some work, the costs may
outweigh the benefits for some non-public resolver operators.
[RFC7873](https://www.rfc-editor.org/rfc/rfc7873.html) describes DNS
cookies, and [RFC9018](https://www.rfc-editor.org/rfc/rfc9018.html)
standardizes shared DNS cookies.
### TTL Recommendations
**TTL limits may be adjusted.**
For: All DNS resolver operators.
Software typically defaults to a maximum stored TTL of 1 or 2 days.
This may be lowered to reduce the cache size. A lower TTL will mean
removing rarely-used records that have long TTL, and should not have
much operational impact from a CPU or network point of view, but may
save memory.
It is possible to set a minimum TTL in many implementations. This is a
violation of the DNS protocol, although may be useful to reduce load
from records with very low TTL (less than 5 seconds).
Note that software may set different maximum and minimum TTL
independent of the results that the resolver returns. That may have a
significant impact on queries as well, but resolver operators cannot
influence that.
### TTL-based Record Pre-Fetch
**TTL record pre-fetch should be enabled when available.**
For: All DNS resolver operators.
Some resolvers have the ability to look up a record before it has
expired from cache, in order to refresh the value and extend the TTL.
This way there is never a time when the records are missing from the
cache. This is not currently standardized, but a form of this was
proposed in the IETF as
[DNS
Hammer](https://datatracker.ietf.org/doc/html/draft-wkumari-dnsop-hammer-03).
We recommend turning this feature on if available.
### EDNS Client Subnet (ECS)
**ECS may be enabled.**
For: All DNS resolver operators.
EDNS Client Subnet (ECS) allows the resolver to include information
about the IP address of the client querying it when sending messages
to authoritative servers. This may allow authoritative servers to
provide different answers which are more appropriate for the client.
However, ECS will increase the amount of cache space required by
resolvers, may reduce DNS performance, and may have privacy
implications.
A resolver operator that has clients that are limited to a specific
region may see no benefit. A resolver operator that has a widely
distributed anycast network may not have much benefit from ECS, since
the locations that initiate the query will be close to the client. But
a resolver operator that answers client queries only from a few
locations, and expects clients to come from a wide area, may provide
better service for end-users by supporting ECS.
EDNS client subnet is described in
[RFC7871](https://www.rfc-editor.org/rfc/rfc7871.html), an
informational RFC.
### Extended DNS Errors
**Extended DNS errors should be enabled.**
For: All DNS resolver operators.
DNS traditionally provides very broad error reporting, SERVFAIL being
the most common. This makes diagnosing and fixing problems difficult.
Extended DNS errors provide extra information about failures, for
example expired DNSSEC signatures. They also allow resolver operators
to report administrative reasons for DNS failures, such as blocks due
to legal requirements.
[RFC8914](https://www.rfc-editor.org/rfc/rfc8914.html) defines
extended DNS errors.
### Negative Trust Anchors
**Negative trust anchors may be deployed.**
For: All DNS resolver operators.
Negative trust anchors (NTA) allow a resolver operator to handle a
case where an authoritative server has a DNSSEC problem and becomes
inaccessible. They basically disable DNSSEC checking for a domain.
When this is warranted is difficult to know with certainty, and will
usually requires some manual checking. Since DNSSEC validation is now
widespread, DNSSEC failures on the authoritative side will impact many
resolvers.
Because of these reasons this document does not recommend NTA, but
also does not recommend that a deployment avoid NTA if it makes sense
for that environment.
Negative trust anchors are documented in
[RFC7646](https://www.rfc-editor.org/rfc/rfc7646.html).
### DNS Error Reporting
**DNS error reporting may be enabled.**
For: All DNS resolver operators.
DNS error reporting is a way for resolver operators to let
authoritative operators know about problems in authoritative servers
or zones. It provides little direct value for the resolver operators,
but over time should improve the overall quality of the DNS ecosystem.
It is neither widely deployed nor standardized, but hopefully will be
both soon. Resolver operators are encouraged to enable DNS error
reporting when it is available.
DNS error reporting is proposed in
[draft-ietf-dnsop-dns-error-reporting](https://datatracker.ietf.org/doc/draf….
### Trust Anchor Reporting
**Trust anchor reporting may be enabled.**
For: All DNS resolver operators.
Trust anchor reporting is a way for resolver operators to convey their
DNSSEC trust anchor configuration to the operator of the zone that it
is for. For most resolvers this is only the root zone. This
information is intended to be used during a root KSK rollover to
ensure that it is safe to proceed. In the future ICANN is planning an
algorithm roll for the root KSK, and this information could be
helpful. Resolver operators are encouraged to enable trust anchor
reporting.
[RFC8145](https://www.rfc-editor.org/rfc/rfc8145.html) covers trust
anchor reporting, in both possibilities available.
## Privacy, Filtering, Transparency
### Privacy & anonymity
Operators are advised to apply
[RFC8932](https://www.rfc-editor.org/rfc/rfc8932.html)
"Recommendations for DNS Privacy Service Operators" as follows:
1. its operational and policy guidance related to DNS encrypted
transports and data handling, by applying all "Threat mitigations"
(thereby by meeting its level of "minimally compliant") and
additionally by applying the "Optimizations" on EDNS Client Subnet
listed in section 5.3.1.
2. its framework on a Recursive operator Privacy Statement, by
publishing a privacy statement on their website that is compliant
with Section 6.
#### Logging considerations
1. Public privacy policy: DNS resolvers are recommended to publish
their privacy policies transparently on their website. It can be a
brief privacy commitment as well or be more elaborate on how the
privacy policy was made. (See for example
[Cloudflare's
statement](https://developers.cloudflare.com/1.1.1.1/privacy/public-dns-res…
or [Quad9's privacy page](https://www.quad9.net/service/privacy/).)
2. Third party access to personal data: it seems that the only
critical personal data that DNS resolvers collect are IP addresses
and the queries that are resolved. The other meta data collected
can be used to have an understanding of for example which user
accessed which website which can reveal information about a
person’s health, lifestyle and other personal preferences (we call
this profiling). For example, resolving the website for alcoholics
anonymous may tell you something about the health of a person
behind an IP address. IP addresses are personally identifiable
information. Follow the applicable privacy laws or privacy
principles when receiving third party requests to access. Resolvers
should only comply with such requests when balancing legitimate
third party interest with other fundamental rights.
3. Access to data for researchers: how it is done, who has access and
who can request access, how the resolver makes a decision to give
access (validated and credible researchers, what they can access
and other issues)
4. Data minimization: do not collect personal information not needed
for critical operations. Only retain or use what is being asked
(the query). If collecting data to make the service more private
and secure, explain the rationale for each piece of data (data
collection purpose)
5. Ad policy and encryption: explain the ad policy and how it can
potentially affect the users privacy. If data is encrypted, explain
how it has been encrypted (DoH, DoT, or so on).
6. Data security and retention: when to delete the data and how it is
stored
### Filtering and blocking
#### Legal blocking:
**Legal requests and blocking and filtering laws:** DNS resolvers
should not filter content and block access to web-services. When the
local law requires blocking, and the law applies to the resolver, the
resolver should transparently disclose a list of blocked websites and
services.
**Community governance of blocklists:** blocklists, if mandatory, have
to be audited and assessed by third parties and there should be a
right to appeal for those blocked. The Internet community can vet the
blocklists from time to time to avoid blocking access to websites that
are mistakenly blocked. During crisis - when mistakes can have drastic
effects on accessing a critical service - preferably filtering and
blocking should not be used.
#### RPZ-based filtering
Response Policy Zone (RPZ) allows a way to both document specific
modifications that resolvers will make to DNS answers, and send the
rules to resolvers. Resolvers can be directed to block or modify
answers in various ways. Blocklists may be provided by governments,
communities, or other parties (for example security firms) using RPZ.
This allows updates to occur very quickly. These source must be highly
trusted, as changes to blocklists will usually immediately impact user
queries.
RPZ is not standardized, but there is an IETF draft,
[draft-vixie-dnsop-dns-rpz](https://datatracker.ietf.org/doc/draft-vixie-dns….
#### Opt-in/Opt-out Mechanisms
End users may choose to use a DNS resolver that filters specific kinds
of traffic. For example, they may wish to avoid potential malware web
sites. Or resolver operators may be required to default to filtering
but allowed for to provide an unfiltered DNS resolver service.
Depending on the specific requirements, a resolver service may publish
different IP addresses and what type of filtering applies to each
address. It is also possible to perform client authentication and
authorization, using IP-based authentication, TSIG keys, or
client-side TLS certificates.
### Transparency
DNS resolvers usually provide transparency reports once a year. The
reports inform the public about disclosure of user information and
removal of content required by law enforcement and other government
agencies.
Transparency reports should (to the extent that the law allows)
indicate which government agencies and law enforcement agencies
request access on what basis.
It should also be clear from the transparency reports what kind of
data has been requested and if content removal and content blocking
have been requested. Categories of data include: Content Data, Basic
Subscriber Data, Other Non-Content Data and Content Blocking.
#### Voluntary certificates and standards
Some DNS resolvers opt for obtaining certificates in security and
privacy. Some also undertake audits on their privacy practices. See
for example:https://www.cloudflare.com/trust-hub/compliance-resources/
#### Human rights considerations
DNS resolvers can opt for declaring their understanding of their
responsibilities regarding human rights from the Universal Declaration
of Human Rights. Specifically, Quad9 mentions rights to freedoms
without distinction made on the basis of country, no interference with
privacy, the right to freedom of opinion and expression, the right to
peaceful assembly, and the right to freely participate in the cultural
life of the community.
See
[Quad9's Human Rights
Considerations](https://www.quad9.net/privacy/human-rights-considerations/)
for the full statement.
It also invokes other human rights related solutions other than
[UDHR](https://www.un.org/en/universal-declaration-human-rights/) such
as Articles 8 and 9 of Resolution 42/15 of the United Nations Human
Rights Council on the right to privacy in the digital age of 26
September 2019 more directly define the responsibilities of the
private sector toward the furtherance of human rights in modern terms.
They also follow the Guidelines for Human Rights Protocol and
Architecture Consideration of the Human Rights Protocol Considerations
Research Group at Internet Research Task Force.
The latest version of the IRTF
[Guidelines for
HRPC](https://tools.ietf.org/html/draft-irtf-hrpc-guidelines) may be
considered for all network operators.
## Appendix A: Why Did RIPE Write This Document?
There is increasing concern that large open DNS resolvers will become
centralised points of DNS operations on the Internet. In order to
address this, the European Commission issued the
[DNS4EU](https://hadea.ec.europa.eu/calls-proposals/equipping-backbone-networks-high-performance-and-secure-dns-resolution-infrastructures-works_en)
proposal. However, such an initiative could lead to centralised
guidance or regulation which might interfere with the decentralised
way the Internet infrastructure works - including the DNS. See for
reference the
[RIPE NCC Open House
discussion](https://labs.ripe.net/author/chrisb/dns4eu-ripe-ncc-open-house-…
on this topic.
Rather than attempting to respond to the EC proposal or organize
specific DNS resolver deployments, the RIPE community has decided that
it is best able to provide advice and guidance. The RIPE Community is
well positioned to provide a set of Best Current Practices that
operators of Open DNS Resolvers will be encouraged to subscribe to.
1
0
Draft Minutes - DNS Resolver Best Common Practice Task Force - 26 September 2023
by Boris Duval 01 Nov '23
by Boris Duval 01 Nov '23
01 Nov '23
Dear TF members,
Here are the minutes from our last call.
Cheers,
Boris
***
Tuesday, 26 September 2023 17:00 (UTC+2)
Attendees: Maarten Aertsen, Farzaneh Badii, Shane Kerr, Andronikos Kyriakou, Tim Wicinski, János Zsakó
Scribe: Boris Duval
-- RFC 8932
Maarten drafted a text referring to RFC 8932 (DNS Privacy Service Operators), mentioning that the task force's advice was to apply this RFC at a specific level (minimum compliance). The text also deals with how to structure a meaningful privacy statement, a topic that has been raised at previous meetings. As far as QNAME minimization is concerned, this RFC should also cover it.
-- Dealing with obsolete references
The task force discussed how to deal with references in the document that might be obsolete in the future and decided to add a reader warning to this effect. This warning could also cover other topics such as staff training. Maarten volunteered to draft this text.
-- Certifications
The task force discussed referencing ISO 27001 or equivalent standards as best practice. The task force agreed in principle, but still needs to work out how to reference them in the document.
-- Human right principles
Farzaneh told the task force that Cloudflare and Quad9 were mentioning human rights principles in their privacy practices. The task force agreed to point out that some operators have made commitments to human rights in general in the document. Farzaneh volunteered to draft this text.
-- Audits
Farzaneh said that some operators, such as Cloudfare, have been audited for privacy. The task force felt that it would be good to make a reference to this in the document, but that operators shouldn't be pushed to do audits because of certain limitations. Farzaneh volunteered to draft this text.
-- Wording of the filtering section
Farzaneh worked on a new wording for the filtering section, which she will send to the task force for review.
-- Blocking practices
Farzaneh spoke about blocking practices, referring to MLATs (Mutual Legal Assistance Treaties) - requests to block another country. The task force said that this could be mentioned in the document at a high-level pointing out to a few examples. Farzaneh volunteered to draft this text.
-- Transparency reports
Farzanaeh mentioned transparency reports for operators. The task force agreed that it could be mentioned as best practices.
-- Community blocklist
Farzaneh raised the topic of blocklists and asked whether the task force should make any recommendations.
Recommendations could include that blocklists should be monitored by third parties, that blocked individuals should have the right to appeal, and that periodic reviews should be carried out to ensure that no one is blocked by mistake. The level of detail regarding blocklists should be defined before being incorporated into the document.
2
1
Hello all,
I made a couple of PR with updated text for the recommendation document,
and sent a message for each PR to the RIPE Forum about them.
I won't repeat everything here, but the PR are at:
https://github.com/DNS-Resolver-BCP-TF/Resolver-Recommendations/pull/30
https://github.com/DNS-Resolver-BCP-TF/Resolver-Recommendations/pull/31
Please have a look! You can comment on the PR, reply to the Forum
message, or send me a message directly if you don't want to embarrass me
in front of other people. 😉
I'll be making a pass over our other sections soon to get them into
shape, although those are probably very close to final shape.
Cheers,
--
Shane
1
0