On Thu, Apr 21, 2022 at 2:57 PM, Dave Lawrence <tale@dd.org> wrote:

Edward Lewis writes:

Unrelated, or just less correlated than you might otherwise imagine?

Unrelated.

I'd studied an event which made it apparent that resolvers vastly ignored the long TTLs in play. At the time I learned that two popular strains of resolver code had an "internal" limit on the TTL they would accept [which is old news now], meaning any authoritative server operator who was banking on long TTLs to lower traffic was not going to realize any benefit.

The "ignored the long TTLs" goes right to the heart of the question though.

If you tell me something akin to "it looks like TTLs under six hours impact query refresh rate but then starts a rapid decline to where almost no TTLs are honored for longer than 24 hours" then I'd say that does not indicate TTLs and traffic are unrelated, just related in a smaller window than those favoring long TTLs would hope.



Well, in addition to the "standard" 6  hour[0],  1 day [1],  1 week[2] and other[3]
MAX_TTL caps, it is also worth remembering that the TTL is the **maximum** time that a record should be cached[4], not the  "you must cache for exactly this time". Records can be evicted from the cache (or otherwise not used) for all sorts of reasons before the TTL expires, including running low on memory, more specific ECS, delegation revalidation, restarts, etc. 

I know that y'all already know this, but it's worth repeating because I quite frequently see people (especially in enterprise environments) doing something like:
$ dig www.example.com @192.0.2.1
seeing that the TTL is e.g 5h37m, and then assuming that their resolution will continue to work for at least five and a half hours…

A while back there was a good presentation (I think at an OARC) where someone set a long TTL and then timed how quickly it was evicted from a bunch of open public resolvers - I cannot easily find the presentation at the moment, but a surprising number of records disappeared before the TTL reached 0. 

W


[0]: https://developers.google.com/speed/public-dns/faq#:~:text=generally%20limited%20to%20six%20hours
[1]: https://github.com/NLnetLabs/unbound/blob/a0feea393a3a7f0ab0f88b3e1aa7a92cee0e0bb8/util/config_file.c#L168
[2]: https://github.com/isc-projects/bind9/blob/09dccf29b4eb46e133c35acfc84acab37403866e/bin/named/config.c#L170
[3]: https://blog.cloudflare.com/refresh-stale-dns-records-on-1-1-1-1/#:~:text=1.1.1.1%20caches%20DNS%20entries%20for%20up%20to%203%20hours , although, according to 'dig' at least, it is actually at least 1day?
[4]: Modulo if the data cannot be authoritatively refreshed when the TTL expires (see serve-stale)



Conditional correlation != unrelated.

--

To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://lists.ripe.net/mailman/listinfo/dns-wg