On Thu, Nov 09, 2017 at 04:28:03PM +0100, Tim Bruijnzeels wrote:
On 7 Nov 2017, at 23:11, Job Snijders via db-wg <db-wg@ripe.net> wrote:
I would also welcome an investigation into alternative approaches, (some not-via-WHOIS replication mechanisms), perhaps something over HTTPS can be done? Either way, something more robust would be useful.
We recently developed and implement a standard for something similar for RPKI: https://datatracker.ietf.org/doc/rfc8182/
I believe this approach can be useful here as well. Without going into all the RPKI specifics, it works a little something like this:
Starting points: = The state of the rpki repository (or whois) at a given point in time can represented by a ‘snapshot’ - This snapshot is “immutable” - therefore they may be cached indefinitely and we can give it a unique URL and deliver it through a distributed CDN = The delta between two consecutive snapshots is also “immutable” data - so again we can cache it and give it a unique URL and distribute = We can publish a notification file (which should NOT be cached) that points to: - the CURRENT snapshot - a list of deltas (each for 1 increment) - total size of deltas MUST not exceed size of snapshot
Clients can then just poll the notification file and work out for themselves whether a list of deltas is available to them, or that they need to get the latest snapshot instead.
Yes, we use a session_id and hashes of referenced files for additional checks (details in the RFC).
The idea behind this design was that we wanted to minimise the impact on the server. In a chatty protocol (like rsync which is still used in RPKI) the server and client need to work out their differences together to determine what needs to be transferred. This is fine in one on one relations, but when a server needs to serve a multitude of clients this doesn’t scale. We want to be able parallelise as much as we can (Amdahl’s law), so we push the computational burden to the clients. The server just needs a one-off investment to create the snapshot and delta and latest notification which it can then offload. Using HTTPS allows us to leverage one of the many, many CDNs out there. This problem has been solved in the industry. So we do not need to invent our own infrastructure for this.
Note that in the case of RPKI the protocol is XML based. This made sense because it leveraged existing definitions in the RPKI space that were also XML based. For whois it may make more sense to look at JSON and/or RDAP.
Please let me know if you see merit in this kind of ‘delta’ protocol in the whois space.
yes, I think there may be merit to replacing NRTM, and DELTA would certainly be a good source of inspiration. Would it be fair to ask for a two-pronged approach? DELTA-WHOIS + WHOIS-END-OF-BLURP markings? How much work (or complexity?) is involved for RIPE NCC to develop a marking that is send to the client at the end of a '-g' query that also had '-k' enabled? Kind regards, Job