Thanks for the extensive note Denis, thanks Cynthia for being first-responder. I wanted to jump in on a specific subthread. On Tue, Apr 06, 2021 at 06:38:29PM +0200, Cynthia Revström via db-wg wrote:
Questions:
-Should the database software do any checks on the existence/reachability of the url as part of the update with an error if the check fails?
I would say yes as this is not a new concept to the DB as I believe this is already done with domain objects.
I disagree on this one point, what is the RIPE DB supposed to do when it discovers one state or another? Should the URIs be probed from many vantage points to compare? Once you try to monitor if something is up or down it can quickly become complicated. The content the 'geofeed:' attribute value references to something outside the RIPE DB, this means the RIPE DB software should not be crawling it. All RIPE NCC's DB software needs to check is whether the string's syntax conforms to the HTTPS URI scheme.
-Should the RIPE NCC do any periodic repeat checks on the continued existence/reachability of the url?
I would say that checking once a month or so could be fine, as long as it just results in a just a nudge email. Like don't enforce it, but nudge people if it is down.
It seems an unnecessary burden for RIPE NCC's business to check whether a given website is up or down. What is such nudging supposed to accomplish? It might end up being busy work if done by an individual RIR.
-Should the RIPE NCC do any periodic checks on the content structure of the csv file referenced by the url?
I don't have a strong opinion either way here but I feel like that is not really something the NCC is responsible for checking. But if the NCC should check then my comments about the repeat reachability checks above apply here too.
The RIPE NCC should not check random URIs, they are not the GeoIP police ;-) Kind regards, Job