Re: [routing-wg] rsync://rpki.ripe.net rsyncd limits set too low?

16 Feb 2022

      Hi Job.
...
On 16 Feb 2022, at 15:05, Job Snijders via routing-wg <routing-wg@ripe.net> wrote:
Hi all,
I noticed the RIPE NCC RRDP service (https://rrdp.ripe.net/) became
unreachable at 2022-02-16 13:34:10 UTC+0 (and still is down).
Ouch. Fallback to rsync due to a DNS misconfiguration (which should have
recovered).
...
This RRDP outage event should not pose an issue for most RPKI
validators, because most RPKI cache implementations (which follow best
practises) will attempt to try to synchronize via RSYNC, in case RRDP is
unavailable.
However, it seems RIPE NCC adjusted the default rsyncd settings and
lowered the concurrent connection count from 200 (which already is too
low for RPKI Repository Servers) to 150?
$ rsync --no-motd -rt rsync://rpki.ripe.net/repository/
   @ERROR: max connections (150) reached -- try again later
   rsync error: error starting client-server protocol (code 5) at
main.c(1666)
[Receiver=3.1.2]
I'm not familiar with the RIPE RPKI RSYNC service architecture, so the
above error could be misleading: perhaps there is a loadbalancer
distributing TCP sessions across multiple backends, each backend
configured to serve up to 150 clients? Or perhaps there is a single
rsyncd instance (in which case 150 definitely is too low).
We have described our rsync infrastructure extensively in earlier messages
(e.g. [0]). There are multiple instances behind a load-balancer. The current
storage is on NFS which has a performance limitation - it peaked at about 80K
operations/second (2m average).

We will follow up with a more detailed post-mortem.

Kind regards,
Ties

[0]: https://www.ripe.net/ripe/mail/archives/routing-wg/2021-June/004351.html

Re: [routing-wg] rsync://rpki.ripe.net rsyncd limits set too low?

Ties de Kock