RPKI: how to migrate an entire industry from RSYNC to RRDP?
Hi all, Some might be wondering what the deal is with RSYNC and RRDP? Why it is critical to continue to support RSYNC in the mid-term? What's the industry's plan to migrate from RSYNC to RRDP? TL;DR - All RIRs need to support both RSYNC and RRDP until at least 2024. - All RRDP-capable validators need to make sure they are fully backwards compatible with RSYNC, until RIRs no longer observe RSYNC traffic. Some background: ---------------- The development of the RPKI technology stack began more than a decade ago in the IETF. From the start, RSYNC was the preferred synchronisation protocol between RPKI publication servers (such as 'rpki.ripe.net') and Relying Parties (the likes of Telia, NTT, Amazon, me, and you). RSYNC was picked for a number of reasons: it was available & easy. A core concept of the RPKI technology stack is that RPKI objects can be transported via any means: carrier pigeon (RFC 2549 ;-), USB stick, FTP, RSYNC, ... anything! This is possible because RPKI exclusively relies on 'object security', the RPKI objects themselves contain all information that is required to perform X.509-based validation. As time went by, a second approach was developed to synchronize RPs to fresh data generated by CAs. It was recognized that where RSYNC servers need to calculate the 'difference' between the client has and what the server has (right after the RSYNC client connects), such data could also be generated a-priori and stored in static files. Pre-generating such a 'journal' of all changes in a repository is considered to be far efficient than calculating it on the fly. The RRDP protocol has many appealing properties! In 2017 the RRDP protocol was published as IETF RFC as 'nice to have' synchronization protocol for the RPKI. Since then, more and more Publication Servers operators and RP software implementers worked to support the new RRDP protocol alongside the old RSYNC protocol. With two options available, how to migrate? ------------------------------------------- This Gordian Knot has two aspects: all deployed RPKI repository operators have to support RRDP, and all deployed Relying Party have to support RRDP. This means that for a succesful transition, for a moment in time, all stakeholders have to support both RSYNC and RRDP. The industry has not yet reached that point. I expect this to happen somewhere in 2023. At this moment of writing, not all Relying Party software, and not all RPKI Repository Operators support RRDP. Based on various communications in the IETF it is clear everyone is working towards implementing RRDP support. However producing a safe implementation of RRDP is not a trivial task, it takes time. As RSYNC existed first and RRDP came later, everyone should be allowed ample time to make the transition. While everyone waits for everyone to deploy RRDP capable software, the trick is to PREFER synchronizing via RRDP (and if it fails try RSYNC). Relevant Internet-Draft: https://datatracker.ietf.org/doc/draft-ietf-sidrops-prefer-rrdp/ The concept this is somewhat analogous to 'Happy Eyeballs': for a period of time many considered it advantageous for global IPv6 deployment to give IPv6 just a little bit of a head start compared to IPv4. People knew that purposefully degrading IPv4 would not motivate people to embrace IPv6. Also, preferring RRDP (and only using RSYNC in case of RRDP failure), makes life easier for RPKI Repository operators: it should be possible for them to temporarily disable the RRDP webserver and expect clients to use RSYNC instead. Knowing that clients will gracefully fall back to RSYNC lowers the barrier to deploy RRDP. Once all RP implementations have embraced the 'prefer RRDP' strategy, and those implemenations have trickled down into the hands of network operators and deployed in the field, Repository Operators will observe less and less clients connecting to the RSYNC service and more and more syncing via RRDP, to the point where it becomes self-evident publication via RSYNC can maybe even be decommissioned all together. TL;DR - general availability of software which prefers RRDP over RSYNC, combined with patience, should be sufficient of a plan to migrate! :) Current status -------------- Current versions of OpenBSD rpki-client supports RSYNC. The team is actively working to also support RRDP. The hope is to release a stable version later this year. OpenBSD supports releases for one year, thus any deprecation of RSYNC services should be post-poned at least until Spring 2023 to avoid disenfranchising existing deployments in the field. The RIPE NCC Validator RRDP implementation is broken. It is trivial for any RRDP Repository Operator to remotely crash the entire RIPE NCC validator process. Luckily the software is almost End-Of-Life and soon won't be relevant anymore. Current versions of Routinator are unable to fall back to RSYNC. However in November 2020, the team indicated they would fix this in the next release (which has yet to happen). Perhaps somewhat counterintuitive, an inability to fallback to RSYNC, makes migrating towards RRDP harder for the industry. For a peer-reviewed scientific publication on this phenomena see: https://dl.acm.org/doi/10.1145/3419394.3423622 I've not inspected how other validators behave, and we should keep in mind proprietary rsync-only validator implementations might exist this community is not aware of! How to move forward in the short term? -------------------------------------- RIPE NCC must support RSYNC as a first-class service. The RPKI via RSYNC is a LIVE production system. The global Internet routing system relies on the Trust Anchors to publish correctly. The suggestion it might take six months (from report to resolution) is too long of a waiting period for this type of issue. There are known negative interactions between inconsistent RPKI publications & BGP routing in the Default-Free Zone. This affects all Internet users. It is not possible to get hundreds of operators to migrate to RRDP overnight, but it is possible for RIPE NCC to improve RSYNC publication (while the world very slowly moves to RRDP). Kind regards, Job
Hello all, I would like to offer some additional context to this story with some specific data, without any judgement on the content itself. Currently the RPKI has 35 repositories serving RPKI data. All of them serve data over both rsync and RRDP, except one. The software that this one remaining Certificate Authority (CA) uses – rpkid – has support for RRDP, but it appears the CA operator has not enabled this option, thus serving data over rsync only. Other CAs that use the same software serve data over RRDP just fine. Although LACNIC has recently introduced support for RRDP, they have not yet published a new Trust Anchor Locator (TAL) file with an HTTPS URI. An example of the format can be found in the RIPE NCC's TAL, where two URIs are available to access the root certificate; HTTPS and rsync: https://tal.rpki.ripe.net/ripe-ncc.tal The result for now is that Relying Party software will download the LACNIC root certificate over rsync, but the rest of the repository over RRDP. As mentioned below, all Relying Party software has support for RRDP except one: rpki-client. But, RRDP support is coming along and you can for example track the progress here: https://marc.info/?l=openbsd-tech&w=2&r=1&s=rrdp To get an understanding of the current market share of Relying Party software, I can refer to an APNIC blog post written by the author of the aforementioned peer-reviewed paper. Figure 4 shows RP implementation popularity based on RRDP user-agent strings, with further details of rpki-client usage under the graph, as this is the only implementation using rsync: https://blog.apnic.net/2021/03/22/rpki-relying-party-synchronization-behavio... This is an example of the status and connections that the Relying Party software Routinator establishes right now. As you can see, just two rsync connections to RPKI repositories remain, one of which is for a single file: https://routinator-demo.aws.nlnetlabs.nl/status My conclusion is that if LACNIC publishes an updated TAL with an HTTPS URI and the one remaining CA operator enables RRDP, the vast majority of Relying Party software would not be needing rsync in any way, shape or form. I'm not going to make any predictions how long the tail is, but I do hope this provides some insight into where we are today. Kind regards, Alex
On 13 Apr 2021, at 00:24, Job Snijders via routing-wg <routing-wg@ripe.net> wrote:
Hi all,
Some might be wondering what the deal is with RSYNC and RRDP? Why it is critical to continue to support RSYNC in the mid-term? What's the industry's plan to migrate from RSYNC to RRDP?
TL;DR - All RIRs need to support both RSYNC and RRDP until at least 2024. - All RRDP-capable validators need to make sure they are fully backwards compatible with RSYNC, until RIRs no longer observe RSYNC traffic.
Some background: ----------------
The development of the RPKI technology stack began more than a decade ago in the IETF. From the start, RSYNC was the preferred synchronisation protocol between RPKI publication servers (such as 'rpki.ripe.net') and Relying Parties (the likes of Telia, NTT, Amazon, me, and you). RSYNC was picked for a number of reasons: it was available & easy.
A core concept of the RPKI technology stack is that RPKI objects can be transported via any means: carrier pigeon (RFC 2549 ;-), USB stick, FTP, RSYNC, ... anything! This is possible because RPKI exclusively relies on 'object security', the RPKI objects themselves contain all information that is required to perform X.509-based validation.
As time went by, a second approach was developed to synchronize RPs to fresh data generated by CAs. It was recognized that where RSYNC servers need to calculate the 'difference' between the client has and what the server has (right after the RSYNC client connects), such data could also be generated a-priori and stored in static files. Pre-generating such a 'journal' of all changes in a repository is considered to be far efficient than calculating it on the fly. The RRDP protocol has many appealing properties!
In 2017 the RRDP protocol was published as IETF RFC as 'nice to have' synchronization protocol for the RPKI. Since then, more and more Publication Servers operators and RP software implementers worked to support the new RRDP protocol alongside the old RSYNC protocol.
With two options available, how to migrate? -------------------------------------------
This Gordian Knot has two aspects: all deployed RPKI repository operators have to support RRDP, and all deployed Relying Party have to support RRDP. This means that for a succesful transition, for a moment in time, all stakeholders have to support both RSYNC and RRDP. The industry has not yet reached that point. I expect this to happen somewhere in 2023.
At this moment of writing, not all Relying Party software, and not all RPKI Repository Operators support RRDP. Based on various communications in the IETF it is clear everyone is working towards implementing RRDP support. However producing a safe implementation of RRDP is not a trivial task, it takes time. As RSYNC existed first and RRDP came later, everyone should be allowed ample time to make the transition.
While everyone waits for everyone to deploy RRDP capable software, the trick is to PREFER synchronizing via RRDP (and if it fails try RSYNC).
Relevant Internet-Draft: https://datatracker.ietf.org/doc/draft-ietf-sidrops-prefer-rrdp/
The concept this is somewhat analogous to 'Happy Eyeballs': for a period of time many considered it advantageous for global IPv6 deployment to give IPv6 just a little bit of a head start compared to IPv4. People knew that purposefully degrading IPv4 would not motivate people to embrace IPv6.
Also, preferring RRDP (and only using RSYNC in case of RRDP failure), makes life easier for RPKI Repository operators: it should be possible for them to temporarily disable the RRDP webserver and expect clients to use RSYNC instead. Knowing that clients will gracefully fall back to RSYNC lowers the barrier to deploy RRDP.
Once all RP implementations have embraced the 'prefer RRDP' strategy, and those implemenations have trickled down into the hands of network operators and deployed in the field, Repository Operators will observe less and less clients connecting to the RSYNC service and more and more syncing via RRDP, to the point where it becomes self-evident publication via RSYNC can maybe even be decommissioned all together.
TL;DR - general availability of software which prefers RRDP over RSYNC, combined with patience, should be sufficient of a plan to migrate! :)
Current status --------------
Current versions of OpenBSD rpki-client supports RSYNC. The team is actively working to also support RRDP. The hope is to release a stable version later this year. OpenBSD supports releases for one year, thus any deprecation of RSYNC services should be post-poned at least until Spring 2023 to avoid disenfranchising existing deployments in the field.
The RIPE NCC Validator RRDP implementation is broken. It is trivial for any RRDP Repository Operator to remotely crash the entire RIPE NCC validator process. Luckily the software is almost End-Of-Life and soon won't be relevant anymore.
Current versions of Routinator are unable to fall back to RSYNC. However in November 2020, the team indicated they would fix this in the next release (which has yet to happen). Perhaps somewhat counterintuitive, an inability to fallback to RSYNC, makes migrating towards RRDP harder for the industry. For a peer-reviewed scientific publication on this phenomena see: https://dl.acm.org/doi/10.1145/3419394.3423622
I've not inspected how other validators behave, and we should keep in mind proprietary rsync-only validator implementations might exist this community is not aware of!
How to move forward in the short term? --------------------------------------
RIPE NCC must support RSYNC as a first-class service. The RPKI via RSYNC is a LIVE production system. The global Internet routing system relies on the Trust Anchors to publish correctly. The suggestion it might take six months (from report to resolution) is too long of a waiting period for this type of issue.
There are known negative interactions between inconsistent RPKI publications & BGP routing in the Default-Free Zone. This affects all Internet users. It is not possible to get hundreds of operators to migrate to RRDP overnight, but it is possible for RIPE NCC to improve RSYNC publication (while the world very slowly moves to RRDP).
Kind regards,
Job
participants (2)
-
Alex Band
-
Job Snijders