Re: [routing-wg] Adding "::" notation to RIPE DB

13 Nov 2022

      From: Netmaster (exAS286) <netmaster@as286.net>
Sent: 09 November 2022 21:01
To: James Bensley; routing-wg@ripe.net
Subject: RE: [routing-wg] Adding "::" notation to RIPE DB
...
James Bensley wrote on Wednesday, November 9, 2022 5:42 PM:
...
Only when RIPE DB users start to update their AS-SET members field and add
a source tag would operators with legacy IRRd versions start to have issues,
and then they would have to update. But before that happens we can publicise
the upcoming changes and give people a fair warning, we can try to provide
resources on what the changes are, why they are beneficial [...]
...
How many *years* you think is a "fair" warning before others have to accept,
that their own tooling breaks? (There's more than bgpq4 and IRRd and code
doing !i queries).
Two, six or more than ten?
Don't get me wrong, I like the idea to ensure, the right AS-SET is being
used and I know the pain, if not. Getting AS-SETs unique across all RRs or
being able to clearly identify the right RR to use would be great. (Even
though I would assume adoption might take -after the warning period- still
many years for a significant amount of updated AS-SETs to be seen.)
...
Breaking things just "because it helps" might not be the best choice
...
I think we should try extra hard to maintain backward compatibility and only
if most agree, there's no other way and breaking things is the only way to
go forward and this saves the world, then it should/might happen.
Thanks for your reply Markus.

I don't see how ten years is the correct order of magnitude here, or any justification for why this needs to "save the world"? Do you have some justification for this claim?

I will provide some thoughts in the opposite direction:

1. We're not talking about updating something like a life support system or auto pilot system which can result in an immediate loss of life. We're talking about updating some software which is important to the daily operations of network operators, which an operator could live without if it totally exploded (it might be painful, even extremely painful, but they could, and that is an important difference).

2. Software related to the daily operation of the Internet already breaks from time to time, and most operators are able to implement temporary fixes until long term ones can be implemented. For example, we already have the current problem being discussed, that rogue AS-SETs result in invalid prefix lists being generated. This means I can't update the prefix list facing an existing peer automatically, but it doesn't mean the peering stops working or that my whole AS goes offline. The peering stays up, but with the existing prefix list, which I can update manually if they announce/withdraw something before we can fix the automation tooling. Equally, I can't generate prefix lists for new peers due to rouge AS-SETs, but I can set a prefix limit for the session, apply some basic AS path checks like no Tier 1 ASNs, no bogons, check RPKI, and so no. If they are small enough I can even write the prefix by hand. It's not ideal, but my network doesn't stop working and nor do my peerings. We’re talking about similar impact but due to reversed circumstances: a network can’t process IRR data because the response contains a source tag, not because it is missing a source tag.

3. As I've already said, we wouldn't actually breaking anything if we were to implement the initially [1] proposed change. Things start to break IF, AND ONLY IF an operator updates their AS-SET or Route-Set to use the "::" notation AND their peer or upstream doesn't update their tooling. This networks could actually notify their upstreams/peers proactively and say “hey, we’re going to add source notation to our IRR data in 3 months, please prepare for this”.

4. You seem to be focusing on the operator who doesn't update their tooling as being the victim. If a network makes use of the the "::" notation in their AS-SET, and their upstream who runs IRRd v2 can no longer automatically update their prefix lists towards that customer network, then I think the customer is the victim here. The customer is paying the upstream for a service and the upstream isn't guaranteeing that they are actually using the correct information to deliver the service, even though it is a option available to them. At this point, IMO, the customer network should be pressuring the upstream to upgrade their IRRd.

[1] My initial proposal was to make this suggestion supported by default so that as many networks as possible get the benefit of this suggestion. However, a non-breaking method has been proposed:
...
From: Alexander Zubkov <green@qrator.net>
Sent: 10 November 2022 08:07
To: Netmaster (exAS286)
Cc: Jared Mauch; James Bensley; routing-wg@ripe.net
Subject: Re: [routing-wg] Adding "::" notation to RIPE DB
Maybe we can add some standard mechanism to identifiy the supported "features" by the IRR > server? So client before sending the request can figure out whether the server supports source-tagged as-sets (like bgpq4 now checks if the server supports !a queries). Then the client can add some "flag" in the query that it is willing to receive source-tagged as-sets. On the server side we can strip source prefixes in the reply if the client did not identify its will to receive them. In that case legacy tools should not brake.
Refactoring this slightly: we could add a new non-default query type (to the RIPE DB + whois client, and to IRRd), and if clients don't explicitly use the new query, source tags are stripped from the reply data. This way the client and server don't both need to track if the other supports source tags.

I can see two problems though: firstly, this would never become the default, unless we have a due date after which point source tags become included in the default query types of these tools.

Secondly, how should the objects be represented in the RIPE DB? If I have a member in my AS-SET which is “RIPE::AS-65534:AS-EXAMPLE” which is an object in the RIPE DB, and a legacy whois client queries the RIPE DB for my AS-SET, they would get the following in the list of members “AS-65535:AS-EXAMPLE”, which isn’t actually an object in the RIPE DB. When the same legacy client then tries to query for that specific object, RIPE would need to know to look for both “AS-65534:AS-EXAMPLE” and “RIPE::AS-65534-AS-EXAMPLE”. The same applies when creating or renaming objects, the RIPE DB would need to know that if I create an object called “RIPE::AS-65534:AS-EXAMPLE” no one else is allowed to create “AS-65534:AS-EXAMPLE” and vice versa. Are there more problems I’ve missed if the RIPE DB returns a different responses to different clients?

Cheers,
James.