Greetings Denis, All, Yes, it was a very long message :-) Well, maybe not, if we keep in mind the time you have worked and thought about and around the RIPE database. I obviously don't agree with everything you wrote, while i can agree with most of it. 2023-04 seems a bad idea to me, but at least it doesn't prevent anyone to keep on the registration of their assignments if they wish to do so. This proposal sounds like a "less effort for everyone" proposal, and for me, even if it's unintended, a way to increase opacity. Is it enough for a public registry to have just the association between address space and its direct members? -- i don't believe so. Some LIRs are not registering their assignments (violating current policy, right?), so we change/update the policy to make their lack of action as part of the policy? It sounds very wrong to increase compliance levels artificially by changing the rules. I see "arguments opposing the proposal" = none. I would like to disagree. The quality of publicly available registration data is likely to decrease if this proposal goes through. Regards, Carlos On Mon, 25 Sep 2023, denis walker wrote:
Colleagues
I want to look at the bigger picture here. I apologise again for another long email. There are many issues here that this community has ignored for too long. So I hope some of you will at least read through to the end, think about what I say and comment...maybe even support the general idea...
Although this has been a discussion with only a handful of people it has raised some interesting points. Many followers may have missed the significance of some of these points or perhaps not thought deeply about them. These include (in no particular order): -Different registration requirements for IPv4 and IPv6 -Differences in the way IPv4 and IPv6 have been allocated and assigned over time -Block size (fixed or random) -Retro fitting of features -Different levels of adherence to policy by resource holders -Voluntary nature of supplying some details -No consistent approach to supplied data -Confusion for some resource holders about what data to publish -Effort required to maintain data in the RIPE Database -Volatility of some fast changing data -Privacy -Customer confidentiality -Public interest -Public registry -Registering public networks -Addresses defined as free text (sometimes including name)
This is a lot of issues wrapped around one policy proposal. This proposal will not address all, or even most, of these issues. I don't believe this is the right way forward. But what is the root problem here and how can we address it?
There are also some other points to consider. At recent RIPE Meetings some prominent members of this community have told me in the strongest possible terms that there is no way in hell that they are going to list any of their customer's details in the public RIPE Database. No matter what any policy says. Commercial confidentiality seems to be a very sensitive issue for some resource holders. Of course this is a valid concern. But it needs to be balanced. Policy needs consensus, but when we have a consensus all resource holders must follow it. That is the only way a self regulating industry can work.
Another reason of concern is the alignment of handling both IPv4 and IPv6 registrations in the RIPE Database. Where we have two systems that are managed in different ways, there are of course two ways they can be aligned. We can dumb down the IPv4 data to the level of IPv6. Or we can raise the IPv6 data to the level of IPv4. Everyone is focused on the dumbing down option. No one has even considered moving in the other direction. I have never understood why the IPv6 registration policy was not written with the same requirements in mind as the IPv4 in the first instance. Maybe at the time the automation options available then were not as extensive as they are today. Computer power and bandwidth were certainly not comparable to what they are today. Changes to the RIPE Database data model, interfaces, technology and design would make it possible to raise the level of IPv6 information available in the public registry to the same level as IPv4.
At the heart of this issue is a public registry. But what is that in 2023? What does it mean? What should be in it? Who is it for? How do we achieve a three way balance between commercial sensitivity, public need and privacy? These are the sort of questions I was hoping the RIPE Database Requirements Task Force would answer when they started their work. The end result was a little disappointing. They didn't answer any of these questions. They focussed most of their attention looking backwards. Many of us know the history. We want to know how to move forwards. These types of proposals are not the right way forward. So where should we be heading? I believe we need a new Task Force to do what I thought the last one would do. To determine the business requirements for the RIPE Database as a public registry in the 2020s and beyond. To answer these fundamental questions. To establish the registration requirements for a public registry that we can have a consensus on and everyone will accept and apply.
Daniel said at the BOFF in Iceland, "It's time to stop tinkering around the edges of the RIPE Database". But that is exactly what these policy proposals are doing. Here we are trying to retrofit an IPv6 construct onto IPv4. Straight away assignment-size had to be dropped as it won't fit with the way IPv4 assignments are made or how they could be retrospectively aggregated. Knowing the blocksize has nothing to do with HD ratios and further allocations. It tells you nothing about how many assignments have been made from the aggregate, 1 or 100. It exists for IPv6 for other reasons. The same reasons we need for IPv4 but can't achieve, because the two systems are not the same.
We need to start with a full, forward looking Business Requirements document for the RIPE Database, based on accepted business analysis procedures. We can follow that with a Technical Requirements document outlining how things should be done. Not at the level of defining technology or software design, that is for the NCC engineer's to determine. This should include the outline design of the data model and interfaces to commercial IPAM systems. Syncing bits of your internal data, as defined necessary for a public registry, with a database really isn't the problem in 2023. There should be no labour intensive work here. It doesn't matter if the RIPE Database has 5m or 50m or 500m assignment data sets in it. As long as they contain the data defined by the requirements to serve as a balanced public registry. No one should be manually entering this data. No one is going to read this data. We can build tools to provide information from this data in a human understandable format. In terms of registration requirements there should be little or no distinction between IPv4 and IPv6. But that doesn't mean we take the lowest common level.
In case anyone is in any doubt, I am suggesting a redesign and rebuild of the RIPE Database, based on an updated understanding of what is needed to maintain and operate a public registry for all stakeholders. I know none of the RIPE community nor the RIPE WG chairs nor the RIPE NCC membership (who pay for it) nor the RIPE NCC executive board or senior management has any appetite for this. In the past whenever I have brought up this subject I have been totally ignored. Replying to emails where I have mentioned this, people have noticeably answered other points and cut out any reference to redesigning the RIPE Database. Many people have gone to extraordinary lengths to avoid even having this conversation. Seriously guys, the time has come to have this conversation. Daniel tried to start it at that BOFF. The RIPE community has just let it drop...again.
The current design of the RIPE Database data model and software is about 25 years old. It was a big waterfall project with a big bang release and switch over from version 2 in April 2001. Aspects of the design, including having all data stored in untouched, human readable, text blocks, even predates this. We have had two major rewrites of the software in this time in C and then java. But the underlying design was not changed at all. Much of it is no longer fit for purpose. This attempt to retro fit aggregations from IPv6 to IPv4 highlights some of the cracks. It gets harder and harder to make significant changes to this system over time. Like assigning a whole allocation which cuts to the core of the software design and data model. Just to make this one change would be a very disruptive process for all users. Even if we decide today to set up a new task force to determine the business requirements, then the technical requirements, then redesign and rebuild in small agile chunks, we won't have a new system for at least 5 years. By then we are working with a 30 year old data model and system design. That is the age of dinosaurs in the IT world. Do we really want to wait until it breaks before we do anything? Calm, collective consideration is a better working model than panic, reactive mode. We are long overdue for this.
It does not need to be done again in one huge step. It can be done incrementally. Use agile not waterfall methods. The whole system can be easily broken down into subsystems which can be worked on independently and deployed without massive disruption. I'll give some of my own thoughts and ideas on how some of this can be done.
Task Force 1 to determine the business requirements of the RIPE Database as a public registry.
Task Force 2 to determine the technical requirements of the RIPE Database as a public registry.
Redesigned data model dropping the old fashioned requirement to have all data stored in untouched text blocks and be human readable. Stored data should be machine parsable and processable. Tools and interfaces can be provided to offer information based on the stored data or raw data for further machine processing.
Accommodate new business models including the acceptance of investors and commercial RIRs operating below the RIPE NCC.
Interfaces to commercial IPAM systems so all the required data can be uploaded and synced without human effort.
Expand the LIR Portal to a system of user accounts for anyone who enters data into the database and identified/verified power users who consume the data.
Notifications are basically an audit trail of changes to your data. This should be configured through the user accounts. No need for it to be spread throughout the entire database at the data set level. There are millions of attributes with duplicated email addresses all over the data. This has no public interest value at all and should not be public data.
We should design a new authorisation and authentication scheme, also configured through the user accounts. Again details about the security of your data have no public interest value and should not be public data. I don't know of any other web based system that publishes so much information about how you secure and protect your data.
The basic data is composed of hierarchical sets of IP addresses. But only abuse contacts use inheritance. All contact and management data should be inherited. That again could remove millions of items of duplicated, redundant data. Structure of contact and identification data should also be redesigned with privacy and confidentiality in mind.
Resource holder and End User name and address details should be properly formatted rather than free text.
Requirements for user registration details in a public registry could be re-evaluated and re-designed from the bottom up with a three way balance of privacy, confidentiality and public interest in mind.
Language and characterisation of data can be re-evaluated for the whole data set.
Routing data could be better structured with usage in mind. Tools could be built in to provide the structured data needed by those who use this data.
Geolocation data could be built in rather than relying on external files.
Basic, anonymous queries could be limited to bare bones data with no PII. More detailed data could be provided only to verified query users, with accounts, with different levels of detail.
Historical data could be subject to a one time post processing to remove PII from public view but still allow anonymised cross referencing that researchers and investigators can do now with the PII data.
The whole dataset should be organisation centric. Every piece of data entered into the database should be directly or indirectly linked to an organisation described in the dataset. There is no reason to allow anonymous or orphaned data to be entered.
All changes of this nature could be made independently and gradually introduced. But we do need a road map based on a bigger picture so we know where we are heading. Especially for the core changes.
If there is one thing I want you to consider from this message it is this: id nunc, aliquo tempore postea fit numquam
I am not well know for my language skills so let me say it in English: do it now, sometime later becomes never
cheers denis co-chair DB-WG
--
To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://lists.ripe.net/mailman/listinfo/address-policy-wg