Hi, Sorry for the late reply, I didn't see that you had replied. The reason that I think the last deletion should be the cut-off point is that it feels quite natural, you deleted the object, it should no longer exist. Also given how a decent number of individuals now have resources from the NCC (either directly or through a sponsoring LIR) I feel like we should consider if organisation objects should really have history. This feels even more important given how full legal names are required (afaik), which will then be published. This might be a separate discussion but I think it is important to consider the pros and cons of this. -Cynthia On Tue, Apr 5, 2022 at 10:50 AM denis walker <ripedenis@gmail.com> wrote:
Hi Cynthia
I understand your concerns but feel I need to clarify the points you made. Most people don't understand the internal workings of the database. That is not surprising as it is quite complex. I spent 14 years with the RIPE NCC as a holistic engineer and analyst working with the database. I know it inside out from all possible angles. I wrote the spec for historical queries in 2013. We had no requirements, it was just an idea that someone thought might be useful. To get something up and running quickly I went for the low hanging fruit. Because of the way objects are managed internally within the database, stopping at the most recent deletion point for an object history was simpler. It was a completely arbitrary, technical constraint. It had no privacy reasoning behind it. At the time we released this service in 2013, it was reviewed by the legal team and they had not asked for this constraint for any reason. (The legal team is currently doing a new review of the issue.)
So lets look at exactly what this constraint means. Currently we provide some history of operational and corporate data objects. These objects are all heavily redacted to remove anything that is considered to be personal data. We do not provide any history of objects that are considered to be mostly personal or security data. Any object can have multiple versions and each version can have multiple instances. An object may be created (v1), updated several times, deleted, re-created (v2), updated, deleted, re-created (v3) and updated again. The arbitrary constraint means only v3 data will be available with all it's updates. None of the data for v1 and v2 is made available. By dropping this arbitrary constraint all the history of this object, v1, v2 and v3, will be available. All the data for v1 and v2 will be redacted in the same way as v3 to remove any personal data. There is no difference in the data for v1, v2 and v3. It is all operational and corporate data with personal data removed. As we have generally accepted that it is ok to release the historical, non personal, data for v3, it should also be ok to release the historical, non personal, data for v1 and v2. If there is any privacy concerns over the v1, v2 data, then we would have exactly the same privacy concerns over the v3 data that we currently provide.
This technical change does not change the object types that we allow historical data to be queried for. Nor does it change any of the attribute types within those objects that we redact. We simply provide a complete set rather than a partial set of the operational and corporate data we currently offer.
Lets look at some examples of what this means. Many objects have never been deleted. Some of these have been around since this version of the RIPE Database model was released in 2001. If you query for the history of these objects you will get the full 21 years of history. Many assignments are given out to end users, then deleted and re-assigned to another end user. This can happen multiple times. This prefix will have many versions v1...vn. Only the history of vn is currently available, and only if this version still exists in the 'live' database. For many researchers of address space usage or routing issues and abuse investigators and brokers, the history beyond vn is useful data. Also when a resource is transferred or consolidated the allocation object is deleted and re-created by the RIPE NCC. So for an object which last week you could see 21 years of history, which has just been transferred, you will only see 1 week of history now. There is also occasionally a case of an operator who accidentally deletes the wrong assignment and does not have an up-to-date copy of the object as it was just before it was deleted. They currently have to ask the RIPE NCC to supply them with the most recent version. Without this arbitrary constraint they could just recreate it, then look up the history themselves.
There is also the issue of NRTM. Some users have been operating an NRTM stream for years, even decades. They have the un-redacted versions of the history of all the resource objects since they started streaming the updates. Redacting personal data only started a few years ago. In many cases they have the full history, regardless of how many times the object has been deleted and re-created. Anyone who starts an NRTM stream now will start to build up their own history of redacted, operational data objects that will remain in their own database after objects are deleted and re-created. That is an accepted practice.
This constraint really is totally arbitrary. It is rooted in technical expedience in getting the service started. It has nothing to do with privacy. There really is no good reason to keep this arbitrary constraint. Either we provide historical operational data or we don't. Offering a partial data set based on random events makes no sense.
cheers denis co-chair DB-WG
On Mon, 4 Apr 2022 at 23:33, Cynthia Revström <me@cynthia.re> wrote:
Hi,
I am sorry for the delayed response but I object to removing that constraint.
It feels problematic to me from a privacy perspective, and it feels like the last deletion point is a fair balance between providing useful info and not providing too much info.
I don't think this restriction should be removed, at least not without a very good reason to do so.
-Cynthia
On Thu, Mar 31, 2022 at 1:32 PM denis walker via db-wg <db-wg@ripe.net> wrote:
Colleagues
When I wrote the spec for historical queries, almost 10 years ago, I included an arbitrary constraint to only show history back to the most recent deletion point. This was to get something in production quickly and see how useful it was. Over the years several people have asked for this arbitrary constraint to be removed. No one has objected to removing it. The co-chairs therefore recommend that this arbitrary constraint be removed. The co-chairs now ask the RIPE NCC to produce an impact analysis and implementation plan to remove it. We will then seek a final approval from the community on the plan.
cheers denis co-chair DB-WG
--
To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://lists.ripe.net/mailman/listinfo/db-wg