Hi Guys
Just to add a bit of context. Historical queries currently show the history of a resource object back to the most recent creation date. So if an object is created, modified several times, deleted, re-created, modified several times all you can query is the modification history back to that re-creation point. There is no community policy regarding this. It was an arbitrary decision made (probably by me at the time) when we specified the historical query feature a few years ago. We were asked for this feature but there were no requirements other than 'please show us some history'. From an implementation point of view this was the 'low hanging fruit' to get a feature developed in a reasonable amount of time. The full history of resource objects going back decades is still available in the database.
Roll the clock forward and GDPR kicks in. It seems now that much of the data IS considered to be 'personal'. Even nic-hdls are considered to be personal data as they can be used to identify an individual. Also there is the issue of the reuse of nic-hdls for many years. So a nic-hdl from a resource object that existed say in 2008 may exist in the database today but it may be a different person.
There is also the issue that the database has never differentiated between personal and corporate data. PERSON and ROLE objects are interchangeable and can be used in exactly the same way in the same places. ORGANISATION objects may also contain personal rather than corporate data. Any attribute that holds name, address, phone, email, nic-hdl, organisation data may hold personal data. So all of this has to be removed from any historical data that is made available. There has never even been any best common practice guidelines saying where personal and corporate data should/should not be used. Except for "abuse-c:". When we implemented that we clearly stated that it should NOT be personal data and will always be made publicly available without any obfuscation.
What you end up with is a bare bones object. For people trying to research historical links between objects that existed at some point in history all those links have been removed. If we move towards removing personal data from the database, say we deprecate the PERSON object, then nic-hdls will no longer identify any individual. In that case there would be no need to remove them from historical data. As for address, emails and phone, these would still have to be removed. To preserve the links between objects in historical data they could be replaced by unique keys. In a one time update all addresses, emails, phones etc could be assigned a unique key which can be added to versions of objects in the database as meta data. When a historical object is queried the key could be shown in place of the actual data. This could be made searchable. That would allow researchers to follow these links across objects and time.
Just a few thoughts....
cheers
denis
co-chair DB-WG