Hi George On 11/07/2016 05:43, George Michaelson wrote:
(this may have been fixed, but this is how I understand things are today)
As currently implemented, the internal DB history of an object is tied to the fundamental DB object ID, which is (I believe) a monotonically rising/unique identifier.
correct
If an object is DEL and then ADD rather than UPD, the associated history of that object is disconnected: Not lost, it exists inside the RIPE SQL model but it has detached from the apparent resource, because the DEL/ADD sequence creates a new object with a new fundamental ID.
yes and no. DEL/ADD does create a new 'object' with a new (internal) object id (and other internal flags like version number are also reset). But these separate physical objects are still connected through the primary key. I would see this more as an event in the life of a logical object, identified by the pkey, rather than a disconnect.
I say fundamental ID, distinct from the apparent unique primary key. The primary key of an INETNUM is not what drives the history mechanism: its the specific object ID which is internal.
'mechanism' is an interesting choice of word here. If you construct a mechanism you can make it do whatever your requirements are. If you actually want to be driven by the internal object id then that is what you build. The current implementation of object history (unless it has changed recently) was arbitrarily built on this object id. You can only look at the history of a currently existing object. This object has an internal object id. You are presented with all the versions of the object that has this object id. So you get the history back to the most recent deletion/(re)creation point. But this was a 'first iteration' design constraint. If you want the full history of, say, an inetnum with a particular address range (the pkey of the object) the 'mechanism' can be built to traverse the internal object ids (deletion points) and follow the full trail of all objects with the same pkey. This is quite easy to do. The community just needs to agree on what they want.
This is one aspect of history-in-whois which is leading APNIC to consider a design for a JSON based approach, which presents the history of objects over time, irrespective of the ADD/DEL/ADD gaps.
This can be done with a very easy tweak to the current implementation.
I believe this needs to be part of your problem statement because as it stands, the history mechanism implies something about an object which may not be complete.
The mechanism does not imply anything. It is what it was designed to be (as stated above). If the community wants something different then specify different requirements.
Additionally, we are considering a model which ties history to the start-end pairs of a range, alongside start-end pairs of dates, This is because (unlike prefix/len) its reasonably simple to construct canonical lists of the entire set of objects which span a time-window, and have an end of range bigger than the start, and a start of range less than the end, which encompasses the overlaps, splits and joins of that range.
I believe the RIPE NCC may already have looked at this.
A large amount of the data is affected by changing referenced objects. As others have noted, the RIPE NCC region has a significant issue in current data privacy legislation which guides limits on what can be seen. Noting that, we feel that the use cases of the history probably do need to reflect the history of changes of associated person, role and maintainer objects. Luckily the DB internal history mechanism tracks this, so we think we can take our basic tabular approach for start-end date and range pairs, and augment it to make the complete (JSON) block of a resource, and the specific state of associated records at that time. Most of this is pre-computable data, which will occupy disk (or database) but is not inherently complex in itself. There may be a lot of repetition (denormalized) but disk (and within limits memory) is cheap.
I believe the RIPE NCC may already have looked at this as well. But let me add the caveat that privacy laws will prevent making any historical personal data available from the RIPE Database even if it is still held internally for the legitimate purposes of the database. However, a lot of 'information' can still be provided about related objects without breaking the privacy laws. But then we are into my favourite topic of the data model and services that provide interpreted information from the database rather than raw data that can be misleading. Services like RIPEstat for example. cheers denis
-George