Re: [db-wg] NWI-2 - displaying history for DB objects where available

12 Jul 2016

      Hi George

On 11/07/2016 05:43, George Michaelson wrote:
...
(this may have been fixed, but this is how I understand things are
today)
As currently implemented, the internal DB history of an object is
tied to the fundamental DB object ID, which is (I believe) a
monotonically rising/unique identifier.
correct
...
If an object is DEL and then
ADD rather than UPD, the associated history of that object is
disconnected: Not lost, it exists inside the RIPE SQL model but it
has detached from the apparent resource, because the DEL/ADD sequence
creates a new object with a new fundamental ID.
yes and no. DEL/ADD does create a new 'object' with a new (internal) 
object id (and other internal flags like version number are also reset). 
But these separate physical objects are still connected through the 
primary key. I would see this more as an event in the life of a logical 
object, identified by the pkey, rather than a disconnect.
...
I say fundamental ID, distinct from the apparent unique primary key.
The primary key of an INETNUM is not what drives the history
mechanism: its the specific object ID which is internal.
'mechanism' is an interesting choice of word here. If you construct a 
mechanism you can make it do whatever your requirements are. If you 
actually want to be driven by the internal object id then that is what 
you build. The current implementation of object history (unless it has 
changed recently) was arbitrarily built on this object id. You can only 
look at the history of a currently existing object. This object has an 
internal object id. You are presented with all the versions of the 
object that has this object id. So you get the history back to the most 
recent deletion/(re)creation point. But this was a 'first iteration' 
design constraint.

If you want the full history of, say, an inetnum with a particular 
address range (the pkey of the object) the 'mechanism' can be built to 
traverse the internal object ids (deletion points) and follow the full 
trail of all objects with the same pkey. This is quite easy to do. The 
community just needs to agree on what they want.
...
This is one aspect of history-in-whois which is leading APNIC to
consider a design for a JSON based approach, which presents the
history of objects over time, irrespective of the ADD/DEL/ADD gaps.
This can be done with a very easy tweak to the current implementation.
...
I believe this needs to be part of your problem statement because as
it stands, the history mechanism implies something about an object
which may not be complete.
The mechanism does not imply anything. It is what it was designed to be 
(as stated above). If the community wants something different then 
specify different requirements.
...
Additionally, we are considering a model which ties history to the
start-end pairs of a range, alongside start-end pairs of dates, This
is because (unlike prefix/len) its reasonably simple to construct
canonical lists of the entire set of objects which span a
time-window, and have an end of range bigger than the start, and a
start of range less than the end, which encompasses the overlaps,
splits and joins of that range.
I believe the RIPE NCC may already have looked at this.
...
A large amount of the data is affected by changing referenced
objects. As others have noted, the RIPE NCC region has a significant
issue in current data privacy legislation which guides limits on what
can be seen. Noting that, we feel that the use cases of the history
probably do need to reflect the history of changes of associated
person, role and maintainer objects. Luckily the DB internal history
mechanism tracks this, so we think we can take our basic tabular
approach for start-end date and range pairs, and augment it to make
the complete (JSON) block of a resource, and the specific state of
associated records at that time. Most of this is pre-computable data,
which will occupy disk (or database) but is not inherently complex in
itself. There may be a lot of repetition (denormalized) but disk (and
within limits memory) is cheap.
I believe the RIPE NCC may already have looked at this as well. But let 
me add the caveat that privacy laws will prevent making any historical 
personal data available from the RIPE Database even if it is still held 
internally for the legitimate purposes of the database.

However, a lot of 'information' can still be provided about related 
objects without breaking the privacy laws. But then we are into my 
favourite topic of the data model and services that provide interpreted 
information from the database rather than raw data that can be 
misleading. Services like RIPEstat for example.

cheers
denis
...
-George