On Sun, Apr 05, 2020 at 12:56:48PM +0200, Marcus Stoegbauer wrote:
thanks a lot for collecting the information and the post-mortem, and apologies for nitpicking, but:
Following an update to our internal registry software on 1 April at 18:16 (UTC+2), 2,669 ROAs were deleted from Provider Independent (PI) address assignments. [..] Affected users with alerts set up in the LIR Portal received a notification email on 31 March at 22:23, stating that their ROAs were missing. [..] Our engineers were able to reinstate all of the missing ROAs by 13:15 on 2 April.
The timeline does not completely match up here. I assume the users received alarms on April 1 after the update of the internal registry software?
I've constructed the following timeline based on my own validator data archives and the notification I received for my PI block. The minute timestamps are a little bit different than the above because NTT's validators run roughly every 15 minutes and i'm basing this off those snapshots. 2020-04-01T16:32Z - 2,666 VRPs disappeared from the RPKI. 2020-04-01T21:23Z - affected users with alerts set up in the LIR Portal received a notification, stating that their ROAs were missing. 2020-04-02T11:32Z - The missing VRPs returned. It is interesting the NCC's outreach about the incident (or their ROA state alert notification emails) seems to have triggered not just a rebound to the original level (which was expected to happen because of the undelete action), but also a tiny increase in new RPKI ROA creation. Date VRP Count 2020-03-29 80,915 2020-03-30 80,933 2020-03-31 81,016 2020-04-01 81,089 2020-04-02 78,626 2020-04-03 81,530 2020-04-04 81,655 2020-04-05 81,732 Kind regards, Job