Some historical RIPEstat data will not be available from 24 September
Dear colleagues, As we shared back in May, we have been working to reduce our data centre footprint by managing the data for RIS and RIPEstat more efficiently[1]. We have experienced some problems which mean that parts of the historical RIPEstat data will not be available for an extended period. To reduce our data centre usage, we have been exporting RIS-related datasets from the database that contains our largest datasets (HBase) to a new environment, which allows us to decommission our old Hadoop cluster. During the migration, we encountered timeouts and memory limitations with old servers and software that held historical data. Here we faced a difficult choice: we could either extend our data centre contracts at considerable additional cost, or we could accept that certain historical data would not be available to users for an extended period. We decided on the second approach. It is important to be clear that we haven’t lost anything here. However, this does mean that we will not be able to provide aggregated historical data for a number of API calls and their corresponding widgets for some time after we complete our migration. These include: as-path-length, as-routing-consistency, asn-neighbours, asn-neighbours-history, bgp-update-activity, bgp-state, bgp-updates, bgplay, ris-peerings, routing-status, visbility. As mentioned, this has been a difficult process. We may still encounter other issues when we make the switch to the new environment tomorrow (24 September). My team is doing their utmost to keep any disruption to a minimum, but there are no guarantees everything will go to plan. We will be looking at the most efficient way to rebuild access to the historical data in the coming months, and we will share more details in an article on RIPE Labs soon. Kind regards, Felipe Victolla Silveira Chief Technology Officer RIPE NCC [1] For more on this, refer to the following: - Reducing our Data Centre Footprint (RIPE Labs): https://labs.ripe.net/author/felipe_victolla_silveira/reducing-the-ripe-nccs... - RIPE 88 Technology Update (slides 10-20): https://ripe88.ripe.net/wp-content/uploads/presentations/89-RIPE-88-RIPE-NCC...
Hi Felipe, Could you possibly clarify what you mean by 'historical' in this context? Your RIPE Labs article <https://labs.ripe.net/author/felipe_victolla_silveira/reducing-the-ripe-nccs-data-centre-footprint/> implies this could mean data older than 30 days, but unless I've misunderstood something, you were already serving pre-2021 data from S3 in advance of the last few months. Does this mean there will effectively be a gap in the available data, or that all data older than 30 days will not be available? Kind regards, Josh On Mon, 23 Sept 2024 at 12:39, Felipe Silveira <fvictolla@ripe.net> wrote:
Dear colleagues,
As we shared back in May, we have been working to reduce our data centre footprint by managing the data for RIS and RIPEstat more efficiently[1]. We have experienced some problems which mean that parts of the historical RIPEstat data will not be available for an extended period.
To reduce our data centre usage, we have been exporting RIS-related datasets from the database that contains our largest datasets (HBase) to a new environment, which allows us to decommission our old Hadoop cluster. During the migration, we encountered timeouts and memory limitations with old servers and software that held historical data.
Here we faced a difficult choice: we could either extend our data centre contracts at considerable additional cost, or we could accept that certain historical data would not be available to users for an extended period. We decided on the second approach.
It is important to be clear that we haven’t lost anything here. However, this does mean that we will not be able to provide aggregated historical data for a number of API calls and their corresponding widgets for some time after we complete our migration. These include: as-path-length, as-routing-consistency, asn-neighbours, asn-neighbours-history, bgp-update-activity, bgp-state, bgp-updates, bgplay, ris-peerings, routing-status, visbility.
As mentioned, this has been a difficult process. We may still encounter other issues when we make the switch to the new environment tomorrow (24 September). My team is doing their utmost to keep any disruption to a minimum, but there are no guarantees everything will go to plan.
We will be looking at the most efficient way to rebuild access to the historical data in the coming months, and we will share more details in an article on RIPE Labs soon.
Kind regards,
Felipe Victolla Silveira Chief Technology Officer RIPE NCC
[1] For more on this, refer to the following: - Reducing our Data Centre Footprint (RIPE Labs):
https://labs.ripe.net/author/felipe_victolla_silveira/reducing-the-ripe-nccs... - RIPE 88 Technology Update (slides 10-20):
https://ripe88.ripe.net/wp-content/uploads/presentations/89-RIPE-88-RIPE-NCC...
----- To unsubscribe from this mailing list or change your subscription options, please visit: https://mailman.ripe.net/mailman3/lists/mat-wg.ripe.net/ As we have migrated to Mailman 3, you will need to create an account with the email matching your subscription before you can change your settings. More details at: https://www.ripe.net/membership/mail/mailman-3-migration/
Hi Josh, Thanks for your inquiry. In our message, we did not consider the definition of historical data we used when communicating about RIPE Atlas. In this case for the RIPEstat datasets, when referring to "historical data," we are considering data that is not about the current state of the internet/current day. We will make sure to clarify this in future communications. Following up on Felipe’s last email [0], we encountered some issues during the data migration which we are now working on. As mentioned, no data will be lost, but historical routing data available through the API calls will not be available while we derive it from the source dataset. We plan to share more details about the migration on RIPE Labs once we have clearer insights into the impact and restoration efforts. Kind regards, Ties de Kock Specialist software engineer RIPE NCC [0] https://mailman.ripe.net/archives/list/ncc-services-wg@ripe.net/thread/4CMKB... On Mon, 23 Sept 2024 at 13:55, Joshua Levett via mat-wg <mat-wg@ripe.net> wrote:
Hi Felipe,
Could you possibly clarify what you mean by 'historical' in this context? Your RIPE Labs article implies this could mean data older than 30 days, but unless I've misunderstood something, you were already serving pre-2021 data from S3 in advance of the last few months.
Does this mean there will effectively be a gap in the available data, or that all data older than 30 days will not be available?
Kind regards, Josh
On Mon, 23 Sept 2024 at 12:39, Felipe Silveira <fvictolla@ripe.net> wrote:
Dear colleagues,
As we shared back in May, we have been working to reduce our data centre footprint by managing the data for RIS and RIPEstat more efficiently[1]. We have experienced some problems which mean that parts of the historical RIPEstat data will not be available for an extended period.
To reduce our data centre usage, we have been exporting RIS-related datasets from the database that contains our largest datasets (HBase) to a new environment, which allows us to decommission our old Hadoop cluster. During the migration, we encountered timeouts and memory limitations with old servers and software that held historical data.
Here we faced a difficult choice: we could either extend our data centre contracts at considerable additional cost, or we could accept that certain historical data would not be available to users for an extended period. We decided on the second approach.
It is important to be clear that we haven’t lost anything here. However, this does mean that we will not be able to provide aggregated historical data for a number of API calls and their corresponding widgets for some time after we complete our migration. These include: as-path-length, as-routing-consistency, asn-neighbours, asn-neighbours-history, bgp-update-activity, bgp-state, bgp-updates, bgplay, ris-peerings, routing-status, visbility.
As mentioned, this has been a difficult process. We may still encounter other issues when we make the switch to the new environment tomorrow (24 September). My team is doing their utmost to keep any disruption to a minimum, but there are no guarantees everything will go to plan.
We will be looking at the most efficient way to rebuild access to the historical data in the coming months, and we will share more details in an article on RIPE Labs soon.
Kind regards,
Felipe Victolla Silveira Chief Technology Officer RIPE NCC
[1] For more on this, refer to the following: - Reducing our Data Centre Footprint (RIPE Labs): https://labs.ripe.net/author/felipe_victolla_silveira/reducing-the-ripe-nccs... - RIPE 88 Technology Update (slides 10-20): https://ripe88.ripe.net/wp-content/uploads/presentations/89-RIPE-88-RIPE-NCC... ----- To unsubscribe from this mailing list or change your subscription options, please visit: https://mailman.ripe.net/mailman3/lists/mat-wg.ripe.net/ As we have migrated to Mailman 3, you will need to create an account with the email matching your subscription before you can change your settings. More details at: https://www.ripe.net/membership/mail/mailman-3-migration/
----- To unsubscribe from this mailing list or change your subscription options, please visit: https://mailman.ripe.net/mailman3/lists/mat-wg.ripe.net/ As we have migrated to Mailman 3, you will need to create an account with the email matching your subscription before you can change your settings. More details at: https://www.ripe.net/membership/mail/mailman-3-migration/
participants (3)
-
Felipe Silveira
-
Joshua Levett
-
Ties de Kock