Hi all,

I wanted to raise a related question (apologies if that is not the right place to do so): I’ve found that one of the most effective ways to extract data for a specific date is through the BigQuery table containing the traceroute results. However, in its current form, the table isn’t partitioned by date (or at all), which makes it quite cumbersome to query, as each query scans the entire dataset (currently over 66 TB).

Would it be possible to make the RIPE Atlas BigQuery tables partitioned by date (e.g., using start_time or a similar field)?  Partitioning would greatly reduce query costs and make it much easier for users to work with the data, especially when focusing on specific time ranges. For example, M-Lab partitions its BigQuery tables by date, which makes their dataset much easier to work with efficiently.

Thanks again for your support and for making this data accessible!

Best,
Loqman 

On Jun 2, 2025, at 11:20 AM, Francesco Iannuzzelli <fiannuzzelli@ripe.net> wrote:


Hi Vasilesios
we are glad that you found a solution, in cases like this we can assist you in various ways, for example providing a bulk archive to be downloaded. 
There would be some limitations, as the bulk archive could only be generated by measurement ID, or by measurement type in a specific date range, but still fully comprehensive and much more efficient that querying the API. 
Please let us if you need further help
kind regards
Francesco

On 02/06/2025 01:33, Vasileios Giotsas wrote:
Hi again folks.

For anyone having similar issues, I managed to address most issues using curl and forcing HTTP version 1.0.
I also fixed some bug I had in my API query parameters (the `start` parameter for `measurements/{pk}/results` has the same effect as the `start_time__lt` for `measurements/` queries). The updated script is here, maybe it can be useful to others too:





From: Vasileios Giotsas <giotsas@hotmail.com>
Sent: Saturday, May 31, 2025 8:06 PM
To: Giovane C. M. Moura via ripe-atlas <ripe-atlas@ripe.net>
Subject: [atlas] Best way to download bulk data older than a month?
 
I'm trying to download the RIPE Atlas traceroutes for a day that is older than one month.

Given that the FTP repository publishes only results from the last month, I need to use the API.
So, what I'm doing is to list of measurement IDs for a given data and then for each ID I do a GET for the results of that ID.

The code is here:
https://github.com/vgiotsas/RIPEAtlas_Queries/blob/main/download_atlas_traceroutes.py

However, often some downloads time out and when I'm trying to resume it using the Range header it seems that the API doesn't support it and the download starts over until it fails again halfway through the download. 

Is there a better way to do such bulk downloads? Or at least to resume failed downloads? 

Thanks! 

-----
To unsubscribe from this mailing list or change your subscription options, please visit: https://mailman.ripe.net/mailman3/lists/ripe-atlas.ripe.net/
As we have migrated to Mailman 3, you will need to create an account with the email matching your subscription before you can change your settings. 
More details at: https://www.ripe.net/membership/mail/mailman-3-migration/
-----
To unsubscribe from this mailing list or change your subscription options, please visit: https://mailman.ripe.net/mailman3/lists/ripe-atlas.ripe.net/
As we have migrated to Mailman 3, you will need to create an account with the email matching your subscription before you can change your settings. 
More details at: https://www.ripe.net/membership/mail/mailman-3-migration/