Hi,

I'm another researcher that uses quite a bit of the historical data held in these services, and I appreciate the commitment to keeping this data available where possible.

In the Labs article, there's a statement that: "For the RIPEstat use-case, we make the data available in a variety of ways which takes up about 800 TB of storage space."
This reads to me as if there's a lot of (potentially unnecessary?) data duplication. I think proposal 2 therefore sounds sensible - I would imagine that it's possible to reconstruct some of or all of the formats served, so for older data would producing some of these on-the-fly/converting formats be feasible? Is there a way to get a breakdown of what data forms you're using are most storage-intensive, or which parts of services like RIPEstat are using the most storage?

I'm imagining that there probably aren't that many use-cases where getting instant access to historic data is needed, so making accessing older data slower/tiered (and hence cheaper) doesn't seem like a problem, but I'm looking at it very much from a research perspective so I could be way off the mark on that.

Kind regards,
Josh