
Daniel Karrenberg wrote:
[Sorry to be repetitive at the frequency of 1/year]
This is experimental data which is immutable once aquired; read: it does not change anyore, ever. It is also not read very often and less so the older it gets. It also needs no high-speed access, we have copies of it in databases for that.
The costs of keeping it around are roughly: equipment + ops time.
It appears that particular equipment (netapps filer), which is easy to operate, becomes too expensive. It might be a good idea to invest a little ops time in a cheap storage boxes for stuff that does not need filer-quality storage.
ATA disks are available at about half a Euro per Gigabute in units of 300GB. Thus it is possible to put up about a Terabyte of storage in any simple Wintel box. Slightly more without redundancy, slightly less with software RAID. Any (old) Wintel box will be fine. The equipment cost will be negligible: EUR600/terabyte if you use old wintel boxes, otherwise add the cost of the simplest wintel box. When building a couple of them and operating them all the same, the ops cost will not be too high. One does not even need RAID. Just build two of them and have a cron job rsyncing between them for full hot redundancy. Name them cheapfiler-1 and cheapfiler-1-copy, make the copy read-only to users. Make as many as we need. Spread them around for physical redundancy.
Not rocket science. What's the problem?
I more-or-less agree. For 12000 Euros you can put 11 TB in a rack-mounted box (prices from alternate.nl, a month or so ago): Procase Procase C4EE Case, 24x3.5" drives, 950 Watt PSU 1 EUR 2.499,00 EUR 2.499,00 Hitachi Deskstar 7K500 SATA hard disk, 500 GB, 8,5 ms, 16 MB, 7200 RPM 22 EUR 359,00 EUR 7.898,00 Promise SATA II 150 SX8 SATA controller, PCI-X 64-bit 133 MHz, 8xSATA, 150 MB/s 2 EUR 189,00 EUR 378,00 Tyan Tyan Thunder K8S Pro (S2882G3NR) 2x Opteron, PCI-X, 8xDDR-SDRAM, ATI Rage XL, 2x1GHz LAN 1 EUR 459,00 EUR 459,00 AMD Opteron 244 1.8 GHz CPU 2 EUR 199,00 EUR 398,00 Kingston Kingston DIMM 2 GB 333 MHz DDR, 2 GB (PC2700) 1 EUR 549,00 EUR 549,00 EUR 12.181,00 The bottleneck here will probably be the dual Gigabit Ethernet controllers, rather than the disks, if we are using it for backup. This is the *worst-case* cost, mind you. If we were to build a system to backup RIS raw data, it could be as simple as putting a USB drive on my desk next to someone's workstation (I vote for Arife, since her workstation is in the corner and not likely to get bumped by someone walking by) - total cost, 359 Euros: http://www.alternate.nl/html/shop/productDetails.html?artno=A9UE03& Which is a bit extreme, but it shows what we could do if we decided to optimise for cost/performance instead of risk- and work-aversion. -- Shane p.s. With 11 TB I could achieve my dream of putting all of RIS data on-line. :)