Building A Backblaze Storage Pod
A while ago I saw the Backblaze storage pod and was impressed.
Like many others I thought:
- I want one
- Wouldn’t it work great with ZFS
- The hardware sucks
Building one
I used to make and sell storage servers using Linux a while ago with media streaming software and easy setup (before all the ready-rolled ones came out) so the software side is not a major challenge. Backblaze have released a template for the cases and a list of other components and coincidentally a good friend is a Mechanical Design Engineer and can work on it for me. The cost for the cases drops precipitously if you buy in bulk and he is looking at making it able to be stored flat and easily assembled by folding edges together so I will take the plunge and buy a load, if anyone wants to get hold of some, please get in touch.
ZFS
As soon as you see so many disks in a case like that, it’s hard not to think of Sun’s Thumper and ZFS.
I’ve blogged about ZFS before and given talks on it. With so many disks to fail (either noisily or silently) data loss is inevitable (and worse - you may not even be alerted), ZFS would solve this (or at least ensure you know about it). BackBlaze use custom application logic to work around this, using TomCat and HTTPS.
It’s Not Highly Available
A chap at Sun has a critique here that is totally spot on and he makes a few great points about subtle changes to Sun’s design to accommodate vibration, noise and electromagnetic radiation. In so many ways the hardware is inadequate and does not have the uptime characteristics of a device in Suns range. That said though an individual device from Sun is not as HA as, say, an EMC SAN (with mirrored write cache, dual SPs etc) as it too relies mostly on commodity hardware. For FiveNines availability you need to decide what you are doing to protect against device failure anyway, the BackBlaze devices just fail faster - that’s your trade-off.
It wont be fast
That is largely a feature of the disks and the controllers; you could get a better motherboard, disks and faster controllers, perhaps eschew the port multipliers too, if performance is a problem. A very cool new feature of ZFS (L2ARC / Hybrid Storage Pools) allows for using SSD as a second level cache, that would help. In linux dm-cache (or here) could probably achieve something similar.
How can you make it HA?
This is really another blog (and a few weeks work hacking out the ideas), but I can think of several ways of doing what BackBlaze do in their software stack to export files (via NFS, SMB or other protocol) or block devices (ATAoE, iSCSI, NBD etc) in a robust manner.
I have ordered some of the port multipliers, got my friend working on the case and will buy the sundry bits over the next few days.
This is one of my101 goals