Digging in the STH archives (Q2 2010!) you will see posts on the "Big WHS". That project was one of the genesis projects for what is now STH and ran from 2009-2011.
That was one of the things that led me over here actually, either linked from someone on smallnetbuilder or that I read before deciding I needed to sign up.
If you look at the evolution and the key learnings from that activity:
- Remember, if you have a 30 drive array, and drives fail at 5% AFR, you can plan for at least one drive failure per year. When that happens, and the array is healing, large arrays may take a long time to recover.
If you wanted to build a 30(ish) drive bay system. Here is my advice after going through the Big WHS evolution which eventually lead to the STH you see today:
Oh trust me, i'm completely and utterly sold on the idea that SAS Expanders are THE way to go, certainly longer term, and for any larger scaleout. My primary question was "can I at least get
by for 6 months to several years" doing things my way? Is there any reason to believe that the electrical complexity of 8 SAS drives in a case is that much worse for some reason at 12 or 16 drives? Because in my mind more wires is just more wires - they just are what they are in any case. [EDIT:] Also again in my mind, "plan for one drive failure per year". In my mind I wasn't willing to spend too much to be able to hot swap one drive failure per year when that was perceived as the primary benefit.
It's actually even cheaper than I outlined/my reason for listing prices was in part if (apparently "ha ha" now) anyone wanted to clone my design because I already have something like five Antec Basiq 350's or something laying around from years ago along with the ATX cases. So my out of pocket costs to move into SAS territory were literally just going to be HBA, Expander, cables. Then seeing how far I could take that/what the soft upper limit was where you can add drives at either the same minimal cost, or even lowering costs, before it goes back up at the 'Enterprise' side of the U-shaped curve.
My largest plan scaleout was going to be the lesser of "what I can run off one 20 amp circuit common plug power" and somewhere 48-64 drives or so because SnapRAID has a max recommended setup of 42 data + 6 parity, and additional separate drives. (not under SnapRAID, such as for much smaller files - ie millions of audio sample files and such/SnapRAID prefers big data blocks per volume) I thought 30 drives was a fair guess, I wasn't sure about 48 or more.
It's my responsibility to have something that can either rapidly migrate data to tape (simply to preserve it for the future) and have a disk scaleout plan that can keep it online if we are actively editing, doing VFX, doing color grading, whatever. Even for multiple projects. Both SnapRAID and SAS scaleout right up to 48 drives which is alot of damn storage for a single server. The sole thing i've been trying to figure out is what is the "bridging NAS" to take care of my needs from now, until I can set it up "properly" AND need the 100+ TB category beyond what the two servers will fairly easily provide.
One semi related question, does Supermicro standardize and let you upgrade backplanes in the future? Like could I buy a 24 bay hot swap case, and change the backplane from SAS2 to SAS3 later? Or am I stuck with the backplane it came with?
- Even shucking external hard drives for 8TB WD drives 30x 8TB hard drives will cost you $6000. $700 up front is not bad.
- Remember 30+ spinning disks create a lot of vibration/ heat/ noise. You do not want to be near them.
- For reference, I now have ZFS servers that I can use as replication targets for Proxmox ZFS. All Proxmox ZFS is setup as two drive mirrors. FreeNAS is used mostly for bulk storage with RAIDZ2. I can then use ZFS send/ receive to push backups to one another. I do have one Ceph pool in the STH hosting cluster. That was a nightmare with <5 nodes in the cluster. Ceph is super cool, but it is also harder to get running well due to complexity.
I fully appreciate what you are trying to do. Check the STH archives. I went through the process when 60TB was "big" using 1TB/ 2TB drives. STH did largely spawn out of my desire to help others after that experience.
Thanks for the single positive word of enthusiasm. ^O_O^ I was hoping people would be happier at my out of the box thinking or consider something worth cloning and it's a bit of a downer so far. That said, I still think I was right in my original plan to have separate servers - which lets me postpone the need to consolidate 20-30 drives for longer, and thus not have to buy the case until later. A 36tb tape prep server and a 64-80tb primary NAS is plenty to buffer even massive opportunistic video shoots right up to Red Weapon 8k footage and 16 camera mocap stage data if I keep the tapes swapped and writing out nonstop from the tape prep server which I can probably conveniently get three per day (before school, after home, and before bed) without much annoyance.
To your points, #1 - the drives will be bought progressively, not up front. (and my goal is to primarily migrate data to tape when possible anyways because the heavier editing/access doesn't have to be done right away - just saving the raw video footage without it corrupting or becoming lost until we can afford to deal with it) #2 - going to go in the laundry room, it's fine. #3 I dont quite understand all that yet but in the future I can learn about better solutions than even SnapRAID potentially. (ie more automated for instance) My primary current plan is to have mirrored 2.5" laptop drives from a video shoot, which copy onto the main NAS the moment we are back (to empty them) plus the secondary NAS. A tape starts writing right away and were able to run out and get more data if needed cuz we have mirrored disk systems, with a backup to tape starting immediately so the laptop drives can be wiped without much fear.
My old server sounds like what you want to do... A full ATX tower with hot swap adapters for the drives. It holds 12 3.5" HDDs in front, with 2 2.5s for the mirrored boot drives.
The build using consumer parts worked, but ended up costing more than getting a server chassis did. Those adapter bays add up fast.
Keep in mind that a PSU failure can take out everything attached to it. Saving a few bucks here can cost you a box full of drives later.
So, you want lots of storage. Just how much are we talking about? How fast does it need to be? What network speed? Sounds like just a couple of clients. Big arrays at 1Gb/s is easy. 10Gb/s gets more difficult.
You mentioned being able to grow storage as needed. Consider ZFS mirror pools (RAID10).
It sounds like at least some of your storage needs are for a staging area for tape backup. I would use ZFS just for its checksums to make sure the data isn't corrupted while in transit.
Yes, i'd considered adding hot swap adapters to an ATX case. One thing I liked about "my" way was that if my whole project bites the dust, every single thing i've bought can be repurposed as normal desktop gear. There's not as much need for a 24 bay backplane case I mean. Though I think by splitting my servers (not consolidating into one) I can "get by" with a max of 8-12 drives per case for separate purposes and it will be okay. By the time I need an SAS Expander case it should hopefully be several years in the future already.
To the PSU failure, my assumption (if i'm wrong PLEASE correct me) is that anything with proper protection circuitry should avoid that..? I've seen it happen on garbage chinese PSU's but I assumed any "real" brand (Antec, Corsair, Seasonic) wouldn't be killing everything attached. If there is a risk then yes I need to reconsider. (but by the same mindset, is there any risk of the server PSU's doing that, and whats the lifespan of a used PSU anyway? I was under the impression I shouldnt expect a used PSU to go last 5 more years)
My planned storage was dynamic in the sense that I didn't know how much I would need and when, but I wanted to be sure I could scale it out to meet needs. The minimum was "enough to buffer 8k multiple camera cinema footage" and 16 camera motion capture (so i'd estimated at several TB per day possible) and rapidly migrate it to Ultrium LTO6 tapes keeping the tape machine working overtime. (about 7.5TB/day, or 2.5TB triple redundant mirroring max throughput) The maximum was if we start editing all that in real time, doing VFX, or/and working on multiple projects - in the last case, some money would be coming in to help fund the drive expansion. But the idea was to have a plan in place so that we could just grow the storage monolith without dicking around worrying about data corruption, bit rot, and dead drives. What caught me flat footed years ago was having incoming footage, nowhere to put it, and no time to research how to properly protect it outside simple mirroring. (and even had two mirrorsets die on me before realizing with the same data corrupt on both for instance) Just like an external USB on your computer is fine, but 20 on my PC turned into an implosion, I didn't want to be painted into a corner again. So I guesstimated 300TB as an upper figure, though more could even be possible. For reference Star Wars 7 had a total dataset over 1PB.
How fast - not very, but accessing full drive throughput going to several workstations at once would be nice. One workstation might be fed a file at 150MB/second from a single storage drive (which would then be worked at on local flash, then reexported back to the server once or twice during the day) but the goal was to have that possible to three to six workstations (three people but one realtime PC and one processing/slave PC each) at once later on - hence a desire to go to 10gigE before too long. Saturating 10gig is not required (be nice but not required), transfer speeds over 1gig probably needed though, so that forces 10gig to not be a bottleneck. Playing with fibrechannel or infiniband still may happen but the simplicity of just sticking with ethernet will probably win out in the end.
Do you need fast space for editing or cold storage for completed projects? The thing you seem to be describing is mostly storage that rarely needs to be powered up. I.E. Finish a project and store it to disk then basically forget it. For years I attempted to keep every bit and along the way realised that after a year or two I simply have no interest in the old data. 1) What are your real storage needs?
I'm not sure what is wrong with creating mirrored vdevs with ZFS 2 disks at a time with whatever is the best price per TB disks for cold storage. I need to buy X number of drives of Y size now against a future need is usually a poor financial decision.
Similarly for performance nvme is obscenely hard to beat for scratch disk. While building sata/sas striped ssds is doable it seems a lot cheaper and easier just to stage your working data onto your workstation on an NVME drive and then offload it to slow disk as required.
I suspect that you could scope down to something that meets your needs this year and worry about next year when a) you have more money, b) specific need c) technology prices on disk get driven down lower.
At the beginning it's mostly going to be an offline archive. It WILL come out again, but it's an archive not a backup, and it's on tape because that wins the cost-per-gig, the lack of need for regular powering up of drives (can sit for 5/10 years if needed), makes off site backup easy, no dropping a box full of hard drives ruining both copies of your mirrorset (ask the dummy me how I discovered the fragility of hard drives and why cardboard boxes are not a suitable way to move them to the car! a dropped Ultrium tape in case wont even care), etc. I'm totally set on that plan right now - migration of video files to tape, mostly being on disk long enough to serve as a mirror until it's verified as good writes.
The second stage (which will come i'm not sure when after, but will come) is needing more online data for active editing, VFX, color grading, other work. This will store snapshots and backups (not archives) to tape as well for the duration of a project. At the finish of a project, the workfiles we choose will be stuffed into deep cold storage. It's foolish to dump anything representing alot of work like that - but keeping it "alive" on spinning rust is annoyingly expensive, so tape is the way. Lost work is lost money - the cost of long term storage of tape is FAR less than disk esp when you start talking about replacing disks every 5 years and more. It's worth saving and it's worth storing right, just not overpaying for storage.
Spinning disks for archival storage still has issues with needing to be powered up and maintained, migrated to other disks usually after 5 years, vulnerability to lightning strikes, all sorts of things. Tapes you stick in a climate controlled storage facility offsite and literally can come back 30 years. No stiction seizing the drive, no bit rot from not getting a scrub every 3 weeks, none of it. It's not a bad idea to migrate data from one set of tapes to a newer set (and newer standard ie LTO6 to LTO8) every 5-10 years but if you dont it isn't the end of the world. So i'm pretty sold on Ultrium tape for multiple purposes - large media library archive, deep archive, and backups/snapshots.
Drives and tapes will definately only be bought on an as needed basis, and both the media tape library and primary NAS grown as necessary for the need at hand. Minimizing cost overhead per drive and "having a scale out plan" is mostly about the luxury of not having to think much in the future, just buy drives and slap in the case if i'm working 16hrs a day between school and filmshoots, letting SnapRAID do it's thing, and swapping Ultrium tapes 3 times a day. No thinking, no stress, just follow the plan.
NVME will likely be used in future workstations. The plan is to load the files planned to be worked on early in the day (before someone starts) making transfer rates a little less important (if they arent already on the drive) then once or twice per day (ie over lunch and after work) to snapshot working data back to the NAS server which can happen in the background. Then it gets mirrored to the second NAS and written to tape religiously in some revolving use strategy in case we have to roll something back. (the workstations themself probably having either RAID1 or at least a local HD also storing separate from the flash)
To the last comment, the whole point was hyperscalability.
Starting at 16 drives included drives I already have that would go in the tape prep server, and was a server consolidation plan that I think i'm not going to do anymore because the added complexity doesn't have any critical benefits for this moment but does have some downsides.