Mapping the future - planning a scale out from 3 drives to 48/controller advice

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.
So maybe you've seen some of my posts in other subforums, micro version is that i'm forced to start on a shoestring that can be upgraded over time preferably without totally changing to a different system doing a data migration of lots of exporting and importing.

The software i'm planning to use is SnapRAID - my reason for choosing it is the Drobo like functionality where I can upgrade any drive at any time in size, or add drives at any time. (FreeNAS ZFS wants me to buy matched monoliths set up as single vdevs and makes upgrading piecemeal more difficult) I can also upgrade the server side hardware - motherboards, controllers and such without any forced remigration of data internally like hardware RAID changes may involve. I originally planned for the capacity of a 300TB scaleout in the future, but in reality it could be more or less. I was going to build an 8TB test box, start at 32TB for production work, which will probably rapidly end up upgraded to 64TB. Beyond 64tb i'm not sure when it will happen but planning for that, in case opportunities present themself requiring me to rapidly grow the back end to hold the data i'd like to have a plan on the shelf knowing exactly what to do.

I at least am aware of things like SAS Expanders, SAS without expanders, SATA multilane and SATA port multipliers. Not very well mind you, but I know what exists. I mostly plan to use consumer grade SATA hard drives now or future, good enough for Backblaze and Google. Future SSD to help cache things might be commercial (due to total lifetime bytes written issues if I plan to burn through petabytes of processing) or at least like an Intel 750 NVME but the questions here are mostly about the hard drive array.

My priorities are 1) drive space, meaning until I have 'enough' space the performance is not as important, 2) after plenty of free space might upgrade to improve that, 3) reliability/uptime after plenty of space and fast enough it could mean adding redundancy (perhaps migrating drives to some kind of SAN nodes and having an entire spare or two node ready to plug in if any problems - or simply a prebuilt backup server and swapping the drives over), 4) minimizing time spent on sysadmin duties, as close to a "set it and forget it" self managing array as we can get. Just periodically swap drives when they die or upgrade and the rest auto takes care of itself.

So at first it's just about the minimum cost overhead per drive (which seems like it will go up past 20-some because 8+ channel controllers are more expensive per port), like common 2 or 4 port pcie x1 cards, and being happy saturating 1gig ethernet. 2nd stage is trying to make those existing drives faster, which might involve some internal drive migration, setting up RAID 0 stripes, and SSD caching while trying to potentially swamp 10gig Ethernet. Stage 3 makes me want to have some kind of total system failover or clustering option, so something like a dying HBA or motherboard doesn't stop the work. So whatever drives are already in there might have to be swappable easily to a backup system. Stage 4 is trying to have that automatic - failure detection, automatic failover, etc. (no idea if it requires special controllers or hardware to do that, but if it's so thats when it's relevant)


Can anyone give me some quick and dirty guidelines about where the sweet spots might be, and when certain upgrades might be necessary? For instance when I should look at SAS controllers (it might be a maybe at 16 drives and a definate at 24-48 for instance) or whether multilane SATA introduces reliability problems or what kind of loads make the SSD useful as a cache and similar. I realize this is kind of open ended and wordy but i'm not sure where else to start. :) I have a plan in my head (which was to just start with 4 port PCIe x1 cards maxing out a motherboard) but know it may not work as I plan. (issues of bus contention, shared PCIe line bandwidth and similar)
 

StammesOpfer

Active Member
Mar 15, 2016
383
136
43
If you know you are going to grow like that then just start with a 24 or 36 bay Supermicro or other brand chassis with SAS2 expander backplanes. Only need a single SAS2 8i PCIe card then. This can all be purchased used from ebay ($300-500). This is very cheap compared to what you will spend on drives, very robust, and will give you a lot of room before you have to add another chassis.
 
And that will let me hook up normal consumer SATA drives? (I dont understand SAS at all yet, i'm only just reading about it now) What specifically should I be searching for to find that? Also how is power managed for those, needing special PSU's (I assume not included at those prices) or what?

My original plan was to figure I would 'probably' end up at 16 drives, and 'entirely possible' for 24 drives, with the thought i'd be periodically upgrading drive size in place over the years, but since SnapRAID supports out to 48 I figured might as well think ahead. Would a pair of 24 drive chassis use/need two SAS cards then, or how does it work?

If someone can link me to a For Dummies guide i'd appreciate it. :p This is an order of magnitude lower than I thought SAS would cost so yes that puts it back on the map as a scale out option. (will always be looking for ex-server deals like that)
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Make sure it's a SAS2 (or 3 for that matter) backplane and yes, high capacity SATA drives will work. Don't mix and match SAS and SATA although there is debate over this.

24x (846 SuperMicro for example) you can use 1 HBA for all 24 drives.
You could technically use a SuperMicro 847 and use 1 SAS cable for front and 1 SAS cable for rear all on 1 HBA.

Depending on performance, redundancy, etc... goals you may want to use dual ex panders, multiple HBAs, etc...
 
  • Like
Reactions: Twice_Shy

StammesOpfer

Active Member
Mar 15, 2016
383
136
43
T_Minus covered it. No for dummies guide but read here and look over at r/homelab/ too. Spend some time reading and if you have specific questions feel free to ask.
 
There was literally an SAS For Dummies guide previously on the internet but the page seems to have vanished where you could claim it for free due to a company buyout - I was hoping someone might literally have it to share. :p Announcing SAS SANs for Dummies book, LSI edition - StorageIOblog

Concerning mixing and matching SAS and SATA, I was curious what the problem was? Esp if they are separate 'stripes' of data? (like keeping the SAS drives in a separate stripe of data or what have you)

I'm still curious how it potentially works with more than one cable to the SAS Expander, and things like dual redundant HBA's with failover if that's doable. Is it possible to connect two SEPARATE servers to the same chassis of SAS drives? (even if only one is on at a time, for a failover... or alternately controlling different drives in the same chassis, like even a SnapRAID and a FreeNAS ZFS system each with its own drives)
 

StammesOpfer

Active Member
Mar 15, 2016
383
136
43
I personally have run SAS and SATA drive on the same expander. It can work but it also may not work under all circumstances. If you want 2 card redundant SAS then you need to be looking at dual expander, dual port setups.

Example for the Supermicro is P/N
BPN-SAS2-846EL1 single expander
BPN-SAS2-846EL2 dual expander

As far as 2 separate computers accessing a single backplane/expander. That isn't what this is designed for.
The manual does a pretty good job of explaining valid configurations.
https://www.supermicro.com/manuals/other/BPN-SAS2-846EL.pdf
 
I'm slowly learning as I go. Someone else did something like what i'd considered - simply running the SAS cables to individual drives in the hot swap chassis from a second server since they can be longer than SATA cables. I forgot they could be up to 10m. Or running a separate SAS card on the second server without expanders, just the 8088 to quad 8087 type cables to a set of four drives along one side, with all the other drives coming off the main expander.

Basically just wanting to have all my drives in one place (if I pony up for a nice hotswap case in the future) but potentially slowly migrating from one server to another. Ie starting on SnapRAID and eventually playing with a stripe of 4-8 matched drives under FreeNAS ZFS, expanding the latter over time and shrinking the former as data migrates over, or using one as the backup for the mainline system which would later be FreeNAS when I understand it better. The Plan B was wondering if the dual SAS cables would let me run redundant servers capable of accessing the same set of drives (any problem in Server A, just power it down and power up Server B) but my understanding of wiring suggests even if it would work it would mandate SAS drives with dual inputs. (because it seems everything is normally crosslinked as it is anyways - one input cable, to an expander, going to all drives) Obviously I couldn't power both up at the same time. But I realized a quick cable pull and swap if the server is down achieves the same goal if a second server is ready to go with an SAS card and prevents accidental dual powerup.

One thing I saw clarified part of the issue - SAS uses different voltage on the cable despite the matching physical connector. So I was mostly curious if it was possible to say, take a typical 4 lane 8088 connector coming out of a card, and run it to 3x SATA drives and 1x SAS drive. Supposedly each lane is separate so I would assume yes, but I don't KNOW.

I'm still not understanding how SAS daisy chaining works, whether you just take any free port out of one Expander and it can run to a whole 'nother Expander, whether you'd get extra bottlenecks sharing 24 more drives via 1 port and 23 upstream of that and such.

I'm still weighing whether there is any purpose to building a case with SATA Port Multipliers vs an SAS only setup. It will in part come down to cost, because I can always separately upgrade the controllers without changing any data on the drives under SnapRAID.