If the hard drives max out at about 200MB/s, then a single SAS2 lane should be good for about 2.5 of them, or 3 if you're okay with a minor bottleneck. If anything is going to notice bottlenecks the most, it's the SSDs. You may want to consider keeping those in a separate path that goes straight back to an HBA, since they will easily max out that link.Alot for me to unpack here i'll have to reread more carefully to understand everything... but if it helps the more complicated SAS2 drive system i'm currently planning for would be about 16 hard drives, 2-4 SSD's, several opticals and an LTO tape drive. Maybe 24 hard drives if I converged two servers into one use later as an absolute max for temporary use/like mirroring one drive to several others. I'm still trying to understand ports vs lanes and how to best connect things up - expanders, breakout cables, direct attached to the HBA, more than one HBA.
It's sounding like if you have to run a separate software or hardware RAID on top of it to get decent performance plus pool the drives, it's not actually saving you a whole lot of complexity, and might just be adding another point of failure. Plus, if you back it with a big array, wouldn't you need your parity drives to be as large as the combined space of the array? You'd have to use yet another array for that. If the extra parity drives aren't contributing to throughput, you're also leaving free performance on the table. If you're backing it with a RAID anyway, my understanding is that you lose SnapRAID's advantages of being able to recover partial data if too many disks fail. Seems like what you actually want is just a simple replication from one array to another?I suppose it's vague if you don't know it... SnapRAID is like a software raid, instead of doing data protection from the bottom up (ie working on data blocks under the file system) it works from the top down (creating parity files within the file system on separate drives) so that if anything is corrupted or lost it can be restored.
SnapRAID isn't used for 'raid performance', it's used for file protection. The idea is you set up something else, like hardware RAID0 or a software RAID0/5/6 system for performance under that, or also in software, or you just have fast drives like SSD's.
I don't want to shut down creativity or exploration, but you should be aware that what you're planning to do is most likely going to be more complex and a worse end result than just learning a bit more about ZFS. The thing is, "Overhead" isn't much of an issue here when the alternative is "all my disks that I use as my SnapRAID parity contribute nothing to disk throughput nor capacity". As you said, you want to "design and build systems which aren't painted into a corner again" - but creating a system with this many separate layers instead of using a single vertical solution that does it better and more reliably is exactly how you work yourself into a corner. The problem of storage stacks accumulating more and more layers of cruft over the years is exactly why ZFS exists in the first place.