Optimal VSAN Configuration Suggestions

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

masterchief07

New Member
Feb 18, 2020
29
0
1
Very shortly and VERY luckily I'll be walking into 4 custom built AMD EPYC 9334 SP5 systems each with 256GB DDR5 4800, Tyan S8050GM4NE-2T, 10 x Samsung 990 Pro 2TB (2 on board, 8 on ASUS PCIE 5 Card w/ mobo bifurcation). That being said, I'd like to set up a VSAN (or similar) environment to maximize the performance of these 4 servers for a medium-size business environment. There is an Exchange VM, a DC, a file server, several proprietary application servers (database heavy), and an RDS Collection.

As of this moment, my bottleneck will be the switch interconnecting and the NICs--both 10Gbit, though I could aggregate easily and add some extra via the PCIe slots.

My primary concern though is performance as I would really like to take advantage of the hardware and from what I have read, not doing this correctly can crush and pretty much render useless an all flash solution like this. Thing is, I've seen a lot of do this, do that, and most of the threads are dealing with enterprise grade storage. While I'm certain the 990s will be more than adequate, I'd like to make sure that whichever direction I choose is optimal.

So, I am open to any and all recommendations/guidance/etc/etc.

Thanks in advance!
 

ano

Well-Known Member
Nov 7, 2022
654
272
63
you dont want 990pro for VSAN you want enterprise drives.

new vsan = all nvme
older vsan = can mix

apart from that have fun, you will get some HCL messages probably, but it should work fine.

VSAN is surpringly fast to setup.
 
  • Like
Reactions: i386

masterchief07

New Member
Feb 18, 2020
29
0
1
So that's the thing. Across the 4 servers, there will be 40 x 990 Pros. I'm only running in this direction because I'm lucky enough to be in the right place/right time. But I don't have the capital to dish out for enterprise drives so I'd like to maximize what I'll have. Are you suggesting that the 990's will be an issue performance wise?
 

ano

Well-Known Member
Nov 7, 2022
654
272
63
you do not want them.

guessing board supports bifurcation, nvme is dirt cheap on ebay theese days (enterprise stuff)
 
  • Like
Reactions: i386

BoredSysadmin

Not affiliated with Maxell
Mar 2, 2019
1,053
437
83
My suggestion is to use it as a playground or lab environment. Maybe, at most, run a Dev environment.
This hardware isn't suited to run production in a medium-sized business environment.
Not unless you don't value your job stability.
Also, the amount of memory is meager compared to flash capacity.
Plus, your primary concern shouldn't be performance but stability.
 
  • Like
Reactions: i386

masterchief07

New Member
Feb 18, 2020
29
0
1
Well, I guess stability would be a no brainer and say performance just from the standpoint of the hardware being so excessive, performantly.

So if I were able to increase memory and swap the 990s for Enterprise NVME, would you agree that would suffice in terms of allowing this to be a production ready config?
 

ano

Well-Known Member
Nov 7, 2022
654
272
63
hard to say?

remember cooling nvme is important, then yes,ish.

we usually run at least 0.5TB ram per vsan node, your layout is a bit weird, considering how expensive VSAN/vmware is.
and usually 1 cpu , with the fastest cores ( if used with SPLA and other products licsense like that) and 0,5-1,5TB ram. and 25 or 100gbps

guessing your db apps is the heavy load? because other stuff mentioned will not make that server break a sweat
 

masterchief07

New Member
Feb 18, 2020
29
0
1
guessing your db apps is the heavy load? because other stuff mentioned will not make that server break a sweat
Yeah, exactly.

Why 990 pro?
Just what's part of the bundle, so figured I'd try to take advantage if reasonable.

I'm exploring my options right now though lol clearly this would not be suitable in current form so going to scour eBay for some hardware. Appreciate the replies and insight thus far!
 

masterchief07

New Member
Feb 18, 2020
29
0
1
By the way, aside from the memory allocation for VSAN, which I understand the limitation there, why would the 990s be less than ideal? I would assume Endurance and Power Loss Protection--or lack thereof, correct?
 

BoredSysadmin

Not affiliated with Maxell
Mar 2, 2019
1,053
437
83
...stability would be a no brainer...
I like your optimism. A few years ago. I built at home 100% un-supported and free hardware 100% SATA SSD based vsan on 3 nodes. It was before the ESA. After I tweaked it a bit since my hardware was below minimum spec - it ran fine. Then, in a moment of pure "genius," I moved my slow Mac Mini-based NVR software to this vsan as VM. Its performance has gotten much better. But all these writes of video streams helped to kill one of SSDs. It happened without any warnings, and one day, my cluster was simply dead. I couldn't connect to vcenter since it was stored on vsan volume and was dead.
Again, my two humble cents - Don't mix custom-built/consumer hardware with enterprise production, but hey - you do you.
 

ano

Well-Known Member
Nov 7, 2022
654
272
63
990 pros are not enterprise drives, few dwpd is probably the smallest problem, read upon consumer vs enterprice, slc, plp, etc
 

zachj

Active Member
Apr 17, 2019
159
104
43
Problems with consumer drives:
  1. Power loss protection
  2. Limited endurance
  3. Unpredictable steady-state performance (due to slc caching)
  4. Lack of support for enterprise self encrypting drive standards
  5. Not hot swappable
  6. Unless you take pains to label every one of them you’re going to have a hell of a time figuring out which one needs to be removed when it is broken
Bluntly nobody who works for a genuine medium- or large-enterprise IT shop would ever use consumer gear for production. Not because it won’t work. Not because it’ll fail early. Not because it’s hard. It’s because it’s indefensible if it ever broke that you knowingly used hardware for a purpose that was outside of its design parameters—it’s essentially a fireable offense.

Enterprises also intentionally nurse hardware beyond the oem warranty by purchasing 3rd party warranties (ex: park place). They rely on 3rd parties to maintain spare parts inventories and to provide on-demand replacements (so they don’t have cash tied up in hundreds of thousands of dollars of stuff sitting on shelves). And more. None of which is really compatible with consumer gear, which runs the risk of having to source spare parts on eBay if you can even find them.

Either Facebook or Google famously did this and you can see pictures of their stuff from 20+ years ago. But they started doing it before they were a legitimate medium-sized enterprise, they had four orders of magnitude more hardware than you’re going to have (meaning they could afford to have a 10% failure rate with no impact to production) and it’s worth pointing out that neither of them still do it today—if it was such a brilliant idea then why’d they stop?

Long story short: 990s will probably work but what happens to you when you find out the hard way that they don’t?
 
  • Like
Reactions: BoredSysadmin

zachj

Active Member
Apr 17, 2019
159
104
43
Also worth pointing out: software license costs are going to absolutely kill you. With VMware announcement they’re no longer selling perpetual licenses you’re going to be paying monthly for 128 cores of vsphere and vsan—make damn sure you need it!

a four node cluster with 128 cores and 1TB RAM is a very lopsided config. It’s highly unlikely in my opinion that the workloads you describe would in aggregate be able to occupy all 128 cores without running into crazy memory starvation; you’re either 4x too small on memory or 4x too big on cores. And the problem with that tyan motherboard is getting more ram is going to be majorly expensive because it’s only got 8 slots; 128gb/256gb dimms are majorly expensive.

if you’re getting all this for free then maybe it’s still a good deal. But if you’re paying for it I’d be seriously rethinking the plan…
 

Rahvin9999

Active Member
Jan 14, 2016
135
86
28
Rotterdam, The Netherlands
Problems with consumer drives:
  1. Power loss protection
  2. Limited endurance
  3. Unpredictable steady-state performance (due to slc caching)
  4. Lack of support for enterprise self encrypting drive standards
  5. Not hot swappable
  6. Unless you take pains to label every one of them you’re going to have a hell of a time figuring out which one needs to be removed when it is broken
Long story short: 990s will probably work but what happens to you when you find out the hard way that they don’t?
Also worth pointing out: software license costs are going to absolutely kill you.
Zachj said it all.

If you still want to go down this path.
Go with OSA and put some Optane drives P4800X/P5900X in front of the Samsung 990 Pro drives.

I have something similar running in a setup:
3 nodes, each node has dual Xeon Gold 6148, 768Gb ram and 2 OSA disk groups made up of an Optane P4800X and 4x a Samsung PM983.
The Samsung PM983 basically a Samsung Evo 970 Plus with some enterprise features bolted on.
This works and performs quite well even though it is 10Gbit.

If I try to run ESA on this using the PM983's it half works but performance is terrible and VSAN throws random errors every now and then.
Have tried it ESA with more modern consumer grade SSD's on newer server hardware but the results are the same.

Networking wise if you can... go for fibre, there is a noticable difference in latency for certain workloads, and go 10Gb+
 
  • Like
Reactions: masterchief07

hmw

Active Member
Apr 29, 2019
581
231
43
So, I am open to any and all recommendations/guidance/etc/etc.
tl;dr - enterprise SSDs are tested and validated on enterprise workloads like cluster file systems and databases, in the absence of TRIM availability and on RAID. They are validated on enterprise platforms with an eye for steady state performance. It MAY be possible to use consumer SSDs but only if you know and understand the workload and platform

_________________________________________________________________________________

Enterprise SSDs are not rocket-science level wonders of tech, they use the same NAND as consumer SSDs, just that they have a few hardware features and their firmware has subtle differences. However, those differences have outsized effects on your enterprise workloads. If you choose wisely you can get away with using consumer SSDs with a few tweaks, but I don't think Samsung's 990 Pro counts as one of those SSDs

The whole point of Enterprise SSDs is to give you anywhere from 1 DWPD (drive write per day) endurance to something like 10 DWPD. They achieve that via two things - firmware and overprovisioning (sadly, you cannot put enterprise firmware onto a consumer SSD).

Enterprise SSD firmware works to do steady state garbage collection and is tested on workloads where TRIM is not used as often (or even never used at all). It's tested and validated exactly on your use case - vSAN, Ceph, HDFS, PostgreSQL, Oracle etc. The firmware is tweaked so that there's minimal write amplification even in the absence of TRIM and with pathological workloads like syslog collectors. The firmware also takes into account the drive may never be power cycled more than a dozen times

Consumer SSD firmware is tested on boot traces of Windows. It's tested with application load times for common Windows apps. And it is tweaked to provide fast sequential bursts of speed, win CrystalDiskMark (no joke), utilize the SLC cache to the fullest and rely heavily on TRIM to 'reset' the drive. And the consumer firmware knows that the drive can be power cycled often. The garbage collector isn't steady state but works less often so as not to slow the drive down. And it then kicks in with 100ms ~ 500ms pauses, unacceptable in the enterprise world but perfectly okay if you're on your consumer desktop

The impact of all this is consistent steady state writes of small blocks will cause a lot more write amplification in a consumer SSD and wear leveling will be a concern much sooner than the anticipated drive lifetime. Since you cannot load enterprise firmware, you can try to get consumer SSDs that are known to have similar firmware to their enterprise brethren (e.g. Micron's MX500). While this may have been common place 5 years ago, now it is an exceptional occurrence

The other is overprovisioning (OP). An enterprise SSD usually has 25% to 33% of the capacity for wear leveling. E.g. a 2TB enterprise SSD is going to be either sold as 1.2TB or then it will be a 2TB SSD with 600GB dedicated to OP. This is something you can change for very specific consumer SSDs. Micron's Storage Executive software allows you to overprovision MX500 SSDs and you can set a 4TB SSD to be 3.2TB - or even 2TB.

I have a 2TB MX500 SSD in my NVR (which doesn't do TRIM and does an endless amount of small writes) OP'ed to 1.2TB, the Write Amplification is now 12.2 and the firmware claims remaining lifetime is 85%. If I had to use new SSDs in my NVR I would just grab Micron 5300 MAX and use those instead of trying to OP some consumer level MX500s
 
  • Like
Reactions: masterchief07

mackintire

New Member
Apr 15, 2024
1
0
1
The last samsung SSDs that would work was the 830 Pro SSDs , 820 and 830 pro SSD in the larger sizes far exceeded their TBW per day before blowing up. Into the PBW range. The 970 pro, 980 pro and 990 pro NVMe M.2 disks are faster but nowhere as resilient. One last note the 990 Pro has half the TBW per day of the 980 Pro NVMe drives.