VM storage setup

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Deci

Active Member
Feb 15, 2015
197
69
28
I am currently looking around and trying to spec up a new solaris/ZFS machine to act as a fiber channel storage server for running the backend storage to ~16 virtual hosts, as this is covering multiple aspects (the main one being disk related) i felt this was the most appropriate place to put it, if that is not the case feel free to move it

The current setup which will be moved to a lower load application consists of the following (which has been running 8 hosts)

Dual Xeon e5530 cpus
40gb ram
Dual LSI 9200-8e HBA cards in IT mode
Intel (LSI) 8 port raid card running JBOD for the SSD/ram disks
Dual Dell MD1000 15 drive enclosures/expanders
2 port acard sata ram disk for ZIL populated with 8x 2gb sticks
4x intel 530 series 240gb SSD as L2ARC
30x WD red 2tb drives for bulk storage (configured in 15x mirrors)


This system has been fine for a while however it is starting to show its lmits when the virtual machines start getting IO heavy.

My current thinking is along the following lines and i would appreciate any input on ways to better optimise the hardware side of things when i am looking to replace it in a few months time.

Supermicro x10SRL-F motherboard
Intel Xeon E5-1620v3 CPU
128gb ram (8x 8gb ECC)
8x samsung 850 Evo 1tb SSD for L2ARC
42x (at a minimum) 2tb drives, i would like to go with more but ultimately it comes down to the case/cases that end up being used
1x ZIL - unsure what to go with here
LSI HBA cards as required for internal/external connectivity as determined by the case choices

For the ZIL, i was thinking the samsung SV843 as it appears to have reasonably low latency for a standard 2.5"ssd and is high endurance, but depending on case choice a 2nd hand dell (marvel) 8gb WAM, or 365gb iodrive2 are at a similar price point and offer lower latency - are there any similar price point options that would give better performance?

For the case, i have not settled on which cases/jbod enclosures to use as yet but am liking the supermicro 45 drive one ( http://www.supermicro.com.tw/products/chassis/4U/847/SC847E1C-R1K28JBOD.cfm ) in combination with a 2u 24x 2.5" case to hold the rest of the server components and ZIL/Cache disks (this has the added benefit of allowing additional cache disks to be installed later on), however someone has put me onto the chenbro 48/60 drive and modular 36 drive 4U chassis and i am enquiring about pricing for those to see how they stack up compared to the supermicro offerings ( Chenbro - Products ) - if you know of any other high density options i would be interested in checking them out

Disk wise, for high IO i know pure SSD would be best, however with the data storage capacity i require this is impractically expensive, at this stage i cant justify $20,000+ (AU) just on disks to leave little room for expansion, this leaves me having to rely on throwing a large number of spinning disks at it and large amounts of cache to free them up for writes?

Should i continue with the WD Red drives? they have given me zero issues in the current system and i would happily use them again unless there is something better to use
Has anyone else on here done a similarly sized setup before?
How would you go about setting up the vdevs? mirrors? 6-8 disk z2s?
Would you change any part of the rest of the setup or would you leave it as is?
 
  • Like
Reactions: MiniKnight

MiniKnight

Well-Known Member
Mar 30, 2012
3,073
974
113
NYC
So you have 30TB of usable storage now and 16VM's. How active is all of that data? What percentage writes? What is slowing you down now?

The Reds are slow, but you're right you probably are not going to solve this through adding spindles.

More RAM is good. I'd be tempted to get a dual socket board given what you're doing. More PCIe lanes and more RAM slots.

The 8TB of L2ARC SSD space is quite a lot for that amount of hard drives.

I have a play array with 2x 16 + 2 RAID Z2 using 4TB drives. Only have 4TB of SSD (thanks to the STH specials) and 256GB RAM. I bought two of those Supermicro SC847's and have a 2U controller for a total of 6U of space in the datacenter. I also added a few RAID 10 SSD arrays for higher I/O workloads but with 4TB L2ARC I don't hit the spindles hard for what we're doing.

Using 10/40GbE not FC though. Maybe that's different.
 

Deci

Active Member
Feb 15, 2015
197
69
28
Possibly a difference in terminology but it's currently 8 host machines in a hp c7000 blade centre running 50+ virtuals talking back to the box over fiber channel, the idea is to have capacity in the new box to handle doubling of the host machines (this box does nothing but storage, the blades handle the rest) would you still suggest increasing the ram?

Those virtual machines consist of everything from database machines and email servers to terminal servers, documents/file storage/web servers basically a primary copy of everything runs from it. At the moment read and write performance drops down on larger sustained file reads/writes (such as backups and restores that go to an entirely seperate storage machine) that aren't coming from cache and those then effect other machines and occur from each machine every couple of hours.

I could possibly make a seperate 6-8 disk pool that's entirely ssd to move the email/databases onto, that would free up a chunk of spindle resources for the rest of the pool, but limits their growth a bit.

I would have figured having only 2 z2 arrays would hurt small writes and any reads that bypass cache.
 

Chuckleb

Moderator
Mar 5, 2013
1,017
331
83
Minnesota
Are you saturating the network at all? You have a lot of machines going to a single host. Can you run multiple filers to increase network and compute capacity? This is what we use Lustre for, even with 5400RPM disks.
 

Chuckleb

Moderator
Mar 5, 2013
1,017
331
83
Minnesota
Let me revise my thoughts a bit, but thinking to suggest going with 24 bay shells and multiple over a storage area network versus a monolithic large server then being constrained by the pipes out.
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,529
241
63
Begs a question, what speed fiber? @Chuckleb has a great point. You easily have enough so saturate 4gb or 8gb and possibly more. Maybe that's part of what @MiniKnight was getting at suggesting to look at where the bottleneck is.

8x cache drives putting out 500+ mb/s each will require a fast pipe back.
 

mrkrad

Well-Known Member
Oct 13, 2012
1,244
52
48
ESXi is going to have reliability issues and latency issues with any homebuilt solution based on sata drives! For production I strongly suggest a serious SAN solution that is esxi certified.
 

Deci

Active Member
Feb 15, 2015
197
69
28
Let me revise my thoughts a bit, but thinking to suggest going with 24 bay shells and multiple over a storage area network versus a monolithic large server then being constrained by the pipes out.
So 2-3 clustered storage nodes behind an mdt? (if my reading on how lustre works is right) however it doesn't support fc from what I have seen meaning it would rely on Ethernet based protocols/switches? And 3-4 physical machines are required to implement it? But it's still limited by the bandwidth to/from the mdt as all traffic passes through that point (or am I mistaken)?
 

Deci

Active Member
Feb 15, 2015
197
69
28
Begs a question, what speed fiber? @Chuckleb has a great point. You easily have enough so saturate 4gb or 8gb and possibly more. Maybe that's part of what @MiniKnight was getting at suggesting to look at where the bottleneck is.

8x cache drives putting out 500+ mb/s each will require a fast pipe back.
Dual path 4gb currently, the intention was to move the new the storage node to dual 8gb into the blade chassis or move to iscsi over 10gbit, the intention of the large cache isn't so much about getting 500mb/s out of all the drives as having a large buffer of quick read access to commonly read data instead of waiting for it to come from spindles, looking at arc stats on the current storage 1 in 4 requests are missing the arc.
 

Deci

Active Member
Feb 15, 2015
197
69
28
ESXi is going to have reliability issues and latency issues with any homebuilt solution based on sata drives! For production I strongly suggest a serious SAN solution that is esxi certified.
This setup has been running stable for quite a while with most drive requests being reported at 4-5ms within vmwares io stats, appart from raw io speed/capacity it has given very little trouble.
 

Chuckleb

Moderator
Mar 5, 2013
1,017
331
83
Minnesota
Sorry for the red herring on the Lustre, I don't think there are ESX drivers and didn't know about the FC part. Yes, multiple storage nodes directly accessed from the compute/servers, with the MDT being separate. Ideally your data would be striped across multiple nodes so you not have the bottleneck at a single node. Looks like you're definitely look at the stats to try and isolate the bottleneck though.