Shared storage options for VMware ESXi

mephisto

Member
Nov 6, 2013
36
3
8
London
Hey guys,

Currently I have a couple of Dell MD3200 (SAS) plugged to a few ESXi hosts, which has done the job back in the day but now is a performance bottleneck, plus it is connected through SAS, limiting the quantity of hosts we can connect to it, and also has the propriety rebranded disks, so I can't use what I feel like. My infra structure is all VMware ESXi and Windows guest, ranging from MS IIS, MS Exchange servers to file servers, RDS server and Citrix servers, and some SQL servers. I've got about 10 Hosts with a mix of 2 dual sockets 4 and 6 cores CPUs, so time to refresh the hosts soon with some newer dual sockets 20 cores and more RAM.

The aim now is to go for a software defined storage, where we can use a regular 12bay server like a Dell R515 or R510 to start testing it. There are 3 paths from here:
  1. convert existing regular servers to SDS, use internal storage on the box (2U 12bay R515/R510 servers) and share it using NFS or iSCSI over 10GB ethernet. I would like 2 boxes to begin so data is fully synced between the 2 boxes, so in case one goes down the other takes over with no downtime. I can do this in pfsense for example with virtual IPs, so one device has the virtual IP in standby and once the main pfsense fails, the virtual IP changes and it all works from the secondary box with no downtime
  2. use 1U servers as head node, connected via SAS cable to JBOD. I don't have any 1U server currently, but would most likely get Dell R630 or a supermicro equivalent with high memory module slots, use an HBA and connect to a JBOD (supermicro most likely). Some may say this approach is good because I can hook up 2x 1U servers to the same JBOD and work in high availability. So if one head fails, the other picks up the work. The downside is, you need to use SAS drives only as with SATA it will not work. My concern with this is, what happens if the backpannel of the JBOD fails? That is a single point of failure, and with one Dell MD3200, I had this problem in the past, we have dual controllers but one SAS backpane, if failed and the whole box was down.
  3. Hyperconverged, not entirely sure on this option yet, as you may have too many eggs in one basket. So there are a few different approaches like VMware VSAN where it seems to write parity to different nodes and pooling every single node as one single pool of storage. So if you lose one host you can rebuild the data parity to another host as the servers work in a RAID-5 like way? I may be wrong here, but that is what I understood. Nutanix and Solarwind work differently doing mirror, so one host has exactly same data as another host, and VMs work with "local" speed storage. What about when I want to expand beyond 3 nodes? So far I can see it replicates the data across 2 nodes, then mirrors up to 3 hosts. So every host I add I just keep replicating, adding more compute power but not actually adding storage?

I've been thinking for a long time to use ZFS. I tried Nexenta in the past but their HCL is rubbish, not a good support for hardware at all. Then recently ZFS on Linux has matured a lot, which has several companies offering their own version of Ubuntu/Debian with ZFS on top and management. Nothing new here as lots of players seem to be doing the same job, getting open source software and selling as their own. Lots of people talk highly of FreeeBSD + HAST + ZFS as way more mature than in Linux, but I'm not sure if it is still the case non late 2016. Lots of posts are from 2013/2014 and in 2-3 years a lot has changed. What commercial options are out there for FreeBSD? I know TrueNAS/FreeNAS and other do it, but considering my past experience with FreeNAS, I would not go there. I don't believe it is a good option, but you guys may think it is, and I'm happy to change my judgement.

Some solutions I've found/tested/equired:
  • I've been testing Zetavault (they used to be called DNUK), which is just Ubuntu + ZFS with a average looking interface, but it works okay so far I can see on my test server.
  • Open-E JovianDSS is Debian + ZFS, interface looks sleek but with less features. I've been battling with them to get a LSI2008 HBA (Dell Perc H200 flashed with IT bios) to work properly as their interface doesn't display my disks, so I can't use it. Their support is trying to get it to work for 5 days now.
  • Starwind Virtual SAN - Seems more targeted towards Hyper-V than Vmware as it integrates nicely with Windows/Hyper-V for hyperconverged system, but with Vmware it would need to run as a barebone as I don't feel like running a virtual machine to store my data, which means standalone box with NFS/iSCSI. They make bold claims about re organising writes to disk in memory and then writing them sequentially to spinning disks, similar to what ZFS does every 5 seconds. They call this IO Blender, where several VMs IOps reach the storage as random IOps and they cache them to write as one single sequential write. I would need to use a barebone server for this, with Windows installed and then Solarwind softwrae on top. That means I need a hardwrae raid controller as I would not trust Windows software raid, it was always rubbish. I would rather use ZFS. Other claim by soalrwind is that you can use RAID-5/6 for storage and because of their way to cache write data to disk, you have better performance than using RAID-10. How do you guys see this? I think is a bold claim, to not say BS.
  • Nutanix/Netapp/Tintri/Nimblestorage - all out of question, propriety hardware and expensive. Last time I checked in teh UK, Nutanix entry level single box would cost £25000, no a flipping way! It would need to take my kids to school and wash my dishes to justify this price tag.
  • EMC ScaleIO - seems neat, I like they sell just the software without hardware. I'm waiting for quotes.

There are some other thigns around like BTRFS which seems too experimental at the moment but for some reason Netgear is selling appliances based on it, lots of companies getting Ubuntu + ZFS and rebranding as their own stuff making claims is the holy grail. Also I have doubts about 10GB Ethernet, it has some latency so how bad would it be in my case with 10 hosts and about 100 VMs? I heard Melanox and their Infiniband 10/40GB cards have a lot lower latency. As we are moving to network storage instead of DAS I'm a bit concerned on the latency issues we can have over 10GB Ethernet, but it may be the case is negligible.

I'm also considering going all flash for the storage box. The Samsung SM863 SSDs that I've been using in RAID-5 have been fairly stable and good performer, so I may use them on a larger array, or perhaps Intel SSDs as they are only 10-20% more expensive than Samsung but have amazing track record.

What are your thoughts on this?

Thanks!
 

gea

Well-Known Member
Dec 31, 2010
2,535
856
113
DE
Nexenta is Illumos (free Solaris fork), so you can use their HCL
You are mainly happy with Intel server class mainboards, Intel nics and LSI HBA

an Option to Nexenta is OmniOS, another free and stable Solaris fork.
For a Cluster you can use RSF-1 from High-Availability.com on both

For OmniOS I am working on a management option for Netraid
This is ZFS Pool mirroring over network via iSCSI. I work on a menu
to make settings easier and allow auto-failover with a virtual IP

SSD only pools with Samsung PM/SM or Intel DC is perfect.

For 10G ethernet, use SFP+ (lower latency than Base-T)
A newer option is Intel X710 Ethernet (40G QSPF+/4 x 10G)
 
Last edited:

mephisto

Member
Nov 6, 2013
36
3
8
London
I had some problems with LSI2008 and Nexenta a couple of years ago when I last tested it, it could be better now as you said. To be honest, I didn't like it much as the community was pretty much nonexistent for nexenta, topics were just there gathering dust, it seemed no one cared.

I had looked into omnios and I liked it, but I can't spend the time with CLI at the moment setting it up. I need a GUI so I can focus on the higher level of what I need to deliver as infrastructure. That is where paid support and GUI would come in.

What paid options involving Illumos code are out there? I've even considered Solaris as it seems the license would be about £1000 per year, but I don't want to have to set it up from scratch from CLI. Are there any proper paid options? Nappit last time I used, about 2 years ago was quite flaky as well, I was testing on Solaris Express 11.

I may go SFP+ for 10GB network, not sure yet about switches pricing for melanox which may break the budget.

Thanks!
 

dswartz

Active Member
Jul 14, 2011
445
37
28
scaleio is interesting, but from what i can see, you need a bunch of nodes before the performance is any good (e.g. not just 3-4)...
 

mephisto

Member
Nov 6, 2013
36
3
8
London
Ow, I didn't know Gea is the Nappit developer. So I'm planning to give it another go this time. How many people are behind the development of nappit?
 

whitey

Moderator
Jun 30, 2014
2,770
866
113
38
Ow, I didn't know Gea is the Nappit developer. So I'm planning to give it another go this time. How many people are behind the development of nappit?
I think Gea develops napp-it on a slim crew but I will let him speak to that or confirm/deny.

As for the underpinnings/guts of what REALLY drives or is the heart of napp-it...well that is just good old *nix Illumos code base 'for the most part'. So you get the best of breed/brightest/innovators of ZFS from Delphix, Joyent, Nexenta, etc. all contributing to the advancement.
 

mephisto

Member
Nov 6, 2013
36
3
8
London
By the way, I noticed you are doing pci-e passthrough to present your disks to your freenas OS, right? I believe you are doing the "hyperconverged" thing that they would call these days. How it is working out for you?
 

whitey

Moderator
Jun 30, 2014
2,770
866
113
38
By the way, I noticed you are doing pci-e passthrough to present your disks to your freenas OS, right? I believe you are doing the "hyperconverged" thing that they would call these days. How it is working out for you?
Correct, workin' like a dream, I also have some vSAN stg avail in my 3-node vSphere 6.x cluster.
 

mephisto

Member
Nov 6, 2013
36
3
8
London
Correct, workin' like a dream, I also have some vSAN stg avail in my 3-node vSphere 6.x cluster.
Are you running one instance of freenas on each host? Any HA system in place?

Do you mean VMware vSAN? I'm planning to start testing it as well, but the prices so far have put me off a bit
 

whitey

Moderator
Jun 30, 2014
2,770
866
113
38
I have two FreeNAS vt-D AIO instances dedicated in two of my three hosts, replication via FreeNAS gui setup between to keep data in sync every 12 hours. No HA, RSF is garbage from what I have heard from friends in the industry that has tried it on stg distro's such as Nexenta/OmniOS.

Yes I am using VMware vSAN, VMUG Eval is your friend for cost conscious folks.
 

mephisto

Member
Nov 6, 2013
36
3
8
London
Hmm, not sure if is that bad as so many distros are using RSF, some others are using DRDB as well which works differently on Ubuntu/Debian, perhaps worth considering.

I mean, expensive for my clients, I want to start doing some testing to see if I can justify asking my clients to pay for it. I'm now thinking even to start doing Hyper-V instead of Vmware because of the licensing costs, but it seems windows 2016 with its per core licensing will get really expensive, so no more datacentre edition and unlimited VMs on a server with 40 cores.

Perhaps the way out will be KVM to keep costs down in the future.
 

whitey

Moderator
Jun 30, 2014
2,770
866
113
38
If I could dig up my buddies RSF horror story from years ago out of gmail I may post here. Let's just say the 'HA-feature' wasn't so HA'ish and took down EVERYTHING.

KVM is a solid platform as well, just what flavor of KVM do you want? IE: OpenStack, RHEV, oVirt, libvirt, etc.

All boils down to what features you 'require' v.s. 'nice to haves'.

Still gonna have to choose your stg platform :-D If you straying over there CEPH or Gluster may be in your future or maybe even btrfs. I prefer my stg w/ a splash of ZFS everytime though...just preference/proven filesystem to me w/ insanely rich featureset and rock solid super smart developers behind it.
 

dswartz

Active Member
Jul 14, 2011
445
37
28
It requires 3 nodes to work, I believe because it distributes parity across them.
When I said 'a bunch', I meant more like a dozen or so. Also, it's not a parity scheme, but a distributed, two-copy scheme. It works fine with as few a 3, but you will find the performance disappointing...
 

dwright1542

Active Member
Dec 26, 2015
364
70
28
47
I've tried the big 3 Hyperconverged with ESXi:

1. Starwinds : Major windows / ESXI bottleneck. I have FusionIO cards that will only put out 60k IOPS (300k native) with SW. I've been working with them nearly 10 months now, no solution in sight.
2. Stormagic : Right now, single CPU iscsi stack, which is what is limiting IOPS. I got 40k IOPS per node on the FusionIO cards. Apparently they will be multithreaded by years end. This seems a more realistic possibility since they know exactly what the issue is.
3. Datacore. Yuk. expensive and all the issues of Starwinds.

I'm actually doing ok right now with Starwinds and Pernix FVP. However, if Stormagic gets their stuff figured out first, I may head that way.
 

darklight

New Member
Oct 20, 2015
16
4
3
37
Greeting, sirs.

I just wanted to clarify some things and come with an update. Right now StarWind is working on improving it's VMware hyper-converged concept which means that we will run inside a Linux VM very soon. Actually, it is exactly the same thing that HP VSA and Nutanix are doing right now. Windows Server inside a StarWind VM will still remain as an option for VMware since Windows-based technologies like Hyper-V/SOFS/whatever are native by StarWind though.
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,009
4,991
113
Greeting, sirs.

I just wanted to clarify some things and come with an update. Right now StarWind is working on improving it's VMware hyper-converged concept which means that we will run inside a Linux VM very soon. Actually, it is exactly the same thing that HP VSA and Nutanix are doing right now. Windows Server inside a StarWind VM will still remain as an option for VMware since Windows-based technologies like Hyper-V/SOFS/whatever are native by StarWind though.
Do you work at StarWind?
 

dwright1542

Active Member
Dec 26, 2015
364
70
28
47
Greeting, sirs.

I just wanted to clarify some things and come with an update. Right now StarWind is working on improving it's VMware hyper-converged concept which means that we will run inside a Linux VM very soon. Actually, it is exactly the same thing that HP VSA and Nutanix are doing right now. Windows Server inside a StarWind VM will still remain as an option for VMware since Windows-based technologies like Hyper-V/SOFS/whatever are native by StarWind though.
Interesting. Just a few weeks ago in the official SW forums there was no such update. Is there a timeframe?
 

darklight

New Member
Oct 20, 2015
16
4
3
37
Do you work at StarWind?
Yes. Sorry for not mentioning this in the previous post.

Interesting. Just a few weeks ago in the official SW forums there was no such update. Is there a timeframe?
We are not disclosing such details on our forums until new features are not fully tested. Approximate ETA for the StarWind Linux VSA (beta) is October - December 2016.
 
  • Like
Reactions: T_Minus