Vsphere 6, EMC Scaleio, and Rockstor

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

stupidcomputers

New Member
May 27, 2013
18
19
3
Seattle, WA
www.linkedin.com
Has anyone else tried a Vsphere6 + Scaleio + NAS setup? I am having huge success with this setup so far. Already survived 2 disk failures. My last two Seagate 3TB drives gave up under load and the system never skipped a beat.
Replaced with 1 TB drives and the system recovered and rebalanced.


Integration with Vcenter is nice.


Impressed with performance on a 1gb network. Beats the performance of my previous nas4free all-in-one vsphere 5.5 setup.

Ultimate goal was a clustered setup that can survive a whole host failure.


Any suggestions on a filer? Need SMB and NFS. Rockstor is currently working for me with a raw mapped 16TB LUN. Its nice to have the snapshot ability on the filter.
 
  • Like
Reactions: Dajinn and Patrick

Dajinn

Active Member
Jun 2, 2015
512
78
28
33
Oh man that GUI looks nice...way better than the clusterf*** that StarWind vSan is..

With ScaleIO you basically lose one node's worth of storage right for the fault tolerance?
 

Dajinn

Active Member
Jun 2, 2015
512
78
28
33
Also, how did you get vSphere? I was contemplating signing up for the 200/year package to get access to their stuff like vSAN and so on/so forth.
 

stupidcomputers

New Member
May 27, 2013
18
19
3
Seattle, WA
www.linkedin.com
All data in scaleio is mirrored. Based on what I am seeing 20TB raw will give you 10TB usable space similar to RAID10. The nice thing is you don't need to lose any space to local raid on each node. Disks are added individually to the cluster. I was able to use both SATA disks and SAS disks in my setup.

Signing up for the package is one way to get vsphere....

I discounted vsan because it didn't support any kind of data integrity or checksumming. Maybe they will have it in version 3? The EMC solution has a background scanner that periodically checks each block for corruption. No need for ZFS anymore.
 

Radjesh

New Member
Sep 10, 2015
3
0
1
29
All data in scaleio is mirrored. Based on what I am seeing 20TB raw will give you 10TB usable space similar to RAID10. The nice thing is you don't need to lose any space to local raid on each node. Disks are added individually to the cluster. I was able to use both SATA disks and SAS disks in my setup.

Signing up for the package is one way to get vsphere....

I discounted vsan because it didn't support any kind of data integrity or checksumming. Maybe they will have it in version 3? The EMC solution has a background scanner that periodically checks each block for corruption. No need for ZFS anymore.

And what about performance? I mean, how many IOps are you getting in such setup?
 

dswartz

Active Member
Jul 14, 2011
610
79
28
The EMC solution has a background scanner that periodically checks each block for corruption. No need for ZFS anymore.
I was curious about this. Maybe I missed it, but reading various scaleio docs, I don't see where they explain how the corruption scan works. If it's just reading the data (to make sure nothing has gone bad but wasn't noticed for a long time due to the block being infrequently referenced), that is not a replacement for zfs, since disks can and do return bad data silently (as well as controllers, buses, etc corrupting valid data before returning it, or even before writing it...)
 

stupidcomputers

New Member
May 27, 2013
18
19
3
Seattle, WA
www.linkedin.com
Once I get my system a bit more settled in, I will try to run some benchmarks.

Excerpt from the user guide on the background scan. Yes, its not inline like zfs, but its better than nothing.

https://www.emc.com/collateral/technical-documentation/scaleio-user-guide.pdf

Background Device Scanner
The Background Device Scanner ("scanner") enhances the resilience of your ScaleIO system by constantly searching for, and fixing, device errors before they can affect your system. This provides increased data reliability than the media's checksum scheme provides. The scanner seeks out corrupted sectors of the devices in that pool, provides SNMP reporting about errors found, and keeps statistics about its operation. When a scan is completed, the process starts again, thus adding constant protection to your system. You can set the scan rate (default: 1 MB/second per device), which limits the bandwidth allowed for scanning, and choose from the following scan modes:

• Device only mode The scanner uses the device's internal checksum mechanism to validate the primary and secondary data. If a read succeeds in both devices, no action is taken. If a faulty area is read, an error will be generated. If a read fails on one device, the scanner attempts to correct the faulty device with the data from the good device. If the fix succeeds, the error-fixes counter is increased. If the fix fails, a device error is issued. Note: A similar algorithm is performed every time an application read fails on the primary device. If the read fails on both devices, the scanner skips to the next storage block. 46 EMC ScaleIO User Guide Architecture

• Data comparison mode (only available if zero padding is enabled) The scanner performs the same algorithm as above, with the following additions: After successful reads of primary and secondary, the scanner calculates and compares their checksums. If this comparison fails, the compare errors counter is increased, and the scanner attempts to overwrite the secondary device with the data from the primary device. If this fails, a device error is issued. The scanning function is enabled and disabled (default) at the Storage Pool level, and this setting affects all devices in the Storage Pool. You can make these changes at anytime, and you can add/remove volumes and devices while the scanner is enabled. When adding a device to a Storage Pool in which the scanner is enabled, the scanning will start about 30 seconds after the device is added.
 
  • Like
Reactions: Dajinn

Radjesh

New Member
Sep 10, 2015
3
0
1
29
Oh man that GUI looks nice...way better than the clusterf*** that StarWind vSan is..

With ScaleIO you basically lose one node's worth of storage right for the fault tolerance?
and I actualy like clusters and I like f*** LOL looks waaay better compared to web guis

 

Jake Sullivan

New Member
Oct 9, 2015
16
23
3
33
I'm in the process of obtaining gear to run a similar type setup (over infiniband). Did you ever have a chance to get performance data?
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Evening gents...well almost morning...I just peeked in at rockstor under the covers and I must say the dashboard is DOPE. Not sure I'd trust my data to 'the new kid on the block' in btrfs but it seems to be kickin' ass and taking names so far. This is a screenshot of the rockstor interface which has a pool of two s3700's in raid1 shared out over 10G NFS to my vSphere 6.0 U1 infra doing a sVMotion between the rockstor NFS datastore and my ZFS hybrid NFS datastore. Wonder if iSCSI/Infiniband are otw?

Anyways...onto testing replication between two rockstor appliances...wish me luck!
 

Attachments

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Apparently replication between rockstor appliances is 'the broke' currently. Some sort of regression introduced. BUMMER/lame.


Feedback from their forum:

@whitey Yes, not so good I'm afraid. There is a critical issue open for Appliance to Appliance replication but it may not be down to anything specifically Rockstor related. I know it's on the cards for the next major milestone though. Anything you can find out / debug would be great though.

There has been some recent possibly related activity on the btrfs mailing list re replication (btrfs send receive):-
[PATCH 2/2] fstests: btrfs, test sending snapshots received from other filesystems

Perhaps this issue could benefit from having a forum thread of it's own?
 

Attachments

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
OK, I 'WAS' diggin' rockstor but I am starting to see the cracks now. WTF is this nonsense, I cant even clone a snapshot through web interface...it's like why even put the feature in the web interaface if it's not working.



Anyone else encounter this? Seems so simple as well as the send/recv's regression so I am kind of in a SMH moment here. I knew ZoL and Rockstor (ScaleIO up next, need more resources MUHAHAHAHAHA) needed some maturity but the initial offerings looks to hold promise over the long haul if the dev doesn't fracture.

Now I just wish we had a GOOD/SOLID ZoL web frontend. Kinda makes me sad...someone please prove me wrong...I know Gea has napp-it for ZoL but last I checked looking at a Illumos based napp-it v.s. a ZoL napp-it looked a lil' barebones which I am sure will have the gap closing over time.
 

Attachments

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Wow that's impressive, a responsive maintainer (main dev guy) respectfully responding to me and
thanking me for identifying a bug.

snapshot clone button is unresponsive · Issue #939 · rockstor/rockstor-core · GitHub

@whitey and @phillxnet, it's fair to say that there is a regression in Rockstor code for sure. I wouldn't be so fast and speculative about the upstream BTRFS code though. At least not until I test the behavior myself in our current(4.2.1) kernel.

As @phillxnet said in his comment, fixing replication is very important and is slated in the current milestone. So please stay tuned and we'll update this forum thread and the github issue as we get to work.
 

knightx2

New Member
Mar 16, 2017
1
0
1
102
Please Help!!!

I've been trying to play around with ScaleIO for the last 3 months with no avail. The only version I can get free from EMC is scaleio v2.0 and it's extremely fussy when I try it on Ubuntu 16.04.

Could you please do a write up and step by step instruction of how you set it up? I've been struggling to get this going and reading the Official Guides did not help. I encounter error that are completely trivial.

My home lab setup is 3 ESXi 6.5 nodes (supported by ScaleIO v2.0). I did try ScaleIO for ESXi but it did not work. none of the SDC can be recognized. Along the way it messed a bunch of things on my ESXi to the point I have to remove every traces of ScaleIO for my other configs and VMs to work

So my latest attempt is to set it up in Ubuntu 16.04 VMs as I figured it would be isolated and can be restarted from scratch if things go south.

Could you help me with a guide how you set up your Scaleio on vsphere using Rockstor NAS ?? I'd really appreciate it. I'm not looking for anythign fancy. I have 3x ESXi nodes each with 3 disks and I'm just wanting to do the minimal 1-2 disks failure protection. I have 1G internet switch connection between all 3 nodes. I have a few SSD/NVMe as well for cache but at this point I'd be happy if it recognize all SDS and SDC.

Thanks in advance
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
I think you are mixing ScaleIO and Rockstor up. Two completely different projects, I did dabble w/ Rockstor a year or so back but found it not mature enough for my use case. ScaleIO I have only seen threads around here discussing and know of the product at a high level (SDS) out of the EMC portfolio but not too much help other than that.

Someone will chime in that has deployed ScaleIO arnd here, give it time.