There were many requests that I start a new thread for the SAN software ScaleIO.
For the record, I do not work for EMC. I have many SAN arrays and fabrics in our configuration
We bought this software to validate that the system will work in a production setting. We have some plans on scaling out to ~ 1 PB (spinning disk) of raw storage to replace older SAN arrays , but politics may change that...
The application that we are running on this is very write intensive. It is a massive file (millions of 2K-16K file size) queue based system. plus we have 76 DFS servers (all VM's) on this array.
The software has basically 4 components
Any and all parts of this can run on Windows and *nix. We are almost strictly a windows shop, so our preferred platform is windows.
I will skip over #1 as I find it is the most lacking. I will add more later...
MetaData Manager (MDM for short) - you must have 2 of these and a TB server. these can be VM's or servers in the pools or clients it does not matter where they are located. however they should be close to the pool because there are many changes that need to be made fast. in our configuration 3 of the 8 servers have a dual role.
Servers (SDS for short) that are part of the storage pool (targets). In the configuration that I manage we have 8 dell servers with 4 * 800 Intel 3700 SSD's in them. The controllers are configured in pass through mode for them. We have a total of 23.3 TB of raw storage and a 32 TB license. This gives us a maximum usable of 16 TB. The servers are also connected to our network over dual 40 GB links. SCaleIO does not (yet) support RDMA. when that happens latency will only go lower.
Clients (SDC for short) are initiators. this is basically a software HBA that runs over a proprietary protocol. The 8 servers that are attached to our configuration are 2 * 4 node HyperV Clusters. I am using the BE network as the IP for the clients.
all and all configuring this system from scratch there are a few things that should be clear
On the SDC, you install the software and point to the primary MDM. you then can receive any LUN from the pools that that MDM manages.
the real bonus to ScaleIO is that you can add and remove disks and servers (as long as you are under your license cap) without any downtime to the customer.
We had a failed server test. it took 10 minutes for the system to re-balance data integrity to the array.
Adding a new server took an hour for they system to re-balance the data. Removing a server took an hour.
As far as latency is concerned. I do not have real numbers and where are are in the product cycle I cannot run any good tests. What we noticed is the behavior changes in the total system is is running. DFS adds/deletes (millions of link changes per day) happen faster. several issues we had in the past with system updates (think MSMQ like system) now fly. Domain DFS updates (with hundreds of thousand links (way beyond spec) now update without issue with the PDC is rebooted.
While it can be said that we could have used a different system. Overall I am impressed with the flexibility of ScaleIO. iT can create huge (1 PB luns (not tested but scary from a data integrity perspective). the ability to do a hardware refresh while live is very impressive.
More to come later.
Chris
For the record, I do not work for EMC. I have many SAN arrays and fabrics in our configuration
HP EVA
HP 3PAR
NetApp
EMC CLARiiON line (CX3/4/VNX)
XIOTech
DataCore
ScaleIO
Cisco and brocade switches (and one fabric has both running in compatibility mode - ick)
HP 3PAR
NetApp
EMC CLARiiON line (CX3/4/VNX)
XIOTech
DataCore
ScaleIO
Cisco and brocade switches (and one fabric has both running in compatibility mode - ick)
We bought this software to validate that the system will work in a production setting. We have some plans on scaling out to ~ 1 PB (spinning disk) of raw storage to replace older SAN arrays , but politics may change that...
The application that we are running on this is very write intensive. It is a massive file (millions of 2K-16K file size) queue based system. plus we have 76 DFS servers (all VM's) on this array.
The software has basically 4 components
1) management - install/gui/SRM
2) metadata managers
3) servers that are part of the storage pool
4) client software
2) metadata managers
3) servers that are part of the storage pool
4) client software
Any and all parts of this can run on Windows and *nix. We are almost strictly a windows shop, so our preferred platform is windows.
I will skip over #1 as I find it is the most lacking. I will add more later...
MetaData Manager (MDM for short) - you must have 2 of these and a TB server. these can be VM's or servers in the pools or clients it does not matter where they are located. however they should be close to the pool because there are many changes that need to be made fast. in our configuration 3 of the 8 servers have a dual role.
Servers (SDS for short) that are part of the storage pool (targets). In the configuration that I manage we have 8 dell servers with 4 * 800 Intel 3700 SSD's in them. The controllers are configured in pass through mode for them. We have a total of 23.3 TB of raw storage and a 32 TB license. This gives us a maximum usable of 16 TB. The servers are also connected to our network over dual 40 GB links. SCaleIO does not (yet) support RDMA. when that happens latency will only go lower.
Clients (SDC for short) are initiators. this is basically a software HBA that runs over a proprietary protocol. The 8 servers that are attached to our configuration are 2 * 4 node HyperV Clusters. I am using the BE network as the IP for the clients.
all and all configuring this system from scratch there are a few things that should be clear
1) ScaleIO depends on things being statically addressed. it does not like a DHCP environment. in my case all of the ports used have static DHCP reservations.
2) The install tool is very confusing to use however if you want flawless upgrades you need to use it. I was able to install the entire environment manually in under an hour. after trying to setup the environment using their tool for 6 hours I just gave up.
3) you should not mix OS's (windows/*nix) for the storage pools.
The rest is easy it is just like working on a SAN array. Creating the storage pool is easy, on the SDS servers create (but not format) partitions on the servers and have a drive letter attached. Then add them to the storage pool. create luns and get ready to present to the SDC. Most of the work here is command line driven. There is very little gui. The lack of a real GUI for management not a real disadvantage.2) The install tool is very confusing to use however if you want flawless upgrades you need to use it. I was able to install the entire environment manually in under an hour. after trying to setup the environment using their tool for 6 hours I just gave up.
3) you should not mix OS's (windows/*nix) for the storage pools.
On the SDC, you install the software and point to the primary MDM. you then can receive any LUN from the pools that that MDM manages.
the real bonus to ScaleIO is that you can add and remove disks and servers (as long as you are under your license cap) without any downtime to the customer.
We had a failed server test. it took 10 minutes for the system to re-balance data integrity to the array.
Adding a new server took an hour for they system to re-balance the data. Removing a server took an hour.
As far as latency is concerned. I do not have real numbers and where are are in the product cycle I cannot run any good tests. What we noticed is the behavior changes in the total system is is running. DFS adds/deletes (millions of link changes per day) happen faster. several issues we had in the past with system updates (think MSMQ like system) now fly. Domain DFS updates (with hundreds of thousand links (way beyond spec) now update without issue with the PDC is rebooted.
While it can be said that we could have used a different system. Overall I am impressed with the flexibility of ScaleIO. iT can create huge (1 PB luns (not tested but scary from a data integrity perspective). the ability to do a hardware refresh while live is very impressive.
More to come later.
Chris