Distributed block storage.

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Martin Jørgensen

New Member
Jun 3, 2015
28
6
3
39
What do you use for distributed block storage?


GlusterFS is awesome but its really best for object storage, if you have VM with databases they are weakened, thats what the website is telling me.


Ceph is really awesome both for object and block storage.
But the support for Hypervisor is weak, Proxmox its first getting it now on the new beta.


What I looking for is shared storage thats Is HA and scalable and true open source. For the hypervisors.


My thought of a starting a small hosting company is start with minimal hardware and scale up.


2 Hypervisor (Ovirt, Proxmox ore other)


Shared block storage thats is HA and scalable.


Don't wanna go the Openstack way
 

Jeggs101

Well-Known Member
Dec 29, 2010
1,529
241
63
Well Ceph is now working with OpenStack so that's big. Also with it in the kernel of most big distro's you can do LXD/ LXC with it easily.
 

cesmith9999

Well-Known Member
Mar 26, 2013
1,420
470
83
ScaleIO - requires 3 nodes (4+ recommended)
Windows Server 2016 using SOFS (iSCSI) - Storage Spaces Direct (4+ nodes)

Chris
 
  • Like
Reactions: Chuntzu

JeffroMart

Member
Jun 27, 2014
61
18
8
45
I setup a small CEPH cluster just using VMs on ESX6 using CentOS 7.1 VMs, and the speed was pretty terrible, I didn't tweak anything, as I just followed a simple howto on getting it up and running, so I can't really say it's optimized or anything like that at this point.

All the MON/OSD VMs running on the same VMware host node sharing the same storage, probably not the best way to test, but I would expect something a little better than this:

Mount of the CEPH share:

dd if=/dev/zero of=test bs=1M count=10000 conv=fdatasync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 181.107 s, 57.9 MB/s

Local storage performance inside one of the MON nodes running CEPH:

dd if=/dev/zero of=test bs=1M count=10000 conv=fdatasync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 10.4648 s, 1.0 GB/s

Again, not tuned or anything like that, and only about 8 hours of setup/config, I am in no way an expert using CEPH yet.

Something seems wrong to me in the config, at least I would hope :)

Anyone else have a running setup and could comment on performance/configuration?

Here are the same results using GlusterFS setup in VMs using OSNEXUS over a NFS share mounted in VMware:

dd if=/dev/zero of=test bs=1M count=10000 conv=fdatasync
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB) copied, 51.8063 s, 202 MB/s

This is a pretty simple test, and I would like to do further bench marking with bonnie++ or FIO, just haven't had the time yet.
 

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
You say all of the ceph hosts are running on the same storage - what is that storage? If it is spinning disk then you should expect a large performance hit; spreading the IO across a bunch of hosts on the same disk will turn dd's usually sequential workload into a random one.
 
  • Like
Reactions: T_Minus

JeffroMart

Member
Jun 27, 2014
61
18
8
45
You say all of the ceph hosts are running on the same storage - what is that storage? If it is spinning disk then you should expect a large performance hit; spreading the IO across a bunch of hosts on the same disk will turn dd's usually sequential workload into a random one.
Yes, the local media is spinning disks in a raid array, thought there was a good chance this would impact performance, I appreciate the info!

Does anyone have a CEPH setup using a few separate machines with setup/performance details? Would just like to get an idea before I waste time playing around with it on a few different machines if it's not really worth the time.
 

BackupProphet

Well-Known Member
Jul 2, 2014
1,092
649
113
Stavanger, Norway
olavgg.com
I use HDFS/Hadoop as a distributed block storage. Mostly because they have a real consensus algorithm. The only thing I really miss is space saving with paritity calculations of the superblocks like Facebook have: Saving capacity with HDFS RAID | Engineering Blog | Facebook Code | Facebook

A database can never be used efficiently with a distributed filesystem, you should rather use a proper distributed database.
 
  • Like
Reactions: lmk