No vSAN - Ceph or GlusterFS these days?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

MiniKnight

Well-Known Member
Mar 30, 2012
3,072
973
113
NYC
I have a project where I was flat out told no budget for vSAN and no VMware. Hypervisor mix will be some Hyper-V and some oVirt KVM. We are looking at containers but they are not ready for going for them just yet. Nothing huge being hosted <20 nodes and not I/O crazy VMs. Just trying to consolidate a bunch of homegrown tools.

A year ago, I would've just done GlusterFS and been done with it. These days there is so much on Ceph, I'm thinking they have more momentum.

Does anyone have an opinion that has evaluated both? If I can save doing the POC that'd be great. I want to do shared storage and have the compute nodes all be part of the storage cluster. I then want all of the backups to hit big ZFS shared storage.
 

Patriot

Moderator
Apr 18, 2011
1,450
789
113
Ceph is not ready for a production environment. It is not performant, and it falls down easy.
 
  • Like
Reactions: T_Minus

Patrick

Administrator
Staff member
Dec 21, 2010
12,511
5,792
113
Fairly pertinent as I am trying to decide between Ceph and Gluster tonight.
 
  • Like
Reactions: Chuntzu

Patriot

Moderator
Apr 18, 2011
1,450
789
113
Did you personally have a bad experience with it or something?

I was reading a fairly data-rich overview by CERN that offered what I thought to be a very promising review of the operational sanity of ceph.

https://iopscience.iop.org/1742-6596/513/4/042047/pdf/1742-6596_513_4_042047.pdf
Meh, peak performance can be shiny, rebalancing performance is painful... They mention this but don't go into any detail. Latency can be over 1s... often, during normal periods due to the synchronous nature of the writes.

When scaling big you need to not look at just peak performance but degraded performance as you will most likely always have a drive or two down when you are at the Multi-PB level.

Ceph at CERN: A Year in the Life of a Petabyte-Scale Block Storage Service » OpenStack Open Source Cloud Computing Software
 
Last edited:
  • Like
Reactions: T_Minus

Patriot

Moderator
Apr 18, 2011
1,450
789
113
So I just watched that whole video, and I think CERN's bottom line is a lot different than what I think you were suggesting.
It just works.
I know their test cluster was 150TB but I wonder what the machines actually were...
I think there is a definite minimum cluster size to get acceptable large block performance. With the knowledge that iops. latency and small block will always suck.

The problems with the architecture for latency is the synchronous double write. ... and if you offload the journal to the SSD it then writes to the SSD, then to the volume, then to the replicas... doubling it again.

Looks like we need to rebuild the cluster on debian and see if we can get any better performance. Kinda surprised they are not using their homegrown Scientific linux.
 
  • Like
Reactions: T_Minus

vanfawx

Active Member
Jan 4, 2015
365
67
28
45
Vancouver, Canada
Ceph is great when you can dedicate the hardware to being ceph OSD nodes. You get into trouble when you also want them to be compute nodes as well. The general recommendation with ceph is you dedicate 1GHz per OSD. You also should have (though most will say need) SSD's to use a journals. If you use erasure coding, ceph is even MORE cpu hungry.

And as has been mentioned, rebalancing due to the addition or removal of OSD's sucks. In the past, you'd need to have done lots of testing and tuning to get rebalancing to not cause horrible performance issues. Now it seems like they've been getting their defaults to better handle this out of box.

Ceph used as an object store, or block store (for rbd) is the best use case. When I tried it for CephFS, there were many, many, many performance issues, mostly resolving around the MDS (meta data server). They have a great vision of a sharded directory tree spread between many meta data servers, but the current reality is a single active MDS, but with failover to hot-standby's.

I think my long-winded point is Ceph is great if you are running at scale (PB's). If you are looking for a smaller cluster (even up to hundreds of TB's), it's more difficult to achieve good, consistent performance.

What is the current state of Gluster like? I haven't touched it since the 3.x days, quite a bit before the Red Hat acquisition.
 

cesmith9999

Well-Known Member
Mar 26, 2013
1,417
468
83
ScaleIO works. and it is simple to setup and configure.

I have one in production that does 90% writes... with very little latency...

Chris
 
  • Like
Reactions: T_Minus

Patriot

Moderator
Apr 18, 2011
1,450
789
113
With 32 * 800 GB SSD's and 80 GB of ethernet capacity per server. I will let you guess why

Chris
So you have the potential for good throughput. That tells me nothing of latency.
There are also 800gb shit drives, and 800gb good drives.

Why don't you start a new thread and tell us about Scale IO
 
  • Like
Reactions: MiniKnight

MiniKnight

Well-Known Member
Mar 30, 2012
3,072
973
113
NYC
Yea I'd love to hear more about what you are doing. Agree on latency. New thread!

BTW --- for that thread --- how big is that scale io cluster? Are you using the "free" edition?