ZFS mirror across servers

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

rthorntn

Member
Apr 28, 2011
81
12
8
Hi,

Thanks for taking a look at my post!

I have a couple of the mITX Supermicro c2550 boards and want to achieve low-cost redundant clustering at home.

8TB disks are expensive, I have two at this point, I would like to put one in each c2550 server and synchronise the data across ethernet, so mirroring across ethernet.

Is this possible using Solaris 11.3 (I believe the c2550 boards work now), if it will work, can I take advantage of all the funky ZFS resiliency features, to prevent silent data corruption through bit rot (disks are <1 in 10^14) or would I need to have two mirrored disks per c2550 for that?

Thanks again.

Richard
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,519
5,827
113
Richard,

Usually clustering will require 3+ nodes. The reason for this is that if there is a network issue and both of two nodes are still up but cannot talk to each other it is very hard to figure out which changes are valid.

Are you looking for a solution to basically keep a copy of all changed data on the second server (e.g. zfs send / receive or rsync could work) or actual RAID 1 mirroring across the servers (writes happen to both simultaneously) or a clustered system (scale out to say 5-100 nodes one day).
 
  • Like
Reactions: rthorntn

rthorntn

Member
Apr 28, 2011
81
12
8
Thanks Patrick!

Ah OK 3 nodes to handle split-brain decisions :)

I guess it might help everyone if I state my requirements:

Home network
6TB of data stored at this point
Plex server
Hypervisor to spin up research VM's (prefer VSphere)
N+1 disk redundancy but server redundancy would be cool also
ZFS (prefer Solaris)
NFS
Solution capable of correcting bit-rot
CPU assisted AES disk encryption
Easy storage capacity upgrades
Very happy in the Unix CLI

I have a few servers:
2 x 6-core ATX
1 x c2750 mITX
4 x c2550 mITX
1 x 4-core ATX

I have a bunch of SSD's, 2TB drives, 10GbE cards and SAS HBA's. I like overkill if it's running on fairly energy efficient hardware, the minimum amount of servers to allow me to recover fairly quickly from a hardware failure would be great, it helps me learn, I don't need amazing performance.

I think I would like to have Plex running in at least Docker but maybe in a container or seperate VM. The Plex requirement probably means Solaris running on the c2550 for storage and a seperate server for VSphere (it doesn't look like people are having much luck running Solaris on VSphere with a SAS HBA in passthru and i'm not sure I could get Plex running with Solaris's Linux support).

Hope that helps and thanks again!

Richard
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,519
5,827
113
How fast is your data growing? Where do you project it to be in 24 months?

The reason I am asking is because I would probably think about server-level redundancy and a backup solution. For example, I have a pretty nifty ZFS server. I use RAID 1 on disks there with L2ARC and ZIL devices. I then have another NAS box with its sole purpose of being a backup target.

The benefit I get is building a bigger/ better main storage server and having a bit more redundancy with the dedicated backup target.
 

rthorntn

Member
Apr 28, 2011
81
12
8
Thanks again Patrick!

I think it will grow at about 2TB per year.

Do you use Solaris ZFS, is your NAS box another DIY server backing up to JBOD/RAID?

Where do all your "services" live (Plex in my case)?

Richard
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
There are no issues vt-D (pass-thru) of an LSI HBA to a Solaris/Illumos/Linux/FreeBSD stg appliance VM. Dunno where you got that sentiment from.

There WAS an odd issue where we found some linux kernel regression on LSI 2008 chipsets and I documented the hell out of it in several threads here/over on rockstor forum.
 
  • Like
Reactions: rthorntn

rthorntn

Member
Apr 28, 2011
81
12
8

unwind-protect

Active Member
Mar 7, 2016
424
157
43
Boston
Is this mostly for the case where one server blows up completely, destroying all drives?

And you want it faster than sending ZFS snapshots can do?

One kinda-straightforward way is to base the filesystem in the active server on ZFS parts that are not directly on the disks. Instead each "disk" that ZFS sees is a raid1, backed twice: local and iSCSI to the other server.

Obviously if there is ever a connection problem you have the mother of all resyncs to do. And the backup machine can at best make casual readonly use of the filesystem (in ZFS probably not even that).

It isn't quit clear to me what kind of scenario you want to prepare against.
 

gea

Well-Known Member
Dec 31, 2010
3,175
1,198
113
DE
The intention is not clear to me as well

You usually start with a regular storage server where you can replicate a ZFS filesystem to another ZFS filesystem on the same or another server. This is the usial procedure.

BTW
Rethink the archive disks. They are a bad idea for any raid. Performance on a resilver can be a disaster.

You can then think to virtualise services, either via zones on Solaris or you can use ESXi as a barebone virtualiser and virtualise anything including the storage VM. This gives you more flexibility regarding your VMs (can be anything from BSD over OSX, Linux, Solaris to Windows) and a very fast recover after a crash (under a minute for the storage VM from a ESXi OVA template).

Server mirroring or building Raid-Z arrays over servers is a very special use case.
You should only think of when you really need a a huge capacity or a recovery option of realtime data (cannot allow a delay of say 1-5 minutes like with replikation) or in a HA solution where you want to survive a whole system failure (Server + Storage). In such a case you can

- use 2+ ZFS storage server or a storage head + 2 or more storage node with disks
- create a pool with same size on any node and create an iSCSI target on the pool up to poolsize

- on the storagehead (can be a combined with a node), create an iSCSI intiator create a pool from a mirror or Raid-Z over the targets. This would allow to use cheap nodes with Sata disks to achieve a cheap but fast Petabyte box.

If the head fails, you can use an initiator on another server to import the disks/ pool/ data and keep services uo with current data.
 
  • Like
Reactions: SomeGuyInTexas