ZFS server problems

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Harold Robinson

New Member
Jan 9, 2017
5
0
1
60
Good evening everyone. I am Harold Robinson. I have a OmniOS and Nap-IT installation. It has one ZFS pool of 10 drives. 2 drives are parity and 8 are data.

The server is running just fine. Now for the problem: I naively installed OmniOS on a flash drive. Now I saw write erros to the usb drive as well as "Media not available" errors.

I am guessing am dealing with a bad USB drive.

The console appears to be locked up, and on top of that, I will need to shut this storage down so i can move it.

How can I recover from a problem like this once the storage server has been moved to its new home.
 

gea

Well-Known Member
Dec 31, 2010
3,162
1,195
113
DE
Simply reinstall OmniOS (prefer an 30GB+ SSD) + napp-it.
If this is a critical setup, prefer something like an Intel S35x0-80 (very reliable with powerloss protection)

Now just import the pool.
If you want to keep napp-it settings, restore the folder /var/web-gui/_log/*
either manually or via backup job + User > restore
 

Harold Robinson

New Member
Jan 9, 2017
5
0
1
60
Thanks for your quick answer. Since this is just a boot drive. I am going to replace the dead USB with two 500GB laptop drives I have on hand. I will be mirroring them. That should make the system a bit more reliable. Although I have to say, it has been rock solid with the exception of what happened with the boot drives. I thought I was being slick by using USB drives. What a mistake.

I know I don't have any choice but since the system will not have been properly shutdown, is there anything I should be aware of when I import. I am looking for any unexpected delays, will it take a long time to import? The storage has a bunch of VM's attached via NFS. I will of course be shutting all that down properly.

Will there be a "chkfsck" automatically run? or is ZFS so robust that it just magically recovers with no data loss.

I will also loose my NFS configuration. I am guessing that I would not be able to recover it as I don't have a backup of the boot drive.
 

Harold Robinson

New Member
Jan 9, 2017
5
0
1
60
one more quick question. in order to save time with the hardware repair. could I preinstall omnios to the replacement drives before arriving on site or should i wait until after i have installed them into the server?
 

gea

Well-Known Member
Dec 31, 2010
3,162
1,195
113
DE
You can install OmniOS/ napp-it on another machine and move the bootdisk then (use Sata/AHCI). It will boot in most cases without problem. Only problem may be the nic settings as they are remembered what can give problems.

As I offer preconfigured images, you can either clone this or you can at least use my prepare script that I use to prepare these images and that deleted all nic settings. Call the following prior move.

perl /var/web-gui/data/tools/other/prepare_image.pl
On next bootup, napp-it automatically configures all nics in dhcp mode.
 

Harold Robinson

New Member
Jan 9, 2017
5
0
1
60
Gea,

Thanks for your reply. in my previous post, I had some additional questions for you, could you look at that post dated 1/10/2017 @ 5:25pm?
 

gea

Well-Known Member
Dec 31, 2010
3,162
1,195
113
DE
Thanks for your quick answer. Since this is just a boot drive. I am going to replace the dead USB with two 500GB laptop drives I have on hand. I will be mirroring them. That should make the system a bit more reliable. Although I have to say, it has been rock solid with the exception of what happened with the boot drives. I thought I was being slick by using USB drives. What a mistake.
Typically USB sticks are not as robust as SSDs but there are good ones around.
But indeed, I would always prefer an Sata SSD (or disk). On a production machine I would always prefer something like an Intel S35x0-80 GB, a very reliable enterprise SSD with powerloss protection.

I know I don't have any choice but since the system will not have been properly shutdown, is there anything I should be aware of when I import. I am looking for any unexpected delays, will it take a long time to import? The storage has a bunch of VM's attached via NFS. I will of course be shutting all that down properly.
Pool import does only require that all disks are available on any controller. It will last usually only a few seconds that are needed to read some data from any disk.

Will there be a "chkfsck" automatically run? or is ZFS so robust that it just magically recovers with no data loss.
There is no fschk or chkdsk on ZFS. Due its Copy on write behaviour it is crash resistent. A write is valid or discarded. There is a scrub function that reads all data, check their checksums and repair them from redundancy but this function is to fight against silent data curruption.

I will also loose my NFS configuration. I am guessing that I would not be able to recover it as I don't have a backup of the boot drive.
NFS configuration is mainly set nfsshare=on
You optionally can restrict access as a share option based on a client ip with only very few other options that are needed or really make sense.