ext4 on JBOD drives, but the file system keeps getting corrupted

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

robk

New Member
Jan 1, 2022
7
0
1
I have a DDN StorageScaler 8460 setup as a JBOD connected to two different bare metal hosts, one CentOS 7 and one CentOS 8. Each drive is LVM ext4 partitioned, but I noticed some drives were showing some input/output errors that were fixed with e2fsck. I'm wondering if I am causing these errors to occur somehow. I generally just reboot the boxes separately as needed. Is there some procedure I should be following for not introducing errors in situations where one partition/filesystem is mounted to two different hosts?

I did notice that I created the file system using the Centos 8 box on about 10 drives, after trying e2fsck from the Centos 7 host threw a version error. But I'm thinking that's not the issue since the filesystem errors appear on not just those drives.
 

dswartz

Active Member
Jul 14, 2011
610
79
28
Maybe I'm misunderstanding you. It sounds like you're trying to mount the same filesystem on 2 hosts at the same time? If so, you'd need a cluster-aware FS.
 

robk

New Member
Jan 1, 2022
7
0
1
That is correct, I'm mounting the same filesystem on two different hosts at the same time.

Thanks for the tip, I'll have to look into cluster filesystems, I'm not familiar with them.
 

dswartz

Active Member
Jul 14, 2011
610
79
28
You can't share data, but you can run the cluster aware LVM which lets you create LVs on a VG that *is* visible to more than one host at a time. But only one host at a time can access each LV.
 

robk

New Member
Jan 1, 2022
7
0
1
That doesn't seem right, you can't share data? I'm sharing data right now, and I think my corruption issue may be caused by Centos 7 and 8 file system differences. It seems like partitions created on the Centos 8 machine are incompatible with e2fsck and probably some other tools. I need to see if there is a way to effectively downgrade them to Centos 7 and be done with 8.
 

dswartz

Active Member
Jul 14, 2011
610
79
28
Poor phrasing on my part. I don't mean sharing like SMB or NFS. Modern filesystems have all kinds of in-memory metadata and such that makes it impossible to mount a disk or partition by more than one client at a time without corrupting things. You are 100% guaranteed to get filesystem corruption that way. That's why cluster-aware filesystems were created. The reason they aren't more widely used is because cluster-safe operations are not free - there is always a performance cost involved somewhere...
 
Last edited:

MBastian

Active Member
Jul 17, 2016
205
59
28
Düsseldorf, Germany
Let's take a step back. What do you intend to do with the shared filesystem? For example, even a cluster-aware filesystem won't help you if you intend to write on a SQL database from two machines.
 
Last edited:

robk

New Member
Jan 1, 2022
7
0
1
Maybe I"m misreading you, but I took "you can't share data," as meaning, that even with a cluster aware file system, it wouldn't be possible to share data.

Its for a Chia JBOD setup, so just need to fill the drives up with data once, and then they are pretty much read only as the plots get queried. Its actually held up surprisingly well so far given its current configuration.
 

acquacow

Well-Known Member
Feb 15, 2017
787
439
63
42
You can't share something like ext between two hosts both in RW mode. One host can be read only if you wish, but if both are mounted read/write, you don't have anything to handle distributed lock management between the hosts. You're going to get journal and file corruption all the time.

You need a central server running NFS shared between the two hosts, or need to run a clustered filesystem, or use a solution like DRBD to mirror data between the hosts.

I wouldn't trust any of that data that you shared between the two hosts to not have lots of corruption.
 

robk

New Member
Jan 1, 2022
7
0
1
What if I mount the filesystem as read-only on both hosts, think I'll still get corruption?
 

acquacow

Well-Known Member
Feb 15, 2017
787
439
63
42
Both in RO mode would be fine. I do that in production a lot. As long as only a single host can write to it, you're fine. It's done a lot in the fiberchannel/iscsi SAN world for rapid fail-over.