ext4 on JBOD drives, but the file system keeps getting corrupted

robk · Jan 1, 2022

I have a DDN StorageScaler 8460 setup as a JBOD connected to two different bare metal hosts, one CentOS 7 and one CentOS 8. Each drive is LVM ext4 partitioned, but I noticed some drives were showing some input/output errors that were fixed with e2fsck. I'm wondering if I am causing these errors to occur somehow. I generally just reboot the boxes separately as needed. Is there some procedure I should be following for not introducing errors in situations where one partition/filesystem is mounted to two different hosts?

I did notice that I created the file system using the Centos 8 box on about 10 drives, after trying e2fsck from the Centos 7 host threw a version error. But I'm thinking that's not the issue since the filesystem errors appear on not just those drives.

dswartz · Jan 2, 2022

Maybe I'm misunderstanding you. It sounds like you're trying to mount the same filesystem on 2 hosts at the same time? If so, you'd need a cluster-aware FS.

robk · Jan 2, 2022

That is correct, I'm mounting the same filesystem on two different hosts at the same time.

Thanks for the tip, I'll have to look into cluster filesystems, I'm not familiar with them.

dswartz · Jan 2, 2022

You can't share data, but you can run the cluster aware LVM which lets you create LVs on a VG that *is* visible to more than one host at a time. But only one host at a time can access each LV.

robk · Jan 2, 2022

That doesn't seem right, you can't share data? I'm sharing data right now, and I think my corruption issue may be caused by Centos 7 and 8 file system differences. It seems like partitions created on the Centos 8 machine are incompatible with e2fsck and probably some other tools. I need to see if there is a way to effectively downgrade them to Centos 7 and be done with 8.

dswartz · Jan 2, 2022

Poor phrasing on my part. I don't mean sharing like SMB or NFS. Modern filesystems have all kinds of in-memory metadata and such that makes it impossible to mount a disk or partition by more than one client at a time without corrupting things. You are 100% guaranteed to get filesystem corruption that way. That's why cluster-aware filesystems were created. The reason they aren't more widely used is because cluster-safe operations are not free - there is always a performance cost involved somewhere...

MBastian · Jan 2, 2022

Let's take a step back. What do you intend to do with the shared filesystem? For example, even a cluster-aware filesystem won't help you if you intend to write on a SQL database from two machines.

robk · Jan 3, 2022

Maybe I"m misreading you, but I took "you can't share data," as meaning, that even with a cluster aware file system, it wouldn't be possible to share data.

Its for a Chia JBOD setup, so just need to fill the drives up with data once, and then they are pretty much read only as the plots get queried. Its actually held up surprisingly well so far given its current configuration.

acquacow · Jan 3, 2022

You can't share something like ext between two hosts both in RW mode. One host can be read only if you wish, but if both are mounted read/write, you don't have anything to handle distributed lock management between the hosts. You're going to get journal and file corruption all the time.

You need a central server running NFS shared between the two hosts, or need to run a clustered filesystem, or use a solution like DRBD to mirror data between the hosts.

I wouldn't trust any of that data that you shared between the two hosts to not have lots of corruption.

MBastian · Jan 3, 2022

acquacow said:
You can't share something like ext between two hosts both in RW mode. One host can be read only if you wish.

Even that is risky as the host with the read-only mount does not know when it's caches and indexes are becoming invalid.

robk · Jan 3, 2022

What if I mount the filesystem as read-only on both hosts, think I'll still get corruption?

acquacow · Jan 4, 2022

Both in RO mode would be fine. I do that in production a lot. As long as only a single host can write to it, you're fine. It's done a lot in the fiberchannel/iscsi SAN world for rapid fail-over.

Search

ext4 on JBOD drives, but the file system keeps getting corrupted

robk

New Member

dswartz

Active Member

robk

New Member

dswartz

Active Member

robk

New Member

dswartz

Active Member

MBastian

Active Member

robk

New Member

acquacow

Well-Known Member

MBastian

Active Member

robk

New Member

acquacow

Well-Known Member