Hi just wanted to get some insight if possible. I'm running truenas scale virtualized in proxmox and have been since around april I'd say maybe a little longer. I'm using a supermicro x9srh-7f motherboard that has onboard lsi 2308 sas controller connected to a BPN-sas846-el1 backplane with ibm ssg h0vh600 sas 2.5 inch 600gb drives, 6 drives in a pool 1 dev zfs raidZ2. data on the server is not mission critical so if my methods may have resulted in bad data it's not a big deal and I have 3 other copies of all the data on different drives on a pc and externals.
Anyways about a week ago truenas emailed me my pool was degraded and i found 2 drives faulted and the rest degraded. when I checked status I had multiple read write errors but no check sum errors. i swapped out the faulted drives to some spare drives same make and size and rebuilt them and did scrub and smart tests and both came back fine. in less than 24 hours the same thing happened 2 faulted drives that were different and 4 degraded. this time I did smart and scrub again with zero problems. I didn't even see any listed files in the scrub that had errors. after all tests were scrubbed i cleared zfs again so the pool went back to online and ok status then 3 hours later the same thing happened. at this point 2 different drives showed faulted so I was really at the point that I have a controller, cable, or backplane issue.
I didn't have a different reverse breakout cable to use so i installed my lsi 9207-8i with the same sas 2308 chipset and passed it through from proxmox and was using mini sas to mini sas cable well it has run error free for 6 days now so i'd say backplane is fine down to the onboard lsi controller or the reverse breakout cable. I just recieved a new revers breakout cable from startech, I have no clue what the other cables brand was. I have no real idea of any particular brand a guy might want to use but new and different is a good start.
Currently I installed the new reverse breakout cable and am waiting to see if I have any issues. I am more or less looking for some thoughts to see if there is anything else I can check, also wanted on an opinion on using this motherboard if I determine the onboard lsi hba is the issue would you trust that it won't cause any other issues even if it wasn't in use. Also yes I know doing the vritualized thing is not the most ideal situation, but I haven't had any issues for months and I hadn't done any updates the day of the original problem to either proxmox or truenas or any of the apps I have in truenas.
Sorry for the long winded post and thanks for any insight anyone has. This is just my hobby in down time I am not in the IT sector but rather enjoy trying to run different servers and troubleshoot things.
Anyways about a week ago truenas emailed me my pool was degraded and i found 2 drives faulted and the rest degraded. when I checked status I had multiple read write errors but no check sum errors. i swapped out the faulted drives to some spare drives same make and size and rebuilt them and did scrub and smart tests and both came back fine. in less than 24 hours the same thing happened 2 faulted drives that were different and 4 degraded. this time I did smart and scrub again with zero problems. I didn't even see any listed files in the scrub that had errors. after all tests were scrubbed i cleared zfs again so the pool went back to online and ok status then 3 hours later the same thing happened. at this point 2 different drives showed faulted so I was really at the point that I have a controller, cable, or backplane issue.
I didn't have a different reverse breakout cable to use so i installed my lsi 9207-8i with the same sas 2308 chipset and passed it through from proxmox and was using mini sas to mini sas cable well it has run error free for 6 days now so i'd say backplane is fine down to the onboard lsi controller or the reverse breakout cable. I just recieved a new revers breakout cable from startech, I have no clue what the other cables brand was. I have no real idea of any particular brand a guy might want to use but new and different is a good start.
Currently I installed the new reverse breakout cable and am waiting to see if I have any issues. I am more or less looking for some thoughts to see if there is anything else I can check, also wanted on an opinion on using this motherboard if I determine the onboard lsi hba is the issue would you trust that it won't cause any other issues even if it wasn't in use. Also yes I know doing the vritualized thing is not the most ideal situation, but I haven't had any issues for months and I hadn't done any updates the day of the original problem to either proxmox or truenas or any of the apps I have in truenas.
Sorry for the long winded post and thanks for any insight anyone has. This is just my hobby in down time I am not in the IT sector but rather enjoy trying to run different servers and troubleshoot things.