Got my replacement backplane and it came with the metal shield/clip for J0 ripped off, lol. Could have maybe soldered it back on myself or even still gotten the connection in but meh I paid for a fully functional and intact backplane, not jank... I've already got it packed up to return and ordered a second one from the same ebay seller. Going to have to wait a till next week and hope the next one comes undamaged. Ordering parts off ebay can be a tedious experince that's for sure.
Thinking I may put my original backplane and triple HBA setup into the system to at least get all the disks connected for badblock testing since that script will probably take a week+. Didn't think I'd face so many setbacks with my newer backplane, sheesh!
I've been playing around with my original failing BPN-SAS2-846EL to try and figure out what the heck is wrong with it but not having much luck. I attached one of the high power fanwall fans to the back of the backplane to blow air directly on the SAS expander chip, thinking maybe it was an overheating issue. In that attempt with the fan blowing on it I was able to use the backplane just fine. I removed the fan, waited a while, and then no disks could be detected any longer. TrueNAS was spitting errors like crazy. I realize now this same scenario happened previously where everything booted and was working fine- but then I removed a disk and TrueNAS couldn't see or not see it anymore- so just a constant flood of error on being unable to find the drive path basically.
I felt like perhaps I had found the root cause since removing airflow caused the backplane to eventually not work. However after testing a few reboots with the fan blowing onto the expander heatsink at full blast it still had the same issues- so I guess not. I truly wonder what could be wrong with the backplane. The sas expander? Bad caps? (they all look fine). Like I said, even if I end up never using it I'd love to figure out what the heck the root cause is.
I also experimented with adding high volume airflow to the HBA as well just in case the HBA was somehow the issue but that too didn't make a difference.
I wonder if something else thermal related on the backplane. Can't think of anything but the expander though.
As a fun experiment I took the heatsink off my HBA (AOC-S3008L-L8e+) which was very easy with just pushing out the plastic pins holding it on. I saw the thermal paste had turned into rock hard cement. I cleaned it up and put on new thermal paste. HBA works just as fine still- shows same heartbeat LED status as ever- so I don't know if it actually will make a difference but I figured why not. I've read some bad stories of HBAs overheating and causing inline data corruption issues. From looking at the HBA documentation It appears I have, specifically, the AOC-S3008L-L8e+ model which can report it's tempature to IPMI somehow? The documentation states:
Referreing to the block diagram for my X11SPI-TF I don't believe I have that required I ² C multiplexer chip? Checking the manual for the motherboard I don't think I can see anything in here that sounds like I can get that working. I'd love to be able to monitor the thermals for the HBA so if I'm missing something here or if someone has a solution I'm all ears. I see others had attempted this with not much success.
I also see that the BPN-SAS2-846EL backplane itself does allow for three fan connections. It would be nice to use those to free up connections on the motherboard. However it doesn't appear there's any way to have control of the fan speeds when connected that way. I saw some posts and topics on this while googling and it appeared like most people gave up on it as a possibility. It would be nice to be able to have the fans ramp up based on disk drive heat.
I was wondering if I could somehow pry the heatsink off the backplane controller to see if there was dusty dry thermal paste there but it's not attached in what appears to be a removable way so I don't think that's an option. Heatsink glued/soldered directly onto/over the chip? I tried to google as much as I could on faulty backplanes but I didn't come up with much. I'm interested in exploring ways to troubleshoot the issue with the backplane further just for my owm knoweldge to be honest. If anyone has ideas let me know! I do see there's a mini USB connection on the backplane which apparently is for the use of a mini-USB-to-RJ45 connection (from reading SuperMicro FAQs) but I couldn't find info on anyone using this to diagnose an issue. Would be interested in to understand how that connection works and if I could do it myself.
Thinking I may put my original backplane and triple HBA setup into the system to at least get all the disks connected for badblock testing since that script will probably take a week+. Didn't think I'd face so many setbacks with my newer backplane, sheesh!
I've been playing around with my original failing BPN-SAS2-846EL to try and figure out what the heck is wrong with it but not having much luck. I attached one of the high power fanwall fans to the back of the backplane to blow air directly on the SAS expander chip, thinking maybe it was an overheating issue. In that attempt with the fan blowing on it I was able to use the backplane just fine. I removed the fan, waited a while, and then no disks could be detected any longer. TrueNAS was spitting errors like crazy. I realize now this same scenario happened previously where everything booted and was working fine- but then I removed a disk and TrueNAS couldn't see or not see it anymore- so just a constant flood of error on being unable to find the drive path basically.
I felt like perhaps I had found the root cause since removing airflow caused the backplane to eventually not work. However after testing a few reboots with the fan blowing onto the expander heatsink at full blast it still had the same issues- so I guess not. I truly wonder what could be wrong with the backplane. The sas expander? Bad caps? (they all look fine). Like I said, even if I end up never using it I'd love to figure out what the heck the root cause is.
I also experimented with adding high volume airflow to the HBA as well just in case the HBA was somehow the issue but that too didn't make a difference.
I wonder if something else thermal related on the backplane. Can't think of anything but the expander though.
As a fun experiment I took the heatsink off my HBA (AOC-S3008L-L8e+) which was very easy with just pushing out the plastic pins holding it on. I saw the thermal paste had turned into rock hard cement. I cleaned it up and put on new thermal paste. HBA works just as fine still- shows same heartbeat LED status as ever- so I don't know if it actually will make a difference but I figured why not. I've read some bad stories of HBAs overheating and causing inline data corruption issues. From looking at the HBA documentation It appears I have, specifically, the AOC-S3008L-L8e+ model which can report it's tempature to IPMI somehow? The documentation states:
Code:
On systems that support IPMI software detection of multiple add-on cards, each
add-on card's PCI-E slot must have an I ² C multiplexer chip. Check the motherboard
manual for each PCI-E slot. Install the AOC-S3008L-L8e+ in a PCI-E slot with a
multiplexer chip for correct software detection.
I also see that the BPN-SAS2-846EL backplane itself does allow for three fan connections. It would be nice to use those to free up connections on the motherboard. However it doesn't appear there's any way to have control of the fan speeds when connected that way. I saw some posts and topics on this while googling and it appeared like most people gave up on it as a possibility. It would be nice to be able to have the fans ramp up based on disk drive heat.
I was wondering if I could somehow pry the heatsink off the backplane controller to see if there was dusty dry thermal paste there but it's not attached in what appears to be a removable way so I don't think that's an option. Heatsink glued/soldered directly onto/over the chip? I tried to google as much as I could on faulty backplanes but I didn't come up with much. I'm interested in exploring ways to troubleshoot the issue with the backplane further just for my owm knoweldge to be honest. If anyone has ideas let me know! I do see there's a mini USB connection on the backplane which apparently is for the use of a mini-USB-to-RJ45 connection (from reading SuperMicro FAQs) but I couldn't find info on anyone using this to diagnose an issue. Would be interested in to understand how that connection works and if I could do it myself.
Last edited: