Wanted: 9207-8i (or equivalent) in Europe

Mastakilla

Member
Jul 23, 2019
36
8
8
Hi all,

Where can I buy one of these HBAs from a reliable source in Europe? This would be my 3rd HBA in less than a year already, so I've already noticed first hand the kind crap that is generally being sold...

I would prefer if it is a brand new / original (if those still exist) or that comes straight out of a datacenter. No China imports or 2nd hand from a desktop please...

Thanks!

Fyi:
More details on my struggles with my current HBA here:
 

lhibou

Member
Jun 12, 2019
46
19
8
Hi all,

Where can I buy one of these HBAs from a reliable source in Europe? This would be my 3rd HBA in less than a year already, so I've already noticed first hand the kind crap that is generally being sold...

I would prefer if it is a brand new / original (if those still exist) or that comes straight out of a datacenter. No China imports or 2nd hand from a desktop please...

Thanks!

Fyi:
More details on my struggles with my current HBA here:
Damn that's some crap luck. I've had nothing but good experiences with used HP H220 from China so far. But I don't buy the cheapest one available. Think I've paid 45USD per or so from a vendor, ninety nine percent certain they are decommissioned server pulls.
 

Mastakilla

Member
Jul 23, 2019
36
8
8
First one I paid about 50 euro, but I screwed that one up myself (I ran it without active cooling and I accidently broke of a resistor). Although it still worked, I didn't trust my data with it anymore.
Second one I paid about 65 euro. The seller was open and told me he bought them from Hong Kong, but he did tell me he had good experience with his vendor. Not sure if it was just bad luck or if it really is a crappy product. He's not actively selling anymore atm and doesn't respond to my messages so far (it's only been 2 days though)

I'm now thinking of buying a 90 euro one from Ebay.de... An LSI 2308 based one this time, as they should be "less old". Hopefully I'll be more lucky this time...
 
Last edited:

gb00s

Active Member
Jul 25, 2018
570
193
43
Malta
I cant confirm the ‘China crap‘ you are talking about.

I bought 4x of HP H220s in China after I had bought 3 used Dell Perc H310s in IT mode from here and they all failed. Lots of problems with failed drives and pools. Lots of time spend for investigations. At one moment I just stoped complaining and exchanged all of them. Now I am working with the ‘China crap’ and i dont have any issues with disks and pools. They all are still working fine.

On the other hand, you are buying used stuff, so ....
 

Mastakilla

Member
Jul 23, 2019
36
8
8
Perhaps it's the type instead of the country of origin then? My current HBA is also a Dell Perc H310 and is having similar problems as yours did. Problem is that it is very hard to know what's "new" and what's "so-called-new" or 2nd hand.
 

Mastakilla

Member
Jul 23, 2019
36
8
8
I just bought 2nd hand "Broadcom 9207-8i SAS2308 6G SATA SAS HBA PCIe x8 3.0 LSI RAID IT Dell 0VGXKD" for 100 euro.

All "so-called-new" HBAs came from UK or non-EU and it would probably take ages to get them shipped to EU around this time.

My pool is already degraded and I need a solution fast :/ Hope it works, as even for this price, they don't accept returns...
 

Mastakilla

Member
Jul 23, 2019
36
8
8
So the issue isn't the HBA, as even after replacing my Dell H310 with an LSI 9207-8i, I still have the data corruption issue... :(
 

gb00s

Active Member
Jul 25, 2018
570
193
43
Malta
You change HBAs like i change my undergarments.

Just realized you are using 3900x for a NAS, with a ’server’ kind of motherboard with UDimms (ECC) for zfs. :rolleyes: :oops:Idont know its relevant, but your ram has the CTD in the model number, while the Asrock support side says only CTDQ. It made a huge difference on my Asus server board where one was RDimm and one was LRDimm, which had a huge impact on board. It may not in your case.
 
  • Like
Reactions: Mastakilla

Mastakilla

Member
Jul 23, 2019
36
8
8
You change HBAs like i change my undergarments.

Just realized you are using 3900x for a NAS, with a ’server’ kind of motherboard with UDimms (ECC) for zfs. :rolleyes: :oops:Idont know its relevant, but your ram has the CTD in the model number, while the Asrock support side says only CTDQ. It made a huge difference on my Asus server board where one was RDimm and one was LRDimm, which had a huge impact on board. It may not in your case.
I certainly wish I would never have to replace my HBA ever, but unfortunately that hasn't been the case... :confused:

I'm not sure where you found that the M391A4G43MB1-CTD are RDIMM or LRDIMM. As far as I could find, both of them are the same and they are both UDIMM.

Here some sources:

This website says that M391A4G43MB1-CTD is also known as M391A4G43MB1-CTDQ.

@fridgespacer also says they are the same.

@ReturnedSword confirms that M391A4G43MB1-CTD are UDIMM

Also on the Samsung website they are confirmed to be UDIMM. No mention of the M391A4G43MB1-CTDQ at all on the Samsung website...
 

gb00s

Active Member
Jul 25, 2018
570
193
43
Malta
... I'm not sure where you found that the M391A4G43MB1-CTD are RDIMM or LRDIMM. As far as I could find, both of them are the same and they are both UDIMM ...
I did not say they are LRDIMMs and RDIMMs. In my case there was a difference in the model number which let to RDIMMs and LRDIMMs. In your case both are UDIMMs. Don't worry. I was just pointing to the possible difference between CTD and CTDQ which may leads to some problems between board, memory and then down to zfs. You won't be the first having weird issues with Ryzen/Threadripper on 'server'-boards and zfs.
 
  • Like
Reactions: Mastakilla

Mastakilla

Member
Jul 23, 2019
36
8
8
Thanks a lot for trying to help!! Much appreciated!!

I also suspected that the memory could be the issue from the start, which is why I went as far as
  • validating that ECC error reporting works, by overclocking my memory (and yes, it does work and ECC memory errors are reported when they occur)
  • extremely underclocking the memory below SPD settings (and then the issues still occur)
That re-assured me quite a bit that it is not memory related, although I am aware that this perhaps isn't a 100% assurance.

I am considering of trying my desktops non-ECC memory in my NAS. This memory isn't on the QLV at all (it's GSkill PC3600 memory), but I guess if I underclock it a bit, it could still be a good test).

Also, I know that it is possible to properly work, as I ran FreeNAS without issues for more than a year (with monthly scrubs), before the issues started occuring...
 
Last edited:

gb00s

Active Member
Jul 25, 2018
570
193
43
Malta
I'm by far not an experienced user in zfs, but even if zfs should work without ECC memory, it's 'some kind of recommended' to use it with ECC memory.

Look, I had lots of problems with my wife's company and their dedicated servers running on Hetzner with Ryzen 3 CPU's on Asrock server boards and ECC memory. The internet is full of these issues. Constant unrelated reboots of the servers. Even Hetzner didn't know what the issues were, even with a lot of manpower on their investigation. We were not alone. We moved back to Intel CPU's and have no issues. Less peak power, but reliable performance.
 

Mastakilla

Member
Jul 23, 2019
36
8
8
I'm certainly not planning on using my desktops non-ECC memory in my NAS permanently :) Only as a test, temporarily, to see if it solves that filesystem data corruption issue / reboots. I know ZFS prefers ECC memory and I certainly plan to keep on using ECC memory once I get this shit solved... ;)

I've seen quite some reports of strange instability or other problems when using Ryzen on Asrock Rack server motherboards at
ASRock Rack X470D4U2-2T (a lenghty thread of which I read almost everything)
But as far as I could see, I think most of those reports were related to DOA products, using incompatible PSUs / memory or having wrong BIOS settings. I'm also seeing a lot people successfully using these motherboards in their server without stability issues.

I tried tackling / testing all of the above mentioned issues one by one during my troubleshooting, but so far I wasn't able to resolve it. Again, I do know it can be stable, as it ran perfectly stable for about a year... I hope I can find the problem and solve it someday, but I'm getting pretty tired of this shit though :mad: I've already wasted so much time and also money on trying to resolve this...
 

Mastakilla

Member
Jul 23, 2019
36
8
8
fyi:
It seems like the root cause was a broken CPU. After replacing the CPU (first temporarily with my desktops Ryzen, later by RMAing it for a new Ryzen 3600), the problems are gone.

So the problem was not caused by TrueNAS <-> AMD incompatibility. Although perhaps not perfect, my experience with TrueNAS <-> AMD compatibility has not been a bad one.

It is a bit concerning that the data corruption itself wasn't detected by anything. Only the corrupted data itself got detected by scrub. But as I wasn't even able to trigger any PCIe AER errors for example in Linux or Windows either (I tried this using my Optane instead of the HBA), I am not sure exactly which part of the CPU was broken and if it is a TrueNAS issue or a platform (AMD) issue or perhaps a combination...