What are your experiences with NEMIX RAM?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

alex_stief

Well-Known Member
May 31, 2016
884
312
63
39
It's not even that cheap I think? Just the cheapest (or sometimes only?) RDIMM on Newegg.
I'd pick used RAM from reputable brands over this, if I really had no way to get it new for a reasonable price.
 

erock

Member
Jul 19, 2023
84
17
8
Waitaminute... during all this time of trying to diagnose cryptic and unexplainable errors and resets, we have been dealing with Nemix RAM?
Because f them. If you still can, return that memory.
My goal for using NEMIX is to accelerate some experimental software work that I may or may not continue while minimizing cost. My expertise is scientific computing and CFD not computing hardware so bear with me as I travel into this new domain. I may still be able to return all of my NEMIX but my issues have been resolved without changing the RAM. I also have some Samsung RAM and have used this to test if NEMIX RAM was the root of any issue.

That said I am at the point in my experimental project where I need to make a call on continuing with NEMIX or buying “good stuff”. I haven’t had any issues with NEMIX RAM yet and I think it would be great for people like me doing DIY stuff to collect some data on NEMIX in a thread like this. I can’t find a complete online database of RAM benchmarks that include data on bad cells, consistency and long-term stability (let me know if you know of one).

My non-expert take on this topic so far is that RAM chips seem to be made by the same handfull of companies (Micron, Hynix etc) and these budget RAM makers take some shortcuts with testing when manufacturing the sticks. Therefore, there will be more inconsistencies and DIY testing of the RAM within the return window is critical. But at half the price it is tempting if the return process is reasonable, and from a statistical perspective the budget RAM may make sense for non-critical machines. I don’t have much data on the NEMIX return process, percent of bad sticks relative to name brands, and long term stability so it is hard to assess objectively. I am also skeptical of online reviews.

I am running Memtest86 on a H11DSi board with BIOS 2.4 and NEMIX supermicro “compatible” RAM. I will share the results when the tests are done.
 
Last edited:

unwind-protect

Active Member
Mar 7, 2016
424
157
43
Boston
I forgot to mention that although our 4x32 GB sets didn't work out the machines in question do run flawlessly now with random other Nemix RAM.

I second the notion that they are actually manufactured on the same lines as different brand RAM. Obviously not as tested.
 
  • Like
Reactions: erock

erock

Member
Jul 19, 2023
84
17
8
Löl... there is a back story?

I recently saw on ebay a boat load of Cisco-labelled 2666 DDR4 RDIMM 32 GB which was Micron. Bought a few, no issues even after the nasty tests I described above. Really cheap, seller obviously wanted them just gone.
2666 DDR4 RDIMM 32 GB won’t work for my use case. I need to fill all 16 RAM slots in 4 compute nodes (H11DSi + 2x7f52) to maximize memory bandwidth and get the full benefit of 8-channel memory for parallel CFD MPI applications and prefer 3200 RDIMM. I also want to stay at or below 256GB per node. The best price I could find so far with a return policy is NEMIX ($35 per stick). We will see how well this batch of NEMIX does with memtest86….more to come on that.
 
Last edited:

alex_stief

Well-Known Member
May 31, 2016
884
312
63
39
Memtest86 is not the right tool, at least the free version.
I don't know how exactly the paid version operates. I think it can be used for ECC, but the question remains how the load is applied.

The better and free way to do it has already been posted. Apply heavy load on memory and CPU cores simultaneously, e.g. with stress-ng. And monitor memory errors with edac. SuperDoctor also has a section where single- and multi-bit errors are counted.
 
  • Like
Reactions: erock

Stephan

Well-Known Member
Apr 21, 2017
945
714
93
Germany
memtest86 = Passmark version, free and paid versions, free with limited but here sufficient functionality.

memtest86+ = Open Source version, no good because I assume Intel NDA preventing any implementation of desirable functionality like reading memory controllers' error counters.

Windows memdiag = same basic testing as memtest86+, no good.

If this is some biggish home lab with 4 EPYC 1 TB of RAM or some small company system, money spent on memtest86 paid version is very wise money spent because else, you will be looking FOREVER to diagnose errors nobody else seems to have. And if something isn't working you send them your logs and they will troubleshoot or point you towards a BIOS that is known working. By now they must have a sizeable Supermicro EPYC KB history.

Put another way, there are a BOAT LOAD of DIMMs out there which are only marginally working. ECC will correct the errors. Put them under load and/or heat them up, and they will fail beyond what ECC can handle. There are more error modes like memory controller in CPU itself gone bad. Speck of dust or cat's footprint on some signal pads. Bent pins of CPU socket.
 
  • Like
Reactions: erock

erock

Member
Jul 19, 2023
84
17
8
I have been running Memtest86 now for 36 hours on one of my H11DSi+2x7f52 nodes with 256G of NEMIX RAM (MEM-DR416L-HL01-ER32 16GB Memory Compatible With Supermicro from Newegg) and made it through pass 1 and 50% of pass 2 with no errors reported (running the default tests with 4 passes).

Help me choose the next steps of this memory testing adventure (based on all the recommendations so far):
(1) Let Memtest86(free version) finish and then choose a better test.
(2) Kill Memtest86(free version) then give the Pro version a try.
(3) Kill the Memtest86(free version) and then run the more robust stress-ng/rasdeamon method.

I am leaning toward option (3) since it seems that this will get me to where I want to be sooner than later.

My objective is to identify bad sticks ASAP. I e-mailed NEMIX yesterday via the Newegg system and they responded saying that if I encounter any problems with testing they will replace the sticks.

Also, for the stress-ng test if high-temperature is the critical factor leading to memory errors, this may indicate a poor cooling design on my part. Is there a DIMM temperature beyond which we should blame the cooling system instead of the RAM?
 
Last edited:

nexox

Well-Known Member
May 3, 2023
700
289
63
I assume Intel NDA preventing any implementation of desirable functionality like reading memory controllers' error counters.
Other open source software (eg the Linux kernel) can do that stuff with no problem, so it's probably not an NDA issue. I skimmed their github page a week or two ago and the real issue seems to be that there's a lot of slightly different hardware to support and not a lot of volunteer developers to write and maintain that code.
 

erock

Member
Jul 19, 2023
84
17
8
Here is an update on testing NEMIX Supermicro compatible RAM available on Newegg and Amazon with return policy and a lifetime warranty claim (16GB DDR4 RDIMM ECC 3200, MEM-DR46LD-ER32). For review, I am using three H11DSi mobos with 2x7f52 on 2 boards and 2x7302 on the other and plan on expanding this cluster of machines. BIOS version is 2.1 on all but one board with 7f52x2.

Rasdaemon and stress-ng
I ran rasdaemon with stress-ng using the bash scripts provided by Stephen in the post referenced above on 82% of memory over 24 hr for 3 computational nodes (each node has 256GB). I may be able to go higher than 82% but need more time to explore this. I set the “ncore” variable in Stephen’s bash script to 30 on these 32 core machines. So this approach is testing 13.12 sticks out of 16 on each node. Rasdaemon reported no memory errors and the CE (corrected errors) and UE (uncorrected errors) columns produced using Stephen’s watcher script showed all zeros (i.e. no errors).

Memtest86(free)
I ran memtest86(free) on a H11DSi board with 7f52x2 and BIOS version 2.4 using the default setup for a single core. This test ran for over 35 hours and made it past test 1 and 50% through test 2 with no errors reported. This test became very slow only moving 2% over 5 hours so I stopped the test. Memtest86(free) froze at ~1.5hr on the other two boards with BIOS 2.1. These slowness/freezing issues are well documented for Supermicro boards (see link above). Resolving these slowness/freezing issues will require time invested with the Pro version and working with Passmark tech support.

memtester
I ran a single pass of memtester (memtester 200G 1) on each node with no errors reported.

Results Summary
Given that the rasdaemon/stress-ng approach is considered the best immediately available tool by several posters for identifying memory problems under load, I think one could conclude that my 3 batches of NEMIX Supermicro compatible RAM show no evidence of poor quality.

It is hard to find this RAM in bulk at reasonable prices with a lifetime warranty claim. Supermicro is out of stock for 16GB tested name brand sticks. Similar named brand Hynix RAM is available on eBay in large qualities from tm_space for $63 per stick with no lifetime warranty claim. However, the price from NEMIX is currently $35 per stick on Amazon and Newegg and large batches can be purchased. OWC also provides equivalent RAM on Amazon for $39 per stick available in large batches. If OWC provides better customer service this is a path to consider (I may try a couple of these sticks). Several other sites that offer name brand RAM sticks do not have enough supply to satisfy my use case (if you find one please share).

I see no evidence indicating I should pay double to more than triple for name brand RAM for my use case. But by all means one should be ready to go with tools like memtester86, rasdaemon and stress-ng, and exercise return and warranty policies. If I was an IT professional whose job it was to manage multiple machines for angry mobs of office workers I would definitely go for name brand RAM for peace of mind. However, for a small experimental cluster that requires lots of RAM sticks, has a narrow application focus and is used by a small team of scientists comfortable with testing and replacing hardware, I think the cheaper stuff should be considered with eyes wide opened as it may be a key factor in reaching objectives given time and budget constraints.

Moving forward with any RAM I plan on using Stephen‘s scripts with rasdaemon and stress-ng. This will allow me to have awareness of memory errors during real life applications and take action when necessary. You can map the rasdaemon edac paths to physical DIMM names using the following link, which will make it easier to identify a bad stick:

Monitoring ECC memory on Linux with rasdaemon

I plan on setting up these configuration files for all machines, which requires systematically removing and installing DIMMS (annoying but I think useful in the long run). I may get memtest86Pro but am not yet seeing the advantage of this relative to rasdaemon and stress-ng for my use case.

I will add additional updates to this post if I encounter any errors or performance issues. Thank you RolloZ170, Stephen and alex_stief for educating me on approaches for testing RAM.
 
Last edited:

erock

Member
Jul 19, 2023
84
17
8
Here is a quick update on this thread. I decided to use NEMIX for most of my project and tested dozens of 16GB sticks (~$32-35 per stick). I have not encountered any memory issues even while running very high workloads over multiple days and performance seems comparable to my Samsung RAM with similar specs. I did encounter two bad sticks but NEMIX replaced these without any issues (I had to pay $10 to return the RAM). Crucial is offering similar Supermicro RAM at $54 per stick and I plan on experimenting with this for my next build to get a second perspective on relative performance: Micron 16GB DDR4-3200 RDIMM 2Rx8 CL22 | MTA18ASF2G72PDZ-3G2R | Crucial.com. Does anyone have experience with returning bad sticks to Crucial in case I encounter issues?