NVMe heatsink / cooling

MrCalvin

IT consultant, Denmark
Aug 22, 2016
75
15
8
49
Denmark
www.wit.dk
Would like to hear your opinion and ideas for NVMe cooling solutions, most obvious is a heatsink I guess.
First of all is it necessary?
The reviews I've read of NVMe's with pre-installed heatsinks, e.g. WD Black SN750, the heatsink didn't seem to make that big a difference in performance (kind of disappointed). Endurance might be another matter :p
E.g. from eteknix.com
I have a pretty good ventilated system without high-end graphics card, so its easy for the drive to keep up with things in my setup. Without the heatsink, the drive came in at 53 degrees Celcius and 490K IOPS. As comparison, with the heatsink attached, the drive only reached 47 degrees and delivered 500K IOPS.
Also there seems to be different cooling needs of the NAND chips and the controller, the former like the heat, the later don't. Can any contribute with knowledge about this (confirm/deny)?

Solutions?
I've been experimenting with custom made heatsink, but the fixing is the difficult part.Using strips work rather okay, but they take some space and actually easily bent the PCB = puting stress the chip-solderings which in worst case breaks over time!
And of course I insert thermal pads in between.
IMG_20190712_143259720.jpg IMG_20190712_143245534.jpg IMG_20190712_143237640.jpg IMG_20190712_143229077.jpg

The WD Black heatsink and fixing:
WD-Black-SN750-1TB-Photo-details-heatsink.jpg
Got my mind thinking about 3D printing something equal
 
Last edited:

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,395
502
113
The prevailing knowledge seems to be that NAND flips faster when it's hot, retains data better when it's cold (some people even say it helps with improving flash endurance); ergo, warm NAND isn't something to worry about assuming it's plugged in and data retention isn't your priority. However, like yourself, I'm worried this is mostly old-wifey hearsay until I see some proof of it. - I've had a cursory look but I never found anything to convince me this was true.

Controllers are almost always the hot-spot in the SSD and some of them definitely do get incredibly toasty (IIRC I saw a review of a PCIe 4.0 SSD recently that regularly hit >100 and throttled as a result (albeit only under extremely heavy sequential writes) which a heatsink can help with.

FWIW I'm in the process of buying some of the EK-M.2 heatsinks - not because I'm overly worried about the heat my P4101 will generate (they'll be in a spot without a huge amount of airflow but they'll almost never be under any heavy loads), but mostly because they're not expensive and really pretty. They use steel clips and two slabs of aluminium, so as long as your SSD is relatively even (i.e. any small differences in the height of the components can be forgiven by squishing the thermal pad a bit more in places) they're fine.

If you know your NVME will come under repeatedly heavy loads and don't want to risk the controller throttling, you can likely splurge on a swanky heatsink like the above, or cheap out and just use thermal adhesive and a small MOSFET or BGA heatsink solely on the controller using thermal adhesive.
 

MrCalvin

IT consultant, Denmark
Aug 22, 2016
75
15
8
49
Denmark
www.wit.dk
Would like to see your review of the EK-M.2 heatsink when you receive it. Especially in relation to the distance between the heatsink and the controller-chip/NAND chips. There doesn't seem to be much spring-effect and no height adjustment. but perhaps the thermal pad are rather high e.g 2-3mm. But for optimal heat dissipation the less thermal pad the better, I use 0.5mm and 1mm myself.

I bought some alu heatsinks for, as I remember, about 1$ each on aliexpress. The alu-heatsink on the picture. I'll need it for servers so they don't have to be pretty and a need a lot.
 
Last edited:

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,395
502
113
From what I've read the thermal pads are 0.5mm and 1mm - the thin one usually meant to go on the bottom where there aren't any ICs and the thick one to go on the top. I'll only be able to tell once I've got all the kit whether it fits nicely or not, or if one could get away with using two 0.5mm thermal pads, but again that's going to depend greatly on the SSD design anyway.

If you're using these entirely in a server scenario, just make sure your M.2s get some degree of airflow and I'm pretty you'll be fine, heatsink or no. If your readings say the controllers are overheating then try something like these (I don't know if they still make them, I bought a bag for a song aeons ago and only ever used a few at a time cooling hot ICs).
 

MrCalvin

IT consultant, Denmark
Aug 22, 2016
75
15
8
49
Denmark
www.wit.dk
Maybe the 3M 8815 Thermal Tape tape is the way to go, as used by the BCC9 you linked too. I'll see if I can get some and try it out :)
 

MrCalvin

IT consultant, Denmark
Aug 22, 2016
75
15
8
49
Denmark
www.wit.dk
Been doing some testing.

Test-setup:

Disk
: NVMe SAMSUNG MZVLB256HAHQ-00000
Thermal pad1 : Arctic 0.5mm ACTPD00001A, 6w/mK
Thermal pad2 : Wakefield-Vette ulTIMiFlux Gold, 5w/mK
Heat-sink: custom alu, rather small.
Air ventilation: some air, but at a minimum. Level as what I would expect in a cabinet adjusted to a minimum. Fan running only 1000RRM and from a rather large distance.
Benchmark software: Linux DD, write and read 100GB
Reading: SMART info, sensor 1 (secondary chip), sensor 2 (the NVMe controller)

Only applied pad on the controller and secondary chip, not the NAND. The heat-sink didn't touch the NANDs.


BASELINE, without heatsink:

WRITE, max reading:
Sensor1: 49c
Sensor2: 75c (I stopped after 50GB, 75c is enough for me, it was still increasing. Don't need the read test)

With heatsink and Arctic thermal pad:
WRITE, max reading:
Sensor1: 43c
Sensor2: 46c

READ, max: (did actually ran the readtest 3x100G, just to be sure it didn't climb further)
Sensor1: 48c
Sensor2: 49c

With heatsink and ulTIMiFlux thermal pad:
WRITE, max reading:
Sensor1: 43c
Sensor2: 47c

READ, max reading:
Sensor1: 50c
Sensor2: 50c

Conclusion
First of all I've always been skeptic about the review-result of non-DIY NVMe heatsinks and there limited performance. Often they don't show more improvement than about 10c degrees, which seemed wrong to me that they didn't perform better.
My test confirm my assumption. If you are thorough in your mounting of the thermal pad and heatsink and use thin and HQ pad you can get much better result.
Secondly I was surprised to see the Arctic pad was performing that good. I expected their specification of thermal dissipation of 6w/mK was just marketing crap, but it actually seem to be true! The best "professional" pad I could find on Digikey was the ulTIMiFlux (5w/mK) and it performed equal or maybe a little worse.
I expected the baseline would eventually climb to 80c, 90c...?. So my "small" heatsink with limited ventilation gave me 30-50c degree improvement.

I attached the heatsink with some rubber band (bicycle tube sliced). Looking for some 16-18mm latex tube for better long-term tension. Not sure the the bicycle tube will keep the same elastic tension over time.
 

Attachments

Last edited:

MrCalvin

IT consultant, Denmark
Aug 22, 2016
75
15
8
49
Denmark
www.wit.dk
As it turn out the bikers have gotten all crazy about latex bicycle tubes! So they are easy to get and rather cheap. I got a racer tube for €11 / $13.
I'm rather confident they'll keep the tension the whole life of the NVMe disk :)
Latex.jpg

Update 31/12-2020:
It turns out the latex tube is rather vulnerable to UV light which quickly "disolve" it. Some 40mm heatshrink "band" to cover it seem to be a good solution.Why not just use heat-shrink you may ask? It's my experience it doesn't keep the "tension" over time.
8187.jpg
 
Last edited:

Spartacus

Well-Known Member
May 27, 2019
781
326
63
Austin, TX
Ha fair, from what I read the flash does better running hotter, I'm gonna stick a thermal pad on the underside of the controller and call it done I think. It barely idle's like 99% of the time anyway
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,395
502
113
As it turn out the bikers have gotten all crazy about latex bicycle tubes! So they are easy to get and rather cheap. I got a racer tube for €11 / $13.
I'm rather confident they'll keep the tension the whole life of the NVMe disk :)
View attachment 11547
Even better than that, it looks like you've stuck the heatsinks on with bubble gum for maximum MacGuyvering points :)
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,395
502
113
So, the UK is in the middle of the same heatwave as the rest of europe (although today is a good ten degrees cooler than thursday), my SSDs are toasty and my M2 heatsinks have arrived so I figured it was best to tack on my findings to this thread. As I said above, I mostly just think the EK-M2's are really pretty.

The test bed is my new build based on the X470D4U which has two M2 slots to the east of the CPU. According to the block diagram these aren't entirely equal - one is PCIe 3.0 x2 and the other is PCIe 2.0 x4 - but the SSDs I'm using, the Intel P4101 1TBs, aren't fast enough to saturate that bandwidth anyway. If you're curious about the rest of the build I've been detailing it from onwards.

From a quick bit of experimentation I can confirm that peak heat is generated by doing large sequential writes so I modified the fio seq write test to make it bigger. Assuming the drive was capable of sustaining writes at the full ~600MB/s, a runtime of 300s would mean a maximum file size of about 175GB so a 256GB file should be sufficient.
Code:
[sequential-write]
rw=write
size=512G
direct=1
directory=/mnt/fio-test
numjobs=5
group_reporting
name=sequential-write-direct
bs=8k
runtime=300
I then partitioned and formatted the entire drive as ext4, mounted it at /mnt/fio-test and ran several tests, keeping a record of temperatures immediately before and after the test. Each run was done three time and averaged, waiting another 300s between each run for temperatures to return to how they were at the start of the first test; ambient temperature is currently a pretty toasty 28 degrees with the controller tmperature when idle stable at 33-34 degrees without any airflow. In a parallel session, I ran a temperature check every 30s just in case the controller started throttling and dropped temperatures; as far as the temperatures showed, it didn't - temperatures generally plateaued 2mins after the test had started. One oddity of the EK design I'm not sure about is that the hot plate has a hole in it underneath the logo as can be seen in this picture which on my (and I imagine most) SSDs will be smack in the middle of the controller which probably isn't maximal for cooling.

The forums don't appear to have bbcode for tables enabled so here's the info in plaintext format. The tests were run in this chronological order on a brand new drive (just over 1GB of data was written to it prior to the test). I had to reboot to remove and re-add the drive as I don't know how to get M2 hotswap to work yet.

No heatsink, no direct airflow. You could have fried an egg on the controller if you had a really tiny egg and didn't mind getting egg all over your nice new SSD.
Code:
Run	Start Temp.	Seq. Write (kB/s)	IOPS	Max Temp.
0	33		807783			100972	68
1	33		786790			98348	65
2	33		792257			99032	67
No heatsink, low airflow from a nearby 80mm fan at 600rpm. This is to attempt to mimic the airflow it'll get when in the case as it'll be obscured by cables, etc.
Code:
Run	Start Temp.	Seq. Write (kB/s)	IOPS	Max Temp.
0	34		870609			108826	50
1	33		875097			109387	48
2	33		850959			106369	48
EK-M2 heatsink, no direct airflow.
Code:
Run	Start Temp.	Seq. Write (kB/s)	IOPS	Max Temp.
0	31		830794			103849	47
1	34		854658			106832	48
2	33		855464			106932	47
EK-M2 heatsink, low airflow from a nearby 80mm fan at 600rpm.
Code:
Run	Start Temp.	Seq. Write (kB/s)	IOPS	Max Temp.
0	34		833895			104236	43
1	33		832780			104097	43
2	30		855980			106997	42
So, over 3TB of writes later, a few interesting findings.
  • The drive repeatedly exceeded its supposed 660MB/s speed for writes, with the slowest write being 750MiB/s and the fastest being 835MiB/s
  • The slowest speeds observed were in the first test with almost zero cooling for the controller. I suspected the controller might be throttling here although it doesn't indicate it in the SMART log.
  • Absolute fastest write speeds were on the second test - no heatsink but nominal airflow. This lends credence to the theory of hot NAND being faster (although I'm not 100% convinced)
  • The EK-M2 heatsinks seem to do a good job of keeping the drives cool although it's not really of any appreciable performance benefit.