U.2 drives showing very high temperature under load but SMART reporting normal

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

akkk44

New Member
Mar 27, 2018
23
0
1
34
Greetings folks, I recently get my hands on a KIOXIA cd6 U.2 ssd. When I plug it into the system(Tyan B8252),The NVME_SSD_1 temperature sensor reading is about 35℃, and the SMART shows about the same value.

However, the temperature when sky high in just a few seconds once I put some load onto the ssd. It went from 33 all the way to 70 and triggering the system alert and cause all the fans spinning at full speed.Meanwhile, the temperature reported by the drive it self showing in the SMART was still under 40 degree.

Once the load is left, the temperature reading in BMC dropped back to 33 in just a few seconds. It is hard to believe that it is a real temperature reading, given it rise and drop so rapidly.

My questions are: why is the temp reported by the drive and the temp reported by the BMC so different? Could the temp reported by BMC actually from a part on the drive backplane? And finally, is there a way to disable the sensor?(Tried the supermicro sdr del command, did not work)
 

ano

Well-Known Member
Nov 7, 2022
655
273
63
most solid states drives under load or benchmark loads will ramp temps up like crazy, think cpu with tiny cooler.

most does not like to be run hot, so they call for fan
 
  • Like
Reactions: akkk44

akkk44

New Member
Mar 27, 2018
23
0
1
34
The temp reported by BMC started at ~ -30℃ and rapidly grow to 70℃. I tested an Intel P4510 4TB U.2 SSD, which have a TDP rated at 15W, and the TDP of CD-6 is 19W.
The Inter drive works totally fine, kept at stable ~40℃ under load. Would 4W TDP really make this much a difference?
 

akkk44

New Member
Mar 27, 2018
23
0
1
34
The shell of the drive remained cold to touch, even the temp in reporting 70℃
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
877
485
63
49
r00t.dk
I have to agree with @ano - when I had u.2. drives in my servers, they were also running very hot - so you definetely need a fan pushing air over the disks.

But 70 degrees might not be too much for the drive to handle, so perhaps you need to tweak the sensor range?

Specs says:

Case Surface Temperature (Operating)0°C to 70°C

So perhaps that mean the internals of the drive can be higher?
 
  • Like
Reactions: T_Minus

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,649
2,065
113
For a server with constant access and drive utilization you want to be sure to run actively cool them ie: in a server chassis or a home desktop style case with fans blowing over them constantly.

However, I run some 2.5" \ u2 NVME enterprise Intel (4500\4510) in my home desktop (not for server\vms) and I have 2 silent intake fans, and never had a problem. Average temp is 18*C max is 34*C.

If you benchmark them then yes they'll temp way up even in a desktop.

When I ran P3700 NVME I had to have fans on them they idle hotter and I wasn't comfortable letting them stay hot to the touch all the time.
 

akkk44

New Member
Mar 27, 2018
23
0
1
34
I "solved" this problem by attaching the drive to a U.2-PCIE adaptor card and that BMC no longer gets those wild temp reading. Now all sensors are about 38 ℃ idle and 44 ℃ under load, which is way more consistent than before.

I am now pretty happy with that. I will report back if anything did go wrong.
 

Bjorn Smith

Well-Known Member
Sep 3, 2019
877
485
63
49
r00t.dk
I am now pretty happy with that. I will report back if anything did go wrong.
Sounds like a defect sensor inside the drive cage - or the sensor reporting "outside" of drive temperatures - where the drive is reporting internal temperatures
 

akkk44

New Member
Mar 27, 2018
23
0
1
34
Sounds like a defect sensor inside the drive cage - or the sensor reporting "outside" of drive temperatures - where the drive is reporting internal temperatures
Tried puting a Intel P4510 U.2 ssd(Which did listed in the AVL of the barebone system) in the same drive bay and it behaves normally. My guess is that some wierd compatibility issue may play a role here... Things get pretty wild in the server world. I can't even disable that BMC sensor.