Hardware Failures in 2022 - Post yours!

Patrick

Administrator
Staff member
Dec 21, 2010
12,382
5,534
113
Another year, another thread. Links to the long-lived 2017, 2020, and 2021 threads. Feel free to post your biggest hardware failures of 2022 in this thread.

Kicking this one off, two Seagate Exos X12 12TB drives that failed at the same time out of eight in the NAS.
Seagate Exos X12 12TB Hard Drives 2.jpg

We will have another one soon, and will also be putting a few of these on the main site. Someone pointed out that we do not discuss hardware failures enough.
 

i386

Well-Known Member
Mar 18, 2016
3,406
1,149
113
33
Germany
Kicking this one off, two Seagate Exos X12 12TB drives that failed at the same time out of eight in the NAS.
Did you check if they had firmware updates available?
"Publicly available" firmware updates are one of the reasons I like the exos hdds :D
 

Patrick

Administrator
Staff member
Dec 21, 2010
12,382
5,534
113
Did you check if they had firmware updates available?
"Publicly available" firmware updates are one of the reasons I like the exos hdds :D
Sadly, not that kind of failure.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,353
1,806
113
CA
It's 2022, will we see the "you should use different brands to be safe" replies :D

Look forward to learning more about those two failures.
 
  • Like
Reactions: Patrick

Y0s

Member
Feb 25, 2021
35
7
8
Do enterprise printers count? Neither of our two HP DesignJet Z6200 42inch poster printers survived return-to-the office/restart of conferences, both exploding in literal red ink! I guess don't leave inkjets idle?
 

Attachments

Malvineous

New Member
Sep 28, 2018
10
6
3
Do old servers count? I had a Dell R720 power supply fail last week. There was a 'pop', followed a minute later by the smell of burning plastic, and then another minute later by smoke coming out the back.

Unfortunately the iDRAC on the machine had long since crashed and I hadn't had a chance to power cycle it, so I wasn't able to check to see whether the PSU actually failed or not. But nothing switched off or showed warnings so I couldn't work out what had broken. I was about to go through the DRACs one by one in each server to check, but then smoke started wafting out the back of the rack so since I didn't want to take any chances, I cut the power (and the UPS power) to get everything switched off quickly. It's a small rack and against a wall under a desk so it's not the easiest to get access to, so I needed time to pull it out, which I didn't feel I had with smoke suddenly appearing!

Eventually I found the culprit PSU and replaced it and the server is working fine again. In the photo of the failed PSU, the bubbly grey part touching the green bit (in the middle of the photo) isn't meant to be all bubbly like that. I haven't had a chance to properly disassemble it yet, but it looks like it may be a filter capacitor that shorted internally, but I can't be certain until I take a proper look.

dell.r720.psu_failure.jpg
 

Jason Antes

Active Member
Feb 28, 2020
209
70
28
Twin Cities
Had an old 2TB SAS drive in my backup system die at power on (it's only on for a week once a month to do D2D2T backups) and just had a P420 cache battery fail on my ESXi server. HDD replaced from my spare inventory and a new battery ordered.
 

NablaSquaredG

Well-Known Member
Aug 17, 2020
697
311
63
IDK if that counts...

Supermicro 8048B-TRFT (Quad Socket Xeon E7)

Issue:
One of the memory riser has factory installed fan cables nearby.
It may happen that the fan cable wraps and thus the RAM slot gets caught in it.

Because you always need a lot of force to insert those memory anyway (Supermicro went with the reference design which is a bazillion pin edge connector), you don't realise that you just yanked the RAM Slot.

Really bad design, like the rest of the server too...
 

Andrewpaulb

New Member
Jan 19, 2022
1
0
1
Well we had 2x ST16000NM002G 16TB Exos Enterprise SAS fail after each other from 14 drives. At least we had a RAID 6 in place...
 
Last edited:

Railgun

Member
Jul 28, 2018
37
13
8
Did you check if they had firmware updates available?
"Publicly available" firmware updates are one of the reasons I like the exos hdds :D
They shipped FW of et03 was replaced by et04. That simply masked the issue. There was a self-admitted manufacturing issue from Seagate around these drives. I replaced two sets of 36…twice. The second set, while more behaved in the boxes they were in, were failing under the hood. Easily the worst disks I’ve ever had the displeasure of dealing with.

The “new” replacements are now derated 14TB (to 12TB). I’m not holding my breath on these either.
 
Last edited:

rune-san

Member
Feb 7, 2014
79
17
8
I just had my last OCZ Vertex 4 SSD fail, after spending the last 10 years as various Server's Read / Write Caches for batch jobs. That little 120GB SSD had over 800TB of writes to it in the end, way WAY more than it was rated for. Overall these OCZ SSDs weren't anything great, but while the previous 6 or so died around 3 years ago, these last 2, failing about 2 months apart, gave way more than what the specifications rated them for, and I can definitely say I got my money's worth.
 
  • Like
Reactions: tinfoil3d

RageBone

Active Member
Jul 11, 2017
584
145
43
My 550W EVGA power-supply went out with a loud Bang and tripped the breaker.
Currently in the Process of RMAing it.
 

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,123
574
113
New York City
www.glaver.org
Do enterprise printers count? Neither of our two HP DesignJet Z6200 42inch poster printers survived return-to-the office/restart of conferences, both exploding in literal red ink! I guess don't leave inkjets idle?
Large format printers need regular exercise. My experience is with Epson - I have a SC-P10000 and a P6000 here at home. I run a printless nozzle check daily in the dry season and print a full nozzle check around once a week. The P10K has 8000 print nozzles on a $2500 printhead, so you really don't want it to develop clogs.

As far as leaks go, that ink is nearly impossible to get out once it gets on something. An older 9500 series leaked ink on me and my hands looked like a Smurf for several weeks. The P10K holds about 2 gallons :eek: of ink - 700 ml in each of 10 cartridges plus all of the ink in the lines and head (it takes about 100 ml of each color to do the initial charge of the ink system).
 
  • Wow
  • Like
Reactions: tinfoil3d and Y0s

Y0s

Member
Feb 25, 2021
35
7
8
Large format printers need regular exercise...As far as leaks go, that ink is nearly impossible to get out once it gets on something.
Yeah, lesson learnt, though it's looking like we're just going to write off both expensive printers and use a printing service instead. The staff are asking for cloth posters for ease of transport.
 

Silly Valley Serf

New Member
Jul 31, 2022
4
2
3
Silicon Valley
First-time poster, long-time lurker.

I had a NAS running XigmaNAS in an old PC tower case, running on a SuperMicro C3758 motherboard. One day this spring, the main CPU just quit working. The IPMI was still responsive, but nothing I tried could get the main CPU to work again. And of course the mobo was long out of warranty.

I had obtained a used SuperMicro SC836 to move everything into prior to the failure, so I bought a new SM Xeon D-2123 motherboard to go with it. The RAM and the SSD from the old NAS dropped right in, and it booted right away. I put the drives from the old NAS chassis in, and the ZFS pools came right up too! I ran scrubs on them and was relieved to find no errors.

The only problem with this new server is that it's too noisy for my office, and I don't have a rack, or an Ethernet drop, in the garage yet.
 
Last edited:

Fritz

Well-Known Member
Apr 6, 2015
2,971
994
113
68
Just had a Supermicro outer rail fail on me and had to catch the server on it's way to the floor. Seems the BB's all fell out leaving nothing to hold in the inner part that the inner rail slides into. I have no idea what caused this to happen. I used a HD magnet to pick up all the little balls from the carpet but couldn't find at least half of them. I'm wondering if maybe this rail was a ticking time bomb that decided to go boom today. I examined the other rail and all the BB's are intact. :confused: