R420 - IDRAC not responding upon reboot

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

dougmckee

New Member
Jan 25, 2023
1
0
1
I shut down all services for a service interruption and moved all of my enclaves to another location. Upon bringing everything back up one of my R420 servers would not respond. There was nothing available on the LCD panel on the front. We hooked up a monitor to the server to troubleshoot and it was asking for F1 to continue. After this, the ESXi comes up fine along will all services but I can still not access my IDRAC console.

Any ideas on the cause? I wasnt sure if the CMOS battery went bad of if there was something else I could look into.
 

Renat

Member
Jun 8, 2016
57
19
8
41
iDrac is dead, at least problem with NAND chip that hold firmware and controller (iDrac on own risc cpu) can`t boot.

In case of external (in R420 it should be external or this is only Enterprise license with port on it... not remember) you can just replace this module.

In case of internal (on motherboard) - only replace motherboard.
Of course you can try restore procedure - insert SD-card with only ".bin" file of iDrac package on it. And insert card to iDrac SD-Card special slot.

p.s. in case of HP there is simple solution to rebind NAND chip. But on Dell it demands much more experience and equipment to restore.
 
Jun 9, 2023
24
6
3
USA
www.youtube.com
iDrac is dead, at least problem with NAND chip that hold firmware and controller (iDrac on own risc cpu) can`t boot.

In case of external (in R420 it should be external or this is only Enterprise license with port on it... not remember) you can just replace this module.

In case of internal (on motherboard) - only replace motherboard.
Of course you can try restore procedure - insert SD-card with only ".bin" file of iDrac package on it. And insert card to iDrac SD-Card special slot.

p.s. in case of HP there is simple solution to rebind NAND chip. But on Dell it demands much more experience and equipment to restore.
Ugh got a growing stack of dead R320, R420 servers due to idrac issues... Kinda sucks it takes the motherboard with them. How hard is the fix if the restore procedure doesn't work?
 

Renat

Member
Jun 8, 2016
57
19
8
41
R420 doesn`t have SD-slot (for recovery procedure, only full versions R630/R730 have), so maybe only this way available:
fohdeesha aswered on forum and there are several posts on Reddit.

Step 0: connect to JTAG and check that booting process hang trying to mount some partition
Step 1: determine model of NAND chip 4G placed on board (different server generations uses different models with different mount (ball) places)
Step 2: get and reball NAND chip
Step 3: via JTAG and simple command interface of iDrac upload file firmimg.dX from firmware file (folder payload) and recovery file system

Step 2 generally maybe costs more then new board (or even server of next generation. R430)
 
Jun 9, 2023
24
6
3
USA
www.youtube.com
R420 doesn`t have SD-slot (for recovery procedure, only full versions R630/R730 have), so maybe only this way available:
fohdeesha aswered on forum and there are several posts on Reddit.

Step 0: connect to JTAG and check that booting process hang trying to mount some partition
Step 1: determine model of NAND chip 4G placed on board (different server generations uses different models with different mount (ball) places)
Step 2: get and reball NAND chip
Step 3: via JTAG and simple command interface of iDrac upload file firmimg.dX from firmware file (folder payload) and recovery file system

Step 2 generally maybe costs more then new board (or even server of next generation. R430)
Ah thought the slot in the back was for idrac stuff. I wonder if the connectors are the same as a R720. It would be funny jerry rigging a R720 front panel to a R420. :p

Yeah I watched the YouTube videos on the more hands on repair route. I'm starting to think Dell servers are just ewaste. I've been able to recover a R720 that failed to update properly. Another one just straight up died. I think I have an R330 with a firmware bug that I may be able to fix if I buy the idrac module, but tempted to trash it as well... I just can't believe they would build such a critical system into the board with no redundancy.
 

Renat

Member
Jun 8, 2016
57
19
8
41
But problems in users also!
200 servers of HP and Dell (from lga2011 to lga3647) - nobody want to update servers at least once a 6 months.

This NAND-chip is the same as in phones: many writes and chip is dead.
At least only HP Blade has this chip on separate board and need only change it.
 
Jun 9, 2023
24
6
3
USA
www.youtube.com
I'm good at killing supermicro systems as well. Been avoiding HP since I hear their firmware can be a pain. I'm surprised they don't do things like dual BIOS like consumer boards. Especially on critical chips. Does make me wonder how many writes the growing pile of R420's I have, have seen.