HP ProLiant DL380 iLO4 NAND Flash

TheMrDec

New Member
Jan 21, 2022
8
3
3
I am new to the forums so forgive me if I have misplaced this thread. I recently had a "FREE" DL380 fall into my hands that had been in some less than ideal operating conditions. I figured it couldn't be too difficult to repair and I was mostly correct. I was able to clean up the system board and get the PDU functional and the board gets power! The only issue now is that the iLO4 embedded memory has the dreaded "init_fail to media write-verify" bug and I am nowhere near satisfied with the solution from HP of trying to reset the NAND and failing that, just throw out the board. Since I am fairly adept with board repair, I figured I would try to venture into replacing the NAND chip itself and brute force the initial 0-wipe to get a good flash chip on the board but I cant find anything on this. No mention of anyone trying this prior to my excursion into this project of a server.

Before someone mentions it, yes. I know I can get a new board for like $300USD. I am not concerned with cost but rather the amount of good hardware going to waste for what I assume to be due to a qspi flash chip that probably costs pennies.

Has anyone seen any information involving chip data or someone identifying the offending IC on the system board or is this something that just gets overlooked due to the low board cost?

Any leads are appreciated. If I don't get anything, I will start probing this board manually and post my results if I find anything useful.
 

TheMrDec

New Member
Jan 21, 2022
8
3
3
Do you need working ilo to use the motherboard itself? :oops:
I dont think so... but also yes? There seems to bee issues with the HBA if iLO isnt working properly and I plan on using this to replace my PowerEdge T420 so having the out-of-band controller is kind of necessary. I do prefer iDRAC but the computing power that I am lacking is starting to show and this 380 came fully kitted.

I am also just incredibly unsatisfied with a half finished job and want to do as much as I can to get this board back to 100%


I started out as a microcontroller and FPGA guy so throwing out a whole PCB because one component failed is just not OK in my book.
 
  • Like
Reactions: tinfoil3d

RolloZ170

Well-Known Member
Apr 24, 2016
1,935
501
113
55
Has anyone seen any information involving chip data or someone identifying the offending IC on the system board or is this something that just gets overlooked due to the low board cost?
the NAND is a BGA chip(Hynix e-nand), most located onthe underside near the ILO chip.
which DL380 gen ???
 

TheMrDec

New Member
Jan 21, 2022
8
3
3
the NAND is a BGA chip(Hynix e-nand), most located onthe underside near the ILO chip.
which DL380 gen ???
I found an skhynix dram module next to the iLO but I didnt see any NAND. I haven't removed the system board from the sled yet but I probably should have done that prior to posting. I was incredibly frustrated by the absolute swarm of forum posts just throwing in the towel and buying new boards.

Ita a gen 9 380 with an iLO4 module.
 

RolloZ170

Well-Known Member
Apr 24, 2016
1,935
501
113
55
I found an skhynix dram module
what is printed on ? write down and google.
I haven't removed the system board from the sled yet but I probably should have done that prior to posting
soldering a BGA chip is not what everyone can do at home.
if a specialist replaces the NAND what is going down next ? risk a server down for just $300 ???
 

TheMrDec

New Member
Jan 21, 2022
8
3
3
what is printed on ? write down and google
Like I said above, its a DRAM module. Ddr3.

soldering a BGA chip is not what everyone can do at home.
if a specialist replaces the NAND what is going down next ? risk a server down for just $300 ???
Again, above: "I started out as a microcontroller amd FPGA guy"

I have ample experience designing and assembling boards as well as diagnosing and repairing boards including flowing BGA chips. I am more than confident that I can repair this board back to a reliable state, I just dont have any data sheets on modules and I didnt see anything ither than the DRAM module that looked even remotely like it could be the issue.

I was simply throwing a net to see if I could find someone who had seen something useful about these chips.
 
  • Like
Reactions: tinfoil3d

TheMrDec

New Member
Jan 21, 2022
8
3
3
That pair is likely the culprit. A hub controller IC (likely the "sd card" controller) and a NAND flash IC. Now to figue out where to source that flash chip.
 

TheMrDec

New Member
Jan 21, 2022
8
3
3
Yeah, nah. Ive sourced the module from a place called UTSOURCE. Ive had good luck with them in the past. I will get this swapped out and post a progress update ince I am done
 
  • Like
Reactions: tomaash

Skud

Active Member
Jan 3, 2012
139
73
28
I think this is a cool project!!

That being said......

I've encountered a lot of these systems. There was a bug in the older iLO code which prematurely wore out the NAND. However, it's my understanding that the NAND is only used for the Intelligent Provisioning. I don't think I've ever used Intelligent Provisioning on these servers and I wouldn't be surprised if it even supported deployment with a recent OS.

Other than the "iLO has failed a self-test" message on boot and the subsequent NAND errors in the iLO logs I've never had any issues using the iLO, including full graphical KVM.

Riley
 

TheMrDec

New Member
Jan 21, 2022
8
3
3
I think this is a cool project!!

That being said......

I've encountered a lot of these systems. There was a bug in the older iLO code which prematurely wore out the NAND. However, it's my understanding that the NAND is only used for the Intelligent Provisioning. I don't think I've ever used Intelligent Provisioning on these servers and I wouldn't be surprised if it even supported deployment with a recent OS.

Other than the "iLO has failed a self-test" message on boot and the subsequent NAND errors in the iLO logs I've never had any issues using the iLO, including full graphical KVM.

Riley

If that is really all the NAND is used for then I likely have about a thousand other issues to work through on this board. The iLO is constantly losing sensor data and cant reliably tell what hardware is and isnt installed. I had assumed that the iLO ROM was stored on the NAND chip and whatever issue there was with the chip was causing my unpredictable behaviors. If that assumption is incorrect then I probably have heavy voltage drops somewhere that I have yet to locate.

I am still going to replace the chip once it gets here but if that doesnt work, I might have my work cut out for me.

I really wish I had access to a boardview so that I could know where to poke to get the info that I need but I assume HP has that under lock and key. I cant wait for R2R to pick up some steam already
 

Skud

Active Member
Jan 3, 2012
139
73
28
If that is really all the NAND is used for then I likely have about a thousand other issues to work through on this board. The iLO is constantly losing sensor data and cant reliably tell what hardware is and isnt installed. I had assumed that the iLO ROM was stored on the NAND chip and whatever issue there was with the chip was causing my unpredictable behaviors. If that assumption is incorrect then I probably have heavy voltage drops somewhere that I have yet to locate.

I am still going to replace the chip once it gets here but if that doesnt work, I might have my work cut out for me.

I really wish I had access to a boardview so that I could know where to poke to get the info that I need but I assume HP has that under lock and key. I cant wait for R2R to pick up some steam already
Yeah - I think you may have other issues then. Out of 25-30 DL360 and DL380 gen8s most have failed NAND and they've remained perfectly functional. Even the iLO logs retain over power cycles so I don't think it uses that NAND chip for that.
 

seanneko

New Member
Feb 1, 2022
9
1
3
Did you make any progress with this?

I've got a DL380p Gen8 which I believe has a working but corrupt NAND. It was working fine until I went to update Intelligent Provisioning but had a power loss part way through the process. Ever since then iLO reports:

Embedded Flash/SD-CARD Controller firmware revision 2.10.00 Embedded media manager failed media attach

I don't think this is the same as the usual problem of NAND going into read-only mode. It seems like it can't mount the partition or something (presumably because it's corrupt). I've tried the process of formatting the NAND but I don't think it's actually performing the format at all. After iLO restarts and I check the logs there's nothing in there to indicate that a format happened.

If there was a way that I could directly zero the NAND without doing so through the iLO software I reckon I could probably get it to work again.
 

TheMrDec

New Member
Jan 21, 2022
8
3
3
Did you make any progress with this?

I've got a DL380p Gen8 which I believe has a working but corrupt NAND. It was working fine until I went to update Intelligent Provisioning but had a power loss part way through the process. Ever since then iLO reports:

Embedded Flash/SD-CARD Controller firmware revision 2.10.00 Embedded media manager failed media attach

I don't think this is the same as the usual problem of NAND going into read-only mode. It seems like it can't mount the partition or something (presumably because it's corrupt). I've tried the process of formatting the NAND but I don't think it's actually performing the format at all. After iLO restarts and I check the logs there's nothing in there to indicate that a format happened.

If there was a way that I could directly zero the NAND without doing so through the iLO software I reckon I could probably get it to work again.
Unfortunately no. I got the NAND chip off and probed voltage while I was in there and noticed that ALL of my rails were looping between their respective high and low voltages. I spent about 8 hours total chasing a gremlin and never did find it. The project has been benched until this summer when I have more time. I will say that the NAND was certainly not being written properly. I bodged up a little circuit to cycle through and wipe the NAND but I dont really recommend this as I think it heavily damaged the NAND. Seems like these boards are just unreliable.
 

seanneko

New Member
Feb 1, 2022
9
1
3
Unfortunately no. I got the NAND chip off and probed voltage while I was in there and noticed that ALL of my rails were looping between their respective high and low voltages. I spent about 8 hours total chasing a gremlin and never did find it. The project has been benched until this summer when I have more time. I will say that the NAND was certainly not being written properly. I bodged up a little circuit to cycle through and wipe the NAND but I dont really recommend this as I think it heavily damaged the NAND. Seems like these boards are just unreliable.
Fair enough, I know the feeling.

I'm fairly confident that I can remove and solder a new NAND chip, but like what you said in your first post I don't really want to spend the time and effort when I don't know if it will actually work or not.

Given that I have a suspicion that my current NAND chip may actually work but just has corrupt contents, I wonder if the format function in iLO doesn't truly format the device but expects it to already contain a partition table...