Problem solved!
A short recap, also see the summary in this threads post #18 here.
After updating the server with the latest SPP and populating box3 with 8 sata hdd's,
the first service request was sent from the server and HPE support called me.
Before this I've used the server with two ssd's without any trouble.
Two service requests later and HPE support started to take this issue seriously.
Doing all sorts of hdd/ssd rotation (see post #5) didn't reveal anything besides
that the bug was triggered by populating box3 with all 8 disks.
And the server was green/ok on all parts, just the annoying service request.
The HPE custom ESXi image was a bit more helpfull when it at least reported
which disk the server meant was causing trouble (false positive).
But the web interface is a bit buggy and not reliable since wbem/sfcbd is unstable.
And again it was not consistent, everything reported green/ok but random failing
of different disks. Pointing to a false positive bug.
HPE support first replaced the backplane, cables and p440ar. Bug still present.
Then replaced the motherboard and now everything is ok.
Original first post follows:
This is the second time the server is booted and each time it has sent a service request to HPE.
The error message that is sent to HPE is "iLO4_300_DriveStatusChanged_Failed"
But I can't figure out what's wrong.
In iLO4 at the service event log I have the following:
Event id is 300, indicating Physical Disc Drive Service Event.
Event Category is HPQSA0300
This first happend after installing the latest SPP (everyting went ok).
Also the only thing installed at the moment is the ESXi 6.5 HPE Custom image.
This is booting off the internal usb (official HP usb key).
There's no error to be found on the system.
The P440ar is green and ok (firmware 5.04)
And the same for the Wellsburg AHCI controller.
And every 8 hdd's connected is green/ok.
There's 6 of the MM1000GBKAL with firmware HPGE in original trays.
And 2 Intel DC S3710 with firmware G2010110, also in original trays.
Any idea what's wrong?
I guess HPE will call me in a day or two (like last time).
But it would be nice to have something to help me understand where this error is coming from.
UPDATE:
In the service event details in HPE.com I found this:
"RecommendedActions:
A hard drive has experienced a failure.
Check for known FW issues with the drive FW rev and if none are found
proceed with drive replacement using spare part number undefined."
The problem is that I can't find any faulty drive.
And in the description, the server doesn't know either:
"Failing Part: Drive Type: Drive Model number: Drive Serial Number: Drive FW Rev:
Drive Spare P/N: undefined Drive Location: Physical Volume Port: Box: 0 Bay: 0
Failing Part Location: The failing hard drive is installed at Physical Volume Port: Box: 0 Bay: 0
of the array controller installed in Smart Array P440ar RAID in Slot 0 of the server XXX a ProLiant DL380 Gen9 with serial number nnnnnnn."
I've xxx and nnn the server name and serial number, but the rest of the numbers missing is just that, missing.
Is this a known error with the firmware from the SPP?
Or do I have a hdd failure I just can't find?
A short recap, also see the summary in this threads post #18 here.
After updating the server with the latest SPP and populating box3 with 8 sata hdd's,
the first service request was sent from the server and HPE support called me.
Before this I've used the server with two ssd's without any trouble.
Two service requests later and HPE support started to take this issue seriously.
Doing all sorts of hdd/ssd rotation (see post #5) didn't reveal anything besides
that the bug was triggered by populating box3 with all 8 disks.
And the server was green/ok on all parts, just the annoying service request.
The HPE custom ESXi image was a bit more helpfull when it at least reported
which disk the server meant was causing trouble (false positive).
But the web interface is a bit buggy and not reliable since wbem/sfcbd is unstable.
And again it was not consistent, everything reported green/ok but random failing
of different disks. Pointing to a false positive bug.
HPE support first replaced the backplane, cables and p440ar. Bug still present.
Then replaced the motherboard and now everything is ok.
Original first post follows:
This is the second time the server is booted and each time it has sent a service request to HPE.
The error message that is sent to HPE is "iLO4_300_DriveStatusChanged_Failed"
But I can't figure out what's wrong.
In iLO4 at the service event log I have the following:
Event id is 300, indicating Physical Disc Drive Service Event.
Event Category is HPQSA0300
This first happend after installing the latest SPP (everyting went ok).
Also the only thing installed at the moment is the ESXi 6.5 HPE Custom image.
This is booting off the internal usb (official HP usb key).
There's no error to be found on the system.
The P440ar is green and ok (firmware 5.04)
And the same for the Wellsburg AHCI controller.
And every 8 hdd's connected is green/ok.
There's 6 of the MM1000GBKAL with firmware HPGE in original trays.
And 2 Intel DC S3710 with firmware G2010110, also in original trays.
Any idea what's wrong?
I guess HPE will call me in a day or two (like last time).
But it would be nice to have something to help me understand where this error is coming from.
UPDATE:
In the service event details in HPE.com I found this:
"RecommendedActions:
A hard drive has experienced a failure.
Check for known FW issues with the drive FW rev and if none are found
proceed with drive replacement using spare part number undefined."
The problem is that I can't find any faulty drive.
And in the description, the server doesn't know either:
"Failing Part: Drive Type: Drive Model number: Drive Serial Number: Drive FW Rev:
Drive Spare P/N: undefined Drive Location: Physical Volume Port: Box: 0 Bay: 0
Failing Part Location: The failing hard drive is installed at Physical Volume Port: Box: 0 Bay: 0
of the array controller installed in Smart Array P440ar RAID in Slot 0 of the server XXX a ProLiant DL380 Gen9 with serial number nnnnnnn."
I've xxx and nnn the server name and serial number, but the rest of the numbers missing is just that, missing.
Is this a known error with the firmware from the SPP?
Or do I have a hdd failure I just can't find?
Last edited: