The quest for the HGST UltraStar SN260 firmware updates...

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

ElCoyote_

Active Member
Jul 22, 2016
248
190
43
Hi @LodeRunner Your drives show firmwares with a name pattern prefix of KNEC and KNEG. As far as I know genuine HGST SN260 have a firmware prefix pattern of KNGN and the Cisco OEM firmwares have a Cisco prefix of KNCC. Your drives being KNE* I I think they will reject a non-KNE firmware by default.
I'd try to track down the OEM and see if you can download a v122 firmware from them.
 

LodeRunner

Active Member
Apr 27, 2019
568
244
43
Well, after some more digging, I was apparently wrong about these being unbranded; I failed to note the EMC part number: 118000588-03, DPN 0TS1478, possibly for a Gen 3 Powermax; and the downloads for Powerxax appear to be restricted. Ugh. Anyone know how to use sg_utils or nvme-cli to force a firmware upgrade? I'm willing to sacrifice one of my SN200's to that experiment.
 

ServerUser

New Member
Aug 29, 2020
6
3
3
I've got hold of some of these SN200, but the performance on writes is somewhat sketchy, which I think is probably firmware related. HUSMR7619BDP3Y1 (Sales P/N: 0TS1355) which have no other markings, or serials for another vendor that I can see, but they are running an odd FW: KNGWD110 and I can't find reference to KNGWD anywhere. They won't take the standard KNGND FW.

Anyone ever come across the KNGWD anywhere? I don't really want to relegate them to just boot drives as I have a project in mind for them.
 

dreunion61

New Member
Nov 22, 2023
24
7
3
Did you manage to get it working on the SN100?

I recently ordered a HGST 3.2TB HUSPR3232ADP301(SN100) and I couldn't find any public links available for Firmware Downloads (except the Links here and your Post).
Yes that firmware package is compatible. I'm running the latest firmware. Running 24h/7d now since over a year.
 

mytime34

New Member
Aug 20, 2013
8
1
1
HGST/WDC Ultrastar SN200 Recovery Guide – Persistent Internal Error / Diagnostic State / Missing Namespace

I wanted to document this entire recovery process because these drives can look completely dead while still being recoverable.

Recovered Drive:

  • HGST/WDC Ultrastar SN200 7.68TB
  • Model: HUSMR7676BDP3Y1
  • Original firmware: KNGND110
Original Symptoms:

  • Linux repeatedly logged:
    "resetting controller due to persistent internal error"
  • Controller appeared/disappeared every ~4.7 seconds
  • No namespaces existed
  • No nvmeXn1 devices
  • BIOS could not see the drive
  • Windows initially did not expose storage
  • nvme-cli firmware activation failed
  • Drive appeared stuck in recovery/diagnostic/SBL mode
Initial Linux Environment:

  • Dell PowerEdge R740XD
  • Ubuntu Linux
  • nvme-cli installed
  • HGST HDM installed
Linux consistently showed:

  • PCIe enumeration worked
  • Controller object existed
  • NVMe admin queues initialized repeatedly
  • Firmware subsystem partially alive
Intermittent successful commands:

  • nvme id-ctrl
  • nvme fw-log
  • fw-download transport
Valid identify data repeatedly returned:

  • SN: SDM0000882DA
  • Model: HUSMR7676BDP3Y1
  • FW: KNGND110
Important discovery:
The controller itself was NOT dead.

Firmware Package Findings:
KNGND122.bin was NOT a raw firmware image.

It was a packaged enterprise firmware bundle containing:

  • FWHEADER.bin
  • PROC0-15.bin
  • SECURITY.bin
  • FCC.bin
  • StringTable.csv.gz
Extracted strings strongly suggested diagnostic/recovery behavior:

  • "SYS: Go into SBL mode"
  • "SYS: Crash Occurred"
  • "Overlay Init Done"
  • "Error: Invalid Overlay"
This strongly suggested:

  • recovery firmware state
  • corrupted operational runtime state
  • failed namespace/FTL initialization
  • NOT dead hardware
Linux Attempts That DID NOT Fix It:

  • nvme fw-download
  • nvme fw-activate
  • namespace creation commands
  • APST disable
  • ASPM disable
  • PCIe secondary bus reset
  • HDM on Linux
Interesting Linux Findings:

  • fw-download transport actually succeeded once
  • firmware activation failed with invalid image
  • Linux HDM could never enumerate the device
  • PCIe bridge reset triggered GHES fatal hardware errors
  • Controller repeatedly initialized:
    "56/0/0 default/read/poll queues"
Critical Hardware Change:
The major breakthrough came after:

  • moving the drive OUT of the Dell server
  • moving it OUT of Linux
  • placing it into a Windows workstation
  • using a direct PCIe 3.0 U.2 adapter card
Hardware used:

  • AMD 7950X workstation
  • direct PCIe 3.0 U.2 adapter card
  • no enterprise backplane/riser complexity
This appeared to provide:

  • cleaner PCIe initialization
  • direct controller access
  • better compatibility with HGST tooling
  • less PLX/backplane interference
Critical Windows Discovery:
Windows Device Manager successfully detected:
"WD Ultrastar SN2xx PCIe SSD Controller"

This was HUGE because Linux HDM never successfully enumerated the drive.

Required Software:

  • HGST Device Manager (HDM) 3.4
  • Administrator CMD/PowerShell
Successful HDM Scan:
Command:
hdm scan

HDM successfully detected:

  • NVMe controller
  • firmware slots
  • stable controller UID
  • stable device path
Initial Firmware Slot State:
Running Firmware Version = KNGND110 (Loaded from Slot 5)

Firmware slots:

  • Slot 1 (Read-only)
  • Slot 2
  • Slot 3
  • Slot 4
  • Slot 5
All showed KNGND110.

Critical Discovery:
Slot 5 appears to behave like:

  • recovery slot
  • fallback slot
  • degraded runtime state
The drive was trapped booting from Slot 5.

THE RECOVERY PROCESS:
This was the key fix.

Step 1:
Activate Slot 2:

hdm manage-firmware --activate --slot 2 -a @nvme0

IMPORTANT:
Do a FULL shutdown after each slot change.
NOT reboot.

Command:
shutdown /s /t 0

Wait ~30 seconds before powering back on.

Result:

  • controller behavior improved slightly
  • BIOS still inconsistent
Step 2:
Activate Slot 3:

hdm manage-firmware --activate --slot 3 -a @nvme0

Again:
FULL shutdown afterward.

Major behavior changes occurred:

  • BIOS started detecting the drive
  • Windows became much more stable
  • controller reconnect storms mostly stopped
  • namespaces began partially initializing
Final Stable State:
Eventually Slot 4 became the healthiest operational slot.

Final stable behavior:

  • BIOS fully sees drive
  • namespaces restored
  • full 7.68TB visible
  • stable controller enumeration
  • stable PCIe link
  • Windows Disk Management detects drive normally
Final HDM State:

  • Running Firmware Version = KNGND110 (Loaded from Slot 4)
  • Namespace Count = 1
  • Capacity = 7681501126656
Important Lessons Learned:

  1. These drives can look COMPLETELY dead while still recoverable.
  2. Missing namespaces does NOT mean dead NAND.
  3. BIOS invisibility does NOT mean dead controller.
  4. Linux recovery tools were insufficient in this case.
  5. Windows HDM was the major breakthrough.
  6. Firmware slot switching was the real fix.
  7. Slot 5 appears to be a recovery/fallback runtime state on these drives.
  8. Direct PCIe access mattered enormously.
Firmware Update Notes:
KNGND122.bin could NOT be directly loaded using the current HDM workflow.

Commands like:
hdm manage-firmware --load --file "C:\KNGND122.bin"

currently failed with:
"Required command parameter is missing --load"

Still investigating:

  • exact firmware package workflow
  • whether newer tooling is required
  • whether diagnostic-clear workflow is mandatory before loading
Current recommendation:
If your SN200:

  • loops resetting
  • has no namespace
  • BIOS cannot see it
  • Linux repeatedly resets controller
DO NOT immediately assume it is dead.

Try:

  1. Windows
  2. HGST HDM
  3. Direct PCIe U.2 adapter
  4. Firmware slot switching
  5. Full shutdowns between slot changes
That combination was ultimately the recovery breakthrough.
 
  • Like
Reactions: Angus