Question for Acquacow.
We have 4 servers HP DL380's that each have two 6.4 TB SX350 cards installed ( the Cisco USC version) last week the on two different server about about 6 hours apart each server lost one of the cards.
The Windows event showed the same pattern on both servers.
Every seen these types of errors ? Servers don't report any power issue. they both run the same workload so wondering what could have cause a software crash ?
Server A - 9/6/2019 12:43:38 AM
fioerr Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): 12303 Watchdog fired - request stuck issued 106953125 us ago last completion 59000000 us ago
Server B - 9/5/2019 1:05:57 PM
-fioerr Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): 12303 Watchdog fired - request stuck issued 56140625 us ago last completion 53046875 us ago
The rest of the errors logged by iomemory_vsl4_mc were the same on both servers.
- Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): Stuck req
then
- fioerr Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): 12318 Failing channel (0x1000).
Last error :
fioerr ioMemory driver: 06156 fct0: Notification received that this device has failed.
The the disk manager reports the drive is offline next message relate to card trying to take a crash dump ?
- Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): Groomer for data log is tearing down
- Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): Groomer for data log halted.
-fiowrn Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): Error saving metadata. Reattach will require rebuild (-12).
-Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): BEGIN CSR DUMP
Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): CSR DUMP: oDQCAP8+ZmlvaW5mIENpc2NvIFVDUyA2NDAwR0IgU2FuRGlzayBpb01lbW9yeSBTWDM1MCAoNywy
100 plus more CSR DUMP
Last entry :
-Cisco UCS 6400GB SanDisk ioMemory SX350 (7,2): END CSR DUMP
card info :
Adapter: ioMono (driver 4.3.5)
Cisco UCS 6400GB SanDisk ioMemory SX350, Product Number

FIOS64002, SN:Fxxxxxx
PCIe Power limit threshold: 24.75W
Connected ioMemory modules:
fct0: 07:00.0, Product Number

FIOS64002, SN:xxxxxxxxxxxx
fct0 Attached
ioMemory Adapter Controller, Product Number

FIOS64002, SN:xxxxxxxxxxx
PCI:07:00.0, Slot Number:2
Firmware v8.9.9, rev 20170222 Public