HGST 7K4000 HUS724030ALS640 3TB SAS shows errors in Linux

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Sundar

New Member
Oct 31, 2018
19
3
3
I am a servethehome noob. I read posts all the time, but have rarely posted before.

My setup:
  • SGI Rackable SE3016 bought May-2016
  • LSI SAS 9200-8e HBA bought May-2016
  • Brand new SuperMicro CSE-836E16-R92JBD with BPN-SAS2-836EL1 backplane bought May-2020
I have 7 (SATA) disks in my SGI Rackables chassis and they have been working fine.
I have put in 5 (SATA) disks in my new Supermicro JBOD chassis and all of them work fine.
Both chassis are connected to the same LSI 9200-8e HBA

I recently bought a HGST 7K4000 HUS724030ALS640 3TB SAS disk on eBay.

Although disk is detected on the Linux host, I see errors in the kernel log, and in smartctl and capacity shows up as zero. Needless to say, dd from the disk fails immediately. sg_format of the generic SCSI device also fails.

Seller's description on eBay was as follows:
"
Erased (DoD 5220.22 M Compliant) and tested good (Not formatted, partitioned, or allocated). Sold as used and in working condition. There may be drives that have up to 25 bad sectors. There may be writing or markings on the drive or the drive may vary slightly (color and/or label may differ) from picture but the model numbers will be the same
"
Seller is harddrivesonly with fairly good reviews.

I have contacted the seller, but I wanted guidance on:
  1. Is this a clear indication that the disk is defective (to be replaced / returned)?
  2. Is there some special stuff I need with the mpt2sas / mpt3sas driver on Linux to see SAS drives (rather than SATA)? This is my first experience with a SAS drive
Details below


Kernel log:
Jul 01 18:21:44 fili kernel: scsi 1:0:14:0: Direct-Access HITACHI HUS72403CLAR3000 C370 PQ: 0 ANSI: 6
Jul 01 18:21:44 fili kernel: scsi 1:0:14:0: SSP: handle(0x0019), sas_addr(0x5000cca0586e4ee1), phy(12), device_name(0xa0cc0050e14e6e58)
Jul 01 18:21:44 fili kernel: scsi 1:0:14:0: enclosure logical id (0x5003048001dc17bf), slot(0)
Jul 01 18:21:44 fili kernel: scsi 1:0:14:0: qdepth(254), tagged(1), scsi_level(7), cmd_que(1)
Jul 01 18:21:44 fili kernel: scsi 1:0:14:0: Power-on or device reset occurred
Jul 01 18:21:44 fili kernel: sd 1:0:14:0: Attached scsi generic sg15 type 0
Jul 01 18:21:44 fili kernel: sd 1:0:14:0: [sdn] Spinning up disk...
Jul 01 18:21:44 fili kernel: end_device-1:1:7: add: handle(0x0019), sas_addr(0x5000cca0586e4ee1)
Jul 01 18:22:26 fili kernel: .........................................not responding...
Jul 01 18:22:26 fili kernel: sd 1:0:14:0: [sdn] Read Capacity(16) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 01 18:22:26 fili kernel: sd 1:0:14:0: [sdn] Sense Key : Hardware Error [current] [descriptor]
Jul 01 18:22:26 fili kernel: sd 1:0:14:0: [sdn] Add. Sense: Logical unit failed self-test
Jul 01 18:22:26 fili kernel: sd 1:0:14:0: [sdn] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 01 18:22:26 fili kernel: sd 1:0:14:0: [sdn] Sense Key : Hardware Error [current] [descriptor]
Jul 01 18:22:26 fili kernel: sd 1:0:14:0: [sdn] Add. Sense: Logical unit failed self-test
Jul 01 18:22:26 fili kernel: sd 1:0:14:0: [sdn] 0 512-byte logical blocks: (0 B/0 B)
Jul 01 18:22:26 fili kernel: sd 1:0:14:0: [sdn] 0-byte physical blocks
Jul 01 18:22:30 fili kernel: sd 1:0:14:0: [sdn] Write Protect is off
Jul 01 18:22:30 fili kernel: sd 1:0:14:0: [sdn] Mode Sense: cf 00 10 08
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Write cache: disabled, read cache: enabled, supports DPO and FUA
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Unit Not Ready
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Sense Key : Hardware Error [current] [descriptor]
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Add. Sense: Logical unit failed self-test
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Read Capacity(16) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Sense Key : Hardware Error [current] [descriptor]
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Add. Sense: Logical unit failed self-test
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Sense Key : Hardware Error [current] [descriptor]
Jul 01 18:22:48 fili kernel: sd 1:0:14:0: [sdn] Add. Sense: Logical unit failed self-test
Jul 01 18:22:58 fili kernel: sd 1:0:14:0: [sdn] Attached SCSI disk



smartctl -i output:
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.7.0] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: HITACHI
Product: HUS72403CLAR3000
Revision: C370
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca0586e4ee0
Serial number: P9HYNMMW
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Wed Jul 1 18:26:17 2020 PDT
device Test Unit Ready [medium or hardware error (serious)]
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.



smartctl -T permissive -x /dev/sdn output:
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-5.7.0] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: HITACHI
Product: HUS72403CLAR3000
Revision: C370
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca0586e4ee0
Serial number: P9HYNMMW
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Wed Jul 1 18:31:40 2020 PDT
device Test Unit Ready [medium or hardware error (serious)]
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported
Read Cache is: Enabled
Writeback Cache is: Disabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature: 32 C
Drive Trip Temperature: 60 C

Manufactured in week 17 of year 2015
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 23
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 1488
Vendor (Seagate) cache information
Blocks sent to initiator = 0

Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 19539 0 0 19539 254 9280.166 0
write: 0 0 0 0 1185 21031.896 0
verify: 0 0 0 0 1210 0.000 0

Non-medium error count: 0

SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 34430 - [- - -]

Long (extended) Self Test duration: 6 seconds [0.1 minutes]

Background scan results log
Status: scan is active
Accumulated power on time, hours:minutes 34504:05 [2070245 minutes]
Number of background scans performed: 480, scan progress: 0.00%
Number of background medium scans performed: 480

Protocol Specific port log page for SAS SSP
relative target port id = 1
generation code = 1
number of phys = 1
phy identifier = 0
attached device type: expander device
attached reason: loss of dword synchronization
reason: unknown
negotiated logical link rate: phy enabled; 6 Gbps
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=1
SAS address = 0x5000cca0586e4ee1
attached SAS address = 0x5003048001dc17bf
attached phy identifier = 12
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 0
Running disparity error count: 0
Loss of dword synchronization count: 0
Phy reset problem count: 0
relative target port id = 2
generation code = 1
number of phys = 1
phy identifier = 1
attached device type: no device attached
attached reason: unknown
reason: power on
negotiated logical link rate: phy enabled; unknown
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=0
SAS address = 0x5000cca0586e4ee2
attached SAS address = 0x0
attached phy identifier = 0
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 0
Running disparity error count: 0
Loss of dword synchronization count: 0
Phy reset problem count: 0


lsscsi -tg output:
[0:0:0:0] disk usb:2-1.3:1.0 /dev/sda /dev/sg0
[1:0:0:0] disk sas:0x5001940000d2c200 /dev/sdb /dev/sg1
[1:0:1:0] disk sas:0x5001940000d2c201 /dev/sdc /dev/sg2
[1:0:2:0] disk sas:0x5001940000d2c202 /dev/sdd /dev/sg3
[1:0:3:0] disk sas:0x5001940000d2c203 /dev/sde /dev/sg4
[1:0:4:0] disk sas:0x5001940000d2c207 /dev/sdf /dev/sg5
[1:0:5:0] disk sas:0x5001940000d2c20f /dev/sdg /dev/sg6
[1:0:6:0] enclosu sas:0x5001940000d2c23e - /dev/sg7
[1:0:7:0] disk sas:0x5003048001dc17af /dev/sdh /dev/sg8
[1:0:8:0] disk sas:0x5003048001dc17b3 /dev/sdi /dev/sg9
[1:0:9:0] disk sas:0x5003048001dc17b7 /dev/sdj /dev/sg10
[1:0:10:0] disk sas:0x5003048001dc17b8 /dev/sdk /dev/sg11
[1:0:11:0] disk sas:0x5003048001dc17ba /dev/sdl /dev/sg12
[1:0:12:0] disk sas:0x5003048001dc17bb /dev/sdm /dev/sg13
[1:0:13:0] enclosu sas:0x5003048001dc17bd - /dev/sg14
[1:0:14:0] disk sas:0x5000cca0586e4ee1 /dev/sdn /dev/sg15



sg_readcap -l /dev/sg15 output:
Read Capacity results:
Protection: prot_en=0, p_type=0, p_i_exponent=0
Logical block provisioning: lbpme=0, lbprz=0
Last logical block address=5860533167 (0x15d50a3af), Number of logical blocks=5860533168
Logical block length=512 bytes
Logical blocks per physical block exponent=0
Lowest aligned logical block address=0
Hence:
Device size: 3000592982016 bytes, 2861588.5 MiB, 3000.59 GB


sg_format -F /dev/sg15 output:
HITACHI HUS72403CLAR3000 C370 peripheral_type: disk [0x0]
<< supports protection information>>
Unit serial number: P9HYNMMW
LU name: 5000cca0586e4ee0
Mode Sense (block descriptor) data, prior to changes:
Mode Sense (block descriptor) data, prior to changes:
<<< longlba flag set (64 bit lba) >>>
Number of blocks=5860533168 [0x15d50a3b0]
Block size=512 [0x200]

A FORMAT UNIT will commence in 15 seconds
ALL data on /dev/sg15 will be DESTROYED
Press control-C to abort

A FORMAT UNIT will commence in 10 seconds
ALL data on /dev/sg15 will be DESTROYED
Press control-C to abort

A FORMAT UNIT will commence in 5 seconds
ALL data on /dev/sg15 will be DESTROYED
Press control-C to abort
format unit:
Descriptor format, current; Sense key: Hardware Error
<<<Sense data overflow>>>
Additional sense: Diagnostic failure on component [0x90]
Descriptor type: Information: >> descriptor too short
00 00 00 00 00 00 00 00 00 00
Descriptor type: Sense key specific: Actual retry count: 0
Descriptor type: Field replaceable unit code: 0x0
Descriptor type: Block commands: Incorrect Length Indicator (ILI) clear
Descriptor type: Vendor specific [0x80]
f1 18
Descriptor type: Vendor specific [0x81Format unit command: Medium or hardware error
FORMAT UNIT failed
try '-v' for more information
 

Dreece

Active Member
Jan 22, 2019
503
160
43
I can't recall ever seeing this kind of issue, possibly corrupt firmware on drive?! Do you have another HBA/Raid controller on hand? for double-confirmation the error is the drive itself and not some other issue such as backplane/cabling/controller firmware/hardware defect.

Though it does indeed appear to be a return job I'm afraid to say.
 

Dreece

Active Member
Jan 22, 2019
503
160
43
PS. regarding if you need special configuration at the linux driver level, not at all, it is pretty straight-forward plug-and-play, unless one has been poking (not just peeking) around in at the system configuration level. Functioning sas hardware setups should just present the drives to the OS without issue.

One thing you could do is try gparted via booting with a Uboot CD/DVD, that way at least you cancel out potential configuration errors at the OS end.
 

Sundar

New Member
Oct 31, 2018
19
3
3
I can't recall ever seeing this kind of issue, possibly corrupt firmware on drive?! Do you have another HBA/Raid controller on hand? for double-confirmation the error is the drive itself and not some other issue such as backplane/cabling/controller firmware/hardware defect.

Though it does indeed appear to be a return job I'm afraid to say.
Thanks for your reply.
Unfortunately, I have only one SAS HBA. I could buy a cheap HBA like an LSI 9200-8i, but it would be no use to me long term, and it would cost as much as the HDD in question. Unfortunately, I also cannot try the HGST SAS HDD in my Rackables enclosure though I have free slots, because the Rackables enclosure only accepts SATA disks.
 

Sundar

New Member
Oct 31, 2018
19
3
3
PS. regarding if you need special configuration at the linux driver level, not at all, it is pretty straight-forward plug-and-play, unless one has been poking (not just peeking) around in at the system configuration level. Functioning sas hardware setups should just present the drives to the OS without issue.

One thing you could do is try gparted via booting with a Uboot CD/DVD, that way at least you cancel out potential configuration errors at the OS end.
The seller responded asking me to initiate a return.
I will try buying a different SAS disk, and try that.
 

Sundar

New Member
Oct 31, 2018
19
3
3
Thanks @Dreece
You're right - probably makes sense to buy (and test) a cheak LSI 9200-81 to test any future SAS drives taking out any variables related to the enclosure being used.

BTW, since I am new to SAS drives, if I used an "internal" HBA like the LSI 9200-81, I know I will need an SFF8087 cable to connect from HBA to SAS drive. How about power for the drive? Will I be able to use a standard SATA power connector on the SAS drive - most links seem to say that there is little physical difference between the SATA and SAS connectors except for the lockout pin.
 

Dreece

Active Member
Jan 22, 2019
503
160
43
Sure, it just depends on the cable you select.

For example this one you can just plug sata power connectors straight into the back of the plug:



You can also buy cables with a molex power connector too:



All kinds of options, just go whichever way fits into your build.