Are my "new" drives good or bad?

nle

Member
Oct 24, 2012
201
11
18
Hi all, I recently purchased 7 HGST 4TB SAS drives after discovering @BLinux thread in "Great deals". I need six for my zpool, but bought one extra.

I’ve used them to upgrade my Dell R710 from 2TB SATA to these 4TB SAS. I run a OmniOS VM with the SAS controller passed through. All the drives are in a single raidZ2 zpool.

A few of the drives have issues:
  • One drive that seems to work-ish, but refuses to complete the long S.M.A.R.T test. (I've pulled that one)
  • One drive that shows quite a few errors, but seems to have stabilized.
  • Two drives with few “errors”.
  • (Please see below for all the disk data.)
Since I only have one SAS card and it’s on a live system, I have limited options to “burn them in”/test.

The way I’ve tried to test and verify the drives is by:
  1. Pulling the existing 2TB SATA drive and replacing them with the 4TB drive. That forces a resilvering that took around 10 hours.
  2. Run long S.M.A.R.T tests.
  3. Multiple regular scrubs.
  4. Fill the zpool with semi random data and then verify it again via this “disk-filltest” script (took around ~30 hours)
  5. Run a full scrub with no erros (~30 hours), but the S.M.A.R.T threw an error on one drive (not sure how bad they are?):
Code:
zpool status
  pool: datapool
 state: ONLINE
  scan: scrub repaired 0 in 35h34m with 0 errors on Fri Jun  8 10:45:01 2018
config:

    NAME                        STATE     READ WRITE CKSUM
    datapool                    ONLINE       0     0     0
      raidz2-0                  ONLINE       0     0     0
        c14t5000CCA25D52E475d0  ONLINE       0     0     0
        c11t5000CCA25D50CB0Dd0  ONLINE       0     0     0
        c15t5000CCA25D52E731d0  ONLINE       0     0     0
        c12t5000CCA25D52E6B1d0  ONLINE       0     0     0
        c16t5000CCA25D5294A5d0  ONLINE       0     0     0
        c13t5000CCA25D52D9C1d0  ONLINE       0     0     0

errors: No known data errors


The drive with the most errors (SN: K4HGL6ZB) returned a bunch of errors after resilvering. I pulled it, and used another drive, then later I tested it in another slot. This time the resilvering went fine. It also passes a long S.M.A.R.T test. Could it be that it was a bad seating of the drive bay?

Drive info:
(I’m putting the ones with errors first)

Code:
Vendor:               HGST
Product:              HUS726040ALS214
Revision:             MS00
Compliance:           SPC-4
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca25d522048
Serial number:        K4HG54ZB
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Sat Jun  2 11:17:57 2018 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     36 C
Drive Trip Temperature:        55 C

Manufactured in week 51 of year 2016
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  3
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  222
Elements in grown defect list: 127

Vendor (Seagate) cache information
  Blocks sent to initiator = 321994889363456

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0       4290         26.509           0
write:         0      312         0       312      25792       1443.841          37
verify:        0        0         0         0       1928          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Failed in segment -->       7      29        1335917654 [0x3 0x5d 0x1]
# 2  Background long   Failed in segment -->       7      24        1345541393 [0x3 0x5d 0x1]
# 3  Background short  Failed in segment -->       3      15                 - [-   -    -]
# 4  Background long   Failed in segment -->       3      14                 - [-   -    -]
# 5  Background long   Failed in segment -->       3      13                 - [-   -    -]

Long (extended) Self Test duration: 6 seconds [0.1 minutes]
Code:
Vendor:               HGST
Product:              HUS726040ALS214
Revision:             MS00
Compliance:           SPC-4
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca25d52e474
Serial number:        K4HGL6ZB
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Tue Jun  5 10:50:44 2018 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     34 C
Drive Trip Temperature:        55 C

Manufactured in week 51 of year 2016
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  6
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  22
Elements in grown defect list: 77

Vendor (Seagate) cache information
  Blocks sent to initiator = 705321592946688

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        6         0         6      20691       4882.801           6
write:         0      315         0       315      12424       2875.993           0
verify:        0        0         0         0       1559          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -      31                 - [-   -    -]

Long (extended) Self Test duration: 6 seconds [0.1 minutes]
Code:
Vendor:               HGST
Product:              HUS726040ALS214
Revision:             MS00
Compliance:           SPC-4
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca25d52d9c0
Serial number:        K4HGKHWB
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Tue Jun  5 10:50:43 2018 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     35 C
Drive Trip Temperature:        55 C

Manufactured in week 51 of year 2016
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  2
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  7
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 424789848096768

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        3         0         3      16911       6459.455           0
write:         0        0         0         0       5782       1517.858           0
verify:        0        0         0         0       1213          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -      34                 - [-   -    -]

Long (extended) Self Test duration: 6 seconds [0.1 minutes]
Code:
Vendor:               HGST
Product:              HUS726040ALS214
Revision:             MS00
Compliance:           SPC-4
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca25d52e730
Serial number:        K4HGLDMB
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Tue Jun  5 10:50:45 2018 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     37 C
Drive Trip Temperature:        55 C

Manufactured in week 51 of year 2016
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  6
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  13
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 513317093244928

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        5         0         5      47010      14936.730           0
write:         0        0         0         0      17654       1511.105           0
verify:        0        0         0         0       2059          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -      96                 - [-   -    -]
# 2  Background short  Completed                   -       1                 - [-   -    -]

Long (extended) Self Test duration: 6 seconds [0.1 minutes]
Code:
Vendor:               HGST
Product:              HUS726040ALS214
Revision:             MS00
Compliance:           SPC-4
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca25d50cb0c
Serial number:        K4HEEEHB
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Tue Jun  5 10:50:42 2018 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     35 C
Drive Trip Temperature:        55 C

Manufactured in week 51 of year 2016
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  5
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  12
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 483625380347904

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0      29838      11544.655           0
write:         0        0         0         0      14616       1570.023           0
verify:        0        0         0         0        896          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -      70                 - [-   -    -]

Long (extended) Self Test duration: 6 seconds [0.1 minutes]
Code:
Vendor:               HGST
Product:              HUS726040ALS214
Revision:             MS00
Compliance:           SPC-4
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca25d52e6b0
Serial number:        K4HGLBLB
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Tue Jun  5 10:50:43 2018 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     36 C
Drive Trip Temperature:        55 C

Manufactured in week 51 of year 2016
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  5
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  11
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 449035173363712

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0      22886       8174.324           0
write:         0        0         0         0      10759       1528.826           0
verify:        0        0         0         0        867          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -      47                 - [-   -    -]

Long (extended) Self Test duration: 6 seconds [0.1 minutes]
Code:
Vendor:               HGST
Product:              HUS726040ALS214
Revision:             MS00
Compliance:           SPC-4
User Capacity:        4,000,787,030,016 bytes [4.00 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca25d5294a4
Serial number:        K4HGDX6B
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Tue Jun  5 10:50:45 2018 CEST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     36 C
Drive Trip Temperature:        55 C

Manufactured in week 51 of year 2016
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  2
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  8
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 471423462146048

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0      25896       9863.903           0
write:         0        0         0         0      14611       1569.825           0
verify:        0        0         0         0       1033          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                   -      59                 - [-   -    -]

Long (extended) Self Test duration: 6 seconds [0.1 minutes]
My main questions to you guys is:
Based on this info, which of these drives could I trust (and which should I return/exchange)?

(On a side note. I’m in the EU and the seller in the US, so the shipping is quite expensive. The heavier/larger package, the more expensive, etc.)

I don’t have too much experience from buying “used” drives, so all input/advice is much appreciated!
 
Last edited:

BLinux

cat lover server enthusiast
Jul 7, 2016
2,539
979
113
artofserver.com
i would return K4HG54ZB for sure. K4HGL6ZB seems less of an issue and depending on how much return is going to cost you, may or may not be worth returning.

first, I would contact Scott. Share with him these errors. He should send you replacement drives for them with no additional cost to you. Now, see if they want the defective drives back. If not, then you have nothing to lose and all to gain. If they do want the defective drives back, then you need to consider the cost (and ask if they would pay for return shipping?) to you for shipping 1 or 2 drives back. although i'm in the US, so it is easier for me, Scott took care of any defective drives I got with a good one with no hassle at all.
 
  • Like
Reactions: nle

pricklypunter

Well-Known Member
Nov 10, 2015
1,608
471
83
Canada
Pretty much any disk that displays a growing list of smart errors is a question mark imo, it's not going to get any better from there. I would return both K4HG54ZB and K4HGL6ZB, if it's reasonable to do so :)
 
  • Like
Reactions: nle

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,068
508
113
New York City
www.glaver.org
(On a side note. I’m in the EU and the seller in the US, so the shipping is quite expensive. The heavier/larger package, the more expensive, etc.)
I have never dealt with that seller, but if the only reason he wants the drives back is so people don't tell him they have bad drives, then keep using them, you could offer to send only the logic boards back as proof that you aren't using the drives. That would save you a good deal on shipping - they could probably go in a padded envelope, regular mail, rather than as a parcel.
I don’t have too much experience from buying “used” drives, so all input/advice is much appreciated!
I reject any SAS drives that have non-zero "Elements in grown defect list". I'd be particularly suspicious of these, as the "Blocks sent to initiator" doesn't seem to correlate with the number of power-on hours implied by "Accumulated start-stop cycles", so I suspect these drives may have had their SMART data erased at some point.
 
  • Like
Reactions: nle

nle

Member
Oct 24, 2012
201
11
18
Thanks for the replies, much appreciated.

It seems like there is a consensus about returning two of the drives. And good tip about the logic boards.

@Terry Kennedy Would you care to elaborate a bit about "Blocks sent to initiator"? I can't seem to find much about it? Do you know anything @BLinux, how was your drives in that regard?

My drives sealed up in bit to big anti static bags, but it felt legit.
 

nle

Member
Oct 24, 2012
201
11
18
I sent Scott an e-mail yesterday about what he suggests to do with the two defect drives. I already notified him ~two weeks ago about one of the bad drives, and that I would send him an e-mail when I've tested the rest of the drives thoroughly.

We'll see when he replies.
 

Terry Kennedy

Well-Known Member
Jun 25, 2015
1,068
508
113
New York City
www.glaver.org
@Terry Kennedy Would you care to elaborate a bit about "Blocks sent to initiator"?
Never mind. It is a bug in smartmontools that assumes that if a drive has a log page 37, that page must be in Seagate format (log pages 30 and above are reserved for OEM-specific use). So smartmontools is reading the HGST data (which is about completely different stuff) and mis-interpreting it as Seagate data. I may send a patch in to the maintainer if I get motivated enough to do something about it.

For reference: HGST log page 37 and Seagate log page 37 definititions (both 1-page PDF files).
 
  • Like
Reactions: nle