[Update: Seller Complaints Accumulating] HGST Ultrastar He10 - 10TB @ $129.95

heromode

Active Member
May 25, 2020
139
83
28
Code:
root@aeronas:~ # which smartctl -a /dev/da2
/usr/local/sbin/smartctl
-a: Command not found.
/dev/da2: Command not found.
nano /usr/local/sbin/smartctl

(to see if it's ascii or binary)

# ls -la /usr/sbin/smartctl
-rwxr-xr-x 1 root root 896912 Mar 8 2021 /usr/sbin/smartctl
 
Last edited:

heromode

Active Member
May 25, 2020
139
83
28
I understand that the mechanical portion of the drive is the same. From what I've been readin sparcly is that it appears that the sas side of the controllers scsi seems to not have any uniformity and alot of things are hidden from smart escpecially if they are OEM drives.
If that is the case, i will be contacting the smartctl devs and crying until i get my helium readings!
 

pr1malr8ge

Member
Nov 27, 2017
63
20
8
40
Well out of curiosity I called HGST support to ask if th
nano /usr/local/sbin/smartctl

(to see if it's ascii or binary)

# ls -la /usr/sbin/smartctl
-rwxr-xr-x 1 root root 896912 Mar 8 2021 /usr/sbin/smartctl
Btw I just got off the phone with WD/HGST asking if there was a way to check helium levels on my M/N and S/N and they said NO.
but below is nano output and your ls -la

Code:
^?ELF^B^A^A     ^@^@^@^@^@^@^@^@^B^@>^@^A^@^@^@��%^@^@^@^@^@@^@^@^@^@^@^@^@8n
^@^@^@^@^@^@^@^@^@@^@8^@^K^@@^@^_^@^^^@^F^@^@^@^D^@^@^@@^@^@^@^@^@^@^@@^@ ^@^@^@^@^@@^@ ^@^@^@^@^@h^B^@^@^@^@^@>
^@^@^@^@^@ �*^@^@^@^@^@ �*^@^@^@^@^@^X^B^@^@^@^@^@^@�^C^@^@^@^@^@^@^@^P^@^@^@^@^@^@^A^@^@^@^F^@^@^@�f
^@^@^@^@^@Ж*^@^@^@^@^@Ж*^@^@^@^@^@�^E^@^@^@^@^@^@�<^A^@^@^@^@^@^@^P^@^@^@^@^@^@^B^@^@^@^F^@^@^@Xc
^@^@^@^@^@X�*^@^@^@^@^@X�*^@^@^@^@^@�^A^@^@^@^@^@^@�^A^@^@^@^@^@^@^H^@^@^@^@^@^@^@R�td^D^@^@^@ c
^@^@^@^@^@ �*^@^@^@^@^@ �*^@^@^@^@^@^X^B^@^@^@^@^@^@�^C^@^@^@^@^@^@^A^@^@^@^@^@^@^@P�td^D^@^@^@^L       ^E^@^@^>
        ^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^Q  ^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@"   ^@^@^R^>
^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^V
^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^\
^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@!
^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@*
^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2
^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@:
^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@?
^@^@^R^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@�^B^@^@^Q^@^X^@��*^@^@^@^@^@^X^@^@^@^@^@^@^@�^C^@^@^Q^@^[^@P�+^@^@^>
^@^@0^@^@^@^P^@^@^@^A^@^A^@t
^@^@P^@^@^@^P^@^@^@^A^@^B^@�
^@^@P^@^@^@^@^@^@^@ӯk^E^@^@^E^@S
^@^@^P^@^@^@�lz^M^@^@^F^@^
^@^@^P^@^@^@t)�^H^@^@^D^@h
^@^@^@^@^@^@P&y^K^@^@^C^@�
^@^@^@^@^@^@�(z^G^@^@^B^@�
^@^@^P^@^@^@�(z^G^@^@^G^@�
^@^@^@^@^@^@^G^@^@^@t^@^@^@^H^@^@^@^Z^@^@^@^SH^@^@^A�^H^@^@^@^@^A^@^@^B�^P^@^@^@�^@l^P^B^@^@ �P^A$^@^D�^@^@^@^H>
��S���^WS^_��^H���3�^N^Tl����BxIk�2��^R^UӪ�\k^B`p^RwcD��^N
��^R^@h^E��^U^C�z^H����^X^F�^P?��Pv��ݣk
��_��^��^@^@^@�^@^@^@T^@^@^@^@^@^@^@^@^@^@^@*^@^@^@_^@^@^@@^@^@^@�^@^@^@^@^@^@^@s^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@>
^@^@^@^@^@^@^@E^@^@^@>^@^@^@B^@^@^@Q^@^@^@^@^@^@^@"^@^@^@^@^@^@^@/^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@R^@^@^@H>
^@^@^@^@^@^@^@^@^@^@^@��*^@^@^@^@^@^G^@^@^@�^@^@^@^@^@^@^@^@^@^@^@��*^@^@^@^@^@^G^@^@^@^W^@^@^@^@^@^@^@^@^@^@^@>
^@Smartctl open device: %s failed: %s
^@afterselect,off^@Option -t pending,N (N=%d) must have 0 <= N <= 65535
^@Option -s aam,N must have 0 <= N <= 254
^@-identify^@ioctl[,N], ataioctl[,N], scsiioctl[,N], nvmeioctl[,N]^@protocol^@SET FEATURES [Disable SATA featur>
^@%d^@No information found^@. %u:%u %n^@%u-%u %n^@. 15:8 Must be set to 0x80^@ 50 Capabilities^@ 57-58 Current >
^@APM disable failed: %s
^@Write cache reordering %sable failed: %s
^@ATA IDLE command failed: %s
^@SMART Extended Comprehensive Error Log (GP Log 0x03) not supported

^@Read SMART Extended Self-test Log failed

^@Selective Self-tests/Logging not supported

^@Logical Sectors Written^@Pending Error Count^@naa^@, zeroed^@Zoned Device:     %s
^@Unknown(0x%04x)^@Transport Type:   Parallel, %s
^@ATA/ATAPI-4 X3T13/1153D revision 6^@ATA8-ACS T13/1699-D revision 3c^@ATA8-ACS T13/1699-D revision 3f^@ACS-2 T>
^@SATA NCQ Send and Receive log^@logged_count^@ at LBA = 0x%08x = %u^@Invalid Error Log index = 0x%02x (T13/132>

^@over_limit_count^@0x%02x%s%s (rev %d) ==
^@R_ERR response for host-to-device non-data FIS^@$Id: dev_interface.cpp 5115 2020-11-09 22:07:22Z chrfranke $$>
^@Buffer not set for DATA IN/OUT command^@JMB39x: Restore original sector (%szero filled)
^@JMB39x: Zero filling original data
^@jmb_get_sector_type(data) == 1^@0x%04x:0x%04x^@smartmontools home page: https://www.smartmontools.org/
^@-v 170,raw48,Grown_Failing_Block_Ct -v 171,raw48,Program_Fail_Count -v 172,raw48,Erase_Fail_Count -v 173,raw4>
THIS DRIVE MAY OR MAY NOT BE AFFECTED,
see the following web pages for details:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/207951en
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=632758^@CC3[5-9A-Z]^@A firmware update for this drive may be >
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/218171en^@ST3(250[68]2|32062|40062|50063|75064)0NS^@-v 9,msec24>
^@Firmware Version:                   %s
                                              [ Read 2240 lines ]
^G Get Help     ^O Write Out    ^W Where Is     ^K Cut Text     ^J Justify      ^C Cur Pos      M-U Undo
^X Exit         ^R Read File    ^\ Replace      ^U Paste Text   ^T To Spell     ^_ Go To Line   M-E Redo
Code:
root@aeronas:~ # ls -la /usr/local/sbin/smartctl
-r-xr-xr-x  1 root  wheel  685560 Apr 14 23:09 /usr/local/sbin/smartctl
 

pr1malr8ge

Member
Nov 27, 2017
63
20
8
40
WOW!!

that is real news.
Not for sure. How ever the more I've been digging into this though I think its just Sas drives or scsi in general doesn't talk to smart much more then some basic info.
as I said same HBA and Expander talking to the 3tb HGST sata drives it will give full out-put and it's not a function of BDS vs Linux but rather Sas vs Sata

Code:
root@aeronas:~ # smartctl -x /dev/da9
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p14 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi/HGST Ultrastar 7K4000
Device Model:     Hitachi HUS724030ALE641
Serial Number:    P8G8KPWR
LU WWN Device Id: 5 000cca 22cc3e56f
Firmware Version: MJ8OA5F0
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Apr 18 14:30:04 2022 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                                        was suspended by an interrupting command from host.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (   24) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (   1) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  P-S---   136   136   054    -    80
  3 Spin_Up_Time            POS---   119   119   024    -    521 (Average 522)
  4 Start_Stop_Count        -O--C-   100   100   000    -    23
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         PO-R--   100   100   067    -    0
  8 Seek_Time_Performance   P-S---   121   121   020    -    34
  9 Power_On_Hours          -O--C-   099   099   000    -    10741
10 Spin_Retry_Count        PO--C-   100   100   060    -    0
12 Power_Cycle_Count       -O--CK   100   100   000    -    23
192 Power-Off_Retract_Count -O--CK   100   100   000    -    314
193 Load_Cycle_Count        -O--C-   100   100   000    -    314
194 Temperature_Celsius     -O----   181   181   000    -    33 (Min/Max 15/38)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x03       GPL     R/O      1  Ext. Comprehensive SMART error log
0x04       GPL     R/O      7  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x08       GPL     R/O      2  Power Conditions log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x12       GPL     R/O      1  SATA NCQ Non-Data log
0x20       GPL     R/O      1  Streaming performance log [OBS-8]
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80       GPL     R/W     63  Host vendor specific log
0x81-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xb2       GPL     VS      63  Device vendor specific log
0xc8       GPL     VS     617  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     10584         -
# 2  Extended offline    Completed without error       00%     10498         -
# 3  Short offline       Completed without error       00%     10418         -
# 4  Short offline       Completed without error       00%     10178         -
# 5  Extended offline    Completed without error       00%     10091         -
# 6  Short offline       Completed without error       00%     10023         -
# 7  Short offline       Completed without error       00%     10013         -
# 8  Short offline       Completed without error       00%      9846         -
# 9  Extended offline    Completed without error       00%      9758         -
#10  Short offline       Completed without error       00%      9678         -
#11  Short offline       Completed without error       00%      9510         -
#12  Extended offline    Completed without error       00%      9422         -
#13  Short offline       Completed without error       00%      9342         -
#14  Short offline       Completed without error       00%      9176         -
#15  Extended offline    Completed without error       00%      9088         -
#16  Short offline       Completed without error       00%      9043         -
#17  Short offline       Completed without error       00%      9010         -
#18  Short offline       Completed without error       00%      8770         -
#19  Extended offline    Completed without error       00%      8682         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       256 (0x0100)
Device State:                        SMART Off-line Data Collection executing in background (4)
Current Temperature:                    33 Celsius
Power Cycle Min/Max Temperature:     25/35 Celsius
Lifetime    Min/Max Temperature:     15/38 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/60 Celsius
Min/Max Temperature Limit:           -40/70 Celsius
Temperature History Size (Index):    128 (90)

Index    Estimated Time   Temperature Celsius
  91    2022-04-18 12:23    33  **************
...    ..(126 skipped).    ..  **************
  90    2022-04-18 14:30    33  **************

SCT Error Recovery Control:
           Read:      1 (0.1 seconds)
          Write:      1 (0.1 seconds)

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 2) ==
0x01  0x008  4              23  ---  Lifetime Power-On Resets
0x01  0x018  6     41564556434  ---  Logical Sectors Written
0x01  0x020  6       735317484  ---  Number of Write Commands
0x01  0x028  6     42947793217  ---  Logical Sectors Read
0x01  0x030  6       299172214  ---  Number of Read Commands
0x03  =====  =               =  ===  == Rotating Media Statistics (rev 1) ==
0x03  0x008  4           10739  ---  Spindle Motor Power-on Hours
0x03  0x010  4           10739  ---  Head Flying Hours
0x03  0x018  4             314  ---  Head Load Events
0x03  0x020  4               0  ---  Number of Reallocated Logical Sectors
0x03  0x028  4               0  ---  Read Recovery Attempts
0x03  0x030  4               0  ---  Number of Mechanical Start Failures
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x04  0x010  4               0  ---  Resets Between Cmd Acceptance and Completion
0x05  =====  =               =  ===  == Temperature Statistics (rev 1) ==
0x05  0x008  1              33  ---  Current Temperature
0x05  0x010  1              32  N--  Average Short Term Temperature
0x05  0x018  1              31  N--  Average Long Term Temperature
0x05  0x020  1              38  ---  Highest Temperature
0x05  0x028  1              15  ---  Lowest Temperature
0x05  0x030  1              36  N--  Highest Average Short Term Temperature
0x05  0x038  1              25  N--  Lowest Average Short Term Temperature
0x05  0x040  1              35  N--  Highest Average Long Term Temperature
0x05  0x048  1              25  N--  Lowest Average Long Term Temperature
0x05  0x050  4               0  ---  Time in Over-Temperature
0x05  0x058  1              60  ---  Specified Maximum Operating Temperature
0x05  0x060  4               0  ---  Time in Under-Temperature
0x05  0x068  1               0  ---  Specified Minimum Operating Temperature
0x06  =====  =               =  ===  == Transport Statistics (rev 1) ==
0x06  0x008  4              31  ---  Number of Hardware Resets
0x06  0x010  4              12  ---  Number of ASR Events
0x06  0x018  4               0  ---  Number of Interface CRC Errors
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0009  2            0  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            0  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS
 

heromode

Active Member
May 25, 2020
139
83
28
Not for sure. How ever the more I've been digging into this though I think its just Sas drives or scsi in general doesn't talk to smart much more then some basic info.
as I said same HBA and Expander talking to the 3tb HGST sata drives it will give full out-put and it's not a function of BDS vs Linux but rather Sas vs Sata
that id22 data still does exist in the SAS controller PCB. No matter what the WD/HGST guys are saying, the correct place to ask about this would be the smartmontools dev team.

Helium filled drives are among the most significant technological advancement in HDD technology in the past decade, if that parameter exists in the SAS HDD PCB IC ROM, then it will be pulled out of there, one way or the other. I think this discussion will yet lead to positive results. I will certainly contact the smartmontools team once i get my drives and can confirm this.
 

heromode

Active Member
May 25, 2020
139
83
28
If this parameter doesn't exist in SAS drives, well, then it doesn't exist in SATA drives either, and the helium_level reading is another scam just like the SMR scam. Either way, i think this discussion is very significant.

If the Helium_Level reading is real, it will be pulled out of the SAS drives one way or the other.
 

pr1malr8ge

Member
Nov 27, 2017
63
20
8
40
that id22 data still does exist in the SAS controller PCB. No matter what the WD/HGST guys are saying, the correct place to ask about this would be the smartmontools dev team.

Helium filled drives are among the most significant technological advancement in HDD technology in the past decade, if that parameter exists in the SAS HDD PCB IC ROM, then it will be pulled out of there, one way or the other. I think this discussion will yet lead to positive results. I will certainly contact the smartmontools team once i get my drives and can confirm this.
Another issue I've ran into is the smart background long..
Long story short when I first got the drives I downed my working truenas to run some tests first as I didn't have space to test any of the drives with the current machine running. I figured since they already went through the infant mortality period and the little bit of smart data availible showing no errors I ran a short test that completed then decided to run a long test. When it came back over a day I decided I didn't want to have my truenas down that long. So I shut down the system put all of the old drives back in the system and then proceeded to plop a single 10tb drive at a time to expand my pool.
With all of that said, after all of the drives have been inserviced I noticed the background long was not completing and this was about a week after the last drive was inserviced.

With that I started a new test and canceled the old test..

I watched the new test progress and then it reset to 0.0% and never showed complete
minus saying it's waiting on an bms interval timer to expire.
I'm wondering if all 6 drives have to scan 1 at a time then when all 6 are done it will then show completed?
how ever more then 1 drive is scanning at a time so that shouldn't be it.
side note; nothing I can do to abort the current one in progress. -X returns
Abort self test failed [unsupported field in scsi command]

Code:
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Aborted (by user command)   -   28417                 - [-   -    -]
# 2  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 3  Background short  Completed                   -   28250                 - [-   -    -]

Long (extended) Self-test duration: 65535 seconds [1092.2 minutes]
Code:
Self-test results page  [0x10]
  Parameter code = 1, accumulated power-on hours = 28417
    self-test code: background extended [2]
    self-test result: aborted by SEND DIAGNOSTIC [1]
  Parameter code = 2, accumulated power-on hours = 0
    self-test code: background extended [2]
    self-test result: self test in progress [15]
  Parameter code = 3, accumulated power-on hours = 28250
    self-test code: background short [1]
    self-test result: completed without error [0]

Background scan results page  [0x15]
  Status parameters:
    Accumulated power on minutes: 1708151 [h:m  28469:11]
    Status: background scan enabled, none active (waiting for BMS interval timer to expire)
    Number of background scans performed: 169
    Background medium scan progress: 0.00 %
    Number of background medium scans performed: 169
Here is da3 which shows what the da2 did when it was "scaning"

Code:
Self-test results page  [0x10]
  Parameter code = 1, accumulated power-on hours = 28395
    self-test code: background extended [2]
    self-test result: aborted by SEND DIAGNOSTIC [1]
  Parameter code = 2, accumulated power-on hours = 0
    self-test code: background extended [2]
    self-test result: self test in progress [15]
  Parameter code = 3, accumulated power-on hours = 28249
    self-test code: background short [1]
    self-test result: completed without error [0]

Background scan results page  [0x15]
  Status parameters:
    Accumulated power on minutes: 1706858 [h:m  28447:38]
    Status: background medium scan is active
    Number of background scans performed: 168
    Background medium scan progress: 93.08 %
    Number of background medium scans performed: 168
 
Last edited:

heromode

Active Member
May 25, 2020
139
83
28
Another issue I've ran into is the smart background long..
Long story short when I first got the drives I downed my working truenas to run some tests first as I didn't have space to test any of the drives with the current machine running. I figured since they already went through the infant mortality period and the little bit of smart data availible showing no errors I ran a short test that completed then decided to run a long test. When it came back over a day I decided I didn't want to have my truenas down that long. So I shut down the system put all of the old drives back in the system and then proceeded to plop a single 10tb drive at a time to expand my pool.
With all of that said, after all of the drives have been inserviced I noticed the background long was not completing and this was about a week after the last drive was inserviced.

With that I started a new test and canceled the old test..

I watched the new test progress and then it reset to 0.0% and never showed complete
minus saying it's waiting on an bms interval timer to expire.
I'm wondering if all 6 drives have to scan 1 at a time then when all 6 are done it will then show completed?
how ever more then 1 drive is scanning at a time so that shouldn't be it.
side note; nothing I can do to abort the current one in progress. -X returns
Abort self test failed [unsupported field in scsi command]

Code:
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Aborted (by user command)   -   28417                 - [-   -    -]
# 2  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 3  Background short  Completed                   -   28250                 - [-   -    -]

Long (extended) Self-test duration: 65535 seconds [1092.2 minutes]
Code:
Self-test results page  [0x10]
  Parameter code = 1, accumulated power-on hours = 28417
    self-test code: background extended [2]
    self-test result: aborted by SEND DIAGNOSTIC [1]
  Parameter code = 2, accumulated power-on hours = 0
    self-test code: background extended [2]
    self-test result: self test in progress [15]
  Parameter code = 3, accumulated power-on hours = 28250
    self-test code: background short [1]
    self-test result: completed without error [0]

Background scan results page  [0x15]
  Status parameters:
    Accumulated power on minutes: 1708151 [h:m  28469:11]
    Status: background scan enabled, none active (waiting for BMS interval timer to expire)
    Number of background scans performed: 169
    Background medium scan progress: 0.00 %
    Number of background medium scans performed: 169
Here is da3 which shows what the da2 did when it was "scaning"

Code:
Self-test results page  [0x10]
  Parameter code = 1, accumulated power-on hours = 28395
    self-test code: background extended [2]
    self-test result: aborted by SEND DIAGNOSTIC [1]
  Parameter code = 2, accumulated power-on hours = 0
    self-test code: background extended [2]
    self-test result: self test in progress [15]
  Parameter code = 3, accumulated power-on hours = 28249
    self-test code: background short [1]
    self-test result: completed without error [0]

Background scan results page  [0x15]
  Status parameters:
    Accumulated power on minutes: 1706858 [h:m  28447:38]
    Status: background medium scan is active
    Number of background scans performed: 168
    Background medium scan progress: 93.08 %
    Number of background medium scans performed: 168
You can run a long test while the drives are still in use, there is no need to take them offline. Just start the test and take note of the estimated time. Your system can still read and write to them at the same time
 

heromode

Active Member
May 25, 2020
139
83
28
If i remember correctly, a long 'offline' test doesn't mean your drives are really offline during the test. It just means they might not have 100% performance during the test, but they are still perfectly usable by your system while the test is running.

edit i think a BMS or Background Medium Scan is a Seagate thing, which is impossible to disable. It will run every time you either power on or even reset your computer. I remember it took about 18 hours to complete with my 8x3TB seagate drives, during which time they ran hot and it was a general pain. Another reason never to use seagate drives.
 
Last edited:

heromode

Active Member
May 25, 2020
139
83
28
Code:
Background scan results page  [0x15]
  Status parameters:
    Accumulated power on minutes: 1706858 [h:m  28447:38]
    Status: background medium scan is active
    Number of background scans performed: 168
    Background medium scan progress: 93.08 %
    Number of background medium scans performed: 168
This is not a smartctl function, it's a hardware coded seagate process, and i remember spending weeks trying to disable it with every seagate 'seachest' utility and any other tool that exists on the internet, only to find it is impossible. It will run every time you reset or power on your computer no matter what.

Again, the BMS or Background Medium Scan is a proprietary Seagate process that cannot be stopped, and if you reboot it will start again, and will not stop until it reaches 100%. It has nothing to do with a smartctl self-test at all. There is some 'interval timer' but again, it's all hardcoded into Seagate drives and cannot be disabled any way.
 
Last edited:

pr1malr8ge

Member
Nov 27, 2017
63
20
8
40
Don't have seagate drives anywhere.

as far as the long tests. I understand that can be ran with the drives online and in use..
as I said I started the first one when I did a quick test of the drives by them selfs in the chassis. I then shut the chassis down put in my 12 3tb drives back in the system rebooted the machine then went through and replaced a single drive at a time with the 10tbs in the first 6 disk vdev to expand the pool. How ever the inittial long test I didn't let finish. After all 6 10tbs were in I went with -X to stop the test I had started and ran a new long test but they apear to never complete.
 

heromode

Active Member
May 25, 2020
139
83
28
Don't have seagate drives anywhere.
Well then i don't know, i had Seagate nl-sas (nearline-sas)drives that did just this, and it has nothing to do with smartctl, so doing a smartctl -X or whatever the command to stop a smartctl long or short self test will not affect a BMS at all. Plus the BMS will always start again in a reboot and run until it reaches 100%, and cannot be stopped or disabled at all.

try to search internet with BMS and seagate and you should find some info.

Edit with that weird seagate line in your smartctl -x output, plus this BMS scan, you got some weird stuff going on here. I think you need to start doubting everything now and really find out what kinda drives you are running.

edit well now, it's my bad as well. Notice you have HUS drives, not HUH. HUS = no helium. You don't have helium drives at all, and my guess here is, without knowing, that you have some Hitatchi drives that are based on Seagate drives, because of the BMS stuff.
 
Last edited:

pr1malr8ge

Member
Nov 27, 2017
63
20
8
40
I think I've managed to get my self into a bit of an run-a-round situation.
So, the one long test that has been endlessly running as far as the report is concerned isn't actually running. It's a remnent log. Because I started the test but didn't let it finish it just stays in the log but it's not actually running and really isn't important. I started a new -t short and watched it show a new status where it went through then showed complete in the #1 spot and then I started a -t long where its now showing progress.
I think the long showing the 65535 seconds[1092.2 minutes] will stay there as a reference at all times and that was what was tripping me up. I was thinking it was a progress bar and it should be counting down.
How ever running the test again now shows above with x.xx% remaining. It's a reverse counter so 100% down to 0% rather then 0% to 100%

Now as far as the BMS side of things. your redit re-post makes sense. Since it's running and a background not worried about it.
Nothing like trying to figure something out and mistaking something for something else.

Now we just need to figure out why we can't poll the vender specific information. Cause I would very much like to know the helium levels also so as it will be a place holder on when to replace a drive.

Code:
Self-test execution status:             100% of test remaining
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 2  Background short  Completed                   -   28470                 - [-   -    -]
# 3  Background long   Aborted (by user command)   -   28417                 - [-   -    -]
# 4  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 5  Background short  Completed                   -   28250                 - [-   -    -]

Long (extended) Self-test duration: 65535 seconds [1092.2 minutes]
Code:
Self-test execution status:             99% of test remaining
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 2  Background short  Completed                   -   28470                 - [-   -    -]
# 3  Background long   Aborted (by user command)   -   28417                 - [-   -    -]
# 4  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 5  Background short  Completed                   -   28250                 - [-   -    -]

Long (extended) Self-test duration: 65535 seconds [1092.2 minutes]
 

heromode

Active Member
May 25, 2020
139
83
28
I think I've managed to get my self into a bit of an run-a-round situation.
So, the one long test that has been endlessly running as far as the report is concerned isn't actually running. It's a remnent log. Because I started the test but didn't let it finish it just stays in the log but it's not actually running and really isn't important. I started a new -t short and watched it show a new status where it went through then showed complete in the #1 spot and then I started a -t long where its now showing progress.
I think the long showing the 65535 seconds[1092.2 minutes] will stay there as a reference at all times and that was what was tripping me up. I was thinking it was a progress bar and it should be counting down.
How ever running the test again now shows above with x.xx% remaining. It's a reverse counter so 100% down to 0% rather then 0% to 100%

Now as far as the BMS side of things. your redit re-post makes sense. Since it's running and a background not worried about it.
Nothing like trying to figure something out and mistaking something for something else.

Now we just need to figure out why we can't poll the vender specific information. Cause I would very much like to know the helium levels also so as it will be a place holder on when to replace a drive.

Code:
Self-test execution status:             100% of test remaining
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 2  Background short  Completed                   -   28470                 - [-   -    -]
# 3  Background long   Aborted (by user command)   -   28417                 - [-   -    -]
# 4  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 5  Background short  Completed                   -   28250                 - [-   -    -]

Long (extended) Self-test duration: 65535 seconds [1092.2 minutes]
Code:
Self-test execution status:             99% of test remaining
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 2  Background short  Completed                   -   28470                 - [-   -    -]
# 3  Background long   Aborted (by user command)   -   28417                 - [-   -    -]
# 4  Background long   Self test in progress ...   -     NOW                 - [-   -    -]
# 5  Background short  Completed                   -   28250                 - [-   -    -]

Long (extended) Self-test duration: 65535 seconds [1092.2 minutes]
#4 is a smartctl long self test that you started.
#3 is that same test you aborted. It will remain in logs forever.

You don't have a helium drive. Your hitachi drive starts with HUS, not HUH. the second H in HUH = Helium.

2022-04-19 01.42.40 documents.westerndigital.com a0b5467c875c.png

once you allow this long self test to complete, it will become # 1 Background long Completed. Don't interrupt it, because it can affect resale value of the drive.
 
Last edited:

heromode

Active Member
May 25, 2020
139
83
28
No all my drives are HUH721010AL42C0's the 10tbs that is
the 3tbs are hus and sata.. I think you probably skimmed on that.
RIGHT :)

And your 3tb drives are the ones with BMS. I admit in the beginning of all this i never checked your model numbers, i just assumed they were all HUH. My bad.

now go check your Helium levels on your HUH drives :)
 

pr1malr8ge

Member
Nov 27, 2017
63
20
8
40
RIGHT :)

And your 3tb drives are the ones with BMS. I admit in the beginning of all this i never checked your model numbers, i just assumed they were all HUH. My bad.

now go check your Helium levels on your HUH drives :)
No the bms is reporting on the 10tb drives. using sg_logs -a /dev/da2-7 [which are my 10tb drives] da8-13 is the 3tb sata drives and they do not report the bms
and all of this was over the 10tb drives will not show helium levels. or the vender specific outputs.

Code:
root@aeronas:~ # sg_logs -a /dev/da8
    ATA       Hitachi HUS72403  A5F0

Supported log pages  [0x0]:
    0x00        Supported log pages [sp]
    0x10        Self test results [str]
    0x2f        Informational exceptions [ie]

Self-test results page  [0x10]
  Parameter code = 1, accumulated power-on hours = 10584
    self-test code: background short [1]
    self-test result: completed without error [0]
  Parameter code = 2, accumulated power-on hours = 10498
    self-test code: background extended [2]
    self-test result: completed without error [0]

Informational Exceptions page  [0x2f]
  IE asc = 0x0, ascq = 0x0
    Current temperature = 33 C
    Threshold temperature = 0 C  [common extension]