Weird SCSI errors in TrueNAS console

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
I am running badblocks on four new disks through SSH on my NAS, and the console is constantly spamming this (this is with just one disk being tested, but just imagine four times as much with four disks mentioned):
1642081947240.png

The server: TruNAS 12 (latest version as of today) running in ESXi 6.7, with LSI 2308-based HBA card (flashed to the correct firmware for this use case - don't remember the details) with passthrough properly configured.
Connected: three SATA disks as the current pool and four new disks that are being tested.
The new disks are SAS HGST HUS726040AL4210.

Badblocks seems to be happily running on all four new disks in parallel.
1642081956308.png

Does anyone know what those errors are and whether they are harmless or something is wrong? I think I saw the error 22 in the console before, with the setup I've been using for the past two years, but the SCSI thing is most probably new.
 
Last edited:

TRACKER

Active Member
Jan 14, 2019
182
56
28
It seems disk da4 is "giving up soul" :)
You may not see error in badblocks scan probably because hdd is able to recover from the errors internally (like internal remapping).
Check smart status and especially parameters, related to bad blocks.
It should be reflected in some of parameter's values there.
 

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
Read the beginning of the post carefully. The screenshot was taken with just one instance of badblocks running. The errors just multiply with more disks in the mix.
I am pretty sure everything is ok, because short and long SMART tests finished without problems and there are no errors in SMART results either.
1642085506948.png
edit: There are no SMART values reported for SAS disks.

The seller did a surface test of the disks too, and with his positive feedback the chances of him sending me FIVE faulty disks is slim to none.

I am inclined to believe the errors in the console are harmless and could maybe be somehow related to the virtualized environment. But it's still annoying to see them and not being able to figure out where do they come from.
 

TRACKER

Active Member
Jan 14, 2019
182
56
28
yes, right, it is SAS disk.
Those don't have complete SMART parameters shown (that's pity).
I don't think it is because of the virtualized environment.
You see, you have couple of SAS disks connected and doing badblocks check - right?
Well, why then you have only da4 showing errors?
 

TRACKER

Active Member
Jan 14, 2019
182
56
28
You did not explain you have errors for other disks.
From your screenshot only da4 is listed.
Anyway, it is not normal to have these errors during any kind of scan, regardless of the environment.
It might be the case issue is with cabling, but since we cannot see smart values, it is just a guess.

P.S. Let's keep religion out of the discussion :)
 

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
Ah, there we go!

Code:
    HGST      HUS726040AL4210   A980

Supported log pages  [0x0]:
    0x00        Supported log pages [sp]
    0x02        Write error [we]
    0x03        Read error [re]
    0x05        Verify error [ve]
    0x06        Non medium [nm]
    0x0d        Temperature [temp]
    0x0e        Start-stop cycle counter [sscc]
    0x0f        Application client [ac]
    0x10        Self test results [str]
    0x15        Background scan results [bsr]
    0x18        Protocol specific port [psp]
    0x19        General Statistics and Performance [gsp]
    0x1a        Power condition transitions [pct]
    0x2f        Informational exceptions [ie]
    0x30        Performance counters (Hitachi) [pc_hi]
    0x37        Cache (seagate) [c_se]

Write error counter page  [0x2]
  Errors corrected without substantial delay = 0
  Errors corrected with possible delays = 0
  Total rewrites or rereads = 0
  Total errors corrected = 0
  Total times correction algorithm processed = 93968
  Total bytes processed = 7253681980480 [7 TB]
  Total uncorrected errors = 0

Read error counter page  [0x3]
  Errors corrected without substantial delay = 0
  Errors corrected with possible delays = 0
  Total rewrites or rereads = 0
  Total errors corrected = 0
  Total times correction algorithm processed = 160754
  Total bytes processed = 16235510690688 [16 TB]
  Total uncorrected errors = 0

Verify error counter page  [0x5]
  Errors corrected without substantial delay = 0
  Errors corrected with possible delays = 0
  Total rewrites or rereads = 0
  Total errors corrected = 0
  Total times correction algorithm processed = 164350
  Total bytes processed = 0
  Total uncorrected errors = 0

Non-medium error page  [0x6]
  Non-medium error count = 0

Temperature page  [0xd]
  Current temperature = 32 C
  Reference temperature = 85 C

Start-stop cycle counter page  [0xe]
  Date of manufacture, year: 2016, week: 45
  Accounting date, year: 2016, week: 45
  Specified cycle count over device lifetime = 50000
  Accumulated start-stop cycles = 19
  Specified load-unload count over device lifetime = 600000
  Accumulated load-unload cycles = 461

Application client page  [0xf]
 00     0f 00 40 00 00 00 03 fc  00 00 00 00 00 00 00 00
 10     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 20     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 30     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 .....  [truncated after 64 of 16388 bytes (use '-H' to see the rest)]

Self-test results page  [0x10]
  Parameter code = 1, accumulated power-on hours = 10606
    self-test code: background extended [2]
    self-test result: completed without error [0]
  Parameter code = 2, accumulated power-on hours = 10596
    self-test code: background short [1]
    self-test result: completed without error [0]

Background scan results page  [0x15]
  Status parameters:
    Accumulated power on minutes: 637315 [h:m  10621:55]
    Status: background medium scan is active
    Number of background scans performed: 52
    Background medium scan progress: 79.08 %
    Number of background medium scans performed: 52

Protocol Specific port page for SAS SSP  (sas-2) [0x18]
relative target port id = 1
  generation code = 1
  number of phys = 1
  phy identifier = 0
    attached SAS device type: SAS or SATA device
    attached reason: power on
    reason: unknown
    negotiated logical link rate: 6 Gbps
    attached initiator port: ssp=1 stp=1 smp=1
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000cca25d36da11
    attached SAS address = 0x500605b008b3ade0
    attached phy identifier = 0
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization count = 0
    Phy reset problem count = 0
    Phy event descriptors:
     Invalid word count: 0
     Running disparity error count: 0
     Loss of dword synchronization count: 0
     Phy reset problem count: 0
relative target port id = 2
  generation code = 1
  number of phys = 1
  phy identifier = 1
    attached SAS device type: no device attached
    attached reason: unknown
    reason: power on
    negotiated logical link rate: phy enabled; unknown rate
    attached initiator port: ssp=0 stp=0 smp=0
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000cca25d36da12
    attached SAS address = 0x0
    attached phy identifier = 0
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization count = 0
    Phy reset problem count = 0
    Phy event descriptors:
     Invalid word count: 0
     Running disparity error count: 0
     Loss of dword synchronization count: 0
     Phy reset problem count: 0

General Statistics and Performance  [0x19]
Statistics and performance log parameter
  number of read commands = 121673164
  number of write commands = 166723234
  number of logical blocks received = 1770918472
  number of logical blocks transmitted = 3963747727
  read command processing intervals = 0
  write command processing intervals = 0
  weighted number of read commands plus write commands = 0
  weighted read command processing plus write command processing = 0
Idle time log parameter
  idle time intervals = 730469572
Time interval log parameter for general stats
  time interval negative exponent = 2
  time interval integer = 5

Power condition transitions page  [0x1a]
  Accumulated transitions to active = 3
  Accumulated transitions to idle_a = 3
  Accumulated transitions to idle_b = 0
  Accumulated transitions to idle_c = 0
  Accumulated transitions to standby_z = 0
  Accumulated transitions to standby_y = 0

Informational Exceptions page  [0x2f]
  IE asc = 0x0, ascq = 0x0
    Current temperature = 32 C
    Threshold temperature = 85 C  [common extension]

HGST/WDC performance counters page [0x30]
  Zero Seeks = 0
  Seeks >= 2/3 = 4
  Seeks >= 1/3 and < 2/3 = 2
  Seeks >= 1/6 and < 1/3 = 0
  Seeks >= 1/12 and < 1/6 = 3
  Seeks > 0 and < 1/12 = 1
  Overrun Counter = 0
  Underrun Counter = 0
  Device Cache Full Read Hits = 17
  Device Cache Partial Read Hits = 0
  Device Cache Write Hits = 0
  Device Cache Fast Writes = 0
  Device Cache Read Misses = 10

HGST/WDC miscellaneous page [0x37, 0x0]
  Power on hours = 10621
  Total Bytes Read = 16235510690688
  Total Bytes Written = 7253682096960
  Max Drive Temp (Celsius) = 59
  GList Size = 0
  Number of Information Exceptions = 0
  MED EXC = 0
  HDW EXC = 0
  Total Read Commands = 121673164
  Total Write Commands = 166723242
  Flash Correction Count = 0
octopuss@Skladiste:~ $
 

TRACKER

Active Member
Jan 14, 2019
182
56
28
nasbdh9, thank you for the command, very useful, i did not know such command exists.
regarding logs - everything looks fine, there is one thing, which might be somehow interfering with badblocks scan:


" Status: background medium scan is active
...
Background medium scan progress: 79.08 %
"
 

Octopuss

Active Member
Jun 30, 2019
412
62
28
Czech republic
I have no idea what that is but it's certainly not related to badblocks. That percentage isn't changing at all; it's been like that since the beginning.