My proxmox host sent me the following mail half an hour ago:
So I logged on to my host and found this:
How do I interpret this? If 0 errors is too many, then what is normal?
Both drives are Intel DC S3710. Here are the smartclt attributes for the faulted drive:
Looks fine, no?
Code:
The number of I/O errors associated with a ZFS device exceeded
acceptable levels. ZFS has marked the device as faulted."
Code:
# zpool status
pool: rpool
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
repaired.
scan: scrub repaired 0B in 0h38m with 0 errors on Sun Mar 10 01:02:39 2019
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
sdi2 ONLINE 0 0 0
sdk2 FAULTED 0 0 0 too many errors
errors: No known data errors
Both drives are Intel DC S3710. Here are the smartclt attributes for the faulted drive:
Code:
# smartctl -A /dev/sdk
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.18-9-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 25381
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 70
170 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
174 Unsafe_Shutdown_Count 0x0032 100 100 000 Old_age Always - 69
175 Power_Loss_Cap_Test 0x0033 100 100 010 Pre-fail Always - 6870 (100 2757)
183 SATA_Downshift_Count 0x0032 100 100 000 Old_age Always - 2
184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
190 Temperature_Case 0x0022 076 069 000 Old_age Always - 24 (Min/Max 21/33)
192 Unsafe_Shutdown_Count 0x0032 100 100 000 Old_age Always - 69
194 Temperature_Internal 0x0022 100 100 000 Old_age Always - 24
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
199 CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0
225 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 26312405
226 Workld_Media_Wear_Indic 0x0032 100 100 000 Old_age Always - 7618
227 Workld_Host_Reads_Perc 0x0032 100 100 000 Old_age Always - 15
228 Workload_Minutes 0x0032 100 100 000 Old_age Always - 1522689
232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0
233 Media_Wearout_Indicator 0x0032 093 093 000 Old_age Always - 0
234 Thermal_Throttle 0x0032 100 100 000 Old_age Always - 0/0
241 Host_Writes_32MiB 0x0032 100 100 000 Old_age Always - 26312405
242 Host_Reads_32MiB 0x0032 100 100 000 Old_age Always - 4944506
Last edited: