In my fallback server I have a Samsung 970 evo disk that reports the following:
root@lair ~]# smartctl -a /dev/nvme0
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1062.1.2.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 970 EVO 500GB
Serial Number: S466NB0K405934V
Firmware Version: 1B2QEXE7
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 500,107,862,016 [500 GB]
Unallocated NVM Capacity: 0
Controller ID: 4
Number of Namespaces: 1
Namespace 1 Size/Capacity: 500,107,862,016 [500 GB]
Namespace 1 Utilization: 229,721,784,320 [229 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 5481b24802
Local Time is: Thu Oct 10 13:14:44 2019 CEST
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 85 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 6.20W - - 0 0 0 0 0 0
1 + 4.30W - - 1 1 1 1 0 0
2 + 2.10W - - 2 2 2 2 0 0
3 - 0.0400W - - 3 3 3 3 210 1200
4 - 0.0050W - - 4 4 4 4 2000 8000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 37 Celsius
Available Spare: 94%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 10,122,289 [5.18 TB]
Data Units Written: 34,031,482 [17.4 TB]
Host Read Commands: 212,554,037
Host Write Commands: 1,166,877,653
Controller Busy Time: 4,563
Power Cycles: 40
Power On Hours: 10,440
Unsafe Shutdowns: 8
Media and Data Integrity Errors: 12
Error Information Log Entries: 14
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 37 Celsius
Temperature Sensor 2: 42 Celsius
Error Information (NVMe Log 0x01, max 64 entries)
No Errors Logged
I know how to read smart data for sata disks, but I have no experience with nvme. I am worried about the
Media and Data Integrity Errors: 12
Yesterday I noticed that there were a few errors reported, but after reading the whole disk with dd the error count increased to 11, another dd read today added one more error.
Does this mean that my NVME SSD is dying? It is not much over one year old and pretty lightly used. Should I get it replaced? Is this a warranty case?
root@lair ~]# smartctl -a /dev/nvme0
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1062.1.2.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 970 EVO 500GB
Serial Number: S466NB0K405934V
Firmware Version: 1B2QEXE7
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 500,107,862,016 [500 GB]
Unallocated NVM Capacity: 0
Controller ID: 4
Number of Namespaces: 1
Namespace 1 Size/Capacity: 500,107,862,016 [500 GB]
Namespace 1 Utilization: 229,721,784,320 [229 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 5481b24802
Local Time is: Thu Oct 10 13:14:44 2019 CEST
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 85 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 6.20W - - 0 0 0 0 0 0
1 + 4.30W - - 1 1 1 1 0 0
2 + 2.10W - - 2 2 2 2 0 0
3 - 0.0400W - - 3 3 3 3 210 1200
4 - 0.0050W - - 4 4 4 4 2000 8000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 37 Celsius
Available Spare: 94%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 10,122,289 [5.18 TB]
Data Units Written: 34,031,482 [17.4 TB]
Host Read Commands: 212,554,037
Host Write Commands: 1,166,877,653
Controller Busy Time: 4,563
Power Cycles: 40
Power On Hours: 10,440
Unsafe Shutdowns: 8
Media and Data Integrity Errors: 12
Error Information Log Entries: 14
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 37 Celsius
Temperature Sensor 2: 42 Celsius
Error Information (NVMe Log 0x01, max 64 entries)
No Errors Logged
I know how to read smart data for sata disks, but I have no experience with nvme. I am worried about the
Media and Data Integrity Errors: 12
Yesterday I noticed that there were a few errors reported, but after reading the whole disk with dd the error count increased to 11, another dd read today added one more error.
Does this mean that my NVME SSD is dying? It is not much over one year old and pretty lightly used. Should I get it replaced? Is this a warranty case?