Question about a Intel P3700

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.
Jun 22, 2015
91
61
18
I bought a P3700 from ebay and here is part of the smartctl output:
Code:
SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        28 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    137,777,948,212 [70.5 PB]
Data Units Written:                 140,116,515,500 [71.7 PB]
Host Read Commands:                 1,884,131,082
Host Write Commands:                1,887,107,611
Controller Busy Time:               153
Power Cycles:                       282
Power On Hours:                     586
Unsafe Shutdowns:                   469
Media and Data Integrity Errors:    70
Error Information Log Entries:      0
The PBW is unusually high and I think it passed the Intel endurance threshold, but the available spare is still 100%. Something wrong with this drive?
There is also a bunch error entries:
Code:
Error Information (NVMe Log 0x01, max 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0         62     1  0x0c77  0x4502  0x000         1032     1     -
  1         61     1  0x0c76  0x4502  0x000         1024     1     -
  2         60     1  0x011c  0x4502  0x000            0     1     -
  3         59     1  0x00f0  0x4502  0x000            0     1     -
  4         58     1  0x00cb  0x4502  0x000            0     1     -
  5         57     1  0x004a  0x4502  0x000            0     1     -
  6         56     1  0x00ef  0x4502  0x000            0     1     -
  7         55     1  0x006d  0x4502  0x000            0     1     -
  8         54     1  0x00a5  0x4502  0x000            0     1     -
  9         53     1  0x007f  0x4502  0x000            0     1     -
 10         52     1  0x011d  0x4502  0x000            0     1     -
 11         51     1  0x0143  0x4502  0x000            0     1     -
 12         50     1  0x009e  0x4502  0x000            0     1     -
 13         49     1  0x0104  0x4502  0x000            0     1     -
 14         48     1  0x0106  0x4502  0x000            0     1     -
 15         47     1  0x004a  0x4502  0x000            0     1     -
... (46 entries not shown)
TIA
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
This does not seem possible... 71 PB in 586 hours? Thats 121TB/h -> 2 TB/m -> 34GB/s

Also with 512000 per Data unit that's 63PB written not 71PB, so not even that is consistent
 

Dreece

Active Member
Jan 22, 2019
503
160
43
The 282 power-cycles would have forced me to do a return. Fair enough it ain't a spinner, but still. Looks like it was used in a workstation rather than a server, and therein lays the conundrum.... ex-workstation hardware is often a bad investment.
 
  • Like
Reactions: vintagehardware
Jun 22, 2015
91
61
18
I bought it last Oct thinking it was a deal ($50 for a 400G and a USA seller with 100% rating.) And I never bothered with running smartctl. Well it's a good lesson seems... Always check smart as soon as buying a drive!
 

Dreece

Active Member
Jan 22, 2019
503
160
43
@Whaaat - IIRC shutdowns do not directly relate to power-cycles. A shutdown can simply be a reboot/crash where the power isn't cycled as well as actually powering-off.
 
  • Like
Reactions: vintagehardware

Whaaat

Active Member
Jan 31, 2020
301
157
43
IIRC shutdowns do not directly relate to power-cycles. A shutdown can simply be a reboot/crash where the power isn't cycled as well as actually powering-off
Completely the opposite is true. Before shutdown drive has to be notified by OS only to flush its cache (and offload heads in case of spinners). Each time power removed without this notification 'unsafe' counter is incremented.

plp0.PNG
plp1.PNG
plp2.PNG
 
  • Like
Reactions: vintagehardware
Jun 22, 2015
91
61
18
So I just downloaded Intel mas, and the first thing I did was to update the firmware to the latest. And after that all wrritten/read numbers became TB instead of PB. So now it makes much more sense.
Code:
Critical Warning:                   0x00
Temperature:                        20 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    137,784,484 [70.5 TB]
Data Units Written:                 140,117,154 [71.7 TB]
Host Read Commands:                 1,884,161,170
Host Write Commands:                1,887,119,855
Controller Busy Time:               153
Power Cycles:                       287
Power On Hours:                     589
Unsafe Shutdowns:                   474
Media and Data Integrity Errors:    70
Error Information Log Entries:      0
Now one thing that still concerns me is the high CRC count dumped by Intel mas, which amounts to about 200k:
Code:
sudo intelmas show -sensor -intelssd 3

ReadOnlyMode : False
ReliabilityDegraded : False
TemperatureThresholdExceeded : False
AvailableSpareBelowThreshold : False
SpecifiedPCBMinOperatingTemp : 0
PowerCycles : 0x011F
LowestLifetimeTemperature : 17
HighestLifetimeTemperature : 20
ErrorInfoLogEntries : 0x00
SpecifiedPCBMaxOperatingTemp : 71
EraseFailCount : 0
ProgramFailCount : 0
UnsafeShutdowns : 0x01DA
PowerOnHours : 0x024D
AvailableSpare : 100
ThermalThrottleCount : 0
MaxNandEraseCycles : 150
AverageNandEraseCycles : 132
MinNandEraseCycles : 118
PercentageUsed : 0
Temperature - Celsius : 20
EnduranceAnalyzer : Property not found
CrcErrorCount : 202028
MediaErrors : 0x046
EndToEndErrorDetectionCount : 70
ThermalThrottleStatus : 0
ReadOnlyMode : False
DeviceStatus : Healthy
 

Rand__

Well-Known Member
Mar 6, 2014
6,626
1,767
113
Well I wouldnt worry about that unless it increases all the time

Iirc my old ones didnt have to good values either when i smart tested them a few months ago, but then I remembered that there were some fw issues in the earlier days that might have caused this.

Just put it in somewhere, use it and then check again in a week or so
 
  • Like
Reactions: vintagehardware

Dreece

Active Member
Jan 22, 2019
503
160
43
Completely the opposite is true. Before shutdown drive has to be notified by OS only to flush its cache (and offload heads in case of spinners). Each time power removed without this notification 'unsafe' counter is incremented.
If you re-read my post, I did state that, but that counter ALSO includes other scenarios too. When I have a few minutes I'll see if I can dig up the blurb on that particular smart counter, it is most definitely not explicitly unsafe power-offs, just something I remember from a few years back when digging around smart-info arena, though I am not entirely sure if each manufacturer follows the same method.
 
  • Like
Reactions: vintagehardware