What Are The best Used Datacenter SAS/SATA SSD To Buy For New Server Build?

uberguru · Sep 9, 2024

I am trying to build some data center servers for SaaS app and i was thinking of setting up data redundancy using used datacenter SSD drives as opposed to buying them new and not having redundancy because of the high cost.

Another reason is i plan to use ZFS as filesystem so i can easily expand the storage size by adding more mirror vdevs to the pool
That way i can easily increase size and also have redundancy using mirrored vdevs

I will not be writing that much data so used datacenter SSD drives will be like new regular SSD drives for my usecase.
I am mostly used to Samsung 870 Evo drives 1 x 250GB (no redundancy) for the OS and 1 x 4TB(no redundancy) for data but i want to change that and start using used datacenter SSDs so i can setup redundancy as mentioned above

Servers am planning to use are R630, R730xd and R830

The R730xd will be used to run TrueNAS and the others will be native ZFS setup on Ubuntu 24.04

I saw this very old article https://www.servethehome.com/used-enterprise-ssds-dissecting-our-production-ssd-population/ but wanted to ask here to get what recommendations are for used datacenter SSDs on ebay will be

I am looking for ~3.8-4TB and/or ~6.8-8TB datacenter SSD drives. I dont mind the bigger ~15-16TB ones if the price make sese.
I know the bigger are more expensive so am ok with the smaller ~4TB ones too
No NVMe, just SAS/SATA datacenter SSDs for now

CyklonDX · Sep 9, 2024

i'd rather stay away from samsung and micron. Not that they are bad - just don't like 'em - and dell ain't too kind with them either. In my exp, most of dell branded disks are Toshiba, then Samsung.

Go for ones that have plenty of endurance. HW RAID wise you won't get TRIM on them (works on in software mode), so keep that in mind. (also a reason why there's plenty of slightly used ssd's on the market)

Personally at home&work i've played with
HGST 800G SAS3 SSD ~> they are great if they come with right firmware. (some people had hard times with them in terms of performance)
Toshiba PX05xxxx sas3 ssd great disks, they are a ok endurance wise - not greatest. Thus far my favorite sas ssd's in company with dell servers, and at home. (some people reported them dying - i'd lean to really bad luck or issues with his system.)

Since your aim is zfs, i recommend you go with toshiba 3.8T px05's, and get pcie sas controller - don't use the mini h controller - its limited x4 pcie, and io queue. If you are in market for used ones, ensure you go for grade A with 90+ endurance left.

for brand new dell certified disks this is a decent site (not that great for recertified)

https://www.serversupply.com/SSD%20W-TRAY/SAS-12GBPS?maxPrice=8700&mfgrFilter=DELL-&minPrice=0&condFilter=New&page=1

Read intensive = shitty write endurance

uberguru · Sep 9, 2024

CyklonDX said:
i'd rather stay away from samsung and micron. Not that they are bad - just don't like 'em - and dell ain't too kind with them either. In my exp, most of dell branded disks are Toshiba, then Samsung.

What about the Intel datacenter SSDs? Intel S3500 and S3700? but those are too small capacity for my usecase
The Intel D3-S4610 are good ones as they have up to 7.68TB

CyklonDX said:
Since your aim is zfs, i recommend you go with toshiba 3.8T px05's, and get pcie sas controller - don't use the mini h controller - its limited x4 pcie, and io queue. If you are in market for used ones, ensure you go for grade A with 90+ endurance left.

So are you saying Dell is crazy with non certified or supported brands? I am coming from Lenovo NX360 M4 and never had issue with a drive not working. I have tried mostly samsung 870 evo and samsung NVMe and never any issue.

I will be using HBA330 controller for the zfs/truenas setup

I am looking for used for the value and since datacenter their endurance even used will be better than consumer SSD new

CyklonDX · Sep 9, 2024

The intel disks are very nice, just make sure there's endurance (as those are quite old already - performance wise might not be the best too, since those are sata as i recall).

Dell is kinda ish crazy if you want some features like SED, and stuff, or able to use it in dell MD/ME storage array - it can plainly refuse non-dell certified disk. For your use case i don't think you'll have any issue - except having a warn in idrac about disks being uncertified, and not showing you how much % of life you have left.

I'd recommend getting 9300-8i instead on pcie for zfs. If you are running the perc h330 mini, You will be stuck at pcie x4 (3.9GB/s max R/W), with sas ssd's you can easily go over that with 4-8 disks. Thus pcie x8 bus is recommended as r630/730 backplane do support some 8GB/s R/W max.
(there are 2 most common types of h330, one 8-lane on pcie which is 9300-8i with dell firmware, and h330 mini mono which runs at 4-lane pcie - some sites tell its 8lane but its not true, it caps at some 4GB/s - and if you push your zfs with a lot of read/write requests you'll see big performance difference going with non-dell branded controller. -- idk why but documentation does state its pcie x8 but never in around 100 r630 boxes have i seen h730p or h330 do pcie x8 from them.)

uberguru · Sep 9, 2024

CyklonDX said:
I'd recommend getting 9300-8i instead on pcie for zfs. If you are running the perc h330 mini, You will be stuck at pcie x4 (3.9GB/s max R/W), with sas ssd's you can easily go over that with 4-8 disks. Thus pcie x8 bus is recommended as r630/730 backplane do support some 8GB/s R/W max.
(there are 2 types of h330, one 8-lane on pcie which is 9300-8i with dell firmware, and h330 mini mono which runs at 4-lane pcie - some sites tell its 8lane but its not true, it caps at some 4GB/s)

here is the HBA330 CAT-10845#Dell HBA330 12Gb/s Host Bus Controller Mini Mono and they are at 12Gb/s

so far am looking at the Intel SSD D3 series but from my calculations seems i might stick to the samsung 870 evo 4TB
they are cheap and yes the write endurance is not as great but brand new at least i know what to expect and also i am not writing that much data to them like that to need the endurance of the datacenter drives which are like 10x the endurance and also the datacenter drives have PLP but honestly for the price difference i might stick with the 4TB 870 evo

it is funny how ebay gets with old stuffs, the old SSD still dont make sense much
i was thinking they will be reasonable with the prices since they are used but the sellers are out of their minds with the prices for used drives
so might end up sticking to the brand new 870 evo and just setup redundancy

CyklonDX · Sep 9, 2024

yeah 12gb/s is sas3 support.
single 12Gb/s is 1.5GB/s

I would expect this card to cap at some 4GB/s (with sata ssd's its fine.)

uberguru · Sep 9, 2024

CyklonDX said:
yeah 12gb/s is sas3 support.
single 12Gb/s is 1.5GB/s

I would expect this card to cap at some 4GB/s (with sata ssd's its fine.)

how did you get to 4GB/s from using the HBA330?

CyklonDX · Sep 9, 2024

4 of those in raid10 (+ sed)

https://www.serversupply.com/SSD%20W-TRAY/SAS-12GBPS/960GB/DELL/09JJC_346343.htm

*its my typical cassandra server disk setup. (Its a mix of r630 and r640 - in different datacenters)

with zfs in mind, i have 4 cassandra snapshot archive boxes r730xd (24 bay) originally with h330 hba's (for different datacenters - i replaced them with non-dell 9300-8i since, i also tested the h330 pcie model)
24x toshiba sas ssd's. (they push almost 8GB/s R/W when hit with snapshots once a week)

In both cases could not hit beyond 4GB/s on mini mono raid controllers and hba's. (sure with cache you could get a short burst higher, but reality is you keep cache disabled for those use cases - and in most cases it will slow you down.) While normal pcie models (including dell h330) worked just fine - and got full x8 lane.

Top perf on that raid10 with normal pcie raid controller is 7.4GB/s read, and some 1.9GB/s write.
the mini mono locks down at 3.8GB/s and 1.9GB/s write.
(my use case is explosive bursts of reads, and then writes or together at the same time - then goes quiet for most of the day with minimal usage.)

Its especially meaningful if you do reads and writes at the same time. (which cassandra does, and so does the zfs - both rely on queue depth too - non dell sas controller for zfs appear to have much higher queue depth thus offer much better performance in sas hdd enviorments like log archive servers - got couple of those too, and dell branded hba controllers suck ass)

*in r730xd you need to sort the disks correctly to get 8GB/s else you'll be stuck at 6GB/s due to backplane limitations. (the way backplane does it refreshes - just an example image)

preferably you'd want array to consist from first 2 disks in each column, and going down in such format. The last disks will offer least performance as wait time will be longest before refresh.

uberguru · Sep 9, 2024

CyklonDX said:
with zfs in mind, i have 4 cassandra snapshot archive boxes r730xd (24 bay) originally with h330 hba's (for different datacenters - i replaced them with non-dell 9300-8i since, i also tested the h330 pcie model)
24x toshiba sas ssd's. (they push almost 8GB/s R/W when hit with snapshots once a week)

Very interesting post, thanks for sharing this

You mind sharing how you are pushing 8GB/s R/W ? Are you using TrueNAS or what is doing the snapshot? ANd why are you doing snapshots weekly and not daily? which will reduce having to push so much data at once as opposed to daily with less? ALso your RPO will be better with daily. Some even do hourly if your data is very critical.

So will be interested in server spec and networking details of this setup.
Like what switches and what is snapshotting and how much data.
For me i want high capacity drives so 900GB is too small for me.

CyklonDX · Sep 10, 2024

An example of zfs server configuration

R730xd 24x 1.6TB sas ssd *(each stand-alone avg around 1.1GB/s read, and 800MB/s write)
2x e5-2690v4 (turbo enabled)
768GB DDR4 (2400MHz)
9300-8i (no caching)
lz4 pool compression for current cassandra snapshots (raidz1 2x 6 disks groups), it comes with:
1x nvme for log

(*during the week job moves older backups onto 2nd pool that uses gzip-9 ~ i.e. slower array location, raidz2)

The load is taken in from dual 40G nic xl710-bm2.
(i can't tell what kind of switches we are using - i only deal with servers.)

There are typically 3-6-12 cassandra servers in multiple datacenters, and Friday evening before repairs (or changes to schema) automated task starts the snapshot on each dc. Resulting in some 3-4TB of data being written to the zfs in rapid manner. *(the sooner its done, the faster repairs can take place, and finish.)

Cassandra is a masterless nosql database - it requires data repair every week. There would be too big of a load if we were to run snapshots daily or run repairs all the time. Just like about anyone serious load and use of cassandra.
In short I can survive failure of whole dc without any data loss.
*there can be no changes to schema during the week, only friday evening.
(The data movement itself causes some 30 to up to 80% cpu spike (cass nodes and zfs storage) , and uses significant amount of our internal bandwidth.)

Reads are only really used in case of emergency - pd fcking up - so far never happened in almost a decade. (but process is ready to stream the data out to every single node at the same time) We do use rsync for upstream, and netcat downstream.

Thats about it from hardware, and software point of view; I did spend sigificant amount of time creating tuned profile, and profiling system (perf + firegraph) to get rid of any bottlenecks - i won't share this as there's no point - its only for my use-case os+hardware+switches+nic's, and it will likely not work/or hinder your performance. *(on different non production enviroment where we have different switches, and only dual 10gig network same settings actually result in less than 500MB/s as both server, and network is choking - so again no point in sharing those - This pdf was my guiding light for tunning https://events.static.linuxfound.org/sites/events/files/slides/LinuxConJapan2016_makita_160712.pdf .)

uberguru · Sep 10, 2024

I see so the ZFS pools are setup on each server and used on that specific server. no NFS or iSCSI right?
that is some huge single cassandra servers.
Also since managing these ZFS based cassandra clusters, any issues with ZFS ever?

CyklonDX · Sep 10, 2024

No zfs only functions as snapshot/backup storage where cassandra nodes offload snapshots/backups.

There are from 3 to 12 cassandra servers in each datacenters (we have multiple around the globe) serving cassandra.

uberguru · Sep 10, 2024

CyklonDX said:
No zfs only functions as snapshot/backup storage where cassandra nodes offload snapshots/backups.

There are from 3 to 12 cassandra servers in each datacenters (we have multiple around the globe) serving cassandra.

Oh i see so the cassandra servers themselves are not using ZFS filesystem?
what filesystem are they using then?

So the ZFS storage server is just for snapshots and backups of the several cassandra servers?
So you just have 1 ZFS storage server but cassandra servers are many?

Asking to just see how people are using ZFS storage server in Production

BackupProphet · Sep 10, 2024

Both XFS and Ext4 can be much much faster than ZFS. While this benchmark is getting quite outdated, A Quick Look At EXT4 vs. ZFS Performance On Ubuntu 19.10 With An NVMe SSD - Phoronix the performance difference on SSD/NVMe is still a major issue for ZFS.

uberguru · Sep 10, 2024

BackupProphet said:
Both XFS and Ext4 can be much much faster than ZFS. While this benchmark is getting quite outdated, A Quick Look At EXT4 vs. ZFS Performance On Ubuntu 19.10 With An NVMe SSD - Phoronix the performance difference on SSD/NVMe is still a major issue for ZFS.

What i dont like about ZFS is like the application approach to storage managment. They add all these complexities to speed up reads and writes when we dont need all that. We just want a simple and easy to manage filesystem that will not create more problems ontop of standard filesystem problems. Maybe ZFS made some kind of sense back when drives were slow but now with all these fast drives, we just want a simple filesystem that does not have all these extra stuffs ontop that will create bigger surface area of possible issues.

so for me i am not too sold on it for any shared block storage. I mean not just ZFS i am not sold on shared block storage for any filesystem, keyword there is block storage. i am ok with NFS shares on non-critical or insanely backed up critical data
For me i like the decentralized nature to storage so if anything happens not only do i have a few data to restore but then i only have to worry about a small section of my overall data. So there is no single point of failure.

So currently going towards linux raid with mdadm and lvm for ext4 filesystem on high density servers with 4 x CPUs. keeping it simple just as AWS does it.

Sure i can use ZFS for backup server and NFS shares.

TRACKER · Sep 10, 2024

Back in 2009 when i built my 1st DIY NAS i was testing different OSes in vmware workstation (i think name was not workstation yet) and i had some really nasty driver bug in intel ich10 driver which was causing data corruption on files, larger than 1GB~. I found it after many tests and hours spent with copying/extracting zip/rar archives.
Anyway, that ich10 was on my vmware host. I decided to try these OSes (slackware linux, windows xp, vista, solaris 10, opensolaris with zfs,etc.).
In vm guests only the one with opensolaris/zfs found the corruption on its vm disk.
Since then i use only file systems which have integrated resiliency/checksums. At that time it was only ZFS providing that feature.
After that file systems like REFS and BTRFS appeared but it was too late

ZFS already got the "intertia". So yeah...for me data consistency is the most important, everything else comes 2nd.

TRACKER · Sep 10, 2024

And yeah...my newest DIY NAS with xeon gold 1st gen (or more correctly said "SAN") is able to achieve 6GB/s via 100Gbps nic on Truenas Scale (with iscsi over rdma - iSER). It may be even faster but my esxi hosts are ancient, running sandy bridge. So speed is fine i would say

uberguru · Sep 10, 2024

TRACKER said:
Back in 2009 when i built my 1st DIY NAS i was testing different OSes in vmware workstation (i think name was not workstation yet) and i had some really nasty driver bug in intel ich10 driver which was causing data corruption on files, larger than 1GB~. I found it after many tests and hours spent with copying/extracting zip/rar archives.
Anyway, that ich10 was on my vmware host. I decided to try these OSes (slackware linux, windows xp, vista, solaris 10, opensolaris with zfs,etc.).
In vm guests only the one with opensolaris/zfs found the corruption on its vm disk.
Since then i use only file systems which have integrated resiliency/checksums. At that time it was only ZFS providing that feature.
After that file systems like REFS and BTRFS appeared but it was too late
ZFS already got the "intertia". So yeah...for me data consistency is the most important, everything else comes 2nd.

when you say "causing data corruption" can you elaborate?
does it mean your drive failed or something? how did you detect data corruption? the files could not be opened?
i have been using ext4 with no checksum and i never came across data corruption so will like to know how often data corruption happens and what causes it to happen in first place

for me ZFS is like building an application on top of filesystem, i dont want that. i just want vanilla filesystem and i can manage things after that myself so i can reduce surface area of issues. the more you add to something the more surface area of issues can happen

i wont blindly be sold on some filesystem because it got one pros that other dont have, i rather consider all pros and cons altogether

CyklonDX · Sep 10, 2024

uberguru said:
So you just have 1 ZFS storage server but cassandra servers are many?

For my use 1 per datacenter in my case. (6 other dc's)

I also have other zfs boxes but for different functions, but they aren't performance oriented - more into capacity;
as logarchive, ai training data storage, kube big-slow storage. (they typically go into 400-600TB total capacity)

On my cassandra nodes, i use ext4 - hardware raid10 + sed (with listed before sas ssd's)

// beyond that i use zfs at home for my home lab, media/ai server needs, total of some 400T of storage, and couple TB of faster sas3 ssd storage.
(slowly running out of space for trainning data, and upscaling projects.)

TRACKER · Sep 10, 2024

By 'data corruption" i mean silent data corruption caused by buggy driver on ich10 (where my 6 sata ports were connected to). The worst thing was all vm guests without file system with check summing like all the windowses and linuxes back then - they were working fine, no bsod, nothing. Until you try to copy and open zip/rar archive files larger than 1GB. Basically i copy known good zip/rar file larger than 1GB and i try to extract. It was giving error and failed to extract the contents. Identifying the culprit was really hard, i tested memory via memtest multiple times, checked hdds on another computers. Nothing. After i found the issue was caused by the driver i disabled ich10 sata and used pci based sata hba (some cheap one i found in my area).

What Are The best Used Datacenter SAS/SATA SSD To Buy For New Server Build?

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Active Member

Active Member

Active Member

Well-Known Member

Active Member