Cheap'ish, high'ish endurance ~4TB SSD that can saturate 10gig with sequential reads?

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Stereodude

Active Member
Feb 21, 2016
453
88
28
USA
Background:
I'm looking to build a media ingestion PC for ripping all my 1080p and "4K" blu-rays to get them online. Something like 10 UHD friendly optical drives in a single tower for ripping 10 discs in parallel. I plan to put the resulting files on another server utilizing ZFS (not built yet). My understanding is that ZFS can't be defragmented and "streaming" 10 large files of unknown size (until they're done ripping) in the 30-90GB range in parallel over 10gig ethernet to the server running ZFS is basically guaranteed to fragment them.

My initial idea to avoid this fragmentation is to write the 10 files locally in parallel to a large SSD, (like ~4TB) and then dump the contents of the SSD to the server (with ZFS) sequentially so the files will not get fragmented in the ZFS pool. Saturating 10gig ethernet rules out SATA. I'd don't have SAS in that system, so it's out (unless a SAS controller + SAS SSD a good bit cheaper). I think that leaves me with a PCIe connected SSD. I see I can get any number of different enterprise new and used 2.5" NVMe U.2 SSD's in the 3.84-4.0TB size range that have something like 5+PB of endurance for around $275-350'ish. I'm thinking ~4TB because 10 drives ripping 100GB 3 layer "4K" blu-rays in parallel can generate about 1TB per hour. 4TB can be several hours worth of ripping.

I'm looking at drives like these:
(new)

(used)

I've been avoiding used drive drive listings that don't give any indication of the drive's health (endurance remaining). I'm also not so interested in drives that are power hogs. I have ~120TB of Blu-rays to rip so the multiple PB of endurance isn't strictly necessary, but it would be nice to be able to use the drive for something else once this process is done.

Is there another cheaper way to handle this that I'm overlooking (other than ripping 1 at a time)?

Are there any suggestions for a specific PCIe connected SSD model?
 

kpfleming

Active Member
Dec 28, 2021
383
205
43
Pelham NY USA
I'm curious why fragmentation is such a concern for you: ZFS is going to place the content wherever it likes, whether you send it in a single streaming operation or in multiple operations. If the storage pool on the ZFS server is sufficiently performant, you aren't ever going to notice fragmentation (and that's especially true if it's all SSDs, where there is no 'seek time' to be concerned about).
 

i386

Well-Known Member
Mar 18, 2016
4,220
1,540
113
34
Germany
You will rarely see 30+ MByte/s with uhd discs: 10x 30MByte/s = 300MByte => sata ssds are fast enough
I'm thinking ~4TB because 10 drives ripping 100GB 3 layer "4K" blu-rays in parallel can generate about 1TB per hour.
Are you ripping shrek 1? :D
 

Stereodude

Active Member
Feb 21, 2016
453
88
28
USA
I'm curious why fragmentation is such a concern for you: ZFS is going to place the content wherever it likes, whether you send it in a single streaming operation or in multiple operations. If the storage pool on the ZFS server is sufficiently performant, you aren't ever going to notice fragmentation (and that's especially true if it's all SSDs, where there is no 'seek time' to be concerned about).
The ZFS server is definitely not going to have several hundred TB of SSD storage. I don't have the sort of budget for that. It is going to have spinning drives composing the pool. Further, why would I want to knowingly and intentionally introduce file fragmentation into the pool that I can never get rid of? My understanding is that ZFS should not fragment files going into the pool if they're written one at a time. People have been trying to avoid fragmentation for decades with spinning HDDs. I don't think my desire is unrealistic. Additionally, I won't know if the ZFS fragmentation is a limitation or issue for me until it's too late to do anything about it (short of writing it to the pool a second time sequentially).
 
Last edited:

Stereodude

Active Member
Feb 21, 2016
453
88
28
USA
You will rarely see 30+ MByte/s with uhd discs: 10x 30MByte/s = 300MByte => sata ssds are fast enough
The drives can rip a 100GB disc in ~1 hour. That's an average of ~26MB/sec. The peak is higher. Most my Blu-rays are not 100GB discs. FWIW, 25GB discs actually have higher data read rates from the drives. Ultimately, the writes to the SSD from the optical drives isn't my concern with SATA. The copy from the SATA SSD to the server is my concern. The copy over 10gig ethernet from the SATA SSD to the ZFS server will be half the speed of a PCIe (or SAS3) SSD.

Are there significantly cheaper SATA SSD's with high endurance? From what I find on eBay a ~4TB SATA SSDs that aren't an auction and have a stated drive "health" are basically just as expensive as the NVMe U.2 ones.

Are you ripping shrek 1? :D
That's an oddly specific question, but yes. I have the 3D Blu-ray of it to rip.
 
Last edited:

kpfleming

Active Member
Dec 28, 2021
383
205
43
Pelham NY USA
People have been trying to avoid fragmentation for decades with spinning HDDs.
Indeed, and that effort was primarily focused on single-disk storage systems. When your storage pool is sharded/spread across multiple drives, you have much less control over the layout of the data-as-written-to-the-media. Depending on the pool configuration and already-stored-data, newly-written data won't be evenly distributed across the drives in the pool, let alone written in sequential blocks. In other words, I'm not sure the word 'fragmentation' applies as well to sharded storage as it did to single-drive storage.
 

nabsltd

Active Member
Jan 26, 2022
339
207
43
Saturating 10gig ethernet rules out SATA.
Since this is temporary storage, how about using a pair of SATA SSDs in RAID-0? Something like a Crucial MX500 comes in at 180TBW per TB of original storage. That means 2x 1TB drives would have 360TBW for endurance. 360TB * 1000GB/TB / 50GB (average) == 7200 disks.

So, those SSDs would run out of endurance after you have ripped 7200 BluRay disks. Plus, you'd also need 360TB of NAS storage to hold them.

SSD endurance is not a problem for your plan. You could also use a single consumer M.2 NVMe drive in place of the RAID-0 SATA. There are quite a few with over 200TBW per TB of original drive.
 
Last edited:

samat.io

Member
Sep 16, 2016
58
43
18
40
+1 to what nablsltd said; you don't really have control how files will be laid out on disk to begin with. Additionally, in newer distributed filesystems like Ceph, fragmenting things is by design; you want chunks of a file randomly distributed all over the place to provide durability.

Just to point out: the reason there aren't mature defrag utilities for ZFS is that it's not a big problem. Fragmentation is a big problem once your volume nears capacity, but is negligible otherwise.

To make things a little better, for your use of case of large multigigabyte files, try to set a large ZFS recordsize. The largest you can go by default is 1M, but you can enable larger and figuring out the best recordsize will probably impact the performance you get more than worrying about fragmentation will.
 

Rttg

Member
May 21, 2020
71
47
18
As others have mentioned, fragmentation really shouldn’t be an issue with this workload. If you were working with a large set of small files and deleting some of those files, then fragmentation might be an issue, but writing large files in a WORM manner like this shouldn’t create any problems.
 

Stereodude

Active Member
Feb 21, 2016
453
88
28
USA
You guys are giving me some food for thought. I guess I will run my assumptions past the gurus on r/zfs and see if they think there's even a problem.
 
  • Like
Reactions: kpfleming