10k 2.5 vs SSD 2.5

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Myth

Member
Feb 27, 2018
148
7
18
Los Angeles
Hey Guys,

I'm doing some performance bench-marking with 2.5 10k sea gate spinning HDDs. They are fast. 8 of them will run at around 1200MPBS. Obviously not as fast as SSDs. I think one 2.5" 10k drive runs at around 160MBps. Then of course an SSD runs around 450MBps. But the spinning drives can sustain much more writes.

I work in the media field so I'm talking about terabytes per day. So I'm just trying to weigh the cost versus speed.

One of the main problems our servers have is that they get fragmented. These huge 80TB volumes get fragmented and play a video file across the entire array when if defraged will play sequentially.

So if an AVID video project is playing across multiple parts of the HDDs, because it's a fragmented project. Then we get playback studdering.

This would obviously be resolved if we had SSDs, but they are expensive, small in space, and have limited endurance. So I was just looking at these 10K ssds in the 2.5" form factor.

I'm thinking that because they spin fast, and are 2.5 they might be better able to handle the fragmented playbacks. What do you guys think?

Also, is there any performance benchmarks out there to test this sort of defragmented playback? Would that be a random I/O ?

-Myth
 

Blinky 42

Active Member
Aug 6, 2015
615
232
43
48
PA, USA
Your fragmentation issue won't go away, it is there even with SSDs but they have the IOPS to mask it well enough so you probably never notice it.
An array of 10k or 15k drives will be faster than 7.2k drives that you may have now. I do a lot of work with large media as well and have found that setting up the storage side to match the workflow helps throughput quite a bit. Dedicated arrays of drives for inputs and outputs, and intermediate files when required is a huge win plus gobs of RAM.
It might be easy just to migrate finished stuff off to another array and wipe clean the temporary area that gets the most random file sizes to cause fragmentation. Or clone it, wipe and copy back to have a fresh start every week etc.

For example I have a small setup here that has RAID1 set of 4x5T drives as the "source" that gets 2TB per day added to it 24x7. Then a pair of 5T in RAID1 for the workspace and then the server with RAID6 arrays as final output destination. It keeps up pretty well with constant cutting and clipping of things. I had a pair of SSDs as the work array but moved back to the 3.5" 5T 7.2k drives just for more space, and no massive performance change because it is still reading or processing only so fast. All the filesystems are XFS however, if you are on Windows YMMV.
 

Blinky 42

Active Member
Aug 6, 2015
615
232
43
48
PA, USA
I think XFS does better with fragmentation than NTFS, but it is hard to do A/B comparisons because they are not both native to the same OS.
NTFS when used in Linux isn't a fair comparison as it is through fuse vs in-kernel and (at least initially) a reverse engineered implementation.
Doing similar dev work on Windows and Linux with the same large code base the Windows machines with NTFS do show problems in a few months where XFS goes for years without issues.
 

EffrafaxOfWug

Radioactive Member
Feb 12, 2015
1,394
511
113
I don't work with media myself but I've been through a similar experience; from the sounds of things you're using local disc arrays (i.e. direct attached storage). Is that needed because you need to be able to sustain high throughput that would be expensive to do over a storage fabric (e.g. fibre channel or 10Gb iSCSI)? Does each server only ever access one file at a time in a sequential fashion or is each server also subject to some "random" IO (sub-question - have you done any monitoring to see what the IO access patterns are like)? To my mind this is probably the most important question. As blinky mentions, it's quite common for some bits of software or processes to copy some "hot" data into RAM or flash storage for fast scratch space but from the sounds of things this might not work for your datasets.

You'll get better performance out of 10/15k drives than you will from 7.2k nearline, but whether the extra random IO performance is good enough to stop your stuttering issues is another story. I assume automated defrags aren't a possibility either because a) there's never sufficient "quiet" time, or b) they don't run quickly enough to be of any benefit?

Extent-based filesystems like ext4 and XFS will generally fragment a lot less than block-mapped filesystems like NTFS, but if you're dealing in huge reads and writes some fragmentation will be unavoidable.

Generally speaking I'd recommend consolidating existing spinners into a big SAN or similar and putting a RAM/flash cache in front of it - this minimises the amount of money you need to spend on expensive flash, and gives you the greatest amount of spindles and thus aggregate IOPS across the whole of the array although whether this is suitable for you depends greatly on the workload and scale (and of course cost). How many servers, arrays and discs are we talking about here?
 

azev

Well-Known Member
Jan 18, 2013
768
251
63
A big ZFS box with a lot of RAM and Optane for both ZIL and L2ARC is probably what you should look into.
 

Sapphiron

New Member
Mar 2, 2018
11
0
1
41
setting up Freenas for ZFS will probably outperform Windows with the same hardware. Just make sure to setup the disks is one big mirrored V-Dev and you have enough RAM. 64GB of RAM is probably the minimum required for 80TB of Storage. Key trick with ZFS, is dont let your disks get more than ~70% full, otherwise you will see a performance dropoff, by 85% full, you will be crying.

Getting two 480GB Optane P900 for ZIL and L2Arc would take the load of the disks some more. Partition each Optane card into two partitions, A 20GB and a 460GB on each. Mirror the two 20GB as a ZIL (write cache). 20GB should be plenty as it only needs to hold about 5-10 seconds worth of write throughput, as ZFS pushes this cache to mechanical disks. Then stripe the two 460GB partitions, to give you a large 920GB, very fast L2Arc (read cache).

Optane does not suffer from the same write endurance limitations that normal SSD's do. Optane will probably outlive your mechanical drives.
 

Myth

Member
Feb 27, 2018
148
7
18
Los Angeles
Amazing advice. Yes, we actually manufacture SAN servers here at this company I'm currently working for. Our MetaData controller is built for NTFS so we can't use ZFS or XFS with our current software.

For those of you who don't know what a meta data controller is, it's a software that allows users to connect to the SAN server over Fiber or Ethernet, then also has projects or workspaces that each user can access based on credentials.

So if we use Linux and share out SMB we can use a ZFS file system (NAS). Our Linux based RAID software also does RAM caching as mentioned in previous other posts. So I'm assuming this RAM caching is a Linux thing, not just our software. We can put in 256GB of RAM attached to 32x 10TB helium drives for a bad ass NAS system, only problem is that it's shared out SMB and that's all.

The other option, and I know this is too much information, but we could export the drive array from Linux into a Windows Server machine via multi-path 32Gig Fiber channel card. Then windows can import it via disk manager, and our SAN meta data software can share it out over Ethernet to the client computers.

The only problem with the above solution is that it requires two phyiscal hardwares. One windows, the other Linux. Of course we could try to use VM ontop of the Linux box then import the drives into Windows Server via VMware. However, the volume would still be NTFS, but if it is on the Linux hardware I believe it will still use the RAM caching.

The reason why I'm trying to improve performance is that some companies suffer. The worst of them are the ones that operate nearly 24/7 since they don't have time to manually defrag. We have a auto-defrag software which runs when the drives are idle, but in these intense environments it usually does more harm than good, since it never has enough time to complete.

We usually turn off the auto-defrag and make the client manually use mydefrag to implement the drive defrag process, which is from year 2000, but it works great. It takes about 14 hours to defrag about 30TB. So for our larger customers, who have 300TB it's really hard for them to find the time. It's also about data management as well. Like if they store audio files and video files on the same array, then the tiny audio files take forever to defrag. For some reason tiny files like 11MB each take 70% longer to defrag than a bunch of large 2gb video files. Which really sucks and confused me, but I guess it has a lot to do with metadata in each file.

So we could use Linux based hardware, import into a windows based hardware via Fiber, but the filesystem would still be NTFS, however if it's going through Linux, I believe it will still be using the RAM cache, which windows does not seem to support or offer.

We do have the option to SSD cache using Windows and our SAN Software and we have a tiering software developed for high performance caching but we haven't used it yet. And even still 6x SSD (12TB) costs about $12,000 while 16x 6TBs (96TB) cost about $3,200. But that optane caching idea seems very interesting. Usually with Storage Pools though when I RAID 1 two nvme drives, the write performance goes from like 8,000MBps to around 800MBps.

I guess we could try a VM on top of the linux hardware but that's not ideal and won't work with a GPU for some of our workstation builds. I don't know what the fastest hardware is for mydefrag and tiny 11mb files. I know a lot of RAM seems to speed up the Defrag process on Windows, but I don't know if more cores at lower ghz or less cores at higher ghz would help it defrag faster.

anyways thanks so much for reading!