Hello,
I'm trying to get some ideas, from those that may have already conquered similar challenges, on building a high capacity storage system that is also capable of high throughput.
The intended usage is for this storage server to house long term data, typically massive files (maybe ~200GB per day in 1 daily file). Periodically we will have a need for a bunch of compute nodes to read and process these files (each compute node processing a different daily file), probably accessing them via NFS. I need to be able to parallelize this as much as needed to maximize processing throughput.
I envision the storage server having 40Gbe connectivity, and I want to be able to fully saturate the connection, not be bottlenecked by disk IO.
Due to the high capacity required (120TB) I'd like to see if it's possible to do with spinning disks.
recap of requirements:
- capable of sustaining 40Gb/s throughput to multiple concurrent readers, that are reading large files sequentially
- 120TB of space
- high redundancy (can't do raid0...thinking raid6/raidz2)
- single chassis solution if possible
- easily expandable
- low cost
I'm currently looking at 45 or 60 bay chassis, and trying to determine if ZFS can perform well enough.
In theory 2 ZFS raidz2 vdevs of 15 disks each would more than meet the performance goals for purely sequential reads, but I'm having trouble finding real world performance numbers to back that up.
I am also contemplating hardware raid controllers with raid6 or raid60 configurations.
I'm trying to get some ideas, from those that may have already conquered similar challenges, on building a high capacity storage system that is also capable of high throughput.
The intended usage is for this storage server to house long term data, typically massive files (maybe ~200GB per day in 1 daily file). Periodically we will have a need for a bunch of compute nodes to read and process these files (each compute node processing a different daily file), probably accessing them via NFS. I need to be able to parallelize this as much as needed to maximize processing throughput.
I envision the storage server having 40Gbe connectivity, and I want to be able to fully saturate the connection, not be bottlenecked by disk IO.
Due to the high capacity required (120TB) I'd like to see if it's possible to do with spinning disks.
recap of requirements:
- capable of sustaining 40Gb/s throughput to multiple concurrent readers, that are reading large files sequentially
- 120TB of space
- high redundancy (can't do raid0...thinking raid6/raidz2)
- single chassis solution if possible
- easily expandable
- low cost
I'm currently looking at 45 or 60 bay chassis, and trying to determine if ZFS can perform well enough.
In theory 2 ZFS raidz2 vdevs of 15 disks each would more than meet the performance goals for purely sequential reads, but I'm having trouble finding real world performance numbers to back that up.
I am also contemplating hardware raid controllers with raid6 or raid60 configurations.