Hello all!
I am researching storage options for 50-100TB of usable storage space for a genomic lab. Essentially, I am trying to get an understanding of best practices and cost to build such a server. Right now, I am in the early stages of investigating solutions, so I can understand the direction we need to focus on.
Note: I initially made a post on LinusTechTips and then discovered this community and its articles. Thought it would be beneficial to post here.
Key Considerations/Details:
1. The server will be used as a backup for compressed and encrypted genomic data. It's main purpose will be backing up files, which individually can be quite large (250-500GB). Generally speaking, this isn't something that will be accessed frequently and certainly nothing that we will be running computationally demanding programs on (e.g. no video-editing 4K movies). It simply needs to transfer large files in a reasonable amount of time.
2. Data redundancy is very important as the data is quite valuable. We don't want a RAID setup where rebuilding could take weeks. Additionally, this server won't be the only back-up (Currently planning to also have an off-site cloud backup solution).
3. Ability to scale. I was asked to investigate the cost/typical-solutions for 50-100TB. For our current needs, I suspect 25TB is adequate. So being able to create a server for 25TB and then scale it to 100TB with little notice/effort would be a huge plus.
4. Noise and form-factor. This will be held in a lab environment. I know typical server-racks can sound like mini-jet engines. I am also a bit concerned about the form-factor of a server. That being said, I am open to all suggestions and realize a normal server may be most cost effective.
5. We may also use this server to make daily backup copies of user's home directories on the cluster (e.g rsync). Additionally, I am intrigued at the possibility of having the backup server also act as a git-server. This may make latency concerns more important.
Again, looking for the general information so I can make more informed inquiries in the future.
I am researching storage options for 50-100TB of usable storage space for a genomic lab. Essentially, I am trying to get an understanding of best practices and cost to build such a server. Right now, I am in the early stages of investigating solutions, so I can understand the direction we need to focus on.
Note: I initially made a post on LinusTechTips and then discovered this community and its articles. Thought it would be beneficial to post here.
Key Considerations/Details:
1. The server will be used as a backup for compressed and encrypted genomic data. It's main purpose will be backing up files, which individually can be quite large (250-500GB). Generally speaking, this isn't something that will be accessed frequently and certainly nothing that we will be running computationally demanding programs on (e.g. no video-editing 4K movies). It simply needs to transfer large files in a reasonable amount of time.
2. Data redundancy is very important as the data is quite valuable. We don't want a RAID setup where rebuilding could take weeks. Additionally, this server won't be the only back-up (Currently planning to also have an off-site cloud backup solution).
3. Ability to scale. I was asked to investigate the cost/typical-solutions for 50-100TB. For our current needs, I suspect 25TB is adequate. So being able to create a server for 25TB and then scale it to 100TB with little notice/effort would be a huge plus.
4. Noise and form-factor. This will be held in a lab environment. I know typical server-racks can sound like mini-jet engines. I am also a bit concerned about the form-factor of a server. That being said, I am open to all suggestions and realize a normal server may be most cost effective.
5. We may also use this server to make daily backup copies of user's home directories on the cluster (e.g rsync). Additionally, I am intrigued at the possibility of having the backup server also act as a git-server. This may make latency concerns more important.
Again, looking for the general information so I can make more informed inquiries in the future.