Storage server for fast reading

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

deeplearningguy

New Member
Jan 25, 2017
3
1
3
37
We are building a small cluster of servers for computer vision related tasks (university lab), mostly deep learning on GPUs. Each box will have 8 GPUs, so that roughly 8 people can work on them at once. To train these models, lots of data has to be fed into GPUs for computation. So we need two things, a fast network and fast storage. I would say that maybe 16 processes from 3 servers will constantly be reading from the server.
Most of the operations are read only, we put the dataset (images, videos, audio etc.) on the store and then mount them on a node via NFS to be read for training the models. We don't need very much space, at the moment we run the stuff on disks with around 1.5 TB storage, with roughly 600 GB free.

Networking I have worked out, 40gbe with used parts from ebay as our budget is sort of very limited here. One Mellanox Connect-X 3 card per node and two in the file server. Infiniband may be a cheaper option, but I have the feeling 40gbe is just easier to integrate into existing environments.

For the file server I have a Dell R720 which I can use. Dual E5-2630 (v1) with 96 GB Ram and a H710 with 512MB Cache. My goal is to saturate a link aggregated 40 gbe link, so roughly 80 gbit/s (10 gbyte/s) for reading. I don't care about write speeds, it can be 100 mb/s. Files range from small 100kbyte files to Gigabyte H5 database files.

One idea that crossed my mind was to get 3x 512GB Samsung 960 Pro NVMe with PCIe cards. Each of them has roughly 3.5 GByte/s sequential read speed. In a RAID0 setting I would expect to get around 8GByte/s.

The data will be backuped every 24 hours to a NAS. While there is development code on this share, we have a git server for code. If a drive dies I don't care if code is lost, it should be on the git server anyways. We can afford down time (not production, its all r&d)

While I found this to be a good idea regarding read speeds, two things were bothering me:
1.) The 960 Pro has a write endurance of 400 TBW. While we are not writing much, this seems awfully low.
2.) I don't like RAID0, except for the speed of course.

Now, I was wondering if anyone had some helpful input into how we can get the maximum read speeds, keep the cost low and not build a ticking time bomb. I had something like a hybrid solution with SSDs and NVMe in mind as well, but didn't find anything useful yet. Would maybe ZFS help us out here?
 

gea

Well-Known Member
Dec 31, 2010
3,141
1,184
113
DE
I would try a combination of an NVMe mirror/ Raid-10 for storage + a very large rambased readcache like the on that ZFS has (use most of the RAM automatically as blockbased cache based on a most read/last read/ read ahead strategy) - the more RAM the better/faster as many reads especially the small random ones are from RAM.

I would avoid a Raid-0 mainly because of reliability but also because a raid-0 has the same iops as a single disk while a mirror has twice and a raid-10 has 4 times the read iops of a single disk.
 

SycoPath

Active Member
Oct 8, 2014
139
41
28
I'd be paying close attention to PCIe bandwidth. With 3-4 NVME drives and 40GBE your going to have a lot of data moving around on the bus. Many motherboards won't link up at PCIe 3.0 with a v1 CPU. Some motherboards will support 3.0 speeds with the most recent BIOS. I do think raid1 NVME will be the way to go, especially if your running a storage platform that will allow you to do a 3-4 drive raid1 or possibly stripe across two raid1 arrays.
 
  • Like
Reactions: Jon Massey

gigatexal

I'm here to learn
Nov 25, 2012
2,913
607
113
Portland, Oregon
alexandarnarayan.com
loosely speaking (out loud at that) you'll probably want the latest gen chips you can afford with the highest clock rates to keep up with all the iops (storage can being CPU limited sometimes). Think lower core-counts, higher clocks to keep up with the transfers.
 
  • Like
Reactions: SycoPath

acquacow

Well-Known Member
Feb 15, 2017
784
439
63
42
I have a windows storage space I built with 3 e-bay 1.2TB ioDrive 2s ($400ea)

In a windows 10 dynamic disk stripe, the speeds are almost name-plate for three of these drives together.



True, you could go with a bunch of the newer NVMe drives, but they tend to run hot and throttle, so I'd use a solution where you could put a heatsink on them with a lot of airflow. You will possibly run into wear-life issues though depending on how often you swap out your dataset.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
I have a windows storage space I built with 3 e-bay 1.2TB ioDrive 2s ($400ea)

In a windows 10 dynamic disk stripe, the speeds are almost name-plate for three of these drives together.



True, you could go with a bunch of the newer NVMe drives, but they tend to run hot and throttle, so I'd use a solution where you could put a heatsink on them with a lot of airflow. You will possibly run into wear-life issues though depending on how often you swap out your dataset.
Hmm... saying you could run NVME but they run hot, and then instead you running Fusion-IO isn't going in the right direction if heat / air flow are an issue for you.

Those Fusion-IO can draw 2-3x as much power as an enterprise NVME, and thus the NVME actually require LESS power and LESS cooling than Fusion-IO.
 

Evan

Well-Known Member
Jan 6, 2016
3,346
598
113
I think he was comparing m.2 NVME vs PCIe card for cooling. But lots of NVME PCIe cards.
(Some m.2 cards do throttle easily due to lack of cooling this is true)
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
I think he was comparing m.2 NVME vs PCIe card for cooling. But lots of NVME PCIe cards.
(Some m.2 cards do throttle easily due to lack of cooling this is true)
Ahh, ok!
 

acquacow

Well-Known Member
Feb 15, 2017
784
439
63
42
Hmm... saying you could run NVME but they run hot, and then instead you running Fusion-IO isn't going in the right direction if heat / air flow are an issue for you.

Those Fusion-IO can draw 2-3x as much power as an enterprise NVME, and thus the NVME actually require LESS power and LESS cooling than Fusion-IO.
Yes, on 100% write workloads, the three cards can draw up to 25W each, for a total of 75W.

Under heavy read workloads, they require much less power as you aren't filling any nand gates with electrons.

As for temps in my $50 NZXT case that I use to house everything:
Code:
C:\WINDOWS\system32>fio-status -a |find "temp"
        Internal temperature: 56.60 degC, max 58.08 degC
        Internal temperature: 57.09 degC, max 64.97 degC
        Internal temperature: 52.66 degC, max 60.04 degC
*** Note, the ioDrive FPGAs are the only things that get hot, and the Virtex chips we used are good for 100C before we take the drive offline.




The draw on my UPS maxes out at about 175W for the whole server running with all 5 HDDs spun up and the ioDrives being read from at ~3GB/sec
 

acquacow

Well-Known Member
Feb 15, 2017
784
439
63
42
Also, not accounting for the cost of a server that supports pci-e bifurcation for the HP NVMe card, here's a cost rundown:

$700 HP Z Turbo Drive Quad Pro
$477 x4 Samsung 960 1TB
------------------------------
$2600 Total
9GB/sec
1.6PBW wear life

$400 x4 ioDrive 2 1.2TB
------------------------------
$1600 Total
4.9GB/sec
68PBW wear life.
 
  • Like
Reactions: T_Minus

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
25w isn't bad, I must be thinking of other Fusion-IO models that will do 50w+ each ;) way more than a 25W NVME :) I've had no problem with my NVME throttling or causing issue but I'm using 2.5" and PCIE versions and make sure I have proper air flow, or over kill air flow :)
 

acquacow

Well-Known Member
Feb 15, 2017
784
439
63
42
25w isn't bad, I must be thinking of other Fusion-IO models that will do 50w+ each ;) way more than a 25W NVME :)
The ioDrive Duos have two physical cards on them plus a pci-e switch. Those can use up to 55W in peak write to both cards in a single slot.

It's an issue in older servers that only support 25W or less per slot, but newer PCI-e 3.0 boxes support 75W/slot.

Just pulled up my power info as well from the status command:
Code:
C:\WINDOWS\system32>fio-status -a |find "Bus power"
        PCIe Bus power: avg 9.50W
        PCIe Bus power: avg 9.26W
        PCIe Bus power: avg 9.16W
 
  • Like
Reactions: Mehmet

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,625
2,043
113
Wow 10w idle is awesome, it's more like running HBAs instead of complete storage + controller, that's awesome. I wasn't aware the IODrive2 was that low... may be worthy to keep an eye out for htem now as they drop <250$ each and can still do>100IOPs and decent capacity!
 

acquacow

Well-Known Member
Feb 15, 2017
784
439
63
42
Yeah, it's really not bad at all.

I personally settled on this board for my home ESX box because it can easily house 6 ioDrives and a 10gig-e card.
Supermicro | Products | Motherboards | Xeon® Boards | X9SRL-F

It is for the E5 v2 Xeons, so it conveniently supports cheaper/used DDR3 ECC and used E5 xeon procs from ebay =)

I have it currently split up because I've got a quad gig-e card and 10gig-e card in there along with an LSI HBA for my storage spaces disk array.

I was going to use the ioDrives for caching the storage space, but strangely that didn't offer any performance increase over raw disk in a 2-column stripe :(
 

BigDaddy

Member
Aug 8, 2016
38
20
8
$1 a GB for 20.8GB/s Storage! :D

Add fans for cooling if you need. More drives is nice for reads... more small drives means you can add redundancy if you wanted. I know you said you don't care about failures, but why have them if you don't have to? Change drive size based on needs and budget, choose the raid right for you. You don't need a special board for this, just PCIe 3.0 x16. :D

2x Squid PCIe x16 Carrier Board for M.2 SSD Modules @$490 each
8x MyDigitalSSD BPX 80mm (2280) M.2 PCI Express 3.0 x4 (PCIe Gen3 x4) NVMe MLC SSD (240GB) @$125each
---------------
$1,980 Total
20.8GB/sec
5.6PBW wear life
5yr warranty on drives

Amazon.com: MyDigitalSSD BPX 80mm (2280) M.2 PCI Express 3.0 x4 (PCIe Gen3 x4) NVMe MLC SSD (120GB): Computers & Accessories

Squid PCIe x16 Carrier Board for M.2 SSD Modules

Double check my math though. :)
 

Deci

Active Member
Feb 15, 2015
197
69
28
unfortunately that squid card is only pci-e 2.1 so you arent going to see the full speeds of the SSDs

nice that someone makes a card with a PLX chip though instead of having you locked to a certain motherboard that supports bifurcation