So i'm here to research Enterprise-class storage up to 300TB :)

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.
Hello, this is my (second) post (first was responding to a help request) because someone on SmallNetBuilder suggested I come on over here because this either is more my style, or may be a necessary part of what I have to do.

My name "Twice Shy" should suggest to you that i've suffered serious data loss in the past which I am trying to prevent EVER occurring again. :)

Basically I need an Enterprise class solution. My problem is that I do not have an Enterprise budget. Even though some/much of the data i'll be working on, I would like to think, will eventually pay for itself, this is part of a bootstrapping startup that has to get by on a shoestring for YEARS.

I already had dozens of terabytes and lost probably at least a third of it. I'm sure alot of the rest is corrupt by now. My previous solution of "buy 3tb Seagate external USB's" when we needed data to stick on the shelf turned into a Total Epic Fail.

Basically put me and several others are going to or will be going to film school (not all at the same time) and have the opportunity to access hardware for free that we cannot readily pay to rent later, like shoot 8k video and motion capture which involves like 16x HD streams at 120fps at 1080p. We are trying to do a side projects with it ("our own movie" plus a bunch of shorts) which we hope to serve as demo reels, calling cards, and similar learning projects.

This is very data intensive. A Red Weapon 8k shoots at 300 megabytes a second per camera (and more than one camera for something like a stunt is the norm), the mocap studio uses even more. Terabytes per day is possible if we pull an overnighter or weekender in the studio when nobody is using it.

300TB is a number I pulled out of my rump and nothing that will be built overnight, i'm not seeking suggestions for tens of thousands of dollars of server parts. That budget doesn't exist. What does exist is a strategy of trying to slowly grow an archive with the minimum cost overhead over that of hard drives possible. I'm not wanting rack gear in the home just to have rack gear (cool as it is) and didnt want to play sysadmin but rather trying to solve specific problems with specific solutions.

Films like Rogue One have more than a petabyte of digital data for comparison but there's many things we can do if we HAVE to to cut down on the data stream like shooting lower resolution and reducing the number of takes saving only the best ones even at the time of capture. 300tb can shrink to 60tb or 30tb if it has to, it's more "it would be nice to save all you can of everything as much as possible" so instead of saying what I want is impossible please remember that this is a very flexible process.

The two biggest problems to solve things so far unfortunately are people saying "hire a professional" when none is available (I know other students with that background but they are busy getting paid when they graduate, they arent in the struggling suffering artist collective), or "you cant afford to do this at all" when I already admitted how flexible and progressive things are. Lacking a better metaphor this is something really like having a herd of deer pass by and you either bag what you can or you dont get another chance. Getting 1 is better than none, getting 5 better than 2, and getting 20 better than 12.

What I mean is i'm just trying to save and salvage as much as temporary opportunity makes available, just like I did before (when it was mostly lost and corrupted on seagate external drives), since I either make an attempt to use the resource or I dont, and if i'm going to do it i'm trying to make the best strategy possible at it to get as much as I can. The budget will be limited, and more TB I can save out of whats available the better because we wont have second chances to capture alot of footage - I dont want to spend money on fancy server racks when "more storage" could be bought instead.


All that said i'm wanting to share the first few early conclusions i've sort of already arrived at for commentary:

1) What i'm trying to plan around first off is a strategy that rapidly migrates incoming data to LTO Ultrium tape. Whether it's cost per gig, ease of storage without data loss, ability to mail data (wouldn't want to mail hard drives - i've seen what happens to the post!) robustly, this to me is a win. Once the cost of entry for a tape drive is exceeded of some tens of TB it only saves money from thereafter to say nothing of the long term storage costs.

2) I'm still researching the NAS/SAN storage options and nothing is off the table yet. We have to capture data now, but may slowly work on it even for years afterwards if there's no alternative/a quicker method isn't possible. What starts as a low watt learning NAS on gigabit will probably turn into a high performance SAN with SSD's and 4-40gigabit networking eventually. It takes as long as it takes - better to make progress than making none at all or give up not trying. Other than i'm trying to DIY and home roll a solution because i'm very interested in technologies like ZFS having already experienced significant silent data corruption, and also dollar for dollar i'll probably have a lower cost overhead DIY than buying some off the shelf answer.

3) Nothing is happening overnight and i'm the main person that has to learn and implement this because i'm the class nerd among the film geeks. :p Migrating incoming data to Ultrium tape will happen much earlier than setting up a successful NAS/SAN to work on/process the data. So i'll simultaneously be asking questions about immediate problems, as well as potentially problems a year off or even far off, just so that I know what i'm doing in the future/have the roadmap planned out. Even if I discover I cant afford a given solution for awhile yet for instance, or need to slowly collect hardware when opportunities present themself.

Sorry for this being so long and thanks for taking the time to read. :) I hope I can learn things and also at some point contribute knowledge back.
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
Basically not too complicated.

Main problem is the number of disks.
If you use 10TB disks, you need 30 disks. You should add at least 20% for redundancy what means 36 disks minimum.

You need then a case for that many disks, like a SC847BE1C-R1K28LPB | 4U | Chassis | Products | Super Micro Computer, Inc.
or something similar from SuperMicro with 44 or 60 bays (prefer 12G Expander SAS backplane but 6G is ok)

A little bit cheaper may be a 24 bay case with an additional Jbod SAS case with expander for 24 disks

These cases come with an SAS expander. While you can use Sata disks with an expander, I strongly suggest SAS disks. You also should use enterprise class disks that can handle the vibrations of many disks in a case like HGST Ultrastar SAS HE8 or HE10 disks

Add a mainboard with a 10G ethernet and 12G SAS HBA (LSI 9300) or a 6G HBA that you connect to the expander and at least 32G RAM, examples see http://www.napp-it.org/doc/downloads/napp-it_build_examples.pdf or STH's Buyer's Guides

Add a ZFS Opensource storage appliance based on an OS like BSD ex FreeNAS or Solarish.
I prefer the Solarish solutions where ZFS comes from because of the best ZFS integration. For them I have created the management interface napp-it that runs on Oracle Solaris, OmniOS or OpenIndiana, see my howto http://napp-it.org/doc/downloads/napp-it.pdf

Then create a ZFS pool with 4 x raid-z2 vdevs with 9 x 10TB disks each. This gives you 360 TB raw and 280 TB usable. A little bit slower would be 3 x raid-z2 vdev of 12 disks that gives the 300 TB usable.

You can use other number of disks per raid-z2 vdev. Optimum is 6 or 10 disks per vdev. For other number of disks per vdev, you should increase ZFS blocksize from 128k to 512k or 1M. The rule is the more vdevs the faster (Pool iops scale with number of vdevs). You can also start with one or two vdevs and add more later.
 
Last edited:

MiniKnight

Well-Known Member
Mar 30, 2012
3,073
974
113
NYC
Yea that's the hardest part. You need 30x of these Amazon.com: HGST Ultrastar He10 HUH721010AL5200 10 TB 3.5" Internal Hard Drive: Computers & Accessories

So for 300TB RAW and not formatted nor losing capacity for data protection and drive failures, your entry price is around $15k

You can use smaller drives but then you'll spend more on chassis so you won't save any money.

@gea has a good suggestion on ZFS but to get 300TB capacity you will be close to $20k just on disks.

Hardware and software are easy to get 300TB. Budget is still difficult.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
I think the "KEY" is what @gea said here: (and generally great advice!!)

"Then create a ZFS pool with 4 x raid-z2 vdevs with 9 x 10TB disks each."

Start with 9 disks, and add disks 9x at a time as you grow, that way you're not starting with the 'total capacity' of your ultimate goal, but instead starting with 90TB-18TB=72TB usable.

If that's not enough then start with 2 sets of those, and add the others as you need them later.

You can pickup SAS2 / 6Gbs 847 SuperMicro chassis very affordable on ebay.
 
This is great! :) People that treat my post like it's the most normal request in the world. :) Now that we have a high watermark that will do the job lets see if we can make it a more progressive and scaleable solution that I can get into without a five figure price tag and running another powerline to the server because I don't need all that data online at the beginning. :)

I'm most interested in having a workflow that rapidly moves data to LTO6 Ultrium tape (as that is still cheaper than LTO7 per gig right now yet still has a far lower 'silent corruption' rate than hard drive by orders of magnitude) as one workaround to the up-front cost issue. $3-4k is probably available to get started, $15-20k is not and also does not provide for backups, site catastrophe protection, or the wearing out of hard drives before the project completes. I'm guessing the data storage set might become 300tb eventually but that doesn't mean the NAS/SAN has to be - but the plan is to upgrade it if it's bottlenecking or slowing down the workflow/too much tape restore going on to get work done.

I'm willing to shoot my terabytes of digital video, move it straight to Ultrium tape ASAP hoarding it (because when and where we can get the studio time available, actors in place, and such is not up to us so its use it or lose it) even if all the later editing, VFX, color grading and such happens later, possibly even years later if it has to because people have jobs or higher priorities for awhile. Are there any tape experts on this board I can specifically pick the brains of?

When the Phase 2 starts to get into place after we have footage on tape, it's then about having a NAS or SAN which can have many terabytes loaded at once as needed to be worked on and processed. At some point that will migrate back out to a new set of tapes whether or not everything is done because the server will have to be used for other things at times too. So a NAS for 24/7 operation and six nines uptime is not needed - after the core data is had, the work on it will be more intermittent because it takes time to do things like render 3d graphics and that too which will be the bottleneck holding us back and the video data may just sit unedited for a bit.

My main goal for the NAS is "minimum overhead cost above that of the hard drives", if possible to use consumer level drives that is fine. (this is not a business, so claiming home warranty failures is okay - taking into account that i'm aware they have a total lifetime bytes written limit as well) Consumer level motherboard is even fine if this wont be up 24/7 - if work later becomes constant it will get upgraded. Just being frugal not cheap - not going to skimp by lacking ECC RAM or such for ZFS! Just whatever can get the cost down because everyone involved is struggling artists who agrees to eat ramen for a month at a time when we need to buy stuff. Money that could buy a fancy case could also buy more LTO tapes or hard drives for the NAS and our first bottlenecks are the storage space at all followed by redundancy/not losing work already created.


Does this inspire any 'reduced cost' hardware insights? (remembering it doesn't have to be a single 300tb monolith - possibly even 300tb ever - though if it was even setting up 10 separate 30tb-usable servers is fine IF the cost overhead-per-drive is less, this also makes the buy-in scalable, the power use far more scalable and reduces some wear and tear although i'm aware drives need exercise and scrubbing once a month still. Browsing ten NAS directories is not intolerable since data would still be organized by related clustered data.)
 
  • Like
Reactions: Patrick

yu130960

Member
Sep 4, 2013
127
10
18
Canada
I have been running 24 SATA drives (in the system set out in my sig) for over three years and have been averaging 1 failure every 18 months. I am curious as to why @gea has the strong recommendation for SAS drives considering the expense. I am in the process of looking to build my next system, so am starting to do research again.
 
Yeah i'd vastly prefer consumer grade SATA - if it's good enough for Google it's good enough for me. TB per dollar will be the big bottleneck - it's one reason i'm so driven towards Ultrium tape/offlining everything that can be so that the NAS only has to be sized to the immediate workload and may let us postpone some upgrades until prices drop like they always do on computer-everything. (ram, ssd, drive space)
 

gea

Well-Known Member
Dec 31, 2010
3,163
1,195
113
DE
Yes, you can use Sata disks with an expander so this may be an option in a cost sensitive home or lab environment. Problems may occur with some disks (depend on type or health) where a single disk can block or irritate the expander. I have seen such a problem myself and I was in contact with users that were affected with sporadic problems that depend on this combination with a high propability.

There were also some discussions in the omnios-discuss maillist where OmniTi (OmniOS with commercial support) clearly stated that they would refuse any support for Sata + expander as it is nearly impossible to debug problems.

This is why I would always use either multiple HBAs with Sata disks or an expander but then with SAS. Every case, even the above SuperMicro or a 50bay Chenbro that I use offer the option to use several HBAs instead the expander. Up to 24 disks this is quite the same price with several HBAs like the IBM 1015 or the Dells with LSI 2008 chipsets.

The second problem is vibration. With 30+ disks in a single case this is really a problem. Only enterprise disks are suited and you can use them within the specs with that many disks. The premium for SAS over Sata is quite small then.

My be different with desktop Sata disks but 30+ desktop disks in a single case is not "Enterprise class storage"
 
  • Like
Reactions: Twice_Shy

cookiesowns

Active Member
Feb 12, 2016
234
83
28
28
Interesting project. Are you an OSX or Windows environment for editing? What NLE's do you use? There's also a workflow efficiency benefit from building your network & storage back end properly from the ground up.

We personally at our studio download/ingest into the main storage servers, then after events are complete we archive and send to LTO-6 tapes. And clear it off our main servers after a set period of time.

Multi vdev Z2's are a great balance between storage capacity and performance, given you have enough vdevs. I would venture to say that 4x Z2's of good 7200RPM drives and enough ARC/L2ARC you can do easily do 4-6+ workstations editing 4K footage. Especially if the RED RAW's are transcoded to something like cineform.

Off-site backups are done on either external / cloud, or LTO-6 depending on type of event.

Over 350+TB / YR.
 
  • Like
Reactions: Twice_Shy

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Western Digital RE SATA HDD - May be a good 'in-between' SAS and consumer SATA grade drive that won't break the bank ???
 

cookiesowns

Active Member
Feb 12, 2016
234
83
28
28
Western Digital RE SATA HDD - May be a good 'in-between' SAS and consumer SATA grade drive that won't break the bank ???
Those drives aren't consumer though, the RE's at least back in the day were rated for full duty-cycle Nearline SATA drives.

I do run HGST's Deskstar NAS which is a NAS variant of "consumer sata" in a non expander based 24 bay supermicro with 0 issues. In 2+YR's of operation I've only had one fail out of a batch of 52 drives. It was also just a soft fail ( increasing smart pending sectors )
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
Western Digital RE SATA HDD - May be a good 'in-between' SAS and consumer SATA grade drive that won't break the bank ???
Those drives aren't consumer though, the RE's at least back in the day were rated for full duty-cycle Nearline SATA drives.

I do run HGST's Deskstar NAS which is a NAS variant of "consumer sata" in a non expander based 24 bay supermicro with 0 issues. In 2+YR's of operation I've only had one fail out of a batch of 52 drives. It was also just a soft fail ( increasing smart pending sectors )
Huh?

Did you read what I wrote before replying?

I said: "May be a good 'in-between' SAS and consumer SATA"

As-in they're not $400+ SAS Enterprise drives, and they're not $100 Consumer SATA, they're in-between.

I suggest these because their price is great, they're made to run 24/7, they're made to handle vibrations, etc... all while staying SATA and not costing a lot of money, all of which were his goals.
 

cookiesowns

Active Member
Feb 12, 2016
234
83
28
28
Huh?

Did you read what I wrote before replying?

I said: "May be a good 'in-between' SAS and consumer SATA"

As-in they're not $400+ SAS Enterprise drives, and they're not $100 Consumer SATA, they're in-between.

I suggest these because their price is great, they're made to run 24/7, they're made to handle vibrations, etc... all while staying SATA and not costing a lot of money, all of which were his goals.
lol.. just realized my mistake. Running on few hours of sleep and no coffee yet heh. Yes, good inbetween drives but there's better out there now :)

On a side note, my opinion is that SAS is really only necessary when you have expander backplanes, or some specific requirements. The un-traditional "SMART" output also means that "consumer" orientated NAS GUI's for ZFS don't give you automated reports of SMART failures/pre-failure warnings.
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
@cookiesowns which drives are you referring too?

I don't think they make the RE anymore? Or at-least not 8TB+ capacity, maybe they've gone with the Ultrastar line to continue that? Or HGST Ultrastar = WD Gold ? Hard to keep up these days with them being 1!!
 

whitey

Moderator
Jun 30, 2014
2,766
868
113
41
Ohh no the ole' caviar tastebuds but cheeseburger budget :-D

Seriously though, pretty good suggestions here, I'd say start off small and scale to 'actual' needs. Doesn't sound like replicating that amount of data would be effective with the limited HW/budget in mind so maybe LTO tape is 'good enough' for backup/remote ship. I do have to say that an ent sas drive w/ proper raid config should/would certainly last 5+ yrs with a high degree of data integrity, I know I typically get more use outta them than that so YMMV. Using ZFS since 2006 and not an incident of data loss yet.
 
Last edited:
Yes, you can use Sata disks with an expander so this may be an option in a cost sensitive home or lab environment.
So the SAS just attaches to a connector which plugs into multiple SATA drives?

I kind of like the design of Backblaze's pods just using straight up SATA ports and SATA disks so far. Though i'd prefer ZFS or some other parity checking file system to stop silent corruption, which is still uncommon inexpensively in off the shelf NAS's.


This is why I would always use either multiple HBAs with Sata disks or an expander but then with SAS.

The second problem is vibration. With 30+ disks in a single case this is really a problem. Only enterprise disks are suited and you can use them within the specs with that many disks. The premium for SAS over Sata is quite small then.
Well nothing is off the table yet, thanks for the heads up, i'm still just feeling my way through the bigger problems before I narrow down to the specifics.

Backblaze seems to have conquered the vibration problem and they published the specs on their storage pods publically, if I end up throwing 45 disks in a single chassis I might well replicate what they've done.

Alternately I still expect to slowly scale up though, smaller cubes of maybe 4-8 drives I power up as needed, even if there's 3/5/7 cubes in total. Also lets me build it more slowly over time and just repeat a configuration I tested and know already works fine. One of my big questions is "multiple small NAS cubes or one big monolith" I want to answer first. My big question is "cost overhead per drive/gig of data" both up front and ongoing power costs.


My be different with desktop Sata disks but 30+ desktop disks in a single case is not "Enterprise class storage"
I beg to differ but only because people like Google and Facebook and Backblaze use commodity hardware already. :) 300TB of home storage is way way beyond "small office home office" solutions even if it's just on Ultrium tape, especially when we start talking ZFS and other things not commonly applied at the SOHO level. I don't even know it will never grow beyond 300TB that just seemed like a nice round number, but I figured a system working at 300 would also probably scale up fine to 3 Petabytes if needed for multiple movies later. Whereas consumer class systems at 32TB don't stretch the same.
 
Interesting project. Are you an OSX or Windows environment for editing? What NLE's do you use? There's also a workflow efficiency benefit from building your network & storage back end properly from the ground up.

We personally at our studio download/ingest into the main storage servers, then after events are complete we archive and send to LTO-6 tapes. And clear it off our main servers after a set period of time.

Multi vdev Z2's are a great balance between storage capacity and performance, given you have enough vdevs. I would venture to say that 4x Z2's of good 7200RPM drives and enough ARC/L2ARC you can do easily do 4-6+ workstations editing 4K footage. Especially if the RED RAW's are transcoded to something like cineform.

Off-site backups are done on either external / cloud, or LTO-6 depending on type of event.

Over 350+TB / YR.
Currently plays with Adobe CC (Premiere) for NLE under PC though I need to learn it better. Will be in the future using PC, Mac OSX and Linux all for different purposes on the same network preferably accessing the same NAS/SAN data - someone else who edits prefers Final Cut Pro. Planning to build a DIY color grading control surface with three trackballs - commercial ones are overpriced. Buying an old Panasonic Plasma Viera when I get a chance for the mastering monitor due to it's color accuracy - or possibly those new Samsung Quantum Dot monitors in a 49" size because the nicer ones are like 96% of the DCI P3 color gamut. They don't go as dark but with bias lighting behind it and color calibration it's good enough for prosumer work.

Ie - other little tricks like this to lower other costs for the total project. Trying to snip similar corners for "85% of perfect is good enough" for home use but fancier toys at school can be used while going there for classes which i'm still saving up for.

If you have any other suggestions for how to set this up i'm ALL ears since youre in the industry and I want to be. :) Designing up for 'workflow efficiency' and such. I can list other hardware were planning around on the non-data side if you want to talk gear, like for audio (Delta 1010 boxes for 24/96 recording, good enough for anything IMHO) and similar, or that were later going to try to build a home mocap rig with multiple HD cameras as well in the garage (for when we cant do things at school anymore/this is a few years off), so ultimately depending who all is doing what this could scale up to over a dozen people and workstations in the same house while cramming to finish the job even. Hence "enterprise class" questions. My first question is trying to sort out the data back end that will grow and I don't have to think much about except to ask everyone for money to buy more hard drives and Ultrium tapes to keep up with the new data flow.
 
lol.. just realized my mistake. Running on few hours of sleep and no coffee yet heh. Yes, good inbetween drives but there's better out there now :)

On a side note, my opinion is that SAS is really only necessary when you have expander backplanes, or some specific requirements. The un-traditional "SMART" output also means that "consumer" orientated NAS GUI's for ZFS don't give you automated reports of SMART failures/pre-failure warnings.
FYI for the side-talk on specific drive models, that's a little early when i'm still planning the total package yet. :) I'll weigh enterprise vs consumer and SATA vs SAS and all that after deciding "one NAS or multiple", "just NAS or separate SAN", SSD or no, which NAS software to use, and many other smaller questions to come later.

For right now i'm looking at a few different ways to solve the total problem and am not even committed to something like freenas vs openfiler vs nexestor vs other things right now. The biggest criteria are protection from silent corruption on the work drives and fault tolerance, with alot of gigs and good performance without excessive cost overhead per gig/per drive. Total suggested solutions can run the gamut from super ghetto to using used datacenter equipment to clever ways to save a few hundred dollars here, and a few hundred there since it adds up (and buys more drives/tapes).
 

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,641
2,058
113
You want a lot and it's not going to be cheap no matter which way you slice it.

You either need to have a very good understanding of hardware and software to take the DIY approach to build this, and then actually know how to manage and run it and have the time to do it as well or you are going to need to buy pre-built or hire a consultant to walk you through DIY or even what you need in regards to pre-built.

You can't have your cake and eat it too when dealing with all your requirements and the scale you are after.

Don't be mistaken though it's really easy to DIY a fast, redundant affordable storage system for say 10, 20, 50TB but 100TB+ you have many more things to consider in addition to 'deals' on hardware become challenging to find in the quantities you need, when you need them.

Saving a 'few hundred here' and 'few hundred there' sure it adds up to buy an extra drive or two BUT in the grand scheme of your project it's a very very very small amount of money. Take that few hundred here or there and put a deposit on someone to walk you through exactly what you need or take that money and use it to build a couple smaller DIY systems so you can get hands on experience and learn what you're doing so you can be more familiar with the choices and info people provide you on the forum.

The stage you're at now is beyond asking for advice it's more hand holding through the entire process. I HATE saying things like that but I fear if you run with ideas given without the experience you'll end up with something you do not like. I apologize if I've misread and you have more experience with hardware and DIY but it doesn't come across like that.

Also, the reason we were talking about drives is because that's going to be your #1 expense and it warrants a discussion.

You could have the entire rest of the system (chassis, motherboard/cpu/ram) for $1500 or less but if you're going to buy 8 or 10TB drives that is a drop in the bucket re: discussing, and your real cost is in drives.

If you're set on DIY then I'd urge a DIY TEst server or 2 so you can test different software and then go from there.
 
Last edited:

pgh5278

Active Member
Oct 25, 2012
479
130
43
Australia
FYI for the side-talk on specific drive models, that's a little early when i'm still planning the total package yet. :) I'll weigh enterprise vs consumer and SATA vs SAS and all that after deciding "one NAS or multiple", "just NAS or separate SAN", SSD or no, which NAS software to use, and many other smaller questions to come later.

For right now i'm looking at a few different ways to solve the total problem and am not even committed to something like freenas vs openfiler vs nexestor vs other things right now. The biggest criteria are protection from silent corruption on the work drives and fault tolerance, with alot of gigs and good performance without excessive cost overhead per gig/per drive. Total suggested solutions can run the gamut from super ghetto to using used datacenter equipment to clever ways to save a few hundred dollars here, and a few hundred there since it adds up (and buys more drives/tapes).
Perhaps You need to reconsider the name of Your post, as it does not appear that is what you are aiming for, if I read your responses to the experienced folks are supplying responses for an "enterprise class" . Just an observation .....