Storage Strategy for Large Plex Libraries

Rand__ · Apr 24, 2022

The "1GB memory/1TB disk size" rule has basically been deprecated ... as long as you have 32-64+GB you should be fine.
In the end increased disk space in your case does not mean increased user count, and as long as you dont do deduplication...

CPUs always can be replaced later when they are cheaper too in case you'd need more power.

ReturnedSword · Apr 24, 2022

If the TrueNAS is expanded from let’s say 250 TB to 500 TB, then onward to PB level, is it advisable to keep all that storage on one server? Whether it is on one zpool or split into multiple zpools.

Rand__ · Apr 24, 2022

Well @Sean Ho said it right - if you cannot have downtime to repair a faulty component then putting all your eggs into a single nest is folly.

If you dont mind having it off for a week (or how ever long it takes to get that particular replacement part), then the benefit of having a single management/access interface (+ no power overhead) probably outweighs the availability concern.

U should be able to start with a TNC box and migrate to a TNS cluster later though, if that eases your mind

ecosse · Apr 25, 2022

I don't know the current price of electricity in all regions but due to multiple factors its about double what is was in 2019 here in the UK. For me its a design principle that I consider to only have services up when I need them, which means for example I split always on (VM's) vs sometimes on (e.g. we work m-f 9-5 in general and we don't normally watch TV until 8 so why would I have full media services available during that time and why would I have them running overnight as a general rule? I'm starting to automate the general approach using ipmi / powershell - powerup / move data to staging area / powerdown kind of thing with a manual override.

oneplane · Apr 25, 2022

If you end up going hyperconverged after all, instead of doing things like mixing Proxmox and Ceph you could also run Harvester HCI.

ReturnedSword · Apr 25, 2022

Rand__ said:
Well @Sean Ho said it right - if you cannot have downtime to repair a faulty component then putting all your eggs into a single nest is folly.

If you dont mind having it off for a week (or how ever long it takes to get that particular replacement part), then the benefit of having a single management/access interface (+ no power overhead) probably outweighs the availability concern.

U should be able to start with a TNC box and migrate to a TNS cluster later though, if that eases your mind

Ofc no normal person can afford to fill out sub-100 disks at once, certainly I can’t But I do have the capacity to buy 12-24 disks a year. I’m interested at this point in the brainstorming to make a plan that would allow less painful expansion in the future, constrained by money sink into the project.

This will be for my Plex library, as well as storing the first/on-site backup of personal files (following 3-2-1, first copy is on the local workstation, third copy will be be offsite somewhere). I will likely retire my existing, much smaller FreeNAS for the purpose of personal files. As it will be for homelab/personal use, a bit of downtime isn’t going to be a dealbreaker but more uptime is preferable.

I am intrigued with both the idea of a massive system that can connect JBODs, as well as splitting up among smaller systems. On one hand a big system saves money on core components such as motherboard, CPU, memory. Boot disks are inconsequential. On the other hand multiple smaller systems means there isn’t a single point of failure, even if files aren’t replicated. I’ve been reading up on the future clustering support for TrueNAS Scale and some of the clustering schemes are quite exciting, especially the distributed options, but I’m unwilling to deploy any features unless the features are ready for production.

ReturnedSword · Apr 25, 2022

ecosse said:
I don't know the current price of electricity in all regions but due to multiple factors its about double what is was in 2019 here in the UK. For me its a design principle that I consider to only have services up when I need them, which means for example I split always on (VM's) vs sometimes on (e.g. we work m-f 9-5 in general and we don't normally watch TV until 8 so why would I have full media services available during that time and why would I have them running overnight as a general rule? I'm starting to automate the general approach using ipmi / powershell - powerup / move data to staging area / powerdown kind of thing with a manual override.

I’m in California so generally electricity costs are a big factor here. I’m not ready to go solar as I’m not sure how much longer I’ll be living at my present home. Depending on the market conditions I may purchase a newer home, or move back to my rental property which is a newer build home then do solar at that point.

Currently my data is staged, however it is a bit of a janky setup. It’s running on my old workstation which is an old Sandy Bridge system. I had spun the system up to test some concepts out then just have had it running since then.

You bring up a good point about having the media server powered down while away. Certainly it’s something I can do, though I prefer to keep it online. The dream would be able to have lesser watched content in cold storage then spun up only when requested, though at the moment I am not aware of any way to do that with a Plex setup.

ReturnedSword · Apr 25, 2022

oneplane said:
If you end up going hyperconverged after all, instead of doing things like mixing Proxmox and Ceph you could also run Harvester HCI.

I admit my understanding of Harvester isn’t that high. My understanding thus far is Harvester provides a solution to clustering containers and VMs into a hyperconverged solution. I’d probably focus on “storage first,” as that is my major use case in my homelab.

A future plan is to have a proper flash-based VM store. I’m aware plenty of people run VMs off of mirror vdevs on spinning rust, but to me that doesn’t seem ideal? If there’s a solution for let’s say, a flash-based VM store, and a slower spinner based general store that can be managed on a single pane of glass that would be most ideal. It seems TrueNAS Scale is moving in this direction… eventually.

Sean Ho · Apr 25, 2022

ReturnedSword said:
The dream would be able to have lesser watched content in cold storage then spun up only when requested, though at the moment I am not aware of any way to do that with a Plex setup.

I think I mentioned Unraid earlier; I don't use it myself, but it would be a good fit for your requirements. Over at serverbuilds.net the predominant use case for our members is NAS for Plex and downloadable media; many of them have upwards of 100TB of usable storage in Unraid, parity-protected (though yes, without data block checksums AFAIK). With that many users and that many drives, we've naturally witnessed many drive failures and upgrades, without issue.

Many report success using spin down timeouts on the drives (though you can't really decide what files get put on what drives). In practise, with few plex users, most of the drives stay spun down most of the time. (Whether or not it is better to let the drives spin 24/7 is a question for another time.)

I have no conflict of interest or skin in the game, it's just a suggestion for your consideration, based on the experiences of many others with very similar needs.

i386 · Apr 25, 2022

ecosse said:
You could look at x265 codec to reduce space.

ReturnedSword said:
I might have to look into re-encoding or re-ripping stuff into HEVC as until recently I didn’t have client devices capable of direct playing high bitrate x265. I can possibly save 10-15% per movie.

I stopped reencoding content years ago and only remux the video + audio streams that I need (usually original language + german) + subtitles. The files are usually not really bigger than the re encoded files (assuming you use the "best possible" audio which can be sometimes a 4+ GByte 7.1 thd file).

T_Minus said:
I've yet to have a commercial DVD or Blu-Ray stop working... well due to age... and why can't you collect them? I don't like to say I "collect" movies, but I prefer to own the disc than a digital copy online. I use Disc Wallets to organize them and have maybe 1000 or so across half dozen+ Disc Wallets and they take up very minimal space and have 0 operating cost like a server to house them all I actually built a ripping-server with 3x BR drives to rip but ended up skipping it, maybe something in the future but for now no issues from me with the physical disc.

I still have optical drives around

1x pioneer bdr-211ebk
1x lg wh16ns60
1x asus bw-16d1ht
These are all libredrive comaptible, the lg and asus are in usb enclosures and only used when I want to rip 10+ discs.
(I bought so many movies off amazon that they send me an invitation once for reviewing unreleased amazon prime series & movies

)

oneplane · Apr 26, 2022

ReturnedSword said:
I admit my understanding of Harvester isn’t that high. My understanding thus far is Harvester provides a solution to clustering containers and VMs into a hyperconverged solution. I’d probably focus on “storage first,” as that is my major use case in my homelab.

A future plan is to have a proper flash-based VM store. I’m aware plenty of people run VMs off of mirror vdevs on spinning rust, but to me that doesn’t seem ideal? If there’s a solution for let’s say, a flash-based VM store, and a slower spinner based general store that can be managed on a single pane of glass that would be most ideal. It seems TrueNAS Scale is moving in this direction… eventually.

It allows you to do whatever configuration you like but one aspect of convergence is that it does "everything" together. So you can run both storage and compute on the same node. But if you were to simply not launch any compute it would just end up as pure storage. In this case it's probably too early for that and it's mostly useful for people who want to improve storage and compute at once by adding nodes. Traditionally to improve compute you would add nodes with lots of CPU and RAM, and then to improve storage you'd add storage nodes to a SAN storage system. But the idea of convergence is that every node can do a bit of each and by scaling it you get more space, more performance and more redundancy. In reality, that's not really going to be interesting until you get about 5 nodes or more.

Ceph has a similar thing where it's useful once you get enough nodes and until then, not so much. It also needs a bit more planning, like "how many disks per node".

One benefit of using DAS disk arrays is that you can start out all connecting them to one server, and once you feel like you need more CPUs and RAM and more redundancy you can simply detach one, attach it to a different server and start building out that way. The downside of this is that you might end up with vdevs or arrays that span multiple DAS devices so you can't start the array unless all of them are connected. And for multi-server storage you need high speed networking. At the same time a DAS chassis tends to house many drives, so you are likely to start filling one up and then buying another, but that means that if that DAS or its controller fails the entire thing becomes unusable. But buying multiple disk chassis and only having a few drives each is costly and uses more space, so now there's that problem. Argh.

ecosse · Apr 26, 2022

T_Minus said:
I've yet to have a commercial DVD or Blu-Ray stop working... well due to age... and why can't you collect them? I don't like to say I "collect" movies, but I prefer to own the disc than a digital copy online. I use Disc Wallets to organize them and have maybe 1000 or so across half dozen+ Disc Wallets and they take up very minimal space and have 0 operating cost like a server to house them all I actually built a ripping-server with 3x BR drives to rip but ended up skipping it, maybe something in the future but for now no issues from me with the physical disc.

I recently thought hard about a pivot to this approach but replacing the disk wallets with Sony / Pioneer bluray carousels. As you say you can host a movie collection of thousands "online" at a fraction of the price of a hard disk backed service. The reliability of these + units are eol + move to digital away from bluray put me off but I did see virtue there!

ecosse · Apr 26, 2022

ReturnedSword said:
I’m in California so generally electricity costs are a big factor here. I’m not ready to go solar as I’m not sure how much longer I’ll be living at my present home. Depending on the market conditions I may purchase a newer home, or move back to my rental property which is a newer build home then do solar at that point.

Currently my data is staged, however it is a bit of a janky setup. It’s running on my old workstation which is an old Sandy Bridge system. I had spun the system up to test some concepts out then just have had it running since then.

You bring up a good point about having the media server powered down while away. Certainly it’s something I can do, though I prefer to keep it online. The dream would be able to have lesser watched content in cold storage then spun up only when requested, though at the moment I am not aware of any way to do that with a Plex setup.

This is where I am at, but not necessarily with a solution yet. I want in essence ordering service - so based on the file requested the appropriate server is spun up, that file is copied to a SSD backed store (cheap) and then it is played from that SSD, allowing the main store to be spun down again. Obviously I could pin "favourites" to the SSD for commonly used stuff or something new (I must watch Mad Max Fury Road / Death of Stalin / Tinker Tailor / Elizabeth at least once a month, drives the wife mad

) Obviously I don't mind waiting 10 mins for my film to start to save £K.

I use Emby not Plex but maybe I'll ask the Jellyfin community.

oneplane · Apr 26, 2022

Sounds like what a tape drive would do to me

But at this point you're looking at HSM.

ReturnedSword · May 4, 2022

A little update on planning here:

I’ve decided to go “big” with TrueNAS and a SM 846/847, likely leaning to SM 847 since it has 12 additional drives. I’m planning to use Core unless Scale is mature enough by the end of the year which is when I’ll be building. I may add a JBOD later, but would prefer to keep each box independent, or limit each box to a single JBOD chassis. Clustering on Scale would be a future project depending on maturity of the feature, and likely involve building entirely new systems and migrating after testing.

Using assumed 18 TB disks x 36, a decision point would be vdev width and zpool configuration. 6-wide z2 x 6 vdev would give ~392 TB usable, whereas 12-wide z2 x 3 vdev would give ~448 TB usable. I’m thinking that as the data will be mostly static after Plex metadata has been generated, it won’t be necessary to have a zpool with so many vdevs as random IO won’t be a huge factor. Data will be mostly sequential, so leaning towards 12-wide vdevs. I do worry about resilver times with such big disks though. I’ve never had to encounter that issue in FreeNAS as my existing FreeNAS uses much smaller disks.

I don’t imagine I’ll need a motherboard with a ton of IO and slots. 2 x8 slots for a fiber NIC and a HBA should be sufficient. Perhaps a third slot for an external HBA if I get a JBOD in the future.

An issue I’m researching now is that Xeon E-class/AM4-class boards usually don’t have sufficient slots, and ECC UDIMMs are much more expensive than RDIMM. Xeon Scalable and EPYC may be a bit overkill on the other hand. Power usage is also a concern as this will be living in the homelab, though yes I’m aware with this many disks the large part of power budget will be in the disks. I’d also like to stay away from “ancient” hardware, such as L-series Xeons, and perhaps Xeon v3 and prior.

Rand__ · May 4, 2022

Why not a xeon e5 system? 1650 or whatever u can get cheaply?

T_Minus · May 5, 2022

For that large of a setup kind of hard to not go with the Xeon E5 v3\v4 due to overall cost\performance and cost of CPUs and RAM.

IamSpartacus · May 5, 2022

I personally can't understand why one would use ZFS for a plex media server storage array over something like Mergerfs/Snapraid. I use the latter for my 200+TB media array and it's wonderful. 22 data disks. 2 parity disks (can use up to 6). Non striped means even if you lose more drives than you have parity for, you only lose the data on those bad drives. Performance wise, there is no need for a striped array to handle a massive media streaming load.

ReturnedSword · May 5, 2022

Rand__ said:
Why not a xeon e5 system? 1650 or whatever u can get cheaply?

This is a route I had totally forgotten about. Power usage on the CPU and mobo is a bit concerning but again, I recognize most of the power budget is going to go into the disks. I’d probably like to swap PSUs to 1200W SQ, so ideally I’d like overall max power to stay at or below 1 kW.

ReturnedSword · May 5, 2022

T_Minus said:
For that large of a setup kind of hard to not go with the Xeon E5 v3\v4 due to overall cost\performance and cost of CPUs and RAM.

IIRC, E5-2xxx can be used in SP mobos? That would open up some possibilities on core count to possibly make up for an older Broadwell-EP being much slower/less IPC than more modern CPUs. Also after a quick cursory look, there doesn’t seem to be a ton of Supermicro socket 2011-3 UP mobo variants made. The IO looks great though…

I’m not sure about what sort of CPU horsepower is needed for a 36 disk ZFS array, or if a 45 disk JBOD was attached on top of that. All transcoding would be done off the server on another box.

Storage Strategy for Large Plex Libraries

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Active Member

Active Member

seanho.com

Well-Known Member

Well-Known Member

Active Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Build. Break. Fix. Repeat

Well-Known Member

Active Member

Active Member