To boil it down to the key points, which will still be rather long. There are unknowns to be explored, & there are variables, some if which I control, others I don't. Basically, much to my chagrin, what GPUs I can use & how I use them are limited by the amount of RAM available.
In theory, with the (maxed out) RAM I have, I can use cards with up to 32GB of VRAM, as long as I don't use NVLink for memory pooling with another card, or I could use pairs of 16GB VRAM cards if I do use it. For reasons of various efficiencies I'm not very interested in those GPUs.
Individual 24GB cards shouldn't present a problem with RAM usage. But they are also a smaller canvas to use for scenes.
If I use individual cards with 48GB of VRAM, or pairs of 24GB cards with NVLink for memory pooling, then I will have a theoretical RAM deficit of ~16GB. This deficit I hope can be overcome with virtual memory, but I don't know if it can be or not.
Ideally, I would like to use 48GB VRAM cards with NVLink for memory pooling, but then I would have a theoretical RAM deficit of ~160GB, over a 100% RAM deficit. This was my optimal use case for GPUs.
Each scene is processed via CPU/RAM/Drives then sent on to the GPUs, a process that is likely to be on the scale of seconds to minutes. The actual amount of RAM required will likely vary from frame to frame, depending on resources used, techniques used, maybe even camera angles. Not very many people seem to be working at this level, at least not that are discussing it. There's a lot of arguing among the few who claim to be in the know about how it works & how easy it is or not.
I knew the system I'm building now was always going to be a testbed, a minimal investment to figure out the process, & smoke out the bugs, as well as renewing my PC building skills. Really, a better system was always in the plans, but I needed to get real world experience & figure out how all the parts work together, or don't, in order to properly plan the next version. I intend to push on as far as I can, & learn the limits.
If you're going to be dealing with memory's worth of write activity, I'd recommend a pci-e carrier with multiple M.2 drives or an array of sata drives in order to get the throughput and wear-life that you need.
I looked into that, but I haven't the knowledge to figure out which enterprise versions would be useful, & available, at least without extensive research. I'm outside my expertise, that's why I'm here. To my bafflement, all the consumer versions I looked at were actually considerably slower than using standalone M.2 socket SSDs in reviews. I *think* I mentioned this about Asus's cards over in my earlier thread, but I didn't spend much time looking as it wasn't panning out at the time.
PCIe sockets are at a premium, but I could sacrifice a Gen 3 x8 or x16 slot for a solid solution. It doesn't have to be M.2 based either, it could be a standalone/monolithic (?) SSD x16 card. But I could use specific MLC recommendations, in particular.
But before going that -- likely more expensive -- route I should do tests with M.2, or perhaps U.2 to see if it's required. One thought that occurred to me is that while RAIDing M.2 NVMe is usually discouraged as its not productive, with TLC drives that are known to have a speed drop (when their SLC cache is full) RAID might be an option. Also, the smaller these Micron drives I discussed are, the slower they get, they might be candidates as well to maintain write speeds.
I'm a little fuzzy on what effect PLP has from a user perspective. Do all drives need PLP, such as one just used for virtual memory, in my case? Or does it also prevent HW damage to the SSDs? Obviously they do nothing to content in RAM that is not being saved, so the virtual memory contents should be considered disposable as well?