Hey all,
I'm about to do a clean wipe and rebuild of my home-office server, after a solid 5 years running OmniOS/napp-it.
It's been a mostly-great experience - except for a recurring issue where the SMB server would lock the CPU into a 100%-thrashing-85+degree-overheating state while the server was completely idle, until I manually logged in and reset the SMB server. That's my main reason to wipe and update - I assume whatever that particular gremlin is, it'll probably be gone in later versions.
Oh - and the 9-out-of-12 Seagate 3TB drive failures have been annoying, especially as they were all out of warranty, but that's on Seagate.
I thought I'd take the opportunity to clue up a bit more about how to tune the setup. Dumb amateur questions incoming
Firstly - I use it as a general everything-and-anything file-storage device, but with special emphasis on visual-effects-simulations kicking out massive cache files (writing 1-2GB per frame, every few minutes, from 4 different machines... but wanting to load sequences of 200-300GB of those frames as quickly as possible once they're written) - each of those four machines is connected via a dedicated 10GbE to 2x2-port 10GbE adapters, without a switch.
Given all that - I'm wondering about my understanding of l2arc, zil/slog, sync enabled/disabled, etc.
As it stands, I have a 6-core Xeon (E5-1650 I think), 64GB of RAM in there, and 3x Corsair Neutron GTX 240GB drives as an L2ARC.
Now, I've gotten the impression over time that the L2ARC wasn't really doing much for my workload, and I've just today read a post suggesting that I'm probably massively overdoing it, possibly to the detriment of my RAM-based caching, given that the L2ARC need to maintain a large index in RAM.
- I know I'd be better off whacking in an extra 64GB of RAM - I'd be interested to know whether that would likely improve things much given my single-user, 4x machines, mostly-large-sequential workloads. Is it worth buying the extra RAM?
- With or without the extra RAM, would it actually be a good idea to reduce and/or scrap the L2ARC altogether?
I've run the thing with sync=disabled up until now, with no SLOG and optane not existing back when I built it, it clearly couldn't function as high performance storage with it enabled.
Firstly, I'm wondering if sync=disabled is fine to leave as it is, given I'm running without a UPS. Most of what I write isn't critical - a file doesn't get written, I just relaunch that sim/render.
Then, if I have sync=disabled - is there *any* point to having an SLOG device? Does sync-disabled literally turn off the intent log, or would there still be a benefit? i.e. if such a thing exists, could writing even async intent-log to an Optane 900p, which would presumably complete much sooner than flushing to the pool, act as a form of insurance against data loss? Would there be any actual performance benefit? Would it be worth bothering at all even if there is, or maybe just going for a cheapo Optane 32GB instead?
Lastly, I was surprised at the time, but I was told at some point that it was fine to run SMB and NFS simultaneously... (rack machines are Linux, main workstation is Windows, and Windows NFS is a pain in the ass and stalls out for 30 seconds whenever I don't use the connection for a couple of minutes). I've found it to work fine in this configuration, never noticed any corruption or unpleasant behaviour.
Mostly just want to double-check that this is definitely still sound advice, and I'm not ignorantly tempting fate.
Additionally though, is there any performance penalty implications here? Could it potentially be a better idea to try and get the Linux clients running on Samba instead of NFS? (I think I'm about done fiddling with Windows' NFS implementation)
Any additional insight most welcome too.
Thanks in advance to anyone who's taken the time to wade through that wall of text!
I'm about to do a clean wipe and rebuild of my home-office server, after a solid 5 years running OmniOS/napp-it.
It's been a mostly-great experience - except for a recurring issue where the SMB server would lock the CPU into a 100%-thrashing-85+degree-overheating state while the server was completely idle, until I manually logged in and reset the SMB server. That's my main reason to wipe and update - I assume whatever that particular gremlin is, it'll probably be gone in later versions.
Oh - and the 9-out-of-12 Seagate 3TB drive failures have been annoying, especially as they were all out of warranty, but that's on Seagate.
I thought I'd take the opportunity to clue up a bit more about how to tune the setup. Dumb amateur questions incoming
Firstly - I use it as a general everything-and-anything file-storage device, but with special emphasis on visual-effects-simulations kicking out massive cache files (writing 1-2GB per frame, every few minutes, from 4 different machines... but wanting to load sequences of 200-300GB of those frames as quickly as possible once they're written) - each of those four machines is connected via a dedicated 10GbE to 2x2-port 10GbE adapters, without a switch.
Given all that - I'm wondering about my understanding of l2arc, zil/slog, sync enabled/disabled, etc.
As it stands, I have a 6-core Xeon (E5-1650 I think), 64GB of RAM in there, and 3x Corsair Neutron GTX 240GB drives as an L2ARC.
Now, I've gotten the impression over time that the L2ARC wasn't really doing much for my workload, and I've just today read a post suggesting that I'm probably massively overdoing it, possibly to the detriment of my RAM-based caching, given that the L2ARC need to maintain a large index in RAM.
- I know I'd be better off whacking in an extra 64GB of RAM - I'd be interested to know whether that would likely improve things much given my single-user, 4x machines, mostly-large-sequential workloads. Is it worth buying the extra RAM?
- With or without the extra RAM, would it actually be a good idea to reduce and/or scrap the L2ARC altogether?
I've run the thing with sync=disabled up until now, with no SLOG and optane not existing back when I built it, it clearly couldn't function as high performance storage with it enabled.
Firstly, I'm wondering if sync=disabled is fine to leave as it is, given I'm running without a UPS. Most of what I write isn't critical - a file doesn't get written, I just relaunch that sim/render.
Then, if I have sync=disabled - is there *any* point to having an SLOG device? Does sync-disabled literally turn off the intent log, or would there still be a benefit? i.e. if such a thing exists, could writing even async intent-log to an Optane 900p, which would presumably complete much sooner than flushing to the pool, act as a form of insurance against data loss? Would there be any actual performance benefit? Would it be worth bothering at all even if there is, or maybe just going for a cheapo Optane 32GB instead?
Lastly, I was surprised at the time, but I was told at some point that it was fine to run SMB and NFS simultaneously... (rack machines are Linux, main workstation is Windows, and Windows NFS is a pain in the ass and stalls out for 30 seconds whenever I don't use the connection for a couple of minutes). I've found it to work fine in this configuration, never noticed any corruption or unpleasant behaviour.
Mostly just want to double-check that this is definitely still sound advice, and I'm not ignorantly tempting fate.
Additionally though, is there any performance penalty implications here? Could it potentially be a better idea to try and get the Linux clients running on Samba instead of NFS? (I think I'm about done fiddling with Windows' NFS implementation)
Any additional insight most welcome too.
Thanks in advance to anyone who's taken the time to wade through that wall of text!
Last edited: