What is the ZFS ZIL SLOG and what makes a good one

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Patrick

Administrator
Staff member
Dec 21, 2010
12,519
5,826
113
nice article, looking forward to the next part in the series.
Thanks for the feedback. This was actually going to be the intro to the SLOG benchmark piece but it became too long.

Hard to judge whether these are interesting for folks. Lots of SC17 news Monday and Tuesday.

Good to have content.
 
  • Like
Reactions: Chuntzu

i386

Well-Known Member
Mar 18, 2016
4,250
1,548
113
34
Germany
This was actually going to be the intro to the SLOG benchmark piece but it became too long.
Maybe it's a silly question but do you know the Microsemi NV1616 drive and can you get one for comparsion vs optane?
The specs on the NV1616 look really great (it's dram based like the zeusram, but with pcie 3.0 x8 and nvme interafce)
 

jcizzo

Member
Jan 31, 2023
37
5
8
I apologize profusely for bringing this back up, but i'm confused.

i read the following article (along with countless others) and i thought i understand the point of the zil/slog and my understanding seems to coincide with this article. However i've been told i'm wrong and i'm hoping someone can explain where i'm wrong in my thought process. i will explain how it all pertains to the design/build of my home nas based upon the latest version of truenas core 13.xx (stable).

keep in mind, my nas is just that - a nas.. no plugins or anything. i have separate mediaserver that is win10 based, runs plex, radarr/sonarr/nzbget, handbrake and is solely for use as a plex and movie rendering box with a 500G OS and software nvme, and a 2TB NVME for temporary storage of downloads until they're transferred to my nas.

movies are downloaded to mediasvr, rendered to mp4, the pumped to the nas. from there, if someone puts on a movie, plex (on mediasvr) has a mapped drive to the nas and it reads from that drive across a dedicated/direct 10G link. from there the movie goes out the 1G link to where ever it is requested.

my humble little nas is built with the following hardware:
supermicro x11ssh-f motherboard
i3-7100T (2 cores, 4 threads, although hyperthreading is disabled).
32Gigs of ECC ram
lsi 9211-8i in IT mode.
500GB NVME (intended for use as a slog/zil)
5x 4TB spinners attached to the aforementioned LSI controller.
onboard intel 1G nic
added intel x710-da2 (for large file transfers that would otherwise take a while)

from this article "www.servethehome.com/what-is-the-zfs-zil-slog-and-what-makes-a-good-one/" and all else that i've researched it seems that i'm correct in my assumptions.

my nas and mediaserver diagram is as follows: mediasvr ->10Glink -> TNC Nas. the mediasvr and nas are accessed via the 1G links to my network, however the mediasvr and nas have a direct connection via DAC to one another for large file transfers (movies).. yes, it's overkill :)

Because of my spinners being slow (raidz1), TNC would never be able to write a data stream to them as fast as it can accept it from the mediasvr, so it would all go as follows:
1) mediasvr sends stream over the 10G link to TNC.
2) zfs cache (ram) fills and attempts to write to the spinners.
3) since the spinners can't keep up with the write speed, TNC starts writting the data stream to my NVME drive to A) ensure data is on persistent storage, and B) free up ram so as to keep accepting the incoming data stream.
4) the contents of the NVME drive from there would be written to the spinners at their pace.

in this scenario, considering the NVME is more than capable of accepting a 10Gb data stream, my file transfers should be able to achieve a sustained rate of 10Gb/s, however when i test, the file transfer starts at 10Gb/s, goes for a several seconds, then drops to between 150-400Mb/s.

is there a setting i can change? how am i wrong in my assumptions? I'm SURE other novices have made the same assumptions so working through this would help many.

Thank you for your time!
 
Last edited:

mrpasc

Well-Known Member
Jan 8, 2022
494
262
63
Munich, Germany
Sorry to say, but your understanding of the use of a dedicated SLOG device for ZIL is wrong.
What you described is basically a mixture of „write caching“ and „write tiering“ which ZFS doesn’t do.

I try to do it short:
The only write cache is RAM for ZFS. This are those some seconds you see high write rate —> cached in RAM.
You can’t max out your sustained write rate with an SLOG. Your sustained write rate is solely limited by your disks and your pool layout.

A good read starting point is this post at TrueNAS forums:Some insights into SLOG/ZIL with ZFS on FreeNAS

It is some older date but still worth reading to get a basic understanding ZFS writes ZIL SLOG
 
Last edited:

reasonsandreasons

Active Member
May 16, 2022
133
88
28
For some additional background, the SLOG only stores sync writes (writes where the sender requests confirmation that the data was successfully written to disk). In those situations it's taking the place of the ZIL, which is stored on the main pool. Alongside the ZIL/SLOG ZFS is also writing to RAM--the ZIL/SLOG is a backup of those writes on non-volatile storage. Unless your system crashes or something goes wrong, nothing is read out of the ZIL/SLOG.

You'd naturally think that oh, this should let me do async writes at the speed of my RAM and sync writes at the speed of my SLOG, but ZFS also regularly flushes data from RAM to the main pool. Unless your writes are very small (and probably not even then, though this is beyond my knowledge), you're still limited by how quickly ZFS can flush writes to the main pool.

(This pulls a ton from this article, which is also worth a read.)
 
  • Like
Reactions: BoredSysadmin

T_Minus

Build. Break. Fix. Repeat
Feb 15, 2015
7,648
2,065
113
  • A SLOG device isn't needed in your configuration at all based on your work load.
  • Depending on your ZFS Configuration those 2 cores may be getting hammered.
  • what's CPU load like?
  • what's pool setup?
 
  • Like
Reactions: Stephan

mrpasc

Well-Known Member
Jan 8, 2022
494
262
63
Munich, Germany
1) mediasvr sends stream over the 10G link to TNC.
Okay, so good so far. Do you use SMB connection, or NFS or how do the mediasrv access the NAS?
2) zfs cache (ram) fills and attempts to write to the spinners.
Yes, almost correct. ZFS does cow (copy on write) so it tries to accumulate the data into chunks.
3) since the spinners can't keep up with the write speed, TNC starts writting the data stream to my NVME drive to A) ensure data is on persistent storage, and B) free up ram so as to keep accepting the incoming data stream.
Here your understanding of whatever you have read or whatever YouTube vids you have seen went completely wrong. TNC/TNS and any other ZFS will never „redirect“ your stream to the SLOG (or the NVME or whatever). It keeps writing to your pool, thus your lowered perfo
4) the contents of the NVME drive from there would be written to the spinners at their pace.
Nope, ZFS doesn’t do any of this. No „staggered writes“ or „ssd caching“
 

jcizzo

Member
Jan 31, 2023
37
5
8
  • A SLOG device isn't needed in your configuration at all based on your work load.
  • Depending on your ZFS Configuration those 2 cores may be getting hammered.
  • what's CPU load like?
  • what's pool setup?
yes, my cpu is ABSOLUTELY getting hammered... only when data comes across the 10G side though... other than that, the cpu is a professional thumb-twiddler.. i can upgrade to an e3 xeon like a 1240L (very low energy requirements) and i've been contemplating that. when big file copies occur on the 1G side, the cpu pops up to between 10-20% and zfs cache usage pops to 50% maybe? when i send the same size files across the 10G side, cpu jumps to 70%, zfs cache usage goes close to 90%~ (maybe more, probably more).

pool setup will probably be a raidz1. the spinners will only be containing media (movies/tv shows/music).. the sensitive and critical stuff (legal docs, resumes, account statements, etc..) will be stored on mirrored SSD's.

yeah, i was just hoping that i could utilize that nvme drive to speed up large file transfers to the spinners... try and saturate that line so i can stop punching myself for splurging on the nics.
 

reasonsandreasons

Active Member
May 16, 2022
133
88
28
If you upgrade your CPU there isn't a huge use for a L-series chip; it should have the same idle power consumption as a standard model. The primary advantage of the lower TDP is a lower thermal ceiling for situations where you have limited cooling available. Generally speaking doing the same amount of work on the same architecture should take the same amount of power, though, so you won't get any real savings there.
 
  • Like
Reactions: name stolen

jcizzo

Member
Jan 31, 2023
37
5
8
Okay, so good so far. Do you use SMB connection, or NFS or how do the mediasrv access the NAS?

Yes, almost correct. ZFS does cow (copy on write) so it tries to accumulate the data into chunks.

Here your understanding of whatever you have read or whatever YouTube vids you have seen went completely wrong. TNC/TNS and any other ZFS will never „redirect“ your stream to the SLOG (or the NVME or whatever). It keeps writing to your pool, thus your lowered perfo

Nope, ZFS doesn’t do any of this. No „staggered writes“ or „ssd caching“
1) SMB.. no point in nfs for me.

i've wondered if there really is a point for me to have a slog/zil in my application.. seems like there isn't.. i've read some instructions on using a separate ssd (in my case nvme) and placing the swap file on it.. do you think that would help? obviously nothings as good as if i were to double my ram to it's max of 64Gigs.. but i'd think that extending the swapfile to an NVME would alleviate some contention.. dunno though
 
Last edited:

jcizzo

Member
Jan 31, 2023
37
5
8
If you upgrade your CPU there isn't a huge use for a L-series chip; it should have the same idle power consumption as a standard model. The primary advantage of the lower TDP is a lower thermal ceiling for situations where you have limited cooling available. Generally speaking doing the same amount of work on the same architecture should take the same amount of power, though, so you won't get any real savings there.
yeah, i read an article, either on here or the truenas forum yesterday that said precisely just that. makes me wonder if i should upgrade anyway just to gain processing capacity in the event that a drive needs to be replaced.. they made a case to where a non low-powered cpu could even use less power than a T or L series chip because it would process the jobs and return to an idle state more quickly.. food for thought
 

reasonsandreasons

Active Member
May 16, 2022
133
88
28
L2ARC is the "use an SSD to augment the RAM cache" configuration you're talking about. It's very much an "if you have to ask you don't need it" situation, and there's an off chance it would lead to worse performance. It doesn't provide write catching, either, only read caching, as does ARC generally.

The one "throw an extra SSD in there" thing that might meaningfully improve performance in any respect is a special vdev (TrueNAS calls these "fusion pools"). This moves the pool's metadata and optionally small files onto a SSD, which really improves performance on things like listing all the files in a directory or accessing said small files. A special vdev does become a load-bearing member of a RAIDZ pool when it's added, though, so you should use a mirrored device.

This will not let you fill your 10G pipe, though. Reconfiguring your pool as a pair of mirrors might let you get above 1G speeds (my Core box does, at least in benchmarks), but you won't be able to saturate a 10G link in a reasonable amount of disks with anything other than NVMe.
 
Last edited:
  • Like
Reactions: name stolen

jcizzo

Member
Jan 31, 2023
37
5
8
Let me emphasize this point: special vdevs must be reliable. You will lose your pool if you lose a special vdev.
I read about the special vdevs when trying to research all this stuff. i'm not gonna bother with that because the drives are just for movies and on the one hand, it's not like i'm losing data that's critical to my life.. on the other hand if i lost the pool that would be SUCH a b!tch.. too much to redownload and process..

i wasn't going to bother with l2arc because as was pointed out; it only speeds up read performance, which on my little network i don't need. i'm just looking for reliability and fast transfers of large media to storage.

i may just move the swap file to the nvme.. more beneficial there than anywhere else i suppose
 

uldise

Active Member
Jul 2, 2020
209
72
28
'm just looking for reliability and fast transfers of large media to storage.
maybe unraid is your choice then? you can use fast drive(one or more) for primary storage, then it get's moved to secondary storage at night. they added zfs native in last release too.
 

gea

Well-Known Member
Dec 31, 2010
3,172
1,197
113
DE
maybe unraid is your choice then? you can use fast drive(one or more) for primary storage, then it get's moved to secondary storage at night. they added zfs native in last release too.
The "problem" with ZFS on single disks (basically pools from a single vdev) to allow disk sleep on all inactive disks (a main advantage of Unraid) is that checksum detection of bitrot or other problems remains active but without the usual auto repair of ZFS during read or scrub. You get informed that a file is damaged and must manually restore from backup. The entire ZFS experience require a ZFS pool with redundancy where a disk sleep is only possible over all disks with up to a minute to wakeup all disks.