dm-cache or flashcache for SnapRAID w/ mergerfs

LeeMan

New Member
Oct 18, 2015
23
3
3
29
I have a big storage server. 96TB currently with 12 8TB WD drives. My OS is ran off of 2 (RAID1) 240Gb Micron Enterprise SSDs that I got a hell of deal off of on ebay.

I'm looking to speed up my disk reads a little from the HDDs. 90-95% of everything done on the server is read base so no writeback needed. Looking to go with 1 or 2 400Gb Flash Accelerators F40 (maybe something different if it offers better performance/price ratio). Server is running SnapRAID with mergerfs.

Depending on feedback will determine how much cache I should have with 100+ (soon to be 130)TB in a majority read processes. Main question is which processing should I use for my caching? Obviously the drives are already filled so that throws out bcache and there is some mixed reviews with dm-cache on Ubuntu 14 (which i'm running).

Any advice would be greatly appreciated!!!
 

trapexit

New Member
Feb 18, 2016
17
7
3
New York, NY
github.com
For people to have it :)

Seems like a mismatch of technologies. snapraid and mergerfs aren't really intended for the kind of workloads that you'd normally need high read speeds like that... especially when talking about wanting a flash accelerator.

Is the OS cache not enough to manage the read speeds you need? Or is it too much data to cache in RAM?

A hack would be to place the data you want to cache onto SSDs, place the drives at the beginning of the srcmount list, and use open, getattr search policies of 'ff'.
 
  • Like
Reactions: LeeMan

LeeMan

New Member
Oct 18, 2015
23
3
3
29
For people to have it :)

Seems like a mismatch of technologies. snapraid and mergerfs aren't really intended for the kind of workloads that you'd normally need high read speeds like that... especially when talking about wanting a flash accelerator.

Is the OS cache not enough to manage the read speeds you need? Or is it too much data to cache in RAM?

A hack would be to place the data you want to cache onto SSDs, place the drives at the beginning of the srcmount list, and use open, getattr search policies of 'ff'.
I have the follow setup.

RAID1 240Gb SSDs for Ubuntu 14.04 LTS
x12 8TB WD RED drives w/ SnapRAID
64Gb DDR4 2400MHZ 1.2V ECC
Intel Xeon E5-2658 V3 (12 core/24 thread)
Supermicro SC826E16 Chassis

I hardly EVER use over 12Gb of ram. This system is extremely over kill for what it's needed for but it's a hobby i'll admit. I have a bunch of dockers running on the SSDs so load times are relatively quick except for large media files which I'd like to make quicker. Also since this server is sitting on a 1gb network writing at those speeds while a couple people are reading files takes a hit. If there's a simple way to use 40ish GB of ram that would be awesome or some sort of caching system with another SSD which would be cheaper anyways than spending $500 on an Intel PCIe SSD.

- P.S. I love the work you do with mergerfs... Has worked since day one.
 
  • Like
Reactions: trapexit

trapexit

New Member
Feb 18, 2016
17
7
3
New York, NY
github.com
What's the access patterns? Same files over and over or always new files?

For the former the OS will handle the caching. So long as you have RAM the OS will cache the data.

For the latter... you can try what I mentioned prior. The problem is knowing what to cache. If there was a programmatic way know which files to cache you could imagine a process which copied files around to improve performance.

Your slowest part shouldn't be reading from disk or mergerfs but the 1Gb connection. So I would question the need for caching in the first place. It might be you're looking in the wrong place for the bottleneck. (If you have high concurrency of reads the cost could be mergerfs. There seems to be some concurrency issues due to the standard FUSE library I use (though there are some possible tweaks we could try to see if that's in fact the issue.)) One solution to the concurrency issue could be an idea I've been considering. A new, optional mode to mergerfs which would actually treat files as symlinks to the original files. For instance... if the file is read only and has a ctime or mtime above some timeout... return a symlink rather than the regular file. This would in effect redirect the application to the original files and therefore read at the native speed of the drive. Removing mergerfs as a read bottleneck.
 

LeeMan

New Member
Oct 18, 2015
23
3
3
29
Never really thought about it like that... My 1Gb network probably is my bottle neck. It's basically just a play server with a bunch of dockers and sharing Plex with a bunch of friends. Hence why I want my read speeds as fast as possible since I have the processing power for the transcoding but worried my drives are slow. During some high Plex usage time I will have an iowait of around 5-6seconds which personally seems high to me.

The idea with mergerfs is a great idea! I'd be excited to see some development with it.
 

trapexit

New Member
Feb 18, 2016
17
7
3
New York, NY
github.com
I need to do some refactoring first to offer support for handling underlying drive / filesystem errors better and then I can pretty easily add that feature. Hope to have it ready in the next couple weeks.
 

trapexit

New Member
Feb 18, 2016
17
7
3
New York, NY
github.com
So... I have something that can setup caches using dm-cache. Splits a drive N ways for N slow drives and then can mount them.

However... I'm having a hell of a time getting a systemd service unit setup to mount them at the correct time during boot. Can't seem to get it quite right and it tries mounting too early. I suppose I can release the tool and update it later for those who may set it up themselves.

If you're willing (or already have) used LVM2 to format your hard drives then you can follow the example in `man lvmcache` though it may be useful to write up something on that as well.
 
  • Love
Reactions: ajaja

rubylaser

Active Member
Jan 4, 2013
842
229
43
Michigan, USA
So... I have something that can setup caches using dm-cache. Splits a drive N ways for N slow drives and then can mount them.

However... I'm having a hell of a time getting a systemd service unit setup to mount them at the correct time during boot. Can't seem to get it quite right and it tries mounting too early. I suppose I can release the tool and update it later for those who may set it up themselves.

If you're willing (or already have) used LVM2 to format your hard drives then you can follow the example in `man lvmcache` though it may be useful to write up something on that as well.
@trapexit I would be happy to take a look and try to troubleshoot the issue if you post your directions so far. I've been following your github on backup and recovery, but haven't seen it posted there yet. Thanks!
 

trapexit

New Member
Feb 18, 2016
17
7
3
New York, NY
github.com
Awesome! I'll check it out once it's available :)
A first round try is there. The docs aren't complete but hopefully they are clear'ish. You'll need a fast device that the software can manipulate as it sees fit and point it to some devices you wish it to sit in front of. I'm still working on getting it setup so it will run after devices are available but before mounts occur.

Please be careful with testing. I've been doing it in a VM with virtual devices. Won't really help with speed comparisons but just to get a feel for it.

Any and all feedback is welcome.
 

rubylaser

Active Member
Jan 4, 2013
842
229
43
Michigan, USA
A first round try is there. The docs aren't complete but hopefully they are clear'ish. You'll need a fast device that the software can manipulate as it sees fit and point it to some devices you wish it to sit in front of. I'm still working on getting it setup so it will run after devices are available but before mounts occur.

Please be careful with testing. I've been doing it in a VM with virtual devices. Won't really help with speed comparisons but just to get a feel for it.

Any and all feedback is welcome.
Thanks! I just noticed the updates to the repository this morning and was already looking at it. I'll have to try it out in a VM myself at first so I can see myself how these parts work together and see the issues you have described with bringing things up correctly after a reboot.
 
  • Like
Reactions: trapexit

MasterCATZ

New Member
Jun 8, 2011
15
0
1
I am thinking about giving "enhanceio" a spin , but really need something with an include / exclude list to work along with snapraid

ie ) I don't want video's cached , but programs I do
 

ajaja

New Member
Jul 13, 2021
1
0
1
…the OS will handle the caching. So long as you have RAM the OS will cache the data.
So I am aware so file systems use system memory as a level one cache; the likes of zfs is quite sophisticated. Does mergerFS also level one cache via system memory or are you saying this is handled at the OS level?

I want to make best use of the M.2 NVMe slot on the motherboard of my server(PowerEdge T30). I can populate the 4 DIMM slots with 8GB DDR4 2133MT/s modules(wow expensive on eBay). Do you think it best that I use M.2 NVMe(256MB) for Swap?

OMV | UnionFS plug-in | five individual btrfs formatted drives | USB thumb boot drive