EnhanceIO - SSD cacheing with benchmarks

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Gene

Active Member
Jan 27, 2016
171
32
28
42
I'm in the process of setting up EnhanceIO ( GitHub - stec-inc/EnhanceIO: EnhanceIO Open Source for Linux ) for Debian/OpenMediaVault to use ssd cache for my download/temp/work pool of drives. Just wondering if anyone else has used it. There really isn't many reviews by users online about it but seems like a great way to not have to buy tons of SSDs and still get most of the benefit.

Sadly I'll have to compile in the support into the linux kernel. However, I was bleeding edge for a few years back in 2008 so have plenty of experience doing that. I looked at dbcache which is baked into the kernel but you have to change the partition structure of the drives to use it so kind of a no go for mostly full existing drives.

Hardware on hand for testing of enhanceio - 500GB Samsung 850 as cache drive using writeback mode. Max speed is a bit over 5gbit. I have a very large battery backup and this is just a work pool for downloads so I'm not worried about if it loses power. I can always recover the data in the cache that hasn't been written aka. Dirty.

If this testing works well, i want to pickup a Sun Oracle F80, 800GB PCI-E flash card. This presents 4x 200GB drives to the system. I'd then use drive 1 and 2 in a software raid 1 writeback mode for my main storage pool (46TB mergerfs), and drives 3 and 4 in a software raid 0 writeback for the work pool (4TB mergerfs). The f80 writes at over 10Gbit speeds so I could finally make full use of the 10gbit sftp+ fiber i have internally. I rarely send more than a few hundred GB of files at a time so this would fit my usage profile pretty well. Anyone see any faults in my thinking here?
 
Last edited:
  • Like
Reactions: Patrick

Gene

Active Member
Jan 27, 2016
171
32
28
42
So in an effort to speed up my scratch/download/work drive performance I looked into different options for using ssds on Linux as cache. The two major options out there are bcache and enhanceio.
Option 1: bcache is built into the linux kernel but requires partition changes of your data drives. This to me is a no-go if you already have existing data and don't feel like taking forever to migrate it around or take the chance or losing it if you mess up resizing partitions. So i'm not even pro/conning it.
Option 2: EnhanceIO. Pros: Simple to get setup once built from source, actively being developed, no drive changes required. Can be added/removed at any time. Negatives: For Debian Jessie not built in, packages in testing repo have not been updated so we need to build form source. Also setup init.d conf to reload modules on boot and run the enhancio cache mounting command.

Directions for Debian Jessie to build EnhanceIO from git source along with usage instructions: Installing enhanceio on debian Jessie (Not wheezy or squeez, minimum kernel 3.7 onwards) – Tech-G

Now we need to look at the options for EnhanceIO. There are three modes of cacheing. Read-only, Write-through, Write-back. Read only writes directly to disk then updates any data in the cache. Write-through writes to both cache and hd then any future use can be read from cache. Speeds up reads. Write-back - uses the cache as a buffer but drive failure could cause data loss (also need battery backup). Since this is a scratch area I'm not worried. If i was going to use write-back for my main data, I'd have two caches in raid1 to protect it.

Cache type - LRU is pretty much the only one i'd use here. Uses 1.2GB of memory per 400GB of cache from what i can tell in docs. Random uses 400MB, FIFO uses 800MB

So now lets get to the testing. My pool for downloads currently consists of two WD 2TB drives formatted in ext4 and pooled using mergerfs. EnhanceIO can only be used with block devices that are in /dev/* so drives combined into folders like mergerfs uses is out. I partitioned my test ssd of 500gb into two equal partitions and then assigned those to each of the data drives individually.

Pre-EnhanceIO Bonnie++ test - 32GB file (16gb of ram on server)
Sequential input: 77.5MB Seq read: 33.4MB - html file: BonnieTest1.html

Post-EnhanceIO Bonnie++ test - 32GB file (16gb of ram on server)
Sequential input: 254MB Seq read: 315.5MB - also much higher cpu usage html file: BonnieTest2.html

Based on this seems like the best approach for a working area space for me would be to go to front mine with a PCI-E ssd card that can get up in the 10+gbit range of speed and use enhanceio in write-back mode. I also may need to look into adding another 2tb drive or two and then go zfs stripe.

Reference sites:
http://www.duo.co.nz/documents/EnhanceIO_Technical_Review.pdf
Active EnhanceIO fork GitHub - elmystico/EnhanceIO: EnhanceIO Open Source for Linux
 
Last edited:
  • Like
Reactions: Patrick

aero

Active Member
Apr 27, 2016
346
86
28
54
I'm a fan of enhanceio, but since I've updated to the 4.4 kernel I can't get it to work. I have switched to bcache, but I'm having difficulty benchmarking the results. I'm only interested in write-through mode for the sake of data integrity (cache is a raid 0 of ssds), so I need a way to test reads.

Bcache detects sequential IO (read and write) and does not cache it (enchanceio does not do this). This makes testing with typical disk bench tools difficult.

The data set is KVM guest images, so I'm thinking the only way to accurately determine performance is via a system/application performance profiler on a guest.

The cache hit rate is very high though ( >75%), so I think it must be having some positive effect.

Any other thoughts on how to benchmark this?
 

aero

Active Member
Apr 27, 2016
346
86
28
54
It compiles with 4.4, but it kept crashing on me and freezing up the file system.
 
  • Like
Reactions: mstrzyze

TuxDude

Well-Known Member
Sep 17, 2011
616
338
63
Bcache detects sequential IO (read and write) and does not cache it (enchanceio does not do this). This makes testing with typical disk bench tools difficult.

The data set is KVM guest images, so I'm thinking the only way to accurately determine performance is via a system/application performance profiler on a guest.
If the workload is VM hosting, and there is more than one VM, just start with the assumption that all IO is going to be random. Even if a VM is doing a sequential workload, two VMs (two sequential workloads) hitting the same underlying disk combine into a single random workload.

But you're right that you should be doing your benchmarking from inside the guest(s), because in the end guest performance is what matters, and other pieces of the stack (eg. which virtual disk controller the VM is using) can make a difference. If you're guests are windows, stick iometer into one or a few of them depending on how big of a workload you want to generate. If they are linux, then use fio.
 

aero

Active Member
Apr 27, 2016
346
86
28
54
I was able to observe random 4k reads at ~2.6 MB/s (0.5 MB/s at 1 queue depth!) with bcache disabled, and it jumped up to ~23 MB/s (~22 MB/s at 1 queue depth) with it enabled. I'd say that's a definite win!

I watched iostat on the kvm host while running the tests on a windows guest, and confirmed that bcache ignores sequential IO (goes straight to backing disk).
 

JustinH

Active Member
Jan 21, 2015
124
76
28
48
Singapore
Side note, if your using LVM, then there is a 3rd option for caching there. I've been pretty happy with it so far.


Sent from my iPhone using Tapatalk
 

gea

Well-Known Member
Dec 31, 2010
3,161
1,195
113
DE
Another option would be ZFS.
Among all its advanced features is one of the best caching methods in RAM and as an extension on SSD based on last accessed/ most accessed datablocks.

As this is integrated as a filesystem feature it is very efficient and OS independent.
 

Boddy

Active Member
Oct 25, 2014
772
144
43
Hi @gea I'm looking for a cache solution to a large database on spinners. (I'm fairly new in this area)
Another member has suggested using FreeBSD rather than Linux as more efficient on resources.
Just to clarify on your last post. Can ZFS be set up to initially cache to RAM and overflow to a SSD cache.
I do have RAID cards with CachePath option if needed (but I understand this is not necessary for ZFS?)
Cheers
 

gea

Well-Known Member
Dec 31, 2010
3,161
1,195
113
DE
ZFS is using nearly all free RAM as readcache and the RAM for a few seconds of writes as rambased write cache. You can add an SSD to extend that but as RAM is much faster than an SSD you should prefer RAM. This is why you see ZFS appliances with several hundred GB RAM. You can also add a fast SSD as an Slog write logging device to allow a fast but powerloss safe write behaviour (Like a BBU on hardwareraid)

Your current hardware raid 5/6 adapters with cacheoption are not wanted with ZFS as ZFS is software-Raid, best with cheaper and raidless LSI HBAs.

ZFS was originally developped by Sun/Oracle and storage features, disk management or integration of ZFS with services like FC/iSCSI, NFS or SMB are yet best on the nonfree Oracle Solaris. Most of these advantages are availble on OmniOS, an OpenSource Solaris fork as well. This is my favourite platform as all these services are integrated in the OS and maintained by the OS supplier itself (Illumos or Oracle Solaris)

Based on the free OpenSolaris fork you can now use ZFS on different operating systems. They are quite identical regarding ZFS as they use all the same codebase. They differ more on the other OS related advantages/disadvantages or the question if you use a core OS like BSD, OSX, Linux or Illumos/Solarish, a combined appliance like FreeNAS or NexentaStor or my napp-it that is a webui and management add-on for some of the OS options.

btw
If you really need performance with databases, think about SSD only pools. Prefer enterprise SSDs with powerloss protection and skip any SSD Cache or SLog options. Even the best caches cannot really give performance as a good SSD can offer > 20000 iops while even a 10k disk is at around 150 iops. Databases and VMs are mostly iops restricted.
 
  • Like
Reactions: Boddy