Disk Buffer/Cache control on Ubuntu 22.04

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

Bert

Well-Known Member
Mar 31, 2018
841
392
63
45
Ubuntu has a built in file cache which is mostly automatic and greatly helps with performance. I am running an application whose performance highly sensitive to memory access speeds and application is running on a 2 socket solution. Application is using GPU and memory and finally outputs to a disk.

If memory is not allocated on the correct socket, the performance degrades by half so memory control is very important. Application makes multiple iterations on the same buffer to process them iteratively. I am using numactl to pin the application to correct socket and set the preferred memory allocator.

File cache is giving me hard time here:

1. Allocating memory on the target socket, forcing app to use the wrong socket. I work around that problem by force freeing the cache memory and restart the app but that's cumbersome and takes my time for manual intervention. ( /proc/sys/vm/drop_caches )
2. Allocating memory for drives where cache is not and starving disks where the cache would help as we quickly want to offload the memory. I tried to work around this by disabling disk cache and hoping that this would also turn off memory buffer for the disk but it didn't work that way.

In short, I need to be able to turn off memory buffer/cache or set the size for some disks to ensure memory is available where it is needed. That would help me to solve ally my problems but I couldn't find any such setting. Is there one?
 

CyklonDX

Well-Known Member
Nov 8, 2022
845
279
63
I would advise killing the issue, by enabling memory mirroring mode on the system. This will ensure each numa node has same memory, and it won't matter anymore which socket wants what from memory or where the memory really is vs the numa.
Con that its cutting your memory in half. (*tho your read performance goes up too, write not so much more on downslope)

One of really good way to isolate caches per application with smallest footprint is to dockerize it / chroot it.
The docker when done right should ignore all your system settings, and run natively with ones it has specified (customized)

You can limit the docker much easier than a process.
(some details here)

by mapping different fs per container/s | disks you can also ensure its limited to disk file caches on local system, and that those are kept in same numa node - so there's no cross-talk. As long as disk/sas controller is also present on 2nd cpu (and cpu's do not need to cross-talk to access each other devices) In perfect world you would want to have sas controller per each cpu, and network card per cpu. You can enable MPIO, connect 2 2 different sas controllers to same backplane, and in this way you will always be native to your numa node - so no cross talk would happen while you would have same visibility from both places (and won't have to rename it to different things).

Next one would be network, this is where you prob should do something else, as teaming/bonding them would not be for the best. Best to configure your dockers to use local to itself cpu for best performance.
 

Bert

Well-Known Member
Mar 31, 2018
841
392
63
45
Well that would work but the reason the app is memory latency sensitive, it uses all the memory. I need the memory from both NUMA node. App itself uses 256GB memory for runtime, the rest of memory is used by the system as disk cache and system needs that for quickly pushing buffers out of the app. Yes ideally, I can switch to 64GB modules then I can contain all the execution to single CPU.

The main issue is that, ubuntu/linux is wasting memory on I/O devices that will not benefit from buffer pool and starve the i/o device where I need the buffer pool.

I found that by using sync option, I can turn off memory pool but that pretty much kills the performance of any I/O operation now because every I/O does fsync.

What I need is being able to configure the cache size not completely disable it:
Use 20GB memory for these devices, Use 100GB memory for this device.

I don't think that I am this first one asking for this functionality.
 
Last edited:

CyklonDX

Well-Known Member
Nov 8, 2022
845
279
63
The main issue is that, ubuntu/linux is wasting memory on I/O devices that will not benefit from buffer pool and starve the i/o device where I need the buffer pool.
If docker is too much work, you can use
hdparm -W 0 /dev/sdX
You can also edit hdparm if you have it; /etc/hdparm.conf and set write_cache = off globally.
or


I found that by using sync option, I can turn off memory pool but that pretty much kills the performance of any I/O operation now because every I/O does fsync.

What I need is being able to configure the cache size not completely disable it:

You can configure how often it tries to flush to disk, and how big it can get.