Do you mind sharing some details about how you're using it? I have four 256GB sticks sitting doing nothing and a dual Cascade Lake ES system (with six Mi50s) for LLM inference.
I have the Optane modules set for AppDirect. This can be done in either the Bios or using
impctl (depending in your distro, you might need to compile from source -- which I would recommend getting an AI agent to help since there's some complexity that isn't mentioned on the repo. It was difficult with my supermicro motherboard due to an external dependency--a package called edk2).
If you're using ipmctl then it's something like
sudo ipmctl create -goal PersistentMemoryType=AppDirect .
You want to set your optane goal in the bios to use 100% of optane memory for appdirect (there's an appdirect max on no-suffix, M and L suffix CPUs. this cap was dropped in later gens).
Then use ndctl (you should be able to install this through your package manager).
sudo ndctl create-namespace --mode=fsdax This turns the appdirect capacity to a DAX block device. Then format it to a DAX supported file extension, fs4, xfs and a few other. xfs will probably offer the maximum performance but I am just currently using fs4. (the block should show up as
/dev/pmem0 if you run
lsblk)
After this you can mount it and add it to your fstab for automount. Run
ls -l /dev/disk/by-id and use the device id for fstab mounting.
The next step is to put your gguf on the DAX drive and use it with llamacpp. In my instance, I am using ik_llamacpp. I am seeing the models load within just a few seconds. You'll also want to put your llamacpp cache on there as well. This is super useful when you use llama-swap and now you can switch models for different usage scenarios from your llm agent and only have to wait a few seconds.
If you're running llamacpp or ik_llamacpp in docker, make sure to also passthrough the /dev/pmem0 device as well (I don't think it's necessary but I do it anyway).
You can also offload your docker data to the DAX drive as well.
sudo systemctl stop docker
Create a folder ex.
/mnt/pmem0/docker-data
edit
/etc/docker/daemon.json and add:
JavaScript:
{
"data-root": "/mnt/pmem0/docker-data"
}
sudo rsync -aP /var/lib/docker/ /mnt/pmem0/docker-data
sudo systemctl start docker
If you're running a database on docker, instead of using docker volume mount, use bind mounting where the path for the database storage is bound to a folder on the DAX drive (e.g.
/mnt/pmem0/postgres/data and
docker run ... -v /mnt/pmem0/postgres/data:/data). Volume mounting uses docker's overlayfs which negates the benefits. Lastly, don't forget database specific configuration flags specifically for running with pmem.