So I tested Intel Optane Persistent Memory for my first time.

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

BackupProphet

Well-Known Member
Jul 2, 2014
1,396
1,051
113
Stavanger, Norway
intellistream.ai
Of course I ran Postgresql's pg_test_fsync

Most enterprise SSD's do around 5000-15000 iops.
Optane 280GB did 30000 iops.
Optane NVDimm:
Code:
sudo numactl -N 0 ./pg_test_fsync -f /mnt/pmem0/test
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in "wal_sync_method" preference order, except fdatasync is Linux's default)
open_datasync 296824.466 ops/sec 3 usecs/op
fdatasync 284270.155 ops/sec 4 usecs/op
fsync 292300.289 ops/sec 3 usecs/op
fsync_writethrough n/a
open_sync 303438.714 ops/sec 3 usecs/op

Compare file sync methods using two 8kB writes:
(in "wal_sync_method" preference order, except fdatasync is Linux's default)
open_datasync 146539.355 ops/sec 7 usecs/op
fdatasync 166876.666 ops/sec 6 usecs/op
fsync 163899.210 ops/sec 6 usecs/op
fsync_writethrough n/a
open_sync 149345.961 ops/sec 7 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
1 * 16kB open_sync write 150159.840 ops/sec 7 usecs/op
2 * 8kB open_sync writes 148207.385 ops/sec 7 usecs/op
4 * 4kB open_sync writes 118460.784 ops/sec 8 usecs/op
8 * 2kB open_sync writes 80294.959 ops/sec 12 usecs/op
16 * 1kB open_sync writes 46076.797 ops/sec 22 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
write, fsync, close 114186.926 ops/sec 9 usecs/op
write, close, fsync 113974.395 ops/sec 9 usecs/op

Non-sync'ed 8kB writes:
write 387088.381 ops/sec 3 usecs/op
An order of magnitude faster than optane via pci-express. Persistent memory is insane...
This is done on AlmaLinux 10, with fsdax namespace and XFS filesystem mounted with dax option. The namespace is a region of 6x128GB Optane modules.
 

Stephan

Well-Known Member
Apr 21, 2017
1,106
864
113
Germany
I think I read in some comments that DAX is on it's way out. Might only be ext4 though.

Because I really want clean snapshots for backups and hourly/daily/monthly/yearly automatic snapshots against rm -rf type fatfingering mistakes, I am migrating everything to ZFS these days. Try it with optane. I am not interleaving modules, just ZFS with two 512 MB modules as disks. GPT partition EF00 holds only one file anymore, a zfs boot menu .efi with kernel and initramsfs and some parameters.

Bash:
#!/bin/sh

ROOT_POOL="zpool1/ROOT"

BOOT_KERNEL="vmlinuz-linux-custom"
BOOT_CMDLINE="rw delayacct audit=0 init_on_alloc=0 init_on_free=0 kvm-intel.nested=1 page_alloc.shuffle=1 intel_iommu=igfx_off iommu=pt msr.allow_writes=on retbleed=stuff"

# rd.debug rd.log=all printk.devkmsg=on
# crashkernel=512M

BOOT_EFI="/boot/efi/EFI/BOOT/BOOTX64.EFI"
BOOT_EFI_DIST="/boot/zfsbootmenu-recovery-x86_64-v3.0.1-linux6.12.EFI"

if ! mountpoint -q /boot/efi; then
    echo "*** Mounting ESP"
    mount /boot/efi || exit 1
    if ! mountpoint -q /boot/efi; then
        echo "ESP problem, not a mountpoint"
        exit 1
    fi
fi

if [ ! -f "$BOOT_EFI" ]; then
    echo "File $BOOT_EFI missing." >&2
    exit 1
fi

if [ ! -f "$BOOT_EFI_DIST" ]; then
    echo "File $BOOT_EFI_DIST missing." >&2
    exit 1
fi

if [ ! -f /etc/hostid ]; then
    echo "File /etc/hostid missing." >&2
    exit 1
fi

echo "*** Boot pool settings"
zfs set org.zfsbootmenu:commandline="$BOOT_CMDLINE" "$ROOT_POOL"
zfs set org.zfsbootmenu:kernel="$BOOT_KERNEL" "$ROOT_POOL"
zfs get all -s local "$ROOT_POOL" | grep zfsbootmenu | awk '{ $1=$1; print }'

echo "*** Writing BOOTX64.EFI"
HOSTID=$(od -An -tx4 /etc/hostid | tr -d " ")
/boot/zbm-kcl -r zbm.timeout -a zbm.timeout=5 -a spl.spl_hostid=0x"$HOSTID" -o "$BOOT_EFI" "$BOOT_EFI_DIST"
/boot/zbm-kcl "$BOOT_EFI"

echo "*** Unmounting ESP"
umount /boot/efi
exit 0
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,781
640
113
The namespace is a region of 6x128GB Optane modules.
which optane version modules are those, Hz?
I've been measuring myself to grab 2-4 256-512G 2666MHz modules from first optane persistent memory myself. *(for dual gold 6254 768GB DDR4 3ds 2666MHz box)
 

CyklonDX

Well-Known Member
Nov 8, 2022
1,781
640
113
may i ask if your memory downclocked from 2933MHz or you were running in config that didn't downclock you?
(Adding modules to my config will likely downclock me to 2400MHz.)

Have you tested performance when used as 2nd tier ram space?