The mlxcx driver seems to be a beast
Most issues are fixed, some are open
follow (maybe you want to report as well)
When you intend to order new servers, why not simply use ipmi ones for remote management?
This is even possible for setup and to online/offline/reset a server remotely.
Your link to Oracle requires a login/subscription
With NFS3 there is no real authentication or authorisation. You can only limit access per client ip (more or less based on goodwill). The uid of the created files depend on server/client configuration, can be nobody or the uid of the writing client.
Usually you allow everyone to solve the...
If you want to write a single Byte to a NVMe/SSD you can do this directly like on RAM with Intel Optane. Traditional flash requires (unless the ssd is new/empty/secure erased/trimmed) to read a whole page (can be several MB), erase the page and write the page newly with the modified Byte.
In a Raid-6 array, the OS sees the array as a single disk. A write to the disks is processed by the raid subsystem that creates raid stripes who are written sequentially to disks. A crash during such a write sequence can lead to a corrupt raid array (not all stripes are written) or depending on...
You want an Slog to protect your rambased writecache on a crash. This is quite senseless if you use an Slog without proper powerloss protection.
Additionally if you want some performance with sync write, you need low latency and high steady write iops and absolutely no desktop SSD.
Pool performance may be a relevant point and a performance reduction may be helpful. With sync=default on the parent destination filesystem, your replication is done async. Your async sequential pool performance is > 400 MB/s. This is 4 x your network.
Beside that the send itself is done...
I expect the low random performance then to trigger the monitor error.
If so, use the monitor error as a warning or info only, not a real error. Just ignore.
I hesitate to increase default timeout of the monitor process but can give infos where to increase.
A napp-it replication job starts two processes. One is the replication itself via netcat and the other is a monitoring process to avoid an endless running/waiting netcat.
In your case the replication 198->199 ends with ok. The sending process notes that the send lasts 48s while the receive...
There are three logs in napp-it to check
Menu Jobs (destination system)
- Last Log
click on the date of last run.
This shows details of last run
click on "replicate" in the line of the job
This shows an overview of last executions
Menu Jobs (source system)
- Remote Log
Slog is there to protect the rambased writecache. Its default size on Open-ZFS is 10% RAM, max 4GB. With 32 GB Ram, your write cache is 3,2 GB. You should use at least twice of that value for Slog. Even if you double to be sure to have enough, you end with less than 20 GB. More does not help, at...
As expected (maybe better than that).
A single mechanical disk can give 100-250 MB/s sequentially depending on the quetion if you use inner or outer track or if disk is quite full (a lot of fragmentation) or empty. Yoy see an unsync sequential pool performance of 723 MB/s. On writes the...