Napp-IT Replication Integrity After NIC Failure

Notice: Page may contain affiliate links for which we may earn a small commission through services like Amazon Affiliates or Skimlinks.

sonoracomm

New Member
Feb 10, 2017
7
0
1
66
Hi all,

We have been using the Napp-IT Replication Feature for a long time as a (multi-level) backup. We've been lucky and haven't needed to restore from it, so I'm not well versed in this feature/technology.

We had an incident where a NIC failed after moving a storage server (third-level backup) to another location with much slower connectivity (VPN) to the two source servers.

Obviously there were a few Replication jobs that failed or were incomplete.

Do I need to do anything to verify or repair the replicated datasets after the same/existing jobs have resumed successful scheduled operation with no ongoing errors?

Thanks in advance,

G
 

gea

Well-Known Member
Dec 31, 2010
3,172
1,197
113
DE
The key point is that you need identical snapshots pairs on source and destination (same repli_nn snapnumber) to continue incremental replications. With correct snapshot pairs you can simply restart or reverse a replication (set old destination filesystem to rw and create there a new replication job with same job id).

If an incremental replications fails ex due a network error, you can just restart/retry. In rare cases the last destination snap is damaged. As a napp-it replication preserves at least last three snap pairs you can destroy the newest destination snap. The next job run will then be based on the former pair.

If you do not have a snappair with same number, rename destination filesystem ex to filesystem.old and start the job for a new full initial transfer. After success destroy the filesystem.old that you had kept simply as backup.

If you (re)run a replication without errors, it was checksum protected and ok. No need to verify or repair anthing due former errors.
 

evawillms

New Member
Oct 6, 2023
2
0
1
Drift Boss said: The key point is that you need identical snapshots pairs on source and destination (same repli_nn snapnumber) to continue incremental replications. With correct snapshot pairs you can simply restart or reverse a replication (set old destination filesystem to rw and create there a new replication job with same job id).

If an incremental replications fails ex due a network error, you can just restart/retry. In rare cases the last destination snap is damaged. As a napp-it replication preserves at least last three snap pairs you can destroy the newest destination snap. The next job run will then be based on the former pair.

If you do not have a snappair with same number, rename destination filesystem ex to filesystem.old and start the job for a new full initial transfer. After success destroy the filesystem.old that you had kept simply as backup.

If you (re)run a replication without errors, it was checksum protected and ok. No need to verify or repair anthing due former errors.
How do I know if DFS replication is working?
 
Last edited:

gea

Well-Known Member
Dec 31, 2010
3,172
1,197
113
DE
ZFS cares about data integrity, distributed filesystems organizes shares.
If the first is true, the second should as well.