Quantcast
Channel: Raspberry Pi Forums
Viewing all articles
Browse latest Browse all 3763

Troubleshooting • Re: crazy issues with NFS and 2 pi's

$
0
0
Thanks for the new answers!

To @jojopi's points, it's always +2, though the numbers vary. Today it was "1043586 when sending 1043584".

This is my /etc/exports:
```
/mnt/hdd/restic 192.168.1.21(rw,sync,no_root_squash) 192.168.1.22(rw,sync,no_root_squash) 192.168.1.31(rw,sync,no_root_squash) 192.168.1.32(rw,sync,no_root_squash) 192.168.1.53(rw,sync,no_root_squash) 192.168.1.225(rw,sync,no_root_squash)
/mnt/hdd/housebox 192.168.1.21(rw,sync,no_root_squash) 192.168.1.22(rw,sync,no_root_squash) 192.168.1.31(rw,sync,no_root_squash) 192.168.1.32(rw,sync,no_root_squash) 192.168.1.53(rw,sync,no_root_squash) 192.168.1.225(rw,sync,no_root_squash)
```

And on the pi4 side fstab looks like this:
```
192.168.1.33:/mnt/hdd/housebox /mnt/housebox nfs vers=3,nofail,x-systemd.automount,x-systemd.requires=network-online.target 0 3
192.168.1.33:/mnt/hdd/restic /mnt/restic nfs vers=3,nofail,x-systemd.automount,x-systemd.requires=network-online.target 0 3
```

Other than that, everything is pretty standard, with nfs-kernel-server installed as is from apt.

I thought it was completely pattern-less initially, but over the last days it's happened quite systematically during my nightly backup using restic, even though it sometimes starts within the same minute, and sometimes after 20 minutes (or any other random timestamp close to the start of the backup). I've moved the backup to a different time last night, and that changed the time the issue appeared correspondingly. I think it's not ONLY that process that can be the trigger, but it might be a very "good" trigger - giving another hint that the issue might be triggered by some form of "stress", since restic involves lots of file exchanges I believe, and also I back up some things which themselves are on the other NFS resource hosted by the same pi5 (and then some more living in an NFS server running in an x86 VM, which works fine). My backup process runs restic and then rclone. As it happens, today I thought of trying to unmount the resource, and when it failed, ran lsof to see that rclone was stuck - though killing it was instant and didn't solve anything. I'll keep checking to see if it's always rclone that gets stuck or not.

Even after killing rclone and unmounting all NFS shares, the pi5 keeps looping with the error, until the pi4 is rebooted. The pi4 shows no relevant logs.

I also tried switching to NFSv3 the night before (in the mount options), but the issue happened as usual.
I mounted the same FS via Samba as a proof of concept, but as I feared, benchmarking some restic operations showed increases in response times from a few % to nearly double the time it takes over NFS.

Statistics: Posted by pierric — Fri Feb 07, 2025 8:39 am



Viewing all articles
Browse latest Browse all 3763

Trending Articles