0

This is weird and I have run out of places to look. Any advice would be most welcome.

There are three hosts involved on the same VLAN as follows:

  • trillian (192.168.1.3/24) - VM running Alma Linux 9.3, installed about a month ago as 9.2 and upgraded. This is an NFS server and DNS server. It cannot access its own NFS shares. This replaced an older trillian (CentOS Stream 8) that didn't have this problem.
  • marvin (192.168.1.2/24) - physical running CentOS Stream 8 that runs the two VMs under libvirt/QEMU/KVM with a bridged network for the VMs. This can access NFS shares on trillian
  • agrajag (192.168.1.126/24) - VM running AlmaLinux 9.3 (built today). Tis cannot access NFS shares on trillian.

There are several other physical and virtual Fedora 39s none of which have an issue accessing the NFS share.

The NFS server config is unchanged from the default, apart from /etc/exports, which has this entry:

/srv/shares/steve localhost(rw) trillian.purplehayes.uk(rw) marvin.purplehayes.uk(rw) jabberwock.purplehayes.uk(rw) jubjub.purplehayes.uk(rw) dormouse.purplehayes.uk(rw) agrajag.purplehayes.uk(rw) arthur.purplehayes.uk(rw) alice.purplehayes.uk(ro) tweedledum.purplehayes.uk(ro) tweedledee.purplehayes.uk(ro)

fstab entries look like:

trillian:/srv/shares/steve /mnt/nfs/steve nfs defaults 0 0

The problem is that the two Alma Linux 9.3s (one of which is the NFS server) cannot mount the export: it always fails with mount.nfs: access denied by server while mounting trillian:/srv/shares/steve

A wireshark trace shows the same NFSv4 sequence of LOOKUP/GETATTR/ACCESS for /srv then /srv/shares then finally srv/shares/steve. The failure occurs in the response to the third ACCESS command: it should return 0x1df and it returns 0x000, i.e., all access denied.

A diff of a Wireshark dissection shows the difference that causes the error: this is two packets from each of two captures, good first, bad second.

2c2
<     167 15:49:52.104792256 marvin.purplehayes.uk 731    trillian.purplehayes.uk 2049        NFS      278    V4 Call (Reply In 168) ACCESS FH: 0xfb898914, [Check: RD LU MD XT DL XAR XAW XAL]
---
>     114 12:27:07.730078275 agrajag.purplehayes.uk 805    trillian.purplehayes.uk 2049        NFS      266    V4 Call (Reply In 115) ACCESS FH: 0xfb898914, [Check: RD LU MD XT DL XAR XAW XAL]
4,8c4,8
< Frame 167: 278 bytes on wire (2224 bits), 278 bytes captured (2224 bits) on interface br1, id 0
< Ethernet II, Src: 54:52:00:01:02:01 (54:52:00:01:02:01), Dst: RealtekU_01:03:00 (52:54:00:01:03:00)
< Internet Protocol Version 4, Src: marvin.purplehayes.uk (192.168.1.2), Dst: trillian.purplehayes.uk (192.168.1.3)
< Transmission Control Protocol, Src Port: 731, Dst Port: 2049, Seq: 7137, Ack: 7057, Len: 212
< Remote Procedure Call, Type:Call XID:0x791d8bd7
---
> Frame 114: 266 bytes on wire (2128 bits), 266 bytes captured (2128 bits) on interface br1, id 0
> Ethernet II, Src: RealtekU_01:7e:00 (52:54:00:01:7e:00), Dst: RealtekU_01:03:00 (52:54:00:01:03:00)
> Internet Protocol Version 4, Src: agrajag.purplehayes.uk (192.168.1.126), Dst: trillian.purplehayes.uk (192.168.1.3)
> Transmission Control Protocol, Src Port: 805, Dst Port: 2049, Seq: 6510, Ack: 6937, Len: 200
> Remote Procedure Call, Type:Call XID:0x99b4e2d3
16,17c16,17
<             sessionid: 5d386f652160d94b0700000000000000
<             seqid: 0x00000023
---
>             sessionid: 06ee6e65c8a3011bac00000000000000
>             seqid: 0x00000242
46c46
<     168 15:49:52.104911178 trillian.purplehayes.uk 2049   marvin.purplehayes.uk 731         NFS      238    V4 Reply (Call In 167) ACCESS, [Allowed: RD LU MD XT DL XAR XAW XAL]
---
>     115 12:27:07.730147846 trillian.purplehayes.uk 2049   agrajag.purplehayes.uk 805         NFS      238    V4 Reply (Call In 114) ACCESS, [Access Denied: RD LU MD XT DL XAR XAW XAL]
48,53c48,52
< Packet comments
< Frame 168: 238 bytes on wire (1904 bits), 238 bytes captured (1904 bits) on interface br1, id 0
< Ethernet II, Src: RealtekU_01:03:00 (52:54:00:01:03:00), Dst: 54:52:00:01:02:01 (54:52:00:01:02:01)
< Internet Protocol Version 4, Src: trillian.purplehayes.uk (192.168.1.3), Dst: marvin.purplehayes.uk (192.168.1.2)
< Transmission Control Protocol, Src Port: 2049, Dst Port: 731, Seq: 7057, Ack: 7349, Len: 172
< Remote Procedure Call, Type:Reply XID:0x791d8bd7
---
> Frame 115: 238 bytes on wire (1904 bits), 238 bytes captured (1904 bits) on interface br1, id 0
> Ethernet II, Src: RealtekU_01:03:00 (52:54:00:01:03:00), Dst: RealtekU_01:7e:00 (52:54:00:01:7e:00)
> Internet Protocol Version 4, Src: trillian.purplehayes.uk (192.168.1.3), Dst: agrajag.purplehayes.uk (192.168.1.126)
> Transmission Control Protocol, Src Port: 2049, Dst Port: 805, Seq: 6937, Ack: 6710, Len: 172
> Remote Procedure Call, Type:Reply XID:0x99b4e2d3
62,63c61,62
<             sessionid: 5d386f652160d94b0700000000000000
<             seqid: 0x00000023
---
>             sessionid: 06ee6e65c8a3011bac00000000000000
>             seqid: 0x00000242
70c69
<         Opcode: ACCESS (3), [Allowed: RD LU MD XT DL XAR XAW XAL]
---
>         Opcode: ACCESS (3), [Access Denied: RD LU MD XT DL XAR XAW XAL]
81,89c80,88
<             Access rights (of requested): 0x1df
<                 .... ...1 = 0x001 READ: allowed
<                 .... ..1. = 0x002 LOOKUP: allowed
<                 .... .1.. = 0x004 MODIFY: allowed
<                 .... 1... = 0x008 EXTEND: allowed
<                 ...1 .... = 0x010 DELETE: allowed
<                 .1.. .... = 0x040 XATTR READ: allowed
<                 1... .... = 0x080 XATTR WRITE: allowed
<                 .... .... = 0x100 XATTR LIST: allowed
---
>             Access rights (of requested): 0x00
>                 .... ...0 = 0x001 READ: *Access Denied*
>                 .... ..0. = 0x002 LOOKUP: *Access Denied*
>                 .... .0.. = 0x004 MODIFY: *Access Denied*
>                 .... 0... = 0x008 EXTEND: *Access Denied*
>                 ...0 .... = 0x010 DELETE: *Access Denied*
>                 .0.. .... = 0x040 XATTR READ: *Access Denied*
>                 0... .... = 0x080 XATTR WRITE: *Access Denied*
>                 .... .... = 0x100 XATTR LIST: *Access Denied*
94c93
<                     changeid: 835
---
>                     changeid: 7305793694093118725
99,100c98,99
<                     seconds: 1701785659
<                     nseconds: 338989372
---
>                     seconds: 1701012648
>                     nseconds: 850758917
102,103c101,102
<                     seconds: 1701785659
<                     nseconds: 338989372
---
>                     seconds: 1701012648
>                     nseconds: 850758917

It really looks like a server side error even though the issue only happens (as far as I can tell) if the client is Alma Linux 9Update: I rebuilt agrajag as Alma Linux 8.9 and it still cannot mount the share. Using, variously rpcdebug -m rpc -c all and rpcdebug -m nfsd -c all on the server and rpcdebug -m nfs -c all on the client shows nothing that looks like this error, via systemd journal or dmesg. I've tried sysctl -w sunrpc.nfsd_debug=1023, etc. but that doesn't seem to do anything (I presume because this is under systemd).

Things this is not:

  • firewall rules: NFS traffic is flowing
  • network related: it happens mounting from localhost on the server
  • the NFS insecure option: the port used is < 1024 and there is no NAT
  • a problem with /etc/exports: showmount -e gives the expected result
  • a problem with /etc/hosts: all name resolution is via DNS (trillian is the DNS server as well as the NFS server; all four hosts have the correct forward and reverse entries)
  • selinux: disabling it doesn't change anything
  • UID and GID mismatches: mounts are all done under root(0:0); file access by steve(1000:1000) and the export is root squashed only.
  • NFSv4: if I attempt the mount by specifying -o nfsvers=3, the mount works but attempting to ls a file immediately fails. Since NFSv3 and v4 differ a lot, that's not surprising but it fails for both with the same error reported by the client.
  • That specific share: I get the same on another share, too.

Update:

  • I built a new server arthur (192.168.1.127) - VM running Alma Linux 8.9, which is a replica of trillian in other respects. It shows the same behaviour (and the same NFSv4 ACCESS reply packet), so this problem is to specific to EL 9.x or to that specific build.
New contributor
tsgsh is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

1 Answer 1

0

You can try to set the NFS 4 version of your NFS server at the clients. If your EL 8 is serving 4.1 write vers=4.1 into the fstab at your nfs client.

nfsserver:/path/to/share /mnt/share nfs defaults,vers=4.1 0 0
2
  • Thanks for the suggestion. Unfortunately, the bit of the wireshark trace that doesn't show up in the diff includes: Network File System, Ops(4): SEQUENCE, PUTFH, ACCESS, GETATTR [Program Version: 4] [V4 Procedure: COMPOUND (1)] Tag: <EMPTY> minorversion: 2 so it's not a difference of minor version numbers either. I should have inclded that in my list of things I'd checked.
    – tsgsh
    21 hours ago
  • Sorry, failed attempt at code markdown in a comment. More legibly... Thanks for the suggestion. Unfortunately, the bit of the wireshark trace that doesn't show up in the diff includes: minorversion: 2 in both cases, so it's not a difference of minor version numbers either. I should have included that in my list of things I'd checked.
    – tsgsh
    21 hours ago

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .