The only use for callbacks for NFSv4.0 is delegations
and delegations rarely work well for NFSv4.0 anyhow.
Therefore, this patch disables callbacks for the
NFSv4.0 client. This is the same behavior as
occurred when the nfscbd(8) daemon was not running.
This change allowed a function called nfscl_getmyip()
to be removed from the kernel, which is nice since
maintaining this function was bothersome, due to its
use of routing, etc.
MFC after: 2 weeks
The inotify flags are copied from the lower vnode into the nullfs vnode
so that the INOTIFY() macro will invoke VOP_INOTIFY on the nullfs vnode;
this is then bypassed to the lower vnode. However, when a nullfs vnode
is reclaimed we should clear these flags, as the vnode is now doomed and
no longer forwards VOPs to the lower vnode.
Add regression tests. Remove a test in vn_inotify_revoke() which is no
longer needed after this change.
PR: 292495
Reviewed by: kib
Reported by: Jed Laundry <jlaundry@jlaundry.com>
Fixes: f1f230439f ("vfs: Initial revision of inotify")
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D56639
Add a new PHYS_TO_DMAP_ADDR that still returns an address for use in
places that only need an address and not a pointer.
Effort: CHERI upstreaming
Reviewed by: kib
Sponsored by: AFRL, DARPA
Pull Request: https://github.com/freebsd/freebsd-src/pull/2068
Under conditions of low memory, getblk can fail. fusefs was not
handling those failures very systematically. It was always using
PCATCH, which appears to have been originally copy/pasted from the NFS
client code, but isn't always appropriate:
* During fuse_vnode_setsize_immediate, which can be called from many
different VOPs and from the vn_delayed_setsize mechanism, remove
PCATCH. Some of these callers cannot tolerate allocate failure.
* In fuse_inval_buf_range, don't assume that getblk will always succeed.
* When calling fuse_inval_buf_range from VOP_ALLOCATE,
VOP_COPY_FILE_RANGE, or VOP_WRITE (with IO_DIRECT), return EINTR if
the allocation fails.
* When calling fuse_inval_buf_range from VOP_DEALLOCATE, remove PCATCH.
This VOP must not fail with EINTR.
No new tests, because I can't force any particular getblk call to fail.
PR: 293957
Sponsored by: ConnectWise
Reported by: zjk7@wp.pl
MFC after: 1 week
Compiling a LINT-NOIP kernel (assumingly also a NOINET) port and ip
are set but not used in nfsrv_getclientipaddr().
Hide the variables behind #ifdef checks and do likewise for the parsing
results. Admittingly the code probably wants to be rewritten one day.
Found with: gcc15 tinderbox build
MFC after: 3 days
Reviewed by: rmacklem
Differential Revision: https://reviews.freebsd.org/D56502
Commit 8b9775912c added support for an NFSv4 mounted
root file system, but only if the NFSv4 configuration
used id numbers in the strings.
This patch adds support for the case where the NFSv4
configuration uses name<-->id mappings via nfsuserd(8)
by priming the mapping cache with just enough entries
so that it works until the nfsuserd(8) is running.
They are listed in nfs_prime_userd[] in
sys/fs/nfs/nfs_commonsubs.c.
The entries in nfs_prime_userd[] are also wired into
the kernel's cache for name<-->id mappings when nfsuserd(8)
starts up. This is necessary, since an upcall to the
nfsuserd(8) daemon for a mapping when looking up the
path to the passwd/group database files (/etc) will
hang the system, due to a vnode lock being held on
the entry in the path which blocks nfsuserd(8) from
accessing files.
To enable this case, the following must be put in the
NFS root file system's /boot/loader.conf:
boot.nfsroot.options="nfsv4"
boot.nfsroot.user_domain="<user.domain>"
where <user.domain> must be the same as nfsuserd
uses (usually set via the -domain flag).
If boot.nfsroot.user_domain does not exist or is
the empty string, ids is strings is configured.
MFC after: 1 week
Requested by: Dan Shelton <dan.f.shelton@gmail.com>
Fixes: 8b9775912c ("nfs_diskless: Add support for an NFSv4 root fs")
This patch moves the definition of the nfsd_idargs
structure out of nfs.h and into a new file called
nfsid.h.
This is being done so that it can be included in
nfs_diskless.c in a future commit.
There should be no semantics change from this
commit.
MFC after: 1 week
Fixes: 8b9775912c ("nfs_diskless: Add support for an NFSv4 root fs")
Without this patch, diskless root NFS file systems
could only be mounted via NFSv3 (or NFSv2).
This patch adds the basic support needed to mount
a root fs via NFSv4.
At this time, the NFSv4 mount will only work if
the following is done on the NFS server configuration:
- The root directory specified in the "V4:" line in
/etc/exports must be "/". This is needed since the
path to mount must be the same for NFSv3 and NFSv4.
- The NFS server must be configured to do both NFSv3
and NFSv4, since the bootstrap code still uses NFSv3.
- The NFSv4 server must be configured with:
vfs.nfs.enable_uidtostring=1
vfs.nfsd.enable_stringtouid=1
since the NFSv4 root fs cannot be running nfsuserd(8)
when it is booting. (This limitation may be removed
in a future commit by hard-wiring enough id<-->name
mapping entries to handle things until the nfsuserd(8)
is running.)
To enable the root fs to be mounted via NFSv4, it needs:
- in the root file system's /boot/loader.conf
boot.nfsroot.options="nfsv4"
(Additional options like rsize=65536,wsize=65536 can
also be specified.)
- in the root file system's /etc/sysctl.conf
vfs.nfs.enable_uidtostring=1
Requested by: Dan Shelton <dan.f.dhelton@gmail.com>
MFC after: 1 week
* cd9660_rrip_slink() did not check that the lengths of individual
entries do not exceed the length of the overall record.
* cd9660_rrip_altname() did not check that the length of the record
was at least 5 before subtracting 5 from it.
Note that in both cases, a better solution would be to check the length
of the data before calling the handler, or immediately upon entry of
the handler, but this would require significant refactoring.
MFC after: 1 week
Reported by: Calif.io in collaboration with Claude and Anthropic Research
Reported by: Adam Crosser, Praetorian
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D56215
For some server file system types, such as ZFS, a Copy/Clone
operation can be done across file systems of the same file
system type.
However, without this patch, the Copy/Clone will fail with
EROFS if the input file is on a read-only mounted file system.
This happens because Copy/Clone will try to do a VOP_SETATTR()
of atime to set the atime.
This patch pretends the VOP_SETATTR() of atime worked for
read-only file systems. It fixes a problem when copying
files from a ZFS snapshot.
PR: 294010
MFC after: 2 weeks
For some server file system types, such as ZFS, a Copy/Clone
operation can be done across file systems of the same file
system type.
As such, this patch allows the Copy/Clone to be attempted
when the file handles are for files on different file systems.
This fixes a problem for exported ZFS file systems when a
copy_files on file_range(2) between file systems in the same
NFSv4 mount is attempted.
PR: 294010
MFC after: 2 weeks
This mostly just fixes indentation and continuations and adds spaces
after commas and around binary operators and parentheses around return
values, but cd9660_rrip_extref() was so egregiously unreadable I
rewrote it. Note that this was done manually, so I may have missed a
few spots, and I made no attempt to fix over-long lines.
MFC after: 1 week
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D55865
An NFSv4.1/4.2 client can set/clear the archive, hidden
and system flags when creating non-regular files, such
as directories.
Without this patch, the setting of va_flags causes an
EPERM failure, since they are specified for VOP_MKDIR(),
VOP_MKNOD() and VOP_SYMLINK().
This patch sets va_flags == VNOVAL for the above VOP_xxx()
calls and then sets/clears the flags after creation,
which fixes the problem.
This bug only affects the Windows NFSv4.1/4.2 client.
PR: 293691
Tested by: Dan Shelton <dan.f.shelton@gmail.com>
MFC after: 2 weeks
Previously most fields in fuse_vnode_data were protected by the vnode
lock. But because DEBUG_VFS_LOCKS was never enabled by default until
stable/15 the assertions were never checked, and many were wrong.
Others were missing. This led to panics in stable/15 and 16.0-CURRENT,
when a vnode was expected to be exclusively locked but wasn't, for fuse
file systems that mount with "-o async".
In some places it isn't possible to exclusively lock the vnode when
accessing these fields. So protect them with a new mutex instead. This
fixes panics and unprotected field accesses in VOP_READ,
VOP_COPY_FILE_RANGE, VOP_GETATTR, VOP_BMAP, and FUSE_NOTIFY_INVAL_ENTRY.
Add assertions everywhere the protected fields are accessed.
Lock the vnode exclusively when handling FUSE_NOTIFY_INVAL_INODE.
During fuse_vnode_setsize, if the vnode isn't already exclusively
locked, use the vn_delayed_setsize mechanism. This fixes panics during
VOP_READ or VOP_GETATTR.
Also, ensure that fuse_vnop_rename locks the "from" vnode.
Finally, reorder elements in struct fuse_vnode_data to reduce the
structure size.
Fixes: 283391
Reported by: kargl, markj, vishwin, Abdelkader Boudih, groenveld@acm.org
MFC after: 2 weeks
Sponsored by: ConnectWise
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D55230
This should prevent seeing inconsistent flags values when updating it
under the shared vnode lock.
Noted and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D55665
If the vnode is share-locked:
- Use vn_delayed_setsize() to avoid calling vnode_pager_setsize() with
the vnode only shared locked.
- Interlock the vnode to get exclusive mode for updating the node
fields.
Reciprocally, interlock the vnode in p9fs_getattr_dotl() to observe the
consistent values on read.
PR: 293492
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D55665
Handle doomed vnodes after LK_RETRY.
Rename the flag from VI_DELAYEDSSZ to VI_DELAYED_SETSIZE.
Change signature of vn_lock_delayed_setsize() to take flatten values
list instead of vop args structure.
__predict_true() for VI_DELAYED_SETSIZE not set.
Minor editings like removing tautological assert, and sorting items.
Noted by: markj
Fixes: 45117ffcd5
Reviewed by: markj, rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D55681
When an NFSv4.1/4.2 sarver upgrades a read delegation to
a write delegation, it does not need to change the
delegation's stateid.
Without this patch, a DELEGRETURN of the stateid was done
for the case where the delegation stateid had not changed.
This return was bogus, since the delegation stateid now
represents the new write delegation.
This patch fixes the priblem by checking for "same stateid"
and only doing the DELEGRETURN when it is not the same.
PR: 289711
Tested by: Peter Much <pmc@citylink.dinoex_sub.org>
MFC after: 2 weeks
The change generalizes code that was initially developed for nfs client
to handle filesystems that needs to call vnode_pager_setsize() while
only owning the vnode lock shared. Since vnode pager might need to trim
or extend the vnode vm_object' page queue, the vnode lock for the call
must be owned exclusive. This is typical for filesystems with remote
authorative source of file attributes, like nfs/p9/fuse.
Handle the conflict by delaying the vnode_pager_setsize() to the next
vnode locking to avoid relock. But if the next locking request is in
shared mode, lock it exclusively instead, perform the delayed
vnode_pager_setsize() call by doing VOP_DEFAULT_SETSIZE(), and then
downgrade to shared.
Filesystems that opt into the feature must provide the implementation of
VOP_DELAYED_SETSIZE() that actually calls vnode_pager_setsize(), and use
vn_delay_setsize() helper to mark the vnode as requiring the delay call.
Reviewed by: rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D55595
nullfs_unlink_lowervp() is called with the lower vnode locked, so the
nullfs vnode is locked too. The following can occur:
1. the vunref() call decrements the usecount 2->1,
2. a different thread calls vrele() on the vnode, decrements the
usecount 0->1, then blocks on the vnode lock,
3. the first thread tests vp->v_usecount == 0 and observes that it is
true,
4. the first thread incorrectly unlocks the lower vnode.
Fix this by testing VN_IS_DOOMED directly. Since
nullfs_unlink_lowervp() holds the vnode lock, the value of the
VIRF_DOOMED flag is stable.
Thanks to leres@ for patiently helping to track this down.
PR: 288345
MFC after: 1 week
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D55446
This lock was included in the original GSoC submission. Its purpose
seems to have been to prevent concurrent FUSE_RENAME operations for the
current mountpoint, as well as to synchronize FUSE_RENAME with
fuse_vnode_setparent. But it's obsolete, now that ef6ea91593 added
mnt_renamelock .
MFC after: 2 weeks
Sponsored by: ConnectWise
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D55231
The lock around dev_clone is unfortunate because cloner might need to
take its own locks that establish the order with devfs vnodes, and then
transiently participates in further VFS locks order. For instance, this
way the proctree_lock or allproc_lock become involved.
Unlock dvp, we can unwind if the vnode become doomed while cloner was
called.
Reported and tested by: pho
Reviewed by: kevans, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D55028
libfuse clients may pass the "-o auto_unmount" flag to ensure that the mountpoint
will get unmounted even if the server terminate abnormally. Without this flag
sending KILL to a FUSE daemon leaves its mountpoint mounted.
Approved by: asomers
Differential Revision: https://reviews.freebsd.org/D53086
Name caching must be handled somewhat differently
for case insensitive file systems. Negative name
caching does not work and, for rename, all names
associated with the rename'd vnode must be disabled.
For a case insensitive ZFS file system that is exported,
the unpatched code did work, since the change in mtime
or ctime of the directory when other case names were
created or rename'd would disable the false name cache
hit. However, an export of an msdosfs file system
breaks the NFS client, because it only works if ctime/mtime
is changed whenever a name is added/removed. Depending
on what the server file system is, this may not happen,
due to clock resolution or lack of support for these
attributes.
This patch checks to see if the server file system is
case insensitive and modifies the name caching to handle
this.
There is still a problem if a case insensitive file system
is a subtree of a non-case insensitive is exported by the
NFSv4 server. This can be fixed someday, when the NFSv4
client gets support for submounts within the mount.
Suggested by: kib
MFC after: 2 weeks
When an NFSv4.n client specifies settings for attributes other
mode during a Open/Create/Exclusive_41, these other attributes
were not being set.
This patch resolves the problem by calling nfsrv_fixsattr()
after the VOP_CREATE() call in nfsvno_open() for this case.
There is no extant NFSv4.n client that currently does this,
as far as I know.
MFC after: 2 weeks
After commit 3bd8fab241 ("vfs: Move DEBUG_VFS_LOCKS checks to
INVARIANTS"), this option has no effect. Let's finish the removal.
There are a couple of additional uses in zfs, I will submit a separate
patch upstream for them.
Reviewed by: mckusick, kib
Differential Revision: https://reviews.freebsd.org/D54662
When an NFSv4.n client specifies settings for the archive,
hidden and/or system attributes during a Open/Create, the
Open/Create fails for ZFS. This is caused by ZFS doing
a secpolicy_xvattr() call, which fails for non-root.
If this check is bypassed, ZFS panics.
This patch resolves the problem by disabling va_flags
for the VOP_CREATE() call in the NFSv4.n server and
then setting the flags with a subsequent VOP_SETATTR().
This problem only affects FreeBSD-15 and main, since the
archive, system and hidden attributes are not enabled
for FreeBSD-14.
I think a similar problem exists for the NFSv4.n
Open/Create/Exclusive_41, but that will be resolved
in a future commit.
Note that the Linux, Solaris and FreeBSD clients
do not set archive, hidden or system for Open/Create,
so the bug does not affect mounts from those clients.
PR: 292283
Reported by: Aurelien Couderc <aurelien.couderc2002@gmail.com>
Tested by: Aurelien Couderc <aurelien.couderc2002@gmail.com>
MFC after: 2 weeks
A POSIX draft default ACL may not exist. As such,
an ACL with zero ACEs needs to be allowed.
This patch fixes acquisition of POSIX draft default
ACLs when they do not exist on the directory.
Fixes: a35bbd5d9f ("nfscommon: Add some support for POSIX draft ACLs")
An internet draft (expected to become an RFC someday)
https://datatracker.ietf.org/doc/draft-ietf-nfsv4-posix-acls
describes an extension to NFSv4.2 to handle POSIX draft ACLs.
This is the final patch in the series that enables
the extension of NFSv4.2 to support POSIX draft ACLs.
At this time, only UFS mounted with the "acls" option
will work, and only for FreeBSD built with these patches.
Patches for client and server for the Linux kernel are
in the works. (I'll admit my next little project is
cleaning the Linux patches up for submission for upstream.)
To make these changes really useful, the FreeBSD port
of OpenZFS needs to be patched to add POSIX draft ACL
support. (Support for POSIX draft ACLs is already in
the Linux port of OpenZFS.)
Interopeability with NFSv4.2 clients and servers that
do not support this extension should not be a problem.
Fixes: a35bbd5d9f ("nfscommon: Add some support for POSIX draft ACLs")
An internet draft (expected to become an RFC someday)
https://datatracker.ietf.org/doc/draft-ietf-nfsv4-posix-acls
describes an extension to NFSv4.2 to handle POSIX draft ACLs.
This is the fifth of several patches that implement the
above draft.
This one mostly adds an extra argument to two functions
in nfscommon.ko. Unfortunately, these functions are
called in many places, so the changes are numerous, but
straightforward.
Since the internal KAPI between the NFS modules is changed
by this commit, all of nfscommon.ko, nfscl.ko and nfsd.ko
must be rebuilt from sources.
There should be no semantics change for the series at
this point.
Fixes: a35bbd5d9f ("nfscommon: Add some support for POSIX draft ACLs")
An internet draft (expected to become an RFC someday)
https://datatracker.ietf.org/doc/draft-ietf-nfsv4-posix-acls
describes an extension to NFSv4.2 to handle POSIX draft ACLs.
This is the fourth of several patches that implement the
above draft.
There should be no semantics change for the series at
this point.
Fixes: a35bbd5d9f ("nfscommon: Add some support for POSIX draft ACLs")
An internet draft (expected to become an RFC someday)
https://datatracker.ietf.org/doc/draft-ietf-nfsv4-posix-acls
describes an extension to NFSv4.2 to handle POSIX draft ACLs.
This is the third of several patches that implement the
above draft.
There should be no semantics change for the series at
this point.
Fixes: a35bbd5d9f ("nfscommon: Add some support for POSIX draft ACLs")
An internet draft (expected to become an RFC someday)
https://datatracker.ietf.org/doc/draft-ietf-nfsv4-posix-acls
describes an extension to NFSv4.2 to handle POSIX draft ACLs.
This is the second of several patches that implement the
above draft.
The only semantics change would be if you have exported
a UFS file system mounted with the "acl" option.
In that case, you would see the acl attribute supported.
This is bogus, but will be handled in the next commit.
Fixes: a35bbd5d9f ("nfscommon: Add some support for POSIX draft ACLs")
An internet draft (expected to become an RFC someday)
https://datatracker.ietf.org/doc/draft-ietf-nfsv4-posix-acls
describes an extension to NFSv4.2 to handle POSIX draft ACLs.
This is the first of several patches that implement the
above draft.
This patch should not result in a semantics change.
This adds support for renaming a symbolic link found on the lower fs,
which necessitates copying it to the upper fs, as well as basic tests.
MFC after: 1 week
Sponsored by: Klara, Inc.
Sponsored by: NetApp, Inc.
Reviewed by: olce, siderop1_netapp.com, jah
Differential Revision: https://reviews.freebsd.org/D54229
When creating a unionfs mount, it's fairly easy to shoot oneself
in the foot by specifying upper and lower file hierarchies that
resolve back to the same vnodes. This is fairly easy to do if
the sameness is not obvious due to aliasing through nullfs or other
unionfs mounts (as in the associated PR), and will produce either
deadlock or failed locking assertions on any attempt to use the
resulting unionfs mount.
Leverage VOP_GETLOWVNODE() to detect the most common cases of
foot-shooting at mount time and fail the mount with EDEADLK.
This is not meant to be an exhaustive check for all possible
deadlock-producing scenarios, but it is an extremely cheap and
simple approach that, unlike previous proposed fixes, also works
in the presence of nullfs aliases.
PR: 172334
Reported by: ngie, Karlo Miličević <karlo98.m@gmail.com>
Reviewed by: kib, olce
Tested by: pho
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D53988