Commit Graph

308254 Commits

Author SHA1 Message Date
Alexander Ziaee dc140a9fc1 Bourne shell -> POSIX shell
The FreeBSD shell is a POSIX compatible shell. It evolved over several
decades from the Almquist shell, which was preceeded a decade before
that by the Bourne shell. Most readers today have never seen a Bourne
shell. If someone wants to learn to use our shell, they need to look for
tutorials on the POSIX shell. Align descriptions through out the tree
with this reality, consistent with it's manual and common parlance.

We made a similar change to the doc tree in b4d6eb01540fe.

MFC after:		3 days
Reviewed by:		carlavilla
Differential Revision:	https://reviews.freebsd.org/D56382
2026-04-14 09:02:58 -04:00
Konstantin Belousov 934a35ac2b libthr.3: describe SIGTHR
Explain how SIGTHR is used and that it should be not touched by user
code.  Note about SIGLIBRT.

Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D56384
2026-04-14 15:51:38 +03:00
Konstantin Belousov fa912e3b9b libthr.3: describe what we mean by C runtime environment.
Reviewed by:	emaste
Sponsored by:   The FreeBSD Foundation
MFC after:      3 days
Differential revision:  https://reviews.freebsd.org/D56384
2026-04-14 15:51:32 +03:00
Goran Mekić 3524d4ebbe sound examples: Add mmap example
This example opens separate OSS capture and playback channels in mmap
mode, places them into a sync group, and starts them together so both
ring buffers advance on the same device timeline. It then monitors the
capture mmap pointer with SNDCTL_DSP_GETIPTR, converts that pointer into
monotonic absolute progress using the reported block count, and copies
newly recorded audio from the input ring to the matching region of the
output ring.

The main loop is driven by an absolute monotonic frame clock rather than
a fixed relative usleep delay. Wakeups are scheduled from the sample
rate using a small frame step similar to the SOSSO timing model, while
the audio path itself stays intentionally simple: just copy input to
output, with no explicit xrun recovery or processing beyond ring
wraparound handling.

MFC after:	1 week
Reviewed by:	christos
Differential Revision:	https://reviews.freebsd.org/D53749
2026-04-14 12:59:14 +02:00
Pouria Mousavizadeh Tehrani 7d38eb720a routing: Fix use-after-free in finalize_nhop
FIB_NH_LOG calls the `nhop_get_upper_family(nh)` to read
`nh->nh_priv->nh_upper_family` for failure logging.
Call FIB_NH_LOG before freeing nh so failures are logged
without causing a panic.

MFC after: 3 days
2026-04-14 14:02:56 +03:30
Sumit Saxena 439132310a iflib: drain admin task and fix teardown order on register failure
When IFDI_ATTACH_POST() fails (or netmap attach fails), iflib tears down with
ether_ifdetach(), taskqueue_free(ifc_tq), and IFDI_DETACH(). CTX_LOCK is still
held after ether_ifattach. ether_ifdetach() and taskqueue_drain(admin) must not
run under CTX_LOCK.

Teardown ordering (match iflib_device_deregister):

- Free the per-interface admin taskqueue after IFDI_DETACH / IFDI_QUEUES_FREE, not before.
- Drop IFNET_WLOCK() across IFDI_DETACH / IFDI_QUEUES_FREE so driver detach can sleep in
LinuxKPI workqueue drain, then retake IFNET_WLOCK() before iflib_free_intr_mem and fail_unlock.

MFC after:      2 weeks
Reviewed by:    gallatin, kgalazka, #iflib
Differential Revision: https://reviews.freebsd.org/D56316
2026-04-14 09:13:53 +00:00
Sreekanth Reddy d2b96f654a iflib: Fix panic observed while doing sysctl -a with if_bnxt unload
Observed below kernel panic calltrace while performing sysctl -a
operation while unloading the if_bnxt driver,

Fatal trap 9: general protection fault while in kernel mode

KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe02a7569940
vpanic() at vpanic+0x136/frame 0xfffffe02a7569a70
panic() at panic+0x43/frame 0xfffffe02a7569ad0
trap_fatal() at trap_fatal+0x68/frame 0xfffffe02a7569af0
calltrap() at calltrap+0x8/frame 0xfffffe02a7569af0

trap 0x9, rip = 0xffffffff80c0b411, rsp = 0xfffffe02a7569bc0, rbp = 0xfffffe02a7569be0 ---
sysctl_handle_counter_u64() at sysctl_handle_counter_u64+0x61/frame 0xfffffe02a7569be0
sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame 0xfffffe02a7569c30
sysctl_root() at sysctl_root+0x22f/frame 0xfffffe02a7569cb0
userland_sysctl() at userland_sysctl+0x196/frame 0xfffffe02a7569d50
sys___sysctl() at sys___sysctl+0x65/frame 0xfffffe02a7569e00
amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe02a7569f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe02a7569f30

Root Cause:
iflib adds per-device sysctl nodes under the device tree using the device
sysctl context. Some of those nodes are counter sysctl that point at fields
inside txq→ift_br. When the if_bnxt driver is unloaded, iflib_device_deregister
runs and calls iflib_tx_structures_free, which frees the txqs ift_br. The device
sysctl tree is only freed when the device is destroyed. If sysctl -a runs during
unload, it can still traverse the device tree and call sysctl_handle_counter_u64
for those nodes. The handler does counter_u64_fetch(*(counter_u64_t *)arg1).
By then arg1 can point into freed memory and leads to use after free type kernel panic.

Fix:
flib now uses its own sysctl context for all iflib-related nodes
instead of using device’s context. And iflib sysctl context is now
removed before any queue/ring memory is freed.

MFC after:      2 weeks
Reviewed by:    gallatin, ssaxena, #iflib
Differential Revision: https://reviews.freebsd.org/D55981
2026-04-14 09:13:34 +00:00
Michael Osipov 54f5d20492 ciss.4: List all devices supported by ciss(4)
PR:		285744
Reviewed by:	ziaee
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D56285
2026-04-14 09:39:02 +02:00
ShengYi Hung 43d632779b x86: Mark LOCORE to prevent build failure on i386 platform
PR:     294468
Reported by:    dan.kotowski@a9development.com
Tested by:      dan.kotowski@a9development.com
Discussed with: kib
Fixes:  9289df1949cd ("x86: Add zen identifier helper function")
MFC after:      2 weeks
Sponsored by:   The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D56377
2026-04-14 12:58:39 +08:00
Warner Losh 2b954770dd nvme: Use passed in max_pages.
Noticed by: jhb
Sponsored by: Netflix
2026-04-13 17:33:13 -06:00
Kristofer Peterson 81b2055c49 sh: Increase default history size to POSIX mandated minimum of 128
The default history size in bin/sh is currently 100 however POSIX.1-2024
mandates that a default greater than or equal to 128 shall be used,
therefore this increases the default history size in /bin/sh to 128.

POSIX standards reference:
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/sh.html#tag_20_110_08

MFC after:	3 days
Reviewed by:	emaste, jilles, jlduran, ziaee
Signed-off-by:	Kristofer Peterson <kris@tranception.com>
Closes:		https://github.com/freebsd/freebsd-src/pull/2093
2026-04-13 19:06:41 -04:00
YAO, Xin a3c457398f linux: add sysfs filetype support for Linux statfs()
Added MAGIC number below and map to linsysfs in bsd_to_linux_ftype()

This maps:
  - `linsysfs` -> `LINUX_SYSFS_MAGIC` (`0x62656572`)

Signed-off-by: YAO, Xin <mr.yaoxin@outlook.com>

Reviewed by:	emaste
Pull request:	https://github.com/freebsd/freebsd-src/pull/2119
2026-04-13 18:32:12 -04:00
Isaac Freund e11eba76cf pkgbase: only provide shlibs from /lib,/usr/lib,/usr/lib32
Reviewed by:	bapt
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54793
2026-04-13 16:03:22 -04:00
Alexander Ziaee afe57c12e9 diskinfo: Align and alphabetize options
MFC after:	3 days
2026-04-13 15:52:24 -04:00
Artem Bunichev b1bc748430 timeout.1: Document non-POSIX options
MFC after:		3 days
Reviewed by:		Aaron Li <aly@aaronly.me>, ziaee
Differential Revision:	https://reviews.freebsd.org/D56090
2026-04-13 15:52:21 -04:00
Colin Percival fa31e76a4c Revert "EC2: Add clibs-lib32 pkg to small/builder images"
This should no longer be necessary after 2018ae4e3b.

This reverts commit cfe0b7d37e.
2026-04-13 12:42:57 -07:00
Isaac Freund 2018ae4e3b pkgbase: remove incorrect clang shlib requires
The FreeBSD-clang package contains a 32-bit shared object at
/usr/lib/clang/19/lib/freebsd/libclang_rt.asan-i386.so

This is expected, since clang uses this object when compiling for i386
targets with asan enabled.

What is not expected is that the FreeBSD-clang package currently depends
on 32-bit libc packages due to pkg's shared library analysis, making it
impossible to install pkgbase on x86_64 without any lib32 packages.

This commit leverages a new pkg feature implemented in [1], but could
be landed before a pkg version including that feature is released
without any ill effects. Unknown keys in package manifests are ignored.

[1]: https://github.com/freebsd/pkg/pull/2594

Reviewed by:	ivy
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54792
2026-04-13 15:36:15 -04:00
Simon J. Gerraty e272f4a61e Fix default for .MAKE.SAVE_DOLLARS
NetBSD make defaults this to "yes",
bmake defauts it to "no" to retain the traditional behavior.

The default is dealt with in bmake's Makefile but that does not
address boot-strap.

For now, just change the ifdef in main.

PR: 294436
2026-04-13 10:38:50 -07:00
Pouria Mousavizadeh Tehrani bc793ad787 ifconfig: Fix printf on geneve for 32-bit architectures
Replace uint64_t type with uintmax_t in printf to fix warnings
on 32-bit architectures.

Reported by:	Jenkins
Fixes:		688e289ee9 ("ifconfig: Add support for geneve")
Differential Revision: https://reviews.freebsd.org/D55184
2026-04-13 20:28:31 +03:30
Dave Cottlehuber 3b10806812 release: remove Oracle Cloud Infrastructure build targets
Oracle's previous support is no longer available to the project.
Repeated attempts to find a sponsor within Oracle's cloud business
have not been successful.

The last published official images are from 15.0-RELEASE.

https://marketplace.oracle.com/app/freebsd-release

Relnotes:	yes
Sponsored by:	SkunkWerks, GmbH
Differential Revision:	https://reviews.freebsd.org/D56360
MFC after:	3 days
2026-04-13 15:34:17 +00:00
Harry Moulton 58de791536 arm64: mte: configure initial state for system registers
The fields in SCTLR_EL1 and HCR_EL2 for enabling MTE are set, and if the
ID_AA64PFR1_EL1 register shows MTE is present, the GCR_EL1 register is
also configured, and the two TFSR registers which hold pending tag check
faults are cleared.

Reviewed by:	andrew
Sponsored by:	Arm Ltd
Signed-off-by:	Harry Moulton <harry.moulton@arm.com>
Differential Revision:	https://reviews.freebsd.org/D55946
2026-04-13 15:23:05 +01:00
Harry Moulton aa555b6004 arm64: mte: add system register definitions
Add system register and bit field definitions for Memory Tagging
Extension (MTE) in ARMv8.5.

Reviewed by:	andrew
Sponsored by:	Arm Ltd
Signed-off-by:	Harry Moulton <harry.moulton@arm.com>
Co-authored-by:	Andrew Turner <andrew@FreeBSD.org>
Differential Revision:	https://reviews.freebsd.org/D55945
2026-04-13 15:23:05 +01:00
Harry Moulton 7e718b9a8e arm64: mte: cleanup cache register definitions
Cleanup the definitions in armreg.h for the CSSIDR_EL1, CLIDR_EL1 and
CSSELR_EL1 system register to prepare for additional bitfeilds for
Memory Tagging Extension (MTE).

Reviewed by:	andrew
Sponsored by:	Arm Ltd
Signed-off-by:	Harry Moulton <harry.moulton@arm.com>
Differential Revision:	https://reviews.freebsd.org/D55944
2026-04-13 15:23:05 +01:00
Andrew Turner 5809584275 arm64: Handle changing self-referential DMAP pages
Support changing the property of a DMAP page that holds it's own page
table entry.

Because we need to perform a break-before-make sequence to change the
properties of pages a page that also holds it's own page table entry
will fault in the make part of the sequence.

Handle this by mapping the page with a temporary mapping as we already
do when demoting a superpage.

Reviewed by:	kib
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D55943
2026-04-13 15:23:05 +01:00
Andrew Turner c208439cdb arm64: Add a cmap page to pmap
When modifying mappings in pmap we may need to perform a
break-before-make sequence. This creates an invalid mapping, then
recreates it with the changes.

When modifying DMAP mappings we may be changing the mapping that
contains its own page table then after breaking the old entry we are
unable to create the new entry.

To fix this create a map that can be used & won't be affected by the
break-before-make sequence.

Reviewed by:	kib
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D56306
2026-04-13 15:23:05 +01:00
Pouria Mousavizadeh Tehrani b0ef03f0c4 ifconfig.8: Add geneve(4) parameters
Add geneve parameters to ifconfig manual.

Reviewed by:	ziaee
Differential Revision: https://reviews.freebsd.org/D55181
2026-04-13 17:45:06 +03:30
Pouria Mousavizadeh Tehrani adecd4c4cd geneve.4: Add geneve manual
Reviewed by: ziaee, adrian
Differential Revision: https://reviews.freebsd.org/D55182
2026-04-13 17:45:05 +03:30
Pouria Mousavizadeh Tehrani aa9f669d09 geneve: Add tests for geneve
Add tests for each combinations of geneve modes, address families
and multicast.

Differential Revision: https://reviews.freebsd.org/D55183
2026-04-13 17:45:04 +03:30
Pouria Mousavizadeh Tehrani 688e289ee9 ifconfig: Add support for geneve (netlink)
This implementation is netlink only

Differential Revision: https://reviews.freebsd.org/D55184
2026-04-13 17:45:01 +03:30
Pouria Mousavizadeh Tehrani e44d2e941e if_geneve: Add Support for Geneve (RFC8926)
geneve creates a generic network virtualization tunnel interface
for Tentant Systems over an L3 (IP/UDP) underlay network that provides
a Layer 2 (ethernet) or Layer 3 service using the geneve protocol.
This implementation is based on RFC8926.

Reviewed by:	glebius, adrian
Discussed with:	zlei, kp
Relnotes:	yes
Differential Revision: https://reviews.freebsd.org/D54172
2026-04-13 17:44:58 +03:30
Martin Matuska eb5165bb49 libarchive: merge from vendor branch
libarchive 3.8.7

Important bugfixes:
 #2871 libarchive: fix handling of option failures
 #2897 iso9660: fix undefined behavior
 #2898 RAR: fix LZSS window size mismatch after PPMd block
 #2900 CAB: fix NULL pointer dereference during skip
 #2911 libarchive: do not continue with truncated numbers
 #2919 CAB: Fix Heap OOB Write in CAB LZX decoder
 #2934 iso9660: fix posibble heap buffer overflow on 32-bit systems
 #2939 cpio: Fix -R memory leak
 #2947 libarchive: lzop and grzip filter support

Important bugfixes between 3.8.5 and 3.8.6:
 #2860 bsdunzip: fix ISO week year and Gregorian year confusion
 #2864 7zip: ix SEGV in check_7zip_header_in_sfx via ELF offset validation
 #2875 7zip: fix out-of-bounds access on ELF 64-bit header
 #2877 RAR5 reader: fix infinite loop in rar5 decompression
 #2878 mtree reader: Fix file descriptor leak in mtree parser cleanup
       (CWE-775)
 #2892 RAR5 reader: fix potential memory leak
 #2893 RAR5: fix SIGSEGV when archive_read_support_format_rar5 is called
       twice
 #2895 CAB reader: fix memory leak on repeated calls to
       archive_read_support_format_cab

Obtained from:	libarchive
Vendor commit:	ded82291ab41d5e355831b96b0e1ff49e24d8939
MFC after:	1 week
2026-04-13 15:47:17 +02:00
Martin Matuska f2cd95a372 Update vendor/libarchive to 3.8.7
Important bugfixes between 3.8.6 and 3.8.7:
 #2871 libarchive: fix handling of option failures
 #2897 iso9660: fix undefined behavior
 #2898 RAR: fix LZSS window size mismatch after PPMd block
 #2900 CAB: fix NULL pointer dereference during skip
 #2911 libarchive: do not continue with truncated numbers
 #2919 CAB: Fix Heap OOB Write in CAB LZX decoder
 #2934 iso9660: fix posibble heap buffer overflow on 32-bit systems
 #2939 cpio: Fix -R memory leak
 #2947 libarchive: lzop and grzip filter support

Important bugfixes between 3.8.5 and 3.8.6:
 #2860 bsdunzip: fix ISO week year and Gregorian year confusion
 #2864 7zip: ix SEGV in check_7zip_header_in_sfx via ELF offset validation
 #2875 7zip: fix out-of-bounds access on ELF 64-bit header
 #2877 RAR5 reader: fix infinite loop in rar5 decompression
 #2878 mtree reader: Fix file descriptor leak in mtree parser cleanup
       (CWE-775)
 #2892 RAR5 reader: fix potential memory leak
 #2893 RAR5: fix SIGSEGV when archive_read_support_format_rar5 is called
       twice
 #2895 CAB reader: fix memory leak on repeated calls to
       archive_read_support_format_cab

Obtained from:	libarchive
Vendor commit:	ded82291ab41d5e355831b96b0e1ff49e24d8939
2026-04-13 15:29:20 +02:00
YAO, Xin 26740e8f80 compat/linux: Add Linux i2c-dev ioctl compatibility support
Implement Linux I2C ioctl translation in the Linux compatibility layer
and wire iicbus cdevs up for in-kernel rdwr handling.
Support common i2c-dev requests including SLAVE, FUNCS, and RDWR,
while rejecting unsupported 10-bit and SMBus operations.

Signed-off-by:	YAO, Xin <mr.yaoxin@outlook.com>
Reviewed by:	imp, adrian, pouria
Differential Revision: https://reviews.freebsd.org/D56251
2026-04-13 16:01:47 +03:30
Zhenlei Huang e9fc0c5382 if_clone: Make ifnet_detach_sxlock opaque to consumers
The change e133271fc1 introduced ifnet_detach_sxlock, and change
6d2a10d96f widened its coverage, but there are still consumers,
net80211 and tuntap e.g., want it. Instead of sprinkling it everywhere,
make it opaque to consumers.

Out of tree drivers shall also benefit from this change.

Reviewed by:	kp
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D56298
2026-04-13 12:38:44 +08:00
Zhenlei Huang 38bd7ef62f ifnet: Move SIOCSIFVNET from ifhwioctl() to ifioctl()
SIOCSIFVNET is not a hardware ioctl. Move it to where it belongs.

Where here, rewrite the logic of checking whether we are moving the
interface from and to the same vnet or not, since it is obviously not
stable to access the interface's vnet, given the current thread may
race with other threads those running if_vmove().

MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D55880
2026-04-13 12:38:44 +08:00
Zhenlei Huang f1fae67afb ifnet: vnet_if_return(): Avoid unnecessary recursive acquisition of ifnet_detach_sxlock
vnet_if_return() will be invocked by vnet_sysuninit() on vnet destructing,
while the lock ifnet_detach_sxlock has been acquired in vnet_destroy()
already.

With this change the order of locking is more clear. There should be no
functional change.

Reviewed by:	pouria
Fixes:		868bf82153 if: avoid interface destroy race
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D56288
2026-04-13 12:38:43 +08:00
Ryan Libby 8085c5a5c1 nvme_ctrlr_linux_passthru_cmd: correct size of upages_small array
The size broke when upages was converted from array to double pointer.

Reported by:	gcc -Wsizeof-pointer-div
Reviewed by:	imp
Fixes:		82ff1c334b ("nvme: Allow larger user request sizes")
Differential Revision:	https://reviews.freebsd.org/D56368
2026-04-12 16:39:41 -07:00
Gleb Smirnoff 151a1eab3b tcp: fix !INVARIANTS build
Fixes:	40dbb06fa7
2026-04-12 14:45:40 -07:00
Gleb Smirnoff 411c28b6ca hash(9): fix my stupid off-by-one
Fixes:	abf68d1cf0
2026-04-12 14:10:27 -07:00
Konstantin Belousov 660498986a fork.2: note that all methods to pre-resolve symbols have consequences
Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D56362
2026-04-12 23:29:42 +03:00
Konstantin Belousov f286933c95 unistd.h: _Fork(2) is required by POSIX 2024
Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D56362
2026-04-12 23:29:36 +03:00
Gleb Smirnoff ece716c5d3 raw ip: move hash table manipulation to inpcb layer
The SOCK_RAW socket is a multiple receiver socket by its definition.  An
incoming packet may be copied to multiple sockets.  Thus, incoming packet
handling is expensive.  Systems with many thousands of raw sockets usually
have them connect(2)-ed to different destinations.  This allows for some
improvement of the input handling, which was introduced by 9ed324c9a5
back in 2008.  This optimization was made specifically for L2TP/PPTP VPN
concentrators based on ports/net/mpd5.

This change generalizes the idea of 9ed324c9a5, so that it potentially
can be used with IPv6 raw sockets.  This also eliminates last use of the
pcbinfo hash lock outside of in_pcb.c.

While here make a speculative design decision: put into the hash table
sockets that did only connect(2).  Previously, we were indexing only
sockets that were protocol bound, did bind(2) and did connect(2).  My
speculation is that only the remote IP provides some real entropy into the
hash and local address and proto are expected to be the same for majority
of the sockets.  My other speculation is that VPN concentrators other than
mpd5 may not bind(2) their sockets, thus not getting any use of the hash.

Differential Revision:	https://reviews.freebsd.org/D56172
2026-04-12 11:35:13 -07:00
Gleb Smirnoff edece33b38 inpcb: move local address assignment out of in_pcbdisconnect()
The logic of clearing local address at the protocol level makes sense.  It
is feature of UDP, not of any protocol, that local address is cleared on
disconnect.  This code can be tracked down to pre-FreeBSD times.

For example, for TCP we want a disconnected socket to return previously
used local address with getsockname(2).  The TCP has successfully evaded
that by not calling in_pcbdisconnect() and calling in_pcbdetach() in the
very old code and in_pcbdrop() later.   After D55661 TCP again has this
potential bug masked.  Better make it right than rely on such
unintentional evasions.

The raw IP sockets don't use in_pcbdisconnect(), but they are going to in
the near future.  If in_pcbdisconnect() clears local address for them,
that would be a larger bug than just getsockname().  A raw socket may be
bound with bind(2) and then connect(2)ed, and then disconnected, e.g.
connect(INADDR_ANY).  And when we run raw IP socket through
in_pcbdisconnect() we don't want to lose local address.

This reverts D38362.
This reverts commit 2589ec0f36.

Reviewed by:		rrs, markj
Differential Revision:	https://reviews.freebsd.org/D56170
2026-04-12 11:34:57 -07:00
Gleb Smirnoff 1d0ea3dfb9 raw ip: remove extra argument to rip_dodisconnect()
No functional change.
2026-04-12 11:34:05 -07:00
Gleb Smirnoff acb79b56b1 udp: make in_pcbbind_setup() acquire the hash lock internally
Reviewed by:		pouria, rrs, markj
Differential Revision:	https://reviews.freebsd.org/D55973
2026-04-12 11:33:51 -07:00
Gleb Smirnoff d7c409174d inpcb: make in6_pcbsetport() acquire the hash lock internally
Reviewed by:		pouria, rrs, markj
Differential Revision:	https://reviews.freebsd.org/D55972
2026-04-12 11:33:41 -07:00
Gleb Smirnoff 2c48736c55 inpcb: make in_pcbconnect() acquire the hash lock internally
Reviewed by:		pouria, rrs, markj
Differential Revision:	https://reviews.freebsd.org/D55971
2026-04-12 11:33:30 -07:00
Gleb Smirnoff 8b4d0bec43 inpcb: make in_pcbbind() acquire the hash lock internally
Reviewed by:		markj
Differential Revision:	https://reviews.freebsd.org/D55970
2026-04-12 11:33:20 -07:00
Gleb Smirnoff 40dbb06fa7 inpcb: retire INP_DROPPED and in_pcbdrop()
The inpcb flag INP_DROPPED served two purposes.

It was used by TCP and subsystems running on top of TCP as a flag that
marks a connection that is now in TCPS_CLOSED, but was in some other state
before (not a new-born connection). Create a new TCP flag TF_DISCONNECTED
for this purpose.

The in_pcbdrop() was a TCP's version of in_pcbdisconnect() that also sets
INP_DROPPED.  Use in_pcbdisconnect() instead.

Second purpose of INP_DROPPED was a negative lookup mask in
inp_smr_lock(), as SMR-protected lookup may see inpcbs that had been
removed from the hash.  We already have had INP_INHASHLIST that marks
inpcb that is in hash.  Convert it into INP_UNCONNECTED with the opposite
meaning.  This allows to combine it with INP_FREED for the negative lookup
mask.

The Chelsio/ToE and kTLS changes are done with some style refactoring,
like moving inp/tp assignments up and using macros for that.  However, no
deep thinking was taken to check if those checks are really needed, it
could be that some are not.

Reviewed by:		rrs
Differential Revision:	https://reviews.freebsd.org/D56186
2026-04-12 11:33:07 -07:00
Gleb Smirnoff ce283e115b netinet6: remove INP_DROPPED checks from setsockopt(2)
The INP_DROPPED is going to become an internal flag for inpcb.  As of now
it means a TCP pcb that is in TCPS_CLOSED.  There is nothing wrong with
calling setsockopt(2) on such socket, although has no practical use.

This deletes a piece of code from 56713d16a0 / D16201.  There is no
description of the panic fixed, but I will speculate that the panic was
about in6p->in6p_outputopts being NULL as the inpcb already went through
in_pcbfree_deferred().  This also can be related to compressed TIME-WAIT,
that is also gone now.

With current locking this shouldn't be possible.  An inpcb goes through
in_pcbfree() only with pr_detach method, which is called from sofree(),
and the latter is called on losing the very last socket reference.  So, at
the point when in_pcbfree() is called, the socket has lost its file
descriptor reference and there can not be any running setsockopt() on it.

Leave the call to ip6_pcbopt() still embraced with INP_WLOCK(), since we
are modifying inpcb contents.

NB: the IPv6 setsockopt(2) definitely has room for improvement.  Several
memory allocations should be moved out of lock and made M_WAITOK.
Covering large piece of setsockopt(2) code with epoch(9) just because
ip6_setpktopts() calls ifnet_byindex() isn't correct either.

Reviewed by:		markj
Differential Revision:	https://reviews.freebsd.org/D56169
2026-04-12 11:32:15 -07:00