Make it possible to create a review without publishing it. This should
be useful when one wants to restrict the visibility of a review, as that
cannot be done via the command line. Note that a draft review is still
publicly visible if one can guess the URL, but creating one does not
result in email notifications to subscribers etc., nor does a draft
appear in the creating user's activity log.
Once a draft is ready, one can publish it via the web UI.
Reviewed by: jrm
Differential Revision: https://reviews.freebsd.org/D56664
Import groups(7) from NetBSD, with tweaks for our system. The group
list is sorted by GID. All the group names from /usr/src/etc/group
are described, except "uucp". The FILES section was added on top of
the original manual page.
PR: 264966
Relnotes: yes
MFC after: 3 days
Obtained from: NetBSD
Reviewed by: des, ziaee
Differential Revision: https://reviews.freebsd.org/D54114
When processing an ASCONF chunk we failed to verify that the chunk
length was at least 8 bytes. As a result we might end up passing a
negative length to pf_multihome_scan(). Fortunately this merely meant
the function did nothing, but we should discard such invalid packets, so
explicitly check for this.
MFC after: 1 week
Reported by: Mark Johnston
Sponsored by: Orange Business Services
This patch addresses the code review comments provided for:
https://reviews.freebsd.org/D56197
* P7 VF PCI ID: rename NETXTREME_E_P7_VF to E_P7_VF (P7/Thor2 line drops the
Netxtreme name in product strings; other VF device IDs are unchanged).
* Use the return value of bnxt_vf_parse_schema() in bnxt_iov_vf_add() to
decide when to call bnxt_set_vf_admin_mac(); make parse_schema() return
bool and remove the has_admin_mac field.
* In bnxt_free_vf_resources(), fix indentation after dma_free_coherent() so
the NULL assignment is clearly separate from the call.
* In bnxt_hwrm_func_vf_resource_free(), use first_vf_id/last_vf_id in the
HWRM_FUNC_VF_RESC_FREE loop.
MFC after: 1 month
Reviewed by: ssaxena
Differential Revision: https://reviews.freebsd.org/D56644
VFs require separate HWRM commands for ring reservation and async
completion ring setup, so a common PF/VF dispatcher is introduced and
the async CR path is extended to handle both. The PF must populate the
VF request forwarding bitmap during driver registration so the firmware
correctly forwards VF-originated HWRM commands. VF reservation strategy
and min-guaranteed capability flags are now parsed for correct resource
partitioning, and PF-only operations (DCB, NVM, package version sysctl)
are guarded against VF invocation.
The short command buffer allocation is also reordered before the function
reset to ensure extended HWRM messages are available when needed, a
prerequisite uncovered during VF bring-up.
MFC after: 1 month
Reviewed by: ssaxena
Differential Revision: https://reviews.freebsd.org/D56232
When the firmware undergoes a hot-reset and the driver re-opens the
device, previously active Virtual Functions lose their resource
configuration. bnxt_reenable_sriov() restores that configuration by
replaying bnxt_cfg_hw_sriov() with the saved resource parameters.
The function is called from bnxt_fw_reset_task() in the
BNXT_FW_RESET_STATE_OPENING state, guarded by #ifdef PCI_IOV.
Because bnxt_cfg_hw_sriov() is a no-op when active_vfs is zero the
call is safe on any PF regardless of whether VFs were ever created.
MFC after: 1 month
Reviewed by: ssaxena
Differential Revision: https://reviews.freebsd.org/D56201
Expose per-VF policy knobs via the FreeBSD sysctl tree and enforce
them at the data-path level.
Trust (dev.bnxt.<unit>.vfN.trusted):
bnxt_set_vf_trust() sets/clears BNXT_VF_TRUST and sends
HWRM_FUNC_CFG with FLAGS_TRUSTED_VF_ENABLE/DISABLE.
bnxt_create_trusted_vf_sysctls() / bnxt_destroy_trusted_vf_sysctls()
manage the sysctl lifetime with VF creation/teardown.
Spoof-check (dev.bnxt.<unit>.vfN.spoofchk):
bnxt_set_vf_spoofchk() issues HWRM_FUNC_CFG with
SRC_MAC_ADDR_CHECK_ENABLE/DISABLE.
Promiscuous gating:
bnxt_is_trusted_vf() queries firmware via HWRM_FUNC_QCFG.
bnxt_promisc_ok() returns false for untrusted VFs, preventing them
from entering promiscuous mode. bnxt_promisc_set() is updated to
gate the PROMISCUOUS and ANYVLAN_NONVLAN mask bits on bnxt_promisc_ok().
bnxt_iov_vf_add() applies the initial trust/spoof-check policy from the
nvlist schema. bnxt_iov_init() creates the sysctl trees after
bnxt_cfg_hw_sriov() succeeds. bnxt_iov_uninit() tears them down.
MFC after: 1 month
Reviewed by: ssaxena
Differential Revision: https://reviews.freebsd.org/D56200
Enable the Physical Function to proxy HWRM commands issued by Virtual
Functions through the firmware forwarded-request mechanism.
When a VF issues a command that requires PF arbitration, the firmware
delivers a CMPL_BASE_TYPE_HWRM_FWD_REQ completion to the PF async ring.
* bnxt_process_async_msg() recognises CMPL_BASE_TYPE_HWRM_FWD_REQ,
identifies the originating VF by its firmware function ID, sets the
corresponding bit in pf.vf_event_bmap, and raises
BNXT_HWRM_EXEC_FWD_REQ_SP_EVENT to schedule deferred processing.
* bnxt_sp_task() dispatches to bnxt_hwrm_exec_fwd_req(), which iterates
over all pending VF bits and calls bnxt_vf_req_validate_snd() for each.
* bnxt_vf_req_validate_snd() inspects the encapsulated request type:
HWRM_FUNC_VF_CFG (MAC change) is handled by bnxt_vf_configure_mac()
which enforces trust/existing-MAC rules; HWRM_CFA_L2_FILTER_ALLOC is
handled by bnxt_vf_validate_set_mac(); HWRM_FUNC_CFG is forwarded
as-is; all other commands are rejected.
All forwarded-request code is guarded by #ifdef PCI_IOV.
MFC after: 1 month
Reviewed by: ssaxena
Differential Revision: https://reviews.freebsd.org/D56199
Teach the driver to distinguish a Physical Function from a Virtual
Function at probe time and configure each appropriately.
* Introduce bnxt_is_vf_device() to identify all known VF device IDs
(NetXtreme-C/E Gen1-3, Thor1/2, Hyper-V variants). Add corresponding
PVID entries to bnxt_vendor_info_array.
* Refactor the iflib shared context: rename bnxt_sctx_init to
bnxt_sctx_template, add a Thor2-specific bnxt_sctx_template_p7, and
build per-call PF/VF instances via bnxt_init_sctx_variants(); the VF
instance carries IFLIB_IS_VF. bnxt_register() selects the correct sctx.
* bnxt_attach_pre(): replace the hard-coded NPAR/VF switch with
bnxt_set_flags_by_devid(); on a VF call bnxt_approve_mac() to request
PF approval for the firmware-assigned MAC address.
* bnxt_hwrm_func_qcaps(): populate fw_fid and MAC for PF and VF contexts
separately; for PF call iflib_set_mac() and record max_msix_vfs; for VF
handle the case where the PF has not yet assigned a MAC.
* bnxt_hwrm_func_qcfg(): populate the new alloc_* counters used by the VF
resource configuration path; record registered_vfs for PF and VLAN/trust
state for VF.
* bnxt_init(): call bnxt_update_vf_mac() on VFs after each bring-up.
MFC after: 1 month
Reviewed by: ssaxena
Differential Revision: https://reviews.freebsd.org/D56198
Introduce the foundational building blocks for SR-IOV Virtual Function
support on Broadcom NetXtreme-C/E adapters.
* Add bnxt_sriov.h: defines the extended bnxt_vf_info structure (per-VF
firmware FID, MAC addresses, VLAN, flags, DMA command buffers, resource
counts), the bnxt_resc_map helper, flag macros (BNXT_VF_TRUST,
BNXT_VF_SPOOFCHK, etc.), and prototypes for all SR-IOV functions.
* Add bnxt_sriov.c: implements the SR-IOV attachment sequence
(bnxt_sriov_attach), the iflib IOV callbacks (bnxt_iov_init,
bnxt_iov_uninit, bnxt_iov_vf_add), VF resource allocation and
firmware configuration helpers (bnxt_alloc_vf_resources,
bnxt_cfg_hw_sriov, bnxt_hwrm_func_vf_resc_cfg, bnxt_hwrm_func_buf_rgtr,
bnxt_hwrm_func_vf_resource_free), and the per-VF parameter helper.
* Extend bnxt.h: include bnxt_sriov.h; extend bnxt_pf_info with VF-
tracking fields (vf array, firmware FID/MAC, resource-reservation
strategy, DMA page management, sysctl context); replace the upstream
bnxt_vf_info stub with the full definition from bnxt_sriov.h; extend
bnxt_func_qcfg with allocation counters required by the VF resource
configuration path; add vf_resc_cfg_input and sriov_lock to bnxt_softc.
* Update Makefile to build bnxt_sriov.c and include bnxt_sriov.h.
* Wire up PCI-IOV device methods (pci_iov_init / pci_iov_uninit /
pci_iov_add_vf) and iflib IOV callbacks (ifdi_iov_init / ifdi_iov_uninit
/ ifdi_iov_vf_add) in if_bnxt.c; call bnxt_sriov_attach() from
bnxt_attach_post() on P5+ Physical Functions.
MFC after: 1 month
Reviewed by: ssaxena
Differential Revision: https://reviews.freebsd.org/D56197
Various src.conf options can cause us to build something that ends up
in the clang package, but MK_TOOLCHAIN is not one of them; copy the
proper conditional from lib/Makefile to decide if we need to build
the package.
This fixes the build when LLVM/clang is entirely disabled.
Fixes: bb75b0d581 ("packages: Convert world to a subdir build")
MFC after: 2 weeks
Reviewed by: emaste
Sponsored by: https://www.patreon.com/bsdivy
Differential Revision: https://reviews.freebsd.org/D56657
In some versions of LLVM (at least 21), the <*intrin.h> headers contain
unguarded duplicate typedefs; this isn't permitted prior to C11, and
libzpool is built as C99. FreeBSD's LLVM backported LLVM PR #153820
to fix this, but other versions of LLVM (e.g., upstream, or on Linux)
don't have the patch, so this breaks the build.
Add -Wno-error=typedef-redefinition to downgrade this from an error
to a warning.
MFC after: 2 weeks
Reviewed by: dim, emaste
Sponsored by: https://www.patreon.com/bsdivy
Differential Revision: https://reviews.freebsd.org/D56653
Previously we had a mix of ${PKG_CMD} and bare 'pkg', which is
wrong, and breaks the build when 'pkg' isn't in the tools path,
e.g. when cross-building.
MFC after: 2 weeks
Reviewed by: wosch, emaste
Sponsored by: https://www.patreon.com/bsdivy
Differential Revision: https://reviews.freebsd.org/D56655
We support both -h and -n, but GNU coreutils only supports -n,
so use that instead. This fixes the package build on Linux.
MFC after: 2 weeks
Reviewed by: (wosch, imp) (previous version), emaste
Better fix than the original patch suggested by: jrtc27
Sponsored by: https://www.patreon.com/bsdivy
Differential Revision: https://reviews.freebsd.org/D56656
This makes it less likely we will silently generate broken artifacts.
Reviewed by: ivy
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D56671
Extract rx_overruns from the keep alive descriptor reported by
the device and expose it via sysctl hw stats.
RX overrun errors occur when a packet arrives but there are not
enough free buffers in the RX ring to receive it.
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D56640
Invoke ln with -n and -f. In normal use it doesn't matter, but during
development this might be run in a partially populated leftover tree.
Reviewed by: ivy
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D52883
The sh-based makeman silently ignored errors from `make showconfig`.
Ignore errors also from makeman.lua (but emit a warning).
We may want to revisit this in the future, but want makeman.lua to
behave identically for now.
PR: 294822
Reviewed by: kevans
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D56663
The changes [1] and [2] made to CURRENT introduce races between ifnet
detach and vmove operations. That requires extra effort to fix. They
are not MFCed to stable branches so the latter are not affected.
Temporarily skip two affected tests on CURRENT right now.
[1] 0bf42a0a05 bpf: virtualize bpf_iflist
[2] a4d766caf7 bpf: add a crutch to support if_vmove
PR: 292993
Discussed with: kp
A ng_eiface(4) or physical interface does not involve the cloner hence
the detaching is a bit different with epair(4). Add more tests to cover
that.
PR: 292993
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D56609
Ideally we shall have tests for all possible races. It is races between
if_detach(), if_vmove_loan(), if_vmove_reclaim() and vnet_if_return().
Well that requires too many tests and it appears to be less valuable to
have them all. So focus on potential in future regressions related to
recent fixes [1] and [2] only.
[1] ee9456ce37 ifnet: Fix races in if_vmove_reclaim()
[2] ba7f47d47d ifnet: if_detach(): Fix races with vmove operations
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D56606
This is a follow-up to cc7479d7dc ("mixer(8): Improve mute and recsrc
controls"). These deprecated values will be completely removed on
2026-06-15.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Reviewed by: 0mp
Pull Request: https://ron-dev.freebsd.org/FreeBSD/src/pulls/21
The X1E and others have a separate configuration bit to increase the
pull-up drive strength for i2c busses.
Add the plumbing; it doesn't do anything just yet.
Differential Revision: https://reviews.freebsd.org/D56351
* Create a set of callbacks implementing the hardware specific
GPIO bus operations
* Migrate the IPQ4018 TLMM setup code into qcom_tlmm_ip4018.c
Differential Revision: https://reviews.freebsd.org/D56349
The current arm64 PCPU implementation uses a global register asm
variable to use x18, which we reserve with -ffixed-x18, from C. Inside a
critical_enter() or sched_pin(), it is vital that any PCPU reads use the
right PCPU pointer, as often the whole point of the critical_enter() or
sched_pin() is to ensure consistent PCPU use (e.g. for SMR it relies on
zpcpu giving the same SMR state). critical_enter() and sched_pin() both
include atomic_interrupt_fence(), i.e. asm volatile("" ::: "memory"),
barriers to ensure that memory accesses don't get moved by the compiler
outside the critical section, which on most architectures will also
order the read of the PCPU pointer itself (whether due to the read being
another asm volatile statement, or due to using a segment-relative
memory access as on x86). However, this approach on arm64 is in no sense
a memory access, and therefore the register access is not ordered with
respect to the the critical_enter() or sched_pin(), or more specifically
the curthread->td_critnest++ / curthread->td_pinned++ within.
In practice upstream today this works out ok because the read of x18 is
inlined into the actual PCPU_GET/ADD/SET memory accesses (i.e. you will
get something like ldr xN, [x18, #imm-or-xM] for PCPU_GET, etc.), and
since *that* instruction is ordered properly due to being a memory
access, the x18 ends up being read in the right place. However, that is
not in any way guaranteed, it just relies on the hope that compiler
optimisations will be perfect at inlining the use. Moreover, PCPU_PTR is
definitely not a memory access in this world, it's just pointer
arithmetic on x18, and so that has nothing ordering it. This can be
observed with the following test function compiled into the kernel:
void
pcpu_test(void)
{
extern void __weak_symbol use_pcpu_ptr(void *);
critical_enter();
use_pcpu_ptr(PCPU_PTR(curthread));
critical_exit();
}
Obviously, this is a bit contrived as you could just read curthread
directly via its atomic definition that bypasses any worries about PCPU
atomicity, but it illustrates the point. With the in-tree LLVM*, this
ends up being compiled for me to:
paciasp
stp x29, x30, [sp, #-0x10]!
mov x29, sp
ldr x8, [x18]
ldr w9, [x8, #0x4fc]
mov x0, x18
add w9, w9, #0x1
str w9, [x8, #0x4fc]
bl use_pcpu_ptr
...
Note that, although the PCPU_PTR was within the critical section in the
C source, the read of x18 into x0, the argument register passed to
use_pcpu_ptr, has been hoisted to before the str, which is storing the
new, incremented, value of td_critnest to curthread, and so there is a
window within which we have to hope the thread is not preempted and
migrated to a different CPU, otherwise it will pass a pointer to the
wrong CPU's pc_curthread PCPU member.
Initially it would seem as though the solution to this would be to add
an additional barrier to critical_enter() / sched_pin() to ensure the
register reads could not be hoisted like this. However, I have not been
able to find a sequence that works reliably across both GCC and Clang,
independent of optimisation level. Using inline asm with x18 marked as a
clobber, using "=r"(pcpup), and using "+r"(pcpup) all run into various
issues; some combinations don't actually seem to be a barrier, and for
Clang at -O0 some combinations will actually generate writes to x18**,
at which point you then have to hope that the kernel is compiled with
optimisations, and that the redundant writes are optimised away such
that x18 is just passed through. But that just gets us back to hoping
optimisation works, which isn't a solution to the problem, it just
trades one point of fragility for another.
In talking to GCC developers, who seemed rather horrified by the
implications of trying to do this (which is effectively "register
volatile", a combination that's explicitly forbidden), we could not find
a solution to this, and so I have concluded that the only reliable to
have a sound PCPU implementation is to ditch this optimisation and
follow other non-x86 architectures in using inline asm in one form or
another; specifically, this adopts riscv's approach of just calling
get_pcpu(), which, curiously, was already implemented in inline asm here
on arm64, rather than reading pcpup.
Anyone who feels strongly enough about PCPU performance is welcome to
try to find a working approach, but such proposals should be heavily
scrutinised to be certain that they won't come back to bite us in
future. In particular, this caused a lot of problems downstream in
CheriBSD's experimental compartmentalised kernel, which is trialling
interposing on PCPU accesses in order to restrict access within
compartments. As a result, even PCPU_GET/SET/ADD can look like PCPU_PTR,
as they pass an opaque PCPU reference to wrapper functions, and so this
case gets hit all over the kernel, giving highly-confusing panics with
locks that aren't owned by the current thread or SMR use allegedly not
within an smr_enter().
The ia64 port encountered the same issue and reached the same conclusion
in e31ece45b7a4 ("Fix the PCPU access macros."), though went to the
trouble of trying to fold the offset into the inline assembly (assuming
it fit, with no fallback if not, since it's using the add pseudo-op that
will be expanded to either adds with a 14-bit immediate or, if somehow
that doesn't fit, addl with a 22-bit immediate). Curiously though it
left pcpup around as a footgun. sparc64 had similar code but was never
fixed. It also defined a curpcb in the same manner which was presumably
similarly broken, but looks to have been entirely unreferenced from C,
only referenced in actual assembly files. Alpha also had the same
design, but it was removed whilst critical_enter() was extern rather
than static inline so uses of the pointer could not have been hoisted,
and whilst sched_pin() didn't have any form of atomic_interrupt_fence()
to even try to make PCPU well-ordered.
* At time of writing, when that was LLVM 19, not verified at time of
commit with LLVM 21.
** For "+r"(pcpup), Clang's initial code generation is to do:
mov xTtmp1, x18
mov x18, xTmp1
/* asm (empty) */
mov xTmp2, x18
mov x18, xTmp2
since its interpretation of what that means is "read the value of
pcpup, and make sure that value is in x18 for the duration of the
assembly due to the asm("x18") on pcpup", and similarly for the output
side.
Reviewed by: andrew, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D56601
SA_UNSUPPORTED was introduced in Linux 5.11 to probe support
for other flags such as SA_EXPOSE_TAGBITS, introduced
at the same time. Ignore both.
Signed-off-by: Ricardo Branco <rbranco@suse.de>
PR: 289285
Reviewed by: pouria, kib
Pull-Request: https://github.com/freebsd/freebsd-src/pull/2163
It will be removed soon & replaces with pmap_s1_invalidate_all_kernel.
This allows us to handle errata that cpu_tlb_flushID is missing
workarounds for.
Sponsored by: Arm Ltd
When using outline atomics on arm64 the compiler will create a call to
a function that performs the atomic operation. This allows us to use
the fastest operation depending on the hardware.
As these functions are implemented in libgcc create a linker script
so libraries that link against libgcc_s will include libgcc to pull
them in.
Reviewed by: imp, jhb
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45268
While the majority of virtio platforms will be fully coherent, some may
require cache maintenance or other specific device memory handling (eg for
secure partitioning). Using bus_dma allows for these usecases.
The virtio buffers are marked as coherent; this should ensure that sync
calls are no-ops in the common cases.
Reviewed by: andrew
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D55564
While the majority of virtio platforms will be fully coherent, some may
require cache maintenance or other specific device memory handling (eg for
secure partitioning). Using bus_dma allows for these usecases.
The virtio buffers are marked as coherent; this should ensure that sync
calls are no-ops in the common cases.
Reviewed by: andrew
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D55492
* Get certdata.txt directly from the NSS Mercurial repository, rather
than from the Mozilla Firefox repository which imports it from NSS at
irregular intervals.
* Instead of always fetching the latest certdata.txt, fetch a specific
version. For this commit, we set this to the version that was last
imported in May 2025.
* Add a refrence to the MPL to the generated files.
* Regenerate with latest OpenSSL. This is purely cosmetic; mostly, the
certificate names now contain less unnecessary whitespace and some
elements are quoted.
MFC after: 1 week
Reviewed by: michaelo, kevans
Differential Revision: https://reviews.freebsd.org/D56620
The function awg_poll() was missing a prototype, which causes the build
to fail if DEVICE_POLLING is enabled, which it is in the ARMADAXP config.
MFC after: 2 weeks
Reviewed by: tuexen, mmel, adrian
Sponsored by: https://www.patreon.com/bsdivy
Differential Revision: https://reviews.freebsd.org/D56651
64bit processes can issue INT $0x80 instruction, and get the syscall
dispatched through ia32_syscall(). This works because syscall argument
fetch and result return are selected from the process sysent.
But, ia32_syscall() does not verify some conditions and does not perform
some actions which are considered unnecessary because the caller is
supposed to only access lower 4G. The INT syscall path breaks this
assumption.
We never supported such hack, so disable it. Send the offending thread
SIGBUS as if #GP was issued by hardware due to IDT vector 0x80 having
not numerically high enough DPL value.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D56630
Whatever params pointer is, it does not matter. copyin() handles any
values. In fact, params cannot be ever NULL.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D56630
If the /sbin/init binary is broken somehow, the way out is to set the
loader environment variable init_path to something else. The most
natural choice would be either /bin/sh or /rescue/sh. Unfortunately,
this does not work because the init process starts withoud stdin/out
descriptors.
Make it nicer to users by teaching /bin/sh startup code to open standard
descriptors on /dev/console if the shell is run as init.
Reviewed by: imp, jilles, zlei
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D56536
This makes it easier to downgrade kernel when it stops providing some
syscall required by libc. In this case, it is enough to downgrade libc
as well, our crt1 delegates all non-trivial work to
libc::__libc_start1(). With static init, the /sbin/init should be
downgraded as well, which might be not easy.
This does not mean that we support forward compatibility.
Reviewed by: imp, jilles, zlei
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D56536