Set the Fine-grained trap registers to trap any features we don't
support. These are expected to be more useful when we support nested
virtualisation, so for now just the base features and GICv3 are not
trapped.
As nested virtualisation will require VHE we only set the fine-grained
trap registers when VHE is used.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D54687
Add the macros and detection for Fine-grained traps (FEAT_FGT and
FEAT_FGT2).
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D54686
This allows us to also use the common VIRTIO_SIMPLE_PROBE and to have
devmatch load the driver when detected.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D54684
On arm64 we can use the "dc zva" instruction to zero memory. The CPU
tells software if the instruction is implemented, and if so the size
and alignment it will use.
When the size is 64-bytes the Arm Optimized Routines implementation of
memset can use dc zva to zero memory, and has a build flag to skip
checking.
Use this flag to build a version of memset that will be used when this
assumption is true.
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D54776
This allows static binaries to only include the functions they
reference.
Reviewed by: imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D54775
If we update Makefile.inc it may be to change the contents of these
files.
Reviewed by: imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D54774
Another boolean, indicating hardware support, will be introduced in next commit.
Thanks to the previous commit modifying sysctl_handle_bool(), this
change is backwards-compatible with old programs using an integer in and
out of sysctl(3).
Reviewed by: obiwac
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54626
Setting up the sysctl tree later:
1. Fixes not de-registering sysctl knobs on failure to attach.
2. Avoids having inconsistent knob values exposed during a brief moment.
Reviewed by: imp, obiwac
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54926
In sysctl_handle_bool(), if the output buffer (for the old value) has
room for exactly 4 bytes (sizeof(int)), then output the current boolean
value as an integer rather than a 'uint8_t'. Conversely, if 4 bytes
exactly remain in the input buffer (for the new value), treat them as an
integer and derive the new boolean value from it.
Doing so allows to convert existing integer syscstl knobs that are
interpreted as a boolean into true boolean ones while staying
backwards-compatible.
That brings no drawback as no code currently uses sysctl_handle_bool()
as part of a series of calls to sysctl_handle_*() functions for
(de)serialization of some compound structure. If that case ever
materializes, it can be easily solved, e.g., by creating
a sysctl_handle_bool_strict() variant.
In the future, we might want to go further and generally be more liberal
in the external type of integers we accept and output, by tolerating any
kind of supported integers (8-bit to 64-bit), enabling integer type
changes of knob's internal representations without breaking the ABI for
consumers hardcoding the passed integers (instead of relying on sysctl
knob type information).
Reviewed by: jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54925
Sockets that implement their own socket buffers (marked with PR_SOCKBUF)
are now also responsible for initialization of socket buffer mutexes in
pr_attach and for destruction in pr_detach (or pr_close).
This removes a big bunch of reported LORs, as now WITNESS is able to see
that tcp(4) socket buffer mutex and netlink(4) socket buffer mutex are two
different things. Distinct names also improve diagnostics for blocked
threads.
This also removes a hack from unix(4), where we used to mtx_destroy().
Also removes an innocent bug from unix(4) where for accept(2)-ed socket
soreserve() was called twice. This one was innocent since first call to
soreserve() was asking for 0 bytes of space.
This slightly increased amount of pasted code in TCP's syncache_socket().
The problem is that while for sockets created with socket(2) it is
pr_attach responsible for call to soreserve() (including !PR_SOCKBUF
protocols), but for the sockets created with accept(2) it was
solisten_clone() doing soreserve(), combined with the fact that for
accept(2) TCP completely bypasses pr_attach. This all should improve once
TCP has its own socket buffers.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D54984
Due to this all the rest of the items in the Built-in Commands section
were not rendered at all.
Fixes: 2711852bd9 ("sh.1: Provide detailed job control documentation")
MFC after: 3 days
Reviewed by: emaste, ziaee
Differential Revision: https://reviews.freebsd.org/D55080
On AMD processors libpmc was using the topic field (based on filename) to
determine the counter's subclass. Unfortunately, the JSON definitions for
AMD Zen 1-4 have the L3 counters in files shared with other counters.
This change has libpmc to use the pmu field (which is derived from the Unit
field in JSON) to determine the correct counter subclass.
Reviewed by: mhorne
MFC after: 2 weeks
Sponsored by: Netflix
Pull Request: https://github.com/freebsd/freebsd-src/pull/1984
There are no plans to allow multiple sizes of HPT superpages, so just use a
constant for it.
MFC after: 3 weeks
Fixes: 1bc75d77e9 ("powerpc/pmap/oea64: Make PV_LOCK superpage sized")
When trying to unregister a fictitious range in
`vm_phys_fictitious_unreg_range()`, the function checks the properties
of the looked up segment, but it does not check if a segment was found
in the first place.
This can happen with the amdgpu DRM driver which could call
`vm_phys_fictitious_unreg_range()` without a fictitious range registered
if the initialisation of the driver failed (for example because
firmwares are unavailable).
The code in the DRM driver was improved to avoid that, but
`vm_phys_fictitious_unreg_range()` should still check the return value
of `RB_FIND()` before trying to dereference the segment pointer and
panic with a page fault.
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D55076
Mainly, avoid reusing the name of one of the functions we should be
testing (but aren't) for local variables.
Sponsored by: Klara, Inc.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D55054
To be closer to AMD's official terminology, except for the "Lowest
Non-Linear Performance" field which we label as 'EFFICIENT_PERF' closer
to Intel's ("Most Efficient Performance"), and to clear possible
confusion.
No functional change (intended).
Reviewed by: aokblast
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54998
While here, rename an argument of BITS_VALUE() to be consistent with the
other macros.
Reviewed by: aokblast
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54997
On MSR_OP_LOCAL and non-naturally-atomic operations (MSR_OP_ANDNOT and
MSR_OP_OR), there is no guarantee that we are not interrupted between
reading and writing the MSR, and that interruption could actually
perform some operation on that MSR, which would be lost.
Prevent that problem by temporarily disabling interrupts around MSR
manipulation.
Reviewed by: kib
Discussed with: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54996
Simplify them by moving them into more natural places, i.e., default
cases of 'switch' statements.
No functional change (intended).
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54996
The Pluton startmethod uses a simple doorbell mechanism to wakeup the
TPM unit after we've issued various forms of state change, with the
registers to use specified in the startmethod-specific segment of the
TPM2 table (up to 12 bytes after the StartMethod).
At the very least, this is the kind of TPM in use by my AMD Zen 4-based
Minisforum machine.
Differential Revision: https://reviews.freebsd.org/D53683
HPT superpages are 16MB, not 2MB. Taking 8 locks to lock a super page
almost defeats the purpose of using the super page. Expanding the
PV_LOCK scope to cover 16MB (24 bit shift) reduces this to a single
lock.
MFC after: 3 weeks
Add POWER10 and POWER11 to the list of lockless TLBIE capable CPUs.
According to Linux, anything POWER5 and later should be able to do this,
but that hasn't been tested with FreeBSD. POWER10 and POWER11, being
derived after the POWER9, implicitly have this capability per the ISA
spec.
MFC after: 1 week
Move swi_remove() call before acquiring the tty lock. swi_remove() calls
intr_event_remove_handler() which may sleep via msleep(), causing a lock
order violation when called with the tty mutex held.
The software interrupt handler removal operates on the interrupt event
structure independently and does not require the tty lock. This matches
the pattern used in other drivers such as tcp_hpts.c where swi_remove()
is called without holding other locks.
Reviewed by: imp, kevans
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D54953
The tmp rc script has much the same problem that the var does: it wants
to test if /tmp is writable, and mount a tmpfs if it's not. This means
that we actually want our zfs datasets mounted first, because we might
have a /tmp dataset that changes the story.
The ordering problem is particularly noticable with a r/o zfs root,
since the write test will fail and we'll mount a tmpfs that later gets
covered by our /tmp dataset. If that /tmp dataset inherited readonly,
then we're still in trouble.
This also fixes `tmpmfs=yes`, which would again get covered by a zfs
dataset with the existing ordering.
Reviewed by: des
Differential Revision: https://reviews.freebsd.org/D54995
Move it back from kern.sched.ule.topology_spec.
Make it scheduler-agnostic.
Provide trivial report for UP kernels.
Apparently the MIB is used by some third-party software. Obviously it
did not worked on UP or 4BSD configs.
PR: 292574
Reviewed by: olce
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D55062
Mac firmware hides the Intel integrated GPU (iGPU) on dual GPU x86
systems, i.e., with AMD/NVIDIA dGPUs, when the Darwin OSI is not
installed via ACPI.
Prior to this change, FreeBSD always used the dGPU. This is fine in
practice, but consumed more power than when the iGPU is used,
resulting in reduced battery life.
Linux handles this in `drivers/acpi/osi.c` by detecting Apple
hardware via DMI, disabling all Windows OSI strings, and
by explicitly installing the Darwin OSI ACPI handler. This change
applies equivalent logic to the acpi(4) driver on FreeBSD.
This feature can be enabled/disabled using the
`hw.acpi.apple_darwin_osi` tunable. Setting this tunable to `0`
restores the previous behavior by explicitly disabling the added
support.
Reviewed by: obiwac, ngie, adrian
Differential Revision: https://reviews.freebsd.org/D54762
SPMC suspend runs after the device tree is suspended using the
acpi_post_dev_suspend eventhandler, and SPMC resume runs before the
device tree is resumed using the acpi_pre_dev_suspend eventhandler.
Reviewed by: olce
Approved by: olce
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D48735
These eventhandlers are called after suspending the device tree and
before resuming it. This is useful for PMC (power management controller)
drivers.
Reviewed by: olce
Approved by: olce
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D48735
Lionel Cons <lionelcons1972@gmail.com> requested
that a new option be added to runat(1) so that it could
be used to manipulate named attributes associated with
a symbolic link and not the file the symbolic link refers to).
This patch adds the option -h/--nofollow to do this.
Requested by: Lionel Cons <lionelcons1972@gmail.com>
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D55023
As part of making the chip-specific mix and match of different accesses
(DMA/bus space) work as desired, the intent is to map the HCB memory as
uncacheable. Prior to VM_MEMATTR_*, the !x86 way of indicating this to
bus_dmamem_alloc(9) was BUS_DMA_COHERENT. Then later on in 2db99100a4,
BUS_DMA_NOCACHE was hooked up to VM_MEMATTR_UNCACHEABLE for x86. As it
turns out, still as of today bus_dmamem_alloc(9) differs in this regard
across architectures. On arm, it still supports BUS_DMA_COHERENT only
for requesting uncacheable DMA and x86 still uses BUS_DMA_NOCACHE only.
On arm64 and riscv, BUS_DMA_COHERENT seems to effectively be an alias
for BUS_DMA_NOCACHE.
Thus, allocate the HCB memory with BUS_DMA_COHERENT | BUS_DMA_NOCACHE,
so we get uncacheable memory on all architectures including x86 and so
loads and stores from/to HCB won't get reordered. However, even on x86
we still need to use at least compiler barriers to achieve the desired
program order.
This change should also fix panics due to out-of-sync data seen with
FreeBSD VMs on top of OpenStack and HBAs of type lsiLogic as a result
of loads and stores getting reordered. [1]
While at it:
- Nuke the unused SYM_DRIVER_NAME macro.
- Remove unused/redundant HCB members and correct a comment typo.
PR: 270816 [1]
MFC after: 3 days
It takes exactly three arguments of known type.
Tweak the types of various resultproc_t functions to match the type (mostly
added const to struct pointers) allowing us to drop casts.
Effort: CHERI upstreaming
Reviewed by: vangyzen, glebius
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D54941
The `eachresult` argument is documented to take a function pointer of
type:
bool_t (*)(caddr_t, struct sockaddr_in *)
It was declared to take a resultproc_t which has historically been
declared to be:
bool_t (*resultproc_t)(caddr_t, ...);
This overlapped well enough for currently supported ABIs where variadic
arguments are passed in registers, but this declaration is misaligned
with the documentation (resultproc_t takes three arguments) and will be
fixed in a followup commit.
Fix the type to be non-variadic, matching callbacks, and define a
convenience type of as most callbacks take something other than a char *
as their first argument and need to be cast.
Effort: CHERI upstreaming
Reviewed by: ngie, glebius, jhb
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D54940
Replace use of thr_getspecific/thr_setspecific to stash the function
pointer we're smuggling between clnt_broadcast and rpc_wrap_bcast with a
simple thread local variable. Clear it after use so the reference
doesn't linger.
In the relatively unlikely event clnt_broadcast was called from threads
that exited prior to program termination, the previous code called free
on a function pointer, which is undefined and might corrupted allocator
state.
Effort: CHERI upstreaming
Reviewed by: glebius, jhb
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D54939
The lock around dev_clone is unfortunate because cloner might need to
take its own locks that establish the order with devfs vnodes, and then
transiently participates in further VFS locks order. For instance, this
way the proctree_lock or allproc_lock become involved.
Unlock dvp, we can unwind if the vnode become doomed while cloner was
called.
Reported and tested by: pho
Reviewed by: kevans, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D55028
When inlining the macro, reg was not substituted with the %ecx argument
previously passed in. One of the definitions was also left behind as an
empty macro.
PR: 292883
Fixes: 377c053a43 ("cpu_switch(): unconditionally wait on the blocked mutex transient")
MFC after: 1 week