Greg Kroah-Hartman [Thu, 9 Jan 2025 09:56:11 +0000 (10:56 +0100)]
Merge tag 'socfpga_firmware_update_for_v6.14' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into char-misc-next
Dinh writes:
SoCFPGA Firmware update for v6.14
- Use kthread_run_on_cpu()
* tag 'socfpga_firmware_update_for_v6.14' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
firmware: stratix10-svc: Use kthread_run_on_cpu()
Lukas Bulwahn [Wed, 8 Jan 2025 12:52:07 +0000 (13:52 +0100)]
scripts/spdxcheck: Handle license identifiers in Jinja comments
Commit 4b132aacb076 ("tools: Add xdrgen") adds a tool, which uses Jinja
template files, i.e., files with the j2 file extension, for its lightweight
code generation.
These template files for this tool have proper headers with the SPDX
License information, which are included as Jinja comments by enclosing the
text with '{#' and '#}'. Sofar, the spdxcheck script does not support to
properly parse this license information in Jinja comments and it reports
back with 'Invalid token: #}'.
Parse Jinja comments properly by stripping the known Jinja comment suffix.
Elizabeth Figura [Fri, 13 Dec 2024 19:35:11 +0000 (13:35 -0600)]
ntsync: No longer depend on BROKEN.
f5b335dc025cfee90957efa90dc72fada0d5abb4 ("misc: ntsync: mark driver as "broken"
to prevent from building") was committed to avoid the driver being used while
only part of its functionality was released. Since the rest of the functionality
has now been committed, revert this.
Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213193511.457338-31-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Elizabeth Figura [Fri, 13 Dec 2024 19:35:08 +0000 (13:35 -0600)]
selftests: ntsync: Add a stress test for contended waits.
Test a more realistic usage pattern, and one with heavy contention, in order to
actually exercise ntsync's internal synchronization.
This test has several threads in a tight loop acquiring a mutex, modifying some
shared data, and then releasing the mutex. At the end we check if the data is
consistent.
Elizabeth Figura [Fri, 13 Dec 2024 19:35:04 +0000 (13:35 -0600)]
selftests: ntsync: Add some tests for auto-reset event state.
Test event-specific ioctls NTSYNC_IOC_EVENT_SET, NTSYNC_IOC_EVENT_RESET,
NTSYNC_IOC_EVENT_PULSE, NTSYNC_IOC_EVENT_READ for auto-reset events, and
waiting on auto-reset events.
Elizabeth Figura [Fri, 13 Dec 2024 19:35:03 +0000 (13:35 -0600)]
selftests: ntsync: Add some tests for manual-reset event state.
Test event-specific ioctls NTSYNC_IOC_EVENT_SET, NTSYNC_IOC_EVENT_RESET,
NTSYNC_IOC_EVENT_PULSE, NTSYNC_IOC_EVENT_READ for manual-reset events, and
waiting on manual-reset events.
Elizabeth Figura [Fri, 13 Dec 2024 19:35:02 +0000 (13:35 -0600)]
selftests: ntsync: Add some tests for wakeup signaling with WINESYNC_IOC_WAIT_ALL.
Test contended "wait-for-all" waits, to make sure that scheduling and wakeup
logic works correctly, and that the wait only exits once objects are all
simultaneously signaled.
Elizabeth Figura [Fri, 13 Dec 2024 19:34:59 +0000 (13:34 -0600)]
selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY.
Test basic synchronous functionality of NTSYNC_IOC_WAIT_ANY, when objects are
considered signaled or not signaled, and how they are affected by a successful
wait.
Elizabeth Figura [Fri, 13 Dec 2024 19:34:57 +0000 (13:34 -0600)]
selftests: ntsync: Add some tests for semaphore state.
Wine has tests for its synchronization primitives, but these are more accessible
to kernel developers, and also allow us to test some edge cases that Wine does
not care about.
This patch adds tests for semaphore-specific ioctls NTSYNC_IOC_SEM_POST and
NTSYNC_IOC_SEM_READ, and waiting on semaphores.
Elizabeth Figura [Fri, 13 Dec 2024 19:34:56 +0000 (13:34 -0600)]
ntsync: Introduce alertable waits.
NT waits can optionally be made "alertable". This is a special channel for
thread wakeup that is mildly similar to SIGIO. A thread has an internal single
bit of "alerted" state, and if a thread is alerted while an alertable wait, the
wait will return a special value, consume the "alerted" state, and will not
consume any of its objects.
Alerts are implemented using events; the user-space NT emulator is expected to
create an internal ntsync event for each thread and pass that event to wait
functions.
Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213193511.457338-16-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Elizabeth Figura [Fri, 13 Dec 2024 19:34:52 +0000 (13:34 -0600)]
ntsync: Introduce NTSYNC_IOC_EVENT_PULSE.
This corresponds to the NT syscall NtPulseEvent().
This wakes up any waiters as if the event had been set, but does not set the
event, instead resetting it if it had been signalled. Thus, for a manual-reset
event, all waiters are woken, whereas for an auto-reset event, at most one
waiter is woken.
Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213193511.457338-12-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Elizabeth Figura [Fri, 13 Dec 2024 19:34:49 +0000 (13:34 -0600)]
ntsync: Introduce NTSYNC_IOC_CREATE_EVENT.
This correspond to the NT syscall NtCreateEvent().
An NT event holds a single bit of state denoting whether it is signaled or
unsignaled.
There are two types of events: manual-reset and automatic-reset. When an
automatic-reset event is acquired via a wait function, its state is reset to
unsignaled. Manual-reset events are not affected by wait functions.
Whether the event is manual-reset, and its initial state, are specified at
creation time.
Elizabeth Figura [Fri, 13 Dec 2024 19:34:48 +0000 (13:34 -0600)]
ntsync: Introduce NTSYNC_IOC_MUTEX_KILL.
This does not correspond to any NT syscall. Rather, when a thread dies, it
should be called by the NT emulator for each mutex, with the TID of the dying
thread.
NT mutexes are robust (in the pthread sense). When an NT thread dies, any
mutexes it owned are immediately released. Acquisition of those mutexes by other
threads will return a special value indicating that the mutex was abandoned,
like EOWNERDEAD returned from pthread_mutex_lock(), and EOWNERDEAD is indeed
used here for that purpose.
Elizabeth Figura [Fri, 13 Dec 2024 19:34:47 +0000 (13:34 -0600)]
ntsync: Introduce NTSYNC_IOC_MUTEX_UNLOCK.
This corresponds to the NT syscall NtReleaseMutant().
This syscall decrements the mutex's recursion count by one, and returns the
previous value. If the mutex is not owned by the current task, the function
instead fails and returns -EPERM.
Elizabeth Figura [Fri, 13 Dec 2024 19:34:46 +0000 (13:34 -0600)]
ntsync: Introduce NTSYNC_IOC_CREATE_MUTEX.
This corresponds to the NT syscall NtCreateMutant().
An NT mutex is recursive, with a 32-bit recursion counter. When acquired via
NtWaitForMultipleObjects(), the recursion counter is incremented by one. The OS
records the thread which acquired it.
The OS records the thread which acquired it. However, in order to keep this
driver self-contained, the owning thread ID is managed by user-space, and passed
as a parameter to all relevant ioctls.
The initial owner and recursion count, if any, are specified when the mutex is
created.
Elizabeth Figura [Fri, 13 Dec 2024 19:34:45 +0000 (13:34 -0600)]
ntsync: Introduce NTSYNC_IOC_WAIT_ALL.
This is similar to NTSYNC_IOC_WAIT_ANY, but waits until all of the objects are
simultaneously signaled, and then acquires all of them as a single atomic
operation.
Because acquisition of multiple objects is atomic, some complex locking is
required. We cannot simply spin-lock multiple objects simultaneously, as that
may disable preëmption for a problematically long time.
Instead, modifying any object which may be involved in a wait-all operation takes
a device-wide sleeping mutex, "wait_all_lock", instead of the normal object
spinlock.
Because wait-for-all is a rare operation, in order to optimize wait-for-any,
this lock is only taken when necessary. "all_hint" is used to mark objects which
are involved in a wait-for-all operation, and if an object is not, only its
spinlock is taken.
The locking scheme used here was written by Peter Zijlstra.
Elizabeth Figura [Fri, 13 Dec 2024 19:34:44 +0000 (13:34 -0600)]
ntsync: Introduce NTSYNC_IOC_WAIT_ANY.
This corresponds to part of the functionality of the NT syscall
NtWaitForMultipleObjects(). Specifically, it implements the behaviour where
the third argument (wait_any) is TRUE, and it does not handle alertable waits.
Those features have been split out into separate patches to ease review.
This patch therefore implements the wait/wake infrastructure which comprises the
core of ntsync's functionality.
NTSYNC_IOC_WAIT_ANY is a vectored wait function similar to poll(). Unlike
poll(), it "consumes" objects when they are signaled. For semaphores, this means
decreasing one from the internal counter. At most one object can be consumed by
this function.
This wait/wake model is fundamentally different from that used anywhere else in
the kernel, and for that reason ntsync does not use any existing infrastructure,
such as futexes, kernel mutexes or semaphores, or wait_event().
Up to 64 objects can be waited on at once. As soon as one is signaled, the
object with the lowest index is consumed, and that index is returned via the
"index" field.
A timeout is supported. The timeout is passed as a u64 nanosecond value, which
represents absolute time measured against either the MONOTONIC or REALTIME clock
(controlled by the flags argument). If U64_MAX is passed, the ioctl waits
indefinitely.
This ioctl validates that all objects belong to the relevant device. This is not
necessary for any technical reason related to NTSYNC_IOC_WAIT_ANY, but will be
necessary for NTSYNC_IOC_WAIT_ALL introduced in the following patch.
Some padding fields are added for alignment and for fields which will be added
in future patches (split out to ease review).
Costa Shulyupin [Mon, 9 Dec 2024 08:29:57 +0000 (10:29 +0200)]
scripts/tags.sh: Tag timer definitions
For timer definitions like
DEFINE_TIMER(mytimer, mytimer_handler);
ctags generates tags `DEFINE_TIMER` and skips `mytimer`
because it doesn't expand the DEFINE_TIMER macro.
Configure ctags to generate tag for `mytimer`
ans skip the `DEFINE_TIMER` tag in such cases.
Vimal Agrawal [Mon, 21 Oct 2024 13:38:12 +0000 (13:38 +0000)]
misc: misc_minor_alloc to use ida for all dynamic/misc dynamic minors
misc_minor_alloc was allocating id using ida for minor only in case of
MISC_DYNAMIC_MINOR but misc_minor_free was always freeing ids
using ida_free causing a mismatch and following warn:
> > WARNING: CPU: 0 PID: 159 at lib/idr.c:525 ida_free+0x3e0/0x41f
> > ida_free called for id=127 which is not allocated.
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
...
> > [<60941eb4>] ida_free+0x3e0/0x41f
> > [<605ac993>] misc_minor_free+0x3e/0xbc
> > [<605acb82>] misc_deregister+0x171/0x1b3
misc_minor_alloc is changed to allocate id from ida for all minors
falling in the range of dynamic/ misc dynamic minors
Fixes: ab760791c0cf ("char: misc: Increase the maximum number of dynamic misc devices to 1048448") Signed-off-by: Vimal Agrawal <vimal.agrawal@sophos.com> Reviewed-by: Dirk VanDerMerwe <dirk.vandermerwe@sophos.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241021133812.23703-1-vimal.agrawal@sophos.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alyssa Ross [Thu, 19 Dec 2024 23:29:57 +0000 (00:29 +0100)]
VMCI: remove unused ioctl definitions
IOCTL_VMCI_SOCKETS_VERSION and IOCTL_VMCI_SOCKETS_GET_AF_VALUE were
never implemented, because VSOCK ended up being implemented as a
generic mechanism with a static AF value. Likewise,
IOCTL_VMCI_SOCKETS_GET_LOCAL_CID ended up being implemented as
IOCTL_VM_SOCKETS_GET_LOCAL_CID.
This isn't a UAPI header, so it should be fine to remove the unused
values. I've left a comment noting IOCTL_VM_SOCKETS_GET_LOCAL_CID is
in the VMCI range to avoid unintentional reuse.
Carlos Llamas [Mon, 6 Jan 2025 19:26:07 +0000 (19:26 +0000)]
binder: fix kernel-doc warning of 'file' member
The 'struct file' member in 'binder_task_work_cb' definition was renamed
to 'file' between patch versions but its kernel-doc reference kept the
old name 'fd'. Update the naming to fix the W=1 build warning.
Li Li [Wed, 18 Dec 2024 21:29:34 +0000 (13:29 -0800)]
binderfs: add new binder devices to binder_devices
When binderfs is not enabled, the binder driver parses the kernel
config to create all binder devices. All of the new binder devices
are stored in the list binder_devices.
When binderfs is enabled, the binder driver creates new binder devices
dynamically when userspace applications call BINDER_CTL_ADD ioctl. But
the devices created in this way are not stored in the same list.
Rodolfo Giometti [Fri, 8 Nov 2024 07:31:12 +0000 (08:31 +0100)]
drivers pps: add PPS generators support
Sometimes one needs to be able not only to catch PPS signals but to
produce them also. For example, running a distributed simulation,
which requires computers' clock to be synchronized very tightly.
This patch adds PPS generators class in order to have a well-defined
interface for these devices.
...followed by more symptoms of corruption, with similar stacks:
refcount_t: underflow; use-after-free.
kernel BUG at lib/list_debug.c:62!
Kernel panic - not syncing: Oops - BUG: Fatal exception
This happens because pps_device_destruct() frees the pps_device with the
embedded cdev immediately after calling cdev_del(), but, as the comment
above cdev_del() notes, fops for previously opened cdevs are still
callable even after cdev_del() returns. I think this bug has always
been there: I can't explain why it suddenly started happening every time
I reboot this particular board.
In commit d953e0e837e6 ("pps: Fix a use-after free bug when
unregistering a source."), George Spelvin suggested removing the
embedded cdev. That seems like the simplest way to fix this, so I've
implemented his suggestion, using __register_chrdev() with pps_idr
becoming the source of truth for which minor corresponds to which
device.
But now that pps_idr defines userspace visibility instead of cdev_add(),
we need to be sure the pps->dev refcount can't reach zero while
userspace can still find it again. So, the idr_remove() call moves to
pps_unregister_cdev(), and pps_idr now holds a reference to pps->dev.
anish kumar [Mon, 30 Dec 2024 14:33:54 +0000 (14:33 +0000)]
MAINTAINERS: add slimbus documentation
In the commit 202318d37613d264e30d71cc32ef442492d6d279
slimbus documentation was added but it missed
the update in this file. Currently get_maintainer script
is missing the main maintainer.
Théo Lebrun [Mon, 30 Dec 2024 14:30:30 +0000 (14:30 +0000)]
nvmem: rmem: add CRC validation for Mobileye EyeQ5 NVMEM
Mobileye EyeQ5 has a non-volatile memory region which
gets used to store MAC addresses. Its format includes
a prefix 12-byte header and a suffix 4-byte CRC.
Add an optional ->checksum() callback inside match data;
it runs CRC32 onto the content.
Théo Lebrun [Mon, 30 Dec 2024 14:30:28 +0000 (14:30 +0000)]
nvmem: rmem: make ->reg_read() straight forward code
memory_read_from_buffer() is a weird choice; it:
- is made for iteration with ppos a pointer.
- does futile error checking in our case.
- does NOT ensure we read exactly N bytes.
Replace it by:
1. A check that (offset + bytes) lands inside the region and,
2. a plain memcpy().
Both ->reg_read() and ->reg_write() return values are not easy to
deduce. Explicit that they should return zero on success (and negative
values otherwise).
Such callbacks, in some alternative world, could return the number of
bytes in the success case. That would be translated to errors in the
nvmem core because of checks like:
ret = nvmem->reg_write(nvmem->priv, offset, val, bytes);
if (ret) {
// error case
}
This mistake is not just theoretical, see commit 28b008751aa2 ("nvmem: rmem: Fix return value of rmem_read()").
On Mobileye EyeQ5, the bootloader will put MAC addresses into memory.
Declare that as reserved memory to be used by the kernel, exposing
nvmem cells. That region has a 12-byte header and a 4-byte trailing CRC.
Thomas Weißschuh [Mon, 30 Dec 2024 14:30:25 +0000 (14:30 +0000)]
nvmem: core: constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Also adapt the dynamic sysfs cell logic to handle the const attributes.
Thomas Weißschuh [Sat, 21 Dec 2024 14:48:15 +0000 (15:48 +0100)]
misc: ds1682: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Thomas Weißschuh [Sat, 21 Dec 2024 14:48:12 +0000 (15:48 +0100)]
misc: pch_phub: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Thomas Weißschuh [Sat, 21 Dec 2024 14:48:11 +0000 (15:48 +0100)]
misc: c2port: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Thomas Weißschuh [Sat, 21 Dec 2024 14:48:09 +0000 (15:48 +0100)]
misc: sram: constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Thomas Weißschuh [Sat, 21 Dec 2024 14:48:08 +0000 (15:48 +0100)]
cxl: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Thomas Weißschuh [Sat, 21 Dec 2024 14:48:07 +0000 (15:48 +0100)]
ocxl: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Carlos Llamas [Tue, 10 Dec 2024 14:31:05 +0000 (14:31 +0000)]
binder: use per-vma lock in page reclaiming
Use per-vma locking in the shrinker's callback when reclaiming pages,
similar to the page installation logic. This minimizes contention with
unrelated vmas improving performance. The mmap_sem is still acquired if
the per-vma lock cannot be obtained.
Carlos Llamas [Tue, 10 Dec 2024 14:31:04 +0000 (14:31 +0000)]
binder: propagate vm_insert_page() errors
Instead of always overriding errors with -ENOMEM, propagate the specific
error code returned by vm_insert_page(). This allows for more accurate
error logs and handling.
Carlos Llamas [Tue, 10 Dec 2024 14:31:03 +0000 (14:31 +0000)]
binder: use per-vma lock in page installation
Use per-vma locking for concurrent page installations, this minimizes
contention with unrelated vmas improving performance. The mmap_lock is
still acquired when needed though, e.g. before get_user_pages_remote().
Many thanks to Barry Song who posted a similar approach [1].
Link: https://lore.kernel.org/all/20240902225009.34576-1-21cnbao@gmail.com/ Cc: Nhat Pham <nphamcs@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20241210143114.661252-8-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Carlos Llamas [Tue, 10 Dec 2024 14:31:02 +0000 (14:31 +0000)]
binder: rename alloc->buffer to vm_start
The alloc->buffer field in struct binder_alloc stores the starting
address of the mapped vma, rename this field to alloc->vm_start to
better reflect its purpose. It also avoids confusion with the binder
buffer concept, e.g. transaction->buffer.
Carlos Llamas [Tue, 10 Dec 2024 14:31:01 +0000 (14:31 +0000)]
binder: replace alloc->vma with alloc->mapped
It is unsafe to use alloc->vma outside of the mmap_sem. Instead, add a
new boolean alloc->mapped to save the vma state (mapped or unmmaped) and
use this as a replacement for alloc->vma to validate several paths.
Using the alloc->vma caused several performance and security issues in
the past. Now that it has been replaced with either vm_lookup() or the
alloc->mapped state, we can finally remove it.
Cc: Minchan Kim <minchan@kernel.org> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20241210143114.661252-6-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Carlos Llamas [Tue, 10 Dec 2024 14:31:00 +0000 (14:31 +0000)]
binder: store shrinker metadata under page->private
Instead of pre-allocating an entire array of struct binder_lru_page in
alloc->pages, install the shrinker metadata under page->private. This
ensures the memory is allocated and released as needed alongside pages.
By converting the alloc->pages[] into an array of struct page pointers,
we can access these pages directly and only reference the shrinker
metadata where it's being used (e.g. inside the shrinker's callback).
Rename struct binder_lru_page to struct binder_shrinker_mdata to better
reflect its purpose. Add convenience functions that wrap the allocation
and freeing of pages along with their shrinker metadata.
Note I've reworked this patch to avoid using page->lru and page->index
directly, as Matthew pointed out that these are being removed [1].
Carlos Llamas [Tue, 10 Dec 2024 14:30:59 +0000 (14:30 +0000)]
binder: select correct nid for pages in LRU
The numa node id for binder pages is currently being derived from the
lru entry under struct binder_lru_page. However, this object doesn't
reflect the node id of the struct page items allocated separately.
Instead, select the correct node id from the page itself. This was made
possible since commit 0a97c01cd20b ("list_lru: allow explicit memcg and
NUMA node selection").
Carlos Llamas [Tue, 10 Dec 2024 14:30:58 +0000 (14:30 +0000)]
binder: concurrent page installation
Allow multiple callers to install pages simultaneously by switching the
mmap_sem from write-mode to read-mode. Races to the same PTE are handled
using get_user_pages_remote() to retrieve the already installed page.
This method significantly reduces contention in the mmap semaphore.
To ensure safety, vma_lookup() is used (instead of alloc->vma) to avoid
operating on an isolated VMA. In addition, zap_page_range_single() is
called under the alloc->mutex to avoid racing with the shrinker.
Many thanks to Barry Song who posted a similar approach [1].
In preparation for concurrent page installations, restore the original
alloc->mutex which will serialize zap_page_range_single() against page
installations in subsequent patches (instead of the mmap_sem).
Resolved trivial conflicts with commit 2c10a20f5e84a ("binder_alloc: Fix
sleeping function called from invalid context") and commit da0c02516c50
("mm/list_lru: simplify the list_lru walk callback function").
Lee Jones [Mon, 23 Dec 2024 15:18:37 +0000 (15:18 +0000)]
misc: trivial: Remove undesired double space from struct definition
When one is too lazy to use an LSP to conduct look-ups on struct
definitions, one might use the ever useful `struct <name> {` search
string. However this doesn't work with `struct miscdevice {` because
of a stray double space. Assuming that this wasn't intentional, let's
simply remove it.
zhangheng [Fri, 20 Dec 2024 10:23:37 +0000 (18:23 +0800)]
w1: core: use sysfs_emit() instead of sprintf()
Follow the advice in Documentation/filesystems/sysfs.rst:
show() should only use sysfs_emit() or sysfs_emit_at() when formatting
the value to be returned to user space.
Linus Torvalds [Sun, 15 Dec 2024 23:33:41 +0000 (15:33 -0800)]
Merge tag 'efi-fixes-for-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- Limit EFI zboot to GZIP and ZSTD before it comes in wider use
- Fix inconsistent error when looking up a non-existent file in
efivarfs with a name that does not adhere to the NAME-GUID format
- Drop some unused code
* tag 'efi-fixes-for-v6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/esrt: remove esre_attribute::store()
efivarfs: Fix error on non-existent file
efi/zboot: Limit compression options to GZIP and ZSTD
Linus Torvalds [Sun, 15 Dec 2024 23:29:07 +0000 (15:29 -0800)]
Merge tag 'i2c-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"i2c host fixes: PNX used the wrong unit for timeouts, Nomadik was
missing a sentinel, and RIIC was missing rounding up"
* tag 'i2c-for-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: riic: Always round-up when calculating bus period
i2c: nomadik: Add missing sentinel to match table
i2c: pnx: Fix timeout in wait functions
Linus Torvalds [Sun, 15 Dec 2024 18:01:10 +0000 (10:01 -0800)]
Merge tag 'edac_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
- Make sure amd64_edac loads successfully on certain Zen4 memory
configurations
* tag 'edac_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/amd64: Simplify ECC check on unified memory controllers
Linus Torvalds [Sun, 15 Dec 2024 17:58:27 +0000 (09:58 -0800)]
Merge tag 'irq_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Disable the secure programming interface of the GIC500 chip in the
RK3399 SoC to fix interrupt priority assignment and even make a dead
machine boot again when the gic-v3 driver enables pseudo NMIs
- Correct the declaration of a percpu variable to fix several sparse
warnings
* tag 'irq_urgent_for_v6.13_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3: Work around insecure GIC integrations
irqchip/gic: Correct declaration of *percpu_base pointer in union gic_base
Linus Torvalds [Sun, 15 Dec 2024 17:38:03 +0000 (09:38 -0800)]
Merge tag 'sched_urgent_for_v6.13_rc3-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Prevent incorrect dequeueing of the deadline dlserver helper task and
fix its time accounting
- Properly track the CFS runqueue runnable stats
- Check the total number of all queued tasks in a sched fair's runqueue
hierarchy before deciding to stop the tick
- Fix the scheduling of the task that got woken last (NEXT_BUDDY) by
preventing those from being delayed
* tag 'sched_urgent_for_v6.13_rc3-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/dlserver: Fix dlserver time accounting
sched/dlserver: Fix dlserver double enqueue
sched/eevdf: More PELT vs DELAYED_DEQUEUE
sched/fair: Fix sched_can_stop_tick() for fair tasks
sched/fair: Fix NEXT_BUDDY
Linus Torvalds [Sun, 15 Dec 2024 17:26:13 +0000 (09:26 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM64:
- Fix confusion with implicitly-shifted MDCR_EL2 masks breaking
SPE/TRBE initialization
- Align nested page table walker with the intended memory attribute
combining rules of the architecture
- Prevent userspace from constraining the advertised ASID width,
avoiding horrors of guest TLBIs not matching the intended context
in hardware
- Don't leak references on LPIs when insertion into the translation
cache fails
RISC-V:
- Replace csr_write() with csr_set() for HVIEN PMU overflow bit
x86:
- Cache CPUID.0xD XSTATE offsets+sizes during module init
On Intel's Emerald Rapids CPUID costs hundreds of cycles and there
are a lot of leaves under 0xD. Getting rid of the CPUIDs during
nested VM-Enter and VM-Exit is planned for the next release, for
now just cache them: even on Skylake that is 40% faster"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Cache CPUID.0xD XSTATE offsets+sizes during module init
RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit
KVM: arm64: vgic-its: Add error handling in vgic_its_cache_translation
KVM: arm64: Do not allow ID_AA64MMFR0_EL1.ASIDbits to be overridden
KVM: arm64: Fix S1/S2 combination when FWB==1 and S2 has Device memory type
arm64: Fix usage of new shifted MDCR_EL2 values
Linus Torvalds [Sat, 14 Dec 2024 23:53:02 +0000 (15:53 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"Single one-line fix in the ufs driver"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Update compl_time_stamp_local_clock after completing a cqe
Linus Torvalds [Sat, 14 Dec 2024 20:58:14 +0000 (12:58 -0800)]
Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Daniel Borkmann:
- Fix a bug in the BPF verifier to track changes to packet data
property for global functions (Eduard Zingerman)
- Fix a theoretical BPF prog_array use-after-free in RCU handling of
__uprobe_perf_func (Jann Horn)
- Fix BPF tracing to have an explicit list of tracepoints and their
arguments which need to be annotated as PTR_MAYBE_NULL (Kumar
Kartikeya Dwivedi)
- Fix a logic bug in the bpf_remove_insns code where a potential error
would have been wrongly propagated (Anton Protopopov)
- Avoid deadlock scenarios caused by nested kprobe and fentry BPF
programs (Priya Bala Govindasamy)
- Fix a bug in BPF verifier which was missing a size check for
BTF-based context access (Kumar Kartikeya Dwivedi)
- Fix a crash found by syzbot through an invalid BPF prog_array access
in perf_event_detach_bpf_prog (Jiri Olsa)
- Fix several BPF sockmap bugs including a race causing a refcount
imbalance upon element replace (Michal Luczaj)
- Fix a use-after-free from mismatching BPF program/attachment RCU
flavors (Jann Horn)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (23 commits)
bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs
selftests/bpf: Add tests for raw_tp NULL args
bpf: Augment raw_tp arguments with PTR_MAYBE_NULL
bpf: Revert "bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"
selftests/bpf: Add test for narrow ctx load for pointer args
bpf: Check size for BTF-based ctx access of pointer members
selftests/bpf: extend changes_pkt_data with cases w/o subprograms
bpf: fix null dereference when computing changes_pkt_data of prog w/o subprogs
bpf: Fix theoretical prog_array UAF in __uprobe_perf_func()
bpf: fix potential error return
selftests/bpf: validate that tail call invalidates packet pointers
bpf: consider that tail calls invalidate packet pointers
selftests/bpf: freplace tests for tracking of changes_packet_data
bpf: check changes_pkt_data property for extension programs
selftests/bpf: test for changing packet data from global functions
bpf: track changes_pkt_data property for global functions
bpf: refactor bpf_helper_changes_pkt_data to use helper number
bpf: add find_containing_subprog() utility function
bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_prog
bpf: Fix UAF via mismatching bpf_prog/attachment RCU flavors
...
bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs
BPF program types like kprobe and fentry can cause deadlocks in certain
situations. If a function takes a lock and one of these bpf programs is
hooked to some point in the function's critical section, and if the
bpf program tries to call the same function and take the same lock it will
lead to deadlock. These situations have been reported in the following
bug reports.
The bugs reported by syzbot are due to tracepoint bpf programs being
called in the critical sections. This patch does not aim to fix deadlocks
caused by tracepoint programs. However, it does prevent deadlocks from
occurring in similar situations due to kprobe and fentry programs.
Linus Torvalds [Sat, 14 Dec 2024 17:31:19 +0000 (09:31 -0800)]
Merge tag 'tty-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull serial driver fixes from Greg KH:
"Here are two small serial driver fixes for 6.13-rc3. They are:
- ioport build fallout fix for the 8250 port driver that should
resolve Guenter's runtime problems
- sh-sci driver bugfix for a reported problem
Both of these have been in linux-next for a while with no reported
issues"
* tag 'tty-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: serial: Work around warning backtrace in serial8250_set_defaults
serial: sh-sci: Check if TX data was written to device in .tx_empty()
Linus Torvalds [Sat, 14 Dec 2024 17:27:07 +0000 (09:27 -0800)]
Merge tag 'staging-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some small staging gpib driver build and bugfixes for issues
that have been much-reported (should finally fix Guenter's build
issues). There are more of these coming in later -rc releases, but for
now this should fix the majority of the reported problems.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: gpib: Fix i386 build issue
staging: gpib: Fix faulty workaround for assignment in if
staging: gpib: Workaround for ppc build failure
staging: gpib: Make GPIB_NI_PCI_ISA depend on HAS_IOPORT
Linus Torvalds [Sat, 14 Dec 2024 17:15:49 +0000 (09:15 -0800)]
Merge tag 'v6.13-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"Fix a regression in rsassa-pkcs1 as well as a buffer overrun in
hisilicon/debugfs"
* tag 'v6.13-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: hisilicon/debugfs - fix the struct pointer incorrectly offset problem
crypto: rsassa-pkcs1 - Copy source data for SG list
Linus Torvalds [Sat, 14 Dec 2024 16:51:43 +0000 (08:51 -0800)]
Merge tag 'rust-fixes-6.13' of https://github.com/Rust-for-Linux/linux
Pull rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:
- Set bindgen's Rust target version to prevent issues when
pairing older rustc releases with newer bindgen releases,
such as bindgen >= 0.71.0 and rustc < 1.82 due to
unsafe_extern_blocks.
drm/panic:
- Remove spurious empty line detected by a new Clippy warning"
* tag 'rust-fixes-6.13' of https://github.com/Rust-for-Linux/linux:
rust: kbuild: set `bindgen`'s Rust target version
drm/panic: remove spurious empty line to clean warning
Linus Torvalds [Sat, 14 Dec 2024 16:41:40 +0000 (08:41 -0800)]
Merge tag 'iommu-fixes-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
- Per-domain device-list locking fixes for the AMD IOMMU driver
- Fix incorrect use of smp_processor_id() in the NVidia-specific part
of the ARM-SMMU-v3 driver
- Intel IOMMU driver fixes:
- Remove cache tags before disabling ATS
- Avoid draining PRQ in sva mm release path
- Fix qi_batch NULL pointer with nested parent domain
* tag 'iommu-fixes-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/vt-d: Avoid draining PRQ in sva mm release path
iommu/vt-d: Fix qi_batch NULL pointer with nested parent domain
iommu/vt-d: Remove cache tags before disabling ATS
iommu/amd: Add lockdep asserts for domain->dev_list
iommu/amd: Put list_add/del(dev_data) back under the domain->lock
iommu/tegra241-cmdqv: do not use smp_processor_id in preemptible context
Linus Torvalds [Sat, 14 Dec 2024 16:40:05 +0000 (08:40 -0800)]
Merge tag 'ata-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fix from Damien Le Moal:
- Fix an OF node reference leak in the sata_highbank driver
* tag 'ata-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: sata_highbank: fix OF node reference leak in highbank_initialize_phys()
Thomas Weißschuh [Wed, 11 Dec 2024 17:50:27 +0000 (18:50 +0100)]
w1: ds28e04: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.
Thomas Weißschuh [Wed, 11 Dec 2024 17:50:26 +0000 (18:50 +0100)]
w1: ds2805: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.