Liam R. Howlett [Sun, 8 Dec 2024 02:29:07 +0000 (21:29 -0500)]
maple_tree: Convert forking to use the sheaf interface
Use the generic interface which should result in less bulk allocations
during a forking.
A part of this is to abstract the freeing of the sheaf or maple state
allocations into its own function so mas_destroy() and the tree
duplication code can use the same functionality to return any unused
resources.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Liam R. Howlett [Fri, 6 Dec 2024 20:08:19 +0000 (15:08 -0500)]
maple_tree: Add single node allocation support to maple state
The fast path through a write will require replacing a single node in
the tree. Using a sheaf (32 nodes) is too heavy for the fast path, so
special case the node store operation by just allocating one node in the
maple state.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Use sheaves instead of bulk allocations. This should speed up the
allocations and the return path of unused allocations.
Remove push/pop of nodes from maple state.
Remove unnecessary testing
ifdef out other testing that probably will be deleted
Fix testcase for testing race
Move some testing around in the same commit.
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Liam R. Howlett [Mon, 27 Jan 2025 20:59:53 +0000 (15:59 -0500)]
testing/raix-tree/maple: Increase readers and reduce delay for faster machines
Add more threads and reduce the timing of the readers to increase the
possibility of catching the rcu changes. The test does not pass unless
the reader is seen.
Signed-off-by: Liam R. Howlett <howlett@gmail.com>
In order to support rebalancing and spanning stores using less than the
worst case number of nodes, we need to track more than just the vacant
height. Using only vacant height to reduce the worst case maple node
allocation count can lead to a shortcoming of nodes in the following
scenarios.
For rebalancing writes, when a leaf node becomes insufficient, it may be
combined with a sibling into a single node. This means that the parent node
which has entries for this children will lose one entry. If this parent node
was just meeting the minimum entries, losing one entry will now cause this
parent node to be insufficient. This leads to a cascading operation of
rebalancing at different levels and can lead to more node allocations than
simply using vacant height can return.
For spanning writes, a similar situation occurs. At the location at
which a spanning write is detected, the number of ancestor nodes may
similarly need to rebalanced into a smaller number of nodes and the same
cascading situation could occur.
To use less than the full height of the tree for the number of
allocations, we also need to track the height at which a non-leaf node
cannot become insufficient. This means even if a rebalance occurs to a
child of this node, it currently has enough entries that it can lose one
without any further action. This field is stored in the maple write state
as sufficient height. In mas_prealloc_calc() when figuring out how many
nodes to allocate, we check if the vacant node is lower in the tree
than a sufficient node (has a larger value). If it is, we cannot use the
vacant height and must use the difference in the height and sufficient
height as the basis for the number of nodes needed.
An off by one bug was also discovered in mast_overflow() where it is
using >= rather than >. This caused extra iterations of the
mas_spanning_rebalance() loop and lead to unneeded allocations. A test
is also added to check the number of allocations is correct.
maple_tree: break on convergence in mas_spanning_rebalance()
This allows support for using the vacant height to calculate the worst
case number of nodes needed for wr_rebalance operation.
mas_spanning_rebalance() was seen to perform unnecessary node allocations.
We can reduce allocations by breaking early during the rebalancing loop
once we realize that we have ascended to a common ancestor.
Suggested-by: Liam Howlett <liam.howlett@oracle.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
maple_tree: use vacant nodes to reduce worst case allocations
In order to determine the store type for a maple tree operation, a walk
of the tree is done through mas_wr_walk(). This function descends the
tree until a spanning write is detected or we reach a leaf node. While
descending, keep track of the height at which we encounter a node with
available space. This is done by checking if mas->end is less than the
number of slots a given node type can fit.
Now that the height of the vacant node is tracked, we can use the
difference between the height of the tree and the height of the vacant
node to know how many levels we will have to propagate creating new
nodes. Update mas_prealloc_calc() to consider the vacant height and
reduce the number of worst-case allocations.
Rebalancing and spanning stores are not supported and fall back to using
the full height of the tree for allocations.
Update preallocation testing assertions to take into account vacant
height.
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
For the maple tree, the root node is defined to have a depth of 0 with a
height of 1. Each level down from the node, these values are incremented
by 1. Various code paths define a root with depth 1 which is inconsisent
with the definition. Modify the code to be consistent with this
definition.
maple_tree: convert mas_prealloc_calc() to take in a maple write state
In a subsequent patch, mas_prealloc_calc() will need to access fields only
in the ma_wr_state. Convert the function to take in a ma_wr_state and
modify all callers. There is no functional change.
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Vlastimil Babka [Mon, 7 Aug 2023 18:52:02 +0000 (20:52 +0200)]
maple_tree: use percpu sheaves for maple_node_cache
Setup the maple_node_cache with percpu sheaves of size 32 to hopefully
improve its performance. Change the single node rcu freeing in
ma_free_rcu() to use kfree_rcu() instead of the custom callback, which
allows the rcu_free sheaf batching to be used. Note there are other
users of mt_free_rcu() where larger parts of maple tree are submitted to
call_rcu() as a whole, and that cannot use the rcu_free sheaf, but it's
still possible for maple nodes freed this way to be reused via the barn,
even if only some cpus are allowed to process rcu callbacks.
Liam R. Howlett [Mon, 2 Dec 2024 21:33:10 +0000 (16:33 -0500)]
tools: Add testing support for changes to rcu and slab for sheaves
Make testing work for the slab and rcu changes that have come in with
the sheaves work.
This only works with one kmem_cache, and only the first one used.
Subsequent setting of kmem_cache will not update the active kmem_cache
and will be silently dropped because there are other tests which happen
after the kmem_cache of interest is set.
The saved active kmem_cache is used in the rcu callback, which passes
the object to be freed.
The rcu call takes the rcu_head, which is passed in as the field in the
struct (in this case rcu in the maple tree node), which is calculated by
pointer math. The offset of which is saved (in a global variable) for
restoring the node pointer on the callback after the rcu grace period
expires.
Don't use any of this outside of testing, please.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Vlastimil Babka [Thu, 13 Feb 2025 11:50:34 +0000 (12:50 +0100)]
slab: determine barn status racily outside of lock
The possibility of many barn operations is determined by the current
number of full or empty sheaves. Taking the barn->lock just to find out
that e.g. there are no empty sheaves results in unnecessary overhead and
lock contention. Thus perform these checks outside of the lock with a
data_race() annotated variable read and fail quickly without taking the
lock.
Checks for sheaf availability that racily succeed have to be obviously
repeated under the lock for correctness, but we can skip repeating
checks if there are too many sheaves on the given list as the limits
don't need to be strict.
Vlastimil Babka [Tue, 5 Nov 2024 16:00:08 +0000 (17:00 +0100)]
slab: sheaf prefilling for guaranteed allocations
Add functions for efficient guaranteed allocations e.g. in a critical
section that cannot sleep, when the exact number of allocations is not
known beforehand, but an upper limit can be calculated.
kmem_cache_prefill_sheaf() returns a sheaf containing at least given
number of objects.
kmem_cache_alloc_from_sheaf() will allocate an object from the sheaf
and is guaranteed not to fail until depleted.
kmem_cache_return_sheaf() is for giving the sheaf back to the slab
allocator after the critical section. This will also attempt to refill
it to cache's sheaf capacity for better efficiency of sheaves handling,
but it's not stricly necessary to succeed.
kmem_cache_refill_sheaf() can be used to refill a previously obtained
sheaf to requested size. If the current size is sufficient, it does
nothing. If the requested size exceeds cache's sheaf_capacity and the
sheaf's current capacity, the sheaf will be replaced with a new one,
hence the indirect pointer parameter.
kmem_cache_sheaf_size() can be used to query the current size.
The implementation supports requesting sizes that exceed cache's
sheaf_capacity, but it is not efficient - such sheaves are allocated
fresh in kmem_cache_prefill_sheaf() and flushed and freed immediately by
kmem_cache_return_sheaf(). kmem_cache_refill_sheaf() might be especially
ineffective when replacing a sheaf with a new one of a larger capacity.
It is therefore better to size cache's sheaf_capacity accordingly.
Vlastimil Babka [Wed, 12 Mar 2025 16:10:18 +0000 (17:10 +0100)]
slab: add sheaf support for batching kfree_rcu() operations
Extend the sheaf infrastructure for more efficient kfree_rcu() handling.
For caches with sheaves, on each cpu maintain a rcu_free sheaf in
addition to main and spare sheaves.
kfree_rcu() operations will try to put objects on this sheaf. Once full,
the sheaf is detached and submitted to call_rcu() with a handler that
will try to put it in the barn, or flush to slab pages using bulk free,
when the barn is full. Then a new empty sheaf must be obtained to put
more objects there.
It's possible that no free sheaves are available to use for a new
rcu_free sheaf, and the allocation in kfree_rcu() context can only use
GFP_NOWAIT and thus may fail. In that case, fall back to the existing
kfree_rcu() machinery.
Expected advantages:
- batching the kfree_rcu() operations, that could eventually replace the
existing batching
- sheaves can be reused for allocations via barn instead of being
flushed to slabs, which is more efficient
- this includes cases where only some cpus are allowed to process rcu
callbacks (Android)
Possible disadvantage:
- objects might be waiting for more than their grace period (it is
determined by the last object freed into the sheaf), increasing memory
usage - but the existing batching does that too?
Only implement this for CONFIG_KVFREE_RCU_BATCHED as the tiny
implementation favors smaller memory footprint over performance.
Vlastimil Babka [Wed, 15 Nov 2023 10:38:15 +0000 (11:38 +0100)]
slab: add opt-in caching layer of percpu sheaves
Specifying a non-zero value for a new struct kmem_cache_args field
sheaf_capacity will setup a caching layer of percpu arrays called
sheaves of given capacity for the created cache.
Allocations from the cache will allocate via the percpu sheaves (main or
spare) as long as they have no NUMA node preference. Frees will also
refill one of the sheaves.
When both percpu sheaves are found empty during an allocation, an empty
sheaf may be replaced with a full one from the per-node barn. If none
are available and the allocation is allowed to block, an empty sheaf is
refilled from slab(s) by an internal bulk alloc operation. When both
percpu sheaves are full during freeing, the barn can replace a full one
with an empty one, unless over a full sheaves limit. In that case a
sheaf is flushed to slab(s) by an internal bulk free operation. Flushing
sheaves and barns is also wired to the existing cpu flushing and cache
shrinking operations.
The sheaves do not distinguish NUMA locality of the cached objects. If
an allocation is requested with kmem_cache_alloc_node() with a specific
node (not NUMA_NO_NODE), sheaves are bypassed.
The bulk operations exposed to slab users also try to utilize the
sheaves as long as the necessary (full or empty) sheaves are available
on the cpu or in the barn. Once depleted, they will fallback to bulk
alloc/free to slabs directly to avoid double copying.
Sysfs stat counters alloc_cpu_sheaf and free_cpu_sheaf count objects
allocated or freed using the sheaves. Counters sheaf_refill,
sheaf_flush_main and sheaf_flush_other count objects filled or flushed
from or to slab pages, and can be used to assess how effective the
caching is. The refill and flush operations will also count towards the
usual alloc_fastpath/slowpath, free_fastpath/slowpath and other
counters.
Access to the percpu sheaves is protected by localtry_trylock() when
potential callers include irq context, and localtry_lock() otherwise
(such as when we already know the gfp flags allow blocking). The trylock
failures should be rare and we can easily fallback. Each per-NUMA-node
barn has a spin_lock.
A current limitation is that when slub_debug is enabled for a cache with
percpu sheaves, the objects in the array are considered as allocated from
the slub_debug perspective, and the alloc/free debugging hooks occur
when moving the objects between the array and slab pages. This means
that e.g. an use-after-free that occurs for an object cached in the
array is undetected. Collected alloc/free stacktraces might also be less
useful. This limitation could be changed in the future.
On the other hand, KASAN, kmemcg and other hooks are executed on actual
allocations and frees by kmem_cache users even if those use the array,
so their debugging or accounting accuracy should be unaffected.
Vlastimil Babka [Tue, 28 Nov 2023 17:49:22 +0000 (18:49 +0100)]
SLUB percpu sheaves
Hi,
This is the v3 RFC to add an opt-in percpu array-based caching layer to
SLUB. This is to publish accumulated fixes since v2 ahead of LSF/MM. I
have also squashed the usage of localtry_lock to be used immediately
(not converted to in a separate patch) as the lock should be going to
6.15 at this point. There's still a patch introducing it for this RFC to
avoid depending on bpf-next tree.
The name "sheaf" was invented by Matthew so we don't call it magazine
like the original Bonwick paper. The per-NUMA-node cache of sheaves is
thus called "barn".
This may seem similar to the arrays in SLAB, but the main differences
are:
- opt-in, not used for every cache
- does not distinguish NUMA locality, thus no "shared" arrays and no
"alien" arrays that would need periodical flushing
- improves kfree_rcu() handling
- API for obtaining a preallocated sheaf that can be used for guaranteed
and efficient allocations in a restricted context, when the upper
bound for needed objects is known but rarely reached
The motivation comes mainly from the ongoing work related to VMA
scalability and the related maple tree operations. This is why maple
tree nodes are sheaf-enabled in the RFC, but it's not a full conversion
that would take benefits of the improved preallocation API. The VMA part
is currently left out as it's expected that Suren will land the VMA
TYPESAFE_BY_RCU conversion [3] soon and there would be conflict with that.
With both series applied it means just adding a line to kmem_cache_args
in proc_caches_init().
Some performance benefits were measured by Suren and Liam in previous
versions. Suren's results in [5] look promising, so far except for the
preallocation support as used by the refactored maple tree code.
A sheaf-enabled cache has the following expected advantages:
- Cheaper fast paths. For allocations, instead of local double cmpxchg,
after Patch 5 it's preempt_disable() and no atomic operations. Same for
freeing, which is normally a local double cmpxchg only for a short
term allocations (so the same slab is still active on the same cpu when
freeing the object) and a more costly locked double cmpxchg otherwise.
The downside is the lack of NUMA locality guarantees for the allocated
objects.
- kfree_rcu() batching and recycling. kfree_rcu() will put objects to a
separate percpu sheaf and only submit the whole sheaf to call_rcu()
when full. After the grace period, the sheaf can be used for
allocations, which is more efficient than freeing and reallocating
individual slab objects (even with the batching done by kfree_rcu()
implementation itself). In case only some cpus are allowed to handle rcu
callbacks, the sheaf can still be made available to other cpus on the
same node via the shared barn. The maple_node cache uses kfree_rcu() and
thus can benefit from this.
- Preallocation support. A prefilled sheaf can be privately borrowed for
a short term operation that is not allowed to block in the middle and
may need to allocate some objects. If an upper bound (worst case) for
the number of allocations is known, but only much fewer allocations
actually needed on average, borrowing and returning a sheaf is much more
efficient then a bulk allocation for the worst case followed by a bulk
free of the many unused objects. Maple tree write operations should
benefit from this.
Patch 1 is copied from the series "bpf, mm: Introduce try_alloc_pages()"
[2] to introduce a variant of local_lock that has a trylock operation.
Patch 2 implements the basic sheaf functionality as a caching layer on
top of (per-cpu) slabs.
Patch 3 adds the sheaf kfree_rcu() support.
Patch 4 implements borrowing prefilled sheaves, with maple tree being the
ancticipated user.
Patch 5 seeks to reduce barn spinlock contention. Separately for
possible evaluation.
Patches 6 and 7 by Liam add testing stubs that maple tree will use in
its userspace tests.
Patch 9 enables sheaves for the maple tree node cache, but does not
take advantage of prefilling yet.
(RFC) LIMITATIONS:
- with slub_debug enabled, objects in sheaves are considered allocated
so allocation/free stacktraces may become imprecise and checking of
e.g. redzone violations may be delayed
GIT TREES:
this series: https://git.kernel.org/vbabka/l/slub-percpu-sheaves-v3
To avoid conflicts, the series requires (and the branch above is based
on) the kfree_rcu() code refactoring scheduled for 6.15:
To: Suren Baghdasaryan <surenb@google.com>
To: Liam R. Howlett <Liam.Howlett@oracle.com>
To: Christoph Lameter <cl@linux.com>
To: David Rientjes <rientjes@google.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Cc: rcu@vger.kernel.org Cc: maple-tree@lists.infradead.org Cc: vbabka@suse.cz
---
Changes in v4:
- EDITME: describe what is new in this series revision.
- EDITME: use bulletpoints and terse descriptions.
- Link to v3: https://lore.kernel.org/r/20250317-slub-percpu-caches-v3-0-9d9884d8b643@suse.cz
Changes in v3:
- Squash localtry_lock conversion so it's used immediately.
- Incorporate feedback and add tags from Suren and Harry - thanks!
- Mostly adding comments and some refactoring.
- Fixes for kfree_rcu_sheaf() vmalloc handling, cpu hotremove
flushing.
- Fix wrong condition in kmem_cache_return_sheaf() that may have
affected performance negatively.
- Refactoring of free_to_pcs()
- Link to v2: https://lore.kernel.org/r/20250214-slub-percpu-caches-v2-0-88592ee0966a@suse.cz
Changes in v2:
- Removed kfree_rcu() destructors support as VMAs will not need it
anymore after [3] is merged.
- Changed to localtry_lock_t borrowed from [2] instead of an own
implementation of the same idea.
- Many fixes and improvements thanks to Liam's adoption for maple tree
nodes.
- Userspace Testing stubs by Liam.
- Reduced limitations/todos - hooking to kfree_rcu() is complete,
prefilled sheaves can exceed cache's sheaf_capacity.
- Link to v1: https://lore.kernel.org/r/20241112-slub-percpu-caches-v1-0-ddc0bdc27e05@suse.cz
locking/local_lock, mm: Replace localtry_ helpers with local_trylock_t type
Partially revert commit 0aaddfb06882 ("locking/local_lock: Introduce localtry_lock_t").
Remove localtry_*() helpers, since localtry_lock() name might
be misinterpreted as "try lock".
Introduce local_trylock[_irqsave]() helpers that only work
with newly introduced local_trylock_t type.
Note that attempt to use local_trylock[_irqsave]() with local_lock_t
will cause compilation failure.
local_trylock_t lock; // sizeof(lock) == 4
local_lock(&lock); // preempt disable, acquired = 1
local_lock_irqsave(&lock, ...); // irq save, acquired = 1
if (local_trylock(&lock)) // if (!acquired) preempt disable
if (local_trylock_irqsave(&lock, ...)) // if (!acquired) irq save
The existing local_lock_*() macros can be used either with
local_lock_t or local_trylock_t.
With local_trylock_t they set acquired = 1 while local_unlock_*() clears it.
In !PREEMPT_RT local_lock_irqsave(local_lock_t *) disables interrupts
to protect critical section, but it doesn't prevent NMI, so the fully
reentrant code cannot use local_lock_irqsave(local_lock_t *) for
exclusive access.
The local_lock_irqsave(local_trylock_t *) helper disables interrupts
and sets acquired=1, so local_trylock_irqsave(local_trylock_t *) from
NMI attempting to acquire the same lock will return false.
In PREEMPT_RT local_lock_irqsave() maps to preemptible spin_lock().
Map local_trylock_irqsave() to preemptible spin_trylock().
When in hard IRQ or NMI return false right away, since
spin_trylock() is not safe due to explicit locking in the underneath
rt_spin_trylock() implementation. Removing this explicit locking and
attempting only "trylock" is undesired due to PI implications.
The local_trylock() without _irqsave can be used to avoid the cost of
disabling/enabling interrupts by only disabling preemption, so
local_trylock() in an interrupt attempting to acquire the same
lock will return false.
Note there is no need to use local_inc for acquired variable,
since it's a percpu variable with strict nesting scopes.
Merge tag 'rust-fixes-6.15-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust fix from Miguel Ojeda:
"Fix 'generate_rust_analyzer.py' due to typo during merge"
Mea culpa, mea maxima culpa.
* tag 'rust-fixes-6.15-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
scripts: generate_rust_analyzer: fix pin-init name in kernel deps
Merge tag 'bcachefs-2025-03-31' of git://evilpiepirate.org/bcachefs
Pull more bcachefs updates from Kent Overstreet:
"All bugfixes and logging improvements"
* tag 'bcachefs-2025-03-31' of git://evilpiepirate.org/bcachefs: (35 commits)
bcachefs: fix bch2_write_point_to_text() units
bcachefs: Log original key being moved in data updates
bcachefs: BCH_JSET_ENTRY_log_bkey
bcachefs: Reorder error messages that include journal debug
bcachefs: Don't use designated initializers for disk_accounting_pos
bcachefs: Silence errors after emergency shutdown
bcachefs: fix units in rebalance_status
bcachefs: bch2_ioctl_subvolume_destroy() fixes
bcachefs: Clear fs_path_parent on subvolume unlink
bcachefs: Change btree_insert_node() assertion to error
bcachefs: Better printing of inconsistency errors
bcachefs: bch2_count_fsck_err()
bcachefs: Better helpers for inconsistency errors
bcachefs: Consistent indentation of multiline fsck errors
bcachefs: Add an "ignore unknown" option to bch2_parse_mount_opts()
bcachefs: bch2_time_stats_init_no_pcpu()
bcachefs: Fix bch2_fs_get_tree() error path
bcachefs: fix logging in journal_entry_err_msg()
bcachefs: add missing newline in bch2_trans_updates_to_text()
bcachefs: print_string_as_lines: fix extra newline
...
Merge tag 'fs_for_v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext2, udf, and isofs updates from Jan Kara:
- conversion of ext2 to the new mount API
- small folio conversion work for ext2
- a fix of an unexpected return value in udf in inode_getblk()
- a fix of handling of corrupted directory in isofs
* tag 'fs_for_v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix inode_getblk() return value
ext2: Make ext2_params_spec static
ext2: create ext2_msg_fc for use during parsing
ext2: convert to the new mount API
ext2: Remove reference to bh->b_page
isofs: fix KMSAN uninit-value bug in do_isofs_readdir()
Merge tag 'exfat-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Fix random stack corruption and incorrect error returns in
exfat_get_block()
- Optimize exfat_get_block() by improving checking corner cases
- Fix an endless loop by self-linked chain in exfat_find_last_cluster
- Remove dead EXFAT_CLUSTERS_UNTRACKED codes
- Add missing shutdown check
- Improve the delete performance with discard mount option
* tag 'exfat-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: call bh_read in get_block only when necessary
exfat: fix potential wrong error return from get_block
exfat: fix missing shutdown check
exfat: fix the infinite loop in exfat_find_last_cluster()
exfat: fix random stack corruption after get_block
exfat: remove count used cluster from exfat_statfs()
exfat: support batch discard of clusters when freeing clusters
Merge tag 'v6.15rc-part1-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server updates from Steve French:
- Two fixes for bounds checks of open contexts
- Two multichannel fixes, including one for important UAF
- Oplock/lease break fix for potential ksmbd connection refcount leak
- Security fix to free crypto data more securely
- Fix to enable allowing Kerberos authentication by default
- Two RDMA/smbdirect fixes
- Minor cleanup
* tag 'v6.15rc-part1-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix r_count dec/increment mismatch
ksmbd: fix multichannel connection failure
ksmbd: fix use-after-free in ksmbd_sessions_deregister()
ksmbd: use ib_device_get_netdev() instead of calling ops.get_netdev
ksmbd: use aead_request_free to match aead_request_alloc
Revert "ksmbd: fix missing RDMA-capable flag for IPoIB device in ksmbd_rdma_capable_netdev()"
ksmbd: add bounds check for create lease context
ksmbd: add bounds check for durable handle context
ksmbd: make SMB_SERVER_KERBEROS5 enable by default
ksmbd: Use str_read_write() and str_true_false() helpers
Merge tag '6.15-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
- Fix for network namespace refcount leak
- Multichannel fix and minor multichannel debug message cleanup
- Fix potential null ptr reference in SMB3 close
- Fix for special file handling when reparse points not supported by
server
- Two ACL fixes one for stricter ACE validation, one for incorrect
perms requested
- Three RFC1001 fixes: one for SMB3 mounts on port 139, one for better
default hostname, and one for better session response processing
- Minor update to email address for MAINTAINERS file
- Allow disabling Unicode for access to old SMB1 servers
- Three minor cleanups
* tag '6.15-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Add new mount option -o nounicode to disable SMB1 UNICODE mode
cifs: Set default Netbios RFC1001 server name to hostname in UNC
smb: client: Fix netns refcount imbalance causing leaks and use-after-free
cifs: add validation check for the fields in smb_aces
CIFS: Propagate min offload along with other parameters from primary to secondary channels.
cifs: Improve establishing SMB connection with NetBIOS session
cifs: Fix establishing NetBIOS session for SMB2+ connection
cifs: Fix getting DACL-only xattr system.cifs_acl and system.smb3_acl
cifs: Check if server supports reparse points before using them
MAINTAINERS: reorder preferred email for Steve French
cifs: avoid NULL pointer dereference in dbg call
smb: client: Remove redundant check in smb2_is_path_accessible()
smb: client: Remove redundant check in cifs_oplock_break()
smb: mark the new channel addition log as informational log with cifs_info
smb: minor cleanup to remove unused function declaration
Merge tag 'nfsd-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
"Neil Brown contributed more scalability improvements to NFSD's open
file cache, and Jeff Layton contributed a menagerie of repairs to
NFSD's NFSv4 callback / backchannel implementation.
Mike Snitzer contributed a change to NFS re-export support that
disables support for file locking on a re-exported NFSv4 mount. This
is because NFSv4 state recovery is currently difficult if not
impossible for re-exported NFS mounts. The change aims to prevent data
integrity exposures after the re-export server crashes.
Work continues on the evolving NFSD netlink administrative API.
Many thanks to the contributors, reviewers, testers, and bug reporters
who participated during the v6.15 development cycle"
* tag 'nfsd-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (45 commits)
NFSD: Add a Kconfig setting to enable delegated timestamps
sysctl: Fixes nsm_local_state bounds
nfsd: use a long for the count in nfsd4_state_shrinker_count()
nfsd: remove obsolete comment from nfs4_alloc_stid
nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
nfsd: reorganize struct nfs4_delegation for better packing
nfsd: handle errors from rpc_call_async()
nfsd: move cb_need_restart flag into cb_flags
nfsd: replace CB_GETATTR_BUSY with NFSD4_CALLBACK_RUNNING
nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY
nfsd: prevent callback tasks running concurrently
nfsd: disallow file locking and delegations for NFSv4 reexport
nfsd: filecache: drop the list_lru lock during lock gc scans
nfsd: filecache: don't repeatedly add/remove files on the lru list
nfsd: filecache: introduce NFSD_FILE_RECENT
nfsd: filecache: use list_lru_walk_node() in nfsd_file_gc()
nfsd: filecache: use nfsd_file_dispose_list() in nfsd_file_close_inode_sync()
NFSD: Re-organize nfsd_file_gc_worker()
nfsd: filecache: remove race handling.
fs: nfs: acl: Avoid -Wflex-array-member-not-at-end warning
...
Linus Torvalds [Mon, 31 Mar 2025 21:19:55 +0000 (14:19 -0700)]
x86: don't re-generate cpufeaturemasks.h so eagerly
It turns out the code to generate the x86 cpufeaturemasks.h header was
way too aggressive, and would re-generate it whenever the timestamp on
the kernel config file changed.
Now, the regular 'make *config' tools are fairly careful to not rewrite
the kernel config file unless the contents change, but other usecases
aren't that careful.
Michael Kelley reports that 'make-kpkg' ends up doing "make syncconfig"
multiple times in prepping to build, and will modify the config file in
the process (and then modify it back, but by then the timestamps have
changed).
Jakub Kicinski reports that the netdev CI does something similar in how
it generates the config file in multiple steps.
In both cases, the config file timestamp updates then cause the
cpufeaturemasks.h file to be regenerated, and that in turn then causes
lots of unnecessary rebuilds due to all the normal dependencies.
Fix it by using our 'filechk' infrastructure in the Makefile to generate
the header file. That will only write a new version of the file if the
contents of the file have actually changed.
Linus Torvalds [Mon, 31 Mar 2025 20:37:22 +0000 (13:37 -0700)]
Merge tag 'trace-ringbuffer-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer updates from Steven Rostedt:
- Restructure the persistent memory to have a "scratch" area
Instead of hard coding the KASLR offset in the persistent memory by
the ring buffer, push that work up to the callers of the persistent
memory as they are the ones that need this information. The offsets
and such is not important to the ring buffer logic and it should not
be part of that.
A scratch pad is now created when the caller allocates a ring buffer
from persistent memory by stating how much memory it needs to save.
- Allow where modules are loaded to be saved in the new scratch pad
Save the addresses of modules when they are loaded into the
persistent memory scratch pad.
- A new module_for_each_mod() helper function was created
With the acknowledgement of the module maintainers a new module
helper function was created to iterate over all the currently loaded
modules. This has a callback to be called for each module. This is
needed for when tracing is started in the persistent buffer and the
currently loaded modules need to be saved in the scratch area.
- Expose the last boot information where the kernel and modules were
loaded
The last_boot_info file is updated to print out the addresses of
where the kernel "_text" location was loaded from a previous boot, as
well as where the modules are loaded. If the buffer is recording the
current boot, it only prints "# Current" so that it does not expose
the KASLR offset of the currently running kernel.
- Allow the persistent ring buffer to be released (freed)
To have this in production environments, where the kernel command
line can not be changed easily, the ring buffer needs to be freed
when it is not going to be used. The memory for the buffer will
always be allocated at boot up, but if the system isn't going to
enable tracing, the memory needs to be freed. Allow it to be freed
and added back to the kernel memory pool.
- Allow stack traces to print the function names in the persistent
buffer
Now that the modules are saved in the persistent ring buffer, if the
same modules are loaded, the printing of the function names will
examine the saved modules. If the module is found in the scratch area
and is also loaded, then it will do the offset shift and use kallsyms
to display the function name. If the address is not found, it simply
displays the address from the previous boot in hex.
* tag 'trace-ringbuffer-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Use _text and the kernel offset in last_boot_info
tracing: Show last module text symbols in the stacktrace
ring-buffer: Remove the unused variable bmeta
tracing: Skip update_last_data() if cleared and remove active check for save_mod()
tracing: Initialize scratch_size to zero to prevent UB
tracing: Fix a compilation error without CONFIG_MODULES
tracing: Freeable reserved ring buffer
mm/memblock: Add reserved memory release function
tracing: Update modules to persistent instances when loaded
tracing: Show module names and addresses of last boot
tracing: Have persistent trace instances save module addresses
module: Add module_for_each_mod() function
tracing: Have persistent trace instances save KASLR offset
ring-buffer: Add ring_buffer_meta_scratch()
ring-buffer: Add buffer meta data for persistent ring buffer
ring-buffer: Use kaslr address instead of text delta
ring-buffer: Fix bytes_dropped calculation issue
Linus Torvalds [Mon, 31 Mar 2025 16:56:08 +0000 (09:56 -0700)]
Merge tag 'trace-latency-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing documentation fix from Steven Rostedt:
"Documentation fix for runtime verifier
The runtime verifier documents that were created were not referenced
in the indices, which caused warning when building the documentation
tree. Those documents are now added to the rv indices"
* tag 'trace-latency-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
Documentation/rv: Add sched pages to the indices
Linus Torvalds [Mon, 31 Mar 2025 15:52:33 +0000 (08:52 -0700)]
Merge tag 'perf-tools-for-v6.15-2025-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
"perf record:
- Introduce latency profiling using scheduler information.
The latency profiling is to show impacts on wall-time rather than
cpu-time. By tracking context switches, it can weight samples and
find which part of the code contributed more to the execution
latency.
The value (period) of the sample is weighted by dividing it by the
number of parallel execution at the moment. The parallelism is
tracked in perf report with sched-switch records. This will reduce
the portion that are run in parallel and in turn increase the
portion of serial executions.
For now, it's limited to profile processes, IOW system-wide
profiling is not supported. You can add --latency option to enable
this.
$ perf record --latency -- make -C tools/perf
I've run the above command for perf build which adds -j option to
make with the number of CPUs in the system internally. Normally
it'd show something like below:
The cc1 takes around 80% of the overhead as it's the actual
compiler. However it runs in parallel so its contribution to
latency may be less than that. Now, perf report will show both
overhead and latency (if --latency was given at record time) like
below:
You can see latency of cc1 goes down to around 50% and python3 and
ld contribute a lot more than their overhead. You can use --latency
option in perf report to get the same result but ordered by
latency.
$ perf report --latency -s comm
perf report:
- As a side effect of the latency profiling work, it adds a new
output field 'latency' and a sort key 'parallelism'. The below is a
result from my system with 64 CPUs. The build was well-parallelized
but contained some serial portions.
- Support Feodra mini-debuginfo which is a LZMA compressed symbol
table inside ".gnu_debugdata" ELF section.
perf annotate:
- Add --code-with-type option to enable data-type profiling with the
usual annotate output.
Instead of focusing on data structure, it shows code annotation
together with data type it accesses in case the instruction refers
to a memory location (and it was able to resolve the target data
type). Currently it only works with --stdio.
The "# data-type:" part was added with this change. The first few
entries are not very interesting. But later you can it accesses a
couple of fields in the task_struct, files_struct and fdtable.
perf trace:
- Support syscall tracing for different ABI. For example it can trace
system calls for 32-bit applications on 64-bit kernel
transparently.
- Add --summary-mode=total option to show global syscall summary. The
default is 'thread' to show per-thread syscall summary.
Python support:
- Add more interfaces to 'perf' module to parse events, and config,
enable or disable the event list properly so that it can implement
basic functionalities purely in Python. There is an example code
for these new interfaces in python/tracepoint.py.
- Add mypy and pylint support to enable build time checking. Fix some
code based on the findings from these tools.
Internals:
- Introduce io_dir__readdir() API to make directory traveral (usually
for proc or sysfs) efficient with less memory footprint.
JSON vendor events:
- Add events and metrics for ARM Neoverse N3 and V3
- Update events and metrics on various Intel CPUs
- Add/update events for a number of SiFive processors"
* tag 'perf-tools-for-v6.15-2025-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (229 commits)
perf bpf-filter: Fix a parsing error with comma
perf report: Fix a memory leak for perf_env on AMD
perf trace: Fix wrong size to bpf_map__update_elem call
perf tools: annotate asm_pure_loop.S
perf python: Fix setup.py mypy errors
perf test: Address attr.py mypy error
perf build: Add pylint build tests
perf build: Add mypy build tests
perf build: Rename TEST_LOGS to SHELL_TEST_LOGS
tools/build: Don't pass test log files to linker
perf bench sched pipe: fix enforced blocking reads in worker_thread
perf tools: Fix is_compat_mode build break in ppc64
perf build: filter all combinations of -flto for libperl
perf vendor events arm64 AmpereOneX: Fix frontend_bound calculation
perf vendor events arm64: AmpereOne/AmpereOneX: Mark LD_RETIRED impacted by errata
perf trace: Fix evlist memory leak
perf trace: Fix BTF memory leak
perf trace: Make syscall table stable
perf syscalltbl: Mask off ABI type for MIPS system calls
perf build: Remove Makefile.syscalls
...
Andrei Lalaev [Mon, 31 Mar 2025 06:17:52 +0000 (06:17 +0000)]
scripts: generate_rust_analyzer: fix pin-init name in kernel deps
Because of different crate names ("pin-init" and "pin_init") passed to
"append_crate" and "append_crate_with_generated", the script fails with
"KeyError: 'pin-init'".
To overcome the issue, pass the same name to both functions.
Linus Torvalds [Mon, 31 Mar 2025 00:03:26 +0000 (17:03 -0700)]
Merge tag 'rust-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust updates from Miguel Ojeda:
"Toolchain and infrastructure:
- Extract the 'pin-init' API from the 'kernel' crate and make it into
a standalone crate.
In order to do this, the contents are rearranged so that they can
easily be kept in sync with the version maintained out-of-tree that
other projects have started to use too (or plan to, like QEMU).
This will reduce the maintenance burden for Benno, who will now
have his own sub-tree, and will simplify future expected changes
like the move to use 'syn' to simplify the implementation.
- Add '#[test]'-like support based on KUnit.
We already had doctests support based on KUnit, which takes the
examples in our Rust documentation and runs them under KUnit.
Now, we are adding the beginning of the support for "normal" tests,
similar to those the '#[test]' tests in userspace Rust. For
instance:
Unlike with doctests, the 'assert*!'s do not map to the KUnit
assertion APIs yet.
- Check Rust signatures at compile time for functions called from C
by name.
In particular, introduce a new '#[export]' macro that can be placed
in the Rust function definition. It will ensure that the function
declaration on the C side matches the signature on the Rust
function:
The macro essentially forces the compiler to compare the types of
the actual Rust function and the 'bindgen'-processed C signature.
These cases are rare so far. In the future, we may consider
introducing another tool, 'cbindgen', to generate C headers
automatically. Even then, having these functions explicitly marked
may be a good idea anyway.
- Enable the 'raw_ref_op' Rust feature: it is already stable, and
allows us to use the new '&raw' syntax, avoiding a couple macros.
After everyone has migrated, we will disallow the macros.
- Pass the correct target to 'bindgen' on Usermode Linux.
- Fix 'rusttest' build in macOS.
'kernel' crate:
- New 'hrtimer' module: add support for setting up intrusive timers
without allocating when starting the timer. Add support for
'Pin<Box<_>>', 'Arc<_>', 'Pin<&_>' and 'Pin<&mut _>' as pointer
types for use with timer callbacks. Add support for setting clock
source and timer mode.
- New 'dma' module: add a simple DMA coherent allocator abstraction
and a test sample driver.
- 'list' module: make the linked list 'Cursor' point between
elements, rather than at an element, which is more convenient to us
and allows for cursors to empty lists; and document it with
examples of how to perform common operations with the provided
methods.
- 'str' module: implement a few traits for 'BStr' as well as the
'strip_prefix()' method.
- 'sync' module: add 'Arc::as_ptr'.
- 'alloc' module: add 'Box::into_pin'.
- 'error' module: extend the 'Result' documentation, including a few
examples on different ways of handling errors, a warning about
using methods that may panic, and links to external documentation.
'macros' crate:
- 'module' macro: add the 'authors' key to support multiple authors.
The original key will be kept until everyone has migrated.
Documentation:
- Add error handling sections.
MAINTAINERS:
- Add Danilo Krummrich as reviewer of the Rust "subsystem".
- Add 'RUST [PIN-INIT]' entry with Benno Lossin as maintainer. It has
its own sub-tree.
- Add sub-tree for 'RUST [ALLOC]'.
- Add 'DMA MAPPING HELPERS DEVICE DRIVER API [RUST]' entry with
Abdiel Janulgue as primary maintainer. It will go through the
sub-tree of the 'RUST [ALLOC]' entry.
- Add 'HIGH-RESOLUTION TIMERS [RUST]' entry with Andreas Hindborg as
maintainer. It has its own sub-tree.
And a few other cleanups and improvements"
* tag 'rust-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (71 commits)
rust: dma: add `Send` implementation for `CoherentAllocation`
rust: macros: fix `make rusttest` build on macOS
rust: block: refactor to use `&raw mut`
rust: enable `raw_ref_op` feature
rust: uaccess: name the correct function
rust: rbtree: fix comments referring to Box instead of KBox
rust: hrtimer: add maintainer entry
rust: hrtimer: add clocksource selection through `ClockId`
rust: hrtimer: add `HrTimerMode`
rust: hrtimer: implement `HrTimerPointer` for `Pin<Box<T>>`
rust: alloc: add `Box::into_pin`
rust: hrtimer: implement `UnsafeHrTimerPointer` for `Pin<&mut T>`
rust: hrtimer: implement `UnsafeHrTimerPointer` for `Pin<&T>`
rust: hrtimer: add `hrtimer::ScopedHrTimerPointer`
rust: hrtimer: add `UnsafeHrTimerPointer`
rust: hrtimer: allow timer restart from timer handler
rust: str: implement `strip_prefix` for `BStr`
rust: str: implement `AsRef<BStr>` for `[u8]` and `BStr`
rust: str: implement `Index` for `BStr`
rust: str: implement `PartialEq` for `BStr`
...
Linus Torvalds [Sun, 30 Mar 2025 22:44:36 +0000 (15:44 -0700)]
Merge tag 'modules-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull modules updates from Petr Pavlu:
- Use RCU instead of RCU-sched
The mix of rcu_read_lock(), rcu_read_lock_sched() and
preempt_disable() in the module code and its users has
been replaced with just rcu_read_lock()
- The rest of changes are smaller fixes and updates
* tag 'modules-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: (32 commits)
MAINTAINERS: Update the MODULE SUPPORT section
module: Remove unnecessary size argument when calling strscpy()
module: Replace deprecated strncpy() with strscpy()
params: Annotate struct module_param_attrs with __counted_by()
bug: Use RCU instead RCU-sched to protect module_bug_list.
static_call: Use RCU in all users of __module_text_address().
kprobes: Use RCU in all users of __module_text_address().
bpf: Use RCU in all users of __module_text_address().
jump_label: Use RCU in all users of __module_text_address().
jump_label: Use RCU in all users of __module_address().
x86: Use RCU in all users of __module_address().
cfi: Use RCU while invoking __module_address().
powerpc/ftrace: Use RCU in all users of __module_text_address().
LoongArch: ftrace: Use RCU in all users of __module_text_address().
LoongArch/orc: Use RCU in all users of __module_address().
arm64: module: Use RCU in all users of __module_text_address().
ARM: module: Use RCU in all users of __module_text_address().
module: Use RCU in all users of __module_text_address().
module: Use RCU in all users of __module_address().
module: Use RCU in search_module_extables().
...
Linus Torvalds [Sun, 30 Mar 2025 22:25:15 +0000 (15:25 -0700)]
Merge tag 'x86-urgent-2025-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes and updates from Ingo Molnar:
- Fix a large number of x86 Kconfig dependency and help text accuracy
bugs/problems, by Mateusz Jończyk and David Heideberg
- Fix a VM_PAT interaction with fork() crash. This also touches core
kernel code
- Fix an ORC unwinder bug for interrupt entries
- Fixes and cleanups
- Fix an AMD microcode loader bug that can promote verification
failures into success
- Add early-printk support for MMIO based UARTs on an x86 board that
had no other serial debugging facility and also experienced early
boot crashes
* tag 'x86-urgent-2025-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Fix __apply_microcode_amd()'s return value
x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range()
x86/fpu: Update the outdated comment above fpstate_init_user()
x86/early_printk: Add support for MMIO-based UARTs
x86/dumpstack: Fix inaccurate unwinding from exception stacks due to misplaced assignment
x86/entry: Fix ORC unwinder for PUSH_REGS with save_ret=1
x86/Kconfig: Fix lists in X86_EXTENDED_PLATFORM help text
x86/Kconfig: Correct X86_X2APIC help text
x86/speculation: Remove the extra #ifdef around CALL_NOSPEC
x86/Kconfig: Document release year of glibc 2.3.3
x86/Kconfig: Make CONFIG_PCI_CNB20LE_QUIRK depend on X86_32
x86/Kconfig: Document CONFIG_PCI_MMCONFIG
x86/Kconfig: Update lists in X86_EXTENDED_PLATFORM
x86/Kconfig: Move all X86_EXTENDED_PLATFORM options together
x86/Kconfig: Always enable ARCH_SPARSEMEM_ENABLE
x86/Kconfig: Enable X86_X2APIC by default and improve help text
Linus Torvalds [Sun, 30 Mar 2025 22:18:36 +0000 (15:18 -0700)]
Merge tag 'locking-urgent-2025-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc locking fixes and updates from Ingo Molnar:
- Fix a locking self-test FAIL on PREEMPT_RT kernels
- Fix nr_unused_locks accounting bug
- Simplify the split-lock debugging feature's fast-path
* tag 'locking-urgent-2025-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Decrease nr_unused_locks if lock unused in zap_class()
lockdep: Fix wait context check on softirq for PREEMPT_RT
x86/split_lock: Simplify reenabling
Linus Torvalds [Sun, 30 Mar 2025 20:45:28 +0000 (13:45 -0700)]
Merge tag 'bpf_try_alloc_pages' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf try_alloc_pages() support from Alexei Starovoitov:
"The pull includes work from Sebastian, Vlastimil and myself with a lot
of help from Michal and Shakeel.
This is a first step towards making kmalloc reentrant to get rid of
slab wrappers: bpf_mem_alloc, kretprobe's objpool, etc. These patches
make page allocator safe from any context.
SLAB wrappers bind memory to a particular subsystem making it
unavailable to the rest of the kernel. Some BPF maps in production
consume Gbytes of preallocated memory. Top 5 in Meta: 1.5G, 1.2G,
1.1G, 300M, 200M. Once we have kmalloc that works in any context BPF
map preallocation won't be necessary.
How:
Synchronous kmalloc/page alloc stack has multiple stages going from
fast to slow: cmpxchg16 -> slab_alloc -> new_slab -> alloc_pages ->
rmqueue_pcplist -> __rmqueue, where rmqueue_pcplist was already
relying on trylock.
This set changes rmqueue_bulk/rmqueue_buddy to attempt a trylock and
return ENOMEM if alloc_flags & ALLOC_TRYLOCK. It then wraps this
functionality into try_alloc_pages() helper. We make sure that the
logic is sane in PREEMPT_RT.
End result: try_alloc_pages()/free_pages_nolock() are safe to call
from any context.
try_kmalloc() for any context with similar trylock approach will
follow. It will use try_alloc_pages() when slab needs a new page.
Though such try_kmalloc/page_alloc() is an opportunistic allocator,
this design ensures that the probability of successful allocation of
small objects (up to one page in size) is high.
Even before we have try_kmalloc(), we already use try_alloc_pages() in
BPF arena implementation and it's going to be used more extensively in
BPF"
* tag 'bpf_try_alloc_pages' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
mm: Fix the flipped condition in gfpflags_allow_spinning()
bpf: Use try_alloc_pages() to allocate pages for bpf needs.
mm, bpf: Use memcg in try_alloc_pages().
memcg: Use trylock to access memcg stock_lock.
mm, bpf: Introduce free_pages_nolock()
mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation
locking/local_lock: Introduce localtry_lock_t
Linus Torvalds [Sun, 30 Mar 2025 20:06:27 +0000 (13:06 -0700)]
Merge tag 'bpf_res_spin_lock' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf relisient spinlock support from Alexei Starovoitov:
"This patch set introduces Resilient Queued Spin Lock (or rqspinlock
with res_spin_lock() and res_spin_unlock() APIs).
This is a qspinlock variant which recovers the kernel from a stalled
state when the lock acquisition path cannot make forward progress.
This can occur when a lock acquisition attempt enters a deadlock
situation (e.g. AA, or ABBA), or more generally, when the owner of the
lock (which we’re trying to acquire) isn’t making forward progress.
Deadlock detection is the main mechanism used to provide instant
recovery, with the timeout mechanism acting as a final line of
defense. Detection is triggered immediately when beginning the waiting
loop of a lock slow path.
Additionally, BPF programs attached to different parts of the kernel
can introduce new control flow into the kernel, which increases the
likelihood of deadlocks in code not written to handle reentrancy.
There have been multiple syzbot reports surfacing deadlocks in
internal kernel code due to the diverse ways in which BPF programs can
be attached to different parts of the kernel. By switching the BPF
subsystem’s lock usage to rqspinlock, all of these issues are
mitigated at runtime.
This spin lock implementation allows BPF maps to become safer and
remove mechanisms that have fallen short in assuring safety when
nesting programs in arbitrary ways in the same context or across
different contexts.
We run benchmarks that stress locking scalability and perform
comparison against the baseline (qspinlock). For the rqspinlock case,
we replace the default qspinlock with it in the kernel, such that all
spin locks in the kernel use the rqspinlock slow path. As such,
benchmarks that stress kernel spin locks end up exercising rqspinlock.
More details in the cover letter in commit 6ffb9017e932 ("Merge branch
'resilient-queued-spin-lock'")"
* tag 'bpf_res_spin_lock' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (24 commits)
selftests/bpf: Add tests for rqspinlock
bpf: Maintain FIFO property for rqspinlock unlock
bpf: Implement verifier support for rqspinlock
bpf: Introduce rqspinlock kfuncs
bpf: Convert lpm_trie.c to rqspinlock
bpf: Convert percpu_freelist.c to rqspinlock
bpf: Convert hashtab.c to rqspinlock
rqspinlock: Add locktorture support
rqspinlock: Add entry to Makefile, MAINTAINERS
rqspinlock: Add macros for rqspinlock usage
rqspinlock: Add basic support for CONFIG_PARAVIRT
rqspinlock: Add a test-and-set fallback
rqspinlock: Add deadlock detection and recovery
rqspinlock: Protect waiters in trylock fallback from stalls
rqspinlock: Protect waiters in queue from stalls
rqspinlock: Protect pending bit owners from stalls
rqspinlock: Hardcode cond_acquire loops for arm64
rqspinlock: Add support for timeouts
rqspinlock: Drop PV and virtualization support
rqspinlock: Add rqspinlock.h header
...
Linus Torvalds [Sun, 30 Mar 2025 19:43:03 +0000 (12:43 -0700)]
Merge tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
"For this merge window we're splitting BPF pull request into three for
higher visibility: main changes, res_spin_lock, try_alloc_pages.
These are the main BPF changes:
- Add DFA-based live registers analysis to improve verification of
programs with loops (Eduard Zingerman)
- Introduce load_acquire and store_release BPF instructions and add
x86, arm64 JIT support (Peilin Ye)
- Fix loop detection logic in the verifier (Eduard Zingerman)
- Drop unnecesary lock in bpf_map_inc_not_zero() (Eric Dumazet)
- Add kfunc for populating cpumask bits (Emil Tsalapatis)
- Convert various shell based tests to selftests/bpf/test_progs
format (Bastien Curutchet)
- Allow passing referenced kptrs into struct_ops callbacks (Amery
Hung)
- Add a flag to LSM bpf hook to facilitate bpf program signing
(Blaise Boscaccy)
- Track arena arguments in kfuncs (Ihor Solodrai)
- Add copy_remote_vm_str() helper for reading strings from remote VM
and bpf_copy_from_user_task_str() kfunc (Jordan Rome)
- Add support for timed may_goto instruction (Kumar Kartikeya
Dwivedi)
- Allow bpf_get_netns_cookie() int cgroup_skb programs (Mahe Tardy)
- Reduce bpf_cgrp_storage_busy false positives when accessing cgroup
local storage (Martin KaFai Lau)
- Allow retrieving BTF data with BTF token (Mykyta Yatsenko)
- Add BPF kfuncs to set and get xattrs with 'security.bpf.' prefix
(Song Liu)
- Reject attaching programs to noreturn functions (Yafang Shao)
- Introduce pre-order traversal of cgroup bpf programs (Yonghong
Song)"
* tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (186 commits)
selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid
bpf: Fix out-of-bounds read in check_atomic_load/store()
libbpf: Add namespace for errstr making it libbpf_errstr
bpf: Add struct_ops context information to struct bpf_prog_aux
selftests/bpf: Sanitize pointer prior fclose()
selftests/bpf: Migrate test_xdp_vlan.sh into test_progs
selftests/bpf: test_xdp_vlan: Rename BPF sections
bpf: clarify a misleading verifier error message
selftests/bpf: Add selftest for attaching fexit to __noreturn functions
bpf: Reject attaching fexit/fmod_ret to __noreturn functions
bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage
bpf: Make perf_event_read_output accessible in all program types.
bpftool: Using the right format specifiers
bpftool: Add -Wformat-signedness flag to detect format errors
selftests/bpf: Test freplace from user namespace
libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID
bpf: Return prog btf_id without capable check
bpf: BPF token support for BPF_BTF_GET_FD_BY_ID
bpf, x86: Fix objtool warning for timed may_goto
bpf: Check map->record at the beginning of check_and_free_fields()
...
Linus Torvalds [Sun, 30 Mar 2025 01:25:34 +0000 (18:25 -0700)]
Merge tag 'mailbox-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox
Pull mailbox updates from Jassi Brar:
"Core:
- misc rejig of header includes
- minor const fixes
Misc:
- constify amba_id table
pcc:
- cleanup and refactoring of shmem and irq handling
qcom:
- add MSM8226 compatible
fsl,mu:
- add i.MX94 compatible
mediatek:
- remove cl in struct cmdq_pkt
tegra:
- define dimensioning masks in SoC data"
* tag 'mailbox-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox: (25 commits)
mailbox: Remove unneeded semicolon
mailbox: pcc: Refactor and simplify check_and_ack()
mailbox: pcc: Always map the shared memory communication address
mailbox: pcc: Refactor error handling in irq handler into separate function
mailbox: pcc: Use acpi_os_ioremap() instead of ioremap()
mailbox: pcc: Return early if no GAS register from pcc_mbox_cmd_complete_check
mailbox: pcc: Drop unnecessary endianness conversion of pcc_hdr.flags
mailbox: pcc: Always clear the platform ack interrupt first
mailbox: pcc: Fix the possible race in updation of chan_in_use flag
dt-bindings: mailbox: qcom: add compatible for MSM8226 SoC
dt-bindings: mailbox: fsl,mu: Add i.MX94 compatible
MAINTAINERS: add mailbox API's tree type and location
mailbox: remove unused header files
mailbox: explicitly include <linux/bits.h>
mailbox: sort headers alphabetically
mailbox: don't protect of_parse_phandle_with_args with con_mutex
mailbox: use error ret code of of_parse_phandle_with_args()
mailbox: arm_mhuv2: Constify amba_id table
mailbox: arm_mhu_db: Constify amba_id table
mailbox: arm_mhu: Constify amba_id table
...
Linus Torvalds [Sun, 30 Mar 2025 01:23:44 +0000 (18:23 -0700)]
Merge tag 'hsi-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi
Pull HSI update from Sebastian Reichel:
- ssi_protocol: fix potential use after free after module removal
* tag 'hsi-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
HSI: ssi_protocol: Fix use after free vulnerability in ssi_protocol Driver Due to Race Condition
Linus Torvalds [Sun, 30 Mar 2025 01:11:12 +0000 (18:11 -0700)]
Merge tag 'for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
"Power-supply core:
- remove unused set_charged infrastructure
- drop of_node from power_supply struct
Power-supply drivers:
- axp717: support devices without thermistors
- bq27xxx: support max design voltage for bq270x0 and bq27x10
- pcf50633: drop charger driver
- max1720x: add battery health support
- switch all power-supply devices from of_node to fwnode
- convert regmap users to maple tree register cache
- convert drivers to devm_kmemdup_array
- misc cleanups and fixes
Reset drivers:
- at91-sama5d2_shdwc: add sama7d65 support
* tag 'for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (30 commits)
power: supply: mt6370: Remove redundant 'flush_workqueue()' calls
Revert "power: supply: bq27xxx: do not report bogus zero values"
power: supply: max77693: Fix wrong conversion of charge input threshold value
power: supply: pcf50633: Remove charger
power: supply: all: switch psy_cfg from of_node to fwnode
power: supply: core: get rid of of_node
power: reset: at91-sama5d2_shdwc: Add sama7d65 PMC
power: supply: smb347: convert to use maple tree register cache
power: supply: rt9455: convert to use maple tree register cache
power: supply: max1720x: convert to use maple tree register cache
power: supply: ltc4162l: convert to use maple tree register cache
power: supply: bq25980: convert to use maple tree register cache
power: supply: bq25890: convert to use maple tree register cache
power: supply: bq2515x: convert to use maple tree register cache
power: supply: bq24257: convert to use maple tree register cache
power: supply: bd99954: convert to use maple tree register cache
power: supply: Remove unused set_charged method
power: supply: ds2760: Remove unused ds2760_battery_set_charged
power: supply: core: Remove unused power_supply_set_battery_charged
power: supply: sc27xx: use devm_kmemdup_array()
...
Linus Torvalds [Sun, 30 Mar 2025 00:23:34 +0000 (17:23 -0700)]
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"Here's the pile of clk driver patches. The usual suspects^Wsilicon
vendors are all here, adding new SoC support and fixing existing code.
There are a few patches to the clk framework here as well. They've
been baking in linux-next for weeks so I'm hoping we don't have to
revert them. The disable OF node patch is probably the scariest one
although it seems unlikely that a system would be relying on a driver
_not_ probing because the clk never appeared, but you never know.
Nothing looks out of the ordinary on the driver side but that's
because it's mostly a bunch of data.
Core:
- Use dev_err_probe() in the clk registration path (Peering into the
crystal ball shows many patches that remove printks)
- Check for disabled OF nodes in of_clk_get_hw_from_clkspec()
New Drivers:
- Allwinner A523/T527 clk driver
- Qualcomm IPQ9574 NSS clk driver
- Qualcomm QCS8300 GPU and video clk drivers
- Qualcomm SDM429 RPM clks
- Qualcomm QCM6490 LPASS (low power audio) resets
- Samsung Exynos2200: driver for several clock controllers (Alive,
CMGP, HSI, PERIC/PERIS, TOP, UFS and VFS)
- Samsung Exynos7870: Driver for several clock controllers (Alive,
MIF, DISP AUD, FSYS, G3D, ISP, MFC and PERI)
- Rockchip rk3528 and rk3562 clk driver
Updates:
- Various fixes to SoC clk drivers for incorrect data, avoid touching
protected registers, etc.
- Additions for some missing clks in existing SoC clk drivers
- DT schema conversions from text to YAML
- Kconfig cleanups to allow drivers to be compiled on moar
architectures"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (125 commits)
clk: qcom: Add NSS clock Controller driver for IPQ9574
clk: qcom: gcc-ipq9574: Add support for gpll0_out_aux clock
dt-bindings: clock: Add ipq9574 NSSCC clock and reset definitions
dt-bindings: clock: gcc-ipq9574: Add definition for GPLL0_OUT_AUX
clk: qcom: gcc-msm8953: fix stuck venus0_core0 clock
clk: qcom: mmcc-sdm660: fix stuck video_subcore0 clock
dt-bindings: clock: qcom,x1e80100-camcc: Fix the list of required-opps
clk: amlogic: a1: fix a typo
clk: amlogic: gxbb: drop non existing 32k clock parent
clk: amlogic: gxbb: drop incorrect flag on 32k clock
clk: amlogic: g12b: fix cluster A parent data
clk: amlogic: g12a: fix mmc A peripheral clock
dt-bindings: clocks: atmel,at91rm9200-pmc: add missing compatibles
dt-bindings: reset: fix double id on rk3562-cru reset ids
drivers: clk: qcom: ipq5424: fix the freq table of sdcc1_apps clock
clk: qcom: lpassaudiocc-sc7280: Add support for LPASS resets for QCM6490
dt-bindings: clock: qcom: Add compatible for QCM6490 boards
clk: qcom: gdsc: Update the status poll timeout for GDSC
clk: qcom: gdsc: Set retain_ff before moving to HW CTRL
clk: davinci: remove support for da830
...
Linus Torvalds [Sun, 30 Mar 2025 00:18:50 +0000 (17:18 -0700)]
Merge tag 'rproc-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
- Transition the i.MX8MP DSP remoteproc driver to use the reset
framework for driving the run/stall reset bits
- Add support for managing the modem remoteprocessor on the Qualcomm
MSM8226, MSM8926, and SM8750 platforms
* tag 'rproc-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (28 commits)
remoteproc: qcom_q6v5_pas: Make single-PD handling more robust
remoteproc: qcom_q6v5_pas: Use resource with CX PD for MSM8226
remoteproc: core: Clear table_sz when rproc_shutdown
remoteproc: sysmon: Update qcom_add_sysmon_subdev() comment
dt-bindings: remoteproc: Consolidate SC8180X and SM8150 PAS files
irqdomain: remoteproc: Switch to of_fwnode_handle()
remoteproc: qcom: pas: add minidump_id to SC7280 WPSS
remoteproc: imx_dsp_rproc: Document run_stall struct member
remoteproc: qcom: pas: Add SM8750 MPSS
dt-bindings: remoteproc: Add SM8750 MPSS
imx_dsp_rproc: Use reset controller API to control the DSP
reset: imx8mp-audiomix: Add support for DSP run/stall
reset: imx8mp-audiomix: Introduce active_low configuration option
reset: imx8mp-audiomix: Prepare the code for more reset bits
reset: imx8mp-audiomix: Add prefix for internal macro
dt-bindings: dsp: fsl,dsp: Add resets property
dt-bindings: reset: audiomix: Add reset ids for EARC and DSP
remoteproc: qcom_wcnss: Handle platforms with only single power domain
dt-bindings: remoteproc: qcom,wcnss-pil: Add support for single power-domain platforms
remoteproc: qcom_q6v5_mss: Add modem support on MSM8926
...
Kent Overstreet [Sat, 29 Mar 2025 21:59:50 +0000 (17:59 -0400)]
bcachefs: Clear fs_path_parent on subvolume unlink
This fixes recursive subvolume removal.
Subvolume deletion is asynchronous; fs_path_parent, and thus the entry
in the subvolume_children btree, need to be cleared when the subvolume
is unlinked from the fs heirarchy - else we'll spuriously think a
subvolume has children and deletion will fail.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sat, 29 Mar 2025 23:59:16 +0000 (16:59 -0700)]
Merge tag 'pinctrl-v6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"Core changes:
- None really.
New drivers:
- AMD ISP411 "AMD ISP" driver
- Exynos 2200 and 7870 SoC subdrivers
- Sophgo RISC-V SG2042 and SG2044 subdrivers
- Amlogic A4 subdriver
- Rockchip RK3528 subdriver
- Broadcom BCM21664 subdriver
- Allwinner A523/T527 subdriver
- Ingenic X1600 subdriver
- Microchip SAMA7D65 subdriver, essentially a re-branded Atmel AT91
PIO4 driver, but nowadays a Microschip SoC line
Improvements:
- Bring in the devm_kmemdup_array() helper and use it throughout,
also bring in changes to other subsystems for this to establish
this helper
- Support EGPIO on the Qualcomm SA8775P SoC
- Extend EINT support in the Mediatek driver"
* tag 'pinctrl-v6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (101 commits)
pinctrl: mediatek: Add EINT support for multiple addresses
pinctrl: amlogic-a4: Drop surplus semicolon
pinctrl: nuvoton: Reduce use of OF-specific APIs
pinctrl: nuvoton: Convert to use struct group_desc
pinctrl: nuvoton: Make use of struct pinfunction and PINCTRL_PINFUNCTION()
pinctrl: nuvoton: Convert to use struct pingroup and PINCTRL_PINGROUP()
pinctrl: npcm8xx: Fix incorrect struct npcm8xx_pincfg assignment
pinctrl: tegra: Fix off by one in tegra_pinctrl_get_group()
pinctrl: PINCTRL_AMDISP should depend on DRM_AMD_ISP
pinctrl: qcom: sa8775p: Enable egpio function
dt-bindings: pinctrl: qcom: Add egpio function for sa8775p
pinctrl: qcom: tlmm-test: Validate irq_enable delivers edge irqs
pinctrl: qcom: Clear latched interrupt status when changing IRQ type
dt-bindings: pinctrl: airoha: Add missing gpio-ranges property
pinctrl: bcm281xx: Add missing assignment in bcm21664_pinctrl_lock_all()
pinctrl: amd: isp411: Fix IS_ERR() vs NULL check in probe()
dt-bindings: pinctrl: at91-pio4: add microchip,sama7d65-pinctrl
pinctrl: tegra: Set SFIO mode to Mux Register
pinctrl-tegra: Restore SFSEL bit when freeing pins
pinctrl: tegra: Add descriptions for SoC data fields
...
Linus Torvalds [Sat, 29 Mar 2025 21:48:33 +0000 (14:48 -0700)]
Merge tag 'backlight-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
Pull backlight updates from Lee Jones:
- Apple DWI Backlight:
- Added devicetree bindings for backlight controllers on Apple's DWI
interface.
- Added a new driver (apple_dwi_bl) for these controllers found on
some Apple mobile devices.
- Added MAINTAINERS entries for the new driver.
- led_bl: Fixed a locking issue by holding the led_access lock when
calling led_sysfs_disable() during device removal to prevent
potential warnings.
- Removed unnecessary <linux/fb.h> includes from a bunch of drivers.
- tdo24m: Removed redundant whitespace in Kconfig description.
- pcf50633-backlight: Removed the driver as the underlying pcf50633 MFD
and s3c24xx platform support were removed.
* tag 'backlight-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: (22 commits)
backlight: pcf50633-backlight: Remove unused driver
backlight: tdo24m: Eliminate redundant whitespace
MAINTAINERS: Add entries for Apple DWI backlight controller
backlight: apple_dwi_bl: Add Apple DWI backlight driver
dt-bindings: leds: backlight: apple,dwi-bl: Add Apple DWI backlight
backlight: led_bl: Hold led_access lock when calling led_sysfs_disable()
backlight: wm831x_bl: Do not include <linux/fb.h>
backlight: vgg2432a4: Do not include <linux/fb.h>
backlight: tps65217_bl: Do not include <linux/fb.h>
backlight: max8925_bl: Do not include <linux/fb.h>
backlight: lv5207lp: Do not include <linux/fb.h>
backlight: locomolcd: Do not include <linux/fb.h>
backlight: hp680_bl: Do not include <linux/fb.h>
backlight: ep93xx_bl: Do not include <linux/fb.h>
backlight: da9052_bl: Do not include <linux/fb.h>
backlight: da903x_bl: Do not include <linux/fb.h>
backlight: bd6107_bl: Do not include <linux/fb.h>
backlight: as3711_bl: Do not include <linux/fb.h>
backlight: adp8870_bl: Do not include <linux/fb.h>
backlight: adp8860_bl: Do not include <linux/fb.h>
...
Linus Torvalds [Sat, 29 Mar 2025 21:42:59 +0000 (14:42 -0700)]
Merge tag 'leds-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
Pull LED updates from Lee Jones:
- pca955x: Add HW blink support, utilizing PWM0. It supports one
frequency across all blinking LEDs and falls back to software blink
if different frequencies are requested.
- trigger: netdev: Allow configuring LED blink interval via .blink_set
even when HW offload (.hw_control) is enabled.
- led-core: Fix a race condition where a quick LED_OFF followed by
another brightness set could leave the LED off incorrectly,
especially noticeable after the introduction of the ordered
workqueue.
- qcom-lpg: Add support for 6-bit PWM resolution alongside the existing
9-bit support.
- qcom-lpg: Fix PWM value capping to respect the selected resolution
(6-bit or 9-bit) for normal PWMs.
- qcom-lpg: Fix PWM value capping to respect the selected resolution
for Hi-Res PWMs.
- qcom-lpg: Fix calculation of the best period for Hi-Res PWMs to
prevent requested duty cycles from exceeding the maximum allowed by
the selected resolution.
- st1202: Add a check for the error code returned by devm_mutex_init().
- pwm-multicolor: Add a check for the return value of
fwnode_property_read_u32().
- Kconfig: leds-st1202: Add select LEDS_TRIGGER_PATTERN as it's
required by the driver.
- lp8860: Drop unneeded explicit assignment to REGCACHE_NONE.
- pca955x: Refactor code with helper functions and rename some
functions/variables for clarity.
- pca955x: Pass driver data pointers instead of the I2C client to
helper functions.
- pca955x: Optimize probe LED selection logic to reduce I2C operations.
- pca955x: Revert the removal of pca95xx_num_led_regs() (renaming it to
pca955x_num_led_regs) as it's needed for HW blink support.
- st1202: Refactor st1202_led_set() to use the !! operator for boolean
conversion.
- st1202: Minor spacing and proofreading edits in comments.
- Directory Rename: Rename the drivers/leds/simple directory to
drivers/leds/simatic as the drivers within are not simple.
- mlxcpld: Remove unused include of acpi.h.
- nic78bx: Tidy up the ACPI ID table (remove ACPI_PTR, use
mod_devicetable.h, remove explicit driver_data initializer).
- tlc591xx: Convert text binding to YAML format, add child node
constraints, and fix typos/formatting in the example.
- qcom-lpg: Document the qcom,pm8937-pwm compatible string as a
fallback for qcom,pm8916-pwm.
* tag 'leds-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (23 commits)
leds: nic78bx: Tidy up ACPI ID table
leds: mlxcpld: Remove unused ACPI header inclusion
leds: rgb: leds-qcom-lpg: Fix calculation of best period Hi-Res PWMs
leds: rgb: leds-qcom-lpg: Fix pwm resolution max for Hi-Res PWMs
leds: rgb: leds-qcom-lpg: Fix pwm resolution max for normal PWMs
leds: Rename simple directory to simatic
leds: Kconfig: leds-st1202: Add select for required LEDS_TRIGGER_PATTERN
leds: leds-st1202: Spacing and proofreading editing
leds: leds-st1202: Initialize hardware before DT node child operations
leds: pwm-multicolor: Add check for fwnode_property_read_u32
leds: rgb: leds-qcom-lpg: Add support for 6-bit PWM resolution
leds: Fix LED_OFF brightness race
Revert "leds-pca955x: Remove the unused function pca95xx_num_led_regs()"
leds: st1202: Refactor st1202_led_set() to use !! operator for boolean conversion
dt-bindings: leds: qcom-lpg: Document PM8937 PWM compatible
leds: pca955x: Add HW blink support
leds: pca955x: Optimize probe LED selection
leds: pca955x: Use pointers to driver data rather than I2C client
leds: pca955x: Refactor with helper functions and renaming
dt-bindings: leds: Convert leds-tlc591xx.txt to yaml format
...
Linus Torvalds [Sat, 29 Mar 2025 21:33:13 +0000 (14:33 -0700)]
Merge tag 'mfd-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"Maxim MAX77705:
- Added core MFD driver.
- Added charger driver.
- Added devicetree bindings for the charger and MFD core.
- Added Haptic controller support via the input subsystem.
- Added LED support.
- Added support to simple-mfd-i2c for fuel gauge and hwmon.
Samsung S2MPU05 (Exynos7870 PMIC):
- Added core MFD support.
- Added Regulator support for 21 LDOs and 5 BUCKs.
- Added devicetree bindings for regulators and the PMIC core.
TI TPS65215 & TPS65214:
- Added support to the existing TPS65219 driver.
- Added devicetree bindings.
STMicroelectronics STM32MP25:
- Added support to the stm32-timers MFD driver.
- Added devicetree bindings.
Congatec Board Controller (CGBC):
- Added HWMON support for internal sensors.
- Added support for the conga-SA8 module.
Microchip LAN969X:
- Enabled the at91-usart MFD driver for this architecture.
MediaTek MT6359:
- Added mfd_cell for mt6359-accdet to allow its driver to probe.
Other misc driver updates:
- AXP20X (AXP717): Added AXP717_TS_PIN_CFG register to writeable regs
for temperature sensor configuration.
- SM501: Switched to using BIT() macro to mitigate potential integer
overflows in GPIO functions.
- ENE KB3930: Added a NULL pointer check for off_gpios during probe
to prevent potential dereference.
- SYSCON: Added a check for invalid resource size to prevent issues
from DT misconfiguration.
- CGBC: Corrected signedness issues in cgbc_session_request
- intel_soc_pmic_chtdc_ti / intel_soc_pmic_crc: Removed unneeded
explicit assignment to REGCACHE_NONE.
- ipaq-micro / tps65010: Switched to using str_enable_disable()
helpers for clarity and potential size reduction.
- upboard-fpga: Removed unnecessary ACPI_PTR() annotation.
- max8997: Removed unused max8997_irq_exit() function, using devm_*
helpers instead.
- lp3943: Dropped unused #include <linux/pwm.h> from the header file.
- db8500-prcmu: Removed needless return statements in void APIs.
- qnap-mcu: Replaced commas with semicolons between expressions for
correctness.
- STA2X11: Removed the core MFD driver as the underlying platform
support was removed.
- EZX-PCAP: Removed the unused pcap_adc_sync function.
- PCF50633 (OpenMoko PMIC): Removed the entire driver (core, adc,
gpio, irq) as the underlying s3c24xx platform support was removed.
Linus Torvalds [Sat, 29 Mar 2025 21:31:39 +0000 (14:31 -0700)]
Merge tag 'regmap-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"Only a couple of small patches this release, one refactoring struct
regmap to pack it more efficiently and another which makes our way of
setting all bits consistent in the regmap-irq code"
* tag 'regmap-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: irq: Use one way of setting all bits in the register
regmap: Reorder 'struct regmap'
Linus Torvalds [Sat, 29 Mar 2025 19:52:49 +0000 (12:52 -0700)]
Merge tag 'parisc-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
- drop parisc specific memcpy_fromio() function
- clean up coding style and fix compile warnings
* tag 'parisc-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: led: Use scnprintf() to avoid string truncation warning
Input: gscps2 - Describe missing function parameters
parisc: perf: use named initializers for struct miscdevice
parisc: PDT: Fix missing prototype warning
parisc: Remove memcpy_fromio
parisc: Fix formatting errors in io.c
Linus Torvalds [Sat, 29 Mar 2025 19:47:09 +0000 (12:47 -0700)]
Merge tag 'mips_6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
- Add support for multi-cluster configuration
- Add quirks for enabling multi-cluster mode on EyeQ6
- Add DTS clocks for ralink
- Cleanup realtek DTS
- Other cleanups and fixes
* tag 'mips_6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (35 commits)
MIPS: config: omega2+, vocore2: enable CLK_MTMIPS
arch: mips: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
MIPS: cm: Fix warning if MIPS_CM is disabled
MIPS: Fix Macro name
MIPS: ds1287: Match ds1287_set_base_clock() function types
MIPS: cevt-ds1287: Add missing ds1287.h include
MIPS: dec: Declare which_prom() as static
MIPS: Loongson2ef: Replace deprecated strncpy() with strscpy()
mips: dts: ralink: mt7628a: update system controller node and its consumers
mips: dts: ralink: mt7620a: update system controller node and its consumers
mips: dts: ralink: rt3883: update system controller node and its consumers
mips: dts: ralink: rt3050: update system controller node and its consumers
mips: dts: ralink: rt2880: update system controller node and its consumers
dt-bindings: clock: add clock definitions for Ralink SoCs
MIPS: Use arch specific syscall name match function
mips: dts: realtek: Add restart to Cisco SG220-26P
mips: dts: realtek: Add RTL838x SoC peripherals
mips: dts: realtek: Replace uart clock property
mips: dts: realtek: Correct uart interrupt-parent
mips: dts: realtek: Add SoC IRQ node for RTL838x
...
Linus Torvalds [Sat, 29 Mar 2025 18:59:43 +0000 (11:59 -0700)]
Merge tag 's390-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Add sorting of mcount locations at build time
- Rework uaccess functions with C exception handling to shorten inline
assembly size and enable full inlining. This yields near-optimal code
for small constant copies with a ~40kb kernel size increase
- Add support for a configurable STRICT_MM_TYPECHECKS which allows to
generate better code, but also allows to have type checking for debug
builds
- Optimize get_lowcore() for common callers with alternatives that
nearly revert to the pre-relocated lowcore code, while also slightly
reducing syscall entry and exit time
- Convert MACHINE_HAS_* checks for single facility tests into cpu_has_*
style macros that call test_facility(), and for features with
additional conditions, add a new ALT_TYPE_FEATURE alternative to
provide a static branch via alternative patching. Also, move machine
feature detection to the decompressor for early patching and add
debugging functionality to easily show which alternatives are patched
- Add exception table support to early boot / startup code to get rid
of the open coded exception handling
- Use asm_inline for all inline assemblies with EX_TABLE or ALTERNATIVE
to ensure correct inlining and unrolling decisions
- Remove 2k page table leftovers now that s390 has been switched to
always allocate 4k page tables
- Split kfence pool into 4k mappings in arch_kfence_init_pool() and
remove the architecture-specific kfence_split_mapping()
- Use READ_ONCE_NOCHECK() in regs_get_kernel_stack_nth() to silence
spurious KASAN warnings from opportunistic ftrace argument tracing
- Force __atomic_add_const() variants on s390 to always return void,
ensuring compile errors for improper usage
- Remove s390's ioremap_wt() and pgprot_writethrough() due to
mismatched semantics and lack of known users, relying on asm-generic
fallbacks
- Signal eventfd in vfio-ap to notify userspace when the guest AP
configuration changes, including during mdev removal
- Convert mdev_types from an array to a pointer in vfio-ccw and vfio-ap
drivers to avoid fake flex array confusion
- Cleanup trap code
- Remove references to the outdated linux390@de.ibm.com address
- Other various small fixes and improvements all over the code
* tag 's390-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (78 commits)
s390: Use inline qualifier for all EX_TABLE and ALTERNATIVE inline assemblies
s390/kfence: Split kfence pool into 4k mappings in arch_kfence_init_pool()
s390/ptrace: Avoid KASAN false positives in regs_get_kernel_stack_nth()
s390/boot: Ignore vmlinux.map
s390/sysctl: Remove "vm/allocate_pgste" sysctl
s390: Remove 2k vs 4k page table leftovers
s390/tlb: Use mm_has_pgste() instead of mm_alloc_pgste()
s390/lowcore: Use lghi instead llilh to clear register
s390/syscall: Merge __do_syscall() and do_syscall()
s390/spinlock: Implement SPINLOCK_LOCKVAL with inline assembly
s390/smp: Implement raw_smp_processor_id() with inline assembly
s390/current: Implement current with inline assembly
s390/lowcore: Use inline qualifier for get_lowcore() inline assembly
s390: Move s390 sysctls into their own file under arch/s390
s390/syscall: Simplify syscall_get_arguments()
s390/vfio-ap: Notify userspace that guest's AP config changed when mdev removed
s390: Remove ioremap_wt() and pgprot_writethrough()
s390/mm: Add configurable STRICT_MM_TYPECHECKS
s390/mm: Convert pgste_val() into function
s390/mm: Convert pgprot_val() into function
...
Linus Torvalds [Sat, 29 Mar 2025 18:36:19 +0000 (11:36 -0700)]
Merge tag 'efi-next-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
- Decouple mixed mode startup code from the traditional x86
decompressor
- Revert zero-length file hack in efivarfs
- Prevent EFI zboot from using the CopyMem/SetMem boot services after
ExitBootServices()
- Update EFI zboot to use the ZLIB/ZSTD library interfaces directly
* tag 'efi-next-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/libstub: Avoid legacy decompressor zlib/zstd wrappers
efi/libstub: Avoid CopyMem/SetMem EFI services after ExitBootServices
efi: efibc: change kmalloc(size * count, ...) to kmalloc_array()
efivarfs: Revert "allow creation of zero length files"
x86/efi/mixed: Move mixed mode startup code into libstub
x86/efi/mixed: Simplify and document thunking logic
x86/efi/mixed: Remove dependency on legacy startup_32 code
x86/efi/mixed: Set up 1:1 mapping of lower 4GiB in the stub
x86/efi/mixed: Factor out and clean up long mode entry
x86/efi/mixed: Check CPU compatibility without relying on verify_cpu()
x86/efistub: Merge PE and handover entrypoints
Linus Torvalds [Sat, 29 Mar 2025 18:23:16 +0000 (11:23 -0700)]
Merge tag 'devicetree-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"DT core:
- Fix ref counting errors in interrupt parsing code
- Allow "nonposted-mmio" property per device and on non-Apple h/w
- Use typed accessors in platform driver code
- Fix mismatch between DT MAX_PHANDLE_ARGS and
NR_FWNODE_REFERENCE_ARGS and increase the maximum number args
- Rework of_resolve_phandles() to use __free() cleanup and fix ref
count error
- Use of_prop_cmp() in a few more places
- Improve make_fit.py script error handling
DT bindings:
- Update DT property ordering rules for properties within groups
(i.e. common suffix)
- Update DT submitting-patches doc to cover sending .dts patches and
SoC maintainer rules on being warning free against linux-next
- Add ti,tps53681, ti,tps53681, Maxim max15301, max15303, and
max20751 to trivial devices
- Add Renesas RZ/V2H(P) and Allwinner H616 support to Arm Mali
Bifrost GPU. Add Samsung exynos7870 support to Arm Mail Midgard.
- Rework qcom,ebi2 and samsung,exynos4210-sram memory controller
bindings to split child node properties. Fix the LAN9115 binding to
use the child node schema so all properties are documented.
- Convert nxp,lpc3220-mic and Altera ECC manager bindings to schema
- Fix some issues with LVDS display panels causing validation
warnings
- Drop some obsolete parts of Xilinx bindings"
* tag 'devicetree-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (48 commits)
scripts/make_fit: Print DT name before libfdt errors
dt-bindings: edac: altera: socfpga: Convert to YAML
dt-bindings: pps: gpio: Correct indentation and style in DTS example
media: dt-bindings: mediatek,vcodec-encoder: Drop assigned-clock properties
of: address: Allow to specify nonposted-mmio per-device
of: address: Expand nonposted-mmio to non-Apple Silicon platforms
docs: dt-bindings: Specify ordering for properties within groups
dt-bindings: gpu: arm,mali-midgard: add exynos7870-mali compatible
of: Move of_prop_val_eq() next to the single user
of/platform: Use typed accessors rather than of_get_property()
dt-bindings: trivial-devices: Add Maxim max15301, max15303, and max20751
dt-bindings: fsi: ibm,p9-scom: Add "ibm,fsi2pib" compatible
dt-bindings: memory-controllers: qcom,ebi2: Enforce child props
dt-bindings: memory-controllers: samsung,exynos4210-srom: Enforce child props
dt-bindings: display: mitsubishi,aa104xd12: Adjust allowed and required properties
dt-bindings: display: mitsubishi,aa104xd12: Allow jeida-18 for data-mapping
dt-bindings: interrupt-controller: Convert nxp,lpc3220-mic.txt to yaml format
docs: process: maintainer-soc-clean-dts: linux-next is decisive
docs: dt: submitting-patches: Document sending DTS patches
of: Align macro MAX_PHANDLE_ARGS with NR_FWNODE_REFERENCE_ARGS
...
Linus Torvalds [Sat, 29 Mar 2025 18:12:28 +0000 (11:12 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
- Usual minor updates and fixes for bnxt_re, hfi1, rxe, mana, iser,
mlx5, vmw_pvrdma, hns
- Make rxe work on tun devices
- mana gains more standard verbs as it moves toward supporting
in-kernel verbs
- DMABUF support for mana
- Fix page size calculations when memory registration exceeds 4G
- On Demand Paging support for rxe
- mlx5 support for RDMA TRANSPORT flow tables and a new ucap mechanism
to access control use of them
- Optional RDMA_TX/RX counters per QP in mlx5
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (73 commits)
IB/mad: Check available slots before posting receive WRs
RDMA/mana_ib: Fix integer overflow during queue creation
RDMA/mlx5: Fix calculation of total invalidated pages
RDMA/mlx5: Fix mlx5_poll_one() cur_qp update flow
RDMA/mlx5: Fix page_size variable overflow
RDMA/mlx5: Drop access_flags from _mlx5_mr_cache_alloc()
RDMA/mlx5: Fix cache entry update on dereg error
RDMA/mlx5: Fix MR cache initialization error flow
RDMA/mlx5: Support optional-counters binding for QPs
RDMA/mlx5: Compile fs.c regardless of INFINIBAND_USER_ACCESS config
RDMA/core: Pass port to counter bind/unbind operations
RDMA/core: Add support to optional-counters binding configuration
RDMA/core: Create and destroy rdma_counter using rdma_zalloc_drv_obj()
RDMA/mlx5: Add optional counters for RDMA_TX/RX_packets/bytes
RDMA/core: Fix use-after-free when rename device name
RDMA/bnxt_re: Support perf management counters
RDMA/rxe: Fix incorrect return value of rxe_odp_atomic_op()
RDMA/uverbs: Propagate errors from rdma_lookup_get_uobject()
RDMA/mana_ib: Handle net event for pointing to the current netdev
net: mana: Change the function signature of mana_get_primary_netdev_rcu
...
Linus Torvalds [Sat, 29 Mar 2025 17:45:20 +0000 (10:45 -0700)]
Merge tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull fwctl subsystem from Jason Gunthorpe:
"fwctl is a new subsystem intended to bring some common rules and order
to the growing pattern of exposing a secure FW interface directly to
userspace.
Unlike existing places like RDMA/DRM/VFIO/uacce that are exposing a
device for datapath operations fwctl is focused on debugging,
configuration and provisioning of the device. It will not have the
necessary features like interrupt delivery to support a datapath.
This concept is similar to the long standing practice in the "HW" RAID
space of having a device specific misc device to manage the RAID
controller FW. fwctl generalizes this notion of a companion debug and
management interface that goes along with a dataplane implemented in
an appropriate subsystem.
There have been three LWN articles written discussing various aspects
of this:
This includes three drivers to launch the subsystem:
- CXL provides a vendor scheme for executing commands and a way to
learn the 'command effects' (ie the security properties) of such
commands. The fwctl driver allows access to these mechanism within
the fwctl security model
- mlx5 is family of networking products, the driver supports all
current Mellanox HW still receiving FW feature updates. This
includes RDMA multiprotocol NICs like ConnectX and the Bluefield
family of Smart NICs.
- AMD/Pensando Distributed Services card is a multi protocol Smart
NIC with a multi PCI function design. fwctl works on the management
PCI function following a 'command effects' model similar to CXL"
* tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (30 commits)
pds_fwctl: add Documentation entries
pds_fwctl: add rpc and query support
pds_fwctl: initial driver framework
pds_core: add new fwctl auxiliary_device
pds_core: specify auxiliary_device to be created
pds_core: make pdsc_auxbus_dev_del() void
cxl: Fixup kdoc issues for include/cxl/features.h
fwctl/cxl: Add documentation to FWCTL CXL
cxl/test: Add Set Feature support to cxl_test
cxl/test: Add Get Feature support to cxl_test
cxl: Add support to handle user feature commands for set feature
cxl: Add support to handle user feature commands for get feature
cxl: Add support for fwctl RPC command to enable CXL feature commands
cxl: Move cxl feature command structs to user header
cxl: Add FWCTL support to CXL
mlx5: Create an auxiliary device for fwctl_mlx5
fwctl/mlx5: Support for communicating with mlx5 fw
fwctl: Add documentation
fwctl: FWCTL_RPC to execute a Remote Procedure Call to device firmware
taint: Add TAINT_FWCTL
...
Kent Overstreet [Wed, 26 Mar 2025 14:41:33 +0000 (10:41 -0400)]
bcachefs: Better printing of inconsistency errors
Build up and emit the error message for an inconsistency error all at
once, instead of spread over multiple printk calls, so they're not
jumbled in the dmesg log.
Also, add better indenting.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 28 Mar 2025 16:15:32 +0000 (12:15 -0400)]
bcachefs: bch2_count_fsck_err()
Factor out a helper from __bch2_fsck_err(), for counting the error in
the superblock and deciding whether to print or ratelimit - will be used
to replace some log_fsck_err() calls, where we want to lift out printing
the error message.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sat, 29 Mar 2025 17:01:55 +0000 (10:01 -0700)]
Merge tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Remove legacy compression interface
- Improve scatterwalk API
- Add request chaining to ahash and acomp
- Add virtual address support to ahash and acomp
- Add folio support to acomp
- Remove NULL dst support from acomp
Algorithms:
- Library options are fuly hidden (selected by kernel users only)
- Add Kerberos5 algorithms
- Add VAES-based ctr(aes) on x86
- Ensure LZO respects output buffer length on compression
- Remove obsolete SIMD fallback code path from arm/ghash-ce
Drivers:
- Add support for PCI device 0x1134 in ccp
- Add support for rk3588's standalone TRNG in rockchip
- Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93
- Fix bugs in tegra uncovered by multi-threaded self-test
- Fix corner cases in hisilicon/sec2
Others:
- Add SG_MITER_LOCAL to sg miter
- Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp"
* tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (187 commits)
crypto: testmgr - Add multibuffer acomp testing
crypto: acomp - Fix synchronous acomp chaining fallback
crypto: testmgr - Add multibuffer hash testing
crypto: hash - Fix synchronous ahash chaining fallback
crypto: arm/ghash-ce - Remove SIMD fallback code path
crypto: essiv - Replace memcpy() + NUL-termination with strscpy()
crypto: api - Call crypto_alg_put in crypto_unregister_alg
crypto: scompress - Fix incorrect stream freeing
crypto: lib/chacha - remove unused arch-specific init support
crypto: remove obsolete 'comp' compression API
crypto: compress_null - drop obsolete 'comp' implementation
crypto: cavium/zip - drop obsolete 'comp' implementation
crypto: zstd - drop obsolete 'comp' implementation
crypto: lzo - drop obsolete 'comp' implementation
crypto: lzo-rle - drop obsolete 'comp' implementation
crypto: lz4hc - drop obsolete 'comp' implementation
crypto: lz4 - drop obsolete 'comp' implementation
crypto: deflate - drop obsolete 'comp' implementation
crypto: 842 - drop obsolete 'comp' implementation
crypto: nx - Migrate to scomp API
...
Sungjong Seo [Wed, 26 Mar 2025 15:01:16 +0000 (00:01 +0900)]
exfat: call bh_read in get_block only when necessary
With commit 11a347fb6cef ("exfat: change to get file size from DataLength"),
exfat_get_block() can now handle valid_size. However, most partial
unwritten blocks that could be mapped with other blocks are being
inefficiently processed separately as individual blocks.
Except for partial unwritten blocks that require independent processing,
let's handle them simply as before.
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Sungjong Seo [Wed, 26 Mar 2025 14:48:48 +0000 (23:48 +0900)]
exfat: fix potential wrong error return from get_block
If there is no error, get_block() should return 0. However, when bh_read()
returns 1, get_block() also returns 1 in the same manner.
Let's set err to 0, if there is no error from bh_read()
Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength") Cc: stable@vger.kernel.org Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Linus Torvalds [Sat, 29 Mar 2025 02:36:53 +0000 (19:36 -0700)]
Merge tag 'pci-v6.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Enable Configuration RRS SV, which makes device readiness visible,
early instead of during child bus scanning (Bjorn Helgaas)
- Log debug messages about reset methods being used (Bjorn Helgaas)
- Avoid reset when it has been disabled via sysfs (Nishanth
Aravamudan)
- Add common pci-ep-bus.yaml schema for exporting several peripherals
of a single PCI function via devicetree (Andrea della Porta)
- Create DT nodes for PCI host bridges to enable loading device tree
overlays to create platform devices for PCI devices that have
several features that require multiple drivers (Herve Codina)
Resource management:
- Enlarge devres table[] to accommodate bridge windows, ROM, IOV
BARs, etc., and validate BAR index in devres interfaces (Philipp
Stanner)
- Fix typo that repeatedly distributed resources to a bridge instead
of iterating over subordinate bridges, which resulted in too little
space to assign some BARs (Kai-Heng Feng)
- Relax bridge window tail sizing for optional resources, e.g., IOV
BARs, to avoid failures when removing and re-adding devices (Ilpo
Järvinen)
- Allow drivers to enable devices even if we haven't assigned
optional IOV resources to them (Ilpo Järvinen)
- Rework handling of optional resources (IOV BARs, ROMs) to reduce
failures if we can't allocate them (Ilpo Järvinen)
- Fix a NULL dereference in the SR-IOV VF creation error path (Shay
Drory)
- Fix s390 mmio_read/write syscalls, which didn't cause page faults
in some cases, which broke vfio-pci lazy mapping on first access
(Niklas Schnelle)
- Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which
was disabled only for s390 (Niklas Schnelle)
- Support mmap of PCI resources on s390 except for ISM devices
(Niklas Schnelle)
ASPM:
- Delay pcie_link_state deallocation to avoid dangling pointers that
cause invalid references during hot-unplug (Daniel Stodden)
Power management:
- Allow PCI bridges to go to D3Hot when suspending on all non-x86
systems (Manivannan Sadhasivam)
Power control:
- Create pwrctrl devices in pci_scan_device() to make it more
symmetric with pci_pwrctrl_unregister() and make pwrctrl devices
for PCI bridges possible (Manivannan Sadhasivam)
- Unregister pwrctrl devices in pci_destroy_dev() so DOE, ASPM, etc.
can still access devices after pci_stop_dev() (Manivannan
Sadhasivam)
- If there's a pwrctrl device for a PCI device, skip scanning it
because the pwrctrl core will rescan the bus after the device is
powered on (Manivannan Sadhasivam)
- Add a pwrctrl driver for PCI slots based on voltage regulators
described via devicetree (Manivannan Sadhasivam)
Bandwidth control:
- Add set_pcie_speed.sh to TEST_PROGS to fix issue when executing the
set_pcie_cooling_state.sh test case (Yi Lai)
- Avoid a NULL pointer dereference when we run out of bus numbers to
assign for a bridge secondary bus (Lukas Wunner)
Hotplug:
- Drop superfluous pci_hotplug_slot_list, try_module_get() calls, and
NULL pointer checks (Lukas Wunner)
- Drop shpchp module init/exit logging, replace shpchp dbg() with
ctrl_dbg(), and remove unused dbg(), err(), info(), warn() wrappers
(Ilpo Järvinen)
- Drop 'shpchp_debug' module parameter in favor of standard dynamic
debugging (Ilpo Järvinen)
- Drop unused cpcihp .get_power(), .set_power() function pointers
(Guilherme Giacomo Simoes)
- Disable hotplug interrupts in portdrv only when pciehp is not
enabled to avoid issuing two hotplug commands too close together
(Feng Tang)
- Skip pciehp 'device replaced' check if the device has been removed
to address a deadlock when resuming after a device was removed
during system sleep (Lukas Wunner)
- Don't enable pciehp hotplug interupt when resuming in poll mode
(Ilpo Järvinen)
Virtualization:
- Fix bugs in 'pci=config_acs=' kernel command line parameter (Tushar
Dave)
DOE:
- Expose supported DOE features via sysfs (Alistair Francis)
- Allow DOE support to be enabled even if CXL isn't enabled (Alistair
Francis)
Endpoint framework:
- Convert PCI device data so pci-epf-test works correctly on
big-endian endpoint systems (Niklas Cassel)
- Add BAR_RESIZABLE type to endpoint framework and add DWC core
support for EPF drivers to set BAR_RESIZABLE type and size (Niklas
Cassel)
- Fix pci-epf-test double free that causes an oops if the host
reboots and PERST# deassertion restarts endpoint BAR allocation
(Christian Bruel)
- Fix endpoint BAR testing so tests can skip disabled BARs instead of
reporting them as failures (Niklas Cassel)
- Widen endpoint test BAR size variable to accommodate BARs larger
than INT_MAX (Niklas Cassel)
- Remove unused tools 'pci' build target left over after moving tests
to tools/testing/selftests/pci_endpoint (Jianfeng Liu)
Altera PCIe controller driver:
- Add DT binding and driver support for Agilex family (P-Tile,
F-Tile, R-Tile) (Matthew Gerlach and D M, Sharath Kumar)
AMD MDB PCIe controller driver:
- Add DT binding and driver for AMD MDB (Multimedia DMA Bridge)
(Thippeswamy Havalige)
Broadcom STB PCIe controller driver:
- Add BCM2712 MSI-X DT binding and interrupt controller drivers and
add softdep on irq_bcm2712_mip driver to ensure that it is loaded
first (Stanimir Varbanov)
- Expand inbound window map to 64GB so it can accommodate BCM2712
(Stanimir Varbanov)
- Add BCM2712 support and DT updates (Stanimir Varbanov)
- Apply link speed restriction before bringing link up, not after
(Jim Quinlan)
- Update Max Link Speed in Link Capabilities via the internal
writable register, not the read-only config register (Jim Quinlan)
- Handle regulator_bulk_get() error to avoid panic when we call
regulator_bulk_free() later (Jim Quinlan)
- Disable regulators only when removing the bus immediately below a
Root Port because we don't support regulators deeper in the
hierarchy (Jim Quinlan)
- Make const read-only arrays static (Colin Ian King)
Cadence PCIe endpoint driver:
- Correct MSG TLP generation so endpoints can generate INTx messages
(Hans Zhang)
Freescale i.MX6 PCIe controller driver:
- Identify the second controller on i.MX8MQ based on devicetree
'linux,pci-domain' instead of DBI 'reg' address (Richard Zhu)
- Remove imx_pcie_cpu_addr_fixup() since dwc core can now derive the
ATU input address (using parent_bus_offset) from devicetree (Frank
Li)
Freescale Layerscape PCIe controller driver:
- Drop deprecated 'num-ib-windows' and 'num-ob-windows' and
unnecessary 'status' from example (Krzysztof Kozlowski)
- Correct the syscon_regmap_lookup_by_phandle_args("fsl,pcie-scfg")
arg_count to fix probe failure on LS1043A (Ioana Ciornei)
HiSilicon STB PCIe controller driver:
- Call phy_exit() to clean up if histb_pcie_probe() fails (Christophe
JAILLET)
Intel Gateway PCIe controller driver:
- Remove intel_pcie_cpu_addr() since dwc core can now derive the ATU
input address (using parent_bus_offset) from devicetree (Frank Li)
Intel VMD host bridge driver:
- Convert vmd_dev.cfg_lock from spinlock_t to raw_spinlock_t so
pci_ops.read() will never sleep, even on PREEMPT_RT where
spinlock_t becomes a sleepable lock, to avoid calling a sleeping
function from invalid context (Ryo Takakura)
- Make DT iommu property required for SA8775P and prohibited for
SDX55 (Dmitry Baryshkov)
- Add DT IOMMU and DMA-related properties for Qualcomm SM8450 (Dmitry
Baryshkov)
- Add endpoint DT properties for SAR2130P and enable endpoint mode in
driver (Dmitry Baryshkov)
- Describe endpoint BAR0 and BAR2 as 64-bit only and BAR1 and BAR3 as
RESERVED (Manivannan Sadhasivam)
Rockchip DesignWare PCIe controller driver:
- Describe rk3568 and rk3588 BARs as Resizable, not Fixed (Niklas
Cassel)
Synopsys DesignWare PCIe controller driver:
- Add debugfs-based Silicon Debug, Error Injection, Statistical
Counter support for DWC (Shradha Todi)
- Add debugfs property to expose LTSSM status of DWC PCIe link (Hans
Zhang)
- Add Rockchip support for DWC debugfs features (Niklas Cassel)
- Add dw_pcie_parent_bus_offset() to look up the parent bus address
of a specified 'reg' property and return the offset from the CPU
physical address (Frank Li)
- Use dw_pcie_parent_bus_offset() to derive CPU -> ATU addr offset
via 'reg[config]' for host controllers and 'reg[addr_space]' for
endpoint controllers (Frank Li)
- Apply struct dw_pcie.parent_bus_offset in ATU users to remove use
of .cpu_addr_fixup() when programming ATU (Frank Li)
TI J721E PCIe driver:
- Correct the 'link down' interrupt bit for J784S4 (Siddharth
Vadapalli)
TI Keystone PCIe controller driver:
- Describe AM65x BARs 2 and 5 as Resizable (not Fixed) and reduce
alignment requirement from 1MB to 64KB (Niklas Cassel)
Xilinx Versal CPM PCIe controller driver:
- Free IRQ domain in probe error path to avoid leaking it
(Thippeswamy Havalige)
- Add DT .compatible "xlnx,versal-cpm5nc-host" and driver support for
Versal Net CPM5NC Root Port controller (Thippeswamy Havalige)
- Add driver support for CPM5_HOST1 (Thippeswamy Havalige)
Miscellaneous:
- Convert fsl,mpc83xx-pcie binding to YAML (J. Neuschäfer)
- Use for_each_available_child_of_node_scoped() to simplify apple,
kirin, mediatek, mt7621, tegra drivers (Zhang Zekun)"
* tag 'pci-v6.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (197 commits)
PCI: layerscape: Fix arg_count to syscon_regmap_lookup_by_phandle_args()
PCI: j721e: Fix the value of .linkdown_irq_regfield for J784S4
misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO
PCI: endpoint: pci-epf-test: Expose supported IRQ types in CAPS register
PCI: dw-rockchip: Endpoint mode cannot raise INTx interrupts
PCI: endpoint: Add intx_capable to epc_features struct
dt-bindings: PCI: Add common schema for devices accessible through PCI BARs
PCI: intel-gw: Remove intel_pcie_cpu_addr()
PCI: imx6: Remove imx_pcie_cpu_addr_fixup()
PCI: dwc: Use parent_bus_offset to remove need for .cpu_addr_fixup()
PCI: dwc: ep: Ensure proper iteration over outbound map windows
PCI: dwc: ep: Use devicetree 'reg[addr_space]' to derive CPU -> ATU addr offset
PCI: dwc: ep: Consolidate devicetree handling in dw_pcie_ep_get_resources()
PCI: dwc: ep: Call epc_create() early in dw_pcie_ep_init()
PCI: dwc: Use devicetree 'reg[config]' to derive CPU -> ATU addr offset
PCI: dwc: Add dw_pcie_parent_bus_offset() checking and debug
PCI: dwc: Add dw_pcie_parent_bus_offset()
PCI/bwctrl: Fix NULL pointer dereference on bus number exhaustion
PCI: xilinx-cpm: Add cpm_csr register mapping for CPM5_HOST1 variant
PCI: brcmstb: Make const read-only arrays static
...
Kent Overstreet [Fri, 28 Mar 2025 15:59:09 +0000 (11:59 -0400)]
bcachefs: Better helpers for inconsistency errors
An inconsistency error often happens as part of an event with multiple
error messages, and we want to build up one single error message with
proper indenting to produce more readable log messages that don't get
garbled.
Add new helpers that emit messages to a printbuf instead of printing
them directly, next patch will convert to use them.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 25 Mar 2025 14:52:00 +0000 (10:52 -0400)]
bcachefs: bch2_time_stats_init_no_pcpu()
Add a mode to disable automatic switching to percpu mode, useful when a
time_stats will only be used by one thread and we don't want to have to
flush the percpu buffers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sat, 29 Mar 2025 00:44:52 +0000 (17:44 -0700)]
Merge tag 'drm-next-2025-03-28' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
"Outside of drm there are some rust patches from Danilo who maintains
that area in here, and some pieces for drm header check tests.
The major things in here are a new driver supporting the touchbar
displays on M1/M2, the nova-core stub driver which is just the vehicle
for adding rust abstractions and start developing a real driver inside
of.
xe adds support for SVM with a non-driver specific SVM core
abstraction that will hopefully be useful for other drivers, along
with support for shrinking for TTM devices. I'm sure xe and AMD
support new devices, but the pipeline depth on these things is hard to
know what they end up being in the marketplace!
uapi:
- add mediatek tiled fourcc
- add support for notifying userspace on device wedged
new driver:
- appletbdrm: support for Apple Touchbar displays on m1/m2
- nova-core: skeleton rust driver to develop nova inside off
firmware:
- add some rust firmware pieces
rust:
- add 'LocalModule' type alias
component:
- add helper to query bound status
fbdev:
- fbtft: remove access to page->index
media:
- cec: tda998x: import driver from drm
dma-buf:
- add fast path for single fence merging
tests:
- fix lockdep warnings
atomic:
- allow full modeset on connector changes
- clarify semantics of allow_modeset and drm_atomic_helper_check
- async-flip: support on arbitary planes
- writeback: fix UAF
- Document atomic-state history
format-helper:
- support ARGB8888 to ARGB4444 conversions
buddy:
- fix multi-root cleanup
ci:
- update IGT
dp:
- support extended wake timeout
- mst: fix RAD to string conversion
- increase DPCD eDP control CAP size to 5 bytes
- add DPCD eDP v1.5 definition
- add helpers for LTTPR transparent mode
ttm:
- refactor pool allocation
- add helpers for TTM shrinker
panel-orientation:
- add a bunch of new quirks
panel:
- convert panels to multi-style functions
- edp: Add support for B140UAN04.4, BOE NV140FHM-NZ, CSW MNB601LS1-3,
LG LP079QX1-SP0V, MNE007QS3-7, STA 116QHD024002, Starry
116KHD024006, Lenovo T14s Gen6 Snapdragon
- himax-hx83102: Add support for CSOT PNA957QT1-1, Kingdisplay
kd110n11-51ie, Starry 2082109qfh040022-50e
- visionox-r66451: use multi-style MIPI-DSI functions
- raydium-rm67200: Add driver for Raydium RM67200
- simple: Add support for BOE AV123Z7M-N17, BOE AV123Z7M-N17
- sony-td4353-jdi: Use MIPI-DSI multi-func interface
- summit: Add driver for Apple Summit display panel
- visionox-rm692e5: Add driver for Visionox RM692E5
bridge:
- pass full atomic state to various callbacks
- adv7511: Report correct capabilities
- it6505: Fix HDCP V compare
- snd65dsi86: fix device IDs
- nwl-dsi: set bridge type
- ti-sn65si83: add error recovery and set bridge type
- synopsys: add HDMI audio support
xe:
- support device-wedged event
- add mmap support for PCI memory barrier
- perf pmu integration and expose per-engien activity
- add EU stall sampling support
- GPU SVM and Xe SVM implementation
- use TTM shrinker
- add survivability mode to allow the driver to do firmware updates
in critical failure states
- PXP HWDRM support for MTL and LNL
- expose package/vram temps over hwmon
- enable DP tunneling
- drop mmio_ext abstraction
- Reject BO evcition if BO is bound to current VM
- Xe suballocator improvements
- re-use display vmas when possible
- add GuC Buffer Cache abstraction
- PCI ID update for Panther Lake and Battlemage
- Enable SRIOV for Panther Lake
- Refactor VRAM manager location
i915:
- enable extends wake timeout
- support device-wedged event
- Enable DP 128b/132b SST DSC
- FBC dirty rectangle support for display version 30+
- convert i915/xe to drm client setup
- Compute HDMI PLLS for rates not in fixed tables
- Allow DSB usage when PSR is enabled on LNL+
- Enable panel replay without full modeset
- Enable async flips with compressed buffers on ICL+
- support luminance based brightness via DPCD for eDP
- enable VRR enable/disable without full modeset
- allow GuC SLPC default strategies on MTL+ for performance
- lots of display refactoring in move to struct intel_display
amdgpu:
- add device wedged event
- support async page flips on overlay planes
- enable broadcast RGB drm property
- add info ioctl for virt mode
- OEM i2c support for RGB lights
- GC 11.5.2 + 11.5.3 support
- SDMA 6.1.3 support
- NBIO 7.9.1 + 7.11.2 support
- MMHUB 1.8.1 + 3.3.2 support
- DCN 3.6.0 support
- Add dynamic workload profile switching for GC 10-12
- support larger VBIOS sizes
- Mark gttsize parameters as deprecated
- Initial JPEG queue resset support
amdkfd:
- add KFD per process flags for setting precision
- sync pasid values between KGD and KFD
- improve GTT/VRAM handling for APUs
- fix user queue validation on GC7/8
- SDMA queue reset support
raedeon:
- rs400 hyperz fix
i2c:
- td998x: drop platform_data, split driver into media and bridge
imagination:
- check job dependencies with sched helper
ivpu:
- improve command queue handling
- use workqueue for IRQ handling
- add support HW fault injection
- locking fixes
mgag200:
- add support for G200eH5
msm:
- dpu: add concurrent writeback support for DPU 10.x+
- use LTTPR helpers
- GPU:
- Fix obscure GMU suspend failure
- Expose syncobj timeline support
- Extend GPU devcoredump with pagetable info
- a623 support
- Fix a6xx gen1/gen2 indexed-register blocks in gpu snapshot /
devcoredump
- Display:
- Add cpu-cfg interconnect paths on SM8560 and SM8650
- Introduce KMS OMMU fault handler, causing devcoredump snapshot
- Fixed error pointer dereference in msm_kms_init_aspace()
- DPU:
- Fix mode_changing handling
- Add writeback support on SM6150 (QCS615)
- Fix DSC programming in 1:1:1 topology
- Reworked hardware resource allocation, moving it to the CRTC code
- Enabled support for Concurrent WriteBack (CWB) on SM8650
- Enabled CDM blocks on all relevant platforms
- Reworked debugfs interface for BW/clocks debugging
- Clear perf params before calculating bw
- Support YUV formats on writeback
- Fixed double inclusion
- Fixed writeback in YUV formats when using cloned output, Dropped
wb2_formats_rgb
- Corrected dpu_crtc_check_mode_changed and struct dpu_encoder_virt
kerneldocs
- Fixed uninitialized variable in dpu_crtc_kickoff_clone_mode()
- DSI:
- DSC-related fixes
- Rework clock programming
- DSI PHY:
- Fix 7nm (and lower) PHY programming
- Add proper DT schema definitions for DSI PHY clocks
- HDMI:
- Rework the driver, enabling the use of the HDMI Connector
framework
- Bindings:
- Added eDP PHY on SA8775P
nouveau:
- move drm_slave_encoder interface into driver
- nvkm: refactor GSP RPC
- use LTTPR helpers
mediatek:
- HDMI fixup and refinement
- add MT8188 dsc compatible
- MT8365 SoC support
panthor:
- Expose sizes of intenral BOs via fdinfo
- Fix race between reset and suspend
- Improve locking
qaic:
- Add support for AIC200
renesas:
- Fix limits in DT bindings
rockchip:
- support rk3562-mali
- rk3576: Add HDMI support
- vop2: Add new display modes on RK3588 HDMI0 up to 4K
- Don't change HDMI reference clock rate
- Fix DT bindings
- analogix_dp: add eDP support
- fix shutodnw
solomon:
- Set SPI device table to silence warnings
- Fix pixel and scanline encoding
v3d:
- handle clock
vc4:
- Use drm_exec
- Use dma-resv for wait-BO ioctl
- Remove seqno infrastructure
virtgpu:
- Support partial mappings of GEM objects
- Reserve VGA resources during initialization
- Fix UAF in virtgpu_dma_buf_free_obj()
- Add panic support
vkms:
- Switch to a managed modesetting pipeline
- Add support for ARGB8888
- fix UAf
xlnx:
- Set correct DMA segment size
- use mutex guards
- Fix error handling
- Fix docs"
* tag 'drm-next-2025-03-28' of https://gitlab.freedesktop.org/drm/kernel: (1762 commits)
drm/amd/pm: Update feature list for smu_v13_0_6
drm/amdgpu: Add parameter documentation for amdgpu_sync_fence
drm/amdgpu/discovery: optionally use fw based ip discovery
drm/amdgpu/discovery: use specific ip_discovery.bin for legacy asics
drm/amdgpu/discovery: check ip_discovery fw file available
drm/amd/pm: Remove unnecessay UQ10 to UINT conversion
drm/amd/pm: Remove unnecessay UQ10 to UINT conversion
drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA
drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush
drm/amd/amdgpu: Increase max rings to enable SDMA page ring
drm/amdgpu: Decode deferred error type in gfx aca bank parser
drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5 GPUs
drm/amdgpu/mes: clean up SDMA HQD loop
drm/amdgpu/mes: enable compute pipes across all MEC
drm/amdgpu/mes: drop MES 10.x leftovers
drm/amdgpu/mes: optimize compute loop handling
drm/amdgpu/sdma: guilty tracking is per instance
drm/amdgpu/sdma: fix engine reset handling
drm/amdgpu: remove invalid usage of sched.ready
drm/amdgpu: add cleaner shader trace point
...
Linus Torvalds [Fri, 28 Mar 2025 23:45:37 +0000 (16:45 -0700)]
Merge tag 'fbdev-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev updates from Helge Deller:
"This includes a major refactoring of the fbcon packed pixel drawing
routines, contributed by Zsolt Kajtar. The original version duplicated
more or less the same algorithms for both system and i/o memory.
The new implementation is more robust, both implementations are now
feature complete (e.g. pixel order reversing now supported by both),
behaves the same way as it uses the identical code for both variants
and adds support for foreign endian framebuffers.
The other patches add some parameter checks, static attribute groups
for sysfs entries and console fixes:
- dummycon: only build module if there are users and fix rows/cols
(Arnd Bergmann)
- mdacon: rework dependency list (Arnd Bergmann)
- lcdcfb, fsl-diu-fb, fbcon: Fix registering and removing of sysfs
(Shixiong Ou)
- sm501fb: Add some geometry checks (Danila Chernetsov)
- omapfb: Remove unused code, add value checks (Leonid Arapov)
- au1100fb: Clean up variable assignment (Markus Elfring)
- pxafb: use devm_kmemdup*() (Raag Jadav)"
* tag 'fbdev-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbdev: fsl-diu-fb: add missing device_remove_file()
fbcon: Use static attribute groups for sysfs entries
fbdev: sm501fb: Add some geometry checks.
fbdev: omapfb: Add 'plane' value check
fbdev: omapfb: Remove writeback deadcode
MAINTAINERS: Add contact info for fbdev packed pixel drawing
fbdev: Refactoring the fbcon packed pixel drawing routines
fbdev: wmt_ge_rops: Remove fb_draw.h includes
fbdev: mach64_cursor: Remove fb_draw.h includes
fbdev: Register sysfs groups through device_add_group
fbdev: lcdcfb: Register sysfs groups through driver core
mdacon: rework dependency list
dummycon: fix default rows/cols
dummycon: only build module if there are users
fbdev: au1100fb: Move a variable assignment behind a null pointer check
fbdev: pxafb: use devm_kmemdup*()
fbcon: Use correct erase colour for clearing in fbcon
fbdev: core: tileblit: Implement missing margin clearing for tileblit
Linus Torvalds [Fri, 28 Mar 2025 22:07:04 +0000 (15:07 -0700)]
Merge tag 'for-6.15/io_uring-reg-vec-20250327' of git://git.kernel.dk/linux
Pull more io_uring updates from Jens Axboe:
"Final separate updates for io_uring.
This started out as a series of cleanups improvements and improvements
for registered buffers, but as the last series of the io_uring changes
for 6.15, it also collected a few fixes for the other branches on top:
- Add support for vectored fixed/registered buffers.
Previously only single segments have been supported for commands,
now vectored variants are supported as well. This series includes
networking and file read/write support.
- Small series unifying return codes across multi and single shot.
- Small series cleaning up registerd buffer importing.
- Adding support for vectored registered buffers for uring_cmd.
- Fix for io-wq handling of command reissue.
- Various little fixes and tweaks"
* tag 'for-6.15/io_uring-reg-vec-20250327' of git://git.kernel.dk/linux: (25 commits)
io_uring/net: fix io_req_post_cqe abuse by send bundle
io_uring/net: use REQ_F_IMPORT_BUFFER for send_zc
io_uring: move min_events sanitisation
io_uring: rename "min" arg in io_iopoll_check()
io_uring: open code __io_post_aux_cqe()
io_uring: defer iowq cqe overflow via task_work
io_uring: fix retry handling off iowq
io_uring/net: only import send_zc buffer once
io_uring/cmd: introduce io_uring_cmd_import_fixed_vec
io_uring/cmd: add iovec cache for commands
io_uring/cmd: don't expose entire cmd async data
io_uring: rename the data cmd cache
io_uring: rely on io_prep_reg_vec for iovec placement
io_uring: introduce io_prep_reg_iovec()
io_uring: unify STOP_MULTISHOT with IOU_OK
io_uring: return -EAGAIN to continue multishot
io_uring: cap cached iovec/bvec size
io_uring/net: implement vectored reg bufs for zctx
io_uring/net: convert to struct iou_vec
io_uring/net: pull vec alloc out of msghdr import
...
Linus Torvalds [Fri, 28 Mar 2025 21:55:32 +0000 (14:55 -0700)]
Merge tag 'for-6.15/io_uring-epoll-wait-20250325' of git://git.kernel.dk/linux
Pull io_uring epoll support from Jens Axboe:
"This adds support for reading epoll events via io_uring.
While this may seem counter-intuitive (and/or productive), the
reasoning here is that quite a few existing epoll event loops can
easily do a partial conversion to a completion based model, but are
still stuck with one (or few) event types that remain readiness based.
For that case, they then need to add the io_uring fd to the epoll
context, and continue to rely on epoll_wait(2) for waiting on events.
This misses out on the finer grained waiting that io_uring can do, to
reduce context switches and wait for multiple events in one batch
reliably.
With adding support for reaping epoll events via io_uring, the whole
legacy readiness based event types can still be reaped via epoll, with
the overall waiting in the loop be driven by io_uring"
* tag 'for-6.15/io_uring-epoll-wait-20250325' of git://git.kernel.dk/linux:
io_uring/epoll: add support for IORING_OP_EPOLL_WAIT
io_uring/epoll: remove CONFIG_EPOLL guards
Linus Torvalds [Fri, 28 Mar 2025 20:45:52 +0000 (13:45 -0700)]
Merge tag 'for-6.15/io_uring-rx-zc-20250325' of git://git.kernel.dk/linux
Pull io_uring zero-copy receive support from Jens Axboe:
"This adds support for zero-copy receive with io_uring, enabling fast
bulk receive of data directly into application memory, rather than
needing to copy the data out of kernel memory.
While this version only supports host memory as that was the initial
target, other memory types are planned as well, with notably GPU
memory coming next.
This work depends on some networking components which were queued up
on the networking side, but have now landed in your tree.
This is the work of Pavel Begunkov and David Wei. From the v14 posting:
'We configure a page pool that a driver uses to fill a hw rx queue
to hand out user pages instead of kernel pages. Any data that ends
up hitting this hw rx queue will thus be dma'd into userspace
memory directly, without needing to be bounced through kernel
memory. 'Reading' data out of a socket instead becomes a
_notification_ mechanism, where the kernel tells userspace where
the data is. The overall approach is similar to the devmem TCP
proposal
This relies on hw header/data split, flow steering and RSS to
ensure packet headers remain in kernel memory and only desired
flows hit a hw rx queue configured for zero copy. Configuring this
is outside of the scope of this patchset.
We share netdev core infra with devmem TCP. The main difference is
that io_uring is used for the uAPI and the lifetime of all objects
are bound to an io_uring instance. Data is 'read' using a new
io_uring request type. When done, data is returned via a new shared
refill queue. A zero copy page pool refills a hw rx queue from this
refill queue directly. Of course, the lifetime of these data
buffers are managed by io_uring rather than the networking stack,
with different refcounting rules.
This patchset is the first step adding basic zero copy support. We
will extend this iteratively with new features e.g. dynamically
allocated zero copy areas, THP support, dmabuf support, improved
copy fallback, general optimisations and more'
In a local setup, I was able to saturate a 200G link with a single CPU
core, and at netdev conf 0x19 earlier this month, Jamal reported
188Gbit of bandwidth using a single core (no HT, including soft-irq).
Safe to say the efficiency is there, as bigger links would be needed
to find the per-core limit, and it's considerably more efficient and
faster than the existing devmem solution"
* tag 'for-6.15/io_uring-rx-zc-20250325' of git://git.kernel.dk/linux:
io_uring/zcrx: add selftest case for recvzc with read limit
io_uring/zcrx: add a read limit to recvzc requests
io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap
io_uring: Rename KConfig to Kconfig
io_uring/zcrx: fix leaks on failed registration
io_uring/zcrx: recheck ifq on shutdown
io_uring/zcrx: add selftest
net: add documentation for io_uring zcrx
io_uring/zcrx: add copy fallback
io_uring/zcrx: throttle receive requests
io_uring/zcrx: set pp memory provider for an rx queue
io_uring/zcrx: add io_recvzc request
io_uring/zcrx: dma-map area for the device
io_uring/zcrx: implement zerocopy receive pp memory provider
io_uring/zcrx: grab a net device
io_uring/zcrx: add io_zcrx_area
io_uring/zcrx: add interface queue and refill queue
Linus Torvalds [Fri, 28 Mar 2025 19:42:53 +0000 (12:42 -0700)]
Merge tag 'tpmdd-next-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm updates from Jarkko Sakkinen:
"This contains a new driver: a TPM FF-A driver.
FF comes from Firmware Framework, and A comes from Arm's A-profile.
FF-A is essentially a standard mechanism to communicate with TrustZone
apps such as TPM.
Other than that, this includes a pile of fixes and small improvments"
* tag 'tpmdd-next-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: Make chip->{status,cancel,req_canceled} opt
MAINTAINERS: TPM DEVICE DRIVER: add missing includes
tpm: End any active auth session before shutdown
Documentation: tpm: Add documentation for the CRB FF-A interface
tpm_crb: Add support for the ARM FF-A start method
ACPICA: Add start method for ARM FF-A
tpm_crb: Clean-up and refactor check for idle support
tpm_crb: ffa_tpm: Implement driver compliant to CRB over FF-A
tpm/tpm_ftpm_tee: fix struct ftpm_tee_private documentation
tpm, tpm_tis: Workaround failed command reception on Infineon devices
tpm, tpm_tis: Fix timeout handling when waiting for TPM status
tpm: Convert warn to dbg in tpm2_start_auth_session()
tpm: Lazily flush auth session when getting random data
tpm: ftpm_tee: remove incorrect of_match_ptr annotation
tpm: do not start chip while suspended
Linus Torvalds [Fri, 28 Mar 2025 19:41:36 +0000 (12:41 -0700)]
Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC fixes from Eric Biggers:
"Fix out-of-scope array bugs in arm and arm64's crc_t10dif_arch()"
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
arm64/crc-t10dif: fix use of out-of-scope array in crc_t10dif_arch()
arm/crc-t10dif: fix use of out-of-scope array in crc_t10dif_arch()
Linus Torvalds [Fri, 28 Mar 2025 19:37:13 +0000 (12:37 -0700)]
Merge tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This brings two main changes to Landlock:
- A signal scoping fix with a new interface for user space to know if
it is compatible with the running kernel.
- Audit support to give visibility on why access requests are denied,
including the origin of the security policy, missing access rights,
and description of object(s). This was designed to limit log spam
as much as possible while still alerting about unexpected blocked
access.
With these changes come new and improved documentation, and a lot of
new tests"
* tag 'landlock-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (36 commits)
landlock: Add audit documentation
selftests/landlock: Add audit tests for network
selftests/landlock: Add audit tests for filesystem
selftests/landlock: Add audit tests for abstract UNIX socket scoping
selftests/landlock: Add audit tests for ptrace
selftests/landlock: Test audit with restrict flags
selftests/landlock: Add tests for audit flags and domain IDs
selftests/landlock: Extend tests for landlock_restrict_self(2)'s flags
selftests/landlock: Add test for invalid ruleset file descriptor
samples/landlock: Enable users to log sandbox denials
landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF
landlock: Add LANDLOCK_RESTRICT_SELF_LOG_*_EXEC_* flags
landlock: Log scoped denials
landlock: Log TCP bind and connect denials
landlock: Log truncate and IOCTL denials
landlock: Factor out IOCTL hooks
landlock: Log file-related denials
landlock: Log mount-related denials
landlock: Add AUDIT_LANDLOCK_DOMAIN and log domain status
landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials
...
Linus Torvalds [Fri, 28 Mar 2025 19:09:33 +0000 (12:09 -0700)]
Merge tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux
Pull capabilities update from Serge Hallyn:
"This contains just one patch that removes a helper function whose last
user (smack) stopped using it in 2018"
* tag 'caps-pr-20250327' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux:
capability: Remove unused has_capability
Linus Torvalds [Fri, 28 Mar 2025 19:06:58 +0000 (12:06 -0700)]
Merge tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull ima updates from Mimi Zohar:
"Two performance improvements, which minimize the number of integrity
violations"
* tag 'integrity-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: limit the number of ToMToU integrity violations
ima: limit the number of open-writers integrity violations
Thomas says:
"I just noticed that for some incomprehensible reason, probably sheer
incompetemce when trying to utilize b4, I managed to merge an outdated
_and_ buggy version of that series.
Can you please revert that merge completely?"
Done.
Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Florian Albrechtskirchinger [Thu, 27 Mar 2025 13:31:08 +0000 (14:31 +0100)]
bcachefs: Fix bch2_fs_get_tree() error path
When a filesystem is mounted read-only, subsequent attempts to mount it
as read-write fail with EBUSY. Previously, the error path in
bch2_fs_get_tree() would unconditionally call __bch2_fs_stop(),
improperly freeing resources for a filesystem that was still actively
mounted. This change modifies the error path to only call
__bch2_fs_stop() if the superblock has no valid root dentry, ensuring
resources are not cleaned up prematurely when the filesystem is in use.
Signed-off-by: Florian Albrechtskirchinger <falbrechtskirchinger@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>