John Ogness [Wed, 4 Sep 2024 12:05:36 +0000 (14:11 +0206)]
printk: Avoid false positive lockdep report for legacy printing
Legacy console printing from printk() caller context may invoke
the console driver from atomic context. This leads to a lockdep
splat because the console driver will acquire a sleeping lock
and the caller may already hold a spinning lock. This is noticed
by lockdep on !PREEMPT_RT configurations because it will lead to
a problem on PREEMPT_RT.
However, on PREEMPT_RT the printing path from atomic context is
always avoided and the console driver is always invoked from a
dedicated thread. Thus the lockdep splat on !PREEMPT_RT is a
false positive.
For !PREEMPT_RT override the lock-context before invoking the
console driver to avoid the false positive.
Do not override the lock-context for PREEMPT_RT in order to
allow lockdep to catch any real locking context issues related
to the write callback usage.
John Ogness [Wed, 4 Sep 2024 12:05:34 +0000 (14:11 +0206)]
printk: Implement legacy printer kthread for PREEMPT_RT
The write() callback of legacy consoles usually makes use of
spinlocks. This is not permitted with PREEMPT_RT in atomic
contexts.
For PREEMPT_RT, create a new kthread to handle printing of all
the legacy consoles (and nbcon consoles if boot consoles are
registered). This allows legacy consoles to work on PREEMPT_RT
without requiring modification. (However they will not have
the reliability properties guaranteed by nbcon atomic
consoles.)
Use the existing printk_kthreads_check_locked() to start/stop
the legacy kthread as needed.
Introduce the macro force_legacy_kthread() to query if the
forced threading of legacy consoles is in effect. Although
currently only enabled for PREEMPT_RT, this acts as a simple
mechanism for the future to allow other preemption models to
easily take advantage of the non-interference property provided
by the legacy kthread.
When force_legacy_kthread() is true, the legacy kthread
fulfills the role of the console_flush_type @legacy_offload by
waking the legacy kthread instead of printing via the
console_lock in the irq_work. If the legacy kthread is not
yet available, no legacy printing takes place (unless in
panic).
If for some reason the legacy kthread fails to create, any
legacy consoles are unregistered. With force_legacy_kthread(),
the legacy kthread is a critical component for legacy consoles.
John Ogness [Wed, 4 Sep 2024 12:05:32 +0000 (14:11 +0206)]
proc: Add nbcon support for /proc/consoles
Update /proc/consoles output to show 'W' if an nbcon console is
registered. Since the write_thread() callback is mandatory, it
enough just to check if it is an nbcon console.
Also update /proc/consoles output to show 'N' if it is an
nbcon console.
John Ogness [Wed, 4 Sep 2024 12:05:30 +0000 (14:11 +0206)]
printk: nbcon: Show replay message on takeover
An emergency or panic context can takeover console ownership
while the current owner was printing a printk message. The
atomic printer will re-print the message that the previous
owner was printing. However, this can look confusing to the
user and may even seem as though a message was lost.
Add a new field @nbcon_prev_seq to struct console to track
the sequence number to print that was assigned to the previous
console owner. If this matches the sequence number to print
that the current owner is assigned, then a takeover must have
occurred. In this case, print an additional message to inform
the user that the previous message is being printed again.
[3430014.1
** replaying previous printk message **
[3430014.181123] usb 1-2: Product: USB Audio
John Ogness [Wed, 4 Sep 2024 12:05:28 +0000 (14:11 +0206)]
printk: nbcon: Rely on kthreads for normal operation
Once the kthread is running and available
(i.e. @printk_kthreads_running is set), the kthread becomes
responsible for flushing any pending messages which are added
in NBCON_PRIO_NORMAL context. Namely the legacy
console_flush_all() and device_release() no longer flush the
console. And nbcon_atomic_flush_pending() used by
nbcon_cpu_emergency_exit() no longer flushes messages added
after the emergency messages.
The console context is safe when used by the kthread only when
one of the following conditions are true:
1. Other caller acquires the console context with
NBCON_PRIO_NORMAL with preemption disabled. It will
release the context before rescheduling.
2. Other caller acquires the console context with
NBCON_PRIO_NORMAL under the device_lock.
3. The kthread is the only context which acquires the console
with NBCON_PRIO_NORMAL.
This is satisfied for all atomic printing call sites:
nbcon_legacy_emit_next_record() (#1)
nbcon_atomic_flush_pending_con() (#1)
nbcon_device_release() (#2)
It is even double guaranteed when @printk_kthreads_running
is set because then _only_ the kthread will print for
NBCON_PRIO_NORMAL. (#3)
John Ogness [Wed, 4 Sep 2024 12:05:27 +0000 (14:11 +0206)]
printk: nbcon: Use thread callback if in task context for legacy
When printing via console_lock, the write_atomic() callback is
used for nbcon consoles. However, if it is known that the
current context is a task context, the write_thread() callback
can be used instead.
Using write_thread() instead of write_atomic() helps to reduce
large disabled preemption regions when the device_lock does not
disable preemption.
This is mainly a preparatory change to allow avoiding
write_atomic() completely during normal operation if boot
consoles are registered.
As a side-effect, it also allows consolidating the printing
code for legacy printing and the kthread printer.
Thomas Gleixner [Wed, 4 Sep 2024 12:05:25 +0000 (14:11 +0206)]
printk: nbcon: Introduce printer kthreads
Provide the main implementation for running a printer kthread
per nbcon console that is takeover/handover aware. This
includes:
- new mandatory write_thread() callback
- kthread creation
- kthread main printing loop
- kthread wakeup mechanism
- kthread shutdown
kthread creation is a bit tricky because consoles may register
before kthreads can be created. In such cases, registration
will succeed, even though no kthread exists. Once kthreads can
be created, an early_initcall will set @printk_kthreads_ready.
If there are no registered boot consoles, the early_initcall
creates the kthreads for all registered nbcon consoles. If
kthread creation fails, the related console is unregistered.
If there are registered boot consoles when
@printk_kthreads_ready is set, no kthreads are created until
the final boot console unregisters.
Once kthread creation finally occurs, @printk_kthreads_running
is set so that the system knows kthreads are available for all
registered nbcon consoles.
If @printk_kthreads_running is already set when the console
is registering, the kthread is created during registration. If
kthread creation fails, the registration will fail.
Until @printk_kthreads_running is set, console printing occurs
directly via the console_lock.
kthread shutdown on system shutdown/reboot is necessary to
ensure the printer kthreads finish their printing so that the
system can cleanly transition back to direct printing via the
console_lock in order to reliably push out the final
shutdown/reboot messages. @printk_kthreads_running is cleared
before shutting down the individual kthreads.
The kthread uses a new mandatory write_thread() callback that
is called with both device_lock() and the console context
acquired.
The console ownership handling is necessary for synchronization
against write_atomic() which is synchronized only via the
console context ownership.
The device_lock() serializes acquiring the console context with
NBCON_PRIO_NORMAL. It is needed in case the device_lock() does
not disable preemption. It prevents the following race:
CPU0 CPU1
[ task A ]
nbcon_context_try_acquire()
# success with NORMAL prio
# .unsafe == false; // safe for takeover
[ schedule: task A -> B ]
WARN_ON()
nbcon_atomic_flush_pending()
nbcon_context_try_acquire()
# success with EMERGENCY prio
# flushing
nbcon_context_release()
# HERE: con->nbcon_state is free
# to take by anyone !!!
nbcon_context_try_acquire()
# success with NORMAL prio [ task B ]
[ schedule: task B -> A ]
nbcon_enter_unsafe()
nbcon_context_can_proceed()
BUG: nbcon_context_can_proceed() returns "true" because
the console is owned by a context on CPU0 with
NBCON_PRIO_NORMAL.
But it should return "false". The console is owned
by a context from task B and we do the check
in a context from task A.
Note that with these changes, the printer kthreads do not yet
take over full responsibility for nbcon printing during normal
operation. These changes only focus on the lifecycle of the
kthreads.
Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240904120536.115780-7-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
John Ogness [Wed, 4 Sep 2024 12:05:24 +0000 (14:11 +0206)]
printk: nbcon: Init @nbcon_seq to highest possible
When initializing an nbcon console, have nbcon_alloc() set
@nbcon_seq to the highest possible sequence number. For all
practical purposes, this will guarantee that the console
will have nothing to print until later when @nbcon_seq is
set to the proper initial printing value.
This will be particularly important once kthread printing is
introduced because nbcon_alloc() can create/start the kthread
before the desired initial sequence number is known.
John Ogness [Wed, 4 Sep 2024 12:05:23 +0000 (14:11 +0206)]
printk: nbcon: Add context to usable() and emit()
The nbcon consoles will have two callbacks to be used for
different contexts. In order to determine if an nbcon console
is usable, console_is_usable() must know if it is a context
that will need to use the optional write_atomic() callback.
Also, nbcon_emit_next_record() must know which callback it
needs to call.
Add an extra parameter @use_atomic to console_is_usable() and
nbcon_emit_next_record() to specify this.
Since so far only the write_atomic() callback exists,
@use_atomic is set to true for all call sites.
John Ogness [Wed, 4 Sep 2024 12:05:21 +0000 (14:11 +0206)]
printk: Fail pr_flush() if before SYSTEM_SCHEDULING
A follow-up change adds pr_flush() to console unregistration.
However, with boot consoles unregistration can happen very
early if there are also regular consoles registering as well.
In this case the pr_flush() is not important because all
consoles are flushed when checking the initial console sequence
number.
Allow pr_flush() to fail if @system_state has not yet reached
SYSTEM_SCHEDULING. This avoids might_sleep() and msleep()
explosions that would otherwise occur:
John Ogness [Wed, 4 Sep 2024 12:05:20 +0000 (14:11 +0206)]
printk: nbcon: Add function for printers to reacquire ownership
Since ownership can be lost at any time due to handover or
takeover, a printing context _must_ be prepared to back out
immediately and carefully. However, there are scenarios where
the printing context must reacquire ownership in order to
finalize or revert hardware changes.
One such example is when interrupts are disabled during
printing. No other context will automagically re-enable the
interrupts. For this case, the disabling context _must_
reacquire nbcon ownership so that it can re-enable the
interrupts.
Provide nbcon_reacquire_nobuf() for exactly this purpose. It
allows a printing context to reacquire ownership using the same
priority as its previous ownership.
Note that after a successful reacquire the printing context
will have no output buffer because that has been lost. This
function cannot be used to resume printing.
John Ogness [Tue, 20 Aug 2024 06:30:01 +0000 (08:36 +0206)]
lockdep: Mark emergency sections in lockdep splats
Mark emergency sections wherever multiple lines of
lock debugging output are generated. In an emergency
section, every printk() call will attempt to directly
flush to the consoles using the EMERGENCY priority.
Note that debug_show_all_locks() and
lockdep_print_held_locks() rely on their callers to
enter the emergency section. This is because these
functions can also be called in non-emergency
situations (such as sysrq).
John Ogness [Tue, 20 Aug 2024 06:30:00 +0000 (08:36 +0206)]
rcu: Mark emergency sections in rcu stalls
Mark emergency sections wherever multiple lines of
rcu stall information are generated. In an emergency
section, every printk() call will attempt to directly
flush to the consoles using the EMERGENCY priority.
John Ogness [Tue, 20 Aug 2024 06:29:59 +0000 (08:35 +0206)]
panic: Mark emergency section in oops
Mark an emergency section beginning with oops_enter() until the
end of oops_exit(). In this section, every printk() call will
attempt to directly flush to the consoles using the EMERGENCY
priority.
The very end of oops_exit() performs a kmsg_dump(). This is not
included in the emergency section because it is another
flushing mechanism that should occur after the consoles have
flushed the oops messages.
Thomas Gleixner [Tue, 20 Aug 2024 06:29:58 +0000 (08:35 +0206)]
panic: Mark emergency section in warn
Mark the full contents of __warn() as an emergency section. In
this section, every printk() call will attempt to directly
flush to the consoles using the EMERGENCY priority.
Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-33-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
Thomas Gleixner [Tue, 20 Aug 2024 06:29:57 +0000 (08:35 +0206)]
printk: nbcon: Implement emergency sections
In emergency situations (something has gone wrong but the
system continues to operate), usually important information
(such as a backtrace) is generated via printk(). This
information should be pushed out to the consoles ASAP.
Add per-CPU emergency nesting tracking because an emergency
can arise while in an emergency situation.
Add functions to mark the beginning and end of emergency
sections where the urgent messages are generated.
Perform direct console flushing at the emergency priority if
the current CPU is in an emergency state and it is safe to do
so.
Note that the emergency state is not system-wide. While one CPU
is in an emergency state, another CPU may attempt to print
console messages at normal priority.
Also note that printk() already attempts to flush consoles in
the caller context for normal priority. However, follow-up
changes will introduce printing kthreads, in which case the
normal priority printk() calls will offload to the kthreads.
Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-32-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
John Ogness [Tue, 20 Aug 2024 06:29:56 +0000 (08:35 +0206)]
printk: Add helper for flush type logic
There are many call sites where console flushing occur.
Depending on the system state and types of consoles, the flush
methods to use are different. A flush call site generally must
consider:
flush nbcon directly via atomic_write() callback
flush legacy directly via console_unlock
flush legacy via offload to irq_work
All of these call sites use their own logic to make this
decision, which is complicated and error prone. Especially
later when two more flush methods will be introduced:
flush nbcon via offload to kthread
flush legacy via offload to kthread
Introduce a new internal struct console_flush_type that specifies
which console flushing methods should be used in the context of
the caller.
Introduce a helper function to fill out console_flush_type to
be used for flushing call sites.
Replace the logic of all flushing call sites to use the new
helper.
This change standardizes behavior, leading to both fixes and
optimizations across various call sites. For instance, in
console_cpu_notify(), the new logic ensures that nbcon consoles
are flushed when they aren’t managed by the legacy loop.
Similarly, in console_flush_on_panic(), the system no longer
needs to flush nbcon consoles if none are present.
John Ogness [Tue, 20 Aug 2024 06:29:55 +0000 (08:35 +0206)]
printk: Coordinate direct printing in panic
If legacy and nbcon consoles are registered and the nbcon
consoles are allowed to flush (i.e. no boot consoles
registered), the legacy consoles will no longer perform
direct printing on the panic CPU until after the backtrace
has been stored. This will give the safe nbcon consoles a
chance to print the panic messages before allowing the
unsafe legacy consoles to print.
If no nbcon consoles are registered or they are not allowed
to flush because boot consoles are registered, there is no
change in behavior (i.e. legacy consoles will always attempt
to print from the printk() caller context).
John Ogness [Tue, 20 Aug 2024 06:29:54 +0000 (08:35 +0206)]
printk: Track nbcon consoles
Add a global flag @have_nbcon_console to identify if any nbcon
consoles are registered. This will be used in follow-up commits
to preserve legacy behavior when no nbcon consoles are registered.
John Ogness [Tue, 20 Aug 2024 06:29:53 +0000 (08:35 +0206)]
printk: Avoid console_lock dance if no legacy or boot consoles
Currently the console lock is used to attempt legacy-type
printing even if there are no legacy or boot consoles registered.
If no such consoles are registered, the console lock does not
need to be taken.
Add tracking of legacy console registration and use it with
boot console tracking to avoid unnecessary code paths, i.e.
do not use the console lock if there are no boot consoles
and no legacy consoles.
John Ogness [Tue, 20 Aug 2024 06:29:52 +0000 (08:35 +0206)]
printk: nbcon: Add unsafe flushing on panic
Add nbcon_atomic_flush_unsafe() to flush all nbcon consoles
using the write_atomic() callback and allowing unsafe hostile
takeovers. Call this at the end of panic() as a final attempt
to flush any pending messages.
Note that legacy consoles use unsafe methods for flushing
from the beginning of panic (see bust_spinlocks()). Therefore,
systems using both legacy and nbcon consoles may still fail to
see panic messages due to unsafe legacy console usage.
John Ogness [Tue, 20 Aug 2024 06:29:51 +0000 (08:35 +0206)]
printk: Flush nbcon consoles first on panic
In console_flush_on_panic(), flush the nbcon consoles before
flushing legacy consoles. The legacy write() callbacks are not
fully safe when oops_in_progress is set.
John Ogness [Tue, 20 Aug 2024 06:29:50 +0000 (08:35 +0206)]
printk: nbcon: Flush new records on device_release()
There may be new records that were added while a driver was
holding the nbcon context for non-printing purposes. These
new records must be flushed by the nbcon_device_release()
context because no other context will do it.
If boot consoles are registered, the legacy loop is used
(either direct or per irq_work) to handle the flushing.
John Ogness [Tue, 20 Aug 2024 06:29:49 +0000 (08:35 +0206)]
printk: Add is_printk_legacy_deferred()
If printk has been explicitly deferred or is called from NMI
context, legacy console printing must be deferred to an irq_work
context. Introduce a helper function is_printk_legacy_deferred()
for a CPU to query if it must defer legacy console printing.
In follow-up commits this helper will be needed at other call
sites as well.
John Ogness [Tue, 20 Aug 2024 06:29:48 +0000 (08:35 +0206)]
printk: nbcon: Use nbcon consoles in console_flush_all()
Allow nbcon consoles to print messages in the legacy printk()
caller context (printing via unlock) by integrating them into
console_flush_all(). The write_atomic() callback is used for
printing.
Provide nbcon_legacy_emit_next_record(), which acts as the
nbcon variant of console_emit_next_record(). Call this variant
within console_flush_all() for nbcon consoles. Since nbcon
consoles use their own @nbcon_seq variable to track the next
record to print, this also must be appropriately handled in
console_flush_all().
Note that the legacy printing logic uses @handover to detect
handovers for printing all consoles. For nbcon consoles,
handovers/takeovers occur on a per-console basis and thus do
not cause the console_flush_all() loop to abort.
John Ogness [Tue, 20 Aug 2024 06:29:47 +0000 (08:35 +0206)]
printk: Track registered boot consoles
Unfortunately it is not known if a boot console and a regular
(legacy or nbcon) console use the same hardware. For this reason
they must not be allowed to print simultaneously.
For legacy consoles this is not an issue because they are
already synchronized with the boot consoles using the console
lock. However nbcon consoles can be triggered separately.
Add a global flag @have_boot_console to identify if any boot
consoles are registered. This will be used in follow-up commits
to ensure that boot consoles and nbcon consoles cannot print
simultaneously.
Thomas Gleixner [Tue, 20 Aug 2024 06:29:46 +0000 (08:35 +0206)]
printk: nbcon: Provide function to flush using write_atomic()
Provide nbcon_atomic_flush_pending() to perform flushing of all
registered nbcon consoles using their write_atomic() callback.
Unlike console_flush_all(), nbcon_atomic_flush_pending() will
only flush up through the newest record at the time of the
call. This prevents a CPU from printing unbounded when other
CPUs are adding records. If new records are added while
flushing, it is expected that the dedicated printer threads
will print those records. If the printer thread is not
available (which is always the case at this point in the
rework), nbcon_atomic_flush_pending() _will_ flush all records
in the ringbuffer.
Unlike console_flush_all(), nbcon_atomic_flush_pending() will
fully flush one console before flushing the next. This helps to
guarantee that a block of pending records (such as a stack
trace in an emergency situation) can be printed atomically at
once before releasing console ownership.
nbcon_atomic_flush_pending() is safe in any context because it
uses write_atomic() and acquires with unsafe_takeover disabled.
Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-21-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
John Ogness [Tue, 20 Aug 2024 06:29:45 +0000 (08:35 +0206)]
printk: nbcon: Add helper to assign priority based on CPU state
Add a helper function to use the current state of the CPU to
determine which priority to assign to the printing context.
The EMERGENCY priority handling is added in a follow-up commit.
It will use a per-CPU variable.
Note: nbcon_device_try_acquire(), which is used by console
drivers to acquire the nbcon console for non-printing
activities, is hard-coded to always use NORMAL priority.
John Ogness [Tue, 20 Aug 2024 06:29:44 +0000 (08:35 +0206)]
printk: Add @flags argument for console_is_usable()
The caller of console_is_usable() usually needs @console->flags
for its own checks. Rather than having console_is_usable() read
its own copy, make the caller pass in the @flags. This also
ensures that the caller saw the same @flags value.
John Ogness [Tue, 20 Aug 2024 06:29:41 +0000 (08:35 +0206)]
printk: nbcon: Do not rely on proxy headers
The headers kernel.h, serial_core.h, and console.h allow for the
definitions of many types and functions from other headers.
Rather than relying on these as proxy headers, explicitly
include all headers providing needed definitions. Also sort the
list alphabetically to be able to easily detect duplicates.
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-16-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
John Ogness [Tue, 20 Aug 2024 06:29:40 +0000 (08:35 +0206)]
serial: core: Acquire nbcon context in port->lock wrapper
Currently the port->lock wrappers uart_port_lock(),
uart_port_unlock() (and their variants) only lock/unlock
the spin_lock.
If the port is an nbcon console that has implemented the
write_atomic() callback, the wrappers must also acquire/release
the console context and mark the region as unsafe. This allows
general port->lock synchronization to be synchronized against
the nbcon write_atomic() callback.
Note that __uart_port_using_nbcon() relies on the port->lock
being held while a console is added and removed from the
console list (i.e. all uart nbcon drivers *must* take the
port->lock in their device_lock() callbacks).
John Ogness [Tue, 20 Aug 2024 06:29:39 +0000 (08:35 +0206)]
nbcon: Add API to acquire context for non-printing operations
Provide functions nbcon_device_try_acquire() and
nbcon_device_release() which will try to acquire the nbcon
console ownership with NBCON_PRIO_NORMAL and mark it unsafe for
handover/takeover.
These functions are to be used together with the device-specific
locking when performing non-printing activities on the console
device. They will allow synchronization against the
atomic_write() callback which will be serialized, for higher
priority contexts, only by acquiring the console context
ownership.
Pitfalls:
The API requires to be called in a context with migration
disabled because it uses per-CPU variables internally.
The context is set unsafe for a takeover all the time. It
guarantees full serialization against any atomic_write() caller
except for the final flush in panic() which might try an unsafe
takeover.
It was not clear when exactly console_srcu_read_flags() must be
used vs. directly reading @console->flags.
Refactor and clarify that console_srcu_read_flags() is only
needed if the console is registered or the caller is in a
context where the registration status of the console may change
(due to another context).
The function requires the caller holds @console_srcu, which will
ensure that the caller sees an appropriate @flags value for the
registered console and that exit/cleanup routines will not run
if the console is in the process of unregistration.
John Ogness [Tue, 20 Aug 2024 06:29:37 +0000 (08:35 +0206)]
serial: core: Introduce wrapper to set @uart_port->cons
Introduce uart_port_set_cons() as a wrapper to set @cons of a
uart_port. The wrapper sets @cons under the port lock in order
to prevent @cons from disappearing while another context is
holding the port lock. This is necessary for a follow-up
commit relating to the port lock wrappers, which rely on @cons
not changing between lock and unlock.
Signed-off-by: John Ogness <john.ogness@linutronix.de> Tested-by: Théo Lebrun <theo.lebrun@bootlin.com> # EyeQ5, AMBA-PL011 Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240820063001.36405-12-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
John Ogness [Tue, 20 Aug 2024 06:29:36 +0000 (08:35 +0206)]
serial: core: Provide low-level functions to lock port
It will be necessary at times for the uart nbcon console
drivers to acquire the port lock directly (without the
additional nbcon functionality of the port lock wrappers).
These are special cases such as the implementation of the
device_lock()/device_unlock() callbacks or for internal
port lock wrapper synchronization.
Provide low-level variants __uart_port_lock_irqsave() and
__uart_port_unlock_irqrestore() for this purpose.
John Ogness [Tue, 20 Aug 2024 06:29:35 +0000 (08:35 +0206)]
printk: nbcon: Use driver synchronization while (un)registering
Console drivers typically have to deal with access to the
hardware via user input/output (such as an interactive login
shell) and output of kernel messages via printk() calls.
They use some classic driver-specific locking mechanism in most
situations. But console->write_atomic() callbacks, used by nbcon
consoles, are synchronized only by acquiring the console
context.
The synchronization via the console context ownership is possible
only when the console driver is registered. It is when a
particular device driver is connected with a particular console
driver.
The two synchronization mechanisms must be synchronized between
each other. It is tricky because the console context ownership
is quite special. It might be taken over by a higher priority
context. Also CPU migration must be disabled. The most tricky
part is to (dis)connect these two mechanisms during the console
(un)registration.
Use the driver-specific locking callbacks: device_lock(),
device_unlock(). They allow taking the device-specific lock
while the device is being (un)registered by the related console
driver.
For example, these callbacks lock/unlock the port lock for
serial port drivers.
Note that the driver-specific locking is only needed during
(un)register if it is an nbcon console with the write_atomic()
callback implemented. If write_atomic() is not implemented, the
driver should never attempt to access the hardware without
first acquiring its driver-specific lock.
John Ogness [Tue, 20 Aug 2024 06:29:34 +0000 (08:35 +0206)]
printk: nbcon: Add callbacks to synchronize with driver
Console drivers typically must deal with access to the hardware
via user input/output (such as an interactive login shell) and
output of kernel messages via printk() calls. To provide the
necessary synchronization, usually some driver-specific locking
mechanism is used (for example, the port spinlock for uart
serial consoles).
Until now, usage of this driver-specific locking has been hidden
from the printk-subsystem and implemented within the various
console callbacks. However, nbcon consoles would need to use it
even in the generic code.
Add device_lock() and device_unlock() callback which will need
to get implemented by nbcon consoles.
The callbacks will use whatever synchronization mechanism the
driver is using for itself. The minimum requirement is to
prevent CPU migration. It would allow a context friendly
acquiring of nbcon console ownership in non-emergency and
non-panic context.
John Ogness [Tue, 20 Aug 2024 06:29:33 +0000 (08:35 +0206)]
printk: nbcon: Add detailed doc for write_atomic()
The write_atomic() callback has special requirements and is
allowed to use special helper functions. Provide detailed
documentation of the callback so that a developer has a
chance of implementing it correctly.
John Ogness [Tue, 20 Aug 2024 06:29:32 +0000 (08:35 +0206)]
printk: nbcon: Remove return value for write_atomic()
The return value of write_atomic() does not provide any useful
information. On the contrary, it makes things more complicated
for the caller to appropriately deal with the information.
Change write_atomic() to not have a return value. If the
message did not get printed due to loss of ownership, the
caller will notice this on its own. If ownership was not lost,
it will be assumed that the driver successfully printed the
message and the sequence number for that console will be
incremented.
John Ogness [Tue, 20 Aug 2024 06:29:31 +0000 (08:35 +0206)]
printk: nbcon: Clarify rules of the owner/waiter matching
The functions nbcon_owner_matches() and nbcon_waiter_matches()
use a minimal set of data to determine if a context matches.
The existing kerneldoc and comments were not clear enough and
caused the printk folks to re-prove that the functions are
indeed reliable in all cases.
Update and expand the explanations so that it is clear that the
implementations are sufficient for all cases.
Petr Mladek [Tue, 20 Aug 2024 06:29:29 +0000 (08:35 +0206)]
printk: Properly deal with nbcon consoles on seq init
If a non-boot console is registering and boot consoles exist,
the consoles are flushed before being unregistered. This allows
the non-boot console to continue where the boot console left
off.
If for whatever reason flushing fails, the lowest seq found from
any of the enabled boot consoles is used. Until now con->seq was
checked. However, if it is an nbcon boot console, the function
nbcon_seq_read() must be used to read seq because con->seq is
not updated for nbcon consoles.
Check if it is an nbcon boot console and if so call
nbcon_seq_read() to read seq.
Also, avoid usage of con->seq as temporary storage of the
starting record. Instead, rename console_init_seq() to
get_init_console_seq() and just return the value. For nbcon
consoles set the sequence via nbcon_seq_force(), for legacy
consoles set con->seq.
The cleaned design should make sure that the value stays and is
set before the console is added to the console list. It also
unifies the sequence number initialization for legacy and nbcon
consoles.
John Ogness [Tue, 20 Aug 2024 06:29:28 +0000 (08:35 +0206)]
printk: nbcon: Consolidate alloc() and init()
Rather than splitting the nbcon allocation and initialization into
two pieces, perform all initialization in nbcon_alloc(). Later,
the initial sequence is calculated and can be explicitly set using
nbcon_seq_force(). This removes the need for the strong rules of
nbcon_init() that even included a BUG_ON().
John Ogness [Tue, 20 Aug 2024 06:29:27 +0000 (08:35 +0206)]
printk: Add notation to console_srcu locking
kernel/printk/printk.c:284:5: sparse: sparse: context imbalance in
'console_srcu_read_lock' - wrong count at exit
include/linux/srcu.h:301:9: sparse: sparse: context imbalance in
'console_srcu_read_unlock' - unexpected unlock
Fixes: 6c4afa79147e ("printk: Prepare for SRCU console list protection") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240820063001.36405-2-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
Linus Torvalds [Mon, 19 Aug 2024 16:26:35 +0000 (09:26 -0700)]
Merge tag 'printk-for-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fix from Petr Mladek:
- Do not block printk on non-panic CPUs when they are dumping
backtraces
* tag 'printk-for-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk/panic: Allow cpu backtraces to be written into ringbuffer during panic
Linus Torvalds [Sun, 18 Aug 2024 17:19:49 +0000 (10:19 -0700)]
Merge tag 'driver-core-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two driver fixes for regressions from 6.11-rc1 due to the
driver core change making a structure in a driver core callback const.
These were missed by all testing EXCEPT for what Bart happened to be
running, so I appreciate the fixes provided here for some
odd/not-often-used driver subsystems that nothing else happened to
catch.
Both of these fixes have been in linux-next all week with no reported
issues"
* tag 'driver-core-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
mips: sgi-ip22: Fix the build
ARM: riscpc: ecard: Fix the build
Linus Torvalds [Sun, 18 Aug 2024 17:16:34 +0000 (10:16 -0700)]
Merge tag 'char-misc-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc fixes from Greg KH:
"Here are some small char/misc fixes for 6.11-rc4 to resolve reported
problems. Included in here are:
- fastrpc revert of a change that broke userspace
- xillybus fixes for reported issues
Half of these have been in linux-next this week with no reported
problems, I don't know if the last bit of xillybus driver changes made
it in, but they are 'obviously correct' so will be safe :)"
* tag 'char-misc-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
char: xillybus: Check USB endpoints when probing device
char: xillybus: Refine workqueue handling
Revert "misc: fastrpc: Restrict untrusted app to attach to privileged PD"
char: xillybus: Don't destroy workqueue from work item running on it
Linus Torvalds [Sun, 18 Aug 2024 17:10:48 +0000 (10:10 -0700)]
Merge tag 'tty-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial fixes from Greg KH:
"Here are some small tty and serial driver fixes for 6.11-rc4 to
resolve some reported problems. Included in here are:
- conmakehash.c userspace build issues
- fsl_lpuart driver fix
- 8250_omap revert for reported regression
- atmel_serial rts flag fix
All of these have been in linux-next this week with no reported
issues"
* tag 'tty-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: 8250_omap: Set the console genpd always on if no console suspend"
tty: atmel_serial: use the correct RTS flag.
tty: vt: conmakehash: remove non-portable code printing comment header
tty: serial: fsl_lpuart: mark last busy before uart_add_one_port
Linus Torvalds [Sun, 18 Aug 2024 16:59:06 +0000 (09:59 -0700)]
Merge tag 'usb-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt driver fixes from Greg KH:
"Here are some small USB and Thunderbolt driver fixes for 6.11-rc4 to
resolve some reported issues. Included in here are:
- thunderbolt driver fixes for reported problems
- typec driver fixes
- xhci fixes
- new device id for ljca usb driver
All of these have been in linux-next this week with no reported
issues"
* tag 'usb-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: Fix Panther point NULL pointer deref at full-speed re-enumeration
usb: misc: ljca: Add Lunar Lake ljca GPIO HID to ljca_gpio_hids[]
Revert "usb: typec: tcpm: clear pd_event queue in PORT_RESET"
usb: typec: ucsi: Fix the return value of ucsi_run_command()
usb: xhci: fix duplicate stall handling in handle_tx_event()
usb: xhci: Check for xhci->interrupters being allocated in xhci_mem_clearup()
thunderbolt: Mark XDomain as unplugged when router is removed
thunderbolt: Fix memory leaks in {port|retimer}_sb_regs_write()
Linus Torvalds [Sun, 18 Aug 2024 15:50:36 +0000 (08:50 -0700)]
Merge tag 'for-6.11-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull more btrfs fixes from David Sterba:
"A more fixes. We got reports that shrinker added in 6.10 still causes
latency spikes and the fixes don't handle all corner cases. Due to
summer holidays we're taking a shortcut to disable it for release
builds and will fix it in the near future.
- only enable extent map shrinker for DEBUG builds, temporary quick
fix to avoid latency spikes for regular builds
- update target inode's ctime on unlink, mandated by POSIX
- properly take lock to read/update block group's zoned variables
- add counted_by() annotations"
* tag 'for-6.11-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: only enable extent map shrinker for DEBUG builds
btrfs: zoned: properly take lock to read/update block group's zoned variables
btrfs: tree-checker: add dev extent item checks
btrfs: update target inode's ctime on unlink
btrfs: send: annotate struct name_cache_entry with __counted_by()
Jann Horn [Tue, 6 Aug 2024 19:51:42 +0000 (21:51 +0200)]
fuse: Initialize beyond-EOF page contents before setting uptodate
fuse_notify_store(), unlike fuse_do_readpage(), does not enable page
zeroing (because it can be used to change partial page contents).
So fuse_notify_store() must be more careful to fully initialize page
contents (including parts of the page that are beyond end-of-file)
before marking the page uptodate.
The current code can leave beyond-EOF page contents uninitialized, which
makes these uninitialized page contents visible to userspace via mmap().
This is an information leak, but only affects systems which do not
enable init-on-alloc (via CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y or the
corresponding kernel command line parameter).
Linus Torvalds [Sun, 18 Aug 2024 02:50:16 +0000 (19:50 -0700)]
Merge tag 'mm-hotfixes-stable-2024-08-17-19-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"16 hotfixes. All except one are for MM. 10 of these are cc:stable and
the others pertain to post-6.10 issues.
As usual with these merges, singletons and doubletons all over the
place, no identifiable-by-me theme. Please see the lovingly curated
changelogs to get the skinny"
* tag 'mm-hotfixes-stable-2024-08-17-19-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/migrate: fix deadlock in migrate_pages_batch() on large folios
alloc_tag: mark pages reserved during CMA activation as not tagged
alloc_tag: introduce clear_page_tag_ref() helper function
crash: fix riscv64 crash memory reserve dead loop
selftests: memfd_secret: don't build memfd_secret test on unsupported arches
mm: fix endless reclaim on machines with unaccepted memory
selftests/mm: compaction_test: fix off by one in check_compaction()
mm/numa: no task_numa_fault() call if PMD is changed
mm/numa: no task_numa_fault() call if PTE is changed
mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0
mm/memory-failure: use raw_spinlock_t in struct memory_failure_cpu
mm: don't account memmap per-node
mm: add system wide stats items category
mm: don't account memmap on failure
mm/hugetlb: fix hugetlb vs. core-mm PT locking
mseal: fix is_madv_discard()
Linus Torvalds [Sun, 18 Aug 2024 02:23:02 +0000 (19:23 -0700)]
Merge tag 'powerpc-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix crashes on 85xx with some configs since the recent hugepd rework.
- Fix boot warning with hugepages and CONFIG_DEBUG_VIRTUAL on some
platforms.
- Don't enable offline cores when changing SMT modes, to match existing
userspace behaviour.
Thanks to Christophe Leroy, Dr. David Alan Gilbert, Guenter Roeck, Nysal
Jan K.A, Shrikanth Hegde, Thomas Gleixner, and Tyrel Datwyler.
* tag 'powerpc-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/topology: Check if a core is online
cpu/SMT: Enable SMT only if a core is online
powerpc/mm: Fix boot warning with hugepages and CONFIG_DEBUG_VIRTUAL
powerpc/mm: Fix size of allocated PGDIR
soc: fsl: qbman: remove unused struct 'cgr_comp'
Linus Torvalds [Sat, 17 Aug 2024 23:31:12 +0000 (16:31 -0700)]
Merge tag 'v6.11-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- fix for clang warning - additional null check
- fix for cached write with posix locks
- flexible structure fix
* tag 'v6.11-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: smb2pdu.h: Use static_assert() to check struct sizes
smb3: fix lock breakage for cached writes
smb/client: avoid possible NULL dereference in cifs_free_subrequest()
Linus Torvalds [Sat, 17 Aug 2024 23:23:05 +0000 (16:23 -0700)]
Merge tag 'i2c-for-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C core fix replacing IS_ENABLED() with IS_REACHABLE()
For host drivers, there are two fixes:
- Tegra I2C Controller: Addresses a potential double-locking issue
during probe. ACPI devices are not IRQ-safe when invoking runtime
suspend and resume functions, so the irq_safe flag should not be
set.
- Qualcomm GENI I2C Controller: Fixes an oversight in the exit path
of the runtime_resume() function, which was missed in the previous
release"
* tag 'i2c-for-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: tegra: Do not mark ACPI devices as irq safe
i2c: Use IS_REACHABLE() for substituting empty ACPI functions
i2c: qcom-geni: Add missing geni_icc_disable in geni_i2c_runtime_resume
Linus Torvalds [Sat, 17 Aug 2024 17:04:01 +0000 (10:04 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two small fixes to the mpi3mr driver. One to avoid oversize
allocations in tracing and the other to fix an uninitialized spinlock
in the user to driver feature request code (used to trigger dumps and
the like)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpi3mr: Avoid MAX_PAGE_ORDER WARNING for buffer allocations
scsi: mpi3mr: Add missing spin_lock_init() for mrioc->trigger_lock
Linus Torvalds [Sat, 17 Aug 2024 16:51:28 +0000 (09:51 -0700)]
Merge tag 'xfs-6.11-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Chandan Babu:
- Check for presence of only 'attr' feature before scrubbing an inode's
attribute fork.
- Restore the behaviour of setting AIL thread to TASK_INTERRUPTIBLE for
long (i.e. 50ms) sleep durations to prevent high load averages.
- Do not allow users to change the realtime flag of a file unless the
datadev and rtdev both support fsdax access modes.
* tag 'xfs-6.11-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: conditionally allow FS_XFLAG_REALTIME changes if S_DAX is set
xfs: revert AIL TASK_KILLABLE threshold
xfs: attr forks require attr, not attr2
Linus Torvalds [Sat, 17 Aug 2024 16:46:10 +0000 (09:46 -0700)]
Merge tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent OverstreetL
- New on disk format version, bcachefs_metadata_version_disk_accounting_inum
This adds one more disk accounting counter, which counts disk usage
and number of extents per inode number. This lets us track
fragmentation, for implementing defragmentation later, and it also
counts disk usage per inode in all snapshots, which will be a useful
thing to expose to users.
- One performance issue we've observed is threads spinning when they
should be waiting for dirty keys in the key cache to be flushed by
journal reclaim, so we now have hysteresis for the waiting thread, as
well as improving the tracepoint and a new time_stat, for tracking
time blocked waiting on key cache flushing.
... and various assorted smaller fixes.
* tag 'bcachefs-2024-08-16' of git://evilpiepirate.org/bcachefs:
bcachefs: Fix locking in __bch2_trans_mark_dev_sb()
bcachefs: fix incorrect i_state usage
bcachefs: avoid overflowing LRU_TIME_BITS for cached data lru
bcachefs: Fix forgetting to pass trans to fsck_err()
bcachefs: Increase size of cuckoo hash table on too many rehashes
bcachefs: bcachefs_metadata_version_disk_accounting_inum
bcachefs: Kill __bch2_accounting_mem_mod()
bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()
bcachefs: Fix warning in __bch2_fsck_err() for trans not passed in
bcachefs: Add a time_stat for blocked on key cache flush
bcachefs: Improve trans_blocked_journal_reclaim tracepoint
bcachefs: Add hysteresis to waiting on btree key cache flush
lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()
bcachefs: Convert for_each_btree_node() to lockrestart_do()
bcachefs: Add missing downgrade table entry
bcachefs: disk accounting: ignore unknown types
bcachefs: bch2_accounting_invalid() fixup
bcachefs: Fix bch2_trigger_alloc when upgrading from old versions
bcachefs: delete faulty fastpath in bch2_btree_path_traverse_cached()
Linus Torvalds [Sat, 17 Aug 2024 00:02:32 +0000 (17:02 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Fix the arm64 __get_mem_asm() to use the _ASM_EXTABLE_##type##ACCESS()
macro instead of the *_ERR() one in order to avoid writing -EFAULT to
the value register in case of a fault
- Initialise all elements of the acpi_early_node_map[] to NUMA_NO_NODE.
Prior to this fix, only the first element was initialised
- Move the KASAN random tag seed initialisation after the per-CPU areas
have been initialised (prng_state is __percpu)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix KASAN random tag seed initialization
arm64: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
arm64: uaccess: correct thinko in __get_mem_asm()
Linus Torvalds [Fri, 16 Aug 2024 23:59:05 +0000 (16:59 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One fix for the new T-Head TH1520 clk driver that marks a bus clk
critical so that it isn't turned off during late init which breaks
emmc-sdio"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: thead: fix dependency on clk_ignore_unused
Linus Torvalds [Fri, 16 Aug 2024 21:03:31 +0000 (14:03 -0700)]
Merge tag 'block-6.11-20240824' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- Fix corruption issues with s390/dasd (Eric, Stefan)
- Fix a misuse of non irq locking grab of a lock (Li)
- MD pull request with a single data corruption fix for raid1 (Yu)
* tag 'block-6.11-20240824' of git://git.kernel.dk/linux:
block: Fix lockdep warning in blk_mq_mark_tag_wait
md/raid1: Fix data corruption for degraded array with slow disk
s390/dasd: fix error recovery leading to data corruption on ESE devices
s390/dasd: Remove DMA alignment
Linus Torvalds [Fri, 16 Aug 2024 21:00:05 +0000 (14:00 -0700)]
Merge tag 'io_uring-6.11-20240824' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Fix a comment in the uapi header using the wrong member name (Caleb)
- Fix KCSAN warning for a debug check in sqpoll (me)
- Two more NAPI tweaks (Olivier)
* tag 'io_uring-6.11-20240824' of git://git.kernel.dk/linux:
io_uring: fix user_data field name in comment
io_uring/sqpoll: annotate debug task == current with data_race()
io_uring/napi: remove duplicate io_napi_entry timeout assignation
io_uring/napi: check napi_enabled in io_napi_add() before proceeding
Qu Wenruo [Fri, 16 Aug 2024 01:10:38 +0000 (10:40 +0930)]
btrfs: only enable extent map shrinker for DEBUG builds
Although there are several patches improving the extent map shrinker,
there are still reports of too frequent shrinker behavior, taking too
much CPU for the kswapd process.
So let's only enable extent shrinker for now, until we got more
comprehensive understanding and a better solution.
Linus Torvalds [Fri, 16 Aug 2024 18:49:07 +0000 (11:49 -0700)]
Merge tag 'thermal-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Fix a Bang-bang thermal governor issue causing it to fail to reset the
state of cooling devices if they are 'on' to start with, but the
thermal zone temperature is always below the corresponding trip point
(Rafael Wysocki)"
* tag 'thermal-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: gov_bang_bang: Use governor_data to reduce overhead
thermal: gov_bang_bang: Add .manage() callback
thermal: gov_bang_bang: Split bang_bang_control()
thermal: gov_bang_bang: Call __thermal_cdev_update() directly
Linus Torvalds [Fri, 16 Aug 2024 18:43:54 +0000 (11:43 -0700)]
Merge tag 'acpi-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Fix an issue related to the ACPI EC device handling that causes the
_REG control method to be evaluated for EC operation regions that are
not expected to be used.
This confuses the platform firmware and provokes various types of
misbehavior on some systems (Rafael Wysocki)"
* tag 'acpi-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: EC: Evaluate _REG outside the EC scope more carefully
ACPICA: Add a depth argument to acpi_execute_reg_methods()
Revert "ACPI: EC: Evaluate orphan _REG under EC device"
Linus Torvalds [Fri, 16 Aug 2024 18:36:40 +0000 (11:36 -0700)]
Merge tag 'libnvdimm-fixes-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fix from Ira Weiny:
"Commit f467fee48da4 ("block: move the dax flag to queue_limits") broke
the DAX tests by skipping over the legacy pmem mapping pages case.
Set the DAX flag in this case as well"
* tag 'libnvdimm-fixes-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm/pmem: Set dax flag for all 'PFN_MAP' cases
Linus Torvalds [Fri, 16 Aug 2024 18:24:06 +0000 (11:24 -0700)]
Merge tag 'rust-fixes-6.11' of https://github.com/Rust-for-Linux/linux
Pull rust fixes from Miguel Ojeda:
- Fix '-Os' Rust 1.80.0+ builds adding more intrinsics (also tweaked in
upstream Rust for the upcoming 1.82.0).
- Fix support for the latest version of rust-analyzer due to a change
on rust-analyzer config file semantics (considered a fix since most
developers use the latest version of the tool, which is the only one
actually supported by upstream). I am discussing stability of the
config file with upstream -- they may be able to start versioning it.
- Fix GCC 14 builds due to '-fmin-function-alignment' not skipped for
libclang (bindgen).
- A couple Kconfig fixes around '{RUSTC,BINDGEN}_VERSION_TEXT' to
suppress error messages in a foreign architecture chroot and to use a
proper default format.
- Clean 'rust-analyzer' target warning due to missing recursive make
invocation mark.
- Clean Clippy warning due to missing indentation in docs.
- Clean LLVM 19 build warning due to removed 3dnow feature upstream.
* tag 'rust-fixes-6.11' of https://github.com/Rust-for-Linux/linux:
rust: x86: remove `-3dnow{,a}` from target features
kbuild: rust-analyzer: mark `rust_is_available.sh` invocation as recursive
rust: add intrinsics to fix `-Os` builds
kbuild: rust: skip -fmin-function-alignment in bindgen flags
rust: Support latest version of `rust-analyzer`
rust: macros: indent list item in `module!`'s docs
rust: fix the default format for CONFIG_{RUSTC,BINDGEN}_VERSION_TEXT
rust: suppress error messages from CONFIG_{RUSTC,BINDGEN}_VERSION_TEXT
Linus Torvalds [Fri, 16 Aug 2024 18:18:09 +0000 (11:18 -0700)]
Merge tag 'riscv-for-linus-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- reintroduce the text patching global icache flush
- fix syscall entry code to correctly initialize a0, which manifested
as a strace bug
- XIP kernels now map the entire kernel, which fixes boot under at
least DEBUG_VIRTUAL=y
- initialize all nodes in the acpi_early_node_map initializer
- fix OOB access in the Andes vendor extension probing code
- A new key for scalar misaligned access performance in hwprobe, which
correctly treat the values as an enum (as opposed to a bitmap)
* tag 'riscv-for-linus-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix out-of-bounds when accessing Andes per hart vendor extension array
RISC-V: hwprobe: Add SCALAR to misaligned perf defines
RISC-V: hwprobe: Add MISALIGNED_PERF key
RISC-V: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
riscv: change XIP's kernel_map.size to be size of the entire kernel
riscv: entry: always initialize regs->a0 to -ENOSYS
riscv: Re-introduce global icache flush in patch_text_XXX()
Linus Torvalds [Fri, 16 Aug 2024 18:12:29 +0000 (11:12 -0700)]
Merge tag 'trace-v6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
"A couple of fixes for tracing:
- Prevent a NULL pointer dereference in the error path of RTLA tool
- Fix an infinite loop bug when reading from the ring buffer when
closed. If there's a thread trying to read the ring buffer and it
gets closed by another thread, the one reading will go into an
infinite loop when the buffer is empty instead of exiting back to
user space"
* tag 'trace-v6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla/osnoise: Prevent NULL dereference in error handling
tracing: Return from tracing_buffers_read() if the file has been closed
Linus Torvalds [Fri, 16 Aug 2024 15:56:45 +0000 (08:56 -0700)]
Merge tag 'iommu-fixes-v6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
- Bring back a lost return statement in io-page-fault code
- Remove an unused function declaration
* tag 'iommu-fixes-v6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu: Remove unused declaration iommu_sva_unbind_gpasid()
iommu: Restore lost return in iommu_report_device_fault()
Linus Torvalds [Fri, 16 Aug 2024 15:39:41 +0000 (08:39 -0700)]
Merge tag 'sound-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"All small fixes, mostly for usual suspects, HD-audio and USB-audio
device-specific fixes / quirks. The Cirrus codec support took the
update of SPI header as well. Other than that, there is a regression
fix in the sanity check of ALSA timer code"
* tag 'sound-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/tas2781: Use correct endian conversion
ALSA: usb-audio: Support Yamaha P-125 quirk entry
ALSA: hda: cs35l41: Remove redundant call to hda_cs_dsp_control_remove()
ALSA: hda: cs35l56: Remove redundant call to hda_cs_dsp_control_remove()
ALSA: hda/tas2781: fix wrong calibrated data order
ALSA: usb-audio: Add delay quirk for VIVO USB-C-XE710 HEADSET
ALSA: hda/realtek: Add support for new HP G12 laptops
ALSA: hda/realtek: Fix noise from speakers on Lenovo IdeaPad 3 15IAU7
ALSA: timer: Relax start tick time check for slave timer elements
spi: Add empty versions of ACPI functions
Linus Torvalds [Fri, 16 Aug 2024 15:35:50 +0000 (08:35 -0700)]
Merge tag 'drm-fixes-2024-08-16' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Weekly drm fixes, mostly amdgpu and xe. The larger amdgpu fix is for a
new IP block introduced in rc1, so should be fine. The xe fixes
contain some missed fixes from the end of the previous round along
with some fixes which required precursor changes, but otherwise
everything seems fine,
xe:
- Validate user fence during creation
- Fix use after free when client stats are captured
- SRIOV fixes
- Runtime PM fixes"
* tag 'drm-fixes-2024-08-16' of https://gitlab.freedesktop.org/drm/kernel: (37 commits)
drm/xe: Hold a PM ref when GT TLB invalidations are inflight
drm/xe: Drop xe_gt_tlb_invalidation_wait
drm/xe: Add xe_gt_tlb_invalidation_fence_init helper
drm/xe/pf: Fix VF config validation on multi-GT platforms
drm/xe: Build PM into GuC CT layer
drm/xe/vf: Fix register value lookup
drm/xe: Fix use after free when client stats are captured
drm/xe: Take a ref to xe file when user creates a VM
drm/xe: Add ref counting for xe_file
drm/xe: Move part of xe_file cleanup to a helper
drm/xe: Validate user fence during creation
drm/rockchip: inno-hdmi: Fix infoframe upload
drm/amd/amdgpu: add HDP_SD support on gc 12.0.0/1
drm/amdgpu: Update kmd_fw_shared for VCN5
drm/amd/amdgpu: command submission parser for JPEG
drm/amdgpu/mes12: fix suspend issue
drm/amdgpu/mes12: sw/hw fini for unified mes
drm/amdgpu/mes12: configure two pipes hardware resources
drm/amdgpu/mes12: adjust mes12 sw/hw init for multiple pipes
drm/amdgpu/mes12: add mes pipe switch support
...
Wolfram Sang [Fri, 16 Aug 2024 14:23:51 +0000 (16:23 +0200)]
Merge tag 'i2c-host-fixes-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current
Two fixes in this update:
Tegra I2C Controller: Addresses a potential double-locking issue
during probe. ACPI devices are not IRQ-safe when invoking runtime
suspend and resume functions, so the irq_safe flag should not be
set.
Qualcomm GENI I2C Controller: Fixes an oversight in the exit path
of the runtime_resume() function, which was missed in the
previous release.
Rafael J. Wysocki [Tue, 13 Aug 2024 14:29:11 +0000 (16:29 +0200)]
thermal: gov_bang_bang: Use governor_data to reduce overhead
After running once, the for_each_trip_desc() loop in
bang_bang_manage() is pure needless overhead because it is not going to
make any changes unless a new cooling device has been bound to one of
the trips in the thermal zone or the system is resuming from sleep.
For this reason, make bang_bang_manage() set governor_data for the
thermal zone and check it upfront to decide whether or not it needs to
do anything.
However, governor_data needs to be reset in some cases to let
bang_bang_manage() know that it should walk the trips again, so add an
.update_tz() callback to the governor and make the core additionally
invoke it during system resume.
To avoid affecting the other users of that callback unnecessarily, add
a special notification reason for system resume, THERMAL_TZ_RESUME, and
also pass it to __thermal_zone_device_update() called during system
resume for consistency.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Kästle <peter@piie.net> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.10+ <stable@vger.kernel.org> # 6.10+ Link: https://patch.msgid.link/2285575.iZASKD2KPV@rjwysocki.net
Rafael J. Wysocki [Tue, 13 Aug 2024 14:27:33 +0000 (16:27 +0200)]
thermal: gov_bang_bang: Add .manage() callback
After recent changes, the Bang-bang governor may not adjust the
initial configuration of cooling devices to the actual situation.
Namely, if a cooling device bound to a certain trip point starts in
the "on" state and the thermal zone temperature is below the threshold
of that trip point, the trip point may never be crossed on the way up
in which case the state of the cooling device will never be adjusted
because the thermal core will never invoke the governor's
.trip_crossed() callback. [Note that there is no issue if the zone
temperature is at the trip threshold or above it to start with because
.trip_crossed() will be invoked then to indicate the start of thermal
mitigation for the given trip.]
To address this, add a .manage() callback to the Bang-bang governor
and use it to ensure that all of the thermal instances managed by the
governor have been initialized properly and the states of all of the
cooling devices involved have been adjusted to the current zone
temperature as appropriate.
Rafael J. Wysocki [Tue, 13 Aug 2024 14:26:42 +0000 (16:26 +0200)]
thermal: gov_bang_bang: Split bang_bang_control()
Move the setting of the thermal instance target state from
bang_bang_control() into a separate function that will be also called
in a different place going forward.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Kästle <peter@piie.net> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.10+ <stable@vger.kernel.org> # 6.10+ Link: https://patch.msgid.link/3313587.aeNJFYEL58@rjwysocki.net
Instead of clearing the "updated" flag for each cooling device
affected by the trip point crossing in bang_bang_control() and
walking all thermal instances to run thermal_cdev_update() for all
of the affected cooling devices, call __thermal_cdev_update()
directly for each of them.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Kästle <peter@piie.net> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Cc: 6.10+ <stable@vger.kernel.org> # 6.10+ Link: https://patch.msgid.link/13583081.uLZWGnKmhe@rjwysocki.net
Eli Billauer [Fri, 16 Aug 2024 07:02:00 +0000 (10:02 +0300)]
char: xillybus: Check USB endpoints when probing device
Ensure, as the driver probes the device, that all endpoints that the
driver may attempt to access exist and are of the correct type.
All XillyUSB devices must have a Bulk IN and Bulk OUT endpoint at
address 1. This is verified in xillyusb_setup_base_eps().
On top of that, a XillyUSB device may have additional Bulk OUT
endpoints. The information about these endpoints' addresses is deduced
from a data structure (the IDT) that the driver fetches from the device
while probing it. These endpoints are checked in setup_channels().
A XillyUSB device never has more than one IN endpoint, as all data
towards the host is multiplexed in this single Bulk IN endpoint. This is
why setup_channels() only checks OUT endpoints.
Reported-by: syzbot+eac39cba052f2e750dbe@syzkaller.appspotmail.com Cc: stable <stable@kernel.org> Closes: https://lore.kernel.org/all/0000000000001d44a6061f7a54ee@google.com/T/ Fixes: a53d1202aef1 ("char: xillybus: Add driver for XillyUSB (Xillybus variant for USB)"). Signed-off-by: Eli Billauer <eli.billauer@gmail.com> Link: https://lore.kernel.org/r/20240816070200.50695-2-eli.billauer@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>