www.infradead.org Git - users/jedix/linux-maple.git/log

]> www.infradead.org Git - users/jedix/linux-maple.git/log

projects / users / jedix / linux-maple.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Ard Biesheuvel [Thu, 20 Mar 2025 21:32:39 +0000 (22:32 +0100)]

x86/asm: Make asm export of __ref_stack_chk_guard unconditional

Clang does not tolerate the use of non-TLS symbols for the per-CPU stack
protector very well, and to work around this limitation, the symbol
passed via the -mstack-protector-guard-symbol= option is never defined
in C code, but only in the linker script, and it is exported from an
assembly file. This is necessary because Clang will fail to generate the
correct %GS based references in a compilation unit that includes a
non-TLS definition of the guard symbol being used to store the stack
cookie.

This problem is only triggered by symbol definitions, not by
declarations, but nonetheless, the declaration in <asm/asm-prototypes.h>
is conditional on __GENKSYMS__ being #define'd, so that only genksyms
will observe it, but for ordinary compilation, it will be invisible.

This is causing problems with the genksyms alternative gendwarfksyms,
which does not #define __GENKSYMS__, does not observe the symbol
declaration, and therefore lacks the information it needs to version it.
Adding the #define creates problems in other places, so that is not a
straight-forward solution. So take the easy way out, and drop the
conditional on __GENKSYMS__, as this is not really needed to begin with.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20250320213238.4451-2-ardb@kernel.org

commit | commitdiff | tree

Rik van Riel [Wed, 19 Mar 2025 17:25:20 +0000 (13:25 -0400)]

x86/mm: Only do broadcast flush from reclaim if pages were unmapped

Track whether pages were unmapped from any MM (even ones with a currently
empty mm_cpumask) by the reclaim code, to figure out whether or not
broadcast TLB flush should be done when reclaim finishes.

The reason any MM must be tracked, and not only ones contributing to the
tlbbatch cpumask, is that broadcast ASIDs are expected to be kept up to
date even on CPUs where the MM is not currently active.

This change allows reclaim to avoid doing TLB flushes when only clean page
cache pages and/or slab memory were reclaimed, which is fairly common.

( This is a simpler alternative to the code that was in my INVLPGB series
before, and it seems to capture most of the benefit due to how common
it is to reclaim only page cache. )

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250319132520.6b10ad90@fangorn

commit | commitdiff | tree

Sohil Mehta [Tue, 18 Mar 2025 22:38:28 +0000 (22:38 +0000)]

perf/x86/intel, x86/cpu: Replace Pentium 4 model checks with VFM ones

Introduce a name for an old Pentium 4 model and replace the x86_model
checks with VFM ones. This gets rid of one of the last remaining
Intel-specific x86_model checks.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250318223828.2945651-3-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Tue, 18 Mar 2025 22:38:27 +0000 (22:38 +0000)]

perf/x86/intel, x86/cpu: Simplify Intel PMU initialization

Architectural Perfmon was introduced on the Family 6 "Core" processors
starting with Yonah. Processors before Yonah need their own customized
PMU initialization.

p6_pmu_init() is expected to provide that initialization for early
Family 6 processors. But, currently, it could get called for any Family
6 processor if the architectural perfmon feature is disabled on that
processor. To simplify, restrict the P6 PMU initialization to early
Family 6 processors that do not have architectural perfmon support and
truly need the special handling.

As a result, the "unsupported" console print becomes practically
unreachable because all the released P6 processors are covered by the
switch cases. Move the console print to a common location where it can
cover all modern processors (including Family >15) that may not have
architectural perfmon support enumerated.

Also, use this opportunity to get rid of the unnecessary switch cases in
P6 initialization. Only the Pentium Pro processor needs a quirk, and the
rest of the processors do not need any special handling. The gaps in the
case numbers are only due to no processor with those model numbers being
released.

Use decimal numbers to represent Intel Family numbers. Also, convert one
of the last few Intel x86_model comparisons to a VFM-based one.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250318223828.2945651-2-sohil.mehta@intel.com

commit | commitdiff | tree

Thomas Huth [Wed, 19 Mar 2025 10:30:57 +0000 (11:30 +0100)]

x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-UAPI headers

While the GCC and Clang compilers already define __ASSEMBLER__
automatically when compiling assembly code, __ASSEMBLY__ is a
macro that only gets defined by the Makefiles in the kernel.

This can be very confusing when switching between userspace
and kernelspace coding, or when dealing with UAPI headers that
rather should use __ASSEMBLER__ instead. So let's standardize on
the __ASSEMBLER__ macro that is provided by the compilers now.

This is mostly a mechanical patch (done with a simple "sed -i"
statement), with some manual tweaks in <asm/frame.h>, <asm/hw_irq.h>
and <asm/setup.h> that mentioned this macro in comments with some
missing underscores.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250314071013.1575167-38-thuth@redhat.com

commit | commitdiff | tree

Thomas Huth [Mon, 10 Mar 2025 10:42:56 +0000 (11:42 +0100)]

x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headers

__ASSEMBLY__ is only defined by the Makefile of the kernel, so
this is not really useful for UAPI headers (unless the userspace
Makefile defines it, too). Let's switch to __ASSEMBLER__ which
gets set automatically by the compiler when compiling assembly
code.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Brian Gerst <brgerst@gmail.com>
Link: https://lore.kernel.org/r/20250310104256.123527-1-thuth@redhat.com

commit | commitdiff | tree

Uros Bizjak [Sun, 9 Mar 2025 17:09:36 +0000 (18:09 +0100)]

x86/locking/atomic: Improve performance by using asm_inline() for atomic locking instructions

According to:

  https://gcc.gnu.org/onlinedocs/gcc/Size-of-an-asm.html

the usage of asm pseudo directives in the asm template can confuse
the compiler to wrongly estimate the size of the generated
code.

The LOCK_PREFIX macro expands to several asm pseudo directives, so
its usage in atomic locking insns causes instruction length estimates
to fail significantly (the specially instrumented compiler reports
the estimated length of these asm templates to be 6 instructions long).

This incorrect estimate further causes unoptimal inlining decisions,
un-optimal instruction scheduling and un-optimal code block alignments
for functions that use these locking primitives.

Use asm_inline instead:

  https://gcc.gnu.org/pipermail/gcc-patches/2018-December/512349.html

which is a feature that makes GCC pretend some inline assembler code
is tiny (while it would think it is huge), instead of just asm.

For code size estimation, the size of the asm is then taken as
the minimum size of one instruction, ignoring how many instructions
compiler thinks it is.

bloat-o-meter reports the following code size increase
(x86_64 defconfig, gcc-14.2.1):

  add/remove: 82/283 grow/shrink: 870/372 up/down: 76272/-43618 (32654)
  Total: Before=22770320, After=22802974, chg +0.14%

with top grows (>500 bytes):

Function                                     old     new   delta
----------------------------------------------------------------
copy_process                                6465   10191   +3726
balance_dirty_pages_ratelimited_flags        237    2949   +2712
icl_plane_update_noarm                      5800    7969   +2169
samsung_input_mapping                       3375    5170   +1795
ext4_do_update_inode.isra                      -    1526   +1526
__schedule                                  2416    3472   +1056
__i915_vma_resource_unhold                     -     946    +946
sched_mm_cid_after_execve                    175    1097    +922
__do_sys_membarrier                            -     862    +862
filemap_fault                               2666    3462    +796
nl80211_send_wiphy                         11185   11874    +689
samsung_input_mapping.cold                   900    1500    +600
virtio_gpu_queue_fenced_ctrl_buffer          839    1410    +571
ilk_update_pipe_csc                         1201    1735    +534
enable_step                                    -     525    +525
icl_color_commit_noarm                      1334    1847    +513
tg3_read_bc_ver                                -     501    +501

and top shrinks (>500 bytes):

Function                                     old     new   delta
----------------------------------------------------------------
nl80211_send_iftype_data                     580       -    -580
samsung_gamepad_input_mapping.isra.cold      604       -    -604
virtio_gpu_queue_ctrl_sgs                    724       -    -724
tg3_get_invariants                          9218    8376    -842
__i915_vma_resource_unhold.part              899       -    -899
ext4_mark_iloc_dirty                        1735     106   -1629
samsung_gamepad_input_mapping.isra          2046       -   -2046
icl_program_input_csc                       2203       -   -2203
copy_mm                                     2242       -   -2242
balance_dirty_pages                         2657       -   -2657

These code size changes can be grouped into 4 groups:

a) some functions now include once-called functions in full or
in part. These are:

Function                                     old     new   delta
----------------------------------------------------------------
copy_process                                6465   10191   +3726
balance_dirty_pages_ratelimited_flags        237    2949   +2712
icl_plane_update_noarm                      5800    7969   +2169
samsung_input_mapping                       3375    5170   +1795
ext4_do_update_inode.isra                      -    1526   +1526

that now include:

Function                                     old     new   delta
----------------------------------------------------------------
copy_mm                                     2242       -   -2242
balance_dirty_pages                         2657       -   -2657
icl_program_input_csc                       2203       -   -2203
samsung_gamepad_input_mapping.isra          2046       -   -2046
ext4_mark_iloc_dirty                        1735     106   -1629

b) ISRA [interprocedural scalar replacement of aggregates,
interprocedural pass that removes unused function return values
(turning functions returning a value which is never used into void
functions) and removes unused function parameters.  It can also
replace an aggregate parameter by a set of other parameters
representing part of the original, turning those passed by reference
into new ones which pass the value directly.]

Top grows and shrinks of this group are listed below:

Function                                     old     new   delta
----------------------------------------------------------------
ext4_do_update_inode.isra                      -    1526   +1526
nfs4_begin_drain_session.isra                  -     249    +249
nfs4_end_drain_session.isra                    -     168    +168
__guc_action_register_multi_lrc_v70.isra     335     500    +165
__i915_gem_free_objects.isra                   -     144    +144
...
membarrier_register_private_expedited.isra     108       -    -108
syncobj_eventfd_entry_func.isra              445     314    -131
__ext4_sb_bread_gfp.isra                     140       -    -140
class_preempt_notrace_destructor.isra        145       -    -145
p9_fid_put.isra                              151       -    -151
__mm_cid_try_get.isra                        238       -    -238
membarrier_global_expedited.isra             294       -    -294
mm_cid_get.isra                              295       -    -295
samsung_gamepad_input_mapping.isra.cold      604       -    -604
samsung_gamepad_input_mapping.isra          2046       -   -2046

c) different split points of hot/cold split that just move code around:

Top grows and shrinks of this group are listed below:

Function                                     old     new   delta
----------------------------------------------------------------
samsung_input_mapping.cold                   900    1500    +600
__i915_request_reset.cold                    311     389     +78
nfs_update_inode.cold                         77     153     +76
__do_sys_swapon.cold                         404     455     +51
copy_process.cold                              -      45     +45
tg3_get_invariants.cold                       73     115     +42
...
hibernate.cold                               671     643     -28
copy_mm.cold                                  31       -     -31
software_resume.cold                         249     207     -42
io_poll_wake.cold                            106      54     -52
samsung_gamepad_input_mapping.isra.cold      604       -    -604

c) full inline of small functions with locking insn (~150 cases).
These bring in most of the code size increase because the removed
function code is now inlined in multiple places. E.g.:

0000000000a50e10 <release_devnum>:
  a50e10:    48 63 07                 movslq (%rdi),%rax
  a50e13:    85 c0                    test   %eax,%eax
  a50e15:    7e 10                    jle    a50e27 <release_devnum+0x17>
  a50e17:    48 8b 4f 50              mov    0x50(%rdi),%rcx
  a50e1b:    f0 48 0f b3 41 50        lock btr %rax,0x50(%rcx)
  a50e21:    c7 07 ff ff ff ff        movl   $0xffffffff,(%rdi)
  a50e27:    e9 00 00 00 00           jmp    a50e2c <release_devnum+0x1c>
    a50e28: R_X86_64_PLT32    __x86_return_thunk-0x4
  a50e2c:    0f 1f 40 00              nopl   0x0(%rax)

is now fully inlined into the caller function. This is desirable due
to the per function overhead of CPU bug mitigations like retpolines.

FTR a) with -Os (where generated code size really matters) x86_64
defconfig object file decreases by 24.388 kbytes, representing 0.1%
code size decrease:

    text           data     bss      dec            hex filename
23883860        4617284  814212 29315356        1bf511c vmlinux-old.o
23859472        4615404  814212 29289088        1beea80 vmlinux-new.o

FTR b) clang recognizes "asm inline", but there was no difference in
code sizes:

    text           data     bss      dec            hex filename
27577163        4503078  807732 32887973        1f5d4a5 vmlinux-clang-patched.o
27577181        4503078  807732 32887991        1f5d4b7 vmlinux-clang-unpatched.o

The performance impact of the patch was assessed by recompiling
fedora-41 6.13.5 kernel and running lmbench with old and new kernel.
The most noticeable improvements were:

Process fork+exit: 270.0952 microseconds
Process fork+execve: 2620.3333 microseconds
Process fork+/bin/sh -c: 6781.0000 microseconds
File /usr/tmp/XXX write bandwidth: 1780350 KB/sec
Pagefaults on /usr/tmp/XXX: 0.3875 microseconds

to:

Process fork+exit: 298.6842 microseconds
Process fork+execve: 1662.7500 microseconds
Process fork+/bin/sh -c: 2127.6667 microseconds
File /usr/tmp/XXX write bandwidth: 1950077 KB/sec
Pagefaults on /usr/tmp/XXX: 0.1958 microseconds

and from:

Socket bandwidth using localhost
0.000001 2.52 MB/sec
0.000064 163.02 MB/sec
0.000128 321.70 MB/sec
0.000256 630.06 MB/sec
0.000512 1207.07 MB/sec
0.001024 2004.06 MB/sec
0.001437 2475.43 MB/sec
10.000000 5817.34 MB/sec

Avg xfer: 3.2KB, 41.8KB in 1.2230 millisecs, 34.15 MB/sec
AF_UNIX sock stream bandwidth: 9850.01 MB/sec
Pipe bandwidth: 4631.28 MB/sec

to:

Socket bandwidth using localhost
0.000001 3.13 MB/sec
0.000064 187.08 MB/sec
0.000128 324.12 MB/sec
0.000256 618.51 MB/sec
0.000512 1137.13 MB/sec
0.001024 1962.95 MB/sec
0.001437 2458.27 MB/sec
10.000000 6168.08 MB/sec

Avg xfer: 3.2KB, 41.8KB in 1.0060 millisecs, 41.52 MB/sec
AF_UNIX sock stream bandwidth: 9921.68 MB/sec
Pipe bandwidth: 4649.96 MB/sec

[ mingo: Prettified the changelog a bit. ]

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20250309170955.48919-1-ubizjak@gmail.com

commit | commitdiff | tree

Uros Bizjak [Thu, 13 Mar 2025 10:26:56 +0000 (11:26 +0100)]

x86/asm: Use asm_inline() instead of asm() in clwb()

Use asm_inline() to instruct the compiler that the size of asm()
is the minimum size of one instruction, ignoring how many instructions
the compiler thinks it is. ALTERNATIVE macro that expands to several
pseudo directives causes instruction length estimate to count
more than 20 instructions.

bloat-o-meter reports slight increase of the code size
for x86_64 defconfig object file, compiled with gcc-14.2:

  add/remove: 0/2 grow/shrink: 3/0 up/down: 190/-59 (131)

  Function                                     old     new   delta
  __copy_user_flushcache                       166     247     +81
  __memcpy_flushcache                          369     437     +68
  arch_wb_cache_pmem                             6      47     +41
  __pfx_clean_cache_range                       16       -     -16
  clean_cache_range                             43       -     -43

  Total: Before=22807167, After=22807298, chg +0.00%

The compiler now inlines and removes the clean_cache_range() function.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250313102715.333142-2-ubizjak@gmail.com

commit | commitdiff | tree

Uros Bizjak [Thu, 13 Mar 2025 10:26:55 +0000 (11:26 +0100)]

x86/asm: Use CLFLUSHOPT and CLWB mnemonics in <asm/special_insns.h>

Current minimum required version of binutils is 2.25,
which supports CLFLUSHOPT and CLWB instruction mnemonics.

Replace the byte-wise specification of CLFLUSHOPT and
CLWB with these proper mnemonics.

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250313102715.333142-1-ubizjak@gmail.com

commit | commitdiff | tree

Uros Bizjak [Wed, 12 Mar 2025 12:38:45 +0000 (13:38 +0100)]

x86/hweight: Use asm_inline() instead of asm()

Use asm_inline() to instruct the compiler that the size of asm()
is the minimum size of one instruction, ignoring how many instructions
the compiler thinks it is. ALTERNATIVE macro that expands to several
pseudo directives causes instruction length estimate to count
more than 20 instructions.

bloat-o-meter reports slight reduction of the code size
for x86_64 defconfig object file, compiled with gcc-14.2:

  add/remove: 6/12 grow/shrink: 59/50 up/down: 3389/-3560 (-171)
  Total: Before=22734393, After=22734222, chg -0.00%

where 29 instances of code blocks involving POPCNT now gets inlined,
resulting in the removal of several functions:

  format_is_yuv_semiplanar.part.isra            41       -     -41
  cdclk_divider                                 69       -     -69
  intel_joiner_adjust_timings                  140       -    -140
  nl80211_send_wowlan_tcp_caps                 369       -    -369
  nl80211_send_iftype_data                     579       -    -579
  __do_sys_pidfd_send_signal                   809       -    -809

One noticeable change is:

  pcpu_page_first_chunk                       1075    1060     -15

Where the compiler now inlines 4 more instances of POPCNT insns,
but still manages to compile to a function with smaller code size.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-3-ubizjak@gmail.com

commit | commitdiff | tree

Uros Bizjak [Wed, 12 Mar 2025 12:38:44 +0000 (13:38 +0100)]

x86/hweight: Use ASM_CALL_CONSTRAINT in inline asm()

Use ASM_CALL_CONSTRAINT to prevent inline asm() that includes call
instruction from being scheduled before the frame pointer gets set
up by the containing function. This unconstrained scheduling might
cause objtool to print a "call without frame pointer save/setup"
warning. Current versions of compilers don't seem to trigger this
condition, but without this constraint there's nothing to prevent
the compiler from scheduling the insn in front of frame creation.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-2-ubizjak@gmail.com

commit | commitdiff | tree

Uros Bizjak [Wed, 12 Mar 2025 12:38:43 +0000 (13:38 +0100)]

x86/hweight: Use named operands in inline asm()

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250312123905.149298-1-ubizjak@gmail.com

commit | commitdiff | tree

Ingo Molnar [Wed, 12 Mar 2025 11:48:49 +0000 (12:48 +0100)]

x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMP

The __ref_stack_chk_guard symbol doesn't exist on UP:

<stdin>:4:15: error: ‘__ref_stack_chk_guard’ undeclared here (not in a function)

Fix the #ifdef around the entry.S export.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20250123190747.745588-8-brgerst@gmail.com

commit | commitdiff | tree

Ard Biesheuvel [Wed, 12 Mar 2025 10:27:41 +0000 (11:27 +0100)]

x86/head/64: Avoid Clang < 17 stack protector in startup code

Clang versions before 17 will not honour -fdirect-access-external-data
for the load of the stack cookie emitted into each function's prologue
and epilogue, and will emit a GOT based reference instead, e.g.,

  4c 8b 2d 00 00 00 00    mov    0x0(%rip),%r13
          18a: R_X86_64_REX_GOTPCRELX     __ref_stack_chk_guard-0x4
  65 49 8b 45 00          mov    %gs:0x0(%r13),%rax

This is inefficient, but at least, the linker will usually follow the
rules of the x86 psABI, and relax the GOT load into a RIP-relative LEA
instruction.  This is still suboptimal, as the per-CPU load could use a
RIP-relative reference directly, but at least it gets rid of the first
load from memory.

However, Boris reports that in some cases, when using distro builds of
Clang/LLD 15, the first load gets relaxed into

  49 c7 c6 20 c0 55 86 mov    $0xffffffff8655c020,%r14
  ffffffff8373bf0f: R_X86_64_32S __ref_stack_chk_guard
  65 49 8b 06           mov    %gs:(%r14),%rax

instead, which is fine in principle, as MOV may be cheaper than LEA on
some micro-architectures. However, such absolute references assume that
the variable in question can be accessed via the kernel virtual mapping,
and this is not guaranteed for the startup code residing in .head.text.

This is therefore a true positive, that was caught using the recently
introduced relocs check for absolute references in the startup code:

  Absolute reference to symbol '__ref_stack_chk_guard' not permitted in .head.text

Work around the issue by disabling the stack protector in the startup
code for Clang versions older than 17.

Fixes: 80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250312102740.602870-2-ardb+git@google.com

commit | commitdiff | tree

Uros Bizjak [Thu, 6 Mar 2025 14:52:11 +0000 (15:52 +0100)]

x86/kexec: Merge x86_32 and x86_64 code using macros from <asm/asm.h>

Merge common x86_32 and x86_64 code in crash_setup_regs()
using macros from <asm/asm.h>.

The compiled object files before and after the patch are unchanged.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20250306145227.55819-1-ubizjak@gmail.com

commit | commitdiff | tree

Kirill A. Shutemov [Tue, 4 Mar 2025 15:33:42 +0000 (17:33 +0200)]

x86/runtime-const: Add the RUNTIME_CONST_PTR assembly macro

Add an assembly macro to refer runtime cost. It hides linker magic and
makes assembly more readable.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250304153342.2016569-1-kirill.shutemov@linux.intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:31 +0000 (18:41 +0000)]

x86/cpu/intel: Limit the non-architectural constant_tsc model checks

X86_FEATURE_CONSTANT_TSC is a Linux-defined, synthesized feature flag.
It is used across several vendors. Intel CPUs will set the feature when
the architectural CPUID.80000007.EDX[1] bit is set. There are also some
Intel CPUs that have the X86_FEATURE_CONSTANT_TSC behavior but don't
enumerate it with the architectural bit. Those currently have a model
range check.

Today, virtually all of the CPUs that have the CPUID bit *also* match
the "model >= 0x0e" check. This is confusing. Instead of an open-ended
check, pick some models (INTEL_IVYBRIDGE and P4_WILLAMETTE) as the end
of goofy CPUs that should enumerate the bit but don't. These models are
relatively arbitrary but conservative pick for this.

This makes it obvious that later CPUs (like Family 18+) no longer need
to synthesize X86_FEATURE_CONSTANT_TSC.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20250219184133.816753-14-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:30 +0000 (18:41 +0000)]

x86/mm/pat: Replace Intel x86_model checks with VFM ones

Introduce markers and names for some Family 6 and Family 15 models and
replace x86_model checks with VFM ones.

Since the VFM checks are closed ended and only applicable to Intel, get
rid of the explicit Intel vendor check as well.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20250219184133.816753-13-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:29 +0000 (18:41 +0000)]

x86/cpu/intel: Fix fast string initialization for extended Families

X86_FEATURE_REP_GOOD is a linux defined feature flag to track whether
fast string operations should be used for copy_page(). It is also used
as a second alternative for clear_page() if enhanced fast string
operations (ERMS) are not available.

X86_FEATURE_ERMS is an Intel-specific hardware-defined feature flag that
tracks hardware support for Enhanced Fast strings. It is used to track
whether Fast strings should be used for similar memory copy and memory
clearing operations.

On top of these, there is a FAST_STRING enable bit in the
IA32_MISC_ENABLE MSR. It is typically controlled by the BIOS to provide
a hint to the hardware and the OS on whether fast string operations are
preferred.

Commit:

161ec53c702c ("x86, mem, intel: Initialize Enhanced REP MOVSB/STOSB")

introduced a mechanism to honor the BIOS preference for fast string
operations and clear the above feature flags if needed.

Unfortunately, the current initialization code for Intel to set and
clear these bits is confusing at best and likely incorrect.

X86_FEATURE_REP_GOOD is cleared in early_init_intel() if
MISC_ENABLE.FAST_STRING is 0. But it gets set later on unconditionally
for all Family 6 processors in init_intel(). This not only overrides the
BIOS preference but also contradicts the earlier check.

Fix this by combining the related checks and always relying on the BIOS
provided preference for fast string operations. This simplification
makes sure the upcoming Intel Family 18 and 19 models are covered as
well.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250219184133.816753-12-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:28 +0000 (18:41 +0000)]

x86/smpboot: Fix INIT delay assignment for extended Intel Families

Some old crusty CPUs need an extra delay that slows down booting. See
the comment above 'init_udelay' for details. Newer CPUs don't need the
delay.

Right now, for Intel, Family 6 and only Family 6 skips the delay. That
leaves out both the Family 15 (Pentium 4s) and brand new Family 18/19
models.

The omission of Family 15 (Pentium 4s) seems like an oversight and 18/19
do not need the delay.

Skip the delay on all Intel processors Family 6 and beyond.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250219184133.816753-11-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:27 +0000 (18:41 +0000)]

x86/smpboot: Remove confusing quirk usage in INIT delay

Very old multiprocessor systems required a 10 msec delay between
asserting and de-asserting INIT but modern processors do not require
this delay.

Over time the usage of the "quirk" wording while setting the INIT delay
has become misleading. The code comments suggest that modern processors
need to be quirked, which clears the default init_udelay of 10 msec,
while legacy processors don't need the quirk and continue to use the
default init_udelay.

With a lot more modern processors, the wording should be inverted if at
all needed. Instead, simplify the comments and the code by getting rid
of "quirk" usage altogether and clarifying the following:

- Old legacy processors -> Set the "legacy" 10 msec delay
- Modern processors -> Do not set any delay

No functional change.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250219184133.816753-10-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:26 +0000 (18:41 +0000)]

x86/acpi/cstate: Improve Intel Family model checks

Update the Intel Family checks to consistently use Family 15 instead of
Family 0xF. Also, get rid of one of last usages of x86_model by using
the new VFM checks.

Update the incorrect comment since the check has changed since the
initial commit:

  ee1ca48fae7e ("ACPI: Disable ARB_DISABLE on platforms where it is not needed")

The two changes were:

- 3e2ada5867b7 ("ACPI: fix Compaq Evo N800c (Pentium 4m) boot hang regression")
   removed the P4 - Family 15.

- 03a05ed11529 ("ACPI: Use the ARB_DISABLE for the CPU which model id is less than 0x0f.")
   got rid of CORE_YONAH - Family 6, model E.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20250219184133.816753-9-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:25 +0000 (18:41 +0000)]

x86/cpu/intel: Replace Family 5 model checks with VFM ones

Introduce names for some Family 5 models and convert some of the checks
to be VFM based.

Also, to keep the file sorted by family, move Family 5 to the top of the
header file.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250219184133.816753-8-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:24 +0000 (18:41 +0000)]

x86/cpu/intel: Replace Family 15 checks with VFM ones

Introduce names for some old pentium 4 models and replace the x86_model
checks with VFM ones.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250219184133.816753-7-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:23 +0000 (18:41 +0000)]

x86/cpu/intel: Replace early Family 6 checks with VFM ones

Introduce names for some old pentium models and replace the x86_model
checks with VFM ones.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250219184133.816753-6-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:22 +0000 (18:41 +0000)]

x86/mtrr: Modify a x86_model check to an Intel VFM check

Simplify one of the last few Intel x86_model checks in arch/x86 by
substituting it with a VFM one.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250219184133.816753-5-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:21 +0000 (18:41 +0000)]

x86/microcode: Update the Intel processor flag scan check

The Family model check to read the processor flag MSR is misleading and
potentially incorrect. It doesn't consider Family while comparing the
model number. The original check did have a Family number but it got
lost/moved during refactoring.

intel_collect_cpu_info() is called through multiple paths such as early
initialization, CPU hotplug as well as IFS image load. Some of these
flows would be error prone due to the ambiguous check.

Correct the processor flag scan check to use a Family number and update
it to a VFM based one to make it more readable.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250219184133.816753-4-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:20 +0000 (18:41 +0000)]

x86/cpu/intel: Fix the MOVSL alignment preference for extended Families

The alignment preference for 32-bit MOVSL based bulk memory move has
been 8-byte for a long time. However this preference is only set for
Family 6 and 15 processors.

Use the same preference for upcoming Family numbers 18 and 19. Also, use
a simpler VFM based check instead of switching based on Family numbers.
Refresh the comment to reflect the new check.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250219184133.816753-3-sohil.mehta@intel.com

commit | commitdiff | tree

Sohil Mehta [Wed, 19 Feb 2025 18:41:19 +0000 (18:41 +0000)]

x86/apic: Fix 32-bit APIC initialization for extended Intel Families

APIC detection is currently limited to a few specific Families and will
not match the upcoming Families >=18.

Extend the check to include all Families 6 or greater. Also convert it
to a VFM check to make it simpler.

Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250219184133.816753-2-sohil.mehta@intel.com

commit | commitdiff | tree

Ingo Molnar [Mon, 17 Mar 2025 22:18:24 +0000 (23:18 +0100)]

x86/cpuid: Use u32 in instead of uint32_t in <asm/cpuid/api.h>

Use u32 instead of uint32_t in hypervisor_cpuid_base().

Yes, uint32_t is used in Xen code et al, but this is a core x86
architecture header and we should standardize on the type that
is being used overwhelmingly in related x86 architecture code.

The two types are the same so there should be no build warnings.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: "Ahmed S. Darwish" <darwi@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250317221824.3738853-6-mingo@kernel.org

commit | commitdiff | tree

Ingo Molnar [Mon, 17 Mar 2025 22:18:23 +0000 (23:18 +0100)]

x86/cpuid: Standardize on u32 in <asm/cpuid/api.h>

Convert all uses of 'unsigned int' to 'u32' in <asm/cpuid/api.h>.

This is how a lot of the call sites are doing it, and the two
types are equivalent in the C sense - but 'u32' better expresses
that these are expressions of an immutable hardware ABI.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Xin Li (Intel) <xin@zytor.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: "Ahmed S. Darwish" <darwi@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250317221824.3738853-5-mingo@kernel.org

commit | commitdiff | tree

Ingo Molnar [Mon, 17 Mar 2025 22:18:22 +0000 (23:18 +0100)]

x86/cpuid: Clean up <asm/cpuid/api.h>

- Include <asm/cpuid/types.h> first, as is customary. This also has
   the side effect of build-testing the header dependency assumptions
   in the types header.

- No newline necessary after the SPDX line

- Newline necessary after inline function definitions

- Rename native_cpuid_reg() to NATIVE_CPUID_REG(): it's a CPP macro,
   whose name we capitalize in such cases.

- Prettify the CONFIG_PARAVIRT_XXL inclusion block a bit

- Standardize register references in comments to EAX/EBX/ECX/etc.,
   from the hodgepodge of references.

- s/cpus/CPUs because why add noise to common acronyms?

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: "Ahmed S. Darwish" <darwi@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250317221824.3738853-4-mingo@kernel.org

commit | commitdiff | tree

Ingo Molnar [Mon, 17 Mar 2025 22:18:21 +0000 (23:18 +0100)]

x86/cpuid: Clean up <asm/cpuid/types.h>

- We have 0x0d, 0x9 and 0x1d as literals for the CPUID_LEAF definitions,
   pick a single, consistent style of 0xZZ literals.

- Likewise, harmonize the style of the 'struct cpuid_regs' list of
   registers with that of 'enum cpuid_regs_idx'. Because while computers
   don't care about unnecessary visual noise, humans do.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: "Ahmed S. Darwish" <darwi@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250317221824.3738853-3-mingo@kernel.org

commit | commitdiff | tree

Ahmed S. Darwish [Mon, 17 Mar 2025 22:18:20 +0000 (23:18 +0100)]

x86/cpuid: Refactor <asm/cpuid.h>

In preparation for future commits where CPUID headers will be expanded,
refactor the CPUID header <asm/cpuid.h> into:

    asm/cpuid/
    ├── api.h
    └── types.h

Move the CPUID data structures into <asm/cpuid/types.h> and the access
APIs into <asm/cpuid/api.h>.  Let <asm/cpuid.h> be just an include of
<asm/cpuid/api.h> so that existing call sites do not break.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: "Ahmed S. Darwish" <darwi@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250317221824.3738853-2-mingo@kernel.org

commit | commitdiff | tree

Brian Gerst [Fri, 14 Mar 2025 15:12:20 +0000 (11:12 -0400)]

x86/syscall/32: Add comment to conditional

Add a CONFIG_X86_FRED comment, since this conditional is nested.

Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250314151220.862768-8-brgerst@gmail.com

commit | commitdiff | tree

Brian Gerst [Fri, 14 Mar 2025 15:12:19 +0000 (11:12 -0400)]

x86/syscall: Remove stray semicolons

No functional change.

Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250314151220.862768-7-brgerst@gmail.com

commit | commitdiff | tree

Brian Gerst [Fri, 14 Mar 2025 15:12:18 +0000 (11:12 -0400)]

x86/syscall: Move sys_ni_syscall()

Move sys_ni_syscall() to kernel/process.c, and remove the now empty
entry/common.c

No functional changes.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250314151220.862768-6-brgerst@gmail.com

commit | commitdiff | tree

Brian Gerst [Fri, 14 Mar 2025 15:12:17 +0000 (11:12 -0400)]

x86/syscall/x32: Move x32 syscall table

Since commit:

2e958a8a510d ("x86/entry/x32: Rename __x32_compat_sys_* to __x64_compat_sys_*")

the ABI prefix for x32 syscalls is the same as native 64-bit
syscalls. Move the x32 syscall table to syscall_64.c

No functional changes.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250314151220.862768-5-brgerst@gmail.com

commit | commitdiff | tree

Brian Gerst [Fri, 14 Mar 2025 15:12:16 +0000 (11:12 -0400)]

x86/syscall/64: Move 64-bit syscall dispatch code

Move the 64-bit syscall dispatch code to syscall_64.c.

No functional changes.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250314151220.862768-4-brgerst@gmail.com

commit | commitdiff | tree

Brian Gerst [Fri, 14 Mar 2025 15:12:15 +0000 (11:12 -0400)]

x86/syscall/32: Move 32-bit syscall dispatch code

Move the 32-bit syscall dispatch code to syscall_32.c.

No functional changes.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250314151220.862768-3-brgerst@gmail.com

commit | commitdiff | tree

Brian Gerst [Fri, 14 Mar 2025 15:12:14 +0000 (11:12 -0400)]

x86/xen: Move Xen upcall handler

Move the upcall handler to Xen-specific files.

No functional changes.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250314151220.862768-2-brgerst@gmail.com

commit | commitdiff | tree

Mario Limonciello [Mon, 17 Feb 2025 23:17:41 +0000 (17:17 -0600)]

x86/amd_node: Add a smn_read_register() helper

Some of the ACP drivers will poll registers through SMN using
read_poll_timeout() which requires returning the result of the register read
as the argument.

Add a helper to do just that.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250217231747.1656228-2-superm1@kernel.org

commit | commitdiff | tree

Mario Limonciello [Thu, 30 Jan 2025 19:48:57 +0000 (19:48 +0000)]

x86/amd_node: Add support for debugfs access to SMN registers

There are certain registers on AMD Zen systems that can only be accessed
through SMN.

Introduce a new interface that provides debugfs files for accessing SMN. As
this introduces the capability for userspace to manipulate the hardware in
unpredictable ways, taint the kernel when writing.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250130-wip-x86-amd-nb-cleanup-v4-3-b5cc997e471b@amd.com

commit | commitdiff | tree

Mario Limonciello [Thu, 30 Jan 2025 19:48:56 +0000 (19:48 +0000)]

x86/amd_node: Add SMN offsets to exclusive region access

Offsets 0x60 and 0x64 are used internally by kernel drivers that call
the amd_smn_read() and amd_smn_write() functions. If userspace accesses
the regions at the same time as the kernel it may cause malfunctions in
drivers using the offsets.

Add these offsets to the exclusions so that the kernel is tainted if a
non locked down userspace tries to access them.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250130-wip-x86-amd-nb-cleanup-v4-2-b5cc997e471b@amd.com

commit | commitdiff | tree

Yazen Ghannam [Thu, 30 Jan 2025 19:48:55 +0000 (19:48 +0000)]

x86/amd_node, platform/x86/amd/hsmp: Have HSMP use SMN through AMD_NODE

The HSMP interface is just an SMN interface with different offsets.

Define an HSMP wrapper in the SMN code and have the HSMP platform driver
use that rather than a local solution.

Also, remove the "root" member from AMD_NB, since there are no more
users of it.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Carlos Bilbao <carlos.bilbao@kernel.org>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20250130-wip-x86-amd-nb-cleanup-v4-1-b5cc997e471b@amd.com

commit | commitdiff | tree

Thorsten Blum [Fri, 17 Jan 2025 14:48:59 +0000 (15:48 +0100)]

x86/mtrr: Use str_enabled_disabled() helper in print_mtrr_state()

Remove hard-coded strings by using the str_enabled_disabled() helper
function.

Suggested-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/20250117144900.171684-2-thorsten.blum%40linux.dev

commit | commitdiff | tree

Vitaly Kuznetsov [Tue, 10 Dec 2024 15:16:50 +0000 (16:16 +0100)]

x86/entry: Add __init to ia32_emulation_override_cmdline()

ia32_emulation_override_cmdline() is an early_param() arg and these
are only needed at boot time. In fact, all other early_param() functions
in arch/x86 seem to have '__init' annotation and
ia32_emulation_override_cmdline() is the only exception.

Fixes: a11e097504ac ("x86: Make IA32_EMULATION boot time configurable")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/all/20241210151650.1746022-1-vkuznets%40redhat.com

commit | commitdiff | tree

Sohil Mehta [Thu, 13 Mar 2025 20:16:08 +0000 (20:16 +0000)]

x86/cpufeatures: Warn about unmet CPU feature dependencies

Currently, the cpuid_deps[] table is only exercised when a particular
feature is explicitly disabled and clear_cpu_cap() is called. However,
some of these listed dependencies might already be missing during boot.

These types of errors shouldn't generally happen in production
environments, but they could sometimes sneak through, especially when
VMs and Kconfigs are in the mix. Also, the kernel might introduce
artificial dependencies between unrelated features, such as making LAM
depend on LASS.

Unexpected failures can occur when the kernel tries to use such
features. Add a simple boot-time scan of the cpuid_deps[] table to
detect the missing dependencies. One option is to disable all of such
features during boot, but that may cause regressions in existing
systems. For now, just warn about the missing dependencies to create
awareness.

As a trade-off between spamming the kernel log and keeping track of all
the features that have been warned about, only warn about the first
missing dependency. Any subsequent unmet dependency will only be logged
after the first one has been resolved.

Features are typically represented through unsigned integers within the
kernel, though some of them have user-friendly names if they are exposed
via /proc/cpuinfo.

Show the friendlier name if available, otherwise display the
X86_FEATURE_* numerals to make it easier to identify the feature.

Suggested-by: Tony Luck <tony.luck@intel.com>
Suggested-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250313201608.3304135-1-sohil.mehta@intel.com

commit | commitdiff | tree

Pawan Gupta [Tue, 11 Mar 2025 15:03:08 +0000 (08:03 -0700)]

x86/rfds: Exclude P-only parts from the RFDS affected list

The affected CPU table (cpu_vuln_blacklist) marks Alderlake and Raptorlake
P-only parts affected by RFDS. This is not true because only E-cores are
affected by RFDS. With the current family/model matching it is not possible
to differentiate the unaffected parts, as the affected and unaffected
hybrid variants have the same model number.

Add a cpu-type match as well for such parts so as to exclude P-only parts
being marked as affected.

Note, family/model and cpu-type enumeration could be inaccurate in
virtualized environments. In a guest affected status is decided by RFDS_NO
and RFDS_CLEAR bits exposed by VMMs.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-5-e8514dcaaff2@linux.intel.com

commit | commitdiff | tree

Pawan Gupta [Tue, 11 Mar 2025 15:02:52 +0000 (08:02 -0700)]

x86/cpu: Update x86_match_cpu() to also use cpu-type

Non-hybrid CPU variants that share the same Family/Model could be
differentiated by their cpu-type. x86_match_cpu() currently does not use
cpu-type for CPU matching.

Dave Hansen suggested to use below conditions to match CPU-type:

  1. If CPU_TYPE_ANY (the wildcard), then matched
  2. If hybrid, then matched
  3. If !hybrid, look at the boot CPU and compare the cpu-type to determine
     if it is a match.

  This special case for hybrid systems allows more compact vulnerability
  list.  Imagine that "Haswell" CPUs might or might not be hybrid and that
  only Atom cores are vulnerable to Meltdown.  That means there are three
  possibilities:

   1. P-core only
   2. Atom only
   3. Atom + P-core (aka. hybrid)

  One might be tempted to code up the vulnerability list like this:

   MATCH(     HASWELL, X86_FEATURE_HYBRID, MELTDOWN)
   MATCH_TYPE(HASWELL, ATOM,               MELTDOWN)

  Logically, this matches #2 and #3. But that's a little silly. You would
  only ask for the "ATOM" match in cases where there *WERE* hybrid cores in
  play. You shouldn't have to _also_ ask for hybrid cores explicitly.

  In short, assume that processors that enumerate Hybrid==1 have a
  vulnerable core type.

Update x86_match_cpu() to also match cpu-type. Also treat hybrid systems as
special, and match them to any cpu-type.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-4-e8514dcaaff2@linux.intel.com

commit | commitdiff | tree

Pawan Gupta [Tue, 11 Mar 2025 15:02:36 +0000 (08:02 -0700)]

x86/cpu: Add cpu_type to struct x86_cpu_id

In addition to matching vendor/family/model/feature, for hybrid variants it is
required to also match cpu-type. For example, some CPU vulnerabilities like
RFDS only affect a specific cpu-type.

To be able to also match CPUs based on their type, add a new field "type" to
struct x86_cpu_id which is used by the CPU-matching tables. Introduce
X86_CPU_TYPE_ANY for the cases that don't care about the cpu-type.

[ bp: Massage commit message. ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-3-e8514dcaaff2@linux.intel.com

commit | commitdiff | tree

Pawan Gupta [Tue, 11 Mar 2025 15:02:20 +0000 (08:02 -0700)]

x86/cpu: Shorten CPU matching macro

To add cpu-type to the existing CPU matching infrastructure, the base macro
X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE need to append _CPU_TYPE. This
makes an already long name longer, and somewhat incomprehensible.

To avoid this, rename the base macro to X86_MATCH_CPU. The macro name
doesn't need to explicitly tell everything that it matches. The arguments
to the macro already hint at that.

For consistency, use this base macro to define X86_MATCH_VFM and friends.

Remove unused X86_MATCH_VENDOR_FAM_MODEL_FEATURE while at it.

[ bp: Massage commit message. ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-2-e8514dcaaff2@linux.intel.com

commit | commitdiff | tree

Pawan Gupta [Tue, 11 Mar 2025 15:02:05 +0000 (08:02 -0700)]

x86/cpu: Fix the description of X86_MATCH_VFM_STEPS()

The comments needs to reflect an implementation change.

No functional change.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250311-add-cpu-type-v8-1-e8514dcaaff2@linux.intel.com

commit | commitdiff | tree

Xin Li (Intel) [Fri, 28 Feb 2025 08:23:38 +0000 (00:23 -0800)]

x86/cpufeatures: Use AWK to generate {REQUIRED|DISABLED}_MASK_BIT_SET in <asm/cpufeaturemasks.h>

Generate the {REQUIRED|DISABLED}_MASK_BIT_SET macros in the newly added AWK
script that generates <asm/cpufeaturemasks.h>.

Suggested-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Brian Gerst <brgerst@gmail.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250228082338.73859-6-xin@zytor.com

commit | commitdiff | tree

Xin Li (Intel) [Mon, 10 Mar 2025 07:32:12 +0000 (08:32 +0100)]

x86/cpufeatures: Remove {disabled,required}-features.h

The functionalities of {disabled,required}-features.h have been replaced with
the auto-generated generated/<asm/cpufeaturemasks.h> header.

Thus they are no longer needed and can be removed.

None of the macros defined in {disabled,required}-features.h is used in tools,
delete them too.

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250305184725.3341760-4-xin@zytor.com

commit | commitdiff | tree

H. Peter Anvin (Intel) [Wed, 5 Mar 2025 18:47:22 +0000 (10:47 -0800)]

x86/cpufeatures: Generate the <asm/cpufeaturemasks.h> header based on build config

Introduce an AWK script to auto-generate the <asm/cpufeaturemasks.h> header
with required and disabled feature masks based on <asm/cpufeatures.h>
and the current build config.

Thus for any CPU feature with a build config, e.g., X86_FRED, simply add:

  config X86_DISABLED_FEATURE_FRED
def_bool y
depends on !X86_FRED

to arch/x86/Kconfig.cpufeatures, instead of adding a conditional CPU
feature disable flag, e.g., DISABLE_FRED.

Lastly, the generated required and disabled feature masks will be added to
their corresponding feature masks for this particular compile-time
configuration.

  [ Xin: build integration improvements ]
  [ mingo: Improved changelog and comments ]

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250305184725.3341760-3-xin@zytor.com

commit | commitdiff | tree

H. Peter Anvin (Intel) [Fri, 28 Feb 2025 08:23:35 +0000 (00:23 -0800)]

x86/cpufeatures: Add {REQUIRED,DISABLED} feature configs

Required and disabled feature masks completely rely on build configs,
i.e., once a build config is fixed, so are the feature masks.

To prepare for auto-generating the <asm/cpufeaturemasks.h> header
with required and disabled feature masks based on a build config,
add feature Kconfig items:

- X86_REQUIRED_FEATURE_x
- X86_DISABLED_FEATURE_x

each of which may be set to "y" if and only if its preconditions from
current build config are met.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250228082338.73859-3-xin@zytor.com

commit | commitdiff | tree

Kirill A. Shutemov [Wed, 16 Oct 2024 11:14:55 +0000 (14:14 +0300)]

x86/mm/ident_map: Fix theoretical virtual address overflow to zero

The current calculation of the 'next' virtual address in the
page table initialization functions in arch/x86/mm/ident_map.c
doesn't protect against wrapping to zero.

This is a theoretical issue that cannot happen currently,
the problematic case is possible only if the user sets a
high enough x86_mapping_info::offset value - which no
current code in the upstream kernel does.

( The wrapping to zero only occurs if the top PGD entry is accessed.
  There are no such users upstream. Only hibernate_64.c uses
  x86_mapping_info::offset, and it operates on the direct mapping
  range, which is not the top PGD entry. )

Should such an overflow happen, it can result in page table
corruption and a hang.

To future-proof this code, replace the manual 'next' calculation
with p?d_addr_end() which handles wrapping correctly.

[ Backporter's note: there's no need to backport this patch. ]

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20241016111458.846228-2-kirill.shutemov@linux.intel.com

commit | commitdiff | tree

Kirill A. Shutemov [Wed, 16 Oct 2024 11:14:56 +0000 (14:14 +0300)]

x86/acpi: Replace manual page table initialization with kernel_ident_mapping_init()

The init_transition_pgtable() functions maps the page with
asm_acpi_mp_play_dead() into an identity mapping.

Replace open-coded manual page table initialization with
kernel_ident_mapping_init() to avoid code duplication.

Use x86_mapping_info::offset to get the page mapped at the
correct location.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20241016111458.846228-3-kirill.shutemov@linux.intel.com

commit | commitdiff | tree

Tom Lendacky [Tue, 4 Mar 2025 11:59:56 +0000 (12:59 +0100)]

x86/mm: Always set the ASID valid bit for the INVLPGB instruction

When executing the INVLPGB instruction on a bare-metal host or hypervisor, if
the ASID valid bit is not set, the instruction will flush the TLB entries that
match the specified criteria for any ASID, not just the those of the host. If
virtual machines are running on the system, this may result in inadvertent
flushes of guest TLB entries.

When executing the INVLPGB instruction in a guest and the INVLPGB instruction is
not intercepted by the hypervisor, the hardware will replace the requested ASID
with the guest ASID and set the ASID valid bit before doing the broadcast
invalidation. Thus a guest is only able to flush its own TLB entries.

So to limit the host TLB flushing reach, always set the ASID valid bit using an
ASID value of 0 (which represents the host/hypervisor). This will will result in
the desired effect in both host and guest.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250304120449.GHZ8bsYYyEBOKQIxBm@fat_crate.local

commit | commitdiff | tree

Rik van Riel [Wed, 26 Feb 2025 03:00:47 +0000 (22:00 -0500)]

x86/mm: Enable AMD translation cache extensions

With AMD TCE (translation cache extensions) only the intermediate mappings
that cover the address range zapped by INVLPG / INVLPGB get invalidated,
rather than all intermediate mappings getting zapped at every TLB invalidation.

This can help reduce the TLB miss rate, by keeping more intermediate mappings
in the cache.

From the AMD manual:

Translation Cache Extension (TCE) Bit. Bit 15, read/write. Setting this bit to
1 changes how the INVLPG, INVLPGB, and INVPCID instructions operate on TLB
entries. When this bit is 0, these instructions remove the target PTE from the
TLB as well as all upper-level table entries that are cached in the TLB,
whether or not they are associated with the target PTE. When this bit is set,
these instructions will remove the target PTE and only those upper-level
entries that lead to the target PTE in the page table hierarchy, leaving
unrelated upper-level entries intact.

[ bp: use cpu_has()... I know, it is a mess. ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226030129.530345-13-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Wed, 26 Feb 2025 03:00:45 +0000 (22:00 -0500)]

x86/mm: Enable broadcast TLB invalidation for multi-threaded processes

There is not enough room in the 12-bit ASID address space to hand out
broadcast ASIDs to every process. Only hand out broadcast ASIDs to processes
when they are observed to be simultaneously running on 4 or more CPUs.

This also allows single threaded process to continue using the cheaper, local
TLB invalidation instructions like INVLPGB.

Due to the structure of flush_tlb_mm_range(), the INVLPGB flushing is done in
a generically named broadcast_tlb_flush() function which can later also be
used for Intel RAR.

Combined with the removal of unnecessary lru_add_drain calls() (see
https://lore.kernel.org/r/20241219153253.3da9e8aa@fangorn) this results in
a nice performance boost for the will-it-scale tlb_flush2_threads test on an
AMD Milan system with 36 cores:

  - vanilla kernel:           527k loops/second
  - lru_add_drain removal:    731k loops/second
  - only INVLPGB:             527k loops/second
  - lru_add_drain + INVLPGB: 1157k loops/second

Profiling with only the INVLPGB changes showed while TLB invalidation went
down from 40% of the total CPU time to only around 4% of CPU time, the
contention simply moved to the LRU lock.

Fixing both at the same time about doubles the number of iterations per second
from this case.

Comparing will-it-scale tlb_flush2_threads with several different numbers of
threads on a 72 CPU AMD Milan shows similar results. The number represents the
total number of loops per second across all the threads:

  threads tip INVLPGB

  1 315k 304k
  2 423k 424k
  4 644k 1032k
  8 652k 1267k
  16 737k 1368k
  32 759k 1199k
  64 636k 1094k
  72 609k 993k

1 and 2 thread performance is similar with and without INVLPGB, because
INVLPGB is only used on processes using 4 or more CPUs simultaneously.

The number is the median across 5 runs.

Some numbers closer to real world performance can be found at Phoronix, thanks
to Michael:

https://www.phoronix.com/news/AMD-INVLPGB-Linux-Benefits

  [ bp:
   - Massage
   - :%s/\<static_cpu_has\>/cpu_feature_enabled/cgi
   - :%s/\<clear_asid_transition\>/mm_clear_asid_transition/cgi
   - Fold in a 0day bot fix: https://lore.kernel.org/oe-kbuild-all/202503040000.GtiWUsBm-lkp@intel.com
   ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nadav Amit <nadav.amit@gmail.com>
Link: https://lore.kernel.org/r/20250226030129.530345-11-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Wed, 26 Feb 2025 03:00:44 +0000 (22:00 -0500)]

x86/mm: Add global ASID process exit helpers

A global ASID is allocated for the lifetime of a process. Free the global ASID
at process exit time.

[ bp: Massage, create helpers, hide details inside them. ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226030129.530345-10-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Wed, 26 Feb 2025 03:00:43 +0000 (22:00 -0500)]

x86/mm: Handle global ASID context switch and TLB flush

Do context switch and TLB flush support for processes that use a global
ASID and PCID across all CPUs.

At both context switch time and TLB flush time, it needs to be checked whether
a task is switching to a global ASID, and, if so, reload the TLB with the new
ASID as appropriate.

In both code paths, the TLB flush is avoided if a global ASID is used, because
the global ASIDs are always kept up to date across CPUs, even when the
process is not running on a CPU.

  [ bp:
   - Massage
   - :%s/\<static_cpu_has\>/cpu_feature_enabled/cgi
  ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226030129.530345-9-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Wed, 26 Feb 2025 03:00:42 +0000 (22:00 -0500)]

x86/mm: Add global ASID allocation helper functions

Add functions to manage global ASID space. Multithreaded processes that are
simultaneously active on 4 or more CPUs can get a global ASID, resulting in the
same PCID being used for that process on every CPU.

This in turn will allow the kernel to use hardware-assisted TLB flushing
through AMD INVLPGB or Intel RAR for these processes.

  [ bp:
   - Extend use_global_asid() comment
   - s/X86_BROADCAST_TLB_FLUSH/BROADCAST_TLB_FLUSH/g
   - other touchups ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226030129.530345-8-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Wed, 26 Feb 2025 03:00:41 +0000 (22:00 -0500)]

x86/mm: Use broadcast TLB flushing in page reclaim

Page reclaim tracks only the CPU(s) where the TLB needs to be flushed, rather
than all the individual mappings that may be getting invalidated.

Use broadcast TLB flushing when that is available.

[ bp: Massage commit message. ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226030129.530345-7-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Wed, 26 Feb 2025 03:00:39 +0000 (22:00 -0500)]

x86/mm: Use INVLPGB for kernel TLB flushes

Use broadcast TLB invalidation for kernel addresses when available.
Remove the need to send IPIs for kernel TLB flushes.

[ bp: Integrate dhansen's comments additions, merge the
flush_tlb_all() change into this one too. ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226030129.530345-5-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Fri, 28 Feb 2025 19:32:30 +0000 (20:32 +0100)]

x86/mm: Add INVLPGB support code

Add helper functions and definitions needed to use broadcast TLB
invalidation on AMD CPUs.

  [ bp:
      - Cleanup commit message
      - Improve and expand comments
      - push the preemption guards inside the invlpgb* helpers
      - merge improvements from dhansen
      - add !CONFIG_BROADCAST_TLB_FLUSH function stubs because Clang
can't do DCE properly yet and looks at the inline asm and
complains about it getting a u64 argument on 32-bit code ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226030129.530345-4-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Wed, 19 Mar 2025 10:08:26 +0000 (11:08 +0100)]

x86/mm: Add INVLPGB feature and Kconfig entry

In addition, the CPU advertises the maximum number of pages that can be
shot down with one INVLPGB instruction in CPUID. Save that information
for later use.

[ bp: use cpu_has(), typos, massage. ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250226030129.530345-3-riel@surriel.com

commit | commitdiff | tree

Rik van Riel [Wed, 26 Feb 2025 03:00:36 +0000 (22:00 -0500)]

x86/mm: Consolidate full flush threshold decision

Reduce code duplication by consolidating the decision point for whether to do
individual invalidations or a full flush inside get_flush_tlb_info().

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/r/20250226030129.530345-2-riel@surriel.com

commit | commitdiff | tree

Philip Redkin [Fri, 15 Nov 2024 17:36:59 +0000 (20:36 +0300)]

x86/mm: Check return value from memblock_phys_alloc_range()

At least with CONFIG_PHYSICAL_START=0x100000, if there is < 4 MiB of
contiguous free memory available at this point, the kernel will crash
and burn because memblock_phys_alloc_range() returns 0 on failure,
which leads memblock_phys_free() to throw the first 4 MiB of physical
memory to the wolves.

At a minimum it should fail gracefully with a meaningful diagnostic,
but in fact everything seems to work fine without the weird reserve
allocation.

Signed-off-by: Philip Redkin <me@rarity.fan>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/94b3e98f-96a7-3560-1f76-349eb95ccf7f@rarity.fan

commit | commitdiff | tree

Ingo Molnar [Wed, 19 Mar 2025 10:03:06 +0000 (11:03 +0100)]

Merge tag 'v6.14-rc7' into x86/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Linus Torvalds [Sun, 16 Mar 2025 22:55:17 +0000 (12:55 -1000)]

Linux 6.14-rc7

commit | commitdiff | tree

Linus Torvalds [Sun, 16 Mar 2025 19:18:46 +0000 (09:18 -1000)]

Merge tag 'media/v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fix from Mauro Carvalho Chehab:
"rtl2832 driver regression fix"

* tag 'media/v6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: rtl2832_sdr: assign vb2 lock before vb2_queue_init

commit | commitdiff | tree

Linus Torvalds [Sun, 16 Mar 2025 19:09:44 +0000 (09:09 -1000)]

Merge tag 'i2c-for-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

- omap: fix irq ACKS to avoid irq storming and system hang

- ali1535, ali15x3, sis630: fix error path at probe exit

* tag 'i2c-for-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: sis630: Fix an error handling path in sis630_probe()
  i2c: ali15x3: Fix an error handling path in ali15x3_probe()
  i2c: ali1535: Fix an error handling path in ali1535_probe()
  i2c: omap: fix IRQ storms

commit | commitdiff | tree

Linus Torvalds [Sun, 16 Mar 2025 19:05:00 +0000 (09:05 -1000)]

Merge tag 'trace-v6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fix from Steven Rostedt:
"Fix ref count of trace_array in error path of histogram file open

  Tracing instances have a ref count to keep them around while files
  within their directories are open. This prevents them from being
  deleted while they are used.

  The histogram code had some files that needed to take the ref count
  and that was added, but the error paths did not decrement the ref
  counts. This caused the instances from ever being removed if a
  histogram file failed to open due to some error"

* tag 'trace-v6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Correct the refcount if the hist/hist_debug file fails to open

commit | commitdiff | tree

Linus Torvalds [Sun, 16 Mar 2025 06:39:55 +0000 (20:39 -1000)]

Merge tag 'usb-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some small USB and Thunderbolt driver fixes and new
  usb-serial device ids. Included in here are:

   - new usb-serial device ids

   - typec driver bugfix

   - thunderbolt driver resume bugfix

  All of these have been in linux-next with no reported issues"

* tag 'usb-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: tcpm: fix state transition for SNK_WAIT_CAPABILITIES state in run_state_machine()
  USB: serial: ftdi_sio: add support for Altera USB Blaster 3
  thunderbolt: Prevent use-after-free in resume from hibernate
  USB: serial: option: fix Telit Cinterion FE990A name
  USB: serial: option: add Telit Cinterion FE990B compositions
  USB: serial: option: match on interface class for Telit FN990B

commit | commitdiff | tree

Linus Torvalds [Sun, 16 Mar 2025 01:46:29 +0000 (15:46 -1000)]

Merge tag 'input-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

- several new device IDs added to xpad game controller driver

- support for imagis IST3038H variant of chip added to imagis touch
   controller driver

- a fix for GPIO allocation for ads7846 touch controller driver

- a fix for iqs7222 driver to properly support status register

- a fix for goodix-berlin touch controller driver to use the right name
   for the regulator

- more i8042 quirks to better handle several old Clevo devices.

* tag 'input-for-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  MAINTAINERS: Remove myself from the goodix touchscreen maintainers
  Input: iqs7222 - preserve system status register
  Input: i8042 - swap old quirk combination with new quirk for more devices
  Input: i8042 - swap old quirk combination with new quirk for several devices
  Input: i8042 - add required quirks for missing old boardnames
  Input: i8042 - swap old quirk combination with new quirk for NHxxRZQ
  Input: xpad - rename QH controller to Legion Go S
  Input: xpad - add support for TECNO Pocket Go
  Input: xpad - add support for ZOTAC Gaming Zone
  Input: goodix-berlin - fix vddio regulator references
  Input: goodix-berlin - fix comment referencing wrong regulator
  Input: imagis - add support for imagis IST3038H
  dt-bindings: input/touchscreen: imagis: add compatible for ist3038h
  Input: xpad - add multiple supported devices
  Input: xpad - add 8BitDo SN30 Pro, Hyperkin X91 and Gamesir G7 SE controllers
  Input: ads7846 - fix gpiod allocation
  Input: wdt87xx_i2c - fix compiler warning

commit | commitdiff | tree

Linus Torvalds [Sun, 16 Mar 2025 01:40:42 +0000 (15:40 -1000)]

Merge tag 'rust-fixes-6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux

Pull rust fixes from Miguel Ojeda:
"Toolchain and infrastructure:

   - Disallow BTF generation with Rust + LTO

   - Improve rust-analyzer support

  'kernel' crate:

   - 'init' module: remove 'Zeroable' implementation for a couple types
     that should not have it

   - 'alloc' module: fix macOS failure in host test by satisfying POSIX
     alignment requirement

   - Add missing '\n's to 'pr_*!()' calls

  And a couple other minor cleanups"

* tag 'rust-fixes-6.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  scripts: generate_rust_analyzer: add uapi crate
  scripts: generate_rust_analyzer: add missing include_dirs
  scripts: generate_rust_analyzer: add missing macros deps
  rust: Disallow BTF generation with Rust + LTO
  rust: task: fix `SAFETY` comment in `Task::wake_up`
  rust: workqueue: add missing newline to pr_info! examples
  rust: sync: add missing newline in locked_by log example
  rust: init: add missing newline to pr_info! calls
  rust: error: add missing newline to pr_warn! calls
  rust: docs: add missing newline to printing macro examples
  rust: alloc: satisfy POSIX alignment requirement
  rust: init: fix `Zeroable` implementation for `Option<NonNull<T>>` and `Option<KBox<T>>`
  rust: remove leftover mentions of the `alloc` crate

commit | commitdiff | tree

Linus Torvalds [Sat, 15 Mar 2025 18:32:16 +0000 (08:32 -1000)]

Merge tag 'fsnotify_for_v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify reverts from Jan Kara:
"Syzbot has found out that fsnotify HSM events generated on page fault
  can be generated while we already hold freeze protection for the
  filesystem (when you do buffered write from a buffer which is mmapped
  file on the same filesystem) which violates expectations for HSM
  events and could lead to deadlocks of HSM clients with filesystem
  freezing.

  Since it's quite late in the cycle we've decided to revert changes
  implementing HSM events on page fault for now and instead just
  generate one event for the whole range on mmap(2) so that HSM client
  can fetch the data at that moment"

* tag 'fsnotify_for_v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  Revert "fanotify: disable readahead if we have pre-content watches"
  Revert "mm: don't allow huge faults for files with pre content watches"
  Revert "fsnotify: generate pre-content permission event on page fault"
  Revert "xfs: add pre-content fsnotify hook for DAX faults"
  Revert "ext4: add pre-content fsnotify hook for DAX faults"
  fsnotify: add pre-content hooks on mmap()

commit | commitdiff | tree

Wolfram Sang [Sat, 15 Mar 2025 08:28:41 +0000 (09:28 +0100)]

Merge tag 'i2c-host-fixes-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current

i2c-host-fixes for v6.14-rc7

- omap: fixed irq ACKS to avoid irq storming and system hang.
- ali1535, ali15x3, sis630: fixed error path at probe exit.

commit | commitdiff | tree

Linus Torvalds [Sat, 15 Mar 2025 04:43:37 +0000 (18:43 -1000)]

Merge tag 'v6.14-rc6-smb3-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:

- Two fixes for oplock break/lease races

* tag 'v6.14-rc6-smb3-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: prevent connection release during oplock break notification
ksmbd: fix use-after-free in ksmbd_free_work_struct

commit | commitdiff | tree

Linus Torvalds [Sat, 15 Mar 2025 00:24:05 +0000 (14:24 -1000)]

Merge tag 'v6.14-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
"Six smb3 client fixes, all also for stable"

* tag 'v6.14-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: Fix match_session bug preventing session reuse
  cifs: Fix integer overflow while processing closetimeo mount option
  cifs: Fix integer overflow while processing actimeo mount option
  cifs: Fix integer overflow while processing acdirmax mount option
  cifs: Fix integer overflow while processing acregmax mount option
  smb: client: fix regression with guest option

commit | commitdiff | tree

Linus Torvalds [Sat, 15 Mar 2025 00:17:37 +0000 (14:17 -1000)]

Merge tag 'bcachefs-2025-03-14.2' of git://evilpiepirate.org/bcachefs

Pull another bcachefs hotfix from Kent Overstreet:

- fix 32 bit build breakage

* tag 'bcachefs-2025-03-14.2' of git://evilpiepirate.org/bcachefs:
bcachefs: fix build on 32 bit in get_random_u64_below()

commit | commitdiff | tree

Kent Overstreet [Fri, 14 Mar 2025 22:20:20 +0000 (18:20 -0400)]

bcachefs: fix build on 32 bit in get_random_u64_below()

bare 64 bit divides not allowed, whoops

arm-linux-gnueabi-ld: drivers/char/random.o: in function `__get_random_u64_below':
drivers/char/random.c:602:(.text+0xc70): undefined reference to `__aeabi_uldivmod'

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 23:21:31 +0000 (13:21 -1000)]

Merge tag 'xfs-fixes-6.14-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs cleanup from Carlos Maiolino:
"Use abs_diff instead of XFS_ABSDIFF"

* tag 'xfs-fixes-6.14-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Use abs_diff instead of XFS_ABSDIFF

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 22:14:32 +0000 (12:14 -1000)]

Merge tag 'bcachefs-2025-03-14' of git://evilpiepirate.org/bcachefs

Pull bcachefs hotfix from Kent Overstreet:
"This one is high priority: a user hit an assertion in the upgrade to
  6.14, and we don't have a reproducer, so this changes the assertion to
  an emergency read-only with more info so we can debug it"

* tag 'bcachefs-2025-03-14' of git://evilpiepirate.org/bcachefs:
  bcachefs: Change btree wb assert to runtime error

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 21:31:57 +0000 (11:31 -1000)]

Merge tag 'for-6.14/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mikulas Patocka:

- dm-flakey: fix memory corruption in optional corrupt_bio_byte feature

* tag 'for-6.14/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm-flakey: Fix memory corruption in optional corrupt_bio_byte feature

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 21:22:05 +0000 (11:22 -1000)]

Merge tag 'block-6.14-20250313' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

- NVMe pull request via Keith:
     - Concurrent pci error and hotplug handling fix (Keith)
     - Endpoint function fixes (Damien)

- Fix for a regression introduced in this cycle with error checking for
   batched request completions (Shin'ichiro)

* tag 'block-6.14-20250313' of git://git.kernel.dk/linux:
  block: change blk_mq_add_to_batch() third argument type to bool
  nvme: move error logging from nvme_end_req() to __nvme_end_req()
  nvmet: pci-epf: Do not add an IRQ vector if not needed
  nvmet: pci-epf: Set NVMET_PCI_EPF_Q_LIVE when a queue is fully created
  nvme-pci: fix stuck reset on concurrent DPC and HP

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 20:57:28 +0000 (10:57 -1000)]

Merge tag 'platform-drivers-x86-v6.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:
"Fixes and new HW support.

  The diff is a bit larger than I'd prefer at this point due to
  unwinding the amd/pmf driver's error handling properly instead of
  calling a deinit function that was a can full of worms.

  Summary:

   - amd/pmf:
       - Fix error handling in amd_pmf_init_smart_pc()
       - Fix missing hidden options for Smart PC

   - surface: aggregator_registry: Add Support for Surface Pro 11"

* tag 'platform-drivers-x86-v6.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  MAINTAINERS: Update Ike Panhc's email address
  platform/x86/amd: pmf: Fix missing hidden options for Smart PC
  platform/surface: aggregator_registry: Add Support for Surface Pro 11
  platform/x86/amd/pmf: fix cleanup in amd_pmf_init_smart_pc()

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 20:39:41 +0000 (10:39 -1000)]

Merge tag 'gpio-fixes-for-v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:
"The first fix is a backport from my v6.15-rc1 queue that turned out to
  be needed in v6.14 as well but as the former diverged from my fixes
  branch I had to adjust the patch a bit.

  The second one fixes a regression observed in user-space where closing
  a file descriptor associated with a GPIO device results in a ~10ms
  delay due to the atomic notifier calling rcu_synchronize() when
  unregistering.

  Summary:

   - don't check the return value of gpio_chip::get_direction() when
     registering a GPIO chip

   - use raw notifier for line state events"

* tag 'gpio-fixes-for-v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: cdev: use raw notifier for line state events
  gpiolib: don't check the retval of get_direction() when registering a chip

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 20:35:39 +0000 (10:35 -1000)]

Merge tag 'sound-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A collection of last-minute fixes.

  Most of them are for ASoC, and the only one core fix is for reverting
  the previous change, while the rest are all device-specific quirks and
  fixes, which should be relatively safe to apply"

* tag 'sound-6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: cs42l43: convert to SYSTEM_SLEEP_PM_OPS
  ALSA: hda/realtek: Add mute LED quirk for HP Pavilion x360 14-dy1xxx
  ASoC: codecs: wm0010: Fix error handling path in wm0010_spi_probe()
  ASoC: rt722-sdca: add missing readable registers
  ASoC: amd: yc: Support mic on another Lenovo ThinkPad E16 Gen 2 model
  ASoC: cs42l43: Fix maximum ADC Volume
  ASoC: ops: Consistently treat platform_max as control value
  ASoC: rt1320: set wake_capable = 0 explicitly
  ASoC: cs42l43: Add jack delay debounce after suspend
  ASoC: tegra: Fix ADX S24_LE audio format
  ASoC: codecs: wsa884x: report temps to hwmon in millidegree of Celsius
  ASoC: Intel: sof_sdw: Fix unlikely uninitialized variable use in create_sdw_dailinks()

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 20:24:57 +0000 (10:24 -1000)]

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
"The main one is a horrible macro fix for our TLB flushing code which
  resulted in over-invalidation on the MMU notifier path.

  Summary:

   - Fix population of the vmemmap for regions of memory that are
     smaller than a section (128 MiB)

   - Fix range-based TLB over-invalidation when invoked via a MMU
     notifier"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Fix mmu notifiers for range-based invalidates
  arm64: mm: Populate vmemmap at the page level if not section aligned

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 20:07:16 +0000 (10:07 -1000)]

Merge tag 'x86-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Ingo Molnar:
"Fix the bootup of SEV-SNP enabled guests under VMware hypervisors"

* tag 'x86-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vmware: Parse MP tables for SEV-SNP enabled guests under VMware hypervisors

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 19:56:46 +0000 (09:56 -1000)]

Merge tag 'sched-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Ingo Molnar:
"Fix a sleeping-while-atomic bug caused by a recent optimization
  utilizing static keys that didn't consider that the
  static_key_disable() call could be triggered in atomic context.

  Revert the optimization"

* tag 'sched-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/clock: Don't define sched_clock_irqtime as static key

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 19:41:36 +0000 (09:41 -1000)]

Merge tag 'locking-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc locking fixes from Ingo Molnar:

- Restrict the Rust runtime from unintended access to dynamically
   allocated LockClassKeys

- KernelDoc annotation fix

- Fix a lock ordering bug in semaphore::up(), related to trying to
   printk() and wake up the console within critical sections

* tag 'locking-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/semaphore: Use wake_q to wake up processes outside lock critical section
  locking/rtmutex: Use the 'struct' keyword in kernel-doc comment
  rust: lockdep: Remove support for dynamically allocated LockClassKeys

commit | commitdiff | tree

Linus Torvalds [Fri, 14 Mar 2025 19:12:28 +0000 (09:12 -1000)]

Merge tag 'core-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core fix from Ingo Molnar:
"Fix a Sparse false positive warning triggered by no_free_ptr()"

* tag 'core-urgent-2025-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
<linux/cleanup.h>: Allow the passing of both iomem and non-iomem pointers to no_free_ptr()

commit | commitdiff | tree

Kent Overstreet [Fri, 14 Mar 2025 13:54:43 +0000 (09:54 -0400)]

bcachefs: Change btree wb assert to runtime error

We just had a report of the assert for "btree in write buffer for
non-write buffer btree" popping during the 6.14 upgrade.

- 150TB filesystem, after a reboot the upgrade was able to continue from
where it left off, so no major damage.

But with 6.14 about to come out we want to get this tracked down asap,
and need more data if other users hit this.

Convert the BUG_ON() to an emergency read-only, and print out btree, the
key itself, and stack trace from the original write buffer update (which
did not have this check before).

Reported-by: Stijn Tintel <stijn@linux-ipv6.be>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

commit | commitdiff | tree

Ike Panhc [Fri, 14 Mar 2025 04:57:32 +0000 (12:57 +0800)]

MAINTAINERS: Update Ike Panhc's email address

I am no longer at Canonical and update with my personal email address.

Signed-off-by: Ike Panhc <ike.pan@canonical.com>
Link: https://lore.kernel.org/r/20250314045732.389973-1-ike.pan@canonical.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

commit | commitdiff | tree

Matthew Wilcox (Oracle) [Mon, 3 Mar 2025 18:02:32 +0000 (18:02 +0000)]

xfs: Use abs_diff instead of XFS_ABSDIFF

We have a central definition for this function since 2023, used by
a number of different parts of the kernel.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

Unnamed repository; edit this file 'description' to name the repository.