]> www.infradead.org Git - nvme.git/log
nvme.git
9 months agoMerge branch kvm-arm64/docs into kvmarm/next
Oliver Upton [Sun, 14 Jul 2024 00:28:57 +0000 (00:28 +0000)]
Merge branch kvm-arm64/docs into kvmarm/next

* kvm-arm64/docs:
  : KVM Documentation fixes, courtesy of Changyuan Lyu
  :
  : Small set of typo fixes / corrections to the KVM API documentation
  : relating to MSIs and arm64 VGIC UAPI.
  MAINTAINERS: Include documentation in KVM/arm64 entry
  KVM: Documentation: Correct the VGIC V2 CPU interface addr space size
  KVM: Documentation: Enumerate allowed value macros of `irq_type`
  KVM: Documentation: Fix typo `BFD`

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoMerge branch kvm-arm64/nv-tcr2 into kvmarm/next
Oliver Upton [Sun, 14 Jul 2024 00:28:30 +0000 (00:28 +0000)]
Merge branch kvm-arm64/nv-tcr2 into kvmarm/next

* kvm-arm64/nv-tcr2:
  : Fixes to the handling of TCR_EL1, courtesy of Marc Zyngier
  :
  : Series addresses a couple gaps that are present in KVM (from cover
  : letter):
  :
  :   - VM configuration: HCRX_EL2.TCR2En is forced to 1, and we blindly
  :     save/restore stuff.
  :
  :   - trap bit description and routing: none, obviously, since we make a
  :     point in not trapping.
  KVM: arm64: Honor trap routing for TCR2_EL1
  KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX
  KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features
  KVM: arm64: Get rid of HCRX_GUEST_FLAGS
  KVM: arm64: Correctly honor the presence of FEAT_TCRX

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoMerge branch kvm-arm64/nv-sve into kvmarm/next
Oliver Upton [Sun, 14 Jul 2024 00:27:01 +0000 (00:27 +0000)]
Merge branch kvm-arm64/nv-sve into kvmarm/next

* kvm-arm64/nv-sve:
  : CPTR_EL2, FPSIMD/SVE support for nested
  :
  : This series brings support for honoring the guest hypervisor's CPTR_EL2
  : trap configuration when running a nested guest, along with support for
  : FPSIMD/SVE usage at L1 and L2.
  KVM: arm64: Allow the use of SVE+NV
  KVM: arm64: nv: Add additional trap setup for CPTR_EL2
  KVM: arm64: nv: Add trap description for CPTR_EL2
  KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper
  KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2
  KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap
  KVM: arm64: nv: Handle CPACR_EL1 traps
  KVM: arm64: Spin off helper for programming CPTR traps
  KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state
  KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest
  KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context
  KVM: arm64: nv: Load guest hyp's ZCR into EL1 state
  KVM: arm64: nv: Handle ZCR_EL2 traps
  KVM: arm64: nv: Forward SVE traps to guest hypervisor
  KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoMerge branch kvm-arm64/el2-kcfi into kvmarm/next
Oliver Upton [Sun, 14 Jul 2024 00:23:32 +0000 (00:23 +0000)]
Merge branch kvm-arm64/el2-kcfi into kvmarm/next

* kvm-arm64/el2-kcfi:
  : kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi
  :
  : Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect
  : branches in the EL2 hypervisor. Unlike kernel support for the feature,
  : CFI failures at EL2 are always fatal.
  KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
  KVM: arm64: Introduce print_nvhe_hyp_panic helper
  arm64: Introduce esr_brk_comment, esr_is_cfi_brk
  KVM: arm64: VHE: Mark __hyp_call_panic __noreturn
  KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32
  KVM: arm64: nVHE: Simplify invalid_host_el2_vect
  KVM: arm64: Fix __pkvm_init_switch_pgd call ABI
  KVM: arm64: Fix clobbered ELR in sync abort/SError

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoMerge branch kvm-arm64/ctr-el0 into kvmarm/next
Oliver Upton [Sun, 14 Jul 2024 00:15:00 +0000 (00:15 +0000)]
Merge branch kvm-arm64/ctr-el0 into kvmarm/next

* kvm-arm64/ctr-el0:
  : Support for user changes to CTR_EL0, courtesy of Sebastian Ott
  :
  : Allow userspace to change the guest-visible value of CTR_EL0 for a VM,
  : so long as the requested value represents a subset of features supported
  : by hardware. In other words, prevent the VMM from over-promising the
  : capabilities of hardware.
  :
  : Make this happen by fitting CTR_EL0 into the existing infrastructure for
  : feature ID registers.
  KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset
  KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking
  KVM: selftests: arm64: Test writes to CTR_EL0
  KVM: arm64: rename functions for invariant sys regs
  KVM: arm64: show writable masks for feature registers
  KVM: arm64: Treat CTR_EL0 as a VM feature ID register
  KVM: arm64: unify code to prepare traps
  KVM: arm64: nv: Use accessors for modifying ID registers
  KVM: arm64: Add helper for writing ID regs
  KVM: arm64: Use read-only helper for reading VM ID registers
  KVM: arm64: Make idregs debugfs iterator search sysreg table directly
  KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show()

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoMerge branch kvm-arm64/shadow-mmu into kvmarm/next
Oliver Upton [Sun, 14 Jul 2024 00:11:45 +0000 (00:11 +0000)]
Merge branch kvm-arm64/shadow-mmu into kvmarm/next

* kvm-arm64/shadow-mmu:
  : Shadow stage-2 MMU support for NV, courtesy of Marc Zyngier
  :
  : Initial implementation of shadow stage-2 page tables to support a guest
  : hypervisor. In the author's words:
  :
  :   So here's the 10000m (approximately 30000ft for those of you stuck
  :   with the wrong units) view of what this is doing:
  :
  :     - for each {VMID,VTTBR,VTCR} tuple the guest uses, we use a
  :       separate shadow s2_mmu context. This context has its own "real"
  :       VMID and a set of page tables that are the combination of the
  :       guest's S2 and the host S2, built dynamically one fault at a time.
  :
  :     - these shadow S2 contexts are ephemeral, and behave exactly as
  :       TLBs. For all intent and purposes, they *are* TLBs, and we discard
  :       them pretty often.
  :
  :     - TLB invalidation takes three possible paths:
  :
  :       * either this is an EL2 S1 invalidation, and we directly emulate
  :         it as early as possible
  :
  :       * or this is an EL1 S1 invalidation, and we need to apply it to
  :         the shadow S2s (plural!) that match the VMID set by the L1 guest
  :
  :       * or finally, this is affecting S2, and we need to teardown the
  :         corresponding part of the shadow S2s, which invalidates the TLBs
  KVM: arm64: nv: Truely enable nXS TLBI operations
  KVM: arm64: nv: Add handling of NXS-flavoured TLBI operations
  KVM: arm64: nv: Add handling of range-based TLBI operations
  KVM: arm64: nv: Add handling of outer-shareable TLBI operations
  KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information
  KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level
  KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations
  KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations
  KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations
  KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations
  KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1
  KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation
  KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives
  KVM: arm64: nv: Unmap/flush shadow stage 2 page tables
  KVM: arm64: nv: Handle shadow stage 2 page faults
  KVM: arm64: nv: Implement nested Stage-2 page table walk logic
  KVM: arm64: nv: Support multiple nested Stage-2 mmu structures

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoMerge branch kvm-arm64/ffa-1p1 into kvmarm/next
Oliver Upton [Sun, 14 Jul 2024 00:11:34 +0000 (00:11 +0000)]
Merge branch kvm-arm64/ffa-1p1 into kvmarm/next

* kvm-arm64/ffa-1p1:
  : Improvements to the pKVM FF-A Proxy, courtesy of Sebastian Ene
  :
  : Various minor improvements to how host FF-A calls are proxied with the
  : TEE, along with support for v1.1 of the protocol.
  KVM: arm64: Use FF-A 1.1 with pKVM
  KVM: arm64: Update the identification range for the FF-A smcs
  KVM: arm64: Add support for FFA_PARTITION_INFO_GET
  KVM: arm64: Trap FFA_VERSION host call in pKVM

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoMerge branch kvm-arm64/misc into kvmarm/next
Oliver Upton [Sun, 14 Jul 2024 00:11:26 +0000 (00:11 +0000)]
Merge branch kvm-arm64/misc into kvmarm/next

* kvm-arm64/misc:
  : Miscellaneous updates
  :
  :  - Provide a command-line parameter to statically control the WFx trap
  :    selection in KVM
  :
  :  - Make sysreg masks allocation accounted
  Revert "KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity"
  KVM: arm64: nv: Use GFP_KERNEL_ACCOUNT for sysreg_masks allocation
  KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity
  KVM: arm64: Add early_param to control WFx trapping

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoRevert "KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity"
Oliver Upton [Mon, 8 Jul 2024 19:50:51 +0000 (19:50 +0000)]
Revert "KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity"

This reverts commit eb9d53d4a949c6d6d7c9f130e537f6b5687fedf9.

As Marc pointed out on the list [*], this patch is wrong, and those who
find themselves in the SOB chain should have their heads checked.

Annoyingly, the architecture has some FGT trap bits that are negative
(i.e. 0 implies trap), and there was some confusion how KVM handles
this for nested guests. However, it is clear now that KVM honors the
RES0-ness of FGT traps already, meaning traps for features never exposed
to the guest hypervisor get handled at L0. As they should.

Link: https://lore.kernel.org/kvmarm/86bk3c3uss.wl-maz@kernel.org/T/#mb9abb3dd79f6a4544a91cb35676bd637c3a5e836
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoKVM: arm64: nv: Truely enable nXS TLBI operations
Marc Zyngier [Wed, 3 Jul 2024 15:47:43 +0000 (16:47 +0100)]
KVM: arm64: nv: Truely enable nXS TLBI operations

Although we now have support for nXS-flavoured TLBI instructions,
we still don't expose the feature to the guest thanks to a mixture
of misleading comment and use of a bunch of magic values.

Fix the comment and correctly express the masking of LS64, which
is enough to expose nXS to the world. Not that anyone cares...

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240703154743.824824-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
9 months agoMAINTAINERS: Include documentation in KVM/arm64 entry
Oliver Upton [Fri, 28 Jun 2024 22:21:47 +0000 (22:21 +0000)]
MAINTAINERS: Include documentation in KVM/arm64 entry

Ensure updates to the KVM/arm64 documentation get sent to the right
place.

Acked-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20240628222147.3153682-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: Documentation: Correct the VGIC V2 CPU interface addr space size
Changyuan Lyu [Sun, 23 Jun 2024 16:45:41 +0000 (09:45 -0700)]
KVM: Documentation: Correct the VGIC V2 CPU interface addr space size

In arch/arm64/include/uapi/asm/kvm.h, we have

  #define KVM_VGIC_V2_CPU_SIZE 0x2000

So the CPU interface address space should cover 8 KByte not 4 KByte.

Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Link: https://lore.kernel.org/r/20240623164542.2999626-3-changyuanl@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: Documentation: Enumerate allowed value macros of `irq_type`
Changyuan Lyu [Sun, 23 Jun 2024 16:45:40 +0000 (09:45 -0700)]
KVM: Documentation: Enumerate allowed value macros of `irq_type`

The expression `irq_type[n]` may confuse readers to interpret `n`
as the bit position and think of CPU = 1 << 0, SPI = 1 << 1, and
PPI = 1 << 2.

Since arch/arm64/include/uapi/asm/kvm.h already has macro definitions
for the allowed values, this commit uses these symbols to clear up
the ambiguity.

Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Link: https://lore.kernel.org/r/20240623164542.2999626-2-changyuanl@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: Documentation: Fix typo `BFD`
Changyuan Lyu [Sun, 23 Jun 2024 16:45:39 +0000 (09:45 -0700)]
KVM: Documentation: Fix typo `BFD`

BDF is the acronym for Bus, Device, Function.

Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Link: https://lore.kernel.org/r/20240623164542.2999626-1-changyuanl@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Honor trap routing for TCR2_EL1
Marc Zyngier [Tue, 25 Jun 2024 13:00:42 +0000 (14:00 +0100)]
KVM: arm64: Honor trap routing for TCR2_EL1

TCR2_EL1 handling is missing the handling of its trap configuration:

- HCRX_EL2.TCR2En must be handled in conjunction with HCR_EL2.{TVM,TRVM}

- HFG{R,W}TR_EL2.TCR_EL1 does apply to TCR2_EL1 as well

Without these two controls being implemented, it is impossible to
correctly route TCR2_EL1 traps.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240625130042.259175-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX
Marc Zyngier [Tue, 25 Jun 2024 13:00:41 +0000 (14:00 +0100)]
KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX

As per the architecture, if FEAT_S1PIE is implemented, then FEAT_TCRX
must be implemented as well.

Take advantage of this to avoid checking for S1PIE when TCRX isn't
implemented.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240625130042.259175-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features
Marc Zyngier [Tue, 25 Jun 2024 13:00:39 +0000 (14:00 +0100)]
KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features

As for other registers, save/restore of TCR2_EL1 should be gated
on the feature being actually present.

In the case of a nVHE hypervisor, it is perfectly fine to leave
the host value in the register, as HCRX_EL2.TCREn==0 imposes that
TCR2_EL1 is treated as 0.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240625130042.259175-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Get rid of HCRX_GUEST_FLAGS
Marc Zyngier [Tue, 25 Jun 2024 13:00:38 +0000 (14:00 +0100)]
KVM: arm64: Get rid of HCRX_GUEST_FLAGS

HCRX_GUEST_FLAGS gives random KVM hackers the impression that
they can stuff bits in this macro and unconditionally enable
features in the guest.

In general, this is wrong (we have been there with FEAT_MOPS,
and again with FEAT_TCRX).

Document that HCRX_EL2.SMPME is an exception rather than the rule,
and get rid of HCRX_GUEST_FLAGS.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240625130042.259175-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Correctly honor the presence of FEAT_TCRX
Marc Zyngier [Tue, 25 Jun 2024 13:00:37 +0000 (14:00 +0100)]
KVM: arm64: Correctly honor the presence of FEAT_TCRX

We currently blindly enable TCR2_EL1 use in a guest, irrespective
of the feature set. This is obviously wrong, and we should actually
honor the guest configuration and handle the possible trap resulting
from the guest being buggy.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240625130042.259175-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset
Oliver Upton [Fri, 21 Jun 2024 22:50:45 +0000 (22:50 +0000)]
KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset

commit 606af8293cd8 ("KVM: selftests: arm64: Test vCPU-scoped feature ID
registers") intended to test that MPIDR_EL1 is unchanged across vCPU
reset but failed at actually doing so.

Add the missing assertion.

Link: https://lore.kernel.org/r/20240621225045.2472090-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking
Oliver Upton [Fri, 21 Jun 2024 22:40:44 +0000 (22:40 +0000)]
KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking

Marc reports that L1 VMs aren't booting with the NV series applied to
today's kvmarm/next. After bisecting the issue, it appears that
44241f34fac9 ("KVM: arm64: nv: Use accessors for modifying ID
registers") is to blame.

Poking around at the issue a bit further, it'd appear that the value for
ID_AA64PFR0_EL1 is complete garbage, as 'val' still contains the value
we set ID_AA64ISAR1_EL1 to.

Fix the read-modify-write pattern to actually use ID_AA64PFR0_EL1 as the
starting point. Excuse me as I return to my shame cube.

Reported-by: Marc Zyngier <maz@kernel.org>
Fixes: 44241f34fac9 ("KVM: arm64: nv: Use accessors for modifying ID registers")
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240621224044.2465901-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Allow the use of SVE+NV
Oliver Upton [Thu, 20 Jun 2024 16:46:52 +0000 (16:46 +0000)]
KVM: arm64: Allow the use of SVE+NV

Allow SVE and NV to mix now that everything is in place to handle it
correctly.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-16-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Add additional trap setup for CPTR_EL2
Marc Zyngier [Thu, 20 Jun 2024 16:46:51 +0000 (16:46 +0000)]
KVM: arm64: nv: Add additional trap setup for CPTR_EL2

We need to teach KVM a couple of new tricks. CPTR_EL2 and its
VHE accessor CPACR_EL1 need to be handled specially:

- CPACR_EL1 is trapped on VHE so that we can track the TCPAC
  and TTA bits

- CPTR_EL2.{TCPAC,E0POE} are propagated from L1 to L2

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-15-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Add trap description for CPTR_EL2
Marc Zyngier [Thu, 20 Jun 2024 16:46:50 +0000 (16:46 +0000)]
KVM: arm64: nv: Add trap description for CPTR_EL2

Add trap description for CPTR_EL2.{TCPAC,TAM,E0POE,TTA}.

TTA is a bit annoying as it changes location depending on E2H.
This forces us to add yet another "complex" trap condition.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-14-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper
Marc Zyngier [Thu, 20 Jun 2024 16:46:49 +0000 (16:46 +0000)]
KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper

We are missing the propagation of CPTR_EL2.{TCPAC,TTA} into
the CPACR format. Make sure we preserve these bits.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-13-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2
Oliver Upton [Thu, 20 Jun 2024 16:46:48 +0000 (16:46 +0000)]
KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2

Start folding the guest hypervisor's FP/SVE traps into the value
programmed in hardware. Note that as of writing this is dead code, since
KVM does a full put() / load() for every nested exception boundary which
saves + flushes the FP/SVE state.

However, this will become useful when we can keep the guest's FP/SVE
state alive across a nested exception boundary and the host no longer
needs to conservatively program traps.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-12-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Load guest FP state for ZCR_EL2 trap
Oliver Upton [Thu, 20 Jun 2024 16:46:47 +0000 (16:46 +0000)]
KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap

Round out the ZCR_EL2 gymnastics by loading SVE state in the fast path
when the guest hypervisor tries to access SVE state.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-11-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle CPACR_EL1 traps
Marc Zyngier [Thu, 20 Jun 2024 16:46:46 +0000 (16:46 +0000)]
KVM: arm64: nv: Handle CPACR_EL1 traps

Handle CPACR_EL1 accesses when running a VHE guest. In order to
limit the cost of the emulation, implement it ass a shallow exit.

In the other cases:

- this is a nVHE L1 which will write to memory, and we don't trap

- this is a L2 guest:

  * the L1 has CPTR_EL2.TCPAC==0, and the L2 has direct register
   access

  * the L1 has CPTR_EL2.TCPAC==1, and the L2 will trap, but the
    handling is defered to the general handling for forwarding

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-10-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Spin off helper for programming CPTR traps
Oliver Upton [Thu, 20 Jun 2024 16:46:45 +0000 (16:46 +0000)]
KVM: arm64: Spin off helper for programming CPTR traps

A subsequent change to KVM will add preliminary support for merging a
guest hypervisor's CPTR traps with that of KVM. Prepare by spinning off
a new helper for managing CPTR traps.

Avoid reading CPACR_EL1 for the baseline trap config, and start off with
the most restrictive set of traps that is subsequently relaxed.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-9-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Ensure correct VL is loaded before saving SVE state
Oliver Upton [Thu, 20 Jun 2024 16:46:44 +0000 (16:46 +0000)]
KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state

It is possible that the guest hypervisor has selected a smaller VL than
the maximum for its nested guest. As such, ZCR_EL2 may be configured for
a different VL when exiting a nested guest.

Set ZCR_EL2 (via the EL1 alias) to the maximum VL for the VM before
saving SVE state as the SVE save area is dimensioned by the max VL.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-8-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Use guest hypervisor's max VL when running nested guest
Oliver Upton [Thu, 20 Jun 2024 16:46:43 +0000 (16:46 +0000)]
KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest

The max VL for nested guests is additionally constrained by the max VL
selected by the guest hypervisor. Use that instead of KVM's max VL when
running a nested guest.

Note that the guest hypervisor's ZCR_EL2 is sanitised against the VM's
max VL at the time of access, so there's no additional handling required
at the time of use.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-7-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context
Oliver Upton [Thu, 20 Jun 2024 16:46:42 +0000 (16:46 +0000)]
KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context

When running a guest hypervisor, ZCR_EL2 is an alias for the counterpart
EL1 state.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Load guest hyp's ZCR into EL1 state
Oliver Upton [Thu, 20 Jun 2024 16:46:41 +0000 (16:46 +0000)]
KVM: arm64: nv: Load guest hyp's ZCR into EL1 state

Load the guest hypervisor's ZCR_EL2 into the corresponding EL1 register
when restoring SVE state, as ZCR_EL2 affects the VL in the hypervisor
context.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle ZCR_EL2 traps
Oliver Upton [Thu, 20 Jun 2024 16:46:40 +0000 (16:46 +0000)]
KVM: arm64: nv: Handle ZCR_EL2 traps

Unlike other SVE-related registers, ZCR_EL2 takes a sysreg trap to EL2
when HCR_EL2.NV = 1. KVM still needs to honor the guest hypervisor's
trap configuration, which expects an SVE trap (i.e. ESR_EL2.EC = 0x19)
when CPTR traps are enabled for the vCPU's current context.

Otherwise, if the guest hypervisor has traps disabled, emulate the
access by mapping the requested VL into ZCR_EL1.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Forward SVE traps to guest hypervisor
Oliver Upton [Thu, 20 Jun 2024 16:46:39 +0000 (16:46 +0000)]
KVM: arm64: nv: Forward SVE traps to guest hypervisor

Similar to FPSIMD traps, don't load SVE state if the guest hypervisor
has SVE traps enabled and forward the trap instead. Note that ZCR_EL2
will require some special handling, as it takes a sysreg trap to EL2
when HCR_EL2.NV = 1.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor
Jintack Lim [Thu, 20 Jun 2024 16:46:38 +0000 (16:46 +0000)]
KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor

Give precedence to the guest hypervisor's trap configuration when
routing an FP/ASIMD trap taken to EL2. Take advantage of the
infrastructure for translating CPTR_EL2 into the VHE (i.e. EL1) format
and base the trap decision solely on the VHE view of the register. The
in-memory value of CPTR_EL2 will always be up to date for the guest
hypervisor (more on that later), so just read it directly from memory.

Bury all of this behind a macro keyed off of the CPTR bitfield in
anticipation of supporting other traps (e.g. SVE).

[maz: account for HCR_EL2.E2H when testing for TFP/FPEN, with
 all the hard work actually being done by Chase Conklin]
[ oliver: translate nVHE->VHE format for testing traps; macro for reuse
 in other CPTR_EL2.xEN fields ]

Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240620164653.1130714-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
Pierre-Clément Tosi [Mon, 10 Jun 2024 06:32:37 +0000 (07:32 +0100)]
KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2

The compiler implements kCFI by adding type information (u32) above
every function that might be indirectly called and, whenever a function
pointer is called, injects a read-and-compare of that u32 against the
value corresponding to the expected type. In case of a mismatch, a BRK
instruction gets executed. When the hypervisor triggers such an
exception in nVHE, it panics and triggers and exception return to EL1.

Therefore, teach nvhe_hyp_panic_handler() to detect kCFI errors from the
ESR and report them. If necessary, remind the user that EL2 kCFI is not
affected by CONFIG_CFI_PERMISSIVE.

Pass $(CC_FLAGS_CFI) to the compiler when building the nVHE hyp code.

Use SYM_TYPED_FUNC_START() for __pkvm_init_switch_pgd, as nVHE can't
call it directly and must use a PA function pointer from C (because it
is part of the idmap page), which would trigger a kCFI failure if the
type ID wasn't present.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-9-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Introduce print_nvhe_hyp_panic helper
Pierre-Clément Tosi [Mon, 10 Jun 2024 06:32:36 +0000 (07:32 +0100)]
KVM: arm64: Introduce print_nvhe_hyp_panic helper

Add a helper to display a panic banner soon to also be used for kCFI
failures, to ensure that we remain consistent.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-8-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoarm64: Introduce esr_brk_comment, esr_is_cfi_brk
Pierre-Clément Tosi [Mon, 10 Jun 2024 06:32:35 +0000 (07:32 +0100)]
arm64: Introduce esr_brk_comment, esr_is_cfi_brk

As it is already used in two places, move esr_comment() to a header for
re-use, with a clearer name.

Introduce esr_is_cfi_brk() to detect kCFI BRK syndromes, currently used
by early_brk64() but soon to also be used by hypervisor code.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-7-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: VHE: Mark __hyp_call_panic __noreturn
Pierre-Clément Tosi [Mon, 10 Jun 2024 06:32:34 +0000 (07:32 +0100)]
KVM: arm64: VHE: Mark __hyp_call_panic __noreturn

Given that the sole purpose of __hyp_call_panic() is to call panic(), a
__noreturn function, give it the __noreturn attribute, removing the need
for its caller to use unreachable().

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-6-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32
Pierre-Clément Tosi [Mon, 10 Jun 2024 06:32:33 +0000 (07:32 +0100)]
KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32

Ignore R_AARCH64_ABS32 relocations, instead of panicking, when emitting
the relocation table of the hypervisor. The toolchain might produce them
when generating function calls with kCFI to represent the 32-bit type ID
which can then be resolved across compilation units at link time.  These
are NOT actual 32-bit addresses and are therefore not needed in the
final (runtime) relocation table (which is unlikely to use 32-bit
absolute addresses for arm64 anyway).

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-5-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nVHE: Simplify invalid_host_el2_vect
Pierre-Clément Tosi [Mon, 10 Jun 2024 06:32:32 +0000 (07:32 +0100)]
KVM: arm64: nVHE: Simplify invalid_host_el2_vect

The invalid_host_el2_vect macro is used by EL2{t,h} handlers in nVHE
*host* context, which should never run with a guest context loaded.
Therefore, remove the superfluous vCPU context check and branch
unconditionally to hyp_panic.

Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-4-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Fix __pkvm_init_switch_pgd call ABI
Pierre-Clément Tosi [Mon, 10 Jun 2024 06:32:31 +0000 (07:32 +0100)]
KVM: arm64: Fix __pkvm_init_switch_pgd call ABI

Fix the mismatch between the (incorrect) C signature, C call site, and
asm implementation by aligning all three on an API passing the
parameters (pgd and SP) separately, instead of as a bundled struct.

Remove the now unnecessary memory accesses while the MMU is off from the
asm, which simplifies the C caller (as it does not need to convert a VA
struct pointer to PA) and makes the code slightly more robust by
offsetting the struct fields from C and properly expressing the call to
the C compiler (e.g. type checker and kCFI).

Fixes: f320bc742bc2 ("KVM: arm64: Prepare the creation of s1 mappings at EL2")
Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-3-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Fix clobbered ELR in sync abort/SError
Pierre-Clément Tosi [Mon, 10 Jun 2024 06:32:30 +0000 (07:32 +0100)]
KVM: arm64: Fix clobbered ELR in sync abort/SError

When the hypervisor receives a SError or synchronous exception (EL2h)
while running with the __kvm_hyp_vector and if ELR_EL2 doesn't point to
an extable entry, it panics indirectly by overwriting ELR with the
address of a panic handler in order for the asm routine it returns to to
ERET into the handler.

However, this clobbers ELR_EL2 for the handler itself. As a result,
hyp_panic(), when retrieving what it believes to be the PC where the
exception happened, actually ends up reading the address of the panic
handler that called it! This results in an erroneous and confusing panic
message where the source of any synchronous exception (e.g. BUG() or
kCFI) appears to be __guest_exit_panic, making it hard to locate the
actual BRK instruction.

Therefore, store the original ELR_EL2 in the per-CPU kvm_hyp_ctxt and
point the sysreg to a routine that first restores it to its previous
value before running __guest_exit_panic.

Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context")
Signed-off-by: Pierre-Clément Tosi <ptosi@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240610063244.2828978-2-ptosi@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: selftests: arm64: Test writes to CTR_EL0
Sebastian Ott [Wed, 19 Jun 2024 17:40:36 +0000 (17:40 +0000)]
KVM: selftests: arm64: Test writes to CTR_EL0

Test that CTR_EL0 is modifiable from userspace, that changes are
visible to guests, and that they are preserved across a vCPU reset.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-11-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: rename functions for invariant sys regs
Sebastian Ott [Wed, 19 Jun 2024 17:40:35 +0000 (17:40 +0000)]
KVM: arm64: rename functions for invariant sys regs

Invariant system id registers are populated with host values
at initialization time using their .reset function cb.

These are currently called get_* which is usually used by
the functions implementing the .get_user callback.

Change their function names to reset_* to reflect what they
are used for.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-10-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: show writable masks for feature registers
Sebastian Ott [Wed, 19 Jun 2024 17:40:34 +0000 (17:40 +0000)]
KVM: arm64: show writable masks for feature registers

Instead of using ~0UL provide the actual writable mask for
non-id feature registers in the output of the
KVM_ARM_GET_REG_WRITABLE_MASKS ioctl.

This changes the mask for the CTR_EL0 and CLIDR_EL1 registers.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-9-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Treat CTR_EL0 as a VM feature ID register
Sebastian Ott [Wed, 19 Jun 2024 17:40:33 +0000 (17:40 +0000)]
KVM: arm64: Treat CTR_EL0 as a VM feature ID register

CTR_EL0 is currently handled as an invariant register, thus
guests will be presented with the host value of that register.

Add emulation for CTR_EL0 based on a per VM value. Userspace can
switch off DIC and IDC bits and reduce DminLine and IminLine sizes.
Naturally, ensure CTR_EL0 is trapped (HCR_EL2.TID2=1) any time that a
VM's CTR_EL0 differs from hardware.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-8-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: unify code to prepare traps
Sebastian Ott [Wed, 19 Jun 2024 17:40:32 +0000 (17:40 +0000)]
KVM: arm64: unify code to prepare traps

There are 2 functions to calculate traps via HCR_EL2:
* kvm_init_sysreg() called via KVM_RUN (before the 1st run or when
  the pid changes)
* vcpu_reset_hcr() called via KVM_ARM_VCPU_INIT

To unify these 2 and to support traps that are dependent on the
ID register configuration, move the code from vcpu_reset_hcr()
to sys_regs.c and call it via kvm_init_sysreg().

We still have to keep the non-FWB handling stuff in vcpu_reset_hcr().
Also the initialization with HCR_GUEST_FLAGS is kept there but guarded
by !vcpu_has_run_once() to ensure that previous calculated values
don't get overwritten.

While at it rename kvm_init_sysreg() to kvm_calculate_traps() to
better reflect what it's doing.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-7-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Use accessors for modifying ID registers
Oliver Upton [Wed, 19 Jun 2024 17:40:31 +0000 (17:40 +0000)]
KVM: arm64: nv: Use accessors for modifying ID registers

In the interest of abstracting away the underlying storage of feature
ID registers, rework the nested code to go through the accessors instead
of directly iterating the id_regs array.

This means we now lose the property that ID registers unknown to the
nested code get zeroed, but we really ought to be handling those
explicitly going forward.

Link: https://lore.kernel.org/r/20240619174036.483943-6-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Add helper for writing ID regs
Oliver Upton [Wed, 19 Jun 2024 17:40:30 +0000 (17:40 +0000)]
KVM: arm64: Add helper for writing ID regs

Replace the remaining usage of IDREG() with a new helper for setting the
value of a feature ID register, with the benefit of cramming in some
extra sanity checks.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-5-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Use read-only helper for reading VM ID registers
Oliver Upton [Wed, 19 Jun 2024 17:40:29 +0000 (17:40 +0000)]
KVM: arm64: Use read-only helper for reading VM ID registers

IDREG() expands to the storage of a particular ID reg, which can be
useful for handling both reads and writes. However, outside of a select
few situations, the ID registers should be considered read only.

Replace current readers with a new macro that expands to the value of
the field rather than the field itself.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-4-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Make idregs debugfs iterator search sysreg table directly
Oliver Upton [Wed, 19 Jun 2024 17:40:28 +0000 (17:40 +0000)]
KVM: arm64: Make idregs debugfs iterator search sysreg table directly

CTR_EL0 complicates the existing scheme for iterating feature ID
registers, as it is not in the contiguous range that we presently
support. Just search the sysreg table for the Nth feature ID register in
anticipation of this. Yes, the debugfs interface has quadratic time
completixy now. Boo hoo.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-3-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show()
Oliver Upton [Wed, 19 Jun 2024 17:40:27 +0000 (17:40 +0000)]
KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show()

KVM is about to add support for more VM-scoped feature ID regs that
live outside of the id_regs[] array, which means the index of the
debugfs iterator may not actually be an index into the array.

Prepare by getting the sys_reg encoding from the descriptor itself.

Reviewed-by: Sebastian Ott <sebott@redhat.com>
Link: https://lore.kernel.org/r/20240619174036.483943-2-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Use GFP_KERNEL_ACCOUNT for sysreg_masks allocation
Oliver Upton [Mon, 17 Jun 2024 18:10:18 +0000 (18:10 +0000)]
KVM: arm64: nv: Use GFP_KERNEL_ACCOUNT for sysreg_masks allocation

Of course, userspace is in the driver's seat for struct kvm and
associated allocations. Make sure the sysreg_masks allocation
participates in kmem accounting.

Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240617181018.2054332-1-oliver.upton@linux.dev
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Add handling of NXS-flavoured TLBI operations
Marc Zyngier [Fri, 14 Jun 2024 14:45:52 +0000 (15:45 +0100)]
KVM: arm64: nv: Add handling of NXS-flavoured TLBI operations

Latest kid on the block: NXS (Non-eXtra-Slow) TLBI operations.

Let's add those in bulk (NSH, ISH, OSH, both normal and range)
as they directly map to their XS (the standard ones) counterparts.

Not a lot to say about them, they are basically useless.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-17-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Add handling of range-based TLBI operations
Marc Zyngier [Fri, 14 Jun 2024 14:45:51 +0000 (15:45 +0100)]
KVM: arm64: nv: Add handling of range-based TLBI operations

We already support some form of range operation by handling FEAT_TTL,
but so far the "arbitrary" range operations are unsupported.

Let's fix that.

For EL2 S1, this is simple enough: we just map both NSH, ISH and OSH
instructions onto the ISH version for EL1.

For TLBI instructions affecting EL1 S1, we use the same model as
their non-range counterpart to invalidate in the context of the
correct VMID.

For TLBI instructions affecting S2, we interpret the data passed
by the guest to compute the range and use that to tear-down part
of the shadow S2 range and invalidate the TLBs.

Finally, we advertise FEAT_TLBIRANGE if the host supports it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-16-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Add handling of outer-shareable TLBI operations
Marc Zyngier [Fri, 14 Jun 2024 14:45:50 +0000 (15:45 +0100)]
KVM: arm64: nv: Add handling of outer-shareable TLBI operations

Our handling of outer-shareable TLBIs is pretty basic: we just
map them to the existing inner-shareable ones, because we really
don't have anything else.

The only significant change is that we can now advertise FEAT_TLBIOS
support if the host supports it.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-15-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information
Marc Zyngier [Fri, 14 Jun 2024 14:45:49 +0000 (15:45 +0100)]
KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information

In order to be able to make S2 TLB invalidations more performant on NV,
let's use a scheme derived from the FEAT_TTL extension.

If bits [56:55] in the leaf descriptor translating the address in the
corresponding shadow S2 are non-zero, they indicate a level which can
be used as an invalidation range. This allows further reduction of the
systematic over-invalidation that takes place otherwise.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-14-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level
Marc Zyngier [Fri, 14 Jun 2024 14:45:48 +0000 (15:45 +0100)]
KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level

Populate bits [56:55] of the leaf entry with the level provided
by the guest's S2 translation. This will allow us to better scope
the invalidation by remembering the mapping size.

Of course, this assume that the guest will issue an invalidation
with an address that falls into the same leaf. If the guest doesn't,
we'll over-invalidate.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-13-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle FEAT_TTL hinted TLB operations
Marc Zyngier [Fri, 14 Jun 2024 14:45:47 +0000 (15:45 +0100)]
KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations

Support guest-provided information information to size the range of
required invalidation. This helps with reducing over-invalidation,
provided that the guest actually provides accurate information.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-12-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations
Marc Zyngier [Fri, 14 Jun 2024 14:45:46 +0000 (15:45 +0100)]
KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations

TLBI IPAS2E1* are the last class of TLBI instructions we need
to handle. For each matching S2 MMU context, we invalidate a
range corresponding to the largest possible mapping for that
context.

At this stage, we don't handle TTL, which means we are likely
over-invalidating. Further patches will aim at making this
a bit better.

Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle TLBI ALLE1{,IS} operations
Marc Zyngier [Fri, 14 Jun 2024 14:45:45 +0000 (15:45 +0100)]
KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations

TLBI ALLE1* is a pretty big hammer that invalides all S1/S2 TLBs.

This translates into the unmapping of all our shadow S2 PTs, itself
resulting in the corresponding TLB invalidations.

Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-10-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations
Marc Zyngier [Fri, 14 Jun 2024 14:45:44 +0000 (15:45 +0100)]
KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations

Emulating TLBI VMALLS12E1* results in tearing down all the shadow
S2 PTs that match the current VMID, since our shadow S2s are just
some form of SW-managed TLBs. That teardown itself results in a
full TLB invalidation for both S1 and S2.

This can result in over-invalidation if two vcpus use the same VMID
to tag private S2 PTs, but this is still correct from an architecture
perspective.

Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-9-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1
Marc Zyngier [Fri, 14 Jun 2024 14:45:43 +0000 (15:45 +0100)]
KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1

While dealing with TLB invalidation targeting the guest hypervisor's
own stage-1 was easy, doing the same thing for its own guests is
a bit more involved.

Since such an invalidation is scoped by VMID, it needs to apply to
all s2_mmu contexts that have been tagged by that VMID, irrespective
of the value of VTTBR_EL2.BADDR.

So for each s2_mmu context matching that VMID, we invalidate the
corresponding TLBs, each context having its own "physical" VMID.

Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-8-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation
Marc Zyngier [Fri, 14 Jun 2024 14:45:42 +0000 (15:45 +0100)]
KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation

Due to the way FEAT_NV2 suppresses traps when accessing EL2
system registers, we can't track when the guest changes its
HCR_EL2.TGE setting. This means we always trap EL1 TLBIs,
even if they don't affect any L2 guest.

Given that invalidating the EL2 TLBs doesn't require any messing
with the shadow stage-2 page-tables, we can simply emulate the
instructions early and return directly to the guest.

This is conditioned on the instruction being an EL1 one and
the guest's HCR_EL2.{E2H,TGE} being {1,1} (indicating that
the instruction targets the EL2 S1 TLBs), or the instruction
being one of the EL2 ones (which are not ambiguous).

EL1 TLBIs issued with HCR_EL2.{E2H,TGE}={1,0} are not handled
here, and cause a full exit so that they can be handled in
the context of a VMID.

Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Add Stage-1 EL2 invalidation primitives
Marc Zyngier [Fri, 14 Jun 2024 14:45:41 +0000 (15:45 +0100)]
KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives

Provide the primitives required to handle TLB invalidation for
Stage-1 EL2 TLBs, which by definition do not require messing
with the Stage-2 page tables.

Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Unmap/flush shadow stage 2 page tables
Christoffer Dall [Fri, 14 Jun 2024 14:45:40 +0000 (15:45 +0100)]
KVM: arm64: nv: Unmap/flush shadow stage 2 page tables

Unmap/flush shadow stage 2 page tables for the nested VMs as well as the
stage 2 page table for the guest hypervisor.

Note: A bunch of the code in mmu.c relating to MMU notifiers is
currently dealt with in an extremely abrupt way, for example by clearing
out an entire shadow stage-2 table. This will be handled in a more
efficient way using the reverse mapping feature in a later version of
the patch series.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Handle shadow stage 2 page faults
Marc Zyngier [Fri, 14 Jun 2024 14:45:39 +0000 (15:45 +0100)]
KVM: arm64: nv: Handle shadow stage 2 page faults

If we are faulting on a shadow stage 2 translation, we first walk the
guest hypervisor's stage 2 page table to see if it has a mapping. If
not, we inject a stage 2 page fault to the virtual EL2. Otherwise, we
create a mapping in the shadow stage 2 page table.

Note that we have to deal with two IPAs when we got a shadow stage 2
page fault. One is the address we faulted on, and is in the L2 guest
phys space. The other is from the guest stage-2 page table walk, and is
in the L1 guest phys space.  To differentiate them, we rename variables
so that fault_ipa is used for the former and ipa is used for the latter.

When mapping a page in a shadow stage-2, special care must be taken not
to be more permissive than the guest is.

Co-developed-by: Christoffer Dall <christoffer.dall@linaro.org>
Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Implement nested Stage-2 page table walk logic
Christoffer Dall [Fri, 14 Jun 2024 14:45:38 +0000 (15:45 +0100)]
KVM: arm64: nv: Implement nested Stage-2 page table walk logic

Based on the pseudo-code in the ARM ARM, implement a stage 2 software
page table walker.

Co-developed-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Support multiple nested Stage-2 mmu structures
Marc Zyngier [Fri, 14 Jun 2024 14:45:37 +0000 (15:45 +0100)]
KVM: arm64: nv: Support multiple nested Stage-2 mmu structures

Add Stage-2 mmu data structures for virtual EL2 and for nested guests.
We don't yet populate shadow Stage-2 page tables, but we now have a
framework for getting to a shadow Stage-2 pgd.

We allocate twice the number of vcpus as Stage-2 mmu structures because
that's sufficient for each vcpu running two translation regimes without
having to flush the Stage-2 page tables.

Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614144552.2773592-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Use FF-A 1.1 with pKVM
Sebastian Ene [Thu, 13 Jun 2024 13:20:35 +0000 (13:20 +0000)]
KVM: arm64: Use FF-A 1.1 with pKVM

Now that the layout of the structures is compatible with 1.1 it is time
to probe the 1.1 version of the FF-A protocol inside the hypervisor. If
the TEE doesn't support it, it should return the minimum supported
version.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240613132035.1070360-5-sebastianene@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Update the identification range for the FF-A smcs
Sebastian Ene [Thu, 13 Jun 2024 13:20:34 +0000 (13:20 +0000)]
KVM: arm64: Update the identification range for the FF-A smcs

The FF-A spec 1.2 reserves the following ranges for identifying FF-A
calls:
0x84000060-0x840000FF: FF-A 32-bit calls
0xC4000060-0xC40000FF: FF-A 64-bit calls.

Use the range identification according to the spec and allow calls that
are currently out of the range(eg. FFA_MSG_SEND_DIRECT_REQ2) to be
identified correctly.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240613132035.1070360-4-sebastianene@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Add support for FFA_PARTITION_INFO_GET
Sebastian Ene [Thu, 13 Jun 2024 13:20:33 +0000 (13:20 +0000)]
KVM: arm64: Add support for FFA_PARTITION_INFO_GET

Handle the FFA_PARTITION_INFO_GET host call inside the pKVM hypervisor
and copy the response message back to the host buffers.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240613132035.1070360-3-sebastianene@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Trap FFA_VERSION host call in pKVM
Sebastian Ene [Thu, 13 Jun 2024 13:20:32 +0000 (13:20 +0000)]
KVM: arm64: Trap FFA_VERSION host call in pKVM

The pKVM hypervisor initializes with FF-A version 1.0. The spec requires
that no other FF-A calls to be issued before the version negotiation
phase is complete. Split the hypervisor proxy initialization code in two
parts so that we can move the later one after the host negotiates its
version.

Without trapping the call, the host drivers can negotiate a higher
version number with TEE which can result in a different memory layout
described during the memory sharing calls.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240613132035.1070360-2-sebastianene@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity
Marc Zyngier [Fri, 14 Jun 2024 12:58:58 +0000 (13:58 +0100)]
KVM: arm64: nv: Fix RESx behaviour of disabled FGTs with negative polarity

The Fine Grained Trap extension is pretty messy as it doesn't
consistently use the same polarity for all trap bits. A bunch
of them, added later in the life of the architecture, have a
*negative* priority.

So if these bits are disabled, they must be RES1 and not RES0.
But that's not what the code implements, making the traps for
these negative trap bits being always on instead of disabled.

Fix the relevant bits, and stick a brown paper bag on my head
for the rest of the day...

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240614125858.78361-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoKVM: arm64: Add early_param to control WFx trapping
Colton Lewis [Thu, 23 May 2024 17:40:55 +0000 (17:40 +0000)]
KVM: arm64: Add early_param to control WFx trapping

Add an early_params to control WFI and WFE trapping. This is to
control the degree guests can wait for interrupts on their own without
being trapped by KVM. Options for each param are trap and notrap. trap
enables the trap. notrap disables the trap. Note that when enabled,
traps are allowed but not guaranteed by the CPU architecture. Absent
an explicitly set policy, default to current behavior: disabling the
trap if only a single task is running and enabling otherwise.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Link: https://lore.kernel.org/r/20240523174056.1565133-1-coltonlewis@google.com
[ oliver: rework kvm_vcpu_should_clear_tw*() for readability ]
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
10 months agoLinux 6.10-rc3
Linus Torvalds [Sun, 9 Jun 2024 21:19:43 +0000 (14:19 -0700)]
Linux 6.10-rc3

10 months agoMerge tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sun, 9 Jun 2024 16:04:51 +0000 (09:04 -0700)]
Merge tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Update copies of kernel headers, which resulted in support for the
   new 'mseal' syscall, SUBVOL statx return mask bit, RISC-V and PPC
   prctls, fcntl's DUPFD_QUERY, POSTED_MSI_NOTIFICATION IRQ vector,
   'map_shadow_stack' syscall for x86-32.

 - Revert perf.data record memory allocation optimization that ended up
   causing a regression, work is being done to re-introduce it in the
   next merge window.

 - Fix handling of minimal vmlinux.h file used with BPF's CO-RE when
   interrupting the build.

* tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf bpf: Fix handling of minimal vmlinux.h file when interrupting the build
  Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event"
  tools headers arm64: Sync arm64's cputype.h with the kernel sources
  tools headers uapi: Sync linux/stat.h with the kernel sources to pick STATX_SUBVOL
  tools headers UAPI: Update i915_drm.h with the kernel sources
  tools headers UAPI: Sync kvm headers with the kernel sources
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools headers: Update the syscall tables and unistd.h, mostly to support the new 'mseal' syscall
  perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources to pick POSTED_MSI_NOTIFICATION
  perf beauty: Update copy of linux/socket.h with the kernel sources
  tools headers UAPI: Sync fcntl.h with the kernel sources to pick F_DUPFD_QUERY
  tools headers UAPI: Sync linux/prctl.h with the kernel sources
  tools include UAPI: Sync linux/stat.h with the kernel sources

10 months agoMerge tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 9 Jun 2024 15:49:13 +0000 (08:49 -0700)]
Merge tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC fixes from Borislav Petkov:

 - Convert PCI core error codes to proper error numbers since latter get
   propagated all the way up to the module loading functions

* tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/igen6: Convert PCIBIOS_* return codes to errnos
  EDAC/amd64: Convert PCIBIOS_* return codes to errnos

10 months agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 9 Jun 2024 02:14:02 +0000 (19:14 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
 "One fix for the SiFive PRCI clocks so that the device boots again.

  This driver was registering clkdev lookups that were always going to
  be useless. This wasn't a problem until clkdev started returning an
  error in these cases, causing this driver to fail probe, and thus boot
  to fail because clks are essential for most drivers. The fix is
  simple, don't use clkdev because this is a DT based system where
  clkdev isn't used"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: sifive: Do not register clkdevs for PRCI clocks

10 months agoMerge tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 9 Jun 2024 02:07:18 +0000 (19:07 -0700)]
Merge tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:
 "Two small smb3 client fixes:

   - fix deadlock in umount

   - minor cleanup due to netfs change"

* tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Don't advance the I/O iterator before terminating subrequest
  smb: client: fix deadlock in smb2_find_smb_tcon()

10 months agoMerge tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 8 Jun 2024 17:48:11 +0000 (10:48 -0700)]
Merge tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Benjamin Tissoires:

 - fix potential read out of bounds in hid-asus (Andrew Ballance)

 - fix endian-conversion on little endian systems in intel-ish-hid (Arnd
   Bergmann)

 - A couple of new input event codes (Aseda Aboagye)

 - errors handling fixes in hid-nvidia-shield (Chen Ni), hid-nintendo
   (Christophe JAILLET), hid-logitech-dj (José Expósito)

 - current leakage fix while the device is in suspend on a i2c-hid
   laptop (Johan Hovold)

 - other assorted smaller fixes and device ID / quirk entry additions

* tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: Ignore battery for ELAN touchscreens 2F2C and 4116
  HID: i2c-hid: elan: fix reset suspend current leakage
  dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' property
  dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015M
  dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schema
  input: Add support for "Do Not Disturb"
  input: Add event code for accessibility key
  hid: asus: asus_report_fixup: fix potential read out of bounds
  HID: logitech-hidpp: add missing MODULE_DESCRIPTION() macro
  HID: intel-ish-hid: fix endian-conversion
  HID: nintendo: Fix an error handling path in nintendo_hid_probe()
  HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
  HID: core: remove unnecessary WARN_ON() in implement()
  HID: nvidia-shield: Add missing check for input_ff_create_memless
  HID: intel-ish-hid: Fix build error for COMPILE_TEST

10 months agoMerge tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 8 Jun 2024 17:12:33 +0000 (10:12 -0700)]
Merge tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Fix the initial state of the save button in 'make gconfig'

 - Improve the Kconfig documentation

 - Fix a Kconfig bug regarding property visibility

 - Fix build breakage for systems where 'sed' is not installed in /bin

 - Fix a false warning about missing MODULE_DESCRIPTION()

* tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o
  kbuild: explicitly run mksysmap as sed script from link-vmlinux.sh
  kconfig: remove wrong expr_trans_bool()
  kconfig: doc: document behavior of 'select' and 'imply' followed by 'if'
  kconfig: doc: fix a typo in the note about 'imply'
  kconfig: gconf: give a proper initial state to the Save button
  kconfig: remove unneeded code for user-supplied values being out of range

10 months agoMerge tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Sat, 8 Jun 2024 16:57:09 +0000 (09:57 -0700)]
Merge tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - fixes for the new ipu6 driver (and related fixes to mei csi driver)

 - fix a double debugfs remove logic at mgb4 driver

 - a documentation fix

* tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: intel/ipu6: add csi2 port sanity check in notifier bound
  media: intel/ipu6: update the maximum supported csi2 port number to 6
  media: mei: csi: Warn less verbosely of a missing device fwnode
  media: mei: csi: Put the IPU device reference
  media: intel/ipu6: fix the buffer flags caused by wrong parentheses
  media: intel/ipu6: Fix an error handling path in isys_probe()
  media: intel/ipu6: Move isys_remove() close to isys_probe()
  media: intel/ipu6: Fix some redundant resources freeing in ipu6_pci_remove()
  media: Documentation: v4l: Fix ACTIVE route flag
  media: mgb4: Fix double debugfs remove

10 months agoMerge tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 8 Jun 2024 16:44:50 +0000 (09:44 -0700)]
Merge tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:

 - Fix possible memory leak the riscv-intc irqchip driver load failures

 - Fix boot crash in the sifive-plic irqchip driver caused by recently
   changed boot initialization order

 - Fix race condition in the gic-v3-its irqchip driver

* tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()
  irqchip/sifive-plic: Chain to parent IRQ after handlers are ready
  irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails

10 months agoMerge tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 8 Jun 2024 16:36:08 +0000 (09:36 -0700)]
Merge tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Miscellaneous fixes:

   - Fix kexec() crash if call depth tracking is enabled

   - Fix SMN reads on inaccessible registers on certain AMD systems"

* tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/amd_nb: Check for invalid SMN reads
  x86/kexec: Fix bug with call depth tracking

10 months agoMerge tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 8 Jun 2024 16:26:59 +0000 (09:26 -0700)]
Merge tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event fix from Ingo Molnar:
 "Fix race between perf_event_free_task() and perf_event_release_kernel()
  that can result in missed wakeups and hung tasks"

* tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix missing wakeup when waiting for context reference

10 months agoMerge tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 8 Jun 2024 16:03:46 +0000 (09:03 -0700)]
Merge tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking doc fix from Ingo Molnar:
 "Fix typos in the kerneldoc of some of the atomic APIs"

* tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldoc

10 months agoMerge tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sat, 8 Jun 2024 00:01:10 +0000 (17:01 -0700)]
Merge tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "14 hotfixes, 6 of which are cc:stable.

  All except the nilfs2 fix affect MM and all are singletons - see the
  chagelogs for details"

* tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
  mm: fix xyz_noprof functions calling profiled functions
  codetag: avoid race at alloc_slab_obj_exts
  mm/hugetlb: do not call vma_add_reservation upon ENOMEM
  mm/ksm: fix ksm_zero_pages accounting
  mm/ksm: fix ksm_pages_scanned accounting
  kmsan: do not wipe out origin when doing partial unpoisoning
  vmalloc: check CONFIG_EXECMEM in is_vmalloc_or_module_addr()
  mm: page_alloc: fix highatomic typing in multi-block buddies
  nilfs2: fix potential kernel bug due to lack of writeback flag waiting
  memcg: remove the lockdep assert from __mod_objcg_mlstate()
  mm: arm64: fix the out-of-bounds issue in contpte_clear_young_dirty_ptes
  mm: huge_mm: fix undefined reference to `mthp_stats' for CONFIG_SYSFS=n
  mm: drop the 'anon_' prefix for swap-out mTHP counters

10 months agoMerge tag 'gpio-fixes-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 7 Jun 2024 23:54:57 +0000 (16:54 -0700)]
Merge tag 'gpio-fixes-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - interrupt handling and Kconfig fixes for gpio-tqmx86

 - add a buffer for storing output values in gpio-tqmx86 as reading back
   the registers always returns the input values

 - add missing MODULE_DESCRIPTION()s to several GPIO drivers

* tag 'gpio-fixes-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: add missing MODULE_DESCRIPTION() macros
  gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type
  gpio: tqmx86: store IRQ trigger type and unmask status separately
  gpio: tqmx86: introduce shadow register for GPIO output value
  gpio: tqmx86: fix typo in Kconfig label

10 months agoMerge tag 'block-6.10-20240607' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 7 Jun 2024 23:45:48 +0000 (16:45 -0700)]
Merge tag 'block-6.10-20240607' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - Fix for null_blk block size validation (Andreas)

 - NVMe pull request via Keith:
      - Use reserved tags for special fabrics operations (Chunguang)
      - Persistent Reservation status masking fix (Weiwen)

* tag 'block-6.10-20240607' of git://git.kernel.dk/linux:
  null_blk: fix validation of block size
  nvme: fix nvme_pr_* status code parsing
  nvme-fabrics: use reserved tag for reg read/write command

10 months agoMerge tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linux
Linus Torvalds [Fri, 7 Jun 2024 23:43:07 +0000 (16:43 -0700)]
Merge tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:

 - Fix a locking order issue with setting max async thread workers
   (Hagar)

 - Fix for a NULL pointer dereference for failed async flagged requests
   using ring provided buffers. This doesn't affect the current kernel,
   but it does affect older kernels, and is being queued up for 6.10
   just to make the stable process easier (me)

 - Fix for NAPI timeout calculations for how long to busy poll, and
   subsequently how much to sleep post that if a wait timeout is passed
   in (me)

 - Fix for a regression in this release cycle, where we could end up
   using a partially unitialized match value for io-wq (Su)

* tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linux:
  io_uring: fix possible deadlock in io_register_iowq_max_workers()
  io_uring/io-wq: avoid garbage value of 'match' in io_wq_enqueue()
  io_uring/napi: fix timeout calculation
  io_uring: check for non-NULL file pointer in io_file_can_poll()

10 months agoMerge tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Fri, 7 Jun 2024 22:13:12 +0000 (15:13 -0700)]
Merge tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix handling of folio private changes.

   The private value holds pointer to our extent buffer structure
   representing a metadata range. Release and create of the range was
   not properly synchronized when updating the private bit which ended
   up in double folio_put, leading to all sorts of breakage

 - fix a crash, reported as duplicate key in metadata, but caused by a
   race of fsync and size extending write. Requires prealloc target
   range + fsync and other conditions (log tree state, timing)

 - fix leak of qgroup extent records after transaction abort

* tag 'for-6.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: protect folio::private when attaching extent buffer folios
  btrfs: fix leak of qgroup extent records after transaction abort
  btrfs: fix crash on racing fsync and size-extending write into prealloc

10 months agoMerge tag 'nfsd-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Fri, 7 Jun 2024 22:07:57 +0000 (15:07 -0700)]
Merge tag 'nfsd-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fix from Chuck Lever:

 - Fix an occasional memory overwrite caused by a fix added in 6.10

* tag 'nfsd-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  SUNRPC: Fix loop termination condition in gss_free_in_token_pages()

10 months agoMerge tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 7 Jun 2024 21:47:38 +0000 (14:47 -0700)]
Merge tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - Another fix to avoid allocating pages that overlap with ERR_PTR,
   which manifests on rv32

 - A revert for the badaccess patch I incorrectly picked up an early
   version of

* tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  Revert "riscv: mm: accelerate pagefault when badaccess"
  riscv: fix overlap of allocated page and PTR_ERR

10 months agoMerge tag 's390-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 7 Jun 2024 21:44:53 +0000 (14:44 -0700)]
Merge tag 's390-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - Do not create PT_LOAD program header for the kenel image when the
   virtual memory informaton in OS_INFO data is not available. That
   fixes stand-alone dump failures against kernels that do not provide
   the virtual memory informaton

 - Add KVM s390 shared zeropage selftest

* tag 's390-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  KVM: s390x: selftests: Add shared zeropage test
  s390/crash: Do not use VM info if os_info does not have it

10 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 7 Jun 2024 21:36:57 +0000 (14:36 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Fix spurious CPU hotplug warning message from SETEND emulation code

 - Fix the build when GCC wasn't inlining our I/O accessor internals

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/io: add constant-argument check
  arm64: armv8_deprecated: Fix warning in isndep cpuhp starting process

10 months agoMerge tag 'platform-drivers-x86-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 7 Jun 2024 21:13:46 +0000 (14:13 -0700)]
Merge tag 'platform-drivers-x86-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:

 -  Default silead touchscreen driver to 10 fingers and drop 10 finger
    setting from all DMI quirks. More of a cleanup then a pure fix, but
    since the DMI quirks always get updated through the fixes branch
    this avoids conflicts.

 -  Kconfig fix for randconfig builds

 -  dell-smbios: Fix wrong token data in sysfs

 -  amd-hsmp: Fix driver poking unsupported hw when loaded manually

* tag 'platform-drivers-x86-v6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86/amd/hsmp: Check HSMP support on AMD family of processors
  platform/x86: dell-smbios: Simplify error handling
  platform/x86: dell-smbios: Fix wrong token data in sysfs
  platform/x86: yt2-1380: add CONFIG_EXTCON dependency
  platform/x86: touchscreen_dmi: Use 2-argument strscpy()
  platform/x86: touchscreen_dmi: Drop "silead,max-fingers" property
  Input: silead - Always support 10 fingers

10 months agoMerge tag 'iommu-fixes-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 7 Jun 2024 20:34:53 +0000 (13:34 -0700)]
Merge tag 'iommu-fixes-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:
 "Core:

   - Make iommu-dma code recognize 'force_aperture' again

   - Fix for potential NULL-ptr dereference from iommu_sva_bind_device()
     return value

  AMD IOMMU fixes:

   - Fix lockdep splat for invalid wait context

   - Add feature bit check before enabling PPR

   - Make workqueue name fit into buffer

   - Fix memory leak in sysfs code"

* tag 'iommu-fixes-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/amd: Fix Invalid wait context issue
  iommu/amd: Check EFR[EPHSup] bit before enabling PPR
  iommu/amd: Fix workqueue name
  iommu: Return right value in iommu_sva_bind_device()
  iommu/dma: Fix domain init
  iommu/amd: Fix sysfs leak in iommu init