]> www.infradead.org Git - users/hch/block.git/log
users/hch/block.git
20 months agoKVM: arm64: Drop the requirement for XARRAY_MULTI
Marc Zyngier [Wed, 14 Feb 2024 13:18:12 +0000 (13:18 +0000)]
KVM: arm64: Drop the requirement for XARRAY_MULTI

Now that we don't use xa_store_range() anymore, drop the added
complexity of XARRAY_MULTI for KVM. It is likely still pulled
in by other bits of the kernel though.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-12-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: nv: Turn encoding ranges into discrete XArray stores
Marc Zyngier [Wed, 14 Feb 2024 13:18:11 +0000 (13:18 +0000)]
KVM: arm64: nv: Turn encoding ranges into discrete XArray stores

In order to be able to store different values for member of an
encoding range, replace xa_store_range() calls with discrete
xa_store() calls and an encoding iterator.

We end-up using a bit more memory, but we gain some flexibility
that we will make use of shortly.

Take this opportunity to tidy up the error handling path.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: nv: Correctly handle negative polarity FGTs
Marc Zyngier [Wed, 14 Feb 2024 13:18:10 +0000 (13:18 +0000)]
KVM: arm64: nv: Correctly handle negative polarity FGTs

Negative trap bits are a massive pain. They are, on the surface,
indistinguishable from RES0 bits. Do you trap? or do you ignore?

Thankfully, we now have the right infrastructure to check for RES0
bits as long as the register is backed by VNCR, which is the case
for the FGT registers.

Use that information as a discriminant when handling a trap that
is potentially caused by a FGT.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-10-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: Unify HDFG[WR]TR_GROUP FGT identifiers
Marc Zyngier [Wed, 14 Feb 2024 13:18:09 +0000 (13:18 +0000)]
KVM: arm64: Unify HDFG[WR]TR_GROUP FGT identifiers

There is no reason to have separate FGT group identifiers for
the debug fine grain trapping. The sole requirement is to provide
the *names* so that the SR_FGF() macro can do its magic of picking
the correct bit definition.

So let's alias HDFGWTR_GROUP and HDFGRTR_GROUP.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-9-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: nv: Drop sanitised_sys_reg() helper
Marc Zyngier [Wed, 14 Feb 2024 13:18:08 +0000 (13:18 +0000)]
KVM: arm64: nv: Drop sanitised_sys_reg() helper

Now that we have the infrastructure to enforce a sanitised register
value depending on the VM configuration, drop the helper that only
used the architectural RES0 value.

Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-8-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: nv: Add sanitising to VNCR-backed HCRX_EL2
Marc Zyngier [Wed, 14 Feb 2024 13:18:07 +0000 (13:18 +0000)]
KVM: arm64: nv: Add sanitising to VNCR-backed HCRX_EL2

Just like its little friends, HCRX_EL2 gets the feature set treatment
when backed by VNCR.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: nv: Add sanitising to VNCR-backed FGT sysregs
Marc Zyngier [Wed, 14 Feb 2024 13:18:06 +0000 (13:18 +0000)]
KVM: arm64: nv: Add sanitising to VNCR-backed FGT sysregs

Fine Grained Traps are controlled by a whole bunch of features.
Each one of them must be checked and the corresponding masks
computed so that we don't let the guest apply traps it shouldn't
be using.

This takes care of HFG[IRW]TR_EL2, HDFG[RW]TR_EL2, and HAFGRTR_EL2.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: nv: Add sanitising to EL2 configuration registers
Marc Zyngier [Wed, 14 Feb 2024 13:18:05 +0000 (13:18 +0000)]
KVM: arm64: nv: Add sanitising to EL2 configuration registers

We can now start making use of our sanitising masks by setting them
to values that depend on the guest's configuration.

First up are VTTBR_EL2, VTCR_EL2, VMPIDR_EL2 and HCR_EL2.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: nv: Add sanitising to VNCR-backed sysregs
Marc Zyngier [Wed, 14 Feb 2024 13:18:04 +0000 (13:18 +0000)]
KVM: arm64: nv: Add sanitising to VNCR-backed sysregs

VNCR-backed "registers" are actually only memory. Which means that
there is zero control over what the guest can write, and that it
is the hypervisor's job to actually sanitise the content of the
backing store. Yeah, this is fun.

In order to preserve some form of sanity, add a repainting mechanism
that makes use of a per-VM set of RES0/RES1 masks, one pair per VNCR
register. These masks get applied on access to the backing store via
__vcpu_sys_reg(), ensuring that the state that is consumed by KVM is
correct.

So far, nothing populates these masks, but stay tuned.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: Add feature checking helpers
Marc Zyngier [Wed, 14 Feb 2024 13:18:03 +0000 (13:18 +0000)]
KVM: arm64: Add feature checking helpers

In order to make it easier to check whether a particular feature
is exposed to a guest, add a new set of helpers, with kvm_has_feat()
being the most useful.

Let's start making use of them in the PMU code (courtesy of Oliver).
Follow-up changes will introduce additional use patterns.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Co-developed--by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240214131827.2856277-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: sysreg: Add missing ID_AA64ISAR[13]_EL1 fields and variants
Marc Zyngier [Wed, 14 Feb 2024 13:18:02 +0000 (13:18 +0000)]
arm64: sysreg: Add missing ID_AA64ISAR[13]_EL1 fields and variants

Despite having the control bits for FEAT_SPECRES and FEAT_PACM,
the ID registers fields are either incomplete or missing.

Fix it.

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240214131827.2856277-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: cpufeatures: Fix FEAT_NV check when checking for FEAT_NV1
Marc Zyngier [Thu, 15 Feb 2024 01:49:54 +0000 (01:49 +0000)]
arm64: cpufeatures: Fix FEAT_NV check when checking for FEAT_NV1

Using this_cpu_has_cap() has the potential to go wrong when
used system-wide on a preemptible kernel. Instead, use the
__system_matches_cap() helper when checking for FEAT_NV in the
FEAT_NV1 probing helper.

Fixes: 3673d01a2f55 ("arm64: cpufeatures: Only check for NV1 if NV is present")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/kvmarm/86bk8k5ts3.wl-maz@kernel.org/
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: cpufeatures: Only check for NV1 if NV is present
Marc Zyngier [Mon, 12 Feb 2024 14:47:36 +0000 (14:47 +0000)]
arm64: cpufeatures: Only check for NV1 if NV is present

We handle ID_AA64MMFR4_EL1.E2H0 being 0 as NV1 being present.
However, this is only true if FEAT_NV is implemented.

Add the required check to has_nv1(), avoiding spuriously advertising
NV1 on HW that doesn't have NV at all.

Fixes: da9af5071b25 ("arm64: cpufeature: Detect HCR_EL2.NV1 being RES0")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240212144736.1933112-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: cpufeatures: Add missing ID_AA64MMFR4_EL1 to __read_sysreg_by_encoding()
Marc Zyngier [Mon, 12 Feb 2024 14:47:35 +0000 (14:47 +0000)]
arm64: cpufeatures: Add missing ID_AA64MMFR4_EL1 to __read_sysreg_by_encoding()

When triggering a CPU hotplug scenario, we reparse the CPU feature
with SCOPE_LOCAL_CPU, for which we use __read_sysreg_by_encoding()
to get the HW value for this CPU.

As it turns out, we're missing the handling for ID_AA64MMFR4_EL1,
and trigger a BUG(). Funnily enough, Marek isn't completely happy
about that.

Add the damn register to the list.

Fixes: 805bb61f8279 ("arm64: cpufeature: Add ID_AA64MMFR4_EL1 handling")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240212144736.1933112-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: Handle Apple M2 as not having HCR_EL2.NV1 implemented
Marc Zyngier [Mon, 22 Jan 2024 18:13:44 +0000 (18:13 +0000)]
KVM: arm64: Handle Apple M2 as not having HCR_EL2.NV1 implemented

Although the Apple M2 family of CPUs can have HCR_EL2.NV1 being
set and clear, with the change in trap behaviour being OK, they
explode spectacularily on an EL2 S1 page table using the nVHE
format. This is no good.

Let's pretend this HW doesn't have NV1, and move along.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240122181344.258974-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: Force guest's HCR_EL2.E2H RES1 when NV1 is not implemented
Marc Zyngier [Mon, 22 Jan 2024 18:13:43 +0000 (18:13 +0000)]
KVM: arm64: Force guest's HCR_EL2.E2H RES1 when NV1 is not implemented

If NV1 isn't supported on a system, make sure we always evaluate
the guest's HCR_EL2.E2H as RES1, irrespective of what the guest
may have written there.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240122181344.258974-10-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoKVM: arm64: Expose ID_AA64MMFR4_EL1 to guests
Marc Zyngier [Mon, 22 Jan 2024 18:13:42 +0000 (18:13 +0000)]
KVM: arm64: Expose ID_AA64MMFR4_EL1 to guests

We can now expose ID_AA64MMFR4_EL1 to guests, and let NV guests
understand that they cannot really switch HCR_EL2.E2H to 0 on
some platforms.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240122181344.258974-9-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative
Marc Zyngier [Mon, 22 Jan 2024 18:13:41 +0000 (18:13 +0000)]
arm64: Treat HCR_EL2.E2H as RES1 when ID_AA64MMFR4_EL1.E2H0 is negative

For CPUs that have ID_AA64MMFR4_EL1.E2H0 as negative, it is important
to avoid the boot path that sets HCR_EL2.E2H=0. Fortunately, we
already have this path to cope with fruity CPUs.

Tweak init_el2 to look at ID_AA64MMFR4_EL1.E2H0 first.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240122181344.258974-8-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: cpufeature: Detect HCR_EL2.NV1 being RES0
Marc Zyngier [Mon, 22 Jan 2024 18:13:40 +0000 (18:13 +0000)]
arm64: cpufeature: Detect HCR_EL2.NV1 being RES0

A variant of FEAT_E2H0 not being implemented exists in the form of
HCR_EL2.E2H being RES1 *and* HCR_EL2.NV1 being RES0 (indicating that
only VHE is supported on the host and nested guests).

Add the necessary infrastructure for this new CPU capability.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240122181344.258974-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: cpufeature: Add ID_AA64MMFR4_EL1 handling
Marc Zyngier [Mon, 22 Jan 2024 18:13:39 +0000 (18:13 +0000)]
arm64: cpufeature: Add ID_AA64MMFR4_EL1 handling

Add ID_AA64MMFR4_EL1 to the list of idregs the kernel knows about,
and describe the E2H0 field.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240122181344.258974-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: sysreg: Add layout for ID_AA64MMFR4_EL1
Marc Zyngier [Mon, 22 Jan 2024 18:13:38 +0000 (18:13 +0000)]
arm64: sysreg: Add layout for ID_AA64MMFR4_EL1

ARMv9.5 has infroduced ID_AA64MMFR4_EL1 with a bunch of new features.
Add the corresponding layout.

This is extracted from the public ARM SysReg_xml_A_profile-2023-09
delivery, timestamped d55f5af8e09052abe92a02adf820deea2eaed717.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Link: https://lore.kernel.org/r/20240122181344.258974-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: cpufeature: Correctly display signed override values
Marc Zyngier [Mon, 22 Jan 2024 18:13:37 +0000 (18:13 +0000)]
arm64: cpufeature: Correctly display signed override values

When a field gets overriden, the kernel indicates the result of
the override in dmesg. This works well with unsigned fields, but
results in a pretty ugly output when the field is signed.

Truncate the field to its width before displaying it.

Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240122181344.258974-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: cpufeatures: Correctly handle signed values
Marc Zyngier [Mon, 22 Jan 2024 18:13:36 +0000 (18:13 +0000)]
arm64: cpufeatures: Correctly handle signed values

Although we've had signed values for some features such as PMUv3
and FP, the code that handles the comparaison with some limit
has a couple of annoying issues:

- the min_field_value is always unsigned, meaning that we cannot
  easily compare it with a negative value

- it is not possible to have a range of values, let alone a range
  of negative values

Fix this by:

- adding an upper limit to the comparison, defaulting to all bits
  being set to the maximum positive value

- ensuring that the signess of the min and max values are taken into
  account

A ARM64_CPUID_FIELDS_NEG() macro is provided for signed features, but
nothing is using it yet.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240122181344.258974-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
20 months agoarm64: Add macro to compose a sysreg field value
Marc Zyngier [Mon, 22 Jan 2024 18:13:35 +0000 (18:13 +0000)]
arm64: Add macro to compose a sysreg field value

A common idiom is to compose a tupple (reg, field, val) into
a symbol matching an autogenerated definition.

Add a help performing the concatenation and replace it when
open-coded implementations exist.

Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240122181344.258974-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
21 months agoLinux 6.8-rc1
Linus Torvalds [Sun, 21 Jan 2024 22:11:32 +0000 (14:11 -0800)]
Linux 6.8-rc1

21 months agoMerge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Sun, 21 Jan 2024 22:01:12 +0000 (14:01 -0800)]
Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Some fixes, Some refactoring, some minor features:

   - Assorted prep work for disk space accounting rewrite

   - BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
     makes our trigger context more explicit

   - A few fixes to avoid excessive transaction restarts on
     multithreaded workloads: fstests (in addition to ktest tests) are
     now checking slowpath counters, and that's shaking out a few bugs

   - Assorted tracepoint improvements

   - Starting to break up bcachefs_format.h and move on disk types so
     they're with the code they belong to; this will make room to start
     documenting the on disk format better.

   - A few minor fixes"

* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
  bcachefs: Improve inode_to_text()
  bcachefs: logged_ops_format.h
  bcachefs: reflink_format.h
  bcachefs; extents_format.h
  bcachefs: ec_format.h
  bcachefs: subvolume_format.h
  bcachefs: snapshot_format.h
  bcachefs: alloc_background_format.h
  bcachefs: xattr_format.h
  bcachefs: dirent_format.h
  bcachefs: inode_format.h
  bcachefs; quota_format.h
  bcachefs: sb-counters_format.h
  bcachefs: counters.c -> sb-counters.c
  bcachefs: comment bch_subvolume
  bcachefs: bch_snapshot::btime
  bcachefs: add missing __GFP_NOWARN
  bcachefs: opts->compression can now also be applied in the background
  bcachefs: Prep work for variable size btree node buffers
  bcachefs: grab s_umount only if snapshotting
  ...

21 months agoMerge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Jan 2024 19:14:40 +0000 (11:14 -0800)]
Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for time and clocksources:

   - A fix for the idle and iowait time accounting vs CPU hotplug.

     The time is reset on CPU hotplug which makes the accumulated
     systemwide time jump backwards.

   - Assorted fixes and improvements for clocksource/event drivers"

* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
  clocksource/drivers/ep93xx: Fix error handling during probe
  clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
  clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
  clocksource/timer-riscv: Add riscv_clock_shutdown callback
  dt-bindings: timer: Add StarFive JH8100 clint
  dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs

21 months agoMerge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 21 Jan 2024 19:04:29 +0000 (11:04 -0800)]
Merge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Aneesh Kumar:

 - Increase default stack size to 32KB for Book3S

Thanks to Michael Ellerman.

* tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Increase default stack size to 32KB

21 months agobcachefs: Improve inode_to_text()
Kent Overstreet [Sun, 21 Jan 2024 17:19:01 +0000 (12:19 -0500)]
bcachefs: Improve inode_to_text()

Add line breaks - inode_to_text() is now much easier to read.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: logged_ops_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:57:45 +0000 (02:57 -0500)]
bcachefs: logged_ops_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: reflink_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:54:47 +0000 (02:54 -0500)]
bcachefs: reflink_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs; extents_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:51:56 +0000 (02:51 -0500)]
bcachefs; extents_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: ec_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:47:14 +0000 (02:47 -0500)]
bcachefs: ec_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: subvolume_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:42:53 +0000 (02:42 -0500)]
bcachefs: subvolume_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: snapshot_format.h
Kent Overstreet [Sun, 21 Jan 2024 07:41:06 +0000 (02:41 -0500)]
bcachefs: snapshot_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: alloc_background_format.h
Kent Overstreet [Sun, 21 Jan 2024 05:01:52 +0000 (00:01 -0500)]
bcachefs: alloc_background_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: xattr_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:59:15 +0000 (23:59 -0500)]
bcachefs: xattr_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: dirent_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:57:10 +0000 (23:57 -0500)]
bcachefs: dirent_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: inode_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:55:39 +0000 (23:55 -0500)]
bcachefs: inode_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs; quota_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:53:52 +0000 (23:53 -0500)]
bcachefs; quota_format.h

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: sb-counters_format.h
Kent Overstreet [Sun, 21 Jan 2024 04:50:56 +0000 (23:50 -0500)]
bcachefs: sb-counters_format.h

bcachefs_format.h has gotten too big; let's do some organizing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: counters.c -> sb-counters.c
Kent Overstreet [Sun, 21 Jan 2024 04:46:35 +0000 (23:46 -0500)]
bcachefs: counters.c -> sb-counters.c

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: comment bch_subvolume
Kent Overstreet [Sun, 21 Jan 2024 04:44:17 +0000 (23:44 -0500)]
bcachefs: comment bch_subvolume

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch_snapshot::btime
Kent Overstreet [Sun, 21 Jan 2024 04:35:41 +0000 (23:35 -0500)]
bcachefs: bch_snapshot::btime

Add a field to bch_snapshot for creation time; this will be important
when we start exposing the snapshot tree to userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: add missing __GFP_NOWARN
Kent Overstreet [Wed, 17 Jan 2024 22:16:07 +0000 (17:16 -0500)]
bcachefs: add missing __GFP_NOWARN

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: opts->compression can now also be applied in the background
Kent Overstreet [Tue, 16 Jan 2024 21:20:21 +0000 (16:20 -0500)]
bcachefs: opts->compression can now also be applied in the background

The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Prep work for variable size btree node buffers
Kent Overstreet [Tue, 16 Jan 2024 18:29:59 +0000 (13:29 -0500)]
bcachefs: Prep work for variable size btree node buffers

bcachefs btree nodes are big - typically 256k - and btree roots are
pinned in memory. As we're now up to 18 btrees, we now have significant
memory overhead in mostly empty btree roots.

And in the future we're going to start enforcing that certain btree node
boundaries exist, to solve lock contention issues - analagous to XFS's
AGIs.

Thus, we need to start allocating smaller btree node buffers when we
can. This patch changes code that refers to the filesystem constant
c->opts.btree_node_size to refer to the btree node buffer size -
btree_buf_bytes() - where appropriate.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: grab s_umount only if snapshotting
Su Yue [Mon, 15 Jan 2024 02:21:25 +0000 (10:21 +0800)]
bcachefs: grab s_umount only if snapshotting

When I was testing mongodb over bcachefs with compression,
there is a lockdep warning when snapshotting mongodb data volume.

$ cat test.sh
prog=bcachefs

$prog subvolume create /mnt/data
$prog subvolume create /mnt/data/snapshots

while true;do
    $prog subvolume snapshot /mnt/data /mnt/data/snapshots/$(date +%s)
    sleep 1s
done

$ cat /etc/mongodb.conf
systemLog:
  destination: file
  logAppend: true
  path: /mnt/data/mongod.log

storage:
  dbPath: /mnt/data/

lockdep reports:
[ 3437.452330] ======================================================
[ 3437.452750] WARNING: possible circular locking dependency detected
[ 3437.453168] 6.7.0-rc7-custom+ #85 Tainted: G            E
[ 3437.453562] ------------------------------------------------------
[ 3437.453981] bcachefs/35533 is trying to acquire lock:
[ 3437.454325] ffffa0a02b2b1418 (sb_writers#10){.+.+}-{0:0}, at: filename_create+0x62/0x190
[ 3437.454875]
               but task is already holding lock:
[ 3437.455268] ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.456009]
               which lock already depends on the new lock.

[ 3437.456553]
               the existing dependency chain (in reverse order) is:
[ 3437.457054]
               -> #3 (&type->s_umount_key#48){.+.+}-{3:3}:
[ 3437.457507]        down_read+0x3e/0x170
[ 3437.457772]        bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.458206]        __x64_sys_ioctl+0x93/0xd0
[ 3437.458498]        do_syscall_64+0x42/0xf0
[ 3437.458779]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.459155]
               -> #2 (&c->snapshot_create_lock){++++}-{3:3}:
[ 3437.459615]        down_read+0x3e/0x170
[ 3437.459878]        bch2_truncate+0x82/0x110 [bcachefs]
[ 3437.460276]        bchfs_truncate+0x254/0x3c0 [bcachefs]
[ 3437.460686]        notify_change+0x1f1/0x4a0
[ 3437.461283]        do_truncate+0x7f/0xd0
[ 3437.461555]        path_openat+0xa57/0xce0
[ 3437.461836]        do_filp_open+0xb4/0x160
[ 3437.462116]        do_sys_openat2+0x91/0xc0
[ 3437.462402]        __x64_sys_openat+0x53/0xa0
[ 3437.462701]        do_syscall_64+0x42/0xf0
[ 3437.462982]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.463359]
               -> #1 (&sb->s_type->i_mutex_key#15){+.+.}-{3:3}:
[ 3437.463843]        down_write+0x3b/0xc0
[ 3437.464223]        bch2_write_iter+0x5b/0xcc0 [bcachefs]
[ 3437.464493]        vfs_write+0x21b/0x4c0
[ 3437.464653]        ksys_write+0x69/0xf0
[ 3437.464839]        do_syscall_64+0x42/0xf0
[ 3437.465009]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.465231]
               -> #0 (sb_writers#10){.+.+}-{0:0}:
[ 3437.465471]        __lock_acquire+0x1455/0x21b0
[ 3437.465656]        lock_acquire+0xc6/0x2b0
[ 3437.465822]        mnt_want_write+0x46/0x1a0
[ 3437.465996]        filename_create+0x62/0x190
[ 3437.466175]        user_path_create+0x2d/0x50
[ 3437.466352]        bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.466617]        __x64_sys_ioctl+0x93/0xd0
[ 3437.466791]        do_syscall_64+0x42/0xf0
[ 3437.466957]        entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.467180]
               other info that might help us debug this:

[ 3437.469670] 2 locks held by bcachefs/35533:
               other info that might help us debug this:

[ 3437.467507] Chain exists of:
                 sb_writers#10 --> &c->snapshot_create_lock --> &type->s_umount_key#48

[ 3437.467979]  Possible unsafe locking scenario:

[ 3437.468223]        CPU0                    CPU1
[ 3437.468405]        ----                    ----
[ 3437.468585]   rlock(&type->s_umount_key#48);
[ 3437.468758]                                lock(&c->snapshot_create_lock);
[ 3437.469030]                                lock(&type->s_umount_key#48);
[ 3437.469291]   rlock(sb_writers#10);
[ 3437.469434]
                *** DEADLOCK ***

[ 3437.469670] 2 locks held by bcachefs/35533:
[ 3437.469838]  #0: ffffa0a02ce00a88 (&c->snapshot_create_lock){++++}-{3:3}, at: bch2_fs_file_ioctl+0x1e3/0xc90 [bcachefs]
[ 3437.470294]  #1: ffffa0a02b2b10e0 (&type->s_umount_key#48){.+.+}-{3:3}, at: bch2_fs_file_ioctl+0x232/0xc90 [bcachefs]
[ 3437.470744]
               stack backtrace:
[ 3437.470922] CPU: 7 PID: 35533 Comm: bcachefs Kdump: loaded Tainted: G            E      6.7.0-rc7-custom+ #85
[ 3437.471313] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 3437.471694] Call Trace:
[ 3437.471795]  <TASK>
[ 3437.471884]  dump_stack_lvl+0x57/0x90
[ 3437.472035]  check_noncircular+0x132/0x150
[ 3437.472202]  __lock_acquire+0x1455/0x21b0
[ 3437.472369]  lock_acquire+0xc6/0x2b0
[ 3437.472518]  ? filename_create+0x62/0x190
[ 3437.472683]  ? lock_is_held_type+0x97/0x110
[ 3437.472856]  mnt_want_write+0x46/0x1a0
[ 3437.473025]  ? filename_create+0x62/0x190
[ 3437.473204]  filename_create+0x62/0x190
[ 3437.473380]  user_path_create+0x2d/0x50
[ 3437.473555]  bch2_fs_file_ioctl+0x2ec/0xc90 [bcachefs]
[ 3437.473819]  ? lock_acquire+0xc6/0x2b0
[ 3437.474002]  ? __fget_files+0x2a/0x190
[ 3437.474195]  ? __fget_files+0xbc/0x190
[ 3437.474380]  ? lock_release+0xc5/0x270
[ 3437.474567]  ? __x64_sys_ioctl+0x93/0xd0
[ 3437.474764]  ? __pfx_bch2_fs_file_ioctl+0x10/0x10 [bcachefs]
[ 3437.475090]  __x64_sys_ioctl+0x93/0xd0
[ 3437.475277]  do_syscall_64+0x42/0xf0
[ 3437.475454]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 3437.475691] RIP: 0033:0x7f2743c313af
======================================================

In __bch2_ioctl_subvolume_create(), we grab s_umount unconditionally
and unlock it at the end of the function. There is a comment
"why do we need this lock?" about the lock coming from
commit 42d237320e98 ("bcachefs: Snapshot creation, deletion")
The reason is that __bch2_ioctl_subvolume_create() calls
sync_inodes_sb() which enforce locked s_umount to writeback all dirty
nodes before doing snapshot works.

Fix it by read locking s_umount for snapshotting only and unlocking
s_umount after sync_inodes_sb().

Signed-off-by: Su Yue <glass.su@suse.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit
Su Yue [Tue, 16 Jan 2024 11:05:37 +0000 (19:05 +0800)]
bcachefs: kvfree bch_fs::snapshots in bch2_fs_snapshots_exit

bch_fs::snapshots is allocated by kvzalloc in __snapshot_t_mut.
It should be freed by kvfree not kfree.
Or umount will triger:

[  406.829178 ] BUG: unable to handle page fault for address: ffffe7b487148008
[  406.830676 ] #PF: supervisor read access in kernel mode
[  406.831643 ] #PF: error_code(0x0000) - not-present page
[  406.832487 ] PGD 0 P4D 0
[  406.832898 ] Oops: 0000 [#1] PREEMPT SMP PTI
[  406.833512 ] CPU: 2 PID: 1754 Comm: umount Kdump: loaded Tainted: G           OE      6.7.0-rc7-custom+ #90
[  406.834746 ] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[  406.835796 ] RIP: 0010:kfree+0x62/0x140
[  406.836197 ] Code: 80 48 01 d8 0f 82 e9 00 00 00 48 c7 c2 00 00 00 80 48 2b 15 78 9f 1f 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 56 9f 1f 01 <48> 8b 50 08 48 89 c7 f6 c2 01 0f 85 b0 00 00 00 66 90 48 8b 07 f6
[  406.837810 ] RSP: 0018:ffffb9d641607e48 EFLAGS: 00010286
[  406.838213 ] RAX: ffffe7b487148000 RBX: ffffb9d645200000 RCX: ffffb9d641607dc4
[  406.838738 ] RDX: 000065bb00000000 RSI: ffffffffc0d88b84 RDI: ffffb9d645200000
[  406.839217 ] RBP: ffff9a4625d00068 R08: 0000000000000001 R09: 0000000000000001
[  406.839650 ] R10: 0000000000000001 R11: 000000000000001f R12: ffff9a4625d4da80
[  406.840055 ] R13: ffff9a4625d00000 R14: ffffffffc0e2eb20 R15: 0000000000000000
[  406.840451 ] FS:  00007f0a264ffb80(0000) GS:ffff9a4e2d500000(0000) knlGS:0000000000000000
[  406.840851 ] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  406.841125 ] CR2: ffffe7b487148008 CR3: 000000018c4d2000 CR4: 00000000000006f0
[  406.841464 ] Call Trace:
[  406.841583 ]  <TASK>
[  406.841682 ]  ? __die+0x1f/0x70
[  406.841828 ]  ? page_fault_oops+0x159/0x470
[  406.842014 ]  ? fixup_exception+0x22/0x310
[  406.842198 ]  ? exc_page_fault+0x1ed/0x200
[  406.842382 ]  ? asm_exc_page_fault+0x22/0x30
[  406.842574 ]  ? bch2_fs_release+0x54/0x280 [bcachefs]
[  406.842842 ]  ? kfree+0x62/0x140
[  406.842988 ]  ? kfree+0x104/0x140
[  406.843138 ]  bch2_fs_release+0x54/0x280 [bcachefs]
[  406.843390 ]  kobject_put+0xb7/0x170
[  406.843552 ]  deactivate_locked_super+0x2f/0xa0
[  406.843756 ]  cleanup_mnt+0xba/0x150
[  406.843917 ]  task_work_run+0x59/0xa0
[  406.844083 ]  exit_to_user_mode_prepare+0x197/0x1a0
[  406.844302 ]  syscall_exit_to_user_mode+0x16/0x40
[  406.844510 ]  do_syscall_64+0x4e/0xf0
[  406.844675 ]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  406.844907 ] RIP: 0033:0x7f0a2664e4fb

Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bios must be 512 byte algined
Kent Overstreet [Tue, 16 Jan 2024 16:38:04 +0000 (11:38 -0500)]
bcachefs: bios must be 512 byte algined

Fixes: 023f9ac9f70f bcachefs: Delete dio read alignment check
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: remove redundant variable tmp
Colin Ian King [Tue, 16 Jan 2024 11:07:23 +0000 (11:07 +0000)]
bcachefs: remove redundant variable tmp

The variable tmp is being assigned a value but it isn't being
read afterwards. The assignment is redundant and so tmp can be
removed.

Cleans up clang scan build warning:
warning: Although the value stored to 'ret' is used in the enclosing
expression, the value is never actually read from 'ret'
[deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Improve trace_trans_restart_relock
Kent Overstreet [Tue, 16 Jan 2024 01:40:06 +0000 (20:40 -0500)]
bcachefs: Improve trace_trans_restart_relock

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Fix excess transaction restarts in __bchfs_fallocate()
Kent Overstreet [Tue, 16 Jan 2024 01:37:23 +0000 (20:37 -0500)]
bcachefs: Fix excess transaction restarts in __bchfs_fallocate()

drop_locks_do() should not be used in a fastpath without first trying
the do in nonblocking mode - the unlock and relock will cause excessive
transaction restarts and potentially livelocking with other threads that
are contending for the same locks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: extents_to_bp_state
Kent Overstreet [Mon, 15 Jan 2024 23:19:52 +0000 (18:19 -0500)]
bcachefs: extents_to_bp_state

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bkey_and_val_eq()
Kent Overstreet [Mon, 15 Jan 2024 23:08:32 +0000 (18:08 -0500)]
bcachefs: bkey_and_val_eq()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Better journal tracepoints
Kent Overstreet [Mon, 15 Jan 2024 22:59:51 +0000 (17:59 -0500)]
bcachefs: Better journal tracepoints

Factor out bch2_journal_bufs_to_text(), and use it in the
journal_entry_full() tracepoint; when we can't get a journal reservation
we need to know the outstanding journal entry sizes to know if the
problem is due to excessive flushing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Print size of superblock with space allocated
Kent Overstreet [Mon, 15 Jan 2024 22:57:44 +0000 (17:57 -0500)]
bcachefs: Print size of superblock with space allocated

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Avoid flushing the journal in the discard path
Kent Overstreet [Mon, 15 Jan 2024 22:56:22 +0000 (17:56 -0500)]
bcachefs: Avoid flushing the journal in the discard path

When issuing discards, we may need to flush the journal if there's too
many buckets that can't be discarded until a journal flush.

But the heuristic was bad; we should be comparing the number of buckets
that need to flushes against the number of free buckets, not the number
of buckets we saw.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Improve move_extent tracepoint
Kent Overstreet [Mon, 15 Jan 2024 20:33:39 +0000 (15:33 -0500)]
bcachefs: Improve move_extent tracepoint

Also print out the data_opts, so that we can see what specifically is
being done to an extent.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Add missing bch2_moving_ctxt_flush_all()
Kent Overstreet [Mon, 15 Jan 2024 20:06:43 +0000 (15:06 -0500)]
bcachefs: Add missing bch2_moving_ctxt_flush_all()

This fixes a bug with rebalance IOs getting stuck with reads completed,
but writes never being issued.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Re-add move_extent_write tracepoint
Kent Overstreet [Mon, 15 Jan 2024 20:04:40 +0000 (15:04 -0500)]
bcachefs: Re-add move_extent_write tracepoint

It appears this was accidentally deleted at some point - also, do a bit
of cleanup.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount
Kent Overstreet [Mon, 15 Jan 2024 19:15:26 +0000 (14:15 -0500)]
bcachefs: bch2_kthread_io_clock_wait() no longer sleeps until full amount

Drop t he loop in bch2_kthread_io_clock_wait(): this allows the code
that uses it to be woken up for other reasons, and fixes a bug where
rebalance wouldn't wake up when a scan was requested.

This raises the possibility of spurious wakeups, but callers should
always be able to handle that reasonably well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Add .val_to_text() for KEY_TYPE_cookie
Kent Overstreet [Mon, 15 Jan 2024 19:15:03 +0000 (14:15 -0500)]
bcachefs: Add .val_to_text() for KEY_TYPE_cookie

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Don't pass memcmp() as a pointer
Kent Overstreet [Mon, 15 Jan 2024 19:12:43 +0000 (14:12 -0500)]
bcachefs: Don't pass memcmp() as a pointer

Some (buggy!) compilers have issues with this.

Fixes: https://github.com/koverstreet/bcachefs/issues/625
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agoMerge tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Sun, 21 Jan 2024 18:21:43 +0000 (10:21 -0800)]
Merge tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs

Pull header fix from Kent Overstreet:
 "Just one small fixup for the RT build"

* tag 'header_cleanup-2024-01-20' of https://evilpiepirate.org/git/bcachefs:
  spinlock: Fix failing build for PREEMPT_RT

21 months agobcachefs: Reduce would_deadlock restarts
Kent Overstreet [Thu, 11 Jan 2024 04:47:04 +0000 (23:47 -0500)]
bcachefs: Reduce would_deadlock restarts

We don't have to take locks in any particular ordering - we'll make
forward progress just fine - but if we try to stick to an ordering, it
can help to avoid excessive would_deadlock transaction restarts.

This tweaks the reflink path to take extents btree locks in the right
order.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch2_trans_account_disk_usage_change()
Kent Overstreet [Sat, 11 Nov 2023 20:08:36 +0000 (15:08 -0500)]
bcachefs: bch2_trans_account_disk_usage_change()

The disk space accounting rewrite is splitting out accounting for each
replicas set - those are moving to btree keys, instead of percpu
counters.

This breaks bch2_trans_fs_usage_apply() up, splitting out the part we
will still need.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch_fs_usage_base
Kent Overstreet [Fri, 17 Nov 2023 05:03:45 +0000 (00:03 -0500)]
bcachefs: bch_fs_usage_base

Split out base filesystem usage into its own type; prep work for
breaking up bch2_trans_fs_usage_apply().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: bch2_prt_compression_type()
Kent Overstreet [Sun, 7 Jan 2024 02:01:47 +0000 (21:01 -0500)]
bcachefs: bch2_prt_compression_type()

bounds checking helper, since compression types are extensible

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: helpers for printing data types
Kent Overstreet [Sun, 7 Jan 2024 01:57:43 +0000 (20:57 -0500)]
bcachefs: helpers for printing data types

We need bounds checking since new versions may introduce new data types.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: BTREE_TRIGGER_ATOMIC
Kent Overstreet [Sun, 7 Jan 2024 22:14:46 +0000 (17:14 -0500)]
bcachefs: BTREE_TRIGGER_ATOMIC

Add a new flag to be explicit about when we're running atomic triggers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: drop to_text code for obsolete bps in alloc keys
Kent Overstreet [Sun, 7 Jan 2024 00:47:09 +0000 (19:47 -0500)]
bcachefs: drop to_text code for obsolete bps in alloc keys

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: eytzinger_for_each() declares loop iter
Kent Overstreet [Sun, 7 Jan 2024 00:29:14 +0000 (19:29 -0500)]
bcachefs: eytzinger_for_each() declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: Don't log errors if BCH_WRITE_ALLOC_NOWAIT
Kent Overstreet [Thu, 11 Jan 2024 04:08:30 +0000 (23:08 -0500)]
bcachefs: Don't log errors if BCH_WRITE_ALLOC_NOWAIT

Previously, we added logging in the write path to ensure that any
unexpected errors getting reported to userspace have a log message; but
BCH_WRITE_ALLOC_NOWAIT is a special case, it's used for promotes where
errors are expected and not reported out to userspace - so we need to
silence those.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agobcachefs: fix memleak in bch2_split_devs
Su Yue [Mon, 8 Jan 2024 15:11:08 +0000 (23:11 +0800)]
bcachefs: fix memleak in bch2_split_devs

The pointer dev_name can be modified by strseq(),
then causes the memleak:

unreferenced object 0xffff9d08a2916c80 (size 32):
  comm "mount.bcachefs", pid 9090, jiffies 4295856224 (age 17.564s)
  hex dump (first 32 bytes):
    2f 64 65 76 2f 6d 61 70 70 65 72 2f 74 65 73 74  /dev/mapper/test
    2d 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00  -0..............
  backtrace:
    [<00000000c5d3be7d>] __kmem_cache_alloc_node+0x1f3/0x2c0
    [<0000000052215d26>] __kmalloc_node_track_caller+0x51/0x150
    [<0000000069fea956>] kstrdup+0x32/0x60
    [<000000000877fcf1>] bch2_split_devs+0x3f/0x150 [bcachefs]
    [<000000007ee93204>] bch2_mount+0xcb/0x640 [bcachefs]
    [<000000002dd1e04b>] legacy_get_tree+0x30/0x60
    [<000000006afc31d3>] vfs_get_tree+0x28/0xf0
    [<000000007b0c538e>] path_mount+0x475/0xb60
    [<0000000092de5882>] __x64_sys_mount+0x105/0x140
    [<0000000054fc05d8>] do_syscall_64+0x42/0xf0
    [<00000000df584910>] entry_SYSCALL_64_after_hwframe+0x6e/0x76

Fix it by copy pointer dev_name at beginning and free the copied
pointer at end.

Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
21 months agoMerge tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 21 Jan 2024 00:48:07 +0000 (16:48 -0800)]
Merge tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client updates from Steve French:
 "Various smb client fixes, including multichannel and for SMB3.1.1
  POSIX extensions:

   - debugging improvement (display start time for stats)

   - two reparse point handling fixes

   - various multichannel improvements and fixes

   - SMB3.1.1 POSIX extensions open/create parsing fix

   - retry (reconnect) improvement including new retrans mount parm, and
     handling of two additional return codes that need to be retried on

   - two minor cleanup patches and another to remove duplicate query
     info code

   - two documentation cleanup, and one reviewer email correction"

* tag 'v6.8-rc-part2-smb-client' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update iface_last_update on each query-and-update
  cifs: handle servers that still advertise multichannel after disabling
  cifs: new mount option called retrans
  cifs: reschedule periodic query for server interfaces
  smb: client: don't clobber ->i_rdev from cached reparse points
  smb: client: get rid of smb311_posix_query_path_info()
  smb: client: parse owner/group when creating reparse points
  smb: client: fix parsing of SMB3.1.1 POSIX create context
  cifs: update known bugs mentioned in kernel docs for cifs
  cifs: new nt status codes from MS-SMB2
  cifs: pick channel for tcon and tdis
  cifs: open_cached_dir should not rely on primary channel
  smb3: minor documentation updates
  Update MAINTAINERS email address
  cifs: minor comment cleanup
  smb3: show beginning time for per share stats
  cifs: remove redundant variable tcon_exist

21 months agoMerge tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Jan 2024 23:03:25 +0000 (15:03 -0800)]
Merge tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
 "New support:
   - Loongson LS2X APB DMA controller
   - sf-pdma: mpfs-pdma support
   - Qualcomm X1E80100 GPI dma controller support

  Updates:
   - Xilinx XDMA updates to support interleaved DMA transfers
   - TI PSIL threads for AM62P and J722S and cfg register regions
     description
   - axi-dmac Improving the cyclic DMA transfers
   - Tegra Support dma-channel-mask property
   - Remaining platform remove callback returning void conversions

 Driver fixes for:
   - Xilinx xdma driver operator precedence and initialization fix
   - Excess kernel-doc warning fix in imx-sdma xilinx xdma drivers
   - format-overflow warning fix for rz-dmac, sh usb dmac drivers
   - 'output may be truncated' fix for shdma, fsl-qdma and dw-edma
     drivers"

* tag 'dmaengine-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (58 commits)
  dmaengine: dw-edma: increase size of 'name' in debugfs code
  dmaengine: fsl-qdma: increase size of 'irq_name'
  dmaengine: shdma: increase size of 'dev_id'
  dmaengine: xilinx: xdma: Fix kernel-doc warnings
  dmaengine: usb-dmac: Avoid format-overflow warning
  dmaengine: sh: rz-dmac: Avoid format-overflow warning
  dmaengine: imx-sdma: fix Excess kernel-doc warnings
  dmaengine: xilinx: xdma: Fix initialization location of desc in xdma_channel_isr()
  dmaengine: xilinx: xdma: Fix operator precedence in xdma_prep_interleaved_dma()
  dmaengine: xilinx: xdma: statify xdma_prep_interleaved_dma
  dmaengine: xilinx: xdma: Workaround truncation compilation error
  dmaengine: pl330: issue_pending waits until WFP state
  dmaengine: xilinx: xdma: Implement interleaved DMA transfers
  dmaengine: xilinx: xdma: Prepare the introduction of interleaved DMA transfers
  dmaengine: xilinx: xdma: Add transfer error reporting
  dmaengine: xilinx: xdma: Add error checking in xdma_channel_isr()
  dmaengine: xilinx: xdma: Rework xdma_terminate_all()
  dmaengine: xilinx: xdma: Ease dma_pool alignment requirements
  dmaengine: xilinx: xdma: Add necessary macro definitions
  dmaengine: xilinx: xdma: Get rid of unused code
  ...

21 months agoMerge tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawa...
Linus Torvalds [Sat, 20 Jan 2024 22:20:34 +0000 (14:20 -0800)]
Merge tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux

Pull coccinelle updates from Julia Lawall:
 "Updates to the device_attr_show semantic patch to reflect the new
  guidelines of the Linux kernel documentation.

  The problem was identified by Li Zhijian <lizhijian@fujitsu.com>, who
  proposed an initial fix"

* tag 'coccinelle-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  coccinelle: device_attr_show: simplify patch case
  coccinelle: device_attr_show: Adapt to the latest Documentation/filesystems/sysfs.rst

21 months agomedia: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)
Aurelien Jarno [Sat, 13 Jan 2024 18:33:31 +0000 (19:33 +0100)]
media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)

This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10
driver.  This improves the readability and more importantly, for the
solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13):

 - the preprocessed size from 121 MiB to 4.5 MiB;

 - the build CPU time from 46.8 s to 1.6 s;

 - the build memory from 2786 MiB to 98MiB.

In fine, this allows this relatively simple C file to be built on a
32-bit system.

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Closes: https://lore.kernel.org/lkml/18c6df0d-45ed-450c-9eda-95160a2bbb8e@gmail.com/
Cc: <stable@vger.kernel.org> # v6.7+
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: David Laight <David.Laight@ACULAB.COM>
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
21 months agococcinelle: device_attr_show: simplify patch case
Julia Lawall [Sat, 20 Jan 2024 20:56:11 +0000 (21:56 +0100)]
coccinelle: device_attr_show: simplify patch case

Replacing the final expression argument by ... allows the format
string to have multiple arguments.

It also has the advantage of allowing the change to be recognized as
a change in a single statement, thus avoiding adding unneeded braces.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
21 months agoexecve: open the executable file before doing anything else
Linus Torvalds [Tue, 9 Jan 2024 00:43:04 +0000 (16:43 -0800)]
execve: open the executable file before doing anything else

No point in allocating a new mm, counting arguments and environment
variables etc if we're just going to return ENOENT.

This patch does expose the fact that 'do_filp_open()' that execve() uses
is still unnecessarily expensive in the failure case, because it
allocates the 'struct file *' early, even if the path lookup (which is
heavily optimized) fails.

So that remains an unnecessary cost in the "no such executable" case,
but it's a separate issue.  Regardless, I do not want to do _both_ a
filename_lookup() and a later do_filp_open() like the origin patch by
Josh Triplett did in [1].

Reported-by: Josh Triplett <josh@joshtriplett.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/lkml/5c7333ea4bec2fad1b47a8fa2db7c31e4ffc4f14.1663334978.git.josh@joshtriplett.org/
Link: https://lore.kernel.org/lkml/202209161637.9EDAF6B18@keescook/
Link: https://lore.kernel.org/lkml/CAHk-=wgznerM-xs+x+krDfE7eVBiy_HOam35rbsFMMOwvYuEKQ@mail.gmail.com/
Link: https://lore.kernel.org/lkml/CAHk-=whf9qLO8ipps4QhmS0BkM8mtWJhvnuDSdtw5gFjhzvKNA@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
21 months agoMerge tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jan 2024 19:06:04 +0000 (11:06 -0800)]
Merge tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for tuning for systems with fast misaligned accesses.

 - Support for SBI-based suspend.

 - Support for the new SBI debug console extension.

 - The T-Head CMOs now use PA-based flushes.

 - Support for enabling the V extension in kernel code.

 - Optimized IP checksum routines.

 - Various ftrace improvements.

 - Support for archrandom, which depends on the Zkr extension.

 - The build is no longer broken under NET=n, KUNIT=y for ports that
   don't define their own ipv6 checksum.

* tag 'riscv-for-linus-6.8-mw4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (56 commits)
  lib: checksum: Fix build with CONFIG_NET=n
  riscv: lib: Check if output in asm goto supported
  riscv: Fix build error on rv32 + XIP
  riscv: optimize ELF relocation function in riscv
  RISC-V: Implement archrandom when Zkr is available
  riscv: Optimize hweight API with Zbb extension
  riscv: add dependency among Image(.gz), loader(.bin), and vmlinuz.efi
  samples: ftrace: Add RISC-V support for SAMPLE_FTRACE_DIRECT[_MULTI]
  riscv: ftrace: Add DYNAMIC_FTRACE_WITH_DIRECT_CALLS support
  riscv: ftrace: Make function graph use ftrace directly
  riscv: select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
  lib/Kconfig.debug: Update AS_HAS_NON_CONST_LEB128 comment and name
  riscv: Restrict DWARF5 when building with LLVM to known working versions
  riscv: Hoist linker relaxation disabling logic into Kconfig
  kunit: Add tests for csum_ipv6_magic and ip_fast_csum
  riscv: Add checksum library
  riscv: Add checksum header
  riscv: Add static key for misaligned accesses
  asm-generic: Improve csum_fold
  RISC-V: selftests: cbo: Ensure asm operands match constraints
  ...

21 months agoMerge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 20 Jan 2024 17:42:32 +0000 (09:42 -0800)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Final round of fixes that came in too late to send in the first
  request.

  It's nine bug fixes and one version update (because of a bug fix) and
  one set of PCI ID additions. There's one bug fix in the core which is
  really a one liner (except that an additional sdev pointer was added
  for convenience) and the rest are in drivers"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: core: Add TMF to tmr_list handling
  scsi: core: Kick the requeue list after inserting when flushing
  scsi: fnic: unlock on error path in fnic_queuecommand()
  scsi: fcoe: Fix unsigned comparison with zero in store_ctlr_mode()
  scsi: mpi3mr: Fix mpi3mr_fw.c kernel-doc warnings
  scsi: smartpqi: Bump driver version to 2.1.26-030
  scsi: smartpqi: Fix logical volume rescan race condition
  scsi: smartpqi: Add new controller PCI IDs
  scsi: ufs: qcom: Remove unnecessary goto statement from ufs_qcom_config_esi()
  scsi: ufs: core: Remove the ufshcd_hba_exit() call from ufshcd_async_scan()
  scsi: ufs: core: Simplify power management during async scan

21 months agoMerge tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubit...
Linus Torvalds [Sat, 20 Jan 2024 17:24:06 +0000 (09:24 -0800)]
Merge tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux

Pull sh updates from John Paul Adrian Glaubitz:
 "Since the large patch series to convert arch/sh to device tree support
  has not been finalized yet due to various maintainers still asking for
  changes to the series, this ended up being rather small consisting of
  just two fixes.

  The first patch by Geert Uytterhoeven addresses a build failure in the
  EcoVec platform code. And the second patch by Masahiro Yamada removes
  an unnecessary $(foreach ...) found in a Makefile of the vsyscall
  code.

   - Rename missed backlight field from fbdev to dev

   - Remove unnecessary $(foreach ...)"

* tag 'sh-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
  sh: vsyscall: Remove unnecessary $(foreach ...)
  sh: ecovec24: Rename missed backlight field from fbdev to dev

21 months agoMerge tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Jan 2024 17:14:04 +0000 (09:14 -0800)]
Merge tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

Pull fbdev fix from Helge Deller:
 "There were various reports from people without any graphics output on
  the screen and it turns out one commit triggers the problem.

   - Revert 'firmware/sysfb: Clear screen_info state after consuming it'"

* tag 'fbdev-for-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  Revert "firmware/sysfb: Clear screen_info state after consuming it"

21 months agoMerge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Fri, 19 Jan 2024 22:25:23 +0000 (14:25 -0800)]
Merge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "Add Namhyung Kim as tools/perf/ co-maintainer, we're taking turns
  processing patches, switching roles from perf-tools to perf-tools-next
  at each Linux release.

  Data profiling:

   - Associate samples that identify loads and stores with data
     structures. This uses events available on Intel, AMD and others and
     DWARF info:

       # To get memory access samples in kernel for 1 second (on Intel)
       $ perf mem record -a -K --ldlat=4 -- sleep 1

       # Similar for the AMD (but it requires 6.3+ kernel for BPF filters)
       $ perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1

     Then, amongst several modes of post processing, one can do things like:

       $ perf report -s type,typeoff --hierarchy --group --stdio
       ...
       #
       # Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u'
       # Event count (approx.): 602758064
       #
       #                    Overhead  Data Type / Data Type Offset
       # ...........................  ............................
       #
           26.09%   3.28%   0.00%     long unsigned int
              26.09%   3.28%   0.00%     long unsigned int +0 (no field)
           18.48%   0.73%   0.00%     struct page
              10.83%   0.02%   0.00%     struct page +8 (lru.next)
               3.90%   0.28%   0.00%     struct page +0 (flags)
               3.45%   0.06%   0.00%     struct page +24 (mapping)
               0.25%   0.28%   0.00%     struct page +48 (_mapcount.counter)
               0.02%   0.06%   0.00%     struct page +32 (index)
               0.02%   0.00%   0.00%     struct page +52 (_refcount.counter)
               0.02%   0.01%   0.00%     struct page +56 (memcg_data)
               0.00%   0.01%   0.00%     struct page +16 (lru.prev)
           15.37%  17.54%   0.00%     (stack operation)
              15.37%  17.54%   0.00%     (stack operation) +0 (no field)
           11.71%  50.27%   0.00%     (unknown)
              11.71%  50.27%   0.00%     (unknown) +0 (no field)

       $ perf annotate --data-type
       ...
       Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples):
       ============================================================================
           samples     offset       size  field
                13          0        640  struct cfs_rq         {
                 2          0         16      struct load_weight       load {
                 2          0          8          unsigned long        weight;
                 0          8          4          u32  inv_weight;
                                              };
                 0         16          8      unsigned long    runnable_weight;
                 0         24          4      unsigned int     nr_running;
                 1         28          4      unsigned int     h_nr_running;
       ...

       $ perf annotate --data-type=page --group
       Annotate type: 'struct page' in [kernel.kallsyms] (480 samples):
        event[0] = cpu/mem-loads,ldlat=4/P
        event[1] = cpu/mem-stores/P
        event[2] = dummy:u
       ===================================================================================
                samples  offset  size  field
       447  33        0       0    64  struct page     {
       108   8        0       0     8  long unsigned int  flags;
       319  13        0       8    40  union       {
       319  13        0       8    40          struct          {
       236   2        0       8    16              union       {
       236   2        0       8    16                  struct list_head       lru {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
       236   2        0       8    16                  struct          {
       236   1        0       8     8                      void*      __filler;
         0   1        0      16     4                      unsigned int       mlock_count;
                                                       };
       236   2        0       8    16                  struct list_head       buddy_list {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
       236   2        0       8    16                  struct list_head       pcp_list {
       236   1        0       8     8                      struct list_head*  next;
         0   1        0      16     8                      struct list_head*  prev;
                                                       };
                                                   };
        82   4        0      24     8              struct address_space*      mapping;
         1   7        0      32     8              union       {
         1   7        0      32     8                  long unsigned int      index;
         1   7        0      32     8                  long unsigned int      share;
                                                   };
         0   0        0      40     8              long unsigned int  private;
                                                                 };

     This uses the existing annotate code, calling objdump to do the
     disassembly, with improvements to avoid having this take too long,
     but longer term a switch to a disassembler library, possibly
     reusing code in the kernel will be pursued.

     This is the initial implementation, please use it and report
     impressions and bugs. Make sure the kernel-debuginfo packages match
     the running kernel. The 'perf report' phase for non short perf.data
     files may take a while.

     There is a great article about it on LWN:

       https://lwn.net/Articles/955709/ - "Data-type profiling for perf"

     One last test I did while writing this text, on a AMD Ryzen 5950X,
     using a distro kernel, while doing a simple 'find /' on an
     otherwise idle system resulted in:

     # uname -r
     6.6.9-100.fc38.x86_64
     # perf -vv | grep BPF_
                      bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
            bpf_skeletons: [ on  ]  # HAVE_BPF_SKEL
     # rpm -qa | grep kernel-debuginfo
     kernel-debuginfo-common-x86_64-6.6.9-100.fc38.x86_64
     kernel-debuginfo-6.6.9-100.fc38.x86_64
     #
     # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000'
     ^C[ perf record: Woken up 1 times to write data ]
     [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ]
     #
     # ls -la perf.data
     -rw-------. 1 root root 2346486 Jan  9 18:36 perf.data
     # perf evlist
     ibs_op//
     dummy:u
     # perf evlist -v
     ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1
     dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
     #
     # perf report -s type,typeoff --hierarchy --group --stdio
     # Total Lost Samples: 0
     #
     # Samples: 2K of events 'ibs_op//, dummy:u'
     # Event count (approx.): 1904553038
     #
     #            Overhead  Data Type / Data Type Offset
     # ...................  ............................
     #
         73.70%   0.00%     (unknown)
            73.70%   0.00%     (unknown) +0 (no field)
          3.01%   0.00%     long unsigned int
             3.00%   0.00%     long unsigned int +0 (no field)
             0.01%   0.00%     long unsigned int +2 (no field)
          2.73%   0.00%     struct task_struct
             1.71%   0.00%     struct task_struct +52 (on_cpu)
             0.38%   0.00%     struct task_struct +2104 (rcu_read_unlock_special.b.blocked)
             0.23%   0.00%     struct task_struct +2100 (rcu_read_lock_nesting)
             0.14%   0.00%     struct task_struct +2384 ()
             0.06%   0.00%     struct task_struct +3096 (signal)
             0.05%   0.00%     struct task_struct +3616 (cgroups)
             0.05%   0.00%     struct task_struct +2344 (active_mm)
             0.02%   0.00%     struct task_struct +46 (flags)
             0.02%   0.00%     struct task_struct +2096 (migration_disabled)
             0.01%   0.00%     struct task_struct +24 (__state)
             0.01%   0.00%     struct task_struct +3956 (mm_cid_active)
             0.01%   0.00%     struct task_struct +1048 (cpus_ptr)
             0.01%   0.00%     struct task_struct +184 (se.group_node.next)
             0.01%   0.00%     struct task_struct +20 (thread_info.cpu)
             0.00%   0.00%     struct task_struct +104 (on_rq)
             0.00%   0.00%     struct task_struct +2456 (pid)
          1.36%   0.00%     struct module
             0.59%   0.00%     struct module +952 (kallsyms)
             0.42%   0.00%     struct module +0 (state)
             0.23%   0.00%     struct module +8 (list.next)
             0.12%   0.00%     struct module +216 (syms)
          0.95%   0.00%     struct inode
             0.41%   0.00%     struct inode +40 (i_sb)
             0.22%   0.00%     struct inode +0 (i_mode)
             0.06%   0.00%     struct inode +76 (i_rdev)
             0.06%   0.00%     struct inode +56 (i_security)
     <SNIP>

  perf top/report:

   - Don't ignore job control, allowing control+Z + bg to work.

   - Add s390 raw data interpretation for PAI (Processor Activity
     Instrumentation) counters.

  perf archive:

   - Add new option '--all' to pack perf.data with DSOs.

   - Add new option '--unpack' to expand tarballs.

  Initialization speedups:

   - Lazily initialize zstd streams to save memory when not using it.

   - Lazily allocate/size mmap event copy.

   - Lazy load kernel symbols in 'perf record'.

   - Be lazier in allocating lost samples buffer in 'perf record'.

   - Don't synthesize BPF events when disabled via the command line
     (perf record --no-bpf-event).

  Assorted improvements:

   - Show note on AMD systems that the :p, :pp, :ppp and :P are all the
     same, as IBS (Instruction Based Sampling) is used and it is
     inherentely precise, not having levels of precision like in Intel
     systems.

   - When 'cycles' isn't available, fall back to the "task-clock" event
     when not system wide, not to 'cpu-clock'.

   - Add --debug-file option to redirect debug output, e.g.:

       $ perf --debug-file /tmp/perf.log record -v true

   - Shrink 'struct map' to under one cacheline by avoiding function
     pointers for selecting if addresses are identity or DSO relative,
     and using just a byte for some boolean struct members.

   - Resolve the arch specific strerrno just once to use in
     perf_env__arch_strerrno().

   - Reduce memory for recording PERF_RECORD_LOST_SAMPLES event.

  Assorted fixes:

   - Fix the default 'perf top' usage on Intel hybrid systems, now it
     starts with a browser showing the number of samples for Efficiency
     (cpu_atom/cycles/P) and Performance (cpu_core/cycles/P). This
     behaviour is similar on ARM64, with its respective set of
     big.LITTLE processors.

   - Fix segfault on build_mem_topology() error path.

   - Fix 'perf mem' error on hybrid related to availability of mem event
     in a PMU.

   - Fix missing reference count gets (map, maps) in the db-export code.

   - Avoid recursively taking env->bpf_progs.lock in the 'perf_env'
     code.

   - Use the newly introduced maps__for_each_map() to add missing
     locking around iteration of 'struct map' entries.

   - Parse NOTE segments until the build id is found, don't stop on the
     first one, ELF files may have several such NOTE segments.

   - Remove 'egrep' usage, its deprecated, use 'grep -E' instead.

   - Warn first about missing libelf, not libbpf, that depends on
     libelf.

   - Use alternative to 'find ... -printf' as this isn't supported in
     busybox.

   - Address python 3.6 DeprecationWarning for string scapes.

   - Fix memory leak in uniq() in libsubcmd.

   - Fix man page formatting for 'perf lock'

   - Fix some spelling mistakes.

  perf tests:

   - Fail shell tests that needs some symbol in perf itself if it is
     stripped. These tests check if a symbol is resolved, if some hot
     function is indeed detected by profiling, etc.

   - The 'perf test sigtrap' test is currently failing on PREEMPT_RT,
     skip it if sleeping spinlocks are detected (using BTF) and point to
     the mailing list discussion about it. This test is also being
     skipped on several architectures (powerpc, s390x, arm and aarch64)
     due to other pending issues with intruction breakpoints.

   - Adjust test case perf record offcpu profiling tests for s390.

   - Fix 'Setup struct perf_event_attr' fails on s390 on z/VM guest,
     addressing issues caused by the fallback from cycles to task-clock
     done in this release.

   - Fix mask for VG register in the user-regs test.

   - Use shellcheck on 'perf test' shell scripts automatically to make
     sure changes don't introduce things it flags as problematic.

   - Add option to change objdump binary and allow it to be set via
     'perf config'.

   - Add basic 'perf script', 'perf list --json" and 'perf diff' tests.

   - Basic branch counter support.

   - Make DSO tests a suite rather than individual.

   - Remove atomics from test_loop to avoid test failures.

   - Fix call chain match on powerpc for the record+probe_libc_inet_pton
     test.

   - Improve Intel hybrid tests.

  Vendor event files (JSON):

  powerpc:

   - Update datasource event name to fix duplicate events on IBM's
     Power10.

   - Add PVN for HX-C2000 CPU with Power8 Architecture.

  Intel:

   - Alderlake/rocketlake metric fixes.

   - Update emeraldrapids events to v1.02.

   - Update icelakex events to v1.23.

   - Update sapphirerapids events to v1.17.

   - Add skx, clx, icx and spr upi bandwidth metric.

  AMD:

   - Add Zen 4 memory controller events.

  RISC-V:

   - Add StarFive Dubhe-80 and Dubhe-90 JSON files.
       https://www.starfivetech.com/en/site/cpu-u

   - Add T-HEAD C9xx JSON file.
       https://github.com/riscv-software-src/opensbi/blob/master/docs/platform/thead-c9xx.md

  ARM64:

   - Remove UTF-8 characters from cmn.json, that were causing build
     failure in some distros.

   - Add core PMU events and metrics for Ampere One X.

   - Rename Ampere One's BPU_FLUSH_MEM_FAULT to GPC_FLUSH_MEM_FAULT

  libperf:

   - Rename several perf_cpu_map constructor names to clarify what they
     really do.

   - Ditto for some other methods, coping with some issues in their
     semantics, like perf_cpu_map__empty() ->
     perf_cpu_map__has_any_cpu_or_is_empty().

   - Document perf_cpu_map__nr()'s behavior

  perf stat:

   - Exit if parse groups fails.

   - Combine the -A/--no-aggr and --no-merge options.

   - Fix help message for --metric-no-threshold option.

  Hardware tracing:

  ARM64 CoreSight:

   - Bump minimum OpenCSD version to ensure a bugfix is present.

   - Add 'T' itrace option for timestamp trace

   - Set start vm addr of exectable file to 0 and don't ignore first
     sample on the arm-cs-trace-disasm.py 'perf script'"

* tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits)
  MAINTAINERS: Add Namhyung as tools/perf/ co-maintainer
  perf test: test case 'Setup struct perf_event_attr' fails on s390 on z/vm
  perf db-export: Fix missing reference count get in call_path_from_sample()
  perf tests: Add perf script test
  libsubcmd: Fix memory leak in uniq()
  perf TUI: Don't ignore job control
  perf vendor events intel: Update sapphirerapids events to v1.17
  perf vendor events intel: Update icelakex events to v1.23
  perf vendor events intel: Update emeraldrapids events to v1.02
  perf vendor events intel: Alderlake/rocketlake metric fixes
  perf x86 test: Add hybrid test for conflicting legacy/sysfs event
  perf x86 test: Update hybrid expectations
  perf vendor events amd: Add Zen 4 memory controller events
  perf stat: Fix hard coded LL miss units
  perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event
  perf env: Avoid recursively taking env->bpf_progs.lock
  perf annotate: Add --insn-stat option for debugging
  perf annotate: Add --type-stat option for debugging
  perf annotate: Support event group display
  perf annotate: Add --data-type option
  ...

21 months agoMerge tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 19 Jan 2024 21:49:16 +0000 (13:49 -0800)]
Merge tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull strlcpy removal from Kees Cook:
 "As promised, this is 'part 2' of the hardening tree, late in -rc1 now
  that all the other trees with strlcpy() removals have landed. One new
  user appeared (in bcachefs) but was a trivial refactor. The kernel is
  now free of the strlcpy() API!

   - Remove of the final (very recent) user of strlcpy() (in bcachefs)

   - Remove the strlcpy() API. Long live strscpy()"

* tag 'strlcpy-removal-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  string: Remove strlcpy()
  bcachefs: Replace strlcpy() with strscpy()

21 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 19 Jan 2024 21:36:15 +0000 (13:36 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "I think the main one is fixing the dynamic SCS patching when full LTO
  is enabled (clang was silently getting this horribly wrong), but it's
  all good stuff.

  Rob just pointed out that the fix to the workaround for erratum
  #2966298 might not be necessary, but in the worst case it's harmless
  and since the official description leaves a little to be desired here,
  I've left it in.

  Summary:

   - Fix shadow call stack patching with LTO=full

   - Fix voluntary preemption of the FPSIMD registers from assembly code

   - Fix workaround for A520 CPU erratum #2966298 and extend to A510

   - Fix SME issues that resulted in corruption of the register state

   - Minor fixes (missing includes, formatting)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Fix silcon-errata.rst formatting
  arm64/sme: Always exit sme_alloc() early with existing storage
  arm64/fpsimd: Remove spurious check for SVE support
  arm64/ptrace: Don't flush ZA/ZT storage when writing ZA via ptrace
  arm64: entry: simplify kernel_exit logic
  arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
  arm64: errata: Add Cortex-A510 speculative unprivileged load workaround
  arm64: Rename ARM64_WORKAROUND_2966298
  arm64: fpsimd: Bring cond_yield asm macro in line with new rules
  arm64: scs: Work around full LTO issue with dynamic SCS
  arm64: irq: include <linux/cpumask.h>

21 months agoMerge tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai...
Linus Torvalds [Fri, 19 Jan 2024 21:30:49 +0000 (13:30 -0800)]
Merge tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch updates from Huacai Chen:

 - Raise minimum clang version to 18.0.0

 - Enable initial Rust support for LoongArch

 - Add built-in dtb support for LoongArch

 - Use generic interface to support crashkernel=X,[high,low]

 - Some bug fixes and other small changes

 - Update the default config file.

* tag 'loongarch-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (22 commits)
  MAINTAINERS: Add BPF JIT for LOONGARCH entry
  LoongArch: Update Loongson-3 default config file
  LoongArch: BPF: Prevent out-of-bounds memory access
  LoongArch: BPF: Support 64-bit pointers to kfuncs
  LoongArch: Fix definition of ftrace_regs_set_instruction_pointer()
  LoongArch: Use generic interface to support crashkernel=X,[high,low]
  LoongArch: Fix and simplify fcsr initialization on execve()
  LoongArch: Let cores_io_master cover the largest NR_CPUS
  LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE
  LoongArch: Add a missing call to efi_esrt_init()
  LoongArch: Parsing CPU-related information from DTS
  LoongArch: dts: DeviceTree for Loongson-2K2000
  LoongArch: dts: DeviceTree for Loongson-2K1000
  LoongArch: dts: DeviceTree for Loongson-2K0500
  LoongArch: Allow device trees be built into the kernel
  dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for interrupt-names
  dt-bindings: interrupt-controller: loongson,liointc: Fix dtbs_check warning for reg-names
  dt-bindings: loongarch: Add Loongson SoC boards compatibles
  dt-bindings: loongarch: Add CPU bindings for LoongArch
  LoongArch: Enable initial Rust support
  ...

21 months agoRevert "firmware/sysfb: Clear screen_info state after consuming it"
Helge Deller [Fri, 19 Jan 2024 20:47:15 +0000 (21:47 +0100)]
Revert "firmware/sysfb: Clear screen_info state after consuming it"

This reverts commit df67699c9cb0ceb70f6cc60630ca938c06773eda.

Jens Axboe reported a regression that his machine is failing to show a
console, or in fact anything, on current -git. There's no output and no
console after:

Loading Linux 6.7.0+ ...
Loading initial ramdisk ...

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
21 months agoMerge tag 'devicetree-for-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 19 Jan 2024 21:00:45 +0000 (13:00 -0800)]
Merge tag 'devicetree-for-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree header detangling from Rob Herring:
 "Remove the circular including of of_device.h and of_platform.h along
  with all of their implicit includes.

  This is the culmination of several kernel cycles worth of fixing
  implicit DT includes throughout the tree"

* tag 'devicetree-for-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: Stop circularly including of_device.h and of_platform.h
  clk: qcom: gcc-x1e80100: Replace of_device.h with explicit includes
  thermal: loongson2: Replace of_device.h with explicit includes
  net: can: Use device_get_match_data()
  sparc: Use device_get_match_data()

21 months agococcinelle: device_attr_show: Adapt to the latest Documentation/filesystems/sysfs.rst
Li Zhijian [Fri, 19 Jan 2024 06:20:57 +0000 (14:20 +0800)]
coccinelle: device_attr_show: Adapt to the latest Documentation/filesystems/sysfs.rst

Adapt description, warning message and MODE=patch according to the latest
Documentation/filesystems/sysfs.rst:
> show() should only use sysfs_emit() or sysfs_emit_at() when formatting
> the value to be returned to user space.

After this patch:
When MODE=report,
 $ make coccicheck COCCI=scripts/coccinelle/api/device_attr_show.cocci M=drivers/hid/hid-picolcd_core.c MODE=report
 <...snip...>
 drivers/hid/hid-picolcd_core.c:304:8-16: WARNING: please use sysfs_emit or sysfs_emit_at
 drivers/hid/hid-picolcd_core.c:259:9-17: WARNING: please use sysfs_emit or sysfs_emit_at

When MODE=patch,
 $ make coccicheck COCCI=scripts/coccinelle/api/device_attr_show.cocci M=drivers/hid/hid-picolcd_core.c MODE=patch
 <...snip...>
 diff -u -p a/drivers/hid/hid-picolcd_core.c b/drivers/hid/hid-picolcd_core.c
 --- a/drivers/hid/hid-picolcd_core.c
 +++ b/drivers/hid/hid-picolcd_core.c
 @@ -255,10 +255,12 @@ static ssize_t picolcd_operation_mode_sh
  {
         struct picolcd_data *data = dev_get_drvdata(dev);

 -       if (data->status & PICOLCD_BOOTLOADER)
 -               return snprintf(buf, PAGE_SIZE, "[bootloader] lcd\n");
 -       else
 -               return snprintf(buf, PAGE_SIZE, "bootloader [lcd]\n");
 +       if (data->status & PICOLCD_BOOTLOADER) {
 +               return sysfs_emit(buf, "[bootloader] lcd\n");
 +       }
 +       else {
 +               return sysfs_emit(buf, "bootloader [lcd]\n");
 +       }
  }

  static ssize_t picolcd_operation_mode_store(struct device *dev,
 @@ -301,7 +303,7 @@ static ssize_t picolcd_operation_mode_de
  {
         struct picolcd_data *data = dev_get_drvdata(dev);

 -       return snprintf(buf, PAGE_SIZE, "hello world\n");
 +       return sysfs_emit(buf, "hello world\n");
  }

  static ssize_t picolcd_operation_mode_delay_store(struct device *dev,

CC: Julia Lawall <Julia.Lawall@inria.fr>
CC: Nicolas Palix <nicolas.palix@imag.fr>
CC: cocci@inria.fr
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
21 months agoMerge tag 'spi-fix-v6.8-merge-window' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 19 Jan 2024 20:50:09 +0000 (12:50 -0800)]
Merge tag 'spi-fix-v6.8-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fix from Mark Brown:
 "One simple fix for the device unbind path in the Coldfire driver.

  A conversion to use a combined get/enable helper missed removing a
  disable"

* tag 'spi-fix-v6.8-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: coldfire-qspi: Remove an erroneous clk_disable_unprepare() from the remove function

21 months agoMerge tag 'sound-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 19 Jan 2024 20:30:29 +0000 (12:30 -0800)]
Merge tag 'sound-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes:

   - Lots of ASoC SOF fixes and related reworks

   - ASoC TAS codec fixes including DT updates

   - A few HD-audio quirks and regression fixes

   - Minor fixes for aloop, oxygen and scarlett2 mixer"

* tag 'sound-fix-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
  ALSA: hda/realtek: Enable headset mic on Lenovo M70 Gen5
  ALSA: hda/realtek: Enable mute/micmute LEDs and limit mic boost on HP ZBook
  ALSA: hda/relatek: Enable Mute LED on HP Laptop 15s-fq2xxx
  ASoC: SOF: ipc4-loader: remove the CPC check warnings
  ASoC: SOF: ipc4-pcm: remove log message for LLP
  ALSA: hda: generic: Remove obsolete call to ledtrig_audio_get
  ALSA: scarlett2: Fix yet more -Wformat-truncation warnings
  ALSA: hda: Properly setup HDMI stream
  ASoC: audio-graph-card2: fix index check on graph_parse_node_multi_nm()
  ASoC: SOF: icp3-dtrace: Revert "Fix wrong kfree() usage"
  ALSA: oxygen: Fix right channel of capture volume mixer
  ALSA: aloop: Introduce a function to get if access is interleaved mode
  ASoC: mediatek: sof-common: Add NULL check for normal_link string
  ASoC: mediatek: mt8195: Remove afe-dai component and rework codec link
  ASoC: mediatek: mt8192: Check existence of dai_name before dereferencing
  ASoC: Intel: bxt_rt298: Fix kernel ops due to COMP_DUMMY change
  ASoC: Intel: bxt_da7219_max98357a: Fix kernel ops due to COMP_DUMMY change
  ASoC: codecs: rtq9128: Fix TDM enable and DAI format control flow
  ASoC: codecs: rtq9128: Fix PM_RUNTIME usage
  ASoC: tas2781: Add tas2563 into driver
  ...

21 months agostring: Remove strlcpy()
Kees Cook [Thu, 18 Jan 2024 20:31:55 +0000 (12:31 -0800)]
string: Remove strlcpy()

With all the users of strlcpy() removed[1] from the kernel, remove the
API, self-tests, and other references. Leave mentions in Documentation
(about its deprecation), and in checkpatch.pl (to help migrate host-only
tools/ usage). Long live strscpy().

Link: https://github.com/KSPP/linux/issues/89
Cc: Azeem Shaikh <azeemshaikh38@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Joe Perches <joe@perches.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: linux-hardening@vger.kernel.org
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
21 months agoMerge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 19 Jan 2024 19:50:00 +0000 (11:50 -0800)]
Merge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm

Pull more drm fixes from Dave Airlie:
 "This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix
  thrown in.

  The amdgpu ones are just the usual couple of weeks of fixes. The xe
  ones are bunch of cleanups for the new xe driver, the fix you put in
  on the merge commit and the kconfig fix that was hiding the problem
  from me.

  amdgpu:
   - DSC fixes
   - DC resource pool fixes
   - OTG fix
   - DML2 fixes
   - Aux fix
   - GFX10 RLC firmware handling fix
   - Revert a broken workaround for SMU 13.0.2
   - DC writeback fix
   - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW
     version

  amdkfd:
   - Fix dma-buf exports using GEM handles

  nouveau:
   - fix a unneeded WARN_ON triggering

  xe:
   - Fix for definition of wakeref_t
   - Fix for an error code aliasing
   - Fix for VM_UNBIND_ALL in the case there are no bound VMAs
   - Fixes for a number of __iomem address space mismatches reported by
     sparse
   - Fixes for the assignment of exec_queue priority
   - A Fix for skip_guc_pc not taking effect
   - Workaround for a build problem on GCC 11
   - A couple of fixes for error paths
   - Fix a Flat CCS compression metadata copy issue
   - Fix a misplace array bounds checking
   - Don't have display support depend on EXPERT (as discussed on IRC)"

* tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits)
  nouveau/vmm: don't set addr on the fail path to avoid warning
  drm/amdgpu: Enable GFXOFF for Compute on GFX11
  drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests.
  drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"
  drm/amdkfd: init drm_client with funcs hook
  drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state()
  drm/amdgpu: Fix the null pointer when load rlc firmware
  drm/amd/display: Align the returned error code with legacy DP
  drm/amd/display: Fix DML2 watermark calculation
  drm/amd/display: Clear OPTC mem select on disable
  drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A
  drm/amd/display: Add logging resource checks
  drm/amd/display: Init link enc resources in dc_state only if res_pool presents
  drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'
  drm/amd/display: Avoid enum conversion warning
  drm/amd/pm: Fix smuv13.0.6 current clock reporting
  drm/amd/pm: Add error log for smu v13.0.6 reset
  drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'
  drm/amdgpu: drop exp hw support check for GC 9.4.3
  drm/amdgpu: move debug options init prior to amdgpu device init
  ...

21 months agoMerge tag 'for-v6.8-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux...
Linus Torvalds [Fri, 19 Jan 2024 19:34:19 +0000 (11:34 -0800)]
Merge tag 'for-v6.8-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply

Pull power supply and reset updates from Sebastian Reichel:
 "New features:
   - bq24190: Add support for BQ24296 charger

  Cleanups:
   - all reset drivers: Stop using module_platform_driver_probe()
   - gpio-restart: use devm_register_sys_off_handler
   - pwr-mlxbf: support graceful reboot
   - cw2015: correct time_to_empty units
   - qcom-battmgr: Fix driver initialization sequence
   - bq27xxx: Start/Stop delayed work in suspend/resume
   - minor cleanups and fixes"

* tag 'for-v6.8-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (33 commits)
  power: supply: bq24190_charger: Fix "initializer element is not constant" error
  power: supply: bq24190_charger: Add support for BQ24296
  dt-bindings: power: supply: bq24190: Add BQ24296 compatible
  dt-bindings: power: reset: xilinx: Rename node names in examples
  power: supply: qcom_battmgr: Register the power supplies after PDR is up
  dt-bindings: power: reset: qcom-pon: fix inconsistent example
  power: supply: Fix null pointer dereference in smb2_probe
  power: reset: at91: Drop '__init' from at91_wakeup_status()
  power: supply: Use multiple MODULE_AUTHOR statements
  power: supply: Fix indentation and some other warnings
  power: reset: gpio-restart: Use devm_register_sys_off_handler()
  power: supply: bq256xx: fix some problem in bq256xx_hw_init
  power: supply: cw2015: correct time_to_empty units in sysfs
  power: reset: at91-sama5d2_shdwc: Convert to platform remove callback returning void
  power: reset: at91-reset: Convert to platform remove callback returning void
  power: reset: tps65086-restart: Convert to platform remove callback returning void
  power: reset: syscon-poweroff: Convert to platform remove callback returning void
  power: reset: rmobile-reset: Convert to platform remove callback returning void
  power: reset: restart-poweroff: Convert to platform remove callback returning void
  power: reset: regulator-poweroff: Convert to platform remove callback returning void
  ...

21 months agoMerge tag 'apparmor-pr-2024-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 19 Jan 2024 18:53:55 +0000 (10:53 -0800)]
Merge tag 'apparmor-pr-2024-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull AppArmor updates from John Johansen:
 "This adds a single feature, switch the hash used to check policy from
  sha1 to sha256

  There are fixes for two memory leaks, and refcount bug and a potential
  crash when a profile name is empty. Along with a couple minor code
  cleanups.

  Summary:

  Features
   - switch policy hash from sha1 to sha256

  Bug Fixes
   - Fix refcount leak in task_kill
   - Fix leak of pdb objects and trans_table
   - avoid crash when parse profie name is empty

  Cleanups
   - add static to stack_msg and nulldfa
   - more kernel-doc cleanups"

* tag 'apparmor-pr-2024-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: Fix memory leak in unpack_profile()
  apparmor: avoid crash when parsed profile name is empty
  apparmor: fix possible memory leak in unpack_trans_table
  apparmor: free the allocated pdb objects
  apparmor: Fix ref count leak in task_kill
  apparmor: cleanup network hook comments
  apparmor: add missing params to aa_may_ptrace kernel-doc comments
  apparmor: declare nulldfa as static
  apparmor: declare stack_msg as static
  apparmor: switch SECURITY_APPARMOR_HASH from sha1 to sha256

21 months agoMerge tag 'ceph-for-6.8-rc1' of https://github.com/ceph/ceph-client
Linus Torvalds [Fri, 19 Jan 2024 17:58:55 +0000 (09:58 -0800)]
Merge tag 'ceph-for-6.8-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "Assorted CephFS fixes and cleanups with nothing standing out"

* tag 'ceph-for-6.8-rc1' of https://github.com/ceph/ceph-client:
  ceph: get rid of passing callbacks in __dentry_leases_walk()
  ceph: d_obtain_{alias,root}(ERR_PTR(...)) will do the right thing
  ceph: fix invalid pointer access if get_quota_realm return ERR_PTR
  ceph: remove duplicated code in ceph_netfs_issue_read()
  ceph: send oldest_client_tid when renewing caps
  ceph: rename create_session_open_msg() to create_session_full_msg()
  ceph: select FS_ENCRYPTION_ALGS if FS_ENCRYPTION
  ceph: fix deadlock or deadcode of misusing dget()
  ceph: try to allocate a smaller extent map for sparse read
  libceph: remove MAX_EXTENTS check for sparse reads
  ceph: reinitialize mds feature bit even when session in open
  ceph: skip reconnecting if MDS is not ready

21 months agoMerge tag 'xfs-6.8-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Fri, 19 Jan 2024 17:57:08 +0000 (09:57 -0800)]
Merge tag 'xfs-6.8-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Chandan Babu:

 - Fix per-inode space accounting bug

* tag 'xfs-6.8-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix backwards logic in xfs_bmap_alloc_account