Melody Olvera [Mon, 12 May 2025 20:54:42 +0000 (13:54 -0700)]
soc: qcom: llcc-qcom: Add support for LLCC V6
Add support for LLCC V6. V6 adds several additional usecase IDs,
rearrages several registers and offsets, and supports slice IDs
over 31, so add a new function for programming LLCC V6.
Kyoji Ogasawara [Fri, 9 May 2025 10:26:31 +0000 (19:26 +0900)]
btrfs: add back warning for mount option commit values exceeding 300
The Btrfs documentation states that if the commit value is greater than
300 a warning should be issued. The warning was accidentally lost in the
new mount API update.
Fixes: 6941823cc878 ("btrfs: remove old mount API code") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Boris Burkov [Wed, 7 May 2025 19:42:24 +0000 (12:42 -0700)]
btrfs: fix folio leak in submit_one_async_extent()
If btrfs_reserve_extent() fails while submitting an async_extent for a
compressed write, then we fail to call free_async_extent_pages() on the
async_extent and leak its folios. A likely cause for such a failure
would be btrfs_reserve_extent() failing to find a large enough
contiguous free extent for the compressed extent.
I was able to reproduce this by:
1. mount with compress-force=zstd:3
2. fallocating most of a filesystem to a big file
3. fragmenting the remaining free space
4. trying to copy in a file which zstd would generate large compressed
extents for (vmlinux worked well for this)
Step 4. hits the memory leak and can be repeated ad nauseam to
eventually exhaust the system memory.
Fix this by detecting the case where we fallback to uncompressed
submission for a compressed async_extent and ensuring that we call
free_async_extent_pages().
Fixes: 131a821a243f ("btrfs: fallback if compressed IO fails for ENOSPC") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Co-developed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Mon, 5 May 2025 15:03:16 +0000 (16:03 +0100)]
btrfs: fix discard worker infinite loop after disabling discard
If the discard worker is running and there's currently only one block
group, that block group is a data block group, it's in the unused block
groups discard list and is being used (it got an extent allocated from it
after becoming unused), the worker can end up in an infinite loop if a
transaction abort happens or the async discard is disabled (during remount
or unmount for example).
This happens like this:
1) Task A, the discard worker, is at peek_discard_list() and
find_next_block_group() returns block group X;
2) Block group X is in the unused block groups discard list (its discard
index is BTRFS_DISCARD_INDEX_UNUSED) since at some point in the past
it become an unused block group and was added to that list, but then
later it got an extent allocated from it, so its ->used counter is not
zero anymore;
3) The current transaction is aborted by task B and we end up at
__btrfs_handle_fs_error() in the transaction abort path, where we call
btrfs_discard_stop(), which clears BTRFS_FS_DISCARD_RUNNING from
fs_info, and then at __btrfs_handle_fs_error() we set the fs to RO mode
(setting SB_RDONLY in the super block's s_flags field);
4) Task A calls __add_to_discard_list() with the goal of moving the block
group from the unused block groups discard list into another discard
list, but at __add_to_discard_list() we end up doing nothing because
btrfs_run_discard_work() returns false, since the super block has
SB_RDONLY set in its flags and BTRFS_FS_DISCARD_RUNNING is not set
anymore in fs_info->flags. So block group X remains in the unused block
groups discard list;
5) Task A then does a goto into the 'again' label, calls
find_next_block_group() again we gets block group X again. Then it
repeats the previous steps over and over since there are not other
block groups in the discard lists and block group X is never moved
out of the unused block groups discard list since
btrfs_run_discard_work() keeps returning false and therefore
__add_to_discard_list() doesn't move block group X out of that discard
list.
When this happens we can get a soft lockup report like this:
So fix this by making sure we move a block group out of the unused block
groups discard list when calling __add_to_discard_list().
Fixes: 2bee7eb8bb81 ("btrfs: discard one region at a time in async discard") Link: https://bugzilla.suse.com/show_bug.cgi?id=1242012 CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Daniel Vacek <neelx@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Sudeep Holla [Mon, 12 May 2025 17:55:30 +0000 (18:55 +0100)]
Merge tags 'scmi-updates-6.16' and 'juno-updates-6.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into for-linux-next
Arm SCMI updates for v6.16
1. Quirk framework to handle buggy firmware
With SCMI gaining broader adoption across arm64 platforms, it's
increasingly important to address how we consistently manage out-of-spec
SCMI firmware already deployed in the field. This change introduces a
lightweight quirk framework built around static_keys, enabling developers to:
- Define quirks and their match criteria, which can include:
o A list of compatibles ({ comp, comp2, NULL })
o Vendor ID / Sub-Vendor ID
o Firmware implementation version ranges ([Min_Vers, Max_Vers])
Matching proceeds from the most specific (longest match) to the least
specific. NULL entries are treated as wildcards (i.e., match any value).
This flexibility allows matching very specific combinations or just a
general compatible string.
The quirk code blocks/snippets implementing the workaround are placed near
their intended usage and guarded by a static_key that's tied to the quirk.
Once the SCMI core stack is initialized and retrieves platform info via the
base protocol, any matching quirks will have their associated static_keys
enabled.
2. Quirk for Qualcomm X1E platforms
On some Qualcomm X1E platforms, such as the Lenovo ThinkPad T14s, the
SCMI firmware fails to set the FastChannel support bit for PERF_LEVEL_GET,
yet it crashes when the driver attempts to fall back to standard messaging
which is clearly out-of-spec behavior.
To work around this, the new SCMI quirk framework is used to
unconditionally enable FC initialization for this firmware version.
In the future, once the fixed firmware version is identified, an upper
version bound can be added to the quirk match criteria. Alternatively,
matching can be further restricted using a SoC-specific compatible string
if always enabling FC proves problematic elsewhere.
3. Support for NXP i.MX LMM/CPU vendor protocol extensions
The i.MX95 System Manager (SM) implements Logical Machine Management (LMM)
and a CPU protocol to manage Logical Machines (LM) and CPUs (e.g., M7).
These changes integrate the vendor-specific protocol extensions
implementing the LMM and CPU protocols for the i.MX95, facilitating
standardized communication between the operating system and the platform's
firmware, which will be used by remoteproc drivers. The changes also
include the necessary device tree bindings.
4. Miscellaneous cleanups/changes
These mainly include polling support in SCMI raw mode. The cleanups
centralize error logging for SCMI device creation into a single helper
function, consolidate the device matching logic into a single function, and
ensure that devices must have a name for registration—removing support for
unnamed devices when matching drivers and devices for probing. Transport
devices are now excluded from bus matching, and the correct assignment of
the parent device for the arm-scmi platform device is ensured in the
transport drivers.
Armv8 Juno/FVP updates for v6.16
Few updates to the Arm FVP(Fixed Virtual Platform) device tree, enhancing
support for system tracing, power management, and firmware coexistence:
1. ETE and TRBE support
Adds CoreSight ETE and TRBE nodes for the FVP Rev C model. These are
disabled by default as they need to be enabled explicitly via model
parameters.
2. CPU idle states and system timer for idle broadcast
Introduces CPU idle state definitions but disabled by default due to
potential performance impact on the model. Also adds a system-level
broadcast timer for use when CPUs enter deep idle states where local
timers stop.
3. Firmware memory reservation
Reserves 64MB at the end of the first DRAM bank to prevent conflicts
with FF-A firmware or similar configurations that rely on this region.
4. Drop the unnecessary clock-frequency property in the timer nodes
The boot/secure firmware must configure the timer clock frequency and
the non-secure OS must be able to read the same. The clock-frequency is
generally used when the firmware is broken which is not the case on
most of the fast models and Juno platform.
As noted above some of the changes are disabled by default where applicable
to ensure backward compatibility and avoid unintended performance impact
on platforms using default model parameters.
* tag 'scmi-updates-6.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: quirk: Force perf level get fastchannel
firmware: arm_scmi: quirk: Fix CLOCK_DESCRIBE_RATES triplet
firmware: arm_scmi: Add common framework to handle firmware quirks
firmware: arm_scmi: Ensure that the message-id supports fastchannel
MAINTAINERS: add entry for i.MX SCMI extensions
firmware: imx: Add i.MX95 SCMI CPU driver
firmware: imx: Add i.MX95 SCMI LMM driver
firmware: arm_scmi: imx: Add i.MX95 CPU Protocol
firmware: arm_scmi: imx: Add i.MX95 LMM protocol
dt-bindings: firmware: Add i.MX95 SCMI LMM and CPU protocol
firmware: arm_scmi: imx: Add LMM and CPU documentation
firmware: arm_scmi: Add polling support to raw mode
firmware: arm_scmi: Exclude transport devices from bus matching
firmware: arm_scmi: Assign correct parent to arm-scmi platform device
firmware: arm_scmi: Refactor error logging from SCMI device creation to single helper
firmware: arm_scmi: Refactor device matching logic to eliminate duplication
firmware: arm_scmi: Ensure scmi_devices are always matched by name as well
* tag 'juno-updates-6.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
arm64: dts: fvp: Add ETE and TRBE nodes for Rev C model
arm64: dts: arm: Drop the clock-frequency property from timer nodes
arm64: dts: fvp: Reserve 64MB for the FF-A firmware in memory map
arm64: dts: fvp: Add CPU idle states for Rev C model
arm64: dts: fvp: Add system timer for broadcast during CPU idle
Ian Rogers [Tue, 18 Mar 2025 16:16:39 +0000 (09:16 -0700)]
perf tests: Harden branch stack sampling test
On continuous testing the perf script output can be empty, or nearly
empty, causing tr/grep to exit and due to "set -e" the test traps and
fails.
Add some empty file handling that sets the test to skip and make grep
and other text rewriting failures non-fatal by adding "|| true".
Committer testing:
root@number:~# grep -m1 "model name" /proc/cpuinfo
model name : AMD Ryzen 9 9950X3D 16-Core Processor
root@number:~# perf test "Check branch stack sampling"
104: Check branch stack sampling : Ok
root@number:~#
root@number:~# perf test -vvvvvvv "Check branch stack sampling"
104: Check branch stack sampling:
--- start ---
test child forked, pid 396047
142d22-142da0 l brstack_bench
perf does have symbol 'brstack_bench'
Testing user branch stack sampling
Testing branch stack filtering permutation (any_call,CALL|IND_CALL|COND_CALL|SYSCALL|IRQ)
Testing branch stack filtering permutation (call,CALL|SYSCALL)
Testing branch stack filtering permutation (cond,COND)
Testing branch stack filtering permutation (any_ret,RET|COND_RET|SYSRET|ERET)
Testing branch stack filtering permutation (call,cond,CALL|SYSCALL|COND)
Testing branch stack filtering permutation (any_call,cond,CALL|IND_CALL|COND_CALL|IRQ|SYSCALL|COND)
Testing branch stack filtering permutation (cond,any_call,any_ret,COND|CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|RET|COND_RET|SYSRET|ERET)
---- end(0) ----
104: Check branch stack sampling : Ok
root@number:~#
Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250318161639.34446-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This change adds device tree nodes for the ETE and TRBE. They are
disabled by default to prevent kernel warnings from failed driver
probes, as the model does not enable the features unless explicitly
specified as mentioned above.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Message-Id: <20250512151149.13111-1-leo.yan@arm.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Sudeep Holla [Fri, 9 May 2025 15:46:40 +0000 (16:46 +0100)]
arm64: dts: fvp: Reserve 64MB for the FF-A firmware in memory map
Reserve 64MB of memory at the end of the first bank of DRAM on FVP model.
This is mainly for FF-A firmware use, as required by various firmware
configurations using the Firmware Framework for Arm (FF-A). This prevents
the kernel from overwriting the firmware region.
This is also useful when running other firmware configurations(non FF-A
based) that rely on usage of 64MB at the end of first DRAM bank.
Necessary for proper coexistence of firmware(FF-A partitions) and the OS.
Sudeep Holla [Fri, 9 May 2025 15:46:39 +0000 (16:46 +0100)]
arm64: dts: fvp: Add CPU idle states for Rev C model
Add CPU idle state definitions to the FVP Rev C device tree to enable
support for CPU lower power modes. This allows the system to properly
enter low power states during idle. It is disabled by default as it is
know to impact performance on the models.
Note that the power_state parameter(arm,psci-suspend-param) doesn't use
the Extended StateID format for compatibility reasons on FVP.
Tested on the FVP Rev C model with PSCI support enabled firmware.
Tested-by: Leo Yan <leo.yan@arm.com>
Message-Id: <20250509154640.836093-2-sudeep.holla@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Sudeep Holla [Fri, 9 May 2025 15:46:38 +0000 (16:46 +0100)]
arm64: dts: fvp: Add system timer for broadcast during CPU idle
Introduce a system-level timer node in the FVP device tree to act as
a broadcast timer when CPUs are in context losing idle states where
the local timer stops on entering such low power states.
This change complements recent CPU idle state additions.
Tested-by: Leo Yan <leo.yan@arm.com>
Message-Id: <20250509154640.836093-1-sudeep.holla@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Ian Rogers [Thu, 3 Apr 2025 19:43:37 +0000 (12:43 -0700)]
perf parse-events: Add "cpu" term to set the CPU an event is recorded on
The -C option allows the CPUs for a list of events to be specified but
its not possible to set the CPU for a single event. Add a term to
allow this. The term isn't a general CPU list due to ',' already being
a special character in event parsing instead multiple cpu= terms may
be provided and they will be merged/unioned together.
An example of mixing different types of events counted on different CPUs:
```
$ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1
root@number:~# grep -m1 "model name" /proc/cpuinfo
model name : AMD Ryzen 9 9950X3D 16-Core Processor
root@number:~# perf stat -A -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.1
Performance counter stats for 'system wide':
CPU0 2,398,351 instructions/cpu=0/ # 0.44 insn per cycle
CPU0 2,398,152 instructions # 0.44 insn per cycle
CPU1 1,265,634 instructions # 0.49 insn per cycle
CPU2 606,087 instructions # 0.50 insn per cycle
CPU3 4,025,752 instructions # 0.52 insn per cycle
CPU4 4,236,810 instructions # 0.53 insn per cycle
CPU5 3,984,832 instructions # 0.66 insn per cycle
CPU6 434,132 instructions # 0.44 insn per cycle
CPU7 65,752 instructions # 0.41 insn per cycle
CPU8 459,083 instructions # 0.48 insn per cycle
CPU9 6,464,161 instructions # 1.31 insn per cycle
<SNIP>
root@number:~# perf stat -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.
Performance counter stats for 'system wide':
144,822 instructions/cpu=0/ # 0.03 insn per cycle
4,666,114 instructions # 0.93 insn per cycle
2,583 l1d-misses
4,993,633 cycles
Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250403194337.40202-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 3 Apr 2025 19:43:36 +0000 (12:43 -0700)]
perf parse-events: Set is_pmu_core for legacy hardware events
Also set the CPU map to all online CPU maps.
This is done so the behavior of legacy hardware and hardware cache
events better matches that of sysfs and JSON events during
__perf_evlist__propagate_maps().
Fix missing cpumap put in "Synthesize attr update" test.
Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250403194337.40202-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>