Namhyung Kim [Fri, 7 Mar 2025 22:09:21 +0000 (14:09 -0800)]
perf bpf-filter: Fix a parsing error with comma
The previous change to support cgroup filters introduced a bug that
pathname can include commas. It confused the lexer to treat an item and
the trailing comma as a single token. And it resulted in a parse error:
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
--filter <filter>
event filter
It should get "0" and "," separately.
An easiest fix would be to remove "," from the possible pathname
characters. As it's for cgroup names, probably ok to assume it won't
have commas in the pathname.
I found that the existing BPF filtering test didn't have any complex
filter condition with commas. Let's update the group filter test which
is supposed to test filter combinations like this.
Namhyung Kim [Tue, 11 Mar 2025 00:04:16 +0000 (17:04 -0700)]
perf report: Fix a memory leak for perf_env on AMD
The env.pmu_mapping can be leaked when it reads data from a pipe on AMD.
For a pipe data, it reads the header data including pmu_mapping from
PERF_RECORD_HEADER_FEATURE runtime. But it's already set in:
Then it'll overwrite that when it processes the HEADER_FEATURE record.
Here's a report from address sanitizer.
Direct leak of 2689 byte(s) in 1 object(s) allocated from:
#0 0x7fed8f814596 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
#1 0x5595a7d416b1 in strbuf_grow util/strbuf.c:64
#2 0x5595a7d414ef in strbuf_init util/strbuf.c:25
#3 0x5595a7d0f4b7 in perf_env__read_pmu_mappings util/env.c:362
#4 0x5595a7d12ab7 in perf_env__nr_pmu_mappings util/env.c:517
#5 0x5595a7d89d2f in evlist__has_amd_ibs util/amd-sample-raw.c:315
#6 0x5595a7d87fb2 in evlist__init_trace_event_sample_raw util/sample-raw.c:23
#7 0x5595a7d7f893 in __perf_session__new util/session.c:179
#8 0x5595a7b79572 in perf_session__new util/session.h:115
#9 0x5595a7b7e9dc in cmd_report builtin-report.c:1603
#10 0x5595a7c019eb in run_builtin perf.c:351
#11 0x5595a7c01c92 in handle_internal_command perf.c:404
#12 0x5595a7c01deb in run_argv perf.c:448
#13 0x5595a7c02134 in main perf.c:556
#14 0x7fed85833d67 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Thomas Richter [Mon, 24 Mar 2025 15:27:56 +0000 (16:27 +0100)]
perf trace: Fix wrong size to bpf_map__update_elem call
In linux-next
commit c760174401f6 ("perf cpumap: Reduce cpu size from int to int16_t")
causes the perf tests 100 126 to fail on s390:
Output before:
# ./perf test 100
100: perf trace BTF general tests : FAILED!
#
The root cause is the change from int to int16_t for the
cpu maps. The size of the CPU key value pair changes from
four bytes to two bytes. However a two byte key size is
not supported for bpf_map__update_elem().
Note: validate_map_op() in libbpf.c emits warning
libbpf: map '__augmented_syscalls__': \
unexpected key size 2 provided, expected 4
when key size is set to int16_t.
Therefore change to variable size back to 4 bytes for
invocation of bpf_map__update_elem().
Output after:
# ./perf test 100
100: perf trace BTF general tests : Ok
#
Fixes: c760174401f6 ("perf cpumap: Reduce cpu size from int to int16_t") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Howard Chu <howardchu95@gmail.com> Cc: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250324152756.3879571-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Marcus Meissner [Sun, 23 Mar 2025 08:53:45 +0000 (09:53 +0100)]
perf tools: annotate asm_pure_loop.S
Annotate so it is built with non-executable stack.
Fixes: 8b97519711c3 ("perf test: Add asm pureloop test tool") Signed-off-by: Marcus Meissner <meissner@suse.de> Reviewed-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250323085410.23751-1-meissner@suse.de Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Tue, 11 Mar 2025 21:36:28 +0000 (14:36 -0700)]
perf python: Fix setup.py mypy errors
getenv may return None, so assert it isn't None for CC and srctree
environmental variables required for the script.
Disable an optional warning related to Popen.
Ian Rogers [Tue, 11 Mar 2025 21:36:26 +0000 (14:36 -0700)]
perf build: Add pylint build tests
If PYLINT=1 is passed to the build then run pylint over python code in
perf. Unlike shellcheck this isn't default on as there are currently
too many errors.
An example of an error:
```
************* Module setup
util/setup.py:19:0: C0301: Line too long (127/100) (line-too-long)
util/setup.py:20:0: C0301: Line too long (138/100) (line-too-long)
util/setup.py:63:0: C0301: Line too long (106/100) (line-too-long)
util/setup.py:1:0: C0114: Missing module docstring (missing-module-docstring)
util/setup.py:24:4: W0622: Redefining built-in 'vars' (redefined-builtin)
util/setup.py:11:4: C0103: Constant name "cc_options" doesn't conform to UPPER_CASE naming style (invalid-name)
util/setup.py:13:4: C0103: Constant name "cc_options" doesn't conform to UPPER_CASE naming style (invalid-name)
util/setup.py:15:34: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
util/setup.py:18:0: C0116: Missing function or method docstring (missing-function-docstring)
util/setup.py:19:16: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
util/setup.py:44:0: C0413: Import "from setuptools import setup, Extension" should be placed at the top of the module (wrong-import-position)
util/setup.py:46:0: C0413: Import "from setuptools.command.build_ext import build_ext as _build_ext" should be placed at the top of the module (wrong-import-position)
util/setup.py:47:0: C0413: Import "from setuptools.command.install_lib import install_lib as _install_lib" should be placed at the top of the module (wrong-import-position)
util/setup.py:49:0: C0115: Missing class docstring (missing-class-docstring)
util/setup.py:49:0: C0103: Class name "build_ext" doesn't conform to PascalCase naming style (invalid-name)
util/setup.py:52:8: W0201: Attribute 'build_lib' defined outside __init__ (attribute-defined-outside-init)
util/setup.py:53:8: W0201: Attribute 'build_temp' defined outside __init__ (attribute-defined-outside-init)
util/setup.py:55:0: C0115: Missing class docstring (missing-class-docstring)
util/setup.py:55:0: C0103: Class name "install_lib" doesn't conform to PascalCase naming style (invalid-name)
util/setup.py:58:8: W0201: Attribute 'build_dir' defined outside __init__ (attribute-defined-outside-init)
*-----------------------------------------------------------------
Your code has been rated at 6.67/10 (previous run: 6.51/10, +0.16)
Ian Rogers [Tue, 11 Mar 2025 21:36:25 +0000 (14:36 -0700)]
perf build: Add mypy build tests
If MYPY=1 is passed to the build then run mypy over python code in
perf. Unlike shellcheck this isn't default on as there are currently
too many errors.
An example of an error:
```
util/setup.py:8: error: Item "None" of "str | None" has no attribute "split" [union-attr]
util/setup.py:15: error: Item "None" of "IO[bytes] | None" has no attribute "readline" [union-attr]
util/setup.py:15: error: List item 0 has incompatible type "str | None"; expected "str | bytes | PathLike[str] | PathLike[bytes]" [list-item]
util/setup.py:16: error: Unsupported left operand type for + ("None") [operator]
util/setup.py:16: note: Left operand is of type "str | None"
util/setup.py:74: error: Unsupported left operand type for + ("None") [operator]
util/setup.py:74: note: Left operand is of type "str | None"
Found 5 errors in 1 file (checked 1 source file)
make[4]: *** [util/Build:430: util/setup.py.mypy_log] Error 1
```
Ian Rogers [Tue, 11 Mar 2025 21:36:24 +0000 (14:36 -0700)]
perf build: Rename TEST_LOGS to SHELL_TEST_LOGS
Rename TEST_LOGS to SHELL_TEST_LOGS as later changes will add more
kinds of test logs.
Minor comment tweak in Makefile.perf as more than just test shell
tests are checked.
Dirk Gouders [Sun, 23 Mar 2025 14:01:01 +0000 (15:01 +0100)]
perf bench sched pipe: fix enforced blocking reads in worker_thread
The function worker_thread() is programmed in a way that roughly
doubles the number of expectable context switches, because it enforces
blocking reads:
Performance counter stats for 'perf bench sched pipe':
From the code, it is unclear if that behavior is wanted but the log
says that at least Ingo Molnar aims to mimic lmbench's lat_ctx, that
doesn't handle the pipe ends that way
(https://sourceforge.net/p/lmbench/code/HEAD/tree/trunk/lmbench2/src/lat_ctx.c)
Fix worker_thread() by always first feeding the write ends of the pipes
and then trying to read.
This roughly halves the context switches and runtime of pure
'perf bench sched pipe':
Performance counter stats for 'perf bench sched pipe':
Likhitha Korrapati [Fri, 21 Mar 2025 10:07:26 +0000 (15:37 +0530)]
perf tools: Fix is_compat_mode build break in ppc64
Commit 54f9aa1092457 ("tools/perf/powerpc/util: Add support to
handle compatible mode PVR for perf json events") introduced
to select proper JSON events in case of compat mode using
auxiliary vector. But this caused a compilation error in ppc64
Big Endian.
arch/powerpc/util/header.c: In function 'is_compat_mode':
arch/powerpc/util/header.c:20:21: error: cast to pointer from
integer of different size [-Werror=int-to-pointer-cast]
20 | if (!strcmp((char *)platform, (char *)base_platform))
| ^
arch/powerpc/util/header.c:20:39: error: cast to pointer from
integer of different size [-Werror=int-to-pointer-cast]
20 | if (!strcmp((char *)platform, (char *)base_platform))
|
Commit saved the getauxval(AT_BASE_PLATFORM) and getauxval(AT_PLATFORM)
return values in u64 which causes the compilation error.
Patch fixes this issue by changing u64 to "unsigned long".
Fixes: 54f9aa1092457 ("tools/perf/powerpc/util: Add support to handle compatible mode PVR for perf json events") Signed-off-by: Likhitha Korrapati <likhitha@linux.ibm.com> Reviewed-by: Athira Rajeev <atrajeev@linux.ibm.com> Link: https://lore.kernel.org/r/20250321100726.699956-1-likhitha@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Holger Hoffstätte [Fri, 21 Mar 2025 08:20:39 +0000 (09:20 +0100)]
perf build: filter all combinations of -flto for libperl
When enabling the libperl feature the build uses perl's build flags
(ccopts) but filters out various flags, e.g. for LTO.
While this is conceptually correct, it is insufficient in practice,
since only "-flto=auto" is filtered out. When perl itself is built with
"-flto" this can cause parts of perf being built with LTO and others
without, giving exciting build errors like e.g.:
../tools/perf/pmu-events/pmu-events.c:72851:(.text+0xb79): undefined
reference to `strcmp_cpuid_str' collect2: error: ld returned 1 exit status
Fix this by filtering all matching flag values of -flto{=n,auto,..}.
frontend_bound metrics was miscalculated due to different scaling in
a couple of metrics it depends on. Change the scaling to match with
AmpereOne.
Fixes: 16438b652b46 ("perf vendor events arm64 AmpereOneX: Add core PMU events and metrics") Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250313201559.11332-3-ilkka@os.amperecomputing.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ilkka Koskinen [Thu, 13 Mar 2025 20:15:58 +0000 (20:15 +0000)]
perf vendor events arm64: AmpereOne/AmpereOneX: Mark LD_RETIRED impacted by errata
Atomic instructions are both memory-reading and memory-writing
instructions and so should be counted by both LD_RETIRED and ST_RETIRED
performance monitoring events. However LD_RETIRED does not count atomic
instructions.
Ian Rogers [Wed, 19 Mar 2025 05:07:41 +0000 (22:07 -0700)]
perf trace: Fix evlist memory leak
Leak sanitizer was reporting a memory leak in the "perf record and
replay" test. Add evlist__delete to trace__exit, also ensure
trace__exit is called after trace__record.
Ian Rogers [Wed, 19 Mar 2025 05:07:39 +0000 (22:07 -0700)]
perf trace: Make syscall table stable
Namhyung fixed the syscall table being reallocated and moving by
reloading the system call pointer after a move:
https://lore.kernel.org/lkml/Z9YHCzINiu4uBQ8B@google.com/
This could be brittle so this patch changes the syscall table to be an
array of pointers of "struct syscall" that don't move. Remove
unnecessary copies and searches with this change.
Ian Rogers [Wed, 19 Mar 2025 05:07:38 +0000 (22:07 -0700)]
perf syscalltbl: Mask off ABI type for MIPS system calls
Arnd Bergmann described that MIPS system calls don't necessarily start
from 0 as an ABI prefix is applied:
https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
When decoding the "id" (aka system call number) for MIPS ignore values
greater-than 1000.
Ian Rogers [Wed, 19 Mar 2025 05:07:37 +0000 (22:07 -0700)]
perf build: Remove Makefile.syscalls
Now a single beauty file is generated and used by all architectures,
remove the per-architecture Makefiles, Kbuild files and previous
generator script.
Note: there was conversation with Charlie Jenkins
<charlie@rivosinc.com> and they'd written an alternate approach to
support multiple architectures:
https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
It would have been better to have helped Charlie fix their series (my
apologies) but they agreed that the approach taken here was likely
best for longer term maintainability:
https://lore.kernel.org/lkml/Z6Jk_UN9i69QGqUj@ghost/
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:36 +0000 (22:07 -0700)]
perf syscalltbl: Use lookup table containing multiple architectures
Switch to use the lookup table containing all architectures rather
than tables matching the perf binary.
This fixes perf trace when executed on a 32-bit i386 binary on an
x86-64 machine. Note in the following the system call names of the
32-bit i386 binary as seen by an x86-64 perf.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:35 +0000 (22:07 -0700)]
perf trace beauty: Add syscalltbl.sh generating all system call tables
Rather than generating individual syscall header files generate a
single trace/beauty/generated/syscalltbl.c. In a syscalltbls array
have references to each architectures tables along with the
corresponding e_machine. When the 32-bit or 64-bit table is ambiguous,
match the perf binary's type. For ARM32 don't use the arm64 32-bit
table which is smaller. EM_NONE is present for is no machine matches.
Conditionally compile the tables, only having the appropriate 32 and
64-bit table. If ALL_SYSCALLTBL is defined all tables can be
compiled.
Add comment for noreturn column suggested by Arnd Bergmann:
https://lore.kernel.org/lkml/d47c35dd-9c52-48e7-a00d-135572f11fbb@app.fastmail.com/
and added in commit 9142be9e6443 ("x86/syscall: Mark exit[_group]
syscall handlers __noreturn").
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:34 +0000 (22:07 -0700)]
perf thread: Add support for reading the e_machine type for a thread
First try to read the e_machine from the dsos associated with the
thread's maps. If live use the executable from /proc/pid/exe and read
the e_machine from the ELF header. On failure use EM_HOST. Change
builtin-trace syscall functions to pass e_machine from the thread
rather than EM_HOST, so that in later patches when syscalltbl can use
the e_machine the system calls are specific to the architecture.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:33 +0000 (22:07 -0700)]
perf dso: Add support for reading the e_machine type for a dso
For ELF file dsos read the e_machine from the ELF header. For kernel
types assume the e_machine matches the perf tool. In other cases
return EM_NONE.
When reading from the ELF header use DSO__SWAP that may need
dso->needs_swap initializing. Factor out dso__swap_init to allow this.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:32 +0000 (22:07 -0700)]
perf syscalltbl: Remove struct syscalltbl
The syscalltbl held entries of system call name and number pairs,
generated from a native syscalltbl at start up. As there are gaps in
the system call number there is a notion of index into the
table. Going forward we want the system call table to be identifiable
by a machine type, for example, i386 vs x86-64. Change the interface
to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
is passed (2) the index to syscall number and system call name mapping
is computed at build time.
Two tables are used for this, an array of system call number to name,
an array of system call numbers sorted by the system call name. The
sorted array doesn't store strings in part to save memory and
relocations. The index notion is carried forward and is an index into
the sorted array of system call numbers, the data structures are
opaque (held only in syscalltbl.c), and so the number of indices for a
machine type is exposed as a new API.
The arrays are computed in the syscalltbl.sh script and so no start-up
time computation and storage is necessary.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:31 +0000 (22:07 -0700)]
perf trace: Reorganize syscalls
Identify struct syscall information in the syscalls table by a machine
type and syscall number, not just system call number. Having the
machine type means that 32-bit system calls can be differentiated from
64-bit ones on a machine capable of both. Having a table for all
machine types and all system call numbers would be too large, so
maintain a sorted array of system calls as they are encountered.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:30 +0000 (22:07 -0700)]
perf syscalltbl: Remove syscall_table.h
The definition of "static const char *const syscalltbl[] = {" is done
in a generated syscalls_32.h or syscalls_64.h that is architecture
dependent. In order to include the appropriate file a syscall_table.h
is found via the perf include path and it includes the syscalls_32.h
or syscalls_64.h as appropriate.
To support having multiple syscall tables, one for 32-bit and one for
64-bit, or for different architectures, an include path cannot be
used. Remove syscall_table.h because of this and inline what it does
into syscalltbl.c.
For architectures without a syscall_table.h this will cause a failure
to include either syscalls_32.h or syscalls_64.h rather than a failure
to include syscall_table.h. For architectures that only included one
or other, the behavior matches BITS_PER_LONG as previously done on
architectures supporting both syscalls_32.h and syscalls_64.h.
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Howard Chu <howardchu95@gmail.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:29 +0000 (22:07 -0700)]
perf dso: kernel-doc for enum dso_binary_type
There are many and non-obvious meanings to the dso_binary_type enum
values. Add kernel-doc to speed interpretting their meanings.
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Wed, 19 Mar 2025 05:07:28 +0000 (22:07 -0700)]
perf dso: Move libunwind dso_data variables into ifdef
The variables elf_base_addr, debug_frame_offset, eh_frame_hdr_addr and
eh_frame_hdr_offset are only accessed in unwind-libunwind-local.c
which is conditionally built on having libunwind support. Make the
variables conditional on libunwind support too.
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20250319050741.269828-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Namhyung Kim [Fri, 7 Mar 2025 08:08:29 +0000 (00:08 -0800)]
perf report: Disable children column for data type profiling
I've realized that it doesn't make sense to accumulate the samples to
parent in the callchain when data type profiling is enabled. Because it
won't have the same data type access in the parent. Otherwise it'd see
something like this:
$ perf report -s type --stdio -g none
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 2K of event 'cycles:Pu'
# Event count (approx.): 8266456478
#
# Children Latency Self Latency Data Type
# ........ ....... ........ ........ .........
#
698.97% 697.72% 99.80% 99.61% (unknown)
0.09% 0.18% 0.09% 0.18% Elf64_Rela
0.05% 0.10% 0.05% 0.10% unsigned char
0.05% 0.10% 0.05% 0.10% struct exit_function_list
0.00% 0.01% 0.00% 0.01% struct rtld_global
Namhyung Kim [Fri, 7 Mar 2025 08:08:28 +0000 (00:08 -0800)]
perf report: Allow hierarchy mode for --children
It was prohibited because the output fields in the children mode were
not handled properly with hierarchy. But we can have the output fields
in the same level, it can allow them together.
For example, latency mode adds more output fields by default and now
they are displayed properly.
Namhyung Kim [Fri, 7 Mar 2025 08:08:27 +0000 (00:08 -0800)]
perf sort: Keep output fields in the same level
This is useful for hierarchy output mode where the first level is
considered as output fields. We want them in the same level so that it
can show only the remaining groups in the hierarchy.
James Clark [Wed, 19 Mar 2025 11:40:09 +0000 (11:40 +0000)]
libperf: Don't remove -g when EXTRA_CFLAGS are used
When using EXTRA_CFLAGS, for example "EXTRA_CFLAGS=-DREFCNT_CHECKING=1",
this construct stops setting -g which you'd expect would not be affected
by adding extra flags. Additionally, EXTRA_CFLAGS should be the last
thing to be appended so that it can be used to undo any defaults. And no
condition is required, just += appends to any existing CFLAGS and also
appends or doesn't append EXTRA_CFLAGS if they are or aren't set.
It's not clear why DEBUG=1 is required for -g in Perf when in libperf
it's always on, but I don't think we need to change that behavior now
because someone may be depending on it.
Thomas Richter [Wed, 19 Mar 2025 12:28:20 +0000 (13:28 +0100)]
perf pmu: Handle memory failure in tool_pmu__new()
On linux-next
commit 72c6f57a4193 ("perf pmu: Dynamically allocate tool PMU")
allocated PMU named "tool" dynamicly. However that allocation
can fail and a NULL pointer is returned. That case is currently
not handled and would result in an invalid address reference.
Add a check for NULL pointer.
Fixes: 72c6f57a4193 ("perf pmu: Dynamically allocate tool PMU") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250319122820.2898333-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
James Clark [Wed, 19 Mar 2025 10:16:10 +0000 (10:16 +0000)]
perf: intel-tpebs: Fix incorrect usage of zfree()
zfree() requires an address otherwise it frees what's in name, rather
than name itself. Pass the address of name to fix it.
This was the only incorrect occurrence in Perf found using a search.
Fixes: 8db5cabcf1b6 ("perf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency value for a metric.") Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250319101614.190922-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Tue, 18 Mar 2025 17:19:14 +0000 (10:19 -0700)]
perf cpumap: Increment reference count for online cpumap
Thomas Richter <tmricht@linux.ibm.com> reported a double put on the
cpumap for the placeholder core PMU:
https://lore.kernel.org/lkml/20250318095132.1502654-3-tmricht@linux.ibm.com/
Requiring the caller to get the cpumap is not how these things are
usually done, switch cpu_map__online to do the get and then fix up any
use cases where a put is needed.
Stephen Brennan [Tue, 18 Mar 2025 23:00:11 +0000 (16:00 -0700)]
perf dso: fix dso__is_kallsyms() check
Kernel modules for which we cannot find a file on-disk will have a
dso->long_name that looks like "[module_name]". Prior to the commit
listed in the fixes, the dso->kernel field would be zero (for user
space), so dso__is_kallsyms() would return false. After the commit,
kernel module DSOs are correctly labeled, but the result is that
dso__is_kallsyms() erroneously returns true for those modules without a
filesystem path.
Later, build_id_cache__add() consults this value of is_kallsyms, and
when true, it copies /proc/kallsyms into the cache. Users with many
kernel modules without a filesystem path (e.g. ksplice or possibly
kernel live patch modules) have reported excessive disk space usage in
the build ID cache directory due to this behavior.
To reproduce the issue, it's enough to build a trivial out-of-tree hello
world kernel module, load it using insmod, and then use:
perf record -ag -- sleep 1
In the build ID directory, there will be a directory for your module
name containing a kallsyms file.
Fix this up by changing dso__is_kallsyms() to consult the
dso_binary_type enumeration, which is also symmetric to the above checks
for dso__is_vmlinux() and dso__is_kcore(). With this change, kallsyms is
not cached in the build-id cache for out-of-tree modules.
Feng Yang [Fri, 14 Mar 2025 03:10:13 +0000 (11:10 +0800)]
perf kwork: Remove unreachable judgments
When s2[i] = '\0', if s1[i] != '\0', it will be judged by ret,
and if s1[i] = '\0', it will be judegd by !s1[i].
So in reality, s2 [i] will never make a judgment
Arnaldo Carvalho de Melo [Wed, 12 Mar 2025 20:31:41 +0000 (17:31 -0300)]
perf python: Check if there is space to copy all the event
The pyrf_event__new() method copies the event obtained from the perf
ring buffer to a structure that will then be turned into a python object
for further consumption, so it copies perf_event.header.size bytes to
its 'event' member:
Arnaldo Carvalho de Melo [Wed, 12 Mar 2025 20:31:40 +0000 (17:31 -0300)]
perf python: Don't keep a raw_data pointer to consumed ring buffer space
When processing tracepoints the perf python binding was parsing the
event before calling perf_mmap__consume(&md->core) in
pyrf_evlist__read_on_cpu().
But part of this event parsing was to set the perf_sample->raw_data
pointer to the payload of the event, which then could be overwritten by
other event before tracepoint fields were asked for via event.prev_comm
in a python program, for instance.
This also happened with other fields, but strings were were problems
were surfacing, as there is UTF-8 validation for the potentially garbled
data.
This ended up showing up as (with some added debugging messages):
( field 'prev_comm' ret=0x7f7c31f65110, raw_size=68 ) ( field 'prev_pid' ret=0x7f7c23b1bed0, raw_size=68 ) ( field 'prev_prio' ret=0x7f7c239c0030, raw_size=68 ) ( field 'prev_state' ret=0x7f7c239c0250, raw_size=68 ) time 14771421785867 prev_comm= prev_pid=1919907691 prev_prio=796026219 prev_state=0x303a32313175 ==>
( XXX '��' len=16, raw_size=68) ( field 'next_comm' ret=(nil), raw_size=68 ) Traceback (most recent call last):
File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 51, in <module>
main()
File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 46, in main
event.next_comm,
^^^^^^^^^^^^^^^
AttributeError: 'perf.sample_event' object has no attribute 'next_comm'
When event.next_comm was asked for, the PyUnicode_FromString() python
API would fail and that tracepoint field wouldn't be available, stopping
the tools/perf/python/tracepoint.py test tool.
But, since we already do a copy of the whole event in pyrf_event__new,
just use it and while at it remove what was done in in e8968e654191390a
("perf python: Fix pyrf_evlist__read_on_cpu event consuming") because we
don't really need to wait for parsing the sample before declaring the
event as consumed.
This copy is questionable as is now, as it limits the maximum event +
sample_type and tracepoint payload to sizeof(union perf_event), this all
has been "working" because 'struct perf_event_mmap2', the largest entry
in 'union perf_event' is:
Arnaldo Carvalho de Melo [Wed, 12 Mar 2025 20:31:39 +0000 (17:31 -0300)]
perf python: Decrement the refcount of just created event on failure
To avoid a leak if we have the python object but then something happens
and we need to return the operation, decrement the offset of the newly
created object.
Fixes: 377f698db12150a1 ("perf python: Add struct evsel into struct pyrf_event") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250312203141.285263-5-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Arnaldo Carvalho de Melo [Wed, 12 Mar 2025 20:31:37 +0000 (17:31 -0300)]
perf python: Remove some unused macros (_PyUnicode_FromString(arg), etc)
When python2 support was removed in e7e9943c87d857da ("perf python:
Remove python 2 scripting support"), all use of the
_PyUnicode_FromString(arg), _PyUnicode_FromFormat(...), and
_PyLong_FromLong(arg) macros was removed as well, so remove it.
Ian Rogers [Tue, 18 Mar 2025 04:31:51 +0000 (21:31 -0700)]
perf test dso-data: Correctly free test file in read test
The DSO data read test opens a file but as dsos__exit is used the test
file isn't closed. This causes the subsequent subtests in don't fork
(-F) mode to fail as one more than expected file descriptor is open.
Ian Rogers [Tue, 18 Mar 2025 04:31:50 +0000 (21:31 -0700)]
perf dso: Use lock annotations to fix asan deadlock
dso__list_del with address sanitizer and/or reference count checking
will call dso__put that can call dso__data_close reentrantly trying to
lock the dso__data_open_lock and deadlocking. Switch from pthread
mutexes to perf's mutex so that lock checking is performed in debug
builds. Add lock annotations that diagnosed the problem. Release the
dso__data_open_lock around the dso__put to avoid the deadlock.
Change the declaration of dso__data_get_fd to return a boolean,
indicating the fd is valid and the lock is held, to make it compatible
with the thread safety annotations as a try lock.
Ian Rogers [Tue, 11 Mar 2025 21:16:35 +0000 (14:16 -0700)]
perf test: Add pipe output testing for annotate
Parameterize the basic testing to generate directly a perf.data file
or to generate/use one from pipe input or output. To simplify the
refactor move some of the head/grep logic around. Use "-q" with grep
to make the test output cleaner.
Ian Rogers [Wed, 12 Mar 2025 00:18:41 +0000 (17:18 -0700)]
perf test: Fixes to variable expansion and stdout for diff test
When make_data fails its error message needs to go to stderr rather
than stdout and the stdout value is captured in a variable. Quote the
$err value so that it is always a valid input for test. This error is
commonly encountered if no sample data is gathered by the test.
Arnaldo Carvalho de Melo [Thu, 13 Mar 2025 03:31:21 +0000 (20:31 -0700)]
perf libunwind: Fixup conversion perf_sample->user_regs to a pointer
The dc6d2bc2d893a878 ("perf sample: Make user_regs and intr_regs optional") misses
the changes to a file, resulting in this problem:
$ make LIBUNWIND=1 -C tools/perf O=/tmp/build/perf-tools-next install-bin
<SNIP>
CC /tmp/build/perf-tools-next/util/unwind-libunwind-local.o
CC /tmp/build/perf-tools-next/util/unwind-libunwind.o
<SNIP>
util/unwind-libunwind-local.c: In function ‘access_mem’:
util/unwind-libunwind-local.c:582:56: error: ‘ui->sample->user_regs’ is a pointer; did you mean to use ‘->’?
582 | if (__write || !stack || !ui->sample->user_regs.regs) {
| ^
| ->
util/unwind-libunwind-local.c:587:38: error: passing argument 2 of ‘perf_reg_value’ from incompatible pointer type [-Wincompatible-pointer-types]
587 | ret = perf_reg_value(&start, &ui->sample->user_regs,
| ^~~~~~~~~~~~~~~~~~~~~~
| |
| struct regs_dump **
<SNIP>
⬢ [acme@toolbox perf-tools-next]$ git bisect bad dc6d2bc2d893a878e7b58578ff01b4738708deb4 is the first bad commit
commit dc6d2bc2d893a878e7b58578ff01b4738708deb4 (HEAD)
Author: Ian Rogers <irogers@google.com>
Date: Mon Jan 13 11:43:45 2025 -0800
perf sample: Make user_regs and intr_regs optional
Detected using:
make -C tools/perf build-test
Fixes: dc6d2bc2d893a878 ("perf sample: Make user_regs and intr_regs optional") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250313033121.758978-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Veronika Molnarova [Fri, 22 Nov 2024 23:12:33 +0000 (00:12 +0100)]
perf test stat_all_pmu.sh: Correctly check 'perf stat' result
Test case "stat_all_pmu.sh" is not correctly checking 'perf stat' output
due to a poor design. Firstly, having the 'set -e' option with a trap
catching the sigexit causes the shell to exit immediately if 'perf stat' ends
with any non-zero value, which is then caught by the trap reporting an
unexpected signal. This causes events that should be parsed by the if-else
statement to be caught by the trap handler and are reported as errors:
$ perf test -vv "perf all pmu"
Testing i915/actual-frequency/
Unexpected signal in main
Error:
Access to performance monitoring and observability operations is limited.
Secondly, the if-else branches are not exclusive as the checking if the
event is present in the output log covers also the "<not supported>"
events, which should be accepted, and also the "Bad name events", which
should be rejected.
Remove the "set -e" option from the test case, correctly parse the
"perf stat" output log and check its return value. Add the missing
outputs for the 'perf stat' result and also add logs messages to
report the branch that parsed the event for more info.
Fixes: 7e73ea40295620e7 ("perf test: Ignore security failures in all PMU test") Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Tested-by: Qiao Zhao <qzhao@redhat.com> Link: https://lore.kernel.org/r/20241122231233.79509-1-vmolnaro@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Yujie Liu [Wed, 12 Mar 2025 07:23:29 +0000 (15:23 +0800)]
perf script: Update brstack syntax documentation
The following commits added new fields/flags to the branch stack field
list:
commit 1f48989cdc7d ("perf script: Output branch sample type")
commit 6ade6c646035 ("perf script: Show branch speculation info")
commit 1e66dcff7b9b ("perf script: Add not taken event for branch stack")
Update brstack syntax documentation to be consistent with the latest
branch stack field list. Improve the descriptions to help users
interpret the fields accurately.
Yujie Liu [Wed, 12 Mar 2025 07:56:36 +0000 (15:56 +0800)]
perf script: Fix typo in branch event mask
BRACH -> BRANCH
Fixes: 88b1473135e4 ("perf script: Separate events from branch types") Signed-off-by: Yujie Liu <yujie.liu@intel.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250312075636.429127-1-yujie.liu@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Arnaldo Carvalho de Melo [Mon, 10 Mar 2025 19:45:33 +0000 (16:45 -0300)]
perf hist stdio: Do bounds check when printing callchains to avoid UB with new gcc versions
Do a simple bounds check to avoid this on new gcc versions:
31 15.81 fedora:rawhide : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC)
In function 'callchain__fprintf_left_margin',
inlined from 'callchain__fprintf_graph.constprop' at ui/stdio/hist.c:246:12:
ui/stdio/hist.c:27:39: error: iteration 2147483647 invokes undefined behavior [-Werror=aggressive-loop-optimizations]
27 | for (i = 0; i < left_margin; i++)
| ~^~
ui/stdio/hist.c:27:23: note: within this loop
27 | for (i = 0; i < left_margin; i++)
| ~~^~~~~~~~~~~~~
cc1: all warnings being treated as errors
Arnaldo Carvalho de Melo [Mon, 10 Mar 2025 19:45:32 +0000 (16:45 -0300)]
perf units: Fix insufficient array space
No need to specify the array size, let the compiler figure that out.
This addresses this compiler warning that was noticed while build
testing on fedora rawhide:
31 15.81 fedora:rawhide : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC)
util/units.c: In function 'unit_number__scnprintf':
util/units.c:67:24: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization]
67 | char unit[4] = "BKMG";
| ^~~~~~
cc1: all warnings being treated as errors
Arnaldo Carvalho de Melo [Mon, 10 Mar 2025 19:45:31 +0000 (16:45 -0300)]
libapi: Add missing header with NAME_MAX define to io_dir.h
Most systems get this indirectly, but some odd cases (some musl libc
systems) can't find it, so just add the header where NAME_MAX is defined
to avoid that.
Fixes: d118b08f7eee6d6f ("tools lib api: Add io_dir an allocation free readdir alternative") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250310194534.265487-2-acme@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Namhyung Kim [Mon, 10 Mar 2025 22:49:25 +0000 (15:49 -0700)]
perf annotate: Add --code-with-type option.
This option is to show data type info in the regular (code) annotation.
It tries to find data type for each (memory) instruction in the
function. It'd be useful to see function-level memory access pattern
and also to debug the data type profiling result.
The output would be added at the end of the line and have "# data-type:"
prefix.
For now, it only works with --stdio mode for simplicity. I can work on
enabling it for TUI later.
Namhyung Kim [Mon, 10 Mar 2025 22:49:24 +0000 (15:49 -0700)]
perf annotate: Implement code + data type annotation
Sometimes it's useful to see both instructions and their data type
together. Let's extend the annotate code to use data type profiling
functions.
To make it easy to pass more argument, introduce a struct to carry
necessary information together. Also add a new annotation_option called
'code_with_type' to control the behavior. This is not enabled yet but
it'll be set later from the command line.
For simplicity, this is implemented for --stdio only.
Namhyung Kim [Mon, 10 Mar 2025 22:49:23 +0000 (15:49 -0700)]
perf annotate: Factor out __hist_entry__get_data_type()
So that it can only handle a single disasm_linme and hopefully make the
code simpler. This is also a preparation to be called from different
places later.
The NO_TYPE macro was added to distinguish when it failed or needs retry.
Namhyung Kim [Mon, 10 Mar 2025 22:49:22 +0000 (15:49 -0700)]
perf annotate: Pass hist_entry to annotate functions
It's a prepartion to support code annotation and data type
annotation at the same time. Data type annotation needs more
information in the hist_entry so it needs to be passed deeper.
Also rename a function with the same name in the builtin-annotate.c
to hist_entry__stdio_annotate since it matches better to the command
line option. And change the condition inside to be simpler.
Namhyung Kim [Mon, 10 Mar 2025 22:49:21 +0000 (15:49 -0700)]
perf annotate: Pass annotation_options to annotation_line__print()
The annotation_line__print() has many arguments. But min_percent,
max_lines and percent_type are from struct annotaion_options. So let's
pass a pointer to the option instead of passing them separately to
reduce the number of function arguments.
Actually it has a recursive call if 'queue' is set. Add a new option
instance to pass different values for the case.
Factor out a function to get the name of member field at the given
offset. This will be used in other places.
Also update the output of typeoff sort key a little bit. As we know
that some special types like (stack operation), (stack canary) and
(unknown) won't have fields, skip printing the offset and field.
For example, the following change is expected.
"(stack operation) +0 (no field)" ==> "(stack operation)"
Namhyung Kim [Thu, 27 Feb 2025 19:12:22 +0000 (11:12 -0800)]
perf ftrace: Remove an unnecessary condition check in BPF
The bucket_num is set based on the {max,min}_latency already in
cmd_ftrace(), so no need to check it again in BPF. Also I found
that it didn't pass the max_latency to BPF. :)
Namhyung Kim [Thu, 27 Feb 2025 19:12:21 +0000 (11:12 -0800)]
perf ftrace: Fix latency stats with BPF
When BPF collects the stats for the latency in usec, it first divides
the time by 1000. But that means it would have 0 if the delta is small
and won't update the total time properly.
Let's keep the stats in nsec always and adjust to usec before printing.
Ian Rogers [Fri, 7 Mar 2025 02:39:06 +0000 (18:39 -0800)]
perf test stat: Additional topdown grouping tests
Add a loop and helper function to avoid repetition, the loop uses
arrays so switch the shell to bash. Add additional topdown group tests
where a topdown event needs to be moved beyond others and the slots
event isn't first in the target group. This replicates issues that
occur on hybrid systems where the other events are for the cpu_atom
PMU. Test with both PMU and software events. Place the slots event
later in the event list.
Dapeng Mi [Fri, 7 Mar 2025 02:39:05 +0000 (18:39 -0800)]
perf x86 evlist: Update comments on topdown regrouping
Update to remove comments about groupings not working and with the:
```
perf stat -e "{instructions,slots},{cycles,topdown-retiring}"
```
case that now works.
Ian Rogers [Fri, 7 Mar 2025 02:39:04 +0000 (18:39 -0800)]
perf parse-events: Corrections to topdown sorting
In the case of '{instructions,slots},faults,topdown-retiring' the
first event that must be grouped, slots, is ignored causing the
topdown-retiring event not to be adjacent to the group it needs to be
inserted into. Don't ignore the group members when computing the
force_grouped_index.
Make the force_grouped_index be for the leader of the group it is
within and always use it first rather than a group leader index so
that topdown events may be sorted from one group into another.
As the PMU name comparison applies to moving events in the same group
ensure the name ordering is always respected.
Change the group splitting logic to not group if there are no other
topdown events and to fix cases where the force group leader wasn't
being grouped with the other members of its group.
Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Closes: https://lore.kernel.org/lkml/20250224083306.71813-2-dapeng1.mi@linux.intel.com/ Closes: https://lore.kernel.org/lkml/f7e4f7e8-748c-4ec7-9088-0e844392c11a@linux.intel.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250307023906.1135613-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Dapeng Mi [Fri, 7 Mar 2025 02:39:03 +0000 (18:39 -0800)]
perf x86/topdown: Fix topdown leader sampling test error on hybrid
When running topdown leader smapling test on Intel hybrid platforms,
such as LNL/ARL, we see the below error.
Topdown leader sampling test
Topdown leader sampling [Failed topdown events not reordered correctly]
It indciates the below command fails.
perf record -o "${perfdata}" -e "{instructions,slots,topdown-retiring}:S" true
The root cause is that perf tool creats a perf event for each PMU type
if it can create.
As for this command, there would be 5 perf events created,
cpu_atom/instructions/,cpu_atom/topdown_retiring/,
cpu_core/slots/,cpu_core/instructions/,cpu_core/topdown-retiring/
For these 5 events, the 2 cpu_atom events are in a group and the other 3
cpu_core events are in another group.
When arch_topdown_sample_read() traverses all these 5 events, events
cpu_atom/instructions/ and cpu_core/slots/ don't have a same group
leade, and then return false directly and lead to cpu_core/slots/ event
is used to sample and this is not allowed by PMU driver.
It's a overkill to return false directly if "evsel->core.leader !=
leader->core.leader" since there could be multiple groups in the event
list.
Just "continue" instead of "return false" to fix this issue.
Fixes: 1e53e9d1787b ("perf x86/topdown: Correct leader selection with sample_read enabled") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250307023906.1135613-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 7 Mar 2025 02:39:02 +0000 (18:39 -0800)]
perf tools: Improve handling of hybrid PMUs in perf_event_attr__fprintf
Support the PMU name from the legacy hardware and hw_cache PMU
extended types. Remove some macros and make variables more intention
revealing, rather than just being called "value".
Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250307023906.1135613-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:23:06 +0000 (14:23 -0800)]
perf python: Add evlist all_cpus accessor
Add a means to get the reference counted all_cpus CPU map from an
evlist in its python form.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:23:05 +0000 (14:23 -0800)]
perf python: Avoid duplicated code in get_tracepoint_field
The code replicates computations done in evsel__tp_format, reuse
evsel__tp_format to simplify the python C code.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:23:04 +0000 (14:23 -0800)]
perf python: Update ungrouped evsel leader in clone
evsels are cloned in the python code as they form part of the Python
object pyrf_evsel. The cloning doesn't update the evsel's leader, do
this for the case of an evsel being ungrouped.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:23:03 +0000 (14:23 -0800)]
perf python: Add optional cpus and threads arguments to parse_events
Used for the evlist initialization.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:23:02 +0000 (14:23 -0800)]
perf python: Add member access to a number of evsel variables
Most variables are part of the perf_event_attr, so that they may be
queried and modified.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:23:01 +0000 (14:23 -0800)]
perf python: Add evlist enable and disable methods
By default the evsels from parse_events will be disabled. Add access
to the evlist functions so they can be enabled/disabled.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:23:00 +0000 (14:23 -0800)]
perf evsel: tp_format accessing improvements
Ensure evsel__clone copies the tp_sys and tp_name variables.
In evsel__tp_format, if tp_sys isn't set, use the config value to find
the tp_format. This succeeds in python code where pyrf__tracepoint has
already found the format.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-4-irogers@google.com Fixes: 6c8310e8380d472c ("perf evsel: Allow evsel__newtp without libtraceevent") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:22:59 +0000 (14:22 -0800)]
perf evlist: Add success path to evlist__create_syswide_maps
Over various refactorings evlist__create_syswide_maps has been made to
only ever return with -ENOMEM. Fix this so that when
perf_evlist__set_maps is successfully called, 0 is returned.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-3-irogers@google.com Fixes: 8c0498b6891d7ca5 ("perf evlist: Fix create_syswide_maps() not propagating maps") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Ian Rogers [Fri, 28 Feb 2025 22:22:58 +0000 (14:22 -0800)]
perf debug: Avoid stack overflow in recursive error message
In debug_file, pr_warning_once is called on error. As that function
calls debug_file the function will yield a stack overflow. Switch the
location of the call so the recursion is avoided.
Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250228222308.626803-2-irogers@google.com Fixes: ec49230cf6dda704 ("perf debug: Expose debug file") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Stephen Brennan [Fri, 7 Mar 2025 23:22:03 +0000 (15:22 -0800)]
perf symbol: Support .gnu_debugdata for symbols
Fedora introduced a "MiniDebuginfo" feature, in which an LZMA-compressed
ELF file is placed inside a section named ".gnu_debugdata". This file
contains nothing but a symbol table, which can be used to supplement the
.dynsym section which only contains required symbols for runtime.
It is supported by GDB for stack traces, but it should be useful for
tracing as well. Implement support for loading symbols from
.gnu_debugdata.
Stephen Brennan [Fri, 7 Mar 2025 23:22:02 +0000 (15:22 -0800)]
perf tools: Add LZMA decompression from FILE
Internally lzma_decompress_to_file() creates a FILE from the filename.
Add an API that takes an existing FILE directly. This allows
decompressing already-open files and even buffers opened by fmemopen().
It is necessary for supporting .gnu_debugdata in the next patch.
Ian Rogers [Sat, 8 Mar 2025 01:28:53 +0000 (17:28 -0800)]
perf mem: Don't leak mem event names
When preparing the mem events for the argv copies are intentionally
made. These copies are leaked and cause runs of perf using address
sanitizer to fail. Rather than leak the memory allocate a chunk of
memory for the mem event names upfront and build the strings in this -
the storage is sized larger than the previous buffer size. The caller
is then responsible for clearing up this memory. As part of this
change, remove the mem_loads_name and mem_stores_name global buffers
then change the perf_pmu__mem_events_name to write to an out argument
buffer.
Eric Lin [Thu, 13 Feb 2025 01:21:40 +0000 (17:21 -0800)]
perf vendor events riscv: Add SiFive P650 events
The SiFive Performance P650 core (including the vector-enabled P670 and
area-optimized P450/P470 variants) updates the P550 microarchitecture.
It brings in the debug, trace, and counter events from newer Bullet
cores, and adds new events for iTLB and dTLB multi-hits.
All other PMU events are unchanged from the P550 core.
Signed-off-by: Eric Lin <eric.lin@sifive.com> Co-developed-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250213220341.3215660-8-samuel.holland@sifive.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Eric Lin [Thu, 13 Feb 2025 01:21:39 +0000 (17:21 -0800)]
perf vendor events riscv: Add SiFive P550 events
The SiFive Performance P550 core features an out-of-order
microarchitecture which exposes the same PMU events as Bullet,
plus events for UTLB hits and PTE cache misses/hits.
Add support for specifying these events using symbolic names.
Signed-off-by: Eric Lin <eric.lin@sifive.com> Co-developed-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250213220341.3215660-7-samuel.holland@sifive.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Eric Lin [Thu, 13 Feb 2025 01:21:38 +0000 (17:21 -0800)]
perf vendor events riscv: Add SiFive Bullet version 0x0d events
SiFive Bullet microarchitecture cores with mimpid values starting with
0x0d or greater add new PMU events to count TLB miss stall cycles.
All other PMU events are unchanged from earlier Bullet cores.
Signed-off-by: Eric Lin <eric.lin@sifive.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250213220341.3215660-6-samuel.holland@sifive.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Eric Lin [Thu, 13 Feb 2025 01:21:37 +0000 (17:21 -0800)]
perf vendor events riscv: Add SiFive Bullet version 0x07 events
SiFive Bullet microarchitecture cores with mimpid values starting with
0x07 or greater add new PMU events to support debug, trace, and counter
sampling and filtering (Sscofpmf).
All other PMU events are unchanged from earlier Bullet cores.
Signed-off-by: Eric Lin <eric.lin@sifive.com> Co-developed-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250213220341.3215660-5-samuel.holland@sifive.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Regenerate the event lists from the original hardware description. This
makes them consistent with the event lists for newer versions of the
hardware, allowing most files to be reused across hardware versions.
Signed-off-by: Eric Lin <eric.lin@sifive.com> Co-developed-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250213220341.3215660-4-samuel.holland@sifive.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Samuel Holland [Thu, 13 Feb 2025 01:21:35 +0000 (17:21 -0800)]
perf vendor events riscv: Remove leading zeroes
The EventCode field (as stored in the mhpmeventN CSRs) is actually 56
bits wide, but there is no need to keep leading zeroes in the JSON
files. Remove them to simplify review of the following change, which
regenerates the files in a way that does not include leading zeroes.
This change was performed automatically with `sed -i "s/0x0*/0x/"`.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250213220341.3215660-3-samuel.holland@sifive.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Samuel Holland [Thu, 13 Feb 2025 01:21:34 +0000 (17:21 -0800)]
perf vendor events riscv: Rename U74 to Bullet
This set of PMU event descriptions applies not only to the SiFive U74
core configuration, but also to other SiFive cores that implement the
Bullet microarchitecture (such as U64, P270, and X280). Rename the
directory to be more generic.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Ian Rogers <irogers@google.com> Tested-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250213220341.3215660-2-samuel.holland@sifive.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
perf_color_default_config() was added in 2009 by
commit 8fc0321f1ad0 ("perf_counter tools: Add color terminal output
support")
but has remained unused.
Ian Rogers [Wed, 26 Feb 2025 23:01:09 +0000 (15:01 -0800)]
perf tests: Fix data symbol test with LTO builds
With LTO builds, although regular builds could also see this as
all the code is in one file, the datasym workload can realize the
buf1.reserved data is never accessed. The compiler moves the
variable to bss and only keeps the data1 and data2 parts as
separate variables. This causes the symbol check to fail in the
test. Make the variable volatile to disable the more aggressive
optimization. Rename the variable to make which buf1 in perf is
being referred to.
Before:
$ perf test -vv "data symbol"
126: Test data symbol:
--- start ---
test child forked, pid 299808
perf does not have symbol 'buf1'
perf is missing symbols - skipping test
---- end(-2) ----
126: Test data symbol : Skip
$ nm perf|grep buf1 0000000000a5fa40 b buf1.0 0000000000a5fa48 b buf1.1
After:
$ nm perf|grep buf1 0000000000a53a00 d buf1
$ perf test -vv "data symbol"126: Test data symbol:
--- start ---
test child forked, pid 302166
a53a00-a53a39 l buf1
perf does have symbol 'buf1'
Recording workload...
Waiting for "perf record has started" message
OK
Cleaning up files...
---- end(0) ----
126: Test data symbol : Ok
I found that hist_entry__delete() missed to release child entries in the
hierarchy tree (hroot_{in,out}). It needs to iterate the child entries
and call hist_entry__delete() recursively.