]> www.infradead.org Git - users/willy/xarray.git/log
users/willy/xarray.git
15 months agoperf python: Clean up build dependencies
Ian Rogers [Tue, 25 Jun 2024 21:41:17 +0000 (14:41 -0700)]
perf python: Clean up build dependencies

The python build now depends on libraries and doesn't use
python-ext-sources except for the util/python.c dependency. Switch to
just directly depending on that file and util/setup.py. This allows
the removal of python-ext-sources.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-9-irogers@google.com
15 months agoperf python: Switch module to linking libraries from building source
Ian Rogers [Tue, 25 Jun 2024 21:41:16 +0000 (14:41 -0700)]
perf python: Switch module to linking libraries from building source

setup.py was building most perf sources causing setup.py to mimic the
Makefile logic as well as flex/bison code to be stubbed out, due to
complexity building. By using libraries fewer functions are stubbed
out, the build is faster and the Makefile logic is reused which should
simplify updating. The libraries are passed through LDFLAGS to avoid
complexity in python.

Force the -fPIC flag for libbpf.a to ensure it is suitable for linking
into the perf python module.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-8-irogers@google.com
15 months agoperf util: Make util its own library
Ian Rogers [Tue, 25 Jun 2024 21:41:15 +0000 (14:41 -0700)]
perf util: Make util its own library

Make the util directory into its own library. This is done to avoid
compiling code twice, once for the perf tool and once for the perf
python module. For convenience:
  arch/common.c
  scripts/perl/Perf-Trace-Util/Context.c
  scripts/python/Perf-Trace-Util/Context.c
are made part of this library.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-7-irogers@google.com
15 months agoperf bench: Make bench its own library
Ian Rogers [Tue, 25 Jun 2024 21:41:14 +0000 (14:41 -0700)]
perf bench: Make bench its own library

Make the benchmark code into a library so it may be linked against
things like the python module to avoid compiling code twice.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-6-irogers@google.com
15 months agoperf test: Make tests its own library
Ian Rogers [Tue, 25 Jun 2024 21:41:13 +0000 (14:41 -0700)]
perf test: Make tests its own library

Make the tests code its own library. This is done to avoid compiling
code twice, once for the perf tool and once for the perf python
module.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-5-irogers@google.com
15 months agoperf pmu-events: Make pmu-events a library
Ian Rogers [Tue, 25 Jun 2024 21:41:12 +0000 (14:41 -0700)]
perf pmu-events: Make pmu-events a library

Make pmu-events into a library so it may be linked against things like
the python module and not built from source.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-4-irogers@google.com
15 months agoperf ui: Make ui its own library
Ian Rogers [Tue, 25 Jun 2024 21:41:11 +0000 (14:41 -0700)]
perf ui: Make ui its own library

Make the ui code its own library. This is done to avoid compiling code
twice, once for the perf tool and once for the perf python module.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-3-irogers@google.com
15 months agoperf build: Add '*.a' to clean targets
Ian Rogers [Tue, 25 Jun 2024 21:41:10 +0000 (14:41 -0700)]
perf build: Add '*.a' to clean targets

Fix some excessively long lines by deploying '\'.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-2-irogers@google.com
15 months agoperf mem: Fix a segfault with NULL event->name
Namhyung Kim [Fri, 21 Jun 2024 17:05:28 +0000 (10:05 -0700)]
perf mem: Fix a segfault with NULL event->name

Guilherme reported a crash in perf mem record.  It's because the
perf_mem_event->name was NULL on his machine.  It should just return
a NULL string when it has no format string in the name.

The backtrace at the crash is below:

  Program received signal SIGSEGV, Segmentation fault.
  __strchrnul_avx2 () at ../sysdeps/x86_64/multiarch/strchr-avx2.S:67
  67              vmovdqu (%rdi), %ymm2
  (gdb) bt
  #0  __strchrnul_avx2 () at ../sysdeps/x86_64/multiarch/strchr-avx2.S:67
  #1  0x00007ffff6c982de in __find_specmb (format=0x0) at printf-parse.h:82
  #2  __printf_buffer (buf=buf@entry=0x7fffffffc760, format=format@entry=0x0, ap=ap@entry=0x7fffffffc880,
      mode_flags=mode_flags@entry=0) at vfprintf-internal.c:649
  #3  0x00007ffff6cb7840 in __vsnprintf_internal (string=<optimized out>, maxlen=<optimized out>, format=0x0,
      args=0x7fffffffc880, mode_flags=mode_flags@entry=0) at vsnprintf.c:96
  #4  0x00007ffff6cb787f in ___vsnprintf (string=<optimized out>, maxlen=<optimized out>, format=<optimized out>,
      args=<optimized out>) at vsnprintf.c:103
  #5  0x00005555557b9391 in scnprintf (buf=0x555555fe9320 <mem_loads_name> "", size=100, fmt=0x0)
      at ../lib/vsprintf.c:21
  #6  0x00005555557b74c3 in perf_pmu__mem_events_name (i=0, pmu=0x555556832180) at util/mem-events.c:106
  #7  0x00005555557b7ab9 in perf_mem_events__record_args (rec_argv=0x55555684c000, argv_nr=0x7fffffffca20)
      at util/mem-events.c:252
  #8  0x00005555555e370d in __cmd_record (argc=3, argv=0x7fffffffd760, mem=0x7fffffffcd80) at builtin-mem.c:156
  #9  0x00005555555e49c4 in cmd_mem (argc=4, argv=0x7fffffffd760) at builtin-mem.c:514
  #10 0x000055555569716c in run_builtin (p=0x555555fcde80 <commands+672>, argc=8, argv=0x7fffffffd760) at perf.c:349
  #11 0x0000555555697402 in handle_internal_command (argc=8, argv=0x7fffffffd760) at perf.c:402
  #12 0x0000555555697560 in run_argv (argcp=0x7fffffffd59c, argv=0x7fffffffd590) at perf.c:446
  #13 0x00005555556978a6 in main (argc=8, argv=0x7fffffffd760) at perf.c:562

Reported-by: Guilherme Amadio <amadio@cern.ch>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Closes: https://lore.kernel.org/linux-perf-users/Zlns_o_IE5L28168@cern.ch
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-5-namhyung@kernel.org
15 months agoperf tools: Fix a compiler warning of NULL pointer
Namhyung Kim [Fri, 21 Jun 2024 17:05:27 +0000 (10:05 -0700)]
perf tools: Fix a compiler warning of NULL pointer

A compiler warning on the second argument of bsearch() should not be
NULL, but there's a case we might pass it.  Let's return early if we
don't have any DSOs to search in __dsos__find_by_longname_id().

  util/dsos.c:184:8: runtime error: null pointer passed as argument 2, which is declared to never be null

Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Closes: https://lore.kernel.org/oe-lkp/202406180932.84be448c-oliver.sang@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-4-namhyung@kernel.org
15 months agoperf symbol: Simplify kernel module checking
Namhyung Kim [Fri, 21 Jun 2024 17:05:26 +0000 (10:05 -0700)]
perf symbol: Simplify kernel module checking

In dso__load(), it checks if the dso is a kernel module by looking the
symtab type.  Actually dso has 'is_kmod' field to check that easily and
dso__set_module_info() set the symtab type and the is_kmod bit.  So it
should have the same result to check the is_kmod bit.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-3-namhyung@kernel.org
15 months agoperf report: Fix condition in sort__sym_cmp()
Namhyung Kim [Fri, 21 Jun 2024 17:05:25 +0000 (10:05 -0700)]
perf report: Fix condition in sort__sym_cmp()

It's expected that both hist entries are in the same hists when
comparing two.  But the current code in the function checks one without
dso sort key and other with the key.  This would make the condition true
in any case.

I guess the intention of the original commit was to add '!' for the
right side too.  But as it should be the same, let's just remove it.

Fixes: 69849fc5d2119 ("perf hists: Move sort__has_dso into struct perf_hpp_list")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240621170528.608772-2-namhyung@kernel.org
15 months agoperf pmus: Fixes always false when compare duplicates aliases
Junhao He [Fri, 14 Jun 2024 09:43:18 +0000 (17:43 +0800)]
perf pmus: Fixes always false when compare duplicates aliases

In the previous loop, all the members in the aliases[j-1] have been freed
and set to NULL. But in this loop, the function pmu_alias_is_duplicate()
compares the aliases[j] with the aliases[j-1] that has already been
disposed, so the function will always return false and duplicate aliases
will never be discarded.

If we find duplicate aliases, it skips the zfree aliases[j], which is
accompanied by a memory leak.

We can use the next aliases[j+1] to theck for duplicate aliases to
fixes the aliases NULL pointer dereference, then goto zfree code snippet
to release it.

After patch testing:
 $ perf list --unit=hisi_sicl,cpa pmu

 uncore cpa:
   cpa_p0_rd_dat_32b
        [Number of read ops transmitted by the P0 port which size is 32 bytes.
         Unit: hisi_sicl,cpa]
   cpa_p0_rd_dat_64b
        [Number of read ops transmitted by the P0 port which size is 64 bytes.
         Unit: hisi_sicl,cpa]

Fixes: c3245d2093c1 ("perf pmu: Abstract alias/event struct")
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: ravi.bangoria@amd.com
Cc: james.clark@arm.com
Cc: prime.zeng@hisilicon.com
Cc: cuigaosheng1@huawei.com
Cc: jonathan.cameron@huawei.com
Cc: linuxarm@huawei.com
Cc: yangyicong@huawei.com
Cc: robh@kernel.org
Cc: renyu.zj@linux.alibaba.com
Cc: kjain@linux.ibm.com
Cc: john.g.garry@oracle.com
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240614094318.11607-1-hejunhao3@huawei.com
15 months agoperf unwind-libunwind: Add malloc() failure handling
Yunseong Kim [Wed, 19 Jun 2024 20:42:12 +0000 (05:42 +0900)]
perf unwind-libunwind: Add malloc() failure handling

Add malloc() failure handling in unread_unwind_spec_debug_frame().
This make caller find_proc_info() works well when the allocation failure.

Signed-off-by: Yunseong Kim <yskelg@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Austin Kim <austindh.kim@gmail.com>
Cc: shjy180909@gmail.com
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240619204211.6438-2-yskelg@gmail.com
15 months agoutil: constant -1 with expression of type char
Yunseong Kim [Wed, 19 Jun 2024 20:34:29 +0000 (05:34 +0900)]
util: constant -1 with expression of type char

This patch resolve following warning.

  tools/perf/util/evsel.c:1620:9: error: result of comparison of constant
   -1 with expression of type 'char' is always false
   -Werror,-Wtautological-constant-out-of-range-compare
   1620 |                 if (c == -1)
        |                     ~ ^  ~~

Signed-off-by: Yunseong Kim <yskelg@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Austin Kim <austindh.kim@gmail.com>
Cc: shjy180909@gmail.com
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Leo Yan <leo.yan@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240619203428.6330-2-yskelg@gmail.com
15 months agoperf: Timehist account sch delay for scheduled out running
Fernand Sieber [Tue, 18 Jun 2024 09:03:39 +0000 (11:03 +0200)]
perf: Timehist account sch delay for scheduled out running

When using perf timehist, sch delay is only computed for a waking task,
not for a pre empted task. This patches changes sch delay to account for
both. This makes sense as testing scheduling policy need to consider the
effect of scheduling delay globally, not only for waking tasks.

Example of `perf timehist` report before the patch for `stress` task
competing with each other.

First column is wait time, second column sch delay, third column
runtime.

1.492060 [0000]  s    stress[81]                          1.999      0.000      2.000      R  next: stress[83]
1.494060 [0000]  s    stress[83]                          2.000      0.000      2.000      R  next: stress[81]
1.496060 [0000]  s    stress[81]                          2.000      0.000      2.000      R  next: stress[83]
1.498060 [0000]  s    stress[83]                          2.000      0.000      1.999      R  next: stress[81]

After the patch, it looks like this (note that all wait time is not zero
anymore):

1.492060 [0000]  s    stress[81]                          1.999      1.999      2.000      R  next: stress[83]
1.494060 [0000]  s    stress[83]                          2.000      2.000      2.000      R  next: stress[81]
1.496060 [0000]  s    stress[81]                          2.000      2.000      2.000      R  next: stress[83]
1.498060 [0000]  s    stress[83]                          2.000      2.000      1.999      R  next: stress[81]

Signed-off-by: Fernand Sieber <sieberf@amazon.com>
Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240618090339.87482-1-sieberf@amazon.com
15 months agoperf tests: Add APX and other new instructions to x86 instruction decoder test
Adrian Hunter [Thu, 2 May 2024 10:58:53 +0000 (13:58 +0300)]
perf tests: Add APX and other new instructions to x86 instruction decoder test

Add samples of APX and other new instructions to the 'x86 instruction
decoder - new instructions' test.

Note the test is only available if the perf tool has been built with
EXTRA_TESTS=1.

Example:

  $ make EXTRA_TESTS=1 -C tools/perf
  $ tools/perf/perf test -F -v 'new ins' |& grep -i 'jmpabs\|popp\|pushp'
  Decoded ok: d5 00 a1 ef cd ab 90 78 56 34 12    jmpabs $0x1234567890abcdef
  Decoded ok: d5 08 53                    pushp  %rbx
  Decoded ok: d5 18 50                    pushp  %r16
  Decoded ok: d5 19 57                    pushp  %r31
  Decoded ok: d5 19 5f                    popp   %r31
  Decoded ok: d5 18 58                    popp   %r16
  Decoded ok: d5 08 5b                    popp   %rbx

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Nikolay Borisov <nik.borisov@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-11-adrian.hunter@intel.com
15 months agoperf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder
Adrian Hunter [Thu, 2 May 2024 10:58:52 +0000 (13:58 +0300)]
perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder

JMPABS is 64-bit absolute direct jump instruction, encoded with a mandatory
REX2 prefix. JMPABS is designed to be used in the procedure linkage table
(PLT) to replace indirect jumps, because it has better performance. In that
case the jump target will be amended at run time. To enable Intel PT to
follow the code, a TIP packet is always emitted when JMPABS is traced under
Intel PT.

Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
Specification for details.

Decode JMPABS as an indirect jump, because it has an associated TIP packet
the same as an indirect jump and the control flow should follow the TIP
packet payload, and not assume it is the same as the on-file object code
JMPABS target address.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Nikolay Borisov <nik.borisov@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-10-adrian.hunter@intel.com
15 months agoperf test: Check output of the probe ... --funcs command
Chaitanya S Prakash [Sat, 1 Jun 2024 12:59:46 +0000 (18:29 +0530)]
perf test: Check output of the probe ... --funcs command

Test "perf probe of function from different CU" only checks if the perf
command has failed and doesn't test the --funcs output. In the issue
reported in the previous commit, the garbage output of the --funcs
command was being ignored by the test when it could have been caught.

The script first makes use of --funcs option with the perf probe command
to check if the function "foo" exists in the testfile before adding a
probe to it in the next command. The output of probe...--funcs command
is redirected to stdout, therefore, add '| grep "foo"' to validate the
result.

Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: anshuman.khandual@arm.com
Cc: james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240601125946.1741414-11-ChaitanyaS.Prakash@arm.com
15 months agotools/perf: Fix parallel-perf python script to replace new python syntax ":=" usage
Athira Rajeev [Sun, 23 Jun 2024 06:48:50 +0000 (12:18 +0530)]
tools/perf: Fix parallel-perf python script to replace new python syntax ":=" usage

perf test "perf script tests" fails as below in systems
with python 3.6

File "/home/athira/linux/tools/perf/tests/shell/../../scripts/python/parallel-perf.py", line 442
if line := p.stdout.readline():
             ^
SyntaxError: invalid syntax
--- Cleaning up ---
---- end(-1) ----
92: perf script tests: FAILED!

This happens because ":=" is a new syntax that assigns values
to variables as part of a larger expression. This is introduced
from python 3.8 and hence fails in setup with python 3.6
Address this by splitting the large expression and check the
value in two steps:
Previous line: if line := p.stdout.readline():
Current change:
line = p.stdout.readline()
if line:

With patch

./perf test "perf script tests"
 93: perf script tests:  Ok

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240623064850.83720-3-atrajeev@linux.vnet.ibm.com
15 months agotools/perf: Use is_perf_pid_map_name helper function to check dso's of pattern /tmp...
Athira Rajeev [Sun, 23 Jun 2024 06:48:49 +0000 (12:18 +0530)]
tools/perf: Use is_perf_pid_map_name helper function to check dso's of pattern /tmp/perf-%d.map

commit 80d496be89ed ("perf report: Add support for profiling JIT
generated code") added support for profiling JIT generated code.
This patch handles dso's of form "/tmp/perf-$PID.map".

Some of the references doesn't check exactly for same pattern.
some uses "if (!strncmp(dso_name, "/tmp/perf-", 10))". Fix
this by using helper function perf_pid_map_tid and
is_perf_pid_map_name which looks for proper pattern of
form: "/tmp/perf-$PID.map" for these checks.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240623064850.83720-2-atrajeev@linux.vnet.ibm.com
15 months agotools/perf: Fix the string match for "/tmp/perf-$PID.map" files in dso__load
Athira Rajeev [Sun, 23 Jun 2024 06:48:48 +0000 (12:18 +0530)]
tools/perf: Fix the string match for "/tmp/perf-$PID.map" files in dso__load

Perf test for perf probe of function from different CU fails
as below:

./perf test -vv "test perf probe of function from different CU"
116: test perf probe of function from different CU:
--- start ---
test child forked, pid 2679
Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.Msa7iy89bx/testfile
  Error: Failed to add events.
--- Cleaning up ---
"foo" does not hit any event.
  Error: Failed to delete events.
---- end(-1) ----
116: test perf probe of function from different CU                   : FAILED!

The test does below to probe function "foo" :

# gcc -g -Og -flto -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.c
-o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o
# gcc -g -Og -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.c
-o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o
# gcc -g -Og -o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile
/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o
/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o

# ./perf probe -x /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile foo
Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile
   Error: Failed to add events.

Perf probe fails to find symbol foo in the executable placed in
/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7

Simple reproduce:

 # mktemp -d /tmp/perf-checkXXXXXXXXXX
   /tmp/perf-checkcWpuLRQI8j

 # gcc -g -o test test.c
 # cp test /tmp/perf-checkcWpuLRQI8j/
 # nm /tmp/perf-checkcWpuLRQI8j/test | grep foo
   00000000100006bc T foo

 # ./perf probe -x /tmp/perf-checkcWpuLRQI8j/test foo
   Failed to find symbol foo in /tmp/perf-checkcWpuLRQI8j/test
      Error: Failed to add events.

But it works with any files like /tmp/perf/test. Only for
patterns with "/tmp/perf-", this fails.

Further debugging, commit 80d496be89ed ("perf report: Add support
for profiling JIT generated code") added support for profiling JIT
generated code. This patch handles dso's of form
"/tmp/perf-$PID.map" .

The check used "if (strncmp(self->name, "/tmp/perf-", 10) == 0)"
to match "/tmp/perf-$PID.map". With this commit, any dso in
/tmp/perf- folder will be considered separately for processing
(not only JIT created map files ). Fix this by changing the
string pattern to check for "/tmp/perf-%d.map". Add a helper
function is_perf_pid_map_name to do this check. In "struct dso",
dso->long_name holds the long name of the dso file. Since the
/tmp/perf-$PID.map check uses the complete name, use dso___long_name for
the string name.

With the fix,
# ./perf test "test perf probe of function from different CU"
117: test perf probe of function from different CU                   : Ok

Fixes: 56cbeacf1435 ("perf probe: Add test for regression introduced by switch to die_get_decl_file()")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240623064850.83720-1-atrajeev@linux.vnet.ibm.com
16 months agoperf test: Make test_arm_callgraph_fp.sh more robust
James Clark [Wed, 12 Jun 2024 14:03:14 +0000 (15:03 +0100)]
perf test: Make test_arm_callgraph_fp.sh more robust

The 2 second sleep can cause the test to fail on very slow network file
systems because Perf ends up being killed before it finishes starting
up.

Fix it by making the leafloop workload end after a fixed time like the
other workloads so there is no need to kill it after 2 seconds.

Also remove the 1 second start sampling delay because it is similarly
fragile. Instead, search through all samples for a matching one, rather
than just checking the first sample and hoping it's in the right place.

Fixes: cd6382d82752 ("perf test arm64: Test unwinding using fame-pointer (fp) mode")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240612140316.3006660-1-james.clark@arm.com
16 months agoperf build: Ensure libtraceevent and libtracefs versions have 3 components
Guilherme Amadio [Thu, 6 Jun 2024 15:33:02 +0000 (17:33 +0200)]
perf build: Ensure libtraceevent and libtracefs versions have 3 components

When either of these have a shorter version, like 1.8, the expression
that computes the version has a syntax error that can be seen in the
output of make:

expr: syntax error: missing argument after +

Link: https://bugs.gentoo.org/917559
Reported-by: Peter Volkov <peter.volkov@gmail.com>
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240606153625.2255470-3-amadio@gentoo.org
16 months agoperf build: Use pkg-config for feature check for libtrace{event,fs}
Guilherme Amadio [Thu, 6 Jun 2024 15:33:01 +0000 (17:33 +0200)]
perf build: Use pkg-config for feature check for libtrace{event,fs}

Needed to add required include directories for the feature detection
to succeed. The header tracefs.h is installed either into the include
directory /usr/include/tracefs/tracefs.h when using the Makefile, or
into /usr/include/libtracefs/tracefs.h when using meson to build
libtracefs. The header tracefs.h uses #include <event-parse.h> from
libtraceevent, so pkg-config needs to pick the correct include directory
for libtracefs and add the one for libtraceevent to succeed.

Note that in baa2ca59ec1e31ccbe3f24ff0368152b36f68720 the variable
LIBTRACEEVENT_DIR was introduced, and now the method to compile against
non-standard locations requires PKG_CONFIG_PATH to be set instead, which
works for both libtraceevent and libtracefs.

Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240606153625.2255470-2-amadio@gentoo.org
16 months agoperf arm: Workaround ARM PMUs cpu maps having offline cpus
Ian Rogers [Fri, 7 Jun 2024 06:53:43 +0000 (23:53 -0700)]
perf arm: Workaround ARM PMUs cpu maps having offline cpus

When PMUs have a cpu map in the 'cpus' or 'cpumask' file, perf will
try to open events on those CPUs. ARM doesn't remove offline CPUs
meaning taking a CPU offline will cause perf commands to fail unless a
CPU map is passed on the command line.

More context in:
https://lore.kernel.org/lkml/20240603092812.46616-1-yangyicong@huawei.com/

Reported-by: Yicong Yang <yangyicong@huawei.com>
Closes: https://lore.kernel.org/lkml/20240603092812.46616-2-yangyicong@huawei.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607065343.695369-1-irogers@google.com
16 months agoperf stat: Fix the hard-coded metrics calculation on the hybrid
Kan Liang [Thu, 6 Jun 2024 18:03:16 +0000 (11:03 -0700)]
perf stat: Fix the hard-coded metrics calculation on the hybrid

The hard-coded metrics is wrongly calculated on the hybrid machine.

$ perf stat -e cycles,instructions -a sleep 1

 Performance counter stats for 'system wide':

        18,205,487      cpu_atom/cycles/
         9,733,603      cpu_core/cycles/
         9,423,111      cpu_atom/instructions/     #  0.52  insn per cycle
         4,268,965      cpu_core/instructions/     #  0.23  insn per cycle

The insn per cycle for cpu_core should be 4,268,965 / 9,733,603 = 0.44.

When finding the metric events, the find_stat() doesn't take the PMU
type into account. The cpu_atom/cycles/ is wrongly used to calculate
the IPC of the cpu_core.

In the hard-coded metrics, the events from a different PMU are only
SW_CPU_CLOCK and SW_TASK_CLOCK. They both have the stat type,
STAT_NSECS. Except the SW CLOCK events, check the PMU type as well.

Fixes: 0a57b910807a ("perf stat: Use counts rather than saved_value")
Reported-by: Khalil, Amiri <amiri.khalil@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240606180316.4122904-1-kan.liang@linux.intel.com
16 months agoperf vendor events: Add westmereex counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:51 +0000 (11:17 -0700)]
perf vendor events: Add westmereex counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-38-irogers@google.com
16 months agoperf vendor events: Add westmereep-sp counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:50 +0000 (11:17 -0700)]
perf vendor events: Add westmereep-sp counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-37-irogers@google.com
16 months agoperf vendor events: Add westmereep-dp counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:49 +0000 (11:17 -0700)]
perf vendor events: Add westmereep-dp counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-36-irogers@google.com
16 months agoperf vendor events: Add/update tigerlake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:48 +0000 (11:17 -0700)]
perf vendor events: Add/update tigerlake events/metrics

Update events from v1.15 to v1.16.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v1.16:
https://github.com/intel/perfmon/commit/43f3b8d6f82f3174bd3bffe8587e2179f086d2ce

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-35-irogers@google.com
16 months agoperf vendor events: Add snowridgex counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:47 +0000 (11:17 -0700)]
perf vendor events: Add snowridgex counter information

Update/remove events as per v1.23:
https://github.com/intel/perfmon/commit/9debd874e1b2b0cca42b9ba2342cacaaace2f0ce

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-34-irogers@google.com
16 months agoperf vendor events: Add/update skylakex events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:46 +0000 (11:17 -0700)]
perf vendor events: Add/update skylakex events/metrics

Update events from v1.33 to v1.35.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v1.35:
https://github.com/intel/perfmon/commit/c99b60c147b96f40f96dd961abfae54909f47e5f

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-33-irogers@google.com
16 months agoperf vendor events: Add/update skylake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:45 +0000 (11:17 -0700)]
perf vendor events: Add/update skylake events/metrics

Update events from v58 to v59.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v59:
https://github.com/intel/perfmon/commit/5d36f1835b02f056031a06e777e4bf54a5964930

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-32-irogers@google.com
16 months agoperf vendor events: Add silvermont counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:44 +0000 (11:17 -0700)]
perf vendor events: Add silvermont counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-31-irogers@google.com
16 months agoperf vendor events: Add/update sierraforest events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:43 +0000 (11:17 -0700)]
perf vendor events: Add/update sierraforest events/metrics

Update events from v1.02 to v1.04.
Add TMA metrics v4.8.

Bring in the event updates v1.04:
https://github.com/intel/perfmon/commit/0a9546cdf63c8b07f5c33ebf6fe49e6ebec89f86
v1.03:
https://github.com/intel/perfmon/commit/c7dd26ce67ca4477d40fb4b55b6baa0584b3e5d6

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

New events are:
FP_INST_RETIRED.128B_DP,
FP_INST_RETIRED.128B_SP,
FP_INST_RETIRED.256B_DP,
FP_INST_RETIRED.32B_SP,
FP_INST_RETIRED.64B_DP,
OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM,
OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD,
OCR.DEMAND_RFO.L3_HIT.SNOOP_HITM,
OCR.STREAMING_WR.ANY_RESPONSE,
UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_LOCAL,
UNC_CHA_TOR_INSERTS.IO_ITOMCACHENEAR_REMOTE,
UNC_CHA_TOR_INSERTS.IO_ITOM_LOCAL,
UNC_CHA_TOR_INSERTS.IO_ITOM_REMOTE,
UNC_CHA_TOR_INSERTS.IO_MISS,
UNC_CHA_TOR_INSERTS.IO_MISS_ITOM,
UNC_CHA_TOR_INSERTS.IO_MISS_ITOMCACHENEAR,
UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_LOCAL,
UNC_CHA_TOR_INSERTS.IO_PCIRDCUR_REMOTE,
UNC_CXLCM_RxC_PACK_BUF_INSERTS.MEM_DATA,
UNC_CXLDP_TxC_AGF_INSERTS.M2S_DATA.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-30-irogers@google.com
16 months agoperf vendor events: Add/update sapphirerapids events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:42 +0000 (11:17 -0700)]
perf vendor events: Add/update sapphirerapids events/metrics

Update events from v1.20 to v1.23.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v1.23:
https://github.com/intel/perfmon/commit/6ace93281c0f573b90d3f8f624486ad59dde1c93
v1.22:
https://github.com/intel/perfmon/commit/356eba05c07c4d54ed5b92c1164ce00fab545636

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
ICACHE_DATA.STALL_PERIODS,
L2_TRANS.L2_WB,
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_1024,
OFFCORE_REQUESTS.DEMAND_CODE_RD,
OFFCORE_REQUESTS.DEMAND_RFO,
OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD,
OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD,
RS.EMPTY_RESOURCE,
SW_PREFETCH_ACCESS.ANY,
UOPS_ISSUED.CYCLES.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-29-irogers@google.com
16 months agoperf vendor events: Update sandybridge metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:41 +0000 (11:17 -0700)]
perf vendor events: Update sandybridge metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-28-irogers@google.com
16 months agoperf vendor events: Add/update rocketlake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:40 +0000 (11:17 -0700)]
perf vendor events: Add/update rocketlake events/metrics

Update events from v1.02 to v1.03.
Update TMA metrics from v4.7 to v4.8.

Bring in the event updates v1.03:
https://github.com/intel/perfmon/commit/a7c75ffd56c7056494cd3acc2749336cd6363b90

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-27-irogers@google.com
16 months agoperf vendor events: Add nehalemex counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:39 +0000 (11:17 -0700)]
perf vendor events: Add nehalemex counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-26-irogers@google.com
16 months agoperf vendor events: Add nehalemep counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:38 +0000 (11:17 -0700)]
perf vendor events: Add nehalemep counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-25-irogers@google.com
16 months agoperf vendor events: Update meteorlake events and add counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:37 +0000 (11:17 -0700)]
perf vendor events: Update meteorlake events and add counter information

Update events from v1.08 to v1.10.

Bring in the event updates v1.10:
https://github.com/intel/perfmon/commit/3bee3dc150164df0bec5980ca5586930730e5778
v1.09:
https://github.com/intel/perfmon/commit/01c8c99f17a72460b2eaf7efe3495913f36c9d42

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
FP_INST_RETIRED.128B_DP,
FP_INST_RETIRED.128B_SP,
FP_INST_RETIRED.256B_DP,
FP_INST_RETIRED.32B_SP,
FP_INST_RETIRED.64B_DP,
FP_VINT_UOPS_EXECUTED.STD,
L2_LINES_OUT.USELESS_HWPF,
L2_RQSTS.SWPF_HIT,
L2_RQSTS.SWPF_MISS,
LOAD_HIT_PREFETCH.SWPF,
MACHINE_CLEARS.ANY,
MACHINE_CLEARS.MRN_NUKE,
MISC_RETIRED.LBR_INSERTS,
SW_PREFETCH_ACCESS.ANY.

The metrics aren't updated as they require retirement latency support
that is added in this series:
https://lore.kernel.org/lkml/20240613033631.199800-1-weilin.wang@intel.com/

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-24-irogers@google.com
16 months agoperf vendor events: Add lunarlake counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:36 +0000 (11:17 -0700)]
perf vendor events: Add lunarlake counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-23-irogers@google.com
16 months agoperf vendor events: Add knightslanding counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:35 +0000 (11:17 -0700)]
perf vendor events: Add knightslanding counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-22-irogers@google.com
16 months agoperf vendor events: Update jaketown metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:34 +0000 (11:17 -0700)]
perf vendor events: Update jaketown metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-21-irogers@google.com
16 months agoperf vendor events: Update ivytown metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:33 +0000 (11:17 -0700)]
perf vendor events: Update ivytown metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-20-irogers@google.com
16 months agoperf vendor events: Update ivybridge metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:32 +0000 (11:17 -0700)]
perf vendor events: Update ivybridge metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-19-irogers@google.com
16 months agoperf vendor events: Add/update icelakex events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:31 +0000 (11:17 -0700)]
perf vendor events: Add/update icelakex events/metrics

Update events from v1.24 to v1.26.
Add TMA metrics v4.8.

Bring in the event updates v1.26:
https://github.com/intel/perfmon/commit/c607c739e05f2569f95998cc98e1283f042b4fd1
v1.25:
https://github.com/intel/perfmon/commit/42d996769069921ec06f6fbb600b0c663b9ec5a9

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-18-irogers@google.com
16 months agoperf vendor events: Add/update icelake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:30 +0000 (11:17 -0700)]
perf vendor events: Add/update icelake events/metrics

Update events from v1.21 to v1.22.
Add TMA metrics v4.8.

Bring in the event updates v1.22:
https://github.com/intel/perfmon/commit/e5640646e96d59e3c1c1e0d0100a475220ff1dfe

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Adds the event SW_PREFETCH_ACCESS.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-17-irogers@google.com
16 months agoperf vendor events: Update haswellx metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:29 +0000 (11:17 -0700)]
perf vendor events: Update haswellx metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-16-irogers@google.com
16 months agoperf vendor events: Add haswell counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:28 +0000 (11:17 -0700)]
perf vendor events: Add haswell counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-15-irogers@google.com
16 months agoperf vendor events: Update graniterapids events and add counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:27 +0000 (11:17 -0700)]
perf vendor events: Update graniterapids events and add counter information

Update events from v1.01 to v1.02.

Bring in the event updates v1.02:
https://github.com/intel/perfmon/commit/0ff9f681bd07d0e84026c52f4941d21b1cd4c171

Add counter information. The most recent RFC patch set using this
information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

There are over 1000 new events.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-14-irogers@google.com
16 months agoperf vendor events: Update/add grandridge events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:26 +0000 (11:17 -0700)]
perf vendor events: Update/add grandridge events/metrics

Update events from v1.02 to v1.03.
Add TMA metrics v4.8.

Bring in the event updates v1.03:
https://github.com/intel/perfmon/commit/5ec7a252d0f6ec461f80cc397c9ac25abcd9184f

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

New events are:
FP_INST_RETIRED.128B_DP,
FP_INST_RETIRED.128B_SP,
FP_INST_RETIRED.256B_DP,
FP_INST_RETIRED.32B_SP,
FP_INST_RETIRED.64B_DP,
OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM,
OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD,
OCR.DEMAND_RFO.L3_HIT.SNOOP_HITM,
OCR.STREAMING_WR.ANY_RESPONSE.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-13-irogers@google.com
16 months agoperf vendor events: Add goldmontplus counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:25 +0000 (11:17 -0700)]
perf vendor events: Add goldmontplus counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-12-irogers@google.com
16 months agoperf vendor events: Add goldmont counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:24 +0000 (11:17 -0700)]
perf vendor events: Add goldmont counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-11-irogers@google.com
16 months agoperf vendor events: Add/update emeraldrapids events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:23 +0000 (11:17 -0700)]
perf vendor events: Add/update emeraldrapids events/metrics

Update events from v1.06 to v1.09.
Add TMA metrics v4.8.

Bring in the event updates v1.09:
https://github.com/intel/perfmon/commit/3fd5892bb4aece9c1e5c17630570d0462838e85d
v1.08:
https://github.com/intel/perfmon/commit/54525c4508f4a1ce4a8b854aa808a4ee2fb5930b

The TMA 4.8 information was added in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
ICACHE_DATA.STALL_PERIODS,
L2_TRANS.L2_WB,
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_1024,
OFFCORE_REQUESTS.DEMAND_CODE_RD,
OFFCORE_REQUESTS.DEMAND_RFO,
OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD,
OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD,
RS.EMPTY_RESOURCE,
SW_PREFETCH_ACCESS.ANY,
UNC_IIO_BANDWIDTH_OUT.PART[0-7]_FREERUN,
UOPS_ISSUED.CYCLES.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-10-irogers@google.com
16 months agoperf vendor events: Update elkhartlake events
Ian Rogers [Thu, 20 Jun 2024 18:17:22 +0000 (11:17 -0700)]
perf vendor events: Update elkhartlake events

Update events from v1.04 to v1.05. Bring in event updates from:
https://github.com/intel/perfmon/commit/fb91e1851ca40a5b443e2c3cd79bc7fc34c8237e

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-9-irogers@google.com
16 months agoperf vendor events: Update cascadelakex events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:21 +0000 (11:17 -0700)]
perf vendor events: Update cascadelakex events/metrics

Update events from v1.21 to v1.22.

Bring in the event updates v1.22
https://github.com/intel/perfmon/commit/013877729c4ed96427932ca48722bc3bfd2a0075

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

New events are:
SW_PREFETCH_ACCESS.ANY

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-8-irogers@google.com
16 months agoperf vendor events: Update broadwellx metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:20 +0000 (11:17 -0700)]
perf vendor events: Update broadwellx metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-7-irogers@google.com
16 months agoperf vendor events: Update broadwellde metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:19 +0000 (11:17 -0700)]
perf vendor events: Update broadwellde metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-6-irogers@google.com
16 months agoperf vendor events: Update broadwell metrics add event counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:18 +0000 (11:17 -0700)]
perf vendor events: Update broadwell metrics add event counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

The TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-5-irogers@google.com
16 months agoperf vendor events: Add bonnell counter information
Ian Rogers [Thu, 20 Jun 2024 18:17:17 +0000 (11:17 -0700)]
perf vendor events: Add bonnell counter information

Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
https://github.com/intel/perfmon/commit/475892a9690cb048949e593fe39cee65cd4765e1
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-4-irogers@google.com
16 months agoperf vendor events: Update alderlaken events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:16 +0000 (11:17 -0700)]
perf vendor events: Update alderlaken events/metrics

Update events from v1.24 to v1.27.
Update e-core TMA metrics to v3.6.

Bring in the event updates v1.27:
https://github.com/intel/perfmon/commit/ea4f309a04c50ca77a00da2db130fd7cf06db978
v1.26:
https://github.com/intel/perfmon/commit/0052e68d24d9873d5ff22363677794fa3eb05313

The e-core TMA 3.6 information was updated in:
https://github.com/intel/perfmon/commit/d9c2faa70bafe03129dc10f9fe414ef03a95acd9

New events are:
MEM_UOPS_RETIRED.LOCK_LOADS,
SERIALIZATION.C01_MS_SCB,
UOPS_ISSUED.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-3-irogers@google.com
16 months agoperf vendor events: Update alderlake events/metrics
Ian Rogers [Thu, 20 Jun 2024 18:17:15 +0000 (11:17 -0700)]
perf vendor events: Update alderlake events/metrics

Update events from v1.24 to v1.27.
Update p-core TMA metrics from v4.7 to v4.8, and the e-core TMA
metrics to v3.6.

Bring in the event updates v1.27:
https://github.com/intel/perfmon/commit/ea4f309a04c50ca77a00da2db130fd7cf06db978
v1.26:
https://github.com/intel/perfmon/commit/0052e68d24d9873d5ff22363677794fa3eb05313

The p-core TMA 4.8 information was updated in:
https://github.com/intel/perfmon/commit/59194d4d90ca50a3fcb2de0d82b9f6fc0c9a5736
And e-core in:
https://github.com/intel/perfmon/commit/d9c2faa70bafe03129dc10f9fe414ef03a95acd9

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
ICACHE_DATA.STALL_PERIODS,
L2_TRANS.L2_WB,
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_1024,
MEM_UOPS_RETIRED.LOCK_LOADS,
OFFCORE_REQUESTS.DEMAND_CODE_RD,
OFFCORE_REQUESTS.DEMAND_RFO,
OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD,
OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD,
RS.EMPTY_RESOURCE,
SERIALIZATION.C01_MS_SCB,
SW_PREFETCH_ACCESS.ANY,
UOPS_ISSUED.ANY,
UOPS_ISSUED.CYCLES

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-2-irogers@google.com
16 months agoperf doc: Add AMD IBS usage document
Ravi Bangoria [Thu, 20 Jun 2024 05:41:04 +0000 (05:41 +0000)]
perf doc: Add AMD IBS usage document

Add a perf man page document that describes how to exploit AMD IBS with
Linux perf. Brief intro about IBS and simple one-liner examples will help
naive users to get started. This is not meant to be an exhaustive IBS
guide. User should refer latest AMD64 Architecture Programmer's Manual
for detailed description of IBS.

Usage:

  $ man perf-amd-ibs

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: ananth.narayan@amd.com
Cc: sandipan.das@amd.com
Cc: santosh.shukla@amd.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620054104.815-1-ravi.bangoria@amd.com
16 months agotools/perf: Handle perftool-testsuite_probe testcases fail when kernel debuginfo...
Athira Rajeev [Mon, 17 Jun 2024 12:21:21 +0000 (17:51 +0530)]
tools/perf: Handle perftool-testsuite_probe testcases fail when kernel debuginfo is not present

Running "perftool-testsuite_probe" fails as below:

./perf test -v "perftool-testsuite_probe"
83: perftool-testsuite_probe  : FAILED

There are three fails:

1. Regexp not found: "\s*probe:inode_permission(?:_\d+)?\s+\(on inode_permission(?:[:\+][0-9A-Fa-f]+)?@.+\)"
   -- [ FAIL ] -- perf_probe :: test_adding_kernel :: listing added probe :: perf probe -l (output regexp parsing)

2. Regexp not found: "probe:vfs_mknod"
   Regexp not found: "probe:vfs_create"
   Regexp not found: "probe:vfs_rmdir"
   Regexp not found: "probe:vfs_link"
   Regexp not found: "probe:vfs_write"
   -- [ FAIL ] -- perf_probe :: test_adding_kernel :: wildcard adding support (command exitcode + output regexp parsing)

3. Regexp not found: "Failed to find"
   Regexp not found: "somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64"
   Regexp not found: "in this function|at this address"
   Line did not match any pattern: "The /boot/vmlinux file has no debug information."
   Line did not match any pattern: "Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package."

These three tests depends on kernel debug info.
1. Fail 1 expects file name along with probe which needs debuginfo
2. Fail 2 :
    perf probe -nf --max-probes=512 -a 'vfs_* $params'
    Debuginfo-analysis is not supported.
     Error: Failed to add events.

3. Fail 3 :
   perf probe 'vfs_read somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64'
   Debuginfo-analysis is not supported.
   Error: Failed to add events.

There is already helper function skip_if_no_debuginfo in
lib/probe_vfs_getname.sh which does perf probe and returns
"2" if debug info is not present. Use the skip_if_no_debuginfo
function and skip only the three tests which needs debuginfo
based on the result.

With the patch:

    83: perftool-testsuite_probe:
   --- start ---
   test child forked, pid 3927
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission ::
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: -a
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: --add
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: listing added probe :: perf list
   Regexp not found: "\s*probe:inode_permission(?:_\d+)?\s+\(on inode_permission(?:[:\+][0-9A-Fa-f]+)?@.+\)"
   -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: using added probe
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: deleting added probe
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: listing removed probe (should NOT be listed)
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: dry run :: adding probe
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: first probe adding
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (without force)
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (with force)
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: using doubled probe
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: removing multiple probes
   Regexp not found: "probe:vfs_mknod"
   Regexp not found: "probe:vfs_create"
   Regexp not found: "probe:vfs_rmdir"
   Regexp not found: "probe:vfs_link"
   Regexp not found: "probe:vfs_write"
   -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
   Regexp not found: "Failed to find"
   Regexp not found: "somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64"
   Regexp not found: "in this function|at this address"
   Line did not match any pattern: "The /boot/vmlinux file has no debug information."
   Line did not match any pattern: "Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package."
   -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: function with retval :: add
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: function with retval :: record
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: function argument probing :: script
   ## [ PASS ] ## perf_probe :: test_adding_kernel SUMMARY
   ---- end(0) ----
   83: perftool-testsuite_probe                                        : Ok

Only the three specific tests are skipped and remaining
ran successfully.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240617122121.7484-1-atrajeev@linux.vnet.ibm.com
16 months agoperf hist: Honor symbol_conf.skip_empty
Namhyung Kim [Fri, 7 Jun 2024 20:29:18 +0000 (13:29 -0700)]
perf hist: Honor symbol_conf.skip_empty

So that it can skip events with no sample according to the config value.
This can omit the dummy event in the output of perf report --group.

An example output:

  $ sudo perf mem record -a sleep 1
  $ sudo perf report --group

Before)
  #
  # Samples: 232  of events 'cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P, dummy:u'
  # Event count (approx.): 3089861
  #
  #                 Overhead  Command      Shared Object      Symbol
  # ........................  ...........  .................  .....................................
  #
       9.29%   0.00%   0.00%  swapper      [kernel.kallsyms]  [k] update_blocked_averages
       5.26%   0.15%   0.00%  swapper      [kernel.kallsyms]  [k] __update_load_avg_se
       4.15%   0.00%   0.00%  perf-exec    [kernel.kallsyms]  [k] slab_update_freelist.isra.0
       3.87%   0.00%   0.00%  perf-exec    [kernel.kallsyms]  [k] memcg_slab_post_alloc_hook
       3.79%   0.17%   0.00%  swapper      [kernel.kallsyms]  [k] enqueue_task_fair
       3.63%   0.00%   0.00%  sleep        [kernel.kallsyms]  [k] next_uptodate_page
       2.86%   0.00%   0.00%  swapper      [kernel.kallsyms]  [k] __update_load_avg_cfs_rq
       2.78%   0.00%   0.00%  swapper      [kernel.kallsyms]  [k] __schedule
       2.34%   0.00%   0.00%  swapper      [kernel.kallsyms]  [k] intel_idle
       2.32%   0.97%   0.00%  swapper      [kernel.kallsyms]  [k] psi_group_change

After)
  #
  # Samples: 232  of events 'cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P'
  # Event count (approx.): 3089861
  #
  #         Overhead  Command      Shared Object      Symbol
  # ................  ...........  .................  .....................................
  #
       9.29%   0.00%  swapper      [kernel.kallsyms]  [k] update_blocked_averages
       5.26%   0.15%  swapper      [kernel.kallsyms]  [k] __update_load_avg_se
       4.15%   0.00%  perf-exec    [kernel.kallsyms]  [k] slab_update_freelist.isra.0
       3.87%   0.00%  perf-exec    [kernel.kallsyms]  [k] memcg_slab_post_alloc_hook
       3.79%   0.17%  swapper      [kernel.kallsyms]  [k] enqueue_task_fair
       3.63%   0.00%  sleep        [kernel.kallsyms]  [k] next_uptodate_page
       2.86%   0.00%  swapper      [kernel.kallsyms]  [k] __update_load_avg_cfs_rq
       2.78%   0.00%  swapper      [kernel.kallsyms]  [k] __schedule
       2.34%   0.00%  swapper      [kernel.kallsyms]  [k] intel_idle
       2.32%   0.97%  swapper      [kernel.kallsyms]  [k] psi_group_change

Now it doesn't have a column for the dummy event.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607202918.2357459-5-namhyung@kernel.org
16 months agoperf hist: Add symbol_conf.skip_empty
Namhyung Kim [Fri, 7 Jun 2024 20:29:17 +0000 (13:29 -0700)]
perf hist: Add symbol_conf.skip_empty

Add the skip_empty flag to symbol_conf and set the value from the report
command to preserve the existing behavior.  This makes the code simpler
and will be needed other code which is hard to add a new argument.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607202918.2357459-4-namhyung@kernel.org
16 months agoperf hist: Simplify __hpp_fmt() using hpp_fmt_data
Namhyung Kim [Fri, 7 Jun 2024 20:29:16 +0000 (13:29 -0700)]
perf hist: Simplify __hpp_fmt() using hpp_fmt_data

The struct hpp_fmt_data is to keep the values for each group members so
it doesn't need to check the event index in the group.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607202918.2357459-3-namhyung@kernel.org
16 months agoperf hist: Factor out __hpp__fmt_print()
Namhyung Kim [Fri, 7 Jun 2024 20:29:15 +0000 (13:29 -0700)]
perf hist: Factor out __hpp__fmt_print()

Split the logic to print the histogram values according to the format
string.  This was used in 3 different places so it's better to move out
the logic into a function.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607202918.2357459-2-namhyung@kernel.org
16 months agoperf: sched map skips redundant lines with cpu filters
Fernand Sieber [Fri, 14 Jun 2024 07:35:17 +0000 (09:35 +0200)]
perf: sched map skips redundant lines with cpu filters

perf sched map supports cpu filter.
However, even with cpu filters active, any context switch currently
corresponds to a separate line.
As result, context switches on irrelevant cpus result to redundant lines,
which makes the output particlularly difficult to read on wide
architectures.

Fix it by skipping printing for irrelevant CPUs.

Example snippet of output before fix:

  *B0       1.461147 secs
   B0
   B0
   B0
  *G0       1.517139 secs

After fix:

  *B0       1.461147 secs
  *G0       1.517139 secs

Signed-off-by: Fernand Sieber <sieberf@amazon.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-and-tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240614073517.94974-1-sieberf@amazon.com
16 months agoperf test pmu: Warn don't fail for legacy mixed case event names
Ian Rogers [Wed, 12 Jun 2024 12:40:27 +0000 (05:40 -0700)]
perf test pmu: Warn don't fail for legacy mixed case event names

PowerPC has mixed case events matching legacy hardware cache
events. Warn but don't fail in this case. Event parsing will still
work in this case by matching the legacy case.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240612124027.2712643-1-irogers@google.com
16 months agotools/perf: Fix timing issue with parallel threads in perf bench wake-up-parallel
Athira Rajeev [Fri, 7 Jun 2024 04:43:54 +0000 (10:13 +0530)]
tools/perf: Fix timing issue with parallel threads in perf bench wake-up-parallel

perf bench futex fails as below and hangs intermittently when
attempted to run on on a powerpc system:

./perf bench futex wake-parallel
 Running 'futex/wake-parallel' benchmark:
 Run summary [PID 88588]: blocking on 640 threads (at [private] futex 0x10464b8c), 640 threads waking up 1 at a time.

[Run 1]: Avg per-thread latency (waking 1/640 threads) in 0.1309 ms (+-53.27%)
[Run 2]: Avg per-thread latency (waking 1/640 threads) in 0.0120 ms (+-31.16%)
[Run 3]: Avg per-thread latency (waking 1/640 threads) in 0.1474 ms (+-92.47%)
[Run 4]: Avg per-thread latency (waking 1/640 threads) in 0.2883 ms (+-67.75%)
[Run 5]: Avg per-thread latency (waking 1/640 threads) in 0.4108 ms (+-39.60%)
[Run 6]: Avg per-thread latency (waking 1/640 threads) in 0.7843 ms (+-78.98%)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)

In the system, where perf bench wake-up-parallel is has system
configuration of 640 cpus. After debugging, this turned out to be
a timing issue. The benchmark creates threads equal to number of
cpus and issues a futex_wait. Then it does a usleep for .1 second
before initiating futex_wake. In system configuration with more
threads, the usleep time is not enough. Patch changes the usleep
from 100000 to 200000

With the patch, ran multiple iterations and there were no issues
further seen

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607044354.82225-3-atrajeev@linux.vnet.ibm.com
16 months agotools/perf: Fix perf bench epoll to enable the run when some CPU's are offline
Athira Rajeev [Fri, 7 Jun 2024 04:43:53 +0000 (10:13 +0530)]
tools/perf: Fix perf bench epoll to enable the run when some CPU's are offline

Perf bench epoll fails as below when attempted to run on
on a powerpc system:

   ./perf bench epoll wait
   Running 'epoll/wait' benchmark:
   Run summary [PID 627653]: 79 threads monitoring on 64 file-descriptors for 8 secs.

   perf: pthread_create: No such file or directory

In the setup where this perf bench was ran, difference was that
partition had 640 CPU's, but not all CPUs were online. 80 CPUs
were online. While creating threads and using epoll_wait , code
sets the affinity using cpumask. The cpumask size used is 80
which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
benchmark reports fail while setting affinity for cpu number which
is greater than 80 or higher, because it attempts to set a bit
position which is not allocated on the cpumask. Fix this by changing
the size of cpumask to number of possible cpus and not the number
of online cpus.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607044354.82225-2-atrajeev@linux.vnet.ibm.com
16 months agotools/perf: Fix perf bench futex to enable the run when some CPU's are offline
Athira Rajeev [Fri, 7 Jun 2024 04:43:52 +0000 (10:13 +0530)]
tools/perf: Fix perf bench futex to enable the run when some CPU's are offline

Perf bench futex fails as below when attempted to run on
on a powerpc system:

 ./perf bench futex all
 Running futex/hash benchmark...
Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.

perf: pthread_create: No such file or directory

In the setup where this perf bench was ran, difference was that
partition had 640 CPU's, but not all CPUs were online. 80 CPUs
were online. While blocking the threads with futex_wait, code
sets the affinity using cpumask. The cpumask size used is 80
which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
benchmark reports fail while setting affinity for cpu number which
is greater than 80 or higher, because it attempts to set a bit
position which is not allocated on the cpumask. Fix this by changing
the size of cpumask to number of possible cpus and not the number
of online cpus.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607044354.82225-1-atrajeev@linux.vnet.ibm.com
16 months agoperf record: Ensure space for lost samples
Ian Rogers [Tue, 11 Jun 2024 05:06:26 +0000 (22:06 -0700)]
perf record: Ensure space for lost samples

Previous allocation didn't account for sample ID written after the
lost samples event. Switch from malloc/free to a stack allocation.

Reported-by: Milian Wolff <milian.wolff@kdab.com>
Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240611050626.1223155-1-irogers@google.com
16 months agoperf evsel: Refactor tool events
Ian Rogers [Fri, 3 May 2024 23:28:49 +0000 (16:28 -0700)]
perf evsel: Refactor tool events

Tool events unnecessarily open a dummy perf event which is useless
even with `perf record` which will still open a dummy event. Change
the behavior of tool events so:

 - duration_time - call `rdclock` on open and then report the count as
   a delta since the start in evsel__read_counter. This moves code out
   of builtin-stat making it more general purpose.

 - user_time/system_time - open the fd as either `/proc/pid/stat` or
   `/proc/stat` for cases like system wide. evsel__read_counter will
   read the appropriate field out of the procfs file. These values
   were previously supplied by wait4, if the procfs read fails then
   the wait4 values are used, assuming the process/thread terminated.
   By reading user_time and system_time this way, interval mode, per
   PID and per CPU can be supported although there are restrictions
   given what the files provide (e.g. per PID can't be combined with
   per CPU).

Opening any of the tool events for `perf record` is changed to return
invalid.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240503232849.17752-1-irogers@google.com
16 months agoperf test: Speed up test case 70 annotate basic tests
Thomas Richter [Fri, 7 Jun 2024 05:43:52 +0000 (07:43 +0200)]
perf test: Speed up test case 70 annotate basic tests

On some s390 linux machine (mostly older models) and with debug
packages installed, the test case 'perf annotate basic tests' runs
for some longer time.
Speed up the test and save the output of command perf annotate
in a temporary file. This is used to perform pattern matching via
grep command. This saves on invocation of perf annotate which
runs for some time.

Output before:
 # time bash -x tests/shell/annotate.sh >/dev/null 2>&1; echo EXIT CODE $?

 real   4m35.543s
 user   3m19.442s
 sys    1m14.322s
 EXIT CODE 0
 #
Output after:
 # time bash -x tests/shell/annotate.sh >/dev/null 2>&1; echo EXIT CODE $?

 real   2m2.881s
 user   1m30.980s
 sys    0m30.684s
 EXIT CODE 0
 #

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607054352.2774936-1-tmricht@linux.ibm.com
16 months agoperf stat: Choose the most disaggregate command line option
Ian Rogers [Wed, 5 Jun 2024 06:38:28 +0000 (23:38 -0700)]
perf stat: Choose the most disaggregate command line option

When multiple aggregation options are passed to perf stat the behavior
isn't clear. Consider "perf stat -A --per-socket .." and "perf stat
--per-socket -A ..", the first won't aggregate at all while the second
will do per-socket aggregation, even though the same options were
passed.

Rather than set an enum value, gather the options in a struct and
process them from most to least aggregate. This ensures the least
aggregate option always applies, so no aggregation if "-A" is passed.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240605063828.195700-2-irogers@google.com
16 months agoperf stat: Make options local
Ian Rogers [Wed, 5 Jun 2024 06:38:27 +0000 (23:38 -0700)]
perf stat: Make options local

Reduce the scope of stat_options to cmd_stat, and pass as an argument
to __cmd_record. This is done to make more localized changes to the
options in later patches. A side-effect of the change is to reduce the
size of a stripped PIE perf binary by 5952 bytes. The savings come
mainly in the dynamic relocation section.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240605063828.195700-1-irogers@google.com
16 months agoperf maps: Add/use a sorted insert for fixup overlap and insert
Ian Rogers [Tue, 21 May 2024 16:51:09 +0000 (09:51 -0700)]
perf maps: Add/use a sorted insert for fixup overlap and insert

Data may have lots of overlapping mmaps. The regular insert adds at
the end and relies on a later sort. For data with overlapping mappings
the sort will happen during a subsequent maps__find or
__maps__fixup_overlap_and_insert, there's never a period where the
inserted maps buffer up and a single sort happens. To avoid back to
back sorts, maintain the sort order when fixing up and
inserting. Previously the first_ending_after search was O(log n) where
n is the size of maps, and the insert was O(1) but because of the
continuous sorting was becoming O(n*log(n)). With maintaining sort
order, the insert now becomes O(n) for a memmove.

For a perf report on a perf.data file containing overlapping mappings
the time numbers are:

Before:
real    0m5.894s
user    0m5.650s
sys     0m0.231s

After:
real    0m0.675s
user    0m0.454s
sys     0m0.196s

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Steinar H . Gunderson <sesse@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240521165109.708593-4-irogers@google.com
16 months agoperf maps: Reduce sorting for overlapping mappings
Ian Rogers [Tue, 21 May 2024 16:51:08 +0000 (09:51 -0700)]
perf maps: Reduce sorting for overlapping mappings

When an 'after' map is generated the 'new' map must be before it so
terminate iterating and don't resort. If the entry 'pos' is entirely
overlapped by the 'new' mapping then don't remove and insert the
mapping, just replace - again to remove sorting.

For a perf report on a perf.data file containing overlapping mappings
the time numbers are:

Before:
real    0m9.856s
user    0m9.637s
sys     0m0.204s

After:
real    0m5.894s
user    0m5.650s
sys     0m0.231s

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Steinar H . Gunderson <sesse@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240521165109.708593-3-irogers@google.com
16 months agoperf maps: Fix use after free in __maps__fixup_overlap_and_insert
Ian Rogers [Tue, 21 May 2024 16:51:07 +0000 (09:51 -0700)]
perf maps: Fix use after free in __maps__fixup_overlap_and_insert

In the case 'before' and 'after' are broken out from pos,
maps_by_address may be changed by __maps__insert, as such it needs
re-reading.

Don't ignore the return value from __maps_insert.

Fixes: 659ad3492b91 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Steinar H . Gunderson <sesse@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240521165109.708593-2-irogers@google.com
16 months agoperf script: netdev-times: add location parameter to consume_skb
Lucas Stach [Wed, 5 Jun 2024 14:44:42 +0000 (16:44 +0200)]
perf script: netdev-times: add location parameter to consume_skb

dd1b527831a3 ("net: add location to trace_consume_skb()") added a new
parameter to the consume_skb tracepoint. Adapt the script to match.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: kernel@pengutronix.de
Cc: patchwork-lst@pengutronix.de
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240605144442.1985270-1-l.stach@pengutronix.de
16 months agoperf: parse-events: Fix compilation error while defining DEBUG_PARSER
Clément Le Goffic [Wed, 5 Jun 2024 14:04:53 +0000 (16:04 +0200)]
perf: parse-events: Fix compilation error while defining DEBUG_PARSER

Compiling perf tool with 'DEBUG_PARSER=1' leads to errors:

$> make -C tools/perf PARSER_DEBUG=1 NO_LIBTRACEEVENT=1
...
  CC      util/expr-flex.o
  CC      util/expr.o
util/parse-events.c:33:12: error: redundant redeclaration of ‘parse_events_debug’ [-Werror=redundant-decls]
   33 | extern int parse_events_debug;
      |            ^~~~~~~~~~~~~~~~~~
In file included from util/parse-events.c:18:
util/parse-events-bison.h:43:12: note: previous declaration of ‘parse_events_debug’ with type ‘int’
   43 | extern int parse_events_debug;
      |            ^~~~~~~~~~~~~~~~~~
util/expr.c:27:12: error: redundant redeclaration of ‘expr_debug’ [-Werror=redundant-decls]
   27 | extern int expr_debug;
      |            ^~~~~~~~~~
In file included from util/expr.c:11:
util/expr-bison.h:43:12: note: previous declaration of ‘expr_debug’ with type ‘int’
   43 | extern int expr_debug;
      |            ^~~~~~~~~~
cc-1: all warnings being treated as errors

Remove extern declaration from the parse-envents.c file as there is a
conflict with the ones generated using bison and yacc tools from the file
parse-events.[ly].

Signed-off-by: Clément Le Goffic <clement.legoffic@foss.st.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240605140453.614862-1-clement.legoffic@foss.st.com
16 months agoperf hisi-ptt: remove unused struct 'hisi_ptt_queue'
Dr. David Alan Gilbert [Sun, 2 Jun 2024 00:07:09 +0000 (01:07 +0100)]
perf hisi-ptt: remove unused struct 'hisi_ptt_queue'

'hisi_ptt_queue' has been unused since the original
commit 5e91e57e6809 ("perf auxtrace arm64: Add support for parsing
HiSilicon PCIe Trace packet").

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: yangyicong@hisilicon.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240602000709.213116-1-linux@treblig.org
16 months agoperf genelf: remove unused struct 'options'
Dr. David Alan Gilbert [Sun, 2 Jun 2024 00:05:05 +0000 (01:05 +0100)]
perf genelf: remove unused struct 'options'

'options' has been unused since
commit fa7f7e735495 ("perf jit: Move test functionality in to a test").

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240602000505.213032-1-linux@treblig.org
16 months agoperf lock info: Display both map and thread by default
Nick Forrington [Mon, 13 May 2024 09:14:12 +0000 (09:14 +0000)]
perf lock info: Display both map and thread by default

Change "perf lock info" argument handling to:

Display both map and thread info (rather than an error) when neither are
specified.

Display both map and thread info (rather than just thread info) when
both are requested.

Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240513091413.738537-2-nick.forrington@arm.com
16 months agoperf top: Allow filters on events
Ian Rogers [Fri, 24 May 2024 20:52:27 +0000 (13:52 -0700)]
perf top: Allow filters on events

Allow filters to be added to perf top events. One use is to workaround
issues with:
```
$ perf top --uid="$(id -u)"
```
which tries to scan /proc find processes belonging to the uid and can
fail in such a pid terminates between the scan and the
perf_event_open reporting:
```
Error:
The sys_perf_event_open() syscall returned with 3 (No such process) for event (cycles:P).
/bin/dmesg | grep -i perf may provide additional information.
```
A similar filter:
```
$ perf top -e cycles:P --filter "uid == $(id -u)"
```
doesn't fail this way.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240524205227.244375-4-irogers@google.com
16 months agoperf bpf filter: Add uid and gid terms
Ian Rogers [Fri, 24 May 2024 20:52:26 +0000 (13:52 -0700)]
perf bpf filter: Add uid and gid terms

Allow the BPF filter to use the uid and gid terms determined by the
bpf_get_current_uid_gid BPF helper. For example, the following will
record the cpu-clock event system wide discarding samples that don't
belong to the current user.

$ perf record -e cpu-clock --filter "uid == $(id -u)" -a sleep 0.1

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240524205227.244375-3-irogers@google.com
16 months agoperf bpf filter: Give terms their own enum
Ian Rogers [Fri, 24 May 2024 20:52:25 +0000 (13:52 -0700)]
perf bpf filter: Give terms their own enum

Give the term types their own enum so that additional terms can be
added that don't correspond to a PERF_SAMPLE_xx flag. The term values
are numerically ascending rather than bit field positions, this means
they need translating to a PERF_SAMPLE_xx bit field in certain places
using a shift.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240524205227.244375-2-irogers@google.com
16 months agotools api io: Move filling the io buffer to its own function
Ian Rogers [Sun, 19 May 2024 18:17:16 +0000 (11:17 -0700)]
tools api io: Move filling the io buffer to its own function

In general a read fills 4kb so filling the buffer is a 1 in 4096
operation, move it out of the io__get_char function to avoid some
checking overhead and to better hint the function is good to inline.

For perf's IO intensive internal (non-rigorous) benchmarks there's a
small improvement to kallsyms-parsing with a default build.

Before:
```
$ perf bench internals all
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
  Average synthesis took: 146.322 usec (+- 0.305 usec)
  Average num. events: 61.000 (+- 0.000)
  Average time per event 2.399 usec
  Average data synthesis took: 145.056 usec (+- 0.155 usec)
  Average num. events: 329.000 (+- 0.000)
  Average time per event 0.441 usec

  Average kallsyms__parse took: 162.313 ms (+- 0.599 ms)
...
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 53.720 usec (+- 7.823 usec)
  Average PMU scanning took: 375.145 usec (+- 23.974 usec)
```
After:
```
$ perf bench internals all
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
  Average synthesis took: 127.829 usec (+- 0.079 usec)
  Average num. events: 61.000 (+- 0.000)
  Average time per event 2.096 usec
  Average data synthesis took: 133.652 usec (+- 0.101 usec)
  Average num. events: 327.000 (+- 0.000)
  Average time per event 0.409 usec

  Average kallsyms__parse took: 150.415 ms (+- 0.313 ms)
...
Computing performance of sysfs PMU event scan for 100 times
  Average core PMU scanning took: 47.790 usec (+- 1.178 usec)
  Average PMU scanning took: 376.945 usec (+- 23.683 usec)
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240519181716.4088459-1-irogers@google.com
16 months agoperf trace beauty: Always show mmap prot even though PROT_NONE
Changbin Du [Wed, 22 May 2024 03:35:42 +0000 (11:35 +0800)]
perf trace beauty: Always show mmap prot even though PROT_NONE

PROT_NONE is also useful information, so do not omit the mmap prot even
though it is 0. syscall_arg__scnprintf_mmap_prot() could print PROT_NONE
for prot 0.

Before: PROT_NONE is not shown.
$ sudo perf trace -e syscalls:sys_enter_mmap --filter prot==0  -- ls
     0.000 ls/2979231 syscalls:sys_enter_mmap(len: 4220888, flags: PRIVATE|ANONYMOUS)

After: PROT_NONE is displayed.
$ sudo perf trace -e syscalls:sys_enter_mmap --filter prot==0  -- ls
     0.000 ls/2975708 syscalls:sys_enter_mmap(len: 4220888, prot: NONE, flags: PRIVATE|ANONYMOUS)

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240522033542.1359421-3-changbin.du@huawei.com
16 months agoperf trace beauty: Always show param if show_zero is set
Changbin Du [Wed, 22 May 2024 03:35:41 +0000 (11:35 +0800)]
perf trace beauty: Always show param if show_zero is set

For some parameters, it is best to also display them when they are 0,
e.g. flags.

Here we only check the show_zero property and let arg printer handle
special cases.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240522033542.1359421-2-changbin.du@huawei.com
16 months agoperf docs: Fix typos
Ian Rogers [Tue, 21 May 2024 22:35:55 +0000 (15:35 -0700)]
perf docs: Fix typos

Assorted typo fixes.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240521223555.858859-1-irogers@google.com
16 months agoperf list: Fix the --no-desc option
Breno Leitao [Fri, 17 May 2024 14:14:26 +0000 (07:14 -0700)]
perf list: Fix the --no-desc option

Currently, the --no-desc option in perf list isn't functioning as
intended.

This issue arises from the overwriting of struct option->desc with the
opposite value of struct option->long_desc. Consequently, whatever
parse_options() returns at struct option->desc gets overridden later,
rendering the --desc or --no-desc arguments ineffective.

To resolve this, set ->desc as true by default and allow parse_options()
to adjust it accordingly. This adjustment will fix the --no-desc
option while preserving the functionality of the other parameters.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: leit@meta.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240517141427.1905691-1-leitao@debian.org
16 months agoperf arm-spe: Unaligned pointer work around
Ian Rogers [Tue, 14 May 2024 05:24:02 +0000 (22:24 -0700)]
perf arm-spe: Unaligned pointer work around

Use get_unaligned_leXX instead of leXX_to_cpu to handle unaligned
pointers. Such pointers occur with libFuzzer testing.

A similar change for intel-pt was done in:
https://lore.kernel.org/r/20231005190451.175568-6-adrian.hunter@intel.com

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240514052402.3031871-1-irogers@google.com
16 months agoperf tests: Add some pmu core functionality tests
Ian Rogers [Wed, 15 May 2024 06:01:14 +0000 (23:01 -0700)]
perf tests: Add some pmu core functionality tests

Test behavior of PMU names and comparisons wrt suffixes using Intel
uncore_cha, marvell mrvl_ddr_pmu and S390's cpum_cf as examples.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: Bhaskara Budiredla <bbudiredla@marvell.com>
Cc: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240515060114.3268149-3-irogers@google.com
16 months agoperf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu
Ian Rogers [Wed, 15 May 2024 06:01:13 +0000 (23:01 -0700)]
perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu

The mrvl_ddr_pmu is uncore and has a hexadecimal address suffix while
the previous PMU sorting/merging code assumes uncore PMU names start
with uncore_ and have a decimal suffix. Because of the previous
assumption it isn't possible to wildcard the mrvl_ddr_pmu.

Modify pmu_name_len_no_suffix but also remove the suffix number out
argument, this is because we don't know if a suffix number of say 100
is in hexadecimal or decimal. As the only use of the suffix number is
in comparisons, it is safe there to compare the values as hexadecimal.
Modify perf_pmu__match_ignoring_suffix so that hexadecimal suffixes
are ignored.

Only allow hexadecimal suffixes to be greater than length 2 (ie 3 or
more) so that S390's cpum_cf PMU doesn't lose its suffix.

Change the return type of pmu_name_len_no_suffix to size_t to
workaround GCC incorrectly determining the result could be negative.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: Bhaskara Budiredla <bbudiredla@marvell.com>
Cc: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240515060114.3268149-2-irogers@google.com
16 months agoLinux 6.10-rc1
Linus Torvalds [Sun, 26 May 2024 22:20:12 +0000 (15:20 -0700)]
Linux 6.10-rc1