]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
11 months agoperf hist: Update hist symbol when updating maps
Matt Fleming [Thu, 15 Aug 2024 14:22:12 +0000 (15:22 +0100)]
perf hist: Update hist symbol when updating maps

AddressSanitizer found a use-after-free bug in the symbol code which
manifested as 'perf top' segfaulting.

  ==1238389==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b00c48844b at pc 0x5650d8035961 bp 0x7f751aaecc90 sp 0x7f751aaecc80
  READ of size 1 at 0x60b00c48844b thread T193
      #0 0x5650d8035960 in _sort__sym_cmp util/sort.c:310
      #1 0x5650d8043744 in hist_entry__cmp util/hist.c:1286
      #2 0x5650d8043951 in hists__findnew_entry util/hist.c:614
      #3 0x5650d804568f in __hists__add_entry util/hist.c:754
      #4 0x5650d8045bf9 in hists__add_entry util/hist.c:772
      #5 0x5650d8045df1 in iter_add_single_normal_entry util/hist.c:997
      #6 0x5650d8043326 in hist_entry_iter__add util/hist.c:1242
      #7 0x5650d7ceeefe in perf_event__process_sample /home/matt/src/linux/tools/perf/builtin-top.c:845
      #8 0x5650d7ceeefe in deliver_event /home/matt/src/linux/tools/perf/builtin-top.c:1208
      #9 0x5650d7fdb51b in do_flush util/ordered-events.c:245
      #10 0x5650d7fdb51b in __ordered_events__flush util/ordered-events.c:324
      #11 0x5650d7ced743 in process_thread /home/matt/src/linux/tools/perf/builtin-top.c:1120
      #12 0x7f757ef1f133 in start_thread nptl/pthread_create.c:442
      #13 0x7f757ef9f7db in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

When updating hist maps it's also necessary to update the hist symbol
reference because the old one gets freed in map__put().

While this bug was probably introduced with 5c24b67aae72f54c ("perf
tools: Replace map->referenced & maps->removed_maps with map->refcnt"),
the symbol objects were leaked until c087e9480cf33672 ("perf machine:
Fix refcount usage when processing PERF_RECORD_KSYMBOL") was merged so
the bug was masked.

Fixes: c087e9480cf33672 ("perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL")
Reported-by: Yunzhao Li <yunzhao@cloudflare.com>
Signed-off-by: Matt Fleming (Cloudflare) <matt@readmodwrite.com>
Cc: Ian Rogers <irogers@google.com>
Cc: kernel-team@cloudflare.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: stable@vger.kernel.org # v5.13+
Link: https://lore.kernel.org/r/20240815142212.3834625-1-matt@readmodwrite.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test record.sh: Raise limit of open file descriptors
Veronika Molnarova [Mon, 29 Apr 2024 08:53:07 +0000 (10:53 +0200)]
perf test record.sh: Raise limit of open file descriptors

Subtest for system-wide record with '--threads=cpu' option fails due
to a limit of open file descriptors on systems with 128 or more CPUs
as the default limit is set to 1024.

The number of open file descriptors should be slightly above
nmb_events*nmb_cpus + nmb_cpus(for perf.data.n) + 4*nmb_cpus(for pipes),
which equals 8*nmb_cpus. Therefore, temporarily raise the limit to
16*nmb_cpus for the test.

Committer notes:

Instead of disabling ShellCheck warnings all the uses of 'uname -n',
i.e. those:

  In tests/shell/record.sh line 35:
  default_fd_limit=$(ulimit -Sn)
                            ^-^ SC3045 (warning): In POSIX sh, ulimit -S is undefined.

We can just switch from using '/bin/sh' to '/bin/bash' for this test, as
bash _has_ 'ulimit -n', so ShellCheck will not emit that warning.

There are dozens of 'perf test' shell tests that do just that,
'/bin/bash' is a reasonable expectation for those tests.

Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Radostin Stoyanov <rstoyano@redhat.com>
Link: https://lore.kernel.org/linux-perf-users/20240429085721.10122-1-vmolnaro@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf build: Fix up broken capstone feature detection fast path
Arnaldo Carvalho de Melo [Wed, 14 Aug 2024 13:36:24 +0000 (10:36 -0300)]
perf build: Fix up broken capstone feature detection fast path

The capstone devel headers define 'struct bpf_insn' in a way that clashes with
what is in the libbpf devel headers, so we so far need to avoid including both.

This is happening on the tools/build/feature/test-all.c file, where we try
building all the expected set of libraries to be normally available on a
system:

  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  In file included from test-bpf.c:3,
                   from test-all.c:150:
  /home/acme/git/perf-tools-next/tools/include/uapi/linux/bpf.h:77:8: error: ‘bpf_insn’ defined as wrong kind of tag
     77 | struct bpf_insn {
        |        ^~~~~~~~
  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output

When doing so there is a trick where we define main to be
main_test_libcapstone, then include the individual
tools/build/feture/test-libcapstone.c capability query test, and then we undef
'main' because we'll do it all over again with the next expected library to
be tested (at this time 'lzma').

To complete this mechanism we need to, in test-all.c 'main' routine, to
call main_test_libcapstone(), which isn't being done, so the effect of
adding references to capstone in test-all.c are not achieved.

The only thing that is happening is that test-all.c is failing to build and thus
all the tests will have to be done individually, which nullifies the test-all.c
single build speedup.

So lets remove references to capstone from test-all.c to see if this makes it
build again so that we get faster builds or go on fixing up whatever is
preventing us to get that benefit.

Nothing: after this fix we get a clean test-all.c build and get the build speedup back:

  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.
  test-all.bin          test-all.d            test-all.make.output
  ⬢[acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output
  ⬢[acme@toolbox perf-tools-next]$ ldd /tmp/build/perf-tools-next/feature/test-all.bin
   linux-vdso.so.1 (0x00007f13277a1000)
   libpython3.12.so.1.0 => /lib64/libpython3.12.so.1.0 (0x00007f1326e00000)
   libm.so.6 => /lib64/libm.so.6 (0x00007f13274be000)
   libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1327496000)
   libtracefs.so.1 => /lib64/libtracefs.so.1 (0x00007f132746f000)
   libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007f1326800000)
   libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f1327452000)
   libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f1327436000)
   liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f1327403000)
   libdw.so.1 => /lib64/libdw.so.1 (0x00007f1326d6f000)
   libz.so.1 => /lib64/libz.so.1 (0x00007f13273e2000)
   libelf.so.1 => /lib64/libelf.so.1 (0x00007f1326d53000)
   libnuma.so.1 => /lib64/libnuma.so.1 (0x00007f13273d4000)
   libslang.so.2 => /lib64/libslang.so.2 (0x00007f1326400000)
   libperl.so.5.38 => /lib64/libperl.so.5.38 (0x00007f1326000000)
   libc.so.6 => /lib64/libc.so.6 (0x00007f1325e0f000)
   libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f1326741000)
   /lib64/ld-linux-x86-64.so.2 (0x00007f13277a3000)
   libbz2.so.1 => /lib64/libbz2.so.1 (0x00007f1326d3f000)
   libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007f1326d07000)
  ⬢[acme@toolbox perf-tools-next]$

And when having capstone-devel installed we get it detected and linked with
perf, allowing us to benefit from the features that it enables:

  ⬢[acme@toolbox perf-tools-next]$ rpm -q capstone-devel
  capstone-devel-5.0.1-3.fc40.x86_64
  ⬢[acme@toolbox perf-tools-next]$ ldd /tmp/build/perf-tools-next/perf | grep capstone
   libcapstone.so.5 => /lib64/libcapstone.so.5 (0x00007fe6a5c00000)
  ⬢[acme@toolbox perf-tools-next]$ /tmp/build/perf-tools-next/perf -vv | grep cap
             libcapstone: [ on  ]  # HAVE_LIBCAPSTONE_SUPPORT
  ⬢[acme@toolbox perf-tools-next]$

Fixes: 8b767db3309595a2 ("perf: build: introduce the libcapstone")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zry0sepD5Ppa5YKP@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test: Add new test cases for the branch counter feature
Kan Liang [Tue, 13 Aug 2024 16:02:08 +0000 (09:02 -0700)]
perf test: Add new test cases for the branch counter feature

Enhance the test case for the branch counter feature.

Now, the test verifies:

- The new filter can be successfully applied on the supported platforms.
- The counter value can be outputted via the perf report -D
- The counter value and the abbr name can be outputted via the
  perf script (New)

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-10-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf script: Add branch counters
Kan Liang [Tue, 13 Aug 2024 16:02:07 +0000 (09:02 -0700)]
perf script: Add branch counters

It's useful to print the branch counter information for each jump in
the brstackinsn when it's available.

Add a new field 'brcntr' to display the branch counter information.

By default, the abbreviation will be used to indicate the branch
counter. In the verbose mode, the real event name is shown.

  $ perf script -F +brstackinsn,+brcntr

  # Branch counter abbr list:
  # branch-instructions:ppp = A
  # branch-misses = B
  # '-' No event occurs
  # '+' Event occurrences may be lost due to branch counter saturated
      tchain_edit  332203 3366329.405674:      53030 branch-instructions:ppp:            401781 f3+0x2c (home/sdp/test/tchain_edit)
         f3+31:
         0000000000401774        insn: eb 04                     br_cntr: AA     # PRED 5 cycles [5]
         000000000040177a        insn: 81 7d fc 0f 27 00 00
         0000000000401781        insn: 7e e3                     br_cntr: A      # PRED 1 cycles [6] 2.00 IPC
         0000000000401766        insn: 8b 45 fc
         0000000000401769        insn: 83 e0 01
         000000000040176c        insn: 85 c0
         000000000040176e        insn: 74 06                     br_cntr: A      # PRED 1 cycles [7] 4.00 IPC
         0000000000401776        insn: 83 45 fc 01
         000000000040177a        insn: 81 7d fc 0f 27 00 00
         0000000000401781        insn: 7e e3                     br_cntr: A      # PRED 7 cycles [14] 0.43 IPC

  $ perf script -F +brstackinsn,+brcntr -v

     tchain_edit  332203 3366329.405674:      53030 branch-instructions:ppp:            401781 f3+0x2c (/home/sdp/os.linux.perf.test-suite/kernels/lbr_kernel/tchain_edit)
        f3+31:
        0000000000401774        insn: eb 04                     br_cntr: branch-instructions:ppp 2 branch-misses 0      # PRED 5 cycles [5]
        000000000040177a        insn: 81 7d fc 0f 27 00 00
        0000000000401781        insn: 7e e3                     br_cntr: branch-instructions:ppp 1 branch-misses 0      # PRED 1 cycles [6] 2.00 IPC
        0000000000401766        insn: 8b 45 fc
        0000000000401769        insn: 83 e0 01
        000000000040176c        insn: 85 c0
        000000000040176e        insn: 74 06                     br_cntr: branch-instructions:ppp 1 branch-misses 0      # PRED 1 cycles [7] 4.00 IPC
        0000000000401776        insn: 83 45 fc 01
        000000000040177a        insn: 81 7d fc 0f 27 00 00
        0000000000401781        insn: 7e e3                     br_cntr: branch-instructions:ppp 1 branch-misses 0      # PRED 7 cycles [14] 0.43 IPC

Originally-by: Tinghao Zhang <tinghao.zhang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-9-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Display the branch counter histogram
Kan Liang [Tue, 13 Aug 2024 16:02:06 +0000 (09:02 -0700)]
perf annotate: Display the branch counter histogram

Display the branch counter histogram in the annotation view.

Press 'B' to display the branch counter's abbreviation list as well.

  Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }',
  4000 Hz, Event count (approx.):
  f3  /home/sdp/test/tchain_edit [Percent: local period]
  Percent       │ IPC Cycle       Branch Counter (Average IPC: 1.39, IPC Coverage: 29.4%)
                │                                     0000000000401755 <f3>:
    0.00   0.00 │                                       endbr64
                │                                       push    %rbp
                │                                       mov     %rsp,%rbp
                │                                       movl    $0x0,-0x4(%rbp)
    0.00   0.00 │1.33     3          |A   |-   |      ↓ jmp     25
   11.03  11.03 │                                 11:   mov     -0x4(%rbp),%eax
                │                                       and     $0x1,%eax
                │                                       test    %eax,%eax
   17.13  17.13 │2.41     1          |A   |-   |      ↓ je      21
                │                                       addl    $0x1,-0x4(%rbp)
   21.84  21.84 │2.22     2          |AA  |-   |      ↓ jmp     25
   17.13  17.13 │                                 21:   addl    $0x1,-0x4(%rbp)
   21.84  21.84 │                                 25:   cmpl    $0x270f,-0x4(%rbp)
   11.03  11.03 │0.61     3          |A   |-   |      ↑ jle     11
                │                                       nop
                │                                       pop     %rbp
    0.00   0.00 │0.24    20          |AA  |B   |      ← ret

Originally-by: Tinghao Zhang <tinghao.zhang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-8-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf report: Display the branch counter histogram
Kan Liang [Tue, 13 Aug 2024 16:02:05 +0000 (09:02 -0700)]
perf report: Display the branch counter histogram

Reusing the existing --total-cycles option to display the branch
counters. Add a new PERF_HPP_REPORT__BLOCK_BRANCH_COUNTER to display
the logged branch counter events. They are shown right after all the
cycle-related annotations.

Extend the 'struct block_info' to store and pass the branch counter
related information.

The annotation_br_cntr_entry() is to print the histogram of each branch
counter event. If the number of logged events is less than 4, the exact
number of the abbr name is printed. Otherwise, using '+' to stands for
more than 3 events.

Assume the number of logged events is less than 4.

The annotation_br_cntr_abbr_list() prints the branch counter's
abbreviation list. Press 'B' to display the list in the TUI mode.

  $ perf record -e "{branch-instructions:ppp,branch-misses}:S" -j any,counter
  $ perf report  --total-cycles --stdio

  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 1M of events 'anon group { branch-instructions:ppp, branch-misses }'
  # Event count (approx.): 1610046
  #
  # Branch counter abbr list:
  # branch-instructions:ppp = A
  # branch-misses = B
  # '-' No event occurs
  # '+' Event occurrences may be lost due to branch counter saturated
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  Branch Counter [Program Block Range]
  # ...............  ..............  ...........  ..........  ..............  ..................
  #
            57.55%            2.5M        0.00%           3     |A   |-   |                 ...
            25.27%            1.1M        0.00%           2     |AA  |-   |                 ...
            15.61%          667.2K        0.00%           1     |A   |-   |                 ...
             0.16%            6.9K        0.81%         575     |A   |-   |                 ...
             0.16%            6.8K        1.38%         977     |AA  |-   |                 ...
             0.16%            6.8K        0.04%          28     |AA  |B   |                 ...
             0.15%            6.6K        1.33%         946     |A   |-   |                 ...
             0.11%            4.5K        0.06%          46     |AAA+|-   |                 ...
             0.10%            4.4K        0.88%         624     |A   |-   |                 ...
             0.09%            3.7K        0.74%         524     |AAA+|B   |                 ...

With -v applied,

  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  Branch Counter [Program Block Range]
  # ...............  ..............  ...........  ..........  ..............  ..................
  #
            57.55%            2.5M        0.00%           3       A=1 ,B=-                  ...
            25.27%            1.1M        0.00%           2       A=2 ,B=-                  ...
            15.61%          667.2K        0.00%           1       A=1 ,B=-                  ...
             0.16%            6.9K        0.81%         575       A=1 ,B=-                  ...
             0.16%            6.8K        1.38%         977       A=2 ,B=-                  ...
             0.16%            6.8K        0.04%          28       A=2 ,B=1                  ...
             0.15%            6.6K        1.33%         946       A=1 ,B=-                  ...
             0.11%            4.5K        0.06%          46       A=3+,B=-                  ...
             0.10%            4.4K        0.88%         624       A=1 ,B=-                  ...
             0.09%            3.7K        0.74%         524       A=3+,B=1                  ...

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-7-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf evsel: Assign abbr name for the branch counter events
Kan Liang [Tue, 13 Aug 2024 16:02:04 +0000 (09:02 -0700)]
perf evsel: Assign abbr name for the branch counter events

There could be several branch counter events. If perf tool output the
result via the format "event name + a number", the line could be very
long and hard to read.

An abbreviation is introduced to replace the full event name in the
display. The abbreviation starts from 'A' to 'Z9', which can support
up to 286 events. The same abbreviation will be assigned if the same
events are found in the evlist. The next patch will utilize the
abbreviation name to show the branch counter events in the output.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-6-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Save branch counters for each block
Kan Liang [Tue, 13 Aug 2024 16:02:03 +0000 (09:02 -0700)]
perf annotate: Save branch counters for each block

When annotating a basic block, it's useful to display the occurrences
of other events in the block.

The branch counter feature is only available for newer Intel platforms.

So a dedicated option to display the branch counters is not introduced.

Reuse the existing --total-cycles option, which triggers the annotation
of a basic block and displays the cycle-related annotation.

When the branch counters information is available, the branch counters
are automatically appended after all the cycle-related annotation.

Accounting the branch counters as well when accounting the cycles in
hist__account_cycles().

In 'struct annotated_branch', introduce a br_cntr array to save the
accumulation of each branch counter.

In a sample, all the branch counters for a branch are saved in a u64
space.

Because the saturation of a branch counter is small, e.g., for Intel
Sierra Forest, the saturation is only 3.

Add ANNOTATION__BR_CNTR_SATURATED_FLAG to indicate if a branch counter
once saturated. That can be used to indicate a potential event lost
because of the saturation.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-5-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf evlist: Save branch counters information
Kan Liang [Tue, 13 Aug 2024 16:02:02 +0000 (09:02 -0700)]
perf evlist: Save branch counters information

The branch counters logging (A.K.A LBR event logging) introduces a
per-counter indication of precise event occurrences in LBRs. The kernel
only dumps the number of occurrences into a record. The perf tool has
to map the number to the corresponding event.

Add evlist__update_br_cntr() to go through the evlist to pick the
events that are configured to be logged. Assign a logical idx to track
them, and add the total number of the events in the leader event.

The total number will be used to allocate the space to save the branch
counters for a block. The logical idx will be used to locate the
corresponding event quickly in the following patches.

It only needs to iterate the evlist once. The
evsel__has_branch_counters() is also optimized.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-4-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf report: Remove the first overflow check for branch counters
Kan Liang [Tue, 13 Aug 2024 16:02:01 +0000 (09:02 -0700)]
perf report: Remove the first overflow check for branch counters

A false overflow warning is triggered if a sample doesn't have any LBRs
recorded and the branch counters feature is enabled.

The current code does OVERFLOW_CHECK_u64() at the very beginning when
reading the information of branch counters. It assumes that there is at
least one LBR in the PEBS record. But it is a valid case that 0 LBR is
recorded especially in a high context switch.

Remove the OVERFLOW_CHECK_u64(). The later OVERFLOW_CHECK() should be
good enough to check the overflow when reading the information of the
branch counters.

Fixes: 9fbb4b02302b0ae6 ("perf tools: Add branch counter knob")
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf report: Fix --total-cycles --stdio output error
Kan Liang [Tue, 13 Aug 2024 16:02:00 +0000 (09:02 -0700)]
perf report: Fix --total-cycles --stdio output error

The --total-cycles may output wrong information with the --stdio.

For example:

  # perf record -e "{cycles,instructions}",cache-misses -b sleep 1
  # perf report --total-cycles --stdio

The total cycles output of {cycles,instructions} and cache-misses are
almost the same.

  # Samples: 938  of events 'anon group { cycles, instructions }'
  # Event count (approx.): 938
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  [Program Block Range]
  # ...............  ..............  ...........  ..........  ..................................................>
  #
            11.19%            2.6K        0.10%           21  [perf_iterate_ctx+48 -> >
             5.79%            1.4K        0.45%           97  [__intel_pmu_enable_all.constprop.0+80 -> __intel_>
             5.11%            1.2K        0.33%           71  [native_write_msr+0 ->>

  # Samples: 293  of event 'cache-misses'
  # Event count (approx.): 293
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  [Program Block Range]
  # ...............  ..............  ...........  ..........  ..................................................>
  #
            11.19%            2.6K        0.13%           21  [perf_iterate_ctx+48 -> >
             5.79%            1.4K        0.59%           97  [__intel_pmu_enable_all.constprop.0+80 -> __intel_>
             5.11%            1.2K        0.43%           71  [native_write_msr+0 ->>

With the symbol_conf.event_group, the 'perf report' should only report the
block information of the leader event in a group.

However, the current implementation retrieves the next event's block
information, rather than the next group leader's block information.

Make sure the index is updated even if the event is skipped.

With the patch,

  # Samples: 293  of event 'cache-misses'
  # Event count (approx.): 293
  #
  # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles  [Program Block Range]
  # ...............  ..............  ...........  ..........  ..................................................>
  #
           37.98%            9.0K        4.05%           299  [perf_event_addr_filters_exec+0 -> perf_event_a>
           11.19%            2.6K        0.28%            21  [perf_iterate_ctx+48 -> >
            5.79%            1.4K        1.32%            97  [__intel_pmu_enable_all.constprop.0+80 -> __intel_>

Fixes: 6f7164fa231a5f36 ("perf report: Sort by sampled cycles percent per block for stdio")
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240813160208.2493643-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test annotate: Dump trapping test in trap handler
Ian Rogers [Tue, 13 Aug 2024 04:06:13 +0000 (21:06 -0700)]
perf test annotate: Dump trapping test in trap handler

Help to better identify the location of test failures but dumping the
failing test in the trap handler.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240813040613.882075-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf disasm: Fix memory leak for locked operations
Ian Rogers [Tue, 13 Aug 2024 04:06:12 +0000 (21:06 -0700)]
perf disasm: Fix memory leak for locked operations

lock__parse() calls disasm_line__parse() passing
&ops->locked.ins.name that will use strdup() to populate it.

Ensure ops->locked.ins.name is freed in lock__delete().

Found with address/leak sanitizer.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240813040613.882075-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf inject: Inject build ids for entire call chain
Ian Rogers [Mon, 12 Aug 2024 22:41:19 +0000 (15:41 -0700)]
perf inject: Inject build ids for entire call chain

The DSO build id is injected when the dso is first encountered but the
checking for first encountered only looks at the sample->ip not the
entire callchain.

Use the callchain logic to ensure all build ids are inserted.

Fixes: 454c407ec17a0c63 ("perf: add perf-inject builtin")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: https://lore.kernel.org/r/20240812224119.744968-1-irogers@google.com
[ Split from a larger patch that introduced the API and use it ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf callchain: Add a for_each callback style API
Ian Rogers [Mon, 12 Aug 2024 22:41:19 +0000 (15:41 -0700)]
perf callchain: Add a for_each callback style API

Add a for_each callback style API to callchain with
sample__for_each_callchain_node().

Possibly in the future such an API can avoid the overhead of
constructing the call chain list.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Casey Chen <cachen@purestorage.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: https://lore.kernel.org/r/20240812224119.744968-1-irogers@google.com
[ Split from a larger patch that introduced the API and use it ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test: Add test for Intel TPEBS counting mode
Weilin Wang [Sat, 20 Jul 2024 06:21:01 +0000 (02:21 -0400)]
perf test: Add test for Intel TPEBS counting mode

Intel TPEBS sampling mode is supported through perf record. The counting mode
code uses perf record to capture retire_latency value and use it in metric
calculation. This test checks the counting mode code on Intel platforms.

Committer testing:

  root@x1:~# perf test tpebs
  123: test Intel TPEBS counting mode                                  : Ok
  root@x1:~# set -o vi
  root@x1:~# perf test tpebs
  123: test Intel TPEBS counting mode                                  : Ok
  root@x1:~# perf test -v tpebs
  123: test Intel TPEBS counting mode                                  : Ok
  root@x1:~# perf test -vvv tpebs
  123: test Intel TPEBS counting mode:
  --- start ---
  test child forked, pid 16603
  Testing without --record-tpebs
  Testing with --record-tpebs
  ---- end(0) ----
  123: test Intel TPEBS counting mode                                  : Ok
  root@x1:~#

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-9-weilin.wang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf Document: Add TPEBS (Timed PEBS(Precise Event-Based Sampling)) to Documents
Weilin Wang [Sat, 20 Jul 2024 06:21:00 +0000 (02:21 -0400)]
perf Document: Add TPEBS (Timed PEBS(Precise Event-Based Sampling)) to Documents

TPEBS (Timed PEBS(Precise Event-Based Sampling)) is a new feature Intel
PMU from Granite Rapids microarchitecture.

It will be used in new TMA (Top-Down Microarchitecture Analysis)
releases.

Add related introduction to documents while adding new code to support
it in 'perf stat'.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-8-weilin.wang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf stat: Add command line option for enabling TPEBS recording
Weilin Wang [Sat, 20 Jul 2024 06:20:59 +0000 (02:20 -0400)]
perf stat: Add command line option for enabling TPEBS recording

With this command line option, TPEBS recording is turned off in 'perf
stat' on default. It will only be turned on when this option is given in
'perf stat' command.

Example with --record-tpebs:

  perf stat -M tma_split_loads -C1-4 --record-tpebs sleep 1

  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.044 MB - ]

   Performance counter stats for 'CPU(s) 1-4':

      53,259,156,071      cpu_core/TOPDOWN.SLOTS/          #      1.6 %  tma_split_loads   (50.00%)
      15,867,565,250      cpu_core/topdown-retiring/                                       (50.00%)
      15,655,580,731      cpu_core/topdown-mem-bound/                                      (50.00%)
      11,738,022,218      cpu_core/topdown-bad-spec/                                       (50.00%)
       6,151,265,424      cpu_core/topdown-fe-bound/                                       (50.00%)
      20,445,917,581      cpu_core/topdown-be-bound/                                       (50.00%)
       6,925,098,013      cpu_core/L1D_PEND_MISS.PENDING/                                  (50.00%)
       3,838,653,421      cpu_core/MEMORY_ACTIVITY.STALLS_L1D_MISS/                        (50.00%)
       4,797,059,783      cpu_core/EXE_ACTIVITY.BOUND_ON_LOADS/                            (50.00%)
      11,931,916,714      cpu_core/CPU_CLK_UNHALTED.THREAD/                                (50.00%)
         102,576,164      cpu_core/MEM_LOAD_COMPLETED.L1_MISS_ANY/                         (50.00%)
          64,071,854      cpu_core/MEM_INST_RETIRED.SPLIT_LOADS/                           (50.00%)
                   3      cpu_core/MEM_INST_RETIRED.SPLIT_LOADS/R

         1.003049679 seconds time elapsed

Example without --record-tpebs:

  perf stat -M tma_contested_accesses -C1 sleep 1

   Performance counter stats for 'CPU(s) 1':

          50,203,891      cpu_core/TOPDOWN.SLOTS/          #      0.0 %  tma_contested_accesses   (63.60%)
          10,040,777      cpu_core/topdown-retiring/                                              (63.60%)
           6,890,729      cpu_core/topdown-mem-bound/                                             (63.60%)
           2,756,463      cpu_core/topdown-bad-spec/                                              (63.60%)
          10,828,288      cpu_core/topdown-fe-bound/                                              (63.60%)
          28,350,432      cpu_core/topdown-be-bound/                                              (63.60%)
                  98      cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HITM/                          (63.70%)
             577,520      cpu_core/MEMORY_ACTIVITY.STALLS_L2_MISS/                                (54.62%)
             313,339      cpu_core/MEMORY_ACTIVITY.STALLS_L3_MISS/                                (54.62%)
              14,155      cpu_core/MEM_LOAD_RETIRED.L1_MISS/                                      (45.54%)
                   0      cpu_core/OCR.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD/                  (36.30%)
           8,468,077      cpu_core/CPU_CLK_UNHALTED.THREAD/                                       (45.38%)
                 198      cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/                             (45.38%)
               8,324      cpu_core/MEM_LOAD_RETIRED.FB_HIT/                                       (45.38%)
       3,388,031,520      TSC
          23,226,785      cpu_core/CPU_CLK_UNHALTED.REF_TSC/                                      (54.46%)
                  80      cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/                              (54.46%)
                   0      cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD/R
                   0      cpu_core/MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS/R
       1,006,816,667 ns   duration_time

         1.002537737 seconds time elapsed

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-7-weilin.wang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf vendor events intel: Add MTL metric JSON files
Weilin Wang [Sat, 20 Jul 2024 06:20:58 +0000 (02:20 -0400)]
perf vendor events intel: Add MTL metric JSON files

Add MTL metric JSON file for TMA4.8. Some of the metrics' formulas use TPEBS
retire_latency in MTL.

This also includes lated E-Core TMA3.6 changes.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-6-weilin.wang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency...
Weilin Wang [Sat, 20 Jul 2024 06:20:56 +0000 (02:20 -0400)]
perf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency value for a metric.

When retire_latency value is used in a metric formula, evsel would fork
a 'perf record' process with "-e" and "-W" options. 'perf record' will
collect required retire_latency values in parallel while 'perf stat' is
collecting counting values.

At the point of time that 'perf stat' stops counting, evsel would stop
'perf record' by sending sigterm signal to 'perf record' process.
Sampled data will be processed to get retire latency value. Another
thread is required to synchronize between 'perf stat' and 'perf record'
when we pass data through pipe.

Retire_latency evsel is not opened for 'perf stat' so that there is no
counter wasted on it. This commit includes code suggested by Namhyung to
adjust reading size for groups that include retire_latency evsels.

In current :R parsing implementation, the parser would recognize events
with retire_latency modifier and insert them into the evlist like a
normal event.  Ideally, we need to avoid counting these events.

In this commit, at the time when a retire_latency evsel is read, set the
retire latency value processed from the sampled data to count value.
This sampled retire latency value will be used for metric calculation
and final event count print out. No special metric calculation and event
print out code required for retire_latency events.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-4-weilin.wang@intel.com
[ Squashed the 3rd and 4th commit in the series to keep it building patch by patch ]
[ Constified the 'struct perf_tool' pointer in process_sample_event() ]
[ Use perf_tool__init(&tool, false) to address a segfault I reported and Ian/Weilin diagnosed ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf data: Allow to use given fd in data->file.fd
Weilin Wang [Sat, 20 Jul 2024 06:20:55 +0000 (02:20 -0400)]
perf data: Allow to use given fd in data->file.fd

When in PIPE mode, allow to use fd dynamically opened and asigned to
data->file.fd instead of STDIN_FILENO or STDOUT_FILENO.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-3-weilin.wang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf parse-events: Add a retirement latency modifier
Ian Rogers [Sat, 20 Jul 2024 06:20:54 +0000 (02:20 -0400)]
perf parse-events: Add a retirement latency modifier

Retirement latency is a separate sampled count used on newer Intel
CPUs.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-2-weilin.wang@intel.com
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf session: Constify tool
Ian Rogers [Mon, 12 Aug 2024 20:47:19 +0000 (13:47 -0700)]
perf session: Constify tool

Make tool const now that all uses are const and
perf_tool__fill_defaults() won't be used. The aim is to better capture
that sessions don't mutate tools.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-28-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf tool: Remove perf_tool__fill_defaults()
Ian Rogers [Mon, 12 Aug 2024 20:47:18 +0000 (13:47 -0700)]
perf tool: Remove perf_tool__fill_defaults()

Now all tools are fully initialized prior to use it has no use so
remove.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-27-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf kwork: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:17 +0000 (13:47 -0700)]
perf kwork: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-26-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test event_update: Ensure tools is initialized
Ian Rogers [Mon, 12 Aug 2024 20:47:16 +0000 (13:47 -0700)]
perf test event_update: Ensure tools is initialized

Ensure tool is initialized to avoid lazy initialization pattern so
that more uses of struct perf_tool can be made const.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-25-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf data convert ctf: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:15 +0000 (13:47 -0700)]
perf data convert ctf: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-24-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf data convert json: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:14 +0000 (13:47 -0700)]
perf data convert json: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-23-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf diff: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:13 +0000 (13:47 -0700)]
perf diff: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-22-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf timechart: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:12 +0000 (13:47 -0700)]
perf timechart: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-21-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf mem: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:11 +0000 (13:47 -0700)]
perf mem: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-20-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf sched: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:10 +0000 (13:47 -0700)]
perf sched: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-19-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:09 +0000 (13:47 -0700)]
perf annotate: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf stat: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:08 +0000 (13:47 -0700)]
perf stat: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf report: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:07 +0000 (13:47 -0700)]
perf report: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf inject: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:06 +0000 (13:47 -0700)]
perf inject: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf script: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:05 +0000 (13:47 -0700)]
perf script: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf c2c: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:04 +0000 (13:47 -0700)]
perf c2c: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf record: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:03 +0000 (13:47 -0700)]
perf record: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf evlist: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:02 +0000 (13:47 -0700)]
perf evlist: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf lock: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:01 +0000 (13:47 -0700)]
perf lock: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf kvm: Use perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:47:00 +0000 (13:47 -0700)]
perf kvm: Use perf_tool__init()

Use perf_tool__init() so that more uses of 'struct perf_tool' can be const
and not relying on perf_tool__fill_defaults().

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf buildid-list: Use perf_tool__init
Ian Rogers [Mon, 12 Aug 2024 20:46:59 +0000 (13:46 -0700)]
perf buildid-list: Use perf_tool__init

Reduce scope of build_id__mark_dso_hit_ops() to the scope of function
perf_session__list_build_ids, its only use, and use perf_tool__init()
for the default values. Move perf_event__exit_del_thread() to event.[ch]
so it can be used in builtin-buildid-list.c.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf kmem: Use perf_tool__init
Ian Rogers [Mon, 12 Aug 2024 20:46:58 +0000 (13:46 -0700)]
perf kmem: Use perf_tool__init

Reduce the scope of the tool from global/static to just that of the
cmd_kmem function where the session is scoped. Use the perf_tool__init()
to initialize default values.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf tool: Add perf_tool__init()
Ian Rogers [Mon, 12 Aug 2024 20:46:57 +0000 (13:46 -0700)]
perf tool: Add perf_tool__init()

Add init function that behaves like perf_tool__fill_defaults() but
assumes all values haven't been initialized.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf tool: Move fill defaults into tool.c
Ian Rogers [Mon, 12 Aug 2024 20:46:56 +0000 (13:46 -0700)]
perf tool: Move fill defaults into tool.c

The aim here is to eventually make perf_tool__fill_defaults() an init
function so that the tools struct is more const.

Create a tool.c to go along with tool.h. Move perf_tool__fill_defaults()
out of session.c into tool.c along with the default stub values. Add
perf_tool__compressed_is_stub() for a test in
perf_session__process_user_event().

perf_session__process_compressed_event() is only used from being default
initialized so migrate into tool.c.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf tool: Constify tool pointers
Ian Rogers [Mon, 12 Aug 2024 20:46:55 +0000 (13:46 -0700)]
perf tool: Constify tool pointers

The tool pointer (to a struct largely of function pointers) is passed
around but is unchanged except at initialization. Change parameter and
variable types to be const to lower the possibilities of what could
happen with a tool.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf s390-cpumsf: Remove unused struct
Ian Rogers [Mon, 12 Aug 2024 20:46:54 +0000 (13:46 -0700)]
perf s390-cpumsf: Remove unused struct

struct s390_cpumsf_synth was likely cargo culted from other auxtrace
examples. It has no users, so remove.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf auxtrace: Remove dummy tools
Ian Rogers [Mon, 12 Aug 2024 20:46:53 +0000 (13:46 -0700)]
perf auxtrace: Remove dummy tools

Add perf_session__deliver_synth_attr_event that synthesizes a
perf_record_header_attr event with one id. Remove use of
perf_event__synthesize_attr that necessitates the use of the dummy
tool in order to pass the session.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf inject: Fix leader sampling inserting additional samples
Ian Rogers [Mon, 29 Jul 2024 22:06:20 +0000 (15:06 -0700)]
perf inject: Fix leader sampling inserting additional samples

The processing of leader samples would turn an individual sample with
a group of read values into multiple samples. 'perf inject' would pass
through the additional samples increasing the output data file size:

  $ perf record -g -e "{instructions,cycles}:S" -o perf.orig.data true
  $ perf script -D -i perf.orig.data | sed -e 's/perf.orig.data/perf.data/g' > orig.txt
  $ perf inject -i perf.orig.data -o perf.new.data
  $ perf script -D -i perf.new.data | sed -e 's/perf.new.data/perf.data/g' > new.txt
  $ diff -u orig.txt new.txt
  --- orig.txt    2024-07-29 14:29:40.606576769 -0700
  +++ new.txt     2024-07-29 14:30:04.142737434 -0700
  ...
  -0xc550@perf.data [0x30]: event: 3
  +0xc550@perf.data [0xd0]: event: 9
  +.
  +. ... raw event: size 208 bytes
  +.  0000:  09 00 00 00 01 00 d0 00 fc 72 01 86 ff ff ff ff  .........r......
  +.  0010:  74 7d 2c 00 74 7d 2c 00 fb c3 79 f9 ba d5 05 00  t},.t},...y.....
  +.  0020:  e6 cb 1a 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
  +.  0030:  02 00 00 00 00 00 00 00 76 01 00 00 00 00 00 00  ........v.......
  +.  0040:  e6 cb 1a 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  +.  0050:  62 18 00 00 00 00 00 00 f6 cb 1a 00 00 00 00 00  b...............
  +.  0060:  00 00 00 00 00 00 00 00 0c 00 00 00 00 00 00 00  ................
  +.  0070:  80 ff ff ff ff ff ff ff fc 72 01 86 ff ff ff ff  .........r......
  +.  0080:  f3 0e 6e 85 ff ff ff ff 0c cb 7f 85 ff ff ff ff  ..n.............
  +.  0090:  bc f2 87 85 ff ff ff ff 44 af 7f 85 ff ff ff ff  ........D.......
  +.  00a0:  bd be 7f 85 ff ff ff ff 26 d0 7f 85 ff ff ff ff  ........&.......
  +.  00b0:  6d a4 ff 85 ff ff ff ff ea 00 20 86 ff ff ff ff  m......... .....
  +.  00c0:  00 fe ff ff ff ff ff ff 57 14 4f 43 fc 7e 00 00  ........W.OC.~..
  +
  +1642373909693435 0xc550 [0xd0]: PERF_RECORD_SAMPLE(IP, 0x1): 2915700/2915700: 0xffffffff860172fc period: 1 addr: 0
  +... FP chain: nr:12
  +.....  0: ffffffffffffff80
  +.....  1: ffffffff860172fc
  +.....  2: ffffffff856e0ef3
  +.....  3: ffffffff857fcb0c
  +.....  4: ffffffff8587f2bc
  +.....  5: ffffffff857faf44
  +.....  6: ffffffff857fbebd
  +.....  7: ffffffff857fd026
  +.....  8: ffffffff85ffa46d
  +.....  9: ffffffff862000ea
  +..... 10: fffffffffffffe00
  +..... 11: 00007efc434f1457
  +... sample_read:
  +.... group nr 2
  +..... id 00000000001acbe6, value 0000000000000176, lost 0
  +..... id 00000000001acbf6, value 0000000000001862, lost 0
  +
  +0xc620@perf.data [0x30]: event: 3
  ...

This behavior is incorrect as in the case above 'perf inject' should
have done nothing. Fix this behavior by disabling separating samples
for a tool that requests it. Only request this for `perf inject` so as
to not affect other perf tools. With the patch and the test above
there are no differences between the orig.txt and new.txt.

Fixes: e4caec0d1af3d608 ("perf evsel: Add PERF_SAMPLE_READ sample related processing")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240729220620.2957754-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate-data: Show first-level children by default in TUI
Namhyung Kim [Mon, 12 Aug 2024 19:44:47 +0000 (12:44 -0700)]
perf annotate-data: Show first-level children by default in TUI

Now default is to fold everything but it only shows the name of the
top-level data type which is not very useful.  Instead just expand the
top level entry so that it can show the layout at a higher level.

  Annotate type: 'struct task_struct' (4 samples)
        Percent     Offset       Size  Field
  -      100.00          0       9792  struct task_struct {                           ◆
  +        0.50          0         24      struct thread_info     thread_info;        ▒
           0.00         24          4      unsigned int   __state;                    ▒
           0.00         32          8      void*  stack;                              ▒
  +        0.00         40          4      refcount_t     usage;                      ▒
           0.00         44          4      unsigned int   flags;                      ▒
           0.00         48          4      unsigned int   ptrace;                     ▒
           0.00         52          4      int    on_cpu;                             ▒
  +        0.00         56         16      struct __call_single_node      wake_entry; ▒
           0.00         72          4      unsigned int   wakee_flips;                ▒
           0.00         80          8      long unsigned int      wakee_flip_decay_ts;▒
           0.00         88          8      struct task_struct*    last_wakee;         ▒
           0.00         96          4      int    recent_used_cpu;                    ▒
           0.00        100          4      int    wake_cpu;                           ▒
           0.00        104          4      int    on_rq;                              ▒
           0.00        108          4      int    prio;                               ▒
           0.00        112          4      int    static_prio;                        ▒
           0.00        116          4      int    normal_prio;                        ▒
           0.00        120          4      unsigned int   rt_priority;                ▒
  +        0.00        128        256      struct sched_entity    se;                 ▒
  +        0.00        384         48      struct sched_rt_entity rt;                 ▒
  +        0.00        432        224      struct sched_dl_entity dl;                 ▒
           0.00        656          8      struct sched_class*    sched_class;        ▒
  ...

Committer testing:

  # perf mem record -a sleep 5s
  # perf annotate --group --data-type=pthread_mutex_t

 Annotate type: 'pthread_mutex_t' (13 samples)
      Percent     Offset       Size  Field
-      100.00          0         40  pthread_mutex_t {                                ▒
-      100.00          0         40      struct __pthread_mutex_s       __data {      ▒
        39.45          0          4          int        __lock;                       ▒
         0.00          4          4          unsigned int       __count;              ▒
         7.80          8          4          int        __owner;                      ▒
         6.88         12          4          unsigned int       __nusers;             ▒
        45.87         16          4          int        __kind;                       ▒
         0.00         20          2          short int  __spins;                      ▒
         0.00         22          2          short int  __elision;                    ▒
+        0.00         24         16          __pthread_list_t   __list;               ▒
                                         };                                           ▒
         0.00          0          0      char[] __size;                               ▒
        39.45          0          8      long int       __align;

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812194447.2049187-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate-data: Implement folding in TUI browser
Namhyung Kim [Mon, 12 Aug 2024 19:44:46 +0000 (12:44 -0700)]
perf annotate-data: Implement folding in TUI browser

Like 'perf report', use 'e' or 'E' key to toggle folding the current
entry so that it can control displaying child entries.

Note I didn't add the 'c' and 'C' key to collapse the entry because it's
also handled with the 'e'/'E' since it toggles the state.

Committer testing:

Do some 'perf mem record' for some workload of the whole system, using
the target options, as usual (--pid/-p, -C/--cpu, -a for the system wide
profiling, etc) and then:

  # perf annotate --skip-empty --data-type=pthread_mutex_t

That, by default, will start as --tui, then press 'E' to see the whole
struct unfolded, etc.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812194447.2049187-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate-data: Support folding in TUI browser
Namhyung Kim [Mon, 12 Aug 2024 19:44:45 +0000 (12:44 -0700)]
perf annotate-data: Support folding in TUI browser

Like in the hists browser, it should support folding current entry so
that it can hide unwanted details in some data structures.

The folded entries will be displayed with the '+' sign, while unfolded
entries will have the '-' sign.

Entries that have no children will not show any signs.

  Annotate type: 'struct socket' (1 samples)
        Percent     Offset       Size  Field
  -      100.00          0        128  struct socket {                                  ◆
           0.00          0          4      socket_state   state;                        ▒
           0.00          4          2      short int      type;                         ▒
           0.00          8          8      long unsigned int      flags;                ▒
           0.00         16          8      struct file*   file;                         ▒
         100.00         24          8      struct sock*   sk;                           ▒
           0.00         32          8      struct proto_ops*      ops;                  ▒
  -        0.00         64         64      struct socket_wq       wq {                  ▒
  -        0.00         64         24          wait_queue_head_t  wait {                ▒
  +        0.00         64          4              spinlock_t     lock;                 ▒
  -        0.00         72         16              struct list_head       head {        ▒
           0.00         72          8                  struct list_head*  next;         ▒
           0.00         80          8                  struct list_head*  prev;         ▒
                                                   };                                   ▒
                                               };                                       ▒
           0.00         88          8          struct fasync_struct*      fasync_list;  ▒
           0.00         96          8          long unsigned int  flags;                ▒
  +        0.00        104         16          struct callback_head       rcu;          ▒
                                           };                                           ▒
                                       };                                               ▒

This just adds the display logic for folding, actually folding action
will be implemented in the next patch.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812194447.2049187-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf vendor events: SKX, CLX, SNR uncore cache event fixes
Ian Rogers [Sun, 11 Aug 2024 04:20:04 +0000 (21:20 -0700)]
perf vendor events: SKX, CLX, SNR uncore cache event fixes

Cache home agent (CHA) events were setting the low rather than high
config1 bits. SNR was using CLX CHA events, however its CHA is similar
to ICX so remove the events.

Incorporate the updates in:

  https://github.com/intel/perfmon/pull/215
  https://github.com/intel/perfmon/pull/216

Fixes: 4cc49942444e958b ("perf vendor events: Update cascadelakex events/metrics")
Closes: https://lore.kernel.org/linux-perf-users/CAPhsuW4nem9XZP+b=sJJ7kqXG-cafz0djZf51HsgjCiwkGBA+A@mail.gmail.com/
Reported-by: Song Liu <song@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240811042004.421869-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf lock contention: Change stack_id type to s32
Namhyung Kim [Mon, 12 Aug 2024 17:25:33 +0000 (10:25 -0700)]
perf lock contention: Change stack_id type to s32

The bpf_get_stackid() helper returns a signed type to check whether it
failed to get a stacktrace or not.  But it saved the result in u32 and
checked if the value is negative.

      376         if (needs_callstack) {
      377                 pelem->stack_id = bpf_get_stackid(ctx, &stacks,
      378                                                   BPF_F_FAST_STACK_CMP | stack_skip);
  --> 379                 if (pelem->stack_id < 0)

  ./tools/perf/util/bpf_skel/lock_contention.bpf.c:379 contention_begin()
  warn: unsigned 'pelem->stack_id' is never less than zero.

Let's change the type to s32 instead.

Fixes: 6d499a6b3d90277d ("perf lock: Print the number of lost entries for BPF")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812172533.2015291-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate-data: Fix a buffer overflow in TUI browser
Namhyung Kim [Sat, 10 Aug 2024 19:15:02 +0000 (12:15 -0700)]
perf annotate-data: Fix a buffer overflow in TUI browser

In get_member_overhead(), k is updated when it has a entry in the
histogram.  But the entry->hists array is allocated with the number of
evsel in the group.  So the k should be reset when it iterates the event
using for_each_group_evsel(), otherwise it'd crash due to a buffer
overflow.

Fixes: cb1898f58e0f175d ("perf annotate-data: Support --skip-empty option")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240810191502.1947959-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf docs: Refine the description for the buffer size
Leo Yan [Mon, 12 Aug 2024 09:34:59 +0000 (10:34 +0100)]
perf docs: Refine the description for the buffer size

Current description for the AUX trace buffer size is misleading. When a
user specifies the option '-m,512M', it represents a size value in bytes
(512MiB) but not 512M pages (512M x 4KiB regard to a page of 4KiB).

Make the document clear that the normal buffer and the AUX tracing
buffer share the same semantics. Syncs the documents for consistent
text.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240812093459.2575278-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf script: add --addr2line option
Martin Liška [Fri, 19 Jul 2024 10:57:08 +0000 (12:57 +0200)]
perf script: add --addr2line option

Similarly to other subcommands (like report, top), it would be handy to
provide a path for addr2line command.

Signed-off-by: Martin Liska <martin.liska@hey.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/eadc3e36-029d-4848-9d69-272fe5a83a26@foxlink.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf tests pmu: Initialize all fields of test_pmu variable
Arnaldo Carvalho de Melo [Mon, 12 Aug 2024 12:57:20 +0000 (09:57 -0300)]
perf tests pmu: Initialize all fields of test_pmu variable

Instead of explicitely initializing just the .name and .alias_name,
use struct member named initialization of just the non-null -name field,
the compiler will initialize all the other non-explicitely initialized
fields to NULL.

This makes the code more robust, avoiding the error recently fixed when
the .alias_name was used and contained a random value.

Reviewed-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Radostin Stoyanov <rstoyano@redhat.com>
Link: https://lore.kernel.org/lkml/e26941f9-f86c-4f2e-b812-20c49fb2c0d3@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate-data: Support --skip-empty option
Namhyung Kim [Wed, 7 Aug 2024 06:17:13 +0000 (23:17 -0700)]
perf annotate-data: Support --skip-empty option

The --skip-empty option is to hide dummy events in a group.  Like other
output mode in 'perf report' and 'perf annotate', the data-type
profiling output should support the option.

Committer testing:

With dummy:

  root@number:~# perf annotate --stdio --group --data-type --skip-empty | head -24
  Annotate type: 'pthread_mutex_t' in /usr/lib64/libc.so.6 (50 samples):
   event[0] = cpu_atom/mem-loads,ldlat=30/P
   event[1] = cpu_atom/mem-stores/P
   event[2] = dummy:u
  ============================================================================
                   Percent     offset       size  field
    100.00  100.00    0.00          0         40  pthread_mutex_t  {
    100.00  100.00    0.00          0         40      struct __pthread_mutex_s __data {
     45.21   84.54    0.00          0          4          int __lock;
      0.00    0.00    0.00          4          4          unsigned int __count;
      0.00    1.83    0.00          8          4          int __owner;
      5.19   10.65    0.00         12          4          unsigned int __nusers;
     49.61    2.97    0.00         16          4          int __kind;
      0.00    0.00    0.00         20          2          short int __spins;
      0.00    0.00    0.00         22          2          short int __elision;
      0.00    0.00    0.00         24         16          __pthread_list_t __list {
      0.00    0.00    0.00         24          8              struct __pthread_internal_list* __prev;
      0.00    0.00    0.00         32          8              struct __pthread_internal_list* __next;
                                                          };
                                                      };
      0.00    0.00    0.00          0          0      char[] __size;
     45.21   84.54    0.00          0          8      long int __align;
                                                };
Skipping it:

  root@number:~# perf annotate --stdio --group --data-type --skip-empty | head -24
  Annotate type: 'pthread_mutex_t' in /usr/lib64/libc.so.6 (50 samples):
   event[0] = cpu_atom/mem-loads,ldlat=30/P
   event[1] = cpu_atom/mem-stores/P
  ============================================================================
           Percent     offset       size  field
    100.00  100.00          0         40  pthread_mutex_t  {
    100.00  100.00          0         40      struct __pthread_mutex_s __data {
     45.21   84.54          0          4          int __lock;
      0.00    0.00          4          4          unsigned int __count;
      0.00    1.83          8          4          int __owner;
      5.19   10.65         12          4          unsigned int __nusers;
     49.61    2.97         16          4          int __kind;
      0.00    0.00         20          2          short int __spins;
      0.00    0.00         22          2          short int __elision;
      0.00    0.00         24         16          __pthread_list_t __list {
      0.00    0.00         24          8              struct __pthread_internal_list* __prev;
      0.00    0.00         32          8              struct __pthread_internal_list* __next;
                                                  };
                                              };
      0.00    0.00          0          0      char[] __size;
     45.21   84.54          0          8      long int __align;
                                          };

  Annotate type: 'pthread_mutexattr_t' in /usr/lib64/libc.so.6 (1 samples):
  root@number:~#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240807061713.1642924-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Fix --group behavior when leader has no samples
Namhyung Kim [Wed, 7 Aug 2024 06:15:55 +0000 (23:15 -0700)]
perf annotate: Fix --group behavior when leader has no samples

When --group option is used, it should display all events together.  But
the current logic only checks if the first (leader) event has samples or
not.  Let's check the member events as well.

Also it missed to put the linked samples from member evsels to the
output RB-tree so that it can be displayed in the output.

For example, take a look at this example.

  $ ./perf evlist
  cpu/mem-loads,ldlat=30/P
  cpu/mem-stores/P
  dummy:u

It has three events but 'path_put' function has samples only for
mem-stores (second) event.

  $ sudo ./perf annotate --stdio -f path_put
   Percent |      Source code & Disassembly of kcore for cpu/mem-stores/P (2 samples, percent: local period)
  ----------------------------------------------------------------------------------------------------------
           : 0                0xffffffffae600020 <path_put>:
      0.00 :   ffffffffae600020:       endbr64
      0.00 :   ffffffffae600024:       nopl    (%rax, %rax)
     91.22 :   ffffffffae600029:       pushq   %rbx
      0.00 :   ffffffffae60002a:       movq    %rdi, %rbx
      0.00 :   ffffffffae60002d:       movq    8(%rdi), %rdi
      8.78 :   ffffffffae600031:       callq   0xffffffffae614aa0
      0.00 :   ffffffffae600036:       movq    (%rbx), %rdi
      0.00 :   ffffffffae600039:       popq    %rbx
      0.00 :   ffffffffae60003a:       jmp     0xffffffffae620670
      0.00 :   ffffffffae60003f:       nop

Therefore, it didn't show up when --group option is used since the
leader ("mem-loads") event has no samples.  But now it checks both
events.

Before:
  $ sudo ./perf annotate --stdio -f --group path_put
  (no output)

After:
  $ sudo ./perf annotate --stdio -f --group path_put
   Percent                 |      Source code & Disassembly of kcore for cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P, dummy:u (0 samples, percent: local period)
  -------------------------------------------------------------------------------------------------------------------------------------------------------------
                           : 0                0xffffffffae600020 <path_put>:
      0.00    0.00    0.00 :   ffffffffae600020:       endbr64
      0.00    0.00    0.00 :   ffffffffae600024:       nopl    (%rax, %rax)
      0.00   91.22    0.00 :   ffffffffae600029:       pushq   %rbx
      0.00    0.00    0.00 :   ffffffffae60002a:       movq    %rdi, %rbx
      0.00    0.00    0.00 :   ffffffffae60002d:       movq    8(%rdi), %rdi
      0.00    8.78    0.00 :   ffffffffae600031:       callq   0xffffffffae614aa0
      0.00    0.00    0.00 :   ffffffffae600036:       movq    (%rbx), %rdi
      0.00    0.00    0.00 :   ffffffffae600039:       popq    %rbx
      0.00    0.00    0.00 :   ffffffffae60003a:       jmp     0xffffffffae620670
      0.00    0.00    0.00 :   ffffffffae60003f:       nop

Committer testing:

Before:

  root@number:~# perf annotate --group --stdio2 clear_page_erms
  root@number:~#

After:

  root@number:~# perf annotate --group --stdio2 clear_page_erms
  Samples: 125  of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P, dummy:u', 4000 Hz, Event count (approx.): 13198416, [percent: local period]
  clear_page_erms() /proc/kcore
  Percent                      0xffffffff990c6cc0 <clear_page_erms>:
                                 endbr64
                                 movl    $0x1000,%ecx
                                 xorl    %eax,%eax
     0.00  100.00    0.00        rep     stosb %al, (%rdi)
                               ← retq
                                 int3
                                 int3
                                 int3
                                 int3
                                 nop
                                 nop
  root@number:~#

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20240807061555.1642669-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf tools: Create source symlink in perf object dir
Andi Kleen [Wed, 7 Aug 2024 23:18:20 +0000 (16:18 -0700)]
perf tools: Create source symlink in perf object dir

Create a source symlink to the original source in the objdir.

This is similar to what the main kernel build script does.

Committer testing:

  ⬢[acme@toolbox perf-tools-next]$ make O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin
  <SNIP>
  ⬢[acme@toolbox perf-tools-next]$ ls -la /tmp/build/perf-tools-next/source
  lrwxrwxrwx. 1 acme acme 41 Aug  9 16:26 /tmp/build/perf-tools-next/source -> /home/acme/git/perf-tools-next/tools/perf
  ⬢[acme@toolbox perf-tools-next]$

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240807231823.898979-1-ak@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf debuginfo: Fix the build with !HAVE_DWARF_SUPPORT
Arnaldo Carvalho de Melo [Fri, 9 Aug 2024 14:32:45 +0000 (11:32 -0300)]
perf debuginfo: Fix the build with !HAVE_DWARF_SUPPORT

In that case we have a set of placeholder functions, one of them uses a
'Dwarf_Addr' type that is not present as it is defined in the missing
DWARF libraries, so provide a placeholder typedef for that as well.

The build error before this patch:

  In file included from util/annotate.c:28:
  util/debuginfo.h:44:46: error: unknown type name ‘Dwarf_Addr’
     44 |                                              Dwarf_Addr *offs __maybe_unused,
        |                                              ^~~~~~~~~~
  make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: util/annotate.o] Error 1
  make[6]: *** Waiting for unfinished jobs....

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/lkml/CAM9d7ciushSwEfj7yW4rtDEJBTcCB991V4cswwFEL+cv6QF2pg@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf script python: Add the 'ins_lat' field to event handler
Zixian Cai [Fri, 9 Aug 2024 08:01:36 +0000 (08:01 +0000)]
perf script python: Add the 'ins_lat' field to event handler

For example, when using the Alder Lake PMU memory load event, the
instruction latency is stored in 'ins_lat', while the cache latency
is stored in 'weight'.

This patch reports the 'ins_lat' field for Python scripting.

Committer testing:

On a Rocket Lake Refresh Intel machine (14th gen):

  root@number:~# grep -m1 'model name' /proc/cpuinfo
  model name : Intel(R) Core(TM) i7-14700K
  root@number:~# perf mem record -a sleep 5
  Memory events are enabled on a subset of CPUs: 16-27
  [ perf record: Woken up 85 times to write data ]
  [ perf record: Captured and wrote 41.236 MB perf.data (191390 samples) ]
  root@number:~# perf evlist -v
  cpu_atom/mem-loads,ldlat=30/P: type: 10 (cpu_atom), size: 136, config: 0x5d0 (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1, { bp_addr, config1 }: 0x1f
  cpu_atom/mem-stores/P: type: 10 (cpu_atom), size: 136, config: 0x6d0 (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1
  dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  root@number:~#

Now generate a python script to then dump the dictionary that now needs
to have that 'ins_lat' field:

  root@number:~# perf script --gen python
  generated Python script: perf-script.py
  root@number:~# vim perf-script.py
  root@number:~# perf script -s perf-script.py | head -40
  in trace_begin
  in trace_end
  root@number:~# vim perf-script.py

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Zixian Cai <fzczx123@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240809080137.3590148-1-fzczx123@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test shell lbr: Support hybrid x86 systems too
Arnaldo Carvalho de Melo [Thu, 8 Aug 2024 14:26:13 +0000 (11:26 -0300)]
perf test shell lbr: Support hybrid x86 systems too

Running on a:

  root@x1:~# grep 'model name' -m1 /proc/cpuinfo
  model name : 13th Gen Intel(R) Core(TM) i7-1365U
  root@x1:~#

It skips all the tests with:

  root@x1:~# perf test -vvvv LBR
   97: perf record LBR tests:
  --- start ---
  test child forked, pid 2033388
  Skip: only x86 CPUs support LBR
  ---- end(-2) ----
   97: perf record LBR tests                                           : Skip
  root@x1:~#

Because the test checks for the /sys/devices/cpu/caps/branches file,
that isn't present as we have instead:

  root@x1:~# ls -la /sys/devices/cpu*/caps/branches
  -r--r--r--. 1 root root 4096 Aug  8 11:22 /sys/devices/cpu_atom/caps/branches
  -r--r--r--. 1 root root 4096 Aug  8 11:21 /sys/devices/cpu_core/caps/branches
  root@x1:~#

If we check as well for one of those,
/sys/devices/cpu_core/caps/branches, then we don't skip the tests and
all are run on these x86 Intel Hybrid systems as well, passing all of
them:

  root@x1:~# perf test -vvvv LBR
   97: perf record LBR tests:
  --- start ---
  test child forked, pid 2034956
  LBR callgraph
  [ perf record: Woken up 5 times to write data ]
  [ perf record: Captured and wrote 1.812 MB /tmp/__perf_test.perf.data.B2HvQ (8114 samples) ]
  LBR callgraph [Success]
  LBR any branch test
  [ perf record: Woken up 25 times to write data ]
  [ perf record: Captured and wrote 6.382 MB /tmp/__perf_test.perf.data.B2HvQ (8071 samples) ]
  LBR any branch test: 8071 samples
  LBR any branch test [Success]
  LBR any call test
  [ perf record: Woken up 23 times to write data ]
  [ perf record: Captured and wrote 6.208 MB /tmp/__perf_test.perf.data.B2HvQ (8092 samples) ]
  LBR any call test: 8092 samples
  LBR any call test [Success]
  LBR any ret test
  [ perf record: Woken up 24 times to write data ]
  [ perf record: Captured and wrote 6.396 MB /tmp/__perf_test.perf.data.B2HvQ (8093 samples) ]
  LBR any ret test: 8093 samples
  LBR any ret test [Success]
  LBR any indirect call test
  [ perf record: Woken up 25 times to write data ]
  [ perf record: Captured and wrote 6.344 MB /tmp/__perf_test.perf.data.B2HvQ (8067 samples) ]
  LBR any indirect call test: 8067 samples
  LBR any indirect call test [Success]
  LBR any indirect jump test
  [ perf record: Woken up 12 times to write data ]
  [ perf record: Captured and wrote 3.073 MB /tmp/__perf_test.perf.data.B2HvQ (8061 samples) ]
  LBR any indirect jump test: 8061 samples
  LBR any indirect jump test [Success]
  LBR direct calls test
  [ perf record: Woken up 25 times to write data ]
  [ perf record: Captured and wrote 6.380 MB /tmp/__perf_test.perf.data.B2HvQ (8076 samples) ]
  LBR direct calls test: 8076 samples
  LBR direct calls test [Success]
  LBR any indirect user call test
  [ perf record: Woken up 5 times to write data ]
  [ perf record: Captured and wrote 1.597 MB /tmp/__perf_test.perf.data.B2HvQ (8079 samples) ]
  LBR any indirect user call test: 8079 samples
  LBR any indirect user call test [Success]
  LBR system wide any branch test
  [ perf record: Woken up 26 times to write data ]
  [ perf record: Captured and wrote 9.088 MB /tmp/__perf_test.perf.data.B2HvQ (9209 samples) ]
  LBR system wide any branch test: 9209 samples
  LBR system wide any branch test [Success]
  LBR system wide any call test
  [ perf record: Woken up 25 times to write data ]
  [ perf record: Captured and wrote 8.945 MB /tmp/__perf_test.perf.data.B2HvQ (9333 samples) ]
  LBR system wide any call test: 9333 samples
  LBR system wide any call test [Success]
  LBR parallel any branch test
  LBR parallel any call test
  LBR parallel any ret test
  LBR parallel any indirect call test
  LBR parallel any indirect jump test
  LBR parallel direct calls test
  LBR parallel system wide any branch test
  LBR parallel any indirect user call test
  LBR parallel system wide any call test
  [ perf record: Woken up 9 times to write data ]
  [ perf record: Woken up 51 times to write data ]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Woken up 5 times to write data ]
  [ perf record: Woken up 559 times to write data ]
  [ perf record: Woken up 14 times to write data ]
  [ perf record: Woken up 17 times to write data ]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Woken up 11 times to write data ]
  [ perf record: Captured and wrote 0.150 MB /tmp/__perf_test.perf.data.lANpR (1909 samples) ]
  [ perf record: Captured and wrote 2.371 MB /tmp/__perf_test.perf.data.Olum8 (3033 samples) ]
  [ perf record: Captured and wrote 1.230 MB /tmp/__perf_test.perf.data.njfJ8 (1742 samples) ]
  [ perf record: Captured and wrote 5.554 MB /tmp/__perf_test.perf.data.4ZTrj (29662 samples) ]
  [ perf record: Captured and wrote 19.906 MB /tmp/__perf_test.perf.data.dlGQt (29576 samples) ]
  [ perf record: Captured and wrote 0.289 MB /tmp/__perf_test.perf.data.CAT7y (4311 samples) ]
  [ perf record: Captured and wrote 3.129 MB /tmp/__perf_test.perf.data.diuKG (3971 samples) ]
  LBR parallel any indirect user call test: 1909 samples
  [ perf record: Captured and wrote 4.858 MB /tmp/__perf_test.perf.data.sVjtN (6130 samples) ]
  LBR parallel any indirect user call test [Success]
  [ perf record: Captured and wrote 3.669 MB /tmp/__perf_test.perf.data.AJtNI (4827 samples) ]
  LBR parallel any indirect jump test: 4311 samples
  LBR parallel any indirect jump test [Success]
  LBR parallel direct calls test: 3033 samples
  LBR parallel direct calls test [Success]
  LBR parallel any indirect call test: 1742 samples
  LBR parallel any indirect call test [Success]
  LBR parallel any call test: 4827 samples
  LBR parallel any call test [Success]
  LBR parallel any branch test: 6130 samples
  LBR parallel any branch test [Success]
  LBR parallel system wide any branch test: 29662 samples
  LBR parallel any ret test: 3971 samples
  LBR parallel any ret test [Success]
  LBR parallel system wide any branch test [Success]
  LBR parallel system wide any call test: 29576 samples
  LBR parallel system wide any call test [Success]
  ---- end(0) ----
   97: perf record LBR tests                                           : Ok
  root@x1:~#

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZrTXftup0H46R8WK@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test: Add set of perf record LBR tests
Ian Rogers [Thu, 8 Aug 2024 05:46:44 +0000 (22:46 -0700)]
perf test: Add set of perf record LBR tests

Adds coverage for LBR operations and LBR callgraph.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240808054644.1286065-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf callchain: Fix stitch LBR memory leaks
Ian Rogers [Thu, 8 Aug 2024 05:46:43 +0000 (22:46 -0700)]
perf callchain: Fix stitch LBR memory leaks

The 'struct callchain_cursor_node' has a 'struct map_symbol' whose maps
and map members are reference counted. Ensure these values use a _get
routine to increment the reference counts and use map_symbol__exit() to
release the reference counts.

Do similar for 'struct thread's prev_lbr_cursor, but save the size of
the prev_lbr_cursor array so that it may be iterated.

Ensure that when stitch_nodes are placed on the free list the
map_symbols are exited.

Fix resolve_lbr_callchain_sample() by replacing list_replace_init() to
list_splice_init(), so the whole list is moved and nodes aren't leaked.

A reproduction of the memory leaks is possible with a leak sanitizer
build in the perf report command of:

  ```
  $ perf record -e cycles --call-graph lbr perf test -w thloop
  $ perf report --stitch-lbr
  ```

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Fixes: ff165628d72644e3 ("perf callchain: Stitch LBR call stack")
Signed-off-by: Ian Rogers <irogers@google.com>
[ Basic tests after applying the patch, repeating the example above ]
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anne Macedo <retpolanne@posteo.net>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240808054644.1286065-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test pmu: Set uninitialized PMU alias to null
Veronika Molnarova [Thu, 8 Aug 2024 10:37:49 +0000 (12:37 +0200)]
perf test pmu: Set uninitialized PMU alias to null

Commit 3e0bf9fde2984469 ("perf pmu: Restore full PMU name wildcard
support") adds a test case "PMU cmdline match" that covers PMU name
wildcard support provided by function perf_pmu__match().

The test works with a wide range of supported combinations of PMU name
matching but omits the case that if the perf_pmu__match() cannot match
the PMU name to the wildcard, it tries to match its alias. However, this
variable is not set up, causing the test case to fail when run with
subprocesses or to segfault if run as a single process.

  ./perf test -vv 9
    9: Sysfs PMU tests                                :
    9.1: Parsing with PMU format directory            : Ok
    9.2: Parsing with PMU event                       : Ok
    9.3: PMU event names                              : Ok
    9.4: PMU name combining                           : Ok
    9.5: PMU name comparison                          : Ok
    9.6: PMU cmdline match                            : FAILED!

  ./perf test -F 9
    9.1: Parsing with PMU format directory            : Ok
    9.2: Parsing with PMU event                       : Ok
    9.3: PMU event names                              : Ok
    9.4: PMU name combining                           : Ok
    9.5: PMU name comparison                          : Ok
  Segmentation fault (core dumped)

Initialize the PMU alias to null for all tests of perf_pmu__match()
as this functionality is not being tested and the alias matching works
exactly the same as the matching of the PMU name.

./perf test -F 9
  9.1: Parsing with PMU format directory                             : Ok
  9.2: Parsing with PMU event                                        : Ok
  9.3: PMU event names                                               : Ok
  9.4: PMU name combining                                            : Ok
  9.5: PMU name comparison                                           : Ok
  9.6: PMU cmdline match                                             : Ok

Fixes: 3e0bf9fde2984469 ("perf pmu: Restore full PMU name wildcard support")
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Radostin Stoyanov <rstoyano@redhat.com>
Link: https://lore.kernel.org/r/20240808103749.9356-1-vmolnaro@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf tests ftrace: Add pattern check for time, count
Arnaldo Carvalho de Melo [Thu, 8 Aug 2024 12:59:40 +0000 (09:59 -0300)]
perf tests ftrace: Add pattern check for time, count

In 'perf ftrace profile sleep 0.1' we know that we'll have an specific
kernel function that will take a bit more than 0.1 seconds and will take
place just one time, so we can add a check for that so that we validate
more than just the presence of some functions in the profile.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/lkml/ZrTBo7KACZeuCyLj@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test: Add a new shell test for perf ftrace
Namhyung Kim [Thu, 8 Aug 2024 04:49:54 +0000 (21:49 -0700)]
perf test: Add a new shell test for perf ftrace

  $ sudo ./perf test ftrace -vv
   86: perf ftrace tests:
  --- start ---
  test child forked, pid 1772223
  perf ftrace list test
  syscalls for sleep:
  __x64_sys_nanosleep
  __ia32_sys_nanosleep
  __x64_sys_clock_nanosleep
  __ia32_sys_clock_nanosleep
  perf ftrace list test  [Success]
  perf ftrace trace test
  # tracer: function_graph
  #
  # CPU  DURATION                  FUNCTION CALLS
  # |     |   |                     |   |   |   |
   0)               |  __x64_sys_clock_nanosleep() {
   0)               |    common_nsleep() {
   0)               |      hrtimer_nanosleep() {
   0)               |        do_nanosleep() {
  perf ftrace trace test  [Success]
  perf ftrace latency test
  target function: __x64_sys_clock_nanosleep
  #   DURATION     |      COUNT | GRAPH                                          |
      32 - 64   ms |          1 | ############################################## |
  perf ftrace latency test  [Success]
  perf ftrace profile test
  # Total (us)   Avg (us)   Max (us)      Count   Function
    100136.400 100136.400 100136.400          1   __x64_sys_clock_nanosleep
    100135.200 100135.200 100135.200          1   common_nsleep
    100134.700 100134.700 100134.700          1   hrtimer_nanosleep
    100133.700 100133.700 100133.700          1   do_nanosleep
    100130.600 100130.600 100130.600          1   schedule
       166.868     55.623     80.299          3   scheduler_tick
         5.926      5.926      5.926          1   native_smp_send_reschedule
       301.941    301.941    301.941          1   __x64_sys_execve
       295.786    295.786    295.786          1   do_execveat_common.isra.0
        71.397     35.699     46.403          2   bprm_execve
         2.519      1.260      1.547          2   sched_mm_cid_before_execve
         1.098      0.549      0.686          2   sched_mm_cid_after_execve
  perf ftrace profile test  [Success]
  ---- end(0) ----
   86: perf ftrace tests                                               : Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240808044954.1775333-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate-data: Show typedef names properly
Namhyung Kim [Wed, 7 Aug 2024 22:31:29 +0000 (15:31 -0700)]
perf annotate-data: Show typedef names properly

The die_get_typename() would resolve typedef and get to the original
type.  But sometimes the original type is a struct without name and it
makes the output confusing and hard to read.

This is a diff of perf report -s type before and after the change.
New types such as atomic{,64}_t and sigset_t appeared and the portion
of unnamed struct was reduced.  Also u32, u64 and size_t were splitted
from the base types.

  --- b   2024-08-01 17:02:34.307809952 -0700
  +++ a   2024-08-07 14:17:05.245853999 -0700
  -     2.40%  long unsigned int
  +     2.26%  long unsigned int
  -     1.56%  unsigned int
  +     1.27%  unsigned int
  -     0.98%  struct
  -     0.79%  long long unsigned int
  +     0.58%  long long unsigned int
  +     0.36%  struct
  +     0.27%  atomic64_t
  +     0.22%  u32
  +     0.21%  u64
  +     0.19%  atomic_t
  +     0.13%  size_t
  -     0.08%  struct seqcount_spinlock
  +     0.08%  seqcount_spinlock_t
  +     0.08%  sigset_t
  +     0.08%  __poll_t

Let's use the typedef name directly and the resolved to get the size of
the type.

Committer testing:

  root@x1:~# diff -u before after | head -30
  --- before 2024-08-08 09:35:13.917325041 -0300
  +++ after 2024-08-08 09:37:35.312257905 -0300
  @@ -10,25 +10,27 @@
   # ........  .........
   #
       79.40%  (unknown)
  -     2.28%  union
        1.96%  (stack operation)
  -     1.24%  struct
  +     1.87%  pthread_mutex_t
        0.99%  u32[]
  -     0.92%  unsigned int
        0.77%  struct task_struct
  +     0.75%  U32
        0.75%  struct pcpu_hot
        0.63%  struct qspinlock
  +     0.61%  atomic_t
        0.59%  struct list_head
  -     0.58%  int
        0.53%  struct cfs_rq
        0.51%  BYTE*
  -     0.48%  unsigned char
  +     0.48%  BYTE
        0.48%  long unsigned int
        0.46%  struct rq
        0.41%  struct worker
        0.41%  struct memcg_vmstats_percpu
  +     0.41%  pthread_cond_t
        0.37%  _Bool
  +     0.36%  int
  root@x1:~#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240807223129.1738004-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Cache debuginfo for data type profiling
Namhyung Kim [Mon, 5 Aug 2024 23:46:48 +0000 (16:46 -0700)]
perf annotate: Cache debuginfo for data type profiling

In find_data_type(), it creates and deletes a debug info whenver it
tries to find data type for a sample.  This is inefficient and it most
likely accesses the same binary again and again.

Let's add a single entry cache the debug info structure for the last DSO.
Depending on sample data, it usually gives me 2~3x (and sometimes more)
speed ups.

Note that this will introduce a little difference in the output due to
the order of checking stack operations.  It used to check the stack ops
before checking the availability of debug info but I moved it after the
symbol check.  So it'll report stack operations in DSOs without debug
info as unknown.  But I think it's ok and better to have the checking
near the caching logic.

Committer testing:

  root@x1:~# perf mem record -a sleep 5s
  root@x1:~# perf evlist
  cpu_atom/mem-loads,ldlat=30/P
  cpu_atom/mem-stores/P
  dummy:u
  root@x1:~# diff -u before after
  --- before 2024-08-08 09:33:53.880780784 -0300
  +++ after 2024-08-08 09:35:13.917325041 -0300
  @@ -81,8 +81,8 @@
   # Overhead  Data Type
   # ........  .........
   #
  -    55.43%  (unknown)
  -    11.61%  (stack operation)
  +    55.56%  (unknown)
  +    11.48%  (stack operation)
        4.93%  struct pcpu_hot
        3.26%  unsigned int
        2.48%  struct

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240805234648.1453689-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf hist: Fix reference counting of branch_info
Ian Rogers [Wed, 7 Aug 2024 06:51:36 +0000 (23:51 -0700)]
perf hist: Fix reference counting of branch_info

iter_finish_branch_entry() doesn't put the branch_info from/to map
elements creating memory leaks. This can be seen with:

```
$ perf record -e cycles -b perf test -w noploop
$ perf report -D
...
Direct leak of 984344 byte(s) in 123043 object(s) allocated from:
    #0 0x7fb2654f3bd7 in malloc libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x564d3400d10b in map__get util/map.h:186
    #2 0x564d3400d10b in ip__resolve_ams util/machine.c:1981
    #3 0x564d34014d81 in sample__resolve_bstack util/machine.c:2151
    #4 0x564d34094790 in iter_prepare_branch_entry util/hist.c:898
    #5 0x564d34098fa4 in hist_entry_iter__add util/hist.c:1238
    #6 0x564d33d1f0c7 in process_sample_event tools/perf/builtin-report.c:334
    #7 0x564d34031eb7 in perf_session__deliver_event util/session.c:1655
    #8 0x564d3403ba52 in do_flush util/ordered-events.c:245
    #9 0x564d3403ba52 in __ordered_events__flush util/ordered-events.c:324
    #10 0x564d3402d32e in perf_session__process_user_event util/session.c:1708
    #11 0x564d34032480 in perf_session__process_event util/session.c:1877
    #12 0x564d340336ad in reader__read_event util/session.c:2399
    #13 0x564d34033fdc in reader__process_events util/session.c:2448
    #14 0x564d34033fdc in __perf_session__process_events util/session.c:2495
    #15 0x564d34033fdc in perf_session__process_events util/session.c:2661
    #16 0x564d33d27113 in __cmd_report tools/perf/builtin-report.c:1065
    #17 0x564d33d27113 in cmd_report tools/perf/builtin-report.c:1805
    #18 0x564d33e0ccb7 in run_builtin tools/perf/perf.c:350
    #19 0x564d33e0d45e in handle_internal_command tools/perf/perf.c:403
    #20 0x564d33cdd827 in run_argv tools/perf/perf.c:447
    #21 0x564d33cdd827 in main tools/perf/perf.c:561
...
```

Clearing up the map_symbols properly creates maps reference count
issues so resolve those. Resolving this issue doesn't improve peak
heap consumption for the test above.

Committer testing:

  $ sudo dnf install libasan
  $ make -k CORESIGHT=1 EXTRA_CFLAGS="-fsanitize=address" CC=clang O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/20240807065136.1039977-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoMerge remote-tracking branch 'torvalds/master' into perf-tools-next
Arnaldo Carvalho de Melo [Tue, 6 Aug 2024 17:01:06 +0000 (14:01 -0300)]
Merge remote-tracking branch 'torvalds/master' into perf-tools-next

To pick a patch that albeit being for tools/perf/ directory went thru a
different tree and ended up breaking some recent tests introduced in the
perf-tools-next tree to validate duplicate events in the JSON
performance event files.

Link: https://lore.kernel.org/lkml/ZrIqDMg7cBVhstYU@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoMerge tag 'platform-drivers-x86-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 6 Aug 2024 14:52:10 +0000 (07:52 -0700)]
Merge tag 'platform-drivers-x86-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Ilpo Järvinen:
 "Fixes:

   - Fix ACPI notifier racing with itself (intel-vbtn)

   - Initialize local variable to cover a timeout corner case
     (intel/ifs)

   - WMI docs spelling

  New device IDs:

   - amd/{pmc,pmf}: AMD 1Ah model 60h series.

   - amd/pmf: SPS quirk support for ASUS ROG Ally X"

* tag 'platform-drivers-x86-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86/intel/ifs: Initialize union ifs_status to zero
  platform/x86: msi-wmi-platform: Fix spelling mistakes
  platform/x86/amd/pmf: Add new ACPI ID AMDI0107
  platform/x86/amd/pmc: Send OS_HINT command for new AMD platform
  platform/x86/amd: pmf: Add quirk for ROG Ally X
  platform/x86: intel-vbtn: Protect ACPI notify handler against recursion

11 months agoperf jevents.py: Ensure event names aren't duplicated
Ian Rogers [Mon, 5 Aug 2024 19:44:24 +0000 (12:44 -0700)]
perf jevents.py: Ensure event names aren't duplicated

Duplicate event names break invariants in 'perf list'. Assert that an
event name isn't duplicated so that broken JSON won't build.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Charles Ci-Jyun Wu <dminus@andestech.com>
Cc: Eric Lin <eric.lin@sifive.com>
Cc: Greentime Hu <greentime.hu@sifive.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Inochi Amaoto <inochiama@outlook.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Ji Sheng Teoh <jisheng.teoh@starfivetech.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Locus Wei-Han Chen <locus84@andestech.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samuel Holland <samuel.holland@sifive.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Vincent Chen <vincent.chen@sifive.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240805194424.597244-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf pmu-events: Remove duplicated ampereone event
Ian Rogers [Mon, 5 Aug 2024 19:44:22 +0000 (12:44 -0700)]
perf pmu-events: Remove duplicated ampereone event

OP_SPEC is repeated twice in the file which will break invariants in
'perf list' as discussed in this thread:

  https://lore.kernel.org/linux-perf-users/20240719081651.24853-1-eric.lin@sifive.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Charles Ci-Jyun Wu <dminus@andestech.com>
Cc: Eric Lin <eric.lin@sifive.com>
Cc: Greentime Hu <greentime.hu@sifive.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Inochi Amaoto <inochiama@outlook.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Ji Sheng Teoh <jisheng.teoh@starfivetech.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Locus Wei-Han Chen <locus84@andestech.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samuel Holland <samuel.holland@sifive.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Vincent Chen <vincent.chen@sifive.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240805194424.597244-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf pmu-events: Change dependencies for empty-pmu-events.c test
Ian Rogers [Mon, 5 Aug 2024 19:44:21 +0000 (12:44 -0700)]
perf pmu-events: Change dependencies for empty-pmu-events.c test

Switch from $? (all the prerequisites that are newer than the target)
to $^ (all the prerequisites) as touching jevents.py will mean that
empty-pmu-events.c won't be passed to the diff command breaking the
build.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Charles Ci-Jyun Wu <dminus@andestech.com>
Cc: Eric Lin <eric.lin@sifive.com>
Cc: Greentime Hu <greentime.hu@sifive.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Inochi Amaoto <inochiama@outlook.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Ji Sheng Teoh <jisheng.teoh@starfivetech.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Locus Wei-Han Chen <locus84@andestech.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samuel Holland <samuel.holland@sifive.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Vincent Chen <vincent.chen@sifive.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240805194424.597244-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf test: Add build test for JEVENTS_ARCH=all
Ian Rogers [Mon, 5 Aug 2024 19:44:20 +0000 (12:44 -0700)]
perf test: Add build test for JEVENTS_ARCH=all

Building with JEVENTS_ARCH=all builds all CPU types and allows things
like assertions to check the validity of the input JSON.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Charles Ci-Jyun Wu <dminus@andestech.com>
Cc: Eric Lin <eric.lin@sifive.com>
Cc: Greentime Hu <greentime.hu@sifive.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Inochi Amaoto <inochiama@outlook.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Ji Sheng Teoh <jisheng.teoh@starfivetech.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Locus Wei-Han Chen <locus84@andestech.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samuel Holland <samuel.holland@sifive.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Vincent Chen <vincent.chen@sifive.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240805194424.597244-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoMerge tag 'linux_kselftest-fixes-6.11-rc3' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Mon, 5 Aug 2024 21:31:12 +0000 (14:31 -0700)]
Merge tag 'linux_kselftest-fixes-6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fix from Shuah Khan:
 "A single fix to the conditional in ksft.py script which incorrectly
  flags a test suite failed when there are skipped tests in the mix.

  The logic is fixed to take skipped tests into account and report the
  test as passed"

* tag 'linux_kselftest-fixes-6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: ksft: Fix finished() helper exit code on skipped tests

11 months agoperf annotate: Add --skip-empty option
Namhyung Kim [Sat, 3 Aug 2024 21:13:32 +0000 (14:13 -0700)]
perf annotate: Add --skip-empty option

Like in 'perf report', we want to hide empty events in the 'perf annotate'
output.  This is consistent when the option is set in perf report.

For example, the following command would use 3 events including dummy.

  $ perf mem record -a -- perf test -w noploop

  $ perf evlist
  cpu/mem-loads,ldlat=30/P
  cpu/mem-stores/P
  dummy:u

Just using perf annotate with --group will show the all 3 events.

  $ perf annotate --group --stdio | head
   Percent                 | Source code & Disassembly of ...
  --------------------------------------------------------------
                           : 0     0xe060 <_dl_relocate_object>:
      0.00    0.00    0.00 :    e060:       pushq   %rbp
      0.00    0.00    0.00 :    e061:       movq    %rsp, %rbp
      0.00    0.00    0.00 :    e064:       pushq   %r15
      0.00    0.00    0.00 :    e066:       movq    %rdi, %r15
      0.00    0.00    0.00 :    e069:       pushq   %r14
      0.00    0.00    0.00 :    e06b:       pushq   %r13
      0.00    0.00    0.00 :    e06d:       movl    %edx, %r13d

Now with --skip-empty, it'll hide the last dummy event.

  $ perf annotate --group --stdio --skip-empty | head
   Percent         | Source code & Disassembly of ...
  ------------------------------------------------------
                   : 0     0xe060 <_dl_relocate_object>:
      0.00    0.00 :    e060:       pushq   %rbp
      0.00    0.00 :    e061:       movq    %rsp, %rbp
      0.00    0.00 :    e064:       pushq   %r15
      0.00    0.00 :    e066:       movq    %rdi, %r15
      0.00    0.00 :    e069:       pushq   %r14
      0.00    0.00 :    e06b:       pushq   %r13
      0.00    0.00 :    e06d:       movl    %edx, %r13d

Committer testing:

  root@x1:~# perf evlist
  cpu_atom/mem-loads,ldlat=30/P
  cpu_atom/mem-stores/P
  dummy:u
  root@x1:~#

Before:

  root@x1:~# perf annotate --group --stdio2 do_lookup_x | head -25
  Samples: 20  of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P, dummy:u', 4000 Hz, Event count (approx.): 769079, [percent: local period]
  do_lookup_x() /usr/lib64/ld-linux-x86-64.so.2
  Percent                       0x9900 <do_lookup_x>:
                                  pushq      %rbp
                                  movq       %rsp,%rbp
                                  pushq      %r15
                                  pushq      %r14
                                  pushq      %r13
                                  pushq      %r12
                                  pushq      %rbx
                                  subq       $0x88,%rsp
                                  movq       %rdi,-0x50(%rbp)
                                  movl       8(%r9),%edi
                                  movq       0x10(%rbp),%r12
                                  movq       0x28(%rbp),%r10
                                  movq       %rdx,-0x70(%rbp)
                                  movq       %rcx,-0x58(%rbp)
                                  movq       %rdi,%r11
     0.00    5.73    0.00         movq       %r8,-0x68(%rbp)
                                  movq       (%r9),%r8
                                  movl       %esi,%eax
     8.30    0.00    0.00         movl       0x30(%rbp),%r9d
                                  movl       %esi,%r15d
                                  shrl       $6, %eax
                                  movq       %r8,%r13
  root@x1:~#

After:

  root@x1:~# perf annotate --group --skip-empty --stdio2 do_lookup_x | head -25
  Samples: 20  of events 'cpu_atom/mem-loads,ldlat=30/P, cpu_atom/mem-stores/P', 4000 Hz, Event count (approx.): 769079, [percent: local period]
  do_lookup_x() /usr/lib64/ld-linux-x86-64.so.2
  Percent               0x9900 <do_lookup_x>:
                          pushq      %rbp
                          movq       %rsp,%rbp
                          pushq      %r15
                          pushq      %r14
                          pushq      %r13
                          pushq      %r12
                          pushq      %rbx
                          subq       $0x88,%rsp
                          movq       %rdi,-0x50(%rbp)
                          movl       8(%r9),%edi
                          movq       0x10(%rbp),%r12
                          movq       0x28(%rbp),%r10
                          movq       %rdx,-0x70(%rbp)
                          movq       %rcx,-0x58(%rbp)
                          movq       %rdi,%r11
     0.00    5.73         movq       %r8,-0x68(%rbp)
                          movq       (%r9),%r8
                          movl       %esi,%eax
     8.30    0.00         movl       0x30(%rbp),%r9d
                          movl       %esi,%r15d
                          shrl       $6, %eax
                          movq       %r8,%r13
  root@x1:~#

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240803211332.1107222-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Set al->data_nr using the notes->src->nr_events
Namhyung Kim [Sat, 3 Aug 2024 21:13:31 +0000 (14:13 -0700)]
perf annotate: Set al->data_nr using the notes->src->nr_events

This is a preparation to support skipping empty events.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240803211332.1107222-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Use annotation__pcnt_width() consistently
Namhyung Kim [Sat, 3 Aug 2024 21:13:30 +0000 (14:13 -0700)]
perf annotate: Use annotation__pcnt_width() consistently

The annotation__pcnt_width() calculates the screen width for the
overhead (percent) area considering event groups properly.  Use this
function consistently so that we can make sure it has similar output
in different modes.  But there's a difference in stdio and tui output:
stdio uses 8 and tui uses 7 for a percent.

Let's use 8 and adjust the print width in __annotation_line__write()
properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240803211332.1107222-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Set notes->src->nr_events early
Namhyung Kim [Sat, 3 Aug 2024 21:13:29 +0000 (14:13 -0700)]
perf annotate: Set notes->src->nr_events early

We want to use it in different places so make sure it sets properly
in symbol__annotate() before creating the disasm lines.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240803211332.1107222-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf annotate: Use al->data_nr if possible
Namhyung Kim [Sat, 3 Aug 2024 21:13:28 +0000 (14:13 -0700)]
perf annotate: Use al->data_nr if possible

The data_nr keeps the number of entries in al->data[] so it should use
it when it iterates the array.  The notes->src->nr_events should have
the same number but it'd be natural to use al->data_nr.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240803211332.1107222-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoMerge tag 'slab-fixes-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 5 Aug 2024 16:23:00 +0000 (09:23 -0700)]
Merge tag 'slab-fixes-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fix from Vlastimil Babka:
 "Since v6.8 we've had a subtle breakage in SLUB with KFENCE enabled,
  that can cause a crash. It hasn't been found earlier due to quite
  specific conditions necessary (OOM during kmem_cache_alloc_bulk())"

* tag 'slab-fixes-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm, slub: do not call do_slab_free for kfence object

11 months agotools build: Correct bpf fixdep dependencies
Brian Norris [Mon, 15 Jul 2024 20:32:44 +0000 (13:32 -0700)]
tools build: Correct bpf fixdep dependencies

The dependencies in tools/lib/bpf/Makefile are incorrect. Before we
recurse to build $(BPF_IN_STATIC), we need to build its 'fixdep'
executable.

I can't use the usual shortcut from Makefile.include:

  <target>: <sources> fixdep

because its 'fixdep' target relies on $(OUTPUT), and $(OUTPUT) differs
in the parent 'make' versus the child 'make' -- so I imitate it via
open-coding.

I tweak a few $(MAKE) invocations while I'm at it, because
1. I'm adding a new recursive make; and
2. these recursive 'make's print spurious lines about files that are "up
   to date" (which isn't normally a feature in Kbuild subtargets) or
   "jobserver not available" (see [1])

I also need to tweak the assignment of the OUTPUT variable, so that
relative path builds work. For example, for 'make tools/lib/bpf', OUTPUT
is unset, and is usually treated as "cwd" -- but recursive make will
change cwd and so OUTPUT has a new meaning. For consistency, I ensure
OUTPUT is always an absolute path.

And $(Q) gets a backup definition in tools/build/Makefile.include,
because Makefile.include is sometimes included without
tools/build/Makefile, so the "quiet command" stuff doesn't actually work
consistently without it.

After this change, top-level builds result in an empty grep result from:

  $ grep 'cannot find fixdep' $(find tools/ -name '*.cmd')

[1] https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html
If we're not using $(MAKE) directly, then we need to use more '+'.

Signed-off-by: Brian Norris <briannorris@chromium.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240715203325.3832977-4-briannorris@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agotools build: Avoid circular .fixdep-in.o.cmd issues
Brian Norris [Mon, 15 Jul 2024 20:32:43 +0000 (13:32 -0700)]
tools build: Avoid circular .fixdep-in.o.cmd issues

The 'fixdep' tool is used to post-process dependency files for various
reasons, and it runs after every object file generation command. This
even includes 'fixdep' itself.

In Kbuild, this isn't actually a problem, because it uses a single
command to generate fixdep (a compile-and-link command on fixdep.c), and
afterward runs the fixdep command on the accompanying .fixdep.cmd file.

In tools/ builds (which notably is maintained separately from Kbuild),
fixdep is generated in several phases:

 1. fixdep.c -> fixdep-in.o
 2. fixdep-in.o -> fixdep

Thus, fixdep is not available in the post-processing for step 1, and
instead, we generate .cmd files that look like:

  ## from tools/objtool/libsubcmd/.fixdep.o.cmd
  # cannot find fixdep (/path/to/linux/tools/objtool/libsubcmd//fixdep)
  [...]

These invalid .cmd files are benign in some respects, but cause problems
in others (such as the linked reports).

Because the tools/ build system is rather complicated in its own right
(and pointedly different than Kbuild), I choose to simply open-code the
rule for building fixdep, and avoid the recursive-make indirection that
produces the problem in the first place.

Signed-off-by: Brian Norris <briannorris@chromium.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/all/Zk-C5Eg84yt6_nml@google.com/
Link: https://lore.kernel.org/r/20240715203325.3832977-3-briannorris@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agotools build: Correct libsubcmd fixdep dependencies
Brian Norris [Mon, 15 Jul 2024 20:32:42 +0000 (13:32 -0700)]
tools build: Correct libsubcmd fixdep dependencies

All built targets need fixdep to be built first, before handling object
dependencies [1]. We're missing one such dependency before the libsubcmd
target.

This resolves .cmd file generation issues such that the following
sequence produces many fewer results:

  $ git clean -xfd tools/
  $ make tools/objtool
  $ grep "cannot find fixdep" $(find tools/objtool -name '*.cmd')

In particular, only a buggy tools/objtool/libsubcmd/.fixdep.o.cmd
remains, due to circular dependencies of fixdep on itself.

Such incomplete .cmd files don't usually cause a direct problem, since
they're designed to fail "open", but they can cause some subtle problems
that would otherwise be handled by proper fixdep'd dependency files. [2]

[1] This problem is better described in commit abb26210a395 ("perf
tools: Force fixdep compilation at the start of the build"). I don't
apply its solution here, because additional recursive make can be a bit
of overkill.

[2] Example failure case:

  cp -arl linux-src linux-src2
  cd linux-src2
  make O=/path/to/out
  cd ../linux-src
  rm -rf ../linux-src2
  make O=/path/to/out

Previously, we'd see errors like:

  make[6]: *** No rule to make target
  '/path/to/linux-src2/tools/include/linux/compiler.h', needed by
  '/path/to/out/tools/bpf/resolve_btfids/libsubcmd/exec-cmd.o'.  Stop.

Now, the properly-fixdep'd .cmd files will ignore a missing
/path/to/linux-src2/...

Signed-off-by: Brian Norris <briannorris@chromium.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/all/ZGVi9HbI43R5trN8@bhelgaas/
Link: https://lore.kernel.org/all/Zk-C5Eg84yt6_nml@google.com/
Link: https://lore.kernel.org/r/20240715203325.3832977-2-briannorris@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoperf mem: Update documentation for new options
Namhyung Kim [Fri, 2 Aug 2024 18:09:13 +0000 (11:09 -0700)]
perf mem: Update documentation for new options

Add a common options section and move some items to the section.  Also
add description of new options to report options.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/lkml/20240802180913.1023886-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoLinux 6.11-rc2
Linus Torvalds [Sun, 4 Aug 2024 20:50:53 +0000 (13:50 -0700)]
Linux 6.11-rc2

11 months agoprofiling: remove profile=sleep support
Tetsuo Handa [Sun, 4 Aug 2024 09:48:10 +0000 (18:48 +0900)]
profiling: remove profile=sleep support

The kernel sleep profile is no longer working due to a recursive locking
bug introduced by commit 42a20f86dc19 ("sched: Add wrapper for get_wchan()
to keep task blocked")

Booting with the 'profile=sleep' kernel command line option added or
executing

  # echo -n sleep > /sys/kernel/profiling

after boot causes the system to lock up.

Lockdep reports

  kthreadd/3 is trying to acquire lock:
  ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70

  but task is already holding lock:
  ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370

with the call trace being

   lock_acquire+0xc8/0x2f0
   get_wchan+0x32/0x70
   __update_stats_enqueue_sleeper+0x151/0x430
   enqueue_entity+0x4b0/0x520
   enqueue_task_fair+0x92/0x6b0
   ttwu_do_activate+0x73/0x140
   try_to_wake_up+0x213/0x370
   swake_up_locked+0x20/0x50
   complete+0x2f/0x40
   kthread+0xfb/0x180

However, since nobody noticed this regression for more than two years,
let's remove 'profile=sleep' support based on the assumption that nobody
needs this functionality.

Fixes: 42a20f86dc19 ("sched: Add wrapper for get_wchan() to keep task blocked")
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 months agoMerge tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 4 Aug 2024 15:57:08 +0000 (08:57 -0700)]
Merge tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:

 - Prevent a deadlock on cpu_hotplug_lock in the aperf/mperf driver.

   A recent change in the ACPI code which consolidated code pathes moved
   the invocation of init_freq_invariance_cppc() to be moved to a CPU
   hotplug handler. The first invocation on AMD CPUs ends up enabling a
   static branch which dead locks because the static branch enable tries
   to acquire cpu_hotplug_lock but that lock is already held write by
   the hotplug machinery.

   Use static_branch_enable_cpuslocked() instead and take the hotplug
   lock read for the Intel code path which is invoked from the
   architecture code outside of the CPU hotplug operations.

 - Fix the number of reserved bits in the sev_config structure bit field
   so that the bitfield does not exceed 64 bit.

 - Add missing Zen5 model numbers

 - Fix the alignment assumptions of pti_clone_pgtable() and
   clone_entry_text() on 32-bit:

   The code assumes PMD aligned code sections, but on 32-bit the kernel
   entry text is not PMD aligned. So depending on the code size and
   location, which is configuration and compiler dependent, entry text
   can cross a PMD boundary. As the start is not PMD aligned adding PMD
   size to the start address is larger than the end address which
   results in partially mapped entry code for user space. That causes
   endless recursion on the first entry from userspace (usually #PF).

   Cure this by aligning the start address in the addition so it ends up
   at the next PMD start address.

   clone_entry_text() enforces PMD mapping, but on 32-bit the tail might
   eventually be PTE mapped, which causes a map fail because the PMD for
   the tail is not a large page mapping. Use PTI_LEVEL_KERNEL_IMAGE for
   the clone() invocation which resolves to PTE on 32-bit and PMD on
   64-bit.

 - Zero the 8-byte case for get_user() on range check failure on 32-bit

   The recend consolidation of the 8-byte get_user() case broke the
   zeroing in the failure case again. Establish it by clearing ECX
   before the range check and not afterwards as that obvioulsy can't be
   reached when the range check fails

* tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit
  x86/mm: Fix pti_clone_entry_text() for i386
  x86/mm: Fix pti_clone_pgtable() alignment assumption
  x86/setup: Parse the builtin command line before merging
  x86/CPU/AMD: Add models 0x60-0x6f to the Zen5 range
  x86/sev: Fix __reserved field in sev_config
  x86/aperfmperf: Fix deadlock on cpu_hotplug_lock

11 months agoMerge tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Aug 2024 15:50:16 +0000 (08:50 -0700)]
Merge tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
 "Two fixes for the timer/clocksource code:

   - The recent fix to make the take over of the broadcast timer more
     reliable retrieves a per CPU pointer in preemptible context.

     This went unnoticed in testing as some compilers hoist the access
     into the non-preemotible section where the pointer is actually
     used, but obviously compilers can rightfully invoke it where the
     code put it.

     Move it into the non-preemptible section right to the actual usage
     side to cure it.

   - The clocksource watchdog is supposed to emit a warning when the
     retry count is greater than one and the number of retries reaches
     the limit.

     The condition is backwards and warns always when the count is
     greater than one. Fixup the condition to prevent spamming dmesg"

* tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: Fix brown-bag boolean thinko in cs_watchdog_read()
  tick/broadcast: Move per CPU pointer access into the atomic section

11 months agoMerge tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Aug 2024 15:46:14 +0000 (08:46 -0700)]
Merge tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:

 - When stime is larger than rtime due to accounting imprecision, then
   utime = rtime - stime becomes negative. As this is unsigned math, the
   result becomes a huge positive number.

   Cure it by resetting stime to rtime in that case, so utime becomes 0.

 - Restore consistent state when sched_cpu_deactivate() fails.

   When offlining a CPU fails in sched_cpu_deactivate() after the SMT
   present counter has been decremented, then the function aborts but
   fails to increment the SMT present counter and leaves it imbalanced.
   Consecutive operations cause it to underflow. Add the missing fixup
   for the error path.

   For SMT accounting the runqueue needs to marked online again in the
   error exit path to restore consistent state.

* tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate()
  sched/core: Introduce sched_set_rq_on/offline() helper
  sched/smt: Fix unbalance sched_smt_present dec/inc
  sched/smt: Introduce sched_smt_present_inc/dec() helper
  sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime

11 months agoMerge tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 4 Aug 2024 15:42:18 +0000 (08:42 -0700)]
Merge tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf fixes from Thomas Gleixner:

 - Move the smp_processor_id() invocation back into the non-preemtible
   region, so that the result is valid to use

 - Add the missing package C2 residency counters for Sierra Forest CPUs
   to make the newly added support actually useful

* tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix smp_processor_id()-in-preemptible warnings
  perf/x86/intel/cstate: Add pkg C2 residency counter for Sierra Forest

11 months agoMerge tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 4 Aug 2024 15:36:57 +0000 (08:36 -0700)]
Merge tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A couple of fixes for interrupt chip drivers:

   - Make sure to skip the clear register space in the MBIGEN driver
     when calculating the node register index. Otherwise the clear
     register is clobbered and the wrong node registers are accessed.

   - Fix a signed/unsigned confusion in the loongarch CPU driver which
     converts an error code to a huge "valid" interrupt number.

   - Convert the mesion GPIO interrupt controller lock to a raw spinlock
     so it works on RT.

   - Add a missing static to a internal function in the pic32 EVIC
     driver"

* tag 'irq-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mbigen: Fix mbigen node address layout
  irqchip/meson-gpio: Convert meson_gpio_irq_controller::lock to 'raw_spinlock_t'
  irqchip/irq-pic32-evic: Add missing 'static' to internal function
  irqchip/loongarch-cpu: Fix return value of lpic_gsi_to_irq()

11 months agoMerge tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 4 Aug 2024 15:32:31 +0000 (08:32 -0700)]
Merge tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "Two fixes for locking and jump labels:

   - Ensure that the atomic_cmpxchg() conditions are correct and
     evaluating to true on any non-zero value except 1. The missing
     check of the return value leads to inconsisted state of the jump
     label counter.

   - Add a missing type conversion in the paravirt spinlock code which
     makes loongson build again"

* tag 'locking-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  jump_label: Fix the fix, brown paper bags galore
  locking/pvqspinlock: Correct the type of "old" variable in pv_kick_node()

11 months agoarm: dts: arm: versatile-ab: Fix duplicate clock node name
Rob Herring (Arm) [Tue, 30 Jul 2024 21:00:30 +0000 (15:00 -0600)]
arm: dts: arm: versatile-ab: Fix duplicate clock node name

Commit 04f08ef291d4 ("arm/arm64: dts: arm: Use generic clock and
regulator nodenames") renamed nodes and created 2 "clock-24000000" nodes
(at different paths).

The kernel can't handle these duplicate names even though they are at
different paths.  Fix this by renaming one of the nodes to "clock-pclk".

This name is aligned with other Arm boards (those didn't have a known
frequency to use in the node name).

Fixes: 04f08ef291d4 ("arm/arm64: dts: arm: Use generic clock and regulator nodenames")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>