Thomas Gleixner [Fri, 22 Mar 2019 21:51:21 +0000 (22:51 +0100)]
Merge tag 'perf-core-for-mingo-5.1-20190321' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo:
BPF:
Song Liu:
- Add support for annotating BPF programs, using the PERF_RECORD_BPF_EVENT
and PERF_RECORD_KSYMBOL recently added to the kernel and plugging
binutils's libopcodes disassembly of BPF programs with the existing
annotation interfaces in 'perf annotate', 'perf report' and 'perf top'
various output formats (--stdio, --stdio2, --tui).
perf list:
Andi Kleen:
- Filter metrics when using substring search.
perf record:
Andi Kleen:
- Allow to limit number of reported perf.data files
- Clarify help for --switch-output.
perf report:
Andi Kleen
- Indicate JITed code better.
- Show all sort keys in help output.
perf script:
Andi Kleen:
- Support relative time.
perf stat:
Andi Kleen:
- Improve scaling.
General:
Changbin Du:
- Fix some mostly error path memory and reference count leaks found
using gcc's ASan and UBSan.
Vendor events:
Mamatha Inamdar:
- Remove P8 HW events which are not supported.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Fri, 22 Mar 2019 21:50:41 +0000 (22:50 +0100)]
Merge tag 'perf-core-for-mingo-5.1-20190311' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo:
kernel:
Stephane Eranian :
- Restore mmap record type correctly when handling PERF_RECORD_MMAP2
events, as the same template is used for all the threads interested
in mmap events, some may want just PERF_RECORD_MMAP, while some
may want the extra info in MMAP2 records.
perf probe:
Adrian Hunter:
- Fix getting the kernel map, because since changes related to x86 PTI
entry trampolines handling, there are more than one kernel map.
perf script:
Andi Kleen:
- Support insn output for normal samples, i.e.:
perf script -F ip,sym,insn --xed
Will fetch the sample IP from the thread address space and feed it
to Intel's XED disassembler, producing lines such as:
- Make the --cpu filter apply to PERF_RECORD_COMM/FORK/... events, in
addition to PERF_RECORD_SAMPLE.
perf report:
- Add a new --samples option to save a small random number of samples
per hist entry, using a reservoir technique to select a representative
number of samples.
Then allow browsing the samples using 'perf script' as part of the hist
entry context menu. This automatically adds the right filters, so only
the thread or CPU of the sample is displayed. Then we use less' search
functionality to directly jump to the time stamp of the selected sample.
It uses different menus for assembler and source display. Assembler
needs xed installed and source needs debuginfo.
- Fix the UI browser scripts pop up menu when there are many scripts
available.
- Update x86's syscall_64.tbl, no change in tools/perf behaviour.
- Sync copies asm-generic/unistd.h and linux/in with the kernel sources.
perf data:
Jiri Olsa:
- Prep work to support having perf.data stored as a directory, with one
file per CPU, that ultimately will allow having one ring buffer reading
thread per CPU.
Vendor events:
Martin Liška:
- perf PMU events for AMD Family 17h.
perf script python:
Tony Jones:
- Add python3 support for the remaining Intel PT related scripts, with
these we should have a clean build of perf with python3 while still
supporting the build with python2.
libbpf:
Arnaldo Carvalho de Melo:
- Fix the build on uCLibc, adding the missing stdarg.h since we use
va_list in one typedef.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the fallback definition when HAVE_LIBBPF_SUPPORT is not defined,
i.e. add the missing 'static inline' and add the __maybe_unused to the
args. Also add stdio.h since we now use FILE * in bpf-event.h.
Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190319165454.1298742-3-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 19 Mar 2019 16:54:53 +0000 (09:54 -0700)]
perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog()
Extract logic to create program names to synthesize_bpf_prog_name(), so
that it can be reused in header.c:print_bpf_prog_info().
This commit doesn't change the behavior.
Signed-off-by: Song Liu <songliubraving@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190319165454.1298742-2-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch handles 3) and 4) for BPF programs loaded after 'perf
record|top'.
For timely process of these information, a dedicated event is added to
the side band evlist.
When PERF_RECORD_BPF_EVENT is received via the side band event, the
polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env.
This information is saved to perf.data at the end of 'perf record'.
Committer testing:
The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a
unnamed union, so can't be used in a struct designated initialization
with older gccs, get it out of that, isolating as 'attr.wakeup_watermark
= 1;' to work with all gcc versions.
We also need to add '--no-bpf-event' to the 'perf record'
perf_event_attr tests in 'perf test', as the way that that test goes is
to intercept the events being setup and looking if they match the fields
described in the control files, since now it finds first the side band
event used to catch the PERF_RECORD_BPF_EVENT, they all fail.
With these issues fixed:
Same scenario as for testing BPF programs loaded before 'perf record' or
'perf top' starts, only start the BPF programs after 'perf record|top',
so that its information get collected by the sideband threads, the rest
works as for the programs loaded before start monitoring.
Add missing 'inline' to the bpf_event__add_sb_event() when
HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without
binutils devel files installed.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:50 +0000 (22:30 -0700)]
perf evlist: Introduce side band thread
This patch introduces side band thread that captures extended
information for events like PERF_RECORD_BPF_EVENT.
This new thread uses its own evlist that uses ring buffer with very low
watermark for lower latency.
To use side band thread, we need to:
1. add side band event(s) by calling perf_evlist__add_sb_event();
2. calls perf_evlist__start_sb_thread();
3. at the end of perf run, perf_evlist__stop_sb_thread().
In the next patch, we use this thread to handle PERF_RECORD_BPF_EVENT.
Committer notes:
Add fix by Jiri Olsa for when te sb_tread can't get started and then at
the end the stop_sb_thread() segfaults when joining the (non-existing)
thread.
That can happen when running 'perf top' or 'perf record' as a normal
user, for instance.
Further checks need to be done on top of this to more graciously handle
these possible failure scenarios.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-15-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:48 +0000 (22:30 -0700)]
perf annotate: Enable annotation of BPF programs
In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into
a new function symbol__disassemble_bpf(), where annotation line
information is filled based on the bpf_prog_info and btf data saved in
given perf_env.
symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf
programs.
We'll see the two BPF programs that augmented_raw_syscalls.o puts in
place, one attached to the raw_syscalls:sys_enter and another to the
raw_syscalls:sys_exit tracepoints, as expected.
Now we can finally do, from the command line, annotation for one of
those two symbols, with the original BPF program source coude intermixed
with the disassembled JITed code:
Please see 'man perf-config' to see how to control what should be seen,
via ~/.perfconfig [annotate] section, for instance, one can suppress the
source code and see just the disassembly, etc.
Alternatively, use the TUI bu just using 'perf annotate', press
'/bpf_prog' to see the bpf symbols, press enter and do the interactive
annotation, which allows for dumping to a file after selecting the
the various output tunables, for instance, the above without source code
intermixed, plus showing all the instruction offsets:
Then press: 's' to hide the source code + 'O' twice to show all
instruction offsets, then 'P' to print to the
bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have:
Another cool way to test all this is to symple use 'perf top' look for
those symbols, go there and press enter, annotate it live :-)
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:48 +0000 (22:30 -0700)]
perf build: Check what binutils's 'disassembler()' signature to use
Commit 003ca0fd2286 ("Refactor disassembler selection") in the binutils
repo, which changed the disassembler() function signature, so we must
use the feature test introduced in fb982666e380 ("tools/bpftool: fix
bpftool build with bintutils >= 2.9") to deal with that.
Committer testing:
After adding the missing function call to test-all.c, and:
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... disassembler-four-args: [ on ]
CC /tmp/build/perf/jvmti/libjvmti.o
CC /tmp/build/perf/builtin-bench.o
<SNIP>
$
$
The feature detection test-all.bin gets successfully built and linked:
$ ls -la /tmp/build/perf/feature/test-all.bin
-rwxrwxr-x. 1 acme acme 2680352 Mar 19 11:07 /tmp/build/perf/feature/test-all.bin
$ nm /tmp/build/perf/feature/test-all.bin | grep -w disassembler 0000000000061f90 T disassembler
$
Time to move on to the patches that make use of this disassembler()
routine in binutils's libopcodes.
Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roman Gushchin <guro@fb.com> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com
[ split from a larger patch, added missing FEATURE_CHECK_LDFLAGS-disassembler-four-args ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:49 +0000 (22:30 -0700)]
perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation
This patch adds processing of PERF_BPF_EVENT_PROG_LOAD, which sets
proper DSO type/id/etc of memory regions mapped to BPF programs to
DSO_BINARY_TYPE__BPF_PROG_INFO.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-14-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce a new dso type DSO_BINARY_TYPE__BPF_PROG_INFO for BPF programs. In
symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso will call into a new
function symbol__disassemble_bpf() in an upcoming patch, where annotation line
information is filled based bpf_prog_info and btf saved in given perf_env.
Committer notes:
Removed the unnamed union with 'bpf_prog' and 'cache' in 'struct dso',
to fix this bug when exiting 'perf top':
That is trying to access the dso->data.cache, and that is not used with
BPF programs, so we end up accessing what is in bpf_prog.first_member,
b00m.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com
[ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:47 +0000 (22:30 -0700)]
perf feature detection: Add -lopcodes to feature-libbfd
Both libbfd and libopcodes are distributed with binutil-dev/devel. When
libbfd is present, it is OK to assume that libopcodes also present. This
has been a safe assumption for bpftool.
This patch adds -lopcodes to perf/Makefile.config. libopcodes will be
used in the next commit for BPF annotation.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-12-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-11-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just to compile and load a BPF program that attaches to the
raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending
in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc).
Make sure you have a recent enough clang, say version 9, to get the
BTF ELF sections needed for this testing:
Then do a systemwide perf record session for a few seconds:
# perf record -a sleep 2s
Then look at:
# perf report --header-only | grep b[pt]f
# event : name = cycles:ppp, , id = { 1116204, 1116205, 1116206, 1116207, 1116208, 1116209, 1116210, 1116211 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
# bpf_prog_info of id 13
# bpf_prog_info of id 14
# bpf_prog_info of id 15
# bpf_prog_info of id 16
# bpf_prog_info of id 17
# bpf_prog_info of id 18
# bpf_prog_info of id 21
# bpf_prog_info of id 22
# bpf_prog_info of id 51
# bpf_prog_info of id 52
# btf info of id 8
#
We need to show more info about these BPF and BTF entries , but that can
be done later.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-10-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:44 +0000 (22:30 -0700)]
perf bpf: Save BTF in a rbtree in perf_env
BTF contains information necessary to annotate BPF programs. This patch
saves BTF for BPF programs loaded in the system.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-9-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:43 +0000 (22:30 -0700)]
perf bpf: Save bpf_prog_info information as headers to perf.data
This patch enables perf-record to save bpf_prog_info information as
headers to perf.data. A new header type HEADER_BPF_PROG_INFO is
introduced for this data.
Committer testing:
As root, being on the kernel sources top level directory, run:
Just to compile and load a BPF program that attaches to the
raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending
in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc).
Then do a systemwide perf record session for a few seconds:
# perf record -a sleep 2s
Then look at:
# perf report --header-only | grep -i bpf
# bpf_prog_info of id 13
# bpf_prog_info of id 14
# bpf_prog_info of id 15
# bpf_prog_info of id 16
# bpf_prog_info of id 17
# bpf_prog_info of id 18
# bpf_prog_info of id 21
# bpf_prog_info of id 22
# bpf_prog_info of id 208
# bpf_prog_info of id 209
#
We need to show more info about these programs, like bpftool does for
the ones running on the system, i.e. 'perf record/perf report' become a
way of saving the BPF state in a machine to then analyse on another,
together with all the other information that is already saved in the
perf.data header:
# perf report --header-only
# ========
# captured on : Tue Mar 12 11:42:13 2019
# header version : 1
# data offset : 296
# data size : 16294184
# feat offset : 16294480
# hostname : quaco
# os release : 5.0.0+
# perf version : 5.0.gd783c8
# arch : x86_64
# nrcpus online : 8
# nrcpus avail : 8
# cpudesc : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
# cpuid : GenuineIntel,6,142,10
# total memory : 24555720 kB
# cmdline : /home/acme/bin/perf (deleted) record -a
# event : name = cycles:ppp, , id = { 3190123, 3190124, 3190125, 3190126, 3190127, 3190128, 3190129, 3190130 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1
# CPU_TOPOLOGY info available, use -I to display
# NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: intel_pt = 8, software = 1, power = 11, uprobe = 7, uncore_imc = 12, cpu = 4, cstate_core = 18, uncore_cbox_2 = 15, breakpoint = 5, uncore_cbox_0 = 13, tracepoint = 2, cstate_pkg = 19, uncore_arb = 17, kprobe = 6, i915 = 10, msr = 9, uncore_cbox_3 = 16, uncore_cbox_1 = 14
# CACHE info available, use -I to display
# time of first sample : 116392.441701
# time of last sample : 116400.932584
# sample duration : 8490.883 ms
# MEM_TOPOLOGY info available, use -I to display
# bpf_prog_info of id 13
# bpf_prog_info of id 14
# bpf_prog_info of id 15
# bpf_prog_info of id 16
# bpf_prog_info of id 17
# bpf_prog_info of id 18
# bpf_prog_info of id 21
# bpf_prog_info of id 22
# bpf_prog_info of id 208
# bpf_prog_info of id 209
# missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT
# ========
#
Committer notes:
We can't use the libbpf unconditionally, as the build may have been with
NO_LIBBPF, when we end up with linking errors, so provide dummy
{process,write}_bpf_prog_info() wrapped by HAVE_LIBBPF_SUPPORT for that
case.
Printing are not affected by this, so can continue as is.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-8-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this set, 1) and 2) in the list are already saved to perf.data
file. For BPF programs that are already loaded before perf run, 1) and 2)
are synthesized by perf_event__synthesize_bpf_events(). For short living
BPF programs, 1) and 2) are generated by kernel.
This set handles 3) and 4) from the list. Again, it is necessary to handle
existing BPF program and short living program separately.
This patch handles 3) for exising BPF programs while synthesizing 1) and
2) in perf_event__synthesize_bpf_events(). These data are stored in
perf_env. The next patch saves these data from perf_env to perf.data as
headers.
Similarly, the two patches after the next saves 4) of existing BPF
programs to perf_env and perf.data.
Another patch later will handle 3) and 4) for short living BPF programs
by monitoring 1) and 2) in a dedicate thread.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-7-songliubraving@fb.com
[ set env->bpf_progs.infos_cnt to zero in perf_env__purge_bpf() as noted by jolsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:41 +0000 (22:30 -0700)]
perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool
This patch changes the arguments of perf_event__synthesize_bpf_events()
to include perf_session* instead of perf_tool*. perf_session will be
used in the next patch.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-6-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
util/bpf-event.c:143:35: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(info->prog_tags);
^
util/bpf-event.c:144:22: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
__u32 *prog_lens = (__u32 *)(info->jited_func_lens);
^
util/bpf-event.c:145:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
__u64 *prog_addrs = (__u64 *)(info->jited_ksyms);
^
util/bpf-event.c:146:22: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
void *func_infos = (void *)(info->func_info);
^
cc1: all warnings being treated as errors
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: kernel-team@fb.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-5-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: kernel-team@fb.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-4-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, bpf_prog_info includes 9 arrays. The user has the option to
fetch any combination of these arrays. However, this requires a lot of
handling.
This work becomes more tricky when we need to store bpf_prog_info to a
file, because these arrays are allocated independently.
This patch introduces 'struct bpf_prog_info_linear', which stores arrays
of bpf_prog_info in continuous memory.
Helper functions are introduced to unify the work to get different sets
of bpf_prog_info. Specifically, bpf_program__get_prog_info_linear()
allows the user to select which arrays to fetch, and handles details for
the user.
Please see the comments right before 'enum bpf_prog_info_array' for more
details and examples.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lkml.kernel.org/r/ce92c091-e80d-a0c1-4aa0-987706c42b20@iogearbox.net Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: kernel-team@fb.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-3-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Song Liu [Tue, 12 Mar 2019 05:30:37 +0000 (22:30 -0700)]
perf record: Replace option --bpf-event with --no-bpf-event
Currently, monitoring of BPF programs through bpf_event is off by
default for 'perf record'.
To turn it on, the user need to use option "--bpf-event". As BPF gets
wider adoption in different subsystems, this option becomes
inconvenient.
This patch makes bpf_event on by default, and adds option "--no-bpf-event"
to turn it off. Since option --bpf-event is not released yet, it is safe
to remove it.
Signed-off-by: Song Liu <songliubraving@fb.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: kernel-team@fb.com Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@google.com> Link: http://lkml.kernel.org/r/20190312053051.2690567-2-songliubraving@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Direct leak of 1160 byte(s) in 1 object(s) allocated from:
#0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
#1 0x55bd50005599 in zalloc util/util.h:23
#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327
#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216
#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69
#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358
#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388
#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583
#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722
#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520
#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Indirect leak of 19 byte(s) in 1 object(s) allocated from:
#0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30)
#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f)
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 6a6cd11d4e57 ("perf test: Add test for the sched tracepoint format fields") Link: http://lkml.kernel.org/r/20190316080556.3075-17-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Direct leak of 13 byte(s) in 3 object(s) allocated from:
#0 0x7f03339d6070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
#1 0x5625e53aaef0 in expr__find_other util/expr.y:221
#2 0x5625e51bcd3f in test__expr tests/expr.c:52
#3 0x5625e51528e6 in run_test tests/builtin-test.c:358
#4 0x5625e5152baf in test_and_print tests/builtin-test.c:388
#5 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
#6 0x5625e515572f in cmd_test tests/builtin-test.c:722
#7 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#8 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#9 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#10 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
#11 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 075167363f8b ("perf tools: Add a simple expression parser for JSON") Link: http://lkml.kernel.org/r/20190316080556.3075-16-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f0333a88f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30)
#1 0x5625e5326213 in cpu_map__trim_new util/cpumap.c:45
#2 0x5625e5326703 in cpu_map__read util/cpumap.c:103
#3 0x5625e53267ef in cpu_map__read_all_cpu_map util/cpumap.c:120
#4 0x5625e5326915 in cpu_map__new util/cpumap.c:135
#5 0x5625e517b355 in test__openat_syscall_event_on_all_cpus tests/openat-syscall-all-cpus.c:36
#6 0x5625e51528e6 in run_test tests/builtin-test.c:358
#7 0x5625e5152baf in test_and_print tests/builtin-test.c:388
#8 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
#9 0x5625e515572f in cmd_test tests/builtin-test.c:722
#10 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#11 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#12 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#13 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
#14 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: f30a79b012e5 ("perf tools: Add reference counting for cpu_map object") Link: http://lkml.kernel.org/r/20190316080556.3075-15-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
#1 0x5625e5330a5e in zalloc util/util.h:23
#2 0x5625e5330a9b in perf_counts__new util/counts.c:10
#3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47
#4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505
#5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347
#6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47
#7 0x5625e51528e6 in run_test tests/builtin-test.c:358
#8 0x5625e5152baf in test_and_print tests/builtin-test.c:388
#9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
#10 0x5625e515572f in cmd_test tests/builtin-test.c:722
#11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
#15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Indirect leak of 72 byte(s) in 1 object(s) allocated from:
#0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
#1 0x5625e532560d in zalloc util/util.h:23
#2 0x5625e532566b in xyarray__new util/xyarray.c:10
#3 0x5625e5330aba in perf_counts__new util/counts.c:15
#4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47
#5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505
#6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347
#7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47
#8 0x5625e51528e6 in run_test tests/builtin-test.c:358
#9 0x5625e5152baf in test_and_print tests/builtin-test.c:388
#10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583
#11 0x5625e515572f in cmd_test tests/builtin-test.c:722
#12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520
#16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
His patch took care of evsel->prev_raw_counts, but the above backtraces
are about evsel->counts, so fix that instead.
Reported-by: Changbin Du <changbin.du@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/n/tip-hd1x13g59f0nuhe4anxhsmfp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:52 +0000 (16:05 +0800)]
perf top: Fix global-buffer-overflow issue
The array str[] should have six elements.
=================================================================
==4322==ERROR: AddressSanitizer: global-buffer-overflow on address 0x56463844e300 at pc 0x564637e7ad0d bp 0x7f30c8c89d10 sp 0x7f30c8c89d00
READ of size 8 at 0x56463844e300 thread T9
#0 0x564637e7ad0c in __ordered_events__flush util/ordered-events.c:316
#1 0x564637e7b0e4 in ordered_events__flush util/ordered-events.c:338
#2 0x564637c6a57d in process_thread /home/changbin/work/linux/tools/perf/builtin-top.c:1073
#3 0x7f30d173a163 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x8163)
#4 0x7f30cfffbdee in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11adee)
0x56463844e300 is located 32 bytes to the left of global variable 'flags' defined in 'util/trace-event-parse.c:229:26' (0x56463844e320) of size 192
0x56463844e300 is located 0 bytes to the right of global variable 'str' defined in 'util/ordered-events.c:268:28' (0x56463844e2e0) of size 32
SUMMARY: AddressSanitizer: global-buffer-overflow util/ordered-events.c:316 in __ordered_events__flush
Shadow bytes around the buggy address:
0x0ac947081c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac947081c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac947081c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac947081c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac947081c50: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00
=>0x0ac947081c60:[f9]f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac947081c70: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9
0x0ac947081c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac947081c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac947081ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ac947081cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Thread T9 created by T0 here:
#0 0x7f30d179de5f in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x4ae5f)
#1 0x564637c6b954 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1253
#2 0x564637c7173c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642
#3 0x564637d85038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#4 0x564637d85577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#5 0x564637d8597b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#6 0x564637d860e9 in main /home/changbin/work/linux/tools/perf/perf.c:520
#7 0x7f30cff0509a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Jiri Olsa <jolsa@kernel.org> Fixes: 16c66bc167cc ("perf top: Add processing thread") Fixes: 68ca5d07de20 ("perf ordered_events: Add ordered_events__flush_time interface") Link: http://lkml.kernel.org/r/20190316080556.3075-13-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:51 +0000 (16:05 +0800)]
perf maps: Purge all maps from the 'names' tree
Add function __maps__purge_names() to purge all maps from the names
tree. We need to cleanup the names tree in maps__exit().
Detected with gcc's ASan.
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 1e6285699b30 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190316080556.3075-12-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:50 +0000 (16:05 +0800)]
perf map: Remove map from 'names' tree in __maps__remove()
There are two trees for each map inserted by maps__insert(), so remove
it from the 'names' tree in __maps__remove().
Detected with gcc's ASan.
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 1e6285699b30 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190316080556.3075-11-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:47 +0000 (16:05 +0800)]
perf top: Delete the evlist before perf_session, fixing heap-use-after-free issue
The evlist should be destroyed before the perf session.
Detected with gcc's ASan:
=================================================================
==27350==ERROR: AddressSanitizer: heap-use-after-free on address 0x62b000002e38 at pc 0x5611da276999 bp 0x7ffce8f1d1a0 sp 0x7ffce8f1d190
WRITE of size 8 at 0x62b000002e38 thread T0
#0 0x5611da276998 in __list_del /home/work/linux/tools/include/linux/list.h:89
#1 0x5611da276d4a in __list_del_entry /home/work/linux/tools/include/linux/list.h:102
#2 0x5611da276e77 in list_del_init /home/work/linux/tools/include/linux/list.h:145
#3 0x5611da2781cd in thread__put util/thread.c:130
#4 0x5611da2cc0a8 in __thread__zput util/thread.h:68
#5 0x5611da2d2dcb in hist_entry__delete util/hist.c:1148
#6 0x5611da2cdf91 in hists__delete_entry util/hist.c:337
#7 0x5611da2ce19e in hists__delete_entries util/hist.c:365
#8 0x5611da2db2ab in hists__delete_all_entries util/hist.c:2639
#9 0x5611da2db325 in hists_evsel__exit util/hist.c:2651
#10 0x5611da1c5352 in perf_evsel__exit util/evsel.c:1304
#11 0x5611da1c5390 in perf_evsel__delete util/evsel.c:1309
#12 0x5611da1b35f0 in perf_evlist__purge util/evlist.c:124
#13 0x5611da1b38e2 in perf_evlist__delete util/evlist.c:148
#14 0x5611da069781 in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1645
#15 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#16 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#17 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#18 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520
#19 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
#20 0x5611d9ff35c9 in _start (/home/work/linux/tools/perf/perf+0x3e95c9)
0x62b000002e38 is located 11320 bytes inside of 27448-byte region [0x62b000000200,0x62b000006d38)
freed by thread T0 here:
#0 0x7fdccb04ab70 in free (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedb70)
#1 0x5611da260df4 in perf_session__delete util/session.c:201
#2 0x5611da063de5 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1300
#3 0x5611da06973c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642
#4 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#5 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#6 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#7 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520
#8 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
previously allocated by thread T0 here:
#0 0x7fdccb04b138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
#1 0x5611da26010c in zalloc util/util.h:23
#2 0x5611da260824 in perf_session__new util/session.c:118
#3 0x5611da0633a6 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1192
#4 0x5611da06973c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642
#5 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#6 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#7 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#8 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520
#9 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Changbin Du [Sat, 16 Mar 2019 08:05:46 +0000 (16:05 +0800)]
perf build-id: Fix memory leak in print_sdt_events()
Detected with gcc's ASan:
Direct leak of 4356 byte(s) in 120 object(s) allocated from:
#0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
#1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215
#2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339
#3 0x55719af66272 in print_events util/parse-events.c:2542
#4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
#5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
#6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
#7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
#8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520
#9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 40218daea1db ("perf list: Show SDT and pre-cached events") Link: http://lkml.kernel.org/r/20190316080556.3075-7-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:45 +0000 (16:05 +0800)]
perf config: Fix a memory leak in collect_config()
Detected with gcc's ASan:
Direct leak of 66 byte(s) in 5 object(s) allocated from:
#0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070)
#1 0x560c8761034d in collect_config util/config.c:597
#2 0x560c8760d9cb in get_value util/config.c:169
#3 0x560c8760dfd7 in perf_parse_file util/config.c:285
#4 0x560c8760e0d2 in perf_config_from_file util/config.c:476
#5 0x560c876108fd in perf_config_set__init util/config.c:661
#6 0x560c87610c72 in perf_config_set__new util/config.c:709
#7 0x560c87610d2f in perf_config__init util/config.c:718
#8 0x560c87610e5d in perf_config util/config.c:730
#9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442
#10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Taeung Song <treeze.taeung@gmail.com> Fixes: 20105ca1240c ("perf config: Introduce perf_config_set class") Link: http://lkml.kernel.org/r/20190316080556.3075-6-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:44 +0000 (16:05 +0800)]
perf config: Fix an error in the config template documentation
The option 'sort-order' should be 'sort_order'.
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 893c5c798be9 ("perf config: Show default report configuration in example and docs") Link: http://lkml.kernel.org/r/20190316080556.3075-5-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:43 +0000 (16:05 +0800)]
perf tools: Fix errors under optimization level '-Og'
Optimization level '-Og' offers a reasonable level of optimization while
maintaining fast compilation and a good debugging experience. This patch
tries to make it work.
$ make DEBUG=1 EXTRA_CFLAGS='-Og'
bench/epoll-ctl.c: In function ‘do_threads’:
bench/epoll-ctl.c:274:9: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
return ret;
^~~
...
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20190316080556.3075-4-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:42 +0000 (16:05 +0800)]
perf list: Don't forget to drop the reference to the allocated thread_map
Detected via gcc's ASan:
Direct leak of 2048 byte(s) in 64 object(s) allocated from:
6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370)
7 #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43
8 #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85
9 #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250
10 #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382
11 #5 0x556b0f0e3231 in print_events util/parse-events.c:2514
12 #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58
13 #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
14 #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
15 #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
16 #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520
17 #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 89896051f8da ("perf tools: Do not put a variable sized type not at the end of a struct") Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Changbin Du [Sat, 16 Mar 2019 08:05:41 +0000 (16:05 +0800)]
perf tools: Add doc about how to build perf with Asan and UBSan
AddressSanitizer (or ASan) and UndefinedBehaviorSanitizer (or UBSan) are
very useful tools to detect program bugs:
- AddressSanitizer (or ASan) is a GCC feature that detects memory
corruption bugs such as buffer overflows and memory leaks.
- UndefinedBehaviorSanitizer (or UBSan) is a fast undefined behavior
detector supported by GCC. UBSan detects undefined behaviors of programs
at runtime.
This patch adds a document about how to use them on perf. Later patches will fix
some of the issues disclosed by them.
Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20190316080556.3075-2-changbin.du@gmail.com
[ Make some changes based on comments made by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 22:50:02 +0000 (15:50 -0700)]
perf stat: Improve scaling
The multiplexing scaling in perf stat mysteriously adds 0.5 to the
value. This dates back to the original perf tool. Other scaling code
doesn't use that strange convention. Remove the extra 0.5.
Before:
$ perf stat -e 'cycles,cycles,cycles,cycles,cycles,cycles' grep -rq foo
Andi Kleen [Thu, 14 Mar 2019 22:50:01 +0000 (15:50 -0700)]
perf stat: Fix --no-scale
The -c option to enable multiplex scaling has been useless for quite
some time because scaling is default.
It's only useful as --no-scale to disable scaling. But the non scaling
code path has bitrotted and doesn't print anything because perf output
code relies on value run/ena information.
Also even when we don't want to scale a value it's still useful to show
its multiplex percentage.
This patch:
- Fixes help and documentation to show --no-scale instead of -c
- Removes -c, only keeps the long option because -c doesn't support negatives.
- Enables running/enabled even with --no-scale
- And fixes some other problems in the no-scale output.
Do not use 'time' as the name of a variable, as this breaks the build on
older glibcs:
cc1: warnings being treated as errors
builtin-script.c: In function 'perf_sample__fprintf_start':
builtin-script.c:691: warning: declaration of 'time' shadows a global declaration
/usr/include/time.h:187: warning: shadowed declaration is here
Andi Kleen [Thu, 14 Mar 2019 22:49:56 +0000 (15:49 -0700)]
perf record: Clarify help for --switch-output
The help description for --switch-output looks like there are multiple
comma separated fields. But it's actually a choice of different options.
Make it clear and less confusing.
Before:
% perf record -h
...
--switch-output[=<signal,size,time>]
Switch output when receive SIGUSR2 or cross size,time threshold
After:
% perf record -h
...
--switch-output[=<signal or size[BKMG] or time[smhd]>]
Switch output when receiving SIGUSR2 (signal) or cross a size or time threshold
Andi Kleen [Thu, 14 Mar 2019 22:49:55 +0000 (15:49 -0700)]
perf record: Allow to limit number of reported perf.data files
When doing long term recording and waiting for some event to snapshot
on, we often only care about the last minute or so.
The --switch-output command line option supports rotating the perf.data
file when the size exceeds a threshold. But the disk would still be
filled with unnecessary old files.
Add a new option to only keep a number of rotated files, so that the
disk space usage can be limited.
Peter Zijlstra [Fri, 15 Mar 2019 08:14:10 +0000 (09:14 +0100)]
perf/x86: Fixup typo in stub functions
Guenter reported a build warning for CONFIG_CPU_SUP_INTEL=n:
> With allmodconfig-CONFIG_CPU_SUP_INTEL, this patch results in:
>
> In file included from arch/x86/events/amd/core.c:8:0:
> arch/x86/events/amd/../perf_event.h:1036:45: warning: ‘struct cpu_hw_event’ declared inside parameter list will not be visible outside of this definition or declaration
> static inline int intel_cpuc_prepare(struct cpu_hw_event *cpuc, int cpu)
While harmless (an unsed pointer is an unused pointer, no matter the type)
it needs fixing.
Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: d01b1f96a82e ("perf/x86/intel: Make cpuc allocations consistent") Link: http://lkml.kernel.org/r/20190315081410.GR5996@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
cpuc->constraint_list[-1] is used, which is an obvious out-of-bound access.
In this case, simply skip the TFA constraint code, there is no event
constraint with just PMC3, therefore the code will never result in the
empty set.
Fixes: 400816f60c54 ("perf/x86/intel: Implement support for TSX Force Abort") Reported-by: Tony Jones <tonyj@suse.com> Reported-by: "DSouza, Nelson" <nelson.dsouza@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Jones <tonyj@suse.com> Tested-by: "DSouza, Nelson" <nelson.dsouza@intel.com> Cc: eranian@google.com Cc: jolsa@redhat.com Cc: stable@kernel.org Link: https://lkml.kernel.org/r/20190314130705.441549378@infradead.org
Linus Torvalds [Thu, 14 Mar 2019 22:10:10 +0000 (15:10 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc patches from Andrew Morton:
- a little bit more MM
- a few fixups
[ The "little bit more MM" is actually just one of the three patches
Andrew sent for mm/filemap.c, I'm still mulling over two more of them
from Josef Bacik - Linus ]
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
include/linux/swap.h: use offsetof() instead of custom __swapoffset macro
tools/testing/selftests/proc/proc-pid-vm.c: test with vsyscall in mind
zram: default to lzo-rle instead of lzo
filemap: pass vm_fault to the mmap ra helpers
Pi-Hsun Shih [Wed, 13 Mar 2019 18:44:33 +0000 (11:44 -0700)]
include/linux/swap.h: use offsetof() instead of custom __swapoffset macro
Use offsetof() to calculate offset of a field to take advantage of
compiler built-in version when possible, and avoid UBSAN warning when
compiling with Clang:
UBSAN: Undefined behaviour in mm/swapfile.c:3010:38
member access within null pointer of type 'union swap_header'
CPU: 6 PID: 1833 Comm: swapon Tainted: G S 4.19.23 #43
Call trace:
dump_backtrace+0x0/0x194
show_stack+0x20/0x2c
__dump_stack+0x20/0x28
dump_stack+0x70/0x94
ubsan_epilogue+0x14/0x44
ubsan_type_mismatch_common+0xf4/0xfc
__ubsan_handle_type_mismatch_v1+0x34/0x54
__se_sys_swapon+0x654/0x1084
__arm64_sys_swapon+0x1c/0x24
el0_svc_common+0xa8/0x150
el0_svc_compat_handler+0x2c/0x38
el0_svc_compat+0x8/0x18
Link: http://lkml.kernel.org/r/20190312081902.223764-1-pihsun@chromium.org Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Josef Bacik [Wed, 13 Mar 2019 18:44:18 +0000 (11:44 -0700)]
filemap: pass vm_fault to the mmap ra helpers
All of the arguments to these functions come from the vmf.
Cut down on the amount of arguments passed by simply passing in the vmf
to these two helpers.
Link: http://lkml.kernel.org/r/20181211173801.29535-3-josef@toxicpanda.com Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 14 Mar 2019 17:48:14 +0000 (10:48 -0700)]
Merge tag 'acpi-5.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These fix a couple of issues and do some cleanups on top of the
previous ACPI changes for 5.1-rc1.
Specifics:
- Fix a crash caused by unloading an SSDT overlay (Andy Shevchenko)
- Prevent user space from getting confusing error values on failing
ACPI sysfs accesses (Rafael Wysocki)
- Simplify leaf node detection in the PPTT parsing code by using a
new flag defined in ACPI 6.3 (Jeremy Linton)
- Add missing "static" in some places in the ACPI configfs code (Andy
Shevchenko)
- Fix acpidbg tool path in the ACPI documentation (Flavio Suligoi)"
* tag 'acpi-5.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: sysfs: Prevent get_status() from returning acpi_status
ACPI / device_sysfs: Avoid OF modalias creation for removed device
ACPI / configfs: Mark local data structures static
ACPI / configfs: Mark local functions static
ACPI: tables: Simplify PPTT leaf node detection
ACPI: Documentation: Fix path for acpidbg tool
Linus Torvalds [Thu, 14 Mar 2019 17:30:06 +0000 (10:30 -0700)]
Merge tag 'pm-5.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are mostly fixes and cleanups on top of the previously merged
power management material for 5.1-rc1 with one cpupower utility update
that wasn't pushed earlier due to unfortunate timing.
Specifics:
- Fix registration of new cpuidle governors partially broken during
the 5.0 development cycle by mistake (Rafael Wysocki).
- Avoid integer overflows in the menu cpuidle governor by making it
discard the overflowing data points upfront (Rafael Wysocki).
- Fix minor mistake in the recent update of the iowait boost
computation in the intel_pstate driver (Rafael Wysocki).
- Drop incorrect __init annotation from one function in the pxa2xx
cpufreq driver (Arnd Bergmann).
- Fix the operating performance points (OPP) framework initialization
for devices in multiple power domains if only one of them is
scalable (Rajendra Nayak).
- Fix mistake in dev_pm_opp_set_rate() which causes it to skip
updating the performance state if the new frequency is the same as
the old one (Viresh Kumar).
- Rework the cancellation of wakeup source timers to avoid potential
issues with it and do some cleanups unlocked by that change (Viresh
Kumar, Rafael Wysocki).
- Clean up the code computing the active/suspended time of devices in
the PM-runtime framework after recent changes (Ulf Hansson).
- Make the power management infrastructure code use pr_fmt()
consistently (Joe Perches).
- Clean up the generic power domains (genpd) framework somewhat
(Aisheng Dong).
- Improve kerneldoc comments for two functions in the cpufreq core
(Rafael Wysocki).
- Fix typo in a PM QoS file description comment (Aisheng Dong).
- Update the handling of CPU boost frequencies in the cpupower
utility (Abhishek Goel)"
* tag 'pm-5.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: governor: Add new governors to cpuidle_governors again
cpufreq: intel_pstate: Fix up iowait_boost computation
PM / OPP: Update performance state when freq == old_freq
PM / wakeup: Drop wakeup_source_drop()
PM / wakeup: Rework wakeup source timer cancellation
PM / domains: Remove one unnecessary blank line
PM / Domains: Return early for all errors in _genpd_power_off()
PM / Domains: Improve warn for multiple states but no governor
OPP: Fix handling of multiple power domains
PM / QoS: Fix typo in file description
cpufreq: pxa2xx: remove incorrect __init annotation
PM-runtime: Call pm_runtime_active|suspended_time() from sysfs
PM-runtime: Consolidate code to get active/suspended time
PM: Add and use pr_fmt()
cpufreq: Improve kerneldoc comments for cpufreq_cpu_get/put()
cpuidle: menu: Avoid overflows when computing variance
tools/power/cpupower: Display boost frequency separately
Linus Torvalds [Thu, 14 Mar 2019 16:11:54 +0000 (09:11 -0700)]
Merge tag 'dmaengine-5.1-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine updates from Vinod Koul:
- dmatest updates for modularizing common struct and code
- remove SG support for VDMA xilinx IP and updates to driver
- Update to dw driver to support Intel iDMA controllers multi-block
support
- tegra updates for proper reporting of residue
- Add Snow Ridge ioatdma device id and support for IOATDMA v3.4
- struct_size() usage and useless LIST_HEAD cleanups in subsystem.
- qDMA controller driver for Layerscape SoCs
- stm32-dma PM Runtime support
- And usual updates to imx-sdma, sprd, Documentation, fsl-edma,
bcm2835, qcom_hidma etc
* tag 'dmaengine-5.1-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (81 commits)
dmaengine: imx-sdma: fix consistent dma test failures
dmaengine: imx-sdma: add a test for imx8mq multi sdma devices
dmaengine: imx-sdma: add clock ratio 1:1 check
dmaengine: dmatest: move test data alloc & free into functions
dmaengine: dmatest: add short-hand `buf_size` var in dmatest_func()
dmaengine: dmatest: wrap src & dst data into a struct
dmaengine: ioatdma: support latency tolerance report (LTR) for v3.4
dmaengine: ioatdma: add descriptor pre-fetch support for v3.4
dmaengine: ioatdma: disable DCA enabling on IOATDMA v3.4
dmaengine: ioatdma: Add Snow Ridge ioatdma device id
dmaengine: sprd: Change channel id to slave id for DMA cell specifier
dt-bindings: dmaengine: sprd: Change channel id to slave id for DMA cell specifier
dmaengine: mv_xor: Use correct device for DMA API
Documentation :dmaengine: clarify DMA desc. pointer after submission
Documentation: dmaengine: fix dmatest.rst warning
dmaengine: k3dma: Add support for dma-channel-mask
dmaengine: k3dma: Delete axi_config
dmaengine: k3dma: Upgrade k3dma driver to support hisi_asp_dma hardware
Documentation: bindings: dma: Add binding for dma-channel-mask
Documentation: bindings: k3dma: Extend the k3dma driver binding to support hisi-asp
...
Linus Torvalds [Thu, 14 Mar 2019 16:00:06 +0000 (09:00 -0700)]
Merge tag 'rproc-v5.1' of git://github.com/andersson/remoteproc
Pull remoteproc updates from Bjorn Andersson:
"This contains the last patches in Loic's remoteproc resource table
handling changes, a number of updates to documentation, support for
invoking the crash handler (for testing purposes), a fix for the
handling of virtio devices during recovery, performance state votes in
Qualcomm modem driver, support for specifying board specific firmware
path for Qualcomm modem driver and improved support for graceful
shutdown of Qualcomm remoteprocs"
* tag 'rproc-v5.1' of git://github.com/andersson/remoteproc: (33 commits)
remoteproc: fix for "dma-mapping: remove the DMA_MEMORY_EXCLUSIVE flag"
remoteproc: fix rproc_check_carveout_da() returned error and comments
remoteproc: fix trace buffer va initialization
remoteproc: fix rproc_alloc_carveout() for rproc with iommu domain
remoteproc: add warning on resource table cast
remoteproc: fix rproc_alloc_carveout() bad variable cast
remoteproc: fix rproc_da_to_va in case of unallocated carveout
remoteproc: correct rproc_mem_entry_init() comments
remoteproc: fix recovery procedure
rpmsg: virtio: change header file sort style
rpmsg: virtio: allocate buffer from parent
remoteproc: st: add reserved memory support
remoteproc: create vdev subdevice with specific dma memory pool
remoteproc: q6v5_adsp: Remove voting for lpass_aon clock
dt-binding: remoteproc: Remove lpass_aon clock from adsp pil clock list
remoteproc: q6v5-mss: Active powerdomain for SDM845
remoteproc: q6v5-mss: Vote for rpmh power domains
remoteproc: qcom: Add support for parsing fw dt bindings
remoteproc: qcom_q6v5: don't auto boot remote processor
remoteproc: qcom: Wait for shutdown-ack/ind on sysmon shutdown
...
Linus Torvalds [Thu, 14 Mar 2019 15:46:17 +0000 (08:46 -0700)]
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk subsystem updates from Stephen Boyd:
"We have a fairly balanced mix of clk driver updates and clk framework
updates this time around. It's the usual pile of new drivers for new
hardware out there and the normal small fixes and updates, but then we
have some core framework changes too.
In the core framework, we introduce support for a clk_get_optional()
API to get clks that may not always be populated and a way to devm
manage clkdev lookups registered by provider drivers. We also do some
refactoring to simplify the interface between clkdev and the common
clk framework so we can reuse the DT parsing and clk_get() path in
provider drivers in the future. This work will continue in the next
few cycles while we convert how providers specify clk parents.
On the driver side, the biggest part of the dirstat is the Amlogic clk
driver that got support for the G12A SoC. It dominates with almost
half the overall diff, while the second largest part of the diff is in
the i.MX clk driver that gained support for imx8mm SoCs. After that,
we have the Actions Semiconductor and Qualcomm drivers rounding out
the big part of the dirstat because they both got new hardware support
for SoCs. The rest is just various updates and non-critical fixes for
existing drivers.
Core:
- Convert a few clk bindings to JSON schema format
- Add a {devm_}clk_get_optional() API
- Add devm_clk_hw_register_clkdev() API to manage clkdev lookups
- Start rewriting clk parent registration and supporting device links
by moving around code that supports clk_get() and DT parsing of the
'clocks' property
New Drivers:
- Add Qualcomm MSM8998 RPM managed clks
- IPA clk support on Qualcomm RPMh clk controllers
- Actions Semi S500 SoC clk support
- Support for fixed rate clks populated from an MMIO register
- Add RPC (QSPI/HyperFLASH) clocks on Renesas R-Car V3H
- Add TMU (timer) clocks on Renesas RZ/G2E
- Add Amlogic G12A Always-On Clock Controller
- Add 32k clock generation for Amlogic AXG
- Add support for the Mali GPU clocks on Amlogic Meson8
- Add Amlogic G12A EE clock controller driver
- Add missing CANFD clocks on Renesas RZ/G2M and RZ/G2E
- Add i.MX8MM SoC clk driver support
Removed Drivers:
- Remove clps711x driver as the board support is gone
Updates:
- 3rd ECO fix for Mediatek MT2712 SoCs
- Updates for Qualcomm MSM8998 GCC clks
- Random static analysis fixes for clk drivers
- Support for sleeping gpios in the clk-gpio type
- Minor fixes for STM32MP1 clk driver (parents, critical flag, etc.)
- Split LCDC into two clks on the Marvell MMP2 SoC
- Various DT of_node refcount fixes
- Get rid of CLK_IS_BASIC from TI code (yay!)
- TI Autoidle clk support
- Fix Amlogic Meson8 APB clock ID name
- Claim input clocks through DT for Amlogic AXG and GXBB
- Correct the DU (display unit) parent clock on Renesas RZ/G2E
- Exynos5433 IMEM CMU crypto clk support (SlimSS)
- Fix for the PLL-MIPI on the Allwinner A23
- Fix Rockchip rk3328 PLL rate calculation
- Add SET_RATE_PARENT flag on display clk of Rockhip rk3066
- i.MX SCU clk driver clk_set_parent() and cpufreq support"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (150 commits)
dt-bindings: clock: imx8mq: Fix numbering overlaps and gaps
clk: ti: clkctrl: Fix clkdm_name regression for TI_CLK_CLKCTRL_COMPAT
clk: fixup default index for of_clk_get_by_name()
clk: Move of_clk_*() APIs into clk.c from clkdev.c
clk: Inform the core about consumer devices
clk: Introduce of_clk_get_hw_from_clkspec()
clk: core: clarify the check for runtime PM
clk: Combine __clk_get() and __clk_create_clk()
clk: imx8mq: add GPIO clocks to clock tree
clk: mediatek: correct cpu clock name for MT8173 SoC
clk: imx: Refactor entire sccg pll clk
clk: imx: scu: add cpu frequency scaling support
clk: mediatek: Mark bus and DRAM related clocks as critical
clk: mediatek: Add flags to mtk_gate
clk: mediatek: Add MUX_FLAGS macro
clk: qcom: gcc-sdm845: Define parent of PCIe PIPE clocks
clk: ingenic: Remove set but not used variable 'enable'
clk: at91: programmable: remove unneeded register read
clk: mediatek: using CLK_MUX_ROUND_CLOSEST for the clock of dpi1_sel
clk: mediatek: add MUX_GATE_FLAGS_2
...
Rafael J. Wysocki [Thu, 14 Mar 2019 09:53:08 +0000 (10:53 +0100)]
Merge branch 'pm-domains'
* pm-domains:
PM / domains: Remove one unnecessary blank line
PM / Domains: Return early for all errors in _genpd_power_off()
PM / Domains: Improve warn for multiple states but no governor
Xin Long [Wed, 13 Mar 2019 09:00:48 +0000 (17:00 +0800)]
pptp: dst_release sk_dst_cache in pptp_sock_destruct
sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect,
so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct,
otherwise, the dst refcnt will leak.
unregister_netdevice: waiting for lo to become free. Usage count = 1
v1->v2:
- use rcu_dereference_protected() instead of rcu_dereference_check(),
as suggested by Eric.
Fixes: 00959ade36ac ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 12 Mar 2019 17:50:59 +0000 (10:50 -0700)]
MAINTAINERS: GENET & SYSTEMPORT: Add internal Broadcom list
There is a patchwork instance behind bcm-kernel-feedback-list that is
helpful to track submissions, add this list for the Broadcom GENET and
SYSTEMPORT drivers.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Local variable description: ----addr@___sys_recvmsg
Variable was created at:
___sys_recvmsg+0xf6/0x1310 net/socket.c:2244
do_recvmmsg+0x646/0x10c0 net/socket.c:2390
Bytes 0-31 of 32 are uninitialized
Memory access of size 32 starts at ffff8880ae62fbb0
Data copied to user address 0000000020000000
Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vakul Garg [Tue, 12 Mar 2019 08:22:57 +0000 (08:22 +0000)]
net/tls: Inform user space about send buffer availability
A previous fix ("tls: Fix write space handling") assumed that
user space application gets informed about the socket send buffer
availability when tls_push_sg() gets called. Inside tls_push_sg(), in
case do_tcp_sendpages() returns 0, the function returns without calling
ctx->sk_write_space. Further, the new function tls_sw_write_space()
did not invoke ctx->sk_write_space. This leads to situation that user
space application encounters a lockup always waiting for socket send
buffer to become available.
Rather than call ctx->sk_write_space from tls_push_sg(), it should be
called from tls_write_space. So whenever tcp stack invokes
sk->sk_write_space after freeing socket send buffer, we always declare
the same to user space by the way of invoking ctx->sk_write_space.
Fixes: 7463d3a2db0ef ("tls: Fix write space handling") Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Zhike Wang [Mon, 11 Mar 2019 10:15:54 +0000 (03:15 -0700)]
net_sched: return correct value for *notify* functions
It is confusing to directly use return value of netlink_send()/
netlink_unicast() as the return value of *notify*, as it may be not
error at all.
Example: in tc_del_tfilter(), after calling tfilter_del_notify(), it will
goto errout if (err). However, the netlink_send()/netlink_unicast() will
return positive value even for successful case. So it may not call
tcf_chain_tp_remove() and so on to clean up the resource, as a result,
resource is leaked.
It may be easier to only check the return value of tfilter_del_nofiy(),
but it is more clean to correct all related functions.
Co-developed-by: Zengmo Gao <gaozengmo@jd.com> Signed-off-by: Zhike Wang <wangzhike@jd.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bryan Whitehead [Wed, 13 Mar 2019 19:55:48 +0000 (15:55 -0400)]
lan743x: Fix TX Stall Issue
It has been observed that tx queue may stall while downloading
from certain web sites (example www.speedtest.net)
The cause has been tracked down to a corner case where
the tx interrupt vector was disabled automatically, but
was not re enabled later.
The lan743x has two mechanisms to enable/disable individual
interrupts. Interrupts can be enabled/disabled by individual
source, and they can also be enabled/disabled by individual
vector which has been mapped to the source. Both must be
enabled for interrupts to work properly.
The TX code path, primarily uses the interrupt enable/disable of
the TX source bit, while leaving the vector enabled all the time.
However, while investigating this issue it was noticed that
the driver requested the use of the vector auto clear feature.
The test above revealed a case where the vector enable was
cleared unintentionally.
This patch fixes the issue by deleting the lines that request
the vector auto clear feature to be used.
Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 13 Mar 2019 18:10:42 +0000 (11:10 -0700)]
Merge tag 'selinux-pr-20190312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fixes from Paul Moore:
"Two small fixes for SELinux in v5.1: one adds a buffer length check to
the SELinux SCTP code, the other ensures that the SELinux labeling for
a NFS mount is not disabled if the filesystem is mounted twice"
* tag 'selinux-pr-20190312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
security/selinux: fix SECURITY_LSM_NATIVE_LABELS on reused superblock
selinux: add the missing walk_size + len check in selinux_sctp_bind_connect
Linus Torvalds [Wed, 13 Mar 2019 18:07:36 +0000 (11:07 -0700)]
Merge tag 'apparmor-pr-2019-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor fixes from John Johansen:
- fix double when failing to unpack secmark rules in policy
- fix leak of dentry when profile is removed
* tag 'apparmor-pr-2019-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: fix double free when unpack of secmark rules fails
apparmor: delete the dentry in aafs_remove() to avoid a leak
apparmor: Fix warning about unused function apparmor_ipv6_postroute
Linus Torvalds [Wed, 13 Mar 2019 17:06:28 +0000 (10:06 -0700)]
Merge tag 'kconfig-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig updates from Masahiro Yamada:
- rename lexer and parse files
- fix 'Save as' menu of xconfig
* tag 'kconfig-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: fix 'Save As' menu of xconfig
kconfig: rename zconf.y to parser.y
kconfig: rename zconf.l to lexer.l
Linus Torvalds [Wed, 13 Mar 2019 17:01:10 +0000 (10:01 -0700)]
Merge tag 'pwm/for-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
"The changes for this cycle are across the board.
The bulk of it is cleanups, but there's also new device support in
some drivers as well as more conversions to the atomic API"
* tag 'pwm/for-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (24 commits)
pwm: atmel: Remove useless symbolic definitions
pwm: bcm-kona: Update macros to remove braces around numbers
pwm: imx27: Only enable the clocks once in .get_state()
pwm: rcar: Improve calculation of divider
pwm: rcar: Remove legacy APIs
pwm: rcar: Use "atomic" API on rcar_pwm_resume()
pwm: rcar: Add support "atomic" API
pwm: atmel: Add support for SAM9X60's PWM controller
pwm: atmel: Add PWM binding for SAM9X60
pwm: atmel: Rename objects of type atmel_pwm_data
pwm: atmel: Add support for controllers with 32 bit counters
pwm: atmel: Add struct atmel_pwm_data
pwm: Add MediaTek MT8183 display PWM driver support
pwm: hibvt: Add hi3559v100 support
dt-bindings: pwm: hibvt: Add hi3559v100 support
pwm: hibvt: Use individual struct per of-data
pwm: imx: Signedness bug in imx_pwm_get_state()
pwm: imx: Split into two drivers
pwm: imx: Don't print an error on -EPROBE_DEFER
pwm: imx: Set driver data earlier simplifying the end of ->probe()
...
Linus Torvalds [Wed, 13 Mar 2019 16:59:08 +0000 (09:59 -0700)]
Merge tag 'mailbox-v5.1' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
- mailbox-test: support multiple controller instances
- misc cleanup: IMX, STM32 and Tegra
- new driver: ZynqMP IPI
* tag 'mailbox-v5.1' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: imx: keep MU irq working during suspend/resume
dt-bindings: mailbox: Add Xilinx IPI Mailbox
mailbox: ZynqMP IPI mailbox controller
mailbox: stm32-ipcc: remove useless device_init_wakeup call
mailbox: stm32-ipcc: do not enable wakeup source by default
mailbox: mailbox-test: fix null pointer if no mmio
mailbox: mailbox-test: fix debugfs in multi-instances
mailbox: tegra-hsp: mark suspend function as __maybe_unused
Linus Torvalds [Wed, 13 Mar 2019 16:41:18 +0000 (09:41 -0700)]
Merge tag 'libnvdimm-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"The bulk of this has been in -next since before the merge window
opened, with no known collisions / issues reported.
The only detail worth noting, outside the summary below, is that the
"libnvdimm-start-pad" topic has been truncated to just cleanups and
small fixes. The full topic branch would have doubled down on hacks
around the "section alignment" limitation of the core-mm, instead
effort is now being spent to address that root issue in the memory
hotplug implementation for v5.2.
- Fix nfit-bus command submission regression
- Support retrieval of short-ARS results if the ARS state is
"requires continuation", and even if the "no_init_ars" module
parameter is specified
- Allow busy-polling of the kernel ARS state by allowing root to
reset the exponential back-off timer
- Filter potentially stale ARS results by tracking query-ARS relative
to the previous start-ARS
- Enhance dax_device alignment checks
- Add support for the Hyper-V family of device-specific-methods
(DSMs)
- Add several fixes and workarounds for Hyper-V compatibility
- Fix support to cache the dirty-shutdown-count at init"
* tag 'libnvdimm-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (25 commits)
libnvdimm/namespace: Clean up holder_class_store()
libnvdimm/of_pmem: Fix platform_no_drv_owner.cocci warnings
acpi/nfit: Update NFIT flags error message
libnvdimm/btt: Fix LBA masking during 'free list' population
libnvdimm/btt: Remove unnecessary code in btt_freelist_init
libnvdimm/pfn: Remove dax_label_reserve
dax: Check the end of the block-device capacity with dax_direct_access()
nfit/ars: Avoid stale ARS results
nfit/ars: Allow root to busy-poll the ARS state machine
nfit/ars: Introduce scrub_flags
nfit/ars: Remove ars_start_flags
nfit/ars: Attempt short-ARS even in the no_init_ars case
nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot
acpi/nfit: Require opt-in for read-only label configurations
libnvdimm/pmem: Honor force_raw for legacy pmem regions
libnvdimm/pfn: Account for PAGE_SIZE > info-block-size in nd_pfn_init()
libnvdimm: Fix altmap reservation size calculation
libnvdimm, pfn: Fix over-trim in trim_pfn_device()
acpi/nfit: Fix bus command validation
libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family
...
Linus Torvalds [Wed, 13 Mar 2019 16:37:09 +0000 (09:37 -0700)]
Merge tag 'fsdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull filesystem-dax updates from Dan Williams:
- Fix handling of PMD-sized entries in the Xarray that lead to a crash
scenario
- Miscellaneous cleanups and small fixes
* tag 'fsdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
dax: Flush partial PMDs correctly
fs/dax: NIT fix comment regarding start/end vs range
fs/dax: Convert to use vmf_error()
Stephen Rothwell [Fri, 22 Feb 2019 05:14:45 +0000 (16:14 +1100)]
remoteproc: fix for "dma-mapping: remove the DMA_MEMORY_EXCLUSIVE flag"
The commit 82c5de0ab8db ("dma-mapping: remove the DMA_MEMORY_EXCLUSIVE
flag") removed the "flags" parameter for dma_declare_coherent_memory().
Remove the parameter from the call in rproc_add_virtio_dev().
Rafael J. Wysocki [Tue, 12 Mar 2019 18:13:13 +0000 (19:13 +0100)]
cpuidle: governor: Add new governors to cpuidle_governors again
After commit 61cb5758d3c4 ("cpuidle: Add cpuidle.governor= command
line parameter") new cpuidle governors are not added to the list
of available governors, so governor selection via sysfs doesn't
work as expected (even though it is rarely used anyway).
Fix that by making cpuidle_register_governor() add new governors to
cpuidle_governors again.
Fixes: 61cb5758d3c4 ("cpuidle: Add cpuidle.governor= command line parameter") Reported-by: Kees Cook <keescook@chromium.org> Cc: 5.0+ <stable@vger.kernel.org> # 5.0+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Tue, 12 Mar 2019 22:06:54 +0000 (15:06 -0700)]
Merge tag 'nfsd-5.1' of git://linux-nfs.org/~bfields/linux
Pull NFS server updates from Bruce Fields:
"Miscellaneous NFS server fixes.
Probably the most visible bug is one that could artificially limit
NFSv4.1 performance by limiting the number of oustanding rpcs from a
single client.
Neil Brown also gets a special mention for fixing a 14.5-year-old
memory-corruption bug in the encoding of NFSv3 readdir responses"
* tag 'nfsd-5.1' of git://linux-nfs.org/~bfields/linux:
nfsd: allow nfsv3 readdir request to be larger.
nfsd: fix wrong check in write_v4_end_grace()
nfsd: fix memory corruption caused by readdir
nfsd: fix performance-limiting session calculation
svcrpc: fix UDP on servers with lots of threads
svcrdma: Remove syslog warnings in work completion handlers
svcrdma: Squelch compiler warning when SUNRPC_DEBUG is disabled
svcrdma: Use struct_size() in kmalloc()
svcrpc: fix unlikely races preventing queueing of sockets
svcrpc: svc_xprt_has_something_to_do seems a little long
SUNRPC: Don't allow compiler optimisation of svc_xprt_release_slot()
nfsd: fix an IS_ERR() vs NULL check
Linus Torvalds [Tue, 12 Mar 2019 22:03:21 +0000 (15:03 -0700)]
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"A large number of bug fixes and cleanups.
One new feature to allow users to more easily find the jbd2 journal
thread for a particular ext4 file system"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits)
jbd2: jbd2_get_transaction does not need to return a value
jbd2: fix invalid descriptor block checksum
ext4: fix bigalloc cluster freeing when hole punching under load
ext4: add sysfs attr /sys/fs/ext4/<disk>/journal_task
ext4: Change debugging support help prefix from EXT4 to Ext4
ext4: fix compile error when using BUFFER_TRACE
jbd2: fix compile warning when using JBUFFER_TRACE
ext4: fix some error pointer dereferences
ext4: annotate more implicit fall throughs
ext4: annotate implicit fall throughs
ext4: don't update s_rev_level if not required
jbd2: fold jbd2_superblock_csum_{verify,set} into their callers
jbd2: fix race when writing superblock
ext4: fix crash during online resizing
ext4: disallow files with EXT4_JOURNAL_DATA_FL from EXT4_IOC_SWAP_BOOT
ext4: add mask of ext4 flags to swap
ext4: update quota information while swapping boot loader inode
ext4: cleanup pagecache before swap i_data
ext4: fix check of inode in swap_inode_boot_loader
ext4: unlock unused_pages timely when doing writeback
...
David S. Miller [Tue, 12 Mar 2019 22:00:15 +0000 (15:00 -0700)]
Merge branch 'mlx4-fixes'
Tariq Toukan says:
====================
mlx4_core misc fixes
This patchset by Jack contains misc fixes to the mlx4 Core driver.
Patch 1 fixes a use-after-free situation by marking (nullifying) the pointer,
please queue for -stable >= v4.0.
Patch 2 adds a missing lock acquire and release in SRIOV command interface,
please queue for -stable >= v4.9.
Patch 3 avoids calling roundup_pow_of_two when argument is zero,
please queue for -stable >= v3.3.
Series generated against net commit: a3b1933d34d5 Merge tag 'mlx5-fixes-2019-03-11' of
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Tue, 12 Mar 2019 15:05:49 +0000 (17:05 +0200)]
net/mlx4_core: Fix qp mtt size calculation
Calculation of qp mtt size (in function mlx4_RST2INIT_wrapper)
ultimately depends on function roundup_pow_of_two.
If the amount of memory required by the QP is less than one page,
roundup_pow_of_two is called with argument zero. In this case, the
roundup_pow_of_two result is undefined.
Calling roundup_pow_of_two with a zero argument resulted in the
following stack trace:
UBSAN: Undefined behaviour in ./include/linux/log2.h:61:13
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 4 PID: 26939 Comm: rping Tainted: G OE 4.19.0-rc1
Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
Call Trace:
dump_stack+0x9a/0xeb
ubsan_epilogue+0x9/0x7c
__ubsan_handle_shift_out_of_bounds+0x254/0x29d
? __ubsan_handle_load_invalid_value+0x180/0x180
? debug_show_all_locks+0x310/0x310
? sched_clock+0x5/0x10
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x260
? find_held_lock+0x35/0x1e0
? mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core]
mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core]
Fix this by explicitly testing for zero, and returning one if the
argument is zero (assuming that the next higher power of 2 in this case
should be one).
Fixes: c82e9aa0a8bc ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Tue, 12 Mar 2019 15:05:48 +0000 (17:05 +0200)]
net/mlx4_core: Fix locking in SRIOV mode when switching between events and polling
In procedures mlx4_cmd_use_events() and mlx4_cmd_use_polling(), we need to
guarantee that there are no FW commands in progress on the comm channel
(for VFs) or wrapped FW commands (on the PF) when SRIOV is active.
We do this by also taking the slave_cmd_mutex when SRIOV is active.
This is especially important when switching from event to polling, since we
free the command-context array during the switch. If there are FW commands
in progress (e.g., waiting for a completion event), the completion event
handler will access freed memory.
Since the decision to use comm_wait or comm_poll is taken before grabbing
the event_sem/poll_sem in mlx4_comm_cmd_wait/poll, we must take the
slave_cmd_mutex as well (to guarantee that the decision to use events or
polling and the call to the appropriate cmd function are atomic).
Fixes: a7e1f04905e5 ("net/mlx4_core: Fix deadlock when switching between polling and event fw commands") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Tue, 12 Mar 2019 15:05:47 +0000 (17:05 +0200)]
net/mlx4_core: Fix reset flow when in command polling mode
As part of unloading a device, the driver switches from
FW command event mode to FW command polling mode.
Part of switching over to polling mode is freeing the command context array
memory (unfortunately, currently, without NULLing the command context array
pointer).
The reset flow calls "complete" to complete all outstanding fw commands
(if we are in event mode). The check for event vs. polling mode here
is to test if the command context array pointer is NULL.
If the reset flow is activated after the switch to polling mode, it will
attempt (incorrectly) to complete all the commands in the context array --
because the pointer was not NULLed when the driver switched over to polling
mode.
As a result, we have a use-after-free situation, which results in a
kernel crash.
The fix is to set the command context array pointer to NULL after freeing
the array.
Fixes: f5aef5aa3506 ("net/mlx4_core: Activate reset flow upon fatal command cases") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 12 Mar 2019 21:58:35 +0000 (14:58 -0700)]
Merge tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"The highlights are:
- rbd will now ignore discards that aren't aligned and big enough to
actually free up some space (myself). This is controlled by the new
alloc_size map option and can be disabled if needed.
- support for rbd deep-flatten feature (myself). Deep-flatten allows
"rbd flatten" to fully disconnect the clone image and its snapshots
from the parent and make the parent snapshot removable.
- a new round of cap handling improvements (Zheng Yan). The kernel
client should now be much more prompt about releasing its caps and
it is possible to put a limit on the number of caps held.
- support for getting ceph.dir.pin extended attribute (Zheng Yan)"
* tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-client: (26 commits)
Documentation: modern versions of ceph are not backed by btrfs
rbd: advertise support for RBD_FEATURE_DEEP_FLATTEN
rbd: whole-object write and zeroout should copyup when snapshots exist
rbd: copyup with an empty snapshot context (aka deep-copyup)
rbd: introduce rbd_obj_issue_copyup_ops()
rbd: stop copying num_osd_ops in rbd_obj_issue_copyup()
rbd: factor out __rbd_osd_req_create()
rbd: clear ->xferred on error from rbd_obj_issue_copyup()
rbd: remove experimental designation from kernel layering
ceph: add mount option to limit caps count
ceph: periodically trim stale dentries
ceph: delete stale dentry when last reference is dropped
ceph: remove dentry_lru file from debugfs
ceph: touch existing cap when handling reply
ceph: pass inclusive lend parameter to filemap_write_and_wait_range()
rbd: round off and ignore discards that are too small
rbd: handle DISCARD and WRITE_ZEROES separately
rbd: get rid of obj_req->obj_request_count
libceph: use struct_size() for kmalloc() in crush_decode()
ceph: send cap releases more aggressively
...
David S. Miller [Tue, 12 Mar 2019 21:55:16 +0000 (14:55 -0700)]
Merge branch 'mlxsw-Various-fixes'
Ido Schimmel says:
====================
mlxsw: Various fixes
Patch #1 fixes the recently introduced QSFP thermal zones to correctly
work with split ports, where several ports are mapped to the same
module.
Patch #2 initializes the base MAC in the minimal driver. The driver is
using the base MAC as its parent ID and without initializing it, it is
reported as all zeroes to user space.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Tue, 12 Mar 2019 08:40:41 +0000 (08:40 +0000)]
mlxsw: core: Prevent duplication during QSFP module initialization
Verify during thermal initialization if QSFP module's entry is already
configured in order to prevent duplication.
Such scenario could happen in case two switch drivers (PCI and I2C
based) coexist and if after boot, splitting configuration is applied
for some ports and then I2C based driver is re-probed.
In such case after reboot same QSFP module, associated with split will
be discovered by I2C based driver few times, and it will cause a crash.
It could happen for example on system equipped with BMC (Baseboard
Management Controller), running I2C based driver, when the next steps
are performed:
- System boot
- Host side configures port spilt.
- BMC side is rebooted.
Fixes: 6a79507cfe94 ("mlxsw: core: Extend thermal module with per QSFP module thermal zones") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 12 Mar 2019 21:53:57 +0000 (14:53 -0700)]
Merge tag 'for-5.1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Correctness and a deadlock fixes"
* tag 'for-5.1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zstd: ensure reclaim timer is properly cleaned up
btrfs: move ulist allocation out of transaction in quota enable
btrfs: save drop_progress if we drop refs at all
btrfs: check for refs on snapshot delete resume
Btrfs: fix deadlock between clone/dedupe and rename
Btrfs: fix corruption reading shared and compressed extents after hole punching
Kangjie Lu [Tue, 12 Mar 2019 07:43:18 +0000 (02:43 -0500)]
net: sh_eth: fix a missing check of of_get_phy_mode
of_get_phy_mode may fail and return a negative error code;
the fix checks the return value of of_get_phy_mode and
returns NULL of it fails.
Fixes: b356e978e92f ("sh_eth: add device tree support") Signed-off-by: Kangjie Lu <kjlu@umn.edu> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 12 Mar 2019 21:50:42 +0000 (14:50 -0700)]
Merge tag 'nfs-for-5.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- Fixes for NFS I/O request leakages
- Fix error handling paths in the NFS I/O recoalescing code
- Reinitialise NFSv4.1 sequence results before retransmitting a
request
- Fix a soft lockup in the delegation recovery code
- Bulk destroy of layouts needs to be safe w.r.t. umount
- Prevent thundering herd issues when the SUNRPC socket is not
connected
- Respect RPC call timeouts when retrying transmission
Features:
- Convert rpc auth layer to use xdr_streams
- Config option to disable insecure RPCSEC_GSS crypto types
- Reduce size of RPC receive buffers
- Readdirplus optimization by cache mechanism
- Convert SUNRPC socket send code to use iov_iter()
- SUNRPC micro-optimisations to avoid indirect calls
- Add support for the pNFS LAYOUTERROR operation and use it with the
pNFS/flexfiles driver
- Add trace events to report non-zero NFS status codes
- Various removals of unnecessary dprintks
Bugfixes and cleanups:
- Fix a number of sparse warnings and documentation format warnings
- Fix nfs_parse_devname to not modify it's argument
- Fix potential corruption of page being written through pNFS/blocks
- fix xfstest generic/099 failures on nfsv3
- Avoid NFSv4.1 "false retries" when RPC calls are interrupted
- Abort I/O early if the pNFS/flexfiles layout segment was
invalidated
- Avoid unnecessary pNFS/flexfiles layout invalidations"
* tag 'nfs-for-5.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (90 commits)
SUNRPC: Take the transport send lock before binding+connecting
SUNRPC: Micro-optimise when the task is known not to be sleeping
SUNRPC: Check whether the task was transmitted before rebind/reconnect
SUNRPC: Remove redundant calls to RPC_IS_QUEUED()
SUNRPC: Clean up
SUNRPC: Respect RPC call timeouts when retrying transmission
SUNRPC: Fix up RPC back channel transmission
SUNRPC: Prevent thundering herd when the socket is not connected
SUNRPC: Allow dynamic allocation of back channel slots
NFSv4.1: Bump the default callback session slot count to 16
SUNRPC: Convert remaining GFP_NOIO, and GFP_NOWAIT sites in sunrpc
NFS/flexfiles: Clean up mirror DS initialisation
NFS/flexfiles: Remove dead code in ff_layout_mirror_valid()
NFS/flexfile: Simplify nfs4_ff_layout_select_ds_stateid()
NFS/flexfile: Simplify nfs4_ff_layout_ds_version()
NFS/flexfiles: Simplify ff_layout_get_ds_cred()
NFS/flexfiles: Simplify nfs4_ff_find_or_create_ds_client()
NFS/flexfiles: Simplify nfs4_ff_layout_select_ds_fh()
NFS/flexfiles: Speed up read failover when DSes are down
NFS/flexfiles: Don't invalidate DS deviceids for being unresponsive
...
Linus Torvalds [Tue, 12 Mar 2019 21:48:52 +0000 (14:48 -0700)]
Merge tag 'ovl-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"Fix copy up of security related xattrs"
* tag 'ovl-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: Do not lose security.capability xattr over metadata file copy-up
ovl: During copy up, first copy up data and then xattrs