Kees Cook [Tue, 18 May 2021 03:16:57 +0000 (20:16 -0700)]
string.h: Introduce memset_startat() for wiping trailing members and padding
A common idiom in kernel code is to wipe the contents of a structure
starting from a given member. These open-coded cases are usually difficult
to read and very sensitive to struct layout changes. Like memset_after(),
introduce a new helper, memset_startat() that takes the target struct
instance, the byte to write, and the member name where zeroing should
start.
Note that this doesn't zero padding preceding the target member. For
those cases, memset_after() should be used on the preceding member.
Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Francis Laniel <laniel_francis@privacyrequired.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Daniel Axtens <dja@axtens.net> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Thu, 17 Jun 2021 15:34:19 +0000 (08:34 -0700)]
xfrm: Use memset_after() to clear padding
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Clear trailing padding bytes using the new helper so that memset()
doesn't get confused about writing "past the end" of the last struct
member. There is no change to the resulting machine code.
Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Tue, 18 May 2021 03:16:57 +0000 (20:16 -0700)]
string.h: Introduce memset_after() for wiping trailing members/padding
A common idiom in kernel code is to wipe the contents of a structure
after a given member. This is especially useful in places where there is
trailing padding. These open-coded cases are usually difficult to read
and very sensitive to struct layout changes. Introduce a new helper,
memset_after() that takes the target struct instance, the byte to write,
and the member name after which the zeroing should start.
Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Francis Laniel <laniel_francis@privacyrequired.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Daniel Axtens <dja@axtens.net> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
While the run-time testing of FORTIFY_SOURCE is already present in
LKDTM, there is no testing of the expected compile-time detections. In
preparation for correctly supporting FORTIFY_SOURCE under Clang, adding
additional FORTIFY_SOURCE defenses, and making sure FORTIFY_SOURCE
doesn't silently regress with GCC, introduce a build-time test suite that
checks each expected compile-time failure condition.
As this is relatively backwards from standard build rules in the
sense that a successful test is actually a compile _failure_, create
a wrapper script to check for the correct errors, and wire it up as
a dummy dependency to lib/string.o, collecting the results into a log
file artifact.
Kees Cook [Tue, 3 Aug 2021 05:51:31 +0000 (22:51 -0700)]
fortify: Allow strlen() and strnlen() to pass compile-time known lengths
Under CONFIG_FORTIFY_SOURCE, it is possible for the compiler to perform
strlen() and strnlen() at compile-time when the string size is known.
This is required to support compile-time overflow checking in strlcpy().
Kees Cook [Wed, 4 Aug 2021 21:20:14 +0000 (14:20 -0700)]
fortify: Prepare to improve strnlen() and strlen() warnings
In order to have strlen() use fortified strnlen() internally, swap their
positions in the source. Doing this as part of later changes makes
review difficult, so reoroder it here; no code changes.
Cc: Francis Laniel <laniel_francis@privacyrequired.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
The implementation for intra-object overflow in str*-family functions
accidentally dropped compile-time write overflow checking in strcpy(),
leaving it entirely to run-time. Add back the intended check.
Fixes: 6a39e62abbaf ("lib: string.h: detect intra-object overflow in fortified string functions") Cc: Daniel Axtens <dja@axtens.net> Cc: Francis Laniel <laniel_francis@privacyrequired.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Kees Cook [Thu, 13 May 2021 04:51:10 +0000 (21:51 -0700)]
fortify: Explicitly disable Clang support
Clang has never correctly compiled the FORTIFY_SOURCE defenses due to
a couple bugs:
Eliding inlines with matching __builtin_* names
https://bugs.llvm.org/show_bug.cgi?id=50322
Incorrect __builtin_constant_p() of some globals
https://bugs.llvm.org/show_bug.cgi?id=41459
In the process of making improvements to the FORTIFY_SOURCE defenses, the
first (silent) bug (coincidentally) becomes worked around, but exposes
the latter which breaks the build. As such, Clang must not be used with
CONFIG_FORTIFY_SOURCE until at least latter bug is fixed (in Clang 13),
and the fortify routines have been rearranged.
Update the Kconfig to reflect the reality of the current situation.
fortify: Move remaining fortify helpers into fortify-string.h
When commit a28a6e860c6c ("string.h: move fortified functions definitions
in a dedicated header.") moved the fortify-specific code, some helpers
were left behind. Move the remaining fortify-specific helpers into
fortify-string.h so they're together where they're used. This requires
that any FORTIFY helper function prototypes be conditionally built to
avoid "no prototype" warnings. Additionally removes unused helpers.
Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Daniel Axtens <dja@axtens.net> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Dan Williams <dan.j.williams@intel.com> Acked-by: Francis Laniel <laniel_francis@privacyrequired.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Fri, 18 Jun 2021 17:57:38 +0000 (10:57 -0700)]
lib/string: Move helper functions out of string.c
The core functions of string.c are those that may be implemented by
per-architecture functions, or overloaded by FORTIFY_SOURCE. As a
result, it needs to be built with __NO_FORTIFY. Without this, macros
will collide with function declarations. This was accidentally working
due to -ffreestanding (on some architectures). Make this deterministic
by explicitly setting __NO_FORTIFY and move all the helper functions
into string_helpers.c so that they gain the fortification coverage they
had been missing.
Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Andy Lavr <andy.lavr@gmail.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Kees Cook <keescook@chromium.org>
Since all compilers support __builtin_object_size(), and there is only
one user of __compiletime_object_size, remove it to avoid the needless
indirection. This lets Clang reason about check_copy_size() correctly.
Link: https://github.com/ClangBuiltLinux/linux/issues/1179 Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Arvind Sankar <nivedita@alum.mit.edu> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Gabriel Krisman Bertazi <krisman@collabora.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Thu, 20 May 2021 22:33:30 +0000 (15:33 -0700)]
cm4000_cs: Use struct_group() to zero struct cm4000_dev region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Add struct_group() to mark region of struct cm4000_dev that should be
initialized to zero.
Kees Cook [Sun, 1 Aug 2021 00:50:58 +0000 (17:50 -0700)]
can: flexcan: Use struct_group() to zero struct flexcan_regs regions
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Add struct_group() to mark both regions of struct flexcan_regs that get
initialized to zero. Avoid the future warnings:
In function 'fortify_memset_chk',
inlined from 'memset_io' at ./include/asm-generic/io.h:1169:2,
inlined from 'flexcan_ram_init' at drivers/net/can/flexcan.c:1403:2:
./include/linux/fortify-string.h:199:4: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
199 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'fortify_memset_chk',
inlined from 'memset_io' at ./include/asm-generic/io.h:1169:2,
inlined from 'flexcan_ram_init' at drivers/net/can/flexcan.c:1408:3:
./include/linux/fortify-string.h:199:4: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
199 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: linux-can@vger.kernel.org Cc: netdev@vger.kernel.org Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Fri, 21 May 2021 02:56:15 +0000 (19:56 -0700)]
HID: roccat: Use struct_group() to zero kone_mouse_event
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Add struct_group() to mark region of struct kone_mouse_event that should
be initialized to zero.
Cc: Stefan Achatz <erazor_de@users.sourceforge.net> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: linux-input@vger.kernel.org Acked-by: Jiri Kosina <jikos@kernel.org> Link: https://lore.kernel.org/lkml/nycvar.YFH.7.76.2108201810560.15313@cbobk.fhfr.pm Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Sun, 20 Jun 2021 17:09:58 +0000 (10:09 -0700)]
HID: cp2112: Use struct_group() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.
Use struct_group() in struct cp2112_string_report around members report,
length, type, and string, so they can be referenced together. This will
allow memcpy() and sizeof() to more easily reason about sizes, improve
readability, and avoid future warnings about writing beyond the end of
report.
"pahole" shows no size nor member offset changes to struct
cp2112_string_report. "objdump -d" shows no meaningful object
code changes (i.e. only source line number induced differences.)
Kees Cook [Tue, 25 May 2021 06:55:11 +0000 (23:55 -0700)]
drm/mga/mga_ioc32: Use struct_group() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.
Use struct_group() in struct drm32_mga_init around members chipset, sgram,
maccess, fb_cpp, front_offset, front_pitch, back_offset, back_pitch,
depth_cpp, depth_offset, depth_pitch, texture_offset, and texture_size,
so they can be referenced together. This will allow memcpy() and sizeof()
to more easily reason about sizes, improve readability, and avoid future
warnings about writing beyond the end of chipset.
"pahole" shows no size nor member offset changes to struct drm32_mga_init.
"objdump -d" shows no meaningful object code changes (i.e. only source
line number induced differences and optimizations).
Note that since this is a UAPI header, __struct_group() is used
directly.
Cc: David Airlie <airlied@linux.ie> Cc: Lee Jones <lee.jones@linaro.org> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Daniel Vetter <daniel@ffwll.ch> Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.local
Kees Cook [Tue, 18 May 2021 18:31:22 +0000 (11:31 -0700)]
iommu/amd: Use struct_group() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.
Use struct_group() in struct ivhd_entry around members ext and hidh, so
they can be referenced together. This will allow memcpy() and sizeof()
to more easily reason about sizes, improve readability, and avoid future
warnings about writing beyond the end of ext.
"pahole" shows no size nor member offset changes to struct ivhd_entry.
"objdump -d" shows no object code changes.
Kees Cook [Tue, 25 May 2021 01:51:54 +0000 (18:51 -0700)]
bnxt_en: Use struct_group_attr() for memcpy() region
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.
Use struct_group() around members queue_id, min_bw, max_bw, tsa, pri_lvl,
and bw_weight so they can be referenced together. This will allow memcpy()
and sizeof() to more easily reason about sizes, improve readability,
and avoid future warnings about writing beyond the end of queue_id.
"pahole" shows no size nor member offset changes to struct bnxt_cos2bw_cfg.
"objdump -d" shows no meaningful object code changes (i.e. only source
line number induced differences and optimizations).
Kees Cook [Tue, 18 May 2021 03:01:15 +0000 (20:01 -0700)]
stddef: Introduce struct_group() helper macro
Kernel code has a regular need to describe groups of members within a
structure usually when they need to be copied or initialized separately
from the rest of the surrounding structure. The generally accepted design
pattern in C is to use a named sub-struct:
struct foo {
int one;
struct {
int two;
int three, four;
} thing;
int five;
};
This would allow for traditional references and sizing:
However, doing this would mean that referencing struct members enclosed
by such named structs would always require including the sub-struct name
in identifiers:
do_something(dst.thing.three);
This has tended to be quite inflexible, especially when such groupings
need to be added to established code which causes huge naming churn.
Three workarounds exist in the kernel for this problem, and each have
other negative properties.
To avoid the naming churn, there is a design pattern of adding macro
aliases for the named struct:
#define f_three thing.three
This ends up polluting the global namespace, and makes it difficult to
search for identifiers.
Another common work-around in kernel code avoids the pollution by avoiding
the named struct entirely, instead identifying the group's boundaries using
either a pair of empty anonymous structs of a pair of zero-element arrays:
struct foo {
int one;
struct { } start;
int two;
int three, four;
struct { } finish;
int five;
};
struct foo {
int one;
int start[0];
int two;
int three, four;
int finish[0];
int five;
};
This allows code to avoid needing to use a sub-struct named for member
references within the surrounding structure, but loses the benefits of
being able to actually use such a struct, making it rather fragile. Using
these requires open-coded calculation of sizes and offsets. The efforts
made to avoid common mistakes include lots of comments, or adding various
BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
about the boundaries (e.g. the "start" object looks like it's 0 bytes
in length), making bounds checking depend on open-coded calculations:
However, the vast majority of places in the kernel that operate on
groups of members do so without any identification of the grouping,
relying either on comments or implicit knowledge of the struct contents,
which is even harder for the compiler to reason about, and results in
even more fragile manual sizing, usually depending on member locations
outside of the region (e.g. to copy "two" and "three", use the start of
"four" to find the size):
In order to have a regular programmatic way to describe a struct
region that can be used for references and sizing, can be examined for
bounds checking, avoids forcing the use of intermediate identifiers,
and avoids polluting the global namespace, introduce the struct_group()
macro. This macro wraps the member declarations to create an anonymous
union of an anonymous struct (no intermediate name) and a named struct
(for references and sizing):
struct foo {
int one;
struct_group(thing,
int two;
int three, four;
);
int five;
};
if (length > sizeof(src.thing))
return -EINVAL;
memcpy(&dst.thing, &src.thing, length);
do_something(dst.three);
There are some rare cases where the resulting struct_group() needs
attributes added, so struct_group_attr() is also introduced to allow
for specifying struct attributes (e.g. __align(x) or __packed).
Additionally, there are places where such declarations would like to
have the struct be tagged, so struct_group_tagged() is added.
Given there is a need for a handful of UAPI uses too, the underlying
__struct_group() macro has been defined in UAPI so it can be used there
too.
To avoid confusing scripts/kernel-doc, hide the macro from its struct
parsing.
Kees Cook [Mon, 21 Jun 2021 19:01:01 +0000 (12:01 -0700)]
powerpc: Split memset() to avoid multi-field overflow
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Instead of writing across a field boundary with memset(), move the call
to just the array, and an explicit zeroing of the prior field.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Qinglang Miao <miaoqinglang@huawei.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Hulk Robot <hulkci@huawei.com> Cc: Wang Wensheng <wangwensheng4@huawei.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/lkml/87czqsnmw9.fsf@mpe.ellerman.id.au
Kees Cook [Mon, 21 Jun 2021 19:07:10 +0000 (12:07 -0700)]
scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the
upper union (evt_struct->iu.srp) instead, as that's what is being wiped.
Cc: Tyrel Datwyler <tyreld@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/lkml/yq135rzp79c.fsf@ca-mkp.ca.oracle.com Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com> Link: https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695359@linux.ibm.com
pci_iounmap'2: Electric Boogaloo: try to make sense of it all
Nathan Chancellor reports that the recent change to pci_iounmap in
commit 9caea0007601 ("parisc: Declare pci_iounmap() parisc version only
when CONFIG_PCI enabled") causes build errors on arm64.
It took me about two hours to convince myself that I think I know what
the logic of that mess of #ifdef's in the <asm-generic/io.h> header file
really aim to do, and rewrite it to be easier to follow.
Famous last words.
Anyway, the code has now been lifted from that grotty header file into
lib/pci_iomap.c, and has fairly extensive comments about what the logic
is. It also avoids indirecting through another confusing (and badly
named) helper function that has other preprocessor config conditionals.
Let's see what odd architecture did something else strange in this area
to break things. But my arm64 cross build is clean.
Fixes: 9caea0007601 ("parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled") Reported-by: Nathan Chancellor <nathan@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Ulrich Teichert <krypton@ulrich-teichert.org> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'x86_urgent_for_v5.15_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Prevent a infinite loop in the MCE recovery on return to user space,
which was caused by a second MCE queueing work for the same page and
thereby creating a circular work list.
- Make kern_addr_valid() handle existing PMD entries, which are marked
not present in the higher level page table, correctly instead of
blindly dereferencing them.
- Pass a valid address to sanitize_phys(). This was caused by the
mixture of inclusive and exclusive ranges. memtype_reserve() expect
'end' being exclusive, but sanitize_phys() wants it inclusive. This
worked so far, but with end being the end of the physical address
space the fail is exposed.
- Increase the maximum supported GPIO numbers for 64bit. Newer SoCs
exceed the previous maximum.
* tag 'x86_urgent_for_v5.15_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Avoid infinite loop for copy from user recovery
x86/mm: Fix kern_addr_valid() to cope with existing but not present entries
x86/platform: Increase maximum GPIO number for X86_64
x86/pat: Pass valid address to sanitize_phys()
Merge tag 'perf-urgent-2021-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event fix from Thomas Gleixner:
"A single fix for the perf core where a value read with READ_ONCE() was
checked and then reread which makes all the checks invalid. Reuse the
already read value instead"
* tag 'perf-urgent-2021-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
events: Reuse value read using READ_ONCE instead of re-reading it
Merge tag 'locking-urgent-2021-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A set of updates for the RT specific reader/writer locking base code:
- Make the fast path reader ordering guarantees correct.
- Code reshuffling to make the fix simpler"
[ This plays ugly games with atomic_add_return_release() because we
don't have a plain atomic_add_release(), and should really be cleaned
up, I think - Linus ]
* tag 'locking-urgent-2021-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwbase: Take care of ordering guarantee for fastpath reader
locking/rwbase: Extract __rwbase_write_trylock()
locking/rwbase: Properly match set_and_save_state() to restore_state()
Merge tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix crashes when scv (System Call Vectored) is used to make a syscall
when a transaction is active, on Power9 or later.
- Fix bad interactions between rfscv (Return-from scv) and Power9
fake-suspend mode.
- Fix crashes when handling machine checks in LPARs using the Hash MMU.
- Partly revert a recent change to our XICS interrupt controller code,
which broke the recently added Microwatt support.
Thanks to Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero,
Joel Stanley, Nicholas Piggin.
* tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/xics: Set the IRQ chip data for the ICS native backend
powerpc/mce: Fix access error in mce handler
KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing registers
powerpc/64s: system call rfscv workaround for TM bugs
selftests/powerpc: Add scv versions of the basic TM syscall tests
powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state
Merge tag 'kbuild-fixes-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix bugs in checkkconfigsymbols.py
- Fix missing sys import in gen_compile_commands.py
- Fix missing FORCE warning for ARCH=sh builds
- Fix -Wignored-optimization-argument warnings for Clang builds
- Turn -Wignored-optimization-argument into an error in order to stop
building instead of sprinkling warnings
* tag 'kbuild-fixes-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: Add -Werror=ignored-optimization-argument to CLANG_FLAGS
x86/build: Do not add -falign flags unconditionally for clang
kbuild: Fix comment typo in scripts/Makefile.modpost
sh: Add missing FORCE prerequisites in Makefile
gen_compile_commands: fix missing 'sys' package
checkkconfigsymbols.py: Remove skipping of help lines in parse_kconfig_file
checkkconfigsymbols.py: Forbid passing 'HEAD' to --commit
Merge tag 'perf-tools-fixes-for-v5.15-2021-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix ip display in 'perf script' when output type != attr->type.
- Ignore deprecation warning when using libbpf'sg btf__get_from_id(),
fixing the build with libbpf v0.6+.
- Make use of FD() robust in libperf, fixing a segfault with 'perf stat
--iostat list'.
- Initialize addr_location:srcline pointer to NULL when resolving
callchain addresses.
- Fix fused instruction logic for assembly functions in 'perf
annotate'.
* tag 'perf-tools-fixes-for-v5.15-2021-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id()
libperf evsel: Make use of FD robust.
perf machine: Initialize srcline string member in add_location struct
perf script: Fix ip display when type != attr->type
perf annotate: Fix fused instr logic for assembly functions
dmascc: use proper 'virt_to_bus()' rather than casting to 'int'
The old dmascc driver depends on the legacy ISA_DMA_API, and blindly
just casts the kernel virtual address to 'int' for set_dma_addr().
That works only incidentally, and because the high bits of the address
will be ignored anyway. And on 64-bit architectures it causes warnings.
Admittedly, 64-bit architectures with ISA are basically dead - I think
the only example of this is alpha, and nobody would ever use the dmascc
driver there. But hey, the fix is easy enough, the end result is
cleaner, and it's yet another configuration that now builds without
warnings.
If somebody actually uses this driver on an alpha and this fixes it for
you, please email me. Because that is just incredibly bizarre.
With the previous commit (9caea0007601: "parisc: Declare pci_iounmap()
parisc version only when CONFIG_PCI enabled") we can now enable
GENERIC_PCI_IOMAP unconditionally on alpha, and if PCI is not enabled we
will just get the nice empty helper functions that allow mixed-bus
drivers to build.
Example driver: the old 3com/3c59x.c driver works with either the PCI or
the EISA version of the 3x59x card, but wouldn't build in an EISA-only
configuration because of missing pci_iomap() and pci_iounmap() dummy
wrappers.
Most of the other PCI infrastructure just becomes empty wrappers even
without GENERIC_PCI_IOMAP, and it's not obvious that the pci_iomap
functionality shouldn't do the same, but this works.
parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled
Linus noticed odd declaration rules for pci_iounmap() in iomap.h and
pci_iomap.h, where it dependend on either NO_GENERIC_PCI_IOPORT_MAP or
GENERIC_IOMAP when CONFIG_PCI was disabled.
Testing on parisc seems to indicate that we need pci_iounmap() only when
CONFIG_PCI is enabled, so the declaration of pci_iounmap() can be moved
cleanly into pci_iomap.h in sync with the declarations of pci_iomap().
Sudip Mukherjee reports that this broke pulseaudio with a NULL pointer
dereference in vc4_hdmi_audio_prepare(), bisected it to this commit, and
confirmed that a revert fixed the problem.
9984d6664ce9 ("drm/vc4: hdmi: Make sure the controller is powered in detect") 411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to runtime_pm")
as Michael Stapelberg reports that the new runtime PM changes cause his
Raspberry Pi 3 to hang on boot, probably due to interactions with other
changes in the DRM tree (because a bisect points to the merge in commit e058a84bfddc: "Merge tag 'drm-next-2021-07-01' of git://.../drm").
Revert these two commits until it's been resolved.
kbuild: Add -Werror=ignored-optimization-argument to CLANG_FLAGS
Similar to commit 589834b3a009 ("kbuild: Add
-Werror=unknown-warning-option to CLANG_FLAGS").
Clang ignores certain GCC flags that it has not implemented, only
emitting a warning:
$ echo | clang -fsyntax-only -falign-jumps -x c -
clang-14: warning: optimization flag '-falign-jumps' is not supported
[-Wignored-optimization-argument]
When one of these flags gets added to KBUILD_CFLAGS unconditionally, all
subsequent cc-{disable-warning,option} calls fail because -Werror was
added to these invocations to turn the above warning and the equivalent
-W flag warning into errors.
To catch the presence of these flags earlier, turn
-Wignored-optimization-argument into an error so that the flags can
either be implemented or ignored via cc-option and there are no more
weird errors.
x86/build: Do not add -falign flags unconditionally for clang
clang does not support -falign-jumps and only recently gained support
for -falign-loops. When one of the configuration options that adds these
flags is enabled, clang warns and all cc-{disable-warning,option} that
follow fail because -Werror gets added to test for the presence of this
warning:
clang-14: warning: optimization flag '-falign-jumps=0' is not supported
[-Wignored-optimization-argument]
To resolve this, add a couple of cc-option calls when building with
clang; gcc has supported these options since 3.2 so there is no point in
testing for their support. -falign-functions was implemented in clang-7,
-falign-loops was implemented in clang-14, and -falign-jumps has not
been implemented yet.
arch/sh/boot/Makefile:87: FORCE prerequisite is missing
Add the missing FORCE prerequisites for all build targets identified by
"make help".
Fixes: e1f86d7b4b2a5213 ("kbuild: warn if FORCE is missing for if_changed(_dep,_rule) and filechk") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
checkkconfigsymbols.py: Remove skipping of help lines in parse_kconfig_file
When parsing Kconfig files to find symbol definitions and references,
lines after a 'help' line are skipped until a new config definition
starts.
However, Kconfig statements can actually be after a help section, as
long as these have shallower indentation. These are skipped by the
parser.
This means that symbols referenced in this kind of statements are
ignored by this function and thus are not considered undefined
references in case the symbol is not defined.
Remove the 'skip' logic entirely, as it is not needed if we just use the
STMT regex to find the end of help lines.
However, this means that keywords that appear as part of the help
message (i.e. with the same indentation as the help lines) it will be
considered as a reference/definition. This can happen now as well, but
only with REGEX_KCONFIG_DEF lines. Also, the keyword must have a SYMBOL
after it, which probably means that someone referenced a config in the
help so it seems like a bonus :)
The real solution is to keep track of the indentation when a the first
help line in encountered and then handle DEF and STMT lines only if the
indentation is shallower.
checkkconfigsymbols.py: Forbid passing 'HEAD' to --commit
As opposed to the --diff option, --commit can get ref names instead of
commit hashes.
When using the --commit option, the script resets the working directory
to the commit before the given ref, by adding '~' to the end of the ref.
However, the 'HEAD' ref is relative, and so when the working directory
is reset to 'HEAD~', 'HEAD' points to what was 'HEAD~'. Then when the
script resets to 'HEAD' it actually stays in the same commit. In this
case, the script won't report any cases because there is no diff between
the cases of the two refs.
Prevent the user from using HEAD refs.
A better solution might be to resolve the refs before doing the
reset, but for now just disallow such refs.
alpha: move __udiv_qrnnd library function to arch/alpha/lib/
We already had the implementation for __udiv_qrnnd (unsigned divide for
multi-precision arithmetic) as part of the alpha math emulation code.
But you can disable the math emulation code - even if you shouldn't -
and then the MPI code that actually wants this functionality (and is
needed by various crypto functions) will fail to build.
So move the extended-precision divide code to be a regular library
function, just like all the regular division code is. That way ie is
available regardless of math-emulation.
Ok, it almost certainly is still broken on actual hardware, but the
immediate reason for it having been marked BROKEN was a build error that
is fixed by just making sure the low-level IO header file is included
sufficiently early that the __EXTERN_INLINE hackery takes effect.
This was marked broken back in 2017 by commit 1883c9f49d02 ("alpha: mark
jensen as broken"), but Ulrich Teichert made me look at it as part of my
cross-build work to make sure -Werror actually does the right thing.
There are lots of alpha configurations that do not build cleanly, but
now it's no longer because Jensen wouldn't be buildable. That said,
because the Jensen platform doesn't force PCI to be enabled (Jensen only
had EISA), it ends up being somewhat interesting as a source of odd
configs.
perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id()
Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
a weak function, presumably to dynamically link against old version of
libbpf shared library. Unfortunately this causes compilation warning
when perf is compiled against libbpf v0.6+.
For now, just ignore deprecation warning, but there might be a better
solution, depending on perf's needs.
Ian Rogers [Sat, 18 Sep 2021 05:44:40 +0000 (22:44 -0700)]
libperf evsel: Make use of FD robust.
FD uses xyarray__entry that may return NULL if an index is out of
bounds. If NULL is returned then a segv happens as FD unconditionally
dereferences the pointer. This was happening in a case of with perf
iostat as shown below. The fix is to make FD an "int*" rather than an
int and handle the NULL case as either invalid input or a closed fd.
$ sudo gdb --args perf stat --iostat list
...
Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
50 {
(gdb) bt
#0 perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
#1 0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
#2 0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
at util/evsel.c:2045
#3 0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
at util/evsel.c:2065
#4 0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
#5 0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
at builtin-stat.c:833
#6 0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
at builtin-stat.c:1048
#7 0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
#8 0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
argv=0x7fffffffe4a0) at perf.c:313
#9 0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
#10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
#11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
...
(gdb) c
Continuing.
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
/bin/dmesg | grep -i perf may provide additional information.
Program received signal SIGSEGV, Segmentation fault.
0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
166 if (FD(evsel, cpu, thread) >= 0)
v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
backward.
Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Michael Petlan [Mon, 19 Jul 2021 14:53:32 +0000 (16:53 +0200)]
perf machine: Initialize srcline string member in add_location struct
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:
# perf top --sort comm -g --ignore-callees=do_idle
terminates with:
#0 0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
#1 0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
#2 0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
#3 hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
#4 0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
sample_self=sample_self@entry=true) at util/hist.c:657
#5 0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
#6 0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
at util/hist.c:1056
#7 iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
#8 0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
at util/hist.c:1231
#9 0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
at builtin-top.c:842
#10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
#11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
#12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
#13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
#14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
#15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
#16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
#17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
#18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6
If you look at the frame #2, the code is:
488 if (he->srcline) {
489 he->srcline = strdup(he->srcline);
490 if (he->srcline == NULL)
491 goto err_rawdata;
492 }
If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.
Also, if you look at the commit 1fb7d06a509e, it adds the srcline property
into the struct, but not initializing it everywhere needed.
Committer notes:
Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():
2181 if (al.sym != NULL) {
2182 if (perf_hpp_list.parent && !*parent &&
2183 symbol__match_regex(al.sym, &parent_regex))
2184 *parent = al.sym;
2185 else if (have_ignore_callees && root_al &&
2186 symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187 /* Treat this symbol as the root,
2188 forgetting its callees. */
2189 *root_al = al;
2190 callchain_cursor_reset(cursor);
2191 }
2192 }
And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:
Adrian Hunter [Sat, 11 Sep 2021 13:30:53 +0000 (16:30 +0300)]
perf script: Fix ip display when type != attr->type
set_print_ip_opts() was not being called when type != attr->type
because there is not a one-to-one relationship between output types
and attr->type. That resulted in ip not printing.
The attr_type() function is removed, and the match of attr->type to
output type is corrected.
Example on ADL using taskset to select an atom cpu:
# perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ravi Bangoria [Sat, 11 Sep 2021 04:38:53 +0000 (10:08 +0530)]
perf annotate: Fix fused instr logic for assembly functions
Some x86 microarchitectures fuse a subset of cmp/test/ALU instructions
with branch instructions, and thus perf annotate highlight such valid
pairs as fused.
When annotated with source, perf uses struct disasm_line to contain
either source or instruction line from objdump output. Usually, a C
statement generates multiple instructions which include such
cmp/test/ALU + branch instruction pairs. But in case of assembly
function, each individual assembly source line generate one
instruction.
The 'perf annotate' instruction fusion logic assumes the previous
disasm_line as the previous instruction line, which is wrong because,
for assembly function, previous disasm_line contains source line. And
thus perf fails to highlight valid fused instruction pairs for assembly
functions.
Fix it by searching backward until we find an instruction line and
consider that disasm_line as fused with current branch instruction.
Before:
│ cmpq %rcx, RIP+8(%rsp)
0.00 │ cmp %rcx,0x88(%rsp)
│ je .Lerror_bad_iret <--- Source line
0.14 │ ┌──je b4 <--- Instruction line
│ │movl %ecx, %eax
Reviewed-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org>
Link: https //lore.kernel.org/r/20210911043854.8373-1-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The TGA boards were based on the DECchip 21030 PCI graphics accelerator
used mainly for alpha, and existed in a TURBOchannel (TC) version for
the DECstation (MIPS) workstations.
However, the config option for the TGA code is a bit confused, and says
depends on FB && (ALPHA || TC)
because people didn't really want to enable the option for random PCI
environments, so the "ALPHA" stands in for that case (while the TC case
is then the MIPS DECstation case).
So that config dependency is kind of a mixture of architecture and bus
choices. But it's incorrect, in that there were non-PCI-based alpha
hardware, and then the driver just causes warnings:
drivers/video/fbdev/tgafb.c:1532:13: error: ‘tgafb_unregister’ defined but not used [-Werror=unused-function]
1532 | static void tgafb_unregister(struct device *dev)
| ^~~~~~~~~~~~~~~~
drivers/video/fbdev/tgafb.c:1387:12: error: ‘tgafb_register’ defined but not used [-Werror=unused-function]
1387 | static int tgafb_register(struct device *dev)
| ^~~~~~~~~~~~~~
so let's make the config option dependencies a bit more explict:
depends on FB
depends on PCI || TC
depends on ALPHA || TC
where that first "FB" is the software configuration dependency, the
second "PCI || TC" is the hardware bus dependency, while that final
"ALPHA || TC" dependency is the "don't bother asking except for these
situations.
We could make that third case have "COMPILE_TEST" as an option, and mark
the register/unregister functions as __maybe_unused, but I'm not sure
it's really worth it.
The Jensen IO functions are overly copmplicated because some of the IO
addresses refer to special 'local IO' ports, and they get accessed
differently.
That then makes gcc not actually inline them, and since they were marked
"extern inline" when included through the regular <asm/io.h> path, and
then only marked "inline" when included from sys_jensen.c, you never
necessarily got a body for the IO functions at all.
The intent of the sys_jensen.c code is to actually get the non-inlined
copy generated, so remove the 'inline' from the magic macro that is
supposed to sort this all out.
Also, do not mix 'extern inline' functions (that may or may not be
inlined and will not generate a function body if they are not) with
'static inline' (that _will_ generate a function body when not inlined).
Because gcc will complain about this situation:
error: ‘jensen_bus_outb’ is static but used in inline function ‘jensen_outb’ which is not static
because gcc basically doesn't know whether to generate a body for that
static inline function or not for that call site.
So make all of these use that __EXTERN_INLINE marker. Gcc will
generally not inline these things on use, and then generate the function
body out-of-line in sys_jensen.c.
This makes the core IO functions build for the alpha Jensen config.
Not that the rest then builds, because it turns out Jensen also doesn't
enable PCI, which then makes other drievrs very unhappy, but that's a
separate issue.
Without CONFIG_PM enabled, the SET_RUNTIME_PM_OPS() macro ends up being
empty, and the only use of tegra_slink_runtime_{resume,suspend} goes
away, resulting in
drivers/spi/spi-tegra20-slink.c:1200:12: error: ‘tegra_slink_runtime_resume’ defined but not used [-Werror=unused-function]
1200 | static int tegra_slink_runtime_resume(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/spi/spi-tegra20-slink.c:1188:12: error: ‘tegra_slink_runtime_suspend’ defined but not used [-Werror=unused-function]
1188 | static int tegra_slink_runtime_suspend(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
mark the functions __maybe_unused to make the build happy.
This hits the alpha allmodconfig build (and others).
David Brazdil [Fri, 17 Sep 2021 13:14:23 +0000 (14:14 +0100)]
of: restricted dma: Fix condition for rmem init
of_dma_set_restricted_buffer fails to handle negative return values from
of_property_count_elems_of_size, e.g. when the property does not exist.
This results in an attempt to assign a non-existent reserved memory
region to the device and a warning being printed. Fix the condition to
take negative values into account.
Fixes: f3cfd136aef0 ("of: restricted dma: Don't fail device probe on rmem init failure") Cc: Will Deacon <will@kernel.org> Signed-off-by: David Brazdil <dbrazdil@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210917131423.2760155-1-dbrazdil@google.com Signed-off-by: Rob Herring <robh@kernel.org>
Merge tag 'pm-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two cpufreq issues, one in the intel_pstate driver and one
in the core.
Specifics:
- Prevent intel_pstate from avoiding to use HWP, even if instructed
to do so via the kernel command line, when HWP has been enabled
already by the platform firmware (Doug Smythies).
- Prevent use-after-free from occurring in the schedutil cpufreq
governor on exit by fixing a core helper function that attempts to
access memory associated with a kobject after calling kobject_put()
on it (James Morse)"
* tag 'pm-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: schedutil: Destroy mutex before kobject_put() frees the memory
cpufreq: intel_pstate: Override parameters if HWP forced by BIOS
Merge tag 'dma-mapping-5.15-1' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
- page align size in sparc32 arch_dma_alloc (Andreas Larsson)
- tone down a new dma-debug message (Hamza Mahfooz)
- fix the kerneldoc for dma_map_sg_attrs (me)
* tag 'dma-mapping-5.15-1' of git://git.infradead.org/users/hch/dma-mapping:
sparc32: page align size in arch_dma_alloc
dma-debug: prevent an error message from causing runtime problems
dma-mapping: fix the kerneldoc for dma_map_sg_attrs
Merge tag 'pci-v5.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Defer VPD sizing until we actually need the contents; fixes a
boot-time slowdown reported by Dave Jones (Bjorn Helgaas)
- Stop clobbering OF fwnodes when we look for an ACPI fwnode; fixes a
virtio-iommu boot regression (Jean-Philippe Brucker)
- Add AMD GPU multi-function power dependencies; fixes runtime power
management, including GPU resume and temp and fan sensor issues (Evan
Quan)
- Update VMD maintainer to Nirmal Patel (Jon Derrick)
* tag 'pci-v5.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
MAINTAINERS: Add Nirmal Patel as VMD maintainer
PCI: Add AMD GPU multi-function power dependencies
PCI/ACPI: Don't reset a fwnode set by OF
PCI/VPD: Defer VPD sizing until first access
Merge tag 'iov_iter.3-5.15-2021-09-17' of git://git.kernel.dk/linux-block
Pull io_uring iov_iter retry fixes from Jens Axboe:
"This adds a helper to save/restore iov_iter state, and modifies
io_uring to use it.
After that is done, we can now kill the iter->truncated addition that
we added for this release. The io_uring change is being overly
cautious with the save/restore/advance, but better safe than sorry and
we can always improve that and reduce the overhead if it proves to be
of concern. The only case to be worried about in this regard is huge
IO, where iteration can take a while to iterate segments.
I spent some time writing test cases, and expanded the coverage quite
a bit from the last posting of this. liburing carries this regression
test case now:
which exercises all of this. It now also supports provided buffers,
and explicitly tests for end-of-file/device truncation as well.
On top of that, Pavel sanitized the IOPOLL retry path to follow the
exact same pattern as normal IO"
* tag 'iov_iter.3-5.15-2021-09-17' of git://git.kernel.dk/linux-block:
io_uring: move iopoll reissue into regular IO path
Revert "iov_iter: track truncated size"
io_uring: use iov_iter state save/restore helpers
iov_iter: add helper to save iov_iter state
Merge tag 'io_uring-5.15-2021-09-17' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Mostly fixes for regressions in this cycle, but also a few fixes that
predate this release.
The odd one out is a tweak to the direct files added in this release,
where attempting to reuse a slot is allowed instead of needing an
explicit removal of that slot first. It's a considerable improvement
in usability to that API, hence I'm sending it for -rc2.
- io-wq race fix and cleanup (Hao)
- loop_rw_iter() type fix
- SQPOLL max worker race fix
- Allow poll arm for O_NONBLOCK files, fixing a case where it's
impossible to properly use io_uring if you cannot modify the file
flags
- Allow direct open to simply reuse a slot, instead of needing it
explicitly removed first (Pavel)
- Fix a case where we missed signal mask restoring in cqring_wait, if
we hit -EFAULT (Xiaoguang)"
* tag 'io_uring-5.15-2021-09-17' of git://git.kernel.dk/linux-block:
io_uring: allow retry for O_NONBLOCK if async is supported
io_uring: auto-removal for direct open/accept
io_uring: fix missing sigmask restore in io_cqring_wait()
io_uring: pin SQPOLL data before unlocking ring lock
io-wq: provide IO_WQ_* constants for IORING_REGISTER_IOWQ_MAX_WORKERS arg items
io-wq: fix potential race of acct->nr_workers
io-wq: code clean of io_wqe_create_worker()
io_uring: ensure symmetry in handling iter types in loop_rw_iter()
Merge tag 'block-5.15-2021-09-17' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request via Christoph:
- fix ANA state updates when a namespace is not present (Anton
Eidelman)
- nvmet: fix a width vs precision bug in
nvmet_subsys_attr_serial_show (Dan Carpenter)
- avoid race in shutdown namespace removal (Daniel Wagner)
- fix io_work priority inversion in nvme-tcp (Keith Busch)
- destroy cm id before destroy qp to avoid use after free (Ruozhu
Li)
* tag 'block-5.15-2021-09-17' of git://git.kernel.dk/linux-block:
blk-cgroup: fix UAF by grabbing blkcg lock before destroying blkg pd
blkcg: fix memory leak in blk_iolatency_init
nvme: remove the call to nvme_update_disk_info in nvme_ns_remove
block: flush the integrity workqueue in blk_integrity_unregister
block: check if a profile is actually registered in blk_integrity_unregister
nvme-tcp: fix io_work priority inversion
nvme-rdma: destroy cm id before destroy qp to avoid use after free
nvme-multipath: fix ANA state updates when a namespace is not present
nvme: avoid race in shutdown namespace removal
nvmet: fix a width vs precision bug in nvmet_subsys_attr_serial_show()
blk-mq: avoid to iterate over stale request
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes and cleanups from Catalin Marinas:
- Fix the memset() size when re-initialising the SVE state.
- Mark __stack_chk_guard as __ro_after_init.
- Remove duplicate include.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Mark __stack_chk_guard as __ro_after_init
arm64/kernel: remove duplicate include in process.c
arm64/sve: Use correct size when reinitialising SVE state
Merge tag 'for-linus-5.15b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- The first hunk of a Xen swiotlb fixup series fixing multiple minor
issues and doing some small cleanups
- Some further Xen related fixes avoiding WARN() splats when running as
Xen guests or dom0
- A Kconfig fix allowing the pvcalls frontend to be built as a module
* tag 'for-linus-5.15b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
swiotlb-xen: drop DEFAULT_NSLABS
swiotlb-xen: arrange to have buffer info logged
swiotlb-xen: drop leftover __ref
swiotlb-xen: limit init retries
swiotlb-xen: suppress certain init retries
swiotlb-xen: maintain slab count properly
swiotlb-xen: fix late init retry
swiotlb-xen: avoid double free
xen/pvcalls: backend can be a module
xen: fix usage of pmd_populate in mremap for pv guests
xen: reset legacy rtc flag for PV domU
PM: base: power: don't try to use non-existing RTC for storing data
xen/balloon: use a kernel thread instead a workqueue
Merge tag 'drm-fixes-2021-09-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Slightly busier than usual rc2, but mostly scattered amdgpu fixes,
some i915 and etnaviv resolves an MMU/runtime PM blowup.
amdgpu:
- UBSAN fix
- Powerplay table update fix
- Fix use after free in BO moves
- Debugfs init fixes
- vblank workqueue fixes for headless devices
- FPU fixes
- sysfs_emit fixes
- SMU updates for cyan skillfish
- Backlight fixes when DMCU is not initialized
- DP MST fixes
- HDCP compliance fix
- Link training fix
- Runtime pm fix
- Panel orientation fixes
- Display GPUVM fix for yellow carp
- Add missing license
amdkfd:
- Drop PCI atomics requirement if proper firmware is available
- Suspend/resume fixes for IOMMUv2 cases
radeon:
- AGP fix
i915:
- Propagate DP link training error returns
- Use max link params for eDP 1.3 and earlier
- Build warning fixes
- Gem selftest fixes
- Ensure wakeref is held before hardware access
etnaviv:
- MMU context vs runtime PM fix"
* tag 'drm-fixes-2021-09-17' of git://anongit.freedesktop.org/drm/drm: (44 commits)
drm/amdgpu/display: add a proper license to dc_link_dp.c
drm/amd/display: Fix white screen page fault for gpuvm
amd/display: enable panel orientation quirks
drm/amdgpu: Demote TMZ unsupported log message from warning to info
drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver
drm/radeon: pass drm dev radeon_agp_head_init directly
drm/amdgpu: move iommu_resume before ip init/resume
drm/amdgpu: add amdgpu_amdkfd_resume_iommu
drm/amdkfd: separate kfd_iommu_resume from kfd_resume
drm/amd/display: Link training retry fix for abort case
drm/amd/display: Fix unstable HPCP compliance on Chrome Barcelo
drm/amd/display: dsc mst 2 4K displays go dark with 2 lane HBR3
drm/amd/display: Get backlight from PWM if DMCU is not initialized
drm/amdkfd: make needs_pcie_atomics FW-version dependent
drm/amdgpu: add manual sclk/vddc setting support for cyan skilfish(v3)
drm/amdgpu: add some pptable funcs for cyan skilfish(v3)
drm/amdgpu: update SMU driver interface for cyan skilfish(v3)
drm/amdgpu: update SMU PPSMC for cyan skilfish
drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings(v2)
...
Dave Airlie [Thu, 16 Sep 2021 19:53:52 +0000 (05:53 +1000)]
Merge tag 'drm-intel-fixes-2021-09-16' of ssh://git.freedesktop.org/git/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.15-rc2:
- Propagate DP link training error returns
- Use max link params for eDP 1.3 and earlier
- Build warning fixes
- Gem selftest fixes
- Ensure wakeref is held before hardware access
tx timeout and slot time are currently specified in units of HZ. On
Alpha, HZ is defined as 1024. When building alpha:allmodconfig, this
results in the following error message.
drivers/net/hamradio/6pack.c: In function 'sixpack_open':
drivers/net/hamradio/6pack.c:71:41: error:
unsigned conversion from 'int' to 'unsigned char'
changes value from '256' to '0'
In the 6PACK protocol, tx timeout is specified in units of 10 ms and
transmitted over the wire:
https://www.linux-ax25.org/wiki/6PACK
Defining a value dependent on HZ doesn't really make sense, and
presumably comes from the (very historical) situation where HZ was
originally 100.
Note that the SIXP_SLOTTIME use explicitly is about 10ms granularity:
Dave Airlie [Thu, 16 Sep 2021 19:30:56 +0000 (05:30 +1000)]
Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux into drm-fixes
Fixes a very annoying issue where the driver view of the MMU state gets
out of sync with the actual hardware state across a runtime PM cycle,
so we end up restarting the GPU with the wrong (potentially already
freed) MMU context. Hilarity ensues.
drm/rockchip: cdn-dp-core: Make cdn_dp_core_resume __maybe_unused
With the new static annotation, the compiler warns when the functions
are actually unused:
drivers/gpu/drm/rockchip/cdn-dp-core.c:1123:12: error: 'cdn_dp_resume' defined but not used [-Werror=unused-function]
1123 | static int cdn_dp_resume(struct device *dev)
| ^~~~~~~~~~~~~
Mark them __maybe_unused to suppress that warning as well.
[ Not so 'new' static annotations any more, and I removed the part of
the patch that added __maybe_unused to cdn_dp_suspend(), because it's
used by the shutdown/remove code.
So only the resume function ends up possibly unused if CONFIG_PM isn't
set - Linus ]
alpha: Declare virt_to_phys and virt_to_bus parameter as pointer to volatile
Some drivers pass a pointer to volatile data to virt_to_bus() and
virt_to_phys(), and that works fine. One exception is alpha. This
results in a number of compile errors such as
drivers/net/wan/lmc/lmc_main.c: In function 'lmc_softreset':
drivers/net/wan/lmc/lmc_main.c:1782:50: error:
passing argument 1 of 'virt_to_bus' discards 'volatile'
qualifier from pointer target type
drivers/atm/ambassador.c: In function 'do_loader_command':
drivers/atm/ambassador.c:1747:58: error:
passing argument 1 of 'virt_to_bus' discards 'volatile'
qualifier from pointer target type
Declare the parameter of virt_to_phys and virt_to_bus as pointer to
volatile to fix the problem.
3com 3c515: make it compile on 64-bit architectures
This driver isn't enabled most places because of the ISA config
dependency, but alpha still has it. And I think the 'Jensen' actually
did have an ISA slot.
However, it doesn't build cleanly, because the "Vortex bus master" code
just casts the skb->data pointer to 'int':
outl((int) (skb->data), ioaddr + Wn7_MasterAddr);
which is all kinds of broken. Even on a good old traditional PC/AT it
would be broken because the high bits will be random kernel address
bits, but presumably the hardware ignores those bits. I mean, it's ISA.
We're talking 16MB dma limits. The "good old days".
Make the build happy with this kind of craziness by using the proper
isa_virt_to_bus() handling that the full bus master code uses anyway
(the Vortex bus mastering is a limited special case).
Merge tag 'm68k-for-v5.15-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k fixes from Geert Uytterhoeven:
- Warning fixes to mitigate CONFIG_WERROR=y
* tag 'm68k-for-v5.15-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: mvme: Remove overdue #warnings in RTC handling
m68k: Double cast io functions to unsigned long
Dan Li [Tue, 14 Sep 2021 09:44:02 +0000 (17:44 +0800)]
arm64: Mark __stack_chk_guard as __ro_after_init
__stack_chk_guard is setup once while init stage and never changed
after that.
Although the modification of this variable at runtime will usually
cause the kernel to crash (so does the attacker), it should be marked
as __ro_after_init, and it should not affect performance if it is
placed in the ro_after_init section.
Mark Brown [Thu, 9 Sep 2021 16:53:56 +0000 (17:53 +0100)]
arm64/sve: Use correct size when reinitialising SVE state
When we need a buffer for SVE register state we call sve_alloc() to make
sure that one is there. In order to avoid repeated allocations and frees
we keep the buffer around unless we change vector length and just memset()
it to ensure a clean register state. The function that deals with this
takes the task to operate on as an argument, however in the case where we
do a memset() we initialise using the SVE state size for the current task
rather than the task passed as an argument.
This is only an issue in the case where we are setting the register state
for a task via ptrace and the task being configured has a different vector
length to the task tracing it. In the case where the buffer is larger in
the traced process we will leak old state from the traced process to
itself, in the case where the buffer is smaller in the traced process we
will overflow the buffer and corrupt memory.
Simon Ser [Fri, 10 Sep 2021 15:37:41 +0000 (15:37 +0000)]
amd/display: enable panel orientation quirks
This patch allows panel orientation quirks from DRM core to be
used. They attach a DRM connector property "panel orientation"
which indicates in which direction the panel has been mounted.
Some machines have the internal screen mounted with a rotation.
Since the panel orientation quirks need the native mode from the
EDID, check for it in amdgpu_dm_connector_ddc_get_modes.
Signed-off-by: Simon Ser <contact@emersion.fr> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <hwentlan@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Paul Menzel [Mon, 13 Sep 2021 08:34:11 +0000 (10:34 +0200)]
drm/amdgpu: Demote TMZ unsupported log message from warning to info
As the user cannot do anything about the unsupported Trusted Memory Zone
(TMZ) feature, do not warn about it, but make it informational, so
demote the log level from warning to info.
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Thu, 9 Sep 2021 16:56:28 +0000 (18:56 +0200)]
drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as
well, and defined in a header file if used by multiple compilation
units. The latter would be more involved in this case, so just drop
the inline declaration for now.
Fixes compile failure building for ppc64le on RHEL 8:
In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32,
from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call
to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available
90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here
1985 | max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count();
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)" Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Evan Quan [Thu, 9 Sep 2021 03:01:00 +0000 (11:01 +0800)]
drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver
Current RUNPM mechanism relies on PMFW to master the timing for BACO
in/exit. And that needs cooperation from sound driver for dstate
change notification for function 1(audio). Otherwise(on sound driver
missing), BACO cannot be kicked in correctly and hang will be observed
on RUNPM exit.
By switching back to legacy message way on sound driver missing,
we are able to fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded
Nirmoy Das [Mon, 13 Sep 2021 08:08:23 +0000 (10:08 +0200)]
drm/radeon: pass drm dev radeon_agp_head_init directly
Pass drm dev directly as rdev->ddev gets initialized later on
at radeon_device_init().
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214375 Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Meenakshikumar Somasundaram [Mon, 30 Aug 2021 18:01:10 +0000 (14:01 -0400)]
drm/amd/display: Link training retry fix for abort case
[Why]
If link training is aborted, it shall be retried if sink is present.
[How]
Check hpd status to find out whether sink is present or not. If sink is
present, then link training shall be tried again with same settings.
Otherwise, link training shall be aborted.
Reviewed-by: Jimmy Kizito <Jimmy.Kizito@amd.com> Acked-by: Mikita Lipski <mikita.lipski@amd.com> Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Qingqing Zhuo [Fri, 27 Aug 2021 10:58:38 +0000 (06:58 -0400)]
drm/amd/display: Fix unstable HPCP compliance on Chrome Barcelo
[Why]
Intermittently, there presents two occurrences of 0 stream
commits in a single HPD event. Current HDCP sequence does
not consider such scenerio, and will thus disable HDCP.
[How]
Add condition check to include stream remove and re-enable
case for HDCP enable.
Reviewed-by: Bhawanpreet Lakha <bhawanpreet.lakha@amd.com> Acked-by: Mikita Lipski <mikita.lipski@amd.com> Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Hersen Wu [Wed, 25 Aug 2021 20:27:47 +0000 (16:27 -0400)]
drm/amd/display: dsc mst 2 4K displays go dark with 2 lane HBR3
[Why]
call stack of amdgpu dsc mst pbn, slot num calculation is as below:
-compute_bpp_x16_from_target_bandwidth
-decide_dsc_target_bpp_x16
-setup_dsc_config
-dc_dsc_compute_bandwidth_range
-compute_mst_dsc_configs_for_link
-compute_mst_dsc_configs_for_state
from pbn -> dsc target bpp_x16
bpp_x16 is calulated by compute_bpp_x16_from_target_bandwidth.
Beside pixel clock and bpp, num_slices_h and bpp_increment_div
will also affect bpp_x16.
from dsc target bpp_x16 -> pbn
within dm_update_mst_vcpi_slots_for_dsc,
pbn = drm_dp_calc_pbn_mode(clock, bpp_x16, true);
bpp / 16 trunc digits after decimal point. This will cause calculation
delta. drm_dp_calc_pbn_mode does not have other informations,
like num_slices_h, bpp_increment_div. therefore, it does not do revese
calcuation properly from bpp_x16 to pbn.
pbn from drm_dp_calc_pbn_mode is less than pbn from
compute_mst_dsc_configs_for_state. This cause not enough mst slot
allocated to display. display could not visually light up.
[How]
pass pbn from compute_mst_dsc_configs_for_state to
dm_update_mst_vcpi_slots_for_dsc
Cc: stable@vger.kernel.org Reviewed-by: Scott Foster <Scott.Foster@amd.com> Acked-by: Mikita Lipski <mikita.lipski@amd.com> Signed-off-by: Hersen Wu <hersenwu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Harry Wentland [Mon, 16 Aug 2021 19:57:12 +0000 (15:57 -0400)]
drm/amd/display: Get backlight from PWM if DMCU is not initialized
On Carrizo/Stoney systems we set backlight through panel_cntl, i.e.
directly via the PWM registers, if DMCU is not initialized. We
always read it back through ABM registers which leads to a
mismatch and forces atomic_commit to program the backlight
each time.
Instead make sure we use the same logic for backlight readback,
i.e. read it from panel_cntl if DMCU is not initialized.
We also need to remove some extraneous and incorrect calculations
at the end of dce_get_16_bit_backlight_from_pwm.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1666 Cc: stable@vger.kernel.org Reviewed-by: Josip Pavic <josip.pavic@amd.com> Acked-by: Mikita Lipski <mikita.lipski@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Tue, 31 Aug 2021 21:42:15 +0000 (17:42 -0400)]
drm/amdkfd: make needs_pcie_atomics FW-version dependent
On some GPUs the PCIe atomic requirement for KFD depends on the MEC
firmware version. Add a firmware version check for this. The minimum
firmware version that works without atomics can be updated in the
device_info structure for each GPU type.
Move PCIe atomic detection from kgd2kfd_probe into kgd2kfd_device_init
because the MEC firmware is not loaded yet at the probe stage.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Lang Yu [Fri, 27 Aug 2021 07:20:51 +0000 (15:20 +0800)]
drm/amdgpu: add manual sclk/vddc setting support for cyan skilfish(v3)
Add manual sclk/vddc setting supoort via pp_od_clk_voltage sysfs
to maintain consistency with other asics. As cyan skillfish doesn't
support DPM, there is only a single frequency and voltage to adjust.
v2: maintain consistency and add command guide.
v3: adjust user settings storage and coding style.
Command guide:
echo vc point sclk vddc > pp_od_clk_voltage
"vc" - sclk voltage curve
"point" - must be 0
"sclk" - target value of sclk(MHz), should be in safe range
"vddc" - target value of vddc(mV), a 6.25(mV) stepping is
recommended and should be in safe range (the real
vddc is an approximation of target value)
echo c > pp_od_clk_voltage
"c" - commit the changes of sclk and vddc, only after
the commit command, the target values set by "vc"
command will take effect
echo r > pp_od_clk_voltage
"r" - reset sclk and vddc to default value, a subsequent
commit command is needed to take effect
Example:
1) Check default sclk and vddc
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1800Mhz *
OD_VDDC:
0: 862mV *
OD_RANGE:
SCLK: 1000Mhz 2000Mhz
VDDC: 700mV 1129mV
2) Set sclk to 1500MHz and vddc to 700mV
$ echo vc 0 1500 700 > pp_od_clk_voltage
$ echo c > pp_od_clk_voltage
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1500Mhz *
OD_VDDC:
0: 693mV *
OD_RANGE:
SCLK: 1000Mhz 2000Mhz
VDDC: 700mV 1129mV
3) Reset sclk and vddc to default
$ echo r > pp_od_clk_voltage
$ echo c > pp_od_clk_voltage
$ cat pp_od_clk_voltage
OD_SCLK:
0: 1800Mhz *
OD_VDDC:
0: 874mV *
OD_RANGE:
SCLK: 1000Mhz 2000Mhz
VDDC: 700mV 1129mV
NOTE:
We don't specify an explicit safe range, you can set any values
between min and max at your own risk. Enjoy!
Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>