Amlogic G12A devices experience CPU stalls and random board wedges when
the system idles and CPU cores clock down to lower opp points. Recent
vendor kernels include a change to remove 100-250MHz and other distro
sources also remove the 500/667MHz points. Unless all 100-667Mhz opps
are removed or the CPU governor forced to performance stalls are still
observed, so let's remove them to improve stability and uptime.
Fixes: b190056fa9ee ("arm64: dts: meson-g12a: add cpus OPP table") Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Link: https://lore.kernel.org/r/20230119053031.21400-1-christianshewitt@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Current PCIe QMP PHY output name were changed in ("arm64: dts: qcom: Fix
IPQ8074 PCIe PHY nodes") however it did not account for the fact that GCC
driver is relying on the old names to match them as they are being used as
the parent for the gcc_pcie0_pipe_clk and gcc_pcie1_pipe_clk.
This broke parenting as GCC could not find the parent clock, so fix it by
changing to the names that driver is expecting.
IPQ8074 comes in 2 silicon versions:
* v1 with 2x Gen2 PCIe ports and QMP PHY-s
* v2 with 1x Gen3 and 1x Gen2 PCIe ports and QMP PHY-s
v2 is the final and production version that is actually supported by the
kernel, however it looks like PCIe related nodes were added for the v1 SoC.
Finish the PCIe fixup by using the correct compatible, adding missing ATU
register space, declaring max-link-speed, use correct ranges, add missing
clocks and resets.
It seems that clock-output-names for the USB3 QMP PHY-s where set without
actually checking what is the GCC clock driver expecting, so clock core
could never actually find the parents for usb0_pipe_clk_src and
usb1_pipe_clk_src clocks in the GCC driver.
So, correct the names to be what the driver expects so that parenting
works.
Before:
gcc_usb0_pipe_clk_src 0 0 0 125000000 0 0 50000 Y
gcc_usb1_pipe_clk_src 0 0 0 125000000 0 0 50000 Y
After:
usb3phy_0_cc_pipe_clk 1 1 0 125000000 0 0 50000 Y
usb0_pipe_clk_src 1 1 0 125000000 0 0 50000 Y
gcc_usb0_pipe_clk 1 1 0 125000000 0 0 50000 Y
usb3phy_1_cc_pipe_clk 1 1 0 125000000 0 0 50000 Y
usb1_pipe_clk_src 1 1 0 125000000 0 0 50000 Y
gcc_usb1_pipe_clk 1 1 0 125000000 0 0 50000 Y
Fixes: 5e09bc51d07b ("arm64: dts: ipq8074: enable USB support") Signed-off-by: Robert Marko <robimarko@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230108130440.670181-2-robimarko@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Fixes: 976d321f32dc ("arm64: dts: qcom: msm8992: Make the DT an overlay on top of 8994") Signed-off-by: Petr Vorel <petr.vorel@gmail.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221226185440.440968-3-pevik@seznam.cz Signed-off-by: Sasha Levin <sashal@kernel.org>
Original google firmware reports 12 MiB:
[ 0.000000] cma: Found cont_splash_mem@0, memory base 0x0000000003400000, size 12 MiB, limit 0xffffffffffffffff
which is actually 12*1024*1024 = 0xc00000.
This matches the aosp source [1]:
&cont_splash_mem {
reg = <0 0x03400000 0 0xc00000>;
};
Fixes: 3cb6a271f4b0 ("arm64: dts: qcom: msm8992-bullhead: Fix cont_splash_mem mapping") Fixes: 976d321f32dc ("arm64: dts: qcom: msm8992: Make the DT an overlay on top of 8994")
[1] https://android.googlesource.com/kernel/msm.git/+/android-7.0.0_r0.17/arch/arm64/boot/dts/lge/msm8992-bullhead.dtsi#141
Signed-off-by: Petr Vorel <petr.vorel@gmail.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221226185440.440968-2-pevik@seznam.cz Signed-off-by: Sasha Levin <sashal@kernel.org>
vmlinux.o: warning: objtool: intel_idle_irq+0x10c: call to trace_hardirqs_off() leaves .noinstr.text section
As per commit 32d4fd5751ea ("cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE"):
"must not have tracing in idle functions"
Clearly people can't read and tinker along until splat dissapears.
This straight up reverts commit d295ad34f236 ("intel_idle: Fix false
positive RCU splats due to incorrect hardirqs state").
It doesn't re-introduce the problem because preceding patches fixed it
properly.
Fixes: d295ad34f236 ("intel_idle: Fix false positive RCU splats due to incorrect hardirqs state") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.434302128@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Node names should be generic and use hyphens instead of underscores to
not cause warnings. Also nodes without a reg property should not have a
unit-address. Change the scpi_dvfs node to use clock-controller as node
name without a unit address (since it does not have a reg property).
Fixes: 70db166a2baa ("ARM64: dts: meson-gxbb: Add SCPI with cpufreq & sensors Nodes") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230111211350.1461860-7-martin.blumenstingl@googlemail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Documentation/devicetree/bindings/net/ethernet-phy.yaml defines that the
node name for Ethernet PHYs should match the following pattern:
^ethernet-phy(@[a-f0-9]+)?$
Replace the underscore with a hyphen to adhere to this binding.
Fixes: 280c17df8fbf ("arm64: dts: meson: g12a: add mdio multiplexer") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230111211350.1461860-6-martin.blumenstingl@googlemail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Unit addresses should be written using lower-case hex characters. Use
wifi_mac@c to fix a yaml schema validation error once the eFuse
dt-bindings have been converted to a yaml schema:
efuse: Unevaluated properties are not allowed ('wifi_mac@C' was
unexpected)
Also node names should use hyphens instead of underscores as the latter
can also cause warnings.
Fixes: abfaae24ecf3 ("arm64: dts: meson-gxl: add support for JetHub H1") Acked-by: Vyacheslav Bocharov <adeep@lexina.in> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230111211350.1461860-2-martin.blumenstingl@googlemail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Running GCC_USB30_*_MASTER_CLK at 200MHz requires CX at nominal level,
not doing so results in occasional lockups. This was previously hidden
by the fact that the display stack incorrectly voted for CX (instead of
MMCX).
Section 5.2.12.12 Processor Local x2APIC Structure in the ACPI v6.5
spec mandates that both "enabled" and "online capable" Local APIC Flags
should be used to determine if the processor is usable or not.
However, Linux doesn't use the "online capable" flag for x2APIC to
determine if the processor is usable. As a result, cpu_possible_mask has
incorrect value and results in more memory getting allocated for per_cpu
variables than it is going to be used.
Make sure Linux parses both "enabled" and "online capable" flags for
x2APIC to correctly determine if the processor is usable.
Fixes: aa06e20f1be6 ("x86/ACPI: Don't add CPUs that are not online capable") Reported-by: Leo Duran <leo.duran@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20230105041059.39366-1-kvijayab@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The systimer block derives its 13 MHz clock by dividing the main 26 MHz
oscillator clock by 2 internally. The 13 MHz clock is not a separate
oscillator.
Fix this by making the 13 MHz clock a divide-by-2 fixed factor clock,
taking its input from the main 26 MHz oscillator.
Fixes: 2e78620b1350 ("arm64: dts: Add MediaTek MT8186 dts and evaluation board and Makefile") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221201084229.3464449-5-wenst@chromium.org Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The systimer block derives its 13 MHz clock by dividing the main 26 MHz
oscillator clock by 2 internally, not through the TOPCKGEN clock
controller.
On the MT8195 this divider is set either by power-on-reset or by the
bootloader. The bootloader may then make the divider unconfigurable to,
but can be read out by, the operating system.
Making the systimer block take the 26 MHz clock directly requires
changing the implementations. As an ABI compatible fix, change the
input clock of the systimer block a fixed factor divide-by-2 clock
that takes the 26 MHz oscillator as its input.
Fixes: 37f2582883be ("arm64: dts: Add mediatek SoC mt8195 and evaluation board") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221201084229.3464449-4-wenst@chromium.org Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The systimer block derives its 13 MHz clock by dividing the main 26 MHz
oscillator clock by 2 internally, not through the TOPCKGEN clock
controller.
On the MT8192 this divider is fixed to /2 and is not configurable.
Making the systimer block take the 26 MHz clock directly requires
changing the implementations. As an ABI compatible fix, change the
input clock of the systimer block a fixed factor divide-by-2 clock
that takes the 26 MHz oscillator as its input.
Fixes: 48489980e27e ("arm64: dts: Add Mediatek SoC MT8192 and evaluation board dts and Makefile") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221201084229.3464449-3-wenst@chromium.org Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The systimer block derives its 13 MHz clock by dividing the main 26 MHz
oscillator clock by 2 internally, not through the TOPCKGEN clock
controller.
On the MT8183 this divider is set either by power-on-reset or by the
bootloader. The bootloader may then make the divider unconfigurable to,
but can be read out by, the operating system.
Making the systimer block take the 26 MHz clock directly requires
changing the implementations. As an ABI compatible fix, change the
input clock of the systimer block a fixed factor divide-by-2 clock
that takes the 26 MHz oscillator as its input.
Assign power domain to the U3PHY1 T-PHY in otder to keep this PHY
alive after unused PD shutdown and to be able to completely cut
and restore power to it, for example, to save some power during
system suspend/sleep.
Fixes: 2b515194bf0c ("arm64: dts: mt8195: Add power domains controller") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221214131117.108008-2-angelogioacchino.delregno@collabora.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
of_find_compatible_node() returns a node pointer with refcount incremented,
we should use of_node_put() on error path.
Add missing of_node_put() to avoid refcount leak.
- Remove autorepeat (leave key repetition to userspace);
- Remove unneeded status = "okay" (this is the default);
- Remove unneeded linux,input-type <EV_KEY> (this is the default for
gpio-keys);
- Allow the interrupt line for this button to be disabled;
- Use a full, descriptive node name;
- Set proper bias on the GPIO via pinctrl;
- Sort properties;
- Replace deprecated gpio-key,wakeup property with wakeup-source.
Fixes: 82e1783890b7 ("arm64: dts: qcom: sm6125: Add support for Sony Xperia 10II") Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221222192443.119103-1-marijn.suijten@somainline.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Reorder the clocks and corresponding names to match the QUSB2 phy
schema, fixing the following CHECK_DTBS errors:
arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dtb: phy@1613000: clock-names:0: 'cfg_ahb' was expected
From schema: /newdata/aosp-r/kernel/mainline/kernel/Documentation/devicetree/bindings/phy/qcom,qusb2-phy.yaml
arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dtb: phy@1613000: clock-names:1: 'ref' was expected
From schema: /newdata/aosp-r/kernel/mainline/kernel/Documentation/devicetree/bindings/phy/qcom,qusb2-phy.yaml
Fixes: cff4bbaf2a2d ("arm64: dts: qcom: Add support for SM6125") Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Martin Botka <martin.botka@somainline.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221216213343.1140143-1-marijn.suijten@somainline.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix up the ramoops node to make it match bindings and style:
- remove "removed-dma-pool"
- don't pad size to 8 hex digits
- change cc-size to ecc-size so that it's used
- increase ecc-size from to 16
- remove the zeroed ftrace-size
The framebuffer configuration for kumano griffin, written in kumano dtsi
(which is overwritten in bahamut dts for its smaller panel) has to use a
1096x2560 configuration as this is what the panel (and framebuffer area)
has been initialized to. Downstream userspace also has access to (and
uses) this 2.5k mode by default, and only switches the panel to 4k when
requested.
Fixes: d0a6ce59ea4e ("arm64: dts: qcom: sm8150: Add support for SONY Xperia 1 / 5 (Kumano platform)") Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221209191733.1458031-1-marijn.suijten@somainline.org Signed-off-by: Sasha Levin <sashal@kernel.org>
The hardware turns out to be pretty sluggish at assuming it can only
do USB2 with just a USB2 phy assigned to it - before it needed about
6 minutes to acknowledge that.
Limit it to USB-HS explicitly to make USB come up about 720x faster.
Fixes: 9da65e441d4d ("arm64: dts: qcom: Add support for SONY Xperia X Performance / XZ / XZs (msm8996, Tone platform)") Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221124220147.102611-1-konrad.dybcio@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>
The commit e5bbbff5b7d7 ("clk: gcc-qcs404: Add PCIe resets") added names
for PCIe resets, but it did not change the existing qcs404.dtsi to use
these names. Do it now and use symbol names to make it easier to check
and modify the dtsi in future.
Commit 104ff59af73a ("ata: ahci: Add Tiger Lake UP{3,4} AHCI
controller") enabled low power mode for the Tiger Lake AHIC adapter in
the author system but created regressions for others. Revert this patch
for now until a better solution is found to make this adapter
eco-friendly.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217114 CC: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
arch/powerpc/mm/book3s64/radix_tlb.c:1191:23: error: variable 'hstart' is uninitialized when used here
__tlbiel_va_range(hstart, hend, pid,
^~~~~~
arch/powerpc/mm/book3s64/radix_tlb.c:1191:31: error: variable 'hend' is uninitialized when used here
__tlbiel_va_range(hstart, hend, pid,
^~~~
Rework the 'if (IS_ENABLE(CONFIG_TRANSPARENT_HUGEPAGE))' so hstart/hend
is always initialized to silence the warnings. That will also simplify
the 'else' path. Clang is getting confused with these warnings, but the
warnings is a false-positive.
Use spinlocks to deal with workers introducing a wrapper
asus_schedule_work(), and several spinlock checks.
Otherwise, asus_kbd_backlight_set() may schedule led->work after the
structure has been freed, causing a use-after-free.
Fixes: af22a610bc38 ("HID: asus: support backlight on USB keyboards") Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> Link: https://lore.kernel.org/r/20230125-hid-unregister-leds-v4-5-7860c5763c38@diag.uniroma1.it Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Stefan Ghinea <stefan.ghinea@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
asus driver has a worker that may access data concurrently.
Proct the accesses using a spinlock.
Fixes: af22a610bc38 ("HID: asus: support backlight on USB keyboards") Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> Link: https://lore.kernel.org/r/20230125-hid-unregister-leds-v4-4-7860c5763c38@diag.uniroma1.it Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Stefan Ghinea <stefan.ghinea@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ever since commit 83e83ecb79a8 ("usb: core: get config and string
descriptors for unauthorized devices") was merged in 2013, there has
been no mechanism for reallocating the rawdescriptors buffers in
struct usb_device after the initial enumeration. Before that commit,
the buffers would be deallocated when a device was deauthorized and
reallocated when it was authorized and enumerated.
This means that the locking in the read_descriptors() routine is not
needed, since the buffers it reads will never be reallocated while the
routine is running. This locking can interfere with user programs
trying to read a hub's descriptors via sysfs while new child devices
of the hub are being initialized, since the hub is locked during this
procedure.
Since the locking in read_descriptors() hasn't been needed for over
nine years, we can remove it.
Starting with release 10.38 PCRE2 drops default support for using \K in
lookaround patterns as described in [1]. Unfortunately, scripts/tags.sh
relies on such functionality to collect all_compiled_soures() leading to
the following error:
$ make COMPILED_SOURCE=1 tags
GEN tags
grep: \K is not allowed in lookarounds (but see PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK)
The usage of \K for this pattern was introduced in commit 4f491bb6ea2a
("scripts/tags.sh: collect compiled source precisely") which speeds up
the generation of tags significantly.
In order to fix this issue without compromising the performance we can
switch over to an equivalent sed expression. The same matching pattern
is preserved here except \K is replaced with a backreference \1.
Now that we made the VFS setgid checking consistent an inode can't be
marked security irrelevant even if the setgid bit is still set. Make
this function consistent with all other helpers.
Note that enforcing consistent setgid stripping checks for file
modification and mode- and ownership changes will cause the setgid bit
to be lost in more cases than useed to be the case. If an unprivileged
user wrote to a non-executable setgid file that they don't have
privilege over the setgid bit will be dropped. This will lead to
temporary failures in some xfstests until they have been updated.
Reported-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently setgid stripping in file_remove_privs()'s should_remove_suid()
helper is inconsistent with other parts of the vfs. Specifically, it only
raises ATTR_KILL_SGID if the inode is S_ISGID and S_IXGRP but not if the
inode isn't in the caller's groups and the caller isn't privileged over the
inode although we require this already in setattr_prepare() and
setattr_copy() and so all filesystem implement this requirement implicitly
because they have to use setattr_{prepare,copy}() anyway.
But the inconsistency shows up in setgid stripping bugs for overlayfs in
xfstests (e.g., generic/673, generic/683, generic/685, generic/686,
generic/687). For example, we test whether suid and setgid stripping works
correctly when performing various write-like operations as an unprivileged
user (fallocate, reflink, write, etc.):
The test basically creates a file with 6666 permissions. While the file has
the S_ISUID and S_ISGID bits set it does not have the S_IXGRP set. On a
regular filesystem like xfs what will happen is:
In should_remove_suid() we can see that ATTR_KILL_SUID is raised
unconditionally because the file in the test has S_ISUID set.
But we also see that ATTR_KILL_SGID won't be set because while the file
is S_ISGID it is not S_IXGRP (see above) which is a condition for
ATTR_KILL_SGID being raised.
So by the time we call notify_change() we have attr->ia_valid set to
ATTR_KILL_SUID | ATTR_FORCE. Now notify_change() sees that
ATTR_KILL_SUID is set and does:
and since the caller in the test is neither capable nor in the group of the
inode the S_ISGID bit is stripped.
But assume the file isn't suid then ATTR_KILL_SUID won't be raised which
has the consequence that neither the setgid nor the suid bits are stripped
even though it should be stripped because the inode isn't in the caller's
groups and the caller isn't privileged over the inode.
If overlayfs is in the mix things become a bit more complicated and the bug
shows up more clearly. When e.g., ovl_setattr() is hit from
ovl_fallocate()'s call to file_remove_privs() then ATTR_KILL_SUID and
ATTR_KILL_SGID might be raised but because the check in notify_change() is
questioning the ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be
stripped the S_ISGID bit isn't removed even though it should be stripped:
The fix for all of this is to make file_remove_privs()'s
should_remove_suid() helper to perform the same checks as we already
require in setattr_prepare() and setattr_copy() and have notify_change()
not pointlessly requiring S_IXGRP again. It doesn't make any sense in the
first place because the caller must calculate the flags via
should_remove_suid() anyway which would raise ATTR_KILL_SGID.
While we're at it we move should_remove_suid() from inode.c to attr.c
where it belongs with the rest of the iattr helpers. Especially since it
returns ATTR_KILL_S{G,U}ID flags. We also rename it to
setattr_should_drop_suidgid() to better reflect that it indicates both
setuid and setgid bit removal and also that it returns attr flags.
Running xfstests with this doesn't report any regressions. We should really
try and use consistent checks.
Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current setgid stripping logic during write and ownership change
operations is inconsistent and strewn over multiple places. In order to
consolidate it and make more consistent we'll add a new helper
setattr_should_drop_sgid(). The function retains the old behavior where
we remove the S_ISGID bit unconditionally when S_IXGRP is set but also
when it isn't set and the caller is neither in the group of the inode
nor privileged over the inode.
We will use this helper both in write operation permission removal such
as file_remove_privs() as well as in ownership change operations.
Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the helper from inode.c to attr.c. This keeps the the core of the
set{g,u}id stripping logic in one place when we add follow-up changes.
It is the better place anyway, since should_remove_suid() returns
ATTR_KILL_S{G,U}ID flags.
Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In setattr_{copy,prepare}() we need to perform the same permission
checks to determine whether we need to drop the setgid bit or not.
Instead of open-coding it twice add a simple helper the encapsulates the
logic. We will reuse this helpers to make dropping the setgid bit during
write operations more consistent in a follow up patch.
Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Why]
Connecting displays to TBT3 docks often produces invalid
replies for DPIA AUX requests. It turns out the completion
structure was not re-initialized before reusing it, resulting
in immature wake up to completion.
[How]
Properly call reinit_completion() on reused completion structure.
Cc: stable@vger.kernel.org Reviewed-by: Solomon Chiu <solomon.chiu@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: "Limonciello, Mario" <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As per USB PD specification, 28th bit of fixed supply sink PDO
represents "higher capability" attribute and not "usb suspend
supported" attribute. So, this patch removes the usb_suspend_supported
attribute from sink PDO.
Fixes: 662a60102c12 ("usb: typec: Separate USB Power Delivery from USB Type-C") Cc: stable <stable@kernel.org> Reported-by: Rajaram Regupathy <rajaram.regupathy@intel.com> Signed-off-by: Saranya Gopal <saranya.gopal@intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20230214114543.205103-1-saranya.gopal@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Consider a case where gserial_disconnect has already cleared
gser->ioport. And if a wakeup interrupt triggers afterwards,
gserial_resume gets called, which will lead to accessing of
gser->ioport and thus causing null pointer dereference.Add
a null pointer check to prevent this.
Added a static spinlock to prevent gser->ioport from becoming
null after the newly added check.
[Why]
This fix was intended for improving on coding style but in the process
uncovers a race condition, which explains why we are getting incorrect
length in DPIA AUX replies. Due to the call path of DPIA AUX going from
DC back to DM layer then again into DC and the added complexities on top
of current DC AUX implementation, a proper fix to rely on current dc_lock
to address the race condition is difficult without a major overhual
on how DPIA AUX is implemented.
[How]
- Add a mutex dpia_aux_lock to protect DPIA AUX transfers
- Remove DMUB_ASYNC_TO_SYNC_ACCESS_* codes and rely solely on
aux_return_code_type for error reporting and handling
- Separate SET_CONFIG from DPIA AUX transfer because they have quiet
different processing logic
- Remove unnecessary type casting to and from void * type
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: "Limonciello, Mario" <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Why]
DOMAIN power gating control is now required to be done via firmware
due to interlock with other power features. This is to avoid
intermittent issues in the LB memories.
[How]
If the firmware supports the command then use the new firmware as
the sequence can avoid potential display corruption issues.
The command will be ignored on firmware that does not support DOMAIN
power control and the pipes will remain always on - frequent PG cycling
can cause the issue to occur on the old sequence, so we should avoid it.
Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: "Limonciello, Mario" <Mario.Limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 226fae124b2d ("vc_screen: move load of struct vc_data pointer in
vcs_read() to avoid UAF") moved the call to vcs_vc() into the loop.
While doing this it also moved the unconditional assignment of
ret = -ENXIO;
This unconditional assignment was valid outside the loop but within it
it clobbers the actual value of ret.
To avoid this only assign "ret = -ENXIO" when actually needed.
[ Also, the 'goto unlock_out" needs to be just a "break", so that it
does the right thing when it exits on later iterations when partial
success has happened - Linus ]
Christoph Paasch reported that commit b5fc29233d28 ("inet6: Remove
inet6_destroy_sock() in sk->sk_prot->destroy().") started triggering
WARN_ON_ONCE(sk->sk_forward_alloc) in sk_stream_kill_queues(). [0 - 2]
Also, we can reproduce it by a program in [3].
In the commit, we delay freeing ipv6_pinfo.pktoptions from sk->destroy()
to sk->sk_destruct(), so sk->sk_forward_alloc is no longer zero in
inet_csk_destroy_sock().
The same check has been in inet_sock_destruct() from at least v2.6,
we can just remove the WARN_ON_ONCE(). However, among the users of
sk_stream_kill_queues(), only CAIF is not calling inet_sock_destruct().
Thus, we add the same WARN_ON_ONCE() to caif_sock_destructor().
The bpf_fib_lookup() helper does not only look up the fib (ie. route)
but it also looks up the neigh. Before returning the neigh, the helper
does not check for NUD_VALID. When a neigh state (neigh->nud_state)
is in NUD_FAILED, its dmac (neigh->ha) could be all zeros. The helper
still returns SUCCESS instead of NO_NEIGH in this case. Because of the
SUCCESS return value, the bpf prog directly uses the returned dmac
and ends up filling all zero in the eth header.
This patch checks for NUD_VALID and returns NO_NEIGH if the neigh is
not valid.
Using pr_cont() in the tasks freezing code related to system-wide
suspend and hibernation is problematic, because the continuation
messages printed there are susceptible to interspersing with other
unrelated messages which results in output that is hard to
understand.
Address this issue by modifying try_to_freeze_tasks() to print
messages that don't require continuations and adjusting its
callers accordingly.
Reported-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have two IS1 filters of the OCELOT_VCAP_KEY_ANY key type (the one with
"action vlan pop" and the one with "action vlan modify") and one of the
OCELOT_VCAP_KEY_IPV4 key type (the one with "action skbedit priority").
But we have no IS1 filter with the OCELOT_VCAP_KEY_ETYPE key type, and
there was an uncaught breakage there.
To increase test coverage, convert one of the OCELOT_VCAP_KEY_ANY
filters to OCELOT_VCAP_KEY_ETYPE, by making the filter also match on the
MAC SA of the traffic sent by mausezahn, $h1_mac.
The touchscreen reports a battery status of 0% and jumps to 1% when a
stylus is used. The device ID was added and the battery ignore quirk was
enabled for it.
Seems like properties parsing and reading was copy-pasted,
so "everest,interrupt-src" and "everest,interrupt-clk" are saved into
the es8326->jack_pol variable. This might lead to wrong settings
being saved into the reg 57 (ES8326_HP_DET).
Fix this by using proper variables while reading properties.
Signed-off-by: Alexey Firago <a.firago@yadro.com> Reviewed-by: Yang Yingliang <yangyingliang@huawei.com Link: https://lore.kernel.org/r/20230204195106.46539-1-a.firago@yadro.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The initial value of hid->collection[].parent_idx if 0. When
Report descriptor doesn't contain "HID Collection", the value
remains as 0.
In the meanwhile, when the Report descriptor fullfill
all following conditions, it will trigger hid_apply_multiplier
function call.
1. Usage page is Generic Desktop Ctrls (0x01)
2. Usage is RESOLUTION_MULTIPLIER (0x48)
3. Contain any FEATURE items
The while loop in hid_apply_multiplier will search the top-most
collection by searching parent_idx == -1. Because all parent_idx
is 0. The loop will run forever.
There is a Report Descriptor triggerring the deadloop
0x05, 0x01, // Usage Page (Generic Desktop Ctrls)
0x09, 0x48, // Usage (0x48)
0x95, 0x01, // Report Count (1)
0x75, 0x08, // Report Size (8)
0xB1, 0x01, // Feature
Entries can linger in cache without timer for days, thanks to
the gc_thresh1 limit. As result, without traffic, the confirmed
time can be outdated and to appear to be in the future. Later,
on traffic, NUD_STALE entries can switch to NUD_DELAY and start
the timer which can see the invalid confirmed time and wrongly
switch to NUD_REACHABLE state instead of NUD_PROBE. As result,
timer is set many days in the future. This is more visible on
32-bit platforms, with higher HZ value.
Why this is a problem? While we expect unused entries to expire,
such entries stay in REACHABLE state for too long, locked in
cache. They are not expired normally, only when cache is full.
Problem and the wrong state change reported by Zhang Changzhong:
172.16.1.18 dev bond0 lladdr 0a:0e:0f:01:12:01 ref 1 used 350521/15994171/350520 probes 4 REACHABLE
350520 seconds have elapsed since this entry was last updated, but it is
still in the REACHABLE state (base_reachable_time_ms is 30000),
preventing lladdr from being updated through probe.
Fix it by ensuring timer is started with valid used/confirmed
times. Considering the valid time range is LONG_MAX jiffies,
we try not to go too much in the past while we are in
DELAY/PROBE state. There are also places that need
used/updated times to be validated while timer is not running.
Reported-by: Zhang Changzhong <zhangchangzhong@huawei.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Tested-by: Zhang Changzhong <zhangchangzhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to c8sectpfe driver code we first drive reset line low and
then high to reset the port, therefore the reset line is supposed to
be annotated as "active low". This will be important when we convert
the driver to gpiod API.
Commit 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN
support") added a select of ARCH_WANTS_NO_INSTR, because it also added
some uses of noinstr. However noinstr is always defined, regardless of
ARCH_WANTS_NO_INSTR, so there's no need to select it just for that.
As PeterZ says [1]:
Note that by selecting ARCH_WANTS_NO_INSTR you effectively state to
abide by its rules.
As of now the powerpc code does not abide by those rules, and trips some
new warnings added by Peter in linux-next.
So until the code can be fixed to avoid those warnings, disable
ARCH_WANTS_NO_INSTR.
Note that ARCH_WANTS_NO_INSTR is also used to gate building KCOV and
parts of KCSAN. However none of the noinstr annotations in powerpc were
added for KCOV or KCSAN, instead instrumentation is blocked at the file
level using KCOV_INSTRUMENT_foo.o := n.
The arg->clone_sources_count is u64 and can trigger a warning when a
huge value is passed from user space and a huge array is allocated.
Limit the allocated memory to 8MiB (can be increased if needed), which
in turn limits the number of clone sources to 8M / sizeof(struct
clone_root) = 8M / 40 = 209715. Real world number of clones is from
tens to hundreds, so this is future proof.
Reported-by: syzbot+4376a9a073770c173269@syzkaller.appspotmail.com Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Lockdep reports that acpi_nfit_shutdown() may deadlock against an
opportune acpi_nfit_scrub(). acpi_nfit_scrub () is run from inside a
'work' and therefore has already acquired workqueue-internal locks. It
also acquiires acpi_desc->init_mutex. acpi_nfit_shutdown() first
acquires init_mutex, and was subsequently attempting to cancel any
pending workqueue items. This reversed locking order causes a potential
deadlock:
======================================================
WARNING: possible circular locking dependency detected
6.2.0-rc3 #116 Tainted: G O N
------------------------------------------------------
libndctl/1958 is trying to acquire lock: ffff888129b461c0 ((work_completion)(&(&acpi_desc->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0x43/0x450
but task is already holding lock: ffff888129b460e8 (&acpi_desc->init_mutex){+.+.}-{3:3}, at: acpi_nfit_shutdown+0x87/0xd0 [nfit]
Since the workqueue manipulation is protected by its own internal locking,
the cancellation of pending work doesn't need to be done under
acpi_desc->init_mutex. Move cancel_delayed_work_sync() outside the
init_mutex to fix the deadlock. Any work that starts after
acpi_nfit_shutdown() drops the lock will see ARS_CANCEL, and the
cancel_delayed_work_sync() will safely flush it out.
This device has a touchscreen thats report a battery even if it doesn't
have one.
Ask Linux to ignore the battery so it will not always report it as low.
Make function buttons on ELECOM M-HT1DRBK trackball mouse work. This model
has two devices with different device IDs (010D and 011C). Both of
them misreports the number of buttons as 5 in the report descriptor, even
though they have 8 buttons. hid-elecom overwrites the report to fix them,
but supports only on 010D and does not work on 011C. This patch fixes
011C in the similar way but with specialized position parameters.
In fact, it is sufficient to rewrite only 17th byte (05 -> 08). However I
followed the existing way.
The clocks in the Rockchip rk3288 DisplayPort node are
included in the power-domain@RK3288_PD_VIO logic, but the
power-domains property in the dp node is missing, so fix it.
While this device uses the rk3399 it is also enclosed in a tight package
and cooled through the screen and back case. The default rk3399 thermal
limits can result in a burnt screen.
These lower limits have resulted in the existing burn not expanding and
will hopefully result in future devices not experiencing the issue.
This change adds support for nested IPsec tunnels by ensuring that
XFRM-I verifies existing policies before decapsulating a subsequent
policies. Addtionally, this clears the secpath entries after policies
are verified, ensuring that previous tunnels with no-longer-valid
do not pollute subsequent policy checks.
This is necessary especially for nested tunnels, as the IP addresses,
protocol and ports may all change, thus not matching the previous
policies. In order to ensure that packets match the relevant inbound
templates, the xfrm_policy_check should be done before handing off to
the inner XFRM protocol to decrypt and decapsulate.
Notably, raw ESP/AH packets did not perform policy checks inherently,
whereas all other encapsulated packets (UDP, TCP encapsulated) do policy
checks after calling xfrm_input handling in the respective encapsulation
layer.
Test: Verified with additional Android Kernel Unit tests Signed-off-by: Benedict Wong <benedictwong@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 74e19ef0ff80 ("uaccess: Add speculation barrier to
copy_from_user()") built fine on x86-64 and arm64, and that's the extent
of my local build testing.
It turns out those got the <linux/nospec.h> include incidentally through
other header files (<linux/kvm_host.h> in particular), but that was not
true of other architectures, resulting in build errors
kernel/bpf/core.c: In function ‘___bpf_prog_run’:
kernel/bpf/core.c:1913:3: error: implicit declaration of function ‘barrier_nospec’
so just make sure to explicitly include the proper <linux/nospec.h>
header file to make everybody see it.
The randstruct support released in Clang 15 is unsafe to use due to a
bug that can cause miscompilations: "-frandomize-layout-seed
inconsistently randomizes all-function-pointers structs"
(https://github.com/llvm/llvm-project/issues/60349). It has been fixed
on the Clang 16 release branch, so add a Clang version check.
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed.
ext4_feat_ktype was setting the "release" handler to "kfree", which
doesn't have a matching function prototype. Add a simple wrapper
with the correct prototype.
This was found as a result of Clang's new -Wcast-function-type-strict
flag, which is more sensitive than the simpler -Wcast-function-type,
which only checks for type width mismatches.
Note that this code is only reached when ext4 is a loadable module and
it is being unloaded:
On some Lenovo Legion models, the backlight might be driven by either
one of nvidia_wmi_ec_backlight or amdgpu_bl0 at different times.
When the Nvidia WMI EC backlight interface reports the backlight is
controlled by the EC, the current backlight handling only registers
nvidia_wmi_ec_backlight (and registers no other backlight interfaces).
This hides (never registers) the amdgpu_bl0 interface, where as prior
to 6.1.4 users would have both nvidia_wmi_ec_backlight and amdgpu_bl0
and could work around things in userspace.
Add a force module parameter which can be used with acpi_backlight=native
to restore the old behavior as a workound (for now) by passing:
We've moved the upstream Linux Kernel audit subsystem discussions to
a new mailing list, this patch updates the MAINTAINERS info with the
new list address.
Marking this for stable inclusion to help speed uptake of the new
list across all of the supported kernel releases. This is a doc only
patch so the risk should be close to nil.
Cc: stable@vger.kernel.org Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit e3fffc1f0b47 ("devicetree: document new marvell-8xxx and
pwrseq-sd8787 options") documented a compatible string for SD8787 in
the devicetree bindings, but neglected to add it to the mwifiex driver.
Fixes: e3fffc1f0b47 ("devicetree: document new marvell-8xxx and pwrseq-sd8787 options") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v4.11+ Cc: Matt Ranostay <mranostay@ti.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/320de5005ff3b8fd76be2d2b859fd021689c3681.1674827105.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sh vmlinux fails to link with GNU ld < 2.40 (likely < 2.36) since
commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv").
This is similar to fixes for powerpc and s390:
commit 4b9880dbf3bd ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT").
commit a494398bde27 ("s390: define RUNTIME_DISCARD_EXIT to fix link error
with GNU ld < 2.36").
$ sh4-linux-gnu-ld --version | head -n1
GNU ld (GNU Binutils for Debian) 2.35.2
$ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu- microdev_defconfig
$ make ARCH=sh CROSS_COMPILE=sh4-linux-gnu-
`.exit.text' referenced in section `__bug_table' of crypto/algboss.o:
defined in discarded section `.exit.text' of crypto/algboss.o
`.exit.text' referenced in section `__bug_table' of
drivers/char/hw_random/core.o: defined in discarded section
`.exit.text' of drivers/char/hw_random/core.o
make[2]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1
make[1]: *** [Makefile:1252: vmlinux] Error 2
arch/sh/kernel/vmlinux.lds.S keeps EXIT_TEXT:
/*
* .exit.text is discarded at runtime, not link time, to deal with
* references from __bug_table
*/
.exit.text : AT(ADDR(.exit.text)) { EXIT_TEXT }
However, EXIT_TEXT is thrown away by
DISCARD(include/asm-generic/vmlinux.lds.h) because
sh does not define RUNTIME_DISCARD_EXIT.
GNU ld 2.40 does not have this issue and builds fine.
This corresponds with Masahiro's comments in a494398bde27:
"Nathan [Chancellor] also found that binutils
commit 21401fc7bf67 ("Duplicate output sections in scripts") cured this
issue, so we cannot reproduce it with binutils 2.36+, but it is better
to not rely on it."
Link: https://lkml.kernel.org/r/9166a8abdc0f979e50377e61780a4bba1dfa2f52.1674518464.git.tom.saeger@oracle.com Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Link: https://lore.kernel.org/all/Y7Jal56f6UBh1abE@dev-arch.thelio-3990X/ Link: https://lore.kernel.org/all/20230123194218.47ssfzhrpnv3xfez@oracle.com/ Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Dennis Gilmore <dennis@ausil.us> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tom Saeger <tom.saeger@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nathan Chancellor reports that the s390 vmlinux fails to link with
GNU ld < 2.36 since commit 99cb0d917ffa ("arch: fix broken BuildID
for arm64 and riscv").
It happens for defconfig, or more specifically for CONFIG_EXPOLINE=y.
$ s390x-linux-gnu-ld --version | head -n1
GNU ld (GNU Binutils for Debian) 2.35.2
$ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- allnoconfig
$ ./scripts/config -e CONFIG_EXPOLINE
$ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- olddefconfig
$ make -s ARCH=s390 CROSS_COMPILE=s390x-linux-gnu-
`.exit.text' referenced in section `.s390_return_reg' of drivers/base/dd.o: defined in discarded section `.exit.text' of drivers/base/dd.o
make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1
make: *** [Makefile:1252: vmlinux] Error 2
arch/s390/kernel/vmlinux.lds.S wants to keep EXIT_TEXT:
.exit.text : {
EXIT_TEXT
}
But, at the same time, EXIT_TEXT is thrown away by DISCARD because
s390 does not define RUNTIME_DISCARD_EXIT.
I still do not understand why the latter wins after 99cb0d917ffa,
but defining RUNTIME_DISCARD_EXIT seems correct because the comment
line in arch/s390/kernel/vmlinux.lds.S says:
/*
* .exit.text is discarded at runtime, not link time,
* to deal with references from __bug_table
*/
Nathan also found that binutils commit 21401fc7bf67 ("Duplicate output
sections in scripts") cured this issue, so we cannot reproduce it with
binutils 2.36+, but it is better to not rely on it.
Relocatable kernels must not discard relocations, they need to be
processed at runtime. As such they are included for CONFIG_RELOCATABLE
builds in the powerpc linker script (line 340).
However they are also unconditionally discarded later in the
script (line 414). Previously that worked because the earlier inclusion
superseded the discard.
However commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and
riscv") introduced an earlier use of DISCARD as part of the RO_DATA
macro (line 137). With binutils < 2.36 that causes the DISCARD
directives later in the script to be applied earlier, causing .rela* to
actually be discarded at link time, leading to build warnings and a
kernel that doesn't boot:
The powerpc linker script explicitly includes .exit.text, because
otherwise the link fails due to references from __bug_table and
__ex_table. The code is freed (discarded) at runtime along with
.init.text and data.
That has worked in the past despite powerpc not defining
RUNTIME_DISCARD_EXIT because DISCARDS appears late in the powerpc linker
script (line 410), and the explicit inclusion of .exit.text
earlier (line 280) supersedes the discard.
However commit 99cb0d917ffa ("arch: fix broken BuildID for arm64 and
riscv") introduced an earlier use of DISCARD as part of the RO_DATA
macro (line 136). With binutils < 2.36 that causes the DISCARD
directives later in the script to be applied earlier [1], causing
.exit.text to actually be discarded at link time, leading to build
errors:
'.exit.text' referenced in section '__bug_table' of crypto/algboss.o: defined in
discarded section '.exit.text' of crypto/algboss.o
'.exit.text' referenced in section '__ex_table' of drivers/nvdimm/core.o: defined in
discarded section '.exit.text' of drivers/nvdimm/core.o
Fix it by defining RUNTIME_DISCARD_EXIT, which causes the generic
DISCARDS macro to not include .exit.text at all.