Arnd Bergmann [Mon, 20 Dec 2021 14:33:33 +0000 (15:33 +0100)]
Merge tag 'samsung-drivers-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into arm/drivers
Samsung SoC drivers changes for v5.17
1. Exynos ChipID: add Exynos7885 support.
2. Exynos PMU: add Exynos850 support.
3. Minor bindings cleanup.
4. Add Exynos USIv2 (Universal Serial Interface) driver. The USI block is
a shared IP block between I2C, UART/serial and SPI. Basically one has
to choose which feature the USI block will support and later the
regular I2C/serial/SPI driver will bind and work.
This merges also one commit with dt-binding headers from my dts64
pull request.
Together with a future serial driver change, this will break the ABI.
Affected: Serial on ExynosAutov9 SADK and out-of-tree ExynosAutov9 boards
Why: To properly and efficiently support the USI with new hierarchy
of USI-{serial,SPI,I2C} devicetree nodes.
Rationale:
Recently added serial and USI support was short-sighted and did not
allow to smooth support of other features (SPI and I2C). Adding
support for USI-SPI and USI-I2C would effect in code duplication.
Adding support for different USI versions (currently supported is
USIv2 but support for v1 is planned) would cause even more code
duplication and create a solution difficult to maintain.
Since USI-serial and ExynosAutov9 have been added recently, are
considered fresh development features and there are no supported
products using them, the code/solution is being refactored in
non-backwards compatible way. The compatibility is not broken yet.
It will be when serial driver changes are accepted.
The ABI break was discussed with only known users of ExynosAutov9 and
received their permission.
* tag 'samsung-drivers-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
dt-bindings: soc: samsung: keep SoC driver bindings together
soc: samsung: Add USI driver
dt-bindings: soc: samsung: Add Exynos USI bindings
soc: samsung: exynos-pmu: Add Exynos850 support
dt-bindings: samsung: pmu: Document Exynos850
soc: samsung: exynos-chipid: add Exynos7885 SoC support
soc: samsung: exynos-chipid: describe which SoCs go with compatibles
Arnd Bergmann [Mon, 20 Dec 2021 14:07:57 +0000 (15:07 +0100)]
Merge tag 'imx-drivers-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/drivers
i.MX drivers update for 5.17:
- A number of patches from Adam Ford to update gpcv2 and blk-ctrl driver
to keep i.MX8MM VPU-H1 and i.MX8MN GPUMIX bus clocks active, and add
i.MX8MN display related domain support.
- Add optional continuous burst clock support for imx-weim bus driver.
- Call pm_runtime_put_sync_suspend() instead of pm_runtime_put() in
gpcv2 driver to prevent a sequence issue seen with i.MX8MM GPU and
MIX domain.
* tag 'imx-drivers-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
soc: imx: imx8m-blk-ctrl: add i.MX8MN DISP blk-ctrl
dt-bindings: power: imx8mn: add defines for DISP blk-ctrl domains
soc: imx: gpcv2: Add dispmix and mipi domains to imx8mn
soc: imx: gpcv2: keep i.MX8MN gpumix bus clock enabled
bus: imx-weim: optionally enable continuous burst clock
soc: imx: gpcv2: keep i.MX8MM VPU-H1 bus clock active
soc: imx: gpcv2: Synchronously suspend MIX domains
Arnd Bergmann [Mon, 20 Dec 2021 13:38:17 +0000 (14:38 +0100)]
Merge tag 'tegra-for-5.17-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/drivers
soc/tegra: Changes for v5.17-rc1
This set of changes contains some preparatory work that is shared by
several branches and trees to support DVFS via power domains.
There's also a bit of cleanup and improvements to reboot on chips that
use PSCI.
* tag 'tegra-for-5.17-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
soc/tegra: pmc: Rename core power domain
soc/tegra: pmc: Rename 3d power domains
soc/tegra: regulators: Prepare for suspend
soc/tegra: fuse: Use resource-managed helpers
soc/tegra: fuse: Reset hardware
soc/tegra: pmc: Add reboot notifier
soc/tegra: Don't print error message when OPPs not available
Arnd Bergmann [Mon, 20 Dec 2021 11:53:18 +0000 (12:53 +0100)]
Merge tag 'tegra-for-5.17-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/drivers
drivers: Changes for v5.17-rc1
This is an assortment of driver patches that rely on some of the changes
in the for-5.17/soc branch. These have all been acked by the respective
maintainers and go through the Tegra tree to more easily handle the
build dependency.
* tag 'tegra-for-5.17-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
media: staging: tegra-vde: Support generic power domain
spi: tegra20-slink: Add OPP support
mtd: rawnand: tegra: Add runtime PM and OPP support
mmc: sdhci-tegra: Add runtime PM and OPP support
pwm: tegra: Add runtime PM and OPP support
bus: tegra-gmi: Add runtime PM and OPP support
usb: chipidea: tegra: Add runtime PM and OPP support
soc/tegra: Add devm_tegra_core_dev_init_opp_table_common()
soc/tegra: Enable runtime PM during OPP state-syncing
Krzysztof Kozlowski [Mon, 13 Dec 2021 11:20:57 +0000 (12:20 +0100)]
dt-bindings: soc: samsung: keep SoC driver bindings together
Recently added Samsung Exynos USI driver devicetree bindings were added
under ../bindings/soc/samsung/exynos-usi.yaml, so move there also two
other bindings for Exynos SoC drivers: the PMU and ChipID.
Update Samsung Exynos MAINTAINERS entry to include this new path.
Sam Protsenko [Sat, 4 Dec 2021 19:57:54 +0000 (21:57 +0200)]
soc: samsung: Add USI driver
USIv2 IP-core is found on modern ARM64 Exynos SoCs (like Exynos850) and
provides selectable serial protocol (one of: UART, SPI, I2C). USIv2
registers usually reside in the same register map as a particular
underlying protocol it implements, but have some particular offset. E.g.
on Exynos850 the USI_UART has 0x13820000 base address, where UART
registers have 0x00..0x40 offsets, and USI registers have 0xc0..0xdc
offsets. Desired protocol can be chosen via SW_CONF register from System
Register block of the same domain as USI.
Before starting to use a particular protocol, USIv2 must be configured
properly:
1. Select protocol to be used via System Register
2. Clear "reset" flag in USI_CON
3. Configure HWACG behavior (e.g. for UART Rx the HWACG must be
disabled, so that the IP clock is not gated automatically); this is
done using USI_OPTION register
4. Keep both USI clocks (PCLK and IPCLK) running during USI registers
modification
This driver implements the above behavior. Of course, USIv2 driver
should be probed before UART/I2C/SPI drivers. It can be achieved by
embedding UART/I2C/SPI nodes inside of the USI node (in Device Tree);
driver then walks underlying nodes and instantiates those. Driver also
handles USI configuration on PM resume, as register contents can be lost
during CPU suspend.
This driver is designed with different USI versions in mind. So it
should be relatively easy to add new USI revisions to it later.
Adam Ford [Wed, 15 Dec 2021 00:46:20 +0000 (18:46 -0600)]
dt-bindings: power: imx8mn: add defines for DISP blk-ctrl domains
This adds the defines for the power domains provided by the DISP
blk-ctrl.
Signed-off-by: Adam Ford <aford173@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:34 +0000 (02:23 +0300)]
media: staging: tegra-vde: Support generic power domain
Currently driver supports legacy power domain API, this patch adds generic
power domain support. This allows us to utilize a modern GENPD API for
newer device-trees.
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30 Tested-by: Paul Fertser <fercerpav@gmail.com> # PAZ00 T20 Tested-by: Nicolas Chauvet <kwizart@gmail.com> # PAZ00 T20 and TK1 T124 Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30 Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:31 +0000 (02:23 +0300)]
spi: tegra20-slink: Add OPP support
The SPI on Tegra belongs to the core power domain and we're going to
enable GENPD support for the core domain. Now SPI driver must use OPP
API for driving the controller's clock rate because OPP API takes care
of reconfiguring the domain's performance state in accordance to the
rate. Add OPP support to the driver.
Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:30 +0000 (02:23 +0300)]
mtd: rawnand: tegra: Add runtime PM and OPP support
The NAND on Tegra belongs to the core power domain and we're going to
enable GENPD support for the core domain. Now NAND must be resumed using
runtime PM API in order to initialize the NAND power state. Add runtime PM
and OPP support to the NAND driver.
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:29 +0000 (02:23 +0300)]
mmc: sdhci-tegra: Add runtime PM and OPP support
The SDHCI on Tegra belongs to the core power domain and we're going to
enable GENPD support for the core domain. Now SDHCI must be resumed using
runtime PM API in order to initialize the SDHCI power state. The SDHCI
clock rate must be changed using OPP API that will reconfigure the power
domain performance state in accordance to the rate. Add runtime PM and OPP
support to the SDHCI driver.
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:28 +0000 (02:23 +0300)]
pwm: tegra: Add runtime PM and OPP support
The PWM on Tegra belongs to the core power domain and we're going to
enable GENPD support for the core domain. Now PWM must be resumed using
runtime PM API in order to initialize the PWM power state. The PWM clock
rate must be changed using OPP API that will reconfigure the power domain
performance state in accordance to the rate. Add runtime PM and OPP
support to the PWM driver.
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:27 +0000 (02:23 +0300)]
bus: tegra-gmi: Add runtime PM and OPP support
The GMI bus on Tegra belongs to the core power domain and we're going to
enable GENPD support for the core domain. Now GMI must be resumed using
runtime PM API in order to initialize the GMI power state. Add runtime PM
and OPP support to the GMI driver.
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:26 +0000 (02:23 +0300)]
usb: chipidea: tegra: Add runtime PM and OPP support
The Tegra USB controller belongs to the core power domain and we're going
to enable GENPD support for the core domain. Now USB controller must be
resumed using runtime PM API in order to initialize the USB power state.
We already support runtime PM for the CI device, but CI's PM is separated
from the RPM managed by tegra-usb driver. Add runtime PM and OPP support
to the driver.
Only couple drivers need to get the -ENODEV error code and majority of
drivers need to explicitly initialize the performance state. Add new
common helper which sets up OPP table for these drivers.
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:38 +0000 (02:23 +0300)]
soc/tegra: pmc: Rename 3d power domains
Device-tree schema doesn't allow domain name to start with a number.
We don't use 3d domain yet in device-trees, so rename it to the name
used by Tegra TRMs: TD, TD2.
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:08 +0000 (02:23 +0300)]
soc/tegra: Enable runtime PM during OPP state-syncing
GENPD core now can set up domain's performance state properly while device
is RPM-suspended. Runtime PM of a device must be enabled during setup
because GENPD checks whether device is suspended and check doesn't work
while RPM is disabled. Instead of replicating the boilerplate RPM-enable
code around OPP helper for each driver, let's make OPP helper to take care
of enabling it.
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:37 +0000 (02:23 +0300)]
soc/tegra: regulators: Prepare for suspend
Depending on hardware version, Tegra SoC may require a higher voltages
during resume from system suspend, otherwise hardware will crash. Set
SoC voltages to a nominal levels during suspend.
Jon Hunter [Tue, 30 Nov 2021 11:44:06 +0000 (11:44 +0000)]
soc/tegra: pmc: Add reboot notifier
The Tegra PMC driver implements a restart handler that supports Tegra
specific reboot commands such as placing the device into 'recovery' mode
in order to reprogram the platform. This is accomplished by setting the
appropriate bit in the PMC scratch0 register prior to rebooting the
platform.
For Tegra platforms that support PSCI or EFI, the default Tegra restart
handler is not called and the PSCI or EFI restart handler is called
instead. Hence, for Tegra platforms that support PSCI or EFI, the Tegra
specific reboot commands do not currently work. Fix this by moving the
code that programs the PMC scratch0 register into a separate reboot
notifier that will always be called on reboot.
Dmitry Osipenko [Tue, 30 Nov 2021 23:23:10 +0000 (02:23 +0300)]
soc/tegra: Don't print error message when OPPs not available
Previously we assumed that devm_tegra_core_dev_init_opp_table() will
be used only by drivers that will always have device with OPP table,
but this is not true anymore. For example now Tegra30 will have OPP table
for PWM, but Tegra20 not and both use the same driver. Hence let's not
print the error message about missing OPP table in the common helper,
we can print it elsewhere.
Miaoqian Lin [Tue, 14 Dec 2021 01:55:44 +0000 (01:55 +0000)]
soc: ti: knav_dma: Fix NULL vs IS_ERR() checking in dma_init
Since devm_ioremap_resource() function return error pointers.
The pktdma_get_regs() function does not return NULL, It return error
pointers too. Using IS_ERR() to check the return value to fix this.
Arnd Bergmann [Wed, 15 Dec 2021 14:58:00 +0000 (15:58 +0100)]
Merge tag 'asahi-soc-pmgr-5.17-v2' of https://github.com/AsahiLinux/linux into arm/drivers
Apple SoC PMGR driver updates for 5.17
* Adds an auto-PM feature that is necessary for a single device
* Disables module builds, which were broken anyway
* tag 'asahi-soc-pmgr-5.17-v2' of https://github.com/AsahiLinux/linux:
soc: apple: apple-pmgr-pwrstate: Do not build as a module
soc: apple: apple-pmgr-pwrstate: Add auto-PM min level support
Hector Martin [Wed, 15 Dec 2021 11:27:54 +0000 (20:27 +0900)]
soc: apple: apple-pmgr-pwrstate: Do not build as a module
This doesn't make any sense as a module since it is a critical device,
and it turns out of_phandle_iterator_args was not exported so the module
version doesn't build anyway.
Fixes: 6df9d38f9146 ("soc: apple: Add driver for Apple PMGR power state controls") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Hector Martin <marcan@marcan.st>
Arnd Bergmann [Tue, 14 Dec 2021 14:54:19 +0000 (15:54 +0100)]
Merge tag 'memory-controller-drv-renesas-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into arm/drivers
Memory controller drivers for v5.17 - Renesas
Changes to the Renesas RPC-IF driver:
1. Add support for R9A07G044 / RZ/G2L.
2. Several minor fixes and improvements to the driver.
* tag 'memory-controller-drv-renesas-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl:
memory: renesas-rpc-if: refactor MOIIO and IOFV macros
memory: renesas-rpc-if: avoid use of undocumented bits
memory: renesas-rpc-if: simplify register update
memory: renesas-rpc-if: Silence clang warning
memory: renesas-rpc-if: Add support for RZ/G2L
memory: renesas-rpc-if: Drop usage of RPCIF_DIRMAP_SIZE macro
memory: renesas-rpc-if: Return error in case devm_ioremap_resource() fails
dt-bindings: memory: renesas,rpc-if: Add optional interrupts property
dt-bindings: memory: renesas,rpc-if: Add support for the R9A07G044
Arnd Bergmann [Tue, 14 Dec 2021 10:27:38 +0000 (11:27 +0100)]
Merge tag 'optee-async-notif-for-v5.17' of https://git.linaro.org/people/jens.wiklander/linux-tee into arm/drivers
OP-TEE Asynchronous notifications from secure world
Adds support in the SMC based OP-TEE driver to receive asynchronous
notifications from secure world using an edge-triggered interrupt as
delivery mechanism.
* tag 'optee-async-notif-for-v5.17' of https://git.linaro.org/people/jens.wiklander/linux-tee:
optee: Fix NULL but dereferenced coccicheck error
optee: add asynchronous notifications
optee: separate notification functions
tee: export teedev_open() and teedev_close_context()
tee: fix put order in teedev_close_context()
dt-bindings: arm: optee: add interrupt property
docs: staging/tee.rst: add a section on OP-TEE notifications
Arnd Bergmann [Tue, 14 Dec 2021 09:50:37 +0000 (10:50 +0100)]
Merge tag 'zynqmp-soc-for-v5.17' of https://github.com/Xilinx/linux-xlnx into arm/drivers
arm64: dts: ZynqMP SoC changes for v5.17
- cleanup and fix PM_INIT_FINALIZE
- check return value of zynqmp_pm_get_api_version()
* tag 'zynqmp-soc-for-v5.17' of https://github.com/Xilinx/linux-xlnx:
firmware: xilinx: check return value of zynqmp_pm_get_api_version()
soc: xilinx: add a to_zynqmp_pm_domain macro
soc: xilinx: use a properly named field instead of flags
soc: xilinx: cleanup debug and error messages
soc: xilinx: move PM_INIT_FINALIZE to zynqmp_pm_domains driver
Arnd Bergmann [Tue, 14 Dec 2021 08:28:44 +0000 (09:28 +0100)]
Merge tag 'renesas-drivers-for-v5.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/drivers
Renesas driver updates for v5.17
- Add a remoteproc API for controlling the Cortex-R7 boot address on
R-Car Gen3 SoCs,
- Consolidate product register handling.
* tag 'renesas-drivers-for-v5.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
soc: renesas: Consolidate product register handling
soc: renesas: rcar-rst: Add support to set rproc boot address
Add constants for choosing USIv2 configuration mode in device tree.
Those are further used in USI driver to figure out which value to write
into SW_CONF register. Also document USIv2 IP-core bindings.
Wan Jiabing [Thu, 14 Oct 2021 08:45:55 +0000 (04:45 -0400)]
ARM: at91: pm: Add of_node_put() before goto
Fix following coccicheck warning:
./arch/arm/mach-at91/pm.c:643:1-33: WARNING: Function
for_each_matching_node_and_match should have of_node_put() before goto
Early exits from for_each_matching_node_and_match should decrement the
node reference counter.
Rajan Vaja [Wed, 6 Oct 2021 08:43:55 +0000 (01:43 -0700)]
firmware: xilinx: check return value of zynqmp_pm_get_api_version()
Currently return value of zynqmp_pm_get_api_version() is ignored.
Because of that, API version is checked in case of error also.
So add check for return value of zynqmp_pm_get_api_version().
Michael Tretter [Wed, 25 Aug 2021 15:03:12 +0000 (17:03 +0200)]
soc: xilinx: use a properly named field instead of flags
Instead of defining a flags field and a single bit in this field to
signal that a PM node has been requested, use a boolean field with a
descriptive name.
No functional change, but using a proper name instead of flags makes the
code easier to read.
Michael Tretter [Wed, 25 Aug 2021 15:03:10 +0000 (17:03 +0200)]
soc: xilinx: move PM_INIT_FINALIZE to zynqmp_pm_domains driver
PM_INIT_FINALIZE tells the PMU FW that Linux is able to handle the power
management nodes that are provided by the PMU FW. Nodes that are not
requested are shut down after this call.
Calling PM_INIT_FINALIZE from the zynqmp_power driver is wrong. The PM
node request mechanism is implemented in the zynqmp_pm_domains driver,
which must also call PM_INIT_FINALIZE.
Due to the behavior of the PMU FW, all devices must be powered up before
PM_INIT_FINALIZE is called, because otherwise the devices might
misbehave. Calling PM_INIT_FINALIZE from the sync_state device callback
ensures that all users probed successfully before the PMU FW is allowed
to power off unused domains.
Yoshihiro Shimoda [Wed, 1 Dec 2021 07:33:04 +0000 (16:33 +0900)]
soc: renesas: rcar-rst: Add support for R-Car S4-8
Add support for R-Car S4-8 (R8A779F0) to the R-Car RST driver.
The register map of R-Car S4-8 is the same as R-Car V3U so that
renames "V3U" and "r8a779a0" to "Gen4".
Hector Martin [Wed, 24 Nov 2021 07:34:18 +0000 (16:34 +0900)]
soc: apple: Add driver for Apple PMGR power state controls
Implements genpd and reset providers for downstream devices. Each
instance of the driver binds to a single register and represents a
single SoC power domain.
The driver does not currently implement all features (clockgate-only
state, misc flags), but we declare the respective registers for
documentation purposes. These features will be added as they become
useful for downstream devices.
This also creates the apple/soc tree and Kconfig submenu.
Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Hector Martin <marcan@marcan.st>
Currently renesas_soc_init() scans the whole device tree up to three
times, to find a device node describing a product register.
Furthermore, the product register handling for the different variants is
very similar, with the major difference being the location of the
product bitfield inside the product register.
Reduce scanning to a single pass using of_find_matching_node_and_match()
instead. Switch to a common handling of product registers, by storing
the intrinsics of each product register type in the data field of the
corresponding match entry.
Wolfram Sang [Wed, 17 Nov 2021 09:37:10 +0000 (10:37 +0100)]
memory: renesas-rpc-if: avoid use of undocumented bits
Instead of writing fixed values with undocumented bits which happen to
be set on some SoCs, better switch to read-modify-write operations
changing only bits which are documented. This is way more future-proof
as we don't know yet how these bits may be on upcoming SoCs.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20211117093710.14430-1-wsa+renesas@sang-engineering.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Linus Torvalds [Sun, 21 Nov 2021 19:25:19 +0000 (11:25 -0800)]
Merge tag 'x86-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
- Move the command line preparation and the early command line parsing
earlier so that the command line parameters which affect
early_reserve_memory(), e.g. efi=nosftreserve, are taken into
account. This was broken when the invocation of
early_reserve_memory() was moved recently.
- Use an atomic type for the SGX page accounting, which is read and
written locklessly, to plug various race conditions related to it.
* tag 'x86-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sgx: Fix free page accounting
x86/boot: Pull up cmdline preparation and early param parsing
Linus Torvalds [Sun, 21 Nov 2021 19:17:50 +0000 (11:17 -0800)]
Merge tag 'perf-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf fixes from Thomas Gleixner:
- Remove unneded PEBS disabling when taking LBR snapshots to prevent an
unchecked MSR access error.
- Fix IIO event constraints for Snowridge and Skylake server chips.
* tag 'perf-urgent-2021-11-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/perf: Fix snapshot_branch_stack warning in VM
perf/x86/intel/uncore: Fix IIO event constraints for Snowridge
perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server
perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake Server
* tag 'powerpc-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/xive: Change IRQ domain to a tree domain
powerpc/8xx: Fix pinned TLBs with CONFIG_STRICT_KERNEL_RWX
powerpc/signal32: Fix sigset_t copy
powerpc/book3e: Fix TLBCAM preset at boot
powerpc/pseries/ddw: Do not try direct mapping with persistent memory and one window
powerpc/pseries/ddw: simplify enable_ddw()
powerpc/pseries/ddw: Revert "Extend upper limit for huge DMA window for persistent memory"
powerpc/pseries: Fix numa FORM2 parsing fallback code
powerpc/pseries: rename numa_dist_table to form2_distances
powerpc: clean vdso32 and vdso64 directories
powerpc/83xx/mpc8349emitx: Drop unused variable
KVM: PPC: Book3S HV: Use GLOBAL_TOC for kvmppc_h_set_dabr/xdabr()
In case the following power domain sequence happens, iMX8M Mini always hangs:
gpumix:on -> gpu:on -> gpu:off -> gpu:on
This is likely due to another quirk of the GPC block. This situation can be
prevented by always synchronously powering off both the domain and MIX domain.
Make it so. This turns the aforementioned sequence into:
gpumix:on -> gpu:on -> gpu:off -> gpumix:off -> gpumix:on -> gpu:on
Signed-off-by: Marek Vasut <marex@denx.de> Cc: Frieder Schrempf <frieder.schrempf@kontron.de> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: Peng Fan <peng.fan@nxp.com> Cc: Shawn Guo <shawnguo@kernel.org> Acked-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Linus Torvalds [Sat, 20 Nov 2021 21:17:24 +0000 (13:17 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"15 patches.
Subsystems affected by this patch series: ipc, hexagon, mm (swap,
slab-generic, kmemleak, hugetlb, kasan, damon, and highmem), and proc"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
proc/vmcore: fix clearing user buffer by properly using clear_user()
kmap_local: don't assume kmap PTEs are linear arrays in memory
mm/damon/dbgfs: fix missed use of damon_dbgfs_lock
mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation
kasan: test: silence intentional read overflow warnings
hugetlb, userfaultfd: fix reservation restore on userfaultfd error
hugetlb: fix hugetlb cgroup refcounting during mremap
mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
hexagon: ignore vmlinux.lds
hexagon: clean up timer-regs.h
hexagon: export raw I/O routines for modules
mm: emit the "free" trace report before freeing memory in kmem_cache_free()
shm: extend forced shm destroy to support objects from several IPC nses
ipc: WARN if trying to remove ipc object which is absent
mm/swap.c:put_pages_list(): reinitialise the page list
Linus Torvalds [Sat, 20 Nov 2021 19:05:10 +0000 (11:05 -0800)]
Merge tag 'block-5.16-2021-11-19' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Flip a cap check to avoid a selinux error (Alistair)
- Fix for a regression this merge window where we can miss a queue ref
put (me)
- Un-mark pstore-blk as broken, as the condition that triggered that
change has been rectified (Kees)
- Queue quiesce and sync fixes (Ming)
- FUA insertion fix (Ming)
- blk-cgroup error path put fix (Yu)
* tag 'block-5.16-2021-11-19' of git://git.kernel.dk/linux-block:
blk-mq: don't insert FUA request with data into scheduler queue
blk-cgroup: fix missing put device in error path from blkg_conf_pref()
block: avoid to quiesce queue in elevator_init_mq
Revert "mark pstore-blk as broken"
blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release()
block: fix missing queue put in error path
block: Check ADMIN before NICE for IOPRIO_CLASS_RT
Linus Torvalds [Sat, 20 Nov 2021 18:59:03 +0000 (10:59 -0800)]
Merge tag 'pinctrl-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"There is an ACPI stubs fix which is ACKed by the ACPI maintainer for
merging through my tree.
One item stand out and that is that I delete the <linux/sdb.h> header
that is used by nothing. I deleted this subsystem (through the GPIO
tree) a while back so I feel responsible for tidying up the floor.
Other than that it is the usual mistakes, a bit noisy around build
issue and Kconfig then driver fixes.
Specifics:
- Fix some stubs causing compile issues for ACPI.
- Fix some wakeups on AMD IRQs shared between GPIO and SCI.
- Fix a build warning in the Tegra driver.
- Fix a Kconfig issue in the Qualcomm driver.
- Add a missing include the RALink driver.
- Return a valid type for the Apple pinctrl IRQs.
- Implement some Qualcomm SDM845 dual-edge errata.
- Remove the unused <linux/sdb.h> header. (The subsystem was once
deleted by the pinctrl maintainer...)
- Fix a duplicate initialized in the Tegra driver.
- Fix register offsets for UFS and SDC in the Qualcomm SM8350 driver"
* tag 'pinctrl-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: qcom: sm8350: Correct UFS and SDC offsets
pinctrl: tegra194: remove duplicate initializer again
Remove unused header <linux/sdb.h>
pinctrl: qcom: sdm845: Enable dual edge errata
pinctrl: apple: Always return valid type in apple_gpio_irq_type
pinctrl: ralink: include 'ralink_regs.h' in 'pinctrl-mt7620.c'
pinctrl: qcom: fix unmet dependencies on GPIOLIB for GPIOLIB_IRQCHIP
pinctrl: tegra: Return const pointer from tegra_pinctrl_get_group()
pinctrl: amd: Fix wakeups when IRQ is shared with SCI
ACPI: Add stubs for wakeup handler functions
Linus Torvalds [Sat, 20 Nov 2021 18:55:50 +0000 (10:55 -0800)]
Merge tag 's390-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
- Add missing Kconfig option for ftrace direct multi sample, so it can
be compiled again, and also add s390 support for this sample.
- Update Christian Borntraeger's email address.
- Various fixes for memory layout setup. Besides other this makes it
possible to load shared DCSS segments again.
- Fix copy to user space of swapped kdump oldmem.
- Remove -mstack-guard and -mstack-size compile options when building
vdso binaries. This can happen when CONFIG_VMAP_STACK is disabled and
results in broken vdso code which causes more or less random
exceptions. Also remove the not needed -nostdlib option.
- Fix memory leak on cpu hotplug and return code handling in kexec
code.
- Wire up futex_waitv system call.
- Replace snprintf with sysfs_emit where appropriate.
* tag 's390-5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
ftrace/samples: add s390 support for ftrace direct multi sample
ftrace/samples: add missing Kconfig option for ftrace direct multi sample
MAINTAINERS: update email address of Christian Borntraeger
s390/kexec: fix memory leak of ipl report buffer
s390/kexec: fix return code handling
s390/dump: fix copying to user-space of swapped kdump oldmem
s390: wire up sys_futex_waitv system call
s390/vdso: filter out -mstack-guard and -mstack-size
s390/vdso: remove -nostdlib compiler flag
s390: replace snprintf in show functions with sysfs_emit
s390/boot: simplify and fix kernel memory layout setup
s390/setup: re-arrange memblock setup
s390/setup: avoid using memblock_enforce_memory_limit
s390/setup: avoid reserving memory above identity mapping
Linus Torvalds [Sat, 20 Nov 2021 18:47:16 +0000 (10:47 -0800)]
Merge tag '5.16-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three small cifs/smb3 fixes: two to address minor coverity issues and
one cleanup"
* tag '5.16-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: introduce cifs_ses_mark_for_reconnect() helper
cifs: protect srv_count with cifs_tcp_ses_lock
cifs: move debug print out of spinlock
David Hildenbrand [Sat, 20 Nov 2021 00:43:58 +0000 (16:43 -0800)]
proc/vmcore: fix clearing user buffer by properly using clear_user()
To clear a user buffer we cannot simply use memset, we have to use
clear_user(). With a virtio-mem device that registers a vmcore_cb and
has some logically unplugged memory inside an added Linux memory block,
I can easily trigger a BUG by copying the vmcore via "cp":
Some x86-64 CPUs have a CPU feature called "Supervisor Mode Access
Prevention (SMAP)", which is used to detect wrong access from the kernel
to user buffers like this: SMAP triggers a permissions violation on
wrong access. In the x86-64 variant of clear_user(), SMAP is properly
handled via clac()+stac().
To fix, properly use clear_user() when we're dealing with a user buffer.
Link: https://lkml.kernel.org/r/20211112092750.6921-1-david@redhat.com Fixes: 997c136f518c ("fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages") Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Philipp Rudo <prudo@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ard Biesheuvel [Sat, 20 Nov 2021 00:43:55 +0000 (16:43 -0800)]
kmap_local: don't assume kmap PTEs are linear arrays in memory
The kmap_local conversion broke the ARM architecture, because the new
code assumes that all PTEs used for creating kmaps form a linear array
in memory, and uses array indexing to look up the kmap PTE belonging to
a certain kmap index.
On ARM, this cannot work, not only because the PTE pages may be
non-adjacent in memory, but also because ARM/!LPAE interleaves hardware
entries and extended entries (carrying software-only bits) in a way that
is not compatible with array indexing.
Fortunately, this only seems to affect configurations with more than 8
CPUs, due to the way the per-CPU kmap slots are organized in memory.
Work around this by permitting an architecture to set a Kconfig symbol
that signifies that the kmap PTEs do not form a lineary array in memory,
and so the only way to locate the appropriate one is to walk the page
tables.
Link: https://lore.kernel.org/linux-arm-kernel/20211026131249.3731275-1-ardb@kernel.org/ Link: https://lkml.kernel.org/r/20211116094737.7391-1-ardb@kernel.org Fixes: 2a15ba82fa6c ("ARM: highmem: Switch to generic kmap atomic") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reported-by: Quanyang Wang <quanyang.wang@windriver.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Sat, 20 Nov 2021 00:43:52 +0000 (16:43 -0800)]
mm/damon/dbgfs: fix missed use of damon_dbgfs_lock
DAMON debugfs is supposed to protect dbgfs_ctxs, dbgfs_nr_ctxs, and
dbgfs_dirs using damon_dbgfs_lock. However, some of the code is
accessing the variables without the protection. This fixes it by
protecting all such accesses.
Link: https://lkml.kernel.org/r/20211110145758.16558-3-sj@kernel.org Fixes: 75c1c2b53c78 ("mm/damon/dbgfs: support multiple contexts") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Sat, 20 Nov 2021 00:43:49 +0000 (16:43 -0800)]
mm/damon/dbgfs: use '__GFP_NOWARN' for user-specified size buffer allocation
Patch series "DAMON fixes".
This patch (of 2):
DAMON users can trigger below warning in '__alloc_pages()' by invoking
write() to some DAMON debugfs files with arbitrarily high count
argument, because DAMON debugfs interface allocates some buffers based
on the user-specified 'count'.
As done in commit d73dad4eb5ad ("kasan: test: bypass __alloc_size
checks") for __write_overflow warnings, also silence some more cases
that trip the __read_overflow warnings seen in 5.16-rc1[1]:
In file included from include/linux/string.h:253,
from include/linux/bitmap.h:10,
from include/linux/cpumask.h:12,
from include/linux/mm_types_task.h:14,
from include/linux/mm_types.h:5,
from include/linux/page-flags.h:13,
from arch/arm64/include/asm/mte.h:14,
from arch/arm64/include/asm/pgtable.h:12,
from include/linux/pgtable.h:6,
from include/linux/kasan.h:29,
from lib/test_kasan.c:10:
In function 'memcmp',
inlined from 'kasan_memcmp' at lib/test_kasan.c:897:2:
include/linux/fortify-string.h:263:25: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
263 | __read_overflow();
| ^~~~~~~~~~~~~~~~~
In function 'memchr',
inlined from 'kasan_memchr' at lib/test_kasan.c:872:2:
include/linux/fortify-string.h:277:17: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
277 | __read_overflow();
| ^~~~~~~~~~~~~~~~~
Mina Almasry [Sat, 20 Nov 2021 00:43:43 +0000 (16:43 -0800)]
hugetlb, userfaultfd: fix reservation restore on userfaultfd error
Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we
bail out using "goto out_release_unlock;" in the cases where idx >=
size, or !huge_pte_none(), the code will detect that new_pagecache_page
== false, and so call restore_reserve_on_error(). In this case I see
restore_reserve_on_error() delete the reservation, and the following
call to remove_inode_hugepages() will increment h->resv_hugepages
causing a 100% reproducible leak.
We should treat the is_continue case similar to adding a page into the
pagecache and set new_pagecache_page to true, to indicate that there is
no reservation to restore on the error path, and we need not call
restore_reserve_on_error(). Rename new_pagecache_page to
page_in_pagecache to make that clear.
Link: https://lkml.kernel.org/r/20211117193825.378528-1-almasrymina@google.com Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error") Signed-off-by: Mina Almasry <almasrymina@google.com> Reported-by: James Houghton <jthoughton@google.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Wei Xu <weixugc@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hugetlb: fix hugetlb cgroup refcounting during mremap
When hugetlb_vm_op_open() is called during copy_vma(), we may take the
reference to resv_map->css. Later, when clearing the reservation
pointer of old_vma after transferring it to new_vma, we forget to drop
the reference to resv_map->css. This leads to a reference leak of css.
Fixes this by adding a check to drop reservation css reference in
clear_vma_resv_huge_pages()
Link: https://lkml.kernel.org/r/20211113154412.91134-1-minhquangbui99@gmail.com Fixes: 550a7d60bd5e35 ("mm, hugepages: add mremap() support for hugepage backed vma") Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rustam Kovhaev [Sat, 20 Nov 2021 00:43:37 +0000 (16:43 -0800)]
mm: kmemleak: slob: respect SLAB_NOLEAKTRACE flag
When kmemleak is enabled for SLOB, system does not boot and does not
print anything to the console. At the very early stage in the boot
process we hit infinite recursion from kmemleak_init() and eventually
kernel crashes.
kmemleak_init() specifies SLAB_NOLEAKTRACE for KMEM_CACHE(), but
kmem_cache_create_usercopy() removes it because CACHE_CREATE_MASK is not
valid for SLOB.
Let's fix CACHE_CREATE_MASK and make kmemleak work with SLOB
Link: https://lkml.kernel.org/r/20211115020850.3154366-1-rkovhaev@gmail.com Fixes: d8843922fba4 ("slab: Ignore internal flags in cache creation") Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Glauber Costa <glommer@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The values in this header are only used in one file each, if they are
used at all. Remove the header and sink all of the constants into their
respective files.
TCX0_CLK_RATE is only used in arch/hexagon/include/asm/timex.h
TIMER_ENABLE, RTOS_TIMER_INT, RTOS_TIMER_REGS_ADDR are only used in
arch/hexagon/kernel/time.c.
SLEEP_CLK_RATE and TIMER_CLR_ON_MATCH have both been unused since the
file's introduction in commit 71e4a47f32f4 ("Hexagon: Add time and timer
functions").
TIMER_ENABLE is redefined as BIT(0) so the shift is moved into the
definition, rather than its use.
Link: https://lkml.kernel.org/r/20211115174250.1994179-3-nathan@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Brian Cain <bcain@codeaurora.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yunfeng Ye [Sat, 20 Nov 2021 00:43:25 +0000 (16:43 -0800)]
mm: emit the "free" trace report before freeing memory in kmem_cache_free()
After the memory is freed, it can be immediately allocated by other
CPUs, before the "free" trace report has been emitted. This causes
inaccurate traces.
For example, if the following sequence of events occurs:
Alexander Mikhalitsyn [Sat, 20 Nov 2021 00:43:21 +0000 (16:43 -0800)]
shm: extend forced shm destroy to support objects from several IPC nses
Currently, the exit_shm() function not designed to work properly when
task->sysvshm.shm_clist holds shm objects from different IPC namespaces.
This is a real pain when sysctl kernel.shm_rmid_forced = 1, because it
leads to use-after-free (reproducer exists).
This is an attempt to fix the problem by extending exit_shm mechanism to
handle shm's destroy from several IPC ns'es.
To achieve that we do several things:
1. add a namespace (non-refcounted) pointer to the struct shmid_kernel
2. during new shm object creation (newseg()/shmget syscall) we
initialize this pointer by current task IPC ns
3. exit_shm() fully reworked such that it traverses over all shp's in
task->sysvshm.shm_clist and gets IPC namespace not from current task
as it was before but from shp's object itself, then call
shm_destroy(shp, ns).
Note: We need to be really careful here, because as it was said before
(1), our pointer to IPC ns non-refcnt'ed. To be on the safe side we
using special helper get_ipc_ns_not_zero() which allows to get IPC ns
refcounter only if IPC ns not in the "state of destruction".
Q/A
Q: Why can we access shp->ns memory using non-refcounted pointer?
A: Because shp object lifetime is always shorther than IPC namespace
lifetime, so, if we get shp object from the task->sysvshm.shm_clist
while holding task_lock(task) nobody can steal our namespace.
Q: Does this patch change semantics of unshare/setns/clone syscalls?
A: No. It's just fixes non-covered case when process may leave IPC
namespace without getting task->sysvshm.shm_clist list cleaned up.
Link: https://lkml.kernel.org/r/67bb03e5-f79c-1815-e2bf-949c67047418@colorfullife.com Link: https://lkml.kernel.org/r/20211109151501.4921-1-manfred@colorfullife.com Fixes: ab602f79915 ("shm: make exit_shm work proportional to task activity") Co-developed-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Mikhalitsyn [Sat, 20 Nov 2021 00:43:18 +0000 (16:43 -0800)]
ipc: WARN if trying to remove ipc object which is absent
Patch series "shm: shm_rmid_forced feature fixes".
Some time ago I met kernel crash after CRIU restore procedure,
fortunately, it was CRIU restore, so, I had dump files and could do
restore many times and crash reproduced easily. After some
investigation I've constructed the minimal reproducer. It was found
that it's use-after-free and it happens only if sysctl
kernel.shm_rmid_forced = 1.
The key of the problem is that the exit_shm() function not handles shp's
object destroy when task->sysvshm.shm_clist contains items from
different IPC namespaces. In most cases this list will contain only
items from one IPC namespace.
How can this list contain object from different namespaces? The
exit_shm() function is designed to clean up this list always when
process leaves IPC namespace. But we made a mistake a long time ago and
did not add a exit_shm() call into the setns() syscall procedures.
The first idea was just to add this call to setns() syscall but it
obviously changes semantics of setns() syscall and that's
userspace-visible change. So, I gave up on this idea.
The first real attempt to address the issue was just to omit forced
destroy if we meet shp object not from current task IPC namespace [1].
But that was not the best idea because task->sysvshm.shm_clist was
protected by rwsem which belongs to current task IPC namespace. It
means that list corruption may occur.
Second approach is just extend exit_shm() to properly handle shp's from
different IPC namespaces [2]. This is really non-trivial thing, I've
put a lot of effort into that but not believed that it's possible to
make it fully safe, clean and clear.
Thanks to the efforts of Manfred Spraul working an elegant solution was
designed. Thanks a lot, Manfred!
Eric also suggested the way to address the issue in ("[RFC][PATCH] shm:
In shm_exit destroy all created and never attached segments") Eric's
idea was to maintain a list of shm_clists one per IPC namespace, use
lock-less lists. But there is some extra memory consumption-related
concerns.
An alternative solution which was suggested by me was implemented in
("shm: reset shm_clist on setns but omit forced shm destroy"). The idea
is pretty simple, we add exit_shm() syscall to setns() but DO NOT
destroy shm segments even if sysctl kernel.shm_rmid_forced = 1, we just
clean up the task->sysvshm.shm_clist list.
This chages semantics of setns() syscall a little bit but in comparision
to the "naive" solution when we just add exit_shm() without any special
exclusions this looks like a safer option.
Matthew Wilcox [Sat, 20 Nov 2021 00:43:15 +0000 (16:43 -0800)]
mm/swap.c:put_pages_list(): reinitialise the page list
While free_unref_page_list() puts pages onto the CPU local LRU list, it
does not remove them from the list they were passed in on. That makes
the list_head appear to be non-empty, and would lead to various
corruption problems if we didn't have an assertion that the list was
empty.
Reinitialise the list after calling free_unref_page_list() to avoid this
problem.
Link: https://lkml.kernel.org/r/YYp40A2lNrxaZji8@casper.infradead.org Fixes: 988c69f1bc23 ("mm: optimise put_pages_list()") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Steve French <stfrench@microsoft.com> Reported-by: Namjae Jeon <linkinjeon@kernel.org> Tested-by: Steve French <stfrench@microsoft.com> Tested-by: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Hyeoncheol Lee <hyc.lee@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 19 Nov 2021 22:15:14 +0000 (14:15 -0800)]
Merge tag 'libata-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull libata fixes from Damien Le Moal:
- Prevent accesses to unsupported log pages as that causes device scan
failures with LLDDs using libsas (from me).
- A couple of fixes for AMD AHCI adapters handling of low power modes
and resume (from Mario).
- Fix a compilation warning (from me).
* tag 'libata-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: libata-sata: Declare ata_ncq_sdev_attrs static
ata: libahci: Adjust behavior when StorageD3Enable _DSD is set
ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile
ata: libata: add missing ata_identify_page_supported() calls
ata: libata: improve ata_read_log_page() error message
Linus Torvalds [Fri, 19 Nov 2021 21:50:48 +0000 (13:50 -0800)]
Merge tag 'trace-v5.16-6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix double free in destroy_hist_field
- Harden memset() of trace_iterator structure
- Do not warn in trace printk check when test buffer fills up
* tag 'trace-v5.16-6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Don't use out-of-sync va_list in event printing
tracing: Use memset_startat() to zero struct trace_iterator
tracing/histogram: Fix UAF in destroy_hist_field()
Linus Torvalds [Fri, 19 Nov 2021 19:40:14 +0000 (11:40 -0800)]
Merge tag 'riscv-for-linus-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
"I have two patches for 5.16:
- allow external modules to be built against read-only source trees
- turn KVM on in the defconfigs
The second one isn't technically a fix, but it got tied up pending
some defconfig cleanups that ended up finding some larger issues. I
figured it'd be better to get the config changes some more testing,
but didn't want to hold up turning KVM on for that"
* tag 'riscv-for-linus-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: fix building external modules
RISC-V: Enable KVM in RV64 and RV32 defconfigs as a module
Linus Torvalds [Fri, 19 Nov 2021 19:33:31 +0000 (11:33 -0800)]
Merge branch 'SA_IMMUTABLE-fixes-for-v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull exit-vs-signal handling fixes from Eric Biederman:
"This is a small set of changes where debuggers were no longer able to
intercept synchronous SIGTRAP and SIGSEGV, introduced by the exit
cleanups.
This is essentially the change you suggested with all of i's dotted
and the t's crossed so that ptrace can intercept all of the cases it
has been able to intercept the past, and all of the cases that made it
to exit without giving ptrace a chance still don't give ptrace a
chance"
* 'SA_IMMUTABLE-fixes-for-v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Replace force_fatal_sig with force_exit_sig when in doubt
signal: Don't always set SA_IMMUTABLE for forced signals
Linus Torvalds [Fri, 19 Nov 2021 19:19:58 +0000 (11:19 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Six fixes, five in drivers (ufs, qla2xxx, iscsi) and one core change
to fix a regression in user space device state setting, which is used
by the iscsi daemons to effect device recovery"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Fix mailbox direction flags in qla2xxx_get_adapter_id()
scsi: ufs: core: Fix another task management completion race
scsi: ufs: core: Fix task management completion timeout race
scsi: core: sysfs: Fix hang when device state is set via sysfs
scsi: iscsi: Unblock session then wake up error handler
scsi: ufs: core: Improve SCSI abort handling
Linus Torvalds [Fri, 19 Nov 2021 19:07:13 +0000 (11:07 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"There are a few big regression items from the merge window suggesting
that people are testing rc1's but not testing the for-next branches:
- Warnings fixes
- Crash in hf1 when creating QPs and setting counters
- Some old mlx4 cards fail to probe due to missing counters
- Syzkaller crash in the new counters code"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
MAINTAINERS: Update for VMware PVRDMA driver
RDMA/nldev: Check stat attribute before accessing it
RDMA/mlx4: Do not fail the registration on port stats
IB/hfi1: Properly allocate rdma counter desc memory
RDMA/core: Set send and receive CQ before forwarding to the driver
RDMA/netlink: Add __maybe_unused to static inline in C file