]> www.infradead.org Git - nvme.git/log
nvme.git
9 months agodt-bindings: timer: sprd-timer: convert to YAML
Stanislav Jakubek [Wed, 3 Jul 2024 12:02:46 +0000 (14:02 +0200)]
dt-bindings: timer: sprd-timer: convert to YAML

Convert the Spreadtrum SC9860 timer bindings to DT schema.

Changes during conversion:
  - rename file to match compatible
  - add sprd,sc9860-suspend-timer which was previously undocumented
  - minor grammar fix in description

Signed-off-by: Stanislav Jakubek <stano.jakubek@gmail.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://lore.kernel.org/r/ZoU95lBgoyF/8Md3@standask-GA-A55M-S2HP
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: incomplete-devices: document devices without bindings
Krzysztof Kozlowski [Fri, 12 Jul 2024 12:11:46 +0000 (14:11 +0200)]
dt-bindings: incomplete-devices: document devices without bindings

There are devices in the wild with non-updatable firmware coming with
ACPI tables with rejected compatibles, e.g. "ltr,ltrf216a".  Linux
kernel still supports this device via ACPI PRP0001, however the
compatible was never accepted to bindings.

There are also several early PowerPC or SPARC platforms using
compatibles for their OpenFirmware, but without in-tree DTS.  Often the
legacy compatible is not correct in terms of current Devicetree
specification, e.g. missing vendor prefix.

Finally there are also Linux-specific tools and test code with
compatibles.

Add a schema covering above cases purely to satisfy the DT schema and
scripts/checkpatch.pl checks for undocumented compatibles.  For
ltr,ltrf216a this also documents the consensus: compatible is allowed
only via ACPI PRP0001, but not bindings.

Link: https://lore.kernel.org/all/20240705095047.90558-1-marex@denx.de/
Link: https://lore.kernel.org/lkml/20220731173446.7400bfa8@jic23-huawei/T/#me55be502302d70424a85368c2645c89f860b7b40
Cc: Marek Vasut <marex@denx.de>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240712121146.90942-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: trivial-devices: document the Sierra Wireless mangOH Green SPI IoT interface
Neil Armstrong [Wed, 10 Jul 2024 17:02:52 +0000 (19:02 +0200)]
dt-bindings: trivial-devices: document the Sierra Wireless mangOH Green SPI IoT interface

Document the Sierra Wireless mangOH Green SPI IoT interface as a trivial
device.

This fixes the following check:
qcom-mdm9615-wp8548-mangoh-green.dtb: /soc/gsbi@16200000/spi@16280000/spi@0: failed to match any schema with compatible: ['swir,mangoh-iotport-spi']

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240710-topic-mdm9615-mangoh-iotport-spi-bindings-v1-1-3efe20cfea8a@linaro.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agoscripts/dtc: Update to upstream version v1.7.0-93-g1df7b047fe43
Rob Herring (Arm) [Tue, 9 Jul 2024 15:13:03 +0000 (09:13 -0600)]
scripts/dtc: Update to upstream version v1.7.0-93-g1df7b047fe43

This adds the following commits from upstream:

1df7b047fe43 pylibfdt/Makefile.pylibfdt: use project's flags to compile the extension
61e88fdcec52 libfdt: overlay: Fix phandle overwrite check for new subtrees
49d30894466e meson: fix installation with meson-python
d54aaf93673c pylibfdt: clean up python build directory
ab86f1e9fda8 pylibfdt: add VERSION.txt to Python sdist
7b8a30eceabe pylibfdt: fix Python version
ff4f17eb5865 pylibfdt/Makefile.pylibfdt: fix Python library being rebuild during install
9e313b14e684 pylibfdt/meson.build: fix Python library being rebuilt during install
d598fc3648ec tests/run_tests.sh: fix Meson library path being dropped
b98239da2f18 tests/meson.build: fix python and yaml tests not running
c17d76ab5e84 checks: Check the overall length of "interrupt-map"
ae26223a056e libfdt: overlay: Refactor overlay_fixup_phandle
4dd831affd01 libfdt: tests: Update test case for overlay_bad_fixup
e6d294200837 tests: Remove two_roots and named_root from LIBTREE_TESTS_L and add all dtb filenames generated by dumptrees to TESTS_TREES_L in Makefile.tests
855c934e26ae tests: fix tests broken under Meson
4fd3f4f0a95d github: enforce testing pylibfdt and yaml support
9ca7d62dbf0b meson: split run-tests by type
bb51223083a4 meson: fix dependencies of tests
e81900635c95 meson: fix pylibfdt missing dependency on libfdt
822123856980 pylibfdt: fix get_mem_rsv for newer Python versions
1fad065080e6 libfdt: overlay: ensure that existing phandles are not overwritten
b0aacd0a7735 github: add windows/msys CI build
ae97d9745862 github: Don't accidentally suppress test errors
057a7dbbb777 github: Display meson test logs on failure
92b5d4e91678 pylibfdt: Remove some apparently deprecated options from setup.py
417e3299dbd1 github: Update to newer checkout action
5e6cefa17e2d fix MinGW format attribute
24f60011fd43 libfdt: Simplify adjustment of values for local fixups
da39ee0e68b6 libfdt: rework shared/static libraries
a669223f7a60 Makefile: do not hardcode the `install` program path
3fbfdd08afd2 libfdt: fix duplicate meson target
dcef5f834ea3 tests: use correct pkg-config when cross compiling
0b8026ff254f meson: allow building from shallow clones
95c74d71f090 treesource: Restore string list output when no type markers
2283dd78eff5 libfdt: fdt_path_offset_namelen: Reject empty path
79b9e326a162 libfdt: fdt_get_alias_namelen: Validate aliases
52157f13ef3d pylibfdt: Support boolean properties
d77433727566 dtc: fix missing string in usage_opts_help
ad8bf9f9aa39 libfdt: Fix fdt_appendprop_addrrange documentation
6c5e189fb952 github: add workflow for Meson builds
a3dc9f006a78 libfdt: rename libfdt-X.Y.Z.so to libfdt.so.X.Y.Z
35019949c4c7 workflows: build: remove setuptools_scm hack
cd3e2304f4a9 pylibfdt: use fallback version in tarballs
0f5864567745 move release version into VERSION.txt
38165954c13b libfdt: add missing version symbols
5e98b5979354 editorconfig: use tab indentation for version.lds
d030a893be25 tests: generate dtbs in Meson build directory
8d8372b13706 tests: fix use of deprecated meson methods
761114effaf7 pylibtfdt: fix use of deprecated meson method
bf6377a98d97 meson: set minimum Meson version to 0.56.0
4c68e4b16b22 libfdt: fix library version to match project version
bdc5c8793a13 meson: allow disabling tests
f088e381f29e Makefile: allow to install libfdt without building executables
6df5328a902c Fix use of <ctype.h> functions
ccf1f62d59ad libfdt: Fix a typo in libfdt.h
71a8b8ef0adf libfdt: meson: Fix linking on macOS linker
589d8c7653c7 dtc: Add an option to generate __local_fixups__ and __fixups__
e8364666d5ac CI: Add build matrix with multiple Linux distributions
3b02a94b486f dtc: Correct invalid dts output with mixed phandles and integers
d4888958d64b tests: Add additional tests for device graph checks
ea3b9a1d2c5a checks: Fix crash in graph_child_address if 'reg' cell size != 1
b2b9671583e9 livetree: fix off-by-one in propval_cell_n() bounds check
ab481e483061 Add definition for a GitHub Actions CI job
c88038c9b8ca Drop obsolete/broken CI definitions
0ac8b30ba5a1 yaml: Depend on libyaml >= 0.2.3
f1657b2fb5be tests: Add test cases for bad endpoint node and remote-endpoint prop checks
44bb89cafd3d checks: Fix segmentation fault in check_graph_node
60bcf1cde1a8 improve documentation for fdt_path_offset()
a6f997bc77d4 add fdt_get_symbol() and fdt_get_symbol_namelen() functions
18f5ec12a10e use fdt_path_getprop_namelen() in fdt_get_alias_namelen()
df093279282c add fdt_path_getprop_namelen() helper
129bb4b78bc6 doc: dt-object-internal: Fix a typo
390f481521c3 fdtoverlay: Drop a a repeated article
9f8b382ed45e manual: Fix and improve documentation about -@
2cdf93a6d402 fdtoverlay: Fix usage string to not mention "<type>"
72fc810c3025 build-sys: add -Wwrite-strings
083ab26da83b tests: fix leaks spotted by ASAN
6f8b28f49609 livetree: fix leak spotted by ASAN
fd68bb8c5658 Make name_node() xstrdup its name argument
4718189c4ca8 Delay xstrdup() of node and property names coming from a flat tree
0b842c3c8199 Make build_property() xstrdup its name argument
9cceabea1ee0 checks: correct I2C 10-bit address check
0d56145938fe yamltree.c: fix -Werror=discarded-qualifiers & -Werror=cast-qual
61fa22b05f69 checks: make check.data const
7a1d72a788e0 checks.c: fix check_msg() leak
ee5799938697 checks.c: fix heap-buffer-overflow
44c9b73801c1 tests: fix -Wwrite-strings
5b60f5104fcc srcpos.c: fix -Wwrite-strings
32174a66efa4 meson: Fix cell overflow tests when running from meson
64a907f08b9b meson.build: bump version to 1.7.0
e3cde0613bfd Add -Wsuggest-attribute=format warning, correct warnings thus generated
41821821101a Use #ifdef NO_VALGRIND
71c19f20b3ef Do not redefine _GNU_SOURCE if already set
039a99414e77 Bump version to v1.7.0
9b62ec84bb2d Merge remote-tracking branch 'gitlab/main'
3f29d6d85c24 pylibfdt: add size_hint parameter for get_path
2022bb10879d checks: Update #{size,address}-cells check for 'dma-ranges'

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: soc: fsl: Add fsl,ls1028a-reset for reset syscon node
Frank Li [Wed, 3 Jul 2024 18:08:11 +0000 (14:08 -0400)]
dt-bindings: soc: fsl: Add fsl,ls1028a-reset for reset syscon node

ls1028a has a reset module that includes reboot, reset control word, and
service processor control.

Add platform specific compatible string to fix the below warning.

syscon@1e60000: compatible: 'anyOf' conditional failed, one must be fixed:
        ['syscon'] is too short
        'syscon' is not one of ['al,alpine-sysfabric-service', ...]

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20240703-ls_reset_syscon-v1-1-338f41b3902d@nxp.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: soc: fsl: cpm_qe: convert to yaml format
Frank Li [Wed, 3 Jul 2024 16:49:39 +0000 (12:49 -0400)]
dt-bindings: soc: fsl: cpm_qe: convert to yaml format

Convert binding doc qe.txt to yaml format. Split it to
fsl,qe-firmware.yaml, fsl,qe-ic.yaml, fsl,qe-muram.yaml, fsl,qe-si.yaml
fsl,qe-siram.yaml, fsl,qe.yaml.

Additional Changes:
- Fix error in example.
- Change to low case for hex value.
- Remove fsl,qe-num-riscs and fsl,qe-snums from required list.
- Add #address-cell and #size-cell.
- Add interrupts description for qe-ic.
- Add compatible string fsl,ls1043-qe-si for fsl,qe-si.yaml
- Add compatible string fsl,ls1043-qe-siram for fsl,qe-siram.yaml
- Add child node for fsl,qe.yaml

Fix below warning:
arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dtb: /soc/uqe@2400000/muram@10000: failed to match any schema with compatible: ['fsl,qe-muram', 'fsl,cpm-muram']
arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dtb: /soc/uqe@2400000/muram@10000: failed to match any schema with compatible: ['fsl,qe-muram', 'fsl,cpm-muram']
arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dtb: /soc/uqe@2400000/muram@10000/data-only@0: failed to match any schema with compatible: ['fsl,qe-muram-data', 'fsl,cpm-muram-data']
arch/arm64/boot/dts/freescale/fsl-ls1043a-qds.dtb: /soc/uqe@2400000: failed to match any schema with compatible: ['fsl,qe', 'simple-bus']
arch/arm64/boot/dts/freescale/fsl-ls1043a-rdb.dtb: /soc/uqe@2400000/muram@10000/data-only@0: failed to match any schema with compatible: ['fsl,qe-muram-data', 'fsl,cpm-muram-data']
arch/arm64/boot/dts/freescale/fsl-ls1043a-qds.dtb: /soc/uqe@2400000/qeic@80: failed to match any schema with compatible: ['fsl,qe-ic']

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20240703-ls_qe_warning-v1-1-7fe4af5b0bb0@nxp.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: i2c: i2c-fsi: Convert to json-schema
Eddie James [Wed, 22 May 2024 19:25:15 +0000 (14:25 -0500)]
dt-bindings: i2c: i2c-fsi: Convert to json-schema

Convert to json-schema for the FSI-attached I2C controller.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Ninad Palsule <ninad@linux.ibm.com>
Link: https://lore.kernel.org/r/20240522192524.3286237-12-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: Document the FSI Hub Controller
Eddie James [Wed, 22 May 2024 19:25:14 +0000 (14:25 -0500)]
dt-bindings: fsi: Document the FSI Hub Controller

Document the FSI Hub Controller CFAM engine.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240522192524.3286237-11-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: Document the AST2700 FSI controller
Eddie James [Wed, 22 May 2024 19:25:13 +0000 (14:25 -0500)]
dt-bindings: fsi: Document the AST2700 FSI controller

Add the appropriate compatible string, and document the new reg
properties of the AST2700 FSI controller, which can directly access
the FSI controller registers as well as the FSI link address space.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240522192524.3286237-10-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: ast2600-fsi-master: Convert to json-schema
Eddie James [Wed, 22 May 2024 19:25:12 +0000 (14:25 -0500)]
dt-bindings: fsi: ast2600-fsi-master: Convert to json-schema

Convert to json-schema for the AST2600 FSI master documentation.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240522192524.3286237-9-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: ibm,i2cr-fsi-master: Reference common FSI controller
Eddie James [Wed, 22 May 2024 19:25:11 +0000 (14:25 -0500)]
dt-bindings: fsi: ibm,i2cr-fsi-master: Reference common FSI controller

Remove the common properties from the I2CR documentation and instead
point to the common FSI controller documentation.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240522192524.3286237-8-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: Document the FSI controller common properties
Eddie James [Wed, 22 May 2024 19:25:10 +0000 (14:25 -0500)]
dt-bindings: fsi: Document the FSI controller common properties

Since there are multiple FSI controllers documented, the common
properties should be documented separately and then referenced
from the specific controller documentation. Add bus-frequency for
the FSI bus and CFAM local bus frequencies. Add interrupt
controller properties.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240522192524.3286237-7-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: Document the IBM SBEFIFO engine
Eddie James [Wed, 22 May 2024 19:25:09 +0000 (14:25 -0500)]
dt-bindings: fsi: Document the IBM SBEFIFO engine

The SBEFIFO engine provides an interface to the POWER processor
Self Boot Engine (SBE).

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20240522192524.3286237-6-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: p9-occ: Convert to json-schema
Eddie James [Wed, 22 May 2024 19:25:08 +0000 (14:25 -0500)]
dt-bindings: fsi: p9-occ: Convert to json-schema

Conver to json-schema for the OCC documentation. Also document the fact
that the OCC "bridge" device will often have the hwmon node as a
child.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240522192524.3286237-5-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: Document the IBM SCOM engine
Eddie James [Wed, 22 May 2024 19:25:07 +0000 (14:25 -0500)]
dt-bindings: fsi: Document the IBM SCOM engine

The SCOM engine provides an interface to the POWER processor PIB
(Pervasive Interconnect Bus).

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240522192524.3286237-4-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: fsi: fsi2spi: Document SPI controller child nodes
Eddie James [Wed, 22 May 2024 19:25:06 +0000 (14:25 -0500)]
dt-bindings: fsi: fsi2spi: Document SPI controller child nodes

The FSI2SPI bridge has several SPI controllers behind it, which
should be documented. Also, therefore the node needs to specify
address and size cells.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240522192524.3286237-3-eajames@linux.ibm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: interrupt-controller: convert fsl,ls-scfg-msi to yaml
Frank Li [Thu, 27 Jun 2024 14:42:07 +0000 (10:42 -0400)]
dt-bindings: interrupt-controller: convert fsl,ls-scfg-msi to yaml

Convert device tree binding fsl,ls-scfg-msi to yaml format.

Additional changes:
- Include gic.h and use predefined macro in example.
- Remove label in example.
- Change node name to interrupt-controller in example.
- Fix error in example.
- ls1046a allow 4 irqs, other platform only 1 irq.
- Add $ref: msi-controller.yaml
- Add #msi-cells.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240627144207.4003708-1-Frank.Li@nxp.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: soc: fsl: Convert q(b)man-* to yaml format
Frank Li [Wed, 26 Jun 2024 19:37:53 +0000 (15:37 -0400)]
dt-bindings: soc: fsl: Convert q(b)man-* to yaml format

Convert qman, bman, qman-portals, bman-portals to yaml format.

Additional Change for fsl,q(b)man-portal:
- Only keep one example.
- Add fsl,qman-channel-id property.
- Use interrupt type macro.
- Remove top level qman-portals@ff4200000 at example.

Additional change for fsl,q(b)man:
- Fixed example error.
- Remove redundent part, only keep fsl,qman node.
- Change memory-regions to memory-region.
- fsl,q(b)man-portals is not required property

Additional change for fsl,qman-fqd.yaml:
- Fixed example error.
- Only keep one example.
- Ref to reserve-memory.yaml
- Merge fsl,bman reserver memory part

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20240626193753.2088926-1-Frank.Li@nxp.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: misc: fsl,qoriq-mc: convert to yaml format
Frank Li [Mon, 17 Jun 2024 17:09:34 +0000 (13:09 -0400)]
dt-bindings: misc: fsl,qoriq-mc: convert to yaml format

Convert fsl,qoriq-mc from txt to yaml format.

Addition changes:
- Child node name allow 'ethernet'.
- Use 32bit address in example.
- Fixed missed ';' in example.
- Allow dma-coherent.
- Remove smmu, its part in example.
- Change child node name as 'ethernet'

Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20240617170934.813321-1-Frank.Li@nxp.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: drop stale Anson Huang from maintainers
Krzysztof Kozlowski [Mon, 17 Jun 2024 06:58:28 +0000 (08:58 +0200)]
dt-bindings: drop stale Anson Huang from maintainers

Emails to Anson Huang bounce:

  Diagnostic-Code: smtp; 550 5.4.1 Recipient address rejected: Access denied.

Add IMX platform maintainers for bindings which would become orphaned.

Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Peng Fan <peng.fan@nxp.com>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com> # for I2C
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # for IIO
Acked-by: Andi Shyti <andi.shyti@kernel.org>
Acked-by: Abel Vesa <abel.vesa@linaro.org>
Link: https://lore.kernel.org/r/20240617065828.9531-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agokbuild: verify dtoverlay files against schema
Dmitry Baryshkov [Mon, 27 May 2024 11:34:24 +0000 (14:34 +0300)]
kbuild: verify dtoverlay files against schema

Currently only the single part device trees are validated against DT
schema. For the multipart DT files only the base DTB is validated.
Extend the fdtoverlay commands to validate the resulting DTB file
against schema.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240527-dtbo-check-schema-v1-1-ee1094f88f74@linaro.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
9 months agodt-bindings: clock: drop obsolete stericsson,abx500.txt
Stanislav Jakubek [Sun, 16 Jun 2024 11:13:29 +0000 (13:13 +0200)]
dt-bindings: clock: drop obsolete stericsson,abx500.txt

These bindings are already (better) described in mfd/stericsson,ab8500.yaml,
drop these now obsolete bindings.

Signed-off-by: Stanislav Jakubek <stano.jakubek@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/Zm7I2Zbq1JNPoEJp@standask-GA-A55M-S2HP
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agodt-bindings: arm: cpus: Add new Cortex and Neoverse names
Andre Przywara [Tue, 18 Jun 2024 16:04:50 +0000 (17:04 +0100)]
dt-bindings: arm: cpus: Add new Cortex and Neoverse names

Add compatible strings for the Arm Cortex-A725 and Cortex-A925 CPUs, as
well as new Neoverse cores: Arm Neoverse N3, Neoverse V2, Neoverse V3,
and Neoverse V3AE.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240618160450.3168005-1-andre.przywara@arm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agodt-bindings: interrupt-controller: qcom,pdc: Add sc8180x PDC
Bjorn Andersson [Sat, 25 May 2024 18:05:31 +0000 (11:05 -0700)]
dt-bindings: interrupt-controller: qcom,pdc: Add sc8180x PDC

The SC8180X platform has a PDC block, add a compatible for this.

Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20240525-sc8180x-pdc-binding-compatible-v1-1-17031c85ed69@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agoPCI: of_property: Add interrupt-controller property in PCI device nodes
Herve Codina [Mon, 27 May 2024 16:14:44 +0000 (18:14 +0200)]
PCI: of_property: Add interrupt-controller property in PCI device nodes

PCI devices and bridges DT nodes created during the PCI scan are created
with the interrupt-map property set to handle interrupts.

In order to set this interrupt-map property at a specific level, a
phandle to the parent interrupt controller is needed. On systems that
are not fully described by a device-tree, the parent interrupt
controller may be unavailable (i.e. not described by the device-tree).

As mentioned in the [1], avoiding the use of the interrupt-map property
and considering a PCI device as an interrupt controller itself avoid the
use of a parent interrupt phandle.

In that case, the PCI device itself as an interrupt controller is
responsible for routing the interrupts described in the device-tree
world (DT overlay) to the PCI interrupts.

Add the 'interrupt-controller' property in the PCI device DT node.

[1]: https://lore.kernel.org/lkml/CAL_Jsq+je7+9ATR=B6jXHjEJHjn24vQFs4Tvi9=vhDeK9n42Aw@mail.gmail.com/

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20240527161450.326615-18-herve.codina@bootlin.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agoof: unittest: Add a test case for of_changeset_add_prop_bool()
Herve Codina [Mon, 27 May 2024 16:14:43 +0000 (18:14 +0200)]
of: unittest: Add a test case for of_changeset_add_prop_bool()

Improve of_unittest_changeset_prop() to have a test case for the
newly introduced of_changeset_add_prop_bool().

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/20240527161450.326615-17-herve.codina@bootlin.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agoof: dynamic: Introduce of_changeset_add_prop_bool()
Herve Codina [Mon, 27 May 2024 16:14:42 +0000 (18:14 +0200)]
of: dynamic: Introduce of_changeset_add_prop_bool()

APIs to add some properties in a changeset exist but nothing to add a DT
boolean property (i.e. a property without any values).

Fill this lack with of_changeset_add_prop_bool().

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/20240527161450.326615-16-herve.codina@bootlin.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agoof: unittest: Add tests for changeset properties adding
Herve Codina [Mon, 27 May 2024 16:14:41 +0000 (18:14 +0200)]
of: unittest: Add tests for changeset properties adding

No test cases are present to test the of_changes_add_prop_*() function
family.

Add a new test to fill this lack.
Functions tested are:
  - of_changes_add_prop_string()
  - of_changes_add_prop_string_array()
  - of_changeset_add_prop_u32()
  - of_changeset_add_prop_u32_array()

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/20240527161450.326615-15-herve.codina@bootlin.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agoof: dynamic: Constify parameter in of_changeset_add_prop_string_array()
Herve Codina [Mon, 27 May 2024 16:14:40 +0000 (18:14 +0200)]
of: dynamic: Constify parameter in of_changeset_add_prop_string_array()

The str_array parameter has no reason to be an un-const array.
Indeed, elements of the 'str_array' array are not changed by the code.

Constify the 'str_array' array parameter.
With this const qualifier added, the following construction is allowed:
  static const char * const tab_str[] = { "string1", "string2" };
  of_changeset_add_prop_string_array(..., tab_str, ARRAY_SIZE(tab_str));

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/20240527161450.326615-14-herve.codina@bootlin.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agodt-bindings: dma: qcom,gpi: document the SDX75 GPI DMA Engine
Rohit Agarwal [Fri, 17 May 2024 10:04:22 +0000 (15:34 +0530)]
dt-bindings: dma: qcom,gpi: document the SDX75 GPI DMA Engine

Document the GPI DMA Engine on the SDX75 Platform.

Signed-off-by: Rohit Agarwal <quic_rohiagar@quicinc.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240517100423.2006022-2-quic_rohiagar@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agodt-bindings: watchdog: img,pdc-wdt: Convert to dtschema
Shresth Prasad [Mon, 27 May 2024 19:58:12 +0000 (01:28 +0530)]
dt-bindings: watchdog: img,pdc-wdt: Convert to dtschema

Convert txt bindings of ImgTec's PDC watchdog timer to dtschema to allow
for validation.

Signed-off-by: Shresth Prasad <shrestprasad7@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240527195811.7897-2-shresthprasad7@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agodt-bindings: timer: renesas,tmu: Make interrupt-names required
Geert Uytterhoeven [Wed, 29 May 2024 12:32:32 +0000 (14:32 +0200)]
dt-bindings: timer: renesas,tmu: Make interrupt-names required

Now all in-tree users have been updated with interrupt-names properties
according to commit 0076a37a426b6c85 ("dt-bindings: timer: renesas,tmu:
Document input capture interrupt"), make interrupt-names required.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/65fdd0425be0cc1bae9e6f7996aceaa5ad34e510.1716985947.git.geert+renesas@glider.be
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
10 months agodt-bindings: interrupt-controller: fsl,irqsteer: Add imx8mp/imx8qxp support
Alexander Stein [Tue, 28 May 2024 07:11:40 +0000 (09:11 +0200)]
dt-bindings: interrupt-controller: fsl,irqsteer: Add imx8mp/imx8qxp support

Some SoC like i.MX8MP or i.MX8QXP use a power-domain for this IP. Add
SoC-specific compatibles, which also requires a power-domain.

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240528071141.92003-2-alexander.stein@ew.tq-group.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
11 months agomedia: dt-bindings: renesas,rzg2l-cru: Document Renesas RZ/G2UL CRU block
Biju Das [Wed, 5 Jun 2024 15:41:15 +0000 (16:41 +0100)]
media: dt-bindings: renesas,rzg2l-cru: Document Renesas RZ/G2UL CRU block

Document the CRU IP found in Renesas RZ/G2UL SoC.

The CRU block on the RZ/G2UL SoC is identical to one found on the
RZ/G2L SoC, but it does not support parallel input.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20240605154115.263447-3-biju.das.jz@bp.renesas.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
11 months agomedia: dt-bindings: renesas,rzg2l-csi2: Document Renesas RZ/G2UL CSI-2 block
Biju Das [Wed, 5 Jun 2024 15:41:14 +0000 (16:41 +0100)]
media: dt-bindings: renesas,rzg2l-csi2: Document Renesas RZ/G2UL CSI-2 block

Document the CSI-2 block which is part of CRU found in Renesas RZ/G2UL
SoC.

The CSI-2 block on the RZ/G2UL SoC is identical to one found on the
RZ/G2L SoC.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20240605154115.263447-2-biju.das.jz@bp.renesas.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
11 months agodt-bindings: display: panel: constrain 'reg' in DSI panels (part two)
Krzysztof Kozlowski [Wed, 5 Jun 2024 10:56:59 +0000 (12:56 +0200)]
dt-bindings: display: panel: constrain 'reg' in DSI panels (part two)

DSI-attached devices could respond to more than one virtual channel
number, thus their bindings are supposed to constrain the 'reg' property
to match hardware.  Add missing 'reg' constrain for DSI-attached display
panels, based on DTS sources in Linux kernel (assume all devices take
only one channel number).

Few bindings missed previous fixup: LG SW43408 and Raydium RM69380.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240605105659.27433-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
11 months agodt-bindings: ufs: qcom,ufs: drop source clock entries
Dmitry Baryshkov [Tue, 28 May 2024 13:36:48 +0000 (16:36 +0300)]
dt-bindings: ufs: qcom,ufs: drop source clock entries

There is no need to mention and/or to touch in any way the intermediate
(source) clocks. Drop them from MSM8996 UFSHCD schema, making it follow
the example lead by all other platforms.

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240528-msm8996-fix-ufs-v5-1-b475c341126e@linaro.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
11 months agoof/fdt: avoid re-parsing '#{address,size}-cells' in of_fdt_limit_memory
Rob Herring [Thu, 30 Aug 2018 16:20:20 +0000 (11:20 -0500)]
of/fdt: avoid re-parsing '#{address,size}-cells' in of_fdt_limit_memory

Now that we initialize dt_root_addr_cells and dt_root_size_cells earlier,
use them and simplify of_fdt_limit_memory.

Link: https://lore.kernel.org/all/20180830190523.31474-3-robh@kernel.org/
Signed-off-by: Rob Herring <robh@kernel.org>
11 months agoof/fdt: Scan the root node properties earlier
Rob Herring [Wed, 29 Aug 2018 22:20:46 +0000 (17:20 -0500)]
of/fdt: Scan the root node properties earlier

Scan the root node properties (#{size,address}-cells) earlier, so that
the dt_root_addr_cells and dt_root_size_cells variables are initialized
and can be used.

Link: https://lore.kernel.org/all/20180830190523.31474-2-robh@kernel.org/
Signed-off-by: Rob Herring <robh@kernel.org>
11 months agoLinux 6.10-rc1
Linus Torvalds [Sun, 26 May 2024 22:20:12 +0000 (15:20 -0700)]
Linux 6.10-rc1

11 months agomm: percpu: Include smp.h in alloc_tag.h
Kent Overstreet [Fri, 24 May 2024 15:42:09 +0000 (11:42 -0400)]
mm: percpu: Include smp.h in alloc_tag.h

percpu.h depends on smp.h, but doesn't include it directly because of
circular header dependency issues; percpu.h is needed in a bunch of low
level headers.

This fixes a randconfig build error on mips:

  include/linux/alloc_tag.h: In function '__alloc_tag_ref_set':
  include/asm-generic/percpu.h:31:40: error: implicit declaration of function 'raw_smp_processor_id' [-Werror=implicit-function-declaration]

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 24e44cc22aa3 ("mm: percpu: enable per-cpu allocation tagging")
Closes: https://lore.kernel.org/oe-kbuild-all/202405210052.DIrMXJNz-lkp@intel.com/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 months agoMerge tag 'perf-tools-fixes-for-v6.10-1-2024-05-26' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sun, 26 May 2024 16:54:26 +0000 (09:54 -0700)]
Merge tag 'perf-tools-fixes-for-v6.10-1-2024-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tool fix from Arnaldo Carvalho de Melo:
 "Revert a patch causing a regression.

  This made a simple 'perf record -e cycles:pp make -j199' stop working
  on the Ampere ARM64 system Linus uses to test ARM64 kernels".

* tag 'perf-tools-fixes-for-v6.10-1-2024-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"

11 months agoRevert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"
Arnaldo Carvalho de Melo [Sun, 26 May 2024 11:13:21 +0000 (08:13 -0300)]
Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"

This reverts commit 617824a7f0f73e4de325cf8add58e55b28c12493.

This made a simple 'perf record -e cycles:pp make -j199' stop working on
the Ampere ARM64 system Linus uses to test ARM64 kernels, as discussed
at length in the threads in the Link tags below.

The fix provided by Ian wasn't acceptable and work to fix this will take
time we don't have at this point, so lets revert this and work on it on
the next devel cycle.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Cc: Ethan Adams <j.ethan.adams@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tycho Andersen <tycho@tycho.pizza>
Cc: Yang Jihong <yangjihong@bytedance.com>
Link: https://lore.kernel.org/lkml/CAHk-=wi5Ri=yR2jBVk-4HzTzpoAWOgstr1LEvg_-OXtJvXXJOA@mail.gmail.com
Link: https://lore.kernel.org/lkml/CAHk-=wiWvtFyedDNpoV7a8Fq_FpbB+F5KmWK2xPY3QoYseOf_A@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
11 months agoMerge tag '6.10-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 26 May 2024 05:33:10 +0000 (22:33 -0700)]
Merge tag '6.10-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fixes from Steve French:

 - two important netfs integration fixes - including for a data
   corruption and also fixes for multiple xfstests

 - reenable swap support over SMB3

* tag '6.10-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Fix missing set of remote_i_size
  cifs: Fix smb3_insert_range() to move the zero_point
  cifs: update internal version number
  smb3: reenable swapfiles over SMB3 mounts

11 months agoMerge tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sat, 25 May 2024 22:10:33 +0000 (15:10 -0700)]
Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "16 hotfixes, 11 of which are cc:stable.

  A few nilfs2 fixes, the remainder are for MM: a couple of selftests
  fixes, various singletons fixing various issues in various parts"

* tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/ksm: fix possible UAF of stable_node
  mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
  mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
  nilfs2: fix potential hang in nilfs_detach_log_writer()
  nilfs2: fix unexpected freezing of nilfs_segctor_sync()
  nilfs2: fix use-after-free of timer for log writer thread
  selftests/mm: fix build warnings on ppc64
  arm64: patching: fix handling of execmem addresses
  selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
  selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
  selftests/mm: compaction_test: fix bogus test success on Aarch64
  mailmap: update email address for Satya Priya
  mm/huge_memory: don't unpoison huge_zero_folio
  kasan, fortify: properly rename memintrinsics
  lib: add version into /proc/allocinfo output
  mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL

11 months agoMerge tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 25 May 2024 21:48:40 +0000 (14:48 -0700)]
Merge tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:

 - Fix x86 IRQ vector leak caused by a CPU offlining race

 - Fix build failure in the riscv-imsic irqchip driver
   caused by an API-change semantic conflict

 - Fix use-after-free in irq_find_at_or_after()

* tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/irqdesc: Prevent use-after-free in irq_find_at_or_after()
  genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
  irqchip/riscv-imsic: Fixup riscv_ipi_set_virq_range() conflict

11 months agoMerge tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 25 May 2024 21:40:09 +0000 (14:40 -0700)]
Merge tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:

 - Fix regressions of the new x86 CPU VFM (vendor/family/model)
   enumeration/matching code

 - Fix crash kernel detection on buggy firmware with
   non-compliant ACPI MADT tables

 - Address Kconfig warning

* tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL
  crypto: x86/aes-xts - switch to new Intel CPU model defines
  x86/topology: Handle bogus ACPI tables correctly
  x86/kconfig: Select ARCH_WANT_FRAME_POINTERS again when UNWINDER_FRAME_POINTER=y

11 months agoMerge tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi
Linus Torvalds [Sat, 25 May 2024 21:32:29 +0000 (14:32 -0700)]
Merge tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi

Pull ipmi updates from Corey Minyard:
 "Mostly updates for deprecated interfaces, platform.remove and
  converting from a tasklet to a BH workqueue.

  Also use HAS_IOPORT for disabling inb()/outb()"

* tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi:
  ipmi: kcs_bmc_npcm7xx: Convert to platform remove callback returning void
  ipmi: kcs_bmc_aspeed: Convert to platform remove callback returning void
  ipmi: ipmi_ssif: Convert to platform remove callback returning void
  ipmi: ipmi_si_platform: Convert to platform remove callback returning void
  ipmi: ipmi_powernv: Convert to platform remove callback returning void
  ipmi: bt-bmc: Convert to platform remove callback returning void
  char: ipmi: handle HAS_IOPORT dependencies
  ipmi: Convert from tasklet to BH workqueue

11 months agoMerge tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client
Linus Torvalds [Sat, 25 May 2024 21:23:58 +0000 (14:23 -0700)]
Merge tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "A series from Xiubo that adds support for additional access checks
  based on MDS auth caps which were recently made available to clients.

  This is needed to prevent scenarios where the MDS quietly discards
  updates that a UID-restricted client previously (wrongfully) acked to
  the user.

  Other than that, just a documentation fixup"

* tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client:
  doc: ceph: update userspace command to get CephFS metadata
  ceph: add CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK feature bit
  ceph: check the cephx mds auth access for async dirop
  ceph: check the cephx mds auth access for open
  ceph: check the cephx mds auth access for setattr
  ceph: add ceph_mds_check_access() helper
  ceph: save cap_auths in MDS client when session is opened

11 months agoMerge tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3
Linus Torvalds [Sat, 25 May 2024 21:19:01 +0000 (14:19 -0700)]
Merge tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 updates from Konstantin Komarov:
 "Fixes:
   - reusing of the file index (could cause the file to be trimmed)
   - infinite dir enumeration
   - taking DOS names into account during link counting
   - le32_to_cpu conversion, 32 bit overflow, NULL check
   - some code was refactored

  Changes:
   - removed max link count info display during driver init

  Remove:
   - atomic_open has been removed for lack of use"

* tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3:
  fs/ntfs3: Break dir enumeration if directory contents error
  fs/ntfs3: Fix case when index is reused during tree transformation
  fs/ntfs3: Mark volume as dirty if xattr is broken
  fs/ntfs3: Always make file nonresident on fallocate call
  fs/ntfs3: Redesign ntfs_create_inode to return error code instead of inode
  fs/ntfs3: Use variable length array instead of fixed size
  fs/ntfs3: Use 64 bit variable to avoid 32 bit overflow
  fs/ntfs3: Check 'folio' pointer for NULL
  fs/ntfs3: Missed le32_to_cpu conversion
  fs/ntfs3: Remove max link count info display during driver init
  fs/ntfs3: Taking DOS names into account during link counting
  fs/ntfs3: remove atomic_open
  fs/ntfs3: use kcalloc() instead of kzalloc()

11 months agoMerge tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Linus Torvalds [Sat, 25 May 2024 21:15:39 +0000 (14:15 -0700)]
Merge tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull smb server fixes from Steve French:
 "Two ksmbd server fixes, both for stable"

* tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: ignore trailing slashes in share paths
  ksmbd: avoid to send duplicate oplock break notifications

11 months agoMerge tag 'rtc-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Linus Torvalds [Sat, 25 May 2024 20:33:53 +0000 (13:33 -0700)]
Merge tag 'rtc-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "There is one new driver and then most of the changes are the device
  tree bindings conversions to yaml.

  New driver:
   - Epson RX8111

  Drivers:
   - Many Device Tree bindings conversions to dtschema
   - pcf8563: wakeup-source support"

* tag 'rtc-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  pcf8563: add wakeup-source support
  rtc: rx8111: handle VLOW flag
  rtc: rx8111: demote warnings to debug level
  rtc: rx6110: Constify struct regmap_config
  dt-bindings: rtc: convert trivial devices into dtschema
  dt-bindings: rtc: stmp3xxx-rtc: convert to dtschema
  dt-bindings: rtc: pxa-rtc: convert to dtschema
  rtc: Add driver for Epson RX8111
  dt-bindings: rtc: Add Epson RX8111
  rtc: mcp795: drop unneeded MODULE_ALIAS
  rtc: nuvoton: Modify part number value
  rtc: test: Split rtc unit test into slow and normal speed test
  dt-bindings: rtc: nxp,lpc1788-rtc: convert to dtschema
  dt-bindings: rtc: digicolor-rtc: move to trivial-rtc
  dt-bindings: rtc: alphascale,asm9260-rtc: convert to dtschema
  dt-bindings: rtc: armada-380-rtc: convert to dtschema
  rtc: cros-ec: provide ID table for avoiding fallback match

11 months agoMerge tag 'i3c/for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Linus Torvalds [Sat, 25 May 2024 20:28:29 +0000 (13:28 -0700)]
Merge tag 'i3c/for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux

Pull i3c updates from Alexandre Belloni:
 "Runtime PM (power management) is improved and hot-join support has
  been added to the dw controller driver.

  Core:
   - Allow device driver to trigger controller runtime PM

  Drivers:
   - dw: hot-join support
   - svc: better IBI handling"

* tag 'i3c/for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  i3c: dw: Add hot-join support.
  i3c: master: Enable runtime PM for master controller
  i3c: master: svc: fix invalidate IBI type and miss call client IBI handler
  i3c: master: svc: change ENXIO to EAGAIN when IBI occurs during start frame
  i3c: Add comment for -EAGAIN in i3c_device_do_priv_xfers()

11 months agoMerge tag 'jffs2-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 25 May 2024 20:23:42 +0000 (13:23 -0700)]
Merge tag 'jffs2-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull jffs2 updates from Richard Weinberger:

 - Fix illegal memory access in jffs2_free_inode()

 - Kernel-doc fixes

 - print symbolic error names

* tag 'jffs2-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  jffs2: Fix potential illegal address access in jffs2_free_inode
  jffs2: Simplify the allocation of slab caches
  jffs2: nodemgmt: fix kernel-doc comments
  jffs2: print symbolic error name instead of error code

11 months agoMerge tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 25 May 2024 20:17:48 +0000 (13:17 -0700)]
Merge tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux

Pull UML updates from Richard Weinberger:

 - Fixes for -Wmissing-prototypes warnings and further cleanup

 - Remove callback returning void from rtc and virtio drivers

 - Fix bash location

* tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (26 commits)
  um: virtio_uml: Convert to platform remove callback returning void
  um: rtc: Convert to platform remove callback returning void
  um: Remove unused do_get_thread_area function
  um: Fix -Wmissing-prototypes warnings for __vdso_*
  um: Add an internal header shared among the user code
  um: Fix the declaration of kasan_map_memory
  um: Fix the -Wmissing-prototypes warning for get_thread_reg
  um: Fix the -Wmissing-prototypes warning for __switch_mm
  um: Fix -Wmissing-prototypes warnings for (rt_)sigreturn
  um: Stop tracking host PID in cpu_tasks
  um: process: remove unused 'n' variable
  um: vector: remove unused len variable/calculation
  um: vector: fix bpfflash parameter evaluation
  um: slirp: remove set but unused variable 'pid'
  um: signal: move pid variable where needed
  um: Makefile: use bash from the environment
  um: Add winch to winch_handlers before registering winch IRQ
  um: Fix -Wmissing-prototypes warnings for __warp_* and foo
  um: Fix -Wmissing-prototypes warnings for text_poke*
  um: Move declarations to proper headers
  ...

11 months agoMerge tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel
Linus Torvalds [Sat, 25 May 2024 00:28:02 +0000 (17:28 -0700)]
Merge tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Some fixes for the end of the merge window, mostly amdgpu and panthor,
  with one nouveau uAPI change that fixes a bad decision we made a few
  months back.

  nouveau:
   - fix bo metadata uAPI for vm bind

  panthor:
   - Fixes for panthor's heap logical block.
   - Reset on unrecoverable fault
   - Fix VM references.
   - Reset fix.

  xlnx:
   - xlnx compile and doc fixes.

  amdgpu:
   - Handle vbios table integrated info v2.3

  amdkfd:
   - Handle duplicate BOs in reserve_bo_and_cond_vms
   - Handle memory limitations on small APUs

  dp/mst:
   - MST null deref fix.

  bridge:
   - Don't let next bridge create connector in adv7511 to make probe
     work"

* tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel:
  drm/amdgpu/atomfirmware: add intergrated info v2.3 table
  drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2
  drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs
  drm/amdkfd: handle duplicate BOs in reserve_bo_and_cond_vms
  drm/bridge: adv7511: Attach next bridge without creating connector
  drm/buddy: Fix the warn on's during force merge
  drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations
  drm/panthor: Call panthor_sched_post_reset() even if the reset failed
  drm/panthor: Reset the FW VM to NULL on unplug
  drm/panthor: Keep a ref to the VM at the panthor_kernel_bo level
  drm/panthor: Force an immediate reset on unrecoverable faults
  drm/panthor: Document drm_panthor_tiler_heap_destroy::handle validity constraints
  drm/panthor: Fix an off-by-one in the heap context retrieval logic
  drm/panthor: Relax the constraints on the tiler chunk size
  drm/panthor: Make sure the tiler initial/max chunks are consistent
  drm/panthor: Fix tiler OOM handling to allow incremental rendering
  drm: xlnx: zynqmp_dpsub: Fix compilation error
  drm: xlnx: zynqmp_dpsub: Fix few function comments

11 months agocifs: Fix missing set of remote_i_size
David Howells [Fri, 24 May 2024 14:23:36 +0000 (15:23 +0100)]
cifs: Fix missing set of remote_i_size

Occasionally, the generic/001 xfstest will fail indicating corruption in
one of the copy chains when run on cifs against a server that supports
FSCTL_DUPLICATE_EXTENTS_TO_FILE (eg. Samba with a share on btrfs).  The
problem is that the remote_i_size value isn't updated by cifs_setsize()
when called by smb2_duplicate_extents(), but i_size *is*.

This may cause cifs_remap_file_range() to then skip the bit after calling
->duplicate_extents() that sets sizes.

Fix this by calling netfs_resize_file() in smb2_duplicate_extents() before
calling cifs_setsize() to set i_size.

This means we don't then need to call netfs_resize_file() upon return from
->duplicate_extents(), but we also fix the test to compare against the pre-dup
inode size.

[Note that this goes back before the addition of remote_i_size with the
netfs_inode struct.  It should probably have been setting cifsi->server_eof
previously.]

Fixes: cfc63fc8126a ("smb3: fix cached file size problems in duplicate extents (reflink)")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
Signed-off-by: Steve French <stfrench@microsoft.com>
11 months agocifs: Fix smb3_insert_range() to move the zero_point
David Howells [Wed, 22 May 2024 08:38:48 +0000 (09:38 +0100)]
cifs: Fix smb3_insert_range() to move the zero_point

Fix smb3_insert_range() to move the zero_point over to the new EOF.
Without this, generic/147 fails as reads of data beyond the old EOF point
return zeroes.

Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
Signed-off-by: Steve French <stfrench@microsoft.com>
11 months agoMerge tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 24 May 2024 19:47:28 +0000 (12:47 -0700)]
Merge tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more mm updates from Andrew Morton:
 "Jeff Xu's implementation of the mseal() syscall"

* tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  selftest mm/mseal read-only elf memory segment
  mseal: add documentation
  selftest mm/mseal memory sealing
  mseal: add mseal syscall
  mseal: wire up mseal syscall

11 months agomm/ksm: fix possible UAF of stable_node
Chengming Zhou [Mon, 13 May 2024 03:07:56 +0000 (11:07 +0800)]
mm/ksm: fix possible UAF of stable_node

The commit 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page
deduplication limit") introduced a possible failure case in the
stable_tree_insert(), where we may free the new allocated stable_node_dup
if we fail to prepare the missing chain node.

Then that kfolio return and unlock with a freed stable_node set...  And
any MM activities can come in to access kfolio->mapping, so UAF.

Fix it by moving folio_set_stable_node() to the end after stable_node
is inserted successfully.

Link: https://lkml.kernel.org/r/20240513-b4-ksm-stable-node-uaf-v1-1-f687de76f452@linux.dev
Fixes: 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication limit")
Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Stefan Roesch <shr@devkernel.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agomm/memory-failure: fix handling of dissolved but not taken off from buddy pages
Miaohe Lin [Thu, 23 May 2024 07:12:17 +0000 (15:12 +0800)]
mm/memory-failure: fix handling of dissolved but not taken off from buddy pages

When I did memory failure tests recently, below panic occurs:

page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page))
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:1009!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:__del_page_from_free_list+0x151/0x180
RSP: 0018:ffffa49c90437998 EFLAGS: 00000046
RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0
RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69
R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80
R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009
FS:  00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 __rmqueue_pcplist+0x23b/0x520
 get_page_from_freelist+0x26b/0xe40
 __alloc_pages_noprof+0x113/0x1120
 __folio_alloc_noprof+0x11/0xb0
 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130
 __alloc_fresh_hugetlb_folio+0xe7/0x140
 alloc_pool_huge_folio+0x68/0x100
 set_max_huge_pages+0x13d/0x340
 hugetlb_sysctl_handler_common+0xe8/0x110
 proc_sys_call_handler+0x194/0x280
 vfs_write+0x387/0x550
 ksys_write+0x64/0xe0
 do_syscall_64+0xc2/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff916114887
RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887
RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003
RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0
R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00
 </TASK>
Modules linked in: mce_inject hwpoison_inject
---[ end trace 0000000000000000 ]---

And before the panic, there had an warning about bad page state:

BUG: Bad page state in process page-types  pfn:8cee00
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
page_type: 0xffffff7f(buddy)
raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000
raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000
page dumped because: nonzero mapcount
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22
Call Trace:
 <TASK>
 dump_stack_lvl+0x83/0xa0
 bad_page+0x63/0xf0
 free_unref_page+0x36e/0x5c0
 unpoison_memory+0x50b/0x630
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xcd/0x550
 ksys_write+0x64/0xe0
 do_syscall_64+0xc2/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f189a514887
RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887
RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003
RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8
R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040
 </TASK>

The root cause should be the below race:

 memory_failure
  try_memory_failure_hugetlb
   me_huge_page
    __page_handle_poison
     dissolve_free_hugetlb_folio
     drain_all_pages -- Buddy page can be isolated e.g. for compaction.
     take_page_off_buddy -- Failed as page is not in the buddy list.
     -- Page can be putback into buddy after compaction.
    page_ref_inc -- Leads to buddy page with refcnt = 1.

Then unpoison_memory() can unpoison the page and send the buddy page back
into buddy list again leading to the above bad page state warning.  And
bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy
page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to
allocate this page.

Fix this issue by only treating __page_handle_poison() as successful when
it returns 1.

Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com
Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agomm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
Yuanyuan Zhong [Thu, 23 May 2024 18:35:31 +0000 (12:35 -0600)]
mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again

After switching smaps_rollup to use VMA iterator, searching for next entry
is part of the condition expression of the do-while loop.  So the current
VMA needs to be addressed before the continue statement.

Otherwise, with some VMAs skipped, userspace observed memory
consumption from /proc/pid/smaps_rollup will be smaller than the sum of
the corresponding fields from /proc/pid/smaps.

Link: https://lkml.kernel.org/r/20240523183531.2535436-1-yzhong@purestorage.com
Fixes: c4c84f06285e ("fs/proc/task_mmu: stop using linked list and highest_vm_end")
Signed-off-by: Yuanyuan Zhong <yzhong@purestorage.com>
Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agonilfs2: fix potential hang in nilfs_detach_log_writer()
Ryusuke Konishi [Mon, 20 May 2024 13:26:21 +0000 (22:26 +0900)]
nilfs2: fix potential hang in nilfs_detach_log_writer()

Syzbot has reported a potential hang in nilfs_detach_log_writer() called
during nilfs2 unmount.

Analysis revealed that this is because nilfs_segctor_sync(), which
synchronizes with the log writer thread, can be called after
nilfs_segctor_destroy() terminates that thread, as shown in the call trace
below:

nilfs_detach_log_writer
  nilfs_segctor_destroy
    nilfs_segctor_kill_thread  --> Shut down log writer thread
    flush_work
      nilfs_iput_work_func
        nilfs_dispose_list
          iput
            nilfs_evict_inode
              nilfs_transaction_commit
                nilfs_construct_segment (if inode needs sync)
                  nilfs_segctor_sync  --> Attempt to synchronize with
                                          log writer thread
                           *** DEADLOCK ***

Fix this issue by changing nilfs_segctor_sync() so that the log writer
thread returns normally without synchronizing after it terminates, and by
forcing tasks that are already waiting to complete once after the thread
terminates.

The skipped inode metadata flushout will then be processed together in the
subsequent cleanup work in nilfs_segctor_destroy().

Link: https://lkml.kernel.org/r/20240520132621.4054-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+e3973c409251e136fdd0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: "Bai, Shuangpeng" <sjb7183@psu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agonilfs2: fix unexpected freezing of nilfs_segctor_sync()
Ryusuke Konishi [Mon, 20 May 2024 13:26:20 +0000 (22:26 +0900)]
nilfs2: fix unexpected freezing of nilfs_segctor_sync()

A potential and reproducible race issue has been identified where
nilfs_segctor_sync() would block even after the log writer thread writes a
checkpoint, unless there is an interrupt or other trigger to resume log
writing.

This turned out to be because, depending on the execution timing of the
log writer thread running in parallel, the log writer thread may skip
responding to nilfs_segctor_sync(), which causes a call to schedule()
waiting for completion within nilfs_segctor_sync() to lose the opportunity
to wake up.

The reason why waking up the task waiting in nilfs_segctor_sync() may be
skipped is that updating the request generation issued using a shared
sequence counter and adding an wait queue entry to the request wait queue
to the log writer, are not done atomically.  There is a possibility that
log writing and request completion notification by nilfs_segctor_wakeup()
may occur between the two operations, and in that case, the wait queue
entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of
nilfs_segctor_sync() will be carried over until the next request occurs.

Fix this issue by performing these two operations simultaneously within
the lock section of sc_state_lock.  Also, following the memory barrier
guidelines for event waiting loops, move the call to set_current_state()
in the same location into the event waiting loop to ensure that a memory
barrier is inserted just before the event condition determination.

Link: https://lkml.kernel.org/r/20240520132621.4054-3-konishi.ryusuke@gmail.com
Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: "Bai, Shuangpeng" <sjb7183@psu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agonilfs2: fix use-after-free of timer for log writer thread
Ryusuke Konishi [Mon, 20 May 2024 13:26:19 +0000 (22:26 +0900)]
nilfs2: fix use-after-free of timer for log writer thread

Patch series "nilfs2: fix log writer related issues".

This bug fix series covers three nilfs2 log writer-related issues,
including a timer use-after-free issue and potential deadlock issue on
unmount, and a potential freeze issue in event synchronization found
during their analysis.  Details are described in each commit log.

This patch (of 3):

A use-after-free issue has been reported regarding the timer sc_timer on
the nilfs_sc_info structure.

The problem is that even though it is used to wake up a sleeping log
writer thread, sc_timer is not shut down until the nilfs_sc_info structure
is about to be freed, and is used regardless of the thread's lifetime.

Fix this issue by limiting the use of sc_timer only while the log writer
thread is alive.

Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com
Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: "Bai, Shuangpeng" <sjb7183@psu.edu>
Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agoselftests/mm: fix build warnings on ppc64
Michael Ellerman [Tue, 21 May 2024 03:02:19 +0000 (13:02 +1000)]
selftests/mm: fix build warnings on ppc64

Fix warnings like:

  In file included from uffd-unit-tests.c:8:
  uffd-unit-tests.c: In function `uffd_poison_handle_fault':
  uffd-common.h:45:33: warning: format `%llu' expects argument of type
  `long long unsigned int', but argument 3 has type `__u64' {aka `long
  unsigned int'} [-Wformat=]

By switching to unsigned long long for u64 for ppc64 builds.

Link: https://lkml.kernel.org/r/20240521030219.57439-1-mpe@ellerman.id.au
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agoarm64: patching: fix handling of execmem addresses
Will Deacon [Tue, 21 May 2024 21:38:13 +0000 (00:38 +0300)]
arm64: patching: fix handling of execmem addresses

Klara Modin reported warnings for a kernel configured with BPF_JIT but
without MODULES:

[   44.131296] Trying to vfree() bad address (000000004a17c299)
[   44.138024] WARNING: CPU: 1 PID: 193 at mm/vmalloc.c:3189 remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
[   44.146675] CPU: 1 PID: 193 Comm: kworker/1:2 Tainted: G      D W          6.9.0-01786-g2c9e5d4a0082 #25
[   44.158229] Hardware name: Raspberry Pi 3 Model B (DT)
[   44.164433] Workqueue: events bpf_prog_free_deferred
[   44.170492] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   44.178601] pc : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
[   44.183705] lr : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
[   44.188772] sp : ffff800082a13c70
[   44.193112] x29: ffff800082a13c70 x28: 0000000000000000 x27: 0000000000000000
[   44.201384] x26: 0000000000000000 x25: ffff00003a44efa0 x24: 00000000d4202000
[   44.209658] x23: ffff800081223dd0 x22: ffff00003a198a40 x21: ffff8000814dd880
[   44.217924] x20: 00000000d4202000 x19: ffff8000814dd880 x18: 0000000000000006
[   44.226206] x17: 0000000000000000 x16: 0000000000000020 x15: 0000000000000002
[   44.234460] x14: ffff8000811a6370 x13: 0000000020000000 x12: 0000000000000000
[   44.242710] x11: ffff8000811a6370 x10: 0000000000000144 x9 : ffff8000811fe370
[   44.250959] x8 : 0000000000017fe8 x7 : 00000000fffff000 x6 : ffff8000811fe370
[   44.259206] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[   44.267457] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000002203240
[   44.275703] Call trace:
[   44.279158] remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
[   44.283858] vfree (mm/vmalloc.c:3322)
[   44.287835] execmem_free (mm/execmem.c:70)
[   44.292347] bpf_jit_free_exec+0x10/0x1c
[   44.297283] bpf_prog_pack_free (kernel/bpf/core.c:1006)
[   44.302457] bpf_jit_binary_pack_free (kernel/bpf/core.c:1195)
[   44.307951] bpf_jit_free (include/linux/filter.h:1083 arch/arm64/net/bpf_jit_comp.c:2474)
[   44.312342] bpf_prog_free_deferred (kernel/bpf/core.c:2785)
[   44.317785] process_one_work (kernel/workqueue.c:3273)
[   44.322684] worker_thread (kernel/workqueue.c:3342 (discriminator 2) kernel/workqueue.c:3429 (discriminator 2))
[   44.327292] kthread (kernel/kthread.c:388)
[   44.331342] ret_from_fork (arch/arm64/kernel/entry.S:861)

The problem is because bpf_arch_text_copy() silently fails to write to the
read-only area as a result of patch_map() faulting and the resulting
-EFAULT being chucked away.

Update patch_map() to use CONFIG_EXECMEM instead of
CONFIG_STRICT_MODULE_RWX to check for vmalloc addresses.

Link: https://lkml.kernel.org/r/20240521213813.703309-1-rppt@kernel.org
Fixes: 2c9e5d4a0082 ("bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of")
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reported-by: Klara Modin <klarasmodin@gmail.com>
Closes: https://lore.kernel.org/all/7983fbbf-0127-457c-9394-8d6e4299c685@gmail.com
Tested-by: Klara Modin <klarasmodin@gmail.com>
Cc: Björn Töpel <bjorn@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agoselftests/mm: compaction_test: fix bogus test success and reduce probability of OOM...
Dev Jain [Tue, 21 May 2024 07:43:58 +0000 (13:13 +0530)]
selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation

Reset nr_hugepages to zero before the start of the test.

If a non-zero number of hugepages is already set before the start of the
test, the following problems arise:

 - The probability of the test getting OOM-killed increases.  Proof:
   The test wants to run on 80% of available memory to prevent OOM-killing
   (see original code comments).  Let the value of mem_free at the start
   of the test, when nr_hugepages = 0, be x.  In the other case, when
   nr_hugepages > 0, let the memory consumed by hugepages be y.  In the
   former case, the test operates on 0.8 * x of memory.  In the latter,
   the test operates on 0.8 * (x - y) of memory, with y already filled,
   hence, memory consumed is y + 0.8 * (x - y) = 0.8 * x + 0.2 * y > 0.8 *
   x.  Q.E.D

 - The probability of a bogus test success increases.  Proof: Let the
   memory consumed by hugepages be greater than 25% of x, with x and y
   defined as above.  The definition of compaction_index is c_index = (x -
   y)/z where z is the memory consumed by hugepages after trying to
   increase them again.  In check_compaction(), we set the number of
   hugepages to zero, and then increase them back; the probability that
   they will be set back to consume at least y amount of memory again is
   very high (since there is not much delay between the two attempts of
   changing nr_hugepages).  Hence, z >= y > (x/4) (by the 25% assumption).
   Therefore, c_index = (x - y)/z <= (x - y)/y = x/y - 1 < 4 - 1 = 3
   hence, c_index can always be forced to be less than 3, thereby the test
   succeeding always.  Q.E.D

Link: https://lkml.kernel.org/r/20240521074358.675031-4-dev.jain@arm.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Cc: <stable@vger.kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sri Jayaramappa <sjayaram@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agoselftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
Dev Jain [Tue, 21 May 2024 07:43:57 +0000 (13:13 +0530)]
selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages

Currently, the test tries to set nr_hugepages to zero, but that is not
actually done because the file offset is not reset after read().  Fix that
using lseek().

Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Cc: <stable@vger.kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sri Jayaramappa <sjayaram@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agoselftests/mm: compaction_test: fix bogus test success on Aarch64
Dev Jain [Tue, 21 May 2024 07:43:56 +0000 (13:13 +0530)]
selftests/mm: compaction_test: fix bogus test success on Aarch64

Patch series "Fixes for compaction_test", v2.

The compaction_test memory selftest introduces fragmentation in memory
and then tries to allocate as many hugepages as possible. This series
addresses some problems.

On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since
compaction_index becomes 0, which is less than 3, due to no division by
zero exception being raised. We fix that by checking for division by
zero.

Secondly, correctly set the number of hugepages to zero before trying
to set a large number of them.

Now, consider a situation in which, at the start of the test, a non-zero
number of hugepages have been already set (while running the entire
selftests/mm suite, or manually by the admin). The test operates on 80%
of memory to avoid OOM-killer invocation, and because some memory is
already blocked by hugepages, it would increase the chance of OOM-killing.
Also, since mem_free used in check_compaction() is the value before we
set nr_hugepages to zero, the chance that the compaction_index will
be small is very high if the preset nr_hugepages was high, leading to a
bogus test success.

This patch (of 3):

Currently, if at runtime we are not able to allocate a huge page, the test
will trivially pass on Aarch64 due to no exception being raised on
division by zero while computing compaction_index.  Fix that by checking
for nr_hugepages == 0.  Anyways, in general, avoid a division by zero by
exiting the program beforehand.  While at it, fix a typo, and handle the
case where the number of hugepages may overflow an integer.

Link: https://lkml.kernel.org/r/20240521074358.675031-1-dev.jain@arm.com
Link: https://lkml.kernel.org/r/20240521074358.675031-2-dev.jain@arm.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sri Jayaramappa <sjayaram@akamai.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agomailmap: update email address for Satya Priya
Satya Priya Kakitapalli [Wed, 15 May 2024 06:04:50 +0000 (11:34 +0530)]
mailmap: update email address for Satya Priya

Update mailmap with my latest email ID, quic_c_skakit@quicinc.com
is no longer active.

Link: https://lkml.kernel.org/r/20240515-mailmap-update-v1-1-df4853f757a3@quicinc.com
Signed-off-by: Satya Priya Kakitapalli <quic_skakitap@quicinc.com>
Cc: Ajit Pandey <quic_ajipan@quicinc.com>
Cc: Bjorn Andersson <andersson@kernel.org>
Cc: Imran Shaik <quic_imrashai@quicinc.com>
Cc: Jagadeesh Kona <quic_jkona@quicinc.com>
Cc: Konrad Dybcio <konrad.dybcio@linaro.org>
Cc: Taniya Das <quic_tdas@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agomm/huge_memory: don't unpoison huge_zero_folio
Miaohe Lin [Thu, 16 May 2024 12:26:08 +0000 (20:26 +0800)]
mm/huge_memory: don't unpoison huge_zero_folio

When I did memory failure tests recently, below panic occurs:

 kernel BUG at include/linux/mm.h:1135!
 invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14
 RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0
 RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246
 RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8
 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0
 RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492
 R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00
 FS:  0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
 Call Trace:
  <TASK>
  do_shrink_slab+0x14f/0x6a0
  shrink_slab+0xca/0x8c0
  shrink_node+0x2d0/0x7d0
  balance_pgdat+0x33a/0x720
  kswapd+0x1f3/0x410
  kthread+0xd5/0x100
  ret_from_fork+0x2f/0x50
  ret_from_fork_asm+0x1a/0x30
  </TASK>
 Modules linked in: mce_inject hwpoison_inject
 ---[ end trace 0000000000000000 ]---
 RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0
 RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246
 RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8
 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0
 RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492
 R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00
 FS:  0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0

The root cause is that HWPoison flag will be set for huge_zero_folio
without increasing the folio refcnt.  But then unpoison_memory() will
decrease the folio refcnt unexpectedly as it appears like a successfully
hwpoisoned folio leading to VM_BUG_ON_PAGE(page_ref_count(page) == 0) when
releasing huge_zero_folio.

Skip unpoisoning huge_zero_folio in unpoison_memory() to fix this issue.
We're not prepared to unpoison huge_zero_folio yet.

Link: https://lkml.kernel.org/r/20240516122608.22610-1-linmiaohe@huawei.com
Fixes: 478d134e9506 ("mm/huge_memory: do not overkill when splitting huge_zero_page")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Xu Yu <xuyu@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agokasan, fortify: properly rename memintrinsics
Andrey Konovalov [Fri, 17 May 2024 13:01:18 +0000 (15:01 +0200)]
kasan, fortify: properly rename memintrinsics

After commit 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*()
functions") and the follow-up fixes, with CONFIG_FORTIFY_SOURCE enabled,
even though the compiler instruments meminstrinsics by generating calls to
__asan/__hwasan_ prefixed functions, FORTIFY_SOURCE still uses
uninstrumented memset/memmove/memcpy as the underlying functions.

As a result, KASAN cannot detect bad accesses in memset/memmove/memcpy.
This also makes KASAN tests corrupt kernel memory and cause crashes.

To fix this, use __asan_/__hwasan_memset/memmove/memcpy as the underlying
functions whenever appropriate.  Do this only for the instrumented code
(as indicated by __SANITIZE_ADDRESS__).

Link: https://lkml.kernel.org/r/20240517130118.759301-1-andrey.konovalov@linux.dev
Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions")
Fixes: 51287dcb00cc ("kasan: emit different calls for instrumentable memintrinsics")
Fixes: 36be5cba99f6 ("kasan: treat meminstrinsic as builtins in uninstrumented files")
Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Reported-by: Nico Pache <npache@redhat.com>
Closes: https://lore.kernel.org/all/20240501144156.17e65021@outsider.home/
Reviewed-by: Marco Elver <elver@google.com>
Tested-by: Nico Pache <npache@redhat.com>
Acked-by: Nico Pache <npache@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agolib: add version into /proc/allocinfo output
Suren Baghdasaryan [Tue, 14 May 2024 16:31:28 +0000 (09:31 -0700)]
lib: add version into /proc/allocinfo output

Add version string and a header at the beginning of /proc/allocinfo to
allow later format changes.  Example output:

> head /proc/allocinfo
allocinfo - version: 1.0
#     <size>  <calls> <tag info>
           0        0 init/main.c:1314 func:do_initcalls
           0        0 init/do_mounts.c:353 func:mount_nodev_root
           0        0 init/do_mounts.c:187 func:mount_root_generic
           0        0 init/do_mounts.c:158 func:do_mount_root
           0        0 init/initramfs.c:493 func:unpack_to_rootfs
           0        0 init/initramfs.c:492 func:unpack_to_rootfs
           0        0 init/initramfs.c:491 func:unpack_to_rootfs
         512        1 arch/x86/events/rapl.c:681 func:init_rapl_pmus
         128        1 arch/x86/events/rapl.c:571 func:rapl_cpu_online

[akpm@linux-foundation.org: remove stray newline from struct allocinfo_private]
Link: https://lkml.kernel.org/r/20240514163128.3662251-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agomm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
Hailong.Liu [Fri, 10 May 2024 10:01:31 +0000 (18:01 +0800)]
mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL

commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc")
includes support for __GFP_NOFAIL, but it presents a conflict with commit
dd544141b9eb ("vmalloc: back off when the current task is OOM-killed").  A
possible scenario is as follows:

process-a
__vmalloc_node_range(GFP_KERNEL | __GFP_NOFAIL)
    __vmalloc_area_node()
        vm_area_alloc_pages()
--> oom-killer send SIGKILL to process-a
        if (fatal_signal_pending(current)) break;
--> return NULL;

To fix this, do not check fatal_signal_pending() in vm_area_alloc_pages()
if __GFP_NOFAIL set.

This issue occurred during OPLUS KASAN TEST. Below is part of the log
-> oom-killer sends signal to process
[65731.222840] [ T1308] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198

[65731.259685] [T32454] Call trace:
[65731.259698] [T32454]  dump_backtrace+0xf4/0x118
[65731.259734] [T32454]  show_stack+0x18/0x24
[65731.259756] [T32454]  dump_stack_lvl+0x60/0x7c
[65731.259781] [T32454]  dump_stack+0x18/0x38
[65731.259800] [T32454]  mrdump_common_die+0x250/0x39c [mrdump]
[65731.259936] [T32454]  ipanic_die+0x20/0x34 [mrdump]
[65731.260019] [T32454]  atomic_notifier_call_chain+0xb4/0xfc
[65731.260047] [T32454]  notify_die+0x114/0x198
[65731.260073] [T32454]  die+0xf4/0x5b4
[65731.260098] [T32454]  die_kernel_fault+0x80/0x98
[65731.260124] [T32454]  __do_kernel_fault+0x160/0x2a8
[65731.260146] [T32454]  do_bad_area+0x68/0x148
[65731.260174] [T32454]  do_mem_abort+0x151c/0x1b34
[65731.260204] [T32454]  el1_abort+0x3c/0x5c
[65731.260227] [T32454]  el1h_64_sync_handler+0x54/0x90
[65731.260248] [T32454]  el1h_64_sync+0x68/0x6c

[65731.260269] [T32454]  z_erofs_decompress_queue+0x7f0/0x2258
--> be->decompressed_pages = kvcalloc(be->nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_NOFAIL);
kernel panic by NULL pointer dereference.
erofs assume kvmalloc with __GFP_NOFAIL never return NULL.
[65731.260293] [T32454]  z_erofs_runqueue+0xf30/0x104c
[65731.260314] [T32454]  z_erofs_readahead+0x4f0/0x968
[65731.260339] [T32454]  read_pages+0x170/0xadc
[65731.260364] [T32454]  page_cache_ra_unbounded+0x874/0xf30
[65731.260388] [T32454]  page_cache_ra_order+0x24c/0x714
[65731.260411] [T32454]  filemap_fault+0xbf0/0x1a74
[65731.260437] [T32454]  __do_fault+0xd0/0x33c
[65731.260462] [T32454]  handle_mm_fault+0xf74/0x3fe0
[65731.260486] [T32454]  do_mem_abort+0x54c/0x1b34
[65731.260509] [T32454]  el0_da+0x44/0x94
[65731.260531] [T32454]  el0t_64_sync_handler+0x98/0xb4
[65731.260553] [T32454]  el0t_64_sync+0x198/0x19c

Link: https://lkml.kernel.org/r/20240510100131.1865-1-hailong.liu@oppo.com
Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL")
Signed-off-by: Hailong.Liu <hailong.liu@oppo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Barry Song <21cnbao@gmail.com>
Reported-by: Oven <liyangouwen1@oppo.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agoMerge tag 'riscv-for-linus-6.10-mw2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 24 May 2024 17:46:35 +0000 (10:46 -0700)]
Merge tag 'riscv-for-linus-6.10-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - The compression format used for boot images is now configurable at
   build time, and these formats are shown in `make help`

 - access_ok() has been optimized

 - A pair of performance bugs have been fixed in the uaccess handlers

 - Various fixes and cleanups, including one for the IMSIC build failure
   and one for the early-boot ftrace illegal NOPs bug

* tag 'riscv-for-linus-6.10-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix early ftrace nop patching
  irqchip: riscv-imsic: Fixup riscv_ipi_set_virq_range() conflict
  riscv: selftests: Add signal handling vector tests
  riscv: mm: accelerate pagefault when badaccess
  riscv: uaccess: Relax the threshold for fast path
  riscv: uaccess: Allow the last potential unrolled copy
  riscv: typo in comment for get_f64_reg
  Use bool value in set_cpu_online()
  riscv: selftests: Add hwprobe binaries to .gitignore
  riscv: stacktrace: fixed walk_stackframe()
  ftrace: riscv: move from REGS to ARGS
  riscv: do not select MODULE_SECTIONS by default
  riscv: show help string for riscv-specific targets
  riscv: make image compression configurable
  riscv: cpufeature: Fix extension subset checking
  riscv: cpufeature: Fix thead vector hwcap removal
  riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
  riscv: force PAGE_SIZE linear mapping if debug_pagealloc is enabled
  riscv: Define TASK_SIZE_MAX for __access_ok()
  riscv: Remove PGDIR_SIZE_L3 and TASK_SIZE_MIN

11 months agoMerge tag 'for-linus-6.10a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 24 May 2024 17:24:49 +0000 (10:24 -0700)]
Merge tag 'for-linus-6.10a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - a small cleanup in the drivers/xen/xenbus Makefile

 - a fix of the Xen xenstore driver to improve connecting to a late
   started Xenstore

 - an enhancement for better support of ballooning in PVH guests

 - a cleanup using try_cmpxchg() instead of open coding it

* tag 'for-linus-6.10a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  drivers/xen: Improve the late XenStore init protocol
  xen/xenbus: Use *-y instead of *-objs in Makefile
  xen/x86: add extra pages to unpopulated-alloc if available
  locking/x86/xen: Use try_cmpxchg() in xen_alloc_p2m_entry()

11 months agoMerge tag 'for-6.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Linus Torvalds [Fri, 24 May 2024 16:40:31 +0000 (09:40 -0700)]
Merge tag 'for-6.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull more btrfs updates from David Sterba:
 "A few more updates, mostly stability fixes or user visible changes:

   - fix race in zoned mode during device replace that can lead to
     use-after-free

   - update return codes and lower message levels for quota rescan where
     it's causing false alerts

   - fix unexpected qgroup id reuse under some conditions

   - fix condition when looking up extent refs

   - add option norecovery (removed in 6.8), the intended replacements
     haven't been used and some aplications still rely on the old one

   - build warning fixes"

* tag 'for-6.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: re-introduce 'norecovery' mount option
  btrfs: fix end of tree detection when searching for data extent ref
  btrfs: scrub: initialize ret in scrub_simple_mirror() to fix compilation warning
  btrfs: zoned: fix use-after-free due to race with dev replace
  btrfs: qgroup: fix qgroup id collision across mounts
  btrfs: qgroup: update rescan message levels and error codes

11 months agoMerge tag 'erofs-for-6.10-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 24 May 2024 16:31:50 +0000 (09:31 -0700)]
Merge tag 'erofs-for-6.10-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull more erofs updates from Gao Xiang:
 "The main ones are metadata API conversion to byte offsets by Al Viro.

  Another patch gets rid of unnecessary memory allocation out of DEFLATE
  decompressor. The remaining one is a trivial cleanup.

   - Convert metadata APIs to byte offsets

   - Avoid allocating DEFLATE streams unnecessarily

   - Some erofs_show_options() cleanup"

* tag 'erofs-for-6.10-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: avoid allocating DEFLATE streams before mounting
  z_erofs_pcluster_begin(): don't bother with rounding position down
  erofs: don't round offset down for erofs_read_metabuf()
  erofs: don't align offset for erofs_read_metabuf() (simple cases)
  erofs: mechanically convert erofs_read_metabuf() to offsets
  erofs: clean up erofs_show_options()

11 months agoMerge tag 'bcachefs-2024-05-24' of https://evilpiepirate.org/git/bcachefs
Linus Torvalds [Fri, 24 May 2024 16:07:22 +0000 (09:07 -0700)]
Merge tag 'bcachefs-2024-05-24' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Nothing exciting, just syzbot fixes (except for the one
  FMODE_CAN_ODIRECT patch).

  Looks like syzbot reports have slowed down; this is all catch up from
  two weeks of conferences.

  Next hardening project is using Thomas's error injection tooling to
  torture test repair"

* tag 'bcachefs-2024-05-24' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: Fix race path in bch2_inode_insert()
  bcachefs: Ensure we're RW before journalling
  bcachefs: Fix shutdown ordering
  bcachefs: Fix unsafety in bch2_dirent_name_bytes()
  bcachefs: Fix stack oob in __bch2_encrypt_bio()
  bcachefs: Fix btree_trans leak in bch2_readahead()
  bcachefs: Fix bogus verify_replicas_entry() assert
  bcachefs: Check for subvolues with bogus snapshot/inode fields
  bcachefs: bch2_checksum() returns 0 for unknown checksum type
  bcachefs: Fix bch2_alloc_ciphers()
  bcachefs: Add missing guard in bch2_snapshot_has_children()
  bcachefs: Fix missing parens in drop_locks_do()
  bcachefs: Improve bch2_assert_pos_locked()
  bcachefs: Fix shift overflows in replicas.c
  bcachefs: Fix shift overflow in btree_lost_data()
  bcachefs: Fix ref in trans_mark_dev_sbs() error path
  bcachefs: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method
  bcachefs: Fix rcu splat in check_fix_ptrs()

11 months agoMerge tag 'input-for-v6.10-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 24 May 2024 16:01:21 +0000 (09:01 -0700)]
Merge tag 'input-for-v6.10-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

 - a change to input core to trim amount of keys data in modalias string
   in case when a device declares too many keys and they do not fit in
   uevent buffer instead of reporting an error which results in uevent
   not being generated at all

 - support for Machenike G5 Pro Controller added to xpad driver

 - support for FocalTech FT5452 and FT8719 added to edt-ft5x06

 - support for new SPMI vibrator added to pm8xxx-vibrator driver

 - missing locking added to cyapa touchpad driver

 - removal of unused fields in various driver structures

 - explicit initialization of i2c_device_id::driver_data to 0 dropped
   from input drivers

 - other assorted fixes and cleanups.

* tag 'input-for-v6.10-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (24 commits)
  Input: edt-ft5x06 - add support for FocalTech FT5452 and FT8719
  dt-bindings: input: touchscreen: edt-ft5x06: Document FT5452 and FT8719 support
  Input: xpad - add support for Machenike G5 Pro Controller
  Input: try trimming too long modalias strings
  Input: drop explicit initialization of struct i2c_device_id::driver_data to 0
  Input: zet6223 - remove an unused field in struct zet6223_ts
  Input: chipone_icn8505 - remove an unused field in struct icn8505_data
  Input: cros_ec_keyb - remove an unused field in struct cros_ec_keyb
  Input: lpc32xx-keys - remove an unused field in struct lpc32xx_kscan_drv
  Input: matrix_keypad - remove an unused field in struct matrix_keypad
  Input: tca6416-keypad - remove unused struct tca6416_drv_data
  Input: tca6416-keypad - remove an unused field in struct tca6416_keypad_chip
  Input: da7280 - remove an unused field in struct da7280_haptic
  Input: ff-core - prefer struct_size over open coded arithmetic
  Input: cyapa - add missing input core locking to suspend/resume functions
  input: pm8xxx-vibrator: add new SPMI vibrator support
  dt-bindings: input: qcom,pm8xxx-vib: add new SPMI vibrator module
  input: pm8xxx-vibrator: refactor to support new SPMI vibrator
  Input: pm8xxx-vibrator - correct VIB_MAX_LEVELS calculation
  Input: sur40 - convert le16 to cpu before use
  ...

11 months agoMerge tag 'sound-fix-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 24 May 2024 15:48:51 +0000 (08:48 -0700)]
Merge tag 'sound-fix-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes for 6.10-rc1. Most of changes are various
  device-specific fixes and quirks, while there are a few small changes
  in ALSA core timer and module / built-in fixes"

* tag 'sound-fix-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: fix mute/micmute LEDs don't work for ProBook 440/460 G11.
  ALSA: core: Enable proc module when CONFIG_MODULES=y
  ALSA: core: Fix NULL module pointer assignment at card init
  ALSA: hda/realtek: Enable headset mic of JP-IK LEAP W502 with ALC897
  ASoC: dt-bindings: stm32: Ensure compatible pattern matches whole string
  ASoC: tas2781: Fix wrong loading calibrated data sequence
  ASoC: tas2552: Add TX path for capturing AUDIO-OUT data
  ALSA: usb-audio: Fix for sampling rates support for Mbox3
  Documentation: sound: Fix trailing whitespaces
  ALSA: timer: Set lower bound of start tick time
  ASoC: codecs: ES8326: solve hp and button detect issue
  ASoC: rt5645: mic-in detection threshold modification
  ASoC: Intel: sof_sdw_rt_sdca_jack_common: Use name_prefix for `-sdca` detection

11 months agoMerge tag 'char-misc-6.10-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 24 May 2024 15:43:25 +0000 (08:43 -0700)]
Merge tag 'char-misc-6.10-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fix from Greg KH:
 "Here is one remaining bugfix for 6.10-rc1 that missed the 6.9-final
  merge window, and has been sitting in my tree and linux-next for quite
  a while now, but wasn't sent to you (my fault, travels...)

  It is a bugfix to resolve an error in the speakup code that could
  overflow a buffer.

  It has been in linux-next for a while with no reported problems"

* tag 'char-misc-6.10-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  speakup: Fix sizeof() vs ARRAY_SIZE() bug

11 months agoMerge tag 'tty-6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Fri, 24 May 2024 15:38:28 +0000 (08:38 -0700)]
Merge tag 'tty-6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are some small TTY and Serial driver fixes that missed the
  6.9-final merge window, but have been in my tree for weeks (my fault,
  travel caused me to miss this)

  These fixes include:

   - more n_gsm fixes for reported problems

   - 8520_mtk driver fix

   - 8250_bcm7271 driver fix

   - sc16is7xx driver fix

  All of these have been in linux-next for weeks without any reported
  problems"

* tag 'tty-6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using prescaler
  serial: 8250_bcm7271: use default_mux_rate if possible
  serial: 8520_mtk: Set RTS on shutdown for Rx in-band wakeup
  tty: n_gsm: fix missing receive state reset after mode switch
  tty: n_gsm: fix possible out-of-bounds in gsm0_receive()

11 months agoMerge tag 'hardening-v6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 24 May 2024 15:33:44 +0000 (08:33 -0700)]
Merge tag 'hardening-v6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - loadpin: Prevent SECURITY_LOADPIN_ENFORCE=y without module
   decompression (Stephen Boyd)

 - ubsan: Restore dependency on ARCH_HAS_UBSAN

 - kunit/fortify: Fix memcmp() test to be amplitude agnostic

* tag 'hardening-v6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kunit/fortify: Fix memcmp() test to be amplitude agnostic
  ubsan: Restore dependency on ARCH_HAS_UBSAN
  loadpin: Prevent SECURITY_LOADPIN_ENFORCE=y without module decompression

11 months agoMerge tag 'trace-tracefs-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 24 May 2024 15:27:34 +0000 (08:27 -0700)]
Merge tag 'trace-tracefs-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracefs/eventfs updates from Steven Rostedt:
 "Bug fixes:

   - The eventfs directories need to have unique inode numbers. Make
     sure that they do not get the default file inode number.

   - Update the inode uid and gid fields on remount.

     When a remount happens where a uid and/or gid is specified, all the
     tracefs files and directories should get the specified uid and/or
     gid. But this can be sporadic when some uids were assigned already.
     There's already a list of inodes that are allocated. Just update
     their uid and gid fields at the time of remount.

   - Update the eventfs_inodes on remount from the top level "events"
     descriptor.

     There was a bug where not all the eventfs files or directories
     where getting updated on remount. One fix was to clear the
     SAVED_UID/GID flags from the inode list during the iteration of the
     inodes during the remount. But because the eventfs inodes can be
     freed when the last referenced is released, not all the
     eventfs_inodes were being updated. This lead to the ownership
     selftest to fail if it was run a second time (the first time would
     leave eventfs_inodes with no corresponding tracefs_inode).

     Instead, for eventfs_inodes, only process the "events"
     eventfs_inode from the list iteration, as it is guaranteed to have
     a tracefs_inode (it's never freed while the "events" directory
     exists). As it has a list of its children, and the children have a
     list of their children, just iterate all the eventfs_inodes from
     the "events" descriptor and it is guaranteed to get all of them.

   - Clear the EVENT_INODE flag from the tracefs_drop_inode() callback.

     Currently the EVENTFS_INODE FLAG is cleared in the tracefs_d_iput()
     callback. But this is the wrong location. The iput() callback is
     called when the last reference to the dentry inode is hit. There
     could be a case where two dentry's have the same inode, and the
     flag will be cleared prematurely. The flag needs to be cleared when
     the last reference of the inode is dropped and that happens in the
     inode's drop_inode() callback handler.

  Cleanups:

   - Consolidate the creation of a tracefs_inode for an eventfs_inode

     A tracefs_inode is created for both files and directories of the
     eventfs system. It is open coded. Instead, consolidate it into a
     single eventfs_get_inode() function call.

   - Remove the eventfs getattr and permission callbacks.

     The permissions for the eventfs files and directories are updated
     when the inodes are created, on remount, and when the user sets
     them (via setattr). The inodes hold the current permissions so
     there is no need to have custom getattr or permissions callbacks as
     they will more likely cause them to be incorrect. The inode's
     permissions are updated when they should be updated. Remove the
     getattr and permissions inode callbacks.

   - Do not update eventfs_inode attributes on creation of inodes.

     The eventfs_inodes attribute field is used to store the permissions
     of the directories and files for when their corresponding inodes
     are freed and are created again. But when the creation of the
     inodes happen, the eventfs_inode attributes are recalculated. The
     recalculation should only happen when the permissions change for a
     given file or directory. Currently, the attribute changes are just
     being set to their current files so this is not a bug, but it's
     unnecessary and error prone. Stop doing that.

   - The events directory inode is created once when the events
     directory is created and deleted when it is deleted. It is now
     updated on remount and when the user changes the permissions.
     There's no need to use the eventfs_inode of the events directory to
     store the events directory permissions. But using it to store the
     default permissions for the files within the directory that have
     not been updated by the user can simplify the code"

* tag 'trace-tracefs-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  eventfs: Do not use attributes for events directory
  eventfs: Cleanup permissions in creation of inodes
  eventfs: Remove getattr and permission callbacks
  eventfs: Consolidate the eventfs_inode update in eventfs_get_inode()
  tracefs: Clear EVENT_INODE flag in tracefs_drop_inode()
  eventfs: Update all the eventfs_inodes from the events descriptor
  tracefs: Update inode permissions on remount
  eventfs: Keep the directories from having the same inode number as files

11 months agogenirq/irqdesc: Prevent use-after-free in irq_find_at_or_after()
dicken.ding [Fri, 24 May 2024 09:17:39 +0000 (17:17 +0800)]
genirq/irqdesc: Prevent use-after-free in irq_find_at_or_after()

irq_find_at_or_after() dereferences the interrupt descriptor which is
returned by mt_find() while neither holding sparse_irq_lock nor RCU read
lock, which means the descriptor can be freed between mt_find() and the
dereference:

    CPU0                            CPU1
    desc = mt_find()
                                    delayed_free_desc(desc)
    irq_desc_get_irq(desc)

The use-after-free is reported by KASAN:

    Call trace:
     irq_get_next_irq+0x58/0x84
     show_stat+0x638/0x824
     seq_read_iter+0x158/0x4ec
     proc_reg_read_iter+0x94/0x12c
     vfs_read+0x1e0/0x2c8

    Freed by task 4471:
     slab_free_freelist_hook+0x174/0x1e0
     __kmem_cache_free+0xa4/0x1dc
     kfree+0x64/0x128
     irq_kobj_release+0x28/0x3c
     kobject_put+0xcc/0x1e0
     delayed_free_desc+0x14/0x2c
     rcu_do_batch+0x214/0x720

Guard the access with a RCU read lock section.

Fixes: 721255b9826b ("genirq: Use a maple tree for interrupt descriptor management")
Signed-off-by: dicken.ding <dicken.ding@mediatek.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240524091739.31611-1-dicken.ding@mediatek.com
11 months agofs/ntfs3: Break dir enumeration if directory contents error
Konstantin Komarov [Tue, 23 Apr 2024 14:21:58 +0000 (17:21 +0300)]
fs/ntfs3: Break dir enumeration if directory contents error

If we somehow attempt to read beyond the directory size, an error
is supposed to be returned.

However, in some cases, read requests do not stop and instead enter
into a loop.

To avoid this, we set the position in the directory to the end.

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: stable@vger.kernel.org
11 months agofs/ntfs3: Fix case when index is reused during tree transformation
Konstantin Komarov [Tue, 23 Apr 2024 12:31:56 +0000 (15:31 +0300)]
fs/ntfs3: Fix case when index is reused during tree transformation

In most cases when adding a cluster to the directory index,
they are placed at the end, and in the bitmap, this cluster corresponds
to the last bit. The new directory size is calculated as follows:

data_size = (u64)(bit + 1) << indx->index_bits;

In the case of reusing a non-final cluster from the index,
data_size is calculated incorrectly, resulting in the directory size
differing from the actual size.

A check for cluster reuse has been added, and the size update is skipped.

Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Cc: stable@vger.kernel.org
11 months agoselftest mm/mseal read-only elf memory segment
Jeff Xu [Mon, 15 Apr 2024 16:35:24 +0000 (16:35 +0000)]
selftest mm/mseal read-only elf memory segment

Sealing read-only of elf mapping so it can't be changed by mprotect.

[jeffxu@chromium.org: style change]
Link: https://lkml.kernel.org/r/20240416220944.2481203-2-jeffxu@chromium.org
[amer.shanawany@gmail.com: fix linker error for inline function]
Link: https://lkml.kernel.org/r/20240420202346.546444-1-amer.shanawany@gmail.com
[jeffxu@chromium.org: fix compile warning]
Link: https://lkml.kernel.org/r/20240420003515.345982-2-jeffxu@chromium.org
[jeffxu@chromium.org: fix arm build]
Link: https://lkml.kernel.org/r/20240502225331.3806279-2-jeffxu@chromium.org
Link: https://lkml.kernel.org/r/20240415163527.626541-6-jeffxu@chromium.org
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Signed-off-by: Amer Al Shanawany <amer.shanawany@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Xu <jeffxu@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Stephen Röttger <sroettger@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Amer Al Shanawany <amer.shanawany@gmail.com>
Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agomseal: add documentation
Jeff Xu [Mon, 15 Apr 2024 16:35:23 +0000 (16:35 +0000)]
mseal: add documentation

Add documentation for mseal().

Link: https://lkml.kernel.org/r/20240415163527.626541-5-jeffxu@chromium.org
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Xu <jeffxu@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Stephen Röttger <sroettger@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Amer Al Shanawany <amer.shanawany@gmail.com>
Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agoselftest mm/mseal memory sealing
Jeff Xu [Mon, 15 Apr 2024 16:35:22 +0000 (16:35 +0000)]
selftest mm/mseal memory sealing

selftest for memory sealing change in mmap() and mseal().

Link: https://lkml.kernel.org/r/20240415163527.626541-4-jeffxu@chromium.org
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Xu <jeffxu@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Stephen Röttger <sroettger@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Amer Al Shanawany <amer.shanawany@gmail.com>
Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agomseal: add mseal syscall
Jeff Xu [Mon, 15 Apr 2024 16:35:21 +0000 (16:35 +0000)]
mseal: add mseal syscall

The new mseal() is an syscall on 64 bit CPU, and with following signature:

int mseal(void addr, size_t len, unsigned long flags)
addr/len: memory range.
flags: reserved.

mseal() blocks following operations for the given memory range.

1> Unmapping, moving to another location, and shrinking the size,
   via munmap() and mremap(), can leave an empty space, therefore can
   be replaced with a VMA with a new set of attributes.

2> Moving or expanding a different VMA into the current location,
   via mremap().

3> Modifying a VMA via mmap(MAP_FIXED).

4> Size expansion, via mremap(), does not appear to pose any specific
   risks to sealed VMAs. It is included anyway because the use case is
   unclear. In any case, users can rely on merging to expand a sealed VMA.

5> mprotect() and pkey_mprotect().

6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous
   memory, when users don't have write permission to the memory. Those
   behaviors can alter region contents by discarding pages, effectively a
   memset(0) for anonymous memory.

Following input during RFC are incooperated into this patch:

Jann Horn: raising awareness and providing valuable insights on the
destructive madvise operations.
Linus Torvalds: assisting in defining system call signature and scope.
Liam R. Howlett: perf optimization.
Theo de Raadt: sharing the experiences and insight gained from
  implementing mimmutable() in OpenBSD.

Finally, the idea that inspired this patch comes from Stephen Röttger's
work in Chrome V8 CFI.

[jeffxu@chromium.org: add branch prediction hint, per Pedro]
Link: https://lkml.kernel.org/r/20240423192825.1273679-2-jeffxu@chromium.org
Link: https://lkml.kernel.org/r/20240415163527.626541-3-jeffxu@chromium.org
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Xu <jeffxu@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Stephen Röttger <sroettger@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Amer Al Shanawany <amer.shanawany@gmail.com>
Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agomseal: wire up mseal syscall
Jeff Xu [Mon, 15 Apr 2024 16:35:20 +0000 (16:35 +0000)]
mseal: wire up mseal syscall

Patch series "Introduce mseal", v10.

This patchset proposes a new mseal() syscall for the Linux kernel.

In a nutshell, mseal() protects the VMAs of a given virtual memory range
against modifications, such as changes to their permission bits.

Modern CPUs support memory permissions, such as the read/write (RW) and
no-execute (NX) bits.  Linux has supported NX since the release of kernel
version 2.6.8 in August 2004 [1].  The memory permission feature improves
the security stance on memory corruption bugs, as an attacker cannot
simply write to arbitrary memory and point the code to it.  The memory
must be marked with the X bit, or else an exception will occur.
Internally, the kernel maintains the memory permissions in a data
structure called VMA (vm_area_struct).  mseal() additionally protects the
VMA itself against modifications of the selected seal type.

Memory sealing is useful to mitigate memory corruption issues where a
corrupted pointer is passed to a memory management system.  For example,
such an attacker primitive can break control-flow integrity guarantees
since read-only memory that is supposed to be trusted can become writable
or .text pages can get remapped.  Memory sealing can automatically be
applied by the runtime loader to seal .text and .rodata pages and
applications can additionally seal security critical data at runtime.  A
similar feature already exists in the XNU kernel with the
VM_FLAGS_PERMANENT [3] flag and on OpenBSD with the mimmutable syscall
[4].  Also, Chrome wants to adopt this feature for their CFI work [2] and
this patchset has been designed to be compatible with the Chrome use case.

Two system calls are involved in sealing the map:  mmap() and mseal().

The new mseal() is an syscall on 64 bit CPU, and with following signature:

int mseal(void addr, size_t len, unsigned long flags)
addr/len: memory range.
flags: reserved.

mseal() blocks following operations for the given memory range.

1> Unmapping, moving to another location, and shrinking the size,
   via munmap() and mremap(), can leave an empty space, therefore can
   be replaced with a VMA with a new set of attributes.

2> Moving or expanding a different VMA into the current location,
   via mremap().

3> Modifying a VMA via mmap(MAP_FIXED).

4> Size expansion, via mremap(), does not appear to pose any specific
   risks to sealed VMAs. It is included anyway because the use case is
   unclear. In any case, users can rely on merging to expand a sealed VMA.

5> mprotect() and pkey_mprotect().

6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous
   memory, when users don't have write permission to the memory. Those
   behaviors can alter region contents by discarding pages, effectively a
   memset(0) for anonymous memory.

The idea that inspired this patch comes from Stephen Röttger’s work in
V8 CFI [5].  Chrome browser in ChromeOS will be the first user of this
API.

Indeed, the Chrome browser has very specific requirements for sealing,
which are distinct from those of most applications.  For example, in the
case of libc, sealing is only applied to read-only (RO) or read-execute
(RX) memory segments (such as .text and .RELRO) to prevent them from
becoming writable, the lifetime of those mappings are tied to the lifetime
of the process.

Chrome wants to seal two large address space reservations that are managed
by different allocators.  The memory is mapped RW- and RWX respectively
but write access to it is restricted using pkeys (or in the future ARM
permission overlay extensions).  The lifetime of those mappings are not
tied to the lifetime of the process, therefore, while the memory is
sealed, the allocators still need to free or discard the unused memory.
For example, with madvise(DONTNEED).

However, always allowing madvise(DONTNEED) on this range poses a security
risk.  For example if a jump instruction crosses a page boundary and the
second page gets discarded, it will overwrite the target bytes with zeros
and change the control flow.  Checking write-permission before the discard
operation allows us to control when the operation is valid.  In this case,
the madvise will only succeed if the executing thread has PKEY write
permissions and PKRU changes are protected in software by control-flow
integrity.

Although the initial version of this patch series is targeting the Chrome
browser as its first user, it became evident during upstream discussions
that we would also want to ensure that the patch set eventually is a
complete solution for memory sealing and compatible with other use cases.
The specific scenario currently in mind is glibc's use case of loading and
sealing ELF executables.  To this end, Stephen is working on a change to
glibc to add sealing support to the dynamic linker, which will seal all
non-writable segments at startup.  Once this work is completed, all
applications will be able to automatically benefit from these new
protections.

In closing, I would like to formally acknowledge the valuable
contributions received during the RFC process, which were instrumental in
shaping this patch:

Jann Horn: raising awareness and providing valuable insights on the
  destructive madvise operations.
Liam R. Howlett: perf optimization.
Linus Torvalds: assisting in defining system call signature and scope.
Theo de Raadt: sharing the experiences and insight gained from
  implementing mimmutable() in OpenBSD.

MM perf benchmarks
==================
This patch adds a loop in the mprotect/munmap/madvise(DONTNEED) to
check the VMAs’ sealing flag, so that no partial update can be made,
when any segment within the given memory range is sealed.

To measure the performance impact of this loop, two tests are developed.
[8]

The first is measuring the time taken for a particular system call,
by using clock_gettime(CLOCK_MONOTONIC). The second is using
PERF_COUNT_HW_REF_CPU_CYCLES (exclude user space). Both tests have
similar results.

The tests have roughly below sequence:
for (i = 0; i < 1000, i++)
    create 1000 mappings (1 page per VMA)
    start the sampling
    for (j = 0; j < 1000, j++)
        mprotect one mapping
    stop and save the sample
    delete 1000 mappings
calculates all samples.

Below tests are performed on Intel(R) Pentium(R) Gold 7505 @ 2.00GHz,
4G memory, Chromebook.

Based on the latest upstream code:
The first test (measuring time)
syscall__ vmas t t_mseal delta_ns per_vma %
munmap__   1 909 944 35 35 104%
munmap__   2 1398 1502 104 52 107%
munmap__   4 2444 2594 149 37 106%
munmap__   8 4029 4323 293 37 107%
munmap__   16 6647 6935 288 18 104%
munmap__   32 11811 12398 587 18 105%
mprotect 1 439 465 26 26 106%
mprotect 2 1659 1745 86 43 105%
mprotect 4 3747 3889 142 36 104%
mprotect 8 6755 6969 215 27 103%
mprotect 16 13748 14144 396 25 103%
mprotect 32 27827 28969 1142 36 104%
madvise_ 1 240 262 22 22 109%
madvise_ 2 366 442 76 38 121%
madvise_ 4 623 751 128 32 121%
madvise_ 8 1110 1324 215 27 119%
madvise_ 16 2127 2451 324 20 115%
madvise_ 32 4109 4642 534 17 113%

The second test (measuring cpu cycle)
syscall__ vmas cpu cmseal delta_cpu per_vma %
munmap__ 1 1790 1890 100 100 106%
munmap__ 2 2819 3033 214 107 108%
munmap__ 4 4959 5271 312 78 106%
munmap__ 8 8262 8745 483 60 106%
munmap__ 16 13099 14116 1017 64 108%
munmap__ 32 23221 24785 1565 49 107%
mprotect 1 906 967 62 62 107%
mprotect 2 3019 3203 184 92 106%
mprotect 4 6149 6569 420 105 107%
mprotect 8 9978 10524 545 68 105%
mprotect 16 20448 21427 979 61 105%
mprotect 32 40972 42935 1963 61 105%
madvise_ 1 434 497 63 63 115%
madvise_ 2 752 899 147 74 120%
madvise_ 4 1313 1513 200 50 115%
madvise_ 8 2271 2627 356 44 116%
madvise_ 16 4312 4883 571 36 113%
madvise_ 32 8376 9319 943 29 111%

Based on the result, for 6.8 kernel, sealing check adds
20-40 nano seconds, or around 50-100 CPU cycles, per VMA.

In addition, I applied the sealing to 5.10 kernel:
The first test (measuring time)
syscall__ vmas t tmseal delta_ns per_vma %
munmap__ 1 357 390 33 33 109%
munmap__ 2 442 463 21 11 105%
munmap__ 4 614 634 20 5 103%
munmap__ 8 1017 1137 120 15 112%
munmap__ 16 1889 2153 263 16 114%
munmap__ 32 4109 4088 -21 -1 99%
mprotect 1 235 227 -7 -7 97%
mprotect 2 495 464 -30 -15 94%
mprotect 4 741 764 24 6 103%
mprotect 8 1434 1437 2 0 100%
mprotect 16 2958 2991 33 2 101%
mprotect 32 6431 6608 177 6 103%
madvise_ 1 191 208 16 16 109%
madvise_ 2 300 324 24 12 108%
madvise_ 4 450 473 23 6 105%
madvise_ 8 753 806 53 7 107%
madvise_ 16 1467 1592 125 8 108%
madvise_ 32 2795 3405 610 19 122%

The second test (measuring cpu cycle)
syscall__ nbr_vma cpu cmseal delta_cpu per_vma %
munmap__ 1 684 715 31 31 105%
munmap__ 2 861 898 38 19 104%
munmap__ 4 1183 1235 51 13 104%
munmap__ 8 1999 2045 46 6 102%
munmap__ 16 3839 3816 -23 -1 99%
munmap__ 32 7672 7887 216 7 103%
mprotect 1 397 443 46 46 112%
mprotect 2 738 788 50 25 107%
mprotect 4 1221 1256 35 9 103%
mprotect 8 2356 2429 72 9 103%
mprotect 16 4961 4935 -26 -2 99%
mprotect 32 9882 10172 291 9 103%
madvise_ 1 351 380 29 29 108%
madvise_ 2 565 615 49 25 109%
madvise_ 4 872 933 61 15 107%
madvise_ 8 1508 1640 132 16 109%
madvise_ 16 3078 3323 245 15 108%
madvise_ 32 5893 6704 811 25 114%

For 5.10 kernel, sealing check adds 0-15 ns in time, or 10-30
CPU cycles, there is even decrease in some cases.

It might be interesting to compare 5.10 and 6.8 kernel
The first test (measuring time)
syscall__ vmas t_5_10 t_6_8 delta_ns per_vma %
munmap__ 1 357 909 552 552 254%
munmap__ 2 442 1398 956 478 316%
munmap__ 4 614 2444 1830 458 398%
munmap__ 8 1017 4029 3012 377 396%
munmap__ 16 1889 6647 4758 297 352%
munmap__ 32 4109 11811 7702 241 287%
mprotect 1 235 439 204 204 187%
mprotect 2 495 1659 1164 582 335%
mprotect 4 741 3747 3006 752 506%
mprotect 8 1434 6755 5320 665 471%
mprotect 16 2958 13748 10790 674 465%
mprotect 32 6431 27827 21397 669 433%
madvise_ 1 191 240 49 49 125%
madvise_ 2 300 366 67 33 122%
madvise_ 4 450 623 173 43 138%
madvise_ 8 753 1110 357 45 147%
madvise_ 16 1467 2127 660 41 145%
madvise_ 32 2795 4109 1314 41 147%

The second test (measuring cpu cycle)
syscall__ vmas cpu_5_10 c_6_8 delta_cpu per_vma %
munmap__ 1 684 1790 1106 1106 262%
munmap__ 2 861 2819 1958 979 327%
munmap__ 4 1183 4959 3776 944 419%
munmap__ 8 1999 8262 6263 783 413%
munmap__ 16 3839 13099 9260 579 341%
munmap__ 32 7672 23221 15549 486 303%
mprotect 1 397 906 509 509 228%
mprotect 2 738 3019 2281 1140 409%
mprotect 4 1221 6149 4929 1232 504%
mprotect 8 2356 9978 7622 953 423%
mprotect 16 4961 20448 15487 968 412%
mprotect 32 9882 40972 31091 972 415%
madvise_ 1 351 434 82 82 123%
madvise_ 2 565 752 186 93 133%
madvise_ 4 872 1313 442 110 151%
madvise_ 8 1508 2271 763 95 151%
madvise_ 16 3078 4312 1234 77 140%
madvise_ 32 5893 8376 2483 78 142%

From 5.10 to 6.8
munmap: added 250-550 ns in time, or 500-1100 in cpu cycle, per vma.
mprotect: added 200-750 ns in time, or 500-1200 in cpu cycle, per vma.
madvise: added 33-50 ns in time, or 70-110 in cpu cycle, per vma.

In comparison to mseal, which adds 20-40 ns or 50-100 CPU cycles, the
increase from 5.10 to 6.8 is significantly larger, approximately ten times
greater for munmap and mprotect.

When I discuss the mm performance with Brian Makin, an engineer who worked
on performance, it was brought to my attention that such performance
benchmarks, which measuring millions of mm syscall in a tight loop, may
not accurately reflect real-world scenarios, such as that of a database
service.  Also this is tested using a single HW and ChromeOS, the data
from another HW or distribution might be different.  It might be best to
take this data with a grain of salt.

This patch (of 5):

Wire up mseal syscall for all architectures.

Link: https://lkml.kernel.org/r/20240415163527.626541-1-jeffxu@chromium.org
Link: https://lkml.kernel.org/r/20240415163527.626541-2-jeffxu@chromium.org
Signed-off-by: Jeff Xu <jeffxu@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Jann Horn <jannh@google.com> [Bug #2]
Cc: Jeff Xu <jeffxu@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Pedro Falcato <pedro.falcato@gmail.com>
Cc: Stephen Röttger <sroettger@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Amer Al Shanawany <amer.shanawany@gmail.com>
Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 months agoMerge tag 'nfs-for-6.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Thu, 23 May 2024 20:51:09 +0000 (13:51 -0700)]
Merge tag 'nfs-for-6.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Stable fixes:
   - nfs: fix undefined behavior in nfs_block_bits()
   - NFSv4.2: Fix READ_PLUS when server doesn't support OP_READ_PLUS

  Bugfixes:
   - Fix mixing of the lock/nolock and local_lock mount options
   - NFSv4: Fixup smatch warning for ambiguous return
   - NFSv3: Fix remount when using the legacy binary mount api
   - SUNRPC: Fix the handling of expired RPCSEC_GSS contexts
   - SUNRPC: fix the NFSACL RPC retries when soft mounts are enabled
   - rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL

  Features and cleanups:
   - NFSv3: Use the atomic_open API to fix open(O_CREAT|O_TRUNC)
   - pNFS/filelayout: S layout segment range in LAYOUTGET
   - pNFS: rework pnfs_generic_pg_check_layout to check IO range
   - NFSv2: Turn off enabling of NFS v2 by default"

* tag 'nfs-for-6.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: fix undefined behavior in nfs_block_bits()
  pNFS: rework pnfs_generic_pg_check_layout to check IO range
  pNFS/filelayout: check layout segment range
  pNFS/filelayout: fixup pNfs allocation modes
  rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL
  NFS: Don't enable NFS v2 by default
  NFS: Fix READ_PLUS when server doesn't support OP_READ_PLUS
  sunrpc: fix NFSACL RPC retry on soft mount
  SUNRPC: fix handling expired GSS context
  nfs: keep server info for remounts
  NFSv4: Fixup smatch warning for ambiguous return
  NFS: make sure lock/nolock overriding local_lock mount option
  NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly.
  pNFS/filelayout: Specify the layout segment range in LAYOUTGET
  pNFS/filelayout: Remove the whole file layout requirement

11 months agoMerge tag 'block-6.10-20240523' of git://git.kernel.dk/linux
Linus Torvalds [Thu, 23 May 2024 20:44:47 +0000 (13:44 -0700)]
Merge tag 'block-6.10-20240523' of git://git.kernel.dk/linux

Pull more block updates from Jens Axboe:
 "Followup block updates, mostly due to NVMe being a bit late to the
  party. But nothing major in there, so not a big deal.

  In detail, this contains:

   - NVMe pull request via Keith:
       - Fabrics connection retries (Daniel, Hannes)
       - Fabrics logging enhancements (Tokunori)
       - RDMA delete optimization (Sagi)

   - ublk DMA alignment fix (me)

   - null_blk sparse warning fixes (Bart)

   - Discard support for brd (Keith)

   - blk-cgroup list corruption fixes (Ming)

   - blk-cgroup stat propagation fix (Waiman)

   - Regression fix for plugging stall with md (Yu)

   - Misc fixes or cleanups (David, Jeff, Justin)"

* tag 'block-6.10-20240523' of git://git.kernel.dk/linux: (24 commits)
  null_blk: fix null-ptr-dereference while configuring 'power' and 'submit_queues'
  blk-throttle: remove unused struct 'avg_latency_bucket'
  block: fix lost bio for plug enabled bio based device
  block: t10-pi: add MODULE_DESCRIPTION()
  blk-mq: add helper for checking if one CPU is mapped to specified hctx
  blk-cgroup: Properly propagate the iostat update up the hierarchy
  blk-cgroup: fix list corruption from reorder of WRITE ->lqueued
  blk-cgroup: fix list corruption from resetting io stat
  cdrom: rearrange last_media_change check to avoid unintentional overflow
  nbd: Fix signal handling
  nbd: Remove a local variable from nbd_send_cmd()
  nbd: Improve the documentation of the locking assumptions
  nbd: Remove superfluous casts
  nbd: Use NULL to represent a pointer
  brd: implement discard support
  null_blk: Fix two sparse warnings
  ublk_drv: set DMA alignment mask to 3
  nvme-rdma, nvme-tcp: include max reconnects for reconnect logging
  nvmet-rdma: Avoid o(n^2) loop in delete_ctrl
  nvme: do not retry authentication failures
  ...

11 months agoMerge tag 'io_uring-6.10-20240523' of git://git.kernel.dk/linux
Linus Torvalds [Thu, 23 May 2024 20:41:49 +0000 (13:41 -0700)]
Merge tag 'io_uring-6.10-20240523' of git://git.kernel.dk/linux

Pull io_uring fixes from Jens Axboe:
 "Single fix here for a regression in 6.9, and then a simple cleanup
  removing some dead code"

* tag 'io_uring-6.10-20240523' of git://git.kernel.dk/linux:
  io_uring: remove checks for NULL 'sq_offset'
  io_uring/sqpoll: ensure that normal task_work is also run timely

11 months agoMerge tag 'regulator-fix-v6.10-merge-window' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Thu, 23 May 2024 20:39:42 +0000 (13:39 -0700)]
Merge tag 'regulator-fix-v6.10-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "A bunch of fixes that came in during the merge window.

  Matti found several issues with some of the more complexly configured
  Rohm regulators and the helpers they use and there were some errors in
  the specification of tps6594 when regulators are grouped together"

* tag 'regulator-fix-v6.10-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: tps6594-regulator: Correct multi-phase configuration
  regulator: tps6287x: Force writing VSEL bit
  regulator: pickable ranges: don't always cache vsel
  regulator: rohm-regulator: warn if unsupported voltage is set
  regulator: bd71828: Don't overwrite runtime voltages

11 months agoMerge tag 'regmap-fix-v6.10-merge-window' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Thu, 23 May 2024 20:38:31 +0000 (13:38 -0700)]
Merge tag 'regmap-fix-v6.10-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
 "Guenter ran with memory sanitisers and found an issue in the new KUnit
  tests that Richard added where an assumption in older test code was
  exposed, this was fixed quickly by Richard"

* tag 'regmap-fix-v6.10-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: kunit: Fix array overflow in stride() test

11 months agogenirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
Dongli Zhang [Wed, 22 May 2024 22:02:18 +0000 (15:02 -0700)]
genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline

The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
interrupt affinity reconfiguration via procfs. Instead, the change is
deferred until the next instance of the interrupt being triggered on the
original CPU.

When the interrupt next triggers on the original CPU, the new affinity is
enforced within __irq_move_irq(). A vector is allocated from the new CPU,
but the old vector on the original CPU remains and is not immediately
reclaimed. Instead, apicd->move_in_progress is flagged, and the reclaiming
process is delayed until the next trigger of the interrupt on the new CPU.

Upon the subsequent triggering of the interrupt on the new CPU,
irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
remains online. Subsequently, the timer on the old CPU iterates over its
vector_cleanup list, reclaiming old vectors.

However, a rare scenario arises if the old CPU is outgoing before the
interrupt triggers again on the new CPU.

In that case irq_force_complete_move() is not invoked on the outgoing CPU
to reclaim the old apicd->prev_vector because the interrupt isn't currently
affine to the outgoing CPU, and irq_needs_fixup() returns false. Even
though __vector_schedule_cleanup() is later called on the new CPU, it
doesn't reclaim apicd->prev_vector; instead, it simply resets both
apicd->move_in_progress and apicd->prev_vector to 0.

As a result, the vector remains unreclaimed in vector_matrix, leading to a
CPU vector leak.

To address this issue, move the invocation of irq_force_complete_move()
before the irq_needs_fixup() call to reclaim apicd->prev_vector, if the
interrupt is currently or used to be affine to the outgoing CPU.

Additionally, reclaim the vector in __vector_schedule_cleanup() as well,
following a warning message, although theoretically it should never see
apicd->move_in_progress with apicd->prev_cpu pointing to an offline CPU.

Fixes: f0383c24b485 ("genirq/cpuhotplug: Add support for cleaning up move in progress")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240522220218.162423-1-dongli.zhang@oracle.com