]> www.infradead.org Git - users/willy/xarray.git/log
users/willy/xarray.git
3 months agoHID: appletb-kbd: fix memory corruption of input_handler_list
Qasim Ijaz [Fri, 27 Jun 2025 11:01:21 +0000 (12:01 +0100)]
HID: appletb-kbd: fix memory corruption of input_handler_list

In appletb_kbd_probe an input handler is initialised and then registered
with input core through input_register_handler(). When this happens input
core will add the input handler (specifically its node) to the global
input_handler_list. The input_handler_list is central to the functionality
of input core and is traversed in various places in input core. An example
of this is when a new input device is plugged in and gets registered with
input core.

The input_handler in probe is allocated as device managed memory. If a
probe failure occurs after input_register_handler() the input_handler
memory is freed, yet it will remain in the input_handler_list. This
effectively means the input_handler_list contains a dangling pointer
to data belonging to a freed input handler.

This causes an issue when any other input device is plugged in - in my
case I had an old PixArt HP USB optical mouse and I decided to
plug it in after a failure occurred after input_register_handler().
This lead to the registration of this input device via
input_register_device which involves traversing over every handler
in the corrupted input_handler_list and calling input_attach_handler(),
giving each handler a chance to bind to newly registered device.

The core of this bug is a UAF which causes memory corruption of
input_handler_list and to fix it we must ensure the input handler is
unregistered from input core, this is done through
input_unregister_handler().

[   63.191597] ==================================================================
[   63.192094] BUG: KASAN: slab-use-after-free in input_attach_handler.isra.0+0x1a9/0x1e0
[   63.192094] Read of size 8 at addr ffff888105ea7c80 by task kworker/0:2/54
[   63.192094]
[   63.192094] CPU: 0 UID: 0 PID: 54 Comm: kworker/0:2 Not tainted 6.16.0-rc2-00321-g2aa6621d
[   63.192094] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.164
[   63.192094] Workqueue: usb_hub_wq hub_event
[   63.192094] Call Trace:
[   63.192094]  <TASK>
[   63.192094]  dump_stack_lvl+0x53/0x70
[   63.192094]  print_report+0xce/0x670
[   63.192094]  kasan_report+0xce/0x100
[   63.192094]  input_attach_handler.isra.0+0x1a9/0x1e0
[   63.192094]  input_register_device+0x76c/0xd00
[   63.192094]  hidinput_connect+0x686d/0xad60
[   63.192094]  hid_connect+0xf20/0x1b10
[   63.192094]  hid_hw_start+0x83/0x100
[   63.192094]  hid_device_probe+0x2d1/0x680
[   63.192094]  really_probe+0x1c3/0x690
[   63.192094]  __driver_probe_device+0x247/0x300
[   63.192094]  driver_probe_device+0x49/0x210
[   63.192094]  __device_attach_driver+0x160/0x320
[   63.192094]  bus_for_each_drv+0x10f/0x190
[   63.192094]  __device_attach+0x18e/0x370
[   63.192094]  bus_probe_device+0x123/0x170
[   63.192094]  device_add+0xd4d/0x1460
[   63.192094]  hid_add_device+0x30b/0x910
[   63.192094]  usbhid_probe+0x920/0xe00
[   63.192094]  usb_probe_interface+0x363/0x9a0
[   63.192094]  really_probe+0x1c3/0x690
[   63.192094]  __driver_probe_device+0x247/0x300
[   63.192094]  driver_probe_device+0x49/0x210
[   63.192094]  __device_attach_driver+0x160/0x320
[   63.192094]  bus_for_each_drv+0x10f/0x190
[   63.192094]  __device_attach+0x18e/0x370
[   63.192094]  bus_probe_device+0x123/0x170
[   63.192094]  device_add+0xd4d/0x1460
[   63.192094]  usb_set_configuration+0xd14/0x1880
[   63.192094]  usb_generic_driver_probe+0x78/0xb0
[   63.192094]  usb_probe_device+0xaa/0x2e0
[   63.192094]  really_probe+0x1c3/0x690
[   63.192094]  __driver_probe_device+0x247/0x300
[   63.192094]  driver_probe_device+0x49/0x210
[   63.192094]  __device_attach_driver+0x160/0x320
[   63.192094]  bus_for_each_drv+0x10f/0x190
[   63.192094]  __device_attach+0x18e/0x370
[   63.192094]  bus_probe_device+0x123/0x170
[   63.192094]  device_add+0xd4d/0x1460
[   63.192094]  usb_new_device+0x7b4/0x1000
[   63.192094]  hub_event+0x234d/0x3fa0
[   63.192094]  process_one_work+0x5bf/0xfe0
[   63.192094]  worker_thread+0x777/0x13a0
[   63.192094]  </TASK>
[   63.192094]
[   63.192094] Allocated by task 54:
[   63.192094]  kasan_save_stack+0x33/0x60
[   63.192094]  kasan_save_track+0x14/0x30
[   63.192094]  __kasan_kmalloc+0x8f/0xa0
[   63.192094]  __kmalloc_node_track_caller_noprof+0x195/0x420
[   63.192094]  devm_kmalloc+0x74/0x1e0
[   63.192094]  appletb_kbd_probe+0x39/0x440
[   63.192094]  hid_device_probe+0x2d1/0x680
[   63.192094]  really_probe+0x1c3/0x690
[   63.192094]  __driver_probe_device+0x247/0x300
[   63.192094]  driver_probe_device+0x49/0x210
[   63.192094]  __device_attach_driver+0x160/0x320
[...]
[   63.192094]
[   63.192094] Freed by task 54:
[   63.192094]  kasan_save_stack+0x33/0x60
[   63.192094]  kasan_save_track+0x14/0x30
[   63.192094]  kasan_save_free_info+0x3b/0x60
[   63.192094]  __kasan_slab_free+0x37/0x50
[   63.192094]  kfree+0xcf/0x360
[   63.192094]  devres_release_group+0x1f8/0x3c0
[   63.192094]  hid_device_probe+0x315/0x680
[   63.192094]  really_probe+0x1c3/0x690
[   63.192094]  __driver_probe_device+0x247/0x300
[   63.192094]  driver_probe_device+0x49/0x210
[   63.192094]  __device_attach_driver+0x160/0x320
[...]

Fixes: 7d62ba8deacf ("HID: hid-appletb-kbd: add support for fn toggle between media and function mode")
Cc: stable@vger.kernel.org
Reviewed-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: lenovo: Add support for ThinkPad X1 Tablet Thin Keyboard Gen2
Akira Inoue [Thu, 12 Jun 2025 04:34:38 +0000 (13:34 +0900)]
HID: lenovo: Add support for ThinkPad X1 Tablet Thin Keyboard Gen2

Add "Thinkpad X1 Tablet Gen 2 Keyboard" PID to hid-lenovo driver to fix trackpoint not working issue.

Signed-off-by: Akira Inoue <niyarium@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: appletb-kbd: fix "appletb_backlight" backlight device reference counting
Qasim Ijaz [Sun, 15 Jun 2025 22:59:41 +0000 (23:59 +0100)]
HID: appletb-kbd: fix "appletb_backlight" backlight device reference counting

During appletb_kbd_probe, probe attempts to get the backlight device
by name. When this happens backlight_device_get_by_name looks for a
device in the backlight class which has name "appletb_backlight" and
upon finding a match it increments the reference count for the device
and returns it to the caller. However this reference is never released
leading to a reference leak.

Fix this by decrementing the backlight device reference count on removal
via put_device and on probe failure.

Fixes: 93a0fc489481 ("HID: hid-appletb-kbd: add support for automatic brightness control while using the touchbar")
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Reviewed-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: wacom: fix crash in wacom_aes_battery_handler()
Thomas Zeitlhofer [Mon, 19 May 2025 08:54:46 +0000 (10:54 +0200)]
HID: wacom: fix crash in wacom_aes_battery_handler()

Commit fd2a9b29dc9c ("HID: wacom: Remove AES power_supply after extended
inactivity") introduced wacom_aes_battery_handler() which is scheduled
as a delayed work (aes_battery_work).

In wacom_remove(), aes_battery_work is not canceled. Consequently, if
the device is removed while aes_battery_work is still pending, then hard
crashes or "Oops: general protection fault..." are experienced when
wacom_aes_battery_handler() is finally called. E.g., this happens with
built-in USB devices after resume from hibernate when aes_battery_work
was still pending at the time of hibernation.

So, take care to cancel aes_battery_work in wacom_remove().

Fixes: fd2a9b29dc9c ("HID: wacom: Remove AES power_supply after extended inactivity")
Signed-off-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Acked-by: Ping Cheng <ping.cheng@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: intel-ish-hid: ipc: Add Wildcat Lake PCI device ID
Zhang Lixu [Tue, 10 Jun 2025 02:01:32 +0000 (10:01 +0800)]
HID: intel-ish-hid: ipc: Add Wildcat Lake PCI device ID

Add device ID of Wildcat Lake into ishtp support list.

Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agohid: intel-ish-hid: Use PCI_DEVICE_DATA() macro for ISH device table
Zhang Lixu [Tue, 10 Jun 2025 02:01:31 +0000 (10:01 +0800)]
hid: intel-ish-hid: Use PCI_DEVICE_DATA() macro for ISH device table

Replace the usage of PCI_VDEVICE() with driver_data assignment in the ISH
PCI device table with the PCI_DEVICE_DATA() macro. This improves code
readability.

Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: lenovo: Restrict F7/9/11 mode to compact keyboards only
Iusico Maxim [Thu, 5 Jun 2025 17:55:50 +0000 (19:55 +0200)]
HID: lenovo: Restrict F7/9/11 mode to compact keyboards only

Commit 2f2bd7cbd1d1 ("hid: lenovo: Resend all settings on reset_resume
for compact keyboards") introduced a regression for ThinkPad TrackPoint
Keyboard II by removing the conditional check for enabling F7/9/11 mode
needed for compact keyboards only. As a result, the non-compact
keyboards can no longer toggle Fn-lock via Fn+Esc, although it can be
controlled via sysfs knob that directly sends raw commands.

This patch restores the previous conditional check without any
additions.

Cc: stable@vger.kernel.org
Fixes: 2f2bd7cbd1d1 ("hid: lenovo: Resend all settings on reset_resume for compact keyboards")
Signed-off-by: Iusico Maxim <iusico.maxim@libero.it>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: Add IGNORE quirk for SMARTLINKTECHNOLOGY
Zhang Heng [Thu, 5 Jun 2025 07:29:59 +0000 (15:29 +0800)]
HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY

MARTLINKTECHNOLOGY is a microphone device, when the HID interface in an
audio device is requested to get specific report id, the following error
may occur.

[  562.939373] usb 1-1.4.1.2: new full-speed USB device number 21 using xhci_hcd
[  563.104908] usb 1-1.4.1.2: New USB device found, idVendor=4c4a, idProduct=4155, bcdDevice= 1.00
[  563.104910] usb 1-1.4.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  563.104911] usb 1-1.4.1.2: Product: USB Composite Device
[  563.104912] usb 1-1.4.1.2: Manufacturer: SmartlinkTechnology
[  563.104913] usb 1-1.4.1.2: SerialNumber: 20201111000001
[  563.229499] input: SmartlinkTechnology USB Composite Device as /devices/pci0000:00/0000:00:07.1/0000:04:00.3/usb1/1-1/1-1.4/1-1.4.1/1-1.4.1.2/1-1.4.1.2:1.2/0003:4C4A:4155.000F/input/input35
[  563.291505] hid-generic 0003:4C4A:4155.000F: input,hidraw2: USB HID v2.01 Keyboard [SmartlinkTechnology USB Composite Device] on usb-0000:04:00.3-1.4.1.2/input2
[  563.291557] usbhid 1-1.4.1.2:1.3: couldn't find an input interrupt endpoint
[  568.506654] usb 1-1.4.1.2: 1:1: usb_set_interface failed (-110)
[  573.626656] usb 1-1.4.1.2: 1:1: usb_set_interface failed (-110)
[  578.746657] usb 1-1.4.1.2: 1:1: usb_set_interface failed (-110)
[  583.866655] usb 1-1.4.1.2: 1:1: usb_set_interface failed (-110)
[  588.986657] usb 1-1.4.1.2: 1:1: usb_set_interface failed (-110)

Ignore HID interface. The device is working properly.

Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: input: lower message severity of 'No inputs registered, leaving' to debug
Mario Limonciello [Fri, 23 May 2025 16:10:07 +0000 (11:10 -0500)]
HID: input: lower message severity of 'No inputs registered, leaving' to debug

Plugging in a "Blue snowball" microphone always shows the
error 'No inputs registered, leaving', but the device functions as
intended.

When a HID device is started using the function hid_hw_start() and
the argument HID_CONNECT_DEFAULT it will try all various hid connect
requests. Not all devices will create an input device and so the
message is needlessly noisy.  Decrease it to debug instead.

[jkosina@suse.com: edit shortlog]
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: quirks: Add quirk for 2 Chicony Electronics HP 5MP Cameras
Chia-Lin Kao (AceLan) [Tue, 6 May 2025 05:50:15 +0000 (13:50 +0800)]
HID: quirks: Add quirk for 2 Chicony Electronics HP 5MP Cameras

The Chicony Electronics HP 5MP Cameras (USB ID 04F2:B824 & 04F2:B82C)
report a HID sensor interface that is not actually implemented.
Attempting to access this non-functional sensor via iio_info causes
system hangs as runtime PM tries to wake up an unresponsive sensor.

Add these 2 devices to the HID ignore list since the sensor interface is
non-functional by design and should not be exposed to userspace.

Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: Intel-thc-hid: Intel-quicki2c: Enhance QuickI2C reset flow
Even Xu [Wed, 14 May 2025 06:26:38 +0000 (14:26 +0800)]
HID: Intel-thc-hid: Intel-quicki2c: Enhance QuickI2C reset flow

During customer board enabling, it was found: some touch devices
prepared reset response, but either forgot sending interrupt or
THC missed reset interrupt because of timing issue. THC QuickI2C
driver depends on interrupt to read reset response, in this case,
it will cause driver waiting timeout.

This patch enhances the flow by adding manually reset response
reading after waiting for reset interrupt timeout.

Signed-off-by: Even Xu <even.xu@intel.com>
Tested-by: Chong Han <chong.han@intel.com>
Fixes: 66b59bfce6d9 ("HID: intel-thc-hid: intel-quicki2c: Complete THC QuickI2C driver")
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: nintendo: avoid bluetooth suspend/resume stalls
Daniel J. Ogorchock [Tue, 13 May 2025 07:47:00 +0000 (03:47 -0400)]
HID: nintendo: avoid bluetooth suspend/resume stalls

Ensure we don't stall or panic the kernel when using bluetooth-connected
controllers. This was reported as an issue on android devices using
kernel 6.6 due to the resume hook which had been added for usb joycons.

First, set a new state value to JOYCON_CTLR_STATE_SUSPENDED in a
newly-added nintendo_hid_suspend. This makes sure we will not stall out
the kernel waiting for input reports during led classdev suspend. The
stalls could happen if connectivity is unreliable or lost to the
controller prior to suspend.

Second, since we lose connectivity during suspend, do not try
joycon_init() for bluetooth controllers in the nintendo_hid_resume path.

Tested via multiple suspend/resume flows when using the controller both
in USB and bluetooth modes.

Signed-off-by: Daniel J. Ogorchock <djogorchock@gmail.com>
Reviewed-by: Silvan Jegen <s.jegen@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: wacom: fix kobject reference count leak
Qasim Ijaz [Fri, 6 Jun 2025 18:49:59 +0000 (19:49 +0100)]
HID: wacom: fix kobject reference count leak

When sysfs_create_files() fails in wacom_initialize_remotes() the error
is returned and the cleanup action will not have been registered yet.

As a result the kobject???s refcount is never dropped, so the
kobject can never be freed leading to a reference leak.

Fix this by calling kobject_put() before returning.

Fixes: 83e6b40e2de6 ("HID: wacom: EKR: have the wacom resources dynamically allocated")
Acked-by: Ping Cheng <ping.cheng@wacom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: wacom: fix memory leak on sysfs attribute creation failure
Qasim Ijaz [Fri, 6 Jun 2025 18:49:58 +0000 (19:49 +0100)]
HID: wacom: fix memory leak on sysfs attribute creation failure

When sysfs_create_files() fails during wacom_initialize_remotes() the
fifo buffer is not freed leading to a memory leak.

Fix this by calling kfifo_free() before returning.

Fixes: 83e6b40e2de6 ("HID: wacom: EKR: have the wacom resources dynamically allocated")
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoHID: wacom: fix memory leak on kobject creation failure
Qasim Ijaz [Fri, 6 Jun 2025 18:49:57 +0000 (19:49 +0100)]
HID: wacom: fix memory leak on kobject creation failure

During wacom_initialize_remotes() a fifo buffer is allocated
with kfifo_alloc() and later a cleanup action is registered
during devm_add_action_or_reset() to clean it up.

However if the code fails to create a kobject and register it
with sysfs the code simply returns -ENOMEM before the cleanup
action is registered leading to a memory leak.

Fix this by ensuring the fifo is freed when the kobject creation
and registration process fails.

Fixes: 83e6b40e2de6 ("HID: wacom: EKR: have the wacom resources dynamically allocated")
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
4 months agoMerge tag 'hid-for-linus-2025060301' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 3 Jun 2025 17:34:36 +0000 (10:34 -0700)]
Merge tag 'hid-for-linus-2025060301' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID updates from Jiri Kosina:

 - support for Apple Magic Mouse 2 USB-C (Aditya Garg)

 - power management improvement for multitouch devices (Werner Sembach)

 - fix for ACPI initialization in intel-thc driver (Wentao Guan)

 - adaptation of HID drivers to use new gpio_chip's line setter
   callbacks (Bartosz Golaszewski)

 - fix potential OOB in usbhid_parse() (Terry Junge)

 - make it possible to set hid_mouse_ignore_list dynamically (the same
   way we handle other quirks) (Aditya Garg)

 - other small assorted fixes and device ID additions

* tag 'hid-for-linus-2025060301' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: multitouch: Disable touchpad on firmware level while not in use
  HID: core: Add functions for HID drivers to react on first open and last close call
  HID: HID_APPLETB_BL should depend on X86
  HID: HID_APPLETB_KBD should depend on X86
  HID: appletb-kbd: Use secs_to_jiffies() instead of msecs_to_jiffies()
  HID: intel-thc-hid: intel-thc: make read-only arrays static const
  HID: magicmouse: Apple Magic Mouse 2 USB-C support
  HID: mcp2221: use new line value setter callbacks
  HID: mcp2200: use new line value setter callbacks
  HID: cp2112: use new line value setter callbacks
  HID: cp2112: use lock guards
  HID: cp2112: hold the lock for the entire direction_output() call
  HID: cp2112: destroy mutex on driver detach
  HID: intel-thc-hid: intel-quicki2c: pass correct arguments to acpi_evaluate_object
  HID: corsair-void: Use to_delayed_work()
  HID: hid-logitech: use sysfs_emit_at() instead of scnprintf()
  HID: quirks: Add HID_QUIRK_IGNORE_MOUSE quirk
  HID: usbhid: Eliminate recurrent out-of-bounds bug in usbhid_parse()
  HID: Kysona: Add periodic online check

4 months agoMerge tag 'ata-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata...
Linus Torvalds [Tue, 3 Jun 2025 16:42:38 +0000 (09:42 -0700)]
Merge tag 'ata-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux

Pull ata updates from Damien Le Moal:

 - Simplify ata_print_version_once() using dev_dbg_once() (Heiner)

 - Some cleanups of libata-sata code to simplify the sense data fetching
   code and use BIT() macro for tag bit handling (Niklas)

 - Fix variable name spelling in the sata_sx4 driver (Colin)

 - Improve sense data information field handling for passthrough
   commands (Igor)

 - Add Rockchip RK3576 SoC compatible to the Designware AHCI DT bindings
   (Nicolas)

 - Add a message to indicate if a port is marked as external or not, to
   help with debugging potential issues with LPM (Niklas)

 - Convert DT bindings for "ti,dm816-ahci", "apm,xgene-ahci",
   "cavium,ebt3000-compact-flash", "marvell,orion-sata", and
   "arasan,cf-spear1340" to DT schema (Rob)

 - Cleanup and improve the code and related comments for HIPM and DIPM
   (host initiated and device initiated power managent) handling.

   In particular, keep DIPM disabled while modifying the allowed LPM
   states to avoid races with the device initiating power state changes
   (Niklas)

* tag 'ata-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: libata-eh: Keep DIPM disabled while modifying the allowed LPM states
  ata: libata-eh: Rename no_dipm variable to be more clear
  ata: libata-eh: Rename hipm and dipm variables
  ata: libata-eh: Add ata_eh_set_lpm() WARN_ON_ONCE
  ata: libata-eh: Update DIPM comments to reflect reality
  dt-bindings: ata: Convert arasan,cf-spear1340 to DT schema
  dt-bindings: ata: Convert marvell,orion-sata to DT schema
  dt-bindings: ata: Convert cavium,ebt3000-compact-flash to DT schema
  dt-bindings: ata: Convert apm,xgene-ahci to DT schema
  dt-bindings: ata: Convert st,ahci to DT schema
  dt-bindings: ata: Convert ti,dm816-ahci to DT schema
  ata: libata: Print if port is external on boot
  dt-bindings: ata: rockchip-dwc-ahci: add RK3576 compatible
  ata: libata-scsi: Do not set the INFORMATION field twice for ATA PT
  ata: sata_sx4: Fix spelling mistake "parttern" -> "pattern"
  ata: libata-sata: Use BIT() macro to convert tag to bit field
  ata: libata-sata: Simplify sense_valid fetching
  ata: libata-core: Simplify ata_print_version_once

4 months agoMerge tag 'hwmon-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck...
Linus Torvalds [Tue, 3 Jun 2025 16:11:26 +0000 (09:11 -0700)]
Merge tag 'hwmon-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon updates from Guenter Roeck:
 "New drivers:
   - KEBA fan controller
   - KEBA battery monitoring controller
   - MAX77705

  Support added to existing drivers:
   - MAXIMUS VI HERO and ROG MAXIMUS Z90 Formula support (asus-ec-sensors)
   - SQ52206 support (ina238)
   - lt3074 support (pmbus/lt3074)
   - ADPM12160 support (pmbus/max34440)
   - MPM82504 and for MPM3695 family support (pmbus/mpq8785)
   - Add the Dell OptiPlex 7050 to the DMI whitelist (dell-smm)
   - Zen5 Ryzen Desktop support (k10temp)

  Various other minor fixes and improvements"

* tag 'hwmon-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (48 commits)
  doc: hwmon: acpi_power_meter: Add information about enabling the power capping feature.
  hwmon: (isl28022) Fix current reading calculation
  hwmon: (lm75) Fix I3C transfer buffer pointer for incoming data
  hwmon: Add KEBA fan controller support
  hwmon: pmbus: mpq8785: Add support for MPM3695 family
  hwmon: pmbus: mpq8785: Add support for MPM82504
  hwmon: pmbus: mpq8785: Implement VOUT feedback resistor divider ratio configuration
  hwmon: pmbus: mpq8785: Prepare driver for multiple device support
  dt-bindings: hwmon: Add bindings for mpq8785 driver
  hwmon: (ina238) Modify the calculation formula to adapt to different chips
  hwmon: (ina238) Add support for SQ52206
  dt-bindings: Add SQ52206 to ina2xx devicetree bindings
  hwmon: (ina238) Add ina238_config to save configurations for different chips
  hwmon: (ausus-ec-sensors) add MAXIMUS VI HERO.
  hwmon: (isl28022, nct7363) Convert to use maple tree register cache
  hwmon: (asus-ec-sensors) check sensor index in read_string()
  hwmon: (asus-ec-sensors) add ROG MAXIMUS Z90 Formula.
  dt-bindings: hwmon: Add Sophgo SG2044 external hardware monitor support
  hwmon: (max77705) Add initial support
  hwmon: (tmp102) add vcc regulator support
  ...

4 months agoMerge tag 'xtensa-20250603' of https://github.com/jcmvbkbc/linux-xtensa
Linus Torvalds [Tue, 3 Jun 2025 16:05:01 +0000 (09:05 -0700)]
Merge tag 'xtensa-20250603' of https://github.com/jcmvbkbc/linux-xtensa

Pull xtensa updates from Max Filippov:

 - migrate to the generic rule for built-in DTB

 - cleanups in code and common_defconfig

* tag 'xtensa-20250603' of https://github.com/jcmvbkbc/linux-xtensa:
  arch: xtensa: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
  xtensa: migrate to the generic rule for built-in DTB
  xtensa: ptrace: Remove zero-length alignment array

4 months agoMerge tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 3 Jun 2025 15:39:20 +0000 (08:39 -0700)]
Merge tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - Support for Virtual Trust Level (VTL) on arm64 (Roman Kisel)

 - Fixes for Hyper-V UIO driver (Long Li)

 - Fixes for Hyper-V PCI driver (Michael Kelley)

 - Select CONFIG_SYSFB for Hyper-V guests (Michael Kelley)

 - Documentation updates for Hyper-V VMBus (Michael Kelley)

 - Enhance logging for hv_kvp_daemon (Shradha Gupta)

* tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (23 commits)
  Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests
  Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir
  Documentation: hyperv: Update VMBus doc with new features and info
  PCI: hv: Remove unnecessary flex array in struct pci_packet
  Drivers: hv: Remove hv_alloc/free_* helpers
  Drivers: hv: Use kzalloc for panic page allocation
  uio_hv_generic: Align ring size to system page
  uio_hv_generic: Use correct size for interrupt and monitor pages
  Drivers: hv: Allocate interrupt and monitor pages aligned to system page boundary
  arch/x86: Provide the CPU number in the wakeup AP callback
  x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap()
  PCI: hv: Get vPCI MSI IRQ domain from DeviceTree
  ACPI: irq: Introduce acpi_get_gsi_dispatcher()
  Drivers: hv: vmbus: Introduce hv_get_vmbus_root_device()
  Drivers: hv: vmbus: Get the IRQ number from DeviceTree
  dt-bindings: microsoft,vmbus: Add interrupt and DMA coherence properties
  arm64, x86: hyperv: Report the VTL the system boots in
  arm64: hyperv: Initialize the Virtual Trust Level field
  Drivers: hv: Provide arch-neutral implementation of get_vtl()
  Drivers: hv: Enable VTL mode for arm64
  ...

4 months agoMerge tag 'v6.16-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Tue, 3 Jun 2025 15:03:45 +0000 (08:03 -0700)]
Merge tag 'v6.16-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "Fix a loongarch header regression and a module name collision on s390"

* tag 'v6.16-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  asm-generic: Add sched.h inclusion in simd.h
  crypto: s390/sha256 - rename module to sha256-s390

4 months agoMerge tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux
Linus Torvalds [Tue, 3 Jun 2025 14:39:23 +0000 (07:39 -0700)]
Merge tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - dead code cleanups for cpumasks and nodemasks (me)

 - fixed-width flavors of GENMASK() and BIT() (Vincent, Lucas and me)

 - FIELD_MODIFY() helper (Luo)

 - for_each_node_with_cpus() optimization (me)

 - bitmap-str fixes (Andy)

* tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux:
  topology: make for_each_node_with_cpus() O(N)
  bitfield: Add FIELD_MODIFY() helper
  bitmap-str: Add missing header(s)
  bitmap-str: Get rid of 'extern' for function prototypes
  build_bug.h: more user friendly error messages in BUILD_BUG_ON_ZERO()
  test_bits: add tests for BIT_U*()
  test_bits: add tests for GENMASK_U*()
  drm/i915: Convert REG_GENMASK*() to fixed-width GENMASK_U*()
  bits: introduce fixed-type BIT_U*()
  bits: introduce fixed-type GENMASK_U*()
  bits: add comments and newlines to #if, #else and #endif directives
  cpumask: drop cpumask_assign_cpu()
  riscv: switch set_icache_stale_mask() to using non-atomic assign_cpu()
  cpumask: add non-atomic __assign_cpu()
  nodemask: drop nodes_shift

4 months agoMerge branch 'for-6.16/core' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:32:30 +0000 (09:32 +0200)]
Merge branch 'for-6.16/core' into for-linus

- power management improvement for multitouch devices (Werner Sembach)

4 months agoMerge branch 'for-6.16/magicmouse' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:27:46 +0000 (09:27 +0200)]
Merge branch 'for-6.16/magicmouse' into for-linus

- support for Apple Magic Mouse 2 USB-C (Aditya Garg)

4 months agoMerge branch 'for-6.16/logitech' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:27:32 +0000 (09:27 +0200)]
Merge branch 'for-6.16/logitech' into for-linus

4 months agoMerge branch 'for-6.16/kysona' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:26:50 +0000 (09:26 +0200)]
Merge branch 'for-6.16/kysona' into for-linus

- power management improvement (Lode Willems)

4 months agoMerge branch 'for-6.16/intel-thc' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:26:12 +0000 (09:26 +0200)]
Merge branch 'for-6.16/intel-thc' into for-linus

- fix for ACPI initialization (Wentao Guan)

4 months agoMerge branch 'for-6.16/hid-gpio-setter-callbacks' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:25:26 +0000 (09:25 +0200)]
Merge branch 'for-6.16/hid-gpio-setter-callbacks' into for-linus

- adapt HID drivers to use new gpio_chip's line setter callbacks
  (Bartosz Golaszewski)

4 months agoMerge branch 'for-6.16/corsair' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:24:39 +0000 (09:24 +0200)]
Merge branch 'for-6.16/corsair' into for-linus

4 months agoMerge branch 'for-6.16/core' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:23:09 +0000 (09:23 +0200)]
Merge branch 'for-6.16/core' into for-linus

- make it possible to set hid_mouse_ignore_list dynamically (the same way we
  handle other quirks) (Aditya Garg)
- fix potential OOB in usbhid_parse() (Terry Junge)

4 months agoMerge branch 'for-6.16/apple' into for-linus
Jiri Kosina [Tue, 3 Jun 2025 07:21:55 +0000 (09:21 +0200)]
Merge branch 'for-6.16/apple' into for-linus

- Kconfig dependency fixes (Geert Uytterhoeven)
- time scaling fix for appletb_tb_idle_timeout and appletb_tb_dim_timeout
  parameters (Thorsten Blum)

4 months agoMerge tag 'bootconfig-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Tue, 3 Jun 2025 00:39:24 +0000 (17:39 -0700)]
Merge tag 'bootconfig-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig updates from Masami Hiramatsu:

 - Allow overriding CFLAGS and LDFLAGS for tools/bootconfig, for example
   making it a static binary.

* tag 'bootconfig-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tools/bootconfig: specify LDFLAGS as an argument to CC
  tools/bootconfig: allow overriding CFLAGS assignment

4 months agoMerge tag 'modules-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules...
Linus Torvalds [Tue, 3 Jun 2025 00:35:06 +0000 (17:35 -0700)]
Merge tag 'modules-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux

Pull module updates from Petr Pavlu:

 - Make .static_call_sites in modules read-only after init

   The .static_call_sites sections in modules have been made read-only
   after init to avoid any (non-)accidental modifications, similarly to
   how they are read-only after init in vmlinux

 - The rest are minor cleanups

* tag 'modules-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
  module: Remove outdated comment about text_size
  module: Make .static_call_sites read-only after init
  module: Add a separate function to mark sections as read-only after init
  module: Constify parameters of module_enforce_rwx_sections()

4 months agoMerge tag 'mm-stable-2025-06-01-14-06' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 2 Jun 2025 23:00:26 +0000 (16:00 -0700)]
Merge tag 'mm-stable-2025-06-01-14-06' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - "zram: support algorithm-specific parameters" from Sergey Senozhatsky
   adds infrastructure for passing algorithm-specific parameters into
   zram. A single parameter `winbits' is implemented at this time.

 - "memcg: nmi-safe kmem charging" from Shakeel Butt makes memcg
   charging nmi-safe, which is required by BFP, which can operate in NMI
   context.

 - "Some random fixes and cleanup to shmem" from Kemeng Shi implements
   small fixes and cleanups in the shmem code.

 - "Skip mm selftests instead when kernel features are not present" from
   Zi Yan fixes some issues in the MM selftest code.

 - "mm/damon: build-enable essential DAMON components by default" from
   SeongJae Park reworks DAMON Kconfig to make it easier to enable
   CONFIG_DAMON.

 - "sched/numa: add statistics of numa balance task migration" from Libo
   Chen adds more info into sysfs and procfs files to improve visibility
   into the NUMA balancer's task migration activity.

 - "selftests/mm: cow and gup_longterm cleanups" from Mark Brown
   provides various updates to some of the MM selftests to make them
   play better with the overall containing framework.

* tag 'mm-stable-2025-06-01-14-06' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (43 commits)
  mm/khugepaged: clean up refcount check using folio_expected_ref_count()
  selftests/mm: fix test result reporting in gup_longterm
  selftests/mm: report unique test names for each cow test
  selftests/mm: add helper for logging test start and results
  selftests/mm: use standard ksft_finished() in cow and gup_longterm
  selftests/damon/_damon_sysfs: skip testcases if CONFIG_DAMON_SYSFS is disabled
  sched/numa: add statistics of numa balance task
  sched/numa: fix task swap by skipping kernel threads
  tools/testing: check correct variable in open_procmap()
  tools/testing/vma: add missing function stub
  mm/gup: update comment explaining why gup_fast() disables IRQs
  selftests/mm: two fixes for the pfnmap test
  mm/khugepaged: fix race with folio split/free using temporary reference
  mm: add CONFIG_PAGE_BLOCK_ORDER to select page block order
  mmu_notifiers: remove leftover stub macros
  selftests/mm: deduplicate test names in madv_populate
  kcov: rust: add flags for KCOV with Rust
  mm: rust: make CONFIG_MMU ifdefs more narrow
  mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables()
  mm/damon/Kconfig: enable CONFIG_DAMON by default
  ...

4 months agoMerge tag 'gfs2-for-6.16-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2...
Linus Torvalds [Mon, 2 Jun 2025 22:53:43 +0000 (15:53 -0700)]
Merge tag 'gfs2-for-6.16-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Andreas Gruenbacher:

 - Fix a NULL pointer dereference reported by syzbot

* tag 'gfs2-for-6.16-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Don't clear sb->s_fs_info in gfs2_sys_fs_add

4 months agoMerge tag 'fuse-update-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszered...
Linus Torvalds [Mon, 2 Jun 2025 22:31:05 +0000 (15:31 -0700)]
Merge tag 'fuse-update-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse updates from Miklos Szeredi:

 - Remove tmp page copying in writeback path (Joanne).

   This removes ~300 lines and with that a lot of complexity related to
   avoiding reclaim related deadlock. The old mechanism is replaced with
   a mapping flag that tells the MM not to block reclaim waiting for
   writeback to complete. The MM parts have been reviewed/acked by
   respective maintainers.

 - Convert more code to handle large folios (Joanne). This still just
   adds the code to deal with large folios and does not enable them yet.

 - Allow invalidating all cached lookups atomically (Luis Henriques).
   This feature is useful for CernVMFS, which currently does this
   iteratively.

 - Align write prefaulting in fuse with generic one (Dave Hansen)

 - Fix race causing invalid data to be cached when setting attributes on
   different nodes of a distributed fs (Guang Yuan Wu)

 - Update documentation for passthrough (Chen Linxuan)

 - Add fdinfo about the device number associated with an opened
   /dev/fuse instance (Chen Linxuan)

 - Increase readdir buffer size (Miklos). This depends on a patch to VFS
   readdir code that was already merged through Christians tree.

 - Optimize io-uring request expiration (Joanne)

 - Misc cleanups

* tag 'fuse-update-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits)
  fuse: increase readdir buffer size
  readdir: supply dir_context.count as readdir buffer size hint
  fuse: don't allow signals to interrupt getdents copying
  fuse: support large folios for writeback
  fuse: support large folios for readahead
  fuse: support large folios for queued writes
  fuse: support large folios for stores
  fuse: support large folios for symlinks
  fuse: support large folios for folio reads
  fuse: support large folios for writethrough writes
  fuse: refactor fuse_fill_write_pages()
  fuse: support large folios for retrieves
  fuse: support copying large folios
  fs: fuse: add dev id to /dev/fuse fdinfo
  docs: filesystems: add fuse-passthrough.rst
  MAINTAINERS: update filter of FUSE documentation
  fuse: fix race between concurrent setattrs from multiple nodes
  fuse: remove tmp folio for writebacks and internal rb tree
  mm: skip folio reclaim in legacy memcg contexts for deadlockable mappings
  fuse: optimize over-io-uring request expiration check
  ...

4 months agoMerge tag 'vfs-6.16-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Mon, 2 Jun 2025 22:04:06 +0000 (15:04 -0700)]
Merge tag 'vfs-6.16-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull netfs updates from Christian Brauner:

 - The main API document has been extensively updated/rewritten

 - Fix an oops in write-retry due to mis-resetting the I/O iterator

 - Fix the recording of transferred bytes for short DIO reads

 - Fix a request's work item to not require a reference, thereby
   avoiding the need to get rid of it in BH/IRQ context

 - Fix waiting and waking to be consistent about the waitqueue used

 - Remove NETFS_SREQ_SEEK_DATA_READ, NETFS_INVALID_WRITE,
   NETFS_ICTX_WRITETHROUGH, NETFS_READ_HOLE_CLEAR,
   NETFS_RREQ_DONT_UNLOCK_FOLIOS, and NETFS_RREQ_BLOCKED

 - Reorder structs to eliminate holes

 - Remove netfs_io_request::ractl

 - Only provide proc_link field if CONFIG_PROC_FS=y

 - Remove folio_queue::marks3

 - Fix undifferentiation of DIO reads from unbuffered reads

* tag 'vfs-6.16-rc1.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  netfs: Fix undifferentiation of DIO reads from unbuffered reads
  netfs: Fix wait/wake to be consistent about the waitqueue used
  netfs: Fix the request's work item to not require a ref
  netfs: Fix setting of transferred bytes with short DIO reads
  netfs: Fix oops in write-retry from mis-resetting the subreq iterator
  fs/netfs: remove unused flag NETFS_RREQ_BLOCKED
  fs/netfs: remove unused flag NETFS_RREQ_DONT_UNLOCK_FOLIOS
  folio_queue: remove unused field `marks3`
  fs/netfs: declare field `proc_link` only if CONFIG_PROC_FS=y
  fs/netfs: remove `netfs_io_request.ractl`
  fs/netfs: reorder struct fields to eliminate holes
  fs/netfs: remove unused enum choice NETFS_READ_HOLE_CLEAR
  fs/netfs: remove unused flag NETFS_ICTX_WRITETHROUGH
  fs/netfs: remove unused source NETFS_INVALID_WRITE
  fs/netfs: remove unused flag NETFS_SREQ_SEEK_DATA_READ

4 months agoMerge tag 'vfs-6.16-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Mon, 2 Jun 2025 19:49:16 +0000 (12:49 -0700)]
Merge tag 'vfs-6.16-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Fix the AT_HANDLE_CONNECTABLE option so filesystems that don't know
   how to decode a connected non-dir dentry fail the request

 - Use repr(transparent) to ensure identical layout between the C and
   Rust implementation of struct file

 - Add a missing xas_pause() into the dax code employing
   wait_entry_unlocked_exclusive()

 - Fix FOP_DONTCACHE which we disabled for v6.15.

   A folio could get redirtied and/or scheduled for writeback after the
   initial dropbehind test. Change the test accordingly to handle these
   cases so we can re-enable FOP_DONTCACHE again

* tag 'vfs-6.16-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  exportfs: require ->fh_to_parent() to encode connectable file handles
  rust: file: improve safety comments
  rust: file: mark `LocalFile` as `repr(transparent)`
  fs/dax: Fix "don't skip locked entries when scanning entries"
  iomap: don't lose folio dropbehind state for overwrites
  mm/filemap: unify dropbehind flag testing and clearing
  mm/filemap: unify read/write dropbehind naming
  Revert "Disable FOP_DONTCACHE for now due to bugs"
  mm/filemap: use filemap_end_dropbehind() for read invalidation
  mm/filemap: gate dropbehind invalidate on folio !dirty && !writeback

4 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Mon, 2 Jun 2025 19:24:58 +0000 (12:24 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
  Generic:

   - Clean up locking of all vCPUs for a VM by using the *_nest_lock()
     family of functions, and move duplicated code to virt/kvm/. kernel/
     patches acked by Peter Zijlstra

   - Add MGLRU support to the access tracking perf test

  ARM fixes:

   - Make the irqbypass hooks resilient to changes in the GSI<->MSI
     routing, avoiding behind stale vLPI mappings being left behind. The
     fix is to resolve the VGIC IRQ using the host IRQ (which is stable)
     and nuking the vLPI mapping upon a routing change

   - Close another VGIC race where vCPU creation races with VGIC
     creation, leading to in-flight vCPUs entering the kernel w/o
     private IRQs allocated

   - Fix a build issue triggered by the recently added workaround for
     Ampere's AC04_CPU_23 erratum

   - Correctly sign-extend the VA when emulating a TLBI instruction
     potentially targeting a VNCR mapping

   - Avoid dereferencing a NULL pointer in the VGIC debug code, which
     can happen if the device doesn't have any mapping yet

  s390:

   - Fix interaction between some filesystems and Secure Execution

   - Some cleanups and refactorings, preparing for an upcoming big
     series

  x86:

   - Wait for target vCPU to ack KVM_REQ_UPDATE_PROTECTED_GUEST_STATE
     to fix a race between AP destroy and VMRUN

   - Decrypt and dump the VMSA in dump_vmcb() if debugging enabled for
     the VM

   - Refine and harden handling of spurious faults

   - Add support for ALLOWED_SEV_FEATURES

   - Add #VMGEXIT to the set of handlers special cased for
     CONFIG_RETPOLINE=y

   - Treat DEBUGCTL[5:2] as reserved to pave the way for virtualizing
     features that utilize those bits

   - Don't account temporary allocations in sev_send_update_data()

   - Add support for KVM_CAP_X86_BUS_LOCK_EXIT on SVM, via Bus Lock
     Threshold

   - Unify virtualization of IBRS on nested VM-Exit, and cross-vCPU
     IBPB, between SVM and VMX

   - Advertise support to userspace for WRMSRNS and PREFETCHI

   - Rescan I/O APIC routes after handling EOI that needed to be
     intercepted due to the old/previous routing, but not the
     new/current routing

   - Add a module param to control and enumerate support for device
     posted interrupts

   - Fix a potential overflow with nested virt on Intel systems running
     32-bit kernels

   - Flush shadow VMCSes on emergency reboot

   - Add support for SNP to the various SEV selftests

   - Add a selftest to verify fastops instructions via forced emulation

   - Refine and optimize KVM's software processing of the posted
     interrupt bitmap, and share the harvesting code between KVM and the
     kernel's Posted MSI handler"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits)
  rtmutex_api: provide correct extern functions
  KVM: arm64: vgic-debug: Avoid dereferencing NULL ITE pointer
  KVM: arm64: vgic-init: Plug vCPU vs. VGIC creation race
  KVM: arm64: Unmap vLPIs affected by changes to GSI routing information
  KVM: arm64: Resolve vLPI by host IRQ in vgic_v4_unset_forwarding()
  KVM: arm64: Protect vLPI translation with vgic_irq::irq_lock
  KVM: arm64: Use lock guard in vgic_v4_set_forwarding()
  KVM: arm64: Mask out non-VA bits from TLBI VA* on VNCR invalidation
  arm64: sysreg: Drag linux/kconfig.h to work around vdso build issue
  KVM: s390: Simplify and move pv code
  KVM: s390: Refactor and split some gmap helpers
  KVM: s390: Remove unneeded srcu lock
  s390: Remove unneeded includes
  s390/uv: Improve splitting of large folios that cannot be split while dirty
  s390/uv: Always return 0 from s390_wiggle_split_folio() if successful
  s390/uv: Don't return 0 from make_hva_secure() if the operation was not successful
  rust: add helper for mutex_trylock
  RISC-V: KVM: use kvm_trylock_all_vcpus when locking all vCPUs
  KVM: arm64: use kvm_trylock_all_vcpus when locking all vCPUs
  x86: KVM: SVM: use kvm_lock_all_vcpus instead of a custom implementation
  ...

4 months agoMerge tag 'm68knommu-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 2 Jun 2025 19:16:17 +0000 (12:16 -0700)]
Merge tag 'm68knommu-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu

Pull m68knommu updates from Greg Ungerer:

 - use new gpio line value settings

 - use strscpy() more

* tag 'm68knommu-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: Replace memcpy() + manual NUL-termination with strscpy()
  m68k/kernel: replace strncpy() with strscpy()
  m68k: coldfire: gpio: use new line value setter callbacks

4 months agoMerge tag 'input-for-v6.16-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 2 Jun 2025 18:14:21 +0000 (11:14 -0700)]
Merge tag 'input-for-v6.16-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

 - support for game controllers requiring delayed initialization
   packets, such as ByoWave Proteus, in xpad driver

 - a change to atkbd driver to not reset the keyboard on Loongson
   devices

 - tweaks to gpio-keys and matrix_keypad drivers

 - fixes to documentation for Amiga joysticks

 - a fix to ims-pcu driver to better handle malformed firmware

* tag 'input-for-v6.16-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ims-pcu - check record size in ims_pcu_flash_firmware()
  Input: gpio-keys - fix possible concurrent access in gpio_keys_irq_timer()
  Input: gpio-keys - fix a sleep while atomic with PREEMPT_RT
  Input: amijoy - make headings compliant w/ guidelines in documentation
  Input: amijoy - fix grammar in documentation
  Input: amijoy - fix Amiga 4-joystick adapter pinout in documentation
  Input: amijoy - fix broken table formatting in documentation
  Input: atkbd - do not reset keyboard by default on Loongson
  Input: xpad - send LED and auth done packets to all Xbox One controllers
  Input: xpad - add the ByoWave Proteus controller
  Input: xpad - allow delaying init packets
  MAINTAINERS: update dlg,da72??.txt to yaml
  dt-bindings: input: convert dlg,da7280.txt to dt-schema
  dt-bindings: input: touchscreen: edt-ft5x06: use unevaluatedProperties
  Input: snvs_pwrkey - support power-off-time-sec
  dt-bindings: crypto: fsl,sec-v4.0-mon: Add "power-off-time-sec"
  Input: matrix_keypad - detect change during scan
  Input: matrix_keypad - add function for reading row state

4 months agoMerge tag 'mtd/for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Linus Torvalds [Mon, 2 Jun 2025 18:08:17 +0000 (11:08 -0700)]
Merge tag 'mtd/for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull MTD updates from Miquel Raynal:
 "A big core MTD change is the introduction of a new class to always
  register a master device. This is a problem that has been there
  forever: the "master" device was not always present depending on a
  number of heuristics such as the presence of fixed partitions and the
  absence of a Kconfig symbol to force its presence. This was a problem
  for runtime PM operations which might not have the "master" device
  available in all situation.

  The SPI NAND subsystem has seen the introduction of DTR operations
  (the equivalent of DDR transfers), which involved quite a few
  preparation patches for clarifying macro names.

  In the raw NAND subsystem, the brcmnand driver has been "fixed" for
  old legacy SoCs with an update of the ->exec_op() hook, there has been
  the introduction of a new controller driver named Loongson-1, and the
  Qualcomm driver has received quite a few misc fixes as well as a new
  compatible.

  Finally, Macornix SPI NOR entries have been cleaned-up and some SFDP
  table fixups for Macronix MX25L3255E have been merged.

  Aside from this, there is the usual load of misc improvement, fixes,
  and yaml conversion"

* tag 'mtd/for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (42 commits)
  mtd: rawnand: brcmnand: legacy exec_op implementation
  mtd: rawnand: sunxi: Add randomizer configuration in sunxi_nfc_hw_ecc_write_chunk
  mtd: nand: brcmnand: fix NAND timeout when accessing eMMC
  mtd: nand: sunxi: Add randomizer configuration before randomizer enable
  mtd: spinand: esmt: fix id code for F50D1G41LB
  mtd: rawnand: brcmnand: remove unused parameters
  mtd: core: always create master device
  mtd: rawnand: loongson1: Fix inconsistent refcounting in ls1x_nand_chip_init()
  mtd: rawnand: loongson1: Fix error code in ls1x_nand_dma_transfer()
  mtd: rawnand: qcom: Fix read len for onfi param page
  mtd: rawnand: qcom: Fix last codeword read in qcom_param_page_type_exec()
  mtd: rawnand: qcom: Pass 18 bit offset from NANDc base to BAM base
  dt-bindings: mtd: qcom,nandc: Document the SDX75 NAND controller
  mtd: bcm47xxnflash: Add error handling for bcm47xxnflash_ops_bcm4706_ctl_cmd()
  mtd: rawnand: Use non-hybrid PCI devres API
  mtd: nand: ecc-mxic: Fix use of uninitialized variable ret
  mtd: spinand: winbond: Add support for W35N02JW and W35N04JW chips
  mtd: spinand: winbond: Add octal support
  mtd: spinand: winbond: Add support for W35N01JW in single mode
  mtd: spinand: winbond: Rename DTR variants
  ...

4 months agoMerge tag 'rpmsg-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Mon, 2 Jun 2025 18:06:44 +0000 (11:06 -0700)]
Merge tag 'rpmsg-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull rpmsg updates from Bjorn Andersson:

 - Remove some dead and unused code from core and virtio modules

 - Improve the error messages from the Qualcomm SMD driver and
   initialize an uninitialized variable in the send path

* tag 'rpmsg-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  rpmsg: qcom_smd: Fix uninitialized return variable in __qcom_smd_send()
  rpmsg: qcom_smd: Improve error handling for qcom_smd_parse_edge
  rpmsg: Remove unused method pointers *send_offchannel
  rpmsg: virtio: Remove uncallable offchannel functions
  rpmsg: core: Remove deadcode

4 months agoMerge tag 'rproc-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Mon, 2 Jun 2025 18:04:29 +0000 (11:04 -0700)]
Merge tag 'rproc-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux

Pull remoteproc updates from Bjorn Andersson:

 - Fix resource cleanup in the remoteproc attach error handling code
   paths

 - Refactor the various TI K3 drivers to extract and reuse common code
   between them

 - Add support in the i.MX remoteproc driver for determining from the
   firmware if Linux should wait on a "firmware ready" signal at startup

 - Improve the Xilinx R5F power down mechanism to handle use cases where
   this is shared with other entities in the system

* tag 'rproc-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (46 commits)
  remoteproc: k3: Refactor release_tsp() functions into common driver
  remoteproc: k3: Refactor reserved_mem_init() functions into common driver
  remoteproc: k3: Refactor mem_release() functions into common driver
  remoteproc: k3: Refactor of_get_memories() functions into common driver
  remoteproc: k3: Refactor .da_to_va rproc ops into common driver
  remoteproc: k3: Refactor .get_loaded_rsc_table ops into common driver
  remoteproc: k3: Refactor .detach rproc ops into common driver
  remoteproc: k3: Refactor .attach rproc ops into common driver
  remoteproc: k3: Refactor .stop rproc ops into common driver
  remoteproc: k3: Refactor .start rproc ops into common driver
  remoteproc: k3: Refactor .unprepare rproc ops into common driver
  remoteproc: k3: Refactor .prepare rproc ops into common driver
  remoteproc: k3-dsp: Assert local reset during .prepare callback
  remoteproc: k3-dsp: Don't override rproc ops in IPC-only mode
  remoteproc: k3: Refactor rproc_request_mbox() implementations into common driver
  remoteproc: k3-m4: Ping the mbox while acquiring the channel
  remoteproc: k3: Refactor rproc_release() implementation into common driver
  remoteproc: k3-m4: Introduce central function to release rproc from reset
  remoteproc: k3-dsp: Correct Reset deassert logic for devices w/o lresets
  remoteproc: k3: Refactor rproc_reset() implementation into common driver
  ...

4 months agoMerge tag 'mailbox-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar...
Linus Torvalds [Mon, 2 Jun 2025 17:58:00 +0000 (10:58 -0700)]
Merge tag 'mailbox-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox

Pull mailbox updates from Jassi Brar:
 "Core:
   - misc cleanup

  sophgo:
   - add driver for CV18XX series

  qcom:
   - add SM7150 APCS compatible
   - apcs: added separate clock node

  imx:
   - fix tx doorbell send

  microchip:
   - misc compile option fix

  mediatek:
   - Refine GCE_GCTL_VALUE setting"

* tag 'mailbox-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
  mailbox: qcom-apcs-ipc: Assign OF node to clock controller child device
  dt-bindings: mailbox: qcom,apcs: Add separate node for clock-controller
  dt-bindings: mailbox: qcom: Add the SM7150 APCS compatible
  mailbox: sophgo: add mailbox driver for CV18XX series SoC
  dt-bindings: mailbox: add Sophgo CV18XX series SoC
  mailbox: Use guard/scoped_guard for spinlock
  mailbox: Use guard/scoped_guard for con_mutex
  mailbox: Remove devm_mbox_controller_unregister
  mailbox: Propagate correct error return value
  mailbox: Not protect module_put with spin_lock_irqsave
  mailbox: Use dev_err when there is error
  mailbox: mtk-cmdq: Refine GCE_GCTL_VALUE setting
  mailbox: imx: Fix TXDB_V2 sending
  mailbox: mchp-ipc-sbi: Fix COMPILE_TEST build error

4 months agoMerge tag 'nand/for-6.16' into mtd/next
Miquel Raynal [Mon, 2 Jun 2025 16:39:50 +0000 (18:39 +0200)]
Merge tag 'nand/for-6.16' into mtd/next

The SPI NAND subsystem has seen the introduction of DTR operations (the
equivalent of DDR transfers), which involved quite a few preparation
patches for clarifying macro names.

In the raw NAND subsystem, the brcmnand driver has been "fixed" for old
legacy SoCs with an update of the ->exec_op() hook, there has been the
introduction of a new controller driver named Loongson-1, and the
Qualcomm driver has received quite a few misc fixes as well as a new
compatible.

Aside from this, there is the usual load of misc improvement and fixes.

4 months agoMerge tag 'spi-nor/for-6.16' into mtd/next
Miquel Raynal [Mon, 2 Jun 2025 16:39:35 +0000 (18:39 +0200)]
Merge tag 'spi-nor/for-6.16' into mtd/next

SPI NOR changes for 6.16

Notable changes:

- Cleanup some Macronix flash entries.

- Add SFDP table fixups for Macronix MX25L3255E.

4 months agoMerge tag 'kvmarm-fixes-6.16-1' of https://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Mon, 2 Jun 2025 07:05:29 +0000 (03:05 -0400)]
Merge tag 'kvmarm-fixes-6.16-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.16, take #1

- Make the irqbypass hooks resilient to changes in the GSI<->MSI
  routing, avoiding behind stale vLPI mappings being left behind. The
  fix is to resolve the VGIC IRQ using the host IRQ (which is stable)
  and nuking the vLPI mapping upon a routing change.

- Close another VGIC race where vCPU creation races with VGIC
  creation, leading to in-flight vCPUs entering the kernel w/o private
  IRQs allocated.

- Fix a build issue triggered by the recently added workaround for
  Ampere's AC04_CPU_23 erratum.

- Correctly sign-extend the VA when emulating a TLBI instruction
  potentially targeting a VNCR mapping.

- Avoid dereferencing a NULL pointer in the VGIC debug code, which can
  happen if the device doesn't have any mapping yet.

4 months agortmutex_api: provide correct extern functions
Paolo Bonzini [Fri, 30 May 2025 07:45:13 +0000 (03:45 -0400)]
rtmutex_api: provide correct extern functions

Commit fb49f07ba1d9 ("locking/mutex: implement mutex_lock_killable_nest_lock")
changed the set of functions that mutex.c defines when CONFIG_DEBUG_LOCK_ALLOC
is set.

- it removed the "extern" declaration of mutex_lock_killable_nested from
  include/linux/mutex.h, and replaced it with a macro since it could be
  treated as a special case of _mutex_lock_killable.  It also removed a
  definition of the function in kernel/locking/mutex.c.

- likewise, it replaced mutex_trylock() with the more generic
  mutex_trylock_nest_lock() and replaced mutex_trylock() with a macro.

However, it left the old definitions in place in kernel/locking/rtmutex_api.c,
which causes failures when building with CONFIG_RT_MUTEXES=y.  Bring over
the changes.

Fixes: fb49f07ba1d9 ("locking/mutex: implement mutex_lock_killable_nest_lock")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 months agoMerge branch 'next' into for-linus
Dmitry Torokhov [Mon, 2 Jun 2025 04:41:07 +0000 (21:41 -0700)]
Merge branch 'next' into for-linus

Prepare input updates for 6.16 merge window.

4 months agoMerge tag 'hardening-v6.16-rc1-fix1-take2' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sun, 1 Jun 2025 18:37:01 +0000 (11:37 -0700)]
Merge tag 'hardening-v6.16-rc1-fix1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening fixes from Kees Cook:

 - randstruct: gcc-plugin: Fix attribute addition with GCC 15

 - ubsan: integer-overflow: depend on BROKEN to keep this out of CI

 - overflow: Introduce __DEFINE_FLEX for having no initializer

 - wifi: iwlwifi: mld: Work around Clang loop unrolling bug

[ Take two after a jump scare due to some repo rewriting by 'b4' - Linus ]

* tag 'hardening-v6.16-rc1-fix1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  randstruct: gcc-plugin: Fix attribute addition
  overflow: Introduce __DEFINE_FLEX for having no initializer
  ubsan: integer-overflow: depend on BROKEN to keep this out of CI
  wifi: iwlwifi: mld: Work around Clang loop unrolling bug

4 months agoMerge tag 'linux-watchdog-6.16-rc1' of git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Sun, 1 Jun 2025 16:01:58 +0000 (09:01 -0700)]
Merge tag 'linux-watchdog-6.16-rc1' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

 - Add watchdog timer for the NXP S32 platform

 - Add driver for Intel OC WDT

 - Add exynos990-wdt

 - Various other fixes and improvements

* tag 'linux-watchdog-6.16-rc1' of git://www.linux-watchdog.org/linux-watchdog: (22 commits)
  watchdog: iTCO_wdt: Update the heartbeat value after clamping timeout
  watchdog: Add driver for Intel OC WDT
  watchdog: arm_smc_wdt: get wdt status through SMCWD_GET_TIMELEFT
  watchdog: iTCO: Drop driver-internal locking
  watchdog: apple: set max_hw_heartbeat_ms instead of max_timeout
  watchdog: qcom: introduce the device data for IPQ5424 watchdog device
  dt-bindings: watchdog: renesas,wdt: Document RZ/V2N (R9A09G056) support
  watchdog: lenovo_se30_wdt: Fix possible devm_ioremap() NULL pointer dereference in lenovo_se30_wdt_probe()
  watchdog: s3c2410_wdt: Add exynos990-wdt compatible data
  dt-bindings: watchdog: samsung-wdt: Add exynos990-wdt compatible
  dt-bindings: watchdog: Add rk3562 compatible
  dt-bindings: watchdog: fsl,scu-wdt: Document imx8qm
  watchdog: Add the Watchdog Timer for the NXP S32 platform
  dt-bindings: watchdog: Add NXP Software Watchdog Timer
  watchdog: Correct kerneldoc warnings
  watchdog: stm32: Fix wakeup source leaks on device unbind
  watchdog: Do not enable by default during compile testing
  watchdog: cros-ec: Avoid -Wflex-array-member-not-at-end warning
  watchdog: da9052_wdt: respect TWDMIN
  watchdog: da9052_wdt: do not disable wdt during probe
  ...

4 months agoMerge tag 'i3c/for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Linus Torvalds [Sun, 1 Jun 2025 15:59:50 +0000 (08:59 -0700)]
Merge tag 'i3c/for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux

Pull i3c updates from Alexandre Belloni:
 "There is not much this this, mostly fixes around interrupt and IBI
  handling:

   - mipi-i3c-hci: interrupt handling fixes

   - svc: i.MX94 and i.MX95 support, IBI handling fixes"

* tag 'i3c/for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  i3c: controllers do not need to depend on I3C
  i3c: master: svc: switch to bulk clk API for flexible clock support
  dt-bindings: i3c: silvaco,i3c-master: add i.MX94 and i.MX95 I3C
  i3c: master: svc: skip address resend on repeat START
  i3c: master: svc: Emit STOP asap in the IBI transaction
  i3c: master: svc: Receive IBI requests in interrupt context
  i3c: mipi-i3c-hci: Move unexpected INTR_STATUS print before IO handler
  i3c: mipi-i3c-hci: Change name of INTR_STATUS bit 11
  i3c: mipi-i3c-hci: Clear INTR_STATUS unconditionally
  i3c: mipi-i3c-hci: Fix handling status of i3c_hci_irq_handler()
  i3c: mipi-i3c-hci: Allow only relevant INTR_STATUS bit updates

4 months agoMerge tag 'edac_urgent_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Jun 2025 15:58:31 +0000 (08:58 -0700)]
Merge tag 'edac_urgent_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC fix from Borislav Petkov:
 "Limit a register write width in altera_edac to avoid hw errors"

* tag 'edac_urgent_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/altera: Use correct write width with the INTTEST register

4 months agoMerge tag 'for-linus' of https://github.com/openrisc/linux
Linus Torvalds [Sun, 1 Jun 2025 15:56:34 +0000 (08:56 -0700)]
Merge tag 'for-linus' of https://github.com/openrisc/linux

Pull OpenRISC updates from Stafford Horne:
 "Just a few documentation updates from the community:

   - Device tree documentation conversion from txt to yaml

   - Documentation addition to help users getting started with initramfs
     on OpenRISC

* tag 'for-linus' of https://github.com/openrisc/linux:
  dt-bindings: interrupt-controller: Convert openrisc,ompic to DT schema
  dt-bindings: interrupt-controller: Convert opencores,or1k-pic to DT schema
  Documentation:openrisc: Add build instructions with initramfs

4 months agoMerge tag 'parisc-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 1 Jun 2025 15:55:28 +0000 (08:55 -0700)]
Merge tag 'parisc-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc updates from Helge Deller:
 "Fix building with gcc-15, formatting fix on unaligned warnings and
  replace __ASSEMBLY__ with __ASSEMBLER__ in headers"

* tag 'parisc-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc/unaligned: Fix hex output to show 8 hex chars
  parisc: fix building with gcc-15
  parisc: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headers
  parisc: Replace __ASSEMBLY__ with __ASSEMBLER__ in uapi headers

4 months agorandstruct: gcc-plugin: Fix attribute addition
Kees Cook [Fri, 30 May 2025 22:18:28 +0000 (15:18 -0700)]
randstruct: gcc-plugin: Fix attribute addition

Based on changes in the 2021 public version of the randstruct
out-of-tree GCC plugin[1], more carefully update the attributes on
resulting decls, to avoid tripping checks in GCC 15's
comptypes_check_enum_int() when it has been configured with
"--enable-checking=misc":

arch/arm64/kernel/kexec_image.c:132:14: internal compiler error: in comptypes_check_enum_int, at c/c-typeck.cc:1519
  132 | const struct kexec_file_ops kexec_image_ops = {
      |              ^~~~~~~~~~~~~~
 internal_error(char const*, ...), at gcc/gcc/diagnostic-global-context.cc:517
 fancy_abort(char const*, int, char const*), at gcc/gcc/diagnostic.cc:1803
 comptypes_check_enum_int(tree_node*, tree_node*, bool*), at gcc/gcc/c/c-typeck.cc:1519
 ...

Link: https://archive.org/download/grsecurity/grsecurity-3.1-5.10.41-202105280954.patch.gz
Reported-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Closes: https://github.com/KSPP/linux/issues/367
Closes: https://lore.kernel.org/lkml/20250530000646.104457-1-thiago.bauermann@linaro.org/
Reported-by: Ingo Saitz <ingo@hannover.ccc.de>
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1104745
Fixes: 313dd1b62921 ("gcc-plugins: Add the randstruct plugin")
Tested-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org>
Link: https://lore.kernel.org/r/20250530221824.work.623-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
4 months agooverflow: Introduce __DEFINE_FLEX for having no initializer
Kees Cook [Fri, 30 May 2025 19:06:47 +0000 (12:06 -0700)]
overflow: Introduce __DEFINE_FLEX for having no initializer

While not yet in the tree, there is a proposed patch[1] that was
depending on the prior behavior of _DEFINE_FLEX, which did not have an
explicit initializer. Provide this via __DEFINE_FLEX now, which can also
have attributes applied (e.g. __uninitialized).

Examples of the resulting initializer behaviors can be seen here:
https://godbolt.org/z/P7Go8Tr33

Link: https://lore.kernel.org/netdev/20250520205920.2134829-9-anthony.l.nguyen@intel.com
Fixes: 47e36ed78406 ("overflow: Fix direct struct member initialization in _DEFINE_FLEX()")
Signed-off-by: Kees Cook <kees@kernel.org>
4 months agowatchdog: iTCO_wdt: Update the heartbeat value after clamping timeout
Ziyan Fu [Tue, 29 Apr 2025 10:25:33 +0000 (18:25 +0800)]
watchdog: iTCO_wdt: Update the heartbeat value after clamping timeout

When executing "modprobe iTCO_wdt heartbeat=700", the user-specified
'heartbeat' parameter exceeds the valid range, the driver clamps the
timeout to default 30s but fails to update the logged 'heartbeat' value,
resulting in misleading log output:

iTCO_wdt iTCO_wdt: timeout value out of range, using 30
iTCO_wdt iTCO_wdt: initialized. heartbeat=700 sec (nowayout=0)

After validating the range, update the 'heartbeat' value with the clamped
timeout value to ensure that log messages accurately reflect the actual
runtime parameters.

Signed-off-by: Ziyan Fu <fuzy5@lenovo.com>
Reviewed-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Link: https://lore.kernel.org/r/20250429102533.11886-1-13281011316@163.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agowatchdog: Add driver for Intel OC WDT
Diogo Ivo [Mon, 17 Mar 2025 10:55:06 +0000 (10:55 +0000)]
watchdog: Add driver for Intel OC WDT

Add a driver for the Intel Over-Clocking Watchdog found in Intel
Platform Controller (PCH) chipsets. This watchdog is controlled
via a simple single-register interface and would otherwise be
standard except for the presence of a LOCK bit that can only be
set once per power cycle, needing extra handling around it.

Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250317-ivo-intel_oc_wdt-v3-1-32c396f4eefd@siemens.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agowatchdog: arm_smc_wdt: get wdt status through SMCWD_GET_TIMELEFT
Antonio Borneo [Tue, 20 May 2025 08:59:52 +0000 (10:59 +0200)]
watchdog: arm_smc_wdt: get wdt status through SMCWD_GET_TIMELEFT

The optional SMCWD_GET_TIMELEFT command can be used to detect if
the watchdog has already been started.
See the implementation in OP-TEE secure OS [1].

At probe time, check if the watchdog is already started and then
set WDOG_HW_RUNNING in the watchdog status. This will cause the
watchdog framework to ping the watchdog until a userspace watchdog
daemon takes over the control.

Link: https://github.com/OP-TEE/optee_os/commit/a7f2d4bd8632
Signed-off-by: Antonio Borneo <antonio.borneo@foss.st.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250520085952.210723-1-antonio.borneo@foss.st.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agowatchdog: iTCO: Drop driver-internal locking
Guenter Roeck [Sat, 17 May 2025 16:09:36 +0000 (09:09 -0700)]
watchdog: iTCO: Drop driver-internal locking

The locking code in the iTCO watchdog driver has been carried along from
before the watchdog core existed. The watchdog core protects calls into
drivers since commit f4e9c82f64b5 ("watchdog: Add Locking support"),
making driver-internal locking unnecessary. Drop it.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Link: https://lore.kernel.org/r/20250517160936.3231017-1-linux@roeck-us.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agowatchdog: apple: set max_hw_heartbeat_ms instead of max_timeout
Florian Klink [Tue, 6 May 2025 14:26:22 +0000 (17:26 +0300)]
watchdog: apple: set max_hw_heartbeat_ms instead of max_timeout

The hardware only supports timeouts slightly below 3mins, but by using
max_hw_heartbeat_ms we can let the kernel take care of supporting larger
timeouts than that requested from userspace.

Switching to max_hw_heartbeat_ms also means our set_timeout function now
needs to configure the hardware to the minimum of either the requested
timeout (in seconds) or the maximum supported by the user (in seconds).

Signed-off-by: Florian Klink <flokli@flokli.de>
Reviewed-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Link: https://lore.kernel.org/r/20250506142621.11428-2-flokli@flokli.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agowatchdog: qcom: introduce the device data for IPQ5424 watchdog device
Kathiravan Thirumoorthy [Fri, 2 May 2025 13:17:51 +0000 (18:47 +0530)]
watchdog: qcom: introduce the device data for IPQ5424 watchdog device

To retrieve the restart reason from IMEM, certain device specific data
like IMEM compatible to lookup, location of IMEM to read, etc should be
defined. To achieve that, introduce the separate device data for IPQ5424
and add the required details subsequently.

Signed-off-by: Kathiravan Thirumoorthy <kathiravan.thirumoorthy@oss.qualcomm.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250502-wdt_reset_reason-v3-3-b2dc7ace38ca@oss.qualcomm.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agodt-bindings: watchdog: renesas,wdt: Document RZ/V2N (R9A09G056) support
Lad Prabhakar [Fri, 2 May 2025 12:00:54 +0000 (13:00 +0100)]
dt-bindings: watchdog: renesas,wdt: Document RZ/V2N (R9A09G056) support

Document support for the watchdog IP found on the Renesas RZ/V2N
(R9A09G056) SoC. The watchdog IP is identical to that on RZ/V2H(P),
so `renesas,r9a09g057-wdt` will be used as a fallback compatible,
enabling reuse of the existing driver without changes.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250502120054.47323-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agowatchdog: lenovo_se30_wdt: Fix possible devm_ioremap() NULL pointer dereference in...
Henry Martin [Thu, 24 Apr 2025 07:16:48 +0000 (15:16 +0800)]
watchdog: lenovo_se30_wdt: Fix possible devm_ioremap() NULL pointer dereference in lenovo_se30_wdt_probe()

devm_ioremap() returns NULL on error. Currently, lenovo_se30_wdt_probe()
does not check for this case, which results in a NULL pointer
dereference.

Add NULL check after devm_ioremap() to prevent this issue.

Fixes: c284153a2c55 ("watchdog: lenovo_se30_wdt: Watchdog driver for Lenovo SE30 platform")
Signed-off-by: Henry Martin <bsdhenrymartin@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250424071648.89016-1-bsdhenrymartin@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agowatchdog: s3c2410_wdt: Add exynos990-wdt compatible data
Igor Belwon [Sun, 20 Apr 2025 19:00:39 +0000 (21:00 +0200)]
watchdog: s3c2410_wdt: Add exynos990-wdt compatible data

The Exynos990 has two watchdog clusters - cl0 and cl2. Add new
driver data for these two clusters, making it possible to use the
watchdog timer on this SoC.

Signed-off-by: Igor Belwon <igor.belwon@mentallysanemainliners.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250420-wdt-resends-april-v1-2-f58639673959@mentallysanemainliners.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agodt-bindings: watchdog: samsung-wdt: Add exynos990-wdt compatible
Igor Belwon [Sun, 20 Apr 2025 19:00:38 +0000 (21:00 +0200)]
dt-bindings: watchdog: samsung-wdt: Add exynos990-wdt compatible

Add a dt-binding compatible for the Exynos990 Watchdog timer.
This watchdog is compatible with the GS101/Exynos850 design, as
such it requires the cluster-index and syscon-phandle properties
to be present. It also contains a cl2 cluster, as such the
cluster-index property has been expanded.

Signed-off-by: Igor Belwon <igor.belwon@mentallysanemainliners.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20250420-wdt-resends-april-v1-1-f58639673959@mentallysanemainliners.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
4 months agomm/khugepaged: clean up refcount check using folio_expected_ref_count()
Shivank Garg [Mon, 26 May 2025 18:28:20 +0000 (18:28 +0000)]
mm/khugepaged: clean up refcount check using folio_expected_ref_count()

Use folio_expected_ref_count() instead of open-coded logic in
is_refcount_suitable().  This avoids code duplication and improves
clarity.

Drop is_refcount_suitable() as it is no longer needed.

Link: https://lkml.kernel.org/r/20250526182818.37978-2-shivankg@amd.com
Signed-off-by: Shivank Garg <shivankg@amd.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Fengwei Yin <fengwei.yin@intel.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: fix test result reporting in gup_longterm
Mark Brown [Tue, 27 May 2025 16:04:48 +0000 (17:04 +0100)]
selftests/mm: fix test result reporting in gup_longterm

The kselftest framework uses the string logged when a test result is
reported as the unique identifier for a test, using it to track test
results between runs.  The gup_longterm test fails to follow this pattern,
it runs a single test function repeatedly with various parameters but each
result report is a string logging an error message which is fixed between
runs.

Since the code already logs each test uniquely before it starts refactor
to also print this to a buffer, then use that name as the test result.
This isn't especially pretty but is relatively straightforward and is a
great help to tooling.

Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-4-ff198df8e38e@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: report unique test names for each cow test
Mark Brown [Tue, 27 May 2025 16:04:47 +0000 (17:04 +0100)]
selftests/mm: report unique test names for each cow test

The kselftest framework uses the string logged when a test result is
reported as the unique identifier for a test, using it to track test
results between runs.  The cow test completely fails to follow this
pattern, it runs test functions repeatedly with various parameters with
each result report from those functions being a string logging an error
message which is fixed between runs.

Since the code already logs each test uniquely before it starts refactor
to also print this to a buffer, then use that name as the test result.
This isn't especially pretty but is relatively straightforward and is a
great help to tooling.

Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-3-ff198df8e38e@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: add helper for logging test start and results
Mark Brown [Tue, 27 May 2025 16:04:46 +0000 (17:04 +0100)]
selftests/mm: add helper for logging test start and results

Several of the MM tests have a pattern of printing a description of the
test to be run then reporting the actual TAP result using a generic string
not connected to the specific test, often in a shared function used by
many tests.  The name reported typically varies depending on the specific
result rather than the test too.  This causes problems for tooling that
works with test results, the names reported with the results are used to
deduplicate tests and track them between runs so both duplicated names and
changing names cause trouble for things like UIs and automated bisection.

As a first step towards matching these tests better with the expectations
of kselftest provide helpers which record the test name as part of the
initial print and then use that as part of reporting a result.

This is not added as a generic kselftest helper partly because the use of
a variable to store the test name doesn't fit well with the header only
implementation of kselftest.h and partly because it's not really an
intended pattern.  Ideally at some point the mm tests that use it will be
updated to not need it.

Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-2-ff198df8e38e@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: use standard ksft_finished() in cow and gup_longterm
Mark Brown [Tue, 27 May 2025 16:04:45 +0000 (17:04 +0100)]
selftests/mm: use standard ksft_finished() in cow and gup_longterm

Patch series "selftests/mm: cow and gup_longterm cleanups", v2.

The bulk of these changes modify the cow and gup_longterm tests to report
unique and stable names for each test, bringing them into line with the
expectations of tooling that works with kselftest.  The string reported as
a test result is used by tooling to both deduplicate tests and track tests
between test runs, using the same string for multiple tests or changing
the string depending on test result causes problems for user interfaces
and automation such as bisection.

It was suggested that converting to use kselftest_harness.h would be a
good way of addressing this, however that really wants the set of tests to
run to be known at compile time but both test programs dynamically
enumarate the set of huge page sizes the system supports and test each.
Refactoring to handle this would be even more invasive than these changes
which are large but straightforward and repetitive.

A version of the main gup_longterm cleanup was previously sent separately,
this version factors out the helpers for logging the start of the test
since the cow test looks very similar.

This patch (of 4):

The cow and gup_longterm test programs open code something that looks a
lot like the standard ksft_finished() helper to summarise the test results
and provide an exit code, convert to use ksft_finished().

Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-0-ff198df8e38e@kernel.org
Link: https://lkml.kernel.org/r/20250527-selftests-mm-cow-dedupe-v2-1-ff198df8e38e@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/damon/_damon_sysfs: skip testcases if CONFIG_DAMON_SYSFS is disabled
Enze Li [Sat, 31 May 2025 09:39:37 +0000 (17:39 +0800)]
selftests/damon/_damon_sysfs: skip testcases if CONFIG_DAMON_SYSFS is disabled

When CONFIG_DAMON_SYSFS is disabled, the selftests fail with the following
outputs,

not ok 2 selftests: damon: sysfs_update_schemes_tried_regions_wss_estimation.py # exit=1
not ok 3 selftests: damon: damos_quota.py # exit=1
not ok 4 selftests: damon: damos_quota_goal.py # exit=1
not ok 5 selftests: damon: damos_apply_interval.py # exit=1
not ok 6 selftests: damon: damos_tried_regions.py # exit=1
not ok 7 selftests: damon: damon_nr_regions.py # exit=1
not ok 11 selftests: damon: sysfs_update_schemes_tried_regions_hang.py # exit=1

The root cause of this issue is that all the testcases above do not check
the sysfs interface of DAMON whether it exists or not.  With this patch
applied, all the testcases above now pass successfully.

Link: https://lkml.kernel.org/r/20250531093937.1555159-1-lienze@kylinos.cn
Signed-off-by: Enze Li <lienze@kylinos.cn>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agosched/numa: add statistics of numa balance task
Chen Yu [Fri, 23 May 2025 12:51:15 +0000 (20:51 +0800)]
sched/numa: add statistics of numa balance task

On systems with NUMA balancing enabled, it has been found that tracking
task activities resulting from NUMA balancing is beneficial.  NUMA
balancing employs two mechanisms for task migration: one is to migrate
a task to an idle CPU within its preferred node, and the other is to
swap tasks located on different nodes when they are on each other's
preferred nodes.

The kernel already provides NUMA page migration statistics in
/sys/fs/cgroup/mytest/memory.stat and /proc/{PID}/sched.  However, it
lacks statistics regarding task migration and swapping.  Therefore,
relevant counts for task migration and swapping should be added.

The following two new fields:

numa_task_migrated
numa_task_swapped

will be shown in /sys/fs/cgroup/{GROUP}/memory.stat, /proc/{PID}/sched
and /proc/vmstat.

Introducing both per-task and per-memory cgroup (memcg) NUMA balancing
statistics facilitates a rapid evaluation of the performance and
resource utilization of the target workload.  For instance, users can
first identify the container with high NUMA balancing activity and then
further pinpoint a specific task within that group, and subsequently
adjust the memory policy for that task.  In short, although it is
possible to iterate through /proc/$pid/sched to locate the problematic
task, the introduction of aggregated NUMA balancing activity for tasks
within each memcg can assist users in identifying the task more
efficiently through a divide-and-conquer approach.

As Libo Chen pointed out, the memcg event relies on the text names in
vmstat_text, and /proc/vmstat generates corresponding items based on
vmstat_text.  Thus, the relevant task migration and swapping events
introduced in vmstat_text also need to be populated by
count_vm_numa_event(), otherwise these values are zero in /proc/vmstat.

In theory, task migration and swap events are part of the scheduler's
activities.  The reason for exposing them through the
memory.stat/vmstat interface is that we already have NUMA balancing
statistics in memory.stat/vmstat, and these events are closely related
to each other.  Following Shakeel's suggestion, we describe the
end-to-end flow/story of all these events occurring on a timeline for
future reference:

The goal of NUMA balancing is to co-locate a task and its memory pages
on the same NUMA node.  There are two strategies: migrate the pages to
the task's node, or migrate the task to the node where its pages
reside.

Suppose a task p1 is running on Node 0, but its pages are located on
Node 1.  NUMA page fault statistics for p1 reveal its "page footprint"
across nodes.  If NUMA balancing detects that most of p1's pages are on
Node 1:

1.Page Migration Attempt:
The Numa balance first tries to migrate p1's pages to Node 0.
The numa_page_migrate counter increments.

2.Task Migration Strategies:
After the page migration finishes, Numa balance checks every
1 second to see if p1 can be migrated to Node 1.

Case 2.1: Idle CPU Available

  If Node 1 has an idle CPU, p1 is directly scheduled there.  This
  event is logged as numa_task_migrated.

Case 2.2: No Idle CPU (Task Swap)

  If all CPUs on Node1 are busy, direct migration could cause CPU
  contention or load imbalance.  Instead: The Numa balance selects a
  candidate task p2 on Node 1 that prefers Node 0 (e.g., due to its own
  page footprint).  p1 and p2 are swapped.  This cross-node swap is
  recorded as numa_task_swapped.

Link: https://lkml.kernel.org/r/d00edb12ba0f0de3c5222f61487e65f2ac58f5b1.1748493462.git.yu.c.chen@intel.com
Link: https://lkml.kernel.org/r/7ef90a88602ed536be46eba7152ed0d33bad5790.1748002400.git.yu.c.chen@intel.com
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Ayush Jain <Ayush.jain3@amd.com>
Cc: "Chen, Tim C" <tim.c.chen@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Libo Chen <libo.chen@oracle.com>
Cc: Mel Gorman <mgorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agosched/numa: fix task swap by skipping kernel threads
Libo Chen [Fri, 23 May 2025 12:51:01 +0000 (20:51 +0800)]
sched/numa: fix task swap by skipping kernel threads

Patch series "sched/numa: add statistics of numa balance task migration",
v6.

Introduce task migration and swap statistics in the following places:
/sys/fs/cgroup/{GROUP}/memory.stat
/proc/{PID}/sched
/proc/vmstat

These statistics facilitate a rapid evaluation of the performance and
resource utilization of the target workload.

This patch (of 2):

Task swapping is triggered when there are no idle CPUs in task A's
preferred node.  In this case, the NUMA load balancer chooses a task B
on A's preferred node and swaps B with A.  This helps improve NUMA
locality without introducing load imbalance between nodes.  In the
current implementation, B's NUMA node preference is not mandatory.
That is to say, a kernel thread might be incorrectly chosen as B.
However, kernel thread and user space thread that does not have mm are
not supposed to be covered by NUMA balancing because NUMA balancing
only considers user pages via VMAs.

According to Peter's suggestion for fixing this issue, we use
PF_KTHREAD to skip the kernel thread.  curr->mm is also checked because
it is possible that user_mode_thread() might create a user thread
without an mm.  As per Prateek's analysis, after adding the PF_KTHREAD
check, there is no need to further check the PF_IDLE flag:

: - play_idle_precise() already ensures PF_KTHREAD is set before adding
:   PF_IDLE
:
: - cpu_startup_entry() is only called from the startup thread which
:   should be marked with PF_KTHREAD (based on my understanding looking at
:   commit cff9b2332ab7 ("kernel/sched: Modify initial boot task idle
:   setup"))

In summary, the check in task_numa_compare() now aligns with
task_tick_numa().

Link: https://lkml.kernel.org/r/cover.1748493462.git.yu.c.chen@intel.com
Link: https://lkml.kernel.org/r/43d68b356b25d124f0d222ebedf3859e86eefb9f.1748493462.git.yu.c.chen@intel.com
Link: https://lkml.kernel.org/r/cover.1748002400.git.yu.c.chen@intel.com
Link: https://lkml.kernel.org/r/eaacc9c9bd37bac92d43a671867d85b2fdad3b06.1748002400.git.yu.c.chen@intel.com
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Libo Chen <libo.chen@oracle.com>
Suggested-by: Michal Koutný <mkoutny@suse.com>
Tested-by: Ayush Jain <Ayush.jain3@amd.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "Chen, Tim C" <tim.c.chen@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Mel Gorman <mgorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agotools/testing: check correct variable in open_procmap()
Dan Carpenter [Wed, 28 May 2025 08:13:45 +0000 (11:13 +0300)]
tools/testing: check correct variable in open_procmap()

Check if "procmap_out->fd" is negative instead of "procmap_out" (which is
a pointer).

Link: https://lkml.kernel.org/r/aDbFuUTlJTBqziVd@stanley.mountain
Fixes: bd23f293a0d5 ("tools/testing: add PROCMAP_QUERY helper functions in mm self tests")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: levi.yun <yeoreum.yun@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agotools/testing/vma: add missing function stub
Lorenzo Stoakes [Wed, 28 May 2025 14:15:39 +0000 (15:15 +0100)]
tools/testing/vma: add missing function stub

The hugetlb fix introduced in commit ee40c9920ac2 ("mm: fix copy_vma()
error handling for hugetlb mappings") mistakenly did not provide a stub
for the VMA userland testing, which results in a compile error when trying
to build this.

Provide this stub to resolve the issue.

Link: https://lkml.kernel.org/r/20250528-fix-vma-test-v1-1-c8a5f533b38f@oracle.com
Fixes: ee40c9920ac2 ("mm: fix copy_vma() error handling for hugetlb mappings")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/gup: update comment explaining why gup_fast() disables IRQs
Jann Horn [Wed, 28 May 2025 21:06:17 +0000 (23:06 +0200)]
mm/gup: update comment explaining why gup_fast() disables IRQs

The current comment in gup_fast() talks about "IPIs that come from THPs
splitting", which is outdated and refers to the old THP splitting
implementation that was removed in commit ad0bed24e98b ("thp: drop all
split_huge_page()-related code"), which landed in v4.5.  Before then, THP
splitting involved a pmdp_splitting_flush(), which sent an IPI to
serialize against gup_fast().

Nowadays, we use tlb_remove_table_sync_one() to send IPIs that serialize
against gup_fast(); this is used, for example, in THP *collapsing* to stop
gup_fast() walks of a page table before depositing it.

Link: https://lkml.kernel.org/r/20250528-gup-irq-comment-fix-v1-1-b9d83c345333@google.com
Signed-off-by: Jann Horn <jannh@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: two fixes for the pfnmap test
David Hildenbrand [Wed, 28 May 2025 19:52:44 +0000 (21:52 +0200)]
selftests/mm: two fixes for the pfnmap test

When unregistering the signal handler, we have to pass SIG_DFL, and
blindly reading from PFN 0 and PFN 1 seems to be problematic on !x86
systems.  In particularly, on arm64 tx2 machines where noting resides at
these physical memory locations, we can generate RAS errors.

Let's fix it by scanning /proc/iomem for actual "System RAM".

Link: https://lkml.kernel.org/r/20250528195244.1182810-1-david@redhat.com
Fixes: 2616b370323a ("selftests/mm: add simple VM_PFNMAP tests based on mmap'ing /dev/mem")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Closes: https://lore.kernel.org/all/232960c2-81db-47ca-a337-38c4bce5f997@arm.com/T/#u
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/khugepaged: fix race with folio split/free using temporary reference
Shivank Garg [Mon, 26 May 2025 18:28:18 +0000 (18:28 +0000)]
mm/khugepaged: fix race with folio split/free using temporary reference

hpage_collapse_scan_file() calls is_refcount_suitable(), which in turn
calls folio_mapcount().  folio_mapcount() checks folio_test_large() before
proceeding to folio_large_mapcount(), but there is a race window where the
folio may get split/freed between these checks, triggering:

  VM_WARN_ON_FOLIO(!folio_test_large(folio), folio)

Take a temporary reference to the folio in hpage_collapse_scan_file().
This stabilizes the folio during refcount check and prevents incorrect
large folio detection due to concurrent split/free.  Use helper
folio_expected_ref_count() + 1 to compare with folio_ref_count() instead
of using is_refcount_suitable().

Link: https://lkml.kernel.org/r/20250526182818.37978-1-shivankg@amd.com
Fixes: 05c5323b2a34 ("mm: track mapcount of large folios in single value")
Signed-off-by: Shivank Garg <shivankg@amd.com>
Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6828470d.a70a0220.38f255.000c.GAE@google.com
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Fengwei Yin <fengwei.yin@intel.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm: add CONFIG_PAGE_BLOCK_ORDER to select page block order
Juan Yescas [Wed, 21 May 2025 21:57:45 +0000 (14:57 -0700)]
mm: add CONFIG_PAGE_BLOCK_ORDER to select page block order

Problem: On large page size configurations (16KiB, 64KiB), the CMA
alignment requirement (CMA_MIN_ALIGNMENT_BYTES) increases considerably,
and this causes the CMA reservations to be larger than necessary.  This
means that system will have less available MIGRATE_UNMOVABLE and
MIGRATE_RECLAIMABLE page blocks since MIGRATE_CMA can't fallback to them.

The CMA_MIN_ALIGNMENT_BYTES increases because it depends on MAX_PAGE_ORDER
which depends on ARCH_FORCE_MAX_ORDER.  The value of ARCH_FORCE_MAX_ORDER
increases on 16k and 64k kernels.

For example, in ARM, the CMA alignment requirement when:

- CONFIG_ARCH_FORCE_MAX_ORDER default value is used
- CONFIG_TRANSPARENT_HUGEPAGE is set:

PAGE_SIZE | MAX_PAGE_ORDER | pageblock_order | CMA_MIN_ALIGNMENT_BYTES
-----------------------------------------------------------------------
   4KiB   |      10        |       9         |  4KiB * (2 ^  9) =   2MiB
  16Kib   |      11        |      11         | 16KiB * (2 ^ 11) =  32MiB
  64KiB   |      13        |      13         | 64KiB * (2 ^ 13) = 512MiB

There are some extreme cases for the CMA alignment requirement when:

- CONFIG_ARCH_FORCE_MAX_ORDER maximum value is set
- CONFIG_TRANSPARENT_HUGEPAGE is NOT set:
- CONFIG_HUGETLB_PAGE is NOT set

PAGE_SIZE | MAX_PAGE_ORDER | pageblock_order |  CMA_MIN_ALIGNMENT_BYTES
------------------------------------------------------------------------
   4KiB   |      15        |      15         |  4KiB * (2 ^ 15) = 128MiB
  16Kib   |      13        |      13         | 16KiB * (2 ^ 13) = 128MiB
  64KiB   |      13        |      13         | 64KiB * (2 ^ 13) = 512MiB

This affects the CMA reservations for the drivers. If a driver in a
4KiB kernel needs 4MiB of CMA memory, in a 16KiB kernel, the minimal
reservation has to be 32MiB due to the alignment requirements:

reserved-memory {
    ...
    cma_test_reserve: cma_test_reserve {
        compatible = "shared-dma-pool";
        size = <0x0 0x400000>; /* 4 MiB */
        ...
    };
};

reserved-memory {
    ...
    cma_test_reserve: cma_test_reserve {
        compatible = "shared-dma-pool";
        size = <0x0 0x2000000>; /* 32 MiB */
        ...
    };
};

Solution: Add a new config CONFIG_PAGE_BLOCK_ORDER that allows to set the
page block order in all the architectures.  The maximum page block order
will be given by ARCH_FORCE_MAX_ORDER.

By default, CONFIG_PAGE_BLOCK_ORDER will have the same value that
ARCH_FORCE_MAX_ORDER.  This will make sure that current kernel
configurations won't be affected by this change.  It is a opt-in change.

This patch will allow to have the same CMA alignment requirements for
large page sizes (16KiB, 64KiB) as that in 4kb kernels by setting a lower
pageblock_order.

Tests:

- Verified that HugeTLB pages work when pageblock_order is 1, 7, 10 on
  4k and 16k kernels.

- Verified that Transparent Huge Pages work when pageblock_order is 1,
  7, 10 on 4k and 16k kernels.

- Verified that dma-buf heaps allocations work when pageblock_order is
  1, 7, 10 on 4k and 16k kernels.

Benchmarks:

The benchmarks compare 16kb kernels with pageblock_order 10 and 7.  The
reason for the pageblock_order 7 is because this value makes the min CMA
alignment requirement the same as that in 4kb kernels (2MB).

- Perform 100K dma-buf heaps (/dev/dma_heap/system) allocations of
  SZ_8M, SZ_4M, SZ_2M, SZ_1M, SZ_64, SZ_8, SZ_4.  Use simpleperf
  (https://developer.android.com/ndk/guides/simpleperf) to measure the #
  of instructions and page-faults on 16k kernels.  The benchmark was
  executed 10 times.  The averages are below:

           # instructions         |     #page-faults
    order 10     |  order 7       | order 10 | order 7
--------------------------------------------------------
 13,891,765,770  | 11,425,777,314 |    220   |   217
 14,456,293,487  | 12,660,819,302 |    224   |   219
 13,924,261,018  | 13,243,970,736 |    217   |   221
 13,910,886,504  | 13,845,519,630 |    217   |   221
 14,388,071,190  | 13,498,583,098 |    223   |   224
 13,656,442,167  | 12,915,831,681 |    216   |   218
 13,300,268,343  | 12,930,484,776 |    222   |   218
 13,625,470,223  | 14,234,092,777 |    219   |   218
 13,508,964,965  | 13,432,689,094 |    225   |   219
 13,368,950,667  | 13,683,587,37  |    219   |   225
-------------------------------------------------------------------
 13,803,137,433  | 13,131,974,268 |    220   |   220    Averages

There were 4.85% #instructions when order was 7, in comparison with order
10.

     13,803,137,433 - 13,131,974,268 = -671,163,166 (-4.86%)

The number of page faults in order 7 and 10 were the same.

These results didn't show any significant regression when the
pageblock_order is set to 7 on 16kb kernels.

- Run speedometer 3.1 (https://browserbench.org/Speedometer3.1/) 5 times
  on the 16k kernels with pageblock_order 7 and 10.

order 10 | order 7  | order 7 - order 10 | (order 7 - order 10) %
-------------------------------------------------------------------
  15.8  |  16.4    |         0.6        |     3.80%
  16.4  |  16.2    |        -0.2        |    -1.22%
  16.6  |  16.3    |        -0.3        |    -1.81%
  16.8  |  16.3    |        -0.5        |    -2.98%
  16.6  |  16.8    |         0.2        |     1.20%
-------------------------------------------------------------------
  16.44     16.4            -0.04           -0.24%   Averages

The results didn't show any significant regression when the
pageblock_order is set to 7 on 16kb kernels.

Link: https://lkml.kernel.org/r/20250521215807.1860663-1-jyescas@google.com
Signed-off-by: Juan Yescas <jyescas@google.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agommu_notifiers: remove leftover stub macros
Jann Horn [Thu, 22 May 2025 22:30:17 +0000 (00:30 +0200)]
mmu_notifiers: remove leftover stub macros

Commit ec8832d007cb ("mmu_notifiers: don't invalidate secondary TLBs as
part of mmu_notifier_invalidate_range_end()") removed the main definitions
of {ptep,pmdp_huge,pudp_huge}_clear_flush_notify; just their
!CONFIG_MMU_NOTIFIER stubs are left behind, remove them.

Link: https://lkml.kernel.org/r/20250523-mmu-notifier-cleanup-unused-v1-1-cc1f47ebec33@google.com
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: deduplicate test names in madv_populate
Mark Brown [Thu, 22 May 2025 16:29:00 +0000 (17:29 +0100)]
selftests/mm: deduplicate test names in madv_populate

The madv_populate selftest has some repetitive code for several different
cases that it covers, included repeated test names used in
ksft_test_result() reports.  This causes problems for automation, the test
name is used to both track the test between runs and distinguish between
multiple tests within the same run.  Fix this by tweaking the messages
with duplication to be more specific about the contexts they're in.

Link: https://lkml.kernel.org/r/20250522-selftests-mm-madv-populate-dedupe-v1-1-fd1dedd79b4b@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agokcov: rust: add flags for KCOV with Rust
Alice Ryhl [Thu, 1 May 2025 12:16:16 +0000 (12:16 +0000)]
kcov: rust: add flags for KCOV with Rust

Rust code is currently not instrumented properly when KCOV is enabled.
Thus, add the relevant flags to perform instrumentation correctly. This
is necessary for efficient fuzzing of Rust code.

The sanitizer-coverage features of LLVM have existed for long enough
that they are available on any LLVM version supported by rustc, so we do
not need any Kconfig feature detection. The coverage level is set to 3,
as that is the level needed by trace-pc.

We do not instrument `core` since when we fuzz the kernel, we are
looking for bugs in the kernel, not the Rust stdlib.

Link: https://lkml.kernel.org/r/20250501-rust-kcov-v2-1-b71e83e9779f@google.com
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Co-developed-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Tested-by: Aleksandr Nogikh <nogikh@google.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm: rust: make CONFIG_MMU ifdefs more narrow
Alice Ryhl [Fri, 16 May 2025 19:32:19 +0000 (19:32 +0000)]
mm: rust: make CONFIG_MMU ifdefs more narrow

Currently the entire kernel::mm module is ifdef'd out when CONFIG_MMU=n.
However, there are some downstream users of the module in
rust/kernel/task.rs and rust/kernel/miscdevice.rs. Thus, update the cfgs
so that only MmWithUserAsync is removed with CONFIG_MMU=n.

The code is moved into a new file, since the #[cfg()] annotation
otherwise has to be duplicated several times.

Link: https://lkml.kernel.org/r/20250516193219.2987032-1-aliceryhl@google.com
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505071753.kldNHYVQ-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202505072116.eSYC8igT-lkp@intel.com/
Fixes: 5bb9ed6cdfeb ("mm: rust: add abstraction for struct mm_struct")
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agommu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables()
Roman Gushchin [Thu, 22 May 2025 01:28:38 +0000 (01:28 +0000)]
mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables()

Commit b67fbebd4cf9 ("mmu_gather: Force tlb-flush VM_PFNMAP vmas") added a
forced tlbflush to tlb_vma_end(), which is required to avoid a race
between munmap() and unmap_mapping_range().  However it added some
overhead to other paths where tlb_vma_end() is used, but vmas are not
removed, e.g.  madvise(MADV_DONTNEED).

Fix this by moving the tlb flush out of tlb_end_vma() into new
tlb_flush_vmas() called from free_pgtables(), somewhat similar to the
stable version of the original commit: commit 895428ee124a ("mm: Force TLB
flush for PFNMAP mappings before unlink_file_vma()").

Note, that if tlb->fullmm is set, no flush is required, as the whole mm is
about to be destroyed.

Link: https://lkml.kernel.org/r/20250522012838.163876-1-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/damon/Kconfig: enable CONFIG_DAMON by default
SeongJae Park [Wed, 21 May 2025 04:27:55 +0000 (21:27 -0700)]
mm/damon/Kconfig: enable CONFIG_DAMON by default

As of this writing, multiple major distros including Alma, Amazon,
Android, CentOS, Debian, Fedora, and Oracle are build-enabling DAMON (set
CONFIG_DAMON[1]).  Enabling it by default will save configuration setup
time for the current and future DAMON users.

Build-enabling DAMON does not introduce a real risk since it makes no
behavioral change by default.  It requires explicit user requests to do
anything.  Only one potential risk is making the size of the kernel a
little bit larger.  On a production-purpose configuration, it increases
the resulting kernel package size by about 0.1 % of the final package
file.  I believe that's too small to be a real problem in common setups.

Hence, the benefit of enabling CONFIG_DAMON outweighs the potential risk.
Set CONFIG_DAMON by default.

Link: https://oracle.github.io/kconfigs/?config=UTS_RELEASE&config=DAMON
Link: https://lkml.kernel.org/r/20250521042755.39653-3-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Acked-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/damon/Kconfig: set DAMON_{VADDR,PADDR,SYSFS} default to DAMON
SeongJae Park [Wed, 21 May 2025 04:27:54 +0000 (21:27 -0700)]
mm/damon/Kconfig: set DAMON_{VADDR,PADDR,SYSFS} default to DAMON

Patch series "mm/damon: build-enable essential DAMON components by
default".

As of this writing, multiple major distros including Alma, Amazon,
Android, CentOS, Debian, Fedora, and Oracle are build-enabling DAMON (set
CONFIG_DAMON[1]).  Configuring DAMON is not very easy, since it is
disabled by default, and there are multiple essential options that need to
be manually turned on, one by one.  Make it easier, by grouping essential
configurations to be enabled with one selection, and enabling build of the
essential parts of DAMON by default.

Note that build-enabling DAMON does not introduce any real risk, since it
makes no behavioral change by default.  It requires explicit user requests
to do anything.  Only one potential risk is making the size of the kernel
a little bit larger.  On a production-purpose configuration, it increases
the resulting kernel package binary size by about 0.1 % of the final
package file.  I believe that's too small to be a real problem in common
setups.

DAMON_{VADDR,PADDR,SYSFS} are de-facto essential parts of DAMON for normal
usages.  Because those need to be enabled one by one, however, and there
are other test-purpose or non-essential configurations, it is easy to be
confused and make mistakes at setup.  Make the essential configurations
default to CONFIG_DAMON, so that those can be enabled by default with a
single change.

Link: https://oracle.github.io/kconfigs/?config=UTS_RELEASE&config=DAMON
Link: https://lkml.kernel.org/r/20250521042755.39653-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250521042755.39653-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Acked-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agohugetlb: show nr_huge_pages in report_hugepages()
Wenjie Xu [Thu, 15 May 2025 11:42:31 +0000 (19:42 +0800)]
hugetlb: show nr_huge_pages in report_hugepages()

The number of pre-allocated huge pages should be nr_huge_pages, not
free_huge_pages, although they are same during booting stage

Link: https://lkml.kernel.org/r/20250515114231.65824-1-xuwenjie04@baidu.com
Signed-off-by: Wenjie Xu <xuwenjie04@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: skip hugevm test if kernel config file is not present
Zi Yan [Fri, 16 May 2025 13:29:38 +0000 (09:29 -0400)]
selftests/mm: skip hugevm test if kernel config file is not present

When running hugevm tests in a machine without kernel config present,
e.g., a VM running a kernel without CONFIG_IKCONFIG_PROC nor
/boot/config-*, skip hugevm tests, which reads kernel config to get page
table level information.

Link: https://lkml.kernel.org/r/20250516132938.356627-3-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Adam Sindelar <adam@wowsignal.io>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agoselftests/mm: skip guard_regions.uffd tests when uffd is not present
Zi Yan [Fri, 16 May 2025 13:29:37 +0000 (09:29 -0400)]
selftests/mm: skip guard_regions.uffd tests when uffd is not present

Patch series "Skip mm selftests instead when kernel features are not
present", v2.

Two guard_regions tests on userfaultfd fail when userfaultfd is not
present.  Skip them instead.

hugevm test reads kernel config to get page table level information and
fails when neither /proc/config.gz nor /boot/config-* is present.  Skip it
instead.

This patch (of 2):

When userfaultfd is not compiled into kernel, userfaultfd() returns -1,
causing guard_regions.uffd tests to fail.  Skip the tests instead.

Link: https://lkml.kernel.org/r/20250516132938.356627-1-ziy@nvidia.com
Link: https://lkml.kernel.org/r/20250516132938.356627-2-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Adam Sindelar <adam@wowsignal.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/shmem: remove unneeded xa_is_value() check in shmem_unuse_swap_entries()
Kemeng Shi [Fri, 16 May 2025 17:09:39 +0000 (01:09 +0800)]
mm/shmem: remove unneeded xa_is_value() check in shmem_unuse_swap_entries()

As only value entry will be added to fbatch in shmem_find_swap_entries(),
there is no need to do xa_is_value() check in shmem_unuse_swap_entries().

Link: https://lkml.kernel.org/r/20250516170939.965736-6-shikemeng@huaweicloud.com
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm: shmem: only remove inode from swaplist when it's swapped page count is 0
Kemeng Shi [Fri, 16 May 2025 17:09:38 +0000 (01:09 +0800)]
mm: shmem: only remove inode from swaplist when it's swapped page count is 0

Even if we fail to allocate a swap entry, the inode might have previously
allocated entry and we might take inode containing swap entry off
swaplist.  As a result, try_to_unuse() may enter a potential dead loop to
repeatedly look for inode and clean it's swap entry.  Only take inode off
swaplist when it's swapped page count is 0 to fix the issue.

Link: https://lkml.kernel.org/r/20250516170939.965736-5-shikemeng@huaweicloud.com
Fixes: b487a2da3575 ("mm, swap: simplify folio swap allocation")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Kairui Song <kasong@tencent.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202505161438.9009cf47-lkp@intel.com
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/shmem: fix potential dead loop in shmem_unuse()
Kemeng Shi [Fri, 16 May 2025 17:09:37 +0000 (01:09 +0800)]
mm/shmem: fix potential dead loop in shmem_unuse()

If multi shmem_unuse() for different swap type is called concurrently, a
dead loop could occur as following:

shmem_unuse(typeA)               shmem_unuse(typeB)
 mutex_lock(&shmem_swaplist_mutex)
 list_for_each_entry_safe(info, next, ...)
  ...
  mutex_unlock(&shmem_swaplist_mutex)
  /* info->swapped may drop to 0 */
  shmem_unuse_inode(&info->vfs_inode, type)

                                  mutex_lock(&shmem_swaplist_mutex)
                                  list_for_each_entry(info, next, ...)
                                   if (!info->swapped)
                                    list_del_init(&info->swaplist)

                                  ...
                                  mutex_unlock(&shmem_swaplist_mutex)

  mutex_lock(&shmem_swaplist_mutex)
  /* iterate with offlist entry and encounter a dead loop */
  next = list_next_entry(info, swaplist);
  ...

Restart the iteration if the inode is already off shmem_swaplist list to
fix the issue.

Link: https://lkml.kernel.org/r/20250516170939.965736-4-shikemeng@huaweicloud.com
Fixes: b56a2d8af914 ("mm: rid swapoff of quadratic complexity")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm: shmem: add missing shmem_unacct_size() in __shmem_file_setup()
Kemeng Shi [Fri, 16 May 2025 17:09:36 +0000 (01:09 +0800)]
mm: shmem: add missing shmem_unacct_size() in __shmem_file_setup()

We will miss shmem_unacct_size() when is_idmapped_mnt() returns a failure.
Move is_idmapped_mnt() before shmem_acct_size() to fix the issue.

Link: https://lkml.kernel.org/r/20250516170939.965736-3-shikemeng@huaweicloud.com
Fixes: 7a80e5b8c6fa ("shmem: support idmapped mounts for tmpfs")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm: shmem: avoid unpaired folio_unlock() in shmem_swapin_folio()
Kemeng Shi [Fri, 16 May 2025 17:09:35 +0000 (01:09 +0800)]
mm: shmem: avoid unpaired folio_unlock() in shmem_swapin_folio()

Patch series "Some random fixes and cleanup to shmem", v3.

This series contains some simple fixes and cleanup which are made during
learning shmem.  More details can be found in respective patches.

This patch (of 5):

If we get a folio from swap_cache_get_folio() successfully but encounter a
failure before the folio is locked, we will unlock the folio which was not
previously locked.

Put the folio and set it to NULL when a failure occurs before the folio is
locked to fix the issue.

Link: https://lkml.kernel.org/r/20250516170939.965736-1-shikemeng@huaweicloud.com
Link: https://lkml.kernel.org/r/20250516170939.965736-2-shikemeng@huaweicloud.com
Fixes: 058313515d5a ("mm: shmem: fix potential data corruption during shmem swapin")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Kairui Song <kasong@tencent.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomm/damon/core: avoid destroyed target reference from DAMOS quota
Akinobu Mita [Sat, 17 May 2025 14:18:52 +0000 (23:18 +0900)]
mm/damon/core: avoid destroyed target reference from DAMOS quota

When the number of the monitoring targets in running contexts is reduced,
there may be DAMOS quotas referencing the targets that will be destroyed.

Applying the scheme action for such DAMOS scheme will be skipped forever
looking for the starting part of the region for the destroyed monitoring
target.

To fix this issue, when the monitoring target is destroyed, reset the
starting part for all DAMOS quotas that reference the target.

Link: https://lkml.kernel.org/r/20250517141852.142802-1-akinobu.mita@gmail.com
Fixes: da87878010e5 ("mm/damon/sysfs: support online inputs update")
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomemcg: make memcg_rstat_updated nmi safe
Shakeel Butt [Mon, 19 May 2025 06:31:42 +0000 (23:31 -0700)]
memcg: make memcg_rstat_updated nmi safe

Currently kernel maintains memory related stats updates per-cgroup to
optimize stats flushing.  The stats_updates is defined as atomic64_t which
is not nmi-safe on some archs.  Actually we don't really need 64bit atomic
as the max value stats_updates can get should be less than nr_cpus *
MEMCG_CHARGE_BATCH.  A normal atomic_t should suffice.

Also the function cgroup_rstat_updated() is still not nmi-safe but there
is parallel effort to make it nmi-safe, so until then let's ignore it in
the nmi context.

Link: https://lkml.kernel.org/r/20250519063142.111219-6-shakeel.butt@linux.dev
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 months agomemcg: nmi-safe slab stats updates
Shakeel Butt [Mon, 19 May 2025 06:31:41 +0000 (23:31 -0700)]
memcg: nmi-safe slab stats updates

The objcg based kmem [un]charging can be called in nmi context and it may
need to update NR_SLAB_[UN]RECLAIMABLE_B stats.  So, let's correctly
handle the updates of these stats in the nmi context.

Link: https://lkml.kernel.org/r/20250519063142.111219-5-shakeel.butt@linux.dev
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>