Linus Torvalds [Sat, 30 Nov 2024 18:17:53 +0000 (10:17 -0800)]
Merge tag 'nfs-for-6.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Bugfixes:
- nfs/localio: fix for a memory corruption in nfs_local_read_done
- Revert "nfs: don't reuse partially completed requests in
nfs_lock_and_join_requests"
- nfsv4:
- ignore SB_RDONLY when mounting nfs
- Fix a use-after-free problem in open()
- sunrpc:
- clear XPRT_SOCK_UPD_TIMEOUT when reseting the transport
- timeout and cancel TLS handshake with -ETIMEDOUT
- fix one UAF issue caused by sunrpc kernel tcp socket
- Fix a hang in TLS sock_close if sk_write_pending
- pNFS/blocklayout: Fix device registration issues
Features and cleanups:
- localio cleanups from Mike Snitzer
- Clean up refcounting on the nfs version modules
- __counted_by() annotations
- nfs: make processes that are waiting for an I/O lock killable"
* tag 'nfs-for-6.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits)
fs/nfs/io: make nfs_start_io_*() killable
nfs/blocklayout: Limit repeat device registration on failure
nfs/blocklayout: Don't attempt unregister for invalid block device
sunrpc: fix one UAF issue caused by sunrpc kernel tcp socket
SUNRPC: timeout and cancel TLS handshake with -ETIMEDOUT
sunrpc: clear XPRT_SOCK_UPD_TIMEOUT when reset transport
nfs: ignore SB_RDONLY when mounting nfs
Revert "nfs: don't reuse partially completed requests in nfs_lock_and_join_requests"
Revert "fs: nfs: fix missing refcnt by replacing folio_set_private by folio_attach_private"
nfs/localio: must clear res.replen in nfs_local_read_done
NFSv4.0: Fix a use-after-free problem in the asynchronous open()
NFSv4.0: Fix the wake up of the next waiter in nfs_release_seqid()
SUNRPC: Fix a hang in TLS sock_close if sk_write_pending
sunrpc: remove newlines from tracepoints
nfs: Annotate struct pnfs_commit_array with __counted_by()
nfs/localio: eliminate need for nfs_local_fsync_work forward declaration
nfs/localio: remove extra indirect nfs_to call to check {read,write}_iter
nfs/localio: eliminate unnecessary kref in nfs_local_fsync_ctx
nfs/localio: remove redundant suid/sgid handling
NFS: Implement get_nfs_version()
...
Linus Torvalds [Sat, 30 Nov 2024 18:14:42 +0000 (10:14 -0800)]
Merge tag '6.13-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
- directory lease fixes
- password rotation fixes
- reconnect fix
- fix for SMB3.02 mounts
- DFS (global namespace) fixes
- fixes for special file handling (most relating to better handling
various types of symlinks)
- two minor cleanups
* tag '6.13-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (22 commits)
cifs: update internal version number
cifs: unlock on error in smb3_reconfigure()
cifs: during remount, make sure passwords are in sync
cifs: support mounting with alternate password to allow password rotation
smb: Initialize cfid->tcon before performing network ops
smb: During unmount, ensure all cached dir instances drop their dentry
smb: client: fix noisy message when mounting shares
smb: client: don't try following DFS links in cifs_tree_connect()
smb: client: allow reconnect when sending ioctl
smb: client: get rid of @nlsc param in cifs_tree_connect()
smb: client: allow more DFS referrals to be cached
cifs: Fix parsing reparse point with native symlink in SMB1 non-UNICODE session
cifs: Validate content of WSL reparse point buffers
cifs: Improve guard for excluding $LXDEV xattr
cifs: Add support for parsing WSL-style symlinks
cifs: Validate content of native symlink
cifs: Fix parsing native symlinks relative to the export
smb: client: fix NULL ptr deref in crypto_aead_setkey()
Update misleading comment in cifs_chan_update_iface
smb: client: change return value in open_cached_dir_by_dentry() if !cfids
...
Linus Torvalds [Sat, 30 Nov 2024 18:06:56 +0000 (10:06 -0800)]
Merge tag '6.13-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server updates from Steve French:
- fix use after free due to race in ksmd workqueue handler
- debugging improvements
- fix incorrectly formatted response when client attempts SMB1
- improve memory allocation to reduce chance of OOM
- improve delays between retries when killing sessions
* tag '6.13-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix use-after-free in SMB request handling
ksmbd: add debug print for pending request during server shutdown
ksmbd: add netdev-up/down event debug print
ksmbd: add debug prints to know what smb2 requests were received
ksmbd: add debug print for rdma capable
ksmbd: use msleep instaed of schedule_timeout_interruptible()
ksmbd: use __GFP_RETRY_MAYFAIL
ksmbd: fix malformed unsupported smb1 negotiate response
Linus Torvalds [Sat, 30 Nov 2024 17:03:16 +0000 (09:03 -0800)]
Merge tag 'tty-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver updates from Greg KH:
"Here is a small set of tty and serial driver updates for 6.13-rc1.
Nothing major at all this time, only some small changes:
- few device tree binding updates
- 8250_exar serial driver updates
- imx serial driver updates
- sprd_serial driver updates
- other tiny serial driver updates, full details in the shortlog
All of these have been in linux-next for a while with one reported
issue, but that commit has now been reverted"
* tag 'tty-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (37 commits)
Revert "serial: sh-sci: Clean sci_ports[0] after at earlycon exit"
serial: amba-pl011: fix build regression
dt-bindings: serial: Add a new compatible string for ums9632
serial: sprd: Add support for sc9632
tty/serial/altera_uart: unwrap error log string
tty/serial/altera_jtaguart: unwrap error log string
serial: amba-pl011: Fix RX stall when DMA is used
tty: ldsic: fix tty_ldisc_autoload sysctl's proc_handler
serial: 8250_fintek: Add support for F81216E
serial: sh-sci: Clean sci_ports[0] after at earlycon exit
tty: atmel_serial: Fix typo retreives to retrieves
tty: atmel_serial: Use devm_platform_ioremap_resource()
serial: 8250: omap: Move pm_runtime_get_sync
tty: serial: samsung: Add Exynos8895 compatible
dt-bindings: serial: samsung: Add samsung,exynos8895-uart compatible
serial: 8250_dw: Add Sophgo SG2044 quirk
dt-bindings: serial: snps-dw-apb-uart: Add Sophgo SG2044 uarts
dt-bindings: serial: snps,dw-apb-uart: merge duplicate compatible entry.
altera_jtaguart: Use dev_err() to report error attaching IRQ
altera_uart: Use dev_err() to report error attaching IRQ handler
...
Linus Torvalds [Fri, 29 Nov 2024 21:06:06 +0000 (13:06 -0800)]
Merge tag 'drm-next-2024-11-29' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Merge window fixes, mostly amdgpu and xe, with a few other minor ones,
all looks fairly normal,
i915:
- hdcp: Fix when the first read and write are retried
xe:
- Wake up waiters after wait condition set to true
- Mark the preempt fence workqueue as reclaim
- Update xe2 graphics name string
- Fix a couple of guc submit races
- Fix pat index usage in migrate
- Ensure non-cached migrate pagetable bo mappings
- Take a PM ref in the delayed snapshot capture worker
* tag 'drm-next-2024-11-29' of https://gitlab.freedesktop.org/drm/kernel: (48 commits)
drm/xe: Take PM ref in delayed snapshot capture worker
drm/xe/migrate: use XE_BO_FLAG_PAGETABLE
drm/xe/migrate: fix pat index usage
drm/xe/guc_submit: fix race around suspend_pending
drm/xe/guc_submit: fix race around pending_disable
drm/xe: Update xe2_graphics name string
drm/rockchip: avoid 64-bit division
Revert "drm/radeon: Delay Connector detecting when HPD singals is unstable"
drm/amdgpu/jpeg: cancel the jpeg worker
drm/amdgpu: fix usage slab after free
drm/amdgpu/vcn: reset fw_shared when VCPU buffers corrupted on vcn v4.0.3
drm/amdgpu: Fix sysfs warning when hotplugging
drm/amdgpu: Add sysfs interface for vcn reset mask
drm/amdgpu/gmc7: fix wait_for_idle callers
drm/amd/pm: Remove arcturus min power limit
drm/amd/pm: skip setting the power source on smu v14.0.2/3
drm/amd/pm: disable pcie speed switching on Intel platform for smu v14.0.2/3
drm/amdkfd: Use the correct wptr size
drm/xe: Mark preempt fence workqueue as reclaim
drm/xe/ufence: Wake up waiters after setting ufence->signalled
...
Linus Torvalds [Fri, 29 Nov 2024 21:01:05 +0000 (13:01 -0800)]
Merge tag 'sound-fix-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. Majority of changes are device-specific
fixes and quirks, while there are a few core fixes to address
regressions and corner cases spotted by fuzzers.
- Fix of spinlock range that wrongly covered kvfree() call in rawmidi
- Fix potential NULL dereference at PCM mmap
- Fix incorrectly advertised MIDI 2.0 UMP Function Block info
- Various ASoC AMD quirks and fixes
- ASoC SOF Intel, Mediatek, HDMI-codec fixes
- A few more quirks and TAS2781 codec fix for HD-audio
- A couple of fixes for USB-audio for malicious USB descriptors"
* tag 'sound-fix-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (30 commits)
ALSA: hda: improve bass speaker support for ASUS Zenbook UM5606WA
ALSA: hda/realtek: Apply quirk for Medion E15433
ASoC: amd: yc: Add a quirk for microfone on Lenovo ThinkPad P14s Gen 5 21MES00B00
ASoC: SOF: ipc3-topology: Convert the topology pin index to ALH dai index
ASoC: mediatek: Check num_codecs is not zero to avoid panic during probe
ASoC: amd: yc: Fix for enabling DMIC on acp6x via _DSD entry
ALSA: ump: Fix evaluation of MIDI 1.0 FB info
ALSA: core: Fix possible NULL dereference caused by kunit_kzalloc()
ALSA: hda: Show the codec quirk info at probing
ALSA: asihpi: Remove unused variable
ALSA: hda/realtek: Set PCBeep to default value for ALC274
ALSA: hda/tas2781: Add speaker id check for ASUS projects
ALSA: hda/realtek: Update ALC225 depop procedure
ALSA: hda/realtek: Enable speaker pins for Medion E15443 platform
ALSA: hda/realtek: fix mute/micmute LEDs don't work for EliteBook X G1i
ALSA: usb-audio: Fix out of bounds reads when finding clock sources
ALSA: rawmidi: Fix kvfree() call in spinlock
ALSA: hda/realtek: Fix Internal Speaker and Mic boost of Infinix Y4 Max
ASoC: amd: yc: Add quirk for microphone on Lenovo Thinkpad T14s Gen 6 21M1CTO1WW
ASoC: doc: dapm: Add location information for dapm-graph tool
...
Linus Torvalds [Fri, 29 Nov 2024 19:58:27 +0000 (11:58 -0800)]
Merge tag 'char-misc-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc/IIO/whatever driver subsystem updates from Greg KH:
"Here is the 'big and hairy' char/misc/iio and other small driver
subsystem updates for 6.13-rc1.
Loads of things in here, and even a fun merge conflict!
- rust misc driver bindings and other rust changes to make misc
drivers actually possible.
I think this is the tipping point, expect to see way more rust
drivers going forward now that these bindings are present. Next
merge window hopefully we will have pci and platform drivers
working, which will fully enable almost all driver subsystems to
start accepting (or at least getting) rust drivers.
This is the end result of a lot of work from a lot of people,
congrats to all of them for getting this far, you've proved many of
us wrong in the best way possible, working code :)
- IIO driver updates, too many to list individually, that subsystem
keeps growing and growing...
- Interconnect driver updates
- nvmem driver updates
- pwm driver updates
- platform_driver::remove() fixups, loads of them
- counter driver updates
- misc driver updates (keba?)
- binder driver updates and fixes
- loads of other small char/misc/etc driver updates and additions,
full details in the shortlog.
All of these have been in linux-next for a while, with no other
reported issues other than that merge conflict"
* tag 'char-misc-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (401 commits)
mei: vsc: Fix typo "maintstepping" -> "mainstepping"
firmware: Switch back to struct platform_driver::remove()
misc: isl29020: Fix the wrong format specifier
scripts/tags.sh: Don't tag usages of DEFINE_MUTEX
fpga: Switch back to struct platform_driver::remove()
mei: vsc: Improve error logging in vsc_identify_silicon()
mei: vsc: Do not re-enable interrupt from vsc_tp_reset()
dt-bindings: spmi: qcom,x1e80100-spmi-pmic-arb: Add SAR2130P compatible
dt-bindings: spmi: spmi-mtk-pmif: Add compatible for MT8188
spmi: pmic-arb: fix return path in for_each_available_child_of_node()
iio: Move __private marking before struct element priv in struct iio_dev
docs: iio: ad7380: add adaq4370-4 and adaq4380-4
iio: adc: ad7380: add support for adaq4370-4 and adaq4380-4
iio: adc: ad7380: use local dev variable to shorten long lines
iio: adc: ad7380: fix oversampling formula
dt-bindings: iio: adc: ad7380: add adaq4370-4 and adaq4380-4 compatible parts
bus: mhi: host: pci_generic: Use pcim_iomap_region() to request and map MHI BAR
bus: mhi: host: Switch trace_mhi_gen_tre fields to native endian
misc: atmel-ssc: Use of_property_present() for non-boolean properties
misc: keba: Add hardware dependency
...
Linus Torvalds [Fri, 29 Nov 2024 19:43:29 +0000 (11:43 -0800)]
Merge tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is a small set of driver core changes for 6.13-rc1.
Nothing major for this merge cycle, except for the two simple merge
conflicts are here just to make life interesting.
Included in here are:
- sysfs core changes and preparations for more sysfs api cleanups
that can come through all driver trees after -rc1 is out
- fw_devlink fixes based on many reports and debugging sessions
- list_for_each_reverse() removal, no one was using it!
- last-minute seq_printf() format string bug found and fixed in many
drivers all at once.
- minor bugfixes and changes full details in the shortlog"
* tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits)
Fix a potential abuse of seq_printf() format string in drivers
cpu: Remove spurious NULL in attribute_group definition
s390/con3215: Remove spurious NULL in attribute_group definition
perf: arm-ni: Remove spurious NULL in attribute_group definition
driver core: Constify bin_attribute definitions
sysfs: attribute_group: allow registration of const bin_attribute
firmware_loader: Fix possible resource leak in fw_log_firmware_info()
drivers: core: fw_devlink: Fix excess parameter description in docstring
driver core: class: Correct WARN() message in APIs class_(for_each|find)_device()
cacheinfo: Use of_property_present() for non-boolean properties
cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap()
drivers: core: fw_devlink: Make the error message a bit more useful
phy: tegra: xusb: Set fwnode for xusb port devices
drm: display: Set fwnode for aux bus devices
driver core: fw_devlink: Stop trying to optimize cycle detection logic
driver core: Constify attribute arguments of binary attributes
sysfs: bin_attribute: add const read/write callback variants
sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR()
sysfs: treewide: constify attribute callback of bin_attribute::llseek()
sysfs: treewide: constify attribute callback of bin_attribute::mmap()
...
Linus Torvalds [Fri, 29 Nov 2024 19:36:13 +0000 (11:36 -0800)]
Merge tag 'staging-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here is the big set of staging driver changes for 6.13-rc1.
Lots of changes this merge cycle, drivers removed and drivers added.
Highlights include:
- removals of the following staging drivers due to no forward
progress and no one having either the hardware or the time/energy
to deal with them anymore:
- fieldbus
- gdm724x
- olpc_dcon
- rtl8712
- rts5208
- vt6655
- vt6656
If anyone has this hardware and wants to work on the drivers, it
can be an easy revert to get them back.
- addition of the gpib driver subsystem. Lots of drivers for really
old and semi-old interfaces to lab equipments. We expect lots of
churn in these drivers as they get cleaned up to "working" order.
These were added at the request of a user and the maintainer/author
of them is helping out with the effort
- loads and loads of tiny coding style cleanups for almost all
staging drivers. Too many to list, see the shortlog for details.
All of these have been in linux-next for a very long time with no
reported issues"
* tag 'staging-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (216 commits)
Staging: gpib: gpib_os.c - Remove unnecessary OOM message
staging: gpib: avoid unintended sign extension
staging: vchiq_debugfs: Use forward declarations
staging: vchiq_core: Rectify header include for vchiq_dump_state()
staging: vc04_services: Cleanup TODO entry
staging: most: Remove TODO contact information
staging: rtl8723bs: Remove TODO contact information
staging: sm750fb: Remove TODO contact information
staging: iio: Remove TODO file
staging: greybus: uart: Fix atomicity violation in get_serial_info()
staging: rtl8723bs: Remove unused function Efuse_GetCurrentSize
staging: rtl8723bs: Remove unused function efuse_WordEnableDataRead
staging: rtl8723bs: Remove function hal_EfusePgPacketWrite1ByteHeader
staging: rtl8723bs: Remove function hal_EfusePgPacketWrite2ByteHeader
staging: rtl8723bs: Remove unused function hal_EfusePgCheckAvailableAddr
staging: rtl8723bs: Remove unused function hal_EfuseConstructPGPkt
staging: rtl8723bs: Remove unused function hal_EfusePartialWriteCheck
staging: rtl8723bs: Remove unused function hal_EfusePgPacketWriteHeader
staging: rtl8723bs: Remove unused function hal_EfusePgPacketWriteData
staging: rtl8723bs: Remove unused function Hal_EfusePgPacketWrite_BT
...
Linus Torvalds [Fri, 29 Nov 2024 19:19:31 +0000 (11:19 -0800)]
Merge tag 'usb-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 6.13-rc1.
Overall, a pretty slow development cycle, the majority of the work
going into the debugfs interface for the thunderbolt (i.e. USB4) code,
to help with debugging the myrad ways that hardware vendors get their
interfaces messed up. Other than that, here's the highlights:
- thunderbolt changes and additions to debugfs interfaces
- lots of device tree updates for new and old hardware
- UVC configfs gadget updates and new apis for features
- xhci driver updates and fixes
- dwc3 driver updates and fixes
- typec driver updates and fixes
- lots of other small updates and fixes, full details in the shortlog
All of these have been in linux-next for a while with no reported
problems"
* tag 'usb-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (148 commits)
usb: typec: tcpm: Add support for sink-bc12-completion-time-ms DT property
dt-bindings: usb: maxim,max33359: add usage of sink bc12 time property
dt-bindings: connector: Add time property for Sink BC12 detection completion
usb: dwc3: gadget: Remove dwc3_request->needs_extra_trb
usb: dwc3: gadget: Cleanup SG handling
usb: dwc3: gadget: Fix looping of queued SG entries
usb: dwc3: gadget: Fix checking for number of TRBs left
usb: dwc3: ep0: Don't clear ep0 DWC3_EP_TRANSFER_STARTED
Revert "usb: gadget: composite: fix OS descriptors w_value logic"
usb: ehci-spear: fix call balance of sehci clk handling routines
USB: make to_usb_device_driver() use container_of_const()
USB: make to_usb_driver() use container_of_const()
USB: properly lock dynamic id list when showing an id
USB: make single lock for all usb dynamic id lists
drivers/usb/storage: refactor min with min_t
drivers/usb/serial: refactor min with min_t
drivers/usb/musb: refactor min/max with min_t/max_t
drivers/usb/mon: refactor min with min_t
drivers/usb/misc: refactor min with min_t
drivers/usb/host: refactor min/max with min_t/max_t
...
Linus Torvalds [Fri, 29 Nov 2024 19:15:07 +0000 (11:15 -0800)]
Merge tag 'modules-6.13-rc1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull modules fixes from Luis Chamberlain:
"Three fixes, the main one build that we build the kallsyms test
modules all over again if we just run make twice"
* tag 'modules-6.13-rc1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
selftests: find_symbol: Actually use load_mod() parameter
selftests: kallsyms: fix and clarify current test boundaries
selftests: kallsyms: fix double build stupidity
Linus Torvalds [Fri, 29 Nov 2024 19:10:30 +0000 (11:10 -0800)]
Merge tag 'apparmor-pr-2024-11-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"Features:
- extend next/check table to add support for 2^24 states to the state
machine.
- rework capability audit cache to use broader cred information
instead of just the profile. Also add a time stamp so old entries
can be aged out of the cache.
Bug Fixes:
- fix 'Do simple duplicate message elimination' to clear previous
state when updating in capability audit cache
- Fix memory leak for aa_unpack_strdup()
- properly handle cx/px lookup failure when in complain mode
- allocate xmatch for nullpdb inside aa_alloc_null fixing a NULL ptr
deref of tracking profiles in when in complain mode
Cleanups:
- Remove everything being reported as deadcode
- replace misleading 'scrubbing environment' phrase in debug print
- Remove unnecessary NULL check before kvfree()
- clean up duplicated parts of handle_onexec()
- Use IS_ERR_OR_NULL() helper function
- move new_profile declaration to top of block instead immediately
after label to remove C23 extension warning
Documentation:
- add comment to document capability.c:profile_capable ad ptr
parameter can not be NULL
- add comment to document first entry is in packed perms struct is
reserved for future planned expansion.
- Update LSM/apparmor.rst add blurb for DEFAULT_SECURITY_APPARMOR"
* tag 'apparmor-pr-2024-11-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: lift new_profile declaration to remove C23 extension warning
apparmor: replace misleading 'scrubbing environment' phrase in debug print
parser: drop dead code for XXX_comb macros
apparmor: Remove unused parameter L1 in macro next_comb
Docs: Update LSM/apparmor.rst
apparmor: audit_cap dedup based on subj_cred instead of profile
apparmor: add a cache entry expiration time aging out capability audit cache
apparmor: document capability.c:profile_capable ad ptr not being NULL
apparmor: fix 'Do simple duplicate message elimination'
apparmor: document first entry is in packed perms struct is reserved
apparmor: test: Fix memory leak for aa_unpack_strdup()
apparmor: Remove deadcode
apparmor: Remove unnecessary NULL check before kvfree()
apparmor: domain: clean up duplicated parts of handle_onexec()
apparmor: Use IS_ERR_OR_NULL() helper function
apparmor: add support for 2^24 states to the dfa state machine.
apparmor: properly handle cx/px lookup failure for complain
apparmor: allocate xmatch for nullpdb inside aa_alloc_null
Linus Torvalds [Fri, 29 Nov 2024 18:40:52 +0000 (10:40 -0800)]
Merge tag 's390-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- Add swap entry for hugetlbfs support
- Add PTE_MARKER support for hugetlbs mappings; this fixes a regression
(possible page fault loop) which was introduced when support for
UFFDIO_POISON for hugetlbfs was added
- Add ARCH_HAS_PREEMPT_LAZY and PREEMPT_DYNAMIC support
- Mark IRQ entries in entry code, so that stack tracers can filter out
the non-IRQ parts of stack traces. This fixes stack depot capacity
limit warnings, since without filtering the number of unique stack
traces is huge
- In PCI code fix leak of struct zpci_dev object, and fix potential
double remove of hotplug slot
- Fix pagefault_disable() / pagefault_enable() unbalance in
arch_stack_user_walk_common()
- A couple of inline assembly optimizations, more cmpxchg() to
try_cmpxchg() conversions, and removal of usages of xchg() and
cmpxchg() on one and two byte memory areas
- Various other small improvements and cleanups
* tag 's390-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (27 commits)
Revert "s390/mm: Allow large pages for KASAN shadow mapping"
s390/spinlock: Use flag output constraint for arch_cmpxchg_niai8()
s390/spinlock: Use R constraint for arch_load_niai4()
s390/spinlock: Generate shorter code for arch_spin_unlock()
s390/spinlock: Remove condition code clobber from arch_spin_unlock()
s390/spinlock: Use symbolic names in inline assemblies
s390: Support PREEMPT_DYNAMIC
s390/pci: Fix potential double remove of hotplug slot
s390/pci: Fix leak of struct zpci_dev when zpci_add_device() fails
s390/mm/hugetlbfs: Add missing includes
s390/mm: Add PTE_MARKER support for hugetlbfs mappings
s390/mm: Introduce region-third and segment table swap entries
s390/mm: Introduce region-third and segment table entry present bits
s390/mm: Rearrange region-third and segment table entry SW bits
KVM: s390: Increase size of union sca_utility to four bytes
KVM: s390: Remove one byte cmpxchg() usage
KVM: s390: Use try_cmpxchg() instead of cmpxchg() loops
s390/ap: Replace xchg() with WRITE_ONCE()
s390/mm: Allow large pages for KASAN shadow mapping
s390: Add ARCH_HAS_PREEMPT_LAZY support
...
Linus Torvalds [Fri, 29 Nov 2024 18:31:18 +0000 (10:31 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
Pull ARM updates from Russell King:
- add dev_is_amba() function to allow conversions during the next cycle
- improve PREEMPT_RT performance with VFP
- KASAN fixes for vmap stack
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9431/1: mm: Pair atomic_set_release() with _read_acquire()
ARM: 9430/1: entry: Do a dummy read from VMAP shadow
ARM: 9429/1: ioremap: Sync PGDs for VMALLOC shadow
ARM: 9426/1: vfp: Move sending signals outside of vfp_state_hold()ed section.
ARM: 9425/1: vfp: Use vfp_state_hold() in vfp_support_entry().
ARM: 9424/1: vfp: Use vfp_state_hold() in vfp_sync_hwstate().
ARM: 9423/1: vfp: Provide vfp_state_hold() for VFP locking.
ARM: 9415/1: amba: Add dev_is_amba() function and export it for modules
Linus Torvalds [Fri, 29 Nov 2024 18:27:49 +0000 (10:27 -0800)]
Merge tag 'sparc-for-6.13-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc
Pull sparc updates from Andreas Larsson:
- Make sparc64 compilable with clang
- Replace one-element array with flexible array member
* tag 'sparc-for-6.13-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/alarsson/linux-sparc:
sparc/vdso: Add helper function for 64-bit right shift on 32-bit target
sparc: Replace one-element array with flexible array member
sparc/build: Add SPARC target flags for compiling with clang
sparc/build: Put usage of -fcall-used* flags behind cc-option
Allowing large pages for KASAN shadow mappings isn't inherently wrong,
but adding POPULATE_KASAN_MAP_SHADOW to large_allowed() exposes an issue
in can_large_pud() and can_large_pmd().
Since commit d8073dc6bc04 ("s390/mm: Allow large pages only for aligned
physical addresses"), both can_large_pud() and can_large_pmd() call _pa()
to check if large page physical addresses are aligned. However, _pa()
has a side effect: it allocates memory in POPULATE_KASAN_MAP_SHADOW
mode. This results in massive memory leaks.
The proper fix would be to address both large_allowed() and _pa()'s side
effects, but for now, revert this change to avoid the leaks.
Fixes: ff123eb77416 ("s390/mm: Allow large pages for KASAN shadow mapping") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Linus Torvalds [Thu, 28 Nov 2024 19:46:13 +0000 (11:46 -0800)]
Merge tag 'trace-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull more tracing updates from Steven Rostedt:
- Add trace flag for NEED_RESCHED_LAZY
Now that NEED_RESCHED_LAZY is upstream, add it to the status bits of
the common_flags. This will now show when the NEED_RESCHED_LAZY flag
is set that is used for debugging latency issues in the kernel via a
trace.
- Remove leftover "__idx" variable when SRCU was removed from the
tracepoint code
- Add rcu_tasks_trace guard
To add a guard() around the tracepoint code, a rcu_tasks_trace guard
needs to be created first.
- Remove __DO_TRACE() macro and just call __DO_TRACE_CALL() directly
The DO_TRACE() macro has conditional locking depending on what was
passed into the macro parameters. As the guts of the macro has been
moved to __DO_TRACE_CALL() to handle static call logic, there's no
reason to keep the __DO_TRACE() macro around.
It is better to just do the locking in place without the conditionals
and call __DO_TRACE_CALL() from those locations. The "cond" passed in
can also be moved out of that macro. This simplifies the code.
- Remove the "cond" from the system call tracepoint macros
The "cond" variable was added to allow some tracepoints to check a
condition within the static_branch (jump/nop) logic. The system calls
do not need this. Removing it simplifies the code.
- Replace scoped_guard() with just guard() in the tracepoint logic
guard() works just as well as scoped_guard() in the tracepoint logic
and the scoped_guard() causes some issues.
* tag 'trace-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Use guard() rather than scoped_guard()
tracing: Remove cond argument from __DECLARE_TRACE_SYSCALL
tracing: Remove conditional locking from __DO_TRACE()
rcupdate_trace: Define rcu_tasks_trace lock guard
tracing: Remove __idx variable from __DO_TRACE
tracing: Move it_func[0] comment to the relevant context
tracing: Record task flag NEED_RESCHED_LAZY.
Geert Uytterhoeven [Thu, 28 Nov 2024 14:04:52 +0000 (15:04 +0100)]
selftests: find_symbol: Actually use load_mod() parameter
The parameter passed to load_mod() is stored in $MOD, but never used.
Obviously it was intended to be used instead of the hardcoded
"test_kallsyms_b" module name.
Fixes: 84b4a51fce4ccc66 ("selftests: add new kallsyms selftests") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Luis Chamberlain [Wed, 27 Nov 2024 22:10:57 +0000 (14:10 -0800)]
selftests: kallsyms: fix double build stupidity
The current arrangement will have the test modules rebuilt on
any make without having the script or code actually change.
Take Masahiro Yamada's suggested fix and cleanups on the Makefile
to fix this.
Dave Airlie [Thu, 28 Nov 2024 18:59:21 +0000 (04:59 +1000)]
Merge tag 'drm-xe-next-fixes-2024-11-28' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
Driver Changes:
- Update xe2 graphics name string (Matt Roper)
- Fix a couple of guc submit races (Matt Auld)
- Fix pat index usage in migrate (Matt Auld)
- Ensure non-cached migrate pagetable bo mappings (Matt Auld)
- Take a PM ref in the delayed snapshot capture worker (Matt Brost)
Linus Torvalds [Thu, 28 Nov 2024 18:15:20 +0000 (10:15 -0800)]
Merge tag 'net-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth.
Current release - regressions:
- rtnetlink: fix rtnl_dump_ifinfo() error path
- bluetooth: remove the redundant sco_conn_put
Previous releases - regressions:
- netlink: fix false positive warning in extack during dumps
- sched: sch_fq: don't follow the fast path if Tx is behind now
- ipv6: delete temporary address if mngtmpaddr is removed or
unmanaged
- tcp: fix use-after-free of nreq in reqsk_timer_handler().
- bluetooth: fix slab-use-after-free Read in set_powered_sync
- l2tp: fix warning in l2tp_exit_net found
- eth:
- bnxt_en: fix receive ring space parameters when XDP is active
- lan78xx: fix double free issue with interrupt buffer allocation
- tg3: set coherent DMA mask bits to 31 for BCM57766 chipsets
Previous releases - always broken:
- ipmr: fix tables suspicious RCU usage
- iucv: MSG_PEEK causes memory leak in iucv_sock_destruct()
- eth:
- octeontx2-af: fix low network performance
- stmmac: dwmac-socfpga: set RX watchdog interrupt as broken
- rtase: correct the speed for RTL907XD-V1
Misc:
- some documentation fixup"
* tag 'net-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
ipmr: fix build with clang and DEBUG_NET disabled.
Documentation: tls_offload: fix typos and grammar
Fix spelling mistake
ipmr: fix tables suspicious RCU usage
ip6mr: fix tables suspicious RCU usage
ipmr: add debug check for mr table cleanup
selftests: rds: move test.py to TEST_FILES
net_sched: sch_fq: don't follow the fast path if Tx is behind now
tcp: Fix use-after-free of nreq in reqsk_timer_handler().
net: phy: fix phy_ethtool_set_eee() incorrectly enabling LPI
net: Comment copy_from_sockptr() explaining its behaviour
rxrpc: Improve setsockopt() handling of malformed user input
llc: Improve setsockopt() handling of malformed user input
Bluetooth: SCO: remove the redundant sco_conn_put
Bluetooth: MGMT: Fix possible deadlocks
Bluetooth: MGMT: Fix slab-use-after-free Read in set_powered_sync
bnxt_en: Unregister PTP during PCI shutdown and suspend
bnxt_en: Refactor bnxt_ptp_init()
bnxt_en: Fix receive ring space parameters when XDP is active
bnxt_en: Fix queue start to update vnic RSS table
...
Linus Torvalds [Thu, 28 Nov 2024 18:06:00 +0000 (10:06 -0800)]
Merge tag 'spi-fix-v6.13-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few fairly minor driver specific fixes, plus one core fix for the
handling of deferred probe on ACPI systems - ignoring probe deferral
and incorrectly treating it like a fatal error while parsing the
generic ACPI bindings for SPI devices"
* tag 'spi-fix-v6.13-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: Fix acpi deferred irq probe
spi: atmel-quadspi: Fix register name in verbose logging function
spi-imx: prevent overflow when estimating transfer time
spi: rockchip-sfc: Embedded DMA only support 4B aligned address
Max Kellermann [Thu, 21 Nov 2024 13:53:51 +0000 (14:53 +0100)]
fs/nfs/io: make nfs_start_io_*() killable
This allows killing processes that wait for a lock when one process is
stuck waiting for the NFS server. This aims to complete the coverage
of NFS operations being killable, like nfs_direct_wait() does, for
example.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Benjamin Coddington [Fri, 22 Nov 2024 15:11:12 +0000 (10:11 -0500)]
nfs/blocklayout: Limit repeat device registration on failure
Every pNFS SCSI IO wants to do LAYOUTGET, then within the layout find the
device which can drive GETDEVINFO, then finally may need to prep the device
with a reservation. This slow work makes a mess of IO latencies if one of
the later steps is going to fail for awhile.
If we're unable to register a SCSI device, ensure we mark the device as
unavailable so that it will timeout and be re-added via GETDEVINFO. This
avoids repeated doomed attempts to register a device in the IO path.
Add some clarifying comments as well.
Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Benjamin Coddington [Fri, 22 Nov 2024 15:11:11 +0000 (10:11 -0500)]
nfs/blocklayout: Don't attempt unregister for invalid block device
Since commit d869da91cccb ("nfs/blocklayout: Fix premature PR key
unregistration") an unmount of a pNFS SCSI layout-enabled NFS may
dereference a NULL block_device in:
This happens because even though we were able to create the
nfs4_deviceid_node, the lookup for the device was unable to attach the
block device to the pnfs_block_dev.
If we never found a block device to register, we can avoid this case with
the PNFS_BDEV_REGISTERED flag. Move the deref behind the test for the
flag.
Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
ip netns add netns_1
ip link add name veth_1_peer type veth peer veth_1
ifconfig veth_1_peer 11.11.0.254 up
ip link set veth_1 netns netns_1
ip netns exec netns_1 ifconfig veth_1 11.11.0.1
ip netns exec netns_1 /root/iptables -A OUTPUT -d 11.11.0.254 -p tcp \
--tcp-flags FIN FIN -j DROP
(note: In my environment, a DESTROY_CLIENTID operation is always sent
immediately, breaking the nfs tcp connection.)
ip netns exec netns_1 timeout -s 9 300 mount -t nfs -o proto=tcp,vers=4.1 \
11.11.0.254:/mnt/nfsshare /mnt/nfs/netns_1
ip netns del netns_1
The reason here is that the tcp socket in netns_1 (nfs side) has been
shutdown and closed (done in xs_destroy), but the FIN message (with ack)
is discarded, and the nfsd side keeps sending retransmission messages.
As a result, when the tcp sock in netns_1 processes the received message,
it sends the message (FIN message) in the sending queue, and the tcp timer
is re-established. When the network namespace is deleted, the net structure
accessed by tcp's timer handler function causes problems.
To fix this problem, let's hold netns refcnt for the tcp kernel socket as
done in other modules. This is an ugly hack which can easily be backported
to earlier kernels. A proper fix which cleans up the interfaces will
follow, but may not be so easy to backport.
Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") Signed-off-by: Liu Jian <liujian56@huawei.com> Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Benjamin Coddington [Fri, 15 Nov 2024 13:59:36 +0000 (08:59 -0500)]
SUNRPC: timeout and cancel TLS handshake with -ETIMEDOUT
We've noticed a situation where an unstable TCP connection can cause the
TLS handshake to timeout waiting for userspace to complete it. When this
happens, we don't want to return from xs_tls_handshake_sync() with zero, as
this will cause the upper xprt to be set CONNECTED, and subsequent attempts
to transmit will be returned with -EPIPE. The sunrpc machine does not
recover from this situation and will spin attempting to transmit.
The return value of tls_handshake_cancel() can be used to detect a race
with completion:
* tls_handshake_cancel - cancel a pending handshake
* Return values:
* %true - Uncompleted handshake request was canceled
* %false - Handshake request already completed or not found
If true, we do not want the upper xprt to be connected, so return
-ETIMEDOUT. If false, its possible the handshake request was lost and
that may be the reason for our timeout. Again we do not want the upper
xprt to be connected, so return -ETIMEDOUT.
Ensure that we alway return an error from xs_tls_handshake_sync() if we
call tls_handshake_cancel().
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Fixes: 75eb6af7acdf ("SUNRPC: Add a TCP-with-TLS RPC transport class") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Liu Jian [Fri, 15 Nov 2024 09:38:04 +0000 (17:38 +0800)]
sunrpc: clear XPRT_SOCK_UPD_TIMEOUT when reset transport
Since transport->sock has been set to NULL during reset transport,
XPRT_SOCK_UPD_TIMEOUT also needs to be cleared. Otherwise, the
xs_tcp_set_socket_timeouts() may be triggered in xs_tcp_send_request()
to dereference the transport->sock that has been set to NULL.
Fixes: 7196dbb02ea0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Li Lingfeng [Thu, 14 Nov 2024 04:53:03 +0000 (12:53 +0800)]
nfs: ignore SB_RDONLY when mounting nfs
When exporting only one file system with fsid=0 on the server side, the
client alternately uses the ro/rw mount options to perform the mount
operation, and a new vfsmount is generated each time.
It can be reproduced as follows:
[root@localhost ~]# mount /dev/sda /mnt2
[root@localhost ~]# echo "/mnt2 *(rw,no_root_squash,fsid=0)" >/etc/exports
[root@localhost ~]# systemctl restart nfs-server
[root@localhost ~]# mount -t nfs -o ro,vers=4 127.0.0.1:/ /mnt/sdaa
[root@localhost ~]# mount -t nfs -o rw,vers=4 127.0.0.1:/ /mnt/sdaa
[root@localhost ~]# mount -t nfs -o ro,vers=4 127.0.0.1:/ /mnt/sdaa
[root@localhost ~]# mount -t nfs -o rw,vers=4 127.0.0.1:/ /mnt/sdaa
[root@localhost ~]# mount | grep nfs4
127.0.0.1:/ on /mnt/sdaa type nfs4 (ro,relatime,vers=4.2,rsize=1048576,...
127.0.0.1:/ on /mnt/sdaa type nfs4 (rw,relatime,vers=4.2,rsize=1048576,...
127.0.0.1:/ on /mnt/sdaa type nfs4 (ro,relatime,vers=4.2,rsize=1048576,...
127.0.0.1:/ on /mnt/sdaa type nfs4 (rw,relatime,vers=4.2,rsize=1048576,...
[root@localhost ~]#
We expected that after mounting with the ro option, using the rw option to
mount again would return EBUSY, but the actual situation was not the case.
As shown above, when mounting for the first time, a superblock with the ro
flag will be generated, and at the same time, in do_new_mount_fc -->
do_add_mount, it detects that the superblock corresponding to the current
target directory is inconsistent with the currently generated one
(path->mnt->mnt_sb != newmnt->mnt.mnt_sb), and a new vfsmount will be
generated.
When mounting with the rw option for the second time, since no matching
superblock can be found in the fs_supers list, a new superblock with the
rw flag will be generated again. The superblock in use (ro) is different
from the newly generated superblock (rw), and a new vfsmount will be
generated again.
When mounting with the ro option for the third time, the superblock (ro)
is found in fs_supers, the superblock in use (rw) is different from the
found superblock (ro), and a new vfsmount will be generated again.
We can switch between ro/rw through remount, and only one superblock needs
to be generated, thus avoiding the problem of repeated generation of
vfsmount caused by switching superblocks.
Furthermore, This can also resolve the issue described in the link.
Linus Torvalds [Thu, 28 Nov 2024 17:40:53 +0000 (09:40 -0800)]
Merge tag 'regulator-fix-v6.13-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A couple of fixes that came in during the merge window, plus
documetation of a new device ID for the Qualcomm LABIBB driver.
There's a core fix for the rarely used current constraints and a fix
for the Qualcomm RPMH driver which had described only one of the two
voltage ranges that the hardware could control, creating a potential
incompatibility with the configuration left by firmware"
* tag 'regulator-fix-v6.13-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: core: Ignore unset max_uA constraints in current limit check
dt-bindings: regulator: qcom-labibb-regulator: document the pmi8950 labibb regulator
regulator: qcom-rpmh: Update ranges for FTSMPS525
Linus Torvalds [Thu, 28 Nov 2024 17:22:00 +0000 (09:22 -0800)]
Merge tag 'ntfs3_for_6.13' of https://github.com/Paragon-Software-Group/linux-ntfs3
Pull ntfs3 updates from Konstantin Komarov:
- additional checks to address issues identified by syzbot
- continuation of the transition from 'page' to 'folio'
* tag 'ntfs3_for_6.13' of https://github.com/Paragon-Software-Group/linux-ntfs3:
fs/ntfs3: Accumulated refactoring changes
fs/ntfs3: Switch to folio to release resources
fs/ntfs3: Add check in ntfs_extend_initialized_size
fs/ntfs3: Add more checks in mi_enum_attr (part 2)
fs/ntfs3: Equivalent transition from page to folio
fs/ntfs3: Fix case when unmarked clusters intersect with zone
fs/ntfs3: Fix warning in ni_fiemap
Linus Torvalds [Thu, 28 Nov 2024 17:18:11 +0000 (09:18 -0800)]
Merge tag 'exfat-for-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- If the start cluster of stream entry is invalid, treat it as the
empty directory
- Valid size of steam entry cannot be greater than data size. If
valid_size is invalid, use data_size
- Move Direct-IO alignment check to before extending the valid size
- Fix uninit-value issue reported by syzbot
- Optimize finding directory entry-set in write_inode, rename, unlink
* tag 'exfat-for-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: reduce FAT chain traversal
exfat: code cleanup for exfat_readdir()
exfat: remove argument 'p_dir' from exfat_add_entry()
exfat: move exfat_chain_set() out of __exfat_resolve_path()
exfat: add exfat_get_dentry_set_by_ei() helper
exfat: rename argument name for exfat_move_file and exfat_rename_file
exfat: remove unnecessary read entry in __exfat_rename()
exfat: fix file being changed by unaligned direct write
exfat: fix uninit-value in __exfat_get_dentry_set
exfat: fix out-of-bounds access of directory entries
Paolo Abeni [Thu, 28 Nov 2024 16:18:04 +0000 (17:18 +0100)]
ipmr: fix build with clang and DEBUG_NET disabled.
Sasha reported a build issue in ipmr::
net/ipv4/ipmr.c:320:13: error: function 'ipmr_can_free_table' is not \
needed and will not be emitted \
[-Werror,-Wunneeded-internal-declaration]
320 | static bool ipmr_can_free_table(struct net *net)
Apparently clang is too smart with BUILD_BUG_ON_INVALID(), let's
fallback to a plain WARN_ON_ONCE().
Reported-by: Sasha Levin <sashal@kernel.org> Closes: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.11-25635-g6813e2326f1e/testrun/26111580/suite/build/test/clang-nightly-lkftconfig/details/ Fixes: 11b6e701bce9 ("ipmr: add debug check for mr table cleanup") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/ee75faa926b2446b8302ee5fc30e129d2df73b90.1732810228.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Dan Carpenter [Fri, 15 Nov 2024 09:13:58 +0000 (12:13 +0300)]
cifs: unlock on error in smb3_reconfigure()
Unlock before returning if smb3_sync_session_ctx_passwords() fails.
Fixes: 7e654ab7da03 ("cifs: during remount, make sure passwords are in sync") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Shyam Prasad N [Wed, 30 Oct 2024 06:45:50 +0000 (06:45 +0000)]
cifs: during remount, make sure passwords are in sync
This fixes scenarios where remount can overwrite the only currently
working password, breaking reconnect.
We recently introduced a password2 field in both ses and ctx structs.
This was done so as to allow the client to rotate passwords for a mount
without any downtime. However, when the client transparently handles
password rotation, it can swap the values of the two password fields
in the ses struct, but not in smb3_fs_context struct that hangs off
cifs_sb. This can lead to a situation where a remount unintentionally
overwrites a working password in the ses struct.
In order to fix this, we first get the passwords in ctx struct
in-sync with ses struct, before replacing them with what the passwords
that could be passed as a part of remount.
Also, in order to avoid race condition between smb2_reconnect and
smb3_reconfigure, we make sure to lock session_mutex before changing
password and password2 fields of the ses structure.
Fixes: 35f834265e0d ("smb3: fix broken reconnect when password changing on the server by allowing password rotation") Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Meetakshi Setiya [Wed, 30 Oct 2024 09:37:21 +0000 (05:37 -0400)]
cifs: support mounting with alternate password to allow password rotation
Fixes the case for example where the password specified on mount is a
recently expired password, but password2 is valid. Without this patch
this mount scenario would fail.
This patch introduces the following changes to support password rotation on
mount:
1. If an existing session is not found and the new session setup results in
EACCES, EKEYEXPIRED or EKEYREVOKED, swap password and password2 (if
available), and retry the mount.
2. To match the new mount with an existing session, add conditions to check
if a) password and password2 of the new mount and the existing session are
the same, or b) password of the new mount is the same as the password2 of
the existing session, and password2 of the new mount is the same as the
password of the existing session.
3. If an existing session is found, but needs reconnect, retry the session
setup after swapping password and password2 (if available), in case the
previous attempt results in EACCES, EKEYEXPIRED or EKEYREVOKED.
Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Matthew Auld [Tue, 26 Nov 2024 18:13:01 +0000 (18:13 +0000)]
drm/xe/migrate: use XE_BO_FLAG_PAGETABLE
On some HW we want to avoid the host caching PTEs, since access from GPU
side can be incoherent. However here the special migrate object is
mapping PTEs which are written from the host and potentially cached. Use
XE_BO_FLAG_PAGETABLE to ensure that non-cached mapping is used, on
platforms where this matters.
Fixes: 7a060d786cc1 ("drm/xe/mtl: Map PPGTT as CPU:WC") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241126181259.159713-4-matthew.auld@intel.com
(cherry picked from commit febc689b27d28973cd02f667548a5dca383d859a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Matthew Auld [Tue, 26 Nov 2024 18:13:00 +0000 (18:13 +0000)]
drm/xe/migrate: fix pat index usage
XE_CACHE_WB must be converted into the per-platform pat index for that
particular caching mode, otherwise we are just encoding whatever happens
to be the value of that enum.
Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241126181259.159713-3-matthew.auld@intel.com
(cherry picked from commit f3dc9246f9c3cd5a7d8fd70cfd805bfc52214e2e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
It looks like we try to suspend the queue (opcode=3), setting
suspend_pending and triggering a disable_scheduling. The user then
closes the queue. However the close will also forcefully signal the
suspend fence after killing the queue, later when the G2H response for
disable_scheduling comes back we have now cleared suspend_pending when
signalling the suspend fence, so the disable_scheduling now incorrectly
tries to also deregister the queue. This leads to warnings since the queue
has yet to even be marked for destruction. We also seem to trigger
errors later with trying to double unregister the same queue.
To fix this tweak the ordering when handling the response to ensure we
don't race with a disable_scheduling that didn't actually intend to
perform an unregister. The destruction path should now also correctly
wait for any pending_disable before marking as destroyed.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3371 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241122161914.321263-6-matthew.auld@intel.com
(cherry picked from commit f161809b362f027b6d72bd998e47f8f0bad60a2e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
This looks to be the two scheduling_disable racing with each other, one
from the suspend (opcode=3) and then again during lr cleanup. While
those two operations are serialized, the G2H portion is not, therefore
when marking the queue as pending_disabled and then firing off the first
request, we proceed do the same again, however the first disable
response only fires after this which then clears the pending_disabled.
At this point the second comes back and is processed, however the
pending_disabled is no longer set, hence triggering the warning.
To fix this wait for pending_disabled when doing the lr cleanup and
calling disable_scheduling_deregister. Also do the same for all other
disable_scheduling callers.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3515 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <mattheq.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241122161914.321263-5-matthew.auld@intel.com
(cherry picked from commit ddb106d2120a0bf1c5ff87c71d059d193814da41) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Matt Roper [Mon, 25 Nov 2024 19:48:39 +0000 (11:48 -0800)]
drm/xe: Update xe2_graphics name string
Since both Xe2 and Xe3 platforms currently use the same set of graphics
IP feature flags, we associate the "graphics_xe2" structure with both IPs.
Update the name string on that IP structure to clarify this and avoid
confusion as Xe3 platforms start going into public CI.
Takashi Iwai [Thu, 28 Nov 2024 13:55:21 +0000 (14:55 +0100)]
Merge tag 'asoc-fix-v6.13-merge-window' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.13
A pile of driver specific quirks and fixes that came in since the merge
window. One of the AMD fixes is a bit broken for some systems, I'm
expecting an incremental change to fix that but it seems better overall
to merge the rest of the fixes.
There's also one small documentation update that seemed sensible to
apply now, pointing to the dapm-graph tool.
Jaroslav Kysela [Thu, 28 Nov 2024 11:21:45 +0000 (12:21 +0100)]
ALSA: hda: improve bass speaker support for ASUS Zenbook UM5606WA
This hardware has ALC294 codec with speaker NID 0x17 and bass speaker
NID 0x15.
This patch removes DAC NID 0x06 (without volume control) from
the connection list for bass speaker NID 0x15. Both speaker PINs
are routed to DAC NID 0x03 with this change.
Heiko Carstens [Tue, 26 Nov 2024 13:28:27 +0000 (14:28 +0100)]
s390/spinlock: Use flag output constraint for arch_cmpxchg_niai8()
Add a new variant of arch_cmpxchg_niai8() which makes use of the flag
output constraint, which allows the compiler to generate slightly better
code. Also rename arch_cmpxchg_niai8() to arch_try_cmpxchg_niai8() which
reflects the purpose of the function and makes it consistent with other
"try" variants.
Heiko Carstens [Tue, 26 Nov 2024 13:28:26 +0000 (14:28 +0100)]
s390/spinlock: Use R constraint for arch_load_niai4()
The load instruction used within arch_load_niai4() has a short displacement
and index register. Therefore use the R constraint to reflect this.
The used Q constraint does consider an index register.
Heiko Carstens [Tue, 26 Nov 2024 13:28:25 +0000 (14:28 +0100)]
s390/spinlock: Generate shorter code for arch_spin_unlock()
Use mvhhi instead of sth to write a zero to spinlocks. Compared to the
sth variant this avoids the load of zero to a register, and reduces
register pressure.
Heiko Carstens [Wed, 27 Nov 2024 16:17:12 +0000 (17:17 +0100)]
s390: Support PREEMPT_DYNAMIC
Select HAVE_PREEMPT_DYNAMIC_KEY and add the pieces which are required to
support PREEMPT_DYNAMIC.
See commit 99cf983cc8bc ("sched/preempt: Add PREEMPT_DYNAMIC using static
keys") and commit 1b2d3451ee50 ("arm64: Support PREEMPT_DYNAMIC") for more
details.
Niklas Schnelle [Mon, 25 Nov 2024 15:02:38 +0000 (16:02 +0100)]
s390/pci: Fix potential double remove of hotplug slot
In commit 6ee600bfbe0f ("s390/pci: remove hotplug slot when releasing the
device") the zpci_exit_slot() was moved from zpci_device_reserved() to
zpci_release_device() with the intention of keeping the hotplug slot
around until the device is actually removed.
Now zpci_release_device() is only called once all references are
dropped. Since the zPCI subsystem only drops its reference once the
device is in the reserved state it follows that zpci_release_device()
must only deal with devices in the reserved state. Despite that it
contains code to tear down from both configured and standby state. For
the standby case this already includes the removal of the hotplug slot
so would cause a double removal if a device was ever removed in
either configured or standby state.
Instead of causing a potential double removal in a case that should
never happen explicitly WARN_ON() if a device in non-reserved state is
released and get rid of the dead code cases.
Fixes: 6ee600bfbe0f ("s390/pci: remove hotplug slot when releasing the device") Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Tested-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Niklas Schnelle [Mon, 25 Nov 2024 09:35:10 +0000 (10:35 +0100)]
s390/pci: Fix leak of struct zpci_dev when zpci_add_device() fails
Prior to commit 0467cdde8c43 ("s390/pci: Sort PCI functions prior to
creating virtual busses") the IOMMU was initialized and the device was
registered as part of zpci_create_device() with the struct zpci_dev
freed if either resulted in an error. With that commit this was moved
into a separate function called zpci_add_device().
While this new function logs when adding failed, it expects the caller
not to use and to free the struct zpci_dev on error. This difference
between it and zpci_create_device() was missed while changing the
callers and the incompletely initialized struct zpci_dev may get used in
zpci_scan_configured_device in the error path. This then leads to
a crash due to the device not being registered with the zbus. It was
also not freed in this case. Fix this by handling the error return of
zpci_add_device(). Since in this case the zdev was not added to the
zpci_list it can simply be discarded and freed. Also make this more
explicit by moving the kref_init() into zpci_add_device() and document
that zpci_zdev_get()/zpci_zdev_put() must be used after adding.
Heiko Carstens [Thu, 28 Nov 2024 08:43:29 +0000 (09:43 +0100)]
s390/mm/hugetlbfs: Add missing includes
Add missing includes to fix this randconfig compile error:
All errors (new ones prefixed by >>):
In file included from mm/pagewalk.c:5:
In file included from include/linux/hugetlb.h:798:
>> arch/s390/include/asm/hugetlb.h:94:31: error: call to undeclared function 'is_pte_marker'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
94 | return huge_pte_none(pte) || is_pte_marker(pte);
| ^
Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202411281002.IPkRpIcR-lkp@intel.com/ Fixes: 487ef5d4d912 ("s390/mm: Add PTE_MARKER support for hugetlbfs mappings") Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Paolo Abeni [Thu, 28 Nov 2024 09:23:26 +0000 (10:23 +0100)]
Merge branch 'net-fix-mcast-rcu-splats'
Paolo Abeni says:
====================
net: fix mcast RCU splats
This series addresses the RCU splat triggered by the forwarding
mroute tests.
The first patch does not address any specific issue, but makes the
following ones more clear. Patch 2 and 3 address the issue for ipv6 and
ipv4 respectively.
====================
Paolo Abeni [Sun, 24 Nov 2024 15:40:58 +0000 (16:40 +0100)]
ipmr: fix tables suspicious RCU usage
Similar to the previous patch, plumb the RCU lock inside
the ipmr_get_table(), provided a lockless variant and apply
the latter in the few spots were the lock is already held.
Fixes: 709b46e8d90b ("net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT") Fixes: f0ad0860d01e ("ipv4: ipmr: support multiple tables") Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Sun, 24 Nov 2024 15:40:57 +0000 (16:40 +0100)]
ip6mr: fix tables suspicious RCU usage
Several places call ip6mr_get_table() with no RCU nor RTNL lock.
Add RCU protection inside such helper and provide a lockless variant
for the few callers that already acquired the relevant lock.
Note that some users additionally reference the table outside the RCU
lock. That is actually safe as the table deletion can happen only
after all table accesses are completed.
Fixes: e2d57766e674 ("net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.") Fixes: d7c31cbde4bc ("net: ip6mr: add RTM_GETROUTE netlink op") Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Sun, 24 Nov 2024 15:40:56 +0000 (16:40 +0100)]
ipmr: add debug check for mr table cleanup
The multicast route tables lifecycle, for both ipv4 and ipv6, is
protected by RCU using the RTNL lock for write access. In many
places a table pointer escapes the RCU (or RTNL) protected critical
section, but such scenarios are actually safe because tables are
deleted only at namespace cleanup time or just after allocation, in
case of default rule creation failure.
Tables freed at namespace cleanup time are assured to be alive for the
whole netns lifetime; tables freed just after creation time are never
exposed to other possible users.
Ensure that the free conditions are respected in ip{,6}mr_free_table, to
document the locking schema and to prevent future possible introduction
of 'table del' operation from breaking it.
Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Hangbin Liu [Sun, 24 Nov 2024 07:32:43 +0000 (07:32 +0000)]
selftests: rds: move test.py to TEST_FILES
The test.py should not be run separately. It should be run via run.sh,
which will do some sanity checks first. Move the test.py from TEST_PROGS
to TEST_FILES.
Reported-by: Maximilian Heyne <mheyne@amazon.de> Closes: https://lore.kernel.org/netdev/20241122150129.GB18887@dev-dsk-mheyne-1b-55676e6a.eu-west-1.amazon.com Fixes: 3ade6ce1255e ("selftests: rds: add testing infrastructure") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Link: https://patch.msgid.link/20241124073243.847932-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Replacing the qdisc with pfifo makes retransmissions go away.
It appears that a flow may have a delayed packet with a very near
Tx time. Later, we may get busy processing Rx and the target Tx time
will pass, but we won't service Tx since the CPU is busy with Rx.
If Rx sees an ACK and we try to push more data for the delayed flow
we may fastpath the skb, not realizing that there are already "ready
to send" packets for this flow sitting in the qdisc.
Don't trust the fastpath if we are "behind" according to the projected
Tx time for next flow waiting in the Qdisc. Because we consider anything
within the offload window to be okay for fastpath we must consider
the entire offload window as "now".
Kuniyuki Iwashima [Sat, 23 Nov 2024 17:42:36 +0000 (09:42 -0800)]
tcp: Fix use-after-free of nreq in reqsk_timer_handler().
The cited commit replaced inet_csk_reqsk_queue_drop_and_put() with
__inet_csk_reqsk_queue_drop() and reqsk_put() in reqsk_timer_handler().
Then, oreq should be passed to reqsk_put() instead of req; otherwise
use-after-free of nreq could happen when reqsk is migrated but the
retry attempt failed (e.g. due to timeout).
Let's pass oreq to reqsk_put().
Fixes: e8c526f2bdf1 ("tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink().") Reported-by: Liu Jian <liujian56@huawei.com> Closes: https://lore.kernel.org/netdev/1284490f-9525-42ee-b7b8-ccadf6606f6d@huawei.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Liu Jian <liujian56@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20241123174236.62438-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When phy_ethtool_set_eee_noneg() detects a change in the LPI
parameters, it attempts to update phylib state and trigger the link
to cycle so the MAC sees the updated parameters.
However, in doing so, it sets phydev->enable_tx_lpi depending on
whether the EEE configuration allows the MAC to generate LPI without
taking into account the result of negotiation.
This can be demonstrated with a 1000base-T FD interface by:
# ethtool --set-eee eno0 advertise 8 # cause EEE to be not negotiated
# ethtool --set-eee eno0 tx-lpi off
# ethtool --set-eee eno0 tx-lpi on
This results in being true, despite EEE not having been negotiated and:
# ethtool --show-eee eno0
EEE status: enabled - inactive
Tx LPI: 250 (us)
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: 100baseT/Full
1000baseT/Full
Fix this by keeping track of whether EEE was negotiated via a new
eee_active member in struct phy_device, and include this state in
the decision whether phydev->enable_tx_lpi should be set.
Fixes: 3e43b903da04 ("net: phy: Immediately call adjust_link if only tx_lpi_enabled changes") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tErSe-005RhB-2R@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 28 Nov 2024 08:23:02 +0000 (09:23 +0100)]
Merge tag 'for-net-2024-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- SCO: remove the redundant sco_conn_put
- MGMT: Fix slab-use-after-free Read in set_powered_sync
- MGMT: Fix possible deadlocks
* tag 'for-net-2024-11-26' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: SCO: remove the redundant sco_conn_put
Bluetooth: MGMT: Fix possible deadlocks
Bluetooth: MGMT: Fix slab-use-after-free Read in set_powered_sync
====================
====================
net: Fix some callers of copy_from_sockptr()
Some callers misinterpret copy_from_sockptr()'s return value. The function
follows copy_from_user(), i.e. returns 0 for success, or the number of
bytes not copied on error. Simply returning the result in a non-zero case
isn't usually what was intended.
Compile tested with CONFIG_LLC, CONFIG_AF_RXRPC, CONFIG_BT enabled.
Last patch probably belongs more to net-next, if any. Here as an RFC.
Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Michal Luczaj <mhal@rbox.co>
====================
Michal Luczaj [Tue, 19 Nov 2024 13:31:43 +0000 (14:31 +0100)]
net: Comment copy_from_sockptr() explaining its behaviour
copy_from_sockptr() has a history of misuse. Add a comment explaining that
the function follows API of copy_from_user(), i.e. returns 0 for success,
or number of bytes not copied on error.
Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Michal Luczaj [Tue, 19 Nov 2024 13:31:42 +0000 (14:31 +0100)]
rxrpc: Improve setsockopt() handling of malformed user input
copy_from_sockptr() does not return negative value on error; instead, it
reports the number of bytes that failed to copy. Since it's deprecated,
switch to copy_safe_from_sockptr().
Note: Keeping the `optlen != sizeof(unsigned int)` check as
copy_safe_from_sockptr() by itself would also accept
optlen > sizeof(unsigned int). Which would allow a more lenient handling
of inputs.
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Michal Luczaj [Tue, 19 Nov 2024 13:31:41 +0000 (14:31 +0100)]
llc: Improve setsockopt() handling of malformed user input
copy_from_sockptr() is used incorrectly: return value is the number of
bytes that could not be copied. Since it's deprecated, switch to
copy_safe_from_sockptr().
Note: Keeping the `optlen != sizeof(int)` check as copy_safe_from_sockptr()
by itself would also accept optlen > sizeof(int). Which would allow a more
lenient handling of inputs.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Suggested-by: David Wei <dw@davidwei.uk> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Linus Torvalds [Wed, 27 Nov 2024 22:50:31 +0000 (14:50 -0800)]
Merge tag 'acpi-6.13-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These add a common init function for arch-specific ACPI
initialization, clean up idle states initialization in the ACPI
processor_idle driver and update quirks:
- Introduce acpi_arch_init() for architecture-specific ACPI subsystem
initialization (Miao Wang)
- Clean up Asus quirks in acpi_quirk_skip_dmi_ids[] and add a quirk
to skip I2C clients on Acer Iconia One 8 A1-840 (Hans de Goede)
- Make the ACPI processor_idle driver use acpi_idle_play_dead() for
all idle states regardless of their types (Rafael Wysocki)"
* tag 'acpi-6.13-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: introduce acpi_arch_init()
ACPI: x86: Clean up Asus entries in acpi_quirk_skip_dmi_ids[]
ACPI: x86: Add skip i2c clients quirk for Acer Iconia One 8 A1-840
ACPI: processor_idle: Use acpi_idle_play_dead() for all C-states
Linus Torvalds [Wed, 27 Nov 2024 22:40:33 +0000 (14:40 -0800)]
Merge tag 'pm-6.13-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull morepower management updates from Rafael Wysocki:
"These update the OPP (Operating Performance Points) DT bindings for
ti-cpu (Dhruva Gole) and remove unused declarations from the OPP
header file (Zhang Zekun)"
* tag 'pm-6.13-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
dt-bindings: opp: operating-points-v2-ti-cpu: Describe opp-supported-hw
OPP: Remove unused declarations in header file
Linus Torvalds [Wed, 27 Nov 2024 22:36:00 +0000 (14:36 -0800)]
Merge tag 'thermal-6.13-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These fix a Power Allocator thermal governor issue reported recently,
update the Intel int3400 thermal driver and simplify DT data parsing
in the thermal control subsystem:
- Add a NULL pointer check that was missed by recent modifications of
the Power Allocator thermal governor (Rafael Wysocki)
- Remove the data_vault attribute_group from int3400 because it is
only used for exposing one binary file that can be exposed directly
(Thomas Weißschuh)
- Prevent the current_uuid sysfs attribute in int3400 from mistakenly
treating valid UUID values as invalid on some older systems
(Srinivas Pandruvada)
- Use the cleanup.h mechanics to simplify DT data parsing in the
thermal core and some drivers (Krzysztof Kozlowski)"
* tag 'thermal-6.13-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: sun8i: Use scoped device node handling to simplify error paths
thermal: tegra: Simplify with scoped for each OF child loop
thermal: qcom-spmi-adc-tm5: Simplify with scoped for each OF child loop
thermal: of: Use scoped device node handling to simplify of_thermal_zone_find()
thermal: of: Use scoped memory and OF handling to simplify thermal_of_trips_init()
thermal: of: Simplify thermal_of_should_bind with scoped for each OF child
thermal: gov_power_allocator: Add missing NULL pointer check
thermal: int3400: Remove unneeded data_vault attribute_group
thermal: int3400: Fix reading of current_uuid for active policy
Linus Torvalds [Wed, 27 Nov 2024 22:24:34 +0000 (14:24 -0800)]
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull more iommufd updates from Jason Gunthorpe:
"Change the driver callback op domain_alloc_user() into two ops:
domain_alloc_paging_flags() and domain_alloc_nesting() that better
describe what the ops are expected to do.
There will be per-driver cleanup based on this going into the next
cycle via the driver trees"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommu: Rename ops->domain_alloc_user() to domain_alloc_paging_flags()
iommu: Add ops->domain_alloc_nested()
Linus Torvalds [Wed, 27 Nov 2024 21:33:43 +0000 (13:33 -0800)]
Merge tag 'phy-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"New hardware support:
- ST STM32MP25 combophy support
- Sparx5 support for lan969x serdes and updates to driver to support
this
- NXP PTN3222 eUSB2 to USB2 redriver
- Qualcomm SAR2130P eusb2 support, QCS8300 USB DW3 and QMP USB2
support, X1E80100 QMP PCIe PHY Gen4 support, QCS615 and QCS8300 QMP
UFS PHY support and SA8775P eDP PHY support
- Rockchip rk3576 usbdp and rk3576 usb2 phy support
- Binding for Microchip ATA6561 can phy
Updates:
- Freescale driver updates from hdmi support
- Conversion of rockchip rk3228 hdmi phy binding to yaml
- Broadcom usb2-phy deprecated support dropped and USB init array
update for BCM4908
- TI USXGMII mode support in J7200
- Switch back to platform_driver::remove() subsystem update"
* tag 'phy-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (59 commits)
phy: qcom: qmp: Fix lecacy-legacy typo
phy: lan969x-serdes: add support for lan969x serdes driver
dt-bindings: phy: sparx5: document lan969x
phy: sparx5-serdes: add support for branching on chip type
phy: sparx5-serdes: add indirection layer to register macros
phy: sparx5-serdes: add function for getting the CMU index
phy: sparx5-serdes: add ops to match data
phy: sparx5-serdes: add constant for the number of CMU's
phy: sparx5-serdes: add constants to match data
phy: sparx5-serdes: add support for private match data
phy: bcm-ns-usb2: drop support for old binding variant
dt-bindings: phy: bcm-ns-usb2-phy: drop deprecated variant
dt-bindings: phy: Add QMP UFS PHY compatible for QCS8300
dt-bindings: phy: qcom: snps-eusb2: Add SAR2130P compatible
dt-bindings: phy: ti,tcan104x-can: Document Microchip ATA6561
phy: airoha: Fix REG_CSR_2L_RX{0,1}_REV0 definitions
phy: airoha: Fix REG_CSR_2L_JCPLL_SDM_HREN config in airoha_pcie_phy_init_ssc_jcpll()
phy: airoha: Fix REG_PCIE_PMA_TX_RESET config in airoha_pcie_phy_init_csr_2l()
phy: airoha: Fix REG_CSR_2L_PLL_CMN_RESERVE0 config in airoha_pcie_phy_init_clk_out()
phy: phy-rockchip-samsung-hdptx: Don't request RST_PHY/RST_ROPLL/RST_LCPLL
...
Linus Torvalds [Wed, 27 Nov 2024 21:23:13 +0000 (13:23 -0800)]
Merge tag 'gpio-fixes-for-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"Apart from the gpio-exar fix which addresses an older issue, they all
fix regressions from this release cycle:
- fix missing GPIO chip labels in gpio-zevio and gpio-altera
- for the latter: also set GPIO base to -1 to use dynamic range
allocation
- fix value setting with external pull-up/down resistor in gpio-exar
- use the recommended IDA interfaces in gpio-mpsse"
* tag 'gpio-fixes-for-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: mpsse: Remove usage of the deprecated ida_simple_xx() API
gpio: exar: set value when external pull-up or pull-down is present
gpio: altera: Add missed base and label initialisations
gpio: zevio: Add missed label initialisation
Linus Torvalds [Wed, 27 Nov 2024 21:11:58 +0000 (13:11 -0800)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"A small number of improvements all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_vdpa: remove redundant check on desc
virtio_fs: store actual queue index in mq_map
virtio_fs: add informative log for new tag discovery
virtio: Make vring_new_virtqueue support packed vring
virtio_pmem: Add freeze/restore callbacks
vdpa/mlx5: Fix suboptimal range on iotlb iteration
Linus Torvalds [Wed, 27 Nov 2024 20:57:03 +0000 (12:57 -0800)]
Merge tag 'vfio-v6.13-rc1' of https://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Constify an unmodified structure used in linking vfio and kvm
(Christophe JAILLET)
- Add ID for an additional hardware SKU supported by the nvgrace-gpu
vfio-pci variant driver (Ankit Agrawal)
- Fix incorrect signed cast in QAT vfio-pci variant driver, negating
test in check_add_overflow(), though still caught by later tests
(Giovanni Cabiddu)
- Additional debugfs attributes exposed in hisi_acc vfio-pci variant
driver for migration debugging (Longfang Liu)
- Migration support is added to the virtio vfio-pci variant driver,
becoming the primary feature of the driver while retaining emulation
of virtio legacy support as a secondary option (Yishai Hadas)
- Fixes to a few unwind flows in the mlx5 vfio-pci driver discovered
through reviews of the virtio variant driver (Yishai Hadas)
- Fix an unlikely issue where a PCI device exposed to userspace with an
unknown capability at the base of the extended capability chain can
overflow an array index (Avihai Horon)
* tag 'vfio-v6.13-rc1' of https://github.com/awilliam/linux-vfio:
vfio/pci: Properly hide first-in-list PCIe extended capability
vfio/mlx5: Fix unwind flows in mlx5vf_pci_save/resume_device_data()
vfio/mlx5: Fix an unwind issue in mlx5vf_add_migration_pages()
vfio/virtio: Enable live migration once VIRTIO_PCI was configured
vfio/virtio: Add PRE_COPY support for live migration
vfio/virtio: Add support for the basic live migration functionality
virtio-pci: Introduce APIs to execute device parts admin commands
virtio: Manage device and driver capabilities via the admin commands
virtio: Extend the admin command to include the result size
virtio_pci: Introduce device parts access commands
Documentation: add debugfs description for hisi migration
hisi_acc_vfio_pci: register debugfs for hisilicon migration driver
hisi_acc_vfio_pci: create subfunction for data reading
hisi_acc_vfio_pci: extract public functions for container_of
vfio/qat: fix overflow check in qat_vf_resume_write()
vfio/nvgrace-gpu: Add a new GH200 SKU to the devid table
kvm/vfio: Constify struct kvm_device_ops
Linus Torvalds [Wed, 27 Nov 2024 19:19:09 +0000 (11:19 -0800)]
Merge tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-v updates from Palmer Dabbelt:
- Support for pointer masking in userspace
- Support for probing vector misaligned access performance
- Support for qspinlock on systems with Zacas and Zabha
* tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits)
RISC-V: Remove unnecessary include from compat.h
riscv: Fix default misaligned access trap
riscv: Add qspinlock support
dt-bindings: riscv: Add Ziccrse ISA extension description
riscv: Add ISA extension parsing for Ziccrse
asm-generic: ticket-lock: Add separate ticket-lock.h
asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
riscv: Implement xchg8/16() using Zabha
riscv: Implement arch_cmpxchg128() using Zacas
riscv: Improve zacas fully-ordered cmpxchg()
riscv: Implement cmpxchg8/16() using Zabha
dt-bindings: riscv: Add Zabha ISA extension description
riscv: Implement cmpxchg32/64() using Zacas
riscv: Do not fail to build on byte/halfword operations with Zawrs
riscv: Move cpufeature.h macros into their own header
KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test
RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests
riscv: hwprobe: Export the Supm ISA extension
riscv: selftests: Add a pointer masking test
riscv: Allow ptrace control of the tagged address ABI
...
Linus Torvalds [Wed, 27 Nov 2024 19:15:27 +0000 (11:15 -0800)]
Merge tag 'loongarch-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Fix build failure with GCC 15 due to default -std=gnu23
- Add PREEMPT_RT/PREEMPT_LAZY support
- Add I2S in DTS for Loongson-2K1000/Loongson-2K2000
- Some bug fixes and other small changes
* tag 'loongarch-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Update Loongson-3 default config file
LoongArch: dts: Add I2S support to Loongson-2K2000
LoongArch: dts: Add I2S support to Loongson-2K1000
LoongArch: Allow to enable PREEMPT_LAZY
LoongArch: Allow to enable PREEMPT_RT
LoongArch: Select HAVE_POSIX_CPU_TIMERS_TASK_WORK
LoongArch: Fix sleeping in atomic context for PREEMPT_RT
LoongArch: Reduce min_delta for the arch clockevent device
LoongArch: BPF: Sign-extend return values
LoongArch: Fix build failure with GCC 15 (-std=gnu23)
LoongArch: Explicitly specify code model in Makefile
Linus Torvalds [Wed, 27 Nov 2024 19:13:25 +0000 (11:13 -0800)]
Merge tag 'memblock-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock updates from Mike Rapoport:
- replace hardcoded strings with str_on_off() in report_meminit()
- initialize reserved pages to MIGRATE_MOVABLE when deferred struct
page initialization is enabled so that if the reserved pages are
freed they are put on movable free lists like it is done now when
deferred struct page initialization is disabled
* tag 'memblock-v6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock: uniformly initialize all reserved pages to MIGRATE_MOVABLE
mm: Use str_on_off() helper function in report_meminit()
Linus Torvalds [Wed, 27 Nov 2024 18:20:50 +0000 (10:20 -0800)]
Merge tag 'modules-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux
Pull modules updates from Luis Chamberlain:
- The whole caching of module code into huge pages by Mike Rapoport is
going in through Andrew Morton's tree due to some other code
dependencies. That's really the biggest highlight for Linux kernel
modules in this release. With it we share huge pages for modules,
starting off with x86. Expect to see that soon through Andrew!
- Helge Deller addressed some lingering low hanging fruit alignment
enhancements by. It is worth pointing out that from his old patch
series I dropped his vmlinux.lds.h change at Masahiro's request as he
would prefer this to be specified in asm code [0].
- Matthew Maurer and Sami Tolvanen have been tag teaming to help get us
closer to a modversions for Rust. In this cycle we take in quite a
lot of the refactoring for ELF validation. I expect modversions for
Rust will be merged by v6.14 as that code is mostly ready now.
- Adds a new modules selftests: kallsyms which helps us tests
find_symbol() and the limits of kallsyms on Linux today.
- We have a realtime mailing list to kernel-ci testing for modules now
which relies and combines patchwork, kpd and kdevops:
If you want to help avoid Linux kernel modules regressions, now its
simple, just add a new Linux modules sefltests under
tools/testing/selftests/module/ That is it. All new selftests will be
used and leveraged automatically by the CI.
* tag 'modules-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux:
tests/module/gen_test_kallsyms.sh: use 0 value for variables
scripts: Remove export_report.pl
selftests: kallsyms: add MODULE_DESCRIPTION
selftests: add new kallsyms selftests
module: Reformat struct for code style
module: Additional validation in elf_validity_cache_strtab
module: Factor out elf_validity_cache_strtab
module: Group section index calculations together
module: Factor out elf_validity_cache_index_str
module: Factor out elf_validity_cache_index_sym
module: Factor out elf_validity_cache_index_mod
module: Factor out elf_validity_cache_index_info
module: Factor out elf_validity_cache_secstrings
module: Factor out elf_validity_cache_sechdrs
module: Factor out elf_validity_ehdr
module: Take const arg in validate_section_offset
modules: Add missing entry for __ex_table
modules: Ensure 64-bit alignment on __ksymtab_* sections
Rafael J. Wysocki [Wed, 27 Nov 2024 17:59:16 +0000 (18:59 +0100)]
Merge branch 'thermal-intel'
Merge updates of Intel int3400 thermal driver for 6.13-rc1:
- Remove the data_vault attribute_group from int3400 because it is only
used for exposing one binary file that can be exposed directly (Thomas
Weißschuh).
- Prevent the current_uuid sysfs attribute in int3400 from mistakenly
treating valid UUID values as invalid on some older systems (Srinivas
Pandruvada).
* thermal-intel:
thermal: int3400: Remove unneeded data_vault attribute_group
thermal: int3400: Fix reading of current_uuid for active policy
Linus Torvalds [Wed, 27 Nov 2024 16:11:46 +0000 (08:11 -0800)]
Merge tag 'vfs-6.13-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix a few iomap bugs
- Fix a wrong argument in backing file callback
- Fix security mount option retrieval in statmount()
- Cleanup how statmount() handles unescaped options
- Add a missing inode_owner_or_capable() check for setting write hints
- Clear the return value in read_kcore_iter() after a successful
iov_iter_zero()
- Fix a mount_setattr() selftest
- Fix function signature in mount api documentation
- Remove duplicate include header in the fscache code
* tag 'vfs-6.13-rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs/backing_file: fix wrong argument in callback
fs_parser: update mount_api doc to match function signature
fs: require inode_owner_or_capable for F_SET_RW_HINT
fs/proc/kcore.c: Clear ret value in read_kcore_iter after successful iov_iter_zero
statmount: fix security option retrieval
statmount: clean up unescaped option handling
fscache: Remove duplicate included header
iomap: elide flush from partial eof zero range
iomap: lift zeroed mapping handling into iomap_zero_range()
iomap: reset per-iter state on non-error iter advances
iomap: warn on zero range of a post-eof folio
selftests/mount_setattr: Fix failures on 64K PAGE_SIZE kernels
Linus Torvalds [Wed, 27 Nov 2024 16:03:38 +0000 (08:03 -0800)]
Merge tag 'vfs-6.13.exec.deny_write_access.revert' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull deny_write_access revert from Christian Brauner:
"It turns out that the mold linker relies on the deny_write_access()
mechanism for executables.
The mold linker tries to open a file for writing and if ETXTBSY is
returned mold falls back to creating a new file"
* tag 'vfs-6.13.exec.deny_write_access.revert' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
Revert "fs: don't block i_writecount during exec"
Gerald Schaefer [Thu, 21 Nov 2024 17:45:23 +0000 (18:45 +0100)]
s390/mm: Add PTE_MARKER support for hugetlbfs mappings
Commit 8a13897fb0daa ("mm: userfaultfd: support UFFDIO_POISON for
hugetlbfs") added support for PTE_MARKER_POISONED for hugetlbfs, but
PTE_MARKER also needs support for swap entries. For s390, swap entries
were only supported on PTE level, not on the PMD/PUD levels that are used
for large hugetlbfs mappings.
Therefore, when writing a PTE_MARKER_POISONED entry, the resulting entry
on PMD/PUD level would be an invalid / empty entry. Further access would
then generate a pagefault loop, instead of the expected SIGBUS. It is a
loop inside the kernel, but interruptible and uffd fault handling also
calls schedule() in between, so at least it won't completely block the
system.
Previous commits prepared support for swap entries on PMD/PUD levels.
PTE_MARKER support for hugetlbfs can now be enabled by simply adding an
extra is_pte_marker() check to huge_pte_none_mostly(). Fault handling
code also needs to be adjusted to expect the VM_FAULT_HWPOISON_LARGE
fault flag, which was not possible on s390 before.
Gerald Schaefer [Thu, 21 Nov 2024 17:45:22 +0000 (18:45 +0100)]
s390/mm: Introduce region-third and segment table swap entries
Introduce region-third (PUD) and segment table (PMD) swap entries, and
make hugetlbfs RSTE <-> PTE conversion code aware of them, so that they
can be used for hugetlbfs PTE_MARKER entries. Future work could also
build on this to enable THP_SWAP and THP_MIGRATION for s390.
Similar to PTE swap entries, bits 0-51 can be used to store the swap
offset, but bits 57-61 cannot be used for swap type because that overlaps
with the INVALID and TABLE TYPE bits. PMD/PUD swap entries must be invalid,
and have a correct table type so that pud_folded() check still works.
Bits 53-57 can be used for swap type, but those include the PROTECT bit.
So unlike swap PTEs, the PROTECT bit cannot be used to mark the swap entry.
Use the "Common-Segment/Region" bit 59 instead for that.
Also remove the !MACHINE_HAS_NX check in __set_huge_pte_at(). Otherwise,
that would clear the _SEGMENT_ENTRY_NOEXEC bit also for swap entries, where
it is used for encoding the swap type. The architecture only requires this
bit to be 0 for PTEs, with !MACHINE_HAS_NX, not for segment or region-third
entries. And the check is also redundant, because after __pte_to_rste()
conversion, for non-swap PTEs it would only be set if it was already set in
the PTE, which should never be the case for !MACHINE_HAS_NX.
This is a prerequisite for hugetlbfs PTE_MARKER support on s390, which
is needed to fix a regression introduced with commit 8a13897fb0da
("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs"). That commit
depends on the availability of swap entries for hugetlbfs, which were
not available for s390 so far.
Gerald Schaefer [Thu, 21 Nov 2024 17:45:21 +0000 (18:45 +0100)]
s390/mm: Introduce region-third and segment table entry present bits
Introduce region-third and segment table entry present SW bits, and adjust
pmd/pud_present() accordingly.
Also add pmd/pud_present() checks to pmd/pud_leaf(), to return false for
future swap entries. Same logic applies to pmd_trans_huge(), make that
return pmd_leaf() instead of duplicating the same check.
huge_pte_offset() also needs to be adjusted, current code would return
NULL for !pud_present(). Use the same logic as in the generic version,
which allows for !pud_present() swap entries.
Similar to PTE, bit 63 can be used for the new SW present bit in region
and segment table entries. For segment-table entries (PMD) the architecture
says that "Bits 62-63 are available for programming", so they are safe to
use. The same is true for large leaf region-third-table entries (PUD).
However, for non-leaf region-third-table entries, bits 62-63 indicate the
TABLE LENGTH and both must be set to 1. But such entries would always be
considered as present, so it is safe to use bit 63 as PRESENT bit for PUD.
They also should not conflict with bit 62 potentially later used for
preserving SOFT_DIRTY in swap entries, because they are not swap entries.
Valid PMDs / PUDs should always have the present bit set, so add it to
the various pgprot defines, and also _SEGMENT_ENTRY which is OR'ed e.g.
in pmd_populate(). _REGION3_ENTRY wouldn't need any change, as the present
bit is already included in the TABLE LENGTH, but also explicitly add it
there, for completeness, and just in case the bit would ever be changed.
gmap code needs some adjustment, to also OR the _SEGMENT_ENTRY, like it
is already done gmap_shadow_pgt() when creating new PMDs, but not in
__gmap_link(). Otherwise, the gmap PMDs would not be considered present,
e.g. when using pmd_leaf() checks in gmap code. The various WARN_ON
checks in gmap code also need adjustment, to tolerate the new present
bit.
This is a prerequisite for hugetlbfs PTE_MARKER support on s390, which
is needed to fix a regression introduced with commit 8a13897fb0da
("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs"). That commit
depends on the availability of swap entries for hugetlbfs, which were
not available for s390 so far.
Gerald Schaefer [Thu, 21 Nov 2024 17:45:20 +0000 (18:45 +0100)]
s390/mm: Rearrange region-third and segment table entry SW bits
Rearrange region-third and segment table entry SW bits, in order to
make room for future encoding of region/segment table swap entries.
Also adjust _SEGMENT_ENTRY_GMAP_UC and _SEGMENT_ENTRY_GMAP_IN bits in
gmap code. Those should only apply for gmap PMDs, and not really depend
on or conflict with host PMD bits, but for consistency also adjust them:
- _SEGMENT_ENTRY_GMAP_UC "dirty (migration)" was using the same bit as
_SEGMENT_ENTRY_SOFT_DIRTY in the host PMD -> make it use the new
SOFT_DIRTY bit 63 (0x0002)
- _SEGMENT_ENTRY_GMAP_IN "invalidation notify bit" was using 0x8000,
which was an unused bit in the host PMD, that is now used for
_SEGMENT_ENTRY_WRITE -> make it use bit 52 (0x0800) instead, which is
still unused in the host PMD
This is a prerequisite for hugetlbfs PTE_MARKER support on s390, which
is needed to fix a regression introduced with commit 8a13897fb0da
("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs"). That commit
depends on the availability of swap entries for hugetlbfs, which were
not available for s390 so far.