blk-throttle: only enable blk-stat when BLK_DEV_THROTTLING_LOW
blk_throtl_register() will unconditionally enable blk-stat for gendisk
when register, even when we have no BLK_DEV_THROTTLING_LOW config.
Since the kernel always has only BLK_DEV_THROTTLING config and the
BLK_DEV_THROTTLING_LOW config is still in EXPERIMENTAL state, we can
just skip blk-stat when !BLK_DEV_THROTTLING_LOW.
Other rq_qos policies such as wbt and iocost are lazy-initialized when they
are configured for the first time for the device but iolatency is
initialized unconditionally from blkcg_init_disk() during gendisk init. Lazy
init is beneficial because rq_qos policies add runtime overhead when
initialized as every IO has to walk all registered rq_qos callbacks.
This patch switches iolatency to lazy initialization too so that it only
registered its rq_qos policy when it is first configured.
Note that there is a known race condition between blkcg config file writes
and del_gendisk() and this patch makes iolatency susceptible to it by
exposing the init path to race against the deletion path. However, that
problem already exists in iocost and is being worked on.
We want to support lazy init of rq-qos policies so that iolatency is enabled
lazily on configuration instead of gendisk initialization. The way blkg
config helpers are structured now is a bit awkward for that. Let's
restructure:
* blkcg_conf_open_bdev() is renamed to blkg_conf_open_bdev(). The blkcg_
prefix was used because the bdev opening step is blkg-independent.
However, the distinction is too subtle and confuses more than helps. Let's
switch to blkg prefix so that it's consistent with the type and other
helper names.
* struct blkg_conf_ctx now remembers the original input string and is always
initialized by the new blkg_conf_init().
* blkg_conf_open_bdev() is updated to take a pointer to blkg_conf_ctx like
blkg_conf_prep() and can be called multiple times safely. Instead of
modifying the double pointer to input string directly,
blkg_conf_open_bdev() now sets blkg_conf_ctx->body.
* blkg_conf_finish() is renamed to blkg_conf_exit() for symmetry and now
must be called on all blkg_conf_ctx's which were initialized with
blkg_conf_init().
Combined, this allows the users to either open the bdev first or do it
altogether with blkg_conf_prep() which will help implementing lazy init of
rq-qos policies.
blkg_conf_init/exit() will also be used implement synchronization against
device removal. This is necessary because iolat / iocost are configured
through cgroupfs instead of one of the files under /sys/block/DEVICE. As
cgroupfs operations aren't synchronized with block layer, the lazy init and
other configuration operations may race against device removal. This patch
makes blkg_conf_init/exit() used consistently for all cgroup-orginating
configurations making them a good place to implement explicit
synchronization.
Users are updated accordingly. No behavior change is intended by this patch.
v2: bfq wasn't updated in v1 causing a build error. Fixed.
v3: Update the description to include future use of blkg_conf_init/exit() as
synchronization points.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yu Kuai <yukuai1@huaweicloud.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230413000649.115785-3-tj@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()
Now that all RCU flavors have been combined either holding a spin lock,
disabling irq or disabling preemption implies RCU read lock, so there's no
need to use rcu_read_[un]lock() explicitly while holding queue_lock. This
shouldn't cause any behavior changes.
v2: Description updated. Leave __acquires/release on queue_lock alone.
Stefan Haberland [Wed, 5 Apr 2023 14:20:17 +0000 (16:20 +0200)]
s390/dasd: fix hanging blockdevice after request requeue
The DASD driver does not kick the requeue list when requeuing IO requests
to the blocklayer. This might lead to hanging blockdevice when there is
no other trigger for this.
Fix by automatically kick the requeue list when requeuing DASD requests
to the blocklayer.
Fixes: e443343e509a ("s390/dasd: blk-mq conversion") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Link: https://lore.kernel.org/r/20230405142017.2446986-8-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stefan Haberland [Wed, 5 Apr 2023 14:20:15 +0000 (16:20 +0200)]
s390/dasd: add aq_timeouts autoquiesce trigger
Add a sysfs attribute aq_timeouts that controls after how many
timeouts a autoquiesce event might be triggered.
The default value is 32768 which is the maximum number of retries
for the DASD device driver DASD_RETRIES_MAX. This means that the
timeout trigger will never happen.
The default value for DASD retries is 255.
Setting the value to below 255 will trigger the timeout autoquiesce
event before an IO error is generated.
Also add the check for the configured amount of timeouts and trigger
an autoquiesce event if exceeded.
Stefan Haberland [Wed, 5 Apr 2023 14:20:13 +0000 (16:20 +0200)]
s390/dasd: add aq_mask sysfs attribute
Add sysfs attribute that controls the DASD autoquiesce feature.
The autoquiesce is disabled when 0 is echoed to the attribute.
A value greater than 0 will enable the feature.
The aq_mask attribute will accept an unsigned integer and the value
will be interpreted as bitmask defining the trigger events that will
lead to an automatic quiesce.
The following autoquiesce triggers will currently be available:
DASD_EER_FATALERROR 1 - any final I/O error
DASD_EER_NOPATH 2 - no remaining paths for the device
DASD_EER_STATECHANGE 3 - a state change interrupt occurred
DASD_EER_PPRCSUSPEND 4 - the device is PPRC suspended
DASD_EER_NOSPC 5 - there is no space remaining on an ESE device
DASD_EER_TIMEOUT 6 - a certain amount of timeouts occurred
DASD_EER_STARTIO 7 - the IO start function encountered an error
The currently supported maximum value is 255.
Bit 31 is reserved for internal usage.
Bit 0 is not used.
Stefan Haberland [Wed, 5 Apr 2023 14:20:12 +0000 (16:20 +0200)]
s390/dasd: add autoquiesce feature
Add the internal logic to check for autoquiesce triggers and handle
them.
Quiesce and resume are functions that tell Linux to stop/resume
issuing I/Os to a specific DASD.
The DASD driver allows a manual quiesce/resume via ioctl.
Autoquiesce will define an amount of triggers that will lead to
an automatic quiesce if a certain event occurs.
There is no automatic resume.
All events will be reported via DASD Extended Error Reporting (EER)
if configured.
BFQ_WEIGHT_LEGACY_DFL is the same as CGROUP_WEIGHT_DFL, which means
we don't need cpd_bind_fn() callback to update default weight when
attached to a hierarchy.
This patch remove BFQ_WEIGHT_LEGACY_DFL and cpd_bind_fn().
Ondrej Kozina [Wed, 5 Apr 2023 11:12:23 +0000 (13:12 +0200)]
sed-opal: Add command to read locking range parameters.
It returns following attributes:
locking range start
locking range length
read lock enabled
write lock enabled
lock state (RW, RO or LK)
It can be retrieved by user authority provided the authority
was added to locking range via prior IOC_OPAL_ADD_USR_TO_LR
ioctl command. The command was extended to add user in ACE that
allows to read attributes listed above.
Ondrej Kozina [Wed, 5 Apr 2023 11:12:21 +0000 (13:12 +0200)]
sed-opal: allow user authority to get locking range attributes.
Extend ACE set of locking range attributes accessible to user
authority. This patch allows user authority to get following
locking range attribues when user get added to locking range via
IOC_OPAL_ADD_USR_TO_LR:
locking range start
locking range end
read lock enabled
write lock enabled
read locked
write locked
lock on reset
active key
Note: Admin1 authority always remains in the ACE. Otherwise
it breaks current userspace expecting Admin1 in the ACE (sedutils).
See TCG OPAL2 s.4.3.1.7 "ACE_Locking_RangeNNNN_Get_RangeStartToActiveKey".
Signed-off-by: Ondrej Kozina <okozina@redhat.com> Tested-by: Luca Boccassi <bluca@debian.org> Tested-by: Milan Broz <gmazyland@gmail.com> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230405111223.272816-4-okozina@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
"This is an alternative type where the options are either a uidref to an
Authority object or one of the boolean_ACE (AND = 0 and OR = 1) options.
This type is used within the AC_element list to form a postfix Boolean
expression of Authorities."
Signed-off-by: Ondrej Kozina <okozina@redhat.com> Tested-by: Luca Boccassi <bluca@debian.org> Tested-by: Milan Broz <gmazyland@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230405111223.272816-2-okozina@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Thu, 30 Mar 2023 11:36:22 +0000 (19:36 +0800)]
block: ublk_drv: add two helpers to clean up map/unmap request
Add two helpers for checking if map/unmap is needed, since we may have
passthrough request which needs map or unmap in future, such as for
supporting report zones.
Meantime don't mark ublk_copy_user_pages as inline since this function
is a bit fat now.
Reviewed-by: Ziyang Zhang <ZiyangZhang@linux.alibaba.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Andreas Gruenbacher [Thu, 30 Mar 2023 10:27:40 +0000 (12:27 +0200)]
drbd: Add peer device parameter to whole-bitmap I/O handlers
Pass a peer device parameter through the bitmap I/O functions to the I/O
handlers. In after_state_ch(), set that parameter when queuing the
drbd_send_bitmap operation so that this operation knows where to send the
bitmap.
Chaitanya Kulkarni [Thu, 30 Mar 2023 18:49:25 +0000 (11:49 -0700)]
null_blk: use non-deprecated lib functions
Use library helper memcpy_page() to copy source page into destination
instead of having duplicate code in copy_to_nullb() & copy_from_nullb().
In copy_from_nullb() also replace the memset call with zero_user().
Use library helper memset_page() to set the buffer to 0xFF instead of
having duplicate code.
This also removes deprecated API kmap_atomic() from copy_to_nullb()
copy_from_nullb() and nullb_fill_pattern(),
from :include/linux/highmem.h:
"kmap_atomic - Atomically map a page for temporary usage - Deprecated!"
Keith Busch [Mon, 20 Mar 2023 19:49:26 +0000 (12:49 -0700)]
blk-mq: remove hybrid polling
io_uring provides the only way user space can poll completions, and that
always sets BLK_POLL_NOSLEEP. This effectively makes hybrid polling dead
code, so remove it and everything supporting it.
Hybrid polling was effectively killed off with 9650b453a3d4b1, "block:
ignore RWF_HIPRI hint for sync dio", but still potentially reachable
through io_uring until d729cf9acb93119, "io_uring: don't sleep when
polling for I/O", but hybrid polling probably should not have been
reachable through that async interface from the beginning.
Fixes: 9650b453a3d4 ("block: ignore RWF_HIPRI hint for sync dio") Fixes: d729cf9acb93 ("io_uring: don't sleep when polling for I/O") Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20230320194926.3353144-1-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Eric Biggers [Wed, 15 Mar 2023 18:39:07 +0000 (11:39 -0700)]
blk-crypto: drop the NULL check from blk_crypto_put_keyslot()
Now that all callers of blk_crypto_put_keyslot() check for NULL before
calling it, there is no need for blk_crypto_put_keyslot() to do the NULL
check itself.
Eric Biggers [Wed, 15 Mar 2023 18:39:04 +0000 (11:39 -0700)]
blk-crypto: make blk_crypto_evict_key() more robust
If blk_crypto_evict_key() sees that the key is still in-use (due to a
bug) or that ->keyslot_evict failed, it currently just returns while
leaving the key linked into the keyslot management structures.
However, blk_crypto_evict_key() is only called in contexts such as inode
eviction where failure is not an option. So actually the caller
proceeds with freeing the blk_crypto_key regardless of the return value
of blk_crypto_evict_key().
These two assumptions don't match, and the result is that there can be a
use-after-free in blk_crypto_reprogram_all_keys() after one of these
errors occurs. (Note, these errors *shouldn't* happen; we're just
talking about what happens if they do anyway.)
Fix this by making blk_crypto_evict_key() unlink the key from the
keyslot management structures even on failure.
Also improve some comments.
Fixes: 1b2628397058 ("block: Keyslot Manager for Inline Encryption") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230315183907.53675-2-ebiggers@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
Eric Biggers [Wed, 15 Mar 2023 18:39:03 +0000 (11:39 -0700)]
blk-crypto: make blk_crypto_evict_key() return void
blk_crypto_evict_key() is only called in contexts such as inode eviction
where failure is not an option. So there is nothing the caller can do
with errors except log them. (dm-table.c does "use" the error code, but
only to pass on to upper layers, so it doesn't really count.)
Just make blk_crypto_evict_key() return void and log errors itself.
Eric Biggers [Wed, 15 Mar 2023 18:39:02 +0000 (11:39 -0700)]
blk-mq: release crypto keyslot before reporting I/O complete
Once all I/O using a blk_crypto_key has completed, filesystems can call
blk_crypto_evict_key(). However, the block layer currently doesn't call
blk_crypto_put_keyslot() until the request is being freed, which happens
after upper layers have been told (via bio_endio()) the I/O has
completed. This causes a race condition where blk_crypto_evict_key()
can see 'slot_refs != 0' without there being an actual bug.
This makes __blk_crypto_evict_key() hit the
'WARN_ON_ONCE(atomic_read(&slot->slot_refs) != 0)' and return without
doing anything, eventually causing a use-after-free in
blk_crypto_reprogram_all_keys(). (This is a very rare bug and has only
been seen when per-file keys are being used with fscrypt.)
There are two options to fix this: either release the keyslot before
bio_endio() is called on the request's last bio, or make
__blk_crypto_evict_key() ignore slot_refs. Let's go with the first
solution, since it preserves the ability to report bugs (via
WARN_ON_ONCE) where a key is evicted while still in-use.
Fixes: a892c8d52c02 ("block: Inline encryption support for blk-mq") Cc: stable@vger.kernel.org Reviewed-by: Nathan Huckleberry <nhuck@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20230315183907.53675-2-ebiggers@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jakub Kicinski [Fri, 24 Feb 2023 02:13:01 +0000 (18:13 -0800)]
nbd: use the structured req attr check
Use the macro for checking presence of required attributes.
It has the advantage of reporting to the user which attr
was missing in a machine-readable format (extack).
Linus Torvalds [Sun, 12 Mar 2023 23:15:36 +0000 (16:15 -0700)]
Merge tag 'tpm-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm fixes from Jarkko Sakkinen:
"Two additional bug fixes for v6.3"
* tag 'tpm-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: disable hwrng for fTPM on some AMD designs
tpm/eventlog: Don't abort tpm_read_log on faulty ACPI address
Mario Limonciello [Tue, 28 Feb 2023 02:44:39 +0000 (20:44 -0600)]
tpm: disable hwrng for fTPM on some AMD designs
AMD has issued an advisory indicating that having fTPM enabled in
BIOS can cause "stuttering" in the OS. This issue has been fixed
in newer versions of the fTPM firmware, but it's up to system
designers to decide whether to distribute it.
This issue has existed for a while, but is more prevalent starting
with kernel 6.1 because commit b006c439d58db ("hwrng: core - start
hwrng kthread also for untrusted sources") started to use the fTPM
for hwrng by default. However, all uses of /dev/hwrng result in
unacceptable stuttering.
So, simply disable registration of the defective hwrng when detecting
these faulty fTPM versions. As this is caused by faulty firmware, it
is plausible that such a problem could also be reproduced by other TPM
interactions, but this hasn't been shown by any user's testing or reports.
It is hypothesized to be triggered more frequently by the use of the RNG
because userspace software will fetch random numbers regularly.
Intentionally continue to register other TPM functionality so that users
that rely upon PCR measurements or any storage of data will still have
access to it. If it's found later that another TPM functionality is
exacerbating this problem a module parameter it can be turned off entirely
and a module parameter can be introduced to allow users who rely upon
fTPM functionality to turn it on even though this problem is present.
Link: https://www.amd.com/en/support/kb/faq/pa-410 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216989 Link: https://lore.kernel.org/all/20230209153120.261904-1-Jason@zx2c4.com/ Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources") Cc: stable@vger.kernel.org Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Thorsten Leemhuis <regressions@leemhuis.info> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Tested-by: reach622@mailcuk.com Tested-by: Bell <1138267643@qq.com> Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Morten Linderud [Wed, 15 Feb 2023 09:25:52 +0000 (10:25 +0100)]
tpm/eventlog: Don't abort tpm_read_log on faulty ACPI address
tpm_read_log_acpi() should return -ENODEV when no eventlog from the ACPI
table is found. If the firmware vendor includes an invalid log address
we are unable to map from the ACPI memory and tpm_read_log() returns -EIO
which would abort discovery of the eventlog.
Change the return value from -EIO to -ENODEV when acpi_os_map_iomem()
fails to map the event log.
The following hardware was used to test this issue:
Framework Laptop (Pre-production)
BIOS: INSYDE Corp, Revision: 3.2
TPM Device: NTC, Firmware Revision: 7.2
Linus Torvalds [Sun, 12 Mar 2023 16:47:08 +0000 (09:47 -0700)]
Merge tag 'xfs-6.3-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
- Fix a crash if mount time quotacheck fails when there are inodes
queued for garbage collection.
- Fix an off by one error when discarding folios after writeback
failure.
* tag 'xfs-6.3-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix off-by-one-block in xfs_discard_folio()
xfs: quotacheck failure can race with background inode inactivation
Linus Torvalds [Sun, 12 Mar 2023 16:17:30 +0000 (09:17 -0700)]
Merge tag 'staging-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes and removal from Greg KH:
"Here are four small staging driver fixes, and one big staging driver
deletion for 6.3-rc2.
The fixes are:
- rtl8192e driver fixes for where the driver was attempting to
execute various programs directly from the disk for unknown reasons
- rtl8723bs driver fixes for issues found by Hans in testing
The deleted driver is the removal of the r8188eu wireless driver as
now in 6.3-rc1 we have a "real" wifi driver for one that includes
support for many many more devices than this old driver did. So it's
time to remove it as it is no longer needed. The maintainers of this
driver all have acked its removal. Many thanks to them over the years
for working to clean it up and keep it working while the real driver
was being developed.
All of these have been in linux-next this week with no reported
problems"
* tag 'staging-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8188eu: delete driver
staging: rtl8723bs: Pass correct parameters to cfg80211_get_bss()
staging: rtl8723bs: Fix key-store index handling
staging: rtl8192e: Remove call_usermodehelper starting RadioPower.sh
staging: rtl8192e: Remove function ..dm_check_ac_dc_power calling a script
Linus Torvalds [Sun, 12 Mar 2023 16:12:03 +0000 (09:12 -0700)]
Merge tag 'x86_urgent_for_v6.3_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
"A single erratum fix for AMD machines:
- Disable XSAVES on AMD Zen1 and Zen2 machines due to an erratum. No
impact to anything as those machines will fallback to XSAVEC which
is equivalent there"
* tag 'x86_urgent_for_v6.3_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/CPU/AMD: Disable XSAVES on AMD family 0x17
Linus Torvalds [Sun, 12 Mar 2023 16:04:28 +0000 (09:04 -0700)]
Merge tag 'kernel.fork.v6.3-rc2' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux
Pull clone3 fix from Christian Brauner:
"A simple fix for the clone3() system call.
The CLONE_NEWTIME allows the creation of time namespaces. The flag
reuses a bit from the CSIGNAL bits that are used in the legacy clone()
system call to set the signal that gets sent to the parent after the
child exits.
The clone3() system call doesn't rely on CSIGNAL anymore as it uses a
dedicated .exit_signal field in struct clone_args. So we blocked all
CSIGNAL bits in clone3_args_valid(). When CLONE_NEWTIME was introduced
and reused a CSIGNAL bit we forgot to adapt clone3_args_valid()
causing CLONE_NEWTIME with clone3() to be rejected. Fix this"
* tag 'kernel.fork.v6.3-rc2' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux:
selftests/clone3: test clone3 with CLONE_NEWTIME
fork: allow CLONE_NEWTIME in clone3 flags
Linus Torvalds [Sun, 12 Mar 2023 16:00:54 +0000 (09:00 -0700)]
Merge tag 'vfs.misc.v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs fixes from Christian Brauner:
- When allocating pages for a watch queue failed, we didn't return an
error causing userspace to proceed even though all subsequent
notifcations would be lost. Make sure to return an error.
- Fix a misformed tree entry for the idmapping maintainers entry.
- When setting file leases from an idmapped mount via
generic_setlease() we need to take the idmapping into account
otherwise taking a lease would fail from an idmapped mount.
- Remove two redundant assignments, one in splice code and the other in
locks code, that static checkers complained about.
* tag 'vfs.misc.v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
filelocks: use mount idmapping for setlease permission check
fs/locks: Remove redundant assignment to cmd
splice: Remove redundant assignment to ret
MAINTAINERS: repair a malformed T: entry in IDMAPPED MOUNTS
watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error paths
Linus Torvalds [Sun, 12 Mar 2023 15:55:55 +0000 (08:55 -0700)]
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Bug fixes and regressions for ext4, the most serious of which is a
potential deadlock during directory renames that was introduced during
the merge window discovered by a combination of syzbot and lockdep"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: zero i_disksize when initializing the bootloader inode
ext4: make sure fs error flag setted before clear journal error
ext4: commit super block if fs record error when journal record without error
ext4, jbd2: add an optimized bmap for the journal inode
ext4: fix WARNING in ext4_update_inline_data
ext4: move where set the MAY_INLINE_DATA flag is set
ext4: Fix deadlock during directory rename
ext4: Fix comment about the 64BIT feature
docs: ext4: modify the group desc size to 64
ext4: fix another off-by-one fsmap error on 1k block filesystems
ext4: fix RENAME_WHITEOUT handling for inline directories
ext4: make kobj_type structures constant
ext4: fix cgroup writeback accounting with fs-layer encryption
Linus Torvalds [Sun, 12 Mar 2023 15:52:03 +0000 (08:52 -0700)]
cpumask: relax sanity checking constraints
The cpumask_check() was unnecessarily tight, and causes problems for the
users of cpumask_next().
We have a number of users that take the previous return value of one of
the bit scanning functions and subtract one to keep it in "range". But
since the scanning functions end up returning up to 'small_cpumask_bits'
instead of the tighter 'nr_cpumask_bits', the range really needs to be
using that widened form.
[ This "previous-1" behavior is also the reason we have all those
comments about /* -1 is a legal arg here. */ and separate checks for
that being ok. So we could have just made "small_cpumask_bits-1"
be a similar special "don't check this" value.
Tetsuo Handa even suggested a patch that only does that for
cpumask_next(), since that seems to be the only actual case that
triggers, but that all makes it even _more_ magical and special. So
just relax the check ]
One example of this kind of pattern being the 'c_start()' function in
arch/x86/kernel/cpu/proc.c, but also duplicated in various forms on
other architectures.
Linus Torvalds [Sat, 11 Mar 2023 17:24:05 +0000 (09:24 -0800)]
Merge tag 'i2c-for-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"This marks the end of a transition to let I2C have the same probe
semantics as other subsystems. Uwe took care that no drivers in the
current tree nor in -next use the deprecated .probe call. So, it is a
good time to switch to the new, standard semantics now.
There is also a regression fix:
- regression fix for the notifier handling of the I2C core
- final coversions of drivers away from deprecated .probe
- make .probe_new the standard probe and convert I2C core to use it
* tag 'i2c-for-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: dev: Fix bus callback return values
i2c: Convert drivers to new .probe() callback
i2c: mux: Convert all drivers to new .probe() callback
i2c: Switch .probe() to not take an id parameter
media: i2c: ov2685: convert to i2c's .probe_new()
media: i2c: ov5695: convert to i2c's .probe_new()
w1: ds2482: Convert to i2c's .probe_new()
serial: sc16is7xx: Convert to i2c's .probe_new()
mtd: maps: pismo: Convert to i2c's .probe_new()
misc: ad525x_dpot-i2c: Convert to i2c's .probe_new()
Zhihao Cheng [Wed, 8 Mar 2023 03:26:43 +0000 (11:26 +0800)]
ext4: zero i_disksize when initializing the bootloader inode
If the boot loader inode has never been used before, the
EXT4_IOC_SWAP_BOOT inode will initialize it, including setting the
i_size to 0. However, if the "never before used" boot loader has a
non-zero i_size, then i_disksize will be non-zero, and the
inconsistency between i_size and i_disksize can trigger a kernel
warning:
Ye Bin [Tue, 7 Mar 2023 06:17:03 +0000 (14:17 +0800)]
ext4: make sure fs error flag setted before clear journal error
Now, jounral error number maybe cleared even though ext4_commit_super()
failed. This may lead to error flag miss, then fsck will miss to check
file system deeply.
Ye Bin [Tue, 7 Mar 2023 06:17:02 +0000 (14:17 +0800)]
ext4: commit super block if fs record error when journal record without error
Now, 'es->s_state' maybe covered by recover journal. And journal errno
maybe not recorded in journal sb as IO error. ext4_update_super() only
update error information when 'sbi->s_add_error_count' large than zero.
Then 'EXT4_ERROR_FS' flag maybe lost.
To solve above issue just recover 'es->s_state' error flag after journal
replay like error info.
Theodore Ts'o [Wed, 8 Mar 2023 04:15:49 +0000 (23:15 -0500)]
ext4, jbd2: add an optimized bmap for the journal inode
The generic bmap() function exported by the VFS takes locks and does
checks that are not necessary for the journal inode. So allow the
file system to set a journal-optimized bmap function in
journal->j_bmap.
Above issue happens as follows:
ext4_iget
ext4_find_inline_data_nolock ->i_inline_off=164 i_inline_size=60
ext4_try_add_inline_entry
__ext4_mark_inode_dirty
ext4_expand_extra_isize_ea ->i_extra_isize=32 s_want_extra_isize=44
ext4_xattr_shift_entries
->after shift i_inline_off is incorrect, actually is change to 176
ext4_try_add_inline_entry
ext4_update_inline_dir
get_max_inline_xattr_value_size
if (EXT4_I(inode)->i_inline_off)
entry = (struct ext4_xattr_entry *)((void *)raw_inode +
EXT4_I(inode)->i_inline_off);
free += EXT4_XATTR_SIZE(le32_to_cpu(entry->e_value_size));
->As entry is incorrect, then 'free' may be negative
ext4_update_inline_data
value = kzalloc(len, GFP_NOFS);
-> len is unsigned int, maybe very large, then trigger warning when
'kzalloc()'
To resolve the above issue we need to update 'i_inline_off' after
'ext4_xattr_shift_entries()'. We do not need to set
EXT4_STATE_MAY_INLINE_DATA flag here, since ext4_mark_inode_dirty()
already sets this flag if needed. Setting EXT4_STATE_MAY_INLINE_DATA
when it is needed may trigger a BUG_ON in ext4_writepages().
Reported-by: syzbot+d30838395804afc2fa6f@syzkaller.appspotmail.com Cc: stable@kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230307015253.2232062-3-yebin@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ye Bin [Tue, 7 Mar 2023 01:52:52 +0000 (09:52 +0800)]
ext4: move where set the MAY_INLINE_DATA flag is set
The only caller of ext4_find_inline_data_nolock() that needs setting of
EXT4_STATE_MAY_INLINE_DATA flag is ext4_iget_extra_inode(). In
ext4_write_inline_data_end() we just need to update inode->i_inline_off.
Since we are going to add one more caller that does not need to set
EXT4_STATE_MAY_INLINE_DATA, just move setting of EXT4_STATE_MAY_INLINE_DATA
out to ext4_iget_extra_inode().
Linus Torvalds [Sat, 11 Mar 2023 04:45:53 +0000 (20:45 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Twenty fixes all in drivers except the one zone storage revalidation
fix to sd.
The megaraid_sas fixes are more on the level of a driver update
(enabling crash dump and increasing lun number) but I thought you
could let this slide on -rc1 and the next most extensive update is a
load of fixes to mpi3mr"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Fix wrong zone_write_granularity value during revalidate
scsi: storvsc: Handle BlockSize change in Hyper-V VHD/VHDX file
scsi: megaraid_sas: Driver version update to 07.725.01.00-rc1
scsi: megaraid_sas: Add crash dump mode capability bit in MFI capabilities
scsi: megaraid_sas: Update max supported LD IDs to 240
scsi: mpi3mr: Bad drive in topology results kernel crash
scsi: mpi3mr: NVMe command size greater than 8K fails
scsi: mpi3mr: Return proper values for failures in firmware init path
scsi: mpi3mr: Wait for diagnostic save during controller init
scsi: mpi3mr: Driver unload crashes host when enhanced logging is enabled
scsi: mpi3mr: ioctl timeout when disabling/enabling interrupt
scsi: lpfc: Avoid usage of list iterator variable after loop
scsi: lpfc: Check kzalloc() in lpfc_sli4_cgn_params_read()
scsi: ufs: mcq: qcom: Clean the return path of ufs_qcom_mcq_config_resource()
scsi: ufs: mcq: qcom: Fix passing zero to PTR_ERR
scsi: ufs: ufs-qcom: Remove impossible check
scsi: ufs: core: Add soft dependency on governor_simpleondemand
scsi: hisi_sas: Check devm_add_action() return value
scsi: qla2xxx: Add option to disable FC2 Target support
scsi: target: iscsi: Fix an error message in iscsi_check_key()
Linus Torvalds [Sat, 11 Mar 2023 04:06:49 +0000 (20:06 -0800)]
Merge tag 'block-6.3-2023-03-09' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- Fix a regression in exclusive mode handling of the partition code,
introduced in this merge windoe (Yu)
- Fix for a use-after-free in BFQ (Yu)
- Add sysfs documentation for the 'hidden' attribute (Sagi)
* tag 'block-6.3-2023-03-09' of git://git.kernel.dk/linux:
block, bfq: fix uaf for 'stable_merge_bfqq'
docs: sysfs-block: document hidden sysfs entry
block: fix wrong mode for blkdev_put() from disk_scan_partitions()
Linus Torvalds [Sat, 11 Mar 2023 03:09:18 +0000 (19:09 -0800)]
Merge tag 'pull-highmem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull put_and_unmap_page() helper from Al Viro:
"kmap_local_page() conversions in local filesystems keep running into
kunmap_local_page()+put_page() combinations. We can keep inventing
names for identical inline helpers, but it's getting rather
inconvenient. I've added a trivial helper to linux/highmem.h instead.
I would've held that back until the merge window, if not for the mess
it causes in tree topology - I've several branches merging from that
one, and it's only going to get worse if e.g. ext2 stuff gets picked
by Jan"
* tag 'pull-highmem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
new helper: put_and_unmap_page()
Linus Torvalds [Sat, 11 Mar 2023 03:04:10 +0000 (19:04 -0800)]
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc fixes from Al Viro:
"pick_file() speculation fix + fix for alpha mis(merge,cherry-pick)
The fs/file.c one is a genuine missing speculation barrier in
pick_file() (reachable e.g. via close(2)). The alpha one is strictly
speaking not a bug fix, but only because confusion between
preempt_enable() and preempt_disable() is harmless on architecture
without CONFIG_PREEMPT.
Looks like alpha.git picked the wrong version of patch - that braino
used to be there in early versions, but it had been fixed quite a
while ago..."
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: prevent out-of-bounds array speculation when closing a file descriptor
alpha: fix lazy-FPU mis(merged/applied/whatnot)
Linus Torvalds [Fri, 10 Mar 2023 17:19:30 +0000 (09:19 -0800)]
Merge tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- RISC-V architecture-specific ELF attributes have been disabled in the
kernel builds
- A fix for a locking failure while during errata patching that
manifests on SiFive-based systems
- A fix for a KASAN failure during stack unwinding
- A fix for some lockdep failures during text patching
* tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Don't check text_mutex during stop_machine
riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
RISC-V: fix taking the text_mutex twice during sifive errata patching
RISC-V: Stop emitting attributes
Linus Torvalds [Fri, 10 Mar 2023 16:57:46 +0000 (08:57 -0800)]
Merge tag 'drm-fixes-2023-03-10' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Weekly fixes.
msm and amdgpu are the vast majority of these, otherwise some
straggler misc from last week for nouveau and cirrus and a mailmap
update for a drm developer.
mailmap:
- add an entry
nouveau:
- fix system shutdown regression
- build warning fix
amdgpu:
- Misc display fixes
- UMC 8.10 fixes
- Driver unload fixes
- NBIO 7.3.0 fix
- Error checking fixes for soc15, nv, soc21 read register interface
- Fix video cap query for VCN 4.0.4
amdkfd:
- Fix return check in doorbell handling"
* tag 'drm-fixes-2023-03-10' of git://anongit.freedesktop.org/drm/drm: (42 commits)
drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4
drm/amdgpu: fix error checking in amdgpu_read_mm_registers for nv
drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc21
drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15
drm/amdgpu: Fix the warning info when removing amdgpu device
drm/amdgpu: fix return value check in kfd
drm/amd: Fix initialization mistake for NBIO 7.3.0
drm/amdgpu: Fix call trace warning and hang when removing amdgpu device
mailmap: add mailmap entries for Faith.
drm/msm: DEVFREQ_GOV_SIMPLE_ONDEMAND is no longer needed
drm/amd/display: Update clock table to include highest clock setting
drm/amd/pm: Enable ecc_info table support for smu v13_0_10
drm/amdgpu: Support umc node harvest config on umc v8_10
drm/connector: print max_requested_bpc in state debugfs
drm/display: Don't block HDR_OUTPUT_METADATA on unknown EOTF
drm/msm/dpu: clear DSPP reservations in rm release
drm/msm/disp/dpu: fix sc7280_pp base offset
drm/msm/dpu: fix stack smashing in dpu_hw_ctl_setup_blendstage
drm/msm/dpu: don't use DPU_CLK_CTRL_CURSORn for DMA SSPP clocks
drm/msm/dpu: fix clocks settings for msm8998 SSPP blocks
...
Linus Torvalds [Fri, 10 Mar 2023 16:51:57 +0000 (08:51 -0800)]
Merge tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
"The most important one reverts an improper fix which can cause an
unexpected warning more often on specific images, and another one
fixes LZMA decompression on 32-bit platforms. The others are minor
fixes and cleanups.
- Fix LZMA decompression failure on HIGHMEM platforms
- Revert an inproper fix since it is actually an implementation issue
of vmalloc()
- Avoid a wrong DBG_BUGON since it could be triggered with -EINTR
- Minor cleanups"
* tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: use wrapper i_blocksize() in erofs_file_read_iter()
erofs: get rid of a useless DBG_BUGON
erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL"
erofs: fix wrong kunmap when using LZMA on HIGHMEM platforms
erofs: mark z_erofs_lzma_init/erofs_pcpubuf_init w/ __init
Linus Torvalds [Fri, 10 Mar 2023 16:45:30 +0000 (08:45 -0800)]
Merge tag 'nfsd-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Protect NFSD writes against filesystem freezing
- Fix a potential memory leak during server shutdown
* tag 'nfsd-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Fix a server shutdown leak
NFSD: Protect against filesystem freezing
Linus Torvalds [Fri, 10 Mar 2023 16:39:13 +0000 (08:39 -0800)]
Merge tag 'for-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"First batch of fixes. Among them there are two updates to sysfs and
ioctl which are not strictly fixes but are used for testing so there's
no reason to delay them.
- fix block group item corruption after inserting new block group
- fix extent map logging bit not cleared for split maps after
dropping range
- fix calculation of unusable block group space reporting bogus
values due to 32/64b division
- fix unnecessary increment of read error stat on write error
- improve error handling in inode update
- export per-device fsid in DEV_INFO ioctl to distinguish seeding
devices, needed for testing
- allocator size classes:
- fix potential dead lock in size class loading logic
- print sysfs stats for the allocation classes"
* tag 'for-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix block group item corruption after inserting new block group
btrfs: fix extent map logging bit not cleared for split maps after dropping range
btrfs: fix percent calculation for bg reclaim message
btrfs: fix unnecessary increment of read error stat on write error
btrfs: handle btrfs_del_item errors in __btrfs_update_delayed_inode
btrfs: ioctl: return device fsid from DEV_INFO ioctl
btrfs: fix potential dead lock in size class loading logic
btrfs: sysfs: add size class stats
Linus Torvalds [Fri, 10 Mar 2023 16:31:29 +0000 (08:31 -0800)]
Merge tag 'io_uring-6.3-2023-03-09' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Stop setting PF_NO_SETAFFINITY on io-wq workers.
This has been reported in the past as it confuses some applications,
as some of their threads will fail with -1/EINVAL if attempted
affinitized. Most recent report was on cpusets, where enabling that
with io-wq workers active will fail.
Just deal with the mask changing by checking when a worker times out,
and then exit if we have no work pending.
- Fix an issue with passthrough support where we don't properly check
if the file type has pollable uring_cmd support.
- Fix a reported W=1 warning on a variable being set and unused. Add a
special helper for iterating these lists that doesn't save the
previous list element, if that iterator never ends up using it.
* tag 'io_uring-6.3-2023-03-09' of git://git.kernel.dk/linux:
io_uring: silence variable ‘prev’ set but not used warning
io_uring/uring_cmd: ensure that device supports IOPOLL
io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers
Linus Torvalds [Fri, 10 Mar 2023 16:18:46 +0000 (08:18 -0800)]
Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Add Adrian Hunter to MAINTAINERS as a perf tools reviewer
- Sync various tools/ copies of kernel headers with the kernel sources,
this time trying to avoid first merging with upstream to then update
but instead copy from upstream so that a merge is avoided and the end
result after merging this pull request is the one expected,
tools/perf/check-headers.sh (mostly) happy, less warnings while
building tools/perf/
- Fix counting when initial delay configured by setting
perf_attr.enable_on_exec when starting workloads from the perf
command line
- Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject
--buildid-all' when that record comes with a build-id, otherwise we
end up not being able to resolve symbols
- Don't use comma as the CSV output separator the "stat+csv_output"
test, as comma can appear on some tests as a modifier for an event,
use @ instead, ditto for the JSON linter test
- The offcpu test was looking for some bits being set on
task_struct->prev_state without masking other bits not important for
this specific 'perf test', fix it
* tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tools: Add Adrian Hunter to MAINTAINERS as a reviewer
tools headers UAPI: Sync linux/perf_event.h with the kernel sources
tools headers x86 cpufeatures: Sync with the kernel sources
tools include UAPI: Sync linux/vhost.h with the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers kvm: Sync uapi/{asm/linux} kvm.h headers with the kernel sources
tools include UAPI: Synchronize linux/fcntl.h with the kernel sources
tools headers: Synchronize {linux,vdso}/bits.h with the kernel sources
tools headers UAPI: Sync linux/prctl.h with the kernel sources
tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'
perf stat: Fix counting when initial delay configured
tools headers svm: Sync svm headers with the kernel sources
perf test: Avoid counting commas in json linter
perf tests stat+csv_output: Switch CSV separator to @
perf inject: Fix --buildid-all not to eat up MMAP2
tools arch x86: Sync the msr-index.h copy with the kernel sources
perf test: Fix offcpu test prev_state check
lyndonli [Fri, 3 Mar 2023 06:55:05 +0000 (14:55 +0800)]
drm/amdgpu: Fix the warning info when removing amdgpu device
Actually, the drm_dev_enter in psp_cmd_submit_buf does not
protect anything. If DRM device is unplugged, it will always
check the condition in WARN_ON. So drop drm_dev_enter and
drm_dev_exit in psp_cmd_submit_buf.
When removing amdgpu, the calling order is as follows:
amdgpu_pci_remove
drm_dev_unplug
amdgpu_driver_unload_kms
amdgpu_device_fini_hw
amdgpu_device_ip_fini_early
psp_hw_fini
psp_ras_terminate
psp_ta_unloadye
psp_cmd_submit_buf
Signed-off-by: lyndonli <Lyndon.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Shashank Sharma [Mon, 27 Feb 2023 14:42:28 +0000 (15:42 +0100)]
drm/amdgpu: fix return value check in kfd
This patch fixes a return value check in kfd doorbell handling.
This function should return 0(error) only when the ida_simple_get
returns < 0(error), return > 0 is a success case.
Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Fixes: 16f0013157bf ("drm/amdkfd: Allocate doorbells only when needed") Acked-by: Christian Koenig <chriatian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Wed, 1 Mar 2023 15:36:06 +0000 (09:36 -0600)]
drm/amd: Fix initialization mistake for NBIO 7.3.0
The same strapping initialization issue that happened on NBIO 7.5.1
appears to be happening on NBIO 7.3.0.
Apply the same fix to 7.3.0 as well.
Note: This workaround relies upon the integrated GPU being enabled
in BIOS. If the integrated GPU is disabled in BIOS a different
workaround will be required.
Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Cc: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Link: https://lore.kernel.org/linux-usb/Y%2Fz9GdHjPyF2rNG3@glanzmann.de/T/#u Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: lyndonli <Lyndon.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Conor Dooley [Fri, 3 Mar 2023 14:37:55 +0000 (14:37 +0000)]
RISC-V: Don't check text_mutex during stop_machine
We're currently using stop_machine() to update ftrace & kprobes, which
means that the thread that takes text_mutex during may not be the same
as the thread that eventually patches the code. This isn't actually a
race because the lock is still held (preventing any other concurrent
accesses) and there is only one thread running during stop_machine(),
but it does trigger a lockdep failure.
This patch just elides the lockdep check during stop_machine.
Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support") Suggested-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230303143754.4005217-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Alexandre Ghiti [Wed, 8 Mar 2023 09:16:39 +0000 (10:16 +0100)]
riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
When CONFIG_FRAME_POINTER is unset, the stack unwinding function
walk_stackframe randomly reads the stack and then, when KASAN is enabled,
it can lead to the following backtrace:
Seth Forshee [Thu, 9 Mar 2023 20:39:09 +0000 (14:39 -0600)]
filelocks: use mount idmapping for setlease permission check
A user should be allowed to take out a lease via an idmapped mount if
the fsuid matches the mapped uid of the inode. generic_setlease() is
checking the unmapped inode uid, causing these operations to be denied.
Fix this by comparing against the mapped inode uid instead of the
unmapped uid.
Geert Uytterhoeven [Thu, 9 Mar 2023 07:45:46 +0000 (08:45 +0100)]
i2c: dev: Fix bus callback return values
The i2cdev_{at,de}tach_adapter() callbacks are used for two purposes:
1. As notifier callbacks, when (un)registering I2C adapters created or
destroyed after i2c_dev_init(),
2. As bus iterator callbacks, for registering already existing
adapters from i2c_dev_init(), and for cleanup.
Unfortunately both use cases expect different return values: the former
expects NOTIFY_* return codes, while the latter expects zero or error
codes, and aborts in case of error.
Hence in case 2, as soon as i2cdev_{at,de}tach_adapter() returns
(non-zero) NOTIFY_OK, the bus iterator aborts. This causes (a) only the
first already existing adapter to be registered, leading to missing
/dev/i2c-* entries, and (b) a failure to unregister all but the first
I2C adapter during cleanup.
Fix this by introducing separate callbacks for the bus iterator,
wrapping the notifier functions, and always returning succes.
Any errors inside these callback functions are unlikely to happen, and
are fatal anyway.
Uwe Kleine-König [Sun, 26 Feb 2023 22:26:52 +0000 (23:26 +0100)]
i2c: Switch .probe() to not take an id parameter
Commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new() call-back
type") introduced a new probe callback to convert i2c init routines to
not take an i2c_device_id parameter. Now that all in-tree drivers are
converted to the temporary .probe_new() callback, .probe() can be
modified to match the desired prototype.
Now that .probe() and .probe_new() have the same semantic, they can be
defined as members of an anonymous union to save some memory and
simplify the core code a bit.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@kernel.org>