Patch series "mm: Rework zap ptes on swap entries", v5.
Patch 1 should fix a long standing bug for zap_pte_range() on
zap_details usage. The risk is we could have some swap entries skipped
while we should have zapped them.
Migration entries are not the major concern because file backed memory
always zap in the pattern that "first time without page lock, then
re-zap with page lock" hence the 2nd zap will always make sure all
migration entries are already recovered.
However there can be issues with real swap entries got skipped
errornoously. There's a reproducer provided in commit message of patch
1 for that.
Patch 2-4 are cleanups that are based on patch 1. After the whole
patchset applied, we should have a very clean view of zap_pte_range().
Only patch 1 needs to be backported to stable if necessary.
This patch (of 4):
The "details" pointer shouldn't be the token to decide whether we should
skip swap entries.
For example, when the callers specified details->zap_mapping==NULL, it
means the user wants to zap all the pages (including COWed pages), then
we need to look into swap entries because there can be private COWed
pages that was swapped out.
Skipping some swap entries when details is non-NULL may lead to wrongly
leaving some of the swap entries while we should have zapped them.
/* Write private page, swap it out */
buffer[page_size] = 1;
madvise(buffer, page_size * 2, MADV_PAGEOUT);
/* This should drop private buffer[page_size] already */
ret = ftruncate(shmem_fd, page_size);
assert(ret == 0);
/* Recover the size */
ret = ftruncate(shmem_fd, page_size * 2);
assert(ret == 0);
/* Re-read the data, it should be all zero */
val = buffer[page_size];
if (val == 0)
printf("Good\n");
else
printf("BUG\n");
}
===8<===
We don't need to touch up the pmd path, because pmd never had a issue with
swap entries. For example, shmem pmd migration will always be split into
pte level, and same to swapping on anonymous.
Add another helper should_zap_cows() so that we can also check whether we
should zap private mappings when there's no page pointer specified.
This patch drops that trick, so we handle swap ptes coherently. Meanwhile
we should do the same check upon migration entry, hwpoison entry and
genuine swap entries too.
To be explicit, we should still remember to keep the private entries if
even_cows==false, and always zap them when even_cows==true.
The issue seems to exist starting from the initial commit of git.
If an error is returned in jffs2_scan_eraseblock() and some memory
has been added to the jffs2_summary *s, we can observe the following
kmemleak report:
Therefore, we should call jffs2_sum_reset_collected(s) on exit to
release the memory added in s. In addition, a new tag "out_buf" is
added to prevent the NULL pointer reference caused by s being NULL.
(thanks to Zhang Yi for this analysis)
Fixes: e631ddba5887 ("[JFFS2] Add erase block summary support (mount time improvement)") Cc: stable@vger.kernel.org
Co-developed-with: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we mount a jffs2 image, assume that the first few blocks of
the image are normal and contain at least one xattr-related inode,
but the next block is abnormal. As a result, an error is returned
in jffs2_scan_eraseblock(). jffs2_clear_xattr_subsystem() is then
called in jffs2_build_filesystem() and then again in
jffs2_do_fill_super().
Finally we can observe the following report:
==================================================================
BUG: KASAN: use-after-free in jffs2_clear_xattr_subsystem+0x95/0x6ac
Read of size 8 at addr ffff8881243384e0 by task mount/719
The buggy address belongs to the object at ffff8881243384b8
which belongs to the cache jffs2_xattr_ref of size 48
The buggy address is located 40 bytes inside of
48-byte region [ffff8881243384b8, ffff8881243384e8)
[...]
==================================================================
The triggering of the BUG is shown in the following stack:
-----------------------------------------------------------
jffs2_fill_super
jffs2_do_fill_super
jffs2_do_mount_fs
jffs2_build_filesystem
jffs2_scan_medium
jffs2_scan_eraseblock <--- ERROR
jffs2_clear_xattr_subsystem <--- free
jffs2_clear_xattr_subsystem <--- free again
-----------------------------------------------------------
An error is returned in jffs2_do_mount_fs(). If the error is returned
by jffs2_sum_init(), the jffs2_clear_xattr_subsystem() does not need to
be executed. If the error is returned by jffs2_build_filesystem(), the
jffs2_clear_xattr_subsystem() also does not need to be executed again.
So move jffs2_clear_xattr_subsystem() from 'out_inohash' to 'out_root'
to fix this UAF problem.
Fixes: aa98d7cf59b5 ("[JFFS2][XATTR] XATTR support on JFFS2 (version. 5)") Cc: stable@vger.kernel.org Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need to call dev_kfree_skb() when usb_submit_urb() fails
because can_put_echo_skb() deletes original skb and
can_free_echo_skb() deletes the cloned skb.
Fixes: 0024d8ad1639 ("can: usb_8dev: Add support for USB2CAN interface from 8 devices") Link: https://lore.kernel.org/all/20220311080614.45229-1-hbh25y@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
can_put_echo_skb() will clone skb then free the skb. Move the
can_put_echo_skb() for the m_can version 3.0.x directly before the
start of the xmit in hardware, similar to the 3.1.x branch.
Fixes: 80646733f11c ("can: m_can: update to support CAN FD features") Link: https://lore.kernel.org/all/20220317081305.739554-1-mkl@pengutronix.de Cc: stable@vger.kernel.org Reported-by: Hangyu Hua <hbh25y@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need to call dev_kfree_skb() when usb_submit_urb() fails
beacause can_put_echo_skb() deletes the original skb and
can_free_echo_skb() deletes the cloned skb.
Link: https://lore.kernel.org/all/20220228083639.38183-1-hbh25y@gmail.com Fixes: 702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface") Cc: stable@vger.kernel.org Cc: Sebastian Haas <haas@ems-wuensche.com> Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Even if SPI_NOR_NO_ERASE was set, one could still send erase opcodes
to the flash. It is not recommended to send unsupported opcodes to
flashes. Fix the logic and do not set mtd->_erase when SPI_NOR_NO_ERASE
is specified. With this users will not be able to issue erase opcodes to
flashes and instead they will recive an -ENOTSUPP error.
Fixes: b199489d37b2 ("mtd: spi-nor: add the framework for SPI NOR") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Reviewed-by: Michael Walle <michael@walle.cc> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220228163334.277730-1-tudor.ambarus@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
By working with external hardware ECC engines, we figured out that
Under certain circumstances, it is needed for the SPI controller to
check INT_TX_EMPTY and INT_RX_NOT_EMPTY in both receive and transmit
path (not only in the receive path). The delay penalty being
negligible, move this code in the common path.
Fixes: b942d80b0a39 ("spi: Add MXIC controller driver") Cc: stable@vger.kernel.org Suggested-by: Mason Yang <masonccyang@mxic.com.tw> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Zhengxun Li <zhengxunli@mxic.com.tw> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/linux-mtd/20220127091808.1043392-10-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Broadcom bnxt_ptp driver does not compile with GCC 11.2.2 when
CONFIG_WERROR is enabled. The following error is generated:
drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c: In function ‘bnxt_ptp_enable’:
drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:400:43: error: array
subscript 255 is above array bounds of ‘struct pps_pin[4]’
[-Werror=array-bounds]
400 | ptp->pps_info.pins[pin_id].event = BNXT_PPS_EVENT_EXTERNAL;
| ~~~~~~~~~~~~~~~~~~^~~~~~~~
In file included from drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:20:
drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h:75:24: note: while
referencing ‘pins’
75 | struct pps_pin pins[BNXT_MAX_TSIO_PINS];
| ^~~~
cc1: all warnings being treated as errors
This is due to the function ptp_find_pin() returning a pin ID of -1 when
a valid pin is not found and this error never being checked.
Change the TSIO_PIN_VALID() function to also check that a pin ID is not
negative and use this macro in bnxt_ptp_enable() to check the result of
the calls to ptp_find_pin() to return an error early for invalid pins.
This fixes the compilation error.
Cc: <stable@vger.kernel.org> Fixes: 9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220328062708.207079-1-damien.lemoal@opensource.wdc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver iterates over its devicetree children with
for_each_child_of_node() and stores for later found node pointer. This
has to be put in error paths to avoid leak during re-probing.
Fixes: ab663789d697 ("pinctrl: samsung: Match pin banks with their device nodes") Cc: <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org> Reviewed-by: Chanho Park <chanho61.park@samsung.com> Link: https://lore.kernel.org/r/20220111201426.326777-2-krzysztof.kozlowski@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The X series Ingenic SoCs have a shadow GPIO group which is at a higher
offset than the other groups, and is used for all GPIO configuration.
The regmap did not take this offset into account and set max_register
too low, so the regmap API blocked writes to the shadow group, which
made the pinctrl driver unable to configure any pins.
Fix this by adding regmap access tables to the chip info. The way that
max_register was computed was also off by one, since max_register is an
inclusive bound, not an exclusive bound; this has been fixed.
user_shm_lock forgets to set allowed to 0 when get_ucounts fails. So the
later user_shm_unlock might do the extra dec_rlimit_ucounts. Also in the
RLIM_INFINITY case, user_shm_lock will success regardless of the value of
memlock where memblock == LONG_MAX && !capable(CAP_IPC_LOCK) should fail.
Fix all of these by changing the code to leave lock_limit at ULONG_MAX aka
RLIM_INFINITY, leave "allowed" initialized to 0 and remove the special case
of RLIM_INFINITY as nothing can be greater than ULONG_MAX.
Credit goes to Eric W. Biederman for proposing simplifying the code and
thus catching the later bug.
Fixes: d7c9e99aee48 ("Reimplement RLIMIT_MEMLOCK on top of ucounts") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: stable@vger.kernel.org
v1: https://lkml.kernel.org/r/20220310132417.41189-1-linmiaohe@huawei.com
v2: https://lkml.kernel.org/r/20220314064039.62972-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20220322080918.59861-1-linmiaohe@huawei.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The root cause is cp_pack_total_block_count field in checkpoint was fuzzed
to one, as calcuated, two cp pack block locates in the same block address,
so then read latter cp pack block, it will block on the page lock due to
the lock has already held when reading previous cp pack block, fix it by
adding sanity check for cp_pack_total_block_count.
cnt should be passed to sb_has_quota_active() instead of type to check
active quota properly.
Moreover, when the type is -1, the compiler with enough inline knowledge
can discard sb_has_quota_active() check altogether, causing a NULL pointer
dereference at the following inode_lock(dqopt->files[cnt]):
After commit 77900c45ee5c ("f2fs: fix to do sanity check in is_alive()"),
node page should be unlock via calling f2fs_put_page() in the error path
of is_alive(), otherwise, f2fs may hang when it tries to lock the node
page, fix it.
Ensure that we always initialise the 'xattr_support' field in struct
nfs_fsinfo, so that nfs_server_set_fsinfo() doesn't declare our NFSv2/v3
client to be capable of supporting the NFSv4.2 xattr protocol by setting
the NFS_CAP_XATTR capability.
This configuration can cause nfs_do_access() to set access mode bits
that are unsupported by the NFSv3 ACCESS call, which may confuse
spec-compliant servers.
Reported-by: Olga Kornievskaia <kolga@netapp.com> Fixes: b78ef845c35d ("NFSv4.2: query the server for extended attribute support") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Do not cast the struct xprt to a sock_xprt unless we know it is a UDP or
TCP transport. Otherwise the call to lock the mutex will scribble over
whatever structure is actually there. This has been seen to cause hard
system lockups when the underlying transport was RDMA.
Fixes: b49ea673e119 ("SUNRPC: lock against ->sock changing during sysfs read") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xprt_destory() claims XPRT_LOCKED and then calls del_timer_sync().
Both xprt_unlock_connect() and xprt_release() call
->release_xprt()
which drops XPRT_LOCKED and *then* xprt_schedule_autodisconnect()
which calls mod_timer().
This may result in mod_timer() being called *after* del_timer_sync().
When this happens, the timer may fire long after the xprt has been freed,
and run_timer_softirq() will probably crash.
The pairing of ->release_xprt() and xprt_schedule_autodisconnect() is
always called under ->transport_lock. So if we take ->transport_lock to
call del_timer_sync(), we can be sure that mod_timer() will run first
(if it runs at all).
Allocating memory with kmalloc and GPF_DMA32 is not allowed, the
allocator will ignore the attribute.
Instead, use dma_alloc_coherent() API as we allocate a small amount of
memory to transfer firmware fragment to the ISH.
On Arcada chromebook, after the patch the warning:
"Unexpected gfp: 0x4 (GFP_DMA32). Fixing up to gfp: 0xcc0 (GFP_KERNEL). Fix your code!"
is gone. The ISH firmware is loaded properly and we can interact with
the ISH:
> ectool --name cros_ish version
...
Build info: arcada_ish_v2.0.3661+3c1a1c1ae0 2022-02-08 05:37:47 @localhost
Tool version: v2.0.12300-900b03ec7f 2022-02-08 10:01:48 @localhost
Fix a bug whereby, the return response of parameter a1 from an
SMC call is not properly set to the callback data during an
INTEL_SIP_SMC_RSU_ERROR command.
It is insecure to allow arbitrary hash algorithms and signature
encodings to be used with arbitrary signature algorithms. Notably,
ECDSA, ECRDSA, and SM2 all sign/verify raw hash values and don't
disambiguate between different hash algorithms like RSA PKCS#1 v1.5
padding does. Therefore, they need to be restricted to certain sets of
hash algorithms (ideally just one, but in practice small sets are used).
Additionally, the encoding is an integral part of modern signature
algorithms, and is not supposed to vary.
Therefore, tighten the checks of hash_algo and encoding done by
software_key_determine_akcipher().
Also rearrange the parameters to software_key_determine_akcipher() to
put the public_key first, as this is the most important parameter and it
often determines everything else.
Fixes: 299f561a6693 ("x509: Add support for parsing x509 certs with ECDSA keys") Fixes: 215525639631 ("X.509: support OSCCA SM2-with-SM3 certificate verification") Fixes: 0d7a78643f69 ("crypto: ecrdsa - add EC-RDSA (GOST 34.10) algorithm") Cc: stable@vger.kernel.org Tested-by: Stefan Berger <stefanb@linux.ibm.com> Tested-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Vitaly Chikunov <vt@altlinux.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Most callers of public_key_verify_signature(), including most indirect
callers via verify_signature() as well as pkcs7_verify_sig_chain(),
don't check that public_key_signature::pkey_algo matches
public_key::pkey_algo. These should always match. However, a malicious
signature could intentionally declare an unintended algorithm. It is
essential that such signatures be rejected outright, or that the
algorithm of the *key* be used -- not the algorithm of the signature as
that would allow attackers to choose the algorithm used.
Currently, public_key_verify_signature() correctly uses the key's
algorithm when deciding which akcipher to allocate. That's good.
However, it uses the signature's algorithm when deciding whether to do
the first step of SM2, which is incorrect. Also, v4.19 and older
kernels used the signature's algorithm for the entire process.
Prevent such errors by making public_key_verify_signature() enforce that
the signature's algorithm (if given) matches the key's algorithm.
Also remove two checks of this done by callers, which are now redundant.
Cc: stable@vger.kernel.org Tested-by: Stefan Berger <stefanb@linux.ibm.com> Tested-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Vitaly Chikunov <vt@altlinux.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In many cases, keyctl_pkey_params_get_2() is validating the user buffer
lengths against the wrong algorithm properties. Fix it to check against
the correct properties.
Probably this wasn't noticed before because for all asymmetric keys of
the "public_key" subtype, max_data_size == max_sig_size == max_enc_size
== max_dec_size. However, this isn't necessarily true for the
"asym_tpm" subtype (it should be, but it's not strictly validated). Of
course, future key types could have different values as well.
Fixes: 00d60fd3b932 ("KEYS: Provide keyctls to drive the new key type ops for asymmetric keys [ver #2]") Cc: <stable@vger.kernel.org> # v4.20+ Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove the spinlock around the tree traversal as we are calling possibly
sleeping functions.
We do not need a spinlock here as there will be no modifications to this
tree at this point.
This prevents warnings like this to occur in dmesg:
[ 653.774996] BUG: sleeping function called from invalid context at kernel/loc\
king/mutex.c:280
[ 653.775088] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1827, nam\
e: umount
[ 653.775152] preempt_count: 1, expected: 0
[ 653.775191] CPU: 0 PID: 1827 Comm: umount Tainted: G W OE 5.17.0\
-rc7-00006-g4eb628dd74df #135
[ 653.775195] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-\
1.fc33 04/01/2014
[ 653.775197] Call Trace:
[ 653.775199] <TASK>
[ 653.775202] dump_stack_lvl+0x34/0x44
[ 653.775209] __might_resched.cold+0x13f/0x172
[ 653.775213] mutex_lock+0x75/0xf0
[ 653.775217] ? __mutex_lock_slowpath+0x10/0x10
[ 653.775220] ? _raw_write_lock_irq+0xd0/0xd0
[ 653.775224] ? dput+0x6b/0x360
[ 653.775228] cifs_kill_sb+0xff/0x1d0 [cifs]
[ 653.775285] deactivate_locked_super+0x85/0x130
[ 653.775289] cleanup_mnt+0x32c/0x4d0
[ 653.775292] ? path_umount+0x228/0x380
[ 653.775296] task_work_run+0xd8/0x180
[ 653.775301] exit_to_user_mode_loop+0x152/0x160
[ 653.775306] exit_to_user_mode_prepare+0x89/0xd0
[ 653.775315] syscall_exit_to_user_mode+0x12/0x30
[ 653.775322] do_syscall_64+0x48/0x90
[ 653.775326] entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: 187af6e98b44e5d8f25e1d41a92db138eb54416f ("cifs: fix handlecache and multiuser") Reported-by: kernel test robot <oliver.sang@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In multiuser each individual user has their own tcon structure for the
share and thus their own handle for a cached directory.
When we umount such a share we much make sure to release the pinned down dentry
for each such tcon and not just the master tcon.
Otherwise we will get nasty warnings on umount that dentries are still in use:
[ 3459.590047] BUG: Dentry 00000000115c6f41{i=12000000019d95,n=/} still in use\
(2) [unmount of cifs cifs]
...
[ 3459.590492] Call Trace:
[ 3459.590500] d_walk+0x61/0x2a0
[ 3459.590518] ? shrink_lock_dentry.part.0+0xe0/0xe0
[ 3459.590526] shrink_dcache_for_umount+0x49/0x110
[ 3459.590535] generic_shutdown_super+0x1a/0x110
[ 3459.590542] kill_anon_super+0x14/0x30
[ 3459.590549] cifs_kill_sb+0xf5/0x104 [cifs]
[ 3459.590773] deactivate_locked_super+0x36/0xa0
[ 3459.590782] cleanup_mnt+0x131/0x190
[ 3459.590789] task_work_run+0x5c/0x90
[ 3459.590798] exit_to_user_mode_loop+0x151/0x160
[ 3459.590809] exit_to_user_mode_prepare+0x83/0xd0
[ 3459.590818] syscall_exit_to_user_mode+0x12/0x30
[ 3459.590828] do_syscall_64+0x48/0x90
[ 3459.590833] entry_SYSCALL_64_after_hwframe+0x44/0xae
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Setting PTRACE_O_SUSPEND_SECCOMP is supposed to be a highly privileged
operation because it allows the tracee to completely bypass all seccomp
filters on kernels with CONFIG_CHECKPOINT_RESTORE=y. It is only supposed to
be settable by a process with global CAP_SYS_ADMIN, and only if that
process is not subject to any seccomp filters at all.
However, while these permission checks were done on the PTRACE_SETOPTIONS
path, they were missing on the PTRACE_SEIZE path, which also sets
user-specified ptrace flags.
Move the permissions checks out into a helper function and let both
ptrace_attach() and ptrace_setoptions() call it.
Cc: stable@kernel.org Fixes: 13c4a90119d2 ("seccomp: add ptrace options for suspend/resume") Signed-off-by: Jann Horn <jannh@google.com> Link: https://lkml.kernel.org/r/20220319010838.1386861-1-jannh@google.com Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixed-rate clocks in UniPhier don't have any parent clocks, however,
initial data "init.flags" isn't initialized, so it might be determined
that there is a parent clock for fixed-rate clock.
This sets init.flags to zero as initialization.
Cc: <stable@vger.kernel.org> Fixes: 734d82f4a678 ("clk: uniphier: add core support code for UniPhier clock driver") Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Link: https://lore.kernel.org/r/1646808918-30899-1-git-send-email-hayashi.kunihiko@socionext.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit 4e7cf74fa3b2 ("clk: fractional-divider: Export approximation
algorithm to the CCF users"), the code handling the rational best
approximation algorithm was replaced by a call to the core
clk_fractional_divider_general_approximation function which did the same
thing back then.
However, in commit 82f53f9ee577 ("clk: fractional-divider: Introduce
POWER_OF_TWO_PS flag"), this common code was made conditional on
CLK_FRAC_DIVIDER_POWER_OF_TWO_PS flag which was not added back to the
rockchip clock driver.
This broke the ltk050h3146w-a2 MIPI DSI display present on a PX30-based
downstream board.
Let's add the flag to the fractional divider flags so that the original
and intended behavior is brought back to the rockchip clock drivers.
iio_convert_raw_to_processed_unlocked() assumes the offset is an
integer. Make a best effort to get a valid offset value for fractional
cases without breaking implicit truncations.
Fixes: 48e44ce0f881 ("iio:inkern: Add function to read the processed value") Signed-off-by: Liam Beguin <liambeguin@gmail.com> Reviewed-by: Peter Rosin <peda@axentia.se> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220108205319.2046348-4-liambeguin@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a consumer calls iio_read_channel_processed() and no channel scale
is available, it's assumed that the scale is one and the raw value is
returned as expected.
On the other hand, if the consumer calls iio_convert_raw_to_processed()
the scaling factor requested by the consumer is not applied.
This for example causes the consumer to process mV when expecting uV.
Make sure to always apply the scaling factor requested by the consumer.
Fixes: adc8ec5ff183 ("iio: inkern: pass through raw values if no scaling") Signed-off-by: Liam Beguin <liambeguin@gmail.com> Reviewed-by: Peter Rosin <peda@axentia.se> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220108205319.2046348-3-liambeguin@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a consumer calls iio_read_channel_processed() and the channel has
an integer scale, the scale channel scale is applied and the processed
value is returned as expected.
On the other hand, if the consumer calls iio_convert_raw_to_processed()
the scaling factor requested by the consumer is not applied.
This for example causes the consumer to process mV when expecting uV.
Make sure to always apply the scaling factor requested by the consumer.
Fixes: 48e44ce0f881 ("iio:inkern: Add function to read the processed value") Signed-off-by: Liam Beguin <liambeguin@gmail.com> Reviewed-by: Peter Rosin <peda@axentia.se> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220108205319.2046348-2-liambeguin@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All four scaling coefficients can take signed values.
Make tmp a signed 64-bit integer and switch to div_s64() to preserve
signs during 64-bit divisions.
Fixes: 8b74816b5a9a ("iio: afe: rescale: new driver") Signed-off-by: Liam Beguin <liambeguin@gmail.com> Reviewed-by: Peter Rosin <peda@axentia.se> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220108205319.2046348-5-liambeguin@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The formula for the ADC sampling period in ast2400/ast2500 is:
ADC clock period = PCLK * 2 * (ADC0C[31:17] + 1) * (ADC0C[9:0])
When ADC0C[9:0] is set to 0 the sampling voltage will be lower than
expected, because the hardware may not have enough time to
charge/discharge to a stable voltage. This patch use the flag
CLK_DIVIDER_ONE_BASED which will use the raw value read from the
register, with the value of zero considered invalid to conform to the
corrected formula.
Fixes: 573803234e72 ("iio: Aspeed ADC") Reported-by: Konstantin Klubnichkin <kitsok@yandex-team.ru> Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20220221012705.22008-1-billy_tsai@aspeedtech.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The original logic to get mma8452_data is wrong, the *dev point to
the device belong to iio_dev. we can't use this dev to find the
correct i2c_client. The original logic happen to work because it
finally use dev->driver_data to get iio_dev. Here use the API
to_i2c_client() is wrong and make reader confuse. To correct the
logic, it should be like this
But after commit 8b7651f25962 ("iio: iio_device_alloc(): Remove
unnecessary self drvdata"), the upper logic also can't work.
When try to show the avialable scale in userspace, will meet kernel
dump, kernel handle NULL pointer dereference.
So use dev_to_iio_dev() to correct the logic.
Dual fixes tags as the second reflects when the bug was exposed, whilst
the first reflects when the original bug was introduced.
Fixes: c3cdd6e48e35 ("iio: mma8452: refactor for seperating chip specific data") Fixes: 8b7651f25962 ("iio: iio_device_alloc(): Remove unnecessary self drvdata") Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Reviewed-by: Martin Kepplinger <martink@posteo.de> Cc: <Stable@vger.kernel.org> Link: https://lore.kernel.org/r/1645497741-5402-1-git-send-email-haibo.chen@nxp.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some of the AMS channels need to be read by switching into single-channel
mode from the normal polling sequence. There was a logic issue in this
switching code that could cause the first read of these channels to read
back as zero.
It appears that the sequencer should be set back to default mode before
changing the channel selection, and the channel should be set before
switching the sequencer back into single-channel mode.
Also, write 1 to the EOC bit in the status register to clear it before
waiting for it to become set, so that we actually wait for a new
conversion to complete, and don't proceed based on a previous conversion
completing.
Fixes: d5c70627a794 ("iio: adc: Add Xilinx AMS driver") Signed-off-by: Robert Hancock <robert.hancock@calian.com> Acked-by: Michal Simek <michal.simek@xilinx.com> Link: https://lore.kernel.org/r/20220127173450.3684318-5-robert.hancock@calian.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's impossible to program a valid value for TRCCONFIGR.QE
when TRCIDR0.QSUPP==0b10. In that case the following is true:
Q element support is implemented, and only supports Q elements without
instruction counts. TRCCONFIGR.QE can only take the values 0b00 or 0b11.
Currently the low bit of QSUPP is checked to see if the low bit of QE can
be written to, but as you can see when QSUPP==0b10 the low bit is cleared
making it impossible to ever write the only valid value of 0b11 to QE.
0b10 would be written instead, which is a reserved QE value even for all
values of QSUPP.
The fix is to allow writing the low bit of QE for any non zero value of
QSUPP.
This change also ensures that the low bit is always set, even when the
user attempts to only set the high bit.
Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Fixes: d8c66962084f ("coresight-etm4x: Controls pertaining to the reset, mode, pe and events") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220120113047.2839622-2-james.clark@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jinja2 release 3.1.0 (March 24, 2022) broke Sphinx<4.0.
This looks like the result of deprecating Python 3.6.
It has been tested against Sphinx 4.3.0 and later.
Setting an upper limit of <3.1 to junja2 can unbreak Sphinx<4.0
including Sphinx 2.4.4.
The MHI driver does not work on big endian architectures. The
controller never transitions into mission mode. This appears to be due
to the modem device expecting the various contexts and transfer rings to
have fields in little endian order in memory, but the driver constructs
them in native endianness.
Fix MHI event, channel and command contexts and TRE handling macros to
use explicit conversion to little endian. Mark fields in relevant
structures as little endian to document this requirement.
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions") Fixes: 6cd330ae76ff ("bus: mhi: core: Add support for ringing channel/event ring doorbells") Cc: stable@vger.kernel.org Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20220301160308.107452-4-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On big endian architectures the mhi debugfs files which report pm state
give "Invalid State" for all states. This is caused by using
find_last_bit which takes an unsigned long* while the state is passed in
as an enum mhi_pm_state which will be of int size.
Fix by using __fls to pass the value of state instead of find_last_bit.
Also the current API expects "mhi_pm_state" enumerator as the function
argument but the function only works with bitmasks. So as Alex suggested,
let's change the argument to u32 to avoid confusion.
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions") Cc: stable@vger.kernel.org
[mani: changed the function argument to u32] Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Hemant Kumar <hemantk@codeaurora.org> Reviewed-by: Alex Elder <elder@linaro.org> Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20220301160308.107452-3-manivannan.sadhasivam@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For default mechanism, the driver uses default MRU 3500 if mru_default
is not initialized. The Qualcomm configured the MRU size to 32768 in the
WWAN device FW. So, we align the driver setting with Qualcomm FW setting.
Usage of the iterator outside of the list_for_each_entry
is considered harmful. https://lkml.org/lkml/2022/2/17/1032
Do not reference the loop variable outside of the loop,
by rearranging the orders of execution.
Instead of performing search loop and checking outside the loop
if the end of the list was hit and no matching element was found,
the execution is performed inside the loop upon a successful match
followed by a goto statement to the next step,
therefore no condition has to be performed after the loop has ended.
xhci_decode_ctrl_ctx() returns the untouched buffer as-is if both "drop"
and "add" parameters are zero.
Fix the function to return an empty string in that case.
It was not immediately clear from the possible call chains whether this
issue is currently actually triggerable or not.
Note that before commit 4843b4b5ec64 ("xhci: fix even more unsafe memory
usage in xhci tracing") the result effect in the failure case was different
as a static buffer was used here, but the code still worked incorrectly.
Fixes: 90d6d5731da7 ("xhci: Add tracing for input control context") Cc: stable@vger.kernel.org Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
commit 4843b4b5ec64 ("xhci: fix even more unsafe memory usage in xhci tracing") Link: https://lore.kernel.org/r/20220303110903.1662404-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xhci_reset() timeout was increased from 250ms to 10 seconds in order to
give Renesas 720201 xHC enough time to get ready in probe.
xhci_reset() is called with interrupts disabled in other places, and
waiting for 10 seconds there is not acceptable.
Add a timeout parameter to xhci_reset(), and adjust it back to 250ms
when called from xhci_stop() or xhci_shutdown() where interrupts are
disabled, and successful reset isn't that critical.
This solves issues when deactivating host mode on platforms like SM8450.
For now don't change the timeout if xHC is reset in xhci_resume().
No issues are reported for it, and we need the reset to succeed.
Locking around that reset needs to be revisited later.
Additionally change the signed integer timeout parameter in
xhci_handshake() to a u64 to match the timeout value we pass to
readl_poll_timeout_atomic()
A race between system resume and device-initiated resume may result in
runtime PM imbalance on USB2 root hub. If a device-initiated resume
starts and system resume xhci_bus_resume() directs U0 before hub driver
sees the resuming device in RESUME state, device-initiated resume will
not be finished in xhci_handle_usb2_port_link_resume(). In this case,
usb_hcd_end_port_resume() call is missing.
This changes calls usb_hcd_end_port_resume() if resuming device reaches
U0 to keep runtime PM balance.
Fixes: a231ec41e6f6 ("xhci: refactor U0 link state handling in get_port_status") Cc: stable@vger.kernel.org Signed-off-by: Henry Lin <henryl@nvidia.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20220303110903.1662404-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xhci_decode_usbsts() is expected to return a zero-terminated string by
its only caller, xhci_stop_endpoint_command_watchdog(), which directly
logs the return value:
However, if no recognized bits are set in usbsts, the function will
return without having called any sprintf() and therefore return an
untouched non-zero-terminated caller-provided buffer, causing garbage
to be output to log.
Fix that by always including the raw value in the output.
Note that before commit 4843b4b5ec64 ("xhci: fix even more unsafe memory
usage in xhci tracing") the result effect in the failure case was different
as a static buffer was used here, but the code still worked incorrectly.
Fixes: 9c1aa36efdae ("xhci: Show host status when watchdog triggers and host is assumed dead.") Cc: stable@vger.kernel.org Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20220303110903.1662404-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The kernel test robot found a problem with the ene_ub6250 subdriver in
usb-storage: It uses structures containing bitfields to represent
hardware bits in its SD_STATUS, MS_STATUS, and SM_STATUS bytes. This
is not safe; it presumes a particular bit ordering and it assumes the
compiler will not insert padding, neither of which is guaranteed.
This patch fixes the problem by changing the structures to simple u8
values, with the bitfields replaced by bitmask constants.
In order to bring up the USB3 PHY on the Apple M1 we need to know the
orientation of the Type-C cable. Extract it from the status register and
forward it to the typec subsystem.
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Sven Peter <sven@svenpeter.dev> Link: https://lore.kernel.org/r/20220226125912.59828-1-sven@svenpeter.dev Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Song reports that a RAID rebuild workload runs much slower recently,
and it is seeing a lot less merging than it did previously. The reason
is that a previous commit reduced the amount of work we do for plug
merging. RAID rebuild interleaves requests between disks, so a last-entry
check in plug merging always misses a merge opportunity since we always
find a different disk than what we are looking for.
Modify the logic such that it's still a one-hit cache, but ensure that
we check enough to find the right target before giving up.
Fixes: d38a9c04c0d5 ("block: only check previous entry for plug merge attempt") Reported-and-tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We used to sort the plug list if we had multiple queues before dispatching
requests to the IO scheduler. This usually isn't needed, but for certain
workloads that interleave requests to disks, it's a less efficient to
process the plug list one-by-one if everything is interleaved.
Don't sort the list, but skip through it and flush out entries that have
the same target at the same time.
Fixes: df87eb0fce8f ("block: get rid of plug list sorting") Reported-and-tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For various reasons based on the allocator behaviour and typical
use-cases at the time, when the max32_alloc_size optimisation was
introduced it seemed reasonable to couple the reset of the tracked
size to the update of cached32_node upon freeing a relevant IOVA.
However, since subsequent optimisations focused on helping genuine
32-bit devices make best use of even more limited address spaces, it
is now a lot more likely for cached32_node to be anywhere in a "full"
32-bit address space, and as such more likely for space to become
available from IOVAs below that node being freed.
At this point, the short-cut in __cached_rbnode_delete_update() really
doesn't hold up any more, and we need to fix the logic to reliably
provide the expected behaviour. We still want cached32_node to only move
upwards, but we should reset the allocation size if *any* 32-bit space
has become available.
It was found that reading /proc/lockdep after a lockdep splat may
potentially cause an access to freed memory if lockdep_unregister_key()
is called after the splat but before access to /proc/lockdep [1]. This
is due to the fact that graph_lock() call in lockdep_unregister_key()
fails after the clearing of debug_locks by the splat process.
After lockdep_unregister_key() is called, the lock_name may be freed
but the corresponding lock_class structure still have a reference to
it. That invalid memory pointer will then be accessed when /proc/lockdep
is read by a user and a use-after-free (UAF) error will be reported if
KASAN is enabled.
To fix this problem, lockdep_unregister_key() is now modified to always
search for a matching key irrespective of the debug_locks state and
zap the corresponding lock class if a matching one is found.
The objcg is not cleared and put for kfence object when it is freed,
which could lead to memory leak for struct obj_cgroup and wrong
statistics of NR_SLAB_RECLAIMABLE_B or NR_SLAB_UNRECLAIMABLE_B.
Since the last freed object's objcg is not cleared,
mem_cgroup_from_obj() could return the wrong memcg when this kfence
object, which is not charged to any objcgs, is reallocated to other
users.
A real word issue [1] is caused by this bug.
Link: https://lore.kernel.org/all/000000000000cabcb505dae9e577@google.com/ Reported-by: syzbot+f8c45ccc7d5d45fc5965@syzkaller.appspotmail.com Fixes: d3fb45f370d9 ("mm, kfence: insert KFENCE hooks for SLAB") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
At least some PL2303GS have a bcdDevice of 0x605 instead of 0x100 as the
datasheet claims. Add it to the list of known release numbers for the
HXN (G) type.
IBM manufactures a PL2303 device for UPS communications. Add the vendor
and product IDs so that the PL2303 driver binds to the device.
Signed-off-by: Eddie James <eajames@linux.ibm.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Eddie James <eajames@linux.ibm.com> Link: https://lore.kernel.org/r/20220301224446.21236-1-eajames@linux.ibm.com Cc: stable@vger.kernel.org
[ johan: amend the SoB chain ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ret = ath9k_hw_process_rxdesc_edma(ah, rs, skb->data);
if (ret == -EINPROGRESS) {
/*let device gain the buffer again*/
dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
common->rx_bufsize, DMA_FROM_DEVICE);
return false;
}
and it's worth noting how that first DMA sync:
dma_sync_single_for_cpu(..DMA_FROM_DEVICE);
is there to make sure the CPU can read the DMA buffer (possibly by
copying it from the bounce buffer area, or by doing some cache flush).
The iommu correctly turns that into a "copy from bounce bufer" so that
the driver can look at the state of the packets.
In the meantime, the device may continue to write to the DMA buffer, but
we at least have a snapshot of the state due to that first DMA sync.
But that _second_ DMA sync:
dma_sync_single_for_device(..DMA_FROM_DEVICE);
is telling the DMA mapping that the CPU wasn't interested in the area
because the packet wasn't there. In the case of a DMA bounce buffer,
that is a no-op.
Note how it's not a sync for the CPU (the "for_device()" part), and it's
not a sync for data written by the CPU (the "DMA_FROM_DEVICE" part).
Or rather, it _should_ be a no-op. That's what commit aa6f8dcbab47
broke: it made the code bounce the buffer unconditionally, and changed
the DMA_FROM_DEVICE to just unconditionally and illogically be
DMA_TO_DEVICE.
[ Side note: purely within the confines of the swiotlb driver it wasn't
entirely illogical: The reason it did that odd DMA_FROM_DEVICE ->
DMA_TO_DEVICE conversion thing is because inside the swiotlb driver,
it uses just a swiotlb_bounce() helper that doesn't care about the
whole distinction of who the sync is for - only which direction to
bounce.
So it took the "sync for device" to mean that the CPU must have been
the one writing, and thought it meant DMA_TO_DEVICE. ]
Also note how the commentary in that commit was wrong, probably due to
that whole confusion, claiming that the commit makes the swiotlb code
"bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer"
which is nonsensical for two reasons:
- that "also when dir == DMA_TO_DEVICE" is nonsensical, as that was
exactly when it always did - and should do - the bounce.
- since this is a sync for the device (not for the CPU), we're clearly
fundamentally not coping back stale data from the bounce buffers at
all, because we'd be copying *to* the bounce buffers.
So that commit was just very confused. It confused the direction of the
synchronization (to the device, not the cpu) with the direction of the
DMA (from the device).
Reported-and-bisected-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reported-by: Olha Cherevyk <olha.cherevyk@gmail.com> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Kalle Valo <kvalo@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Toke Høiland-Jørgensen <toke@toke.dk> Cc: Maxime Bizon <mbizon@freebox.fr> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
My latest patch, attempting to fix the refcount leak in a minimal
way turned out to add a new bug.
Whenever the bind operation fails before we attempt to grab
a reference count on a device, we might release the device refcount
of a prior successful bind() operation.
syzbot was not happy about this [1].
Note to stable teams:
Make sure commit b37a46683739 ("netdevice: add the case if dev is NULL")
is already present in your trees.
While most m68k platforms use separate address spaces for user
and kernel space, at least coldfire does not, and the other
ones have a TASK_SIZE that is less than the entire 4GB address
range.
Using the default implementation of __access_ok() stops coldfire
user space from trivially accessing kernel memory.
exposure of the chip->tpm_mutex was removed from much of the upper
level code. In this conversion, tpm2_del_space() was missed. This
didn't matter much because it's usually called closely after a
converted operation, so there's only a very tiny race window where the
chip can be removed before the space flushing is done which causes a
NULL deref on the mutex. However, there are reports of this window
being hit in practice, so fix this by converting tpm2_del_space() to
use tpm_try_get_ops(), which performs all the teardown checks before
acquring the mutex.
Cc: stable@vger.kernel.org # 5.4.x Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The following sequence of operations results in a refcount warning:
1. Open device /dev/tpmrm.
2. Remove module tpm_tis_spi.
3. Write a TPM command to the file descriptor opened at step 1.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1161 at lib/refcount.c:25 kobject_get+0xa0/0xa4
refcount_t: addition on 0; use-after-free.
Modules linked in: tpm_tis_spi tpm_tis_core tpm mdio_bcm_unimac brcmfmac
sha256_generic libsha256 sha256_arm hci_uart btbcm bluetooth cfg80211 vc4
brcmutil ecdh_generic ecc snd_soc_core crc32_arm_ce libaes
raspberrypi_hwmon ac97_bus snd_pcm_dmaengine bcm2711_thermal snd_pcm
snd_timer genet snd phy_generic soundcore [last unloaded: spi_bcm2835]
CPU: 3 PID: 1161 Comm: hold_open Not tainted 5.10.0ls-main-dirty #2
Hardware name: BCM2711
[<c0410c3c>] (unwind_backtrace) from [<c040b580>] (show_stack+0x10/0x14)
[<c040b580>] (show_stack) from [<c1092174>] (dump_stack+0xc4/0xd8)
[<c1092174>] (dump_stack) from [<c0445a30>] (__warn+0x104/0x108)
[<c0445a30>] (__warn) from [<c0445aa8>] (warn_slowpath_fmt+0x74/0xb8)
[<c0445aa8>] (warn_slowpath_fmt) from [<c08435d0>] (kobject_get+0xa0/0xa4)
[<c08435d0>] (kobject_get) from [<bf0a715c>] (tpm_try_get_ops+0x14/0x54 [tpm])
[<bf0a715c>] (tpm_try_get_ops [tpm]) from [<bf0a7d6c>] (tpm_common_write+0x38/0x60 [tpm])
[<bf0a7d6c>] (tpm_common_write [tpm]) from [<c05a7ac0>] (vfs_write+0xc4/0x3c0)
[<c05a7ac0>] (vfs_write) from [<c05a7ee4>] (ksys_write+0x58/0xcc)
[<c05a7ee4>] (ksys_write) from [<c04001a0>] (ret_fast_syscall+0x0/0x4c)
Exception stack(0xc226bfa8 to 0xc226bff0)
bfa0: 00000000000105b400000003beafe6640000001400000000
bfc0: 00000000000105b4000103f8000000040000000000000000b6f9c000beafe684
bfe0: 0000006cbeafe6480001056cb6eb6944
---[ end trace d4b8409def9b8b1f ]---
The reason for this warning is the attempt to get the chip->dev reference
in tpm_common_write() although the reference counter is already zero.
Since commit 8979b02aaf1d ("tpm: Fix reference count to main device") the
extra reference used to prevent a premature zero counter is never taken,
because the required TPM_CHIP_FLAG_TPM2 flag is never set.
Fix this by moving the TPM 2 character device handling from
tpm_chip_alloc() to tpm_add_char_device() which is called at a later point
in time when the flag has been set in case of TPM2.
Commit fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>")
already introduced function tpm_devs_release() to release the extra
reference but did not implement the required put on chip->devs that results
in the call of this function.
Fix this by putting chip->devs in tpm_chip_unregister().
Finally move the new implementation for the TPM 2 handling into a new
function to avoid multiple checks for the TPM_CHIP_FLAG_TPM2 flag in the
good case and error cases.
Cc: stable@vger.kernel.org Fixes: fdc915f7f719 ("tpm: expose spaces via a device link /dev/tpmrm<n>") Fixes: 8979b02aaf1d ("tpm: Fix reference count to main device") Co-developed-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Tested-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While commit 6a01afcf8468 ("mac80211: mesh: Free ie data when leaving
mesh") fixed a memory leak on mesh leave / teardown it introduced a
potential memory corruption caused by a double free when rejoining the
mesh:
This double free / kernel panics can be reproduced by using wpa_supplicant
with an encrypted mesh (if set up without encryption via "iw" then
ifmsh->ie is always NULL, which avoids this issue). And then calling:
$ iw dev mesh0 mesh leave
$ iw dev mesh0 mesh join my-mesh
Note that typically these commands are not used / working when using
wpa_supplicant. And it seems that wpa_supplicant or wpa_cli are going
through a NETDEV_DOWN/NETDEV_UP cycle between a mesh leave and mesh join
where the NETDEV_UP resets the mesh.ie to NULL via a memcpy of
default_mesh_setup in cfg80211_netdev_notifier_call, which then avoids
the memory corruption, too.
The issue was first observed in an application which was not using
wpa_supplicant but "Senf" instead, which implements its own calls to
nl80211.
Fixing the issue by removing the kfree()'ing of the mesh IE in the mesh
join function and leaving it solely up to the mesh leave to free the
mesh IE.
Three architectures check the end of a user access against the
address limit without taking a possible overflow into account.
Passing a negative length or another overflow in here returns
success when it should not.
Use the most common correct implementation here, which optimizes
for a constant 'size' argument, and turns the common case into a
single comparison.
Cc: stable@vger.kernel.org Fixes: da551281947c ("csky: User access") Fixes: f663b60f5215 ("microblaze: Fix uaccess_ok macro") Fixes: 7567746e1c0d ("Hexagon: Add user access functions") Reported-by: David Laight <David.Laight@aculab.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently rcu_preempt_deferred_qs_irqrestore() releases rnp->boost_mtx
before reporting the expedited quiescent state. Under heavy real-time
load, this can result in this function being preempted before the
quiescent state is reported, which can in turn prevent the expedited grace
period from completing. Tim Murray reports that the resulting expedited
grace periods can take hundreds of milliseconds and even more than one
second, when they should normally complete in less than a millisecond.
This was fine given that there were no particular response-time
constraints for synchronize_rcu_expedited(), as it was designed
for throughput rather than latency. However, some users now need
sub-100-millisecond response-time constratints.
This patch therefore follows Neeraj's suggestion (seconded by Tim and
by Uladzislau Rezki) of simply reversing the two operations.
Reported-by: Tim Murray <timmurray@google.com> Reported-by: Joel Fernandes <joelaf@google.com> Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Tim Murray <timmurray@google.com> Cc: Todd Kjos <tkjos@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jbd2_journal_wait_updates() is called with j_state_lock held. But if
there is a commit in progress, then this transaction might get committed
and freed via jbd2_journal_commit_transaction() ->
jbd2_journal_free_transaction(), when we release j_state_lock.
So check for journal->j_running_transaction everytime we release and
acquire j_state_lock to avoid use-after-free issue.
Link: https://lore.kernel.org/r/948c2fed518ae739db6a8f7f83f1d58b504f87d0.1644497105.git.ritesh.list@gmail.com Fixes: 4f98186848707f53 ("jbd2: refactor wait logic for transaction updates into a common function") Cc: stable@kernel.org Reported-and-tested-by: syzbot+afa2ca5171d93e44b348@syzkaller.appspotmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If virtio_gpu_object_shmem_init() fails (e.g. due to fault injection, as it
happened in the bug report by syzbot), virtio_gpu_array_put_free() could be
called with objs equal to NULL.
Ensure that objs is not NULL in virtio_gpu_array_put_free(), or otherwise
return from the function.
Another subset of the more recent batch of Chinese clones aren't
specs-compliant and seem to lock up whenever they receive a
HCI_OP_SET_EVENT_FLT with flt_type set to zero/HCI_FLT_CLEAR_ALL,
which on Linux (until the recent HCI state-machine refactor) happened
right at BR/EDR setup. As there are other less-straightforward ways
of reaching those operations, this patch is still relevant.
So, while all the previous efforts to wrangle the herd of fake CSRs
seem to be paying off (and these also get detected as such) we
still need to take care of this quirk; testers seem to agree
that these dongles tend to work well enough afterwards.
From some cursory USB packet capture on Windows it seems like
that driver doesn't appear to use this clear-all functionality at all.
This patch was tested on some really popular AliExpress-style
dongles, in my case marked as "V5.0". Chip markings: UG8413,
the backside of the PCB says "USB Dangel" (sic).
Here is the `hciconfig -a` output; for completeness:
ID 0a12:0001 Cambridge Silicon Radio, Ltd Bluetooth Dongle (HCI mode)
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 224 Wireless
bDeviceSubClass 1 Radio Frequency
bDeviceProtocol 1 Bluetooth
bMaxPacketSize0 64
idVendor 0x0a12 Cambridge Silicon Radio, Ltd
idProduct 0x0001 Bluetooth Dongle (HCI mode)
bcdDevice 88.91
iManufacturer 0
iProduct 2 BT DONGLE10
iSerial 0
bNumConfigurations 1
Also, changed the benign dmesg print that shows up whenever the
generic force-suspend fails from bt_dev_err to bt_dev_warn;
it's okay and done on a best-effort basis, not a problem
if that does not work.
Also, swapped the HCI subver and LMP subver numbers for the Barrot
in the comment, which I copied wrong the last time around.
Fixes: 81cac64ba258a ("Bluetooth: Deal with USB devices that are faking CSR vendor") Fixes: cde1a8a992875 ("Bluetooth: btusb: Fix and detect most of the Chinese Bluetooth controllers") Fixes: d74e0ae7e0303 ("Bluetooth: btusb: Fix detection of some fake CSR controllers with a bcdDevice val of 0x0134") Fixes: 0671c0662383e ("Bluetooth: btusb: Add workaround for remote-wakeup issues with Barrot 8041a02 fake CSR controllers") Fixes: f4292e2faf522 ("Bluetooth: btusb: Make the CSR clone chip force-suspend workaround more generic") Link: https://bugzilla.kernel.org/show_bug.cgi?id=60824 Link: https://gist.github.com/nevack/6b36b82d715dc025163d9e9124840a07 Cc: stable@vger.kernel.org Cc: Hans de Goede <hdegoede@redhat.com> Tested-by: Gonzalo Tornaría <tornaria@cmat.edu.uy> Tested-by: Mateus Lemos <lemonsmateus@gmail.com> Tested-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Signed-off-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some controllers have problems with being sent a command to clear
all filtering. While the HCI code does not unconditionally
send a clear-all anymore at BR/EDR setup (after the state machine
refactor), there might be more ways of hitting these codepaths
in the future as the kernel develops.
Cc: stable@vger.kernel.org Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The implementations of aead and skcipher in the QAT driver do not
support properly requests with the CRYPTO_TFM_REQ_MAY_BACKLOG flag set.
If the HW queue is full, the driver returns -EBUSY but does not enqueue
the request.
This can result in applications like dm-crypt waiting indefinitely for a
completion of a request that was never submitted to the hardware.
To avoid this problem, disable the registration of all crypto algorithms
in the QAT driver by setting the number of crypto instances to 0 at
configuration time.
Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clevo NL5xRU and NL5xNU/TUXEDO Aura 15 Gen1 and Gen2 have both a working
native and video interface. However the default detection mechanism first
registers the video interface before unregistering it again and switching
to the native interface during boot. This results in a dangling SBIOS
request for backlight change for some reason, causing the backlight to
switch to ~2% once per boot on the first power cord connect or disconnect
event. Setting the native interface explicitly circumvents this buggy
behaviour by avoiding the unregistering process.
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For some reason, the Microsoft Surface Go 3 uses the standard ACPI
interface for battery information, but does not use the standard PNP0C0A
HID. Instead it uses MSHW0146 as identifier. Add that ID to the driver
as this seems to work well.
Additionally, the power state is not updated immediately after the AC
has been (un-)plugged, so add the respective quirk for that.
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On this board the ACPI RSDP structure points to both a RSDT and an XSDT,
but the XSDT points to a truncated FADT. This causes all sorts of trouble
and usually a complete failure to boot after the following error occurs:
This leaves the ACPI implementation in such a broken state that subsequent
kernel subsystem initialisations go wrong, resulting in among others
mismapped PCI memory, SATA and USB enumeration failures, and freezes.
As this is an older embedded platform that will likely never see any BIOS
updates to address this issue and its default shipping OS only complies to
ACPI 1.0, work around this by forcing `acpi=rsdt`. This patch, applied on
top of Linux 5.10.102, was confirmed on real hardware to fix the issue.
Signed-off-by: Mark Cilissen <mark@yotsuba.nl> Cc: All applicable <stable@vger.kernel.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>