pm_runtime_get_sync will increment pm usage counter even it failed.
Forgetting to putting operation will result in reference leak here.
We fix it by replacing it with the newest pm_runtime_resume_and_get
to keep usage counter balanced.
The device will respond with a CMD_ERROR_EVENT command, with error_code
KVASER_USB_{LEAF,HYDRA}_ERROR_EVENT_PARAM, if the CMD_SET_BUSPARAMS_REQ
contains invalid bittiming parameters.
However, this command does not contain any channel reference.
To check if the CMD_SET_BUSPARAMS_REQ was successful, redback and compare
the requested bittiming parameters with the device reported parameters.
Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Fixes: aec5fb2268b7 ("can: kvaser_usb: Add support for Kvaser USB hydra family") Tested-by: Anssi Hannula <anssi.hannula@bitwise.fi> Co-developed-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-12-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
When auto-restart is enabled, the kvaser_usb_leaf driver considers
transition from any state >= CAN_STATE_BUS_OFF as a bus-off recovery
event (restart).
However, these events may occur at interface startup time before
kvaser_usb_open() has set the state to CAN_STATE_ERROR_ACTIVE, causing
restarts counter to increase and CAN_ERR_RESTARTED to be sent despite no
actual restart having occurred.
Fix that by making the auto-restart condition checks more strict so that
they only trigger when the interface was actually in the BUS_OFF state.
Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Tested-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-10-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
0bfd:0124 Kvaser Mini PCI Express 2xHS FW 4.18.778 sends a
CMD_CHIP_STATE_EVENT indicating bus-off after stopping the device,
causing a stopped device to appear as CAN_STATE_BUS_OFF instead of
CAN_STATE_STOPPED.
Fix that by not handling error events on stopped devices.
Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Tested-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-8-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The tested 0bfd:0017 Kvaser Memorator Professional HS/HS FW 2.0.50 and
0bfd:0124 Kvaser Mini PCI Express 2xHS FW 4.18.778 do not seem to send
any unsolicited events when error counters decrease or when the device
transitions from ERROR_PASSIVE to ERROR_ACTIVE (or WARNING).
This causes the interface to e.g. indefinitely stay in the ERROR_PASSIVE
state.
Fix that by asking for chip state (inc. counters) event every 0.5 secs
when error counters are non-zero.
Since there are non-error-counter devices, also always poll in
ERROR_PASSIVE even if the counters show zero.
Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Tested-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-7-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
kvaser_usb_leaf_rx_error_update_can_state() sets error state according
to error counters when the hardware does not indicate a specific state
directly.
However, this is currently gated behind a check for
M16C_STATE_BUS_ERROR which does not always seem to be set when error
counters are increasing, and may not be set when error counters are
decreasing.
This causes the CAN_STATE_ERROR_WARNING state to not be set in some
cases even when appropriate.
Change the code to set error state from counters even without
M16C_STATE_BUS_ERROR.
The Error-Passive case seems superfluous as it is already set via
M16C_STATE_BUS_PASSIVE flag above, but it is kept for now.
Tested with 0bfd:0124 Kvaser Mini PCI Express 2xHS FW 4.18.778.
Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Tested-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-6-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Use the CMD_GET_CAPABILITIES_REQ command to query the device for certain
capabilities. We are only interested in LISTENONLY mode and wither the
device reports CAN error counters.
Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Reported-by: Anssi Hannula <anssi.hannula@bitwise.fi> Tested-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-3-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The CAN error message frames (i.e. error skb) are an interface
specific to socket CAN. The payload of the CAN error message frames
does not correspond to any actual data sent on the wire. Only an error
flag and a delimiter are transmitted when an error occurs (c.f. ISO
11898-1 section 10.4.4.2 "Error flag").
For this reason, it makes no sense to increment the tx_packets and
tx_bytes fields of struct net_device_stats when sending an error
message frame because no actual payload will be transmitted on the
wire.
N.B. Sending error message frames is a very specific feature which, at
the moment, is only supported by the Kvaser Hydra hardware. Please
refer to [1] for more details on the topic.
When examining module BTF, it is common to see core kernel structures
such as sk_buff, net_device duplicated in the module. After adding
debug messaging to BTF it turned out that much of the problem
was down to the identical struct test failing during deduplication;
sometimes the compiler adds identical structs. However
it turns out sometimes that type ids of identical struct members
can also differ, even when the containing structs are still identical.
To take an example, for struct sk_buff, debug messaging revealed
that the identical struct matching was failing for the anon
struct "headers"; specifically for the first field:
__u8 __pkt_type_offset[0]; /* 128 0 */
Looking at the code in BTF deduplication, we have code that guards
against the possibility of identical struct definitions, down to
type ids, and identical array definitions. However in this case
we have a struct which is being defined twice but does not have
identical type ids since each duplicate struct has separate type
ids for the above array member. A similar problem (though not
observed) could occur for struct-in-struct.
The solution is to make the "identical struct" test check members
not just for matching ids, but to also check if they in turn are
identical structs or arrays.
The results of doing this are quite dramatic (for some modules
at least); I see the number of type ids drop from around 10000
to just over 1000 in one module for example.
For testing use latest pahole or apply [1], otherwise dedups
can fail for the reasons described there.
Also fix return type of btf_dedup_identical_arrays() as
suggested by Andrii to match boolean return type used
elsewhere.
Commit 1f391df44607 ("media: v4l2-async: Use endpoints in
__v4l2_async_nf_add_fwnode_remote()") changed the data that is stored in
the v4l2_async_subdev internals from the fwnode pointer to the parent
device to the fwnode pointer to the matched endpoint. This broke the
sensor matching code, which relied on the particular fwnode data in the
v4l2_async_subdev internals. Fix this by simply matching the
v4l2_async_subdev pointer, which is already available there.
Reported-by: Daniel Scally <djrscally@gmail.com> Fixes: fa91f1056f17 ("[media] exynos4-is: Add support for asynchronous subdevices registration") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Daniel Scally <djrscally@gmail.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Kazuho Oku reported that setsockopt(SO_INCOMING_CPU) does not work
with setsockopt(SO_REUSEPORT) since v4.6.
With the combination of SO_REUSEPORT and SO_INCOMING_CPU, we could
build a highly efficient server application.
setsockopt(SO_INCOMING_CPU) associates a CPU with a TCP listener
or UDP socket, and then incoming packets processed on the CPU will
likely be distributed to the socket. Technically, a socket could
even receive packets handled on another CPU if no sockets in the
reuseport group have the same CPU receiving the flow.
The logic exists in compute_score() so that a socket will get a higher
score if it has the same CPU with the flow. However, the score gets
ignored after the blamed two commits, which introduced a faster socket
selection algorithm for SO_REUSEPORT.
This patch introduces a counter of sockets with SO_INCOMING_CPU in
a reuseport group to check if we should iterate all sockets to find
a proper one. We increment the counter when
* calling listen() if the socket has SO_INCOMING_CPU and SO_REUSEPORT
* enabling SO_INCOMING_CPU if the socket is in a reuseport group
Also, we decrement it when
* detaching a socket out of the group to apply SO_INCOMING_CPU to
migrated TCP requests
* disabling SO_INCOMING_CPU if the socket is in a reuseport group
When the counter reaches 0, we can get back to the O(1) selection
algorithm.
The overall changes are negligible for the non-SO_INCOMING_CPU case,
and the only notable thing is that we have to update sk_incomnig_cpu
under reuseport_lock. Otherwise, the race prevents transitioning to
the O(n) algorithm and results in the wrong socket selection.
In the function vcodec_domains_get(), dev_pm_domain_attach_by_name()
may return NULL in some cases, so IS_ERR() doesn't meet the
requirements. Thus fix it.
Fixes: 7482a983dea3 ("media: venus: redesign clocks and pm domains control") Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When moving the input selection to adv748x_reset() it was missed that
during probe the device is reset _before_ the initialization and parsing
of DT by the AFE subdevice. This can lead to the wrong input port (in
case it's not port 0) being selected until the device is reset for the
first time.
Fix this by restoring the call to adv748x_afe_s_input() in the AFE
initialization while also keeping it in the adv748x_reset().
Fixes: c30ed81afe89 ("media: adv748x: afe: Select input port when device is reset") Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
The V4L2 API defines the maximum value for an integer menu control as
the number of elements minus one. The v4l2_ctrl_new_std_menu() validates
this constraint with an off-by-one error. Fix it.
Fixes: d1e9b7c12b74 ("[media] V4L: Add support for integer menu controls with standard menu items") Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 902bc65de0b3 ("drm/amdgpu/powerplay/psm: return an error in power
state init") made the power state init function return early in case of
failure to get an entry from the powerplay table, but it missed to clean up
the allocated memory for the current power state before returning.
Fixes: 902bc65de0b3 ("drm/amdgpu/powerplay/psm: return an error in power state init") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The ASPEED KCS devices don't provide a BMC-side interrupt for the host
reading the output data register (ODR). The act of the host reading ODR
clears the output buffer full (OBF) flag in the status register (STR),
informing the BMC it can transmit a subsequent byte.
On the BMC side the KCS client must enable the OBE event *and* perform a
subsequent read of STR anyway to avoid races - the polling provides a
window for the host to read ODR if data was freshly written while
minimising BMC-side latency.
Fixes: 28651e6c4237 ("ipmi: kcs_bmc: Allow clients to control KCS IRQ state") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-Id: <20220812144741.240315-1-andrew@aj.id.au> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, the logic if we should call ata_scsi_set_sense()
(and set flag ATA_QCFLAG_SENSE_VALID to indicate that we have
successfully added sense data to the struct ata_queued_cmd)
looks like this:
if (dev->class == ATA_DEV_ZAC &&
((qc->result_tf.status & ATA_SENSE) || qc->result_tf.auxiliary))
The problem with this is that a drive can support the NCQ command
error log without supporting NCQ autosense.
On such a drive, if the failing command has sense data, the status
field in the NCQ command error log will have the ATA_SENSE bit set.
It is just that this sense data is not included in the NCQ command
error log when NCQ autosense is not supported. Instead the sense
data has to be fetched using the REQUEST SENSE DATA EXT command.
Therefore, we should only add the sense data if the drive supports
NCQ autosense AND the ATA_SENSE bit is set in the status field.
Fix this, and at the same time, remove the duplicated ATA_DEV_ZAC
check. The struct ata_taskfile supplied to ata_eh_read_log_10h()
is memset:ed before calling the function, so simply checking if
qc->result_tf.auxiliary is set is sufficient to tell us that the
log actually contained sense data.
Fixes: d238ffd59d3c ("libata: do not attempt to retrieve sense code twice") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add the explicit error and status register fields to 'struct ata_taskfile'
using the anonymous *union*s ('struct ide_taskfile' had that for ages!) and
update the libata taskfile code accordingly. There should be no object code
changes resulting from that...
When there are no program sections, obj->programs is left unallocated,
and find_prog_by_sec_insn()'s search lands on &obj->programs[0] == NULL,
and will cause null-pointer dereference in the following access to
prog->sec_idx.
Guard the search with obj->nr_programs similar to what's being done in
__bpf_program__iter() to prevent null-pointer access from happening.
ASAN reports an use-after-free in btf_dump_name_dups:
ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928
READ of size 2 at 0xffff927006db thread T0
#0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614)
#1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127
#2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143
#3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212
#4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525
#5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552
#6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567
#7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912
#8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798
#9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282
#10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236
#11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
#12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
#13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
#14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
#15 0xaaaab5d65990 (test_progs+0x185990)
0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0)
freed by thread T0 here:
#0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
#1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
#2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
#3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
#4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
#5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
#6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032
#7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232
#8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
#9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
#10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
#11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
#12 0xaaaab5d65990 (test_progs+0x185990)
previously allocated by thread T0 here:
#0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
#1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
#2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
#3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
#4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
#5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
#6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070
#7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102
#8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162
#9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
#10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
#11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
#12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
#13 0xaaaab5d65990 (test_progs+0x185990)
The reason is that the key stored in hash table name_map is a string
address, and the string memory is allocated by realloc() function, when
the memory is resized by realloc() later, the old memory may be freed,
so the address stored in name_map references to a freed memory, causing
use-after-free.
Fix it by storing duplicated string address in name_map.
Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API") Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20221011120108.782373-2-xukuohai@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>
adv7533 bridge tries to dynamically switch lanes based on the
mode by detaching and attaching the mipi dsi device.
This approach is incorrect because this method of dynamic switch of
detaching and attaching the mipi dsi device also results in removing
and adding the component which is not necessary.
This approach is also prone to deadlocks. So for example, on the
db410c whenever this path is executed with lockdep enabled,
this results in a deadlock due to below ordering of locks.
Due to above reasons, remove the dynamic lane switching
code from adv7533 bridge chip and filter out the modes
which would need different number of lanes as compared
to the initialization time using the mode_valid callback.
This can be potentially re-introduced by using the pre_enable()
callback but this needs to be evaluated first whether such an
approach will work so this will be done with a separate change.
changes since RFC:
- Fix commit text and add TODO comment
The wifi + bluetooth combo chips (RTL8723AU and RTL8723BU) read the
chip vendor from the wrong register because the val32 variable gets
overwritten. Add one more variable to avoid this.
This had no real effect on RTL8723BU. It may have had an effect on
RTL8723AU.
It is possible that skb is freed in ath9k_htc_rx_msg(), then
usb_submit_urb() fails and we try to free skb again. It causes
use-after-free bug. Moreover, if alloc_skb() fails, urb->context becomes
NULL but rx_buf is not freed and there can be a memory leak.
The patch removes unnecessary nskb and makes skb processing more clear: it
is supposed that ath9k_htc_rx_msg() either frees old skb or passes its
managing to another callback function.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Syzkaller reports a long-known leak of urbs in
ath9k_hif_usb_dealloc_tx_urbs().
The cause of the leak is that usb_get_urb() is called but usb_free_urb()
(or usb_put_urb()) is not called inside usb_kill_urb() as urb->dev or
urb->ep fields have not been initialized and usb_kill_urb() returns
immediately.
The patch removes trying to kill urbs located in hif_dev->tx.tx_buf
because hif_dev->tx.tx_buf is not supposed to contain urbs which are in
pending state (the pending urbs are stored in hif_dev->tx.tx_pending).
The tx.tx_lock is acquired so there should not be any changes in the list.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 03fb92a432ea ("ath9k: hif_usb: fix race condition between usb_get_urb() and usb_kill_anchored_urbs()") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20220725151359.283704-1-pchelkin@ispras.ru Signed-off-by: Sasha Levin <sashal@kernel.org>
When kfifo_alloc fails, the refcount of chdev->dev is left incremental.
We should use put_device(&chdev->dev) to decrease the ref count of
chdev->dev to avoid refcount leak.
Link: https://lkml.kernel.org/r/20221203085721.13146-1-caixinchen1@huawei.com Fixes: e8de370188d0 ("rapidio: add mport char device driver") Signed-off-by: Cai Xinchen <caixinchen1@huawei.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Dan Carpenter <error27@gmail.com> Cc: Jakob Koschel <jakobkoschel@gmail.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Wang Weiyang <wangweiyang2@huawei.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
loop0: detected capacity change from 0 to 64
==================================================================
BUG: KASAN: slab-out-of-bounds in hfs_asc2mac+0x467/0x9a0
fs/hfs/trans.c:133
Write of size 1 at addr ffff88801848314e by task syz-executor391/3632
If in->len is much larger than HFS_NAMELEN(31) which is the maximum
length of an HFS filename, a OOB write could occur in hfs_asc2mac(). In
that case, when the dst reaches the boundary, the srclen is still
greater than 0, which causes a OOB write.
Fix this by adding a check on dstlen in while() before writing to dst
address.
If kfifo_alloc() fails in mport_cdev_open(), goto err_fifo and just free
priv. But priv is still in the chdev->file_list, then list traversal
may cause UAF. This fixes the following smatch warning:
drivers/rapidio/devices/rio_mport_cdev.c:1930 mport_cdev_open() warn: '&priv->list' not removed from list
Link: https://lkml.kernel.org/r/20221123095147.52408-1-wangweiyang2@huawei.com Fixes: e8de370188d0 ("rapidio: add mport char device driver") Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Dan Carpenter <error27@gmail.com> Cc: Jakob Koschel <jakobkoschel@gmail.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
sysv_nblocks() returns 'blocks' rather than 'res', which only counting
the number of triple-indirect blocks and causing sysv_getattr() gets a
wrong result.
[AV: this is actually a sysv counterpart of minixfs fix - 0fcd426de9d0 "[PATCH] minix block usage counting fix" in
historical tree; mea culpa, should've thought to check
fs/sysv back then...]
vfs_lock_file() expects the struct file_lock to be fully initialised by
the caller. Re-exported NFSv3 has been seen to Oops if the fl_file field
is NULL.
Fixes: aec158242b87 ("lockd: set fl_owner when unlocking files") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216582 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fixes: 2c0756d306c2 ("MIPS: OCTEON: warn if deprecated link status is being used") Signed-off-by: Ladislav Michl <ladis@linux-mips.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Check clk for NULL before calling clk_enable_unlocked where clk
is dereferenced. There is such check in other implementations
of clk_enable.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: e7300d04bd08 ("MIPS: BCM63xx: Add support for the Broadcom BCM63xx family of SOCs.") Signed-off-by: Anastasia Belova <abelova@astralinux.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
In some error paths before device_register(), the names allocated
by dev_set_name() are not freed. Move dev_set_name() front to
device_register(), so the name can be freed while calling
put_device().
Fixes: 54b34aa0a729 ("platform/x86: intel_scu_ipc: Split out SCU IPC functionality from the SCU driver") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221208151916.2404977-1-yangyingliang@huawei.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
`cros_typec_get_switch_handles` allocates four pointers when obtaining
type-c switch handles. These pointers are all freed if failing to obtain
any of them; therefore, pointers in `port` become stale. The stale
pointers eventually cause use-after-free or double free in later code
paths. Zeroing out all pointer fields after freeing to eliminate these
stale pointers.
Fixes: f28adb41dab4 ("platform/chrome: cros_ec_typec: Register Type C switches") Fixes: 1a8912caba02 ("platform/chrome: cros_ec_typec: Get retimer handle") Signed-off-by: Victor Ding <victording@chromium.org> Acked-by: Prashant Malani <pmalani@chromium.org> Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://lore.kernel.org/r/20221207093924.v2.1.I1864b6a7ee98824118b93677868d22d3750f439b@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
Some of the return paths for the cros_typec_get_switch_handles()
aren't necessary. Clean up the return paths to only undo the handle
get's which succeeded.
Calling __rpm_callback() from rpm_idle() after adding device links
support to the former is a clear mistake.
Not only it causes rpm_idle() to carry out unnecessary actions, but it
is also against the assumption regarding the stability of PM-runtime
status across __rpm_callback() invocations, because rpm_suspend() and
rpm_resume() may run in parallel with __rpm_callback() when it is called
by rpm_idle() and the device's PM-runtime status can be updated by any
of them.
As 'kdata.num' is user-controlled data, if user tries to allocate
memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it
creates a stack trace and messes up dmesg with a warning.
In xen_init_lock_cpu(), the @name has allocated new string by kasprintf(),
if bind_ipi_to_irqhandler() fails, it should be freed, otherwise may lead
to a memory leak issue, fix it.
These local variables @{resched|pmu|callfunc...}_name saves the new
string allocated by kasprintf(), and when bind_{v}ipi_to_irqhandler()
fails, it goes to the @fail tag, and calls xen_smp_intr_free{_pv}() to
free resource, however the new string is not saved, which cause a memory
leak issue. fix it.
Intel ICC -hotpatch inserts 2-byte "0x66 0x90" NOP at the start of each
function to reserve extra space for hot-patching, and currently it is not
possible to probe these functions because branch_setup_xol_ops() wrongly
rejects NOP with REP prefix as it treats them like word-sized branch
instructions.
BUG: KASAN: use-after-free in acpi_ut_remove_reference+0x3b/0x82
Read of size 1 at addr ffff888112afc460 by task modprobe/2111
CPU: 0 PID: 2111 Comm: modprobe Not tainted 6.1.0-rc7-dirty
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
Call Trace:
<TASK>
kasan_report+0xae/0xe0
acpi_ut_remove_reference+0x3b/0x82
acpi_ut_copy_iobject_to_iobject+0x3be/0x3d5
acpi_ds_store_object_to_local+0x15d/0x3a0
acpi_ex_store+0x78d/0x7fd
acpi_ex_opcode_1A_1T_1R+0xbe4/0xf9b
acpi_ps_parse_aml+0x217/0x8d5
...
</TASK>
The root cause of the problem is that the acpi_operand_object
is freed when acpi_ut_walk_package_tree() fails in
acpi_ut_copy_ipackage_to_ipackage(), lead to repeated release in
acpi_ut_copy_iobject_to_iobject(). The problem was introduced
by "8aa5e56eeb61" commit, this commit is to fix memory leak in
acpi_ut_copy_iobject_to_iobject(), repeatedly adding remove
operation, lead to "acpi_operand_object" used after free.
Fix it by removing acpi_ut_remove_reference() in
acpi_ut_copy_ipackage_to_ipackage(). acpi_ut_copy_ipackage_to_ipackage()
is called to copy an internal package object into another internal
package object, when it fails, the memory of acpi_operand_object
should be freed by the caller.
Fixes: 8aa5e56eeb61 ("ACPICA: Utilities: Fix memory leak in acpi_ut_copy_iobject_to_iobject") Signed-off-by: Li Zetao <lizetao1@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If clk_get_rate() fails which is called after clk_prepare_enable(),
clk_disable_unprepare() need be called in error path to disable the
clock in dmtimer_systimer_init_clock().
Fixes: 52762fbd1c47 ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Tony Lindgren <tony@atomide.com> Link: https://lore.kernel.org/r/20221029114427.946520-1-yangyingliang@huawei.com Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The DYING/STARTING callbacks are not expected to fail. However, as reported
by Derek, buggy drivers such as tboot are still free to return errors
within those sections, which halts the hot(un)plug and leaves the CPU in an
unrecoverable state.
As there is no rollback possible, only log the failures and proceed with
the following steps.
This restores the hotplug behaviour prior to commit 453e41085183
("cpu/hotplug: Add cpuhp_invoke_callback_range()")
Writing the current state back in hotplug/target calls cpu_down()
which will set cpu dying even when it isn't and then nothing will
ever clear it. A stress test that reads values and writes them back
for all cpu device files in sysfs will trigger the BUG() in
select_fallback_rq once all cpus are marked as dying.
kernel/cpu.c::target_store()
...
if (st->state < target)
ret = cpu_up(dev->id, target);
else
ret = cpu_down(dev->id, target);
Commit ca16d5bee598 ("futex: Prevent robust futex exit race") addressed
two cases when tasks waiting on a robust non-PI futex remained blocked
despite the futex not being owned anymore:
* if the owner died after writing zero to the futex word, but before
waking up a waiter
* if a task waiting on the futex was woken up, but died before updating
the futex word (effectively swallowing the notification without acting
on it)
In the second case, the task could be woken up either by the previous
owner (after the futex word was reset to zero) or by the kernel (after
the OWNER_DIED bit was set and the TID part of the futex word was reset
to zero) if the previous owner died without the resetting the futex.
Because the referenced commit wakes up a potential waiter only if the
whole futex word is zero, the latter subcase remains unaddressed.
Fix this by looking only at the TID part of the futex when deciding
whether a wake up is needed.
With CONFIG_SLUB=y, following commit 6edf2576a6cc ("mm/slub: enable
debugging memory wasting of kmalloc") mt7621 failed to boot very early,
without showing any console messages.
This exposed the pre-existing bug of mt7621.c using kzalloc before normal
memory management was available.
Prior to this slub change, there existed the unintended protection against
"kmem_cache *s" being NULL as slab_pre_alloc_hook() happened to
return NULL and bailed out of slab_alloc_node().
This allowed mt7621 prom_soc_init to fail in the soc_dev_init kzalloc,
but continue booting without the SOC_BUS driver device registered.
Console output from a DEBUG_ZBOOT vmlinuz kernel loading,
with mm/slub modified to warn on kmem_cache zero or null:
zimage at: 80B842A0810B4BC0
Uncompressing Linux at load address 80001000
Copy device tree to address 80B80EE0
Now, booting the kernel...
Allowing soc_device_register to work exposed oops in the mt7621 phy pci,
and pci controller drivers from soc_device_match_attr, due to missing
sentinels in the quirks tables. These were fixed with:
commit 819b885cd886 ("phy: ralink: mt7621-pci: add sentinel to quirks
table")
not yet applied ("PCI: mt7621: add sentinel to quirks table")
Move the SoC register value queries and tests to specific functions,
to remove repetition of logic
No functional changes intended
Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Stable-dep-of: 7c18b64bba3b ("mips: ralink: mt7621: do not use kzalloc too early") Signed-off-by: Sasha Levin <sashal@kernel.org>
So that MT7621_SYSC_BASE can be used later in multiple functions without
needing to repeat this __iomem declaration each time
Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Stable-dep-of: 7c18b64bba3b ("mips: ralink: mt7621: do not use kzalloc too early") Signed-off-by: Sasha Levin <sashal@kernel.org>
Documentation for most CMTs say that it takes two input clocks before
changes propagate to the timer. This is especially relevant when the timer
is stopped to change further settings.
Implement the delays according to the spec. To avoid unnecessary delays in
atomic mode, also check if the to-be-written value actually differs.
CMCNT is a bit special because testing showed that it requires 3 cycles to
propagate, which affects all CMTs. Also, the WRFLAG needs to be checked
before writing. This fixes "cannot clear CMCNT" messages which occur often
on R-Car Gen4 SoCs, but only very rarely on older SoCs for some reason.
Fixes: 81b3b2711072 ("clocksource: sh_cmt: Add support for multiple channels per device") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221130210609.7718-1-wsa+renesas@sang-engineering.com Signed-off-by: Sasha Levin <sashal@kernel.org>
If device_register() returns error, the name allocated by dev_set_name()
need be freed. It should use put_device() to give up the reference in the
error path, so that the name can be freed in kobject_cleanup(), and
list_del() is called to delete the port from rio_mports.
Link: https://lkml.kernel.org/r/20221114152636.2939035-3-yangyingliang@huawei.com Fixes: 2aaf308b95b2 ("rapidio: rework device hierarchy and introduce mport class of devices") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Patch series "rapidio: fix three possible memory leaks".
This patchset fixes three name leaks in error handling.
- patch #1 fixes two name leaks while rio_add_device() fails.
- patch #2 fixes a name leak while rio_register_mport() fails.
This patch (of 2):
If rio_add_device() returns error, the name allocated by dev_set_name()
need be freed. It should use put_device() to give up the reference in the
error path, so that the name can be freed in kobject_cleanup(), and the
'rdev' can be freed in rio_release_dev().
The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in
simple_attr_write()"), so we have to use a 64-bit value to write a
negative value for a debugfs file created by debugfs_create_atomic_t().
This restores the previous behaviour by introducing
DEFINE_DEBUGFS_ATTRIBUTE_SIGNED for a signed value.
Link: https://lkml.kernel.org/r/20220919172418.45257-4-akinobu.mita@gmail.com Fixes: 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reported-by: Zhao Gongyi <zhaogongyi@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in
simple_attr_write()").
This restores the previous behaviour by using newly introduced
DEFINE_SIMPLE_ATTRIBUTE_SIGNED instead of DEFINE_SIMPLE_ATTRIBUTE.
Link: https://lkml.kernel.org/r/20220919172418.45257-3-akinobu.mita@gmail.com Fixes: 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reported-by: Zhao Gongyi <zhaogongyi@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Patch series "fix error when writing negative value to simple attribute
files".
The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in
simple_attr_write()"), but some attribute files want to accept a negative
value.
This patch (of 3):
The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in
simple_attr_write()"), so we have to use a 64-bit value to write a
negative value.
This adds DEFINE_SIMPLE_ATTRIBUTE_SIGNED for a signed value.
Link: https://lkml.kernel.org/r/20220919172418.45257-1-akinobu.mita@gmail.com Link: https://lkml.kernel.org/r/20220919172418.45257-2-akinobu.mita@gmail.com Fixes: 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reported-by: Zhao Gongyi <zhaogongyi@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
pci_get_device() will increase the reference count for the returned
pci_dev. We need to use pci_dev_put() to decrease the reference count
after using pci_get_device(). Let's add it.
Fixes: 59a3b3a8db16 ("cpufreq: AMD: Ignore the check for ProcFeedback in ST/CZ") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
kernfs: can not remove 'chip_name', no directory
WARNING: CPU: 0 PID: 253 at fs/kernfs/dir.c:1616 kernfs_remove_by_name_ns+0xce/0xe0
RIP: 0010:kernfs_remove_by_name_ns+0xce/0xe0
Call Trace:
<TASK>
remove_files.isra.1+0x3f/0xb0
sysfs_remove_group+0x68/0xe0
sysfs_remove_groups+0x41/0x70
__kobject_del+0x45/0xc0
kobject_del+0x29/0x40
free_desc+0x42/0x70
irq_free_descs+0x5e/0x90
The reason is that the interrupt descriptor sysfs handling does not roll
back on a failing kobject_add() during allocation. If the descriptor is
freed later on, kobject_del() is invoked with a not added kobject resulting
in the above warnings.
A proper rollback in case of a kobject_add() failure would be the straight
forward solution. But this is not possible due to the way how interrupt
descriptor sysfs handling works.
Interrupt descriptors are allocated before sysfs becomes available. So the
sysfs files for the early allocated descriptors are added later in the boot
process. At this point there can be nothing useful done about a failing
kobject_add(). For consistency the interrupt descriptor allocation always
treats kobject_add() failures as non-critical and just emits a warning.
To solve this problem, keep track in the interrupt descriptor whether
kobject_add() was successful or not and make the invocation of
kobject_del() conditional on that.
[ tglx: Massage changelog, comments and use a state bit. ]
Fixes: ecb3f394c5db ("genirq: Expose interrupt information through sysfs") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20221128151612.1786122-1-yangyingliang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
BUG: sleeping function called from invalid context at fs/nfsd/filecache.c:440
...with a stack trace showing nfsd_file_put being called from
nfs4_show_open. This code has always tried to call fput while holding a
spinlock, but we recently changed this to use the filecache, and that
started triggering the might_sleep() in nfsd_file_put.
states_start takes and holds the cl_lock while iterating over the
client's states, and we can't sleep with that held.
Have the various nfs4_show_* functions instead hold the fi_lock instead
of taking a nfsd_file reference.
Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138357 Reported-by: Zhi Li <yieli@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The xdr_stream conversion inadvertently left some code that set the
page_len of the send buffer. The XDR stream encoders should handle
this automatically now.
This oversight adds garbage past the end of the Reply message.
Clients typically ignore the garbage, but NFSD does not need to send
it, as it leaks stale memory contents onto the wire.
Fixes: f8cba47344f7 ("NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Return boolean values ("true" or "false") instead of 1 or 0 from bool
functions. This fixes the following warnings from coccicheck:
./fs/nfsd/nfs2acl.c:289:9-10: WARNING: return of 0/1 in function
'nfsaclsvc_encode_accessres' with return type bool
./fs/nfsd/nfs2acl.c:252:9-10: WARNING: return of 0/1 in function
'nfsaclsvc_encode_getaclres' with return type bool
Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: ea5021e911d3 ("NFSD: Finish converting the NFSv2 GETACL result encoder") Signed-off-by: Sasha Levin <sashal@kernel.org>
As the comment of pci_get_domain_bus_and_slot() says, it returns
a PCI device with refcount incremented, so it doesn't need to
call an extra pci_dev_get() in pci_get_dev_wrapper(), and the PCI
device needs to be put in the error path.
Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20221128065512.3572550-1-yangyingliang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
gic_probe() calls pm_runtime_get_sync() and added fail path as
rpm_put to put usage_counter. However, pm_runtime_get_sync()
will increment usage_counter even it failed. Fix it by replacing it with
pm_runtime_resume_and_get() to keep usage counter balanced.
Fixes: 9c8edddfc992 ("irqchip/gic: Add platform driver for non-root GICs that require RPM") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124065150.22809-1-shangxiaojing@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
In some error paths before device_register(), the names allocated
by dev_set_name() are not freed. Move dev_set_name() front to
device_register(), so the name can be freed while calling
put_device().
Fixes: 1dd7128b839f ("thermal/core: Fix null pointer dereference in thermal_release()") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The reason is that the cros_usbpd_notify_init() does not check the return
value of platform_driver_register(), and the cros_usbpd_notify can
install successfully even if platform_driver_register() failed.
Fix by checking the return value of platform_driver_register() and
unregister cros_usbpd_notify_plat_driver when it failed.
pci_get_device() will increase the reference count for the returned
pci_dev, so tgl_uncore_get_mc_dev() will return a pci_dev with its
reference count increased. We need to call pci_dev_put() to decrease the
reference count before exiting from __uncore_imc_init_box(). Add
pci_dev_put() for both normal and error path.
Fixes: fdb64822443e ("perf/x86: Add Intel Tiger Lake uncore support") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20221118063137.121512-5-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
pci_get_device() will increase the reference count for the returned
pci_dev, so snr_uncore_get_mc_dev() will return a pci_dev with its
reference count increased. We need to call pci_dev_put() to decrease the
reference count. Let's add the missing pci_dev_put().
Fixes: ee49532b38dd ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20221118063137.121512-4-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
pci_get_device() will increase the reference count for the returned
'dev'. We need to call pci_dev_put() to decrease the reference count.
Since 'dev' is only used in pci_read_config_dword(), let's add
pci_dev_put() right after it.
Fixes: 9d480158ee86 ("perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20221118063137.121512-3-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
pci_get_device() will increase the reference count for the returned
pci_dev, and also decrease the reference count for the input parameter
*from* if it is not NULL.
If we break the loop in sad_cfg_iio_topology() with 'dev' not NULL. We
need to call pci_dev_put() to decrease the reference count. Since
pci_dev_put() can handle the NULL input parameter, we can just add one
pci_dev_put() right before 'return ret'.
Fixes: c1777be3646b ("perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on SNR") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20221118063137.121512-2-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
After commit 1fa5ae857bb1 ("driver core: get rid of struct device's
bus_id string array"), the name of device is allocated dynamically,
move dev_set_name() after pnp_add_id() to avoid memory leak.
Fixes: 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
dev_set_name() allocates memory for name, it need be freed
when module exiting, call put_device() to give up reference,
so that it can be freed in kobject_cleanup() when the refcount
hit to 0. The vpe_device is static, so remove kfree() from
vpe_device_release().
Fixes: 17a1d523aa58 ("MIPS: APRP: Add VPE loader support for CMP platforms.") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Afer commit 1fa5ae857bb1 ("driver core: get rid of struct device's
bus_id string array"), the name of device is allocated dynamically,
it need be freed when module exiting, call put_device() to give up
reference, so that it can be freed in kobject_cleanup() when the
refcount hit to 0. The vpe_device is static, so remove kfree() from
vpe_device_release().
Fixes: 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Link: https://lkml.kernel.org/r/41651ca1-432a-db34-eb97-d35744559de1@linux.alibaba.com Fixes: 3878f110f71a ("ocfs2: Move the hb_ctl_path sysctl into the stack glue.") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When `timerqueue_getnext()` is called on an empty timer queue, it will
use `rb_entry()` on a NULL pointer, which is invalid. Fix that by using
`rb_entry_safe()` which handles NULL pointers.
This has not caused any issues so far because the offset of the `rb_node`
member in `timerqueue_node` is 0, so `rb_entry()` is essentially a no-op.
Fixes: 511885d7061e ("lib/timerqueue: Rely on rbtree semantics for next timer") Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221114195421.342929-1-pobrn@protonmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Previously, `huawei_wmi_input_setup()` returned the result of
logical or-ing the return values of two functions that return negative
errno-style error codes and one that returns `acpi_status`. If this
returned value was non-zero, then it was propagated from the platform
driver's probe function. That function should return a negative
errno-style error code, so the result of the logical or that
`huawei_wmi_input_setup()` returned was not appropriate.
Fix that by checking each function separately and returning the
error code unmodified.
Fixes: 1ac9abeb2e5b ("platform/x86: huawei-wmi: Move to platform driver") Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> Link: https://lore.kernel.org/r/20221005150032.173198-2-pobrn@protonmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
1. Var debug_objects_allocated tracks valid kmem_cache_alloc calls, so
track it in debug_objects_replace_static_objects. Do similar things in
object_cpu_offline.
2. In debug_objects_mem_init, there is no need to call function
cpuhp_setup_state_nocalls when debug_objects_enabled = 0 (out of
memory).
Link: https://lkml.kernel.org/r/20220611130634.99741-1-wuchi.zero@gmail.com Fixes: 634d61f45d6f ("debugobjects: Percpu pool lookahead freeing/allocation") Fixes: c4b73aabd098 ("debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done") Signed-off-by: wuchi <wuchi.zero@gmail.com> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In pmu_dev_alloc(), when dev_set_name() failed, it will goto free_dev
and call put_device(pmu->dev) to release it.
However pmu->dev->release is assigned after this, which makes warning
and memleak.
Call dev_set_name() after pmu->dev->release = pmu_dev_release to fix it.
Device '(null)' does not have a release() function...
WARNING: CPU: 2 PID: 441 at drivers/base/core.c:2332 device_release+0x1b9/0x240
...
Call Trace:
<TASK>
kobject_put+0x17f/0x460
put_device+0x20/0x30
pmu_dev_alloc+0x152/0x400
perf_pmu_register+0x96b/0xee0
...
kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
unreferenced object 0xffff888014759000 (size 2048):
comm "modprobe", pid 441, jiffies 4294931444 (age 38.332s)
backtrace:
[<0000000005aed3b4>] kmalloc_trace+0x27/0x110
[<000000006b38f9b8>] pmu_dev_alloc+0x50/0x400
[<00000000735f17be>] perf_pmu_register+0x96b/0xee0
[<00000000e38477f1>] 0xffffffffc0ad8603
[<000000004e162216>] do_one_initcall+0xd0/0x4e0
...
The check being unconditional may lead to unwanted denials reported by
LSMs when a process has the capability granted by DAC, but denied by an
LSM. In the case of SELinux such denials are a problem, since they can't
be effectively filtered out via the policy and when not silenced, they
produce noise that may hide a true problem or an attack.
Checking for the capability only if any trusted xattr is actually
present wouldn't really address the issue, since calling listxattr(2) on
such node on its own doesn't indicate an explicit attempt to see the
trusted xattrs. Additionally, it could potentially leak the presence of
trusted xattrs to an unprivileged user if they can check for the denials
(e.g. through dmesg).
Therefore, it's best (and simplest) to keep the check unconditional and
instead use ns_capable_noaudit() that will silence any associated LSM
denials.
Fixes: 38f38657444d ("xattr: extract simple_xattr code from tmpfs") Reported-by: Martin Pitt <mpitt@redhat.com> Suggested-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
but the one in the kerneldoc comment of the function is different and
incorrect.
Fixes: ddeb64870810 ("PM / Hibernate: Add sysfs knob to control size of memory for drivers") Signed-off-by: xiongxin <xiongxin@kylinos.cn>
[ rjw: Subject and changelog rewrite ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8795359e35bc ("x86/sgx: Silence softlockup detection when
releasing large enclaves") introduced a cond_resched() during enclave
release where the EREMOVE instruction is applied to every 4k enclave
page. Giving other tasks an opportunity to run while tearing down a
large enclave placates the soft lockup detector but Iqbal found
that the fix causes a 25% performance degradation of a workload
run using Gramine.
Gramine maintains a 1:1 mapping between processes and SGX enclaves.
That means if a workload in an enclave creates a subprocess then
Gramine creates a duplicate enclave for that subprocess to run in.
The consequence is that the release of the enclave used to run
the subprocess can impact the performance of the workload that is
run in the original enclave, especially in large enclaves when
SGX2 is not in use.
The workload run by Iqbal behaves as follows:
Create enclave (enclave "A")
/* Initialize workload in enclave "A" */
Create enclave (enclave "B")
/* Run subprocess in enclave "B" and send result to enclave "A" */
Release enclave (enclave "B")
/* Run workload in enclave "A" */
Release enclave (enclave "A")
The performance impact of releasing enclave "B" in the above scenario
is amplified when there is a lot of SGX memory and the enclave size
matches the SGX memory. When there is 128GB SGX memory and an enclave
size of 128GB, from the time enclave "B" starts the 128GB SGX memory
is oversubscribed with a combined demand for 256GB from the two
enclaves.
Before commit 8795359e35bc ("x86/sgx: Silence softlockup detection when
releasing large enclaves") enclave release was done in a tight loop
without giving other tasks a chance to run. Even though the system
experienced soft lockups the workload (run in enclave "A") obtained
good performance numbers because when the workload started running
there was no interference.
Commit 8795359e35bc ("x86/sgx: Silence softlockup detection when
releasing large enclaves") gave other tasks opportunity to run while an
enclave is released. The impact of this in this scenario is that while
enclave "B" is released and needing to access each page that belongs
to it in order to run the SGX EREMOVE instruction on it, enclave "A"
is attempting to run the workload needing to access the enclave
pages that belong to it. This causes a lot of swapping due to the
demand for the oversubscribed SGX memory. Longer latencies are
experienced by the workload in enclave "A" while enclave "B" is
released.
Improve the performance of enclave release while still avoiding the
soft lockup detector with two enhancements:
- Only call cond_resched() after XA_CHECK_SCHED iterations.
- Use the xarray advanced API to keep the xarray locked for
XA_CHECK_SCHED iterations instead of locking and unlocking
at every iteration.
This batching solution is copied from sgx_encl_may_map() that
also iterates through all enclave pages using this technique.
With this enhancement the workload experiences a 5%
performance degradation when compared to a kernel without
commit 8795359e35bc ("x86/sgx: Silence softlockup detection when
releasing large enclaves"), an improvement to the reported 25%
degradation, while still placating the soft lockup detector.
Scenarios with poor performance are still possible even with these
enhancements. For example, short workloads creating sub processes
while running in large enclaves. Further performance improvements
are pursued in user space through avoiding to create duplicate enclaves
for certain sub processes, and using SGX2 that will do lazy allocation
of pages as needed so enclaves created for sub processes start quickly
and release quickly.
We only want to take the slow path if SYSCALL_TRACE or SYSCALL_AUDIT is
set; on !AUDIT_SYSCALL configs the current tree hits it whenever _any_
thread flag (including NEED_RESCHED, NOTIFY_SIGNAL, etc.) happens to
be set.
it needs to be added to _TIF_WORK_MASK, or we might not reach
do_work_pending() in the first place...
Fixes: 5a9a8897c253a "alpha: add support for TIF_NOTIFY_SIGNAL" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
While we correctly skips to initialize an idle state from a disabled idle
state node in DT, the returned value from dt_init_idle_driver() don't get
adjusted accordingly. Instead the number of found idle state nodes are
returned, while the callers are expecting the number of successfully
initialized idle states from DT.
This leads to cpuidle drivers unnecessarily continues to initialize their
idle state specific data. Moreover, in the case when all idle states have
been disabled in DT, we would end up registering a cpuidle driver, rather
than relying on the default arch specific idle call.
Fixes: 9f14da345599 ("drivers: cpuidle: implement DT based idle states infrastructure") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>