cards_found is a static variable, but when it enters atl2_probe(),
cards_found is set to zero, the value is not consistent with last probe,
so next behavior is not our expect.
Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The code waits up to 20 usec for the firmware response to complete
once we've seen the valid response header in the buffer. It turns
out that in some scenarios, this wait time is not long enough.
Extend it to 150 usec and use usleep_range() instead of udelay().
Fixes: 9751e8e71487 ("bnxt_en: reduce timeout on initial HWRM calls") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The logic that polls for the firmware message response uses a shorter
sleep interval for the first few passes. But there was a typo so it
was using the wrong counter (larger counter) for these short sleep
passes. The result is a slightly shorter timeout period for these
firmware messages than intended. Fix it by using the proper counter.
Fixes: 9751e8e71487 ("bnxt_en: reduce timeout on initial HWRM calls") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
NFP BPF JIT compiler is doing a couple of small optimizations when jitting
ALU imm instructions, some of these optimizations could save code-gen, for
example:
A & -1 = A
A | 0 = A
A ^ 0 = A
However, for ALU32, high 32-bit of the 64-bit register should still be
cleared according to ISA semantics.
Fixes: cd7df56ed3e6 ("nfp: add BPF to NFP code translator") Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The rx_set_mode invokes number of messages to be send to PF for receive
mode configuration. In case if there any issues we need to stop sending
messages and release allocated memory.
This commit is to implement check of nicvf_msg_send_to_pf() result.
Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
At the end of NIC VF initialization VF sends CFG_DONE message to PF without
using nicvf_msg_send_to_pf routine. This potentially could re-write data in
mailbox. This commit is to implement common way of sending CFG_DONE message
by the same way with other configuration messages by using
nicvf_send_msg_to_pf() routine.
Signed-off-by: Vadim Lomovtsev <vlomovtsev@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
clang warns about overflowing the data[] member in the struct pnpipehdr:
net/phonet/pep.c:295:8: warning: array index 4 is past the end of the array (which contains 1 element) [-Warray-bounds]
if (hdr->data[4] == PEP_IND_READY)
^ ~
include/net/phonet/pep.h:66:3: note: array 'data' declared here
u8 data[1];
Using a flexible array member at the end of the struct avoids the
warning, but since we cannot have a flexible array member inside
of the union, each index now has to be moved back by one, which
makes it a little uglier.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Rémi Denis-Courmont <remi@remlab.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
There's a hardware bug which affects the HSDK platform, triggered by
micro-ops for auto-saving regfile on taken interrupt. The workaround is
to inhibit autosave.
ARCv2 optimized memcpy uses PREFETCHW instruction for prefetching the
next cache line but doesn't ensure that the line is not past the end of
the buffer. PRETECHW changes the line ownership and marks it dirty,
which can cause data corruption if this area is used for DMA IO.
Fix the issue by avoiding the PREFETCHW. This leads to performance
degradation but it is OK as we'll introduce new memcpy implementation
optimized for unaligned memory access using.
We also cut off all PREFETCH instructions at they are quite useless
here:
* we call PREFETCH right before LOAD instruction call.
* we copy 16 or 32 bytes of data (depending on CONFIG_ARC_HAS_LL64)
in a main logical loop. so we call PREFETCH 4 times (or 2 times)
for each L1 cache line (in case of 64B L1 cache Line which is
default case). Obviously this is not optimal.
The enabling L3/L4 filtering for transmit switched packets for all
devices caused unforeseen issue on older devices when trying to send UDP
traffic in an ordered sequence. This bit was originally intended for X550
devices, which supported this feature, so limit the scope of this bit to
only X550 devices.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
tmpfs has a peculiarity of accounting hard links as if they were
separate inodes: so that when the number of inodes is limited, as it is
by default, a user cannot soak up an unlimited amount of unreclaimable
dcache memory just by repeatedly linking a file.
But when v3.11 added O_TMPFILE, and the ability to use linkat() on the
fd, we missed accommodating this new case in tmpfs: "df -i" shows that
an extra "inode" remains accounted after the file is unlinked and the fd
closed and the actual inode evicted. If a user repeatedly links
tmpfiles into a tmpfs, the limit will be hit (ENOSPC) even after they
are deleted.
Just skip the extra reservation from shmem_link() in this case: there's
a sense in which this first link of a tmpfile is then cheaper than a
hard link of another file, but the accounting works out, and there's
still good limiting, so no need to do anything more complicated.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1902182134370.7035@eggly.anvils Fixes: f4e0c30c191 ("allow the temp files created by open() to be linked to") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Matej Kupljen <matej.kupljen@gmail.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since for_each_cpu(cpu, mask) added by commit 2d3854a37e8b767a
("cpumask: introduce new API, without changing anything") did not
evaluate the mask argument if NR_CPUS == 1 due to CONFIG_SMP=n,
lru_add_drain_all() is hitting WARN_ON() at __flush_work() added by
commit 4d43d395fed12463 ("workqueue: Try to catch flush_work() without
INIT_WORK().") by unconditionally calling flush_work() [1].
Workaround this issue by using CONFIG_SMP=n specific lru_add_drain_all
implementation. There is no real need to defer the implementation to
the workqueue as the draining is going to happen on the local cpu. So
alias lru_add_drain_all to lru_add_drain which does all the necessary
work.
Booting 4.20 on SolidRun Clearfog issues this warning with DMA API
debug enabled:
WARNING: CPU: 0 PID: 555 at kernel/dma/debug.c:1230 check_sync+0x514/0x5bc
mvneta f1070000.ethernet: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000002dd7dc00] [size=240 bytes]
Modules linked in: ahci mv88e6xxx dsa_core xhci_plat_hcd xhci_hcd devlink armada_thermal marvell_cesa des_generic ehci_orion phy_armada38x_comphy mcp3021 spi_orion evbug sfp mdio_i2c ip_tables x_tables
CPU: 0 PID: 555 Comm: bridge-network- Not tainted 4.20.0+ #291
Hardware name: Marvell Armada 380/385 (Device Tree)
[<c0019638>] (unwind_backtrace) from [<c0014888>] (show_stack+0x10/0x14)
[<c0014888>] (show_stack) from [<c07f54e0>] (dump_stack+0x9c/0xd4)
[<c07f54e0>] (dump_stack) from [<c00312bc>] (__warn+0xf8/0x124)
[<c00312bc>] (__warn) from [<c00313b0>] (warn_slowpath_fmt+0x38/0x48)
[<c00313b0>] (warn_slowpath_fmt) from [<c00b0370>] (check_sync+0x514/0x5bc)
[<c00b0370>] (check_sync) from [<c00b04f8>] (debug_dma_sync_single_range_for_cpu+0x6c/0x74)
[<c00b04f8>] (debug_dma_sync_single_range_for_cpu) from [<c051bd14>] (mvneta_poll+0x298/0xf58)
[<c051bd14>] (mvneta_poll) from [<c0656194>] (net_rx_action+0x128/0x424)
[<c0656194>] (net_rx_action) from [<c000a230>] (__do_softirq+0xf0/0x540)
[<c000a230>] (__do_softirq) from [<c00386e0>] (irq_exit+0x124/0x144)
[<c00386e0>] (irq_exit) from [<c009b5e0>] (__handle_domain_irq+0x58/0xb0)
[<c009b5e0>] (__handle_domain_irq) from [<c03a63c4>] (gic_handle_irq+0x48/0x98)
[<c03a63c4>] (gic_handle_irq) from [<c0009a10>] (__irq_svc+0x70/0x98)
...
This appears to be caused by mvneta_rx_hwbm() calling
dma_sync_single_range_for_cpu() with the wrong struct device pointer,
as the buffer manager device pointer is used to map and unmap the
buffer. Fix this.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 482997699ef0 ("ARM: tegra: Fix unit_address_vs_reg DTC warnings
for /memory") inadventently broke device tree ABI by adding a unit-
address to the "/memory" node because the device tree compiler flagged
the missing unit-address as a warning.
Tegra124 Chromebooks (a.k.a. Nyan) use a bootloader that relies on the
full name of the memory node in device tree being exactly "/memory". It
can be argued whether this was a good decision or not, and some other
bootloaders (such as U-Boot) do accept a unit-address in the name of the
node, but the device tree is an ABI and we can't break existing setups
just because the device tree compiler considers it bad practice to omit
the unit-address nowadays.
This partially reverts the offending commit and restores device tree ABI
compatibility.
Updates to the GIC architecture allow ID_AA64PFR0_EL1.GIC to have
values other than 0 or 1. At the moment, Linux is quite strict in the
way it handles this field at early boot stage (cpufeature is fine) and
will refuse to use the system register CPU interface if it doesn't
find the value 1.
Fixes: 021f653791ad17e03f98aaa7fb933816ae16f161 ("irqchip: gic-v3: Initial support for GICv3") Reported-by: Chase Conklin <Chase.Conklin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 3b79919946cd2cf4dac47842afc9a893acec4ed7 ("ARM: dts:
armada-370-xp: update NAND node with new bindings") updated some
Marvell Armada DT description to use the new NAND controller bindings,
but did it incorrectly for a number of boards: armada-xp-gp,
armada-xp-db and armada-xp-lenovo-ix4-300d. Due to this, the NAND is
no longer detected on those platforms.
This commit fixes that by properly using the new NAND DT binding. This
commit was runtime-tested on Armada XP GP, the two other platforms are
only compile-tested.
Fixes: 3b79919946cd2 ("ARM: dts: armada-370-xp: update NAND node with new bindings") Cc: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The ll2 forwards all syn packets to the driver without validating the mac
address. Add validation check in the driver's iWARP listener flow and drop
the packet if it isn't intended for the device.
Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The assumption that the maximum size of a syn packet is 128 bytes
is wrong. Tunneling headers were not accounted for.
Allocate buffers large enough for mtu.
Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
After moving an XFRM interface to another namespace it stays associated
with the original namespace (net in `struct xfrm_if` and the list keyed
with `xfrmi_net_id`), allowing processes in the new namespace to use
SAs/policies that were created in the original namespace. For instance,
this allows a keying daemon in one namespace to establish IPsec SAs for
other namespaces without processes there having access to the keys or IKE
credentials.
This worked fine for outbound traffic, however, for inbound traffic the
lookup for the interfaces and the policies used the incorrect namespace
(the one the XFRM interface was moved to).
If mv643xx_eth_shared_of_probe() fails, mv643xx_eth_shared_probe()
leaves clk enabled.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The 1199:68C0 USB ID is reused by Sierra WP7607 which requires the DTR
quirk to be detected. Apply QMI_QUIRK_SET_DTR unconditionally as
already done for other IDs shared between different devices.
Signed-off-by: Beniamino Galvani <bgalvani@redhat.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix the mismatch between the "sdxc_d13_1_a" pin group definition from
meson8b_cbus_groups and the entry in sdxc_a_groups ("sdxc_d0_13_1_a").
This makes it possible to use "sdxc_d13_1_a" in device-tree files to
route the MMC data 1..3 pins to GPIOX_1..3.
Fixes: 0fefcb6876d0d6 ("pinctrl: Add support for Meson8b") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
We assume in the bcm_sf2 driver that the DSA master network device
supports ethtool_ops::{get,set}_wol operations, which is not a given.
Avoid de-referencing potentially non-existent function pointers and
check them as we should.
Fixes: 96e65d7f3f88 ("net: dsa: bcm_sf2: add support for Wake-on-LAN") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
SYSTEMPORT has its RXCHK parser block that attempts to validate the
packet structures, unfortunately setting the L2 header check bit will
cause Bridge PDUs (BPDUs) to be incorrectly rejected because they look
like LLC/SNAP packets with a non-IPv4 or non-IPv6 Ethernet Type.
Fixes: 4e8aedfe78c7 ("net: systemport: Turn on offloads by default") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When a target sends Check Condition, whilst initiator is busy xmiting
re-queued data, could lead to race between iscsi_complete_task() and
iscsi_xmit_task() and eventually crashing with the following kernel
backtrace.
Commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix
list corruption regression") introduced "taskqueuelock" to fix list
corruption during the race, but this wasn't enough.
Re-setting of conn->task to NULL, could race with iscsi_xmit_task().
iscsi_complete_task()
{
....
if (conn->task == task)
conn->task = NULL;
}
conn->task in iscsi_xmit_task() could be NULL and so will be task.
__iscsi_get_task(task) will crash (NullPtr de-ref), trying to access
refcount.
This commit will take extra conn->session->back_lock in iscsi_xmit_task()
to ensure iscsi_xmit_task() waits for iscsi_complete_task(), if
iscsi_complete_task() wins the race. If iscsi_xmit_task() wins the race,
iscsi_xmit_task() increments task->refcount
(__iscsi_get_task) ensuring iscsi_complete_task() will not iscsi_free_task().
Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Acked-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the request_key() upcall mechanism there's a dependency loop by which if
a key type driver overrides the ->request_key hook and the userspace side
manages to lose the authorisation key, the auth key and the internal
construction record (struct key_construction) can keep each other pinned.
Fix this by the following changes:
(1) Killing off the construction record and using the auth key instead.
(2) Including the operation name in the auth key payload and making the
payload available outside of security/keys/.
(3) The ->request_key hook is given the authkey instead of the cons
record and operation name.
Changes (2) and (3) allow the auth key to naturally be cleaned up if the
keyring it is in is destroyed or cleared or the auth key is unlinked.
Fixes: 7ee02a316600 ("keys: Fix dependency loop between construction record and auth key") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix the creation of shortcuts for which the length of the index key value
is an exact multiple of the machine word size. The problem is that the
code that blanks off the unused bits of the shortcut value malfunctions if
the number of bits in the last word equals machine word size. This is due
to the "<<" operator being given a shift of zero in this case, and so the
mask that should be all zeros is all ones instead. This causes the
subsequent masking operation to clear everything rather than clearing
nothing.
Ordinarily, the presence of the hash at the beginning of the tree index key
makes the issue very hard to test for, but in this case, it was encountered
due to a development mistake that caused the hash output to be either 0
(keyring) or 1 (non-keyring) only. This made it susceptible to the
keyctl/unlink/valid test in the keyutils package.
The fix is simply to skip the blanking if the shift would be 0. For
example, an index key that is 64 bits long would produce a 0 shift and thus
a 'blank' of all 1s. This would then be inverted and AND'd onto the
index_key, incorrectly clearing the entire last word.
Fixes: 3cb989501c26 ("Add a generic associative array implementation.") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Installing the appropriate non-IOMMU DMA ops in arm_iommu_detch_device()
serves the case where IOMMU-aware drivers choose to control their own
mapping but still make DMA API calls, however it also affects the case
when the arch code itself tears down the mapping upon driver unbinding,
where the ops now get left in place and can inhibit arch_setup_dma_ops()
on subsequent re-probe attempts.
Fix the latter case by making sure that arch_teardown_dma_ops() cleans
up whenever the ops were automatically installed by its counterpart.
Reported-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 1874619a7df4 "ARM: dma-mapping: Set proper DMA ops in arm_iommu_detach_device()" Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Tested-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
Attempting to avoid cloning the skb when broadcasting by inflating
the refcount with sock_hold/sock_put while under RCU lock is dangerous
and violates RCU principles. It leads to subtle race conditions when
attempting to free the SKB, as we may reference sockets that have
already been freed by the stack.
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c4b
[006b6b6b6b6b6c4b] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] PREEMPT SMP
task: fffffff78f65b380 task.stack: ffffff8049a88000
pc : sock_rfree+0x38/0x6c
lr : skb_release_head_state+0x6c/0xcc
Process repro (pid: 7117, stack limit = 0xffffff8049a88000)
Call trace:
sock_rfree+0x38/0x6c
skb_release_head_state+0x6c/0xcc
skb_release_all+0x1c/0x38
__kfree_skb+0x1c/0x30
kfree_skb+0xd0/0xf4
pfkey_broadcast+0x14c/0x18c
pfkey_sendmsg+0x1d8/0x408
sock_sendmsg+0x44/0x60
___sys_sendmsg+0x1d0/0x2a8
__sys_sendmsg+0x64/0xb4
SyS_sendmsg+0x34/0x4c
el0_svc_naked+0x34/0x38
Kernel panic - not syncing: Fatal exception
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
since rw_semaphore is released in a different task vs task that locked the sema.
It is expected behavior.
Fix the warning with up_read_non_owner() and rwsem_release() annotation.
Fixes: bae77c5eb5b2 ("bpf: enable stackmap with build_id in nmi context") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
bpf_skb_change_proto and bpf_skb_adjust_room change skb header length.
For GSO packets they adjust gso_size to maintain the same MTU.
The gso size can only be safely adjusted on bytestream protocols.
Commit d02f51cbcf12 ("bpf: fix bpf_skb_adjust_net/bpf_skb_proto_xlat
to deal with gso sctp skbs") excluded SKB_GSO_SCTP.
Since then type SKB_GSO_UDP_L4 has been added, whose contents are one
gso_size unit per datagram. Also exclude these.
Move from a blacklist to a whitelist check to future proof against
additional such new GSO types, e.g., for fraglist based GRO.
Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
This issue was tracked down to a missing copy of the new affinity
cpumask for the vexpress-spc interrupt into struct
irq_common_data.affinity when the interrupt is migrated in
migrate_one_irq().
Fix it by replacing the arm specific hotplug cpu migration with the
generic irq code.
This is the counterpart implementation to commit 217d453d473c ("arm64:
fix a migrating irq bug when hotplug cpu").
Tested with cpu hotplug stress test on Arm TC2 (multi_v7_defconfig plus
CONFIG_ARM_BIG_LITTLE_CPUFREQ=y and CONFIG_ARM_VEXPRESS_SPC_CPUFREQ=y).
The vexpress-spc interrupt (irq=22) on this board is affine to CPU0.
Its affinity cpumask now changes correctly e.g. from 0 to 1-4 when
CPU0 is hotplugged out.
Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
On ESP output, sk_wmem_alloc is incremented for the added padding if a
socket is associated to the skb. When replying with TCP SYNACKs over
IPsec, the associated sk is a casted request socket, only. Increasing
sk_wmem_alloc on a request socket results in a write at an arbitrary
struct offset. In the best case, this produces the following WARNING:
WARNING: CPU: 1 PID: 0 at lib/refcount.c:102 esp_output_head+0x2e4/0x308 [esp4]
refcount_t: addition on 0; use-after-free.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc3 #2
Hardware name: Marvell Armada 380/385 (Device Tree)
[...]
[<bf0ff354>] (esp_output_head [esp4]) from [<bf1006a4>] (esp_output+0xb8/0x180 [esp4])
[<bf1006a4>] (esp_output [esp4]) from [<c05dee64>] (xfrm_output_resume+0x558/0x664)
[<c05dee64>] (xfrm_output_resume) from [<c05d07b0>] (xfrm4_output+0x44/0xc4)
[<c05d07b0>] (xfrm4_output) from [<c05956bc>] (tcp_v4_send_synack+0xa8/0xe8)
[<c05956bc>] (tcp_v4_send_synack) from [<c0586ad8>] (tcp_conn_request+0x7f4/0x948)
[<c0586ad8>] (tcp_conn_request) from [<c058c404>] (tcp_rcv_state_process+0x2a0/0xe64)
[<c058c404>] (tcp_rcv_state_process) from [<c05958ac>] (tcp_v4_do_rcv+0xf0/0x1f4)
[<c05958ac>] (tcp_v4_do_rcv) from [<c0598a4c>] (tcp_v4_rcv+0xdb8/0xe20)
[<c0598a4c>] (tcp_v4_rcv) from [<c056eb74>] (ip_protocol_deliver_rcu+0x2c/0x2dc)
[<c056eb74>] (ip_protocol_deliver_rcu) from [<c056ee6c>] (ip_local_deliver_finish+0x48/0x54)
[<c056ee6c>] (ip_local_deliver_finish) from [<c056eecc>] (ip_local_deliver+0x54/0xec)
[<c056eecc>] (ip_local_deliver) from [<c056efac>] (ip_rcv+0x48/0xb8)
[<c056efac>] (ip_rcv) from [<c0519c2c>] (__netif_receive_skb_one_core+0x50/0x6c)
[...]
The issue triggers only when not using TCP syncookies, as for syncookies
no socket is associated.
On module unload/remove, we need to ensure that work does not run
after we have freed resources. Concretely, cancel_delayed_work()
may return while the callback function is still running.
From kernel/workqueue.c:
The work callback function may still be running on return,
unless it returns true and the work doesn't re-arm itself.
Explicitly flush or use cancel_delayed_work_sync() to wait on it.
Link: https://lore.kernel.org/lkml/20190204220952.30761-1-TheSven73@googlemail.com/ Reported-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Sven Van Asbroeck <TheSven73@gmail.com> Acked-by: Robin van der Gracht <robin@protonic.nl> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The driver's interrupt handler checks whether a message is currently
being handled with the curr_msg pointer. When it is NULL, the interrupt
is considered to be unexpected. Similarly, the i2c_start_transfer
routine checks for the remaining number of messages to handle in
num_msgs.
However, these values are never cleared and always keep the message and
number relevant to the latest transfer (which might be done already and
the underlying message memory might have been freed).
When an unexpected interrupt hits with the DONE bit set, the isr will
then try to access the flags field of the curr_msg structure, leading
to a fatal page fault.
The msg_buf and msg_buf_remaining fields are also never cleared at the
end of the transfer, which can lead to similar pitfalls.
Fix these issues by introducing a cleanup function and always calling
it after a transfer is finished.
Fixes: e2474541032d ("i2c: bcm2835: Fix hang for writing messages larger than 16 bytes") Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Acked-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The of_find_device_by_node() takes a reference to the underlying device
structure, we should release that reference.
Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The basic idea behind ->pagecnt_bias is: If we pre-allocate the maximum
number of references that we might need to create in the fastpath later,
the bump-allocation fastpath only has to modify the non-atomic bias value
that tracks the number of extra references we hold instead of the atomic
refcount. The maximum number of allocations we can serve (under the
assumption that no allocation is made with size 0) is nc->size, so that's
the bias used.
However, even when all memory in the allocation has been given away, a
reference to the page is still held; and in the `offset < 0` slowpath, the
page may be reused if everyone else has dropped their references.
This means that the necessary number of references is actually
`nc->size+1`.
Luckily, from a quick grep, it looks like the only path that can call
page_frag_alloc(fragsz=1) is TAP with the IFF_NAPI_FRAGS flag, which
requires CAP_NET_ADMIN in the init namespace and is only intended to be
used for kernel testing and fuzzing.
To test for this issue, put a `WARN_ON(page_ref_count(page) == 0)` in the
`offset < 0` path, below the virt_to_page() call, and then repeatedly call
writev() on a TAP device with IFF_TAP|IFF_NO_PI|IFF_NAPI_FRAGS|IFF_NAPI,
with a vector consisting of 15 elements containing 1 byte each.
Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The value of ->num_ports comes from bcm_sf2_sw_probe() and it is less
than or equal to DSA_MAX_PORTS. The ds->ports[] array is used inside
the dsa_is_user_port() and dsa_is_cpu_port() functions. The ds->ports[]
array is allocated in dsa_switch_alloc() and it has ds->num_ports
elements so this leads to a static checker warning about a potential out
of bounds read.
Fixes: 8cfa94984c9c ("net: dsa: bcm_sf2: add suspend/resume callbacks") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In qla2x00_async_tm_cmd, we reference off sp after it has been freed. This
caused a panic on a system running a slub debug kernel. Since fcport is
passed in anyways, just use that instead.
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Acked-by: Giridhar Malavali <gmalavali@marvell.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The issue here is that page_to_nid() will not work since some page flags
have no node information until later in page_alloc_init_late() due to
DEFERRED_STRUCT_PAGE_INIT. Hence, it could trigger an out-of-bounds
access with an invalid nid.
UBSAN: Undefined behaviour in ./include/linux/mm.h:1104:50
index 7 is out of range for type 'zone [5]'
Also, kernel will panic since flags were poisoned earlier with,
This means that assumptions behind commit fe53ca54270a ("mm: use
early_pfn_to_nid in page_ext_init") are incomplete. Therefore, revert
the commit for now. A proper way to move the page_owner initialization
to sooner is to hook into memmap initialization.
Link: http://lkml.kernel.org/r/20190115202812.75820-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Michal Hocko <mhocko@kernel.org> Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Yang Shi <yang.shi@linaro.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
For dax pmd, pmd_trans_huge() returns false but pmd_huge() returns true
on x86. So the function works as long as hugetlb is configured.
However, dax doesn't depend on hugetlb.
Link: http://lkml.kernel.org/r/20190111034033.601-1-yuzhao@google.com Signed-off-by: Yu Zhao <yuzhao@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Keith Busch <keith.busch@intel.com> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If nfs_page_async_flush() removes the page from the mapping, then we can't
use page_file_mapping() on it as nfs_updatepate() is wont to do when
receiving an error. Instead, push the mapping to the stack before the page
is possibly truncated.
Fixes: 8fc75bed96bb ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Make sure the device has at least 2 completion vectors
before allocating to compvec#1
Fixes: a4699f5647f3 (xprtrdma: Put Send CQ in IB_POLL_WORKQUEUE mode) Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
floppy_check_events() is supposed to return bit flags to say which
events occured. We should return zero to say that no event flags are
set. Only BIT(0) and BIT(1) are used in the caller. And .check_events
interface also expect to return an unsigned int value.
However, after commit a0c80efe5956, it may return -EINTR (-4u).
Here, both BIT(0) and BIT(1) are cleared. So this patch shouldn't
affect runtime, but it obviously is still worth fixing.
ipvs relies on nf_defrag_ipv6 module to manage IPv6 fragmentation,
but lacks proper Kconfig dependencies and does not explicitly
request defrag features.
As a result, if netfilter hooks are not loaded, when IPv6 fragmented
packet are handled by ipvs only the first fragment makes through.
Fix it properly declaring the dependency on Kconfig and registering
netfilter hooks on ip_vs_add_service() and ip_vs_new_dest().
Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Andrea Claudi <aclaudi@redhat.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When requeue, if RQF_DONTPREP, rq has contained some driver
specific data, so insert it to hctx dispatch list to avoid any
merge. Take scsi as example, here is the trace event log (no
io scheduler, because RQF_STARTED would prevent merging),
(32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
the sdb only contained the part of (32768 + 8), then only that part
was completed. The lucky thing was that scsi_io_completion detected
it and requeued the remaining part. So we didn't get corrupted data.
However, the requeue of (32776 + 8) is not expected.
When mac80211 requests the low level driver to stop an ongoing
Tx aggregation, the low level driver is expected to call
ieee80211_stop_tx_ba_cb_irqsafe() to indicate that it is ready
to stop the session. The callback in turn schedules a worker
to complete the session tear down, which in turn also handles
the relevant state for the intermediate Tx queue.
However, as this flow in asynchronous, the intermediate queue
should be stopped and not continue servicing frames, as in
such a case frames that are dequeued would be marked as part
of an aggregation, although the aggregation is already been
stopped.
Fix this by stopping the intermediate Tx queue, before
calling the low level driver to stop the Tx aggregation.
Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If a driver does any significant activity in its ibss_join method,
then it will very well expect that to be called during restart,
before any stations are added. Do that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We should be using flush_delayed_work() instead of flush_work() in
matrix_keypad_stop() to ensure that we are not missing work that is
scheduled but not yet put in the workqueue (i.e. its delay timer has not
expired yet).
Updating LED state requires access to regmap and therefore we may sleep,
so we could not do that directly form set_brightness() method.
Historically we used private work to adjust the brightness, but with the
introduction of set_brightness_blocking() we no longer need it.
As a bonus, not having our own work item means we do not have
use-after-free issue as we neglected to cancel outstanding work on
driver unbind.
Reported-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: Sven Van Asbroeck <TheSven73@googlemail.com> Acked-by: Jacek Anaszewski <jacek.anaszewski@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If we have a kernel configured for periodic timer interrupts, and we
have cpuidle enabled, then we end up with CPU1 losing timer interupts
after a hotplug.
This can manifest itself in RCU stall warnings, or userspace becoming
unresponsive.
The problem is that the kernel initially wants to use the TWD timer
for interrupts, but the TWD loses context when we enter the C3 cpuidle
state. Nothing reprograms the TWD after idle.
We have solved this in the past by switching to broadcast timer ticks,
and cpuidle44xx switches to that mode at boot time. However, there is
nothing to switch from periodic mode local timers after a hotplug
operation.
We call tick_broadcast_enter() in omap_enter_idle_coupled(), which one
would expect would take care of the issue, but internally this only
deals with one-shot local timers - tick_broadcast_enable() on the other
hand only deals with periodic local timers. So, we need to call both.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
[tony@atomide.com: just standardized the subject line] Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch moves clk_get_rate() call from trigger() to hw_params()
callback to avoid calling sleeping clk API from atomic context
and prevent deadlock as indicated below.
Before this change clk_get_rate() was being called with same
spinlock held as the one passed to the clk API when registering
clocks exposed by the I2S driver.
[ 82.109780] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
[ 82.117009] in_atomic(): 1, irqs_disabled(): 128, pid: 1554, name: speaker-test
[ 82.124235] 3 locks held by speaker-test/1554:
[ 82.128653] #0: cc8c5328 (snd_pcm_link_rwlock){...-}, at: snd_pcm_stream_lock_irq+0x20/0x38
[ 82.137058] #1: ec9eda17 (&(&substream->self_group.lock)->rlock){..-.}, at: snd_pcm_ioctl+0x900/0x1268
[ 82.146417] #2: 6ac279bf (&(&pri_dai->spinlock)->rlock){..-.}, at: i2s_trigger+0x64/0x6d4
[ 82.154650] irq event stamp: 8144
[ 82.157949] hardirqs last enabled at (8143): [<c0a0f574>] _raw_read_unlock_irq+0x24/0x5c
[ 82.166089] hardirqs last disabled at (8144): [<c0a0f6a8>] _raw_read_lock_irq+0x18/0x58
[ 82.174063] softirqs last enabled at (8004): [<c01024e4>] __do_softirq+0x3a4/0x66c
[ 82.181688] softirqs last disabled at (7997): [<c012d730>] irq_exit+0x140/0x168
[ 82.188964] Preemption disabled at:
[ 82.188967] [<00000000>] (null)
[ 82.195728] CPU: 6 PID: 1554 Comm: speaker-test Not tainted 5.0.0-rc5-00192-ga6e6caca8f03 #191
[ 82.204302] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 82.210376] [<c0111a54>] (unwind_backtrace) from [<c010d8f4>] (show_stack+0x10/0x14)
[ 82.218084] [<c010d8f4>] (show_stack) from [<c09ef004>] (dump_stack+0x90/0xc8)
[ 82.225278] [<c09ef004>] (dump_stack) from [<c0152980>] (___might_sleep+0x22c/0x2c8)
[ 82.232990] [<c0152980>] (___might_sleep) from [<c0a0a2e4>] (__mutex_lock+0x28/0xa3c)
[ 82.240788] [<c0a0a2e4>] (__mutex_lock) from [<c0a0ad80>] (mutex_lock_nested+0x1c/0x24)
[ 82.248763] [<c0a0ad80>] (mutex_lock_nested) from [<c04923dc>] (clk_prepare_lock+0x78/0xec)
[ 82.257079] [<c04923dc>] (clk_prepare_lock) from [<c049538c>] (clk_core_get_rate+0xc/0x5c)
[ 82.265309] [<c049538c>] (clk_core_get_rate) from [<c0766b18>] (i2s_trigger+0x490/0x6d4)
[ 82.273369] [<c0766b18>] (i2s_trigger) from [<c074fec4>] (soc_pcm_trigger+0x100/0x140)
[ 82.281254] [<c074fec4>] (soc_pcm_trigger) from [<c07378a0>] (snd_pcm_do_start+0x2c/0x30)
[ 82.289400] [<c07378a0>] (snd_pcm_do_start) from [<c07376cc>] (snd_pcm_action_single+0x38/0x78)
[ 82.298065] [<c07376cc>] (snd_pcm_action_single) from [<c073a450>] (snd_pcm_ioctl+0x910/0x1268)
[ 82.306734] [<c073a450>] (snd_pcm_ioctl) from [<c0292344>] (do_vfs_ioctl+0x90/0x9ec)
[ 82.314443] [<c0292344>] (do_vfs_ioctl) from [<c0292cd4>] (ksys_ioctl+0x34/0x60)
[ 82.321808] [<c0292cd4>] (ksys_ioctl) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
[ 82.329431] Exception stack(0xeb875fa8 to 0xeb875ff0)
[ 82.334459] 5fa0: 00033c18b6e31000000000040000414200033d8000033d80
[ 82.342605] 5fc0: 00033c18b6e3100000008000000000360000800000000000beea38a800008000
[ 82.350748] 5fe0: b6e3142cbeea384cb6da9a30b6c9212c
[ 82.355789]
[ 82.357245] ======================================================
[ 82.363397] WARNING: possible circular locking dependency detected
[ 82.369551] 5.0.0-rc5-00192-ga6e6caca8f03 #191 Tainted: G W
[ 82.376395] ------------------------------------------------------
[ 82.382548] speaker-test/1554 is trying to acquire lock:
[ 82.387834] 6d2007f4 (prepare_lock){+.+.}, at: clk_prepare_lock+0x78/0xec
[ 82.394593]
[ 82.394593] but task is already holding lock:
[ 82.400398] 6ac279bf (&(&pri_dai->spinlock)->rlock){..-.}, at: i2s_trigger+0x64/0x6d4
[ 82.408197]
[ 82.408197] which lock already depends on the new lock.
[ 82.416343]
[ 82.416343] the existing dependency chain (in reverse order) is:
[ 82.423795]
[ 82.423795] -> #1 (&(&pri_dai->spinlock)->rlock){..-.}:
[ 82.430472] clk_mux_set_parent+0x34/0xb8
[ 82.434975] clk_core_set_parent_nolock+0x1c4/0x52c
[ 82.440347] clk_set_parent+0x38/0x6c
[ 82.444509] of_clk_set_defaults+0xc8/0x308
[ 82.449186] of_clk_add_provider+0x84/0xd0
[ 82.453779] samsung_i2s_probe+0x408/0x5f8
[ 82.458376] platform_drv_probe+0x48/0x98
[ 82.462879] really_probe+0x224/0x3f4
[ 82.467037] driver_probe_device+0x70/0x1c4
[ 82.471716] bus_for_each_drv+0x44/0x8c
[ 82.476049] __device_attach+0xa0/0x138
[ 82.480382] bus_probe_device+0x88/0x90
[ 82.484715] deferred_probe_work_func+0x6c/0xbc
[ 82.489741] process_one_work+0x200/0x740
[ 82.494246] worker_thread+0x2c/0x4c8
[ 82.498408] kthread+0x128/0x164
[ 82.502131] ret_from_fork+0x14/0x20
[ 82.506204] (null)
[ 82.508976]
[ 82.508976] -> #0 (prepare_lock){+.+.}:
[ 82.514264] __mutex_lock+0x60/0xa3c
[ 82.518336] mutex_lock_nested+0x1c/0x24
[ 82.522756] clk_prepare_lock+0x78/0xec
[ 82.527088] clk_core_get_rate+0xc/0x5c
[ 82.531421] i2s_trigger+0x490/0x6d4
[ 82.535494] soc_pcm_trigger+0x100/0x140
[ 82.539913] snd_pcm_do_start+0x2c/0x30
[ 82.544246] snd_pcm_action_single+0x38/0x78
[ 82.549012] snd_pcm_ioctl+0x910/0x1268
[ 82.553345] do_vfs_ioctl+0x90/0x9ec
[ 82.557417] ksys_ioctl+0x34/0x60
[ 82.561229] ret_fast_syscall+0x0/0x28
[ 82.565477] 0xbeea384c
[ 82.568421]
[ 82.568421] other info that might help us debug this:
[ 82.568421]
[ 82.576394] Possible unsafe locking scenario:
[ 82.576394]
[ 82.582285] CPU0 CPU1
[ 82.586792] ---- ----
[ 82.591297] lock(&(&pri_dai->spinlock)->rlock);
[ 82.595977] lock(prepare_lock);
[ 82.601782] lock(&(&pri_dai->spinlock)->rlock);
[ 82.608975] lock(prepare_lock);
[ 82.612268]
[ 82.612268] *** DEADLOCK ***
Fixes: 647d04f8e07a ("ASoC: samsung: i2s: Ensure the RCLK rate is properly determined") Reported-by: Krzysztof Kozłowski <krzk@kernel.org> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On systems with VHE the kernel and KVM's world-switch code run at the
same exception level. Code that is only used on a VHE system does not
need to be annotated as __hyp_text as it can reside anywhere in the
kernel text.
__hyp_text was also used to prevent kprobes from patching breakpoint
instructions into this region, as this code runs at a different
exception level. While this is no longer true with VHE, KVM still
switches VBAR_EL1, meaning a kprobe's breakpoint executed in the
world-switch code will cause a hyp-panic.
echo "p:weasel sysreg_save_guest_state_vhe" > /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/weasel/enable
lkvm run -k /boot/Image --console serial -p "console=ttyS0 earlycon=uart,mmio,0x3f8"
# lkvm run -k /boot/Image -m 384 -c 3 --name guest-1474
Info: Placing fdt at 0x8fe00000 - 0x8fffffff
Info: virtio-mmio.devices=0x200@0x10000:36
We currently initialize the group of private IRQs during
kvm_vgic_vcpu_init, and the value of the group depends on the GIC model
we are emulating. However, CPUs created before creating (and
initializing) the VGIC might end up with the wrong group if the VGIC
is created as GICv3 later.
Since we have no enforced ordering of creating the VGIC and creating
VCPUs, we can end up with part the VCPUs being properly intialized and
the remaining incorrectly initialized. That also means that we have no
single place to do the per-cpu data structure initialization which
depends on knowing the emulated GIC model (which is only the group
field).
This patch removes the incorrect comment from kvm_vgic_vcpu_init and
initializes the group of all previously created VCPUs's private
interrupts in vgic_init in addition to the existing initialization in
kvm_vgic_vcpu_init.
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Failing to properly reset system registers is pretty bad. But not
quite as bad as bringing the whole machine down... So warn loudly,
but slightly more gracefully.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The current kvm_psci_vcpu_on implementation will directly try to
manipulate the state of the VCPU to reset it. However, since this is
not done on the thread that runs the VCPU, we can end up in a strangely
corrupted state when the source and target VCPUs are running at the same
time.
Fix this by factoring out all reset logic from the PSCI implementation
and forwarding the required information along with a request to the
target VCPU.
Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We have two ways to reset a vcpu:
- either through VCPU_INIT
- or through a PSCI_ON call
The first one is easy to reason about. The second one is implemented
in a more bizarre way, as it is the vcpu that handles PSCI_ON that
resets the vcpu that is being powered-on. As we need to turn the logic
around and have the target vcpu to reset itself, we must take some
preliminary steps.
Resetting the VCPU state modifies the system register state in memory,
but this may interact with vcpu_load/vcpu_put if running with preemption
disabled, which in turn may lead to corrupted system register state.
Address this by disabling preemption and doing put/load if required
around the reset logic.
Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 4d230d1271064 ("ASoC: rsnd: fixup not to call clk_get/set
under non-atomic") added new rsnd_ssi_prepare() and moved
rsnd_ssi_master_clk_start() to .prepare.
But, ssi user count (= ssi->usrcnt) is incremented at .init
(= rsnd_ssi_init()).
Because of these timing exchange, ssi->usrcnt check at
rsnd_ssi_master_clk_start() should be adjusted.
Otherwise, 2nd master clock setup will be no check.
This patch fixup this issue.
Fixes: commit 4d230d1271064 ("ASoC: rsnd: fixup not to call clk_get/set under non-atomic") Reported-by: Yusuke Goda <yusuke.goda.sx@renesas.com> Reported-by: Valentine Barshak <valentine.barshak@cogentembedded.com> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Tested-by: Yusuke Goda <yusuke.goda.sx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
KASAN reports and additional traces point to out-of-bounds accesses to
the dapm_up_seq and dapm_down_seq lookup tables. The indices used are
larger than the array definition.
Fix by adding missing entries for the new widget types in these two
lookup tables, and align them with PGA values.
Also the sequences for the following widgets were not defined. Since
their values defaulted to zero, assign them explicitly
Fixes: 8a70b4544ef4 ('ASoC: dapm: Add new widget type for constructing DAPM graphs on DSPs.'). Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In function omap4_dsi_mux_pads(), local variable "reg" could
be uninitialized if function regmap_read() returns -EINVAL.
However, it will be used directly in the later context, which
is potentially unsafe.
Commit 84badc5ec5fc ("ARM: dts: omap4: Move l4 child devices to probe
them with ti-sysc") moved some omap4 timers to probe with ti-sysc
interconnect target module. Turns out this broke pwm-omap-dmtimer
for reparenting of the timer clock.
With ti-sysc, we can now configure the clock sources in the dts with
assigned-clocks and assigned-clock-parents.
Fixes: 84badc5ec5fc ("ARM: dts: omap4: Move l4 child devices to probe them with ti-sysc") Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: H. Nikolaus Schaller <hns@goldelico.com> Cc: Keerthy <j-keerthy@ti.com> Cc: Ladislav Michl <ladis@linux-mips.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Cc: Tero Kristo <t-kristo@ti.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch fixes order of disable calls in pwm_vibrator_stop.
Currently when starting device, we first enable vcc regulator and then
setup and enable pwm. When stopping, we should do this in oposite order,
so first disable pwm and then disable regulator.
Previously order was the same as in start.
pwm_vibrator_stop disables the regulator, but it can be called from
multiple places, even when the regulator is already disabled. Fix this
by using regulator_is_enabled check when starting and stopping device.
Signed-off-by: Jonathan Bakker <xc-racer2@live.ca> Signed-off-by: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The patch 52898025cf7d: "[S390] dasd: security and PSF update patch
for EMC CKD ioctl" from Mar 8, 2010, leads to the following static
checker warning:
drivers/s390/block/dasd_eckd.c:4486 dasd_symm_io()
error: using offset into zero size array 'psf_data[]'
drivers/s390/block/dasd_eckd.c
4458 /* Copy parms from caller */
4459 rc = -EFAULT;
4460 if (copy_from_user(&usrparm, argp, sizeof(usrparm)))
^^^^^^^
The user can specify any "usrparm.psf_data_len". They choose zero by
mistake.
4461 goto out;
4462 if (is_compat_task()) {
4463 /* Make sure pointers are sane even on 31 bit. */
4464 rc = -EINVAL;
4465 if ((usrparm.psf_data >> 32) != 0)
4466 goto out;
4467 if ((usrparm.rssd_result >> 32) != 0)
4468 goto out;
4469 usrparm.psf_data &= 0x7fffffffULL;
4470 usrparm.rssd_result &= 0x7fffffffULL;
4471 }
4472 /* alloc I/O data area */
4473 psf_data = kzalloc(usrparm.psf_data_len, GFP_KERNEL
| GFP_DMA);
4474 rssd_result = kzalloc(usrparm.rssd_result_len, GFP_KERNEL
| GFP_DMA);
4475 if (!psf_data || !rssd_result) {
kzalloc() returns a ZERO_SIZE_PTR (0x16).
4476 rc = -ENOMEM;
4477 goto out_free;
4478 }
4479
4480 /* get syscall header from user space */
4481 rc = -EFAULT;
4482 if (copy_from_user(psf_data,
4483 (void __user *)(unsigned long)
usrparm.psf_data,
4484 usrparm.psf_data_len))
Ports are described by child 'port' nodes contained in the device node.
'ports' is optional and is used to group all 'port' nodes which is not
the case here.
This patch fixes the following warnings:
arch/arm64/boot/dts/rockchip/rk3399-gru-bob.dts:25.9-29.5: Warning (graph_port): /edp-panel/ports: graph port node name should be 'port'
arch/arm64/boot/dts/rockchip/rk3399-gru-kevin.dts:46.9-50.5: Warningi (graph_port): /edp-panel/ports: graph port node name should be 'port'
arch/arm64/boot/dts/rockchip/rk3399-sapphire-excavator.dts:94.9-98.5: Warning (graph_port): /edp-panel/ports: graph port node name should be 'port'
Commit 84badc5ec5fc ("ARM: dts: omap4: Move l4 child devices to probe
them with ti-sysc") moved some omap4 timers to probe with ti-sysc
interconnect target module. Turns out this broke pwm-omap-dmtimer
where we now try to reparent the clock to itself with the following:
omap_dm_timer_of_set_source: failed to set parent
With ti-sysc, we can now configure the clock sources in the dts
with assigned-clocks and assigned-clock-parents. So we should be able
to remove omap_dm_timer_of_set_source with clean-up patches later on.
But for now, let's just fix it first by checking if parent and fck
are the same and bail out of so.
Fixes: 84badc5ec5fc ("ARM: dts: omap4: Move l4 child devices to probe them with ti-sysc") Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: H. Nikolaus Schaller <hns@goldelico.com> Cc: Keerthy <j-keerthy@ti.com> Cc: Ladislav Michl <ladis@linux-mips.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Cc: Tero Kristo <t-kristo@ti.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Reported-by: H. Nikolaus Schaller <hns@goldelico.com> Tested-By: Andreas Kemnade <andreas@kemnade.info> Tested-By: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
AD/DA ASRC function control two ASRC clock sources separately.
Whether AD/DA filter select which clock source, we enable AD/DA ASRC
function for all cases.
Signed-off-by: Shuming Fan <shumingf@realtek.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The device node iterators perform an of_node_get on each
iteration, so a jump out of the loop requires an of_node_put.
Move the initialization channel->child = child; down to just
before the call to imx_ldb_register so that intervening failures
don't need to clear it. Add a label at the end of the function to
do all the of_node_puts.
The semantic patch that finds part of this problem is as follows
(http://coccinelle.lip6.fr):
// <smpl>
@@
expression root,e;
local idexpression child;
iterator name for_each_child_of_node;
@@
for_each_child_of_node(root, child) {
... when != of_node_put(child)
when != e = child
(
return child;
|
* return ...;
)
...
}
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The CSI0/CSI1 registers offset is at +0xe030000/+0xe038000 relative
to the control module registers on IPUv3EX.
This patch fixes wrong values for i.MX51 CSI0/CSI1.
Fixes: 2ffd48f2e7 ("gpu: ipu-v3: Add Camera Sensor Interface unit") Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch fixes backtraces like the following when sending SIGKILL to a
process with a currently pending plane update:
[drm:ipu_plane_atomic_check] CRTC should be enabled
[drm:drm_framebuffer_remove] *ERROR* failed to commit
------------[ cut here ]------------
WARNING: CPU: 3 PID: 63 at drivers/gpu/drm/drm_framebuffer.c:926 drm_framebuffer_remove+0x47c/0x498
atomic remove_fb failed with -22
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
For chain mode in cipher(eg. AES-CBC/DES-CBC), the iv is continuously
updated in the operation. The new iv value should be written to device
register by software.
Reported-by: Eric Biggers <ebiggers@google.com> Fixes: 433cd2c617bf ("crypto: rockchip - add crypto driver for rk3288") Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Zhang Zhijie <zhangzj@rock-chips.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hash algorithms with an alignmask set, e.g. "xcbc(aes-aesni)" and
"michael_mic", fail the improved hash tests because they sometimes
produce the wrong digest. The bug is that in the case where a
scatterlist element crosses pages, not all the data is actually hashed
because the scatterlist walk terminates too early. This happens because
the 'nbytes' variable in crypto_hash_walk_done() is assigned the number
of bytes remaining in the page, then later interpreted as the number of
bytes remaining in the scatterlist element. Fix it.
Fixes: 900a081f6912 ("crypto: ahash - Fix early termination in hash walk") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The memcpy() in crypto_cfb_decrypt_inplace() uses walk->iv as both the
source and destination, which has undefined behavior. It is unneeded
because walk->iv is already used to hold the previous ciphertext block;
thus, walk->iv is already updated to its final value. So, remove it.
Also, note that in-place decryption is the only case where the previous
ciphertext block is not directly available. Therefore, as a related
cleanup I also updated crypto_cfb_encrypt_segment() to directly use the
previous ciphertext block rather than save it into walk->iv. This makes
it consistent with in-place encryption and out-of-place decryption; now
only in-place decryption is different, because it has to be.
Fixes: a7d85e06ed80 ("crypto: cfb - add support for Cipher FeedBack mode") Cc: <stable@vger.kernel.org> # v4.17+ Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Like some other block cipher mode implementations, the CFB
implementation assumes that while walking through the scatterlist, a
partial block does not occur until the end. But the walk is incorrectly
being done with a blocksize of 1, as 'cra_blocksize' is set to 1 (since
CFB is a stream cipher) but no 'chunksize' is set. This bug causes
incorrect encryption/decryption for some scatterlist layouts.
Fix it by setting the 'chunksize'. Also extend the CFB test vectors to
cover this bug as well as cases where the message length is not a
multiple of the block size.
Fixes: a7d85e06ed80 ("crypto: cfb - add support for Cipher FeedBack mode") Cc: <stable@vger.kernel.org> # v4.17+ Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For decryption in CBC mode we need to save the last ciphertext block
for use as the next IV. However, we were trying to do this also with
zero sized ciphertext resulting in a panic.
Fix this by only doing the copy if the ciphertext length is at least
of IV size.
We were copying the last ciphertext block into the IV field
for CBC before removing the DMA mapping of the output buffer
with the result of the buffer sometime being out-of-sync cache
wise and were getting intermittent cases of bad output IV.
Fix it by moving the DMA buffer unmapping before the copy.
In cc_unmap_aead_request(), call dma_pool_free() for mlli buffer only
if an item is allocated from the pool and not always if there is a
pool allocated.
This fixes a kernel panic when trying to free a non-allocated item.
Cc: stable@vger.kernel.org Signed-off-by: Hadar Gat <hadar.gat@arm.com> Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roland reports the following issue and provides a root cause analysis:
"On a v4.19 i.MX6 system with IMA and CONFIG_DMA_API_DEBUG enabled, a
warning is generated when accessing files on a filesystem for which IMA
measurement is enabled:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/dma/debug.c:1181 check_for_stack.part.9+0xd0/0x120
caam_jr 2101000.jr0: DMA-API: device driver maps memory from stack [addr=b668049e]
Modules linked in:
CPU: 0 PID: 1 Comm: switch_root Not tainted 4.19.0-20181214-1 #2
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
Backtrace:
[<c010efb8>] (dump_backtrace) from [<c010f2d0>] (show_stack+0x20/0x24)
[<c010f2b0>] (show_stack) from [<c08b04f4>] (dump_stack+0xa0/0xcc)
[<c08b0454>] (dump_stack) from [<c012b610>] (__warn+0xf0/0x108)
[<c012b520>] (__warn) from [<c012b680>] (warn_slowpath_fmt+0x58/0x74)
[<c012b62c>] (warn_slowpath_fmt) from [<c0199acc>] (check_for_stack.part.9+0xd0/0x120)
[<c01999fc>] (check_for_stack.part.9) from [<c019a040>] (debug_dma_map_page+0x144/0x174)
[<c0199efc>] (debug_dma_map_page) from [<c065f7f4>] (ahash_final_ctx+0x5b4/0xcf0)
[<c065f240>] (ahash_final_ctx) from [<c065b3c4>] (ahash_final+0x1c/0x20)
[<c065b3a8>] (ahash_final) from [<c03fe278>] (crypto_ahash_op+0x38/0x80)
[<c03fe240>] (crypto_ahash_op) from [<c03fe2e0>] (crypto_ahash_final+0x20/0x24)
[<c03fe2c0>] (crypto_ahash_final) from [<c03f19a8>] (ima_calc_file_hash+0x29c/0xa40)
[<c03f170c>] (ima_calc_file_hash) from [<c03f2b24>] (ima_collect_measurement+0x1dc/0x240)
[<c03f2948>] (ima_collect_measurement) from [<c03f0a60>] (process_measurement+0x4c4/0x6b8)
[<c03f059c>] (process_measurement) from [<c03f0cdc>] (ima_file_check+0x88/0xa4)
[<c03f0c54>] (ima_file_check) from [<c02d8adc>] (path_openat+0x5d8/0x1364)
[<c02d8504>] (path_openat) from [<c02dad24>] (do_filp_open+0x84/0xf0)
[<c02daca0>] (do_filp_open) from [<c02cf50c>] (do_open_execat+0x84/0x1b0)
[<c02cf488>] (do_open_execat) from [<c02d1058>] (__do_execve_file+0x43c/0x890)
[<c02d0c1c>] (__do_execve_file) from [<c02d1770>] (sys_execve+0x44/0x4c)
[<c02d172c>] (sys_execve) from [<c0101000>] (ret_fast_syscall+0x0/0x28)
---[ end trace 3455789a10e3aefd ]---
The cause is that the struct ahash_request *req is created as a
stack-local variable up in the stack (presumably somewhere in the IMA
implementation), then passed down into the CAAM driver, which tries to
dma_single_map the req->result (indirectly via map_seq_out_ptr_result)
in order to make that buffer available for the CAAM to store the result
of the following hash operation.
The calling code doesn't know how req will be used by the CAAM driver,
and there could be other such occurrences where stack memory is passed
down to the CAAM driver. Therefore we should rather fix this issue in
the CAAM driver where the requirements are known."
Fix this problem by:
-instructing the crypto engine to write the final hash in state->caam_ctx
-subsequently memcpy-ing the final hash into req->result
Cc: <stable@vger.kernel.org> # v4.19+ Reported-by: Roland Hieber <rhi@pengutronix.de> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Tested-by: Roland Hieber <rhi@pengutronix.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a bug in the channel allocation logic that leads to an endless
loop when looking for a contiguous range of channels in a range with a
mixture of free and occupied channels. For example, opening three
consequtive channels, closing the first two and requesting 4 channels in
a row will trigger this soft lockup. The bug is that the search loop
forgets to skip over the range once it detects that one channel in that
range is occupied.
Restore the original intent to the logic by fixing the omission.
Signed-off-by: Zhi Jin <zhi.jin@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Fixes: 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module devices") CC: stable@vger.kernel.org # v4.4+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>