This patch fixes a subtle PACKET_ORIGDEV regression which was a side
effect of fixes introduced by:
6a9e461f6fe4 bonding: pass link-local packets to bonding master also.
... to:
b89f04c61efe bonding: deliver link-local packets with skb->dev set to link that packets arrived on
While 6a9e461f6fe4 restored pre-b89f04c61efe presence of link-local
packets on bonding masters (which is required e.g. by linux bridges
participating in spanning tree or needed for lab-like setups created
with group_fwd_mask) it also caused the originating device
information to be lost due to cloning.
Maciej Żenczykowski proposed another solution that doesn't require
packet cloning and retains original device information - instead of
returning RX_HANDLER_PASS for all link-local packets it's now limited
only to packets from inactive slaves.
At the same time, packets passed to bonding masters retain correct
information about the originating device and PACKET_ORIGDEV can be used
to determine it.
This elegantly solves all issues so far:
- link-local packets that were removed from bonding masters
- LLDP daemons being forced to explicitly bind to slave interfaces
- PACKET_ORIGDEV having no effect on bond interfaces
Fixes: 6a9e461f6fe4 (bonding: pass link-local packets to bonding master also.) Reported-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We need a RCU critical section around rt6_info->from deference, and
proper annotation.
Fixes: 4ed591c8ab44 ("net/ipv6: Allow onlink routes to have a device mismatch if it is the default route") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We must access rt6_info->from under RCU read lock: move the
dereference under such lock, with proper annotation.
v1 -> v2:
- avoid using multiple, racy, fetch operations for rt->from
Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When running Docker with userns isolation e.g. --userns-remap="default"
and spawning up some containers with CAP_NET_ADMIN under this realm, I
noticed that link changes on ipvlan slave device inside that container
can affect all devices from this ipvlan group which are in other net
namespaces where the container should have no permission to make changes
to, such as the init netns, for example.
This effectively allows to undo ipvlan private mode and switch globally to
bridge mode where slaves can communicate directly without going through
hostns, or it allows to switch between global operation mode (l2/l3/l3s)
for everyone bound to the given ipvlan master device. libnetwork plugin
here is creating an ipvlan master and ipvlan slave in hostns and a slave
each that is moved into the container's netns upon creation event.
* In hostns:
# ip -d a
[...]
8: cilium_host@bond0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
ipvlan mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
inet 10.41.0.1/32 scope link cilium_host
valid_lft forever preferred_lft forever
[...]
# docker exec -ti client ip -d a
[...]
10: cilium0@if4: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
ipvlan mode l3 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
valid_lft forever preferred_lft forever
# docker exec -ti client ip link change link cilium0 name cilium0 type ipvlan mode l2
# docker exec -ti client ip -d a
[...]
10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
valid_lft forever preferred_lft forever
* In hostns (mode switched to l2):
# ip -d a
[...]
8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
inet 10.41.0.1/32 scope link cilium_host
valid_lft forever preferred_lft forever
[...]
Same l3 -> l2 switch would also happen by creating another slave inside
the container's network namespace when specifying the existing cilium0
link to derive the actual (bond0) master:
# docker exec -ti client ip link add link cilium0 name cilium1 type ipvlan mode l2
# docker exec -ti client ip -d a
[...]
2: cilium1@if4: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
10: cilium0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
inet 10.41.197.43/32 brd 10.41.197.43 scope global cilium0
valid_lft forever preferred_lft forever
* In hostns:
# ip -d a
[...]
8: cilium_host@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 0c:c4:7a:e1:3d:cc brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535
ipvlan mode l2 bridge numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
inet 10.41.0.1/32 scope link cilium_host
valid_lft forever preferred_lft forever
[...]
One way to mitigate it is to check CAP_NET_ADMIN permissions of
the ipvlan master device's ns, and only then allow to change
mode or flags for all devices bound to it. Above two cases are
then disallowed after the patch.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a port is added to a team, its initial state is derived
from netif_carrier_ok rather than netif_oper_up.
If it is carrier up but operationally down at the time of being
added, the port state.linkup will be set prematurely.
port state.linkup should be set consistently using
netif_oper_up rather than netif_carrier_ok.
Fixes: f1d22a1e0595 ("team: account for oper state") Signed-off-by: George Wilkie <gwilkie@vyatta.att-mail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a netdevice is unregistered, we flush the relevant exception
via rt6_sync_down_dev() -> fib6_ifdown() -> fib6_del() -> fib6_del_route().
Finally, we end-up calling rt6_remove_exception(), where we release
the relevant dst, while we keep the references to the related fib6_info and
dev. Such references should be released later when the dst will be
destroyed.
There are a number of caches that can keep the exception around for an
unlimited amount of time - namely dst_cache, possibly even socket cache.
As a result device registration may hang, as demonstrated by this script:
ip netns add cl
ip netns add rt
ip netns add srv
ip netns exec rt sysctl -w net.ipv6.conf.all.forwarding=1
ip link add name cl_veth type veth peer name cl_rt_veth
ip link set dev cl_veth netns cl
ip -n cl link set dev cl_veth up
ip -n cl addr add dev cl_veth 2001::2/64
ip -n cl route add default via 2001::1
ip -n cl link add tunv6 type ip6tnl mode ip6ip6 local 2001::2 remote 2002::1 hoplimit 64 dev cl_veth
ip -n cl link set tunv6 up
ip -n cl addr add 2013::2/64 dev tunv6
ip link set dev cl_rt_veth netns rt
ip -n rt link set dev cl_rt_veth up
ip -n rt addr add dev cl_rt_veth 2001::1/64
ip link add name rt_srv_veth type veth peer name srv_veth
ip link set dev srv_veth netns srv
ip -n srv link set dev srv_veth up
ip -n srv addr add dev srv_veth 2002::1/64
ip -n srv route add default via 2002::2
ip -n srv link add tunv6 type ip6tnl mode ip6ip6 local 2002::1 remote 2001::2 hoplimit 64 dev srv_veth
ip -n srv link set tunv6 up
ip -n srv addr add 2013::1/64 dev tunv6
ip link set dev rt_srv_veth netns rt
ip -n rt link set dev rt_srv_veth up
ip -n rt addr add dev rt_srv_veth 2002::2/64
ip netns exec srv netserver & sleep 0.1
ip netns exec cl ping6 -c 4 2013::1
ip netns exec cl netperf -H 2013::1 -t TCP_STREAM -l 3 & sleep 1
ip -n rt link set dev rt_srv_veth mtu 1400
wait %2
ip -n cl link del cl_veth
This commit addresses the issue purging all the references held by the
exception at time, as we currently do for e.g. ipv6 pcpu dst entries.
v1 -> v2:
- re-order the code to avoid accessing dst and net after dst_dev_put()
Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 to
keep legacy software happy. This is similar to what was done for
ipv4 in commit 709772e6e065 ("net: Fix routing tables with
id > 255 for legacy software").
Signed-off-by: Kalash Nainwal <kalash@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KASAN has found use-after-free in fixed_mdio_bus_init,
commit 0c692d07842a ("drivers/net/phy/mdio_bus.c: call
put_device on device_register() failure") call put_device()
while device_register() fails,give up the last reference
to the device and allow mdiobus_release to be executed
,kfreeing the bus. However in most drives, mdiobus_free
be called to free the bus while mdiobus_register fails.
use-after-free occurs when access bus again, this patch
revert it to let mdiobus_free free the bus.
KASAN report details as below:
BUG: KASAN: use-after-free in mdiobus_free+0x85/0x90 drivers/net/phy/mdio_bus.c:482
Read of size 4 at addr ffff8881dc824d78 by task syz-executor.0/3524
Memory state around the buggy address: ffff8881dc824c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8881dc824c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8881dc824d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff8881dc824d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8881dc824e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: 0c692d07842a ("drivers/net/phy/mdio_bus.c: call put_device on device_register() failure") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzbot was able to trigger another soft lockup [1]
I first thought it was the O(N^2) issue I mentioned in my
prior fix (f657d22ee1f "net/x25: do not hold the cpu
too long in x25_new_lci()"), but I eventually found
that x25_bind() was not checking SOCK_ZAPPED state under
socket lock protection.
This means that multiple threads can end up calling
x25_insert_socket() for the same socket, and corrupt x25_list
Fixes: 90c27297a9bf ("X.25 remove bkl in bind") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: andrew hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Calculation of qp mtt size (in function mlx4_RST2INIT_wrapper)
ultimately depends on function roundup_pow_of_two.
If the amount of memory required by the QP is less than one page,
roundup_pow_of_two is called with argument zero. In this case, the
roundup_pow_of_two result is undefined.
Calling roundup_pow_of_two with a zero argument resulted in the
following stack trace:
UBSAN: Undefined behaviour in ./include/linux/log2.h:61:13
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 4 PID: 26939 Comm: rping Tainted: G OE 4.19.0-rc1
Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
Call Trace:
dump_stack+0x9a/0xeb
ubsan_epilogue+0x9/0x7c
__ubsan_handle_shift_out_of_bounds+0x254/0x29d
? __ubsan_handle_load_invalid_value+0x180/0x180
? debug_show_all_locks+0x310/0x310
? sched_clock+0x5/0x10
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x260
? find_held_lock+0x35/0x1e0
? mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core]
mlx4_RST2INIT_QP_wrapper+0xfb1/0x1440 [mlx4_core]
Fix this by explicitly testing for zero, and returning one if the
argument is zero (assuming that the next higher power of 2 in this case
should be one).
Fixes: c82e9aa0a8bc ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In procedures mlx4_cmd_use_events() and mlx4_cmd_use_polling(), we need to
guarantee that there are no FW commands in progress on the comm channel
(for VFs) or wrapped FW commands (on the PF) when SRIOV is active.
We do this by also taking the slave_cmd_mutex when SRIOV is active.
This is especially important when switching from event to polling, since we
free the command-context array during the switch. If there are FW commands
in progress (e.g., waiting for a completion event), the completion event
handler will access freed memory.
Since the decision to use comm_wait or comm_poll is taken before grabbing
the event_sem/poll_sem in mlx4_comm_cmd_wait/poll, we must take the
slave_cmd_mutex as well (to guarantee that the decision to use events or
polling and the call to the appropriate cmd function are atomic).
Fixes: a7e1f04905e5 ("net/mlx4_core: Fix deadlock when switching between polling and event fw commands") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As part of unloading a device, the driver switches from
FW command event mode to FW command polling mode.
Part of switching over to polling mode is freeing the command context array
memory (unfortunately, currently, without NULLing the command context array
pointer).
The reset flow calls "complete" to complete all outstanding fw commands
(if we are in event mode). The check for event vs. polling mode here
is to test if the command context array pointer is NULL.
If the reset flow is activated after the switch to polling mode, it will
attempt (incorrectly) to complete all the commands in the context array --
because the pointer was not NULLed when the driver switched over to polling
mode.
As a result, we have a use-after-free situation, which results in a
kernel crash.
Same reasons than the ones explained in commit 4179cb5a4c92
("vxlan: test dev->flags & IFF_UP before calling netif_rx()")
netif_rx() or gro_cells_receive() must be called under a strict contract.
At device dismantle phase, core networking clears IFF_UP
and flush_all_backlogs() is called after rcu grace period
to make sure no incoming packet might be in a cpu backlog
and still referencing the device.
A similar protocol is used for gro_cells infrastructure, as
gro_cells_destroy() will be called only after a full rcu
grace period is observed after IFF_UP has been cleared.
Most drivers call netif_rx() from their interrupt handler,
and since the interrupts are disabled at device dismantle,
netif_rx() does not have to check dev->flags & IFF_UP
Virtual drivers do not have this guarantee, and must
therefore make the check themselves.
Otherwise we risk use-after-free and/or crashes.
Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we receive a packet while deleting a VXLAN device, there's a chance
vxlan_rcv() is called at the same time as vxlan_dellink(). This is fine,
except that vxlan_dellink() should never ever touch stuff that's still in
use, such as the GRO cells list.
Otherwise, vxlan_rcv() crashes while queueing packets via
gro_cells_receive().
Move the gro_cells_destroy() to vxlan_uninit(), which runs after the RCU
grace period is elapsed and nothing needs the gro_cells anymore.
This is now done in the same way as commit 8e816df87997 ("geneve: Use GRO
cells infrastructure.") originally implemented for GENEVE.
Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: 58ce31cca1ff ("vxlan: GRO support at tunnel layer") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 7716682cc58e ("tcp/dccp: fix another race at listener
dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted
{tcp,dccp}_check_req() accordingly. However, TFO and syncookies
weren't modified, thus leaking allocated resources on error.
Contrary to tcp_check_req(), in both syncookies and TFO cases,
we need to drop the request socket. Also, since the child socket is
created with inet_csk_clone_lock(), we have to unlock it and drop an
extra reference (->sk_refcount is initially set to 2 and
inet_csk_reqsk_queue_add() drops only one ref).
For TFO, we also need to revert the work done by tcp_try_fastopen()
(with reqsk_fastopen_remove()).
Fixes: 7716682cc58e ("tcp/dccp: fix another race at listener dismantle") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit eeea10b83a13 ("tcp: add
tcp_v4_fill_cb()/tcp_v4_restore_cb()"), tcp_vX_fill_cb is only called
after tcp_filter(). That means, TCP_SKB_CB(skb)->end_seq still points to
the IP-part of the cb.
We thus should not mock with it, as this can trigger bugs (thanks
syzkaller):
[ 12.349396] ==================================================================
[ 12.350188] BUG: KASAN: slab-out-of-bounds in ip6_datagram_recv_specific_ctl+0x19b3/0x1a20
[ 12.351035] Read of size 1 at addr ffff88006adbc208 by task test_ip6_datagr/1799
Setting end_seq is actually no more necessary in tcp_filter as it gets
initialized later on in tcp_vX_fill_cb.
Cc: Eric Dumazet <edumazet@google.com> Fixes: eeea10b83a13 ("tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()") Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Returning 0 as inq to userspace indicates there is no more data to
read, and the application needs to wait for EPOLLIN. For a connection
that has received FIN from the remote peer, however, the application
must continue reading until getting EOF (return value of 0
from tcp_recvmsg) or an error, if edge-triggered epoll (EPOLLET) is
being used. Otherwise, the application will never receive a new
EPOLLIN, since there is no epoll edge after the FIN.
Return 1 when there is no data left on the queue but the
connection has received FIN, so that the applications continue
reading.
Fixes: b75eba76d3d72 (tcp: send in-queue bytes in cmsg upon read) Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzbot reported a NULL-ptr deref caused by that sched->init() in
sctp_stream_init() set stream->rr_next = NULL.
kasan: GPF could be caused by NULL-ptr deref or user memory access
RIP: 0010:sctp_sched_rr_dequeue+0xd3/0x170 net/sctp/stream_sched_rr.c:141
Call Trace:
sctp_outq_dequeue_data net/sctp/outqueue.c:90 [inline]
sctp_outq_flush_data net/sctp/outqueue.c:1079 [inline]
sctp_outq_flush+0xba2/0x2790 net/sctp/outqueue.c:1205
All sched info is saved in sout->ext now, in sctp_stream_init()
sctp_stream_alloc_out() will not change it, there's no need to
call sched->init() again, since sctp_outq_init() has already
done it.
Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations") Reported-by: syzbot+4c9934f20522c0efd657@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
rxrpc_get_client_conn() adds a new call to the front of the waiting_calls
queue if the connection it's going to use already exists. This is bad as
it allows calls to get starved out.
Fix this by adding to the tail instead.
Also change the other enqueue point in the same function to put it on the
front (ie. when we have a new connection). This makes the point that in
the case of a new connection the new call goes at the front (though it
doesn't actually matter since the queue should be unoccupied).
Fixes: 45025bceef17 ("rxrpc: Improve management and caching of client connection objects") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The race occurs in __mkroute_output() when 2 threads lookup a dst:
CPU A CPU B
find_exception()
find_exception() [fnhe expires]
ip_del_fnhe() [fnhe is deleted]
rt_bind_exception()
In rt_bind_exception() it will bind a deleted fnhe with the new dst, and
this dst will get no chance to be freed. It causes a dev defcnt leak and
consecutive dmesg warnings:
unregister_netdevice: waiting for ethX to become free. Usage count = 1
Especially thanks Jon to identify the issue.
This patch fixes it by setting fnhe_daddr to 0 in ip_del_fnhe() to stop
binding the deleted fnhe with a new dst when checking fnhe's fnhe_daddr
and daddr in rt_bind_exception().
It works as both ip_del_fnhe() and rt_bind_exception() are protected by
fnhe_lock and the fhne is freed by kfree_rcu().
Fixes: deed49df7390 ("route: check and remove route cache when we get route") Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hardware has the CBS (Credit Based Shaper) which affects only Q3
and Q2. When updating the CBS settings, even if the driver does so
after waiting for Tx DMA finished, there is a possibility that frame
data still remains in TxFIFO.
To avoid this, decrease TxFIFO depth of Q3 and Q2 to one.
This patch has been exercised this using netperf TCP_MAERTS, TCP_STREAM
and UDP_STREAM tests run on an Ebisu board. No performance change was
detected, outside of noise in the tests, both in terms of throughput and
CPU utilisation.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com> Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
[simon: updated changelog] Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect,
so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct,
otherwise, the dst refcnt will leak.
unregister_netdevice: waiting for lo to become free. Usage count = 1
v1->v2:
- use rcu_dereference_protected() instead of rcu_dereference_check(),
as suggested by Eric.
Fixes: 00959ade36ac ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: 95d6ebd53c79 ("net/x25: fix use-after-free in x25_device_event()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: andrew hendry <andrew.hendry@gmail.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case of failure x25_connect() does a x25_neigh_put(x25->neighbour)
but forgets to clear x25->neighbour pointer, thus triggering use-after-free.
Since the socket is visible in x25_list, we need to hold x25_list_lock
to protect the operation.
syzbot report :
BUG: KASAN: use-after-free in x25_kill_by_device net/x25/af_x25.c:217 [inline]
BUG: KASAN: use-after-free in x25_device_event+0x296/0x2b0 net/x25/af_x25.c:252
Read of size 8 at addr ffff8880a030edd0 by task syz-executor003/7854
Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+04babcefcd396fabec37@syzkaller.appspotmail.com Cc: andrew hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: linmiaohe <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If hsr_add_port(hsr, hsr_dev, HSR_PT_MASTER) failed to
add port, it directly returns res and forgets to free the node
that allocated in hsr_create_self_node(), and forgets to delete
the node->mac_list linked in hsr->self_node_db.
Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It has been observed that tx queue may stall while downloading
from certain web sites (example www.speedtest.net)
The cause has been tracked down to a corner case where
the tx interrupt vector was disabled automatically, but
was not re enabled later.
The lan743x has two mechanisms to enable/disable individual
interrupts. Interrupts can be enabled/disabled by individual
source, and they can also be enabled/disabled by individual
vector which has been mapped to the source. Both must be
enabled for interrupts to work properly.
The TX code path, primarily uses the interrupt enable/disable of
the TX source bit, while leaving the vector enabled all the time.
However, while investigating this issue it was noticed that
the driver requested the use of the vector auto clear feature.
The test above revealed a case where the vector enable was
cleared unintentionally.
This patch fixes the issue by deleting the lines that request
the vector auto clear feature to be used.
Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It has been noticed that running the speed test at
www.speedtest.net occasionally causes a kernel panic.
Investigation revealed that under this test RX buffer allocation
sometimes fails and returns NULL. But the lan743x driver did
not handle this case.
This patch fixes this issue by attempting to allocate a buffer
before sending the new rx packet to the OS. If the allocation
fails then the new rx packet is dropped and the existing buffer
is reused in the DMA ring.
Updates for v2:
Additional 2 locations where allocation was not checked,
has been changed to reuse existing buffer.
Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Local variable description: ----addr@___sys_recvmsg
Variable was created at:
___sys_recvmsg+0xf6/0x1310 net/socket.c:2244
do_recvmmsg+0x646/0x10c0 net/socket.c:2390
Bytes 0-31 of 32 are uninitialized
Memory access of size 32 starts at ffff8880ae62fbb0
Data copied to user address 0000000020000000
Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a non local multicast packet reaches ip_route_input_rcu() while
the ingress device IPv4 private data (in_dev) is NULL, we end up
doing a NULL pointer dereference in IN_DEV_MFORWARD().
Since the later call to ip_route_input_mc() is going to fail if
!in_dev, we can fail early in such scenario and avoid the dangerous
code path.
v1 -> v2:
- clarified the commit message, no code changes
Reported-by: Tianhao Zhao <tizhao@redhat.com> Fixes: e58e41596811 ("net: Enable support for VRF with ipv4 multicast") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
proc_exit_connector() uses ->real_parent lockless. This is not
safe that its parent can go away at any moment, so use RCU to
protect it, and ensure that this task is not released.
Fixes: b086ff87251b4a4 ("connector: add parent pid and tgid to coredump and exit events") Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Skylake (and later) will receive a microcode update to address a TSX
errata. This microcode will, on execution of a TSX instruction
(speculative or not) use (clobber) PMC3. This update will also provide
a new MSR to change this behaviour along with a CPUID bit to enumerate
the presence of this new MSR.
When the MSR gets set; the microcode will no longer use PMC3 but will
Force Abort every TSX transaction (upon executing COMMIT).
When TSX Force Abort (TFA) is allowed (default); the MSR gets set when
PMC3 gets scheduled and cleared when, after scheduling, PMC3 is
unused.
When TFA is not allowed; clear PMC3 from all constraints such that it
will not get used.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Skylake systems will receive a microcode update to address a TSX
errata. This microcode will (by default) clobber PMC3 when TSX
instructions are (speculatively or not) executed.
It also provides an MSR to cause all TSX transaction to abort and
preserve PMC3.
Add the CPUID enumeration and MSR definition.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When unbinding the (IOMMU-enabled) R-Car SATA device on Salvator-XS
(R-Car H3 ES2.0), in preparation of rebinding against vfio-platform for
device pass-through for virtualization:
While I've bisected this to commit e8e683ae9a736407 ("iommu/of: Fix
probe-deferral"), and reverting that commit on post-v5.0-rc4 kernels
does fix the problem, this turned out to be a red herring.
On arm64, arch_teardown_dma_ops() resets dev->dma_ops to NULL.
Hence if a driver has used a managed DMA allocation API, the allocated
DMA memory will be freed using the direct DMA ops, while it may have
been allocated using a custom DMA ops (iommu_dma_ops in this case).
Fix this by reversing the order of the calls to devres_release_all() and
arch_teardown_dma_ops().
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
[rm: backport for 4.12-4.19 - kernels before 5.0 will not see
the crash above, but may get silent memory corruption instead] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ath9k_of_init() function[0] was initially written on the assumption that
if someone had an explicit ath9k OF node that "there must be something
wrong, why would someone add an OF node if everything is fine"[1]
(Quoting Martin Blumenstingl <martin.blumenstingl@googlemail.com>)
"it turns out it's not that simple. with your requirements I'm now aware
of two use-cases where the current code in ath9k_of_init() doesn't work
without modifications"[1]
The "your requirements" Martin speaks of is the result of the fact that I
have a device (PowerCloud Systems CR5000) has some kind of default - not
unique mac address - set and requires to set the correct MAC address via
mac-address devicetree property, however:
"some cards come with a physical EEPROM chip [or OTP] so "qca,no-eeprom"
should not be set (your use-case). in this case AH_USE_EEPROM should be
set (which is the default when there is no OF node)"[1]
The other use case is:
the firmware on some PowerMac G5 seems to add a OF node for the ath9k
card automatically. depending on the EEPROM on the card AH_NO_EEP_SWAP
should be unset (which is the default when there is no OF node). see [3]
After this patch to ath9k_of_init() the new behavior will be:
if there's no OF node then everything is the same as before
if there's an empty OF node then ath9k will use the hardware EEPROM
(before ath9k would fail to initialize because no EEPROM data was
provided by userspace)
if there's an OF node with only a MAC address then ath9k will use
the MAC address and the hardware EEPROM (see the case above)
with "qca,no-eeprom" EEPROM data from userspace will be requested.
the behavior here will not change
[1]
Martin provides additional background on EEPROM swapping[1].
Thanks to Christian Lamparter <chunkeey@gmail.com> for all his help on
troubleshooting this issue and the basis for this patch.
Fixes: 138b41253d9c ("ath9k: parse the device configuration from an OF node") Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Daniel F. Dickinson <cshored@thecshore.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Cc: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Change these free functions to allow passing NULL as the argument and
treat it as a no-op just like free(NULL) would.
Or, if rqst->rq_iov is NULL.
The second scenario could happen for smb2_queryfs() if the call
to SMB2_query_info_init() fails and we go to qfs_exit to clean up
and free all resources.
In that case we have not yet assigned rqst[2].rq_iov and thus
the rq_iov dereference in SMB2_close_free() will cause a NULL pointer
dereference.
[ bp: upstream patch also fixes SMB2_set_info_free which was introduced in 4.20 ]
Fixes: 1eb9fb52040f ("cifs: create SMB2_open_init()/SMB2_open_free() helpers") Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As Al pointed out, "
... and while we are at it, what happens to
unsigned int nameoff = le16_to_cpu(de[mid].nameoff);
unsigned int matched = min(startprfx, endprfx);
/* string comparison without already matched prefix */
int ret = dirnamecmp(name, &dname, &matched);
if le16_to_cpu(de[...].nameoff) is not monotonically increasing? I.e.
what's to prevent e.g. (unsigned)-1 ending up in dname.len?
Corrupted fs image shouldn't oops the kernel.. "
Revisit the related lookup flow to address the issue.
In real scenario, there could be several threads accessing xattrs
of the same xattr-uninitialized inode, and init_inode_xattrs()
almost at the same time.
That's actually an unexpected behavior, this patch closes the race.
If it fails to read a shared xattr page, the inode's shared xattr array
is not freed. The next time the inode's xattr is accessed, the previously
allocated array is leaked.
Currently, this will hit a BUG_ON for these symlinks as follows:
- kernel message
------------[ cut here ]------------
kernel BUG at drivers/staging/erofs/xattr.c:59!
SMP PTI
CPU: 1 PID: 1170 Comm: getllxattr Not tainted 4.20.0-rc6+ #92
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
RIP: 0010:init_inode_xattrs+0x22b/0x270
Code: 48 0f 45 ea f0 ff 4d 34 74 0d 41 83 4c 24 e0 01 31 c0 e9 00 fe ff ff 48 89 ef e8 e0 31 9e ff eb e9 89 e8 e9 ef fd ff ff 0f 0$
<0f> 0b 48 89 ef e8 fb f6 9c ff 48 8b 45 08 a8 01 75 24 f0 ff 4d 34
RSP: 0018:ffffa03ac026bdf8 EFLAGS: 00010246
------------[ cut here ]------------
...
Call Trace:
erofs_listxattr+0x30/0x2c0
? selinux_inode_listxattr+0x5a/0x80
? kmem_cache_alloc+0x33/0x170
? security_inode_listxattr+0x27/0x40
listxattr+0xaf/0xc0
path_listxattr+0x5a/0xa0
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
...
---[ end trace 3c24b49408dc0c72 ]---
Fix it by checking ->xattr_isize in init_inode_xattrs(),
and it also fixes improper return value -ENOTSUPP
(it should be -ENODATA if xattr is enabled) for those inodes.
Mark Syms has reported seeing tasks that are stuck waiting in
find_insert_glock. It turns out that struct lm_lockname contains four padding
bytes on 64-bit architectures that function glock_waitqueue doesn't skip when
hashing the glock name. As a result, we can end up waking up the wrong
waitqueue, and the waiting tasks may be stuck forever.
Fix that by using ht_parms.key_len instead of sizeof(struct lm_lockname) for
the key length.
Reported-by: Mark Syms <mark.syms@citrix.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
checkentry(tee_tg_check) should initialize priv->oif from dev if possible.
But only netdevice notifier handler can set that.
Hence priv->oif is always -1 until notifier handler is called.
TEE netdevice notifier handler checks only interface name. however
each netns can have same interface name. hence other netns's interface
could be selected.
test commands:
%ip netns add vm1
%iptables -I INPUT -p icmp -j TEE --gateway 192.168.1.1 --oif enp2s0
%ip link set enp2s0 netns vm1
Above rule is in the root netns. but that rule could get enp2s0
ifindex of vm1 by notifier handler.
After this patch, TEE rule is added to the per-netns list.
The DRM driver stack is designed to work with cache coherent devices
only, but permits an optimization to be enabled in some cases, where
for some buffers, both the CPU and the GPU use uncached mappings,
removing the need for DMA snooping and allocation in the CPU caches.
The use of uncached GPU mappings relies on the correct implementation
of the PCIe NoSnoop TLP attribute by the platform, otherwise the GPU
will use cached mappings nonetheless. On x86 platforms, this does not
seem to matter, as uncached CPU mappings will snoop the caches in any
case. However, on ARM and arm64, enabling this optimization on a
platform where NoSnoop is ignored results in loss of coherency, which
breaks correct operation of the device. Since we have no way of
detecting whether NoSnoop works or not, just disable this
optimization entirely for ARM and arm64.
Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: David Zhou <David1.Zhou@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Junwei Zhang <Jerry.Zhang@amd.com> Cc: Michel Daenzer <michel.daenzer@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <maxime.ripard@bootlin.com> Cc: Sean Paul <sean@poorly.run> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: amd-gfx list <amd-gfx@lists.freedesktop.org> Cc: dri-devel <dri-devel@lists.freedesktop.org> Reported-by: Carsten Haitzler <Carsten.Haitzler@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.kernel.org/patch/10778815/ Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The maximum voltage value for buck8 regulator on Odroid XU3/XU4 boards is
set too low. Increase it to the 2000mV as specified on the board schematic.
So far the board worked fine, because of the bug in the PMIC driver, which
used incorrect step value for that regulator. It interpreted the voltage
value set by the bootloader as 1225mV and kept it unchanged. The regulator
driver has been however fixed recently in the commit 56b5d4ea778c
("regulator: s2mps11: Fix steps for buck7, buck8 and LDO35"), what results
in reading the proper buck8 value and forcing it to 1500mV on boot. This
is not enough for proper board operation and results in eMMC errors during
heavy IO traffic. Increasing maximum voltage value for buck8 restores
original driver behavior and fixes eMMC issues.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 86a2d2ac5e5d ("ARM: dts: Add dts file for Odroid XU3 board") Fixes: 56b5d4ea778c ("regulator: s2mps11: Fix steps for buck7, buck8 and LDO35") Cc: <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 225da7e65a03 ("ARM: dts: add eMMC reset line for
exynos4412-odroid-common") added MMC power sequence for eMMC card of
Odroid X2/U3. It reused generic sd1_cd pin control configuration node
and only disabled pull-up. However that time the pinctrl configuration
was not applied during MMC power sequence driver initialization. This
has been changed later by commit d97a1e5d7cd2 ("mmc: pwrseq: convert to
proper platform device").
It turned out then, that the provided pinctrl configuration is not
correct, because the eMMC_RTSN line is being re-configured as 'special
function/card detect function for mmc1 controller' not the simple
'output', thus the power sequence driver doesn't really set the pin
value. This in effect broke the reboot of Odroid X2/U3 boards. Fix this
by providing separate node with eMMC_RTSN pin configuration.
Cc: <stable@vger.kernel.org> Reported-by: Markus Reichl <m.reichl@fivetechno.de> Suggested-by: Ulf Hansson <ulf.hansson@linaro.org> Fixes: 225da7e65a03 ("ARM: dts: add eMMC reset line for exynos4412-odroid-common") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Somewhere along recent changes to power control of the wl1835, power-on
became very unreliable on the hikey, failing like this:
wl1271_sdio: probe of mmc2:0001:1 failed with error -16
wl1271_sdio: probe of mmc2:0001:2 failed with error -16
After playing with some dt parameters and comparing to other users of
this chip, it turned out we need some power-on delay to make things
stable again. In contrast to those other users which define 200 ms, the
hikey would already be happy with 1 ms. Still, we use the safer 10 ms,
like on the Ultra96.
Somewhere along recent changes to power control of the wl1831, power-on
became very unreliable on the Ultra96, failing like this:
wl1271_sdio: probe of mmc2:0001:1 failed with error -16
wl1271_sdio: probe of mmc2:0001:2 failed with error -16
After playing with some dt parameters and comparing to other users of
this chip, it turned out we need some power-on delay to make things
stable again. In contrast to those other users which define 200 ms,
Ultra96 is already happy with 10 ms.
Fixes: 5869ba0653b9 ("arm64: zynqmp: Add support for Xilinx zcu100-revC") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On Denverton's integration of the Intel(R) Trace Hub (for a reference and
overview see Documentation/trace/intel_th.rst) the reported size of one of
its resources (RTIT_BAR) doesn't match its actual size, which leads to
overlaps with other devices' resources.
In practice, it overlaps with XHCI MMIO space, which results in the xhci
driver bailing out after seeing its registers as 0xffffffff, and perceived
disappearance of all USB devices:
intel_th_pci 0000:00:1f.7: enabling device (0004 -> 0006)
xhci_hcd 0000:00:15.0: xHCI host controller not responding, assume dead
xhci_hcd 0000:00:15.0: xHC not responding in xhci_irq, assume controller is dead
xhci_hcd 0000:00:15.0: HC died; cleaning up
usb 1-1: USB disconnect, device number 2
For this reason, we need to resize the RTIT_BAR on Denverton to its actual
size, which in this case is 4MB. The corresponding erratum is DNV36 at the
link below:
DNV36. Processor Host Root Complex May Incorrectly Route Memory
Accesses to Intel® Trace Hub
Problem: The Intel® Trace Hub RTIT_BAR (B0:D31:F7 offset 20h) is
reported as a 2KB memory range. Due to this erratum, the
processor Host Root Complex will forward addresses from
RTIT_BAR to RTIT_BAR + 4MB -1 to Intel® Trace Hub.
Implication: Devices assigned within the RTIT_BAR to RTIT_BAR + 4MB -1
space may not function correctly.
Workaround: A BIOS code change has been identified and may be
implemented as a workaround for this erratum.
Status: No Fix.
Note that 5118ccd34780 ("intel_th: pci: Add Denverton SOC support") updates
the Trace Hub driver so it claims the Denverton device, but the resource
overlap exists regardless of whether that driver is loaded or that commit
is included.
Under heavy traffic load, when changing number of channels via
ethtool (ethtool -L) which will cause interface to be reloaded,
it was observed that some packets gets transmitted on old TX
channel/queue id which doesn't really exist after the channel
configuration leads to system crash.
Add a safeguard in the driver by validating queue id through
ndo_select_queue() which is called before the ndo_start_xmit().
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When slowpath messages are sent with high rate, the resulting
events can lead to a FW assert in case they are not handled fast
enough (Event Queue Full assert). Attempt to send queued slowpath
messages only after the newly evacuated entries in the EQ ring
are indicated to FW.
Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When something let __find_get_block_slow() hit all_mapped path, it calls
printk() for 100+ times per a second. But there is no need to print same
message with such high frequency; it is just asking for stall warning, or
at least bloating log files.
A surprise removal may fail to tear down request queues if it is racing
with the initial asynchronous probe. If that happens, the remove path
won't see the queue resources to tear down, and the controller reset
path may create a new request queue on a removed device, but will not
be able to make forward progress, deadlocking the pci removal.
Protect setting up non-blocking resources from a shutdown by holding the
same mutex, and transition to the CONNECTING state after these resources
are initialized so the probe path may see the dead controller state
before dispatching new IO.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202081 Reported-by: Alex Gagniuc <Alex_Gagniuc@Dellteam.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Alex Gagniuc <mr.nuke.me@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
If a controller supports the NS Change Notification, the namespace
scan_work is automatically triggered after attaching a new namespace.
Occasionally the namespace scan_work may append the new namespace to the
list before the admin command effects handling is completed. The effects
handling unfreezes namespaces, but if it unfreezes the newly attached
namespace, its request_queue freeze depth will be off and we'll hit the
warning in blk_mq_unfreeze_queue().
On the next namespace add, we will fail to freeze that queue due to the
previous bad accounting and deadlock waiting for frozen.
Fix that by preventing scan work from altering the namespace list while
command effects handling needs to pair freeze with unfreeze.
amdgpu_vm_get_task_info is called from interrupt handler and sched timeout
workqueue, we should use irq version spin_lock to avoid deadlock.
Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We currently get the following error with pixcir_ts driver during a
suspend resume cycle:
omap_i2c 4802a000.i2c: controller timed out
pixcir_ts 1-005c: pixcir_int_enable: can't read reg 0x34 : -110
pixcir_ts 1-005c: Failed to disable interrupt generation: -110
pixcir_ts 1-005c: Failed to stop
dpm_run_callback(): pixcir_i2c_ts_resume+0x0/0x98
[pixcir_i2c_ts] returns -110
PM: Device 1-005c failed to resume: error -110
And at least am437x based devices with pixcir_ts will fail to resume
to a touchscreen that is configured as the wakeup-source in device
tree for these devices.
This is because pixcir_ts tries to reconfigure it's registers for
noirq suspend which fails. This also leaves i2c-omap in enabled state
for suspend.
Let's fix the pixcir_ts issue and make sure i2c-omap is suspended by
adding SET_NOIRQ_SYSTEM_SLEEP_PM_OPS.
Let's also get rid of some ifdefs while at it and replace them with
__maybe_unused as SET_RUNTIME_PM_OPS and SET_NOIRQ_SYSTEM_SLEEP_PM_OPS
already deal with the various PM Kconfig options.
Reported-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Vignesh R <vigneshr@ti.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
It added a warning whose intent was to check whether the rport was still
linked into the peer list. It doesn't work as intended and gives false
positive warnings for two reasons:
1) If the rport is never linked into the peer list it will not be
considered empty since the list_head is never initialized.
2) If the rport is deleted from the peer list using list_del_rcu(), then
the list_head is in an undefined state and it is not considered empty.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Patch (b6c7a324df37b "MIPS: Fix get_frame_info() handling of
microMIPS function size.") introduces additional function size
check for microMIPS by only checking insn between ip and ip + func_size.
However, func_size in get_frame_info() is always 0 if KALLSYMS is not
enabled. This causes get_frame_info() to return immediately without
calculating correct frame_size, which in turn causes "Can't analyze
schedule() prologue" warning messages at boot time.
This patch removes func_size check, and let the frame_size check run
up to 128 insns for both MIPS and microMIPS.
With a suitably defined "probe:vfs_getname" probe, 'perf trace' can
"beautify" its output, so syscalls like open() or openat() can print the
"filename" argument instead of just its hex address, like:
[root@quaco ~]# perf test vfs
65: Use vfs_getname probe to get syscall args filenames : Ok
66: Add vfs_getname probe to get syscall args filenames : Ok
67: Check open filename arg using perf trace + vfs_getname: Ok
[root@quaco ~]#
Reported-by: Michael Petlan <mpetlan@redhat.com> Tested-by: Michael Petlan <mpetlan@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-mv8kolk17xla1smvmp3qabv1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Those symbols have no use for report or annotation and should be
skipped. Moreover they interfere with the DWARF unwind test on the PPC
arch, where they are mixed with checked symbols and then the test fails:
# perf test dwarf -v
59: Test dwarf unwind :
--- start ---
test child forked, pid 8515
unwind: .annobin_dwarf_unwind.c:ip = 0x10dba40dc (0x2740dc)
...
got: .annobin_dwarf_unwind.c 0x10dba40dc, expecting test__arch_unwind_sample
unwind: failed with 'no error'
The annobin symbols are defined as NOTYPE/LOCAL/HIDDEN:
# readelf -s ./perf | grep annobin | head -1
40: 00000000001bce4f 0 NOTYPE LOCAL HIDDEN 13 .annobin_init.c
They can still pass the check for the label symbol. Adding check for
HIDDEN and INTERNAL (as suggested by Nick below) visibility and filter
out such symbols.
> Just to be awkward, if you are going to ignore STV_HIDDEN
> symbols then you should probably also ignore STV_INTERNAL ones
> as well... Annobin does not generate them, but you never know,
> one day some other tool might create some.
Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Clifton <nickc@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190128133526.GD15461@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
A card's close_dev work is scheduled on a driver-wide workqueue. If the
card is removed and freed while the work is still active, this causes a
use-after-free.
So make sure that the work is completed before freeing the card.
Fixes: 0f54761d167f ("qeth: Support VEPA mode") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The error path in qeth_alloc_qdio_buffers() that takes care of
cleaning up the Output Queues is buggy. It first frees the queue, but
then calls qeth_clear_outq_buffers() with that very queue struct.
Make the call to qeth_clear_outq_buffers() part of the free action
(in the correct order), and while at it fix the naming of the helper.
Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Whenever we fail before/while starting an IO, make sure to release the
IO buffer. Usually qeth_irq() would do this for us, but if the IO
doesn't even start we obviously won't get an interrupt for it either.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It is possible that two concurrent packets originating from the same
socket of a connection-less protocol (e.g. UDP) can end up having
different IP_CT_DIR_REPLY tuples which results in one of the packets
being dropped.
To illustrate this, consider the following simplified scenario:
1. Packet A and B are sent at the same time from two different threads
by same UDP socket. No matching conntrack entry exists yet.
Both packets cause allocation of a new conntrack entry.
2. get_unique_tuple gets called for A. No clashing entry found.
conntrack entry for A is added to main conntrack table.
3. get_unique_tuple is called for B and will find that the reply
tuple of B is already taken by A.
It will allocate a new UDP source port for B to resolve the clash.
4. conntrack entry for B cannot be added to main conntrack table
because its ORIGINAL direction is clashing with A and the REPLY
directions of A and B are not the same anymore due to UDP source
port reallocation done in step 3.
This patch modifies nf_conntrack_tuple_taken so it doesn't consider
colliding reply tuples if the IP_CT_DIR_ORIGINAL tuples are equal.
[ Florian: simplify patch to not use .allow_clash setting
and always ignore identical flows ]
Signed-off-by: Martynas Pumputis <martynas@weave.works> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the virtio transport device disappear, we should reset all
connected sockets in order to inform the users.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
virtio_vsock_remove() invokes the vsock_core_exit() also if there
are opened sockets for the AF_VSOCK protocol family. In this way
the vsock "transport" pointer is set to NULL, triggering the
kernel panic at the first socket activity.
This patch move the vsock_core_init()/vsock_core_exit() in the
virtio_vsock respectively in module_init and module_exit functions,
that cannot be invoked until there are open sockets.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1609699 Reported-by: Yan Fu <yafu@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
atchan->status variable is used to store two different information:
- pass channel interrupts status from interrupt handler to tasklet;
- channel information like whether it is cyclic or paused;
This causes a bug when device_terminate_all() is called,
(AT_XDMAC_CHAN_IS_CYCLIC cleared on atchan->status) and then a late End
of Block interrupt arrives (AT_XDMAC_CIS_BIS), which sets bit 0 of
atchan->status. Bit 0 is also used for AT_XDMAC_CHAN_IS_CYCLIC, so when
a new descriptor for a cyclic transfer is created, the driver reports
the channel as in use:
if (test_and_set_bit(AT_XDMAC_CHAN_IS_CYCLIC, &atchan->status)) {
dev_err(chan2dev(chan), "channel currently used\n");
return NULL;
}
This patch fixes the bug by adding a different struct member to keep
the interrupts status separated from the channel status bits.
When initializing clocks, a reference to the TCON channel 0 clock is
obtained. However, the clock is never prepared and enabled later.
Switching from simplefb to DRM actually disables the clock (that was
usually configured by U-Boot) because of that.
On the V3s, this results in a hang when writing to some mixer registers
when switching over to DRM from simplefb.
Fix this by preparing and enabling the clock when initializing other
clocks. Waiting for sun4i_tcon_channel_enable to enable the clock is
apparently too late and results in the same mixer register access hang.
The map_lookup_elem used to not acquiring spinlock
in order to optimize the reader.
It was true until commit 557c0c6e7df8 ("bpf: convert stackmap to pre-allocation")
The syscall's map_lookup_elem(stackmap) calls bpf_stackmap_copy().
bpf_stackmap_copy() may find the elem no longer needed after the copy is done.
If that is the case, pcpu_freelist_push() saves this elem for reuse later.
This push requires a spinlock.
If a tracing bpf_prog got run in the middle of the syscall's
map_lookup_elem(stackmap) and this tracing bpf_prog is calling
bpf_get_stackid(stackmap) which also requires the same pcpu_freelist's
spinlock, it may end up with a dead lock situation as reported by
Eric Dumazet in https://patchwork.ozlabs.org/patch/1030266/
The situation is the same as the syscall's map_update_elem() which
needs to acquire the pcpu_freelist's spinlock and could race
with tracing bpf_prog. Hence, this patch fixes it by protecting
bpf_stackmap_copy() with this_cpu_inc(bpf_prog_active)
to prevent tracing bpf_prog from running.
A later syscall's map_lookup_elem commit f1a2e44a3aec ("bpf: add queue and stack maps")
also acquires a spinlock and races with tracing bpf_prog similarly.
Hence, this patch is forward looking and protects the majority
of the map lookups. bpf_map_offload_lookup_elem() is the exception
since it is for network bpf_prog only (i.e. never called by tracing
bpf_prog).
Fixes: 557c0c6e7df8 ("bpf: convert stackmap to pre-allocation") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since tracepoints_mutex will be taken in tracepoint_probe_register/unregister()
there is no need to take bpf_event_mutex too.
bpf_event_mutex is protecting modifications to prog array used in kprobe/perf bpf progs.
bpf_raw_tracepoints don't need to take this mutex.
Fixes: c4f6699dfcb8 ("bpf: introduce BPF_RAW_TRACEPOINT") Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It has been explained that is a false positive here:
https://lkml.org/lkml/2018/7/25/756
Recap:
- stackmap uses pcpu_freelist
- The lock in pcpu_freelist is a percpu lock
- stackmap is only used by tracing bpf_prog
- A tracing bpf_prog cannot be run if another bpf_prog
has already been running (ensured by the percpu bpf_prog_active counter).
Eric pointed out that this lockdep splats stops other
legit lockdep splats in selftests/bpf/test_progs.c.
Fix this by calling local_irq_save/restore for stackmap.
Another false positive had also been worked around by calling
local_irq_save in commit 89ad2fa3f043 ("bpf: fix lockdep splat").
That commit added unnecessary irq_save/restore to fast path of
bpf hash map. irqs are already disabled at that point, since htab
is holding per bucket spin_lock with irqsave.
Let's reduce overhead for htab by introducing __pcpu_freelist_push/pop
function w/o irqsave and convert pcpu_freelist_push/pop to irqsave
to be used elsewhere (right now only in stackmap).
It stops lockdep false positive in stackmap with a bit of acceptable overhead.
Fixes: 557c0c6e7df8 ("bpf: convert stackmap to pre-allocation") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Previously, bpf_num_possible_cpus() had a bug when calculating a
number of possible CPUs in the case of sparse CPU allocations, as
it was considering only the first range or element of
/sys/devices/system/cpu/possible.
E.g. in the case of "0,2-3" (CPU 1 is not available), the function
returned 1 instead of 3.
This patch fixes the function by making it parse all CPU ranges and
elements.
Signed-off-by: Martynas Pumputis <m@lambda.lt> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to ARM IHI 0069C (ID070116), we should use GITS_TYPER's
bits [7:4] as ITT_entry_size instead of [8:4]. Although this is
pretty annoying, it only results in a potential over-allocation
of memory, and nothing bad happens.
Fixes: 3dfa576bfb45 ("irqchip/gic-v3-its: Add probing for VLPI properties") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
[maz: massaged subject and commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In stmmac xmit callback we use a different flow for TSO packets but TSO
xmit callback is not disabling the EEE mode.
Fix this by disabling earlier the EEE mode, i.e. before calling the TSO
xmit callback.
Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The number of TSO enabled channels in HW can be different than the
number of total channels. There is no way to determined, at runtime, the
number of TSO capable channels and its safe to assume that if TSO is
enabled then at least channel 0 will be TSO capable.
Lets always send TSO packets from Queue 0.
Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If we don't have DT then stmmac_clk will not be available. Let's add a
new Platform Data field so that we can specify the refclk by this mean.
This way we can still use the coalesce command in PCI based setups.
Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
amdgpu only uses shared-fences internally, but dmabuf importers rely on
implicit write hazard tracking via the reservation_object.fence_excl.
For example, the importer use the write hazard for timing a page flip to
only occur after the exporter has finished flushing its write into the
surface. As such, on exporting a dmabuf, we must either flush all
outstanding fences (for we do not know which are writes and should have
been exclusive) or alternatively create a new exclusive fence that is
the composite of all the existing shared fences, and so will only be
signaled when all earlier fences are signaled (ensuring that we can not
be signaled before the completion of any earlier write).
v2: reservation_object is already locked by amdgpu_bo_reserve()
v3: Replace looping with get_fences_rcu and special case the promotion
of a single shared fence directly to an exclusive fence, bypassing the
fence array.
v4: Drop the fence array ref after assigning to reservation_object
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341
Testcase: igt/amd_prime/amd-to-i915
References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Reviewed-by: "Christian König" <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Resetting bit 4 disables the interrupt delivery to the "secure
processor" core. This breaks the keyboard on a OLPC XO 1.75 laptop,
where the firmware running on the "secure processor" bit-bangs the
PS/2 protocol over the GPIO lines.
It is not clear what the rest of the bits are and Marvell was unhelpful
when asked for documentation. Aside from the SP bit, there are probably
priority bits.
Leaving the unknown bits as the firmware set them up seems to be a wiser
course of action compared to just turning them off.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Pavel Machek <pavel@ucw.cz>
[maz: fixed-up subject and commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the unlikely event that we cannot find any available LPI in the
system, we should gracefully return an error instead of carrying
on with no LPI allocated at all.
Fixes: 38dd7c494cf6 ("irqchip/gic-v3-its: Drop chunk allocation compatibility") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>