Michael Brown [Tue, 13 Apr 2021 15:25:12 +0000 (16:25 +0100)]
xen-netback: Check for hotplug-status existence before watching
The logic in connect() is currently written with the assumption that
xenbus_watch_pathfmt() will return an error for a node that does not
exist. This assumption is incorrect: xenstore does allow a watch to
be registered for a nonexistent node (and will send notifications
should the node be subsequently created).
As of commit 1f2565780 ("xen-netback: remove 'hotplug-status' once it
has served its purpose"), this leads to a failure when a domU
transitions into XenbusStateConnected more than once. On the first
domU transition into Connected state, the "hotplug-status" node will
be deleted by the hotplug_status_changed() callback in dom0. On the
second or subsequent domU transition into Connected state, the
hotplug_status_changed() callback will therefore never be invoked, and
so the backend will remain stuck in InitWait.
This failure prevents scenarios such as reloading the xen-netfront
module within a domU, or booting a domU via iPXE. There is
unfortunately no way for the domU to work around this dom0 bug.
Fix by explicitly checking for existence of the "hotplug-status" node,
thereby creating the behaviour that was previously assumed to exist.
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 13 Apr 2021 12:41:35 +0000 (05:41 -0700)]
gro: ensure frag0 meets IP header alignment
After commit 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head")
Guenter Roeck reported one failure in his tests using sh architecture.
After much debugging, we have been able to spot silent unaligned accesses
in inet_gro_receive()
The issue at hand is that upper networking stacks assume their header
is word-aligned. Low level drivers are supposed to reserve NET_IP_ALIGN
bytes before the Ethernet header to make that happen.
This patch hardens skb_gro_reset_offset() to not allow frag0 fast-path
if the fragment is not properly aligned.
Some arches like x86, arm64 and powerpc do not care and define NET_IP_ALIGN
as 0, this extra check will be a NOP for them.
Note that if frag0 is not used, GRO will call pskb_may_pull()
as many times as needed to pull network and transport headers.
Fixes: 0f6925b3e8da ("virtio_net: Do not pull payload in skb->head") Fixes: 78a478d0efd9 ("gro: Inline skb_gro_header and cache frag0 virtual address") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Cohen [Tue, 13 Apr 2021 18:10:31 +0000 (21:10 +0300)]
net/sctp: fix race condition in sctp_destroy_sock
If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
held and sp->do_auto_asconf is true, then an element is removed
from the auto_asconf_splist without any proper locking.
This can happen in the following functions:
1. In sctp_accept, if sctp_sock_migrate fails.
2. In inet_create or inet6_create, if there is a bpf program
attached to BPF_CGROUP_INET_SOCK_CREATE which denies
creation of the sctp socket.
The bug is fixed by acquiring addr_wq_lock in sctp_destroy_sock
instead of sctp_close.
This addresses CVE-2021-23133.
Reported-by: Or Cohen <orcohen@paloaltonetworks.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Fixes: 610236587600 ("bpf: Add new cgroup attach type to enable sock modifications") Signed-off-by: Or Cohen <orcohen@paloaltonetworks.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lijun Pan [Tue, 13 Apr 2021 08:33:25 +0000 (03:33 -0500)]
ibmvnic: correctly use dev_consume/free_skb_irq
It is more correct to use dev_kfree_skb_irq when packets are dropped,
and to use dev_consume_skb_irq when packets are consumed.
Fixes: 0d973388185d ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls") Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Make tcp_allowed_congestion_control readonly in non-init netns
Currently, tcp_allowed_congestion_control is global and writable;
writing to it in any net namespace will leak into all other net
namespaces.
tcp_available_congestion_control and tcp_allowed_congestion_control are
the only sysctls in ipv4_net_table (the per-netns sysctl table) with a
NULL data pointer; their handlers (proc_tcp_available_congestion_control
and proc_allowed_congestion_control) have no other way of referencing a
struct net. Thus, they operate globally.
Because ipv4_net_table does not use designated initializers, there is no
easy way to fix up this one "bad" table entry. However, the data pointer
updating logic shouldn't be applied to NULL pointers anyway, so we
instead force these entries to be read-only.
These sysctls used to exist in ipv4_table (init-net only), but they were
moved to the per-net ipv4_net_table, presumably without realizing that
tcp_allowed_congestion_control was writable and thus introduced a leak.
Because the intent of that commit was only to know (i.e. read) "which
congestion algorithms are available or allowed", this read-only solution
should be sufficient.
The logic added in recent commit 31c4d2f160eb: ("net: Ensure net namespace isolation of sysctls")
does not and cannot check for NULL data pointers, because
other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have
.data=NULL but use other methods (.extra2) to access the struct net.
Fixes: 9cb8e048e5d9 ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns") Signed-off-by: Jonathon Reinhart <jonathon.reinhart@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 13 Apr 2021 21:31:52 +0000 (14:31 -0700)]
Merge branch 'catch-all-devices'
Hristo Venev says:
====================
net: Fix two use-after-free bugs
The two patches fix two use-after-free bugs related to cleaning up
network namespaces, one in sit and one in ip6_tunnel. They are easy to
trigger if the user has the ability to create network namespaces.
The bugs can be used to trigger null pointer dereferences. I am not
sure if they can be exploited further, but I would guess that they
can. I am not sending them to the mailing list without confirmation
that doing so would be OK.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Similarly to the sit case, we need to remove the tunnels with no
addresses that have been moved to another network namespace.
Fixes: 0bd8762824e73 ("ip6tnl: add x-netns support") Signed-off-by: Hristo Venev <hristo@venev.name> Signed-off-by: David S. Miller <davem@davemloft.net>
A sit interface created without a local or a remote address is linked
into the `sit_net::tunnels_wc` list of its original namespace. When
deleting a network namespace, delete the devices that have been moved.
The following script triggers a null pointer dereference if devices
linked in a deleted `sit_net` remain:
for i in `seq 1 30`; do
ip netns add ns-test
ip netns exec ns-test ip link add dev veth0 type veth peer veth1
ip netns exec ns-test ip link add dev sit$i type sit dev veth0
ip netns exec ns-test ip link set dev sit$i netns $$
ip netns del ns-test
done
for i in `seq 1 30`; do
ip link del dev sit$i
done
Fixes: 5e6700b3bf98f ("sit: add support of x-netns") Signed-off-by: Hristo Venev <hristo@venev.name> Signed-off-by: David S. Miller <davem@davemloft.net>
ASoC: meson: axg-frddr: fix fifo depth on g12 and sm1
Previous fifo depth patch was only tested on axg, not g12 or sm1.
Of course, while adding hw_params dai callback for the axg, I forgot to do
the same for g12 and sm1, leaving the depth unset and breaking playback on
these SoCs.
Add hw_params callback to the g12 dai_ops to fix the problem.
Fixes: 6f68accaa864 ("ASoC: meson: axg-frddr: set fifo depth according to the period") Reported-by: Christian Hewitt <christianshewitt@gmail.com> Tested-by: Christian Hewitt <christianshewitt@gmail.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://lore.kernel.org/r/20210412132256.89920-1-jbrunet@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
arm64: kprobes: Restore local irqflag if kprobes is cancelled
If instruction being single stepped caused a page fault, the kprobes
is cancelled to let the page fault handler continue as a normal page
fault. But the local irqflags are disabled so cpu will restore pstate
with DAIF masked. After pagefault is serviced, the kprobes is
triggerred again, we overwrite the saved_irqflag by calling
kprobes_save_local_irqflag(). NOTE, DAIF is masked in this new saved
irqflag. After kprobes is serviced, the cpu pstate is retored with
DAIF masked.
This patch is inspired by one patch for riscv from Liao Chang.
netfilter: nftables: clone set element expression template
memcpy() breaks when using connlimit in set elements. Use
nft_expr_clone() to initialize the connlimit expression list, otherwise
connlimit garbage collector crashes when walking on the list head copy.
Reported-by: Laura Garcia Liebana <nevola@gmail.com> Fixes: 409444522976 ("netfilter: nf_tables: add elements with stateful expressions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
net: phy: marvell: fix detection of PHY on Topaz switches
Since commit fee2d546414d ("net: phy: marvell: mv88e6390 temperature
sensor reading"), Linux reports the temperature of Topaz hwmon as
constant -75°C.
This is because switches from the Topaz family (88E6141 / 88E6341) have
the address of the temperature sensor register different from Peridot.
This address is instead compatible with 88E1510 PHYs, as was used for
Topaz before the above mentioned commit.
Create a new mapping table between switch family and PHY ID for families
which don't have a model number. And define PHY IDs for Topaz and Peridot
families.
Create a new PHY ID and a new PHY driver for Topaz's internal PHY.
The only difference from Peridot's PHY driver is the HWMON probing
method.
Prior this change Topaz's internal PHY is detected by kernel as:
Signed-off-by: Pali Rohár <pali@kernel.org> BugLink: https://github.com/globalscaletechnologies/linux/issues/1 Fixes: fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading") Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
arm64: mte: Ensure TIF_MTE_ASYNC_FAULT is set atomically
The entry from EL0 code checks the TFSRE0_EL1 register for any
asynchronous tag check faults in user space and sets the
TIF_MTE_ASYNC_FAULT flag. This is not done atomically, potentially
racing with another CPU calling set_tsk_thread_flag().
Replace the non-atomic ORR+STR with an STSET instruction. While STSET
requires ARMv8.1 and an assembler that understands LSE atomics, the MTE
feature is part of ARMv8.5 and already requires an updated assembler.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 637ec831ea4f ("arm64: mte: Handle synchronous and asynchronous tag check faults") Cc: <stable@vger.kernel.org> # 5.10.x Reported-by: Will Deacon <will@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210409173710.18582-1-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org>
Currently psw_idle does not allocate a stack frame and does not
save its r14 and r15 into the save area. Even though this is valid from
call ABI point of view, because psw_idle does not make any calls
explicitly, in reality psw_idle is an entry point for controlled
transition into serving interrupts. So, in practice, psw_idle stack
frame is analyzed during stack unwinding. Depending on build options
that r14 slot in the save area of psw_idle might either contain a value
saved by previous sibling call or complete garbage.
s390/entry: avoid setting up backchain in ext|io handlers
Currently when interrupt arrives to cpu while in kernel context
INT_HANDLER macro (used for ext_int_handler and io_int_handler)
allocates new stack frame and pt_regs on the kernel stack and
sets up the backchain to jump over the pt_regs to the frame which has
been interrupted. This is not ideal to two reasons:
1. This hides the fact that kernel stack contains interrupt frame in it
and hence breaks arch_stack_walk_reliable(), which needs to know that to
guarantee "reliability" and checks that there are no pt_regs on the way.
2. It breaks the backchain unwinder logic, which assumes that the next
stack frame after an interrupt frame is reliable, while it is not.
In some cases (when r14 contains garbage) this leads to early unwinding
termination with an error, instead of marking frame as unreliable
and continuing.
To address that, only set backchain to 0.
Fixes: 56e62a737028 ("s390: convert to generic entry") Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Phillip Potter [Sun, 11 Apr 2021 11:28:24 +0000 (12:28 +0100)]
net: geneve: check skb is large enough for IPv4/IPv6 header
Check within geneve_xmit_skb/geneve6_xmit_skb that sk_buff structure
is large enough to include IPv4 or IPv6 header, and reject if not. The
geneve_xmit_skb portion and overall idea was contributed by Eric Dumazet.
Fixes a KMSAN-found uninit-value bug reported by syzbot at:
https://syzkaller.appspot.com/bug?id=abe95dc3e3e9667fc23b8d81f29ecad95c6f106f
Suggested-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+2e406a9ac75bb71d4b7a@syzkaller.appspotmail.com Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: davicom: Fix regulator not turned off on failed probe
When the probe fails, we must disable the regulator that was previously
enabled.
This patch is a follow-up to commit ac88c531a5b3
("net: davicom: Fix regulator not turned off on failed probe") which missed
one case.
Fixes: 7994fe55a4a2 ("dm9000: Add regulator and reset support to dm9000") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
MAINTAINERS: update maintainer entry for freescale fec driver
Update maintainer entry for freescale fec driver.
Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'for-5.12-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One more patch that we'd like to get to 5.12 before release.
It's changing where and how the superblock is stored in the zoned
mode. It is an on-disk format change but so far there are no
implications for users as the proper mkfs support hasn't been merged
and is waiting for the kernel side to settle.
Until now, the superblocks were derived from the zone index, but zone
size can differ per device. This is changed to be based on fixed
offset values, to make it independent of the device zone size.
The work on that got a bit delayed, we discussed the exact locations
to support potential device sizes and usecases. (Partially delayed
also due to my vacation.) Having that in the same release where the
zoned mode is declared usable is highly desired, there are userspace
projects that need to be updated to recognize the feature. Pushing
that to the next release would make things harder to test"
* tag 'for-5.12-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: move superblock logging zone location
Merge tag 'locking-urgent-2021-04-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixlets from Ingo Molnar:
"Two minor fixes: one for a Clang warning, the other improves an
ambiguous/confusing kernel log message"
* tag 'locking-urgent-2021-04-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
lockdep: Address clang -Wformat warning printing for %hd
lockdep: Add a missing initialization hint to the "INFO: Trying to register non-static key" message
Merge tag 'x86_urgent_for_v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Fix the vDSO exception handling return path to disable interrupts
again.
- A fix for the CE collector to return the proper return values to its
callers which are used to convey what the collector has done with the
error address.
* tag 'x86_urgent_for_v5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/traps: Correct exc_general_protection() and math_error() return paths
RAS/CEC: Correct ce_add_elem()'s returned values
Merge branch 'for-5.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu
Pull percpu fix from Dennis Zhou:
"This contains a fix for sporadically failing atomic percpu
allocations.
I only caught it recently while I was reviewing a new series [1] and
simultaneously saw reports by btrfs in xfstests [2] and [3].
In v5.9, memcg accounting was extended to percpu done by adding a
second type of chunk. I missed an interaction with the free page float
count used to ensure we can support atomic allocations. If one type of
chunk has no free pages, but the other has enough to satisfy the free
page float requirement, we will not repopulate the free pages for the
former type of chunk. This led to the sporadically failing atomic
allocations"
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Seven fixes, all in drivers.
The hpsa three are the most extensive and the most problematic: it's a
packed structure misalignment that oopses on ia64 but looks like it
would also oops on quite a few non-x86 architectures.
The pm80xx is a regression and the rest are bug fixes for patches in
the misc tree"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_transport_srp: Don't block target in SRP_PORT_LOST state
scsi: target: iscsi: Fix zero tag inside a trace event
scsi: pm80xx: Fix chip initialization failure
scsi: ufs: core: Fix wrong Task Tag used in task management request UPIUs
scsi: ufs: core: Fix task management request completion timeout
scsi: hpsa: Add an assert to prevent __packed reintroduction
scsi: hpsa: Fix boot on ia64 (atomic_t alignment)
scsi: hpsa: Use __packed on individual structs, not header-wide
netfilter: arp_tables: add pre_exit hook for table unregister
Same problem that also existed in iptables/ip(6)tables, when
arptable_filter is removed there is no longer a wait period before the
table/ruleset is free'd.
Unregister the hook in pre_exit, then remove the table in the exit
function.
This used to work correctly because the old nf_hook_unregister API
did unconditional synchronize_net.
The per-net hook unregister function uses call_rcu instead.
Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
netfilter: bridge: add pre_exit hooks for ebtable unregistration
Just like ip/ip6/arptables, the hooks have to be removed, then
synchronize_rcu() has to be called to make sure no more packets are being
processed before the ruleset data is released.
Place the hook unregistration in the pre_exit hook, then call the new
ebtables pre_exit function from there.
Years ago, when first netns support got added for netfilter+ebtables,
this used an older (now removed) netfilter hook unregister API, that did
a unconditional synchronize_rcu().
Now that all is done with call_rcu, ebtable_{filter,nat,broute} pernet exit
handlers may free the ebtable ruleset while packets are still in flight.
This can only happens on module removal, not during netns exit.
The new function expects the table name, not the table struct.
This is because upcoming patch set (targeting -next) will remove all
net->xt.{nat,filter,broute}_table instances, this makes it necessary
to avoid external references to those member variables.
The existing APIs will be converted, so follow the upcoming scheme of
passing name + hook type instead.
Merge tag 'powerpc-5.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some some more powerpc fixes for 5.12:
- Fix an oops triggered by ptrace when CONFIG_PPC_FPU_REGS=n
- Fix an oops on sigreturn when the VDSO is unmapped on 32-bit
- Fix vdso_wrapper.o not being rebuilt everytime vdso.so is rebuilt
Thanks to Christophe Leroy"
* tag 'powerpc-5.12-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/vdso: Make sure vdso_wrapper.o is rebuilt everytime vdso.so is rebuilt
powerpc/signal32: Fix Oops on sigreturn with unmapped VDSO
powerpc/ptrace: Don't return error when getting/setting FP regs without CONFIG_PPC_FPU_REGS
Merge tag 'driver-core-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix for 5.12-rc7 to resolve a reported
problem that caused some devices to lockup when booting. It has been
in linux-next with no reported issues"
* tag 'driver-core-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Fix locking bug in deferred_probe_timeout_work_func()
Merge tag 'usb-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/Thunderbolt fixes from Greg KH:
"Here are a few small USB and Thunderbolt driver fixes for 5.12-rc7 for
reported issues:
- thunderbolt leaks and off-by-one fix
- cdnsp deque fix
- usbip fixes for syzbot-reported issues
All have been in linux-next with no reported problems"
* tag 'usb-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usbip: synchronize event handler with sysfs code paths
usbip: vudc synchronize sysfs code paths
usbip: stub-dev synchronize sysfs code paths
usbip: add sysfs_lock to synchronize sysfs code paths
thunderbolt: Fix off by one in tb_port_find_retimer()
thunderbolt: Fix a leak in tb_retimer_add()
usb: cdnsp: Fixes issue with dequeuing requests after disabling endpoint
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"A mixture of driver and documentation bugfixes for I2C"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: mention Oleksij as maintainer of the binding docs
i2c: exynos5: correct top kerneldoc
i2c: designware: Adjust bus_freq_hz when refuse high speed mode set
i2c: hix5hd2: use the correct HiSilicon copyright
i2c: gpio: update email address in binding docs
i2c: imx: drop me as maintainer of binding docs
i2c: stm32f4: Mundane typo fix
I2C: JZ4780: Fix bug for Ingenic X1000.
i2c: turn recovery error on init to debug
btrfs: zoned: move superblock logging zone location
Moves the location of the superblock logging zones. The new locations of
the logging zones are now determined based on fixed block addresses
instead of on fixed zone numbers.
The old placement method based on fixed zone numbers causes problems when
one needs to inspect a file system image without access to the drive zone
information. In such case, the super block locations cannot be reliably
determined as the zone size is unknown. By locating the superblock logging
zones using fixed addresses, we can scan a dumped file system image without
the zone information since a super block copy will always be present at or
after the fixed known locations.
Introduce the following three pairs of zones containing fixed offset
locations, regardless of the device zone size.
- primary superblock: offset 0B (and the following zone)
- first copy: offset 512G (and the following zone)
- Second copy: offset 4T (4096G, and the following zone)
If a logging zone is outside of the disk capacity, we do not record the
superblock copy.
The first copy position is much larger than for a non-zoned filesystem,
which is at 64M. This is to avoid overlapping with the log zones for
the primary superblock. This higher location is arbitrary but allows
supporting devices with very large zone sizes, plus some space around in
between.
Such large zone size is unrealistic and very unlikely to ever be seen in
real devices. Currently, SMR disks have a zone size of 256MB, and we are
expecting ZNS drives to be in the 1-4GB range, so this limit gives us
room to breathe. For now, we only allow zone sizes up to 8GB. The
maximum zone size that would still fit in the space is 256G.
The fixed location addresses are somewhat arbitrary, with the intent of
maintaining superblock reliability for smaller and larger devices, with
the preference for the latter. For this reason, there are two superblocks
under the first 1T. This should cover use cases for physical devices and
for emulated/device-mapper devices.
The superblock logging zones are reserved for superblock logging and
never used for data or metadata blocks. Note that we only reserve the
two zones per primary/copy actually used for superblock logging. We do
not reserve the ranges of zones possibly containing superblocks with the
largest supported zone size (0-16GB, 512G-528GB, 4096G-4112G).
The zones containing the fixed location offsets used to store
superblocks on a non-zoned volume are also reserved to avoid confusion.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Here's the latest pile of clk driver and clk framework fixes for this
release:
- Two clk framework fixes for a long standing issue in
clk_notifier_{register,unregister}() where we used a pointer that
was for a struct containing a list head when there was no container
struct
- A compile warning fix for socfpga that's good to have
- A double free problem with devm registered fixed factor clks
- One last fix to the Qualcomm camera clk driver to use the right clk
ops so clks don't get stuck and stop working because the firmware
takes them for a ride"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: fixed: fix double free in resource managed fixed-factor clock
clk: fix invalid usage of list cursor in unregister
clk: fix invalid usage of list cursor in register
clk: qcom: camcc: Update the clock ops for the SC7180
clk: socfpga: fix iomem pointer cast on 64-bit
Subsystems affected by this patch series: mm (kasan, gup, pagecache,
and kfence), MAINTAINERS, mailmap, nds32, gcov, ocfs2, ia64, and lib"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS
kfence, x86: fix preemptible warning on KPTI-enabled systems
lib/test_kasan_module.c: suppress unused var warning
kasan: fix conflict with page poisoning
fs: direct-io: fix missing sdio->boundary
ia64: fix user_stack_pointer() for ptrace()
ocfs2: fix deadlock between setattr and dio_end_io_write
gcov: re-fix clang-11+ support
nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff
mm/gup: check page posion status for coredump.
.mailmap: fix old email addresses
mailmap: update email address for Jordan Crouse
treewide: change my e-mail address, fix my name
MAINTAINERS: update CZ.NIC's Turris information
Merge tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Two minor fixups for the reissue logic, and one for making sure that
unbounded work is canceled on io-wq exit"
* tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block:
io-wq: cancel unbounded works on io-wq destroy
io_uring: fix rw req completion
io_uring: clear F_REISSUE right after getting it
Depending on ARCH_WANT_FRAME_POINTERS causes a recursive dependency
error. ARCH_WANT_FRAME_POINTERS is to be selected by the architecture,
and is not supposed to be overridden by other config options.
Marco Elver [Fri, 9 Apr 2021 20:27:44 +0000 (13:27 -0700)]
kfence, x86: fix preemptible warning on KPTI-enabled systems
On systems with KPTI enabled, we can currently observe the following
warning:
BUG: using smp_processor_id() in preemptible
caller is invalidate_user_asid+0x13/0x50
CPU: 6 PID: 1075 Comm: dmesg Not tainted 5.12.0-rc4-gda4a2b1a5479-kfence_1+ #1
Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
Call Trace:
dump_stack+0x7f/0xad
check_preemption_disabled+0xc8/0xd0
invalidate_user_asid+0x13/0x50
flush_tlb_one_kernel+0x5/0x20
kfence_protect+0x56/0x80
...
While it normally makes sense to require preemption to be off, so that
the expected CPU's TLB is flushed and not another, in our case it really
is best-effort (see comments in kfence_protect_page()).
Avoid the warning by disabling preemption around flush_tlb_one_kernel().
Link: https://lore.kernel.org/lkml/YGIDBAboELGgMgXy@elver.google.com/ Link: https://lkml.kernel.org/r/20210330065737.652669-1-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jack Qiu [Fri, 9 Apr 2021 20:27:35 +0000 (13:27 -0700)]
fs: direct-io: fix missing sdio->boundary
I encountered a hung task issue, but not a performance one. I run DIO
on a device (need lba continuous, for example open channel ssd), maybe
hungtask in below case:
DIO: Checkpoint:
get addr A(at boundary), merge into BIO,
no submit because boundary missing
flush dirty data(get addr A+1), wait IO(A+1)
writeback timeout, because DIO(A) didn't submit
get addr A+2 fail, because checkpoint is doing
dio_send_cur_page() may clear sdio->boundary, so prevent it from missing
a boundary.
Link: https://lkml.kernel.org/r/20210322042253.38312-1-jack.qiu@huawei.com Fixes: b1058b981272 ("direct-io: submit bio after boundary buffer is added to it") Signed-off-by: Jack Qiu <jack.qiu@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thus above forms ABBA deadlock. The same deadlock was mentioned in
upstream commit 28f5a8a7c033 ("ocfs2: should wait dio before inode lock
in ocfs2_setattr()"). It seems that that commit only removed the
cluster lock (the victim of above dead lock) from the ABBA deadlock
party.
End-user visible effects: Process hang in truncate -> ocfs2_setattr path
and other processes hang at ocfs2_dio_end_io_write path.
This is to fix the deadlock itself. It removes inode_lock() call from
dio completion path to remove the deadlock and add ip_alloc_sem lock in
setattr path to synchronize the inode modifications.
[wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested] Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Desaulniers [Fri, 9 Apr 2021 20:27:26 +0000 (13:27 -0700)]
gcov: re-fix clang-11+ support
LLVM changed the expected function signature for llvm_gcda_emit_function()
in the clang-11 release. Users of clang-11 or newer may have noticed
their kernels producing invalid coverage information:
Fix up the function signatures so calling this function interprets its
parameters correctly and computes the correct cfg checksum. In
particular, in clang-11, the additional checksum is no longer optional.
Mike Rapoport [Fri, 9 Apr 2021 20:27:23 +0000 (13:27 -0700)]
nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff
Commit cb9f753a3731 ("mm: fix races between swapoff and flush dcache")
updated flush_dcache_page implementations on several architectures to
use page_mapping_file() in order to avoid races between page_mapping()
and swapoff().
This update missed arch/nds32 and there is a possibility of a race
there.
Replace page_mapping() with page_mapping_file() in nds32 implementation
of flush_dcache_page().
Link: https://lkml.kernel.org/r/20210330175126.26500-1-rppt@kernel.org Fixes: cb9f753a3731 ("mm: fix races between swapoff and flush dcache") Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Greentime Hu <green.hu@gmail.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aili Yao [Fri, 9 Apr 2021 20:27:19 +0000 (13:27 -0700)]
mm/gup: check page posion status for coredump.
When we do coredump for user process signal, this may be an SIGBUS signal
with BUS_MCEERR_AR or BUS_MCEERR_AO code, which means this signal is
resulted from ECC memory fail like SRAR or SRAO, we expect the memory
recovery work is finished correctly, then the get_dump_page() will not
return the error page as its process pte is set invalid by
memory_failure().
But memory_failure() may fail, and the process's related pte may not be
correctly set invalid, for current code, we will return the poison page,
get it dumped, and then lead to system panic as its in kernel code.
So check the poison status in get_dump_page(), and if TRUE, return NULL.
There maybe other scenario that is also better to check the posion status
and not to panic, so make a wrapper for this check, Thanks to David's
suggestion(<david@redhat.com>).
[akpm@linux-foundation.org: s/0/false/]
[yaoaili@kingsoft.com: is_page_poisoned() arg cannot be null, per Matthew]
Link: https://lkml.kernel.org/r/20210322115233.05e4e82a@alex-virtual-machine Link: https://lkml.kernel.org/r/20210319104437.6f30e80d@alex-virtual-machine Signed-off-by: Aili Yao <yaoaili@kingsoft.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Aili Yao <yaoaili@kingsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix fw_devlink failure with ".*,nr-gpios" properties
- Doc link reference fixes from Mauro
- Fixes for unaligned FDT handling found on OpenRisc. First, avoid
crash with better error handling when unflattening an unaligned FDT.
Second, fix memory allocations for FDTs to ensure alignment.
* tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: property: fw_devlink: do not link ".*,nr-gpios"
dt-bindings:iio:adc: update motorola,cpcap-adc.yaml reference
dt-bindings: fix references for iio-bindings.txt
dt-bindings: don't use ../dir for doc references
of: unittest: overlay: ensure proper alignment of copied FDT
of: properly check for error returned by fdt_get_name()
Merge tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Was relatively quiet this week, but still a few pulls came in, pretty
much small fixes across the board, a couple of regression fixes in the
amdgpu/radeon code, msm has a few minor fixes across the board, a
panel regression fix also.
xen:
- Fix use-after-free in xen
- minor duplicate defintion cleanup
vc4:
- Reduce fifo threshold on hvs4 to fix a fifo full error
- minor redunantant assignment cleanup
panel:
- Disable TE support for Droid4 and N950"
* tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm:
drm/vc4: crtc: Reduce PV fifo threshold on hvs4
drm/vc4: plane: Remove redundant assignment
drm/amdgpu/smu7: fix CAC setting on TOPAZ
drm/radeon: Fix size overflow
drm/amdgpu: Fix size overflow
drm/i915: Fix invalid access to ACPI _DSM objects
drm/amd/display: Add missing mask for DCN3
drm/panel: panel-dsi-cm: disable TE for now
drm/msm/disp/dpu1: program 3d_merge only if block is attached
drm/msm: a6xx: fix version check for the A650 SQE microcode
drm/msm: Fix a5xx/a6xx timestamps
drm/msm: Fix removal of valid error case when checking speed_bin
drm/msm: Set drvdata to NULL when msm_drm_init() fails
drivers: gpu: drm: xen_drm_front_drm_info is declared twice
gpu/xen: Fix a use after free in xen_drm_drv_init
Paolo Abeni [Fri, 9 Apr 2021 15:24:17 +0000 (17:24 +0200)]
net: fix hangup on napi_disable for threaded napi
napi_disable() is subject to an hangup, when the threaded
mode is enabled and the napi is under heavy traffic.
If the relevant napi has been scheduled and the napi_disable()
kicks in before the next napi_threaded_wait() completes - so
that the latter quits due to the napi_disable_pending() condition,
the existing code leaves the NAPI_STATE_SCHED bit set and the
napi_disable() loop waiting for such bit will hang.
This patch addresses the issue by dropping the NAPI_STATE_DISABLE
bit test in napi_thread_wait(). The later napi_threaded_poll()
iteration will take care of clearing the NAPI_STATE_SCHED.
This also addresses a related problem reported by Jakub:
before this patch a napi_disable()/napi_enable() pair killed
the napi thread, effectively disabling the threaded mode.
On the patched kernel napi_disable() simply stops scheduling
the relevant thread.
v1 -> v2:
- let the main napi_thread_poll() loop clear the SCHED bit
Sven Van Asbroeck [Fri, 9 Apr 2021 00:39:04 +0000 (20:39 -0400)]
lan743x: fix ethernet frame cutoff issue
The ethernet frame length is calculated incorrectly. Depending on
the value of RX_HEAD_PADDING, this may result in ethernet frames
that are too short (cut off at the end), or too long (garbage added
to the end).
Fix by calculating the ethernet frame length correctly. For added
clarity, use the ETH_FCS_LEN constant in the calculation.
Many thanks to Heiner Kallweit for suggesting this solution.
of: property: fw_devlink: do not link ".*,nr-gpios"
[<vendor>,]nr-gpios property is used by some GPIO drivers[0] to indicate
the number of GPIOs present on a system, not define a GPIO. nr-gpios is
not configured by #gpio-cells and can't be parsed along with other
"*-gpios" properties.
nr-gpios without the "<vendor>," prefix is not allowed by the DT
spec[1], so only add exception for the ",nr-gpios" suffix and let the
error message continue being printed for non-compliant implementations.
[0] nr-gpios is referenced in Documentation/devicetree/bindings/gpio:
- gpio-adnp.txt
- gpio-xgene-sb.txt
- gpio-xlp.txt
- snps,dw-apb-gpio.yaml
As documents have been renamed and moved around, their
references will break, but this will be unnoticed, as the
script which checks for it won't handle "../" references.
Dave Airlie [Fri, 9 Apr 2021 19:15:35 +0000 (05:15 +1000)]
Merge tag 'drm-misc-fixes-2021-04-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.12-rc7:
- Fix use-after-free in xen.
- Reduce fifo threshold on hvs4 to fix a fifo full error.
- Disable TE support for Droid4 and N950.
- Small compiler fixes.
Merge tag 'selinux-pr-20210409' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fixes from Paul Moore:
"Three SELinux fixes.
These fix known problems relating to (re)loading SELinux policy or
changing the policy booleans, and pass our test suite without problem"
* tag 'selinux-pr-20210409' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix race between old and new sidtab
selinux: fix cond_list corruption when changing booleans
selinux: make nslot handling in avtab more robust
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vdpa/mlx5 fixes from Michael Tsirkin:
"Last minute fixes.
These all look like something we are better off having
than not ..."
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa/mlx5: Fix suspend/resume index restoration
vdpa/mlx5: Fix wrong use of bit numbers
vdpa/mlx5: Retrieve BAR address suitable any function
vdpa/mlx5: Use the correct dma device when registering memory
vdpa/mlx5: should exclude header length and fcs from mtu
Merge tag 'rproc-v5.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc
Pull remoteproc fixes from Bjorn Andersson:
"This fixes an issue with firmware loading on the TI K3 PRU, fixes
compatibility with GNU binutils for the same and resolves link error
due to a 64-bit division in the Qualcomm PIL info.
It also recognizes Mathieu Poirier as co-maintainer of the remoteproc
and rpmsg subsystems"
* tag 'rproc-v5.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc:
remoteproc: pru: Fix firmware loading crashes on K3 SoCs
remoteproc: pru: Fix loading of GNU Binutils ELF
MAINTAINERS: Add co-maintainer for remoteproc/RPMSG subsystems
remoteproc: qcom: pil_info: avoid 64-bit division
Merge tag 'for-linus-5.12b-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A single fix of a 5.12 patch for the rather uncommon problem of
running as a Xen guest with a real time kernel config"
* tag 'for-linus-5.12b-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/evtchn: Change irq_info lock to raw_spinlock_t
Eli Cohen [Thu, 8 Apr 2021 09:10:47 +0000 (12:10 +0300)]
vdpa/mlx5: Fix suspend/resume index restoration
When we suspend the VM, the VDPA interface will be reset. When the VM is
resumed again, clear_virtqueues() will clear the available and used
indices resulting in hardware virqtqueue objects becoming out of sync.
We can avoid this function alltogether since qemu will clear them if
required, e.g. when the VM went through a reboot.
Moreover, since the hw available and used indices should always be
identical on query and should be restored to the same value same value
for virtqueues that complete in order, we set the single value provided
by set_vq_state(). In get_vq_state() we return the value of hardware
used index.
Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210408091047.4269-6-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
Eli Cohen [Thu, 8 Apr 2021 09:10:46 +0000 (12:10 +0300)]
vdpa/mlx5: Fix wrong use of bit numbers
VIRTIO_F_VERSION_1 is a bit number. Use BIT_ULL() with mask
conditionals.
Also, in mlx5_vdpa_is_little_endian() use BIT_ULL for consistency with
the rest of the code.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210408091047.4269-5-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
Eli Cohen [Thu, 8 Apr 2021 09:10:45 +0000 (12:10 +0300)]
vdpa/mlx5: Retrieve BAR address suitable any function
struct mlx5_core_dev has a bar_addr field that contains the correct bar
address for the function regardless of whether it is pci function or sub
function. Use it.
Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver") Signed-off-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Link: https://lore.kernel.org/r/20210408091047.4269-4-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
Si-Wei Liu [Thu, 8 Apr 2021 09:10:43 +0000 (12:10 +0300)]
vdpa/mlx5: should exclude header length and fcs from mtu
When feature VIRTIO_NET_F_MTU is negotiated on mlx5_vdpa,
22 extra bytes worth of MTU length is shown in guest.
This is because the mlx5_query_port_max_mtu API returns
the "hardware" MTU value, which does not just contain the
Ethernet payload, but includes extra lengths starting
from the Ethernet header up to the FCS altogether.
Fix the MTU so packets won't get dropped silently.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210408091047.4269-2-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Hans de Goede [Fri, 9 Apr 2021 13:58:50 +0000 (15:58 +0200)]
Bluetooth: btusb: Revert Fix the autosuspend enable and disable
drivers/usb/core/hub.c: usb_new_device() contains the following:
/* By default, forbid autosuspend for all devices. It will be
* allowed for hubs during binding.
*/
usb_disable_autosuspend(udev);
So for anything which is not a hub, such as btusb devices, autosuspend is
disabled by default and we must call usb_enable_autosuspend(udev) to
enable it.
This means that the "Fix the autosuspend enable and disable" commit,
which drops the usb_enable_autosuspend() call when the enable_autosuspend
module option is true, is completely wrong, revert it.
Roman Gushchin [Thu, 8 Apr 2021 03:57:33 +0000 (20:57 -0700)]
percpu: make pcpu_nr_empty_pop_pages per chunk type
nr_empty_pop_pages is used to guarantee that there are some free
populated pages to satisfy atomic allocations. Accounted and
non-accounted allocations are using separate sets of chunks,
so both need to have a surplus of empty pages.
This commit makes pcpu_nr_empty_pop_pages and the corresponding logic
per chunk type.
[Dennis]
This issue came up as I was reviewing [1] and realized I missed this.
Simultaneously, it was reported btrfs was seeing failed atomic
allocations in fsstress tests [2] and [3].
Thomas Tai [Thu, 8 Apr 2021 17:28:33 +0000 (13:28 -0400)]
x86/traps: Correct exc_general_protection() and math_error() return paths
Commit
334872a09198 ("x86/traps: Attempt to fixup exceptions in vDSO before signaling")
added return statements which bypass calling cond_local_irq_disable().
According to
ca4c6a9858c2 ("x86/traps: Make interrupt enable/disable symmetric in C code"),
cond_local_irq_disable() is needed because the asm return code no longer
disables interrupts. Follow the existing code as an example to use "goto
exit" instead of "return" statement.
[ bp: Massage commit message. ]
Fixes: 334872a09198 ("x86/traps: Attempt to fixup exceptions in vDSO before signaling") Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lkml.kernel.org/r/1617902914-83245-1-git-send-email-thomas.tai@oracle.com
Merge tag '5.12-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three cifs/smb3 fixes, two for stable: a reconnect fix and a fix for
display of devnames with special characters"
* tag '5.12-rc6-smb3' of git://git.samba.org/sfrench/cifs-2.6:
cifs: escape spaces in share names
fs: cifs: Remove unnecessary struct declaration
cifs: On cifs_reconnect, resolve the hostname again.
net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh
nlh is being checked for validtity two times when it is dereferenced in
this function. Check for validity again when updating the flags through
nlh pointer to make the dereferencing safe.
CC: <stable@vger.kernel.org>
Addresses-Coverity: ("NULL pointer dereference") Signed-off-by: Muhammad Usama Anjum <musamaanjum@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Apr 2021 23:38:23 +0000 (16:38 -0700)]
Merge branch 'lantiq-GSWIP-fixes'
Martin Blumenstingl says:
====================
lantiq: GSWIP: two more fixes
after my last patch got accepted and is now in net as commit 3e6fdeb28f4c33 ("net: dsa: lantiq_gswip: Let GSWIP automatically set
the xMII clock") [0] some more people from the OpenWrt community
(many thanks to everyone involved) helped test the GSWIP driver: [1]
It turns out that the previous fix does not work for all boards.
There's no regression, but it doesn't fix as many problems as I
thought. This is why two more fixes are needed:
- the first one solves many (four known but probably there are
a few extra hidden ones) reported bugs with the GSWIP where no
traffic would flow. Not all circumstances are fully understood
but testing shows that switching away from PHY auto polling
solves all of them
- while investigating the different problems which are addressed
by the first patch some small issues with the existing code were
found. These are addressed by the second patch
Changes since v1 at [0]:
- Don't configure the link parameters in gswip_phylink_mac_config
(as we're using the "modern" way in gswip_phylink_mac_link_up).
Thanks to Andrew for the hint with the phylink documentation.
- Clarify that GSWIP_MII_CFG_RMII_CLK is ignored by the hardware in
the description of the second patch as suggested by Hauke
- Don't set GSWIP_MII_CFG_RGMII_IBS in the second patch as we don't
have any hardware available for testing this. The patch
description now also reflects this.
- Added Andrew's Reviewed-by to the first patch (thank you!)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Blumenstingl [Thu, 8 Apr 2021 18:38:28 +0000 (20:38 +0200)]
net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
There are a few more bits in the GSWIP_MII_CFG register for which we
did rely on the boot-loader (or the hardware defaults) to set them up
properly.
For some external RMII PHYs we need to select the GSWIP_MII_CFG_RMII_CLK
bit and also we should un-set it for non-RMII PHYs. The
GSWIP_MII_CFG_RMII_CLK bit is ignored for other PHY connection modes.
The GSWIP IP also supports in-band auto-negotiation for RGMII PHYs when
the GSWIP_MII_CFG_RGMII_IBS bit is set. Clear this bit always as there's
no known hardware which uses this (so it is not tested yet).
Clear the xMII isolation bit when set at initialization time if it was
previously set by the bootloader. Not doing so could lead to no traffic
(neither RX nor TX) on a port with this bit set.
While here, also add the GSWIP_MII_CFG_RESET bit. We don't need to
manage it because this bit is self-clearning when set. We still add it
here to get a better overview of the GSWIP_MII_CFG register.
Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Cc: stable@vger.kernel.org Suggested-by: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Blumenstingl [Thu, 8 Apr 2021 18:38:27 +0000 (20:38 +0200)]
net: dsa: lantiq_gswip: Don't use PHY auto polling
PHY auto polling on the GSWIP hardware can be used so link changes
(speed, link up/down, etc.) can be detected automatically. Internally
GSWIP reads the PHY's registers for this functionality. Based on this
automatic detection GSWIP can also automatically re-configure it's port
settings. Unfortunately this auto polling (and configuration) mechanism
seems to cause various issues observed by different people on different
devices:
- FritzBox 7360v2: the two Gbit/s ports (connected to the two internal
PHY11G instances) are working fine but the two Fast Ethernet ports
(using an AR8030 RMII PHY) are completely dead (neither RX nor TX are
received). It turns out that the AR8030 PHY sets the BMSR_ESTATEN bit
as well as the ESTATUS_1000_TFULL and ESTATUS_1000_XFULL bits. This
makes the PHY auto polling state machine (rightfully?) think that the
established link speed (when the other side is Gbit/s capable) is
1Gbit/s.
- None of the Ethernet ports on the Zyxel P-2812HNU-F1 (two are
connected to the internal PHY11G GPHYs while the other three are
external RGMII PHYs) are working. Neither RX nor TX traffic was
observed. It is not clear which part of the PHY auto polling state-
machine caused this.
- FritzBox 7412 (only one LAN port which is connected to one of the
internal GPHYs running in PHY22F / Fast Ethernet mode) was seeing
random disconnects (link down events could be seen). Sometimes all
traffic would stop after such disconnect. It is not clear which part
of the PHY auto polling state-machine cauased this.
- TP-Link TD-W9980 (two ports are connected to the internal GPHYs
running in PHY11G / Gbit/s mode, the other two are external RGMII
PHYs) was affected by similar issues as the FritzBox 7412 just without
the "link down" events
Switch to software based configuration instead of PHY auto polling (and
letting the GSWIP hardware configure the ports automatically) for the
following link parameters:
- link up/down
- link speed
- full/half duplex
- flow control (RX / TX pause)
After a big round of manual testing by various people (who helped test
this on OpenWrt) it turns out that this fixes all reported issues.
Additionally it can be considered more future proof because any
"quirk" which is implemented for a PHY on the driver side can now be
used with the GSWIP hardware as well because Linux is in control of the
link parameters.
As a nice side-effect this also solves a problem where fixed-links were
not supported previously because we were relying on the PHY auto polling
mechanism, which cannot work for fixed-links as there's no PHY from
where it can read the registers. Configuring the link settings on the
GSWIP ports means that we now use the settings from device-tree also for
ports with fixed-links.
Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Fixes: 3e6fdeb28f4c33 ("net: dsa: lantiq_gswip: Let GSWIP automatically set the xMII clock") Cc: stable@vger.kernel.org Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Nothing very exciting here, just a few small bug fixes. No red flags
for this release have shown up.
- Regression from the last pull request in cxgb4 related to the ipv6
fixes
- KASAN crasher in rtrs
- oops in hfi1 related to a buggy BIOS
- Userspace could oops qedr's XRC support
- Uninitialized memory when parsing a LS_NLA_TYPE_DGID netlink
message"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/addr: Be strict with gid size
RDMA/qedr: Fix kernel panic when trying to access recv_cq
IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
RDMA/cxgb4: check for ipv6 address properly while destroying listener
RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session files
Frank Rowand [Thu, 8 Apr 2021 20:45:08 +0000 (15:45 -0500)]
of: unittest: overlay: ensure proper alignment of copied FDT
The Devicetree standard specifies an 8 byte alignment of the FDT.
Code in libfdt expects this alignment for an FDT image in memory.
kmemdup() returns 4 byte alignment on openrisc. Replace kmemdup()
with kmalloc(), align pointer, memcpy() to get proper alignment.
The 4 byte alignment exposed a related bug which triggered a crash
on openrisc with:
commit 79edff12060f ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9")
as reported in:
https://lore.kernel.org/lkml/20210327224116.69309-1-linux@roeck-us.net/