As the code comments in perf_stat_process_counter() say, we calculate
counter's data every interval, and the display code shows ps->res_stats
avg value. We need to zero the stats for interval mode.
But the current code only zeros the res_stats[0], it doesn't zero the
res_stats[1] and res_stats[2], which are for ena and run of counter.
This patch zeros the whole res_stats[] for interval mode.
Fixes: 51fd2df1e882 ("perf stat: Fix interval output values") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200409070755.17261-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes a regression encountered while running the
gdb.base/corefile.exp test in GDB's test suite.
In my testing, the typo prevented the sw_reserved field of struct
fxregs_state from being output to the kernel XSAVES area. Thus the
correct mask corresponding to XCR0 was not present in the core file for
GDB to interrogate, resulting in the following behavior:
The above patch is supposed to fix a register index error on mt2701. It
is not clear if the problem solved is a hang or just an invalid value
returned, my guess is the second. The patch introduces, though, a new
hang on MT8173 device making them unusable. So, seems reasonable, revert
the patch because introduces a worst issue.
The reason I send a revert instead of trying to fix the issue for MT8173
is because the information needed to fix the issue is in the datasheet
and is not public. So I am not really able to fix it.
Fixes the following bug when CONFIG_MTK_THERMAL is set on MT8173
devices.
This code reads from the array before verifying that "trig" is a valid
index. If the index is wildly out of bounds then reading from an
invalid address could lead to an Oops.
Fixes: a8c66b684efa ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20200709102936.GA20875@mwanda Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
A SPI transfer defines the _maximum_ speed of the SPI transfer. However the
driver doesn't take into account that the clock divider is always rounded down
(due to integer arithmetics). This results in a too high clock rate for the SPI
transfer.
E.g.: with a mclk_rate of 24 MHz and a SPI transfer speed of 10 MHz, the
original code calculates a reg of "0", which results in a effective divider of
"2" and a 12 MHz clock for the SPI transfer.
This patch fixes the issue by using DIV_ROUND_UP() instead of a plain
integer division.
While there simplify the divider calculation for the CDR1 case, use
order_base_2() instead of two ilog2() calculations.
Fixes: 3558fe900e8a ("spi: sunxi: Add Allwinner A31 SPI controller driver") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Maxime Ripard <mripard@kernel.org> Link: https://lore.kernel.org/r/20200706143443.9855-2-mkl@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently pointer phy0 is being dereferenced via the assignment of
phy on the call to phy_get_drvdata before phy0 is null checked, this
can lead to a null pointer dereference. Fix this by performing the
null check on phy0 before the call to phy_get_drvdata. Also replace
the phy0 == NULL check with the more usual !phy0 idiom.
Addresses-Coverity: ("Dereference before null check") Fixes: e6f32efb1b12 ("phy: sun4i-usb: Make sure to disable PHY0 passby for peripheral mode") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20200625124428.83564-1-colin.king@canonical.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses a 40 byte array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data with alignment
explicitly requested. This data is allocated with kzalloc so no
data can leak appart from previous readings.
Fixes: 87aec56e27ef ("iio: health: Add driver for the TI AFE4404 heart monitor") Reported-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, changing the brightness of the internal display of the Acer
TravelMate 5735Z does not work. Pressing the function keys or changing the
slider, GNOME Shell 3.36.2 displays the OSD (five steps), but the
brightness does not change.
The Acer TravelMate 5735Z shipped with Windows 7 and as such does not
trigger our "win8 ready" heuristic for preferring the native backlight
interface.
Still ACPI backlight control doesn't work on this model, where as the
native (intel_video) backlight interface does work by adding
`acpi_backlight=native` or `acpi_backlight=none` to Linux’ command line.
So, add a quirk to force using native backlight control on this model.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207835 Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Acer Aspire 5783z shipped with Windows 7 and as such does not trigger
our "win8 ready" heuristic for prefering the native backlight interface.
Still ACPI backlight control doesn't work on this model, where as the
native (intel_video) backlight interface does work. Add a quirk to
force using native backlight control on this model.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Except SDHCI_QUIRK_BROKEN_CARD_DETECTION and MMC_CAP_NONREMOVABLE,
we also do not need to handle controller native card detect interrupt
for gpio cd type.
If we wrong enabled the card detect interrupt for gpio case, it will
cause a lot of unexpected card detect interrupts during data transfer
which should not happen.
This patch updates the documentation with the information related
to the quirks that needs to be added for disabling all SuperSpeed XHCI
instances in park mode.
Cc: Dongjin Kim <tobetter@gmail.com> Cc: Jianxin Pan <jianxin.pan@amlogic.com> Cc: Thinh Nguyen <thinhn@synopsys.com> Cc: Jun Li <lijun.kernel@gmail.com> Reported-by: Tim <elatllat@gmail.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On Thu, Jul 09, 2020 at 09:00:23AM +0200, Eugeniu Rosca wrote:
>After integrating v4.14.186 commit 5410d158ca2a50 ("usb/ehci-platform:
>Set PM runtime as active on resume") into downstream v4.14.x, we started
>to consistently experience below panic [1] on every second s2ram of
>R-Car H3 Salvator-X Renesas reference board.
>
>After some investigations, we concluded the following:
> - the issue does not exist in vanilla v5.8-rc4+
> - [bisecting shows that] the panic on v4.14.186 is caused by the lack
> of v5.6-rc1 commit 987351e1ea7772 ("phy: core: Add consumer device
> link support"). Getting evidence for that is easy. Reverting
> 987351e1ea7772 in vanilla leads to a similar backtrace [2].
>
>Questions:
> - Backporting 987351e1ea7772 ("phy: core: Add consumer device
> link support") to v4.14.187 looks challenging enough, so probably not
> worth it. Anybody to contradict this?
> - Assuming no plans to backport the missing mainline commit to v4.14.x,
> should the following three v4.14.186 commits be reverted on v4.14.x?
> * baef809ea497a4 ("usb/ohci-platform: Fix a warning when hibernating")
> * 9f33eff4958885 ("usb/xhci-plat: Set PM runtime as active on resume")
> * 5410d158ca2a50 ("usb/ehci-platform: Set PM runtime as active on resume")
On Thu, Jul 09, 2020 at 09:00:23AM +0200, Eugeniu Rosca wrote:
>After integrating v4.14.186 commit 5410d158ca2a50 ("usb/ehci-platform:
>Set PM runtime as active on resume") into downstream v4.14.x, we started
>to consistently experience below panic [1] on every second s2ram of
>R-Car H3 Salvator-X Renesas reference board.
>
>After some investigations, we concluded the following:
> - the issue does not exist in vanilla v5.8-rc4+
> - [bisecting shows that] the panic on v4.14.186 is caused by the lack
> of v5.6-rc1 commit 987351e1ea7772 ("phy: core: Add consumer device
> link support"). Getting evidence for that is easy. Reverting
> 987351e1ea7772 in vanilla leads to a similar backtrace [2].
>
>Questions:
> - Backporting 987351e1ea7772 ("phy: core: Add consumer device
> link support") to v4.14.187 looks challenging enough, so probably not
> worth it. Anybody to contradict this?
> - Assuming no plans to backport the missing mainline commit to v4.14.x,
> should the following three v4.14.186 commits be reverted on v4.14.x?
> * baef809ea497a4 ("usb/ohci-platform: Fix a warning when hibernating")
> * 9f33eff4958885 ("usb/xhci-plat: Set PM runtime as active on resume")
> * 5410d158ca2a50 ("usb/ehci-platform: Set PM runtime as active on resume")
On Thu, Jul 09, 2020 at 09:00:23AM +0200, Eugeniu Rosca wrote:
>After integrating v4.14.186 commit 5410d158ca2a50 ("usb/ehci-platform:
>Set PM runtime as active on resume") into downstream v4.14.x, we started
>to consistently experience below panic [1] on every second s2ram of
>R-Car H3 Salvator-X Renesas reference board.
>
>After some investigations, we concluded the following:
> - the issue does not exist in vanilla v5.8-rc4+
> - [bisecting shows that] the panic on v4.14.186 is caused by the lack
> of v5.6-rc1 commit 987351e1ea7772 ("phy: core: Add consumer device
> link support"). Getting evidence for that is easy. Reverting
> 987351e1ea7772 in vanilla leads to a similar backtrace [2].
>
>Questions:
> - Backporting 987351e1ea7772 ("phy: core: Add consumer device
> link support") to v4.14.187 looks challenging enough, so probably not
> worth it. Anybody to contradict this?
> - Assuming no plans to backport the missing mainline commit to v4.14.x,
> should the following three v4.14.186 commits be reverted on v4.14.x?
> * baef809ea497a4 ("usb/ohci-platform: Fix a warning when hibernating")
> * 9f33eff4958885 ("usb/xhci-plat: Set PM runtime as active on resume")
> * 5410d158ca2a50 ("usb/ehci-platform: Set PM runtime as active on resume")
Commit 209c65b61d94 ("drivers/of/of_mdio.c:fix of_mdiobus_register()")
introduced a break of the loop on the premise that a successful
registration should exit the loop. The premise is correct but not to
code, because rc && rc != -ENODEV is just a special error condition,
that means we would exit the loop even with rc == -ENODEV which is
absolutely not correct since this is the error code to indicate to the
MDIO bus layer that scanning should continue.
Fix this by explicitly checking for rc = 0 as the only valid condition
to break out of the loop.
Fixes: 209c65b61d94 ("drivers/of/of_mdio.c:fix of_mdiobus_register()") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
of_find_node_by_name() will do an of_node_put() on the "from" argument.
With CONFIG_OF_DYNAMIC enabled which checks for device_node reference
counts, we would be getting a warning like this:
On ColdFire mcf54418, using DSPI_DMA_MODE mode, spi transfers
at first boot stage are not succeding:
m25p80 spi0.1: unrecognized JEDEC id bytes: 00, 00, 00
The reason is the SPI_SR initial value set by the driver, that
is not clearing (not setting to 1) the RF_DF flag. After a tour
on the dspi hw modules that use this driver(Vybrid, ColdFire and
ls1021a) a better init value for SR register has been set.
Signed-off-by: Angelo Dureghello <angelo@sysam.it> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
During shutdown, the driver should unregister the SPI controller
and stop the hardware. Otherwise the dspi_transfer_one_message() could
wait on completion infinitely.
Additionally, calling spi_unregister_controller() first in device
shutdown reverse-matches the probe function, where SPI controller is
registered at the end.
Fixes: dc234825997e ("spi: spi-fsl-dspi: Adding shutdown hook") Reported-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200622110543.5035-2-krzk@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses a 32 byte array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data with alignment
explicitly requested. This data is allocated with kzalloc so no
data can leak appart from previous readings.
Fixes: eec96d1e2d31 ("iio: health: Add driver for the TI AFE4403 heart monitor") Reported-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Andrew F. Davis <afd@ti.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
Here there is no data leak possibility so use an explicit structure
on the stack to ensure alignment and nice readable fashion.
The forced alignment of ts isn't strictly necessary in this driver
as the padding will be correct anyway (there isn't any). However
it is probably less fragile to have it there and it acts as
documentation of the requirement.
Calling pm_runtime_get_sync increments the counter even in case of
failure, causing incorrect ref count. Call pm_runtime_put if
pm_runtime_get_sync fails.
The function iio_device_register() was called in mma8452_probe().
But the function iio_device_unregister() was not called after
a call of the function mma8452_set_freefall_mode() failed.
Thus add the missed function call for one error case.
Fixes: 1a965d405fc6 ("drivers:iio:accel:mma8452: added cleanup provision in case of failure.") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When devm_regmap_init_i2c() returns an error code, a pairing
runtime PM usage counter decrement is needed to keep the
counter balanced. For error paths after ak8974_set_power(),
ak8974_detect() and ak8974_reset(), things are the same.
However, When iio_triggered_buffer_setup() returns an error
code, there will be two PM usgae counter decrements.
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Fixes: 7c94a8b2ee8c ("iio: magn: add a driver for AK8974") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data.
This data is allocated with kzalloc so no data can leak apart
from previous readings.
Fixes: 16bf793f86b2 ("iio: humidity: hdc100x: add triggered buffer support for HDC100X") Reported-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Matt Ranostay <matt.ranostay@konsulko.com> Cc: Alison Schofield <amsfield22@gmail.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data.
This data is allocated with kzalloc so no data can leak appart from
previous readings.
Fixes: 7c94a8b2ee8cf ("iio: magn: add a driver for AK8974") Reported-by: Lars-Peter Clausen <lars@metafoo.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit f7b93d42945c ("arm64/alternatives: use subsections for replacement
sequences") moved the alternatives replacement sequences into subsections,
in order to keep the as close as possible to the code that they replace.
Unfortunately, this broke the logic in branch_insn_requires_update,
which assumed that any branch into kernel executable code was a branch
that required updating, which is no longer the case now that the code
sequences that are patched in are in the same section as the patch site
itself.
So the only way to discriminate branches that require updating and ones
that don't is to check whether the branch targets the replacement sequence
itself, and so we can drop the call to kernel_text_address() entirely.
Fixes: f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences") Reported-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20200709125953.30918-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Return statements in functions returning bool should use true or false
instead of an integer value. This code was detected with the help of
Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Before this patch, only read-write mounts would grab the freeze
glock in read-only mode, as part of gfs2_make_fs_rw. So the freeze
glock was never initialized. That meant requests to freeze, which
request the glock in EX, were granted without any state transition.
That meant you could mount a gfs2 file system, which is currently
frozen on a different cluster node, in read-only mode.
This patch makes read-only mounts lock the freeze glock in SH mode,
which will block for file systems that are frozen on another node.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When building very large kernels, the logic that emits replacement
sequences for alternatives fails when relative branches are present
in the code that is emitted into the .altinstr_replacement section
and patched in at the original site and fixed up. The reason is that
the linker will insert veneers if relative branches go out of range,
and due to the relative distance of the .altinstr_replacement from
the .text section where its branch targets usually live, veneers
may be emitted at the end of the .altinstr_replacement section, with
the relative branches in the sequence pointed at the veneers instead
of the actual target.
The alternatives patching logic will attempt to fix up the branch to
point to its original target, which will be the veneer in this case,
but given that the patch site is likely to be far away as well, it
will be out of range and so patching will fail. There are other cases
where these veneers are problematic, e.g., when the target of the
branch is in .text while the patch site is in .init.text, in which
case putting the replacement sequence inside .text may not help either.
So let's use subsections to emit the replacement code as closely as
possible to the patch site, to ensure that veneers are only likely to
be emitted if they are required at the patch site as well, in which
case they will be in range for the replacement sequence both before
and after it is transported to the patch site.
This will prevent alternative sequences in non-init code from being
released from memory after boot, but this is tolerable given that the
entire section is only 512 KB on an allyesconfig build (which weighs in
at 500+ MB for the entire Image). Also, note that modules today carry
the replacement sequences in non-init sections as well, and any of
those that target init code will be emitted into init sections after
this change.
This fixes an early crash when booting an allyesconfig kernel on a
system where any of the alternatives sequences containing relative
branches are activated at boot (e.g., ARM64_HAS_PAN on TX2)
Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Dave P Martin <dave.martin@arm.com> Link: https://lore.kernel.org/r/20200630081921.13443-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
in mic_pre_enable, pm_runtime_get_sync is called which
increments the counter even in case of failure, leading to incorrect
ref count. In case of failure, decrement the ref count before returning.
In order for no_refcnt and is_data to be the lowest order two
bits in the 'val' we have to pad out the bitfield of the u8.
Fixes: ad0f75e5f57c ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we clone a socket in sk_clone_lock(), its sk_cgrp_data is
copied, so the cgroup refcnt must be taken too. And, unlike the
sk_alloc() path, sock_update_netprioidx() is not called here.
Therefore, it is safe and necessary to grab the cgroup refcnt
even when cgroup_sk_alloc is disabled.
sk_clone_lock() is in BH context anyway, the in_interrupt()
would terminate this function if called there. And for sk_alloc()
skcd->val is always zero. So it's safe to factor out the code
to make it more readable.
The global variable 'cgroup_sk_alloc_disabled' is used to determine
whether to take these reference counts. It is impossible to make
the reference counting correct unless we save this bit of information
in skcd->val. So, add a new bit there to record whether the socket
has already taken the reference counts. This obviously relies on
kmalloc() to align cgroup pointers to at least 4 bytes,
ARCH_KMALLOC_MINALIGN is certainly larger than that.
This bug seems to be introduced since the beginning, commit d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
tried to fix it but not compeletely. It seems not easy to trigger until
the recent commit 090e28b229af
("netprio_cgroup: Fix unlimited memory leak of v2 cgroups") was merged.
Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") Reported-by: Cameron Berkenpas <cam@neo-zeon.de> Reported-by: Peter Geis <pgwipeout@gmail.com> Reported-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reported-by: Daniël Sonck <dsonck92@gmail.com> Reported-by: Zhang Qiang <qiang.zhang@windriver.com> Tested-by: Cameron Berkenpas <cam@neo-zeon.de> Tested-by: Peter Geis <pgwipeout@gmail.com> Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Zefan Li <lizefan@huawei.com> Cc: Tejun Heo <tj@kernel.org> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Whenever cookie_init_timestamp() has been used to encode
ECN,SACK,WSCALE options, we can not remove the TS option in the SYNACK.
Otherwise, tcp_synack_options() will still advertize options like WSCALE
that we can not deduce later when receiving the packet from the client
to complete 3WHS.
Note that modern linux TCP stacks wont use MD5+TS+SACK in a SYN packet,
but we can not know for sure that all TCP stacks have the same logic.
Before the fix a tcpdump would exhibit this wrong exchange :
10:12:15.464591 IP C > S: Flags [S], seq 4202415601, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 456965269 ecr 0,nop,wscale 8], length 0
10:12:15.464602 IP S > C: Flags [S.], seq 253516766, ack 4202415602, win 65535, options [nop,nop,md5 valid,mss 1400,nop,nop,sackOK,nop,wscale 8], length 0
10:12:15.464611 IP C > S: Flags [.], ack 1, win 256, options [nop,nop,md5 valid], length 0
10:12:15.464678 IP C > S: Flags [P.], seq 1:13, ack 1, win 256, options [nop,nop,md5 valid], length 12
10:12:15.464685 IP S > C: Flags [.], ack 13, win 65535, options [nop,nop,md5 valid], length 0
After this patch the exchange looks saner :
11:59:59.882990 IP C > S: Flags [S], seq 517075944, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 1751508483 ecr 0,nop,wscale 8], length 0
11:59:59.883002 IP S > C: Flags [S.], seq 1902939253, ack 517075945, win 65535, options [nop,nop,md5 valid,mss 1400,sackOK,TS val 1751508479 ecr 1751508483,nop,wscale 8], length 0
11:59:59.883012 IP C > S: Flags [.], ack 1, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508479], length 0
11:59:59.883114 IP C > S: Flags [P.], seq 1:13, ack 1, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508479], length 12
11:59:59.883122 IP S > C: Flags [.], ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508483 ecr 1751508483], length 0
11:59:59.883152 IP S > C: Flags [P.], seq 1:13, ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508484 ecr 1751508483], length 12
11:59:59.883170 IP C > S: Flags [.], ack 13, win 256, options [nop,nop,md5 valid,nop,nop,TS val 1751508484 ecr 1751508484], length 0
Of course, no SACK block will ever be added later, but nothing should break.
Technically, we could remove the 4 nops included in MD5+TS options,
but again some stacks could break seeing not conventional alignment.
Fixes: 4957faade11b ("TCPCT part 1g: Responder Cookie => Initiator") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzkaller found its way into setsockopt with TCP_CONGESTION "cdg".
tcp_cdg_init() does a kcalloc to store the gradients. As sk_clone_lock
just copies all the memory, the allocated pointer will be copied as
well, if the app called setsockopt(..., TCP_CONGESTION) on the listener.
If now the socket will be destroyed before the congestion-control
has properly been initialized (through a call to tcp_init_transfer), we
will end up freeing memory that does not belong to that particular
socket, opening the door to a double-free:
Wei Wang fixed a part of these CDG-malloc issues with commit c12014440750
("tcp: memset ca_priv data to 0 properly").
This patch here fixes the listener-scenario: We make sure that listeners
setting the congestion-control through setsockopt won't initialize it
(thus CDG never allocates on listeners). For those who use AF_UNSPEC to
reuse a socket, tcp_disconnect() is changed to cleanup afterwards.
(The issue can be reproduced at least down to v4.4.x.)
Cc: Wei Wang <weiwan@google.com> Cc: Eric Dumazet <edumazet@google.com> Fixes: 2b0a8c9eee81 ("tcp: add CDG congestion control") Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When tcf_block_get() fails inside atm_tc_init(),
atm_tc_put() is called to release the qdisc p->link.q.
But the flow->ref prevents it to do so, as the flow->ref
is still zero.
Fix this by moving the p->link.ref initialization before
tcf_block_get().
Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure") Reported-and-tested-by: syzbot+d411cff6ab29cc2c311b@syzkaller.appspotmail.com Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This essentially reverts commit 721230326891 ("tcp: md5: reject TCP_MD5SIG
or TCP_MD5SIG_EXT on established sockets")
Mathieu reported that many vendors BGP implementations can
actually switch TCP MD5 on established flows.
Quoting Mathieu :
Here is a list of a few network vendors along with their behavior
with respect to TCP MD5:
- Cisco: Allows for password to be changed, but within the hold-down
timer (~180 seconds).
- Juniper: When password is initially set on active connection it will
reset, but after that any subsequent password changes no network
resets.
- Nokia: No notes on if they flap the tcp connection or not.
- Ericsson/RedBack: Allows for 2 password (old/new) to co-exist until
both sides are ok with new passwords.
- Meta-Switch: Expects the password to be set before a connection is
attempted, but no further info on whether they reset the TCP
connection on a change.
- Avaya: Disable the neighbor, then set password, then re-enable.
- Zebos: Would normally allow the change when socket connected.
We can revert my prior change because commit 9424e2e7ad93 ("tcp: md5: fix potential
overestimation of TCP option space") removed the leak of 4 kernel bytes to
the wire that was the main reason for my patch.
While doing my investigations, I found a bug when a MD5 key is changed, leading
to these commits that stable teams want to consider before backporting this revert :
Fixes: 721230326891 "tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets" Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
My prior fix went a bit too far, according to Herbert and Mathieu.
Since we accept that concurrent TCP MD5 lookups might see inconsistent
keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb()
Clearing all key->key[] is needed to avoid possible KMSAN reports,
if key->keylen is increased. Since tcp_md5_do_add() is not fast path,
using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler.
data_race() was added in linux-5.8 and will prevent KCSAN reports,
this can safely be removed in stable backports, if data_race() is
not yet backported.
v2: use data_race() both in tcp_md5_hash_key() and tcp_md5_do_add()
Fixes: 6a2febec338d ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Marco Elver <elver@google.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
MD5 keys are read with RCU protection, and tcp_md5_do_add()
might update in-place a prior key.
Normally, typical RCU updates would allocate a new piece
of memory. In this case only key->key and key->keylen might
be updated, and we do not care if an incoming packet could
see the old key, the new one, or some intermediate value,
since changing the key on a live flow is known to be problematic
anyway.
We only want to make sure that in the case key->keylen
is changed, cpus in tcp_md5_hash_key() wont try to use
uninitialized data, or crash because key->keylen was
read twice to feed sg_init_one() and ahash_request_set_crypt()
Fixes: 9ea88a153001 ("tcp: md5: check md5 signature without socket lock") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: AceLan Kao <acelan.kao@canonical.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The packets from tunnel devices (eg bareudp) may have only
metadata in the dst pointer of skb. Hence a pointer check of
neigh_lookup is needed in dst_neigh_lookup_skb
Kernel crashes when packets from bareudp device is processed in
the kernel neighbour subsytem.
Fixes: aaa0c23cb901 ("Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug") Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzbot was to trigger a bug by tricking AF_LLC with
non sensible addr->sllc_arphrd
It seems clear LLC requires an Ethernet device.
Back in commit abf9d537fea2 ("llc: add support for SO_BINDTODEVICE")
Octavian Purdila added possibility for application to use a zero
value for sllc_arphrd, convert it to ARPHRD_ETHER to not cause
regressions on existing applications.
BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:199 [inline]
BUG: KASAN: use-after-free in list_empty include/linux/list.h:268 [inline]
BUG: KASAN: use-after-free in waitqueue_active include/linux/wait.h:126 [inline]
BUG: KASAN: use-after-free in wq_has_sleeper include/linux/wait.h:160 [inline]
BUG: KASAN: use-after-free in skwq_has_sleeper include/net/sock.h:2092 [inline]
BUG: KASAN: use-after-free in sock_def_write_space+0x642/0x670 net/core/sock.c:2813
Read of size 8 at addr ffff88801e0b4078 by task ksoftirqd/3/27
Memory state around the buggy address: ffff88801e0b3f00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff88801e0b3f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88801e0b4000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff88801e0b4080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88801e0b4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: abf9d537fea2 ("llc: add support for SO_BINDTODEVICE") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the tx path of l2tp, l2tp_xmit_skb() calls skb_dst_set() to set
skb's dst. However, it will eventually call inet6_csk_xmit() or
ip_queue_xmit() where skb's dst will be overwritten by:
skb_dst_set_noref(skb, dst);
without releasing the old dst in skb. Then it causes dst/dev refcnt leak:
unregister_netdevice: waiting for eth0 to become free. Usage count = 1
This can be reproduced by simply running:
# modprobe l2tp_eth && modprobe l2tp_ip
# sh ./tools/testing/selftests/net/l2tp.sh
So before going to inet6_csk_xmit() or ip_queue_xmit(), skb's dst
should be dropped. This patch is to fix it by removing skb_dst_set()
from l2tp_xmit_skb() and moving skb_dst_drop() into l2tp_xmit_core().
Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core") Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: James Chapman <jchapman@katalix.com> Tested-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
IPv4 ping sockets don't set fl4.fl4_icmp_{type,code}, which leads to
incomplete IPsec ACQUIRE messages being sent to userspace. Currently,
both raw sockets and IPv6 ping sockets set those fields.
Expected output of "ip xfrm monitor":
acquire proto esp
sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 8 code 0 dev ens4
policy src 10.0.2.15/32 dst 8.8.8.8/32
<snip>
Currently with ping sockets:
acquire proto esp
sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 0 code 0 dev ens4
policy src 10.0.2.15/32 dst 8.8.8.8/32
<snip>
The Libreswan test suite found this problem after Fedora changed the
value for the sysctl net.ipv4.ping_group_range.
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Paul Wouters <pwouters@redhat.com> Tested-by: Paul Wouters <pwouters@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A potential deadlock can occur during registering or unregistering a
new generic netlink family between the main nl_table_lock and the
cb_lock where each thread wants the lock held by the other, as
demonstrated below.
1) Thread 1 is performing a netlink_bind() operation on a socket. As part
of this call, it will call netlink_lock_table(), incrementing the
nl_table_users count to 1.
2) Thread 2 is registering (or unregistering) a genl_family via the
genl_(un)register_family() API. The cb_lock semaphore will be taken for
writing.
3) Thread 1 will call genl_bind() as part of the bind operation to handle
subscribing to GENL multicast groups at the request of the user. It will
attempt to take the cb_lock semaphore for reading, but it will fail and
be scheduled away, waiting for Thread 2 to finish the write.
4) Thread 2 will call netlink_table_grab() during the (un)registration
call. However, as Thread 1 has incremented nl_table_users, it will not
be able to proceed, and both threads will be stuck waiting for the
other.
genl_bind() is a noop, unless a genl_family implements the mcast_bind()
function to handle setting up family-specific multicast operations. Since
no one in-tree uses this functionality as Cong pointed out, simply removing
the genl_bind() function will remove the possibility for deadlock, as there
is no attempt by Thread 1 above to take the cb_lock semaphore.
Fixes: c380d9a7afff ("genetlink: pass multicast bind/unbind to families") Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Johannes Berg <johannes.berg@intel.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Trap handler for syscall tracing reads EFA (Exception Fault Address),
in case strace wants PC of trap instruction (EFA is not part of pt_regs
as of current code).
However this EFA read is racy as it happens after dropping to pure
kernel mode (re-enabling interrupts). A taken interrupt could
context-switch, trigger a different task's trap, clobbering EFA for this
execution context.
Fix this by reading EFA early, before re-enabling interrupts. A slight
side benefit is de-duplication of FAKE_RET_FROM_EXCPN in trap handler.
The trap handler is common to both ARCompact and ARCv2 builds too.
This just came out of code rework/review and no real problem was reported
but is clearly a potential problem specially for strace.
kobject_uevent may allocate memory and it may be called while there are dm
devices suspended. The allocation may recurse into a suspended device,
causing a deadlock. We must set the noio flag when sending a uevent.
The observed deadlock was reported here:
https://www.redhat.com/archives/dm-devel/2020-March/msg00025.html
drivers/gpu/drm/radeon/ci_dpm.c:5652:9: warning: Use of memory after it is freed [unix.Malloc]
kfree(rdev->pm.dpm.ps[i].ps_priv);
^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/radeon/ci_dpm.c:5654:2: warning: Attempt to free released memory [unix.Malloc]
kfree(rdev->pm.dpm.ps);
^~~~~~~~~~~~~~~~~~~~~~
problem is reported in ci_dpm_fini, with these code blocks.
for (i = 0; i < rdev->pm.dpm.num_ps; i++) {
kfree(rdev->pm.dpm.ps[i].ps_priv);
}
kfree(rdev->pm.dpm.ps);
The first free happens in ci_parse_power_table where it cleans up locally
on a failure. ci_dpm_fini also does a cleanup.
ret = ci_parse_power_table(rdev);
if (ret) {
ci_dpm_fini(rdev);
return ret;
}
So remove the cleanup in ci_parse_power_table and
move the num_ps calculation to inside the loop so ci_dpm_fini
will know how many array elements to free.
Fixes: cc8dbbb4f62a ("drm/radeon: add dpm support for CI dGPUs (v2)") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Under somewhat convoluted conditions, it is possible to attempt to
release an extent_buffer that is under io, which triggers a BUG_ON in
btrfs_release_extent_buffer_pages.
This relies on a few different factors. First, extent_buffer reads done
as readahead for searching use WAIT_NONE, so they free the local extent
buffer reference while the io is outstanding. However, they should still
be protected by TREE_REF. However, if the system is doing signficant
reclaim, and simultaneously heavily accessing the extent_buffers, it is
possible for releasepage to race with two concurrent readahead attempts
in a way that leaves TREE_REF unset when the readahead extent buffer is
released.
Essentially, if two tasks race to allocate a new extent_buffer, but the
winner who attempts the first io is rebuffed by a page being locked
(likely by the reclaim itself) then the loser will still go ahead with
issuing the readahead. The loser's call to find_extent_buffer must also
race with the reclaim task reading the extent_buffer's refcount as 1 in
a way that allows the reclaim to re-clear the TREE_REF checked by
find_extent_buffer.
The following represents an example execution demonstrating the race:
CPU0 CPU1 CPU2
reada_for_search reada_for_search
readahead_tree_block readahead_tree_block
find_create_tree_block find_create_tree_block
alloc_extent_buffer alloc_extent_buffer
find_extent_buffer // not found
allocates eb
lock pages
associate pages to eb
insert eb into radix tree
set TREE_REF, refs == 2
unlock pages
read_extent_buffer_pages // WAIT_NONE
not uptodate (brand new eb)
lock_page
if !trylock_page
goto unlock_exit // not an error
free_extent_buffer
release_extent_buffer
atomic_dec_and_test refs to 1
find_extent_buffer // found
try_release_extent_buffer
take refs_lock
reads refs == 1; no io
atomic_inc_not_zero refs to 2
mark_buffer_accessed
check_buffer_tree_ref
// not STALE, won't take refs_lock
refs == 2; TREE_REF set // no action
read_extent_buffer_pages // WAIT_NONE
clear TREE_REF
release_extent_buffer
atomic_dec_and_test refs to 1
unlock_page
still not uptodate (CPU1 read failed on trylock_page)
locks pages
set io_pages > 0
submit io
return
free_extent_buffer
release_extent_buffer
dec refs to 0
delete from radix tree
btrfs_release_extent_buffer_pages
BUG_ON(io_pages > 0)!!!
We observe this at a very low rate in production and were also able to
reproduce it in a test environment by introducing some spurious delays
and by introducing probabilistic trylock_page failures.
To fix it, we apply check_tree_ref at a point where it could not
possibly be unset by a competing task: after io_pages has been
incremented. All the codepaths that clear TREE_REF check for io, so they
would not be able to clear it after this point until the io is done.
It is being reverted upstream, just hasn't made it there yet and is
causing lots of problems.
Reported-by: Hans de Goede <hdegoede@redhat.com> Cc: Qiujun Huang <hqjagain@gmail.com> Cc: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mark CR4.TSD as being possibly owned by the guest as that is indeed the
case on VMX. Without TSD being tagged as possibly owned by the guest, a
targeted read of CR4 to get TSD could observe a stale value. This bug
is benign in the current code base as the sole consumer of TSD is the
emulator (for RDTSC) and the emulator always "reads" the entirety of CR4
when grabbing bits.
Add a build-time assertion in to ensure VMX doesn't hand over more CR4
bits without also updating x86.
Fixes: 52ce3c21aec3 ("x86,kvm,vmx: Don't trap writes to CR4.TSD") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703040422.31536-2-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Inject a #GP on MOV CR4 if CR4.LA57 is toggled in 64-bit mode, which is
illegal per Intel's SDM:
CR4.LA57
57-bit linear addresses (bit 12 of CR4) ... blah blah blah ...
This bit cannot be modified in IA-32e mode.
Note, the pseudocode for MOV CR doesn't call out the fault condition,
which is likely why the check was missed during initial development.
This is arguably an SDM bug and will hopefully be fixed in future
release of the SDM.
Fixes: fd8cb433734ee ("KVM: MMU: Expose the LA57 feature to VM.") Cc: stable@vger.kernel.org Reported-by: Sebastien Boeuf <sebastien.boeuf@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703021714.5549-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bit 8 would be the "global" bit, which does not quite make sense for non-leaf
page table entries. Intel ignores it; AMD ignores it in PDEs and PDPEs, but
reserves it in PML4Es.
Probably, earlier versions of the AMD manual documented it as reserved in PDPEs
as well, and that behavior made it into KVM as well as kvm-unit-tests; fix it.
Cc: stable@vger.kernel.org Reported-by: Nadav Amit <namit@vmware.com> Fixes: a0c0feb57992 ("KVM: x86: reserve bit 8 of non-leaf PDPEs and PML4Es in 64-bit mode on AMD", 2014-09-03) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
HVC_SOFT_RESTART is given values for x0-2 that it should installed
before exiting to the new address so should not set x0 to stub HVC
success or failure code.
Fixes: af42f20480bf1 ("arm64: hyp-stub: Zero x0 on successful stub handling") Cc: stable@vger.kernel.org Signed-off-by: Andrew Scull <ascull@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200706095259.1338221-1-ascull@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
PAGE_HYP_DEVICE is intended to encode attribute bits for an EL2 stage-1
pte mapping a device. Unfortunately, it includes PROT_DEVICE_nGnRE which
encodes attributes for EL1 stage-1 mappings such as UXN and nG, which are
RES0 for EL2, and DBM which is meaningless as TCR_EL2.HD is not set.
Fix the definition of PAGE_HYP_DEVICE so that it doesn't set RES0 bits
at EL2.
Acked-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200708162546.26176-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have a Dell AIO, there is neither internal speaker nor internal
mic, only a multi-function audio jack on it.
Users reported that after freshly installing the OS and plug
a headset to the audio jack, the headset can't output sound. I
reproduced this bug, at that moment, the Input Source is as below:
Simple mixer control 'Input Source',0
Capabilities: cenum
Items: 'Headphone Mic' 'Headset Mic'
Item0: 'Headphone Mic'
That is because the patch_realtek will set this audio jack as mic_in
mode if Input Source's value is hp_mic.
If it is not fresh installing, this issue will not happen since the
systemd will run alsactl restore -f /var/lib/alsa/asound.state, this
will set the 'Input Source' according to history value.
If there is internal speaker or internal mic, this issue will not
happen since there is valid sink/source in the pulseaudio, the PA will
set the 'Input Source' according to active_port.
To fix this issue, change the parser function to let the hs_mic be
stored ahead of hp_mic.
Change the way the "magic-packet" DT property is handled in the
macb_probe() function, matching DT binding documentation.
Now we mark the device as "wakeup capable" instead of calling the
device_init_wakeup() function that would enable the wakeup source.
For Ethernet WoL, enabling the wakeup_source is done by
using ethtool and associated macb_set_wol() function that
already calls device_set_wakeup_enable() for this purpose.
That would reduce power consumption by cutting more clocks if
"magic-packet" property is set but WoL is not configured by ethtool.
Fixes: 3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Sergio Prado <sergio.prado@e-labworks.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
we need to set 'active_vfs' back to 0, if something goes wrong during the
allocation of SR-IOV resources: otherwise, further VF configurations will
wrongly assume that bp->pf.vf[x] are valid memory locations, and commands
like the ones in the following sequence:
# echo 2 >/sys/bus/pci/devices/${ADDR}/sriov_numvfs
# ip link set dev ens1f0np0 up
# ip link set dev ens1f0np0 vf 0 trust on
Fixes: c0c050c58d840 ("bnxt_en: New Broadcom ethernet driver.") Reported-by: Fei Liu <feliu@redhat.com> CC: Jonathan Toppins <jtoppins@redhat.com> CC: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When adding first socket to nbd, if nsock's allocation failed, the data
structure member "config->socks" was reallocated, but the data structure
member "config->num_connections" was not updated. A memory leak will occur
then because the function "nbd_config_put" will free "config->socks" only
when "config->num_connections" is not zero.
After entering kdb due to breakpoint, when we execute 'ss' or 'go' (will
delay installing breakpoints, do single-step first), it won't work
correctly, and it will enter kdb due to oops.
It's because the reason gotten in kdb_stub() is not as expected, and it
seems that the ex_vector for single-step should be 0, like what arch
powerpc/sh/parisc has implemented.
Before the patch:
Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0xffff8000101486cc (printk)
is enabled addr at ffff8000101486cc, hardtype=0 installed=0
[0]kdb> g
/ # echo h > /proc/sysrq-trigger
Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 due to Breakpoint @ 0xffff8000101486cc
[3]kdb> ss
After the patch:
Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0xffff8000101486cc (printk)
is enabled addr at ffff8000101486cc, hardtype=0 installed=0
[0]kdb> g
/ # echo h > /proc/sysrq-trigger
Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc
[0]kdb> g
Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc
[0]kdb> ss
Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to SS trap @ 0xffff800010082ab8
[0]kdb>
Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support") Signed-off-by: Wei Li <liwei391@huawei.com> Tested-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200509214159.19680-2-liwei391@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On partial_drain completion we should be in SNDRV_PCM_STATE_RUNNING
state, so set that for partially draining streams in
snd_compr_drain_notify() and use a flag for partially draining streams
While at it, add locks for stream state change in
snd_compr_drain_notify() as well.
In a case where the ID_REV register read is failed, the memory for a
private data structure has to be freed before returning error from the
function smsc95xx_bind.
Fixes: bbd9f9ee69242 ("smsc95xx: add wol support for more frame types") Signed-off-by: Andre Edich <andre.edich@microchip.com> Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
t4_prep_fw goto bye tag with positive return value when something
bad happened and which can not free resource in adap_init0.
so fix it to return negative value.
Fixes: 16e47624e76b ("cxgb4: Add new scheme to update T4/T5 firmware") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Li Heng <liheng40@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The completion vector index that is given during CQ creation can't
exceed the number of support vectors by the underlying RDMA device. This
violation currently can accure, for example, in case one will try to
connect with N regular read/write queues and M poll queues and the sum
of N + M > num_supported_vectors. This will lead to failure in establish
a connection to remote target. Instead, in that case, share a completion
vector between queues.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The sense data buffer in sense_buf_pool is allocated with size of
MPT_SENSE_BUFFER_ALLOC(64) (multiplied by req_depth) while SNS_LEN(sc)(96)
is used when reading the data. That may lead to a read from unallocated
area, sometimes from another (unallocated) page. To fix this, limit the
read size to MPT_SENSE_BUFFER_ALLOC.
Link: https://lore.kernel.org/r/20200616150446.4840-1-thenzl@redhat.com Co-developed-by: Stanislav Saner <ssaner@redhat.com> Signed-off-by: Stanislav Saner <ssaner@redhat.com> Signed-off-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
if of_find_device_by_node() succeed, imx6q_suspend_init() doesn't have a
corresponding put_device(). Thus add a jump target to fix the exception
handling for this function implementation.
Currently if early_pgm_check_handler is called it ends up in pgm check
loop. The problem is that early_pgm_check_handler is instrumented by
KASAN but executed without DAT flag enabled which leads to addressing
exception when KASAN checks try to access shadow memory.
Fix that by executing early handlers with DAT flag on under KASAN as
expected.
READ_ONCE should be used when reading rings prior to accessing the
statistics pointer. Introduce this as well as the corresponding WRITE_ONCE
usage when allocating and freeing the rings, to ensure protected access.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If an spi device is unbounded from the driver before the release
process, there will be an NULL pointer reference when it's
referenced in spi_slave_abort().
Fix it by checking it's already freed before reference.
Currently when a host1x device driver is unregistered, it is not
detached from the host1x controller, which means that the device
will stay around and when the driver is registered again, it may
bind to the old, stale device rather than the new one that was
created from scratch upon driver registration. This in turn can
cause various weird crashes within the driver core because it is
confronted with a device that was already deleted.
Fix this by detaching the driver from the host1x controller when
it is unregistered. This ensures that the deleted device also is
no longer present in the device list that drivers will bind to.
We can currently sometimes get "RXS timed out" errors and "EOT timed out"
errors with spi transfers.
These errors can be made easy to reproduce by reading the cpcap iio
values in a loop while keeping the CPUs busy by also reading /dev/urandom.
The "RXS timed out" errors we can fix by adding spi-cpol and spi-cpha
in addition to the spi-cs-high property we already have.
The "EOT timed out" errors we can fix by increasing the spi clock rate
to 9.6 MHz. Looks similar MC13783 PMIC says it works at spi clock rates
up to 20 MHz, so let's assume we can pick any rate up to 20 MHz also
for cpcap.
Cc: maemo-leste@lists.dyne.org Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If shared interrupt comes late, during probe error path or device remove
(could be triggered with CONFIG_DEBUG_SHIRQ), the interrupt handler
dspi_interrupt() will access registers with the clock being disabled.
This leads to external abort on non-linefetch on Toradex Colibri VF50
module (with Vybrid VF5xx):
The resource-managed framework should not be used for shared interrupt
handling, because the interrupt handler might be called after releasing
other resources and disabling clocks.
Similar bug could happen during suspend - the shared interrupt handler
could be invoked after suspending the device. Each device sharing this
interrupt line should disable the IRQ during suspend so handler will be
invoked only in following cases:
1. None suspended,
2. All devices resumed.
Fixes: 349ad66c0ab0 ("spi:Add Freescale DSPI driver for Vybrid VF610 platform") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200622110543.5035-3-krzk@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Some SoC share one irq number between DSPI controllers.
For example, on the LX2160 board, DSPI0 and DSPI1 share one irq number.
In this case, only one DSPI controller can register successfully,
and others will fail.
Signed-off-by: Chuanhua Han <chuanhua.han@nxp.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
During device removal, the driver should unregister the SPI controller
and stop the hardware. Otherwise the dspi_transfer_one_message() could
wait on completion infinitely.
Additionally, calling spi_unregister_controller() first in device
removal reverse-matches the probe function, where SPI controller is
registered at the end.
Fixes: 05209f457069 ("spi: fsl-dspi: add missing clk_disable_unprepare() in dspi_remove()") Reported-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200622110543.5035-1-krzk@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The current number of KVM_IRQCHIP_NUM_PINS results in an order 3
allocation (32kb) for each guest start/restart. This can result in OOM
killer activity even with free swap when the memory is fragmented
enough:
As far as I can tell s390x does not use the iopins as we bail our for
anything other than KVM_IRQ_ROUTING_S390_ADAPTER and the chip/pin is
only used for KVM_IRQ_ROUTING_IRQCHIP. So let us use a small number to
reduce the memory footprint.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200617083620.5409-1-borntraeger@de.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
In most cases, such as CONFIG_ACPI_CUSTOM_DSDT and
CONFIG_ACPI_TABLE_UPGRADE, boot-time modifications to firmware tables
are tied to specific Kconfig options. Currently this is not the case
for modifying the ACPI SSDT via the efivar_ssdt kernel command line
option and associated EFI variable.
This patch adds CONFIG_EFI_CUSTOM_SSDT_OVERLAYS, which defaults
disabled, in order to allow enabling or disabling that feature during
the build.
The GIC driver uses a RMW sequence to update the affinity, and
relies on the gic_lock_irqsave/gic_unlock_irqrestore sequences
to update it atomically.
But these sequences only expand into anything meaningful if
the BL_SWITCHER option is selected, which almost never happens.
It also turns out that using a RMW and locks is just as silly,
as the GIC distributor supports byte accesses for the GICD_TARGETRn
registers, which when used make the update atomic by definition.
Drop the terminally broken code and replace it by a byte write.
Fixes: 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This resolves the hazard between the mtc0 in the change_c0_status() and
the mfc0 in configure_exception_vector(). Without resolving this hazard
configure_exception_vector() could read an old value and would restore
this old value again. This would revert the changes change_c0_status()
did. I checked this by printing out the read_c0_status() at the end of
per_cpu_trap_init() and the ST0_MX is not set without this patch.
The hazard is documented in the MIPS Architecture Reference Manual Vol.
III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev
6.03 table 8.1 which includes:
I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU.
There the change_c0_status() function would activate the DSPen by
setting ST0_MX in the c0_status register. This was reverted and then the
system got a DSP exception when the DSP registers were saved in
save_dsp() in the first process switch. The crash looks like this:
We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs,
not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with
kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it
did not happen.
In the kernel I bisected this problem to commit 9012d011660e ("compiler:
allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was
reverted it also happened after commit 172dcd935c34b ("MIPS: Always
allocate exception vector for MIPSr2+").
Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.")
does similar changes to a different file. I am not sure if there are
more places affected by this problem.
When xfstest generic/035, we found the target file was deleted
if the rename return -EACESS.
In cifs_rename2, we unlink the positive target dentry if rename
failed with EACESS or EEXIST, even if the target dentry is positived
before rename. Then the existing file was deleted.
We should just delete the target file which created during the
rename.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- persistent handles will only be enabled for per-user tcons if the
server advertises the 'Continuous Availabity' capability
- resilient handles would never be enabled for per-user tcons
Signed-off-by: Paul Aurich <paul@darkrain42.org> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ensure multiuser SMB3 mounts use encryption for all users' tcons if the
mount options are configured to require encryption. Without this, only
the primary tcon and IPC tcons are guaranteed to be encrypted. Per-user
tcons would only be encrypted if the server was configured to require
encryption.
Signed-off-by: Paul Aurich <paul@darkrain42.org> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>