For at least some modems like the TELIT LE910, skipping SOF makes
transfers blocking indefinitely after a short amount of data
transferred.
Given the small improvement provided by skipping the SOF (just one
byte on about 100 bytes), it seems better to completely remove this
"feature" than make it optional.
Building a kernel with clang sometimes fails with an objtool error in dlm:
fs/dlm/lock.o: warning: objtool: revert_lock_pc()+0xbd: can't find jump dest instruction at .text+0xd7fc
The problem is that BUG() never returns and the compiler knows
that anything after it is unreachable, however the panic still
emits some code that does not get fully eliminated.
Having both BUG() and panic() is really pointless as the BUG()
kills the current process and the subsequent panic() never hits.
In most cases, we probably don't really want either and should
replace the DLM_ASSERT() statements with WARN_ON(), as has
been done for some of them.
Remove the BUG() here so the user at least sees the panic message
and we can reliably build randconfig kernels.
Fixes: e7fd41792fc0 ("[DLM] The core of the DLM for GFS2/CLVM") Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: clang-built-linux@googlegroups.com Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the probe function, in case of error, resources allocated in
'lp8788_setup_adc_channel()' must be released.
This can be achieved easily by using the devm_ variant of
'iio_channel_get()'.
This has the extra benefit to simplify the remove function and to axe the
'lp8788_release_adc_channel()' function which is now useless.
Fixes: 98a276649358 ("power_supply: Add new lp8788 charger driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The issue happens because no real ISP reset is executed. The
qla2x00_abort_isp(scsi_qla_host_t *vha) function expects that
vha->flags.online will be not zero for ISP reset procedure. This patch
sets vha->flags.online to 1 before calling ->abort_isp() for starting the
ISP reset.
7d715a6c1ae5 ("PCI: add PCI Express ASPM support") added the ability for
Linux to enable ASPM, but for some undocumented reason, it didn't enable
ASPM on links where the downstream component is a PCIe-to-PCI/PCI-X Bridge.
Remove this exclusion so we can enable ASPM on these links.
The Dell OptiPlex 7080 mentioned in the bugzilla has a TI XIO2001
PCIe-to-PCI Bridge. Enabling ASPM on the link leading to it allows the
Intel SoC to enter deeper Package C-states, which is a significant power
savings.
The outbound windows (PCIEPAUR(x), PCIEPALR(x)) describe a mapping between
a CPU address (which is determined by the window number 'x') and a
programmed PCI address - Thus allowing the controller to translate CPU
accesses into PCI accesses.
However the existing code incorrectly writes the CPU address - lets fix
this by writing the PCI address instead.
For memory transactions, existing DT users describe a 1:1 identity mapping
and thus this change should have no effect. However the same isn't true for
I/O.
Link: https://lore.kernel.org/r/20191004132941.6660-1-andrew.murray@arm.com Fixes: c25da4778803 ("PCI: rcar: Add Renesas R-Car PCIe driver") Tested-by: Marek Vasut <marek.vasut+renesas@gmail.com> Signed-off-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Marek Vasut <marek.vasut+renesas@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If platform bus driver registration is failed then, accessing
platform bus spin lock (&drv->driver.bus->p->klist_drivers.k_lock)
in __platform_driver_probe() without verifying the return value
__platform_driver_register() can lead to NULL pointer exception.
So check the return value before attempting the spin lock.
One such example is below:
For a custom usecase, I have intentionally failed the platform bus
registration and I expected all the platform device/driver
registrations to fail gracefully. But I came across this panic
issue.
Which seems to be due to the fact that after allocating the uap
structure, nothing initializes the spinlock.
Its a little confusing, as uart_port_spin_lock_init() is one
place where the lock is supposed to be initialized, but it has
an exception for the case where the port is a console.
This makes it seem like a deeper fix is needed to properly
register the console, but I'm not sure what that entails, and
Andy suggested that this approach is less invasive.
Thus, this patch resolves the issue by initializing the spinlock
in the driver, and resolves the resulting warning.
Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Jiri Slaby <jslaby@suse.com> Cc: linux-serial@vger.kernel.org Reported-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-and-tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lore.kernel.org/r/20200428184050.6501-1-john.stultz@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The IRQ log output is supposed to appear on a single line. However,
commit 3a2dc1677b60 ("i2c: pxa: Update debug function to dump more info
on error") resulted in it being printed one-entry-per-line, which is
excessively long.
Fixing this is not a trivial matter; using pr_cont() doesn't work as
the previous dev_dbg() may not have been compiled in, or may be
dynamic.
Since the rest of this function output is at error level, and is also
debug output, promote this to error level as well to avoid this
problem.
Reduce the number of always zero prefix digits to save screen real-
estate.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On error the function ti_bandgap_get_sensor_data() returns the error
code in ERR_PTR() but we only checked if the return value is NULL or
not. And, so we can dereference an error code inside ERR_PTR.
While at it, convert a check to IS_ERR_OR_NULL.
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200424161944.6044-1-sudipm.mukherjee@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, serialize hvc_open and check if tty->private_data is NULL before
proceeding ahead.
The issue can be easily reproduced by launching two tasks simultaneously
that does nothing but open() and close() on /dev/hvcX.
For example:
$ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &
qdio_establish() calls qdio_setup_thinint() via qdio_setup_irq().
If the subsequent qdio_establish_thinint() fails, we miss to put the
DSCI again. Thus the DSCI isn't available for re-use. Given enough of
such errors, we could end up with having only the shared DSCI available.
Merge qdio_setup_thinint() into qdio_establish_thinint(), and deal with
such an error internally.
For computation of the the next frame size current value of fs/fps and
accumulated fractional parts of fs/fps are used, where values are stored
in Q16.16 format. This is quite natural for computing frame size for
asynchronous endpoints driven by explicit feedback, since in this case
fs/fps is a value provided by the feedback endpoint and it's already in
the Q format. If an error is accumulated over time, the device can
adjust fs/fps value to prevent buffer overruns/underruns.
But for synchronous endpoints the accuracy provided by these computations
is not enough. Due to accumulated error the driver periodically produces
frames with incorrect size (+/- 1 audio sample).
This patch fixes this issue by implementing a different algorithm for
frame size computation. It is based on accumulating of the remainders
from division fs/fps and it doesn't accumulate errors over time. This
new method is enabled for synchronous and adaptive playback endpoints.
For an unreachable target, offload_work is not initialized and the endpoint
state is set to OFLDCONN_NONE. This results in a WARN_ON due to the check
of the work function field being set to zero.
The adapter info MAD is used to send the client info and receive the host
info as a response. A persistent buffer is used and as such the client info
is overwritten after the response. During the course of a normal adapter
reset the client info is refreshed in the buffer in preparation for sending
the adapter info MAD.
However, in the special case of LPM where we reenable the CRQ instead of a
full CRQ teardown and reset we fail to refresh the client info in the
adapter info buffer. As a result, after Live Partition Migration (LPM) we
erroneously report the host's info as our own.
1. If a task is attached to a unconfined profile that is not the
ns->unconfined profile then. Mode the mode is always reported
as -
$ ps -Z
LABEL PID TTY TIME CMD
unconfined 1287 pts/0 00:00:01 bash
test (-) 1892 pts/0 00:00:00 ps
instead of the correct value of (unconfined) as shown below
$ ps -Z
LABEL PID TTY TIME CMD
unconfined 2483 pts/0 00:00:01 bash
test (unconfined) 3591 pts/0 00:00:00 ps
2. if a task is confined by a stack of profiles that are unconfined
the output of label mode is again the incorrect value of (-) like
above, instead of (unconfined). This is because the visibile
profile count increment is skipped by the special casing of
unconfined.
Fixes: f1bd904175e8 ("apparmor: add the base fns() for domain labels") Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When System.map was generated, the kernel used mksysmap to
filter the kernel symbols, but all the symbols with the
second letter 'L' in the kernel were filtered out, not just
the symbols starting with 'dot + L'.
For example:
ashimida@ubuntu:~/linux$ cat System.map |grep ' .L'
ashimida@ubuntu:~/linux$ nm -n vmlinux |grep ' .L' ffff0000088028e0 t bLength_show
...... ffff0000092e0408 b PLLP_OUTC_lock ffff0000092e0410 b PLLP_OUTA_lock
The original intent should be to filter out all local symbols
starting with '.L', so the dot should be escaped.
Fixes: 00902e984732 ("mksysmap: Add h8300 local symbol pattern") Signed-off-by: ashimida <ashimida@linux.alibaba.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the commit adding ntb_default_port_number() and
ntb_default_peer_port_number() entered the kernel there was no
users of it so it was impossible to tell what the API needed.
When a user finally landed a year later (ntb_pingpong) there were
more NTB topologies were created and no consideration was considered
to how other drivers had changed.
Now that there is a user it can be fixed to provide a sensible default
for the legacy drivers that do not implement ntb_{peer_}port_number().
Seeing ntb_pingpong doesn't check error codes returning EINVAL was also
not sensible.
Patches for ntb_pingpong and ntb_perf follow (which are broken
otherwise) to support hardware that doesn't have port numbers. This is
important not only to not break support with existing drivers but for
the cross link topology which, due to its perfect symmetry, cannot
assign unique port numbers to each side.
Fixes: 1e5301196a88 ("NTB: Add indexed ports NTB API") Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Allen Hubbe <allenbh@gmail.com> Tested-by: Alexander Fomichev <fomichev.ru@gmail.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Sasha Levin <sashal@kernel.org>
If register_netdev(dev) fails, free_netdev(dev) needs
to be called, otherwise a memory leak will occur.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
'mem=" option is an easy way to put high pressure on memory during
some test. Hence after applying the memory limit, instead of total
mem, the actual usable memory should be considered when reserving mem
for crashkernel. Otherwise the boot up may experience OOM issue.
E.g. it would reserve 4G prior to the change and 512M afterward, if
passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G",
and mem=5G on a 256G machine.
This issue is powerpc specific because it puts higher priority on
fadump and kdump reservation than on "mem=". Referring the following
code:
if (fadump_reserve_mem() == 0)
reserve_crashkernel();
...
/* Ensure that total memory size is page-aligned. */
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);
While on other arches, the effect of "mem=" takes a higher priority
and pass through memblock_phys_mem_size() before calling
reserve_crashkernel().
nfsd4_process_cb_update() invokes svc_xprt_get(), which increases the
refcount of the "c->cn_xprt".
The reference counting issue happens in one exception handling path of
nfsd4_process_cb_update(). When setup callback client failed, the
function forgets to decrease the refcnt increased by svc_xprt_get(),
causing a refcnt leak.
Fix this issue by calling svc_xprt_put() when setup callback client
failed.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 2b206ee6b0df ("powerpc/perf/hv-24x7: Display change in counter
values")' added to print _change_ in the counter value rather then raw
value for 24x7 counters. Incase of transactions, the event count
is set to 0 at the beginning of the transaction. It also sets
the event's prev_count to the raw value at the time of initialization.
Because of setting event count to 0, we are seeing some weird behaviour,
whenever we run multiple 24x7 events at a time.
As we are setting event_count to 0, for interval case, overall event_count is not
coming in incremental order. As we may can get new delta lesser then previous count.
Because of which when we print intervals, we are getting negative value which create
these large values.
This patch removes part where we set event_count to 0 in function
'h_24x7_event_read'. There won't be much impact as we do set event->hw.prev_count
to the raw value at the time of initialization to print change value.
In order to create or activate a new node, lpfc_els_unsol_buffer() invokes
lpfc_nlp_init() or lpfc_enable_node() or lpfc_nlp_get(), all of them will
return a reference of the specified lpfc_nodelist object to "ndlp" with
increased refcnt.
When lpfc_els_unsol_buffer() returns, local variable "ndlp" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
lpfc_els_unsol_buffer(). When "ndlp" in DEV_LOSS, the function forgets to
decrease the refcnt increased by lpfc_nlp_init() or lpfc_enable_node() or
lpfc_nlp_get(), causing a refcnt leak.
Fix this issue by calling lpfc_nlp_put() when "ndlp" in DEV_LOSS.
Link: https://lore.kernel.org/r/1590416184-52592-1-git-send-email-xiyuyang19@fudan.edu.cn Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
WM8994 chip has built-in regulators, which might be used for chip
operation. They are controlled by a separate wm8994-regulator driver,
which should be loaded before this driver calls regulator_get(), because
that driver also provides consumer-supply mapping for the them. If that
driver is not yet loaded, regulator core substitute them with dummy
regulator, what breaks chip operation, because the built-in regulators are
never enabled. Fix this by annotating this driver with MODULE_SOFTDEP()
"pre" dependency to "wm8994_regulator" module.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit dcebd755926b ("block: use bio_for_each_bvec() to compute
multi-page bvec count"), the kernel will bug_on on the PS3 because
bio_split() is called with sectors == 0:
The problem originates from setting the segment boundary of the
request queue to -1UL. This makes get_max_segment_size() return zero
when offset is zero, whatever the max segment size. The test with
BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows
to zero in the return statement.
Not setting the segment boundary and using the default
value (BLK_SEG_BOUNDARY_MASK) fixes the problem.
Trying to change Link Status register does not have any effect as this
is a read-only register. Trying to overwrite bits for Negotiated Link
Width does not make sense.
In future proper change of link width can be done via Lane Count Select
bits in PCIe Control 0 register.
Trying to unconditionally enable ASPM L0s via ASPM Control bits in Link
Control register is wrong. There should be at least some detection if
endpoint supports L0s as isn't mandatory.
Moreover ASPM Control bits in Link Control register are controlled by
pcie/aspm.c code which sets it according to system ASPM settings,
immediately after aardvark driver probes. So setting these bits by
aardvark driver has no long running effect.
Remove code which touches ASPM L0s bits from this driver and let
kernel's ASPM implementation to set ASPM state properly.
Some users are reporting issues that this code is problematic for some
Intel wifi cards and removing it fixes them, see e.g.:
https://bugzilla.kernel.org/show_bug.cgi?id=196339
If problems with Intel wifi cards occur even after this commit, then
pcie/aspm.c code could be modified / hooked to not enable ASPM L0s state
for affected problematic cards.
Link: https://lore.kernel.org/r/20200430080625.26070-3-pali@kernel.org Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
SCSI LUN passthrough code such as qemu's "scsi-block" device model
pass every IO to the host via SG_IO ioctls. Currently, dm-multipath
calls choose_pgpath() only in the block IO code path, not in the ioctl
code path (unless current_pgpath is NULL). This has the effect that no
path switching and thus no load balancing is done for SCSI-passthrough
IO, unless the active path fails.
Fix this by using the same logic in multipath_prepare_ioctl() as in
multipath_clone_and_map().
Note: The allegedly best path selection algorithm, service-time,
still wouldn't work perfectly, because the io size of the current
request is always set to 0. Changing that for the IO passthrough
case would require the ioctl cmd and arg to be passed to dm's
prepare_ioctl() method.
Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If we timeout during a message transfer, the control register may
contain bits that cause an action to be set. Read-modify-writing the
register leaving these bits set may trigger the hardware to attempt
one of these actions unintentionally.
Always clear these bits when cleaning up after a message or after
a timeout.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Limit the output of humidity compensation to the range between 0 and 100
percent.
Depending on the calibration parameters of the individual sensor it
happens, that a humidity above 100 percent or below 0 percent is
calculated, which don't make sense in terms of relative humidity.
Add a clamp to the compensation formula as described in the datasheet of
the sensor in chapter 4.2.3.
Although this clamp is documented, it was never in the driver of the
kernel.
It depends on the circumstances (calibration parameters, temperature,
humidity) if one can see a value above 100 percent without the clamp.
The writer of this patch was working with this type of sensor without
noting this error. So it seems to be a rare event when this bug occures.
Signed-off-by: Andreas Klinger <ak@it-klinger.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The goal of the following command sequence is to restart the adapter.
However, the tgt_stop flag remains set, indicating that the adapter is
still in stopping state even after re-enabling it.
qlt_handle_cmd_for_atio() rejects the request to send commands because the
adapter is in the stopping state:
kernel: PID 0:qla_target.c:4442 qlt_handle_cmd_for_atio(): tgt_stop 0x1, tgt_stopped 0x0
kernel: qla2xxx [0001:00:02.0]-3861:1: PID 0:qla_target.c:4447: New command while device c000000005314600 is shutting down
kernel: qla2xxx [0001:00:02.0]-e85f:1: PID 0:qla_target.c:5728: qla_target: Unable to send command to target
This patch calls qla_stop_phase2() in addition to qlt_stop_phase1() in
tcm_qla2xxx_tpg_enable_store() and tcm_qla2xxx_npiv_tpg_enable_store(). The
qlt_stop_phase1() marks adapter as stopping (tgt_stop == 0x1, tgt_stopped
== 0x0) but qlt_stop_phase2() marks adapter as stopped (tgt_stop == 0x0,
tgt_stopped == 0x1).
Link: https://lore.kernel.org/r/52be1e8a3537f6c5407eae3edd4c8e08a9545ea5.camel@yadro.com Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Viacheslav Dubeyko <v.dubeiko@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Smatch complains that the "path_data->handle" variable is user controlled.
It comes from iscsi_set_path() so that seems possible. It's harmless to
add a limit check.
The qedi->ep_tbl[] array has qedi->max_active_conns elements (which is
always ISCSI_MAX_SESS_PER_HBA (4096) elements). The array is allocated in
the qedi_cm_alloc_mem() function.
Link: https://lore.kernel.org/r/20200428131939.GA696531@mwanda Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The CMA and DMA_CMA Kconfig options need to be selected
by the Integrator in order to produce boot console on some
Integrator systems.
The REGULATOR and REGULATOR_FIXED_VOLTAGE need to be
selected in order to boot the system from an external
MMC card when using MMCI/PL181 from the device tree
probe path.
Select these things directly from the Kconfig so we are
sure to be able to bring the systems up with console
from any device tree.
davinci_mcasp_get_dma_type() invokes dma_request_chan(), which returns a
reference of the specified dma_chan object to "chan" with increased
refcnt.
When davinci_mcasp_get_dma_type() returns, local variable "chan" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
davinci_mcasp_get_dma_type(). When chan device is NULL, the function
forgets to decrease the refcnt increased by dma_request_chan(), causing
a refcnt leak.
Fix this issue by calling dma_release_channel() when chan device is
NULL.
Fix this by ensuring that the regulators are disabled, if enabled, on
probe failure.
Finally, ensure that the vddio regulator is disabled in the driver
remove handler.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
During the process of debugging a processor derived from the msm8916 which
we found the new processor was not starting one of its PLLs.
After tracing the addresses and writes that downstream was doing and
comparing to upstream it became obvious that we were writing to a different
register location than downstream when trying to configure the PLL.
This error is also present in upstream msm8916.
As an example clk-pll.c::clk_pll_recalc_rate wants to write to
pll->config_reg updating the bit-field POST_DIV_RATIO. That bit-field is
defined in PLL_USER_CTL not in PLL_CONFIG_CTL. Taking the BIMC PLL as an
example
If ida_simple_get() returns an error when called in rproc_alloc(),
put_device() is called to clean things up. By this time the rproc
device type has been assigned, with rproc_type_release() as the
release function.
The first thing rproc_type_release() does is call:
idr_destroy(&rproc->notifyids);
But at the time the ida_simple_get() call is made, the notifyids
field in the remoteproc structure has not been initialized.
I'm not actually sure this case causes an observable problem, but
it's incorrect. Fix this by initializing the notifyids field before
calling ida_simple_get() in rproc_alloc().
Fixes: b5ab5e24e960 ("remoteproc: maintain a generic child device for each rproc") Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Reviewed-by: Suman Anna <s-anna@ti.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20200415204858.2448-2-mathieu.poirier@linaro.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
because DEBUG_SHIRQ mechanism fires an IRQ before registration and drivers
ought to be able to handle an interrupt happening before request_irq() returns.
Fixes: aae953949651 ("iio: pressure: bmp280: add support for BMP085 EOC interrupt") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The AMD X370 and other AM4 chipsets (A/B/X 3/4/5 parts) and Threadripper
equivalents have a secondary SMBus controller at I/O port address
0x0B20. This bus is used by several manufacturers to control
motherboard RGB lighting via embedded controllers. I have been using
this bus in my OpenRGB project to control the Aura RGB on many
motherboards and ASRock also uses this bus for their Polychrome RGB
controller.
I am not aware of any CZ-compatible platforms which do not have the
second SMBus channel. All of AMD's AM4- and Threadripper- series
chipsets that OpenRGB users have tested appear to have this secondary
bus. I also noticed this secondary bus is present on older AMD
platforms including my FM1 home server.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202587 Signed-off-by: Adam Honse <calcprogrammer1@gmail.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Tested-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
round_down() can only round to powers of 2. If round_down() is asked
to round to something that is not a power of 2, incorrect results are
produced. The incorrect results can be both too large and too small.
Instead, use rounddown() which can round to any number.
regmap is a library function that gets selected by drivers that need
it. No driver modules should depend on it. Depending on REGMAP_I2C makes
this driver only build if another driver already selected REGMAP_I2C,
as the symbol can't be selected through the menu kernel configuration.
Fixes: 2219a935963e ("power_supply: Add TI BQ24257 charger driver") Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If both the tracer and the tracee are compat processes, and gprs[2]
is assigned a value by __poke_user_compat, then the higher 32 bits
of gprs[2] are cleared, IS_ERR_VALUE() always returns false, and
syscall_get_error() always returns 0.
Fix the implementation by sign-extending the value for compat processes
the same way as x86 implementation does.
The bug was exposed to user space by commit 201766a20e30f ("ptrace: add
PTRACE_GET_SYSCALL_INFO request") and detected by strace test suite.
This change fixes strace syscall tampering on s390.
Reportedly, from 19.10 Ubuntu has begun mixing up the location of some
debug symbol files, putting files expected to be in
/usr/lib/debug/usr/lib into /usr/lib/debug/lib instead. Fix by adding
another dso_binary_type.
Example on Ubuntu 20.04
Before:
$ perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.030 MB perf.data ]
$ perf script --call-trace | head -5
uname 14003 [005] 15321.764958566: cbr: 42 freq: 4219 MHz (156%)
uname 14003 [005] 15321.764958566: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) 7f1e71cc4100
uname 14003 [005] 15321.764961566: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) 7f1e71cc4df0
uname 14003 [005] 15321.764961900: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) 7f1e71cc4e18
uname 14003 [005] 15321.764963233: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) 7f1e71cc5128
Fix to check kprobe blacklist address correctly with relocated address
by adjusting debuginfo address.
Since the address in the debuginfo is same as objdump, it is different
from relocated kernel address with KASLR. Thus, 'perf probe' always
misses to catch the blacklisted addresses.
Without this patch, 'perf probe' can not detect the blacklist addresses
on a KASLR enabled kernel.
# perf probe kprobe_dispatcher
Failed to write event: Invalid argument
Error: Failed to add events.
#
With this patch, it correctly shows the error message.
# perf probe kprobe_dispatcher
kprobe_dispatcher is blacklisted function, skip it.
Probe point 'kprobe_dispatcher' not found.
Error: Failed to add events.
#
Fixes: 9aaf5a5f479b ("perf probe: Check kprobes blacklist when adding new events") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/158763966411.30755.5882376357738273695.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a probe point is expanded to several places (like inlined) and if
some of them are skipped because of blacklisted or __init function,
those trace_events has no event name. It must be skipped while showing
results.
Without this fix, you can see "(null):(null)" on the list,
# ./perf probe request_resource
reserve_setup is out of .text, skip it.
Added new events:
(null):(null) (on request_resource)
probe:request_resource (on request_resource)
You can now use it in all perf tools, such as:
perf record -e probe:request_resource -aR sleep 1
#
With this fix, it is ignored:
# ./perf probe request_resource
reserve_setup is out of .text, skip it.
Added new events:
probe:request_resource (on request_resource)
You can now use it in all perf tools, such as:
perf record -e probe:request_resource -aR sleep 1
#
Fixes: 5a51fcd1f30c ("perf probe: Skip kernel symbols which is out of .text") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/158763968263.30755.12800484151476026340.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nand_cleanup() is supposed to be called on error after a successful
call to nand_scan() to free all NAND resources.
There is no real Fixes tag applying here as the use of nand_release()
in this driver predates by far the introduction of nand_cleanup() in
commit d44154f969a4 ("mtd: nand: Provide nand_cleanup() function to free NAND related resources")
which makes this change possible, hence pointing it as the commit to
fix for backporting purposes, even if this commit is not introducing
any bug.
Fixes: d44154f969a4 ("mtd: nand: Provide nand_cleanup() function to free NAND related resources") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-mtd/20200519130035.1883-41-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
gss_mech_register() calls svcauth_gss_register_pseudoflavor() for each
flavour, but gss_mech_unregister() does not call auth_domain_put().
This is unbalanced and makes it impossible to reload the module.
Change svcauth_gss_register_pseudoflavor() to return the registered
auth_domain, and save it for later release.
Cc: stable@vger.kernel.org (v2.6.12+) Link: https://bugzilla.kernel.org/show_bug.cgi?id=206651 Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no valid case for supporting duplicate pseudoflavor
registrations.
Currently the silent acceptance of such registrations is hiding a bug.
The rpcsec_gss_krb5 module registers 2 flavours but does not unregister
them, so if you load, unload, reload the module, it will happily
continue to use the old registration which now has pointers to the
memory were the module was originally loaded. This could lead to
unexpected results.
So disallow duplicate registrations.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206651 Cc: stable@vger.kernel.org (v2.6.12+) Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
$(CONFIG_MODVERSIONS) is always empty because it is expanded before
include/config/auto.conf is included. Hence, 'make modules' with
CONFIG_MODVERSION=y cannot record the version CRCs.
This has been broken since 2003, commit ("kbuild: Enable modules to be
build using the "make dir/" syntax"). [1]
At boot the FSCR is initialised via one of two paths. On most systems
it's set to a hard coded value in __init_FSCR().
On newer skiboot systems we use the device tree CPU features binding,
where firmware can tell Linux what bits to set in FSCR (and HFSCR).
In both cases the value that's configured at boot is not propagated
into the init_task.thread.fscr value prior to the initial fork of init
(pid 1), which means the value is not used by any processes other than
swapper (the idle task).
For the __init_FSCR() case this is OK, because the value in
init_task.thread.fscr is initialised to something sensible. However it
does mean that the value set in __init_FSCR() is not used other than
for swapper, which is odd and confusing.
The bigger problem is for the device tree CPU features case it
prevents firmware from setting (or clearing) FSCR bits for use by user
space. This means all existing kernels can not have features
enabled/disabled by firmware if those features require
setting/clearing FSCR bits.
We can handle both cases by saving the FSCR value into
init_task.thread.fscr after we have initialised it at boot. This fixes
the bug for device tree CPU features, and will allow us to simplify
the initialisation for the __init_FSCR() case in a future patch.
Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200527145843.2761782-3-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The device tree CPU features binding includes FSCR bit numbers which
Linux is instructed to set by firmware.
Whether that's a good idea or not, in the case of the DSCR the Linux
implementation has a hard requirement that the FSCR_DSCR bit not be
set by default. We use it to track when a process reads/writes to
DSCR, so it must be clear to begin with.
So if firmware tells us to set FSCR_DSCR we must ignore it.
Currently this does not cause a bug in our DSCR handling because the
value of FSCR that the device tree CPU features code establishes is
only used by swapper. All other tasks use the value hard coded in
init_task.thread.fscr.
However we'd like to fix that in a future commit, at which point this
will become necessary.
Fixes: 5a61ef74f269 ("powerpc/64s: Support new device tree binding for discovering CPU features") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200527145843.2761782-2-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
create_cpu_loop() calls smu_sat_get_sdb_partition() which does
kmalloc() and returns the allocated buffer. In fact it's called twice,
and neither buffer is freed.
The PL310 Auxiliary Control Register shouldn't have the "Full line of
zero" optimization bit being set before L2 cache is enabled. The L2X0
driver takes care of enabling the optimization by itself.
This patch fixes a noisy error message on Tegra20 and Tegra30 telling
that cache optimization is erroneously enabled without enabling it for
the CPU:
L2C-310: enabling full line of zeros but not enabled in Cortex-A9
cpu_pm_notify() is basically a wrapper of notifier_call_chain().
notifier_call_chain() doesn't initialize *nr_calls to 0 before it
starts incrementing it--presumably it's up to the callers to do this.
Unfortunately the callers of cpu_pm_notify() don't init *nr_calls.
This potentially means you could get too many or two few calls to
CPU_PM_ENTER_FAILED or CPU_CLUSTER_PM_ENTER_FAILED depending on the
luck of the stack.
Let's fix this.
Fixes: ab10023e0088 ("cpu_pm: Add cpu power management notifiers") Cc: stable@vger.kernel.org Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200504104917.v6.3.I2d44fc0053d019f239527a4e5829416714b7e299@changeid Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
queue_limits::logical_block_size got changed from unsigned short to
unsigned int, but it was forgotten to update crypt_io_hints() to use the
new type. Fix it.
Fixes: ad6bf88a6c19 ("block: fix an integer overflow in logical block size") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We do need access_process_vm() to access the target's reg_window.
However, access to caller's memory (storing the result in
genregs32_get(), fetching the new values in case of genregs32_set())
should be done by normal uaccess primitives.
Fixes: ad4f95764040 ([SPARC64]: Fix user accesses in regset code.) Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It needs access_process_vm() if the traced process does not share
mm with the caller. Solution is similar to what sparc64 does.
Note that genregs32_set() is only ever called with pos being 0
or 32 * sizeof(u32) (the latter - as part of PTRACE_SETREGS
handling).
Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, for EINT_TYPE GPIOs, the CON and FLTCON registers
are saved and restored over a suspend/resume cycle. However, the
EINT_MASK registers are not.
On S5PV210 at the very least, these registers are not retained over
suspend, leading to the interrupts remaining masked upon resume and
therefore no interrupts being triggered for the device. There should
be no effect on any SoCs that do retain these registers as theoretically
we would just be re-writing what was already there.
Fixes: 7ccbc60cd9c2 ("pinctrl: exynos: Handle suspend/resume of GPIO EINT registers") Cc: <stable@vger.kernel.org> Signed-off-by: Jonathan Bakker <xc-racer2@live.ca> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make sure that the POWER_RESET_VEXPRESS driver won't have bind/unbind
attributes available via the sysfs, so lets be explicit here and use
".suppress_bind_attrs = true" to prevent userspace from doing something
silly.
igb device gets runtime suspended when there's no link partner. We can't
get correct speed under that state:
$ cat /sys/class/net/enp3s0/speed
1000
In addition to that, an error can also be spotted in dmesg:
[ 385.991957] igb 0000:03:00.0 enp3s0: PCIe link lost
Since device can only be runtime suspended when there's no link partner,
we can skip reading register and let the following logic set speed and
duplex with correct status.
The more generic approach will be wrap get_link_ksettings() with begin()
and complete() callbacks. However, for this particular issue, begin()
calls igb_runtime_resume() , which tries to rtnl_lock() while the lock
is already hold by upper ethtool layer.
So let's take this approach until the igb_runtime_resume() no longer
needs to hold rtnl_lock.
CC: stable <stable@vger.kernel.org> Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
v4l2_ctrl_handler_free() uses hdl->lock, which in ov5640 driver is set
to sensor's own sensor->lock. In ov5640_remove(), the driver destroys the
sensor->lock first, and then calls v4l2_ctrl_handler_free(), resulting
in the use of the destroyed mutex.
Fix this by calling moving the mutex_destroy() to the end of the cleanup
sequence, as there's no need to destroy the mutex as early as possible.
Since the driver was first introduced into the kernel, it has only
handled the ciphers associated with WEP, WPA, and WPA2. It fails with
WPA3 even though mac80211 can handle those additional ciphers in software,
b43legacy did not report that it could handle them. By setting MFP_CAPABLE using
ieee80211_set_hw(), the problem is fixed.
With this change, b43legacy will handle the ciphers it knows in hardware,
and let mac80211 handle the others in software. It is not necessary to
use the module parameter NOHWCRYPT to turn hardware encryption off.
Although this change essentially eliminates that module parameter,
I am choosing to keep it for cases where the hardware is broken,
and software encryption is required for all ciphers.
Since the driver was first introduced into the kernel, it has only
handled the ciphers associated with WEP, WPA, and WPA2. It fails with
WPA3 even though mac80211 can handle those additional ciphers in software,
b43 did not report that it could handle them. By setting MFP_CAPABLE using
ieee80211_set_hw(), the problem is fixed.
With this change, b43 will handle the ciphers it knows in hardware,
and let mac80211 handle the others in software. It is not necessary to
use the module parameter NOHWCRYPT to turn hardware encryption off.
Although this change essentially eliminates that module parameter,
I am choosing to keep it for cases where the hardware is broken,
and software encryption is required for all ciphers.
Reported-and-tested-by: Rui Salvaterra <rsalvaterra@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@vger.kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200526155909.5807-2-Larry.Finger@lwfinger.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes commit 75388acd0cd8 ("add mac80211-based driver for
legacy BCM43xx devices")
In https://bugzilla.kernel.org/show_bug.cgi?id=207093, a defect in
b43legacy is reported. Upon testing, thus problem exists on PPC and
X86 platforms, is present in the oldest kernel tested (3.2), and
has been present in the driver since it was first added to the kernel.
The problem is a corrupted channel status received from the device.
Both the internal card in a PowerBook G4 and the PCMCIA version
(Broadcom BCM4306 with PCI ID 14e4:4320) have the problem. Only Rev, 2
(revision 4 of the 802.11 core) of the chip has been tested. No other
devices using b43legacy are available for testing.
Various sources of the problem were considered. Buffer overrun and
other sources of corruption within the driver were rejected because
the faulty channel status is always the same, not a random value.
It was concluded that the faulty data is coming from the device, probably
due to a firmware bug. As that source is not available, the driver
must take appropriate action to recover.
At present, the driver reports the error, and them continues to process
the bad packet. This is believed that to be a mistake, and the correct
action is to drop the correpted packet.
Fixes: 75388acd0cd8 ("add mac80211-based driver for legacy BCM43xx devices") Cc: Stable <stable@vger.kernel.org> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-and-tested by: F. Erhard <erhard_f@mailbox.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200407190043.1686-1-Larry.Finger@lwfinger.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
go7007_snd_init() misses a snd_card_free() in an error path.
Add the missed call to fix it.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
[Salvatore Bonaccorso: Adjust context for backport to versions which do
not contain c0decac19da3 ("media: use strscpy() instead of strlcpy()")
and ba78170ef153 ("media: go7007: Fix misuse of strscpy")] Signed-off-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch follows up on a bug-report by Frank Schäfer that
discovered P2P GO wasn't working with wpa_supplicant.
This patch removes part of the broken P2P GO support but
keeps the vif switchover code in place.
Cc: <stable@vger.kernel.org>
Link: <https://lkml.kernel.org/r/3a9d86b6-744f-e670-8792-9167257edef8@googlemail.com> Reported-by: Frank Schäfer <fschaefer.oss@googlemail.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200425092811.9494-1-chunkeey@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's an error if the value of the RX/TX tail descriptor does not match
what was written. The error condition is true regardless the duration
of the interference from ME. But the driver only performs the reset if
E1000_ICH_FWSM_PCIM2PCI_COUNT (2000) iterations of 50us delay have
transpired. The extra condition can lead to inconsistency between the
state of hardware as expected by the driver.
Fix this by dropping the check for number of delay iterations.
While at it, also make __ew32_prepare() static as it's not used
anywhere else.
CC: stable <stable@vger.kernel.org> Signed-off-by: Punit Agrawal <punit1.agrawal@toshiba.co.jp> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit b10effb92e27 ("e1000e: fix buffer overrun while the I219 is
processing DMA transactions") imposes roughly 30% performance penalty.
The commit log states that "Disabling TSO eliminates performance loss
for TCP traffic without a noticeable impact on CPU performance", so
let's disable TSO by default to regain the loss.
CC: stable <stable@vger.kernel.org> Fixes: b10effb92e27 ("e1000e: fix buffer overrun while the I219 is processing DMA transactions") BugLink: https://bugs.launchpad.net/bugs/1802691 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Root Complex Integrated Endpoints (RCiEPs) do not have an upstream bridge,
so pci_configure_mps() previously ignored them, which may result in reduced
performance.
Instead, program the Max_Payload_Size of RCiEPs to the maximum supported
value (unless it is limited for the PCIE_BUS_PEER2PEER case). This also
affects the subsequent programming of Max_Read_Request_Size because Linux
programs MRRS based on the MPS value.
If an error happens while running dellaloc in COW mode for a range, we can
end up calling extent_clear_unlock_delalloc() for a range that goes beyond
our range's end offset by 1 byte, which affects 1 extra page. This results
in clearing bits and doing page operations (such as a page unlock) outside
our target range.
Fix that by calling extent_clear_unlock_delalloc() with an inclusive end
offset, instead of an exclusive end offset, at cow_file_range().
Fixes: a315e68f6e8b30 ("Btrfs: fix invalid attempt to free reserved space on failure to cow range") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In btrfs_submit_direct_hook(), if a direct I/O write doesn't span a RAID
stripe or chunk, we submit orig_bio without cloning it. In this case, we
don't increment pending_bios. Then, if btrfs_submit_dio_bio() fails, we
decrement pending_bios to -1, and we never complete orig_bio. Fix it by
initializing pending_bios to 1 instead of incrementing later.
Fixing this exposes another bug: we put orig_bio prematurely and then
put it again from end_io. Fix it by not putting orig_bio.
After this change, pending_bios is really more of a reference count, but
I'll leave that cleanup separate to keep the fix small.
Fixes: e65e15355429 ("btrfs: fix panic caused by direct IO") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
acs_flags &= ~( <controls provided by this device> );
return acs_flags ? 0 : 1;
Pull this out into a helper function to simplify the quirks slightly. The
helper function is also a convenient place for comments about what the list
of ACS controls means. No functional change intended.
All Intel platforms guarantee that all root complex implementations must
send transactions up to IOMMU for address translations. Hence for Intel
RCiEP devices, we can assume some ACS-type isolation even without an ACS
capability.
From the Intel VT-d spec, r3.1, sec 3.16 ("Root-Complex Peer to Peer
Considerations"):
When DMA remapping is enabled, peer-to-peer requests through the
Root-Complex must be handled as follows:
- The input address in the request is translated (through first-level,
second-level or nested translation) to a host physical address (HPA).
The address decoding for peer addresses must be done only on the
translated HPA. Hardware implementations are free to further limit
peer-to-peer accesses to specific host physical address regions (or
to completely disallow peer-forwarding of translated requests).
- Since address translation changes the contents (address field) of
the PCI Express Transaction Layer Packet (TLP), for PCI Express
peer-to-peer requests with ECRC, the Root-Complex hardware must use
the new ECRC (re-computed with the translated address) if it
decides to forward the TLP as a peer request.
- Root-ports, and multi-function root-complex integrated endpoints, may
support additional peer-to-peer control features by supporting PCI
Express Access Control Services (ACS) capability. Refer to ACS
capability in PCI Express specifications for details.
Since Linux didn't give special treatment to allow this exception, certain
RCiEP MFD devices were grouped in a single IOMMU group. This doesn't permit
a single device to be assigned to a guest for instance.
In one vendor system: Device 14.x were grouped in a single IOMMU group.
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/6/devices/0000:00:14.3 <<< new group
14.0 and 14.2 are integrated devices, but legacy end points, whereas 14.3
was a PCIe-compliant RCiEP.
[bhelgaas: drop "Fixes" tag since this doesn't fix a bug in that commit] Link: https://lore.kernel.org/r/1590699462-7131-1-git-send-email-ashok.raj@intel.com Tested-by: Darrel Goeddel <dgoeddel@forcepoint.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Mark Scott <mscott@forcepoint.com>, Cc: Romil Sharma <rsharma@forcepoint.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Although not allowed by the PCI specs, some multi-function devices have
power dependencies between the functions. For example, function 1 may not
work unless function 0 is in the D0 power state.
The existing quirk_gpu_hda() adds a device link to express this dependency
for GPU and HDA devices, but it really is not specific to those device
types.
Generalize it and rename it to pci_create_device_link() so we can create
dependencies between any "consumer" and "producer" functions of a
multi-function device, where the consumer is only functional if the
producer is in D0. This reorganization should not affect any
functionality.
Back in 2013, runtime PM for GPUs with integrated HDA controller was
introduced with commits 0d69704ae348 ("gpu/vga_switcheroo: add driver
control power feature. (v3)") and 246efa4a072f ("snd/hda: add runtime
suspend/resume on optimus support (v4)").
Briefly, the idea was that the HDA controller is forced on and off in
unison with the GPU.
The original code is mostly still in place even though it was never a
100% perfect solution: E.g. on access to the HDA controller, the GPU
is powered up via vga_switcheroo_runtime_resume_hdmi_audio() but there
are no provisions to keep it resumed until access to the HDA controller
has ceased: The GPU autosuspends after 5 seconds, rendering the HDA
controller inaccessible.
Additionally, a kludge is required when hda_intel.c probes: It has to
check whether the GPU is powered down (check_hdmi_disabled()) and defer
probing if so.
However in the meantime (in v4.10) the driver core has gained a feature
called device links which promises to solve such issues in a clean way:
It allows us to declare a dependency from the HDA controller (consumer)
to the GPU (supplier). The PM core then automagically ensures that the
GPU is runtime resumed as long as the HDA controller's ->probe hook is
executed and whenever the HDA controller is accessed.
By default, the HDA controller has a dependency on its parent, a PCIe
Root Port. Adding a device link creates another dependency on its
sibling:
PCIe Root Port
^ ^
| |
| |
HDA ===> GPU
The device link is not only used for runtime PM, it also guarantees that
on system sleep, the HDA controller suspends before the GPU and resumes
after the GPU, and on system shutdown the HDA controller's ->shutdown
hook is executed before the one of the GPU. It is a complete solution.
Using this functionality is as simple as calling device_link_add(),
which results in a dmesg entry like this:
pci 0000:01:00.1: Linked as a consumer to 0000:01:00.0
The code for the GPU-governed audio power management can thus be removed
(except where it's still needed for legacy manual power control).
The device link is added in a PCI quirk rather than in hda_intel.c.
It is therefore legal for the GPU to runtime suspend to D3cold even if
the HDA controller is not bound to a driver or if CONFIG_SND_HDA_INTEL
is not enabled, for accesses to the HDA controller will cause the GPU to
wake up regardless if they're occurring outside of hda_intel.c (think
config space readout via sysfs).
Contrary to the previous implementation, the HDA controller's power
state is now self-governed, rather than GPU-governed, whereas the GPU's
power state is no longer fully self-governed. (The HDA controller needs
to runtime suspend before the GPU can.)
It is thus crucial that runtime PM is always activated on the HDA
controller even if CONFIG_SND_HDA_POWER_SAVE_DEFAULT is set to 0 (which
is the default), lest the GPU stays awake. This is achieved by setting
the auto_runtime_pm flag on every codec and the AZX_DCAPS_PM_RUNTIME
flag on the HDA controller.
A side effect is that power consumption might be reduced if the GPU is
in use but the HDA controller is not, because the HDA controller is now
allowed to go to D3hot. Before, it was forced to stay in D0 as long as
the GPU was in use. (There is no reduction in power consumption on my
Nvidia GK107, but there might be on other chips.)
The code paths for legacy manual power control are adjusted such that
runtime PM is disabled during power off, thereby preventing the PM core
from resuming the HDA controller.
Note that the device link is not only added on vga_switcheroo capable
systems, but for *any* GPU with integrated HDA controller. The idea is
that the HDA controller streams audio via connectors located on the GPU,
so the GPU needs to be on for the HDA controller to do anything useful.
This commit implicitly fixes an unbalanced runtime PM ref upon unbind of
hda_intel.c: On ->probe, a runtime PM ref was previously released under
the condition "azx_has_pm_runtime(chip) || hda->use_vga_switcheroo", but
on ->remove a runtime PM ref was only acquired under the first of those
conditions. Thus, binding and unbinding the driver twice on a
vga_switcheroo capable system caused the runtime PM refcount to drop
below zero. The issue is resolved because the AZX_DCAPS_PM_RUNTIME flag
is now always set if use_vga_switcheroo is true.
For more information on device links please refer to:
https://www.kernel.org/doc/html/latest/driver-api/device_link.html
Documentation/driver-api/device_link.rst
Cc: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Peter Wu <peter@lekensteyn.nl> Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress Tested-by: Mike Lothian <mike@fireburn.co.uk> # AMD PowerXpress Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://patchwork.freedesktop.org/patch/msgid/51bd38360ff502a8c42b1ebf4405ee1d3f27118d.1520068884.git.lukas@wunner.de Signed-off-by: Sasha Levin <sashal@kernel.org>
If DRM drivers use runtime PM, they currently notify vga_switcheroo
whenever they ->runtime_suspend or ->runtime_resume to update
vga_switcheroo's internal power state tracking.
That's essentially a duplication of a functionality performed by the
PM core as it already tracks the GPU's power state and vga_switcheroo
can always query it.
Introduce a new internal helper vga_switcheroo_pwr_state() which does
just that if runtime PM is used, or falls back to vga_switcheroo's
internal power state tracking if manual power control is used.
Drop a redundant power state check in set_audio_state() while at it.
This removes one of the two purposes of the notification mechanism
implemented by vga_switcheroo_set_dynamic_switch(). The other one is
power management of the audio device and we'll remove that next.
Cc: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Peter Wu <peter@lekensteyn.nl> Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress Tested-by: Mike Lothian <mike@fireburn.co.uk> # AMD PowerXpress Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://patchwork.freedesktop.org/patch/msgid/0aa49d735b988aa04524a8dc339582ace33f0f94.1520068884.git.lukas@wunner.de Signed-off-by: Sasha Levin <sashal@kernel.org>
The Ampere Computing PCIe root port does not support ACS at this point.
However, the hardware provides isolation and source validation through the
SMMU. The stream ID generated by the PCIe ports contain both the
bus/device/function number as well as the port ID in its 3 most significant
bits. Turn on ACS but disable all the peer-to-peer features.
APM is being rebranded to Ampere. The Vendor and Device IDs change, but
the functionality stays the same.
iProc PAXB Root Ports don't advertise an ACS capability, but they do not
allow peer-to-peer transactions between Root Ports. Add an ACS quirk so
each Root Port can be in a separate IOMMU group.
[bhelgaas: commit log, comment, use common implementation style] Link: https://lore.kernel.org/r/1566275985-25670-1-git-send-email-srinath.mannam@broadcom.com Signed-off-by: Abhinav Ratna <abhinav.ratna@broadcom.com> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The AMD Starship USB 3.0 host controller advertises Function Level Reset
support, but it apparently doesn't work. Add a quirk to prevent use of FLR
on this device.
Without this quirk, when attempting to assign (pass through) an AMD
Starship USB 3.0 host controller to a guest OS, the system becomes
increasingly unresponsive over the course of several minutes, eventually
requiring a hard reset. Shortly after attempting to start the guest, I see
these messages:
vfio-pci 0000:05:00.3: not ready 1023ms after FLR; waiting
vfio-pci 0000:05:00.3: not ready 2047ms after FLR; waiting
vfio-pci 0000:05:00.3: not ready 4095ms after FLR; waiting
vfio-pci 0000:05:00.3: not ready 8191ms after FLR; waiting
And then eventually:
vfio-pci 0000:05:00.3: not ready 65535ms after FLR; giving up
INFO: NMI handler (perf_event_nmi_handler) took too long to run: 0.000 msecs
perf: interrupt took too long (642744 > 2500), lowering kernel.perf_event_max_sample_rate to 1000
INFO: NMI handler (perf_event_nmi_handler) took too long to run: 82.270 msecs
INFO: NMI handler (perf_event_nmi_handler) took too long to run: 680.608 msecs
INFO: NMI handler (perf_event_nmi_handler) took too long to run: 100.952 msecs
...
watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [qemu-system-x86:7487]
Tested on a Micro-Star International Co., Ltd. MS-7C59/Creator TRX40
motherboard with an AMD Ryzen Threadripper 3970X.
The AMD Matisse HD Audio & USB 3.0 devices advertise Function Level Reset
support, but hang when an FLR is triggered.
To reproduce the problem, attach the device to a VM, then detach and try to
attach again.
Rename the existing quirk_intel_no_flr(), which was not Intel-specific, to
quirk_no_flr(), and apply it to prevent the use of FLR on these AMD
devices.
The Freescale PCIe controller advertises the MSI/MSI-X capability in both
RC and Endpoint mode, but in RC mode it doesn't support MSI/MSI-X by
itself; it can only transfer MSI/MSI-X from downstream devices.
Add a quirk to prevent use of MSI/MSI-X in RC mode.
'igrab(d_inode(dentry->d_parent))' without holding dentry->d_lock is
broken because without d_lock, d_parent can be concurrently changed due
to a rename(). Then if the old directory is immediately deleted, old
d_parent->inode can be NULL. That causes a NULL dereference in igrab().
To fix this, use dget_parent() to safely grab a reference to the parent
dentry, which pins the inode. This also eliminates the need to use
d_find_any_alias() other than for the initial inode, as we no longer
throw away the dentry at each step.
This is an extremely hard race to hit, but it is possible. Adding a
udelay() in between the reads of ->d_parent and its ->d_inode makes it
reproducible on a no-journal filesystem using the following program:
#include <fcntl.h>
#include <unistd.h>
int main()
{
if (fork()) {
for (;;) {
mkdir("dir1", 0700);
int fd = open("dir1/file", O_RDWR|O_CREAT|O_SYNC);
write(fd, "X", 1);
close(fd);
}
} else {
mkdir("dir2", 0700);
for (;;) {
rename("dir1/file", "dir2/file");
rmdir("dir1");
}
}
}