iio: xilinx-xadc: Make sure not exceed maximum samplerate
The XADC supports a samplerate of up to 1MSPS. Unfortunately the hardware
does not have a FIFO, which means it generates an interrupt for each
conversion sequence. At one 1MSPS this creates an interrupt storm that
causes the system to soft-lock.
For this reason the driver limits the maximum samplerate to 150kSPS.
Currently this check is only done when setting a new samplerate. But it is
also possible that the initial samplerate configured in the FPGA bitstream
exceeds the limit.
In this case when starting to capture data without first changing the
samplerate the system can overload.
To prevent this check the currently configured samplerate in the probe
function and reduce it to the maximum if necessary.
iio: xilinx-xadc: Fix sequencer configuration for aux channels in simultaneous mode
The XADC has two internal ADCs. Depending on the mode it is operating in
either one or both of them are used. The device manual calls this
continuous (one ADC) and simultaneous (both ADCs) mode.
The meaning of the sequencing register for the aux channels changes
depending on the mode.
In continuous mode each bit corresponds to one of the 16 aux channels. And
the single ADC will convert them one by one in order.
In simultaneous mode the aux channels are split into two groups the first 8
channels are assigned to the first ADC and the other 8 channels to the
second ADC. The upper 8 bits of the sequencing register are unused and the
lower 8 bits control both ADCs. This means a bit needs to be set if either
the corresponding channel from the first group or the second group (or
both) are set.
Currently the driver does not have the special handling required for
simultaneous mode. Add it.
iio: xilinx-xadc: Fix clearing interrupt when enabling trigger
When enabling the trigger and unmasking the end-of-sequence (EOS) interrupt
the EOS interrupt should be cleared from the status register. Otherwise it
is possible that it was still set from a previous capture. If that is the
case the interrupt would fire immediately even though no conversion has
been done yet and stale data is being read from the device.
The old code only clears the interrupt if the interrupt was previously
unmasked. Which does not make much sense since the interrupt is always
masked at this point and in addition masking the interrupt does not clear
the interrupt from the status register. So the clearing needs to be done
unconditionally.
The check for shutting down the second ADC is inverted. This causes it to
be powered down when it should be enabled. As a result channels that are
supposed to be handled by the second ADC return invalid conversion results.
Colin Ian King [Fri, 3 Apr 2020 12:58:38 +0000 (13:58 +0100)]
iio: dac: ad5770r: fix off-by-one check on maximum number of channels
Currently there is an off-by-one check on the number of channels that
will cause an arry overrun in array st->output_mode when calling the
function d5770r_store_output_range. Fix this by using >= rather than >
to check for maximum number of channels.
Addresses-Coverity: ("Out-of-bounds access") Fixes: cbbb819837f6 ("iio: dac: ad5770r: Add AD5770R support") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Lorenzo Bianconi [Fri, 13 Mar 2020 18:06:00 +0000 (19:06 +0100)]
iio: imu: st_lsm6dsx: flush hw FIFO before resetting the device
flush hw FIFO before device reset in order to avoid possible races
on interrupt line 1. If the first interrupt line is asserted during
hw reset the device will work in I3C-only mode (if it is supported)
Fixes: 801a6e0af0c6 ("iio: imu: st_lsm6dsx: add support to LSM6DSO") Fixes: 43901008fde0 ("iio: imu: st_lsm6dsx: add support to LSM6DSR") Reported-by: Mario Tesi <mario.tesi@st.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Vitor Soares <vitor.soares@synopsys.com> Tested-by: Vitor Soares <vitor.soares@synopsys.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Mircea Caprioru [Wed, 1 Apr 2020 11:22:30 +0000 (14:22 +0300)]
iio: core: Fix handling of 'dB'
This patch fixes the call to iio_str_to_fixpoint when using 'dB' sufix.
Before this the scale_db was not used when parsing the string written to
the attribute and it failed with invalid value.
Fixes: b8528224741b ("iio: core: Handle 'dB' suffix in core") Signed-off-by: Mircea Caprioru <mircea.caprioru@analog.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Fabrice Gasnier [Thu, 19 Mar 2020 18:18:27 +0000 (19:18 +0100)]
dt-bindings: iio: adc: stm32-adc: fix id relative path
Fix id relative path that shouldn't contain 'bindings', as pointed out
when submitting st,stm32-dac bindings conversion to json-schema [1].
[1] https://patchwork.ozlabs.org/patch/1257568/
Fixes: a8cf1723c4b7 ("dt-bindings: iio: adc: stm32-adc: convert bindings to json-schema") Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Lorenzo Bianconi [Fri, 13 Mar 2020 17:54:42 +0000 (18:54 +0100)]
iio: imu: st_lsm6dsx: specify slave odr in slv_odr
Introduce slv_odr in ext_info data structure in order to distinguish
between sensor hub trigger (accel sensor) odr and i2c slave odr and
properly compute samples in FIFO pattern
Fixes: e485e2a2cfd6 ("iio: imu: st_lsm6dsx: enable sensor-hub support for lsm6dsm") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Lorenzo Bianconi [Fri, 13 Mar 2020 17:54:41 +0000 (18:54 +0100)]
iio: imu: st_lsm6dsx: fix read misalignment on untagged FIFO
st_lsm6dsx suffers of a read misalignment on untagged FIFO when
all 3 supported sensors (accel, gyro and ext device) are running
at different ODRs (the use-case is reported in the LSM6DSM Application
Note at pag 100).
Fix the issue taking into account decimation factor reading the FIFO
pattern.
Fixes: e485e2a2cfd6 ("iio: imu: st_lsm6dsx: enable sensor-hub support for lsm6dsm") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
YueHaibing [Tue, 10 Mar 2020 14:16:54 +0000 (22:16 +0800)]
iio:ad7797: Use correct attribute_group
It should use ad7797_attribute_group in ad7797_info,
according to commit ("iio:ad7793: Add support for the ad7796 and ad7797").
Scale is fixed for the ad7796 and not programmable, hence
should not have the scale_available attribute.
Fixes: fd1a8b912841 ("iio:ad7793: Add support for the ad7796 and ad7797") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
George Spelvin [Thu, 26 Mar 2020 15:23:36 +0000 (15:23 +0000)]
staging: wilc1000: Use crc7 in lib/ rather than a private copy
The code in lib/ is the desired polynomial, and even includes
the 1-bit left shift in the table rather than needing to code
it explicitly.
While I'm in Kconfig, add a description of what a WILC1000 is.
Kconfig questions that require me to look up a data sheet to
find out that I probably don't have one are a pet peeve.
Sam Muhammed [Thu, 26 Mar 2020 15:02:36 +0000 (11:02 -0400)]
Staging: rtl8192u: ieee80211: Use netdev_warn() for network devices.
Use netdev_warn() over printk().
netdev_warn() is specific for printing warning
messages for network devices, and preferable
over printk(KERN_WARNING ...).
Sam Muhammed [Thu, 26 Mar 2020 15:02:35 +0000 (11:02 -0400)]
Staging: rtl8192u: ieee80211: Use netdev_dbg() for debug messages.
Replace printk(KERN_DEBUG ...) with netdev_dbg() across the driver.
since netdev_dbg() is preferable and specific for
printing debug messages for network devices.
Simran Singhal [Thu, 26 Mar 2020 13:28:23 +0000 (18:58 +0530)]
staging: rtl8723bs: hal: Remove NULL check before kfree
NULL check before kfree is unnecessary so remove it.
The following Coccinelle script was used to detect this:
@@ expression E; @@
- if (E != NULL) { kfree(E); }
+ kfree(E);
@@ expression E; @@
- if (E != NULL) { kfree(E); E = NULL; }
+ kfree(E);
+ E = NULL;
Simran Singhal [Thu, 26 Mar 2020 11:32:10 +0000 (17:02 +0530)]
staging: rtl8723bs: hal: Remove unnecessary cast on void pointer
Assignment to a typed pointer is sufficient in C.
No cast is needed.
The following Coccinelle script was used to detect this:
@r@
expression x;
void* e;
type T;
identifier f;
@@
(
*((T *)e)
|
((T *)x)[...]
|
((T*)x)->f
|
Michael Straube [Thu, 26 Mar 2020 08:43:48 +0000 (09:43 +0100)]
staging: rtl8188eu: cleanup long line in odm.c
Cleanup line over 80 characters by removing unnecessary test
'pDM_Odm->RSSI_Min <= 25'. The above test 'pDM_Odm->RSSI_Min > 25'
already guarantees that it is <= 25.
Sam Muhammed [Wed, 25 Mar 2020 14:26:37 +0000 (10:26 -0400)]
Staging: kpc2000: kpc_dma: Use sizeof(*var) in kzalloc().
kzalloc(sizeof(*var), ...) was the format been used
across the driver, which is the preferred format,
but missed two instances, correct them to match the
coding standards.
Checkpatch.pl CHECK: Prefer kzalloc(sizeof(*var)...)
over kzalloc(sizeof(struct var)...)
Soumyajit Deb [Wed, 25 Mar 2020 18:00:03 +0000 (23:30 +0530)]
staging: hp100: Properly indent the multiline comments.
Shift the end of multiline comments "*/" to the next line and align the
lines of the comments properly to improve code readability and to adhere
to the Linux Kernel coding style.
Reported by checkpatch.pl
Simran Singhal [Wed, 25 Mar 2020 14:02:45 +0000 (19:32 +0530)]
staging: rtl8723bs: Remove unnecessary braces for single statements
Clean up unnecessary braces around single statement blocks.
Issues reported by checkpatch.pl as:
WARNING: braces {} are not necessary for single statement blocks
Soumyajit Deb [Wed, 25 Mar 2020 12:27:52 +0000 (17:57 +0530)]
staging: hp100: Add spaces in if statement.
Add space between if and open parenthesis to improve
code readability and to adhere to the standard linux kernel
coding style. Also, shift the next line to the right by a single space
as it is the continuation of the above if statement.
Soumyajit Deb [Wed, 25 Mar 2020 11:59:56 +0000 (17:29 +0530)]
staging: hp100: Add space between while keyword and open parenthesis
Add space between while keyword and open parenthesis "(" in the do while
statement to improve code readability and to adhere to the Linux Kernel
coding style.
Reported by checkpatch.pl
Soumyajit Deb [Wed, 25 Mar 2020 07:29:05 +0000 (12:59 +0530)]
staging: hp100: Remove space after opening parenthesis "("
Remove space after opening parenthesis in if statement to improve code
readability and to adhere to the standard coding style.
Reported by checkpatch.pl
Deepak R Varma [Sun, 22 Mar 2020 18:59:36 +0000 (00:29 +0530)]
staging: comedi: ni_labpc_common: Reformat multiple line dereference
Reformat multi-line dereferencing of function arguments
&cmd->scan_begin_arg. Also reformat another call to the same function to
follow the same argument formatting structure. Problem detected by
checkpatch script.
Deepak R Varma [Sun, 22 Mar 2020 19:57:26 +0000 (01:27 +0530)]
staging: iio: adc: ad7280a: Add comments to clarify stringified arguments
Checkpatch would flash a check message around a stringified macro
argument containing a '-' character. Add comment to indicate the
argument is legitimate and doesn't need fixing.
John B. Wyatt IV [Sat, 21 Mar 2020 22:58:08 +0000 (15:58 -0700)]
staging: wlan-ng: Fix third argument going over 80 characters
Create a new 'status' variable to store the value of a long argument
that goes over 80 characters. The status variable is also used for
an if check. Replacing that long statement in both places makes the
code much easier to read.
Note: the status variable is assigned after a needed byte order
conversion for usbin->rxfrm.desc.status, which uses a reference.
Issue reported by checkpatch.
Suggested-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: John B. Wyatt IV <jbwyatt4@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Link: https://lore.kernel.org/r/20200321225808.2494564-1-jbwyatt4@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sergio Paracuellos [Sun, 22 Mar 2020 07:21:28 +0000 (08:21 +0100)]
staging: mt7621-pci: avoid to set 'iomem_resource' addresses
Setting up kernel resource 'iomem_resource' for PCI with
addresses parsed from device tree gots into a conflict within
the usb xhci driver:
xhci-mtk 1e1c0000.xhci: can't request region for resource [mem 0x1e1c0000-0x1e1c0fff]
xhci-mtk: probe of 1e1c0000.xhci failed with error -16
Don't assign it and maintain the default addresses for this
resource seems to fix the problem. Checking legacy driver it
is being only setting the 'ioport_resource'.
Detection of the Xtal mode is using magic numbers that
can be avoided using properly some definitions and a more
accurate variable name from 'reg' into 'xtal_mode'. This
increase readability.
Sergio Paracuellos [Sat, 21 Mar 2020 13:36:23 +0000 (14:36 +0100)]
staging: mt7621-pci-phy: use builtin_platform_driver()
Macro builtin_platform_driver can be used for builtin drivers
that don't do anything in driver init. So, use the macro
builtin_platform_driver and remove some boilerplate code.
Sergio Paracuellos [Sat, 21 Mar 2020 13:36:21 +0000 (14:36 +0100)]
staging: mt7621-pci: use builtin_platform_driver()
Macro builtin_platform_driver can be used for builtin drivers
that don't do anything in driver init. So, use the macro
builtin_platform_driver and remove some boilerplate code.
Linus Torvalds [Sun, 22 Mar 2020 18:35:33 +0000 (11:35 -0700)]
Merge tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Two fixes.
The first is a regression: when dropping some incompat bits the
conditions were reversed. The other is a fix for rename whiteout
potentially leaving stack memory linked to a list"
* tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix removal of raid[56|1c34} incompat flags after removing block group
btrfs: fix log context list corruption after rename whiteout error
Linus Torvalds [Sun, 22 Mar 2020 17:46:50 +0000 (10:46 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"10 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
x86/mm: split vmalloc_sync_all()
mm, slub: prevent kmalloc_node crashes and memory leaks
mm/mmu_notifier: silence PROVE_RCU_LIST warnings
epoll: fix possible lost wakeup on epoll_ctl() path
mm: do not allow MADV_PAGEOUT for CoW pages
mm, memcg: throttle allocators based on ancestral memory.high
mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
page-flags: fix a crash at SetPageError(THP_SWAP)
mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event
Joerg Roedel [Sun, 22 Mar 2020 01:22:41 +0000 (18:22 -0700)]
x86/mm: split vmalloc_sync_all()
Commit 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in
the vunmap() code-path. While this change was necessary to maintain
correctness on x86-32-pae kernels, it also adds additional cycles for
architectures that don't need it.
Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported
severe performance regressions in micro-benchmarks because it now also
calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But
the vmalloc_sync_all() implementation on x86-64 is only needed for newly
created mappings.
To avoid the unnecessary work on x86-64 and to gain the performance
back, split up vmalloc_sync_all() into two functions:
* vmalloc_sync_mappings(), and
* vmalloc_sync_unmappings()
Most call-sites to vmalloc_sync_all() only care about new mappings being
synchronized. The only exception is the new call-site added in the
above mentioned commit.
Shile Zhang directed us to a report of an 80% regression in reaim
throughput.
Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Borislav Petkov <bp@suse.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [GHES] Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/ Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This only happens with a mmotm patch "mm/memcontrol.c: allocate
shrinker_map on appropriate NUMA node" [2] which effectively calls
kmalloc_node for each possible node. SLUB however only allocates
kmem_cache_node on online N_NORMAL_MEMORY nodes, and relies on
node_to_mem_node to return such valid node for other nodes since commit a561ce00b09e ("slub: fall back to node_to_mem_node() node if allocating
on memoryless node"). This is however not true in this configuration
where the _node_numa_mem_ array is not initialized for nodes 0 and 2-31,
thus it contains zeroes and get_partial() ends up accessing
non-allocated kmem_cache_node.
A related issue was reported by Bharata (originally by Ramachandran) [3]
where a similar PowerPC configuration, but with mainline kernel without
patch [2] ends up allocating large amounts of pages by kmalloc-1k
kmalloc-512. This seems to have the same underlying issue with
node_to_mem_node() not behaving as expected, and might probably also
lead to an infinite loop with CONFIG_SLUB_CPU_PARTIAL [4].
This patch should fix both issues by not relying on node_to_mem_node()
anymore and instead simply falling back to NUMA_NO_NODE, when
kmalloc_node(node) is attempted for a node that's not online, or has no
usable memory. The "usable memory" condition is also changed from
node_present_pages() to N_NORMAL_MEMORY node state, as that is exactly
the condition that SLUB uses to allocate kmem_cache_node structures.
The check in get_partial() is removed completely, as the checks in
___slab_alloc() are now sufficient to prevent get_partial() being
reached with an invalid node.
Qian Cai [Sun, 22 Mar 2020 01:22:34 +0000 (18:22 -0700)]
mm/mmu_notifier: silence PROVE_RCU_LIST warnings
It is safe to traverse mm->notifier_subscriptions->list either under
SRCU read lock or mm->notifier_subscriptions->lock using
hlist_for_each_entry_rcu(). Silence the PROVE_RCU_LIST false positives,
for example,
Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Link: http://lkml.kernel.org/r/20200317175640.2047-1-cai@lca.pw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Penyaev [Sun, 22 Mar 2020 01:22:30 +0000 (18:22 -0700)]
epoll: fix possible lost wakeup on epoll_ctl() path
This fixes possible lost wakeup introduced by commit a218cc491420.
Originally modifications to ep->wq were serialized by ep->wq.lock, but
in commit a218cc491420 ("epoll: use rwlock in order to reduce
ep_poll_callback() contention") a new rw lock was introduced in order to
relax fd event path, i.e. callers of ep_poll_callback() function.
After the change ep_modify and ep_insert (both are called on epoll_ctl()
path) were switched to ep->lock, but ep_poll (epoll_wait) was using
ep->wq.lock on wqueue list modification.
The bug doesn't lead to any wqueue list corruptions, because wake up
path and list modifications were serialized by ep->wq.lock internally,
but actual waitqueue_active() check prior wake_up() call can be
reordered with modifications of ep ready list, thus wake up can be lost.
And yes, can be healed by explicit smp_mb():
list_add_tail(&epi->rdlink, &ep->rdllist);
smp_mb();
if (waitqueue_active(&ep->wq))
wake_up(&ep->wp);
But let's make it simple, thus current patch replaces ep->wq.lock with
the ep->lock for wqueue modifications, thus wake up path always observes
activeness of the wqueue correcty.
Fixes: a218cc491420 ("epoll: use rwlock in order to reduce ep_poll_callback() contention") Reported-by: Max Neunhoeffer <max@arangodb.com> Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Max Neunhoeffer <max@arangodb.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Jes Sorensen <jes.sorensen@gmail.com> Cc: <stable@vger.kernel.org> [5.1+] Link: http://lkml.kernel.org/r/20200214170211.561524-1-rpenyaev@suse.de
References: https://bugzilla.kernel.org/show_bug.cgi?id=205933 Bisected-by: Max Neunhoeffer <max@arangodb.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Sun, 22 Mar 2020 01:22:26 +0000 (18:22 -0700)]
mm: do not allow MADV_PAGEOUT for CoW pages
Jann has brought up a very interesting point [1]. While shared pages
are excluded from MADV_PAGEOUT normally, CoW pages can be easily
reclaimed that way. This can lead to all sorts of hard to debug
problems. E.g. performance problems outlined by Daniel [2].
There are runtime environments where there is a substantial memory
shared among security domains via CoW memory and a easy to reclaim way
of that memory, which MADV_{COLD,PAGEOUT} offers, can lead to either
performance degradation in for the parent process which might be more
privileged or even open side channel attacks.
The feasibility of the latter is not really clear to me TBH but there is
no real reason for exposure at this stage. It seems there is no real
use case to depend on reclaiming CoW memory via madvise at this stage so
it is much easier to simply disallow it and this is what this patch
does. Put it simply MADV_{PAGEOUT,COLD} can operate only on the
exclusively owned memory which is a straightforward semantic.
Chris Down [Sun, 22 Mar 2020 01:22:23 +0000 (18:22 -0700)]
mm, memcg: throttle allocators based on ancestral memory.high
Prior to this commit, we only directly check the affected cgroup's
memory.high against its usage. However, it's possible that we are being
reclaimed as a result of hitting an ancestor memory.high and should be
penalised based on that, instead.
This patch changes memory.high overage throttling to use the largest
overage in its ancestors when considering how many penalty jiffies to
charge. This makes sure that we penalise poorly behaving cgroups in the
same way regardless of at what level of the hierarchy memory.high was
breached.
Fixes: 0e4b01df8659 ("mm, memcg: throttle allocators when failing reclaim over memory.high") Reported-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Chris Down <chris@chrisdown.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: <stable@vger.kernel.org> [5.4.x+] Link: http://lkml.kernel.org/r/8cd132f84bd7e16cdb8fde3378cdbf05ba00d387.1584036142.git.chris@chrisdown.name Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Down [Sun, 22 Mar 2020 01:22:20 +0000 (18:22 -0700)]
mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
Commit 0e4b01df8659 had a bunch of fixups to use the right division
method. However, it seems that after all that it still wasn't right --
div_u64 takes a 32-bit divisor.
The headroom is still large (2^32 pages), so on mundane systems you
won't hit this, but this should definitely be fixed.
Fixes: 0e4b01df8659 ("mm, memcg: throttle allocators when failing reclaim over memory.high") Reported-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Chris Down <chris@chrisdown.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: <stable@vger.kernel.org> [5.4.x+] Link: http://lkml.kernel.org/r/80780887060514967d414b3cd91f9a316a16ab98.1584036142.git.chris@chrisdown.name Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qian Cai [Sun, 22 Mar 2020 01:22:17 +0000 (18:22 -0700)]
page-flags: fix a crash at SetPageError(THP_SWAP)
Commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped
out") supported writing THP to a swap device but forgot to upgrade an
older commit df8c94d13c7e ("page-flags: define behavior of FS/IO-related
flags on compound pages") which could trigger a crash during THP
swapping out with DEBUG_VM_PGFLAGS=y,
Baoquan He [Sun, 22 Mar 2020 01:22:13 +0000 (18:22 -0700)]
mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
In section_deactivate(), pfn_to_page() doesn't work any more after
ms->section_mem_map is resetting to NULL in SPARSEMEM|!VMEMMAP case. It
causes a hot remove failure:
Chunguang Xu [Sun, 22 Mar 2020 01:22:10 +0000 (18:22 -0700)]
memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event
An eventfd monitors multiple memory thresholds of the cgroup, closes them,
the kernel deletes all events related to this eventfd. Before all events
are deleted, another eventfd monitors the memory threshold of this cgroup,
leading to a crash:
We can reproduce this problem in the following ways:
1. We create a new cgroup subdirectory and a new eventfd, and then we
monitor multiple memory thresholds of the cgroup through this eventfd.
2. closing this eventfd, and __mem_cgroup_usage_unregister_event ()
will be called multiple times to delete all events related to this
eventfd.
The first time __mem_cgroup_usage_unregister_event() is called, the
kernel will clear all items related to this eventfd in thresholds->
primary.
Since there is currently only one eventfd, thresholds-> primary becomes
empty, so the kernel will set thresholds-> primary and hresholds-> spare
to NULL. If at this time, the user creates a new eventfd and monitor
the memory threshold of this cgroup, kernel will re-initialize
thresholds-> primary.
Then when __mem_cgroup_usage_unregister_event () is called for the
second time, because thresholds-> primary is not empty, the system will
access thresholds-> spare, but thresholds-> spare is NULL, which will
trigger a crash.
In general, the longer it takes to delete all events related to this
eventfd, the easier it is to trigger this problem.
The solution is to check whether the thresholds associated with the
eventfd has been cleared when deleting the event. If so, we do nothing.
[akpm@linux-foundation.org: fix comment, per Kirill] Fixes: 907860ed381a ("cgroups: make cftype.unregister_event() void-returning") Signed-off-by: Chunguang Xu <brookxu@tencent.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/077a6f67-aefa-4591-efec-f2f3af2b0b02@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 21 Mar 2020 19:08:26 +0000 (12:08 -0700)]
Merge tag 'block-5.6-20200320' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Just two NVMe fabrics fixes that should go into 5.6"
* tag 'block-5.6-20200320' of git://git.kernel.dk/linux-block:
nvmet-tcp: set MSG_MORE only if we actually have more to send
nvme-rdma: Avoid double freeing of async event data
Linus Torvalds [Sat, 21 Mar 2020 18:54:47 +0000 (11:54 -0700)]
Merge tag 'io_uring-5.6-20200320' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Two different fixes in here:
- Fix for a potential NULL pointer deref for links with async or
drain marked (Pavel)
- Fix for not properly checking RLIMIT_NOFILE for async punted
operations.
This affects openat/openat2, which were added this cycle, and
accept4. I did a full audit of other cases where we might check
current->signal->rlim[] and found only RLIMIT_FSIZE for buffered
writes and fallocate. That one is fixed and queued for 5.7 and
marked stable"
* tag 'io_uring-5.6-20200320' of git://git.kernel.dk/linux-block:
io_uring: make sure accept honor rlimit nofile
io_uring: make sure openat/openat2 honor rlimit nofile
io_uring: NULL-deref for IOSQE_{ASYNC,DRAIN}
Linus Torvalds [Sat, 21 Mar 2020 18:50:36 +0000 (11:50 -0700)]
Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:
"Update to turbostat v20.03.20.
These patches unlock the full turbostat features for some new
machines, plus a couple other minor tweaks"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: update version
tools/power turbostat: Print cpuidle information
tools/power turbostat: Fix 32-bit capabilities warning
tools/power turbostat: Fix missing SYS_LPI counter on some Chromebooks
tools/power turbostat: Support Elkhart Lake
tools/power turbostat: Support Jasper Lake
tools/power turbostat: Support Ice Lake server
tools/power turbostat: Support Tiger Lake
tools/power turbostat: Fix gcc build warnings
tools/power turbostat: Support Cometlake
Linus Torvalds [Sat, 21 Mar 2020 15:51:45 +0000 (08:51 -0700)]
Merge tag 'powerpc-5.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Two fixes for bugs introduced this cycle:
- fix a crash when shutting down a KVM PR guest (our original style
of KVM which doesn't use hypervisor mode)
- fix for the recently added 32-bit KASAN_VMALLOC support
Thanks to: Christophe Leroy, Greg Kurz, Sean Christopherson"
* tag 'powerpc-5.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
KVM: PPC: Fix kernel crash with PR KVM
powerpc/kasan: Fix shadow memory protection with CONFIG_KASAN_VMALLOC
Oscar Carter [Fri, 20 Mar 2020 18:13:26 +0000 (19:13 +0100)]
staging: vt6656: Use BIT() macro in vnt_mac_reg_bits_* functions
The last parameter in the functions vnt_mac_reg_bits_on and
vnt_mac_reg_bits_off defines the bits to set or unset. So, it's more
clear to use the BIT() macro instead of an hexadecimal value.
Sergio Paracuellos [Sat, 21 Mar 2020 07:26:50 +0000 (08:26 +0100)]
staging: mt7621-pci: delete release gpios related code
Making gpio8 and gpio9 vendor specific and putting them
into the specific dts file makes not needed to release
gpios anymore because we are not occupying those pins
in the first place if it is not necessary. When the
device tree is parsed we can also check and return for
the error because we rely in the fact that the related
device for the board is correct.
Sergio Paracuellos [Sat, 21 Mar 2020 07:26:49 +0000 (08:26 +0100)]
staging: mt7621-dts: gpio 8 and 9 are vendor specific
There are three pins that can be used for reset gpios.
As mentioned in the application note, there are two
possible way of wiring pcie reset:
* connect gpio19 to all pcie reset pins
* connect gpio19 to pcie0 reset and pick two other
gpios for pcie1 and pcie2
gpio7 and gpio8 may not be used as pcie reset and are
vendor specific. Hence, maintain common mt7621.dtsi with
only gpio19 which is common and make an overlay for gnubee
board which uses all gpio's as resets for pcie. After this
changes release gpios in driver code is not needed anymore.