www.infradead.org Git - users/dwmw2/linux.git/log

selftests/powerpc: Fix eeh-basic.sh exit codes

[ Upstream commit 996f9e0f93f16211945c8d5f18f296a88cb32f91 ]

The kselftests test running infrastructure expects tests to finish with an
exit code of 4 if the test decided it should be skipped. Currently
eeh-basic.sh exits with the number of devices that failed to recover, so if
four devices didn't recover we'll report a skip instead of a fail.

Fix this by checking if the return code is non-zero and report success
and failure by returning 0 or 1 respectively. For the cases where should
actually skip return 4.

Fixes: 85d86c8aa52e ("selftests/powerpc: Add basic EEH selftest")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201014024711.1138386-1-oohall@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

maiblox: mediatek: Fix handling of platform_get_irq() error

[ Upstream commit 558e4c36ec9f2722af4fe8ef84dc812bcdb5c43a ]

platform_get_irq() returns -ERRNO on error. In such case casting to u32
and comparing to 0 would pass the check.

Fixes: 623a6143a845 ("mailbox: mediatek: Add Mediatek CMDQ driver")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()

[ Upstream commit e7ec96fc7932f48a6d6cdd05bf82004a1a04285b ]

The changes referenced below replaced sbk_clone)_ by taking additional
references, passing the skb along and then freeing the skb. This
deleted the packets before they could be processed and additionally
passed bad data in each packet. Since pkt is stored in skb->cb
changing pkt->qp changed it for all the packets.

Replace skb_get() by sbk_clone() in rxe_rcv_mcast_pkt() for cases where
multiple QPs are receiving multicast packets on the same address.

Delete kfree_skb() because the packets need to live until they have been
processed by each QP. They are freed later.

Fixes: 86af61764151 ("IB/rxe: remove unnecessary skb_clone")
Fixes: fe896ceb5772 ("IB/rxe: replace refcount_inc with skb_get")
Link: https://lore.kernel.org/r/20201008203651.256958-1-rpearson@hpe.com
Signed-off-by: Bob Pearson <rpearson@hpe.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

IB/rdmavt: Fix sizeof mismatch

[ Upstream commit 8e71f694e0c819db39af2336f16eb9689f1ae53f ]

An incorrect sizeof is being used, struct rvt_ibport ** is not correct, it
should be struct rvt_ibport *. Note that since ** is the same size as
* this is not causing any issues. Improve this fix by using
sizeof(*rdi->ports) as this allows us to not even reference the type
of the pointer. Also remove line breaks as the entire statement can
fit on one line.

Link: https://lore.kernel.org/r/20201008095204.82683-1-colin.king@canonical.com
Addresses-Coverity: ("Sizeof not portable (SIZEOF_MISMATCH)")
Fixes: ff6acd69518e ("IB/rdmavt: Add device structure allocation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier

[ Upstream commit a2d0230b91f7e23ceb5d8fb6a9799f30517ec33a ]

The patch avoids allocating cpufreq_policy on stack hence fixing frame
size overflow in 'powernv_cpufreq_reboot_notifier':

drivers/cpufreq/powernv-cpufreq.c: In function powernv_cpufreq_reboot_notifier:
drivers/cpufreq/powernv-cpufreq.c:906:1: error: the frame size of 2064 bytes is larger than 2048 bytes

Fixes: cf30af76 ("cpufreq: powernv: Set the cpus to nominal frequency during reboot/kexec")
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200922080254.41497-1-srikar@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

i3c: master: Fix error return in cdns_i3c_master_probe()

[ Upstream commit abea14bfdebbe9bd02f2ad24a1f3a878ed21c8f0 ]

Fix to return negative error code -ENOMEM from the error handling
case instead of 0.

Fixes: 603f2bee2c54 ("i3c: master: Add driver for Cadence IP")
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://lore.kernel.org/linux-i3c/20200911033350.23904-1-jingxiangfeng@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/perf/hv-gpci: Fix starting index value

[ Upstream commit 0f9866f7e85765bbda86666df56c92f377c3bc10 ]

Commit 9e9f60108423f ("powerpc/perf/{hv-gpci, hv-common}: generate
requests with counters annotated") adds a framework for defining
gpci counters.
In this patch, they adds starting_index value as '0xffffffffffffffff'.
which is wrong as starting_index is of size 32 bits.

Because of this, incase we try to run hv-gpci event we get error.

In power9 machine:

command#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
          -C 0 -I 1000
event syntax error: '..bie_count_and_time_tlbie_instructions_issued/'
                                  \___ value too big for format, maximum is 4294967295

This patch fix this issue and changes starting_index value to '0xffffffff'

After this patch:

command#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ -C 0 -I 1000
     1.000085786              1,024      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     2.000287818              1,024      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
     2.439113909             17,408      hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/

Fixes: 9e9f60108423 ("powerpc/perf/{hv-gpci, hv-common}: generate requests with counters annotated")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201003074943.338618-1-kjain@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/perf: Exclude pmc5/6 from the irrelevant PMU group constraints

[ Upstream commit 3b6c3adbb2fa42749c3d38cfc4d4d0b7e096bb7b ]

PMU counter support functions enforces event constraints for group of
events to check if all events in a group can be monitored. Incase of
event codes using PMC5 and PMC6 ( 500fa and 600f4 respectively ), not
all constraints are applicable, say the threshold or sample bits. But
current code includes pmc5 and pmc6 in some group constraints (like
IC_DC Qualifier bits) which is actually not applicable and hence
results in those events not getting counted when scheduled along with
group of other events. Patch fixes this by excluding PMC5/6 from
constraints which are not relevant for it.

Fixes: 7ffd948 ("powerpc/perf: factor out power8 pmu functions")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1600672204-1610-1-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces

[ Upstream commit 5ce2dced8e95e76ff7439863a118a053a7fc6f91 ]

Report the "ipoib pkey", "mode" and "umcast" netlink attributes for every
IPoiB interface type, not just children created with 'ip link add'.

After setting the rtnl_link_ops for the parent interface, implement the
dellink() callback to block users from trying to remove it.

Fixes: 862096a8bbf8 ("IB/ipoib: Add more rtnl_link_ops callbacks")
Link: https://lore.kernel.org/r/20201004132948.26669-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

overflow: Include header file with SIZE_MAX declaration

[ Upstream commit a4947e84f23474803b62a2759b5808147e4e15f9 ]

The various array_size functions use SIZE_MAX define, but missed limits.h
causes to failure to compile code that needs overflow.h.

In file included from drivers/infiniband/core/uverbs_std_types_device.c:6:
./include/linux/overflow.h: In function 'array_size':
./include/linux/overflow.h:258:10: error: 'SIZE_MAX' undeclared (first use in this function)
258 | return SIZE_MAX;
| ^~~~~~~~

Fixes: 610b15c50e86 ("overflow.h: Add allocation size calculation helpers")
Link: https://lore.kernel.org/r/20200913102928.134985-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

kdb: Fix pager search for multi-line strings

[ Upstream commit d081a6e353168f15e63eb9e9334757f20343319f ]

Currently using forward search doesn't handle multi-line strings correctly.
The search routine replaces line breaks with \0 during the search and, for
regular searches ("help | grep Common\n"), there is code after the line
has been discarded or printed to replace the break character.

However during a pager search ("help\n" followed by "/Common\n") when the
string is matched we will immediately return to normal output and the code
that should restore the \n becomes unreachable. Fix this by restoring the
replaced character when we disable the search mode and update the comment
accordingly.

Fixes: fb6daa7520f9d ("kdb: Provide forward search at more prompt")
Link: https://lore.kernel.org/r/20200909141708.338273-1-daniel.thompson@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: spinand: gigadevice: Add QE Bit

[ Upstream commit aea7687e77bebce5b67fab9d03347bd8df7933c7 ]

The following GigaDevice chips have the QE BIT in the feature flags, I
checked the datasheets, but did not try this.
* GD5F1GQ4xExxG
* GD5F1GQ4xFxxG
* GD5F1GQ4UAYIG
* GD5F4GQ4UAYIG

The Quad operations like 0xEB mention that the QE bit has to be set.

Fixes: c93c613214ac ("mtd: spinand: add support for GigaDevice GD5FxGQ4xA")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Tested-by: Chuanhong Guo <gch981213@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200820165121.3192-3-hauke@hauke-m.de
Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: spinand: gigadevice: Only one dummy byte in QUADIO

[ Upstream commit 6387ad9caf8f09747a8569e5876086b72ee9382c ]

The datasheet only lists one dummy byte in the 0xEH operation for the
following chips:
* GD5F1GQ4xExxG
* GD5F1GQ4xFxxG
* GD5F1GQ4UAYIG
* GD5F4GQ4UAYIG

Fixes: c93c613214ac ("mtd: spinand: add support for GigaDevice GD5FxGQ4xA")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Tested-by: Chuanhong Guo <gch981213@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200820165121.3192-2-hauke@hauke-m.de
Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: rawnand: vf610: disable clk on error handling path in probe

[ Upstream commit cb7dc3178a9862614b1e7567d77f4679f027a074 ]

vf610_nfc_probe() does not invoke clk_disable_unprepare() on one error
handling path. The patch fixes that.

Found by Linux Driver Verification project (linuxtesting.org).

Fixes: 6f0ce4dfc5a3 ("mtd: rawnand: vf610: Avoid a potential NULL pointer dereference")
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200806072634.23528-1-novikov@ispras.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/hns: Fix missing sq_sig_type when querying QP

[ Upstream commit 05df49279f8926178ecb3ce88e61b63104cd6293 ]

The sq_sig_type field should be filled when querying QP, or the users may
get a wrong value.

Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1600509802-44382-9-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/hns: Fix the wrong value of rnr_retry when querying qp

[ Upstream commit 99fcf82521d91468ee6115a3c253aa032dc63cbc ]

The rnr_retry returned to the user is not correct, it should be got from
another fields in QPC.

Fixes: bfe860351e31 ("RDMA/hns: Fix cast from or to restricted __le32 for driver")
Link: https://lore.kernel.org/r/1600509802-44382-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf stat: Skip duration_time in setup_system_wide

[ Upstream commit 002a3d690f95804bdef6b70b26154103518e13d9 ]

Some metrics (such as DRAM_BW_Use) consists of uncore events and
duration_time. For uncore events, counter->core.system_wide is true. But
for duration_time, counter->core.system_wide is false so
target.system_wide is set to false.

Then 'enable_on_exec' is set in perf_event_attr of uncore event.  Kernel
will return error when trying to open the uncore event.

This patch skips the duration_time in setup_system_wide then
target.system_wide will be set to true for the evlist of uncore events +
duration_time.

Before (tested on skylake desktop):

  # perf stat -M DRAM_BW_Use -- sleep 1
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
  /bin/dmesg | grep -i perf may provide additional information.

After:

  # perf stat -M DRAM_BW_Use -- sleep 1

   Performance counter stats for 'system wide':

                169      arb/event=0x84,umask=0x1/ #     0.00 DRAM_BW_Use
             40,427      arb/event=0x81,umask=0x1/
      1,000,902,197 ns   duration_time

        1.000902197 seconds time elapsed

Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200922015004.30114-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

i40iw: Add support to make destroy QP synchronous

[ Upstream commit f2334964e969762e266a616acf9377f6046470a2 ]

Occasionally ib_write_bw crash is seen due to access of a pd object in
i40iw_sc_qp_destroy after it is freed. Destroy qp is not synchronous in
i40iw and thus the iwqp object could be referencing a pd object that is
freed by ib core as a result of successful return from i40iw_destroy_qp.

Wait in i40iw_destroy_qp till all QP references are released and destroy
the QP and its associated resources before returning.  Switch to use the
refcount API vs atomic API for lifetime management of the qp.

RIP: 0010:i40iw_sc_qp_destroy+0x4b/0x120 [i40iw]
[...]
RSP: 0018:ffffb4a7042e3ba8 EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000001 RCX: dead000000000122
RDX: ffffb4a7042e3bac RSI: ffff8b7ef9b1e940 RDI: ffff8b7efbf09080
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 8080808080808080 R11: 0000000000000010 R12: ffff8b7efbf08050
R13: 0000000000000001 R14: ffff8b7f15042928 R15: ffff8b7ef9b1e940
FS:  0000000000000000(0000) GS:ffff8b7f2fa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000400 CR3: 000000020d60a006 CR4: 00000000001606e0
Call Trace:
  i40iw_exec_cqp_cmd+0x4d3/0x5c0 [i40iw]
  ? try_to_wake_up+0x1ea/0x5d0
  ? __switch_to_asm+0x40/0x70
  i40iw_process_cqp_cmd+0x95/0xa0 [i40iw]
  i40iw_handle_cqp_op+0x42/0x1a0 [i40iw]
  ? cm_event_handler+0x13c/0x1f0 [iw_cm]
  i40iw_rem_ref+0xa0/0xf0 [i40iw]
  cm_work_handler+0x99c/0xd10 [iw_cm]
  process_one_work+0x1a1/0x360
  worker_thread+0x30/0x380
  ? process_one_work+0x360/0x360
  kthread+0x10c/0x130
  ? kthread_park+0x80/0x80
  ret_from_fork+0x35/0x40

Fixes: d37498417947 ("i40iw: add files for iwarp interface")
Link: https://lore.kernel.org/r/20200916131811.2077-1-shiraz.saleem@intel.com
Reported-by: Kamal Heib <kheib@redhat.com>
Signed-off-by: Sindhu, Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz, Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/mlx5: Disable IB_DEVICE_MEM_MGT_EXTENSIONS if IB_WR_REG_MR can't work

[ Upstream commit 0ec52f0194638e2d284ad55eba5a7aff753de1b9 ]

set_reg_wr() always fails if !umr_modify_entity_size_disabled because
mlx5_ib_can_use_umr() always fails. Without set_reg_wr() IB_WR_REG_MR
doesn't work and that means the device should not advertise
IB_DEVICE_MEM_MGT_EXTENSIONS.

Fixes: 841b07f99a47 ("IB/mlx5: Block MR WR if UMR is not possible")
Link: https://lore.kernel.org/r/20200914112653.345244-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/hns: Set the unsupported wr opcode

[ Upstream commit 22d3e1ed2cc837af87f76c3c8a4ccf4455e225c5 ]

hip06 does not support IB_WR_LOCAL_INV, so the ps_opcode should be set to
an invalid value instead of being left uninitialized.

Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Fixes: a2f3d4479fe9 ("RDMA/hns: Avoid unncessary initialization")
Link: https://lore.kernel.org/r/1600350615-115217-1-git-send-email-oulijun@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf intel-pt: Fix "context_switch event has no tid" error

[ Upstream commit 7d537a8d2e76bc4fc71e34545ceaa463ac2cd928 ]

A context_switch event can have no tid because pids can be detached from
a task while the task is still running (in do_exit()). Note this won't
happen with per-task contexts because then tracing stops at
perf_event_exit_task()

If a task with no tid gets preempted, or a dying task gets preempted and
its parent releases it, when it subsequently gets switched back in,
Intel PT will not be able to determine what task is running and prints
an error "context_switch event has no tid". However, it is not really an
error because the task is in kernel space and the decoder can continue
to decode successfully. Fix by changing the error to be only a logged
message, and make allowance for tid == -1.

Example:

  Using 5.9-rc4 with Preemptible Kernel (Low-Latency Desktop) e.g.
  $ uname -r
  5.9.0-rc4
  $ grep PREEMPT .config
  # CONFIG_PREEMPT_NONE is not set
  # CONFIG_PREEMPT_VOLUNTARY is not set
  CONFIG_PREEMPT=y
  CONFIG_PREEMPT_COUNT=y
  CONFIG_PREEMPTION=y
  CONFIG_PREEMPT_RCU=y
  CONFIG_PREEMPT_NOTIFIERS=y
  CONFIG_DRM_I915_PREEMPT_TIMEOUT=640
  CONFIG_DEBUG_PREEMPT=y
  # CONFIG_PREEMPT_TRACER is not set
  # CONFIG_PREEMPTIRQ_DELAY_TEST is not set

Before:

  $ cat forkit.c

  #include <sys/types.h>
  #include <unistd.h>
  #include <sys/wait.h>

  int main()
  {
          pid_t child;
          int status = 0;

          child = fork();
          if (child == 0)
                  return 123;
          wait(&status);
          return 0;
  }

  $ gcc -o forkit forkit.c
  $ sudo ~/bin/perf record --kcore -a -m,64M -e intel_pt/cyc/k &
  [1] 11016
  $ taskset 2 ./forkit
  $ sudo pkill perf
  $ [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 17.262 MB perf.data ]

  [1]+  Terminated              sudo ~/bin/perf record --kcore -a -m,64M -e intel_pt/cyc/k
  $ sudo ~/bin/perf script --show-task-events --show-switch-events --itrace=iqqe-o -C 1 --ns | grep -C 2 forkit
  context_switch event has no tid
           taskset 11019 [001] 66663.270045029:          1 instructions:k:  ffffffffb1d9f844 strnlen_user+0xb4 ([kernel.kallsyms])
           taskset 11019 [001] 66663.270201816:          1 instructions:k:  ffffffffb1a83121 unmap_page_range+0x561 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270327553: PERF_RECORD_COMM exec: forkit:11019/11019
            forkit 11019 [001] 66663.270420028:          1 instructions:k:  ffffffffb1db9537 __clear_user+0x27 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270648704:          1 instructions:k:  ffffffffb18829e6 do_user_addr_fault+0xf6 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270833163:          1 instructions:k:  ffffffffb230a825 irqentry_exit_to_user_mode+0x15 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271092359:          1 instructions:k:  ffffffffb1aea3d9 lock_page_memcg+0x9 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271207092: PERF_RECORD_FORK(11020:11020):(11019:11019)
            forkit 11019 [001] 66663.271234775: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid: 11020/11020
            forkit 11020 [001] 66663.271238407: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
            forkit 11020 [001] 66663.271312066:          1 instructions:k:  ffffffffb1a88140 handle_mm_fault+0x10 ([kernel.kallsyms])
            forkit 11020 [001] 66663.271476225: PERF_RECORD_EXIT(11020:11020):(11019:11019)
            forkit 11020 [001] 66663.271497488: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 11019/11019
            forkit 11019 [001] 66663.271500523: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11020/11020
            forkit 11019 [001] 66663.271517241:          1 instructions:k:  ffffffffb24012cd error_entry+0x6d ([kernel.kallsyms])
            forkit 11019 [001] 66663.271664080: PERF_RECORD_EXIT(11019:11019):(1386:1386)

After:

  $ sudo ~/bin/perf script --show-task-events --show-switch-events --itrace=iqqe-o -C 1 --ns | grep -C 2 forkit
           taskset 11019 [001] 66663.270045029:          1 instructions:k:  ffffffffb1d9f844 strnlen_user+0xb4 ([kernel.kallsyms])
           taskset 11019 [001] 66663.270201816:          1 instructions:k:  ffffffffb1a83121 unmap_page_range+0x561 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270327553: PERF_RECORD_COMM exec: forkit:11019/11019
            forkit 11019 [001] 66663.270420028:          1 instructions:k:  ffffffffb1db9537 __clear_user+0x27 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270648704:          1 instructions:k:  ffffffffb18829e6 do_user_addr_fault+0xf6 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270833163:          1 instructions:k:  ffffffffb230a825 irqentry_exit_to_user_mode+0x15 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271092359:          1 instructions:k:  ffffffffb1aea3d9 lock_page_memcg+0x9 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271207092: PERF_RECORD_FORK(11020:11020):(11019:11019)
            forkit 11019 [001] 66663.271234775: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid: 11020/11020
            forkit 11020 [001] 66663.271238407: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
            forkit 11020 [001] 66663.271312066:          1 instructions:k:  ffffffffb1a88140 handle_mm_fault+0x10 ([kernel.kallsyms])
            forkit 11020 [001] 66663.271476225: PERF_RECORD_EXIT(11020:11020):(11019:11019)
            forkit 11020 [001] 66663.271497488: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 11019/11019
            forkit 11019 [001] 66663.271500523: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11020/11020
            forkit 11019 [001] 66663.271517241:          1 instructions:k:  ffffffffb24012cd error_entry+0x6d ([kernel.kallsyms])
            forkit 11019 [001] 66663.271664080: PERF_RECORD_EXIT(11019:11019):(1386:1386)
            forkit 11019 [001] 66663.271688752: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    -1/-1
               :-1    -1 [001] 66663.271692086: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
                :-1    -1 [001] 66663.271707466:          1 instructions:k:  ffffffffb18eb096 update_load_avg+0x306 ([kernel.kallsyms])

Fixes: 86c2786994bd7c ("perf intel-pt: Add support for PERF_RECORD_SWITCH")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lore.kernel.org/lkml/20200909084923.9096-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/cma: Consolidate the destruction of a cma_multicast in one place

[ Upstream commit 3788d2997bc0150ea911a964d5b5a2e11808a936 ]

Two places were open coding this sequence, and also pull in
cma_leave_roce_mc_group() which was called only once.

Link: https://lore.kernel.org/r/20200902081122.745412-8-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/cma: Remove dead code for kernel rdmacm multicast

[ Upstream commit 1bb5091def706732c749df9aae45fbca003696f2 ]

There is no kernel user of RDMA CM multicast so this code managing the
multicast subscription of the kernel-only internal QP is dead. Remove it.

This makes the bug fixes in the next patches much simpler.

Link: https://lore.kernel.org/r/20200902081122.745412-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/64s/radix: Fix mm_cpumask trimming race vs kthread_use_mm

[ Upstream commit a665eec0a22e11cdde708c1c256a465ebe768047 ]

Commit 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of
single-threaded mm_cpumask") added a mechanism to trim the mm_cpumask of
a process under certain conditions. One of the assumptions is that
mm_users would not be incremented via a reference outside the process
context with mmget_not_zero() then go on to kthread_use_mm() via that
reference.

That invariant was broken by io_uring code (see previous sparc64 fix),
but I'll point Fixes: to the original powerpc commit because we are
changing that assumption going forward, so this will make backports
match up.

Fix this by no longer relying on that assumption, but by having each CPU
check the mm is not being used, and clearing their own bit from the mask
only if it hasn't been switched-to by the time the IPI is processed.

This relies on commit 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB
invalidate") and ARCH_WANT_IRQS_OFF_ACTIVATE_MM to disable irqs over mm
switch sequences.

Fixes: 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Depends-on: 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB invalidate")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200914045219.3736466-5-npiggin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/tau: Disable TAU between measurements

[ Upstream commit e63d6fb5637e92725cf143559672a34b706bca4f ]

Enabling CONFIG_TAU_INT causes random crashes:

Unrecoverable exception 1700 at c0009414 (msr=1000)
Oops: Unrecoverable exception, sig: 6 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-pmac-00043-gd5f545e1a8593 #5
NIP:  c0009414 LR: c0009414 CTR: c00116fc
REGS: c0799eb8 TRAP: 1700   Not tainted  (5.7.0-pmac-00043-gd5f545e1a8593)
MSR:  00001000 <ME>  CR: 22000228  XER: 00000100

GPR00: 00000000 c0799f70 c076e300 00800000 0291c0ac 00e00000 c076e300 00049032
GPR08: 00000001 c00116fc 00000000 dfbd3200 ffffffff 007f80a8 00000000 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c075ce04
GPR24: c075ce04 dfff8880 c07b0000 c075ce04 00080000 00000001 c079ef98 c079ef5c
NIP [c0009414] arch_cpu_idle+0x24/0x6c
LR [c0009414] arch_cpu_idle+0x24/0x6c
Call Trace:
[c0799f70] [00000001] 0x1 (unreliable)
[c0799f80] [c0060990] do_idle+0xd8/0x17c
[c0799fa0] [c0060ba4] cpu_startup_entry+0x20/0x28
[c0799fb0] [c072d220] start_kernel+0x434/0x44c
[c0799ff0] [00003860] 0x3860
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX 3d20c07b XXXXXXXX XXXXXXXX XXXXXXXX 7c0802a6
XXXXXXXX XXXXXXXX XXXXXXXX 4e800421 XXXXXXXX XXXXXXXX XXXXXXXX 7d2000a6
---[ end trace 3a0c9b5cb216db6b ]---

Resolve this problem by disabling each THRMn comparator when handling
the associated THRMn interrupt and by disabling the TAU entirely when
updating THRMn thresholds.

Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5a0ba3dc5612c7aac596727331284a3676c08472.1599260540.git.fthain@telegraphics.com.au
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/tau: Check processor type before enabling TAU interrupt

[ Upstream commit 5e3119e15fed5b9a9a7e528665ff098a4a8dbdbc ]

According to Freescale's documentation, MPC74XX processors have an
erratum that prevents the TAU interrupt from working, so don't try to
use it when running on those processors.

Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c281611544768e758bd58fe812cf702a5bd2d042.1599260540.git.fthain@telegraphics.com.au
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/tau: Remove duplicated set_thresholds() call

[ Upstream commit 420ab2bc7544d978a5d0762ee736412fe9c796ab ]

The commentary at the call site seems to disagree with the code. The
conditional prevents calling set_thresholds() via the exception handler,
which appears to crash. Perhaps that's because it immediately triggers
another TAU exception. Anyway, calling set_thresholds() from TAUupdate()
is redundant because tau_timeout() does so.

Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d7c7ee33232cf72a6a6bbb6ef05838b2e2b113c0.1599260540.git.fthain@telegraphics.com.au
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/tau: Convert from timer to workqueue

[ Upstream commit b1c6a0a10bfaf36ec82fde6f621da72407fa60a1 ]

Since commit 19dbdcb8039cf ("smp: Warn on function calls from softirq
context") the Thermal Assist Unit driver causes a warning like the
following when CONFIG_SMP is enabled.

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 0 at kernel/smp.c:428 smp_call_function_many_cond+0xf4/0x38c
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-pmac #3
  NIP:  c00b37a8 LR: c00b3abc CTR: c001218c
  REGS: c0799c60 TRAP: 0700   Not tainted  (5.7.0-pmac)
  MSR:  00029032 <EE,ME,IR,DR,RI>  CR: 42000224  XER: 00000000
  GPR00: c00b3abc c0799d18 c076e300 c079ef5c c0011fec 00000000 00000000 00000000
  GPR08: 00000100 00000100 00008000 ffffffff 42000224 00000000 c079d040 c079d044
  GPR16: 00000001 00000000 00000004 c0799da0 c079f054 c07a0000 c07a0000 00000000
  GPR24: c0011fec 00000000 c079ef5c c079ef5c 00000000 00000000 00000000 00000000
  NIP [c00b37a8] smp_call_function_many_cond+0xf4/0x38c
  LR [c00b3abc] on_each_cpu+0x38/0x68
  Call Trace:
  [c0799d18] [ffffffff] 0xffffffff (unreliable)
  [c0799d68] [c00b3abc] on_each_cpu+0x38/0x68
  [c0799d88] [c0096704] call_timer_fn.isra.26+0x20/0x7c
  [c0799d98] [c0096b40] run_timer_softirq+0x1d4/0x3fc
  [c0799df8] [c05b4368] __do_softirq+0x118/0x240
  [c0799e58] [c0039c44] irq_exit+0xc4/0xcc
  [c0799e68] [c000ade8] timer_interrupt+0x1b0/0x230
  [c0799ea8] [c0013520] ret_from_except+0x0/0x14
  --- interrupt: 901 at arch_cpu_idle+0x24/0x6c
      LR = arch_cpu_idle+0x24/0x6c
  [c0799f70] [00000001] 0x1 (unreliable)
  [c0799f80] [c0060990] do_idle+0xd8/0x17c
  [c0799fa0] [c0060ba8] cpu_startup_entry+0x24/0x28
  [c0799fb0] [c072d220] start_kernel+0x434/0x44c
  [c0799ff0] [00003860] 0x3860
  Instruction dump:
  8129f204 2f890000 40beff98 3d20c07a 8929eec4 2f890000 40beff88 0fe00000
  81220000 552805de 550802ef 4182ff84 <0fe00000> 3860ffff 7f65db78 7f44d378
  ---[ end trace 34a886e47819c2eb ]---

Don't call on_each_cpu() from a timer callback, call it from a worker
thread instead.

Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bb61650bea4f4c91fb8e24b9a6f130a1438651a7.1599260540.git.fthain@telegraphics.com.au
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/tau: Use appropriate temperature sample interval

[ Upstream commit 66943005cc41f48e4d05614e8f76c0ca1812f0fd ]

According to the MPC750 Users Manual, the SITV value in Thermal
Management Register 3 is 13 bits long. The present code calculates the
SITV value as 60 * 500 cycles. This would overflow to give 10 us on
a 500 MHz CPU rather than the intended 60 us. (But according to the
Microprocessor Datasheet, there is also a factor of 266 that has to be
applied to this value on certain parts i.e. speed sort above 266 MHz.)
Always use the maximum cycle count, as recommended by the Datasheet.

Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/896f542e5f0f1d6cf8218524c2b67d79f3d69b3c.1599260540.git.fthain@telegraphics.com.au
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/book3s64/hash/4k: Support large linear mapping range with 4K

[ Upstream commit 7746406baa3bc9e23fdd7b7da2f04d86e25ab837 ]

With commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel
regions in the same 0xc range"), we now split the 64TB address range
into 4 contexts each of 16TB. That implies we can do only 16TB linear
mapping.

On some systems, eg. Power9, memory attached to nodes > 0 will appear
above 16TB in the linear mapping. This resulted in kernel crash when
we boot such systems in hash translation mode with 4K PAGE_SIZE.

This patch updates the kernel mapping such that we now start supporting upto
61TB of memory with 4K. The kernel mapping now looks like below 4K PAGE_SIZE
and hash translation.

    vmalloc start     = 0xc0003d0000000000
    IO start          = 0xc0003e0000000000
    vmemmap start     = 0xc0003f0000000000

Our MAX_PHYSMEM_BITS for 4K is still 64TB even though we can only map 61TB.
We prevent bolt mapping anything outside 61TB range by checking against
H_VMALLOC_START.

Fixes: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range")
Reported-by: Cameron Berkenpas <cam@neo-zeon.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200608070904.387440-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/qedr: Fix inline size returned for iWARP

[ Upstream commit fbf58026b2256e9cd5f241a4801d79d3b2b7b89d ]

commit 59e8970b3798 ("RDMA/qedr: Return max inline data in QP query
result") changed query_qp max_inline size to return the max roce inline
size. When iwarp was introduced, this should have been modified to return
the max inline size based on protocol. This size is cached in the device
attributes

Fixes: 69ad0e7fe845 ("RDMA/qedr: Add support for iWARP in user space")
Link: https://lore.kernel.org/r/20200902165741.8355-8-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/qedr: Fix return code if accept is called on a destroyed qp

[ Upstream commit 8a5a10a1a74465065c75d9de1aa6685e1f1aa117 ]

In iWARP, accept could be called after a QP is already destroyed. In this
case an error should be returned and not success.

Fixes: 82af6d19d8d9 ("RDMA/qedr: Fix synchronization methods and memory leaks in qedr")
Link: https://lore.kernel.org/r/20200902165741.8355-5-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/qedr: Fix use of uninitialized field

[ Upstream commit a379ad54e55a12618cae7f6333fd1b3071de9606 ]

dev->attr.page_size_caps was used uninitialized when setting device
attributes

Fixes: ec72fce401c6 ("qedr: Add support for RoCE HW init")
Link: https://lore.kernel.org/r/20200902165741.8355-4-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/qedr: Fix qp structure memory leak

[ Upstream commit 098e345a1a8faaad6e4e54d138773466cecc45d4 ]

The qedr_qp structure wasn't freed when the protocol was RoCE.  kmemleak
output when running basic RoCE scenario.

unreferenced object 0xffff927ad7e22c00 (size 1024):
  comm "ib_send_bw", pid 7082, jiffies 4384133693 (age 274.698s)
  hex dump (first 32 bytes):
    00 b0 cd a2 79 92 ff ff 00 3f a1 a2 79 92 ff ff  ....y....?..y...
    00 ee 5c dd 80 92 ff ff 00 f6 5c dd 80 92 ff ff  ..\.......\.....
  backtrace:
    [<00000000b2ba0f35>] qedr_create_qp+0xb3/0x6c0 [qedr]
    [<00000000e85a43dd>] ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x555/0xad0 [ib_uverbs]
    [<00000000fee4d029>] ib_uverbs_cmd_verbs+0xa5a/0xb80 [ib_uverbs]
    [<000000005d622660>] ib_uverbs_ioctl+0xa4/0x110 [ib_uverbs]
    [<00000000eb4cdc71>] ksys_ioctl+0x87/0xc0
    [<00000000abe6b23a>] __x64_sys_ioctl+0x16/0x20
    [<0000000046e7cef4>] do_syscall_64+0x4d/0x90
    [<00000000c6948f76>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 1212767e23bb ("qedr: Add wrapping generic structure for qpidr and adjust idr routines.")
Link: https://lore.kernel.org/r/20200902165741.8355-2-michal.kalderon@marvell.com
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/umem: Prevent small pages from being returned by ib_umem_find_best_pgsz()

[ Upstream commit 10c75ccb54e4fe548cb16d7ed426d7d709e6ae76 ]

rdma_for_each_block() makes assumptions about how the SGL is constructed
that don't work if the block size is below the page size used to to build
the SGL.

The rules for umem SGL construction require that the SG's all be PAGE_SIZE
aligned and we don't encode the actual byte offset of the VA range inside
the SGL using offset and length. So rdma_for_each_block() has no idea
where the actual starting/ending point is to compute the first/last block
boundary if the starting address should be within a SGL.

Fixing the SGL construction turns out to be really hard, and will be the
subject of other patches. For now block smaller pages.

Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
Link: https://lore.kernel.org/r/2-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/umem: Fix ib_umem_find_best_pgsz() for mappings that cross a page boundary

[ Upstream commit a40c20dabdf9045270767c75918feb67f0727c89 ]

It is possible for a single SGL to span an aligned boundary, eg if the SGL
is

61440 -> 90112

Then the length is 28672, which currently limits the block size to
32k. With a 32k page size the two covering blocks will be:

32768->65536 and 65536->98304

However, the correct answer is a 128K block size which will span the whole
28672 bytes in a single block.

Instead of limiting based on length figure out which high IOVA bits don't
change between the start and end addresses. That is the highest useful
page size.

Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
Link: https://lore.kernel.org/r/1-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

xfs: fix high key handling in the rt allocator's query_range function

[ Upstream commit d88850bd5516a77c6f727e8b6cefb64e0cc929c7 ]

Fix some off-by-one errors in xfs_rtalloc_query_range.  The highest key
in the realtime bitmap is always one less than the number of rt extents,
which means that the key clamp at the start of the function is wrong.
The 4th argument to xfs_rtfind_forw is the highest rt extent that we
want to probe, which means that passing 1 less than the high key is
wrong.  Finally, drop the rem variable that controls the loop because we
can compare the iteration point (rtstart) against the high key directly.

The sordid history of this function is that the original commit (fb3c3)
incorrectly passed (high_rec->ar_startblock - 1) as the 'limit' parameter
to xfs_rtfind_forw.  This was wrong because the "high key" is supposed
to be the largest key for which the caller wants result rows, not the
key for the first row that could possibly be outside the range that the
caller wants to see.

A subsequent attempt (8ad56) to strengthen the parameter checking added
incorrect clamping of the parameters to the number of rt blocks in the
system (despite the bitmap functions all taking units of rt extents) to
avoid querying ranges past the end of rt bitmap file but failed to fix
the incorrect _rtfind_forw parameter.  The original _rtfind_forw
parameter error then survived the conversion of the startblock and
blockcount fields to rt extents (a0e5c), and the most recent off-by-one
fix (a3a37) thought it was patching a problem when the end of the rt
volume is not in use, but none of these fixes actually solved the
original problem that the author was confused about the "limit" argument
to xfs_rtfind_forw.

Sadly, all four of these patches were written by this author and even
his own usage of this function and rt testing were inadequate to get
this fixed quickly.

Original-problem: fb3c3de2f65c ("xfs: add a couple of queries to iterate free extents in the rtbitmap")
Not-fixed-by: 8ad560d2565e ("xfs: strengthen rtalloc query range checks")
Not-fixed-by: a0e5c435babd ("xfs: fix xfs_rtalloc_rec units")
Fixes: a3a374bf1889 ("xfs: fix off-by-one error in xfs_rtalloc_query_range")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

xfs: fix deadlock and streamline xfs_getfsmap performance

[ Upstream commit 8ffa90e1145c70c7ac47f14b59583b2296d89e72 ]

Refactor xfs_getfsmap to improve its performance: instead of indirectly
calling a function that copies one record to userspace at a time, create
a shadow buffer in the kernel and copy the whole array once at the end.
On the author's computer, this reduces the runtime on his /home by ~20%.

This also eliminates a deadlock when running GETFSMAP against the
realtime device. The current code locks the rtbitmap to create
fsmappings and copies them into userspace, having not released the
rtbitmap lock. If the userspace buffer is an mmap of a sparse file that
itself resides on the realtime device, the write page fault will recurse
into the fs for allocation, which will deadlock on the rtbitmap lock.

Fixes: 4c934c7dd60c ("xfs: report realtime space information via the rtbitmap")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

xfs: limit entries returned when counting fsmap records

[ Upstream commit acd1ac3aa22fd58803a12d26b1ab7f70232f8d8d ]

If userspace asked fsmap to count the number of entries, we cannot
return more than UINT_MAX entries because fmh_entries is u32.
Therefore, stop counting if we hit this limit or else we will waste time
to return truncated results.

Fixes: e89c041338ed ("xfs: implement the GETFSMAP ioctl")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ida: Free allocated bitmap in error path

[ Upstream commit a219b856a2b993da234108307be772448f22b0ce ]

If a bitmap needs to be allocated, and then by the time the thread
is scheduled to be run again all the indices which would satisfy the
allocation have been allocated then we would leak the allocation. Almost
impossible to hit in practice, but a trivial fix. Found by Coverity.

Fixes: f32f004cddf8 ("ida: Convert to XArray")
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

arc: plat-hsdk: fix kconfig dependency warning when !RESET_CONTROLLER

[ Upstream commit 63bcf87cb1c57956e1179f1a78dde625c7e3cba7 ]

When ARC_SOC_HSDK is enabled and RESET_CONTROLLER is disabled, it results
in the following Kbuild warning:

WARNING: unmet direct dependencies detected for RESET_HSDK
  Depends on [n]: RESET_CONTROLLER [=n] && HAS_IOMEM [=y] && (ARC_SOC_HSDK [=y] || COMPILE_TEST [=n])
  Selected by [y]:
  - ARC_SOC_HSDK [=y] && ISA_ARCV2 [=y]

The reason is that ARC_SOC_HSDK selects RESET_HSDK without depending on or
selecting RESET_CONTROLLER while RESET_HSDK is subordinate to
RESET_CONTROLLER.

Honor the kconfig menu hierarchy to remove kconfig dependency warnings.

Fixes: a528629dfd3b ("ARC: [plat-hsdk] select CONFIG_RESET_HSDK from Kconfig")
Signed-off-by: Necip Fazil Yildiran <fazilyildiran@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ARM: 9007/1: l2c: fix prefetch bits init in L2X0_AUX_CTRL using DT values

[ Upstream commit 8e007b367a59bcdf484c81f6df9bd5a4cc179ca6 ]

The L310_PREFETCH_CTRL register bits 28 and 29 to enable data and
instruction prefetch respectively can also be accessed via the
L2X0_AUX_CTRL register.  They appear to be actually wired together in
hardware between the registers.  Changing them in the prefetch
register only will get undone when restoring the aux control register
later on.  For this reason, set these bits in both registers during
initialisation according to the devicetree property values.

Link: https://lore.kernel.org/lkml/76f2f3ad5e77e356e0a5b99ceee1e774a2842c25.1597061474.git.guillaume.tucker@collabora.com/
Fixes: ec3bd0e68a67 ("ARM: 8391/1: l2c: add options to overwrite prefetching behavior")
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: mtdoops: Don't write panic data twice

[ Upstream commit c1cf1d57d1492235309111ea6a900940213a9166 ]

If calling mtdoops_write, don't also schedule work to be done later.

Although this appears to not be causing an issue, possibly because the
scheduled work will never get done, it is confusing.

Fixes: 016c1291ce70 ("mtd: mtdoops: do not use mtd->panic_write directly")
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200903034217.23079-1-mark.tomlinson@alliedtelesis.co.nz
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/mlx5: Fix potential race between destroy and CQE poll

[ Upstream commit 4b916ed9f9e85f705213ca8d69771d3c1cd6ee5a ]

The SRQ can be destroyed right before mlx5_cmd_get_srq is called.
In such case the latter will return NULL instead of expected SRQ.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/20200830084010.102381-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

pseries/drmem: don't cache node id in drmem_lmb struct

[ Upstream commit e5e179aa3a39c818db8fbc2dce8d2cd24adaf657 ]

At memory hot-remove time we can retrieve an LMB's nid from its
corresponding memory_block.  There is no need to store the nid
in multiple locations.

Note that lmb_to_memblock() uses find_memory_block() to get the
corresponding memory_block.  As find_memory_block() runs in sub-linear
time this approach is negligibly slower than what we do at present.

In exchange for this lookup at hot-remove time we no longer need to
call memory_add_physaddr_to_nid() during drmem_init() for each LMB.
On powerpc, memory_add_physaddr_to_nid() is a linear search, so this
spares us an O(n^2) initialization during boot.

On systems with many LMBs that initialization overhead is palpable and
disruptive.  For example, on a box with 249854 LMBs we're seeing
drmem_init() take upwards of 30 seconds to complete:

[   53.721639] drmem: initializing drmem v2
[   80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1]
[   80.604377] Modules linked in:
[   80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4
[   80.604397] NIP:  c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000
[   80.604407] REGS: c0002dbff8493830 TRAP: 0901   Not tainted  (5.6.0-rc2+)
[   80.604412] MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 44000248  XER: 0000000d
[   80.604431] CFAR: c0000000000a4a38 IRQMASK: 0
[   80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30
[   80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000
[   80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001
[   80.604431] GPR12: 0000000000000000 c00000001e8ec200
[   80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0
[   80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0
[   80.604492] Call Trace:
[   80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable)
[   80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60
[   80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0
[   80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0
[   80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0
[   80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148
[   80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[   80.604567] Instruction dump:
[   80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214
[   80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040
[   89.047390] drmem: 249854 LMB(s)

With a patched kernel on the same machine we're no longer seeing the
soft lockup.  drmem_init() now completes in negligible time, even when
the LMB count is large.

Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree")
Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/pseries: explicitly reschedule during drmem_lmb list traversal

[ Upstream commit 9d6792ffe140240ae54c881cc4183f9acc24b4df ]

The drmem lmb list can have hundreds of thousands of entries, and
unfortunately lookups take the form of linear searches. As long as
this is the case, traversals have the potential to monopolize the CPU
and provoke lockup reports, workqueue stalls, and the like unless
they explicitly yield.

Rather than placing cond_resched() calls within various
for_each_drmem_lmb() loop blocks in the code, put it in the iteration
expression of the loop macro itself so users can't omit it.

Introduce a drmem_lmb_next() iteration helper function which calls
cond_resched() at a regular interval during array traversal. Each
iteration of the loop in DLPAR code paths can involve around ten RTAS
calls which can each take up to 250us, so this ensures the check is
performed at worst every few milliseconds.

Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200813151131.2070161-1-nathanl@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/umem: Fix signature of stub ib_umem_find_best_pgsz()

[ Upstream commit 61690d01db32eb1f94adc9ac2b8bb741d34e4671 ]

The original function returns unsigned long and 0 on failure.

Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
Link: https://lore.kernel.org/r/0-v1-982a13cc5c6d+501ae-fix_best_pgsz_stub_jgg@nvidia.com
Reviewed-by: Gal Pressman <galpress@amazon.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/hns: Add a check for current state before modifying QP

[ Upstream commit e0ef0f68c4c0d85b1eb63f38d5d10324361280e8 ]

It should be considered an illegal operation if the ULP attempts to modify
a QP from another state to the current hardware state. Otherwise, the ULP
can modify some fields of QPC at any time. For example, for a QP in state
of RTS, modify it from RTR to RTS can change the PSN, which is always not
as expected.

Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Link: https://lore.kernel.org/r/1598353674-24270-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: lpddr: fix excessive stack usage with clang

[ Upstream commit 3e1b6469f8324bee5927b063e2aca30d3e56b907 ]

Building lpddr2_nvm with clang can result in a giant stack usage
in one function:

drivers/mtd/lpddr/lpddr2_nvm.c:399:12: error: stack frame size of 1144 bytes in function 'lpddr2_nvm_probe' [-Werror,-Wframe-larger-than=]

The problem is that clang decides to build a copy of the mtd_info
structure on the stack and then do a memcpy() into the actual version. It
shouldn't really do it that way, but it's not strictly a bug either.

As a workaround, use a static const version of the structure to assign
most of the members upfront and then only set the few members that
require runtime knowledge at probe time.

Fixes: 96ba9dd65788 ("mtd: lpddr: add driver for LPDDR2-NVM PCM memories")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200505140136.263461-1-arnd@arndb.de
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/ucma: Add missing locking around rdma_leave_multicast()

[ Upstream commit 38e03d092699891c3237b5aee9e8029d4ede0956 ]

All entry points to the rdma_cm from a ULP must be single threaded,
even this error unwinds. Add the missing locking.

Fixes: 7c11910783a1 ("RDMA/ucma: Put a lock around every call to the rdma_cm layer")
Link: https://lore.kernel.org/r/20200818120526.702120-11-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

RDMA/ucma: Fix locking for ctx->events_reported

[ Upstream commit 98837c6c3d7285f6eca86480b6f7fac6880e27a8 ]

This value is locked under the file->mut, ensure it is held whenever
touching it.

The case in ucma_migrate_id() is a race, while in ucma_free_uctx() it is
already not possible for the write side to run, the movement is just for
clarity.

Fixes: 88314e4dda1e ("RDMA/cma: add support for rdma_migrate_id()")
Link: https://lore.kernel.org/r/20200818120526.702120-10-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/icp-hv: Fix missing of_node_put() in success path

[ Upstream commit d3e669f31ec35856f5e85df9224ede5bdbf1bc7b ]

Both of_find_compatible_node() and of_find_node_by_type() will return
a refcounted node on success - thus for the success path the node must
be explicitly released with a of_node_put().

Fixes: 0b05ac6e2480 ("powerpc/xics: Rewrite XICS driver")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1530691407-3991-1-git-send-email-hofrat@osadl.org
Signed-off-by: Sasha Levin <sashal@kernel.org>

powerpc/pseries: Fix missing of_node_put() in rng_init()

[ Upstream commit 67c3e59443f5fc77be39e2ce0db75fbfa78c7965 ]

The call to of_find_compatible_node() returns a node pointer with
refcount incremented thus it must be explicitly decremented here
before returning.

Fixes: a489043f4626 ("powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1530522496-14816-1-git-send-email-hofrat@osadl.org
Signed-off-by: Sasha Levin <sashal@kernel.org>

IB/mlx4: Adjust delayed work when a dup is observed

[ Upstream commit 785167a114855c5aa75efca97000e405c2cc85bf ]

When scheduling delayed work to clean up the cache, if the entry already
has been scheduled for deletion, we adjust the delay.

Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization")
Link: https://lore.kernel.org/r/20200803061941.1139994-7-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

IB/mlx4: Fix starvation in paravirt mux/demux

[ Upstream commit 7fd1507df7cee9c533f38152fcd1dd769fcac6ce ]

The mlx4 driver will proxy MAD packets through the PF driver. A VM or an
instantiated VF will send its MAD packets to the PF driver using
loop-back. The PF driver will be informed by an interrupt, but defer the
handling and polling of CQEs to a worker thread running on an ordered
work-queue.

Consider the following scenario: the VMs will in short proximity in time,
for example due to a network event, send many MAD packets to the PF
driver. Lets say there are K VMs, each sending N packets.

The interrupt from the first VM will start the worker thread, which will
poll N CQEs. A common case here is where the PF driver will multiplex the
packets received from the VMs out on the wire QP.

But before the wire QP has returned a send CQE and associated interrupt,
the other K - 1 VMs have sent their N packets as well.

The PF driver has to multiplex K * N packets out on the wire QP. But the
send-queue on the wire QP has a finite capacity.

So, in this scenario, if K * N is larger than the send-queue capacity of
the wire QP, we will get MAD packets dropped on the floor with this
dynamic debug message:

mlx4_ib_multiplex_mad: failed sending GSI to wire on behalf of slave 2 (-11)

and this despite the fact that the wire send-queue could have capacity,
but the PF driver isn't aware, because the wire send CQEs have not yet
been polled.

We can also have a similar scenario inbound, with a wire recv-queue larger
than the tunnel QP's send-queue. If many remote peers send MAD packets to
the very same VM, the tunnel send-queue destined to the VM could allegedly
be construed to be full by the PF driver.

This starvation is fixed by introducing separate work queues for the wire
QPs vs. the tunnel QPs.

With this fix, using a dual ported HCA, 8 VFs instantiated, we could run
cmtime on each of the 18 interfaces towards a similar configured peer,
each cmtime instance with 800 QPs (all in all 14400 QPs) without a single
CM packet getting lost.

Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization")
Link: https://lore.kernel.org/r/20200803061941.1139994-5-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

i3c: master add i3c_master_attach_boardinfo to preserve boardinfo

[ Upstream commit 9da36a7ec42135428e1d41621e3703429bda3b2e ]

Boardinfo was lost if I3C object for devices with boardinfo
available are not created or not added to the I3C device list
because of some failure e.g. SETDASA failed, retrieve info failed etc
This patch adds i3c_master_attach_boardinfo which scan boardinfo list
in the master object and 'attach' it to the I3C device object.

Fixes: 3a379bbcea0a ("i3c: Add core I3C infrastructure")
Signed-off-by: Parshuram Thombare <pthombar@cadence.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://lore.kernel.org/linux-i3c/1590053542-389-1-git-send-email-pthombar@cadence.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests/ftrace: Change synthetic event name for inter-event-combined test

[ Upstream commit 96378b2088faea68f1fb05ea6b9a566fc569a44c ]

This test uses waking+wakeup_latency as an event name, which doesn't
make sense since it includes an operator. Illegal names are now
detected by the synthetic event command parsing, which causes this
test to fail. Change the name to 'waking_plus_wakeup_latency' to
prevent this.

Link: https://lkml.kernel.org/r/a1ee2f76ff28ef7166fb788ca8be968887808920.1602598160.git.zanussi@kernel.org
Fixes: f06eec4d0f2c (selftests: ftrace: Add inter-event hist triggers testcases)
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

fs: fix NULL dereference due to data race in prepend_path()

[ Upstream commit 09cad07547445bf3a41683e4d3abcd154c123ef5 ]

Fix data race in prepend_path() with re-reading mnt->mnt_ns twice
without holding the lock.

is_mounted() does check for NULL, but is_anon_ns(mnt->mnt_ns) might
re-read the pointer again which could be NULL already, if in between
reads one of kern_unmount()/kern_unmount_array()/umount_tree() sets
mnt->mnt_ns to NULL.

This is seen in production with the following stack trace:

  BUG: kernel NULL pointer dereference, address: 0000000000000048
  ...
  RIP: 0010:prepend_path.isra.4+0x1ce/0x2e0
  Call Trace:
    d_path+0xe6/0x150
    proc_pid_readlink+0x8f/0x100
    vfs_readlink+0xf8/0x110
    do_readlinkat+0xfd/0x120
    __x64_sys_readlinkat+0x1a/0x20
    do_syscall_64+0x42/0x110
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: f2683bd8d5bd ("[PATCH] fix d_absolute_path() interplay with fsmount()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary

[ Upstream commit 67197a4f28d28d0b073ab0427b03cb2ee5382578 ]

Currently __set_oom_adj loops through all processes in the system to keep
oom_score_adj and oom_score_adj_min in sync between processes sharing
their mm.  This is done for any task with more that one mm_users, which
includes processes with multiple threads (sharing mm and signals).
However for such processes the loop is unnecessary because their signal
structure is shared as well.

Android updates oom_score_adj whenever a tasks changes its role
(background/foreground/...) or binds to/unbinds from a service, making it
more/less important.  Such operation can happen frequently.  We noticed
that updates to oom_score_adj became more expensive and after further
investigation found out that the patch mentioned in "Fixes" introduced a
regression.  Using Pixel 4 with a typical Android workload, write time to
oom_score_adj increased from ~3.57us to ~362us.  Moreover this regression
linearly depends on the number of multi-threaded processes running on the
system.

Mark the mm with a new MMF_MULTIPROCESS flag bit when task is created with
(CLONE_VM && !CLONE_THREAD && !CLONE_VFORK).  Change __set_oom_adj to use
MMF_MULTIPROCESS instead of mm_users to decide whether oom_score_adj
update should be synchronized between multiple processes.  To prevent
races between clone() and __set_oom_adj(), when oom_score_adj of the
process being cloned might be modified from userspace, we use
oom_adj_mutex.  Its scope is changed to global.

The combination of (CLONE_VM && !CLONE_THREAD) is rarely used except for
the case of vfork().  To prevent performance regressions of vfork(), we
skip taking oom_adj_mutex and setting MMF_MULTIPROCESS when CLONE_VFORK is
specified.  Clearing the MMF_MULTIPROCESS flag (when the last process
sharing the mm exits) is left out of this patch to keep it simple and
because it is believed that this threading model is rare.  Should there
ever be a need for optimizing that case as well, it can be done by hooking
into the exit path, likely following the mm_update_next_owner pattern.

With the combination of (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK) being
quite rare, the regression is gone after the change is applied.

[surenb@google.com: v3]
Link: https://lkml.kernel.org/r/20200902012558.2335613-1-surenb@google.com
Fixes: 44a70adec910 ("mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj")
Reported-by: Tim Murray <timmurray@google.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Eugene Syromiatnikov <esyr@redhat.com>
Cc: Christian Kellner <christian@kellner.me>
Cc: Adrian Reber <areber@redhat.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Gladkov <gladkov.alexey@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Yafang Shao <laoar.shao@gmail.com>
Link: https://lkml.kernel.org/r/20200824153036.3201505-1-surenb@google.com
Debugged-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/memcg: fix device private memcg accounting

[ Upstream commit 9a137153fc8798a89d8fce895cd0a06ea5b8e37c ]

The code in mc_handle_swap_pte() checks for non_swap_entry() and returns
NULL before checking is_device_private_entry() so device private pages are
never handled. Fix this by checking for non_swap_entry() after handling
device private swap PTEs.

I assume the memory cgroup accounting would be off somehow when moving
a process to another memory cgroup. Currently, the device private page
is charged like a normal anonymous page when allocated and is uncharged
when the page is freed so I think that path is OK.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Link: https://lkml.kernel.org/r/20201009215952.2726-1-rcampbell@nvidia.com
xFixes: c733a82874a7 ("mm/memcontrol: support MEMORY_DEVICE_PRIVATE")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/swapfile.c: fix potential memory leak in sys_swapon

[ Upstream commit 822bca52ee7eb279acfba261a423ed7ac47d6f73 ]

If we failed to drain inode, we would forget to free the swap address
space allocated by init_swap_address_space() above.

Fixes: dc617f29dbe5 ("vfs: don't allow writes to swap files")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Link: https://lkml.kernel.org/r/20200930101803.53884-1-linmiaohe@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_log: missing vlan offload tag and proto

[ Upstream commit 0d9826bc18ce356e8909919ad681ad65d0a6061e ]

Dump vlan tag and proto for the usual vlan offload case if the
NF_LOG_MACDECODE flag is set on. Without this information the logging is
misleading as there is no reference to the VLAN header.

[12716.993704] test: IN=veth0 OUT= MACSRC=86:6c:92:ea:d6:73 MACDST=0e:3b:eb:86:73:76 VPROTO=8100 VID=10 MACPROTO=0800 SRC=192.168.10.2 DST=172.217.168.163 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=2548 DF PROTO=TCP SPT=55848 DPT=80 WINDOW=501 RES=0x00 ACK FIN URGP=0
[12721.157643] test: IN=veth0 OUT= MACSRC=86:6c:92:ea:d6:73 MACDST=0e:3b:eb:86:73:76 VPROTO=8100 VID=10 MACPROTO=0806 ARP HTYPE=1 PTYPE=0x0800 OPCODE=2 MACSRC=86:6c:92:ea:d6:73 IPSRC=192.168.10.2 MACDST=0e:3b:eb:86:73:76 IPDST=192.168.10.1

Fixes: 83e96d443b37 ("netfilter: log: split family specific code to nf_log_{ip,ip6,common}.c files")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: korina: fix kfree of rx/tx descriptor array

[ Upstream commit 3af5f0f5c74ecbaf757ef06c3f80d56751277637 ]

kmalloc returns KSEG0 addresses so convert back from KSEG1
in kfree. Also make sure array is freed when the driver is
unloaded from the kernel.

Fixes: ef11291bcd5f ("Add support the Korina (IDT RC32434) Ethernet MAC")
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ipvs: clear skb->tstamp in forwarding path

[ Upstream commit 7980d2eabde82be86c5be18aa3d07e88ec13c6a1 ]

fq qdisc requires tstamp to be cleared in forwarding path

Reported-by: Evgeny B <abt-admin@mail.ru>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209427
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 8203e2d844d3 ("net: clear skb->tstamp in forwarding paths")
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mwifiex: fix double free

[ Upstream commit 53708f4fd9cfe389beab5c8daa763bcd0e0b4aef ]

clang static analysis reports this problem:

sdio.c:2403:3: warning: Attempt to free released memory
        kfree(card->mpa_rx.buf);
        ^~~~~~~~~~~~~~~~~~~~~~~

When mwifiex_init_sdio() fails in its first call to
mwifiex_alloc_sdio_mpa_buffer, it falls back to calling it
again.  If the second alloc of mpa_tx.buf fails, the error
handler will try to free the old, previously freed mpa_rx.buf.
Reviewing the code, it looks like a second double free would
happen with mwifiex_cleanup_sdio().

So set both pointers to NULL when they are freed.

Fixes: 5e6e3a92b9a4 ("wireless: mwifiex: initial commit for Marvell mwifiex driver")
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201004131931.29782-1-trix@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/x86: mlx-platform: Remove PSU EEPROM configuration

[ Upstream commit c071afcea6ecf24a3c119f25ce9f71ffd55b5dc2 ]

Remove PSU EEPROM configuration for systems class equipped with
Mellanox chip Spectrume-2. Till now all the systems from this class
used few types of power units, all equipped with EEPROM device with
address space two bytes. Thus, all these devices have been handled by
EEPROM driver "24c32".
There is a new requirement is to support power unit replacement by "off
the shelf" device, matching electrical required parameters. Such device
could be equipped with different EEPROM type, which could be one byte
address space addressing or even could be not equipped with EEPROM.
In such case "24c32" will not work.

Fixes: 1bd42d94ccab ("platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200923172053.26296-2-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ipmi_si: Fix wrong return value in try_smi_init()

[ Upstream commit 8fe7990ceda8597e407d06bffc4bdbe835a93ece ]

On an error exit path, a negative error code should be returned
instead of a positive return value.

Fixes: 90b2d4f15ff7 ("ipmi_si: Remove hacks for adding a dummy platform devices")
Cc: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Message-Id: <20201005145212.84435-1-tianjia.zhang@linux.alibaba.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs()

[ Upstream commit 38b2db564d9ab7797192ef15d7aade30633ceeae ]

The be_fill_queue() function can only fail when "eq_vaddress" is NULL and
since it's non-NULL here that means the function call can't fail. But
imagine if it could, then in that situation we would want to store the
"paddr" so that dma memory can be released.

Link: https://lore.kernel.org/r/20200928091300.GD377727@mwanda
Fixes: bfead3b2cb46 ("[SCSI] be2iscsi: Adding msix and mcc_rings V3")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: target: tcmu: Fix warning: 'page' may be used uninitialized

[ Upstream commit 61741d8699e1fc764a309ebd20211bb1cb193110 ]

Corrects drivers/target/target_core_user.c:688:6: warning: 'page' may be
used uninitialized.

Link: https://lore.kernel.org/r/20200924001920.43594-1-john.p.donnelly@oracle.com
Fixes: 3c58f737231e ("scsi: target: tcmu: Optimize use of flush_dcache_page")
Cc: Mike Christie <michael.christie@oracle.com>
Acked-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

usb: dwc2: Fix INTR OUT transfers in DDMA mode.

[ Upstream commit b2c586eb07efab982419f32b7c3bd96829bc8bcd ]

In DDMA mode if INTR OUT transfers mps not multiple of 4 then single packet
corresponds to single descriptor.

Descriptor limit set to mps and desc chain limit set to mps *
MAX_DMA_DESC_NUM_GENERIC. On that descriptors complete, to calculate
transfer size should be considered correction value for each descriptor.

In start request function, if "continue" is true then dma buffer address
should be incremmented by offset for all type of transfers, not only for
Control DATA_OUT transfers.

Fixes: cf77b5fb9b394 ("usb: dwc2: gadget: Transfer length limit checking for DDMA")
Fixes: e02f9aa6119e0 ("usb: dwc2: gadget: EP 0 specific DDMA programming")
Fixes: aa3e8bc81311e ("usb: dwc2: gadget: DDMA transfer start and complete")
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

nl80211: fix non-split wiphy information

[ Upstream commit ab10c22bc3b2024f0c9eafa463899a071eac8d97 ]

When dumping wiphy information, we try to split the data into
many submessages, but for old userspace we still support the
old mode where this doesn't happen.

However, in this case we were not resetting our state correctly
and dumping multiple messages for each wiphy, which would have
broken such older userspace.

This was broken pretty much immediately afterwards because it
only worked in the original commit where non-split dumps didn't
have any more data than split dumps...

Fixes: fe1abafd942f ("nl80211: re-add channel width and extended capa advertising")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20200928130717.3e6d9c6bada2.Ie0f151a8d0d00a8e1e18f6a8c9244dd02496af67@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

usb: gadget: u_ether: enable qmult on SuperSpeed Plus as well

[ Upstream commit 4eea21dc67b0c6ba15ae41b1defa113a680a858e ]

The u_ether driver has a qmult setting that multiplies the
transmit queue length (which by default is 2).

The intent is that it should be enabled at high/super speed, but
because the code does not explicitly check for USB_SUPER_PLUS,
it is disabled at that speed.

Fix this by ensuring that the queue multiplier is enabled for any
wired link at high speed or above. Using >= for USB_SPEED_*
constants seems correct because it is what the gadget_is_xxxspeed
functions do.

The queue multiplier substantially helps performance at higher
speeds. On a direct SuperSpeed Plus link to a Linux laptop,
iperf3 single TCP stream:

Before (qmult=1): 1.3 Gbps
After (qmult=5): 3.2 Gbps

Fixes: 04617db7aa68 ("usb: gadget: add SS descriptors to Ethernet gadget")
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

usb: gadget: f_ncm: fix ncm_bitrate for SuperSpeed and above.

[ Upstream commit 986499b1569af980a819817f17238015b27793f6 ]

Currently, SuperSpeed NCM gadgets report a speed of 851 Mbps
in USB_CDC_NOTIFY_SPEED_CHANGE. But the calculation appears to
assume 16 packets per microframe, and USB 3 and above no longer
use microframes.

Maximum speed is actually much higher. On a direct connection,
theoretical throughput is at most 3.86 Gbps for gen1x1 and
9.36 Gbps for gen2x1, and I have seen gadget->host iperf
throughput of >2 Gbps for gen1x1 and >4 Gbps for gen2x1.

Unfortunately the ConnectionSpeedChange defined in the CDC spec
only uses 32-bit values, so we can't report accurate numbers for
10Gbps and above. So, report 3.75Gbps for SuperSpeed (which is
roughly maximum theoretical performance) and 4.25Gbps for
SuperSpeed Plus (which is close to the maximum that we can report
in a 32-bit unsigned integer).

This results in:

[50879.191272] cdc_ncm 2-2:1.0 enx228b127e050c: renamed from usb0
[50879.234778] cdc_ncm 2-2:1.0 enx228b127e050c: 3750 mbit/s downlink 3750 mbit/s uplink

on SuperSpeed and:

[50798.434527] cdc_ncm 8-2:1.0 enx228b127e050c: renamed from usb0
[50798.524278] cdc_ncm 8-2:1.0 enx228b127e050c: 4250 mbit/s downlink 4250 mbit/s uplink

on SuperSpeed Plus.

Fixes: 1650113888fe ("usb: gadget: f_ncm: add SuperSpeed descriptors for CDC NCM")
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

iwlwifi: mvm: split a print to avoid a WARNING in ROC

[ Upstream commit 903b3f9badf1d54f77b468b96706dab679b45b14 ]

A print in the remain on channel code was too long and caused
a WARNING, split it.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Fixes: dc28e12f2125 ("iwlwifi: mvm: ROC: Extend the ROC max delay duration & limit ROC duration")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200930102759.58d57c0bdc68.Ib06008665e7bf1199c360aa92691d9c74fb84990@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

mfd: sm501: Fix leaks in probe()

[ Upstream commit 8ce24f8967df2836b4557a23e74dc4bb098249f1 ]

This code should clean up if sm501_init_dev() fails.

Fixes: b6d6454fdb66 ("[PATCH] mfd: SM501 core driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: enic: Cure the enic api locking trainwreck

[ Upstream commit a53b59ece86c86d16d12ccdaa1ad0c78250a9d96 ]

enic_dev_wait() has a BUG_ON(in_interrupt()).

Chasing the callers of enic_dev_wait() revealed the gems of enic_reset()
and enic_tx_hang_reset() which are both invoked through work queues in
order to be able to call rtnl_lock(). So far so good.

After locking rtnl both functions acquire enic::enic_api_lock which
serializes against the (ab)use from infiniband. This is where the
trainwreck starts.

enic::enic_api_lock is a spin_lock() which implicitly disables preemption,
but both functions invoke a ton of functions under that lock which can
sleep. The BUG_ON(in_interrupt()) does not trigger in that case because it
can't detect the preempt disabled condition.

This clearly has never been tested with any of the mandatory debug options
for 7+ years, which would have caught that for sure.

Cure it by adding a enic_api_busy member to struct enic, which is modified
and evaluated with enic::enic_api_lock held.

If enic_api_devcmd_proxy_by_index() observes enic::enic_api_busy as true,
it drops enic::enic_api_lock and busy waits for enic::enic_api_busy to
become false.

It would be smarter to wait for a completion of that busy period, but
enic_api_devcmd_proxy_by_index() is called with other spin locks held which
obviously can't sleep.

Remove the BUG_ON(in_interrupt()) check as well because it's incomplete and
with proper debugging enabled the problem would have been caught from the
debug checks in schedule_timeout().

Fixes: 0b038566c0ea ("drivers/net: enic: Add an interface for USNIC to interact with firmware")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

iio: adc: stm32-adc: fix runtime autosuspend delay when slow polling

[ Upstream commit c537d3457542a398caa1fe58e0976c5f83cf7281 ]

When the ADC is runtime suspended and starting a conversion, the stm32-adc
driver calls pm_runtime_get_sync() that gets cascaded to the parent
(e.g. runtime resume of stm32-adc-core driver). This also kicks the
autosuspend delay (e.g. 2s) of the parent.
Once the ADC is active, calling pm_runtime_get_sync() again (upon a new
capture) won't kick the autosuspend delay for the parent (stm32-adc-core
driver) as already active.

Currently, this makes the stm32-adc-core driver go in suspend state
every 2s when doing slow polling. As an example, doing a capture, e.g.
cat in_voltageY_raw at a 0.2s rate, the auto suspend delay for the parent
isn't refreshed. Once it expires, the parent immediately falls into
runtime suspended state, in between two captures, as soon as the child
driver falls into runtime suspend state:
- e.g. after 2s, + child calls pm_runtime_put_autosuspend() + 100ms
autosuspend delay of the child.
- stm32-adc-core switches off regulators, clocks and so on.
- They get switched on back again 100ms later in this example (at 2.2s).

So, use runtime_idle() callback in stm32-adc-core driver to call
pm_runtime_mark_last_busy() for the parent driver (stm32-adc-core),
to avoid this.

Fixes: 9bdbb1139ca1 ("iio: adc: stm32-adc: add power management support")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/1593615328-5180-1-git-send-email-fabrice.gasnier@st.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

qtnfmac: fix resource leaks on unsupported iftype error return path

[ Upstream commit 63f6982075d890d7563e2469643f05a37d193f01 ]

Currently if an unsupported iftype is detected the error return path
does not free the cmd_skb leading to a resource leak. Fix this by
free'ing cmd_skb.

Addresses-Coverity: ("Resource leak")
Fixes: 805b28c05c8e ("qtnfmac: prepare for AP_VLAN interface type support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200925132224.21638-1-colin.king@canonical.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

ibmvnic: set up 200GBPS speed

[ Upstream commit b9cd795b0e4860f482bf3741d12e1c8f3ec1cfc9 ]

Set up the speed according to crq->query_phys_parms.rsp.speed.
Fix IBMVNIC_10GBPS typo.

Fixes: f8d6ae0d27ec ("ibmvnic: Report actual backing device speed and duplex values")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

coresight: etm: perf: Fix warning caused by etm_setup_aux failure

[ Upstream commit 716f5652a13122364a65e694386b9b26f5e98c51 ]

When coresight_build_path() fails on all the cpus, etm_setup_aux
calls etm_free_aux() to free allocated event_data.
WARN_ON(cpumask_empty(mask) will be triggered since cpu mask is empty.
Check event_data->snk_config is not NULL first to avoid this
warning.

Fixes: f5200aa9831f38 ("coresight: perf: Refactor function free_event_data()")
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200928163513.70169-9-mathieu.poirier@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

nl80211: fix OBSS PD min and max offset validation

[ Upstream commit 6c8b6e4a5f745ec49286ac0a3f1d591a34818f82 ]

The SRG min and max offset won't present when SRG Information Present of
SR control field of Spatial Reuse Parameter Set element set to 0. Per
spec. IEEE802.11ax D7.0, SRG OBSS PD Min Offset ≤ SRG OBSS PD Max
Offset. Hence fix the constrain check to allow same values in both
offset and also call appropriate nla_get function to read the values.

Fixes: 796e90f42b7e ("cfg80211: add support for parsing OBBS_PD attributes")
Signed-off-by: Rajkumar Manoharan <rmanohar@codeaurora.org>
Link: https://lore.kernel.org/r/1601278091-20313-1-git-send-email-rmanohar@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

nvmem: core: fix possibly memleak when use nvmem_cell_info_to_nvmem_cell()

[ Upstream commit fc9eec4d643597cf4cb2fef17d48110e677610da ]

Fix missing 'kfree_const(cell->name)' when call to
nvmem_cell_info_to_nvmem_cell() in several places:

     * after nvmem_cell_info_to_nvmem_cell() failed during
       nvmem_add_cells()

     * during nvmem_device_cell_{read,write} when cell->name is
       kstrdup'ed() without calling kfree_const() at the end, but
       really there is no reason to do that 'dup, because the cell
       instance is allocated on the stack for some short period to be
       read/write without exposing it to the caller.

So the new nvmem_cell_info_to_nvmem_cell_nodup() helper is introduced
which is used to convert cell_info -> cell without name duplication as
a lighweight version of nvmem_cell_info_to_nvmem_cell().

Fixes: e2a5402ec7c6 ("nvmem: Add nvmem_device based consumer apis.")
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
Link: https://lore.kernel.org/r/20200923204456.14032-1-vadym.kochan@plvision.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

HID: hid-input: fix stylus battery reporting

[ Upstream commit 505f394fa239cecb76d916aa858f87ed7ea7fde4 ]

With commit 4f3882177240 hid-input started clearing of "ignored" usages
to avoid using garbage that might have been left in them. However
"battery strength" usages should not be ignored, as we do want to
use them.

Fixes: 4f3882177240 ("HID: hid-input: clear unmapped usages")
Reported-by: Kenneth Albanowski <kenalba@google.com>
Tested-by: Kenneth Albanowski <kenalba@google.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: fsl_sai: Instantiate snd_soc_dai_driver

[ Upstream commit 22a16145af824f91014d07f8664114859900b9e6 ]

Instantiate snd_soc_dai_driver for independent symmetric control.
Otherwise the symmetric setting may be overwritten by other
instance.

Fixes: 08fdf65e37d5 ("ASoC: fsl_sai: Add asynchronous mode support")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1600424760-32071-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

slimbus: qcom-ngd-ctrl: disable ngd in qmi server down callback

[ Upstream commit 709ec3f7fc5773ac4aa6fb22c3f0ac8103c674db ]

In QMI new server notification we enable the NGD however during
delete server notification we do not disable the NGD.

This can lead to multiple instances of NGD being enabled, so make
sure that we disable NGD in delete server callback to fix this issue!

Fixes: 917809e2280b ("slimbus: ngd: Add qcom SLIMBus NGD driver")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200925095520.27316-4-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

slimbus: core: do not enter to clock pause mode in core

[ Upstream commit df2c471c4ae07e18a0396db670dca2ef867c5153 ]

Let the controller logic decide when to enter into clock pause mode!
Entering in to pause mode during unregistration does not really make
sense as the controller is totally going down at that point in time.

Fixes: 4b14e62ad3c9e ("slimbus: Add support for 'clock-pause' feature")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200925095520.27316-3-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

slimbus: core: check get_addr before removing laddr ida

[ Upstream commit f97769fde678e111a1b7b165b380d8a3dfe54f4e ]

logical address can be either assigned by the SLIMBus controller or the core.
Core uses IDA in cases where get_addr callback is not provided by the
controller.
Core already has this check while allocating IDR, however during absence
reporting this is not checked. This patch fixes this issue.

Fixes: 46a2bb5a7f7e ("slimbus: core: Add slim controllers support")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20200925095520.27316-2-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

quota: clear padding in v2r1_mem2diskdqb()

[ Upstream commit 3d3dc274ce736227e3197868ff749cff2f175f63 ]

Freshly allocated memory contains garbage, better make sure
to init all struct v2r1_disk_dqblk fields to avoid KMSAN report:

BUG: KMSAN: uninit-value in qtree_entry_unused+0x137/0x1b0 fs/quota/quota_tree.c:218
CPU: 0 PID: 23373 Comm: syz-executor.1 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x21c/0x280 lib/dump_stack.c:118
kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:122
__msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:219
qtree_entry_unused+0x137/0x1b0 fs/quota/quota_tree.c:218
v2r1_mem2diskdqb+0x43d/0x710 fs/quota/quota_v2.c:285
qtree_write_dquot+0x226/0x870 fs/quota/quota_tree.c:394
v2_write_dquot+0x1ad/0x280 fs/quota/quota_v2.c:333
dquot_commit+0x4af/0x600 fs/quota/dquot.c:482
ext4_write_dquot fs/ext4/super.c:5934 [inline]
ext4_mark_dquot_dirty+0x4d8/0x6a0 fs/ext4/super.c:5985
mark_dquot_dirty fs/quota/dquot.c:347 [inline]
mark_all_dquot_dirty fs/quota/dquot.c:385 [inline]
dquot_alloc_inode+0xc05/0x12b0 fs/quota/dquot.c:1755
__ext4_new_inode+0x8204/0x9d70 fs/ext4/ialloc.c:1155
ext4_tmpfile+0x41a/0x850 fs/ext4/namei.c:2686
vfs_tmpfile+0x2a2/0x570 fs/namei.c:3283
do_tmpfile fs/namei.c:3316 [inline]
path_openat+0x4035/0x6a90 fs/namei.c:3359
do_filp_open+0x2b8/0x710 fs/namei.c:3395
do_sys_openat2+0xa88/0x1140 fs/open.c:1168
do_sys_open fs/open.c:1184 [inline]
__do_compat_sys_openat fs/open.c:1242 [inline]
__se_compat_sys_openat+0x2a4/0x310 fs/open.c:1240
__ia32_compat_sys_openat+0x56/0x70 fs/open.c:1240
do_syscall_32_irqs_on arch/x86/entry/common.c:80 [inline]
__do_fast_syscall_32+0x129/0x180 arch/x86/entry/common.c:139
do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:162
do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:205
entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
RIP: 0023:0xf7ff4549
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 002b:00000000f55cd0cc EFLAGS: 00000296 ORIG_RAX: 0000000000000127
RAX: ffffffffffffffda RBX: 00000000ffffff9c RCX: 0000000020000000
RDX: 0000000000410481 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:143 [inline]
kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:126
kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:80
slab_alloc_node mm/slub.c:2907 [inline]
slab_alloc mm/slub.c:2916 [inline]
__kmalloc+0x2bb/0x4b0 mm/slub.c:3982
kmalloc include/linux/slab.h:559 [inline]
getdqbuf+0x56/0x150 fs/quota/quota_tree.c:52
qtree_write_dquot+0xf2/0x870 fs/quota/quota_tree.c:378
v2_write_dquot+0x1ad/0x280 fs/quota/quota_v2.c:333
dquot_commit+0x4af/0x600 fs/quota/dquot.c:482
ext4_write_dquot fs/ext4/super.c:5934 [inline]
ext4_mark_dquot_dirty+0x4d8/0x6a0 fs/ext4/super.c:5985
mark_dquot_dirty fs/quota/dquot.c:347 [inline]
mark_all_dquot_dirty fs/quota/dquot.c:385 [inline]
dquot_alloc_inode+0xc05/0x12b0 fs/quota/dquot.c:1755
__ext4_new_inode+0x8204/0x9d70 fs/ext4/ialloc.c:1155
ext4_tmpfile+0x41a/0x850 fs/ext4/namei.c:2686
vfs_tmpfile+0x2a2/0x570 fs/namei.c:3283
do_tmpfile fs/namei.c:3316 [inline]
path_openat+0x4035/0x6a90 fs/namei.c:3359
do_filp_open+0x2b8/0x710 fs/namei.c:3395
do_sys_openat2+0xa88/0x1140 fs/open.c:1168
do_sys_open fs/open.c:1184 [inline]
__do_compat_sys_openat fs/open.c:1242 [inline]
__se_compat_sys_openat+0x2a4/0x310 fs/open.c:1240
__ia32_compat_sys_openat+0x56/0x70 fs/open.c:1240
do_syscall_32_irqs_on arch/x86/entry/common.c:80 [inline]
__do_fast_syscall_32+0x129/0x180 arch/x86/entry/common.c:139
do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:162
do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:205
entry_SYSENTER_compat_after_hwframe+0x4d/0x5c

Fixes: 498c60153ebb ("quota: Implement quota format with 64-bit space and inode limits")
Link: https://lore.kernel.org/r/20200924183619.4176790-1-edumazet@google.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jan Kara <jack@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>

usb: dwc2: Fix parameter type in function pointer prototype

[ Upstream commit 362b9398c962c9ec563653444e15ef9032ef3a90 ]

When booting up on a Raspberry Pi 4 with Control Flow Integrity checking
enabled, the following warning/panic happens:

[    1.626435] CFI failure (target: dwc2_set_bcm_params+0x0/0x4):
[    1.632408] WARNING: CPU: 0 PID: 32 at kernel/cfi.c:30 __cfi_check_fail+0x54/0x5c
[    1.640021] Modules linked in:
[    1.643137] CPU: 0 PID: 32 Comm: kworker/0:1 Not tainted 5.8.0-rc6-next-20200724-00051-g89ba619726de #1
[    1.652693] Hardware name: Raspberry Pi 4 Model B Rev 1.2 (DT)
[    1.658637] Workqueue: events deferred_probe_work_func
[    1.663870] pstate: 60000005 (nZCv daif -PAN -UAO BTYPE=--)
[    1.669542] pc : __cfi_check_fail+0x54/0x5c
[    1.673798] lr : __cfi_check_fail+0x54/0x5c
[    1.678050] sp : ffff8000102bbaa0
[    1.681419] x29: ffff8000102bbaa0 x28: ffffab09e21c7000
[    1.686829] x27: 0000000000000402 x26: ffff0000f6e7c228
[    1.692238] x25: 00000000fb7cdb0d x24: 0000000000000005
[    1.697647] x23: ffffab09e2515000 x22: ffffab09e069a000
[    1.703055] x21: 4c550309df1cf4c1 x20: ffffab09e2433c60
[    1.708462] x19: ffffab09e160dc50 x18: ffff0000f6e8cc78
[    1.713870] x17: 0000000000000041 x16: ffffab09e0bce6f8
[    1.719278] x15: ffffab09e1c819b7 x14: 0000000000000003
[    1.724686] x13: 00000000ffffefff x12: 0000000000000000
[    1.730094] x11: 0000000000000000 x10: 00000000ffffffff
[    1.735501] x9 : c932f7abfc4bc600 x8 : c932f7abfc4bc600
[    1.740910] x7 : 077207610770075f x6 : ffff0000f6c38f00
[    1.746317] x5 : 0000000000000000 x4 : 0000000000000000
[    1.751723] x3 : 0000000000000000 x2 : 0000000000000000
[    1.757129] x1 : ffff8000102bb7d8 x0 : 0000000000000032
[    1.762539] Call trace:
[    1.765030]  __cfi_check_fail+0x54/0x5c
[    1.768938]  __cfi_check+0x5fa6c/0x66afc
[    1.772932]  dwc2_init_params+0xd74/0xd78
[    1.777012]  dwc2_driver_probe+0x484/0x6ec
[    1.781180]  platform_drv_probe+0xb4/0x100
[    1.785350]  really_probe+0x228/0x63c
[    1.789076]  driver_probe_device+0x80/0xc0
[    1.793247]  __device_attach_driver+0x114/0x160
[    1.797857]  bus_for_each_drv+0xa8/0x128
[    1.801851]  __device_attach.llvm.14901095709067289134+0xc0/0x170
[    1.808050]  bus_probe_device+0x44/0x100
[    1.812044]  deferred_probe_work_func+0x78/0xb8
[    1.816656]  process_one_work+0x204/0x3c4
[    1.820736]  worker_thread+0x2f0/0x4c4
[    1.824552]  kthread+0x174/0x184
[    1.827837]  ret_from_fork+0x10/0x18

CFI validates that all indirect calls go to a function with the same
exact function pointer prototype. In this case, dwc2_set_bcm_params
is the target, which has a parameter of type 'struct dwc2_hsotg *',
but it is being implicitly cast to have a parameter of type 'void *'
because that is the set_params function pointer prototype. Make the
function pointer protoype match the definitions so that there is no
more violation.

Fixes: 7de1debcd2de ("usb: dwc2: Remove platform static params")
Link: https://github.com/ClangBuiltLinux/linux/issues/1107
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: seq: oss: Avoid mutex lock for a long-time ioctl

[ Upstream commit 2759caad2600d503c3b0ed800e7e03d2cd7a4c05 ]

Recently we applied a fix to cover the whole OSS sequencer ioctls with
the mutex for dealing with the possible races. This works fine in
general, but in theory, this may lead to unexpectedly long stall if an
ioctl like SNDCTL_SEQ_SYNC is issued and an event with the far future
timestamp was queued.

For fixing such a potential stall, this patch changes the mutex lock
applied conditionally excluding such an ioctl command. Also, change
the mutex_lock() with the interruptible version for user to allow
escaping from the big-hammer mutex.

Fixes: 80982c7e834e ("ALSA: seq: oss: Serialize ioctls")
Suggested-by: Pavel Machek <pavel@ucw.cz>
Link: https://lore.kernel.org/r/20200922083856.28572-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

misc: mic: scif: Fix error handling path

[ Upstream commit a81072a9c0ae734b7889929b0bc070fe3f353f0e ]

Inside __scif_pin_pages(), when map_flags != SCIF_MAP_KERNEL it
will call pin_user_pages_fast() to map nr_pages. However,
pin_user_pages_fast() might fail with a return value -ERRNO.

The return value is stored in pinned_pages->nr_pages. which in
turn is passed to unpin_user_pages(), which expects
pinned_pages->nr_pages >=0, else disaster.

Fix this by assigning pinned_pages->nr_pages to 0 if
pin_user_pages_fast() returns -ERRNO.

Fixes: ba612aa8b487 ("misc: mic: SCIF memory registration and unregistration")
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Link: https://lore.kernel.org/r/1600570295-29546-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

dmaengine: dmatest: Check list for emptiness before access its last entry

[ Upstream commit b28de385b71abf31ce68ec0387638bee26ae9024 ]

After writing a garbage to the channel we get an Oops in dmatest_chan_set()
due to access to last entry in the empty list.

[  212.670672] BUG: unable to handle page fault for address: fffffff000000020
[  212.677562] #PF: supervisor read access in kernel mode
[  212.682702] #PF: error_code(0x0000) - not-present page
...
[  212.710074] RIP: 0010:dmatest_chan_set+0x149/0x2d0 [dmatest]
[  212.715739] Code: e8 cc f9 ff ff 48 8b 1d 0d 55 00 00 48 83 7b 10 00 0f 84 63 01 00 00 48 c7 c7 d0 65 4d c0 e8 ee 4a f5 e1 48 89 c6 48 8b 43 10 <48> 8b 40 20 48 8b 78 58 48 85 ff 0f 84 f5 00 00 00 e8 b1 41 f5 e1

Fix this by checking list for emptiness before accessing its last entry.

Fixes: d53513d5dc28 ("dmaengine: dmatest: Add support for multi channel testing")
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20200922115847.30100-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ath6kl: wmi: prevent a shift wrapping bug in ath6kl_wmi_delete_pstream_cmd()

[ Upstream commit 6a950755cec1a90ddaaff3e4acb5333617441c32 ]

The "tsid" is a user controlled u8 which comes from debugfs. Values
more than 15 are invalid because "active_tsids" is a 16 bit variable.
If the value of "tsid" is more than 31 then that leads to a shift
wrapping bug.

Fixes: 8fffd9e5ec9e ("ath6kl: Implement support for QOS-enable and QOS-disable from userspace")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200918142732.GA909725@mwanda
Signed-off-by: Sasha Levin <sashal@kernel.org>

spi: omap2-mcspi: Improve performance waiting for CHSTAT

[ Upstream commit 7b1d96813317358312440d0d07abbfbeb0ef8d22 ]

This reverts commit 13d515c796 (spi: omap2-mcspi: Switch to
readl_poll_timeout()).

The amount of time spent polling for the MCSPI_CHSTAT bits to be set on
AM335x-icev2 platform is less than 1us (about 0.6us) in most cases, with
or without using DMA. So, in most cases the function need not sleep.
Also, setting the sleep_usecs to zero would not be optimal here because
ktime_add_us() used in readl_poll_timeout() is slower compared to the
direct addition used after the revert. So, it is sub-optimal to use
readl_poll_timeout in this case.

When DMA is not enabled, this revert results in an increase of about 27%
in throughput and decrease of about 20% in CPU usage. However, the CPU
usage and throughput are almost the same when used with DMA.

Therefore, fix this by reverting the commit which switched to using
readl_poll_timeout().

Fixes: 13d515c796ad ("spi: omap2-mcspi: Switch to readl_poll_timeout()")
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20200910122624.8769-1-a-govindraju@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: rtl8366rb: Support all 4096 VLANs

[ Upstream commit a7920efdd86d8a0d74402dbc80ead03b023294ba ]

There is an off-by-one error in rtl8366rb_is_vlan_valid()
making VLANs 0..4094 valid while it should be 1..4095.
Fix it.

Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: tlv320aic32x4: Fix bdiv clock rate derivation

[ Upstream commit 40b37136287ba6b34aa2f1f6123f3d6d205dc2f0 ]

Current code expects a single channel to be always used. Fix this
situation by forwarding the number of channels used. Then fix the
derivation of the bdiv clock rate.

Fixes: 96c3bb00239d ("ASoC: tlv320aic32x4: Dynamically Determine Clocking")
Suggested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/r/20200911173140.29984-3-miquel.raynal@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: wilc1000: clean up resource in error path of init mon interface

[ Upstream commit 55bd149978679742374c800e56e8f6bc74378bbe ]

The wilc_wfi_init_mon_int() forgets to clean up resource when
register_netdevice() failed. Add the missed call to fix it.
And the return value of netdev_priv can't be NULL, so remove
the unnecessary error handling.

Fixes: 588713006ea4 ("staging: wilc1000: avoid the use of 'wilc_wfi_mon' static variable")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200917123019.206382-1-huangguobin4@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: rtl8366: Skip PVID setting if not requested

[ Upstream commit 3dfe8dde093a07e82fa472c0f8c29a7f6a2006a5 ]

We go to lengths to determine whether the PVID should be set
for this port or not, and then fail to take it into account.
Fix this oversight.

Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: rtl8366: Refactor VLAN/PVID init

[ Upstream commit 7e1301ed1881447d2a25f9c6423738c33cbca133 ]

The VLANs and PVIDs on the RTL8366 utilizes a "member
configuration" (MC) which is largely unexplained in the
code.

This set-up requires a special ordering: rtl8366_set_pvid()
must be called first, followed by rtl8366_set_vlan(),
else the MC will not be properly allocated. Relax this
by factoring out the code obtaining an MC and reuse
the helper in both rtl8366_set_pvid() and
rtl8366_set_vlan() so we remove this strict ordering
requirement.

In the process, add some better comments and debug prints
so people who read the code understand what is going on.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: rtl8366: Check validity of passed VLANs

[ Upstream commit 6641a2c42b0a307b7638d10e5d4b90debc61389d ]

The rtl8366_set_vlan() and rtl8366_set_pvid() get invalid
VLANs tossed at it, especially VLAN0, something the hardware
and driver cannot handle. Check validity and bail out like
we do in the other callbacks.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>