Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Backport: We don't have the ORC stack which means our calling.h
has the CTF code. And that has RESTORE_EXTRA_ARGS and ZERO_EXTRA_ARGS
so there was no need to port that in. See
commit 76f5df43cab5e765c0bd42289103e8f625813ae1
x86/asm/entry/64: Always allocate a complete "struct pt_regs" on the kernel stack
which added them.
The ZERO_EXTRA_REGS (aka CLEAR_EXTRA_REGS) is not part of it.
It ends up crashing the user-space. Not sure why not.
Which means this patch is pretty much useless - we don't clear
any of the %r12-%r15, nor %rbp, nor %rbx at all.
In other words we just save now more registers on the %esp
and restore them.
But somewhere we depend on these and need to fix that.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Konrad Rzeszutek Wilk [Thu, 4 Jan 2018 16:30:05 +0000 (11:30 -0500)]
x86/mm: Only set IBPB when the new thread cannot ptrace current thread
To reduce overhead of setting IBPB, we only do that when
the new thread cannot ptrace the current one. If the new
thread has ptrace capability on current thread, it is safe.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: Need more #include's than the original]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport needs an asm/microcode.h to include the native_wrmsrl]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: We don't have b466bdb614823
"x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer"
hence the change to delay_mwaitx is not needed]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Andrea Arcangeli [Fri, 15 Dec 2017 00:04:25 +0000 (16:04 -0800)]
x86/spec_ctrl: save IBRS MSR value in paranoid_entry
If the NMI runs while entering kernel between SWAPGS and IBRS_ENABLE
everything is fine, paranoid_entry would have unconditionally set
IBRS bit 0 and when exiting the NMI it would have cleared bit 0 like
if it was returning to userland. IBRS_ENABLE would have then enabled
bit 0 again.
If NMI instead runs when exiting kernel between IBRS_DISABLE and
SWAPGS, the NMI would have turned on IBRS bit 0 and then it would have
left enabled when exiting the NMI. IBRS bit 0 would then be left
enabled in userland until the next enter kernel.
That is a minor inefficiency only, but we can eliminate it by saving
the MSR when entering the NMI in save_paranoid and restoring it when
exiting the NMI.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[*Scaffolding*:This backport lacks a lot. It is only put it on so that the
later patches compiled _and_ can be tested to run. It is meant to be removed
once the full set of patches are all good. Aka scaffolding.]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: I had to add 'asm/spec_ctrl.h' in the assembler files]
Also we should not put ENABLE_IBRS on irq_entries_start]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: In UEK4 it is 'cpufeature.h', not 'cpufeatures.h']
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Konrad Rzeszutek Wilk [Fri, 29 Dec 2017 19:11:17 +0000 (14:11 -0500)]
x86: Add STIBP feature enumeration
Enumerate single thread indirect branch predictors (STIBP) feature. It
provides means to prevent indirect branch predictions from being
controlled by sibling HW thread.
Mohamed Ghannam [Tue, 5 Dec 2017 20:58:35 +0000 (20:58 +0000)]
dccp: CVE-2017-8824: use-after-free in DCCP code
Whenever the sock object is in DCCP_CLOSED state,
dccp_disconnect() must free dccps_hc_tx_ccid and
dccps_hc_rx_ccid and set to NULL.
Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 69c64866ce072dea1d1e59a0d61e0f66c0dffb76)
Patrick Colp [Mon, 8 Jan 2018 20:36:14 +0000 (12:36 -0800)]
negotiate_mq should happen in all cases of a new VBD being discovered by
xen-blkfront, whether called through _probe() or a hot-attached new VBD
from dom-0 via xenstore. Otherwise, hot-attached new VBDs are left
configured without multi-queue.
Colin Ian King [Fri, 22 Sep 2017 17:13:48 +0000 (18:13 +0100)]
e1000: avoid null pointer dereference on invalid stat type
Currently if the stat type is invalid then data[i] is being set
either by dereferencing a null pointer p, or it is reading from
an incorrect previous location if we had a valid stat type
previously. Fix this by skipping over the read of p on an invalid
stat type.
Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")
Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 5983587c8c5ef00d6886477544ad67d495bc5479) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
e1000: fix race condition between e1000_down() and e1000_watchdog
This patch fixes a race condition that can result into the interface being
up and carrier on, but with transmits disabled in the hardware.
The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
allows e1000_watchdog() interleave with e1000_down().
CPU x CPU y
--------------------------------------------------------------------
e1000_down():
netif_carrier_off()
e1000_watchdog():
if (carrier == off) {
netif_carrier_on();
enable_hw_transmit();
}
disable_hw_transmit();
e1000_watchdog():
/* carrier on, do nothing */
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 44c445c3d1b4eacff23141fa7977c3b2ec3a45c9) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Florian Fainelli [Sat, 26 Aug 2017 01:14:24 +0000 (18:14 -0700)]
e1000e: Be drop monitor friendly
e1000e_put_txbuf() can be called from normal reclamation path as well as
when a DMA mapping failure, so we need to differentiate these two cases
when freeing SKBs to be drop monitor friendly. e1000e_tx_hwtstamp_work()
and e1000_remove() are processing TX timestamped SKBs and those should
not be accounted as drops either.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 377b62736c01f14309141c69caa6d84363c12e12) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Willem de Bruijn [Fri, 25 Aug 2017 15:06:26 +0000 (11:06 -0400)]
e1000e: apply burst mode settings only on default
Devices that support FLAG2_DMA_BURST have different default values
for RDTR and RADV. Apply burst mode default settings only when no
explicit value was passed at module load.
The RDTR default is zero. If the module is loaded for low latency
operation with RxIntDelay=0, do not override this value with a burst
default of 32.
Move the decision to apply burst values earlier, where explicitly
initialized module variables can be distinguished from defaults.
Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 48072ae1ec7a1c778771cad8c1b8dd803c4992ab) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Sasha Neftin [Sun, 6 Aug 2017 13:49:18 +0000 (16:49 +0300)]
e1000e: fix buffer overrun while the I219 is processing DMA transactions
Intel® 100/200 Series Chipset platforms reduced the round-trip
latency for the LAN Controller DMA accesses, causing in some high
performance cases a buffer overrun while the I219 LAN Connected
Device is processing the DMA transactions. I219LM and I219V devices
can fall into unrecovered Tx hang under very stressfully UDP traffic
and multiple reconnection of Ethernet cable. This Tx hang of the LAN
Controller is only recovered if the system is rebooted. Slightly slow
down DMA access by reducing the number of outstanding requests.
This workaround could have an impact on TCP traffic performance
on the platform. Disabling TSO eliminates performance loss for TCP
traffic without a noticeable impact on CPU performance.
Please, refer to I218/I219 specification update:
https://www.intel.com/content/www/us/en/embedded/products/networking/
ethernet-connection-i218-family-documentation.html
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit b10effb92e272051dd1ec0d7be56bf9ca85ab927) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Benjamin Poirier [Fri, 21 Jul 2017 18:36:27 +0000 (11:36 -0700)]
e1000e: Avoid receiver overrun interrupt bursts
When e1000e_poll() is not fast enough to keep up with incoming traffic, the
adapter (when operating in msix mode) raises the Other interrupt to signal
Receiver Overrun.
This is a double problem because 1) at the moment e1000_msix_other()
assumes that it is only called in case of Link Status Change and 2) if the
condition persists, the interrupt is repeatedly raised again in quick
succession.
Ideally we would configure the Other interrupt to not be raised in case of
receiver overrun but this doesn't seem possible on this adapter. Instead,
we handle the first part of the problem by reverting to the practice of
reading ICR in the other interrupt handler, like before commit 16ecba59bc33
("e1000e: Do not read ICR in Other interrupt"). Thanks to commit 0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME
from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
anymore. We handle the second part of the problem by not re-enabling the
Other interrupt right away when there is overrun. Instead, we wait until
traffic subsides, napi polling mode is exited and interrupts are
re-enabled.
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca> Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 4aea7a5c5e940c1723add439f4088844cd26196d) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
link_active = !hw->mac.get_link_status
/* link_active is false, wrongly */
This problem arises because the single flag get_link_status is used to
signal two different states: link status needs checking and link status is
down.
Avoid the problem by using the return value of .check_for_link to signal
the link status to e1000e_has_link().
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 19110cfbb34d4af0cdfe14cd243f3b09dc95b013) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Allen Pais [Thu, 21 Sep 2017 17:04:52 +0000 (22:34 +0530)]
drivers: net: e1000e: use setup_timer() helper.
Use setup_timer function instead of initializing timer with the
function and data fields.
Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 27069012
(cherry picked from commit 4a9c07ed71c2b8d755ee585264f80dd2d82a8066) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
i219 (8) and i219 (9) are the next LOM generations that will be available
on the next Intel Client platform (IceLake).
This patch provides the initial support for these devices
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 48f76b68f9fca4e1d5bbb1755d14e8e8e09bdd5b) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Gustavo A R Silva [Tue, 20 Jun 2017 21:22:34 +0000 (16:22 -0500)]
e1000e: add check on e1e_wphy() return value
Check return value from call to e1e_wphy(). This value is being
checked during previous calls to function e1e_wphy() and it seems
a check was missing here.
Addresses-Coverity-ID: 1226905 Signed-off-by: Gustavo A R Silva <garsilva@embeddedor.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit d75372a2daf5dc48207ee9e5592917e893cddb87) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
The unwind failures stems from commit 2800209994f8 ("e1000e: Refactor PM
flows"), but it may be a later patch that introduced the non-recoverable
behaviour.
Fixes: 2800209994f8 ("e1000e: Refactor PM flows")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 833521ebc65b1c3092e5c0d8a97092f98eec595d) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Current code manually allocate an fcport structure that
is not properly initialize. Replace kzalloc with
qla2x00_alloc_fcport, so that all fields are initialized.
Original code acquires hardware_lock to add Abort IOCB
onto driver request queue for processing. However,
abort_command() will also acquire hardware lock to look up
sp pointer before issuing abort IOCB command resulting
into a deadlock. This patch safely removes the possible
deadlock scenario by removing extra spinlock.
Get Port Database MBX cmd is to validate current Login state upon
PRLI completion. Current code looks at the last login state for
re-validation which was incorrect. This patch removed incorrect
state check.
Fixes: 15f30a5752287 ("qla2xxx: Use IOCB interface to submit non-critical MBX.") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com>
[ Upstream commit 23c645595dab7b414f23639d0a428a07515807df ] Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Current driver design schedules relogin process via DPC thread
every 1 second. In a large fabric, this DPC thread tries to
schedule too many jobs and might get overloaded. As a result of
this processing of DPC thread, it can schedule relogin earlier
than 1 second.
If user swaps one target port for another target port for same
switch port, the new target port is not being recognized by the
driver. Current code assumes that old Target port has recovered
from link down. The fix will ask switch what is the WWPN of a
specific NportID (GPNID) rather than assuming it's the same Target
port which has came back.
when RSCN is delivered for specific remote port,
the switch say the remote port is still up and
current state of the remote port/session is good,
don't trust the state. Instead use ADISC to very
the session is still valid or not.
Name pointer describing each command is assigned with stack frame's memory.
The stack frame could eventually be re-use, where name pointer
access can get get garbage. To fix the problem, use designated
static memory for name pointer.
When NPort Handle is in use, driver needs to mark the handle
as used and pick another. Instead, the code clears the handle
and re-pick the same handle.
Name server login is normally handle by FW. In some rare case where one
of the switches is being updated, name server login could get
affected. Trigger relogin to name server when driver detects this
condition.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ Upstream commit b98ae0d748dbc80016c5cc2e926f33648d83353d ] Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
The value of "size" comes from the user. When we add "start + size" it
could lead to an integer overflow bug.
It means we vmalloc() a lot more memory than we had intended. I believe
that on 64 bit systems vmalloc() can succeed even if we ask it to
allocate huge 4GB buffers. So we would get memory corruption and likely
a crash when we call ha->isp_ops->write_optrom() and ->read_optrom().
Only root can trigger this bug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=194061 Cc: <stable@vger.kernel.org> Fixes: b7cc176c9eb3 ("[SCSI] qla2xxx: Allow region-based flash-part accesses.") Reported-by: shqking <shqking@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ Upstream commit e6f77540c067b48dee10f1e33678415bfcc89017 ] Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
When pci_enable_device() or pci_enable_device_mem() fail in
qla2x00_probe_one() we bail out but do a call to
pci_disable_device(). This causes the dev_WARN_ON() in
pci_disable_device() to trigger, as the device wasn't enabled
previously.
So instead of taking the 'probe_out' error path we can directly return
*iff* one of the pci_enable_device() calls fails.
Additionally rename the 'probe_out' goto label's name to the more
descriptive 'disable_device'.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ddff7ed45edce4a4c92949d3c61cd25d229c4a14 ] Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Robb Glasser [Tue, 5 Dec 2017 17:16:55 +0000 (09:16 -0800)]
ALSA: pcm: prevent UAF in snd_pcm_info
When the device descriptor is closed, the `substream->runtime` pointer
is freed. But another thread may be in the ioctl handler, case
SNDRV_CTL_IOCTL_PCM_INFO. This case calls snd_pcm_info_user() which
calls snd_pcm_info() which accesses the now freed `substream->runtime`.
Note: this fixes CVE-2017-0861
Signed-off-by: Robb Glasser <rglasser@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 362bca57f5d78220f8b5907b875961af9436e229)
Andrey Ryabinin [Fri, 16 Oct 2015 11:28:53 +0000 (14:28 +0300)]
x86, kasan: Fix build failure on KASAN=y && KMEMCHECK=y kernels
Declaration of memcpy() is hidden under #ifndef CONFIG_KMEMCHECK.
In asm/efi.h under #ifdef CONFIG_KASAN we #undef memcpy(), due to
which the following happens:
In file included from arch/x86/kernel/setup.c:96:0:
./arch/x86/include/asm/desc.h: In function â\80\98native_write_idt_entryâ\80\99:
./arch/x86/include/asm/desc.h:122:2: error: implicit declaration of function â\80\98memcpyâ\80\99 [-Werror=implicit-function-declaration] memcpy(&idt[entry], gate, sizeof(*gate));
^
cc1: some warnings being treated as errors
make[2]: *** [arch/x86/kernel/setup.o] Error 1
We will get rid of that #undef in asm/efi.h eventually.
But in the meanwhile move memcpy() declaration out of #ifdefs
to fix the build.
With KMEMCHECK=y, KASAN=n we get this build failure:
arch/x86/platform/efi/efi.c:673:3: error: implicit declaration of function â\80\98memcpyâ\80\99 [-Werror=implicit-function-declaration]
arch/x86/platform/efi/efi_64.c:139:2: error: implicit declaration of function â\80\98memcpyâ\80\99 [-Werror=implicit-function-declaration]
arch/x86/include/asm/desc.h:121:2: error: implicit declaration of function â\80\98memcpyâ\80\99 [-Werror=implicit-function-declaration]
Don't #undef memcpy if KASAN=n.
Reported-by: Ingo Molnar <mingo@kernel.org> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 769a8089c1fd ("x86, efi, kasan: #undef memset/memcpy/memmove per arch") Link: http://lkml.kernel.org/r/1443544814-20122-1-git-send-email-ryabinin.a.a@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 4ac86a6dcec1c3878de9747bf5a2aa4455be69e3)
x86, efi, kasan: #undef memset/memcpy/memmove per arch
In not-instrumented code KASAN replaces instrumented memset/memcpy/memmove
with not-instrumented analogues __memset/__memcpy/__memove.
However, on x86 the EFI stub is not linked with the kernel. It uses
not-instrumented mem*() functions from arch/x86/boot/compressed/string.c
So we don't replace them with __mem*() variants in EFI stub.
On ARM64 the EFI stub is linked with the kernel, so we should replace
mem*() functions with __mem*(), because the EFI stub runs before KASAN
sets up early shadow.
So let's move these #undef mem* into arch's asm/efi.h which is also
included by the EFI stub.
Also, this will fix the warning in 32-bit build reported by kbuild test
robot:
efi-stub-helper.c:599:2: warning: implicit declaration of function 'memcpy'
[akpm@linux-foundation.org: use 80 cols in comment] Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Reported-by: Fengguang Wu <fengguang.wu@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 769a8089c1fd2fe94c13e66fe6e03d7820953ee3)
Daniel Kiper [Thu, 14 Dec 2017 14:31:56 +0000 (15:31 +0100)]
x86/efi: Initialize and display UEFI secure boot state a bit later during init
Otherwise Xen dom0 does not display "Secure boot enabled" message if it runs
on secure boot enabled platform. This happens because boot_params.secure_boot
is initialized too late. However, despite lack of message all features depending
on UEFI secure boot are enabled properly.
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Michael Chan [Sat, 14 Oct 2017 01:09:33 +0000 (21:09 -0400)]
bnxt_en: Fix possible corrupted NVRAM parameters from firmware response.
In bnxt_find_nvram_item(), it is copying firmware response data after
releasing the mutex. This can cause the firmware response data
to be corrupted if the next firmware response overwrites the response
buffer. The rare problem shows up when running ethtool -i repeatedly.
Fix it by calling the new variant _hwrm_send_message_silent() that requires
the caller to take the mutex and to release it after the response data has
been copied.
Fixes: 3ebf6f0a09a2 ("bnxt_en: Add installed-package version reporting via Ethtool GDRVINFO") Reported-by: Sarveswara Rao Mygapula <sarveswararao.mygapula@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cc72f3b1feb4fd38d33ab7a013d5ab95041cb8ba)
Kris Van Hees [Tue, 12 Dec 2017 18:19:21 +0000 (13:19 -0500)]
dtrace: do not use copy_from_user when accessing kernel stack
The implementation of sdt_getarg() for x86_64 uses a copy_from_user
variant while reading from kernel stack which is obviously wrong.
This commit corrects that.
Orabug: 25949088 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Reviewed-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Kris Van Hees [Mon, 11 Dec 2017 20:20:16 +0000 (15:20 -0500)]
dtrace: fix arg5 and up retrieval for FBT entry probes on x86
When tracing function entry using FBT entry probes, access to all the
function arguments should be supported. The existing code supported
access up to the 5th argument, but would yield incorrect results for
any argument beyond that. On some kernels it could result in a crash
when the 6th (or higher) argument was being retrieved.
The reason for the problem lies in the fact that the first 5 arguments
could be read directly from the register set, whereas the 6th argument
and beyond needs to be retrieved from the stack. The generic code
implementing retrieval of arguments turns out to be incorrect for FBT
entry probes.
Thi commit introduces a FBT-specific function to access function
arguments.
Orabug: 25949088 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Reviewed-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
As we alloc pages with GFP_KERNEL in init_espfix_ap() which is
called before we enable local irqs, so the lockdep sub-system
would (correctly) trigger a warning about the potentially
blocking API.
So we allocate them on the boot CPU side when the secondary CPU is
brought up by the boot CPU, and hand them over to the secondary
CPU.
And we use alloc_pages_node() with the secondary CPU's node, to
make sure the espfix stack is NUMA-local to the CPU that is
going to use it.
Elena Ufimtseva [Thu, 14 Sep 2017 00:40:25 +0000 (20:40 -0400)]
xen: Make PV Dom0 Linux kernel NUMA aware
Issues Xen hypercall subop XENMEM_get_vnumainfo and sets the
NUMA topology, otherwise sets dummy NUMA node and prevents
numa_init from calling other numa initializators.
Enables vNUMA for dom0 if numa kernel boot option does not
disable it.
It also requires Xen to have patches that support Dom0 NUMA
and xen boot option dom0_vcpus_pin=numa.
Dom0 NUMA topology with this patch applied and Xen booted with
"dom0_mem=max:6144M dom0_vcpus_pin=numa dom0_max_vcpus=20":
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
ext4_find_unwritten_pgoff() is used to search for offset of hole or
data in page range [index, end] (both inclusive), and the max number
of pages to search should be at least one, if end == index.
Otherwise the only page is missed and no hole or data is found,
which is not correct.
When block size is smaller than page size, this can be demonstrated
by preallocating a file with size smaller than page size and writing
data to the last block. E.g. run this xfs_io command on a 1k block
size ext4 on x86_64 host.
# xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \
-c "seek -d 0" /mnt/ext4/testfile
wrote 1024/1024 bytes at offset 2048
1 KiB, 1 ops; 0.0000 sec (42.459 MiB/sec and 43478.2609 ops/sec)
Whence Result
DATA EOF
Data at offset 2k was missed, and lseek(2) returned ENXIO.
This is unconvered by generic/285 subtest 07 and 08 on ppc64 host,
where pagesize is 64k. Because a recent change to generic/285
reduced the preallocated file size to smaller than 64k.
Signed-off-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
(cherry picked from commit 624327f8794704c5066b11a52f9da6a09dce7f9a) Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Nicolas Droux [Fri, 17 Nov 2017 23:51:45 +0000 (16:51 -0700)]
DTrace: IO wait probes b_flags can contain incorrect operation
In the synchronous IO path, the xfs buffer flag value can change,
causing the IO provider io:::wait-start and io:::wait-done to report
an incorrect operation through the bufinfo_t b_flags field.
Liran Alon [Sun, 5 Nov 2017 14:11:30 +0000 (16:11 +0200)]
KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk
When guest passes KVM it's pvclock-page GPA via WRMSR to
MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize
pvclock-page to some start-values. It just requests a clock-update which
will happen before entering to guest.
The clock-update logic will call kvm_setup_pvclock_page() to update the
pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly*
assumes that the version-field is initialized to an even number. This is
wrong because at first-time write, field could be any-value.
Fix simply makes sure that if first-time version-field is odd, increment
it once more to make it even and only then start standard logic.
This follows same logic as done in other pvclock shared-pages (See
kvm_write_wall_clock() and record_steal_time()).
Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit 51c4b8bba674cfd2260d173602c4dac08e4c3a99)
Paolo Bonzini [Thu, 1 Sep 2016 12:20:09 +0000 (14:20 +0200)]
KVM: x86: always fill in vcpu->arch.hv_clock
We will use it in the next patches for KVM_GET_CLOCK and as a basis for the
contents of the Hyper-V TSC page. Get the values from the Linux
timekeeper even if kvmclock is not enabled.
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0d6dd2ff8206dc1da3428d5b1611f6304d481dab)
Liran Alon [Sun, 5 Nov 2017 14:07:43 +0000 (16:07 +0200)]
KVM: nVMX: Fix vmx_check_nested_events() return value in case an event was reinjected to L2
vmx_check_nested_events() should return -EBUSY only in case there is a
pending L1 event which requires a VMExit from L2 to L1 but such a
VMExit is currently blocked. Such VMExits are blocked either
because nested_run_pending=1 or an event was reinjected to L2.
vmx_check_nested_events() should return 0 in case there are no
pending L1 events which requires a VMExit from L2 to L1 or if
a VMExit from L2 to L1 was done internally.
However, upstream commit which introduced blocking in case an event was
reinjected to L2 (commit acc9ab601327 ("KVM: nVMX: Fix pending events
injection")) contains a bug: It returns -EBUSY even if there are no
pending L1 events which requires VMExit from L2 to L1.
We should try to reinject previous events if any before trying to inject
new event if pending. If vmexit is triggered by L2 guest and L0 interested
in, we should reinject IDT-vectoring info to L2 through vmcs02 if any,
otherwise, we can consider new IRQs/NMIs which can be injected and call
nested events callback to switch from L2 to L1 if needed and inject the
proper vmexit events. However, 'commit 0ad3bed6c5ec ("kvm: nVMX: move
nested events check to kvm_vcpu_running")' results in the handle events
order reversely on non-APICv box. This patch fixes it by bailing out for
pending events and not consider new events in this scenario.
Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Fixes: 0ad3bed6c5ec ("kvm: nVMX: move nested events check to kvm_vcpu_running") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit acc9ab60132739e0c5ab40644444cd7b22d12886)
Orabug: 27200329 Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Acked-by: Liran Alon <liran.alon@oracle.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Dongli Zhang [Tue, 28 Nov 2017 02:48:52 +0000 (10:48 +0800)]
xen/time: do not decrease steal time after live migration on xen
After guest live migration on xen, steal time in /proc/stat
(cpustat[CPUTIME_STEAL]) might decrease because steal returned by
xen_steal_lock() might be less than this_rq()->prev_steal_time which is
derived from previous return value of xen_steal_clock().
For instance, steal time of each vcpu is 335 before live migration.
Since runstate times are cumulative and cleared during xen live migration
by xen hypervisor, the idea of this patch is to accumulate runstate times
to global percpu variables before live migration suspend. Once guest VM is
resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new
runstate times and previously accumulated times stored in global percpu
variables.
Comment above HYPERVISOR_suspend() has been removed as it is inaccurate:
the call can return an error code (e.g., possibly -EPERM in the future).
Similar and more severe issue would impact prior linux 4.8-4.10 as
discussed by Michael Las at
https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest,
which would overflow steal time and lead to 100% st usage in top command
for linux 4.8-4.10. A backport of this patch would fix that issue.
[boris: added linux/slab.h to driver/xen/time.c, slightly reformatted
commit message]
References: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Orabug: 27181243
(cherry picked from commit 5e25f5db6abb96ca8ee2aaedcb863daa6dfcc07a)
The original purpose of this patch is to not decrease steal time after live
migration on xen. We backport this patch to avoid xen steal clock overflow
issue which would lead to 100% st usage in top command as mentioned in
above commit message. UEK3 or UEK2 would not hit this issue.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
When "NETDEV WATCHDOG: em4 (bnx2x): transmit queue 2 timed out" occurs,
BNX2X_SP_RTNL_TX_TIMEOUT is set. In the function bnx2x_sp_rtnl_task,
bnx2x_nic_unload and bnx2x_nic_load are executed to shutdown and open
NIC. In the function bnx2x_nic_load, bnx2x_alloc_mem fails to allocate
dma memory. The message "bnx2x: [bnx2x_alloc_mem:8399(em4)]Can't
allocate memory" pops out. The variable slowpath is set to NULL.
When shutdown the NIC, the function bnx2x_nic_unload is called. In
the function bnx2x_nic_unload, the following functions are executed.
bnx2x_chip_cleanup
bnx2x_set_storm_rx_mode
bnx2x_set_q_rx_mode
bnx2x_set_q_rx_mode
bnx2x_config_rx_mode
bnx2x_set_rx_mode_e2
In the function bnx2x_set_rx_mode_e2, the variable slowpath is operated.
Then the crash occurs.
To fix this crash, the variable slowpath is checked. And in the function
bnx2x_sp_rtnl_task, after dma memory allocation fails, another shutdown
and open NIC is executed.
CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Ariel Elior <aelior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
This patch replaces max_t() with sub_positive() to prevent intermediate
values getting stored in cfs_rq->runnable_load_avg/runnable_load_sum.
This patch has been partially cherry picked from commit c7b50216818e
("sched/fair: Remove se->load.weight from se->avg.load_sum")
So I _should_ have looked at other unserialized users of ->load_avg,
but alas. Luckily nikbor reported a similar /0 from task_h_load() which
instantly triggered recollection of this here problem.
Aside from the intermediate value hitting memory and causing problems,
there's another problem: the underflow detection relies on the signed
bit. This reduces the effective width of the variables, IOW its
effectively the same as having these variables be of signed type.
This patch changes to a different means of unsigned underflow
detection to not rely on the signed bit. This allows the variables to
use the 'full' unsigned range. And it does so with explicit LOAD -
STORE to ensure any intermediate value will never be visible in
memory, allowing these unserialized loads.
Note: GCC generates crap code for this, might warrant a look later.
Note2: I say 'full' above, if we end up at U*_MAX we'll still explode;
maybe we should do clamping on add too.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yuyang Du <yuyang.du@intel.com> Cc: bsegall@google.com Cc: kernel@kyup.com Cc: morten.rasmussen@arm.com Cc: pjt@google.com Cc: steve.muckle@linaro.org Fixes: 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking") Link: http://lkml.kernel.org/r/20160617091948.GJ30927@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 8974189222159154c55f24ddad33e3613960521a)
The deadlock happens when an attempt is made to take fc->lock in
fuse_abort_conn(). The same lock has already been taken in
fuse_dev_release() in the stack.
This flow is initiated from cuse_process_init_reply() in case of an error
and so, it is a very rare scenario, but it can lockup the fuse fs.
Fix this by releasing the spin lock before calling end_queued_requests()
instead of after, since it does not make a difference anyway.
Ka-Cheong Poon [Tue, 24 Oct 2017 15:00:05 +0000 (08:00 -0700)]
rds: IB active bonding IPv6 changes
For IPv6, each interface can be associated with many addresses. This
is handled by having an array of struct rds_ib_port_addr in each
struct rds_ib_port. This new struct is only used for storing an IPv6
address to minimize code changes and is separate from struct
rds_ib_alias. Note that address alias is obsolete and does not apply
to IPv6 address. All the failover event handling code stays the same
except that IPv6 addresses are also moved.
Ka-Cheong Poon [Tue, 24 Oct 2017 14:59:30 +0000 (07:59 -0700)]
{IB/{core,ipoib},net/rds}: IPv6 support for ACL
The IB ACL components are extended to support IPv6 address. Some
of the ACL ioctls use a 32 bit integer to represent an IP address. To
support IPv6, struct in6_addr needs to be used. To ensure backward
compatibility, a new ioctl will be introduced for each of those ioctls
that take a 32 bit integer as address. The original ioctls are kept
and can still be used. The new ioctls can take IPv4 mapped IPv6
address so new apps only need to use the new ioctls.
The IPOIBACLNGET and IPOIBACLNADD commands re-use the same
ipoib_ioctl_req_data, except that the ips field should actually be a
pointer to a list of struct in6_addr. Here we assume that the pointer
size to an u32 and in6_addr are the same.
Ka-Cheong Poon [Mon, 23 Oct 2017 13:21:49 +0000 (06:21 -0700)]
rds: Enable RDS IPv6 support
This patch enables RDS to use IPv6 addresses. There are many data
structures (RDS socket options) used by RDS apps which use a 32 bit
integer to store IP address. To support IPv6, struct in6_addr needs to
be used. To ensure backward compatibility, a new data structure is
introduced for each of those data structures which use a 32 bit
integer to represent an IP address. And new socket options are
introduced to use those new structures. This means that existing apps
should work without a problem with the new RDS module. For apps which
want to use IPv6, those new data structures and socket options can be
used. IPv4 mapped address is used to represent IPv4 address in the new
data structures.
RDS/RDMA/IB uses a private data (struct rds_ib_connect_private)
exchange between endpoints at RDS connection establishment time to
support RDMA. This private data exchange uses a 32 bit integer to
represent an IP address. This needs to be changed in order to support
IPv6. A new private data struct rds6_ib_connect_private is introduced
to handle this. To ensure backward compatibility, an IPv6 capable RDS
stack uses another RDMA listener port (RDS_CM_PORT which is 16385,
the same value as the RDS/TCP listener port number) to accept IPv6
connection. And it continues to use the original RDS_PORT for IPv4
RDS connections. When it needs to communicate with an IPv6 peer, it
uses the RDS_CM_PORT to send the connection set up request.
Ka-Cheong Poon [Fri, 20 Oct 2017 09:08:36 +0000 (02:08 -0700)]
rds: Changed IP address internal representation to struct in6_addr
This patch changed the internal representation of an IP address to use
struct in6_addr. IPv4 address is stored as an IPv4 mapped address.
All the functions which take an IP address as argument are also
changed to use struct in6_addr. But RDS socket layer is not modified
such that it still does not accept IPv6 address from an application.
And RDS layer does not accept nor initiate IPv6 connections.
The RDS netfilter header is changed. User of RDS netfilter will need
to be re-compiled.
Ka-Cheong Poon [Fri, 20 Oct 2017 09:07:56 +0000 (02:07 -0700)]
IB/ipoib: Remove ACL sysfs debug files
Currently, there are device attributes like add_acl which are writable and
can be used to add ACL in addition to using ioctl. The kernel needs to
parse a string from user space. While parsing IPv4 address is simple,
parsing IPv6 address is not. And it is error prone and can be a security
issue if not done right. Since those attributes are only used in a debug
kernel, it is advisable to remove those attributes instead of adding IPv6
support to them. The following attributes, add_acl, delete_acl, acl,
acl_instance, add_acl_instance and delete_acl_instance are removed. The
last four do not handle IP address. But to be consistent, they are also
removed.
Now consider an invocation of rds_ib_recv_refill() from the worker
thread, which is preemptible. Further, assume that the worker thread
is preempted between the ib_post_recv() and rdsdebug() statements.
Then, if the preemption is due to a receive CQE event, the
rds_ib_recv_cqe_handler() will be invoked. This function processes
receive completions, including freeing up data structures, such as the
recv->r_frag.
In this scenario, rds_ib_recv_cqe_handler() will process the receive
WR posted above. That implies, that the recv->r_frag has been freed
before the above rdsdebug() statement has been executed. When it is
later executed, we will have a NULL pointer dereference:
This bug was provoked by compiling rds out-of-tree with
EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
between the rdsdebug() and ib_ib_port_recv() statements:
/* XXX when can this fail? */
ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
+ if (can_wait)
+ usleep_range(1000, 5000);
rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
(long) ib_sg_dma_address(
The fix is simply to move the rdsdebug() statement up before the
ib_post_recv() and remove the printing of ret, which is taken care of
anyway by the non-debug code.
Johan Hovold [Wed, 4 Oct 2017 09:01:13 +0000 (11:01 +0200)]
USB: serial: console: fix use-after-free after failed setup
Make sure to reset the USB-console port pointer when console setup fails
in order to avoid having the struct usb_serial be prematurely freed by
the console code when the device is later disconnected.
Fixes: 73e487fdb75f ("[PATCH] USB console: fix disconnection issues") Cc: stable <stable@vger.kernel.org> # 2.6.18 Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>
(cherry picked from commit 299d7572e46f98534033a9e65973f13ad1ce9047)
uwbd_start() calls kthread_run() and checks that the return value is
not NULL. But the return value is not NULL in case kthread_run() fails,
it takes the form of ERR_PTR(-EINTR).
ALSA: usb-audio: Check out-of-bounds access by corrupted buffer descriptor
When a USB-audio device receives a maliciously adjusted or corrupted
buffer descriptor, the USB-audio driver may access an out-of-bounce
value at its parser. This was detected by syzkaller, something like:
Alan Stern [Fri, 22 Sep 2017 15:56:49 +0000 (11:56 -0400)]
USB: uas: fix bug in handling of alternate settings
The uas driver has a subtle bug in the way it handles alternate
settings. The uas_find_uas_alt_setting() routine returns an
altsetting value (the bAlternateSetting number in the descriptor), but
uas_use_uas_driver() then treats that value as an index to the
intf->altsetting array, which it isn't.
Normally this doesn't cause any problems because the various
alternate settings have bAlternateSetting values 0, 1, 2, ..., so the
value is equal to the index in the array. But this is not guaranteed,
and Andrey Konovalov used the syzkaller fuzzer with KASAN to get a
slab-out-of-bounds error by violating this assumption.
This patch fixes the bug by making uas_find_uas_alt_setting() return a
pointer to the altsetting entry rather than either the value or the
index. Pointers are less subject to misinterpretation.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> CC: Oliver Neukum <oneukum@suse.com> CC: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 786de92b3cb26012d3d0f00ee37adf14527f35c4)
Andrey Konovalov reported a possible out-of-bounds problem for a USB interface
association descriptor. He writes:
It seems there's no proper size check of a USB_DT_INTERFACE_ASSOCIATION
descriptor. It's only checked that the size is >= 2 in
usb_parse_configuration(), so find_iad() might do out-of-bounds access
to intf_assoc->bInterfaceCount.
And he's right, we don't check for crazy descriptors of this type very well, so
resolve this problem. Yet another issue found by syzkaller...
Tejun Heo [Thu, 21 Jan 2016 20:31:11 +0000 (15:31 -0500)]
cgroup: make sure a parent css isn't offlined before its children
There are three subsystem callbacks in css shutdown path -
css_offline(), css_released() and css_free(). Except for
css_released(), cgroup core didn't guarantee the order of invocation.
css_offline() or css_free() could be called on a parent css before its
children. This behavior is unexpected and led to bugs in cpu and
memory controller.
This patch updates offline path so that a parent css is never offlined
before its children. Each css keeps online_cnt which reaches zero iff
itself and all its children are offline and offline_css() is invoked
only after online_cnt reaches zero.
This fixes the memory controller bug and allows the fix for cpu
controller.
Jaejoong Kim [Thu, 28 Sep 2017 10:16:30 +0000 (19:16 +0900)]
HID: usbhid: fix out-of-bounds bug
The hid descriptor identifies the length and type of subordinate
descriptors for a device. If the received hid descriptor is smaller than
the size of the struct hid_descriptor, it is possible to cause
out-of-bounds.
In addition, if bNumDescriptors of the hid descriptor have an incorrect
value, this can also cause out-of-bounds while approaching hdesc->desc[n].
So check the size of hid descriptor and bNumDescriptors.
BUG: KASAN: slab-out-of-bounds in usbhid_parse+0x9b1/0xa20
Read of size 1 at addr ffff88006c5f8edf by task kworker/1:2/1261
Alan Stern [Wed, 18 Oct 2017 16:49:38 +0000 (12:49 -0400)]
USB: core: fix out-of-bounds access bug in usb_get_bos_descriptor()
Andrey used the syzkaller fuzzer to find an out-of-bounds memory
access in usb_get_bos_descriptor(). The code wasn't checking that the
next usb_dev_cap_header structure could fit into the remaining buffer
space.
This patch fixes the error and also reduces the bNumDeviceCaps field
in the header to match the actual number of capabilities found, in
cases where there are fewer than expected.
Fix by simply ignoring the bogus descriptor, as it is optional
for QMI devices anyway.
Fixes: 423ce8caab7e ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Johan Hovold [Thu, 21 Sep 2017 08:40:18 +0000 (05:40 -0300)]
[media] cx231xx-cards: fix NULL-deref on missing association descriptor
Make sure to check that we actually have an Interface Association
Descriptor before dereferencing it during probe to avoid dereferencing a
NULL-pointer.
Fixes: e0d3bafd0258 ("V4L/DVB (10954): Add cx231xx USB driver") Cc: stable <stable@vger.kernel.org> # 2.6.30 Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Johan Hovold <johan@kernel.org> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
(cherry picked from commit 6c3b047fa2d2286d5e438bcb470c7b1a49f415f6)