Elena Reshetova [Thu, 4 Jan 2018 09:38:52 +0000 (01:38 -0800)]
p54: prevent speculative execution
Since the queue value in function p54_conf_tx()
seems to be controllable by userspace and later on
conditionally (upon bound check) used to resolve
priv->qos_params, insert an observable speculation
barrier before its usage. This should prevent
observable speculation on that branch and avoid
kernel memory leak.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Orabug: 27340445
CVE: CVE-2017-5753
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Conflicts:
patch refers to drivers/net/wireless/intersil/p54/main.c
code base has drivers/net/wireless/p54/main.c
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Elena Reshetova [Thu, 4 Jan 2018 09:31:31 +0000 (01:31 -0800)]
carl9170: prevent speculative execution
Since the queue value in function carl9170_op_conf_tx()
seems to be controllable by userspace and later on
conditionally (upon bound check) used to resolve
ar9170_qmap and following ar->edcf, insert an observable
speculation barrier before its usage. This should prevent
observable speculation on that branch and avoid
kernel memory leak.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Orabug: 27340445
CVE: CVE-2017-5753
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Elena Reshetova [Thu, 4 Jan 2018 09:25:32 +0000 (01:25 -0800)]
uvcvideo: prevent speculative execution
Since the index value in function uvc_ioctl_enum_input()
seems to be controllable by userspace and later on
conditionally (upon bound check) used to resolve
selector->baSourceID, insert an observable speculation
barrier before its usage. This should prevent
observable speculation on that branch and avoid
kernel memory leak.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Orabug: 27340445
CVE: CVE-2017-5753
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Elena Reshetova [Thu, 4 Jan 2018 08:05:42 +0000 (00:05 -0800)]
bpf: prevent speculative execution in eBPF interpreter
This adds an observable speculation barrier before LD_IMM_DW and
LDX_MEM_B/H/W/DW eBPF instructions during eBPF program
execution in order to prevent speculative execution on out
of bound BFP_MAP array indexes. This way an arbitary kernel
memory is not exposed through side channel attacks.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Orabug: 27340445
CVE: CVE-2017-5753
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Conflicts:
kernel/bpf/core.c code base differences
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Elena Reshetova [Thu, 4 Jan 2018 07:56:24 +0000 (23:56 -0800)]
locking/barriers: introduce new observable speculation barrier
The new observable speculation barrier, osb(), ensures
that any user observable speculation doesn't cross the boundary.
Any user observable speculative activity on this CPU
thread before this point either completes, reaches a
state it can no longer cause an observable activity, or
is aborted before instructions after the barrier execute.
In x86 case, osb() resolves in lfence if X86_FEATURE_LFENCE_RDTSC
is present. Other architectures can define their variants.
Suggested-by: Arjan van de Ven <arjan@linux.intel.com> Suggested-by: Alan Cox <alan.cox@intel.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Orabug: 27340445
CVE: CVE-2017-5753
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Conflicts:
include/asm-generic/barrier.h code base differences
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Elena Reshetova [Thu, 4 Jan 2018 07:43:33 +0000 (23:43 -0800)]
x86/cpu/AMD: Remove now unused definition of MFENCE_RDTSC feature
With the switch to using LFENCE_RDTSC on AMD platforms there is no longer
a need for the MFENCE_RDTSC feature. Remove its usage and definition.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Orabug: 27340445
CVE: CVE-2017-5753
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Conflicts:
Patch refers to arch/x86/include/asm/cpufeatures.h
Code base has arch/x86/include/asm/cpufeature.h
Patch references X86_FEATURE_MFENCE_RDTSC in arch/x86/include/asm/msr.h
Code base references it in:
arch/x86/include/asm/barrier.h
arch/x86/um/asm/barrier.h
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Elena Reshetova [Thu, 4 Jan 2018 07:19:32 +0000 (23:19 -0800)]
x86/cpu/AMD: Make the LFENCE instruction serialized
In order to reduce the impact of using MFENCE, make the execution of the
LFENCE instruction serialized. This is done by setting bit 1 of MSR
0xc0011029 (DE_CFG).
Some families that support LFENCE do not have this MSR. For these
families, the LFENCE instruction is already serialized.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Orabug: 27340445
CVE: CVE-2017-5753
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Conflicts:
patch refers to arch/x86/include/asm/msr-index.h
code base has arch/x86/include/uapi/asm/msr-index.h
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Konrad Rzeszutek Wilk [Thu, 4 Jan 2018 17:26:58 +0000 (12:26 -0500)]
kABI: Make the boot_cpu_data look normal.
It is statically allocated and we only grow it - so having an
GENKSYMS around it is fine. This fixes aff7641cb9f37c7aa6897a7b51faa6e20b08013f
"x86/cpu/AMD: Add speculative control support for AMD" breaking the kABI
Tom Lendacky [Thu, 30 Nov 2017 22:46:40 +0000 (16:46 -0600)]
x86/microcode/AMD: Add support for fam17h microcode loading
The size for the Microcode Patch Block (MPB) for an AMD family 17h
processor is 3200 bytes. Add a #define for fam17h so that it does
not default to 2048 bytes and fail a microcode load/update.
Dave Hansen [Wed, 20 Dec 2017 20:54:52 +0000 (12:54 -0800)]
Set IBPB when running a different VCPU
Picking up a change from:
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Thu, 30 Nov 2017 15:00:12 +0100
[RHEL7.5 PATCH 07/35] kvm: vmx: Set IBPB when running a different
VCPU
Ensure an IBPB (Indirect branch prediction barrier) before every VCPU
switch.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Paolo Bonzini [Thu, 30 Nov 2017 14:00:14 +0000 (15:00 +0100)]
x86/svm: Set IBPB when running a different VCPU
[RHEL7.5 PATCH 09/35] x86/svm: Set IBPB when running a different VCPU
Set IBPB (Indirect Branch Prediction Barrier) when the current CPU is
going to run a VCPU different from what was previously run. Nested
virtualization uses the same VMCB for the second level guest, but the
L1 hypervisor should be using IBRS to protect itself.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Tim Chen <tim.c.chen@linux.inte.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: We don't have 39c06df4dc10a "x86/cpufeature: Cleanup get_cpu_cap()"
which adds a nice enum and we neither do we have 2167ceabf3416
"x86/cpu: Add CLZERO detection". As such we just a partial backport
of the last one and only look for one specific bit (12).]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Konrad Rzeszutek Wilk [Thu, 4 Jan 2018 16:20:00 +0000 (11:20 -0500)]
x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature
There are 2 ways to control IBPB and IBRS
1. At boot time
noibrs kernel boot parameter will disable IBRS usage
noibpb kernel boot parameter will disable IBPB usage
Otherwise if the above parameters are not specified, the system
will enable ibrs and ibpb usage if the cpu supports it.
2. At run time
echo 0 > /proc/sys/kernel/ibrs_enabled will turn off IBRS
echo 1 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in kernel
echo 2 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in both userspace and kernel
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: This completes the scaffolding work done in the earlier
patch which had the same title]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Backport: We don't have the ORC stack which means our calling.h
has the CTF code. And that has RESTORE_EXTRA_ARGS and ZERO_EXTRA_ARGS
so there was no need to port that in. See
commit 76f5df43cab5e765c0bd42289103e8f625813ae1
x86/asm/entry/64: Always allocate a complete "struct pt_regs" on the kernel stack
which added them.
The ZERO_EXTRA_REGS (aka CLEAR_EXTRA_REGS) is not part of it.
It ends up crashing the user-space. Not sure why not.
Which means this patch is pretty much useless - we don't clear
any of the %r12-%r15, nor %rbp, nor %rbx at all.
In other words we just save now more registers on the %esp
and restore them.
But somewhere we depend on these and need to fix that.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Konrad Rzeszutek Wilk [Thu, 4 Jan 2018 16:30:05 +0000 (11:30 -0500)]
x86/mm: Only set IBPB when the new thread cannot ptrace current thread
To reduce overhead of setting IBPB, we only do that when
the new thread cannot ptrace the current one. If the new
thread has ptrace capability on current thread, it is safe.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: Need more #include's than the original]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport needs an asm/microcode.h to include the native_wrmsrl]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: We don't have b466bdb614823
"x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer"
hence the change to delay_mwaitx is not needed]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Andrea Arcangeli [Fri, 15 Dec 2017 00:04:25 +0000 (16:04 -0800)]
x86/spec_ctrl: save IBRS MSR value in paranoid_entry
If the NMI runs while entering kernel between SWAPGS and IBRS_ENABLE
everything is fine, paranoid_entry would have unconditionally set
IBRS bit 0 and when exiting the NMI it would have cleared bit 0 like
if it was returning to userland. IBRS_ENABLE would have then enabled
bit 0 again.
If NMI instead runs when exiting kernel between IBRS_DISABLE and
SWAPGS, the NMI would have turned on IBRS bit 0 and then it would have
left enabled when exiting the NMI. IBRS bit 0 would then be left
enabled in userland until the next enter kernel.
That is a minor inefficiency only, but we can eliminate it by saving
the MSR when entering the NMI in save_paranoid and restoring it when
exiting the NMI.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[*Scaffolding*:This backport lacks a lot. It is only put it on so that the
later patches compiled _and_ can be tested to run. It is meant to be removed
once the full set of patches are all good. Aka scaffolding.]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: I had to add 'asm/spec_ctrl.h' in the assembler files]
Also we should not put ENABLE_IBRS on irq_entries_start]
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport: In UEK4 it is 'cpufeature.h', not 'cpufeatures.h']
Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Konrad Rzeszutek Wilk [Fri, 29 Dec 2017 19:11:17 +0000 (14:11 -0500)]
x86: Add STIBP feature enumeration
Enumerate single thread indirect branch predictors (STIBP) feature. It
provides means to prevent indirect branch predictions from being
controlled by sibling HW thread.
Mohamed Ghannam [Tue, 5 Dec 2017 20:58:35 +0000 (20:58 +0000)]
dccp: CVE-2017-8824: use-after-free in DCCP code
Whenever the sock object is in DCCP_CLOSED state,
dccp_disconnect() must free dccps_hc_tx_ccid and
dccps_hc_rx_ccid and set to NULL.
Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 69c64866ce072dea1d1e59a0d61e0f66c0dffb76)
Patrick Colp [Mon, 8 Jan 2018 20:36:14 +0000 (12:36 -0800)]
negotiate_mq should happen in all cases of a new VBD being discovered by
xen-blkfront, whether called through _probe() or a hot-attached new VBD
from dom-0 via xenstore. Otherwise, hot-attached new VBDs are left
configured without multi-queue.
Colin Ian King [Fri, 22 Sep 2017 17:13:48 +0000 (18:13 +0100)]
e1000: avoid null pointer dereference on invalid stat type
Currently if the stat type is invalid then data[i] is being set
either by dereferencing a null pointer p, or it is reading from
an incorrect previous location if we had a valid stat type
previously. Fix this by skipping over the read of p on an invalid
stat type.
Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")
Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 5983587c8c5ef00d6886477544ad67d495bc5479) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
e1000: fix race condition between e1000_down() and e1000_watchdog
This patch fixes a race condition that can result into the interface being
up and carrier on, but with transmits disabled in the hardware.
The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
allows e1000_watchdog() interleave with e1000_down().
CPU x CPU y
--------------------------------------------------------------------
e1000_down():
netif_carrier_off()
e1000_watchdog():
if (carrier == off) {
netif_carrier_on();
enable_hw_transmit();
}
disable_hw_transmit();
e1000_watchdog():
/* carrier on, do nothing */
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 44c445c3d1b4eacff23141fa7977c3b2ec3a45c9) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Florian Fainelli [Sat, 26 Aug 2017 01:14:24 +0000 (18:14 -0700)]
e1000e: Be drop monitor friendly
e1000e_put_txbuf() can be called from normal reclamation path as well as
when a DMA mapping failure, so we need to differentiate these two cases
when freeing SKBs to be drop monitor friendly. e1000e_tx_hwtstamp_work()
and e1000_remove() are processing TX timestamped SKBs and those should
not be accounted as drops either.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 377b62736c01f14309141c69caa6d84363c12e12) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Willem de Bruijn [Fri, 25 Aug 2017 15:06:26 +0000 (11:06 -0400)]
e1000e: apply burst mode settings only on default
Devices that support FLAG2_DMA_BURST have different default values
for RDTR and RADV. Apply burst mode default settings only when no
explicit value was passed at module load.
The RDTR default is zero. If the module is loaded for low latency
operation with RxIntDelay=0, do not override this value with a burst
default of 32.
Move the decision to apply burst values earlier, where explicitly
initialized module variables can be distinguished from defaults.
Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 48072ae1ec7a1c778771cad8c1b8dd803c4992ab) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Sasha Neftin [Sun, 6 Aug 2017 13:49:18 +0000 (16:49 +0300)]
e1000e: fix buffer overrun while the I219 is processing DMA transactions
Intel® 100/200 Series Chipset platforms reduced the round-trip
latency for the LAN Controller DMA accesses, causing in some high
performance cases a buffer overrun while the I219 LAN Connected
Device is processing the DMA transactions. I219LM and I219V devices
can fall into unrecovered Tx hang under very stressfully UDP traffic
and multiple reconnection of Ethernet cable. This Tx hang of the LAN
Controller is only recovered if the system is rebooted. Slightly slow
down DMA access by reducing the number of outstanding requests.
This workaround could have an impact on TCP traffic performance
on the platform. Disabling TSO eliminates performance loss for TCP
traffic without a noticeable impact on CPU performance.
Please, refer to I218/I219 specification update:
https://www.intel.com/content/www/us/en/embedded/products/networking/
ethernet-connection-i218-family-documentation.html
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit b10effb92e272051dd1ec0d7be56bf9ca85ab927) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Benjamin Poirier [Fri, 21 Jul 2017 18:36:27 +0000 (11:36 -0700)]
e1000e: Avoid receiver overrun interrupt bursts
When e1000e_poll() is not fast enough to keep up with incoming traffic, the
adapter (when operating in msix mode) raises the Other interrupt to signal
Receiver Overrun.
This is a double problem because 1) at the moment e1000_msix_other()
assumes that it is only called in case of Link Status Change and 2) if the
condition persists, the interrupt is repeatedly raised again in quick
succession.
Ideally we would configure the Other interrupt to not be raised in case of
receiver overrun but this doesn't seem possible on this adapter. Instead,
we handle the first part of the problem by reverting to the practice of
reading ICR in the other interrupt handler, like before commit 16ecba59bc33
("e1000e: Do not read ICR in Other interrupt"). Thanks to commit 0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME
from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts
anymore. We handle the second part of the problem by not re-enabling the
Other interrupt right away when there is overrun. Instead, we wait until
traffic subsides, napi polling mode is exited and interrupts are
re-enabled.
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca> Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 4aea7a5c5e940c1723add439f4088844cd26196d) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
link_active = !hw->mac.get_link_status
/* link_active is false, wrongly */
This problem arises because the single flag get_link_status is used to
signal two different states: link status needs checking and link status is
down.
Avoid the problem by using the return value of .check_for_link to signal
the link status to e1000e_has_link().
Reported-by: Lennart Sorensen <lsorense@csclub.uwaterloo.ca> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 19110cfbb34d4af0cdfe14cd243f3b09dc95b013) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Allen Pais [Thu, 21 Sep 2017 17:04:52 +0000 (22:34 +0530)]
drivers: net: e1000e: use setup_timer() helper.
Use setup_timer function instead of initializing timer with the
function and data fields.
Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 27069012
(cherry picked from commit 4a9c07ed71c2b8d755ee585264f80dd2d82a8066) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
i219 (8) and i219 (9) are the next LOM generations that will be available
on the next Intel Client platform (IceLake).
This patch provides the initial support for these devices
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Raanan Avargil <raanan.avargil@intel.com> Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 48f76b68f9fca4e1d5bbb1755d14e8e8e09bdd5b) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Gustavo A R Silva [Tue, 20 Jun 2017 21:22:34 +0000 (16:22 -0500)]
e1000e: add check on e1e_wphy() return value
Check return value from call to e1e_wphy(). This value is being
checked during previous calls to function e1e_wphy() and it seems
a check was missing here.
Addresses-Coverity-ID: 1226905 Signed-off-by: Gustavo A R Silva <garsilva@embeddedor.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit d75372a2daf5dc48207ee9e5592917e893cddb87) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
The unwind failures stems from commit 2800209994f8 ("e1000e: Refactor PM
flows"), but it may be a later patch that introduced the non-recoverable
behaviour.
Fixes: 2800209994f8 ("e1000e: Refactor PM flows")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 27069012
(cherry picked from commit 833521ebc65b1c3092e5c0d8a97092f98eec595d) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Current code manually allocate an fcport structure that
is not properly initialize. Replace kzalloc with
qla2x00_alloc_fcport, so that all fields are initialized.
Original code acquires hardware_lock to add Abort IOCB
onto driver request queue for processing. However,
abort_command() will also acquire hardware lock to look up
sp pointer before issuing abort IOCB command resulting
into a deadlock. This patch safely removes the possible
deadlock scenario by removing extra spinlock.
Get Port Database MBX cmd is to validate current Login state upon
PRLI completion. Current code looks at the last login state for
re-validation which was incorrect. This patch removed incorrect
state check.
Fixes: 15f30a5752287 ("qla2xxx: Use IOCB interface to submit non-critical MBX.") Cc: <stable@vger.kernel.org> # 4.10+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Hannes Reinecke <hare@suse.com>
[ Upstream commit 23c645595dab7b414f23639d0a428a07515807df ] Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Current driver design schedules relogin process via DPC thread
every 1 second. In a large fabric, this DPC thread tries to
schedule too many jobs and might get overloaded. As a result of
this processing of DPC thread, it can schedule relogin earlier
than 1 second.
If user swaps one target port for another target port for same
switch port, the new target port is not being recognized by the
driver. Current code assumes that old Target port has recovered
from link down. The fix will ask switch what is the WWPN of a
specific NportID (GPNID) rather than assuming it's the same Target
port which has came back.
when RSCN is delivered for specific remote port,
the switch say the remote port is still up and
current state of the remote port/session is good,
don't trust the state. Instead use ADISC to very
the session is still valid or not.
Name pointer describing each command is assigned with stack frame's memory.
The stack frame could eventually be re-use, where name pointer
access can get get garbage. To fix the problem, use designated
static memory for name pointer.
When NPort Handle is in use, driver needs to mark the handle
as used and pick another. Instead, the code clears the handle
and re-pick the same handle.
Name server login is normally handle by FW. In some rare case where one
of the switches is being updated, name server login could get
affected. Trigger relogin to name server when driver detects this
condition.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ Upstream commit b98ae0d748dbc80016c5cc2e926f33648d83353d ] Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
The value of "size" comes from the user. When we add "start + size" it
could lead to an integer overflow bug.
It means we vmalloc() a lot more memory than we had intended. I believe
that on 64 bit systems vmalloc() can succeed even if we ask it to
allocate huge 4GB buffers. So we would get memory corruption and likely
a crash when we call ha->isp_ops->write_optrom() and ->read_optrom().
Only root can trigger this bug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=194061 Cc: <stable@vger.kernel.org> Fixes: b7cc176c9eb3 ("[SCSI] qla2xxx: Allow region-based flash-part accesses.") Reported-by: shqking <shqking@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[ Upstream commit e6f77540c067b48dee10f1e33678415bfcc89017 ] Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
When pci_enable_device() or pci_enable_device_mem() fail in
qla2x00_probe_one() we bail out but do a call to
pci_disable_device(). This causes the dev_WARN_ON() in
pci_disable_device() to trigger, as the device wasn't enabled
previously.
So instead of taking the 'probe_out' error path we can directly return
*iff* one of the pci_enable_device() calls fails.
Additionally rename the 'probe_out' goto label's name to the more
descriptive 'disable_device'.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ddff7ed45edce4a4c92949d3c61cd25d229c4a14 ] Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Robb Glasser [Tue, 5 Dec 2017 17:16:55 +0000 (09:16 -0800)]
ALSA: pcm: prevent UAF in snd_pcm_info
When the device descriptor is closed, the `substream->runtime` pointer
is freed. But another thread may be in the ioctl handler, case
SNDRV_CTL_IOCTL_PCM_INFO. This case calls snd_pcm_info_user() which
calls snd_pcm_info() which accesses the now freed `substream->runtime`.
Note: this fixes CVE-2017-0861
Signed-off-by: Robb Glasser <rglasser@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 362bca57f5d78220f8b5907b875961af9436e229)
Andrey Ryabinin [Fri, 16 Oct 2015 11:28:53 +0000 (14:28 +0300)]
x86, kasan: Fix build failure on KASAN=y && KMEMCHECK=y kernels
Declaration of memcpy() is hidden under #ifndef CONFIG_KMEMCHECK.
In asm/efi.h under #ifdef CONFIG_KASAN we #undef memcpy(), due to
which the following happens:
In file included from arch/x86/kernel/setup.c:96:0:
./arch/x86/include/asm/desc.h: In function â\80\98native_write_idt_entryâ\80\99:
./arch/x86/include/asm/desc.h:122:2: error: implicit declaration of function â\80\98memcpyâ\80\99 [-Werror=implicit-function-declaration] memcpy(&idt[entry], gate, sizeof(*gate));
^
cc1: some warnings being treated as errors
make[2]: *** [arch/x86/kernel/setup.o] Error 1
We will get rid of that #undef in asm/efi.h eventually.
But in the meanwhile move memcpy() declaration out of #ifdefs
to fix the build.
With KMEMCHECK=y, KASAN=n we get this build failure:
arch/x86/platform/efi/efi.c:673:3: error: implicit declaration of function â\80\98memcpyâ\80\99 [-Werror=implicit-function-declaration]
arch/x86/platform/efi/efi_64.c:139:2: error: implicit declaration of function â\80\98memcpyâ\80\99 [-Werror=implicit-function-declaration]
arch/x86/include/asm/desc.h:121:2: error: implicit declaration of function â\80\98memcpyâ\80\99 [-Werror=implicit-function-declaration]
Don't #undef memcpy if KASAN=n.
Reported-by: Ingo Molnar <mingo@kernel.org> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 769a8089c1fd ("x86, efi, kasan: #undef memset/memcpy/memmove per arch") Link: http://lkml.kernel.org/r/1443544814-20122-1-git-send-email-ryabinin.a.a@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 4ac86a6dcec1c3878de9747bf5a2aa4455be69e3)
x86, efi, kasan: #undef memset/memcpy/memmove per arch
In not-instrumented code KASAN replaces instrumented memset/memcpy/memmove
with not-instrumented analogues __memset/__memcpy/__memove.
However, on x86 the EFI stub is not linked with the kernel. It uses
not-instrumented mem*() functions from arch/x86/boot/compressed/string.c
So we don't replace them with __mem*() variants in EFI stub.
On ARM64 the EFI stub is linked with the kernel, so we should replace
mem*() functions with __mem*(), because the EFI stub runs before KASAN
sets up early shadow.
So let's move these #undef mem* into arch's asm/efi.h which is also
included by the EFI stub.
Also, this will fix the warning in 32-bit build reported by kbuild test
robot:
efi-stub-helper.c:599:2: warning: implicit declaration of function 'memcpy'
[akpm@linux-foundation.org: use 80 cols in comment] Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Reported-by: Fengguang Wu <fengguang.wu@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 769a8089c1fd2fe94c13e66fe6e03d7820953ee3)
Daniel Kiper [Thu, 14 Dec 2017 14:31:56 +0000 (15:31 +0100)]
x86/efi: Initialize and display UEFI secure boot state a bit later during init
Otherwise Xen dom0 does not display "Secure boot enabled" message if it runs
on secure boot enabled platform. This happens because boot_params.secure_boot
is initialized too late. However, despite lack of message all features depending
on UEFI secure boot are enabled properly.
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Michael Chan [Sat, 14 Oct 2017 01:09:33 +0000 (21:09 -0400)]
bnxt_en: Fix possible corrupted NVRAM parameters from firmware response.
In bnxt_find_nvram_item(), it is copying firmware response data after
releasing the mutex. This can cause the firmware response data
to be corrupted if the next firmware response overwrites the response
buffer. The rare problem shows up when running ethtool -i repeatedly.
Fix it by calling the new variant _hwrm_send_message_silent() that requires
the caller to take the mutex and to release it after the response data has
been copied.
Fixes: 3ebf6f0a09a2 ("bnxt_en: Add installed-package version reporting via Ethtool GDRVINFO") Reported-by: Sarveswara Rao Mygapula <sarveswararao.mygapula@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cc72f3b1feb4fd38d33ab7a013d5ab95041cb8ba)
Kris Van Hees [Tue, 12 Dec 2017 18:19:21 +0000 (13:19 -0500)]
dtrace: do not use copy_from_user when accessing kernel stack
The implementation of sdt_getarg() for x86_64 uses a copy_from_user
variant while reading from kernel stack which is obviously wrong.
This commit corrects that.
Orabug: 25949088 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Reviewed-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Kris Van Hees [Mon, 11 Dec 2017 20:20:16 +0000 (15:20 -0500)]
dtrace: fix arg5 and up retrieval for FBT entry probes on x86
When tracing function entry using FBT entry probes, access to all the
function arguments should be supported. The existing code supported
access up to the 5th argument, but would yield incorrect results for
any argument beyond that. On some kernels it could result in a crash
when the 6th (or higher) argument was being retrieved.
The reason for the problem lies in the fact that the first 5 arguments
could be read directly from the register set, whereas the 6th argument
and beyond needs to be retrieved from the stack. The generic code
implementing retrieval of arguments turns out to be incorrect for FBT
entry probes.
Thi commit introduces a FBT-specific function to access function
arguments.
Orabug: 25949088 Signed-off-by: Kris Van Hees <kris.van.hees@oracle.com> Reviewed-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
As we alloc pages with GFP_KERNEL in init_espfix_ap() which is
called before we enable local irqs, so the lockdep sub-system
would (correctly) trigger a warning about the potentially
blocking API.
So we allocate them on the boot CPU side when the secondary CPU is
brought up by the boot CPU, and hand them over to the secondary
CPU.
And we use alloc_pages_node() with the secondary CPU's node, to
make sure the espfix stack is NUMA-local to the CPU that is
going to use it.
Elena Ufimtseva [Thu, 14 Sep 2017 00:40:25 +0000 (20:40 -0400)]
xen: Make PV Dom0 Linux kernel NUMA aware
Issues Xen hypercall subop XENMEM_get_vnumainfo and sets the
NUMA topology, otherwise sets dummy NUMA node and prevents
numa_init from calling other numa initializators.
Enables vNUMA for dom0 if numa kernel boot option does not
disable it.
It also requires Xen to have patches that support Dom0 NUMA
and xen boot option dom0_vcpus_pin=numa.
Dom0 NUMA topology with this patch applied and Xen booted with
"dom0_mem=max:6144M dom0_vcpus_pin=numa dom0_max_vcpus=20":
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
ext4_find_unwritten_pgoff() is used to search for offset of hole or
data in page range [index, end] (both inclusive), and the max number
of pages to search should be at least one, if end == index.
Otherwise the only page is missed and no hole or data is found,
which is not correct.
When block size is smaller than page size, this can be demonstrated
by preallocating a file with size smaller than page size and writing
data to the last block. E.g. run this xfs_io command on a 1k block
size ext4 on x86_64 host.
# xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \
-c "seek -d 0" /mnt/ext4/testfile
wrote 1024/1024 bytes at offset 2048
1 KiB, 1 ops; 0.0000 sec (42.459 MiB/sec and 43478.2609 ops/sec)
Whence Result
DATA EOF
Data at offset 2k was missed, and lseek(2) returned ENXIO.
This is unconvered by generic/285 subtest 07 and 08 on ppc64 host,
where pagesize is 64k. Because a recent change to generic/285
reduced the preallocated file size to smaller than 64k.
Signed-off-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
(cherry picked from commit 624327f8794704c5066b11a52f9da6a09dce7f9a) Signed-off-by: Brian Maly <brian.maly@oracle.com> Reviewed-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Nicolas Droux [Fri, 17 Nov 2017 23:51:45 +0000 (16:51 -0700)]
DTrace: IO wait probes b_flags can contain incorrect operation
In the synchronous IO path, the xfs buffer flag value can change,
causing the IO provider io:::wait-start and io:::wait-done to report
an incorrect operation through the bufinfo_t b_flags field.
Liran Alon [Sun, 5 Nov 2017 14:11:30 +0000 (16:11 +0200)]
KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk
When guest passes KVM it's pvclock-page GPA via WRMSR to
MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize
pvclock-page to some start-values. It just requests a clock-update which
will happen before entering to guest.
The clock-update logic will call kvm_setup_pvclock_page() to update the
pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly*
assumes that the version-field is initialized to an even number. This is
wrong because at first-time write, field could be any-value.
Fix simply makes sure that if first-time version-field is odd, increment
it once more to make it even and only then start standard logic.
This follows same logic as done in other pvclock shared-pages (See
kvm_write_wall_clock() and record_steal_time()).
Signed-off-by: Liran Alon <liran.alon@oracle.com> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit 51c4b8bba674cfd2260d173602c4dac08e4c3a99)
Paolo Bonzini [Thu, 1 Sep 2016 12:20:09 +0000 (14:20 +0200)]
KVM: x86: always fill in vcpu->arch.hv_clock
We will use it in the next patches for KVM_GET_CLOCK and as a basis for the
contents of the Hyper-V TSC page. Get the values from the Linux
timekeeper even if kvmclock is not enabled.
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0d6dd2ff8206dc1da3428d5b1611f6304d481dab)
Liran Alon [Sun, 5 Nov 2017 14:07:43 +0000 (16:07 +0200)]
KVM: nVMX: Fix vmx_check_nested_events() return value in case an event was reinjected to L2
vmx_check_nested_events() should return -EBUSY only in case there is a
pending L1 event which requires a VMExit from L2 to L1 but such a
VMExit is currently blocked. Such VMExits are blocked either
because nested_run_pending=1 or an event was reinjected to L2.
vmx_check_nested_events() should return 0 in case there are no
pending L1 events which requires a VMExit from L2 to L1 or if
a VMExit from L2 to L1 was done internally.
However, upstream commit which introduced blocking in case an event was
reinjected to L2 (commit acc9ab601327 ("KVM: nVMX: Fix pending events
injection")) contains a bug: It returns -EBUSY even if there are no
pending L1 events which requires VMExit from L2 to L1.