]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
7 years agox86/bugs: Drop one "mitigation" from dmesg
Borislav Petkov [Fri, 26 Jan 2018 12:11:39 +0000 (13:11 +0100)]
x86/bugs: Drop one "mitigation" from dmesg

Make

[    0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline

into

[    0.031118] Spectre V2: Mitigation: Full generic retpoline

to reduce the mitigation mitigations strings.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: riel@redhat.com
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: jikos@kernel.org
Cc: luto@amacapital.net
Cc: dave.hansen@intel.com
Cc: torvalds@linux-foundation.org
Cc: keescook@google.com
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: tim.c.chen@linux.intel.com
Cc: pjt@google.com
Link: https://lkml.kernel.org/r/20180126121139.31959-5-bp@alien8.de
(cherry picked from commit 55fa19d3e51f33d9cd4056d25836d93abf9438db)
Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 Conflicts:
arch/x86/kernel/cpu/bugs.c
[It is bugs_64.c in our kernel]

Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/nospec: Fix header guards names
Borislav Petkov [Fri, 26 Jan 2018 12:11:37 +0000 (13:11 +0100)]
x86/nospec: Fix header guards names

... to adhere to the _ASM_X86_ naming scheme.

No functional change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: riel@redhat.com
Cc: ak@linux.intel.com
Cc: peterz@infradead.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: jikos@kernel.org
Cc: luto@amacapital.net
Cc: dave.hansen@intel.com
Cc: torvalds@linux-foundation.org
Cc: keescook@google.com
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: tim.c.chen@linux.intel.com
Cc: gregkh@linux-foundation.org
Cc: pjt@google.com
Link: https://lkml.kernel.org/r/20180126121139.31959-3-bp@alien8.de
(cherry picked from commit 7a32fc51ca938e67974cbb9db31e1a43f98345a9)
Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
7 years agox86/spectre_v2: Don't spam the console with these:
Konrad Rzeszutek Wilk [Fri, 2 Feb 2018 18:58:20 +0000 (13:58 -0500)]
x86/spectre_v2: Don't spam the console with these:

Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL
for every CPU.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
7 years agox86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes
David Woodhouse [Thu, 25 Jan 2018 16:14:14 +0000 (16:14 +0000)]
x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes

This doesn't refuse to load the affected microcodes; it just refuses to
use the Spectre v2 mitigation features if they're detected, by clearing
the appropriate feature bits.

The AMD CPUID bits are handled here too, because hypervisors *may* have
been exposing those bits even on Intel chips, for fine-grained control
of what's available.

It is non-trivial to use x86_match_cpu() for this table because that
doesn't handle steppings. And the approach taken in commit bd9240a18
almost made me lose my lunch.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: ak@linux.intel.com
Cc: ashok.raj@intel.com
Cc: dave.hansen@intel.com
Cc: karahmed@amazon.de
Cc: arjan@linux.intel.com
Cc: torvalds@linux-foundation.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: pbonzini@redhat.com
Cc: tim.c.chen@linux.intel.com
Cc: gregkh@linux-foundation.org
Link: https://lkml.kernel.org/r/1516896855-7642-7-git-send-email-dwmw@amazon.co.uk
(cherry picked from commit a5b2966364538a0e68c9fa29bc0a3a1651799035)

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 Conflicts:
arch/x86/kernel/cpu/intel.c

[Our resolution to
 5d10cbc91d9e x86/cpufeatures: Add AMD feature bits for Speculation Control
 is different, so we still clear the right bits if there (or would be)are AMD machines
 with bad microcode]

Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/cpu: Keep model defines sorted by model number
Andy Shevchenko [Thu, 16 Mar 2017 15:50:45 +0000 (17:50 +0200)]
x86/cpu: Keep model defines sorted by model number

For better maintenance keep it sorted by numeric model ID. Add new lines to
seperate model groups.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20170316155045.50389-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit c238f2343441e3995d2d4e993de42b072d005f4a)
Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
7 years agox86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
David Woodhouse [Thu, 25 Jan 2018 16:14:13 +0000 (16:14 +0000)]
x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

Also, for CPUs which don't speculate at all, don't report that they're
vulnerable to the Spectre variants either.

Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it
for now, even though that could be done with a simple comparison, on the
assumption that we'll have more to add.

Based on suggestions from Dave Hansen and Alan Cox.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: ak@linux.intel.com
Cc: ashok.raj@intel.com
Cc: karahmed@amazon.de
Cc: arjan@linux.intel.com
Cc: torvalds@linux-foundation.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: pbonzini@redhat.com
Cc: tim.c.chen@linux.intel.com
Cc: gregkh@linux-foundation.org
Link: https://lkml.kernel.org/r/1516896855-7642-6-git-send-email-dwmw@amazon.co.uk
(cherry picked from commit fec9434a12f38d3aeafeb75711b71d8a1fdef621)

Orabug: 27477743
CVE: CVE-2017-5715

[Backport:
 s/X86_FEATURE_ARCH_CAPABILITIES/X86_FEATURE_IA32_ARCH_CAPS/
 Fix compiler warnings by removing __init/__initdata from
   cpu_no_speculation, cpu_no_meltdown, and cpu_vulnerable_to_meltdown
]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/msr: Add definitions for new speculation control MSRs
David Woodhouse [Thu, 25 Jan 2018 16:14:12 +0000 (16:14 +0000)]
x86/msr: Add definitions for new speculation control MSRs

Add MSR and bit definitions for SPEC_CTRL, PRED_CMD and ARCH_CAPABILITIES.

See Intel's 336996-Speculative-Execution-Side-Channel-Mitigations.pdf

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: ak@linux.intel.com
Cc: ashok.raj@intel.com
Cc: dave.hansen@intel.com
Cc: karahmed@amazon.de
Cc: arjan@linux.intel.com
Cc: torvalds@linux-foundation.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: pbonzini@redhat.com
Cc: tim.c.chen@linux.intel.com
Cc: gregkh@linux-foundation.org
Link: https://lkml.kernel.org/r/1516896855-7642-5-git-send-email-dwmw@amazon.co.uk
(cherry picked from commit 1e340c60d0dd3ae07b5bedc16a0469c14b9f3410)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Orabug: 27477743
CVE: CVE-2017-5715

 Conflicts:
arch/x86/include/asm/msr-index.h

[File does not exist, and more importantly we have most of the MSRs, except
 two of the bits]

Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/cpufeatures: Add AMD feature bits for Speculation Control
David Woodhouse [Thu, 25 Jan 2018 16:14:11 +0000 (16:14 +0000)]
x86/cpufeatures: Add AMD feature bits for Speculation Control

AMD exposes the PRED_CMD/SPEC_CTRL MSRs slightly differently to Intel.
See http://lkml.kernel.org/r/2b3e25cc-286d-8bd0-aeaf-9ac4aae39de8@amd.com

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: ak@linux.intel.com
Cc: ashok.raj@intel.com
Cc: dave.hansen@intel.com
Cc: karahmed@amazon.de
Cc: arjan@linux.intel.com
Cc: torvalds@linux-foundation.org
Cc: peterz@infradead.org
Cc: bp@alien8.de
Cc: pbonzini@redhat.com
Cc: tim.c.chen@linux.intel.com
Cc: gregkh@linux-foundation.org
Link: https://lkml.kernel.org/r/1516896855-7642-4-git-send-email-dwmw@amazon.co.uk
 Conflicts:
arch/x86/include/asm/cpufeatures.h
[Upstream has the NCAPINTS raised so there is another array to stuff all the
 0x80000008 cpuid leaf in. But we don't have that so we just toggle the global
 ones]

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/spectre_v2: Print what options are available.
Konrad Rzeszutek Wilk [Fri, 2 Feb 2018 04:22:46 +0000 (23:22 -0500)]
x86/spectre_v2: Print what options are available.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
7 years agox86/spectre_v2: Add VMEXIT_FILL_RSB instead of RETPOLINE
Konrad Rzeszutek Wilk [Fri, 2 Feb 2018 03:56:00 +0000 (22:56 -0500)]
x86/spectre_v2: Add VMEXIT_FILL_RSB instead of RETPOLINE

The backport of "x86/retpoline: Fill return stack buffer on vmexit"
made the full stuffing of RSB only enabled if the kernel had
selected X86_FEATURE_RETPOLINE.

But if we are using IBRS we still want the full RSB stuffing
as it was prior to the backport.

Since we have both retpoline and ibrs wanting it we introduce
a new feature to enable the common mitigation that both of them
need.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
7 years agox86/spectre: If IBRS is enabled disable "Filling RSB on context switch"
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 21:20:54 +0000 (16:20 -0500)]
x86/spectre: If IBRS is enabled disable "Filling RSB on context switch"

As that is only needed if the machine is running IBRS and !SMEP.

The comment above the conditional says it all:

 Skylake era CPUs have a separate issue with *underflow* of the
 RSB, when they will predict 'ret' targets from the generic BTB.
 The proper mitigation for this is IBRS. If IBRS is not supported
 or deactivated in favour of retpolines the RSB fill on context

.. and if we have IBRS then we should ignore this conditional.

Note that the check (!SMEP) and using the STUFF_RSB is already
done in:
x86/spectre_v2: Figure out if STUFF_RSB macro needs to be used.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agoKVM: VMX: Allow direct access to MSR_IA32_SPEC_CTRL
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 20:47:19 +0000 (15:47 -0500)]
KVM: VMX: Allow direct access to MSR_IA32_SPEC_CTRL

This is an adopation of what is being posted upstream.

The issue here is that guest kernels such as Windows
needs IBRS to solve their spectre_v2 mitigation.

And since our kernel can do either IBRS or retpoline
we need to be aware of both.

(We are ignoring for right now the situation in which
the microcode is not loaded and we want to emulate IBRS)

In short:
 1) If the CPU has microcode, and host uses IBRS we need
    to save the MSR during VMEXIT and write our IBRS value.
    Also before VMENTER we need to write the guest MSR value.

 2) If the host kernel uses retpoline we need to WRMSR the
    guest MSR value on VMENTER, but we are optimizing by only
    doing it if it a non-zero value.

    On VMEXIT we read the guest MSR value.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/spectre_v2: Don't allow {ibrs,ipbp,lfence}_enabled to be toggled if retpoline
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 17:25:54 +0000 (12:25 -0500)]
x86/spectre_v2: Don't allow {ibrs,ipbp,lfence}_enabled to be toggled if retpoline

is enabled.

And also refresh the spectre_v2_enabled mode depending on whether
the sysfs knobs are turned on/off.

Preserve the ability to tweak IBPB if retpoline is enabled.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/spectre: Fix retpoline_enabled
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 17:16:10 +0000 (12:16 -0500)]
x86/spectre: Fix retpoline_enabled

As it will report that retpoline_enabled if IBRS is set.

Instead just figure out which mode we are in and if it
has retpoline then return true.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/spectre: Update sysctl values if toggled only by set_{ibrs,ibpb}_disabled
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 16:45:41 +0000 (11:45 -0500)]
x86/spectre: Update sysctl values if toggled only by set_{ibrs,ibpb}_disabled

As otherwise we will report the wrong values in the
/sys/kernel/debug/x86/{ibrs,ipbp}_enabled at first
bootup - that is if you have 'noibrs' on the command line.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agoretpoline/module: Taint kernel for missing retpoline in module
Andi Kleen [Thu, 4 Jan 2018 14:37:08 +0000 (14:37 +0000)]
retpoline/module: Taint kernel for missing retpoline in module

There's a risk that a kernel that has full retpoline mitigations
becomes vulnerable when a module gets loaded that hasn't been
compiled with the right compiler or the right option.

We cannot fix it, but should at least warn the user when that
happens.

Add a flag to each module if it has been compiled with RETPOLINE

When the a module hasn't been compiled with a retpoline
aware compiler, print a warning and set a taint flag.

For modules it is checked at compile time, however it cannot
check assembler or other non compiled objects used in the module link.

Due to lack of better letter it uses taint option 'Z'

We only set the taint flag for incorrectly compiled modules
now, not for the main kernel, which already has other
report mechanisms.

Also make sure to report vulnerable for spectre if such a module
has been loaded.

v2: Change warning message
v3: Port to latest tree
Cc: jeyu@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
(cherry picked from commit abc01f3c4bbd927f5e47cc6dff99a76a393d1bbe)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
        Documentation/oops-tracing.txt
  (dmj: patch had Documentation/admin-guide/tainted-kernels.rst)
        arch/x86/kernel/cpu/bugs_64.c
  (dmj: patch had arch/x86/kernel/cpu/bugs.c)
include/linux/kernel.h
kernel/module.c
kernel/panic.c
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline: Fill RSB on context switch for affected CPUs
David Woodhouse [Fri, 12 Jan 2018 17:49:25 +0000 (17:49 +0000)]
x86/retpoline: Fill RSB on context switch for affected CPUs

commit c995efd5a740d9cbafbf58bde4973e8b50b4d761 upstream.

On context switch from a shallow call stack to a deeper one, as the CPU
does 'ret' up the deeper side it may encounter RSB entries (predictions for
where the 'ret' goes to) which were populated in userspace.

This is problematic if neither SMEP nor KPTI (the latter of which marks
userspace pages as NX for the kernel) are active, as malicious code in
userspace may then be executed speculatively.

Overwrite the CPU's return prediction stack with calls which are predicted
to return to an infinite loop, to "capture" speculation if this
happens. This is required both for retpoline, and also in conjunction with
IBRS for !SMEP && !KPTI.

On Skylake+ the problem is slightly different, and an *underflow* of the
RSB may cause errant branch predictions to occur. So there it's not so much
overwrite, as *filling* the RSB to attempt to prevent it getting
empty. This is only a partial solution for Skylake+ since there are many
other conditions which may result in the RSB becoming empty. The full
solution on Skylake+ is to use IBRS, which will prevent the problem even
when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
required on context switch.

[ tglx: Added missing vendor check and slighty massaged comments and
   changelog ]

[js] backport to 4.4 -- __switch_to_asm does not exist there, we
     have to patch the switch_to macros for both x86_32 and x86_64.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 18bb117d1b7690181346e6365c6237b6ceaac4c4)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/include/asm/cpufeature.h
          (dmj: bumped all uek4-specific microcode features up one)
arch/x86/include/asm/switch_to.h
arch/x86/kernel/cpu/bugs_64.c
  (dmj:
            - patch had arch/x86/kernel/cpu/bugs.c
            - preserved X86_FEATURE_RSB_CTXSW check in case of IBRS)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline: Optimize inline assembler for vmexit_fill_RSB
Andi Kleen [Wed, 17 Jan 2018 22:53:28 +0000 (14:53 -0800)]
x86/retpoline: Optimize inline assembler for vmexit_fill_RSB

commit 3f7d875566d8e79c5e0b2c9a413e91b2c29e0854 upstream.

The generated assembler for the C fill RSB inline asm operations has
several issues:

- The C code sets up the loop register, which is then immediately
  overwritten in __FILL_RETURN_BUFFER with the same value again.

- The C code also passes in the iteration count in another register, which
  is not used at all.

Remove these two unnecessary operations. Just rely on the single constant
passed to the macro for the iterations.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: dave.hansen@intel.com
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: arjan@linux.intel.com
Link: https://lkml.kernel.org/r/20180117225328.15414-1-andi@firstfloor.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 11e619414b69b7f1e47baac72c5be589d86e5393)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agokprobes/x86: Disable optimizing on the function jumps to indirect thunk
Masami Hiramatsu [Thu, 18 Jan 2018 16:15:20 +0000 (01:15 +0900)]
kprobes/x86: Disable optimizing on the function jumps to indirect thunk

commit c86a32c09f8ced67971a2310e3b0dda4d1749007 upstream.

Since indirect jump instructions will be replaced by jump
to __x86_indirect_thunk_*, those jmp instruction must be
treated as an indirect jump. Since optprobe prohibits to
optimize probes in the function which uses an indirect jump,
it also needs to find out the function which jump to
__x86_indirect_thunk_* and disable optimization.

Add a check that the jump target address is between the
__indirect_thunk_start/end when optimizing kprobe.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Link: https://lkml.kernel.org/r/151629212062.10241.6991266100233002273.stgit@devbox
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6cb73eb8045157ea280f0a047777ea1f56547375)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agokprobes/x86: Blacklist indirect thunk functions for kprobes
Masami Hiramatsu [Thu, 18 Jan 2018 16:14:51 +0000 (01:14 +0900)]
kprobes/x86: Blacklist indirect thunk functions for kprobes

commit c1804a236894ecc942da7dc6c5abe209e56cba93 upstream.

Mark __x86_indirect_thunk_* functions as blacklist for kprobes
because those functions can be called from anywhere in the kernel
including blacklist functions of kprobes.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Link: https://lkml.kernel.org/r/151629209111.10241.5444852823378068683.stgit@devbox
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9b8bd0d3586842716f5673e7a0922a9c9e1fcbfd)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agoretpoline: Introduce start/end markers of indirect thunk
Masami Hiramatsu [Thu, 18 Jan 2018 16:14:21 +0000 (01:14 +0900)]
retpoline: Introduce start/end markers of indirect thunk

commit 736e80a4213e9bbce40a7c050337047128b472ac upstream.

Introduce start/end markers of __x86_indirect_thunk_* functions.
To make it easy, consolidate .text.__x86.indirect_thunk.* sections
to one .text.__x86.indirect_thunk section and put it in the
end of kernel text section and adds __indirect_thunk_start/end
so that other subsystem (e.g. kprobes) can identify it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Link: https://lkml.kernel.org/r/151629206178.10241.6828804696410044771.stgit@devbox
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 799dc737680a8074a0c7c2d3426b85f4c439377f)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/mce: Make machine check speculation protected
Thomas Gleixner [Thu, 18 Jan 2018 15:28:26 +0000 (16:28 +0100)]
x86/mce: Make machine check speculation protected

commit 6f41c34d69eb005e7848716bbcafc979b35037d5 upstream.

The machine check idtentry uses an indirect branch directly from the low
level code. This evades the speculation protection.

Replace it by a direct call into C code and issue the indirect call there
so the compiler can apply the proper speculation protection.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by:Borislav Petkov <bp@alien8.de>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Niced-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801181626290.1847@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f59e7ce17ba327245c8feb312d447b09d3b98eba)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/entry_64.S
  (dmj: patch had arch/x86/entry/entry_64.S)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agokbuild: modversions for EXPORT_SYMBOL() for asm
Nicholas Piggin [Tue, 1 Nov 2016 01:46:19 +0000 (12:46 +1100)]
kbuild: modversions for EXPORT_SYMBOL() for asm

commit 4efca4ed05cbdfd13ec3e8cb623fb77d6e4ab187 upstream.

Allow architectures to create asm/asm-prototypes.h file that
provides C prototypes for exported asm functions, which enables
proper CRC versions to be generated for them.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
[jkosina@suse.cz: folded cc6acc11cad1 fixup in as well ]
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ff535919c1366b0218113f95c447d92c3aac0c1f)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros
Tom Lendacky [Sat, 13 Jan 2018 23:27:30 +0000 (17:27 -0600)]
x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros

commit 28d437d550e1e39f805d99f9f8ac399c778827b7 upstream.

The PAUSE instruction is currently used in the retpoline and RSB filling
macros as a speculation trap.  The use of PAUSE was originally suggested
because it showed a very, very small difference in the amount of
cycles/time used to execute the retpoline as compared to LFENCE.  On AMD,
the PAUSE instruction is not a serializing instruction, so the pause/jmp
loop will use excess power as it is speculated over waiting for return
to mispredict to the correct target.

The RSB filling macro is applicable to AMD, and, if software is unable to
verify that LFENCE is serializing on AMD (possible when running under a
hypervisor), the generic retpoline support will be used and, so, is also
applicable to AMD.  Keep the current usage of PAUSE for Intel, but add an
LFENCE instruction to the speculation trap for AMD.

The same sequence has been adopted by GCC for the GCC generated retpolines.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Kees Cook <keescook@google.com>
Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit fba063e6dfb413e06b9daa5d45b164761172f5ed)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline: Remove compile time warning
Thomas Gleixner [Sun, 14 Jan 2018 21:13:29 +0000 (22:13 +0100)]
x86/retpoline: Remove compile time warning

commit b8b9ce4b5aec8de9e23cabb0a26b78641f9ab1d6 upstream.

Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler
does not have retpoline support. Linus rationale for this is:

  It's wrong because it will just make people turn off RETPOLINE, and the
  asm updates - and return stack clearing - that are independent of the
  compiler are likely the most important parts because they are likely the
  ones easiest to target.

  And it's annoying because most people won't be able to do anything about
  it. The number of people building their own compiler? Very small. So if
  their distro hasn't got a compiler yet (and pretty much nobody does), the
  warning is just annoying crap.

  It is already properly reported as part of the sysfs interface. The
  compile-time warning only encourages bad things.

Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support")
Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAtw@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 451725c3e785dfc3ede6c65184b96c213181995a)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline: Fill return stack buffer on vmexit
David Woodhouse [Fri, 12 Jan 2018 11:11:27 +0000 (11:11 +0000)]
x86/retpoline: Fill return stack buffer on vmexit

commit 117cc7a908c83697b0b737d15ae1eb5943afe35b upstream.

In accordance with the Intel and AMD documentation, we need to overwrite
all entries in the RSB on exiting a guest, to prevent malicious branch
target predictions from affecting the host kernel. This is needed both
for retpoline and for IBRS.

[ak: numbers again for the RSB stuffing labels]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit eebc3f8adee0a6f43a4789ef0bf5c5b35de8cfe4)

Removed now-unused stuff_RSB function from uek4's 60f856955c1b
("x86/kvm: Pad RSB on VM transition").

Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kvm/svm.c
arch/x86/kvm/vmx.c
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline/irq32: Convert assembler indirect jumps
Andi Kleen [Thu, 11 Jan 2018 21:46:33 +0000 (21:46 +0000)]
x86/retpoline/irq32: Convert assembler indirect jumps

commit 7614e913db1f40fff819b36216484dc3808995d4 upstream.

Convert all indirect jumps in 32bit irq inline asm code to use non
speculative sequences.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-12-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f72655b837eb4320a1ffebbd0e0ebe92ce1e5314)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/irq_32.c
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline/checksum32: Convert assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:32 +0000 (21:46 +0000)]
x86/retpoline/checksum32: Convert assembler indirect jumps

commit 5096732f6f695001fa2d6f1335a2680b37912c69 upstream.

Convert all indirect jumps in 32bit checksum assembler code to use
non-speculative sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-11-git-send-email-dwmw@amazon.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7e5bb301bd2fdd62cbee7b26a8234cccb6731849)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline/xen: Convert Xen hypercall indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:31 +0000 (21:46 +0000)]
x86/retpoline/xen: Convert Xen hypercall indirect jumps

commit ea08816d5b185ab3d09e95e393f265af54560350 upstream.

Convert indirect call in Xen hypercall to use non-speculative sequence,
when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-10-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit 6b222e7483af4fd8f632efbf3b91025c2359b10a)
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/include/asm/xen/hypercall.h
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline/hyperv: Convert assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:30 +0000 (21:46 +0000)]
x86/retpoline/hyperv: Convert assembler indirect jumps

commit e70e5892b28c18f517f29ab6e83bd57705104b31 upstream.

Convert all indirect jumps in hyperv inline asm code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-9-git-send-email-dwmw@amazon.co.uk
[ backport to 4.4, hopefully correct, not tested... - gregkh ]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit d2beed45635e3c430bc6d84ff8e6c6e8cb2e10b4)
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline/ftrace: Convert ftrace assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:29 +0000 (21:46 +0000)]
x86/retpoline/ftrace: Convert ftrace assembler indirect jumps

commit 9351803bd803cdbeb9b5a7850b7b6f464806e3db upstream.

Convert all indirect jumps in ftrace assembler code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-8-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7153a6d5ff050050555066f58ac3458c5efc699b)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
        arch/x86/kernel/entry_64.S
  (dmj: patch had arch/x86/entry/entry_32.S)
arch/x86/kernel/mcount_64.S
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline/entry: Convert entry assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:28 +0000 (21:46 +0000)]
x86/retpoline/entry: Convert entry assembler indirect jumps

commit 2641f08bb7fc63a636a2b18173221d7040a3512e upstream.

Convert indirect jumps in core 32/64bit entry assembler code to use
non-speculative sequences when CONFIG_RETPOLINE is enabled.

Don't use CALL_NOSPEC in entry_SYSCALL_64_fastpath because the return
address after the 'call' instruction must be *precisely* at the
.Lentry_SYSCALL_64_after_fastpath label for stub_ptregs_64 to work,
and the use of alternatives will mess that up unless we play horrid
games to prepend with NOPs and make the variants the same length. It's
not worth it; in the case where we ALTERNATIVE out the retpoline, the
first instruction at __x86.indirect_thunk.rax is going to be a bare
jmp *%rax anyway.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-7-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 028083cb02db69237e73950576bc81ac579693dc)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/entry_32.S
          (dmj:
            - patch had arch/x86/entry/entry_32.S
            - extra retpolines needed for sys_call_table)
arch/x86/kernel/entry_64.S
          (dmj: patch had arch/x86/entry/entry_64.S)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline/crypto: Convert crypto assembler indirect jumps
David Woodhouse [Thu, 11 Jan 2018 21:46:27 +0000 (21:46 +0000)]
x86/retpoline/crypto: Convert crypto assembler indirect jumps

commit 9697fa39efd3fc3692f2949d4045f393ec58450b upstream.

Convert all indirect jumps in crypto assembler code to use non-speculative
sequences when CONFIG_RETPOLINE is enabled.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-6-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9fe55976f0c8acfd7408bf693b6d171587b62129)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/spectre_v2: Add disable_ibrs_and_friends
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 15:43:41 +0000 (10:43 -0500)]
x86/spectre_v2: Add disable_ibrs_and_friends

Instead of having the three calls to set_{ibrs,ibpb,lfence}_disabled.

Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/spectre_v2: Figure out if STUFF_RSB macro needs to be used.
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 15:40:43 +0000 (10:40 -0500)]
x86/spectre_v2: Figure out if STUFF_RSB macro needs to be used.

If IBRS is on, STUFF_RSB only needs to be enabled if the CPU
does not have SMEP enabled.

We change the macro to depend on a synthetic one called
X86_FEATURE_STUFF_RSB - which will only be turned on
if 'spectre_v2=ibrs' is set and the BIOS does not have SMEP
enabled.

(And naturally also if the kernel is compiled without retpoline
and the CPU does not have SMEP).

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/spectre_v2: Figure out when to use IBRS.
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 15:13:37 +0000 (10:13 -0500)]
x86/spectre_v2: Figure out when to use IBRS.

Which is if:
 a) on the bootline you have 'spectre_v2=ibrs' _and_ the microcode
    is loaded that exposes this. And nobody used 'noibrs'.

    Also if you have 'spectre_v2=ibrs noibrs' we end up falling
    in the 'lfence'. Unless somebody did 'spectre_v2=ibrs noibrs nolfence',
    then we go back to automatic discovery.

b) Kernel compiled without retpoline and the microcode is
   available.

c) Kernel compiled without retpoline and there is no microcode with IBRS
   then we pick 'lfence' on system calls.

Orabug: 27477743
CVE: CVE-2017-5715

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/spectre: Add IBRS option.
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 14:45:27 +0000 (09:45 -0500)]
x86/spectre: Add IBRS option.

The spectre_v2_mitigation already has an IBRS option, lets make
the override possible. But don't select it by default if
the kernel has been compiled with retpoline.

If it has not (no compiler support), then fallback to ibrs.

Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/spectre: Add boot time option to select Spectre v2 mitigation
David Woodhouse [Thu, 11 Jan 2018 21:46:26 +0000 (21:46 +0000)]
x86/spectre: Add boot time option to select Spectre v2 mitigation

commit da285121560e769cc31797bba6422eea71d473e0 upstream.

Add a spectre_v2= option to select the mitigation used for the indirect
branch speculation vulnerability.

Currently, the only option available is retpoline, in its various forms.
This will be expanded to cover the new IBRS/IBPB microcode features.

The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation
control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a
serializing instruction, which is indicated by the LFENCE_RDTSC feature.

[ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS
   integration becomes simple ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9f789bc5711bcacb5df003594b992f0c1cc19df4)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/cpu/bugs_64.c
          (dmj:
            - patch used arch/x86/kernel/cpu/bugs.c
            - UEK4-specific IBPB and IBRS code merged with retpoline
              cmdline parsing
            - disabled IBRS in the case of full retpoline mitigation)
           (konrad:
            - disabled lfence on IBRS paths in case of full retpoline)
arch/x86/kernel/cpu/common.c
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/retpoline: Add initial retpoline support
David Woodhouse [Thu, 11 Jan 2018 21:46:25 +0000 (21:46 +0000)]
x86/retpoline: Add initial retpoline support

commit 76b043848fd22dbf7f8bf3a1452f8c70d557b860 upstream.

Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
the corresponding thunks. Provide assembler macros for invoking the thunks
in the same way that GCC does, from native and inline assembler.

This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
some circumstances, IBRS microcode features may be used instead, and the
retpoline can be disabled.

On AMD CPUs if lfence is serialising, the retpoline can be dramatically
simplified to a simple "lfence; jmp *\reg". A future patch, after it has
been verified that lfence really is serialising in all circumstances, can
enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
to X86_FEATURE_RETPOLINE.

Do not align the retpoline in the altinstr section, because there is no
guarantee that it stays aligned when it's copied over the oldinstr during
alternative patching.

[ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
[ tglx: Put actual function CALL/JMP in front of the macros, convert to
   symbolic labels ]
[ dwmw2: Convert back to numeric labels, merge objtool fixes ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
[ 4.4 backport: removed objtool annotation since there is no objtool ]
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3c5e10905263dbe9fbc621d1889b85e9c867da25)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/cpu/common.c
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agokconfig.h: use __is_defined() to check if MODULE is defined
Masahiro Yamada [Tue, 14 Jun 2016 05:58:54 +0000 (14:58 +0900)]
kconfig.h: use __is_defined() to check if MODULE is defined

commit 4f920843d248946545415c1bf6120942048708ed upstream.

The macro MODULE is not a config option, it is a per-file build
option.  So, config_enabled(MODULE) is not sensible.  (There is
another case in include/linux/export.h, where config_enabled() is
used against a non-config option.)

This commit renames some macros in include/linux/kconfig.h for the
use for non-config macros and replaces config_enabled(MODULE) with
__is_defined(MODULE).

I am keeping config_enabled() because it is still referenced from
some places, but I expect it would be deprecated in the future.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit 675901851fd2a00c7a70bcf327303efba3b81e2f)
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agoEXPORT_SYMBOL() for asm
Al Viro [Mon, 11 Jan 2016 15:54:54 +0000 (10:54 -0500)]
EXPORT_SYMBOL() for asm

commit 22823ab419d8ed884195cfa75483fd3a99bb1462 upstream.

Add asm-usable variants of EXPORT_SYMBOL/EXPORT_SYMBOL_GPL.  This
commit just adds the default implementation; most of the architectures
can simply add export.h to asm/Kbuild and start using <asm/export.h>
from assembler.  The rest needs to have their <asm/export.h> define
everal macros and then explicitly include <asm-generic/export.h>

One area where the things might diverge from default is the alignment;
normally it's 8 bytes on 64bit targets and 4 on 32bit ones, both for
unsigned long and for struct kernel_symbol.  Unfortunately, amd64 and
m68k are unusual - m68k aligns to 2 bytes (for both) and amd64 aligns
struct kernel_symbol to 16 bytes.  For those we'll need asm/export.h to
override the constants used by generic version - KSYM_ALIGN and KCRC_ALIGN
for kernel_symbol and unsigned long resp.  And no, __alignof__ would not
do the trick - on amd64 __alignof__ of struct kernel_symbol is 8, not 16.

More serious source of unpleasantness is treatment of function
descriptors on architectures that have those.  Things like ppc64,
parisc, ia64, etc.  need more than the address of the first insn to
call an arbitrary function.  As the result, their representation of
pointers to functions is not the typical "address of the entry point" -
it's an address of a small static structure containing all the required
information (including the entry point, of course).  Sadly, the asm-side
conventions differ in what the function name refers to - entry point or
the function descriptor.  On ppc64 we do the latter;
bar: .quad foo
is what void (*bar)(void) = foo; turns into and the rare places where
we need to explicitly work with the label of entry point are dealt with
as DOTSYM(foo).  For our purposes it's ideal - generic macros are usable.
However, parisc would have foo and P%foo used for label of entry point
and address of the function descriptor and
bar: .long P%foo
woudl be used instead. ia64 goes similar to parisc in that respect,
except that there it's @fptr(foo) rather than P%foo.  Such architectures
need to define KSYM_FUNC that would turn a function name into whatever
is needed to refer to function descriptor.

What's more, on such architectures we need to know whether we are exporting
a function or an object - in assembler we have to tell that explicitly, to
decide whether we want EXPORT_SYMBOL(foo) produce e.g.
__ksymtab_foo: .quad foo
or
__ksymtab_foo: .quad @fptr(foo)

For that reason we introduce EXPORT_DATA_SYMBOL{,_GPL}(), to be used for
exports of data objects.  On normal architectures it's the same thing
as EXPORT_SYMBOL{,_GPL}(), but on parisc-like ones they differ and the
right one needs to be used.  Most of the exports are functions, so we
keep EXPORT_SYMBOL for those...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit a88693d006986b8fc9ae5036852ac0156106de2a)
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/asm: Make asm/alternative.h safe from assembly
Andy Lutomirski [Tue, 26 Apr 2016 19:23:25 +0000 (12:23 -0700)]
x86/asm: Make asm/alternative.h safe from assembly

commit f005f5d860e0231fe212cfda8c1a3148b99609f4 upstream.

asm/alternative.h isn't directly useful from assembly, but it
shouldn't break the build.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e5b693fcef99fe6e80341c9e97a002fb23871e91.1461698311.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit b8e7a489b51810992ba3266bfd9e043709e07267)
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/kbuild: enable modversions for symbols exported from asm
Adam Borowski [Sun, 11 Dec 2016 01:09:18 +0000 (02:09 +0100)]
x86/kbuild: enable modversions for symbols exported from asm

commit 334bb773876403eae3457d81be0b8ea70f8e4ccc upstream.

Commit 4efca4ed ("kbuild: modversions for EXPORT_SYMBOL() for asm") adds
modversion support for symbols exported from asm files. Architectures
must include C-style declarations for those symbols in asm/asm-prototypes.h
in order for them to be versioned.

Add these declarations for x86, and an architecture-independent file that
can be used for common symbols.

With f27c2f6 reverting 8ab2ae6 ("default exported asm symbols to zero") we
produce a scary warning on x86, this commit fixes that.

Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Tested-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit b76ac90af34dc08f057da15aa34584a0a3d381b5)
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/asm: Use register variable to get stack pointer value
Andrey Ryabinin [Fri, 29 Sep 2017 14:15:36 +0000 (17:15 +0300)]
x86/asm: Use register variable to get stack pointer value

commit 196bd485ee4f03ce4c690bfcf38138abfcd0a4bc upstream.

Currently we use current_stack_pointer() function to get the value
of the stack pointer register. Since commit:

  f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")

... we have a stack register variable declared. It can be used instead of
current_stack_pointer() function which allows to optimize away some
excessive "mov %rsp, %<dst>" instructions:

 -mov    %rsp,%rdx
 -sub    %rdx,%rax
 -cmp    $0x3fff,%rax
 -ja     ffffffff810722fd <ist_begin_non_atomic+0x2d>

 +sub    %rsp,%rax
 +cmp    $0x3fff,%rax
 +ja     ffffffff810722fa <ist_begin_non_atomic+0x2a>

Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer
and use it instead of the removed function.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[dwmw2: We want ASM_CALL_CONSTRAINT for retpoline]
Signed-off-by: David Woodhouse <dwmw@amazon.co.ku>
Signed-off-by: Razvan Ghitulete <rga@amazon.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit cfc8c1d61e46fd3c60a34a5b1962eeeb03222a3d)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier
Andy Lutomirski [Sun, 17 Sep 2017 16:03:50 +0000 (09:03 -0700)]
x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier

commit b8b7abaed7a49b350f8ba659ddc264b04931d581 upstream.

Otherwise we might have the PCID feature bit set during cpu_init().

This is just for robustness.  I haven't seen any actual bugs here.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: cba4671af755 ("x86/mm: Disable PCID on 32-bit kernels")
Link: http://lkml.kernel.org/r/b16dae9d6b0db5d9801ddbebbfd83384097c61f3.1505663533.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 416f66509fce36e128ea2d9e62739a21f1a1d04d)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/cpu/common.c
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm
David Woodhouse [Thu, 4 Jan 2018 14:37:05 +0000 (14:37 +0000)]
x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm

commit b9e705ef7cfaf22db0daab91ad3cd33b0fa32eb9 upstream.

Where an ALTERNATIVE is used in the middle of an inline asm block, this
would otherwise lead to the following instruction being appended directly
to the trailing ".popsection", and a failed compile.

Fixes: 9cebed423c84 ("x86, alternative: Use .pushsection/.popsection")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: ak@linux.intel.com
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180104143710.8961-8-dwmw@amazon.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 999d4f1961fa002bda138ddfe9119965421f85da)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
7 years agox86/alternatives: Fix optimize_nops() checking
Borislav Petkov [Wed, 10 Jan 2018 11:28:16 +0000 (12:28 +0100)]
x86/alternatives: Fix optimize_nops() checking

commit 612e8e9350fd19cae6900cf36ea0c6892d1a0dca upstream.

The alternatives code checks only the first byte whether it is a NOP, but
with NOPs in front of the payload and having actual instructions after it
breaks the "optimized' test.

Make sure to scan all bytes before deciding to optimize the NOPs in there.

Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/20180110112815.mgciyf5acwacphkq@pd.tnic
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e997d991ab2b1dc9f9cdad999a891626c2aecf21)
Orabug: 27477743
CVE: CVE-2017-5715
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
7 years agoblock: Check for gaps on front and back merges
Jens Axboe [Thu, 3 Sep 2015 16:28:20 +0000 (19:28 +0300)]
block: Check for gaps on front and back merges

We are checking for gaps to previous bio_vec, which can
only detect back merges gaps. Moreover, at the point where
we check for a gap, we don't know if we will attempt a back
or a front merge. Thus, check for gap to prev in a back merge
attempt and check for a gap to next in a front merge attempt.

Signed-off-by: Jens Axboe <axboe@fb.com>
[sagig: Minor rename change]
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
(cherry picked from commit 5e7c4274a70aa2d6f485996d0ca1dad52d0039ca)

Orabug: 27484719

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Kyle Fortin <kyle.fortin@oracle.com>
7 years agoblock: Copy a user iovec if it includes gaps
Sagi Grimberg [Thu, 3 Sep 2015 16:28:23 +0000 (19:28 +0300)]
block: Copy a user iovec if it includes gaps

For drivers that don't support gaps in the SG lists handed to
them we must bounce (copy the user buffers) and pass a bio that
does not include gaps. This doesn't matter for any current user,
but will help to allow iser which can't handle gaps to use the
block virtual boundary instead of using driver-local bounce
buffering when handling SG_IO commands.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 46348456c1791053dcbe5a9e21825b10a3c8a8fb)

Orabug: 27484719

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Kyle Fortin <kyle.fortin@oracle.com>
7 years agoblock: Replace SG_GAPS with new queue limits mask
Keith Busch [Thu, 1 Feb 2018 19:37:06 +0000 (11:37 -0800)]
block: Replace SG_GAPS with new queue limits mask

The SG_GAPS queue flag caused checks for bio vector alignment against
PAGE_SIZE, but the device may have different constraints. This patch
adds a queue limits so a driver with such constraints can set to allow
requests that would have been unnecessarily split. The new gaps check
takes the request_queue as a parameter to simplify the logic around
invoking this function.

This new limit makes the queue flag redundant, so removing it and
all usage. Device-mappers will inherit the correct settings through
blk_stack_limits().

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 03100aada96f0645bbcb89aea24c01f02d0ef1fa)

Orabug: 27484719

Conflicts:
        Changes in blk_bio_segment_split() are not picked up
from 03100aada96f0645bbcb89aea24c01f02d0ef1fa as we are not
porting arbitary bio size support.

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Kyle Fortin <kyle.fortin@oracle.com>
7 years agoRevert "block: Copy a user iovec if it includes gaps"
Ashok Vairavan [Sun, 4 Feb 2018 02:18:36 +0000 (18:18 -0800)]
Revert "block: Copy a user iovec if it includes gaps"

This reverts commit cc500383c29a734102683003e2485dac6c6a0f05.

In UEK4, the decision is made to do complete switch to
queue_virt_boundary discarding the use of SG_GAPS flags. Hence,
this partial porting of new queue limit mask is reverted. Also,
this partial patch is broken for stacked devices.

Orabug: 27484719

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Kyle Fortin <kyle.fortin@oracle.com>
7 years agoRevert "block: Check for gaps on front and back merges"
Ashok Vairavan [Sun, 4 Feb 2018 01:34:38 +0000 (17:34 -0800)]
Revert "block: Check for gaps on front and back merges"

This reverts commit 50f113a81b852e45ead0d4b0fd5d79e96530f643.

In UEK4, the decision is made to do complete switch to
queue_virt_boundary discarding the use of SG_GAPS flags. Hence,
this partial porting of new queue limit mask is reverted. Also,
this partial patch is broken for stacked devices.

Orabug: 27484719

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Kyle Fortin <kyle.fortin@oracle.com>
7 years agoRevert "blk: [Partial] Replace SG_GAPGS with new queue limits mask"
Ashok Vairavan [Sun, 4 Feb 2018 00:39:19 +0000 (16:39 -0800)]
Revert "blk: [Partial] Replace SG_GAPGS with new queue limits mask"

This reverts commit 5ba566b455bd5302aff276dca38059202b7be08c.

In UEK4, the decision is made to do complete switch to
queue_virt_boundary discarding the use of SG_GAPS flags. Hence,
this partial porting of new queue limit mask is reverted. Also,
this partial patch is broken for stacked devices.

Orabug: 27484719

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Kyle Fortin <kyle.fortin@oracle.com>
7 years agoqlcnic: fix deadlock bug
Junxiao Bi [Mon, 29 Jan 2018 09:53:42 +0000 (17:53 +0800)]
qlcnic: fix deadlock bug

Orabug: 27496907

The following soft lockup was caught. This is a deadlock caused by
recursive locking.

Process kworker/u40:1:28016 was holding spin lock "mbx->queue_lock" in
qlcnic_83xx_mailbox_worker(), while a softirq came in and ask the same spin
lock in qlcnic_83xx_enqueue_mbx_cmd(). This lock should be hold by disable
bh..

[161846.962125] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/u40:1:28016]
[161846.962367] Modules linked in: tun ocfs2 xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn xenfs xen_privcmd autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc fcoe libfcoe libfc sunrpc 8021q mrp garp bridge stp llc bonding dm_round_robin dm_multipath iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 shpchp lpc_ich mfd_core ioatdma ipmi_devintf ipmi_si ipmi_msghandler sg ext4 jbd2 mbcache2 sr_mod cdrom sd_mod igb i2c_algo_bit i2c_core ahci libahci megaraid_sas ixgbe dca ptp pps_core vxlan udp_tunnel ip6_udp_tunnel qla2xxx scsi_transport_fc qlcnic crc32c_intel be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod
[161846.962454]
[161846.962460] CPU: 1 PID: 28016 Comm: kworker/u40:1 Not tainted 4.1.12-94.5.9.el6uek.x86_64 #2
[161846.962463] Hardware name: Oracle Corporation SUN SERVER X4-2L      /ASSY,MB,X4-2L         , BIOS 26050100 09/19/2017
[161846.962489] Workqueue: qlcnic_mailbox qlcnic_83xx_mailbox_worker [qlcnic]
[161846.962493] task: ffff8801f2e34600 ti: ffff88004ca5c000 task.ti: ffff88004ca5c000
[161846.962496] RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
[161846.962506] RSP: e02b:ffff880202e43388  EFLAGS: 00000206
[161846.962509] RAX: 0000000000000000 RBX: ffff8801f6996b70 RCX: ffffffff810013aa
[161846.962511] RDX: ffff880202e433cc RSI: ffff880202e433b0 RDI: 0000000000000003
[161846.962513] RBP: ffff880202e433d0 R08: 0000000000000000 R09: ffff8801fe893200
[161846.962516] R10: ffff8801fe400538 R11: 0000000000000206 R12: ffff880202e4b000
[161846.962518] R13: 0000000000000050 R14: 0000000000000001 R15: 000000000000020d
[161846.962528] FS:  0000000000000000(0000) GS:ffff880202e40000(0000) knlGS:ffff880202e40000
[161846.962531] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[161846.962533] CR2: 0000000002612640 CR3: 00000001bb796000 CR4: 0000000000042660
[161846.962536] Stack:
[161846.962538]  ffff880202e43608 0000000000000000 ffffffff813f0442 ffff880202e433b0
[161846.962543]  0000000000000000 ffff880202e433cc ffffffff00000001 0000000000000000
[161846.962547]  00000009813f03d6 ffff880202e433e0 ffffffff813f0460 ffff880202e43440
[161846.962552] Call Trace:
[161846.962555]  <IRQ>
[161846.962565]  [<ffffffff813f0442>] ? xen_poll_irq_timeout+0x42/0x50
[161846.962570]  [<ffffffff813f0460>] xen_poll_irq+0x10/0x20
[161846.962578]  [<ffffffff81014222>] xen_lock_spinning+0xe2/0x110
[161846.962583]  [<ffffffff81013f01>] __raw_callee_save_xen_lock_spinning+0x11/0x20
[161846.962592]  [<ffffffff816e5c57>] ? _raw_spin_lock+0x57/0x80
[161846.962609]  [<ffffffffa028acfc>] qlcnic_83xx_enqueue_mbx_cmd+0x7c/0xe0 [qlcnic]
[161846.962623]  [<ffffffffa028e008>] qlcnic_83xx_issue_cmd+0x58/0x210 [qlcnic]
[161846.962636]  [<ffffffffa028caf2>] qlcnic_83xx_sre_macaddr_change+0x162/0x1d0 [qlcnic]
[161846.962649]  [<ffffffffa028cb8b>] qlcnic_83xx_change_l2_filter+0x2b/0x30 [qlcnic]
[161846.962657]  [<ffffffff8160248b>] ? __skb_flow_dissect+0x18b/0x650
[161846.962670]  [<ffffffffa02856e5>] qlcnic_send_filter+0x205/0x250 [qlcnic]
[161846.962682]  [<ffffffffa0285c77>] qlcnic_xmit_frame+0x547/0x7b0 [qlcnic]
[161846.962691]  [<ffffffff8160ac22>] xmit_one+0x82/0x1a0
[161846.962696]  [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0
[161846.962701]  [<ffffffff81630112>] sch_direct_xmit+0x112/0x220
[161846.962706]  [<ffffffff8160b80f>] __dev_queue_xmit+0x1df/0x5e0
[161846.962710]  [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20
[161846.962721]  [<ffffffffa0575bd5>] bond_dev_queue_xmit+0x35/0x80 [bonding]
[161846.962729]  [<ffffffffa05769fb>] __bond_start_xmit+0x1cb/0x210 [bonding]
[161846.962736]  [<ffffffffa0576a71>] bond_start_xmit+0x31/0x60 [bonding]
[161846.962740]  [<ffffffff8160ac22>] xmit_one+0x82/0x1a0
[161846.962745]  [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0
[161846.962749]  [<ffffffff8160bb1e>] __dev_queue_xmit+0x4ee/0x5e0
[161846.962754]  [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20
[161846.962760]  [<ffffffffa05cfa72>] vlan_dev_hard_start_xmit+0xb2/0x150 [8021q]
[161846.962764]  [<ffffffff8160ac22>] xmit_one+0x82/0x1a0
[161846.962769]  [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0
[161846.962773]  [<ffffffff8160bb1e>] __dev_queue_xmit+0x4ee/0x5e0
[161846.962777]  [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20
[161846.962789]  [<ffffffffa05adf74>] br_dev_queue_push_xmit+0x54/0xa0 [bridge]
[161846.962797]  [<ffffffffa05ae4ff>] br_forward_finish+0x2f/0x90 [bridge]
[161846.962807]  [<ffffffff810b0dad>] ? ttwu_do_wakeup+0x1d/0x100
[161846.962811]  [<ffffffff815f929b>] ? __alloc_skb+0x8b/0x1f0
[161846.962818]  [<ffffffffa05ae04d>] __br_forward+0x8d/0x120 [bridge]
[161846.962822]  [<ffffffff815f613b>] ? __kmalloc_reserve+0x3b/0xa0
[161846.962829]  [<ffffffff810be55e>] ? update_rq_runnable_avg+0xee/0x230
[161846.962836]  [<ffffffffa05ae176>] br_forward+0x96/0xb0 [bridge]
[161846.962845]  [<ffffffffa05af85e>] br_handle_frame_finish+0x1ae/0x420 [bridge]
[161846.962853]  [<ffffffffa05afc4f>] br_handle_frame+0x17f/0x260 [bridge]
[161846.962862]  [<ffffffffa05afad0>] ? br_handle_frame_finish+0x420/0x420 [bridge]
[161846.962867]  [<ffffffff8160d057>] __netif_receive_skb_core+0x1f7/0x870
[161846.962872]  [<ffffffff8160d6f2>] __netif_receive_skb+0x22/0x70
[161846.962877]  [<ffffffff8160d913>] netif_receive_skb_internal+0x23/0x90
[161846.962884]  [<ffffffffa07512ea>] ? xenvif_idx_release+0xea/0x100 [xen_netback]
[161846.962889]  [<ffffffff816e5a10>] ? _raw_spin_unlock_irqrestore+0x20/0x50
[161846.962893]  [<ffffffff8160e624>] netif_receive_skb_sk+0x24/0x90
[161846.962899]  [<ffffffffa075269a>] xenvif_tx_submit+0x2ca/0x3f0 [xen_netback]
[161846.962906]  [<ffffffffa0753f0c>] xenvif_tx_action+0x9c/0xd0 [xen_netback]
[161846.962915]  [<ffffffffa07567f5>] xenvif_poll+0x35/0x70 [xen_netback]
[161846.962920]  [<ffffffff8160e01b>] napi_poll+0xcb/0x1e0
[161846.962925]  [<ffffffff8160e1c0>] net_rx_action+0x90/0x1c0
[161846.962931]  [<ffffffff8108aaba>] __do_softirq+0x10a/0x350
[161846.962938]  [<ffffffff8108ae75>] irq_exit+0x125/0x130
[161846.962943]  [<ffffffff813f03a9>] xen_evtchn_do_upcall+0x39/0x50
[161846.962950]  [<ffffffff816e7ffe>] xen_do_hypervisor_callback+0x1e/0x40
[161846.962952]  <EOI>
[161846.962959]  [<ffffffff816e5c4a>] ? _raw_spin_lock+0x4a/0x80
[161846.962964]  [<ffffffff816e5b1e>] ? _raw_spin_lock_irqsave+0x1e/0xa0
[161846.962978]  [<ffffffffa028e279>] ? qlcnic_83xx_mailbox_worker+0xb9/0x2a0 [qlcnic]
[161846.962991]  [<ffffffff810a14e1>] ? process_one_work+0x151/0x4b0
[161846.962995]  [<ffffffff8100c3f2>] ? check_events+0x12/0x20
[161846.963001]  [<ffffffff810a1960>] ? worker_thread+0x120/0x480
[161846.963005]  [<ffffffff816e187b>] ? __schedule+0x30b/0x890
[161846.963010]  [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0
[161846.963015]  [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0
[161846.963021]  [<ffffffff810a6b3e>] ? kthread+0xce/0xf0
[161846.963025]  [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70
[161846.963031]  [<ffffffff816e6522>] ? ret_from_fork+0x42/0x70
[161846.963035]  [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70
[161846.963037] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 233ac3891607f501f08879134d623b303838f478)
Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
7 years agox86/entry: RESTORE_IBRS needs to be done under kernel CR3
Ankur Arora [Mon, 5 Feb 2018 03:35:07 +0000 (22:35 -0500)]
x86/entry: RESTORE_IBRS needs to be done under kernel CR3

RESTORE_IBRS_CLOBBER executes after we have already switched
to the USER_CR3. This blows up because RESTORE_IBRS_CLOBBER
looks at a kernel variable (use_ibrs).

Orabug: 27501734

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit a2b15d7844fc60bc3ebb5f1703cd2fe39256db35)

7 years agords: Fix NULL pointer dereference in __rds_rdma_map
Håkon Bugge [Wed, 6 Dec 2017 16:18:28 +0000 (17:18 +0100)]
rds: Fix NULL pointer dereference in __rds_rdma_map

This is a fix for syzkaller719569, where memory registration was
attempted without any underlying transport being loaded.

Analysis of the case reveals that it is the setsockopt() RDS_GET_MR
(2) and RDS_GET_MR_FOR_DEST (7) that are vulnerable.

Here is an example stack trace when the bug is hit:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000c0
IP: __rds_rdma_map+0x36/0x440 [rds]
PGD 2f93d03067 P4D 2f93d03067 PUD 2f93d02067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: bridge stp llc tun rpcsec_gss_krb5 nfsv4
dns_resolver nfs fscache rds binfmt_misc sb_edac intel_powerclamp
coretemp kvm_intel kvm irqbypass crct10dif_pclmul c rc32_pclmul
ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd
iTCO_wdt mei_me sg iTCO_vendor_support ipmi_si mei ipmi_devintf nfsd
shpchp pcspkr i2c_i801 ioatd ma ipmi_msghandler wmi lpc_ich mfd_core
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2
mgag200 i2c_algo_bit drm_kms_helper ixgbe syscopyarea ahci sysfillrect
sysimgblt libahci mdio fb_sys_fops ttm ptp libata sd_mod mlx4_core drm
crc32c_intel pps_core megaraid_sas i2c_core dca dm_mirror
dm_region_hash dm_log dm_mod
CPU: 48 PID: 45787 Comm: repro_set2 Not tainted 4.14.2-3.el7uek.x86_64 #2
Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
task: ffff882f9190db00 task.stack: ffffc9002b994000
RIP: 0010:__rds_rdma_map+0x36/0x440 [rds]
RSP: 0018:ffffc9002b997df0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff882fa2182580 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffc9002b997e40 RDI: ffff882fa2182580
RBP: ffffc9002b997e30 R08: 0000000000000000 R09: 0000000000000002
R10: ffff885fb29e3838 R11: 0000000000000000 R12: ffff882fa2182580
R13: ffff882fa2182580 R14: 0000000000000002 R15: 0000000020000ffc
FS:  00007fbffa20b700(0000) GS:ffff882fbfb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000c0 CR3: 0000002f98a66006 CR4: 00000000001606e0
Call Trace:
 rds_get_mr+0x56/0x80 [rds]
 rds_setsockopt+0x172/0x340 [rds]
 ? __fget_light+0x25/0x60
 ? __fdget+0x13/0x20
 SyS_setsockopt+0x80/0xe0
 do_syscall_64+0x67/0x1b0
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7fbff9b117f9
RSP: 002b:00007fbffa20aed8 EFLAGS: 00000293 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00000000000c84a4 RCX: 00007fbff9b117f9
RDX: 0000000000000002 RSI: 0000400000000114 RDI: 000000000000109b
RBP: 00007fbffa20af10 R08: 0000000000000020 R09: 00007fbff9dd7860
R10: 0000000020000ffc R11: 0000000000000293 R12: 0000000000000000
R13: 00007fbffa20b9c0 R14: 00007fbffa20b700 R15: 0000000000000021

Code: 41 56 41 55 49 89 fd 41 54 53 48 83 ec 18 8b 87 f0 02 00 00 48
89 55 d0 48 89 4d c8 85 c0 0f 84 2d 03 00 00 48 8b 87 00 03 00 00 <48>
83 b8 c0 00 00 00 00 0f 84 25 03 00 0 0 48 8b 06 48 8b 56 08

The fix is to check the existence of an underlying transport in
__rds_rdma_map().

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f3069c6d33f6ae63a1668737bc78aaaa51bff7ca)

Conflicts:
net/rds/rdma.c

Due to commit 4d2cc57ee46b ("rds: Changed IP address internal
representation to struct in6_addr")

(cherry picked from commit eb54c5657bd52347a5b75f1e6be49432d8944357)

Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>
Orabug: 27477010

7 years agoBtrfs: fix unexpected EEXIST from btrfs_get_extent
Liu Bo [Thu, 21 Dec 2017 05:05:19 +0000 (22:05 -0700)]
Btrfs: fix unexpected EEXIST from btrfs_get_extent

Orabug: 27446668

This fixes a corner case that is caused by a race of dio write vs dio
read/write.

Here is how the race could happen.

Suppose that no extent map has been loaded into memory yet.
There is a file extent [0, 32K), two jobs are running concurrently
against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio
read from [0, 4K) or [4K, 8K).

t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K).

------------------------------------------------------
         t1                                t2
  btrfs_get_blocks_direct()         btrfs_get_blocks_direct()
   -> btrfs_get_extent()              -> btrfs_get_extent()
       -> lookup_extent_mapping()
       -> add_extent_mapping()            -> lookup_extent_mapping()
          # load [0, 32K)
   -> btrfs_new_extent_direct()
       -> btrfs_drop_extent_cache()
          # split [0, 32K)
       -> add_extent_mapping()
          # add [8K, 32K)
                                          -> add_extent_mapping()
                                             # handle -EEXIST when adding
                                             # [0, 32K)
------------------------------------------------------
More details about how t2(dio read/write) runs into -EEXIST:

When add_extent_mapping() gets -EEXIST for adding em [0, 32k),
search_extent_mapping() would return [0, 8k) as existing em, even
though start == existing->start, em is [0, 32k) and
extent_map_end(em) > extent_map_end(existing), ie. 32k > 8k,
then it goes thru merge_extent_mapping() which tries to add a [8k, 8k)
(with a length 0), and btrfs_get_extent() ends up returning -EEXIST,
and dio read/write will get -EEXIST which is confusing applications.

Here I also concluded all possible situations,
1) start < existing->start

            +-----------+em+-----------+
+--prev---+ |     +-------------+      |
|         | |     |             |      |
+---------+ +     +---+existing++      ++
                +
                |
                +
             start

2) start == existing->start

      +------------em------------+
      |     +-------------+      |
      |     |             |      |
      +     +----existing-+      +
            |
            |
            +
         start

3) start > existing->start && start < (existing->start + existing->len)

      +------------em------------+
      |     +-------------+      |
      |     |             |      |
      +     +----existing-+      +
               |
               |
               +
             start

4) start >= (existing->start + existing->len)

+-----------+em+-----------+
|     +-------------+      | +--next---+
|     |             |      | |         |
+     +---+existing++      + +---------+
                      +
                      |
                      +
                   start

After going thru the above case by case, it turns out that if start is
within existing em (front inclusive), then the existing em should be
returned, otherwise, we try our best to merge candidate em with
sibling ems to form a larger em.

Reported-by: David Vallender <david.vallender@landmark.co.uk>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
7 years agoBtrfs: fix incorrect block_len in merge_extent_mapping
Liu Bo [Tue, 19 Dec 2017 22:42:54 +0000 (15:42 -0700)]
Btrfs: fix incorrect block_len in merge_extent_mapping

Orabug: 27446668

%block_len could be checked on deciding if two em are mergable.

merge_extent_mapping() has only added the front pad if the front part
of em gets truncated, but it's possible that the end part gets
truncated.

For both compressed extent and inline extent, em->block_len is not
adjusted accordingly, while for regular extent, em->block_len always
equals to em->len, hence this sets em->block_len with em->len.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
7 years agoBtrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping
Liu Bo [Fri, 15 Dec 2017 06:36:53 +0000 (23:36 -0700)]
Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping

Orabug: 27446668

This is a subtle case, so in order to understand the problem, it'd be
good to know the content of existing and em when any error occurs.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
7 years agoBtrfs: deal with existing encompassing extent map in btrfs_get_extent()
Omar Sandoval [Wed, 9 Nov 2016 23:26:50 +0000 (15:26 -0800)]
Btrfs: deal with existing encompassing extent map in btrfs_get_extent()

Orabug: 27446668

My QEMU VM was seeing inexplicable I/O errors that I tracked down to
errors coming from the qcow2 virtual drive in the host system. The qcow2
file is a nocow file on my Btrfs drive, which QEMU opens with O_DIRECT.
Every once in awhile, pread() or pwrite() would return EEXIST, which
makes no sense. This turned out to be a bug in btrfs_get_extent().

Commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map
insertion in btrfs_get_extent") fixed a case in btrfs_get_extent() where
two threads race on adding the same extent map to an inode's extent map
tree. However, if the added em is merged with an adjacent em in the
extent tree, then we'll end up with an existing extent that is not
identical to but instead encompasses the extent we tried to add. When we
call merge_extent_mapping() to find the nonoverlapping part of the new
em, the arithmetic overflows because there is no such thing. We then end
up trying to add a bogus em to the em_tree, which results in a EEXIST
that can bubble all the way up to userspace.

Fix it by extending the identical extent map special case.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
(cherry picked from commit 8e2bd3b7fac91b79a6115fd1511ca20b2a09696d)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
7 years agoBtrfs: deal with duplciates during extent_map insertion in btrfs_get_extent
Chris Mason [Sat, 19 Sep 2015 18:28:25 +0000 (11:28 -0700)]
Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent

Orabug: 27446668

When dealing with inline extents, btrfs_get_extent will incorrectly try
to insert a duplicate extent_map.  The dup hits -EEXIST from
add_extent_map, but then we try to merge with the existing one and end
up trying to insert a zero length extent_map.

This actually works most of the time, except when there are extent maps
past the end of the inline extent.  rocksdb will trigger this sometimes
because it preallocates an extent and then truncates down.

Josef made a script to trigger with xfs_io:

#!/bin/bash

xfs_io -f -c "pwrite 0 1000" inline
xfs_io -c "falloc -k 4k 1M" inline
xfs_io -c "pread 0 1000" -c "fadvise -d 0 1000" -c "pread 0 1000" inline
xfs_io -c "fadvise -d 0 1000" inline
cat inline

You'll get EIOs trying to read inline after this because add_extent_map
is returning EEXIST

Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 8dff9c85341032767d7b519217a79ea04cd676b0)
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
7 years agox86/spec: Fix spectre_v1 bug and mitigation indicators
John Haxby [Sun, 28 Jan 2018 19:05:12 +0000 (19:05 +0000)]
x86/spec: Fix spectre_v1 bug and mitigation indicators

/proc/cpuinfo correctly shows that we have the spectre_v1 bug and
the sysfs vulnerabilities file shows that we have the lfence mitigation
in place.

Orabug: 27470687

Signed-off-by: John Haxby <john.haxby@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agoDrivers: hv: util: Backup: Fix a rescind processing issue
K. Y. Srinivasan [Fri, 23 Dec 2016 00:54:03 +0000 (16:54 -0800)]
Drivers: hv: util: Backup: Fix a rescind processing issue

VSS may use a char device to support the communication between
the user level daemon and the driver. When the VSS channel is rescinded
we need to make sure that the char device is fully cleaned up before
we can process a new VSS offer from the host. Implement this logic.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit d77044d142e960f7b5f814a91ecb8bcf86aa552c)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agoDrivers: hv: vss: Operation timeouts should match host expectation
Alex Ng [Sun, 6 Nov 2016 21:14:11 +0000 (13:14 -0800)]
Drivers: hv: vss: Operation timeouts should match host expectation

Increase the timeout of backup operations. When system is under I/O load,
it needs more time to freeze. These timeout values should also match the
host timeout values more closely.

Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit b357fd3908c1191f2f56e38aa77f2aecdae18bc8)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agoDrivers: hv: vss: Improve log messages.
Alex Ng [Sun, 6 Nov 2016 21:14:10 +0000 (13:14 -0800)]
Drivers: hv: vss: Improve log messages.

Adding log messages to help troubleshoot error cases and transaction
handling.

Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit 23d2cc0c29eb0e7c6fe4cac88098306c31c40208)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agoDrivers: hv: utils: Check VSS daemon is listening before a hot backup
Alex Ng [Fri, 2 Sep 2016 12:58:25 +0000 (05:58 -0700)]
Drivers: hv: utils: Check VSS daemon is listening before a hot backup

Hyper-V host will send a VSS_OP_HOT_BACKUP request to check if guest is
ready for a live backup/snapshot. The driver should respond to the check
only if the daemon is running and listening to requests. This allows the
host to fallback to standard snapshots in case the VSS daemon is not
running.

Signed-off-by: Alex Ng <alexng@messages.microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit db886e4d24c2b3d334be2cc1bd1bd05d547eb4c4)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agoDrivers: hv: utils: Continue to poll VSS channel after handling requests.
Alex Ng [Fri, 2 Sep 2016 12:58:24 +0000 (05:58 -0700)]
Drivers: hv: utils: Continue to poll VSS channel after handling requests.

Multiple VSS_OP_HOT_BACKUP requests may arrive in quick succession, even
though the host only signals once. The driver wass handling the first
request while ignoring the others in the ring buffer. We should poll the
VSS channel after handling a request to continue processing other requests.

Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit 497af84b81b98b27e9ba7aebb8a373412e328497)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agoDrivers: hv: utils: fix a race on userspace daemons registration
Vitaly Kuznetsov [Fri, 10 Jun 2016 00:08:57 +0000 (17:08 -0700)]
Drivers: hv: utils: fix a race on userspace daemons registration

Background: userspace daemons registration protocol for Hyper-V utilities
drivers has two steps:
1) daemon writes its own version to kernel
2) kernel reads it and replies with module version
at this point we consider the handshake procedure being completed and we
do hv_poll_channel() transitioning the utility device to HVUTIL_READY
state. At this point we're ready to handle messages from kernel.

When hvutil_transport is in HVUTIL_TRANSPORT_CHARDEV mode we have a
single buffer for outgoing message. hvutil_transport_send() puts to this
buffer and till the buffer is cleared with hvt_op_read() returns -EFAULT
to all consequent calls. Host<->guest protocol guarantees there is no more
than one request at a time and we will not get new requests till we reply
to the previous one so this single message buffer is enough.

Now to the race. When we finish negotiation procedure and send kernel
module version to userspace with hvutil_transport_send() it goes into the
above mentioned buffer and if the daemon is slow enough to read it from
there we can get a collision when a request from the host comes, we won't
be able to put anything to the buffer so the request will be lost. To
solve the issue we need to know when the negotiation is really done (when
the version message is read by the daemon) and transition to HVUTIL_READY
state after this happens. Implement a callback on read to support this.
Old style netlink communication is not affected by the change, we don't
really know when these messages are delivered but we don't have a single
message buffer there.

Reported-by: Barry Davis <barry_davis@stormagic.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit e0fa3e5e7df61eb2c339c9f0067c202c0cdeec2c)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agoDrivers: hv: util: catch allocation errors
Olaf Hering [Tue, 15 Dec 2015 00:01:36 +0000 (16:01 -0800)]
Drivers: hv: util: catch allocation errors

Catch allocation errors in hvutil_transport_send.

Fixes: 14b50f80c32d ('Drivers: hv: util: introduce hv_utils_transport abstraction')
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit cdc0c0c94e4e6dfa371d497a3130f83349b6ead6)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agoDrivers: hv: vss: run only on supported host versions
Olaf Hering [Tue, 15 Dec 2015 00:01:42 +0000 (16:01 -0800)]
Drivers: hv: vss: run only on supported host versions

The Backup integration service on WS2012 has appearently trouble to
negotiate with a guest which does not support the provided util version.
Currently the VSS driver supports only version 5/0. A WS2012 offers only
version 1/x and 3/x, and vmbus_prep_negotiate_resp correctly returns an
empty icframe_vercnt/icmsg_vercnt. But the host ignores that and
continues to send ICMSGTYPE_NEGOTIATE messages. The result are weird
errors during boot and general misbehaviour.

Check the Windows version to work around the host bug, skip hv_vss_init
on WS2012 and older.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit ed9ba608e4851144af8c7061cbb19f751c73e998)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agoDrivers: hv: utils: unify driver registration reporting
Vitaly Kuznetsov [Sun, 12 Apr 2015 01:07:59 +0000 (18:07 -0700)]
Drivers: hv: utils: unify driver registration reporting

Unify driver registration reporting and move it to debug level as normally daemons write to syslog themselves
and these kernel messages are useless.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit 7c959127581d420ec692344ee905504fdeecec1a)
Signed-off-by: Jack Vogel <jack.vogel@oracle.com>
Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
7 years agodrivers/char/mem.c: deny access in open operation when securelevel is set
Ethan Zhao [Fri, 29 Sep 2017 02:50:35 +0000 (10:50 +0800)]
drivers/char/mem.c: deny access in open operation when securelevel is set

Orabug: 26943864

There is still a secure hole left in mem.c driver -- when securelevel is set
userland application could access PCI configuration space via this driver.

--
Attempting to write via mmap() API using acc_test.

[root@ban92uut054 ~]# ./acc_test mmap 0x846000c4=0x1
Using mmap() API for access
mmap write wrote 0x1

[root@ban92uut054 ~]# setpci -s 46:00.0 0xc4.l
00000001

How, write 0x0 to offset 0xc4
[root@ban92uut054 ~]# ./acc_test mmap 0x846000c4=0x0
Using mmap() API for access
mmap write wrote 0x0
[root@ban92uut054 ~]# setpci -s 46:00.0 0xc4.l
00000000
--
source code of acc_test program:

main(int argc, char *argv[])
{

int fd;
int retval;
int val;
off_t addr;
char *tmp;
int operation = 0; //read

int access_type = -1;
off_t page_base = 0;
off_t page_offset = 0;
off_t pagesize = sysconf(_SC_PAGE_SIZE);
char *mem;
int prot;

if (argc < 3) {
printf("Insufficient args: acc_test rw|mmap <addr> [-w <val>]\n");
return -1;
}

if (strcmp("rw", argv[1]) == 0) {
access_type = 1;
printf("Using pread()/pwrite() API for access\n");
}
else if (strcmp("mmap", argv[1]) == 0) {
access_type = 2;
printf("Using mmap() API for access\n");
}
else {
printf("Illegal access type: must be rw or mmap\n");
return -1;
}

addr = strtoul(argv[2], &tmp, 16);
if ((tmp && (*tmp != '='))  &&
            ((*tmp != '\0') || (errno == EINVAL) ||
            (addr == ULONG_MAX && errno == ERANGE))) {
                fprintf(stderr, "Invalid address specified; must be hex based\n");
                if (errno) perror("error : ");
                exit(1);
}
else if (tmp && (*tmp == '=')) { // write case
tmp++;
val = strtoul(tmp, NULL, 16);
operation = 1;
}

//fd = open("/sys/bus/pci/devices/0000:46:00.0/config",O_RDWR | O_SYNC);
//retval = pread(fd, &val, 4, 0xc4);

if (operation == 1)
fd = open("/dev/mem",O_RDWR);
else
fd = open("/dev/mem",O_RDONLY);

        if (fd < 0) {
perror("open failed");
exit(1);
}

switch (access_type) {

case 1 : // pread/pwrite API
 if (!operation) {
  if (pread(fd, &val, 4, addr) < 0) {
perror("pread failed");
return -1;
}
else
printf("pread returned 0x%x from 0x%x\n",val,addr);
 }
 else {
if (pwrite(fd, &val, 4, addr) < 0) {
perror ("pwrite failed");
return -1;
}
printf("pwrite() wrote 0x%x to 0x%x\n",val,addr);
}
break;

case 2 :   // mmap API
page_base = (addr / pagesize) * pagesize;
page_offset = addr - page_base;
prot = PROT_READ;
if (operation)
prot |= PROT_WRITE;
mem = mmap(NULL, page_offset + 4, prot, MAP_SHARED,
fd, page_base);
if (mem == MAP_FAILED) {
perror("can't mmap");
return -1;
}

if (!operation)
printf("mmap read returned 0x%x\n",*(uint32_t *)&mem[page_offset]);
else {
*(uint32_t *)&mem[page_offset] = (uint32_t)val;
printf("mmap write wrote 0x%x\n",val);
}

break;

default :
printf("Illegal access mode\n");
return -1;
}

close(fd);

}
--

This patch is purposed to fix this hole when securelevel is set where one could write to
/dev/mem via the mmap() API. The fix to disallow opening /dev/mem or /dev/kmem
for access. The fix checks access at open rather than have get_securelevel() called at
the various write/read locations.

Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle>
(cherry picked from commit 8ca81822fffb2536aa1076fb17d1ea1413df3e7f)

Add this commit back to UEK4 now that the regressions have been fixed

Orabug: 27465736

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Dan Duval <dan.duval@oracle.com>
7 years agords: Calling getsockname() on unbounded socket generates seg fault
Ka-Cheong Poon [Mon, 18 Dec 2017 15:45:18 +0000 (07:45 -0800)]
rds: Calling getsockname() on unbounded socket generates seg fault

If a socket is not yet bound, calling getsockname() should return an
unspecified address with family AF_UNSPEC as an RDS socket can be
bound to either an IPv4 or an IPv6 address.  Hence returning either
family is incorrect.  Currently, the returned address is set to an
unspecified address with family AF_INET6.  If the passed in buffer is
smaller than sizeof(struct sockaddr_in6), the app may get a
segmentation fault error.  This is similar to passing a buffer smaller
than sizeof(struct sockaddr_in) before the IPv6 changes.

Orabug: 27463484

Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
7 years agords: Second bind() can overwrite the first bind()
Ka-Cheong Poon [Mon, 18 Dec 2017 16:23:57 +0000 (08:23 -0800)]
rds: Second bind() can overwrite the first bind()

In rds_bind(), it does not check if a socket is already bound
or not.  Hence a second bind() will overwrite what is done in
the first bind().

Orabug: 27463500

Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
7 years agords: Un-connected socket sendmsg() with a NULL destination does not fail
Ka-Cheong Poon [Mon, 18 Dec 2017 15:09:55 +0000 (07:09 -0800)]
rds: Un-connected socket sendmsg() with a NULL destination does not fail

If the destnation address used in sendto()/sendmsg() is NULL, the
send call does not fail because the check done in rds_sendmsg() is
not correct.

Orabug: 27463507

Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
7 years agox86/mitigation/spectre_v2: Add reporting of 'lfence'
Konrad Rzeszutek Wilk [Mon, 29 Jan 2018 19:42:32 +0000 (14:42 -0500)]
x86/mitigation/spectre_v2: Add reporting of 'lfence'

if IBRS is off, but we do have 'lfence' enabled.

Obviously if 'lfence' is IBRS and 'nolfence' has been used - we
print that too.

OraBug: 27472666
Reviewed-by: John Haxby <john.haxby@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/spec: Add 'lfence_enabled' in sysfs
Konrad Rzeszutek Wilk [Mon, 29 Jan 2018 18:51:48 +0000 (13:51 -0500)]
x86/spec: Add 'lfence_enabled' in sysfs

To toggle during runtime whether the lfence fallback should be used.
If IBRS is enabled warnings will be returned.

OraBug: 27472666
Reviewed-by: John Haxby <john.haxby@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/spec_ctrl: Add 'nolfence' knob to disable fallback for spectre_v2 mitigation
Konrad Rzeszutek Wilk [Mon, 29 Jan 2018 18:08:20 +0000 (13:08 -0500)]
x86/spec_ctrl: Add 'nolfence' knob to disable fallback for spectre_v2 mitigation

If 'noibrs' is used, or the hardware does not have IBRS microcode
we fallback on using 'lfence' on every system call/interrupt/exception/etc.

This can dramatically slow down the performance. As a knob to
measure this provide 'nolfence' which will also disable this
security big hammer.

OraBug: 27472666
Reviewed-by: John Haxby <john.haxby@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86: Fix compile issues if CONFIG_XEN not defined
Konrad Rzeszutek Wilk [Mon, 29 Jan 2018 19:29:40 +0000 (14:29 -0500)]
x86: Fix compile issues if CONFIG_XEN not defined

We added the usage of 'xen_initial_domain()' but forgot
to include the header file.

OraBug: 27477720
Reviewed-by: John Haxby <john.haxby@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agohugetlb: fix nr_pmds accounting with shared page tables
Kirill A. Shutemov [Fri, 24 Jun 2016 21:49:51 +0000 (14:49 -0700)]
hugetlb: fix nr_pmds accounting with shared page tables

We account HugeTLB's shared page table to all processes who share it.
The accounting happens during huge_pmd_share().

If somebody populates pud entry under us, we should decrease pagetable's
refcount and decrease nr_pmds of the process.

By mistake, I increase nr_pmds again in this case.  :-/ It will lead to
"BUG: non-zero nr_pmds on freeing mm: 2" on process' exit.

Let's fix this by increasing nr_pmds only when we're sure that the page
table will be used.

Link: http://lkml.kernel.org/r/20160617122506.GC6534@node.shutemov.name
Fixes: dc6c9a35b66b ("mm: account pmd page tables to the process")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: zhongjiang <zhongjiang@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Orabug: 27451809

(cherry picked from commit c17b1f42594eb71b8d3eb5a6dfc907a7eb88a51d)
Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
7 years agonet/mlx4_core: allow QPs with enable_smi_admin enabled
Zhu Yanjun [Fri, 1 Dec 2017 02:13:20 +0000 (21:13 -0500)]
net/mlx4_core: allow QPs with enable_smi_admin enabled

In the commit 64453f042519 ("net/mlx4_core: Disallow creation of RAW QPs
on a VF"), it disallows some QPs. But when enable_smi_admin is enabled,
it should be allowed to pass.

Orabug: 27452072

Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
(cherry picked from commit a3fe544e915d367100f9f149109d7a7599b032ca)
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
7 years agonet/rds: Fix incorrect error handling
Håkon Bugge [Fri, 8 Dec 2017 15:55:22 +0000 (16:55 +0100)]
net/rds: Fix incorrect error handling

Commit 5f58d7e81c2f ("net/rds: reduce memory footprint during
ib_post_recv in IB transport") removes order two allocations used to
receive fragments. Instead, zero order allocations are used. However,
said commit has incorrect error handling.

This commit fixes this.

Orabug: 27469760

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
(cherry picked from commit a3740c633f4f7089d5cbb11b4a9d19c79c96f746)

7 years agox86: Move STUFF_RSB in to the idt macro
Konrad Rzeszutek Wilk [Sat, 13 Jan 2018 03:32:23 +0000 (22:32 -0500)]
x86: Move STUFF_RSB in to the idt macro

instead of it sitting in paranoid_entry or error_entry.

The idea behind the STUFF_RSB is to be done _before_
any calls are done. Which means we really want this in the idt
macro that is handled for exceptions - such as device not available,
which currently looks as so:

[Ignore the callq *0x40.. that gets converted to an 'cld']

<device_not_available>:
  nop
  nop
  nop
  callq  *0x40d0b7(%rip)        # ffffffff81b55330 <pv_irq_ops+0x30> <= patched to cld
  pushq  $0xffffffffffffffff
  sub    $0x78,%rsp
  callq  ffffffff81748ea0 <error_entry>           <=== call!
  mov    %rsp,%rdi
  xor    %esi,%esi
  callq  ffffffff81018830 <do_device_not_available>
  test   %rax,%rax
  jne    ffffffff81747f10 <dtrace_error_exit>
  jmpq   ffffffff817490a0 <error_exit>
  nopl   0x0(%rax)

By stuffing the RSB before the call to error_entry (or
paranoid_entry) we remove the chance of this becoming an attack vector.

While at it, remove the useless comment - we don't encode any frames
in UEK4.

OraBug: 27417150
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/spectre: Drop the warning about ibrs being obsolete.
Konrad Rzeszutek Wilk [Tue, 23 Jan 2018 17:14:06 +0000 (12:14 -0500)]
x86/spectre: Drop the warning about ibrs being obsolete.

They scare folks thinking in that they don't work if you
use 'noibrs'. And create support tickets. Just let us
be quiet and support both options.

OraBug: 27439976
Reviewed-by: John Haxby <john.haxby@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/spec: STUFF_RSB _before_ ENABLE_IBRS
Konrad Rzeszutek Wilk [Sat, 13 Jan 2018 02:05:45 +0000 (21:05 -0500)]
x86/spec: STUFF_RSB _before_ ENABLE_IBRS

And also we need to STUFF_RSB _before_ calls.

In our case we have a bunch of ENABLE_INTERRUPTS
which are (in objdump):
       callq  *0x40b379(%rip)         <pv_cpu_ops+0x128>

During bootup they do change to 'cld' (on baremetal).

On Xen PV they end up being those calls and STUFF_RSB is still
in effect which means it should be done before those calls are made.

Also the semantics of the IBRS MSR is "If IBRS is set, .. indirect
calls will not allow their predicated target address to be controlled ...
so long as as all RSB entries from previous less privileged prediction
mode are overwritten."

In other words - STUFF_RSB, then ENABLE_IBRS.

Xen hypervisor code follows that religiously and so shall we.

OraBug: 27448169
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/spec: Don't print the Missing arguments for option spectre_v2.
Konrad Rzeszutek Wilk [Sun, 21 Jan 2018 12:53:57 +0000 (07:53 -0500)]
x86/spec: Don't print the Missing arguments for option spectre_v2.

if not specified. There is no need for that at all.

OraBug: 27448241
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86: Move ENABLE_IBRS in the interrupt macro.
Konrad Rzeszutek Wilk [Sat, 13 Jan 2018 03:21:18 +0000 (22:21 -0500)]
x86: Move ENABLE_IBRS in the interrupt macro.

The interrupt macro already stuffs the RSB at the start
but neglected to call the ENABLE_IBRS (we did call
DISABLE_IBRS when we were done).

This meant that on any interrupt we would not toggle
the IBRS MSR.

OraBug: 27448273

Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Tested-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/IBRS: Don't try to change IBRS mode if IBRS is not available
Boris Ostrovsky [Tue, 23 Jan 2018 16:02:42 +0000 (11:02 -0500)]
x86/IBRS: Don't try to change IBRS mode if IBRS is not available

sysctl_ibrs_enabled is set properly when IBRS support is discovered
in init_scattered_cpuid_features(). We should simply return an error
when attempt is made to change IBRS mode when the feature is not supoorted.

There is also no need to call set_ibrs_inuse() when changing mode to 1:
this is already done by clear_ibrs_disabled().

Orabug: 27448280

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/IBRS: Remove support for IBRS_ENABLED_USER mode
Boris Ostrovsky [Tue, 23 Jan 2018 16:02:41 +0000 (11:02 -0500)]
x86/IBRS: Remove support for IBRS_ENABLED_USER mode

This mode was added based on our understanding of IBRS_ATT (IBRS
All The Time) described in early versions of Intel documentation.
We assumed that while "basic" IBRS protects kernel from using
predictions created by userland, IBRS_ATT will provide similar
defence between usermode tasks.

This understanding was incorrect.

Instead, IBRS_ATT (also referred to as "Enhanced IBRS") allows the
kernel to write IBRS MSR once, during boot, and never have to write
it again. This is in contrast to basic IBRS where every change of
protection mode required an MSR write, which is somewhat expensive.

Enhanced IBRS is not available on existing processors. Until it
becomes available we remove IBRS_ENABLED_USER.

While doing this also add a test in ibrs_enabled_write() that will
only process input if the mode will actually change.

Orabug: 27448280

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86: Use PRED_CMD MSR when ibpb is enabled
Konrad Rzeszutek Wilk [Sun, 21 Jan 2018 13:21:28 +0000 (08:21 -0500)]
x86: Use PRED_CMD MSR when ibpb is enabled

Since we have the knobs we should depend on those.

OraBug: 27448280

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/IBRS: Drop unnecessary WRITE_ONCE
Boris Ostrovsky [Tue, 23 Jan 2018 16:02:43 +0000 (11:02 -0500)]
x86/IBRS: Drop unnecessary WRITE_ONCE

There is no reason to use it here.

Orabug: 27448280

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/IBRS/IBPB: Remove procfs interface to ibrs/ibpb_enable
Boris Ostrovsky [Tue, 23 Jan 2018 16:02:45 +0000 (11:02 -0500)]
x86/IBRS/IBPB: Remove procfs interface to ibrs/ibpb_enable

We already have exact same functionality available in debugfs.

Orabug: 27448280

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/IBPB: Provide debugfs interface for changing IBPB mode
Boris Ostrovsky [Tue, 23 Jan 2018 16:02:44 +0000 (11:02 -0500)]
x86/IBPB: Provide debugfs interface for changing IBPB mode

... similar to how we change IBRS mode.

Orabug: 27448313

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/spec: Also print IBRS if IBPB is disabled.
Konrad Rzeszutek Wilk [Thu, 25 Jan 2018 22:10:36 +0000 (17:10 -0500)]
x86/spec: Also print IBRS if IBPB is disabled.

If ones disables ibpb_enabled the 'spectre_v2' sysfs
shows "Vulnerable" but it should say "IBRS"

This fixes it.

OraBug: 27448313
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86: Include linux/device.h in bugs_64.c
Boris Ostrovsky [Tue, 23 Jan 2018 16:02:46 +0000 (11:02 -0500)]
x86: Include linux/device.h in bugs_64.c

struct device_attribute is defined there. Without this file we get

arch/x86/kernel/cpu/bugs_64.c:125: warning: `struct device_attribute¿
declared inside parameter list

Orabug: 27448330

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agofs/ocfs2: remove page cache for converted direct write
Wengang Wang [Tue, 23 Jan 2018 17:17:39 +0000 (09:17 -0800)]
fs/ocfs2: remove page cache for converted direct write

Remove the page cache range for the writes converted from direct IO to
buffered IO.

orabug: 27431376
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
7 years agoRevert "ocfs2: code clean up for direct io"
Wengang Wang [Tue, 23 Jan 2018 17:16:59 +0000 (09:16 -0800)]
Revert "ocfs2: code clean up for direct io"

This reverts commit 11fc5176778a82fb5bb0100413496e125862e649.

This is patch in a patch set but back ported separately and caused
problem.

orabug: 27431376
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Ashish Samant <ashish.samant@oracle.com>
7 years agomlx4: add mstflint secure boot access kernel support
Qing Huang [Tue, 16 Jan 2018 19:27:32 +0000 (11:27 -0800)]
mlx4: add mstflint secure boot access kernel support

Source files are copied from:
https://github.com/Mellanox/mstflint.git master_devel
(under kernel sub-directory - changes up to commit
a8a84fae7459aba01ff6ad6cbc40622b7615b110)

Due to recent security changes in UEK4, mstflint FW tool may
lose the ability to access CX3 HCAs in Secure Boot enabled
enviroment. This kernel patch in addition to an enhanced
version of mstflint tool will let us regain the ability to
manage CX3 FW when Secure Boot is enabled while running latest
UEK4 kernels.

Orabug: 27424392

Signed-off-by: Qing Huang <qing.huang@oracle.com>
Reviewed-by: Rao Shoaib <rao.shoaib@oracle.com>
7 years agox86/microcode/intel: Extend BDW late-loading with a revision check
Jia Zhang [Mon, 1 Jan 2018 02:04:47 +0000 (10:04 +0800)]
x86/microcode/intel: Extend BDW late-loading with a revision check

Instead of blacklisting all model 79 CPUs when attempting a late
microcode loading, limit that only to CPUs with microcode revisions <
0x0b000021 because only on those late loading may cause a system hang.

For such processors either:

a) a BIOS update which might contain a newer microcode revision

or

b) the early microcode loading method

should be considered.

Processors with revisions 0x0b000021 or higher will not experience such
hangs.

For more details, see erratum BDF90 in document #334165 (Intel Xeon
Processor E7-8800/4800 v4 Product Family Specification Update) from
September 2017.

[ bp: Heavily massage commit message and pr_* statements. ]

Orabug: 27343609

Fixes: 723f2828a98c ("x86/microcode/intel: Disable late loading on model 79")
Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org> # v4.14
Link: http://lkml.kernel.org/r/1514772287-92959-1-git-send-email-qianyue.zj@alibaba-inc.com
(cherry picked from commit b94b7373317164402ff7728d10f7023127a02b60)
Signed-off-by: Todd Vierling <todd.vierling@oracle.com>
Reviewed-by: John Haxby <john.haxby@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/microcode/intel: Disable late loading on model 79
Borislav Petkov [Wed, 18 Oct 2017 11:12:25 +0000 (13:12 +0200)]
x86/microcode/intel: Disable late loading on model 79

[ Upstream commit 723f2828a98c8ca19842042f418fb30dd8cfc0f7 ]

Blacklist Broadwell X model 79 for late loading due to an erratum.

Orabug: 27343609

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171018111225.25635-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
(cherry picked from commit 9a8466b5d6c650b6c7f9e632ce605cccd5df1097)
Signed-off-by: Todd Vierling <todd.vierling@oracle.com>
Reviewed-by: John Haxby <john.haxby@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>