Todd Vierling [Fri, 9 Feb 2018 21:58:34 +0000 (16:58 -0500)]
kernel: on OL6 only, simulate the gcc 4.4 kABI for __stack_chk_fail()
The __visible linkage changes st_value calculation for genksyms on newer
compilers such as OL7's 4.8 and SCL's 4.9. However, this symbol is in
the kABI whitelist and needs to retain its old st_value, even though the
guts of the compiled function have not changed.
Add a special directive, CONFIG_SIMULATE_GCC44_KABI, which will mask the
__visible qualifier at genksyms time; and use it only on OL6 builds.
Orabug: 27509351 Signed-off-by: Todd Vierling <todd.vierling@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Todd Vierling [Fri, 9 Feb 2018 20:34:50 +0000 (15:34 -0500)]
uek-rpm: configs: Don't set HAVE_FENTRY on OL6 builds.
CONFIG_HAVE_FENTRY turns on a *conditional* addition of compile time
flags in the top level Makefile (-mfentry -DCC_USES_FENTRY). The OL6
stock gcc-4.4 code doesn't support this, but gcc 4.8+ does. This
changes the way function trace preambles work on the newer compilers,
which breaks kABI.
To do this effectively, the option needs to be removed from
arch/x86/Kconfig ("select HAVE_FENTRY if X86_64") so that it can be
set optionally in the RPM build configs. The OL7 configs properly
contain CONFIG_HAVE_FENTRY=y, so those will continue to build the same.
Orabug: 27509351 Signed-off-by: Todd Vierling <todd.vierling@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
KarimAllah Ahmed [Thu, 1 Feb 2018 21:59:45 +0000 (22:59 +0100)]
KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
[ Based on a patch from Ashok Raj <ashok.raj@intel.com> ]
Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
be using a retpoline+IBPB based approach.
To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for
guests that do not actually use the MSR, only start saving and restoring
when a non-zero is written to it.
No attempt is made to handle STIBP here, intentionally. Filtering STIBP
may be added in a future patch, which may require trapping all writes
if we don't want to pass it through directly to the guest.
[dwmw2: Clean up CPUID bits, save/restore manually, handle reset]
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de
(cherry picked from commit d28b387fb74da95d69d2615732f50cceb38e9a4d)
Orabug: 27525575 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Backport:
There is a lot that this patch does not pick up - but the most important we need
to pick up is the wrmsr(0x48, 0) when the retpoline is used. That is we cannot leave
the MSR048 hanging around with the guest value. The reason is that on a particular
CPU we may schedule another guest vCPU (a different) one, and the check on whether
to write the MSR0x48 is if 'vmx->spec_ctrl' (the vmx is tied to a specific VCPU).
Which means we may not write the prpoer guest vCPU MSR value in and have the
stale one in the guest.!]
Konrad Rzeszutek Wilk [Fri, 9 Feb 2018 22:19:01 +0000 (17:19 -0500)]
x86/spectre_v2: Disable IBRS if spectre_v2=off
We found some interesting behaviors when booting on IBRS enabled
hardware where 'spectre_v2=off' did not disable IBRS completely.
That is during the bootup we received some interrupts which
meant that we set the MSR 0x48 to 1.. and then when we got
to the code that handles 'spectre_v2=off' we would just
call 'set_ibrs_disabled' which would clear the in_use parameter.
But that would not clear the wedged MSR 048 being set to 1.
Which means we would continue on with both retpoline and IBRS on!
The fix is rather simple - clear the MSR as well.
Note that if 'noibrs' was used - then we would have disabled the
IBRS _before_ we got the interrupt, as 'noibrs' is an early parameter.
OraBug: 27525754 Reported-by: Yao-Min Chen <yaomin.chen@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Joao Martins [Fri, 2 Feb 2018 17:42:33 +0000 (17:42 +0000)]
xenbus: track caller request id
Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent
xenstore accesses") optimized xenbus concurrent accesses but in doing so
broke UABI of /dev/xen/xenbus. Through /dev/xen/xenbus applications are in
charge of xenbus message exchange with the correct header and body. Now,
after the mentioned commit the replies received by application will no
longer have the header req_id echoed back as it was on request (see
specification below for reference), because that particular field is being
overwritten by kernel.
struct xsd_sockmsg
{
uint32_t type; /* XS_??? */
uint32_t req_id;/* Request identifier, echoed in daemon's response. */
uint32_t tx_id; /* Transaction id (0 if not related to a transaction). */
uint32_t len; /* Length of data following this. */
/* Generally followed by nul-terminated string(s). */
};
Before there was only one request at a time so req_id could simply be
forwarded back and forth. To allow simultaneous requests we need a
different req_id for each message thus kernel keeps a monotonic increasing
counter for this field and is written on every request irrespective of
userspace value.
Forwarding again the req_id on userspace requests is not a solution because
we would open the possibility of userspace-generated req_id colliding with
kernel ones. So this patch instead takes another route which is to
artificially keep user req_id while keeping the xenbus logic as is. We do
that by saving the original req_id before xs_send(), use the private kernel
counter as req_id and then once reply comes and was validated, we restore
back the original req_id.
Darren Kenny [Fri, 9 Feb 2018 14:17:04 +0000 (14:17 +0000)]
x86/spectre_v2: Remove 0xc2 from spectre_bad_microcodes
According to the latest microcode update from Intel (on Feb 8, 2018) on
Skylake we should be using the microcode revisions 0xC2***, so we need
to remove that from the blacklist now.
Signed-off-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tim Chen [Thu, 8 Feb 2018 21:52:42 +0000 (16:52 -0500)]
x86/speculation: Use Indirect Branch Prediction Barrier in context switch
This patch is a subset of the changes in the upstream commit 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7. Since we don't have 'ctx_id' in
mm_context_t in UEK4, we can't check whether the context ID of the new
task is the same as that of the previous one. In this patch, we flush indirect
branches when switching into a process that marked itself non-dumpable.
This protects high value processes like gpg better, without having too high
performance overhead.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: linux@dominikbrodowski.net Cc: peterz@infradead.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: pbonzini@redhat.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517263487-3708-1-git-send-email-dwmw@amazon.co.uk
Orabug: 27524608 Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Mon, 5 Feb 2018 19:34:55 +0000 (14:34 -0500)]
x86/spectre_v2: Add spectre_v2_heuristics=
As we have one already in the tree:
x86/spectre: Favor IBRS on Skylake over retpoline
But we may want a knob to turn that off altgoether. The
hard to remember spectre_v2_heuristics=off or
spectre_v2_heuristics=skylake=off
will take care of disabling that optimization.
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Darren Kenny [Fri, 2 Feb 2018 19:12:20 +0000 (19:12 +0000)]
x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL
Fixes: 117cc7a908c83 ("x86/retpoline: Fill return stack buffer on vmexit") Signed-off-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180202191220.blvgkgutojecxr3b@starbug-vm.ie.oracle.com
(cherry picked from commit af189c95a371b59f493dbe0f50c0a09724868881)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
David Woodhouse [Fri, 2 Feb 2018 18:59:51 +0000 (13:59 -0500)]
x86/cpufeatures: Clean up Spectre v2 related CPUID flags
We want to expose the hardware features simply in /proc/cpuinfo as "ibrs",
"ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them
as the user-visible bits.
When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB
capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP
bit is set, set the AMD STIBP that's used for the generic hardware
capability.
Hide the rest from /proc/cpuinfo by putting "" in the comments. Including
RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are
patches to make the sysfs vulnerabilities information non-readable by
non-root, and the same should apply to all information about which
mitigations are actually in use. Those *shouldn't* appear in /proc/cpuinfo.
The feature bit for whether IBPB is actually used, which is needed for
ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
arch/x86/include/asm/cpufeatures.h
arch/x86/kernel/cpu/bugs.c
arch/x86/kernel/cpu/intel.c
[Backport: The majority of the changes are to make 'retpoline' and friends
not show up. Also fix so that instead of 'spec_ctrl' we have 'ibrs'.
The rest we had already]
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Conflicts:
arch/x86/include/asm/cpufeatures.h
arch/x86/kernel/cpu/bugs.c
[The original version of this patch doesn't set X86_FEATURE_IBPB, so do
it ourselves. Given X86_FEATURE_SPEC_CTRL (i.e. CPUID.07.[EDX.26])[*],
always set X86_FEATURE_IBPB in scattered.c. This omission did not
impact the actual IBPB functionality as the code uses 'ibpb_inuse': the
only thing missing was the 'ibpb' string in /proc/cpuinfo.
Since we already have code to enable IBPB (e.g. switch_mm_irqs_off),
there is no point in backporting indirect_branch_prediction_barrier from
this patch.]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
David Woodhouse [Thu, 25 Jan 2018 16:14:14 +0000 (16:14 +0000)]
x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes
This doesn't refuse to load the affected microcodes; it just refuses to
use the Spectre v2 mitigation features if they're detected, by clearing
the appropriate feature bits.
The AMD CPUID bits are handled here too, because hypervisors *may* have
been exposing those bits even on Intel chips, for fine-grained control
of what's available.
It is non-trivial to use x86_match_cpu() for this table because that
doesn't handle steppings. And the approach taken in commit bd9240a18
almost made me lose my lunch.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Conflicts:
arch/x86/kernel/cpu/intel.c
[Our resolution to 5d10cbc91d9e x86/cpufeatures: Add AMD feature bits for Speculation Control
is different, so we still clear the right bits if there (or would be)are AMD machines
with bad microcode]
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
David Woodhouse [Thu, 25 Jan 2018 16:14:13 +0000 (16:14 +0000)]
x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
Also, for CPUs which don't speculate at all, don't report that they're
vulnerable to the Spectre variants either.
Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it
for now, even though that could be done with a simple comparison, on the
assumption that we'll have more to add.
Based on suggestions from Dave Hansen and Alan Cox.
[Backport:
s/X86_FEATURE_ARCH_CAPABILITIES/X86_FEATURE_IA32_ARCH_CAPS/
Fix compiler warnings by removing __init/__initdata from
cpu_no_speculation, cpu_no_meltdown, and cpu_vulnerable_to_meltdown
] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
David Woodhouse [Thu, 25 Jan 2018 16:14:11 +0000 (16:14 +0000)]
x86/cpufeatures: Add AMD feature bits for Speculation Control
AMD exposes the PRED_CMD/SPEC_CTRL MSRs slightly differently to Intel.
See http://lkml.kernel.org/r/2b3e25cc-286d-8bd0-aeaf-9ac4aae39de8@amd.com
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-4-git-send-email-dwmw@amazon.co.uk
Conflicts:
arch/x86/include/asm/cpufeatures.h
[Upstream has the NCAPINTS raised so there is another array to stuff all the
0x80000008 cpuid leaf in. But we don't have that so we just toggle the global
ones]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Konrad Rzeszutek Wilk [Fri, 2 Feb 2018 03:56:00 +0000 (22:56 -0500)]
x86/spectre_v2: Add VMEXIT_FILL_RSB instead of RETPOLINE
The backport of "x86/retpoline: Fill return stack buffer on vmexit"
made the full stuffing of RSB only enabled if the kernel had
selected X86_FEATURE_RETPOLINE.
But if we are using IBRS we still want the full RSB stuffing
as it was prior to the backport.
Since we have both retpoline and ibrs wanting it we introduce
a new feature to enable the common mitigation that both of them
need.
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 21:20:54 +0000 (16:20 -0500)]
x86/spectre: If IBRS is enabled disable "Filling RSB on context switch"
As that is only needed if the machine is running IBRS and !SMEP.
The comment above the conditional says it all:
Skylake era CPUs have a separate issue with *underflow* of the
RSB, when they will predict 'ret' targets from the generic BTB.
The proper mitigation for this is IBRS. If IBRS is not supported
or deactivated in favour of retpolines the RSB fill on context
.. and if we have IBRS then we should ignore this conditional.
Note that the check (!SMEP) and using the STUFF_RSB is already
done in:
x86/spectre_v2: Figure out if STUFF_RSB macro needs to be used.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 20:47:19 +0000 (15:47 -0500)]
KVM: VMX: Allow direct access to MSR_IA32_SPEC_CTRL
This is an adopation of what is being posted upstream.
The issue here is that guest kernels such as Windows
needs IBRS to solve their spectre_v2 mitigation.
And since our kernel can do either IBRS or retpoline
we need to be aware of both.
(We are ignoring for right now the situation in which
the microcode is not loaded and we want to emulate IBRS)
In short:
1) If the CPU has microcode, and host uses IBRS we need
to save the MSR during VMEXIT and write our IBRS value.
Also before VMENTER we need to write the guest MSR value.
2) If the host kernel uses retpoline we need to WRMSR the
guest MSR value on VMENTER, but we are optimizing by only
doing it if it a non-zero value.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 16:45:41 +0000 (11:45 -0500)]
x86/spectre: Update sysctl values if toggled only by set_{ibrs,ibpb}_disabled
As otherwise we will report the wrong values in the
/sys/kernel/debug/x86/{ibrs,ipbp}_enabled at first
bootup - that is if you have 'noibrs' on the command line.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Andi Kleen [Thu, 4 Jan 2018 14:37:08 +0000 (14:37 +0000)]
retpoline/module: Taint kernel for missing retpoline in module
There's a risk that a kernel that has full retpoline mitigations
becomes vulnerable when a module gets loaded that hasn't been
compiled with the right compiler or the right option.
We cannot fix it, but should at least warn the user when that
happens.
Add a flag to each module if it has been compiled with RETPOLINE
When the a module hasn't been compiled with a retpoline
aware compiler, print a warning and set a taint flag.
For modules it is checked at compile time, however it cannot
check assembler or other non compiled objects used in the module link.
Due to lack of better letter it uses taint option 'Z'
We only set the taint flag for incorrectly compiled modules
now, not for the main kernel, which already has other
report mechanisms.
Also make sure to report vulnerable for spectre if such a module
has been loaded.
v2: Change warning message
v3: Port to latest tree Cc: jeyu@kernel.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
(cherry picked from commit abc01f3c4bbd927f5e47cc6dff99a76a393d1bbe)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
Documentation/oops-tracing.txt
(dmj: patch had Documentation/admin-guide/tainted-kernels.rst)
arch/x86/kernel/cpu/bugs_64.c
(dmj: patch had arch/x86/kernel/cpu/bugs.c)
include/linux/kernel.h
kernel/module.c
kernel/panic.c Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
On context switch from a shallow call stack to a deeper one, as the CPU
does 'ret' up the deeper side it may encounter RSB entries (predictions for
where the 'ret' goes to) which were populated in userspace.
This is problematic if neither SMEP nor KPTI (the latter of which marks
userspace pages as NX for the kernel) are active, as malicious code in
userspace may then be executed speculatively.
Overwrite the CPU's return prediction stack with calls which are predicted
to return to an infinite loop, to "capture" speculation if this
happens. This is required both for retpoline, and also in conjunction with
IBRS for !SMEP && !KPTI.
On Skylake+ the problem is slightly different, and an *underflow* of the
RSB may cause errant branch predictions to occur. So there it's not so much
overwrite, as *filling* the RSB to attempt to prevent it getting
empty. This is only a partial solution for Skylake+ since there are many
other conditions which may result in the RSB becoming empty. The full
solution on Skylake+ is to use IBRS, which will prevent the problem even
when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
required on context switch.
[ tglx: Added missing vendor check and slighty massaged comments and
changelog ]
[js] backport to 4.4 -- __switch_to_asm does not exist there, we
have to patch the switch_to macros for both x86_32 and x86_64.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 18bb117d1b7690181346e6365c6237b6ceaac4c4)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/include/asm/cpufeature.h
(dmj: bumped all uek4-specific microcode features up one)
arch/x86/include/asm/switch_to.h
arch/x86/kernel/cpu/bugs_64.c
(dmj:
- patch had arch/x86/kernel/cpu/bugs.c
- preserved X86_FEATURE_RSB_CTXSW check in case of IBRS) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Since indirect jump instructions will be replaced by jump
to __x86_indirect_thunk_*, those jmp instruction must be
treated as an indirect jump. Since optprobe prohibits to
optimize probes in the function which uses an indirect jump,
it also needs to find out the function which jump to
__x86_indirect_thunk_* and disable optimization.
Add a check that the jump target address is between the
__indirect_thunk_start/end when optimizing kprobe.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Link: https://lkml.kernel.org/r/151629212062.10241.6991266100233002273.stgit@devbox Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6cb73eb8045157ea280f0a047777ea1f56547375)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Mark __x86_indirect_thunk_* functions as blacklist for kprobes
because those functions can be called from anywhere in the kernel
including blacklist functions of kprobes.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Link: https://lkml.kernel.org/r/151629209111.10241.5444852823378068683.stgit@devbox Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9b8bd0d3586842716f5673e7a0922a9c9e1fcbfd)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Introduce start/end markers of __x86_indirect_thunk_* functions.
To make it easy, consolidate .text.__x86.indirect_thunk.* sections
to one .text.__x86.indirect_thunk section and put it in the
end of kernel text section and adds __indirect_thunk_start/end
so that other subsystem (e.g. kprobes) can identify it.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Link: https://lkml.kernel.org/r/151629206178.10241.6828804696410044771.stgit@devbox Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 799dc737680a8074a0c7c2d3426b85f4c439377f)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Allow architectures to create asm/asm-prototypes.h file that
provides C prototypes for exported asm functions, which enables
proper CRC versions to be generated for them.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.com>
[jkosina@suse.cz: folded cc6acc11cad1 fixup in as well ] Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ff535919c1366b0218113f95c447d92c3aac0c1f)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
The PAUSE instruction is currently used in the retpoline and RSB filling
macros as a speculation trap. The use of PAUSE was originally suggested
because it showed a very, very small difference in the amount of
cycles/time used to execute the retpoline as compared to LFENCE. On AMD,
the PAUSE instruction is not a serializing instruction, so the pause/jmp
loop will use excess power as it is speculated over waiting for return
to mispredict to the correct target.
The RSB filling macro is applicable to AMD, and, if software is unable to
verify that LFENCE is serializing on AMD (possible when running under a
hypervisor), the generic retpoline support will be used and, so, is also
applicable to AMD. Keep the current usage of PAUSE for Intel, but add an
LFENCE instruction to the speculation trap for AMD.
The same sequence has been adopted by GCC for the GCC generated retpolines.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Kees Cook <keescook@google.com> Link: https://lkml.kernel.org/r/20180113232730.31060.36287.stgit@tlendack-t1.amdoffice.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit fba063e6dfb413e06b9daa5d45b164761172f5ed)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler
does not have retpoline support. Linus rationale for this is:
It's wrong because it will just make people turn off RETPOLINE, and the
asm updates - and return stack clearing - that are independent of the
compiler are likely the most important parts because they are likely the
ones easiest to target.
And it's annoying because most people won't be able to do anything about
it. The number of people building their own compiler? Very small. So if
their distro hasn't got a compiler yet (and pretty much nobody does), the
warning is just annoying crap.
It is already properly reported as part of the sysfs interface. The
compile-time warning only encourages bad things.
Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Requested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAtw@mail.gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 451725c3e785dfc3ede6c65184b96c213181995a)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
In accordance with the Intel and AMD documentation, we need to overwrite
all entries in the RSB on exiting a guest, to prevent malicious branch
target predictions from affecting the host kernel. This is needed both
for retpoline and for IBRS.
[ak: numbers again for the RSB stuffing labels]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515755487-8524-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit eebc3f8adee0a6f43a4789ef0bf5c5b35de8cfe4)
Removed now-unused stuff_RSB function from uek4's 60f856955c1b
("x86/kvm: Pad RSB on VM transition").
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kvm/svm.c
arch/x86/kvm/vmx.c Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Convert indirect jumps in core 32/64bit entry assembler code to use
non-speculative sequences when CONFIG_RETPOLINE is enabled.
Don't use CALL_NOSPEC in entry_SYSCALL_64_fastpath because the return
address after the 'call' instruction must be *precisely* at the
.Lentry_SYSCALL_64_after_fastpath label for stub_ptregs_64 to work,
and the use of alternatives will mess that up unless we play horrid
games to prepend with NOPs and make the variants the same length. It's
not worth it; in the case where we ALTERNATIVE out the retpoline, the
first instruction at __x86.indirect_thunk.rax is going to be a bare
jmp *%rax anyway.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-7-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 028083cb02db69237e73950576bc81ac579693dc)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/entry_32.S
(dmj:
- patch had arch/x86/entry/entry_32.S
- extra retpolines needed for sys_call_table)
arch/x86/kernel/entry_64.S
(dmj: patch had arch/x86/entry/entry_64.S) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 15:40:43 +0000 (10:40 -0500)]
x86/spectre_v2: Figure out if STUFF_RSB macro needs to be used.
If IBRS is on, STUFF_RSB only needs to be enabled if the CPU
does not have SMEP enabled.
We change the macro to depend on a synthetic one called
X86_FEATURE_STUFF_RSB - which will only be turned on
if 'spectre_v2=ibrs' is set and the BIOS does not have SMEP
enabled.
(And naturally also if the kernel is compiled without retpoline
and the CPU does not have SMEP).
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 15:13:37 +0000 (10:13 -0500)]
x86/spectre_v2: Figure out when to use IBRS.
Which is if:
a) on the bootline you have 'spectre_v2=ibrs' _and_ the microcode
is loaded that exposes this. And nobody used 'noibrs'.
Also if you have 'spectre_v2=ibrs noibrs' we end up falling
in the 'lfence'. Unless somebody did 'spectre_v2=ibrs noibrs nolfence',
then we go back to automatic discovery.
b) Kernel compiled without retpoline and the microcode is
available.
c) Kernel compiled without retpoline and there is no microcode with IBRS
then we pick 'lfence' on system calls.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Konrad Rzeszutek Wilk [Thu, 1 Feb 2018 14:45:27 +0000 (09:45 -0500)]
x86/spectre: Add IBRS option.
The spectre_v2_mitigation already has an IBRS option, lets make
the override possible. But don't select it by default if
the kernel has been compiled with retpoline.
If it has not (no compiler support), then fallback to ibrs.
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Add a spectre_v2= option to select the mitigation used for the indirect
branch speculation vulnerability.
Currently, the only option available is retpoline, in its various forms.
This will be expanded to cover the new IBRS/IBPB microcode features.
The RETPOLINE_AMD feature relies on a serializing LFENCE for speculation
control. For AMD hardware, only set RETPOLINE_AMD if LFENCE is a
serializing instruction, which is indicated by the LFENCE_RDTSC feature.
[ tglx: Folded back the LFENCE/AMD fixes and reworked it so IBRS
integration becomes simple ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-5-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9f789bc5711bcacb5df003594b992f0c1cc19df4)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/cpu/bugs_64.c
(dmj:
- patch used arch/x86/kernel/cpu/bugs.c
- UEK4-specific IBPB and IBRS code merged with retpoline
cmdline parsing
- disabled IBRS in the case of full retpoline mitigation)
(konrad:
- disabled lfence on IBRS paths in case of full retpoline)
arch/x86/kernel/cpu/common.c Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
the corresponding thunks. Provide assembler macros for invoking the thunks
in the same way that GCC does, from native and inline assembler.
This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
some circumstances, IBRS microcode features may be used instead, and the
retpoline can be disabled.
On AMD CPUs if lfence is serialising, the retpoline can be dramatically
simplified to a simple "lfence; jmp *\reg". A future patch, after it has
been verified that lfence really is serialising in all circumstances, can
enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
to X86_FEATURE_RETPOLINE.
Do not align the retpoline in the altinstr section, because there is no
guarantee that it stays aligned when it's copied over the oldinstr during
alternative patching.
[ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
[ tglx: Put actual function CALL/JMP in front of the macros, convert to
symbolic labels ]
[ dwmw2: Convert back to numeric labels, merge objtool fixes ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
[ 4.4 backport: removed objtool annotation since there is no objtool ] Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3c5e10905263dbe9fbc621d1889b85e9c867da25)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Conflicts:
arch/x86/kernel/cpu/common.c Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
The macro MODULE is not a config option, it is a per-file build
option. So, config_enabled(MODULE) is not sensible. (There is
another case in include/linux/export.h, where config_enabled() is
used against a non-config option.)
This commit renames some macros in include/linux/kconfig.h for the
use for non-config macros and replaces config_enabled(MODULE) with
__is_defined(MODULE).
I am keeping config_enabled() because it is still referenced from
some places, but I expect it would be deprecated in the future.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michal Marek <mmarek@suse.com> Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit 675901851fd2a00c7a70bcf327303efba3b81e2f) Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Add asm-usable variants of EXPORT_SYMBOL/EXPORT_SYMBOL_GPL. This
commit just adds the default implementation; most of the architectures
can simply add export.h to asm/Kbuild and start using <asm/export.h>
from assembler. The rest needs to have their <asm/export.h> define
everal macros and then explicitly include <asm-generic/export.h>
One area where the things might diverge from default is the alignment;
normally it's 8 bytes on 64bit targets and 4 on 32bit ones, both for
unsigned long and for struct kernel_symbol. Unfortunately, amd64 and
m68k are unusual - m68k aligns to 2 bytes (for both) and amd64 aligns
struct kernel_symbol to 16 bytes. For those we'll need asm/export.h to
override the constants used by generic version - KSYM_ALIGN and KCRC_ALIGN
for kernel_symbol and unsigned long resp. And no, __alignof__ would not
do the trick - on amd64 __alignof__ of struct kernel_symbol is 8, not 16.
More serious source of unpleasantness is treatment of function
descriptors on architectures that have those. Things like ppc64,
parisc, ia64, etc. need more than the address of the first insn to
call an arbitrary function. As the result, their representation of
pointers to functions is not the typical "address of the entry point" -
it's an address of a small static structure containing all the required
information (including the entry point, of course). Sadly, the asm-side
conventions differ in what the function name refers to - entry point or
the function descriptor. On ppc64 we do the latter;
bar: .quad foo
is what void (*bar)(void) = foo; turns into and the rare places where
we need to explicitly work with the label of entry point are dealt with
as DOTSYM(foo). For our purposes it's ideal - generic macros are usable.
However, parisc would have foo and P%foo used for label of entry point
and address of the function descriptor and
bar: .long P%foo
woudl be used instead. ia64 goes similar to parisc in that respect,
except that there it's @fptr(foo) rather than P%foo. Such architectures
need to define KSYM_FUNC that would turn a function name into whatever
is needed to refer to function descriptor.
What's more, on such architectures we need to know whether we are exporting
a function or an object - in assembler we have to tell that explicitly, to
decide whether we want EXPORT_SYMBOL(foo) produce e.g.
__ksymtab_foo: .quad foo
or
__ksymtab_foo: .quad @fptr(foo)
For that reason we introduce EXPORT_DATA_SYMBOL{,_GPL}(), to be used for
exports of data objects. On normal architectures it's the same thing
as EXPORT_SYMBOL{,_GPL}(), but on parisc-like ones they differ and the
right one needs to be used. Most of the exports are functions, so we
keep EXPORT_SYMBOL for those...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit a88693d006986b8fc9ae5036852ac0156106de2a) Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Commit 4efca4ed ("kbuild: modversions for EXPORT_SYMBOL() for asm") adds
modversion support for symbols exported from asm files. Architectures
must include C-style declarations for those symbols in asm/asm-prototypes.h
in order for them to be versioned.
Add these declarations for x86, and an architecture-independent file that
can be used for common symbols.
With f27c2f6 reverting 8ab2ae6 ("default exported asm symbols to zero") we
produce a scary warning on x86, this commit fixes that.
Signed-off-by: Adam Borowski <kilobyte@angband.pl> Tested-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Peter Wu <peter@lekensteyn.nl> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Michal Marek <mmarek@suse.com> Signed-off-by: Razvan Ghitulete <rga@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27477743
CVE: CVE-2017-5715
(cherry picked from commit b76ac90af34dc08f057da15aa34584a0a3d381b5) Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Currently we use current_stack_pointer() function to get the value
of the stack pointer register. Since commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
... we have a stack register variable declared. It can be used instead of
current_stack_pointer() function which allows to optimize away some
excessive "mov %rsp, %<dst>" instructions:
Where an ALTERNATIVE is used in the middle of an inline asm block, this
would otherwise lead to the following instruction being appended directly
to the trailing ".popsection", and a failed compile.
Fixes: 9cebed423c84 ("x86, alternative: Use .pushsection/.popsection") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: ak@linux.intel.com Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180104143710.8961-8-dwmw@amazon.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 999d4f1961fa002bda138ddfe9119965421f85da)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
The alternatives code checks only the first byte whether it is a NOP, but
with NOPs in front of the payload and having actual instructions after it
breaks the "optimized' test.
Make sure to scan all bytes before deciding to optimize the NOPs in there.
Reported-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andrew Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/20180110112815.mgciyf5acwacphkq@pd.tnic Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e997d991ab2b1dc9f9cdad999a891626c2aecf21)
Orabug: 27477743
CVE: CVE-2017-5715 Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
We are checking for gaps to previous bio_vec, which can
only detect back merges gaps. Moreover, at the point where
we check for a gap, we don't know if we will attempt a back
or a front merge. Thus, check for gap to prev in a back merge
attempt and check for a gap to next in a front merge attempt.
Signed-off-by: Jens Axboe <axboe@fb.com>
[sagig: Minor rename change] Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
(cherry picked from commit 5e7c4274a70aa2d6f485996d0ca1dad52d0039ca)
For drivers that don't support gaps in the SG lists handed to
them we must bounce (copy the user buffers) and pass a bio that
does not include gaps. This doesn't matter for any current user,
but will help to allow iser which can't handle gaps to use the
block virtual boundary instead of using driver-local bounce
buffering when handling SG_IO commands.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 46348456c1791053dcbe5a9e21825b10a3c8a8fb)
Keith Busch [Thu, 1 Feb 2018 19:37:06 +0000 (11:37 -0800)]
block: Replace SG_GAPS with new queue limits mask
The SG_GAPS queue flag caused checks for bio vector alignment against
PAGE_SIZE, but the device may have different constraints. This patch
adds a queue limits so a driver with such constraints can set to allow
requests that would have been unnecessarily split. The new gaps check
takes the request_queue as a parameter to simplify the logic around
invoking this function.
This new limit makes the queue flag redundant, so removing it and
all usage. Device-mappers will inherit the correct settings through
blk_stack_limits().
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 03100aada96f0645bbcb89aea24c01f02d0ef1fa)
In UEK4, the decision is made to do complete switch to
queue_virt_boundary discarding the use of SG_GAPS flags. Hence,
this partial porting of new queue limit mask is reverted. Also,
this partial patch is broken for stacked devices.
In UEK4, the decision is made to do complete switch to
queue_virt_boundary discarding the use of SG_GAPS flags. Hence,
this partial porting of new queue limit mask is reverted. Also,
this partial patch is broken for stacked devices.
In UEK4, the decision is made to do complete switch to
queue_virt_boundary discarding the use of SG_GAPS flags. Hence,
this partial porting of new queue limit mask is reverted. Also,
this partial patch is broken for stacked devices.
The following soft lockup was caught. This is a deadlock caused by
recursive locking.
Process kworker/u40:1:28016 was holding spin lock "mbx->queue_lock" in
qlcnic_83xx_mailbox_worker(), while a softirq came in and ask the same spin
lock in qlcnic_83xx_enqueue_mbx_cmd(). This lock should be hold by disable
bh..
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 233ac3891607f501f08879134d623b303838f478) Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Ankur Arora [Mon, 5 Feb 2018 03:35:07 +0000 (22:35 -0500)]
x86/entry: RESTORE_IBRS needs to be done under kernel CR3
RESTORE_IBRS_CLOBBER executes after we have already switched
to the USER_CR3. This blows up because RESTORE_IBRS_CLOBBER
looks at a kernel variable (use_ibrs).
This fixes a corner case that is caused by a race of dio write vs dio
read/write.
Here is how the race could happen.
Suppose that no extent map has been loaded into memory yet.
There is a file extent [0, 32K), two jobs are running concurrently
against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio
read from [0, 4K) or [4K, 8K).
t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K).
When add_extent_mapping() gets -EEXIST for adding em [0, 32k),
search_extent_mapping() would return [0, 8k) as existing em, even
though start == existing->start, em is [0, 32k) and
extent_map_end(em) > extent_map_end(existing), ie. 32k > 8k,
then it goes thru merge_extent_mapping() which tries to add a [8k, 8k)
(with a length 0), and btrfs_get_extent() ends up returning -EEXIST,
and dio read/write will get -EEXIST which is confusing applications.
Here I also concluded all possible situations,
1) start < existing->start
After going thru the above case by case, it turns out that if start is
within existing em (front inclusive), then the existing em should be
returned, otherwise, we try our best to merge candidate em with
sibling ems to form a larger em.
Reported-by: David Vallender <david.vallender@landmark.co.uk> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Anand Jain <anand.jain@oracle.com>
%block_len could be checked on deciding if two em are mergable.
merge_extent_mapping() has only added the front pad if the front part
of em gets truncated, but it's possible that the end part gets
truncated.
For both compressed extent and inline extent, em->block_len is not
adjusted accordingly, while for regular extent, em->block_len always
equals to em->len, hence this sets em->block_len with em->len.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Anand Jain <anand.jain@oracle.com>
My QEMU VM was seeing inexplicable I/O errors that I tracked down to
errors coming from the qcow2 virtual drive in the host system. The qcow2
file is a nocow file on my Btrfs drive, which QEMU opens with O_DIRECT.
Every once in awhile, pread() or pwrite() would return EEXIST, which
makes no sense. This turned out to be a bug in btrfs_get_extent().
Commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map
insertion in btrfs_get_extent") fixed a case in btrfs_get_extent() where
two threads race on adding the same extent map to an inode's extent map
tree. However, if the added em is merged with an adjacent em in the
extent tree, then we'll end up with an existing extent that is not
identical to but instead encompasses the extent we tried to add. When we
call merge_extent_mapping() to find the nonoverlapping part of the new
em, the arithmetic overflows because there is no such thing. We then end
up trying to add a bogus em to the em_tree, which results in a EEXIST
that can bubble all the way up to userspace.
Fix it by extending the identical extent map special case.
Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
(cherry picked from commit 8e2bd3b7fac91b79a6115fd1511ca20b2a09696d) Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Anand Jain <anand.jain@oracle.com>
When dealing with inline extents, btrfs_get_extent will incorrectly try
to insert a duplicate extent_map. The dup hits -EEXIST from
add_extent_map, but then we try to merge with the existing one and end
up trying to insert a zero length extent_map.
This actually works most of the time, except when there are extent maps
past the end of the inline extent. rocksdb will trigger this sometimes
because it preallocates an extent and then truncates down.
You'll get EIOs trying to read inline after this because add_extent_map
is returning EEXIST
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 8dff9c85341032767d7b519217a79ea04cd676b0) Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: John Haxby <john.haxby@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
K. Y. Srinivasan [Fri, 23 Dec 2016 00:54:03 +0000 (16:54 -0800)]
Drivers: hv: util: Backup: Fix a rescind processing issue
VSS may use a char device to support the communication between
the user level daemon and the driver. When the VSS channel is rescinded
we need to make sure that the char device is fully cleaned up before
we can process a new VSS offer from the host. Implement this logic.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit d77044d142e960f7b5f814a91ecb8bcf86aa552c) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Alex Ng [Sun, 6 Nov 2016 21:14:11 +0000 (13:14 -0800)]
Drivers: hv: vss: Operation timeouts should match host expectation
Increase the timeout of backup operations. When system is under I/O load,
it needs more time to freeze. These timeout values should also match the
host timeout values more closely.
Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit b357fd3908c1191f2f56e38aa77f2aecdae18bc8) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Alex Ng [Sun, 6 Nov 2016 21:14:10 +0000 (13:14 -0800)]
Drivers: hv: vss: Improve log messages.
Adding log messages to help troubleshoot error cases and transaction
handling.
Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit 23d2cc0c29eb0e7c6fe4cac88098306c31c40208) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Alex Ng [Fri, 2 Sep 2016 12:58:25 +0000 (05:58 -0700)]
Drivers: hv: utils: Check VSS daemon is listening before a hot backup
Hyper-V host will send a VSS_OP_HOT_BACKUP request to check if guest is
ready for a live backup/snapshot. The driver should respond to the check
only if the daemon is running and listening to requests. This allows the
host to fallback to standard snapshots in case the VSS daemon is not
running.
Signed-off-by: Alex Ng <alexng@messages.microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit db886e4d24c2b3d334be2cc1bd1bd05d547eb4c4) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Alex Ng [Fri, 2 Sep 2016 12:58:24 +0000 (05:58 -0700)]
Drivers: hv: utils: Continue to poll VSS channel after handling requests.
Multiple VSS_OP_HOT_BACKUP requests may arrive in quick succession, even
though the host only signals once. The driver wass handling the first
request while ignoring the others in the ring buffer. We should poll the
VSS channel after handling a request to continue processing other requests.
Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit 497af84b81b98b27e9ba7aebb8a373412e328497) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Vitaly Kuznetsov [Fri, 10 Jun 2016 00:08:57 +0000 (17:08 -0700)]
Drivers: hv: utils: fix a race on userspace daemons registration
Background: userspace daemons registration protocol for Hyper-V utilities
drivers has two steps:
1) daemon writes its own version to kernel
2) kernel reads it and replies with module version
at this point we consider the handshake procedure being completed and we
do hv_poll_channel() transitioning the utility device to HVUTIL_READY
state. At this point we're ready to handle messages from kernel.
When hvutil_transport is in HVUTIL_TRANSPORT_CHARDEV mode we have a
single buffer for outgoing message. hvutil_transport_send() puts to this
buffer and till the buffer is cleared with hvt_op_read() returns -EFAULT
to all consequent calls. Host<->guest protocol guarantees there is no more
than one request at a time and we will not get new requests till we reply
to the previous one so this single message buffer is enough.
Now to the race. When we finish negotiation procedure and send kernel
module version to userspace with hvutil_transport_send() it goes into the
above mentioned buffer and if the daemon is slow enough to read it from
there we can get a collision when a request from the host comes, we won't
be able to put anything to the buffer so the request will be lost. To
solve the issue we need to know when the negotiation is really done (when
the version message is read by the daemon) and transition to HVUTIL_READY
state after this happens. Implement a callback on read to support this.
Old style netlink communication is not affected by the change, we don't
really know when these messages are delivered but we don't have a single
message buffer there.
Reported-by: Barry Davis <barry_davis@stormagic.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit e0fa3e5e7df61eb2c339c9f0067c202c0cdeec2c) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Olaf Hering [Tue, 15 Dec 2015 00:01:42 +0000 (16:01 -0800)]
Drivers: hv: vss: run only on supported host versions
The Backup integration service on WS2012 has appearently trouble to
negotiate with a guest which does not support the provided util version.
Currently the VSS driver supports only version 5/0. A WS2012 offers only
version 1/x and 3/x, and vmbus_prep_negotiate_resp correctly returns an
empty icframe_vercnt/icmsg_vercnt. But the host ignores that and
continues to send ICMSGTYPE_NEGOTIATE messages. The result are weird
errors during boot and general misbehaviour.
Check the Windows version to work around the host bug, skip hv_vss_init
on WS2012 and older.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Orabug: 27426063
(cherry picked from commit ed9ba608e4851144af8c7061cbb19f751c73e998) Signed-off-by: Jack Vogel <jack.vogel@oracle.com> Reviewed-by: Ashok Vairavan <ashok.vairavan@oracle.com>
There is still a secure hole left in mem.c driver -- when securelevel is set
userland application could access PCI configuration space via this driver.
--
Attempting to write via mmap() API using acc_test.
[root@ban92uut054 ~]# ./acc_test mmap 0x846000c4=0x1
Using mmap() API for access
mmap write wrote 0x1
if (strcmp("rw", argv[1]) == 0) {
access_type = 1;
printf("Using pread()/pwrite() API for access\n");
}
else if (strcmp("mmap", argv[1]) == 0) {
access_type = 2;
printf("Using mmap() API for access\n");
}
else {
printf("Illegal access type: must be rw or mmap\n");
return -1;
}
addr = strtoul(argv[2], &tmp, 16);
if ((tmp && (*tmp != '=')) &&
((*tmp != '\0') || (errno == EINVAL) ||
(addr == ULONG_MAX && errno == ERANGE))) {
fprintf(stderr, "Invalid address specified; must be hex based\n");
if (errno) perror("error : ");
exit(1);
}
else if (tmp && (*tmp == '=')) { // write case
tmp++;
val = strtoul(tmp, NULL, 16);
operation = 1;
}
This patch is purposed to fix this hole when securelevel is set where one could write to
/dev/mem via the mmap() API. The fix to disallow opening /dev/mem or /dev/kmem
for access. The fix checks access at open rather than have get_securelevel() called at
the various write/read locations.
Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com> Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com> Reviewed-by: Eric Snowberg <eric.snowberg@oracle.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle>
(cherry picked from commit 8ca81822fffb2536aa1076fb17d1ea1413df3e7f)
Add this commit back to UEK4 now that the regressions have been fixed
Ka-Cheong Poon [Mon, 18 Dec 2017 15:45:18 +0000 (07:45 -0800)]
rds: Calling getsockname() on unbounded socket generates seg fault
If a socket is not yet bound, calling getsockname() should return an
unspecified address with family AF_UNSPEC as an RDS socket can be
bound to either an IPv4 or an IPv6 address. Hence returning either
family is incorrect. Currently, the returned address is set to an
unspecified address with family AF_INET6. If the passed in buffer is
smaller than sizeof(struct sockaddr_in6), the app may get a
segmentation fault error. This is similar to passing a buffer smaller
than sizeof(struct sockaddr_in) before the IPv6 changes.
Zhu Yanjun [Fri, 1 Dec 2017 02:13:20 +0000 (21:13 -0500)]
net/mlx4_core: allow QPs with enable_smi_admin enabled
In the commit 64453f042519 ("net/mlx4_core: Disallow creation of RAW QPs
on a VF"), it disallows some QPs. But when enable_smi_admin is enabled,
it should be allowed to pass.
Håkon Bugge [Fri, 8 Dec 2017 15:55:22 +0000 (16:55 +0100)]
net/rds: Fix incorrect error handling
Commit 5f58d7e81c2f ("net/rds: reduce memory footprint during
ib_post_recv in IB transport") removes order two allocations used to
receive fragments. Instead, zero order allocations are used. However,
said commit has incorrect error handling.
Konrad Rzeszutek Wilk [Sat, 13 Jan 2018 03:32:23 +0000 (22:32 -0500)]
x86: Move STUFF_RSB in to the idt macro
instead of it sitting in paranoid_entry or error_entry.
The idea behind the STUFF_RSB is to be done _before_
any calls are done. Which means we really want this in the idt
macro that is handled for exceptions - such as device not available,
which currently looks as so:
[Ignore the callq *0x40.. that gets converted to an 'cld']
<device_not_available>:
nop
nop
nop
callq *0x40d0b7(%rip) # ffffffff81b55330 <pv_irq_ops+0x30> <= patched to cld
pushq $0xffffffffffffffff
sub $0x78,%rsp
callq ffffffff81748ea0 <error_entry> <=== call!
mov %rsp,%rdi
xor %esi,%esi
callq ffffffff81018830 <do_device_not_available>
test %rax,%rax
jne ffffffff81747f10 <dtrace_error_exit>
jmpq ffffffff817490a0 <error_exit>
nopl 0x0(%rax)
By stuffing the RSB before the call to error_entry (or
paranoid_entry) we remove the chance of this becoming an attack vector.
While at it, remove the useless comment - we don't encode any frames
in UEK4.
OraBug: 27417150 Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>