Chuck Anderson [Wed, 9 Nov 2016 22:19:53 +0000 (14:19 -0800)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/upstream-cherry-picks:
percpu: fix synchronization between synchronous map extension and chunk destruction
percpu: fix synchronization between chunk->map_extend_work and chunk destruction
ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt
ALSA: timer: Fix leak in events via snd_timer_user_ccallback
ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS
Tejun Heo [Wed, 25 May 2016 15:48:25 +0000 (11:48 -0400)]
percpu: fix synchronization between synchronous map extension and chunk destruction
For non-atomic allocations, pcpu_alloc() can try to extend the area
map synchronously after dropping pcpu_lock; however, the extension
wasn't synchronized against chunk destruction and the chunk might get
freed while extension is in progress.
This patch fixes the bug by putting most of non-atomic allocations
under pcpu_alloc_mutex to synchronize against pcpu_balance_work which
is responsible for async chunk management including destruction.
Tejun Heo [Wed, 25 May 2016 15:48:25 +0000 (11:48 -0400)]
percpu: fix synchronization between chunk->map_extend_work and chunk destruction
Atomic allocations can trigger async map extensions which is serviced
by chunk->map_extend_work. pcpu_balance_work which is responsible for
destroying idle chunks wasn't synchronizing properly against
chunk->map_extend_work and may end up freeing the chunk while the work
item is still in flight.
This patch fixes the bug by rolling async map extension operations
into pcpu_balance_work.
Kangjie Lu [Tue, 3 May 2016 20:44:32 +0000 (16:44 -0400)]
ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt
The stack object “r1” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Orabug: 25059885
CVE: CVE-2016-4578
Mainline v4.7 commit e4ec8cc8039a7063e24204299b462bd1383184a5 Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Kangjie Lu [Tue, 3 May 2016 20:44:20 +0000 (16:44 -0400)]
ALSA: timer: Fix leak in events via snd_timer_user_ccallback
The stack object “r1” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Orabug: 25059885
CVE: CVE-2016-4578
Mainline v4.7 commit 9a47e9cff994f37f7f0dbd9ae23740d0f64f9fe6 Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Kangjie Lu [Tue, 3 May 2016 20:44:07 +0000 (16:44 -0400)]
ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS
The stack object “tread” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Orabug: 25059408
CVE: CVE-2016-4569
Mainline v4.7 commit cec8f96e49d9be372fdb0c3836dcf31ec71e457e Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Chuck Anderson [Fri, 4 Nov 2016 12:33:13 +0000 (05:33 -0700)]
uek-rpm ol7: change uek-rpm/ol7/update-el release value from 7.1 to 7.3
Change release value in uek-rpm/ol7/update-el to 7.3 so that manual builds
will pick up the new OL7.3 secure boot key.
uek-rpm/ol6/update-el is not affected.
Chuck Anderson [Thu, 3 Nov 2016 17:42:10 +0000 (10:42 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/upstream-cherry-picks: (23 commits)
NFS: Fix an LOCK/OPEN race when unlinking an open file
intel_idle: correct BXT support
intel_idle: re-work bxt_idle_state_table_update() and its helper
x86/intel_idle: Use Intel family macros for intel_idle
x86/cpu/intel: Introduce macros for Intel family numbers
intel_idle: add BXT support
intel_idle: Add KBL support
intel_idle: Add SKX support
intel_idle: Clean up all registered devices on exit.
intel_idle: Propagate hot plug errors.
intel_idle: Don't overreact to a cpuidle registration failure.
intel_idle: Setup the timer broadcast only on successful driver load.
intel_idle: Avoid a double free of the per-CPU data.
intel_idle: Fix dangling registration on error path.
intel_idle: Fix deallocation order on the driver exit path.
intel_idle: Remove redundant initialization calls.
intel_idle: Fix a helper function's return value.
intel_idle: remove useless return from void function.
intel_idle: Support for Intel Xeon Phi Processor x200 Product Family
intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled
...
Sometime uVNIC removal on OFOS won't trigger a actual removal
of Vstar interface, in that case uVNIC driver has to send NACK
code so that XCM will start cleaning its database.
When path->users becomes zero uVNIC driver starts
cleaning up the Forwarding table entries.
In some corner cases the call is invoked from transmit
function which is in interrupt context and that results
in a hard LOCKUP.
With new changes path->users is decremented in transmit
function to allow cleanup to happen from other thread.
Proper care is taken to avoid race between these
two contexts.
At Connectathon 2016, we found that recent upstream Linux clients
would occasionally send a LOCK operation with a zero stateid. This
appeared to happen in close proximity to another thread returning
a delegation before unlinking the same file while it remained open.
Earlier, the client received a write delegation on this file and
returned the open stateid. Now, as it is getting ready to unlink the
file, it returns the write delegation. But there is still an open
file descriptor on that file, so the client must OPEN the file
again before it returns the delegation.
Since commit 24311f884189 ('NFSv4: Recovery of recalled read
delegations is broken'), nfs_open_delegation_recall() clears the
NFS_DELEGATED_STATE flag _before_ it sends the OPEN. This allows a
racing LOCK on the same inode to be put on the wire before the OPEN
operation has returned a valid open stateid.
To eliminate this race, serialize delegation return with the
acquisition of a file lock on the same file. Adopt the same approach
as is used in the unlock path.
This patch also eliminates a similar race seen when sending a LOCK
operation at the same time as returning a delegation on the same file.
Fixes: 24311f884189 ('NFSv4: Recovery of recalled read ... ') Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[Anna: Add sentence about LOCK / delegation race] Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
(cherry picked from commit 11476e9dec39d90fe1e9bf12abc6f3efe35a073d) Signed-off-by: Todd Vierling <todd.vierling@oracle.com>
Commit 5dcef69486 ("intel_idle: add BXT support") added an 8-element
lookup array with just a 2-bit value used for lookups. As per the SDM
that bit field is really 3 bits wide. While this is supposedly benign
here, future re-use of the code for other CPUs might expose the issue.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit bef450962597ff39a7f9d53a30523aae9eb55843) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Since irtl_ns_units[] has itself zero entries, make sure the caller
recognized those cases along with the MSR read returning zero, as zero
is not a valid value for exit_latency and target_residency.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 3451ab3ebf92b12801878d8b5c94845afd4219f0) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Use the new INTEL_FAM6_* macros for intel_idle.c. Also fix up
some of the macros to be consistent with how some of the
intel_idle code refers to the model.
There's on oddity here: model 0x1F is uniquely referred to here
and nowhere else that I could find. 0x1E/0x1F are just spelled
out as "Intel Core i7 and i5 Processors" in the SDM or as "Intel
processors based on the Nehalem, Westmere microarchitectures" in
the RDPMC section. Comments between tables 19-19 and 19-20 in
the SDM seem to point to 0x1F being some kind of Westmere, so
let's call it "WESTMERE2".
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jacob.jun.pan@intel.com Cc: linux-pm@vger.kernel.org Link: http://lkml.kernel.org/r/20160603001932.EE978EB9@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit db73c5a8c80decbb6ddf208e58f3865b4df5384d) Signed-off-by: Brian Maly <brian.maly@oracle.com>
We have a boatload of open-coded family-6 model numbers. Half of
them have these model numbers in hex and the other half in
decimal. This makes grepping for them tons of fun, if you were
to try.
Solution:
Consolidate all the magic numbers. Put all the definitions in
one header.
The names here are closely derived from the comments describing
the models from arch/x86/events/intel/core.c. We could easily
make them shorter by doing things like s/SANDYBRIDGE/SNB/, but
they seemed fine even with the longer versions to me.
Do not take any of these names too literally, like "DESKTOP"
or "MOBILE". These are all colloquial names and not precise
descriptions of everywhere a given model will show up.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Doug Thompson <dougthompson@xmission.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com> Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Vishwanath Somayaji <vishwanath.somayaji@intel.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: jacob.jun.pan@intel.com Cc: linux-acpi@vger.kernel.org Cc: linux-edac@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: platform-driver-x86@vger.kernel.org Link: http://lkml.kernel.org/r/20160603001927.F2A7D828@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 970442c599b22ccd644ebfe94d1d303bf6f87c05) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Broxton has all the HSW C-states, except C3.
BXT C-state timing is slightly different.
Here we trust the IRTL MSRs as authority
on maximum C-state latency, and override the driver's tables
with the values found in the associated IRTL MSRs.
Further we set the target_residency to 1x maximum latency,
trusting the hardware demotion logic.
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 5dcef694860100fd16885f052591b1268b764d21) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
arch/x86/include/asm/msr-index.h
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 3ce093d4de753d6c92cc09366e29d0618a62f542) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit f9e71657c2c0a8f1c50884ab45794be2854e158e) Signed-off-by: Brian Maly <brian.maly@oracle.com>
This driver registers cpuidle devices when a CPU comes online, but it
leaves the registrations in place when a CPU goes offline. The module
exit code only unregisters the currently online CPUs, leaving the
devices for offline CPUs dangling.
This patch changes the driver to clean up all registrations on exit,
even those from CPUs that are offline.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 3e66a9ab53641a0f7a440e56f7b35bf5d77494b3) Signed-off-by: Brian Maly <brian.maly@oracle.com>
If a cpuidle registration error occurs during the hot plug notifier
callback, we should really inform the hot plug machinery instead of
just ignoring the error. This patch changes the callback to properly
return on error.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 08820546e4c30c84d0a1f1a49df055e1719c07ea) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The helper function, intel_idle_cpu_init, registers one new device
with the cpuidle layer. If the registration should fail, that
function immediately calls intel_idle_cpuidle_devices_uninit() to
unregister every last CPU's device. However, it makes no sense to do
so, when called from the hot plug notifier callback.
This patch moves the call to intel_idle_cpuidle_devices_uninit()
outside of the helper function to the one call site that actually
needs to perform the de-registrations.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit b69ef2c099c3e5f11bd5c33a9530d6522f72c9aa) Signed-off-by: Brian Maly <brian.maly@oracle.com>
This driver sets the broadcast tick quite early on during probe and does
not clean up again in cast of failure. This patch moves the setup call
after the registration, placing the on_each_cpu() calls within the global
CPU lock region.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 2259a819a8d37e472f08c88bc0dd22194754adb4) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The helper function, intel_idle_cpuidle_devices_uninit, frees the
globally allocated per-CPU data. However, this function is invoked
from the hot plug notifier callback at a time when freeing that data
is not safe.
If the call to cpuidle_register_driver() should fail (say, due to lack
of memory), then the driver will free its per-CPU region. On the
*next* CPU_ONLINE event, the driver will happily use the region again
and even free it again if the failure repeats.
This patch fixes the issue by moving the call to free_percpu() outside
of the helper function at the two call sites that actually need to
free the per-CPU data.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit ca42489d9ee3262482717c83428e087322fdc39c) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In the module_init() method, if the per-CPU allocation fails, then the
active cpuidle registration is not cleaned up. This patch fixes the
issue by attempting the allocation before registration, and then
cleaning it up again on registration failure.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit e9df69ccd1322e87eee10f28036fad9e6c71f8dd) Signed-off-by: Brian Maly <brian.maly@oracle.com>
In the module_exit() method, this driver first frees its per-CPU
pointer, then unregisters a callback making use of the pointer.
Furthermore, the function, intel_idle_cpuidle_devices_uninit, is racy
against CPU hot plugging as it calls for_each_online_cpu().
This patch corrects the issues by unregistering first on the exit path
while holding the hot plug lock.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 51319918bcc31f901646fc66348d41cf74ee0566) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The function, intel_idle_cpuidle_driver_init, makes calls on each CPU
to auto_demotion_disable() and c1e_promotion_disable(). These calls
are redundant, as intel_idle_cpu_init() does the same calls just a bit
later on. They are also premature, as the driver registration may yet
fail.
This patch removes the redundant code.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 4a3dfb3fc0fb0fc9acd36c94b7145f9c9dd4d93a) Signed-off-by: Brian Maly <brian.maly@oracle.com>
The function, intel_idle_cpuidle_driver_init, delivers no error codes
at all. This patch changes the function to return 'void' instead of
returning zero.
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 5469c827d20ab013f43d4f5f94e101d0cf7afd2c) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit f70415496d5ddf06fe7e0a22250d60bab2b2d7cc) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Enables "Intel(R) Xeon Phi(TM) Processor x200 Product Family" support,
formerly code-named KNL. It is based on modified Intel Atom Silvermont
microarchitecture.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
[micah.barany@intel.com: adjusted values of residency and latency] Signed-off-by: Micah Barany <micah.barany@intel.com>
[hubert.chrzaniuk@intel.com: removed deprecated CPUIDLE_FLAG_TIME_VALID flag] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Pawel Karczewski <pawel.karczewski@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 281baf7a702693deaa45c98ef0c5161006b48257) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Some SKL-H configurations require "intel_idle.max_cstate=7" to boot.
While that is an effective workaround, it disables C10.
This patch detects the problematic configuration,
and disables C8 and C9, keeping C10 enabled.
Note that enabling SGX in BIOS SETUP can also prevent this issue,
if the system BIOS provides that option.
https://bugzilla.kernel.org/show_bug.cgi?id=109081
"Freezes with Intel i7 6700HQ (Skylake), unless intel_idle.max_cstate=7"
Signed-off-by: Len Brown <len.brown@intel.com> Cc: stable@vger.kernel.org
(cherry picked from commit d70e28f57e14a481977436695b0c9ba165472431) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Addition of PC9 state, and minor tweaks to existing PC6 and PC8 states.
Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 135919a3a80565070b9645009e65f73e72c661c0) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Skylake Client CPU idle Power states (C-states)
are similar to the previous generation, Broadwell.
However, Skylake does get its own table with updated
worst-case latency and average energy-break-even residency values.
Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 493f133f47750aa5566fafa9403617e3f0506f8c) Signed-off-by: Brian Maly <brian.maly@oracle.com>
intel_idle uses a NULL "enter" field in a cpuidle state
to recognize the invalid entry terminating a variable-length array.
Linux-4.0 added support for the system-wide "freeze" state
in cpuidle drivers via the new "enter_freeze" field.
The natural way to expose a deep idle state for freeze,
but not for run-time idle is to supply "enter_freeze" without "enter";
so we update the driver to accept such states.
Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 7dd0e0af64afe4aa08ccdd167f64bd007f09b515) Signed-off-by: Brian Maly <brian.maly@oracle.com>
shamir rabinovitch [Wed, 26 Oct 2016 13:16:50 +0000 (06:16 -0700)]
RDS: rds debug messages are enabled by default
rds use Kconfig option called "RDS_DEBUG" to enable rds debug messages.
This option cause the rds Makefile to add -DDEBUG to the rds gcc command
line.
When CONFIG_DYNAMIC_DEBUG is enabled, the "DEBUG" macro is used by
include/linux/dynamic_debug.h to decide if dynamic debug prints should
be sent by default to the kernel log.
rds should not enable this macro for production builds.
David Ahern [Mon, 4 May 2015 15:51:38 +0000 (11:51 -0400)]
net/rds: Fix new sparse warning
c0adf54a109 introduced new sparse warnings:
CHECK /home/dahern/kernels/linux.git/net/rds/ib_cm.c
net/rds/ib_cm.c:191:34: warning: incorrect type in initializer (different base types)
net/rds/ib_cm.c:191:34: expected unsigned long long [unsigned] [usertype] dp_ack_seq
net/rds/ib_cm.c:191:34: got restricted __be64 <noident>
net/rds/ib_cm.c:194:51: warning: cast to restricted __be64
The temporary variable for sequence number should have been declared as __be64
rather than u64. Make it so.
Signed-off-by: David Ahern <david.ahern@oracle.com> Cc: shamir rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e2783717a71e9babfdd7c36c7e35b790d2c01022) Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
shamir rabinovitch [Fri, 1 May 2015 00:58:07 +0000 (20:58 -0400)]
net/rds: fix unaligned memory access
rdma_conn_param private data is copied using memcpy after headers such
as cma_hdr (see cma_resolve_ib_udp as example). so the start of the
private data is aligned to the end of the structure that come before. if
this structure end with u32 the meaning is that the start of the private
data will be 4 bytes aligned. structures that use u8/u16/u32/u64 are
naturally aligned but in case the structure start is not 8 bytes aligned,
all u64 members of this structure will not be aligned. to solve this issue
we must use special macros that allow unaligned access to those
unaligned members.
Addresses the following kernel log seen when attempting to use RDMA:
Kernel unaligned access at TPC[10507a88] rds_ib_cm_connect_complete+0x1bc/0x1e0 [rds_rdma]
Acked-by: Chien Yen <chien.yen@oracle.com> Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
[Minor tweaks for top of tree by:] Signed-off-by: David Ahern <david.ahern@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c0adf54a10903b59037a4c5fcb933dfeeb7b2624) Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Chuck Anderson [Mon, 31 Oct 2016 22:52:23 +0000 (15:52 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/upstream-cherry-picks:
sched: panic on corrupted stack end
ecryptfs: forbid opening files without mmap handler
proc: prevent stacking filesystems on top
Until now, hitting this BUG_ON caused a recursive oops (because oops
handling involves do_exit(), which calls into the scheduler, which in
turn raises an oops), which caused stuff below the stack to be
overwritten until a panic happened (e.g. via an oops in interrupt
context, caused by the overwritten CPU index in the thread_info).
Just panic directly.
Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 29d6455178a09e1dc340380c582b13356227e8df) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
kernel/sched/core.c
This prevents users from triggering a stack overflow through a recursive
invocation of pagefault handling that involves mapping procfs files into
virtual memory.
Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Tyler Hicks <tyhicks@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 2f36db71009304b3f0b95afacd8eba1f9f046b87) Signed-off-by: Brian Maly <brian.maly@oracle.com>
This prevents stacking filesystems (ecryptfs and overlayfs) from using
procfs as lower filesystem. There is too much magic going on inside
procfs, and there is no good reason to stack stuff on top of procfs.
(For example, procfs does access checks in VFS open handlers, and
ecryptfs by design calls open handlers from a kernel thread that doesn't
drop privileges or so.)
Signed-off-by: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit e54ad7f1ee263ffa5a2de9c609d58dfa27b21cd9) Signed-off-by: Brian Maly <brian.maly@oracle.com>
Chuck Anderson [Mon, 31 Oct 2016 10:48:30 +0000 (03:48 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1
* topic/uek-4.1/upstream-cherry-picks:
btrfs: Handle unaligned length in extent_same
panic, x86: Fix re-entrance problem due to panic on NMI
kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup
Fix compilation error introduced by "cancel the setfilesize transation when io error happen"
cancel the setfilesize transation when io error happen
mm/hugetlb: optimize minimum size (min_size) accounting
Btrfs: fix device replace of a missing RAID 5/6 device
Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
bpf: fix double-fdput in replace_map_fd_with_map_ptr()
Mark Fasheh [Mon, 8 Jun 2015 22:05:25 +0000 (15:05 -0700)]
btrfs: Handle unaligned length in extent_same
The extent-same code rejects requests with an unaligned length. This
poses a problem when we want to dedupe the tail extent of files as we
skip cloning the portion between i_size and the extent boundary.
If we don't clone the entire extent, it won't be deleted. So the
combination of these behaviors winds up giving us worst-case dedupe on
many files.
We can fix this by allowing a length that extents to i_size and
internally aligining those to the end of the block. This is what
btrfs_ioctl_clone() so we can just copy that check over.
Signed-off-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit e1d227a42ea2b4664f94212bd1106b9a3413ffb8) Signed-off-by: Divya Indi <divya.indi@oracle.com>
Orabug: 24696342
Hidehiro Kawai [Mon, 14 Dec 2015 10:19:09 +0000 (11:19 +0100)]
panic, x86: Fix re-entrance problem due to panic on NMI
If panic on NMI happens just after panic() on the same CPU, panic() is
recursively called. Kernel stalls, as a result, after failing to acquire
panic_lock.
To avoid this problem, don't call panic() in NMI context if we've
already entered panic().
For that, introduce nmi_panic() macro to reduce code duplication. In
the case of panic on NMI, don't return from NMI handlers if another CPU
already panicked.
Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: Don Zickus <dzickus@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com> Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Javi Merino <javi.merino@arm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: kexec@lists.infradead.org Cc: linux-doc@vger.kernel.org Cc: lkml <linux-kernel@vger.kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Seth Jennings <sjenning@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs
[ Cleanup comments, fixup formatting. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit 1717f2096b543cede7a380c858c765c41936bc35)
Jiri Kosina [Fri, 6 Nov 2015 02:44:41 +0000 (18:44 -0800)]
kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup
In many cases of hardlockup reports, it's actually not possible to know
why it triggered, because the CPU that got stuck is usually waiting on a
resource (with IRQs disabled) in posession of some other CPU is holding.
IOW, we are often looking at the stacktrace of the victim and not the
actual offender.
Introduce sysctl / cmdline parameter that makes it possible to have
hardlockup detector perform all-CPU backtrace.
Signed-off-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 55537871ef666b4153fd1ef8782e4a13fee142cc)
Zhaohongjiang [Mon, 12 Oct 2015 04:28:39 +0000 (15:28 +1100)]
cancel the setfilesize transation when io error happen
When I ran xfstest/073 case, the remount process was blocked to wait
transactions to be zero. I found there was a io error happened, and
the setfilesize transaction was not released properly. We should add
the changes to cancel the io error in this case.
Reproduction steps:
1. dd if=/dev/zero of=xfs1.img bs=1M count=2048
2. mkfs.xfs xfs1.img
3. losetup -f ./xfs1.img /dev/loop0
4. mount -t xfs /dev/loop0 /home/test_dir/
5. mkdir /home/test_dir/test
6. mkfs.xfs -dfile,name=image,size=2g
7. mount -t xfs -o loop image /home/test_dir/test
8. cp a file bigger than 2g to /home/test_dir/test
9. mount -t xfs -o remount,ro /home/test_dir/test
[ dchinner: moved io error detection to xfs_setfilesize_ioend() after
transaction context restoration. ]
It was observed that minimum size accounting associated with the
hugetlbfs min_size mount option may not perform optimally and as
expected. As huge pages/reservations are released from the filesystem
and given back to the global pools, they are reserved for subsequent
filesystem use as long as the subpool reserved count is less than
subpool minimum size. It does not take into account used pages within
the filesystem. The filesystem size limits are not exceeded and this is
technically not a bug. However, better behavior would be to wait for
the number of used pages/reservations associated with the filesystem to
drop below the minimum size before taking reservations to satisfy
minimum size.
An optimization is also made to the hugepage_subpool_get_pages() routine
which is called when pages/reservations are allocated. This does not
change behavior, but simply avoids the accounting if all reservations
have already been taken (subpool reserved count == 0).
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: David Rientjes <rientjes@google.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Orabug: 24450029
(cherry picked from commit 09a95e29cb30a3930db22d340ddd072a82b6b0db) Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Omar Sandoval [Fri, 19 Jun 2015 18:52:51 +0000 (11:52 -0700)]
Btrfs: fix device replace of a missing RAID 5/6 device
The original implementation of device replace on RAID 5/6 seems to have
missed support for replacing a missing device. When this is attempted,
we end up calling bio_add_page() on a bio with a NULL ->bi_bdev, which
crashes when we try to dereference it. This happens because
btrfs_map_block() has no choice but to return us the missing device
because RAID 5/6 don't have any alternate mirrors to read from, and a
missing device has a NULL bdev.
The idea implemented here is to handle the missing device case
separately, which better only happen when we're replacing a missing RAID
5/6 device. We use the new BTRFS_RBIO_REBUILD_MISSING operation to
reconstruct the data from parity, check it with
scrub_recheck_block_checksum(), and write it out with
scrub_write_block_to_dev_replace().
Reported-by: Philip <bugzilla@philip-seeger.de>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96141 Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 24447930
signed-off-by: Divya Indi <divya.indi@oracle.com>
(cherry picked from commit 73ff61dbe5edeb1799d7e91c8b0641f87feb75fa)
The current RAID 5/6 recovery code isn't quite prepared to handle
missing devices. In particular, it expects a bio that we previously
attempted to use in the read path, meaning that it has valid pages
allocated. However, missing devices have a NULL blkdev, and we can't
call bio_add_page() on a bio with a NULL blkdev. We could do manual
manipulation of bio->bi_io_vec, but that's pretty gross. So instead, add
a separate path that allows us to manually add pages to the rbio.
Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 24447930 Signed-off-by: Divya Indi <divya.indi@oracle.com>
(cherry picked from commit b4ee1782686d5b7a97826d67fdeaefaedbca23ce)
Roman Kagan [Wed, 18 May 2016 14:48:20 +0000 (17:48 +0300)]
kvm:vmx: more complete state update on APICv on/off
The function to update APICv on/off state (in particular, to deactivate
it when enabling Hyper-V SynIC) is incomplete: it doesn't adjust
APICv-related fields among secondary processor-based VM-execution
controls. As a result, Windows 2012 guests get stuck when SynIC-based
auto-EOI interrupt intersected with e.g. an IPI in the guest.
In addition, the MSR intercept bitmap isn't updated every time "virtualize
x2APIC mode" is toggled. This path can only be triggered by a malicious
guest, because Windows didn't use x2APIC but rather their own synthetic
APIC access MSRs; however a guest running in a SynIC-enabled VM could
switch to x2APIC and thus obtain direct access to host APIC MSRs
(CVE-2016-4440).
The patch fixes those omissions.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Reported-by: Steve Rutherford <srutherford@google.com> Reported-by: Yang Zhang <yang.zhang.wz@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Orabug: 23347009
CVE: CVE-2016-4440 Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
Ashish Samant [Thu, 18 Aug 2016 20:54:26 +0000 (13:54 -0700)]
fuse: direct-io: don't dirty ITER_BVEC pages
When reading from a loop device backed by a fuse file it deadlocks on
lock_page().
This is because the page is already locked by the read() operation done on
the loop device. In this case we don't want to either lock the page or
dirty it.
So do what fs/direct-io.c does: only dirty the page for ITER_IOVEC vectors.
bpf: fix double-fdput in replace_map_fd_with_map_ptr()
When bpf(BPF_PROG_LOAD, ...) was invoked with a BPF program whose bytecode
references a non-map file descriptor as a map file descriptor, the error
handling code called fdput() twice instead of once (in __bpf_map_get() and
in replace_map_fd_with_map_ptr()). If the file descriptor table of the
current task is shared, this causes f_count to be decremented too much,
allowing the struct file to be freed while it is still in use
(use-after-free). This can be exploited to gain root privileges by an
unprivileged user.
This bug was introduced in
commit 0246e64d9a5f ("bpf: handle pseudo BPF_LD_IMM64 insn"), but is only
exploitable since
commit 1be7f75d1668 ("bpf: enable non-root eBPF programs") because
previously, CAP_SYS_ADMIN was required to reach the vulnerable code.
(posted publicly according to request by maintainer)
Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7)
Mukesh Kacker [Thu, 27 Oct 2016 13:07:16 +0000 (06:07 -0700)]
mlx4_ib: remove WARN_ON() based on incorrect assumptions
A WARN_ON() was inserted when user data was introduced to the
ibv_cmd_alloc_shpd() by another infiniband provider to make
sure that no user data is sent to older providers such as this
one which do not expect it.
It assumed when no user data is sent the udata->inlen is zero.
The user-kernel API however always sends at least 8 octets
(which may not be initialized in case of provider libraries that
do not user user data).
We remove the WARN_ON and rely on providers not to touch that
field if their companion library is not expected to initialize it.
Ashok Vairavan [Thu, 27 Oct 2016 19:31:52 +0000 (12:31 -0700)]
nvme: fix max_segments integer truncation
The block layer uses an unsigned short for max_segments. The way we
calculate the value for NVMe tends to generate very large 32-bit values,
which after integer truncation may lead to a zero value instead of
the desired outcome.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Jeff Lien <Jeff.Lien@hgst.com> Tested-by: Jeff Lien <Jeff.Lien@hgst.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
Orabug: 24928835
Cherry picked commit: 45686b6198bd824f083ff5293f191d78db9d708a
Conflicts:
UEK4 QU2 nvme module doesn't have core.c file. All the functions
resides in pci.c. Hence, this patch is manually ported to the respective
function in pci.c
drivers/nvme/host/core.c
Santosh Shilimkar [Fri, 14 Oct 2016 23:47:49 +0000 (16:47 -0700)]
mlx4_core/ib: set the IB port MTU to 2K
'commit 096335b3f983 ("mlx4_core: Allow dynamic MTU configuration for IB
ports")' overwrite the default port MTU and sets it as 4K. Since this
directly impacts the HW VLs supported and Oracle workloads heavily uses
all supported 8 VLs for traffic classification, 2K default needs to
be kept as is.
We initilise it to default 2k so that the feature(dynamic MTU configuration)
is still available for non DB users to set the desired MTU value using sysctl.
Also for CX2 cards, commit 596c5ff4b7b3 ("net/mlx4: adjust initial
value of vl_cap in mlx4_SET_PORT") broke the vl_cap which made the
supported VLs to 4 irrespective of MTU size.
Knut Omang [Fri, 21 Oct 2016 06:23:39 +0000 (08:23 +0200)]
sif: cq: transfer headroom attribute to user mode
This commit makes sure old libsif versions works
with the driver while providing a forward compatible
way of making additional changes to the extra
headroom in the CQs.
We anticipate to be able to trim
down the extra entries once we have PQP errors
handled transparently. This commit then ensures that
the headroom is only set in one place, at the
driver side, and that user mode just can
pick up the configured headroom from the kernel.
This is done by providing the used headroom
in a formerly reserved 32 bit field, thus no changes
to the packet size is necessary.
Nevertheless we increment the abi version from
3.6 to 3.7 to allow libsif to detect whether
the headroom field can be trusted.
Knut Omang [Thu, 13 Oct 2016 10:07:33 +0000 (12:07 +0200)]
sif: Add vendor flag to support testing without oversized CQs
After introduction of extra CQ entries to reduce risk of
having duplicate completions overflow a CQ, we no longer can
trigger various CQ overflow scenarios without running a lot of
requests. We need to be able to test with a minimal set of operations
to allow co-sim based tests for further analysis.
Introduce a new vendor_flag no_x_cqe = 0x80 to turn off
the allocation of extra CQEs.
This patch fixes an incomplete patch in commit "cq: Add
additional SIF visible cqes to CQ". The max_cqe
capability reported by query_device is incorrect because
it includes the SIF visible cqes.
Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com>
Fix an issue in modify_qp_hw from SQE to RTS returns
EPSC_MODIFY_CANNOT_CHANGE_QP_ATTR. In sif, qp transition
from SQE to RTS must explicitly set the req_access_error
to 0.
Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com>
shared pd is not an IBTA defined feature, but an Oracle
Linux extension. Even though PSIF can share a pd easily,
it must comply with the Oracle ib_core implementation which
requires a new pd "object" when reusing a pd (via share_pd
verbs).
Without a new pd "object", it causes a NULL pointer deference
during pd clean-up phase. Thus, this patch creates a new pd
"object" when reusing a pd, and this pd "object" is pointing
to the original pd index.
Signed-off-by: Wei Lin Guay <wei.lin.guay@oracle.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Francisco Triviño [Fri, 30 Sep 2016 08:35:14 +0000 (10:35 +0200)]
sif: eq: Add timeout to the threaded interrupt handler
This commit implements a timeout that prevents soft lockup issues when
the threaded interrupt function (sif_intr_worker) keeps processing
events for a long period. If the timeout is reached, the threaded
handler returns IRQ_HANDLED even if there are more events to be
processed. In such a case, the coalescing mechanism will generate
an IRQ for the last event.
RDS: IB: fix panic with handlers running post teardown
Shutdown cqe reaping loop takes care of emptying the
CQ's before they being destroyed. And once tasklets are
killed, the hanlders are not expected to run.
But because of core tasklet BUG, tasklet handler could
still run after tasklet_kill which lead can lead to kernel
panic. Fix for core tasklet code was proposed and accepted
upstream, but it comes with bagage of fixing quite a
few bad users of it. Also for receive, we have additional
kthread to take care.
The BUG fix done as part of Orabug 2446085, had an additional
assumption that reaping code won't reap all the CQEs after
QP moved to error state which was not correct. QP is
moved to error state as part of rdma_disconnect() and
all the CQEs are reaped by the loop properly.
Any handler running after above and trying to access the
qp/cq resources gets exposed to race conditions. Patch
fixes this race by makes sure that handlers returns
without any action post teardown.
Linus Torvalds [Thu, 13 Oct 2016 20:07:36 +0000 (13:07 -0700)]
mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").
In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better). The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9. Earlier kernels will
have to look at the page state itself.
Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.
To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.
Reported-and-tested-by: Phil "not Paul" Oester <kernel@linuxace.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Nick Piggin <npiggin@gmail.com> Cc: Greg Thelen <gthelen@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619)
Orabug: 24926639
Conflicts:
include/linux/mm.h
mm/gup.c Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Vitaly Kuznetsov [Thu, 30 Jun 2016 15:56:36 +0000 (17:56 +0200)]
x86/acpi: store ACPI ids from MADT for future usage
Currently we don't save ACPI ids (unlike LAPIC ids which go to
x86_cpu_to_apicid) from MADT and we may need this information later.
Particularly, ACPI ids is the only existent way for a PVHVM Xen guest
to figure out Xen's idea of its vCPUs ids before these CPUs boot and
in some cases these ids diverge from Linux's cpu ids.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 3e9e57fad3d8530aa30787f861c710f598ddc4e7) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Filipe Manco [Thu, 15 Sep 2016 15:10:46 +0000 (17:10 +0200)]
xen-netback: fix error handling on netback_probe()
In case of error during netback_probe() (e.g. an entry missing on the
xenstore) netback_remove() is called on the new device, which will set
the device backend state to XenbusStateClosed by calling
set_backend_state(). However, the backend state wasn't initialized by
netback_probe() at this point, which will cause and invalid transaction
and set_backend_state() to BUG().
Initialize the backend state at the beginning of netback_probe() to
XenbusStateInitialising, and create two new valid state transitions on
set_backend_state(), from XenbusStateInitialising to XenbusStateClosed,
and from XenbusStateInitialising to XenbusStateInitWait.
Signed-off-by: Filipe Manco <filipe.manco@neclab.eu> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cce94483e47e8e3d74cf4475dea33f9fd4b6ad9f) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
We pass xen_vcpu_id mapping information to hypercalls which require
uint32_t type so it would be cleaner to have it as uint32_t. The
initializer to -1 can be dropped as we always do the mapping before using
it and we never check the 'not set' value anyway.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 55467dea2967259f21f4f854fc99d39cc5fea60e) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Jan Beulich [Mon, 15 Aug 2016 15:02:38 +0000 (09:02 -0600)]
xenbus: don't look up transaction IDs for ordinary writes
This should really only be done for XS_TRANSACTION_END messages, or
else at least some of the xenstore-* tools don't work anymore.
Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition") Reported-by: Richard Schütz <rschuetz@uni-koblenz.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Richard Schütz <rschuetz@uni-koblenz.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit 9a035a40f7f3f6708b79224b86c5777a3334f7ea) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
Bob Liu [Wed, 27 Jul 2016 09:42:04 +0000 (17:42 +0800)]
xen-blkfront: free resources if xlvbd_alloc_gendisk fails
Current code forgets to free resources in the failure path of
xlvbd_alloc_gendisk(), this patch fix it.
Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
(cherry picked from commit 4e876c2bd37fbb5c37a4554a79cf979d486f0e82) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937
xen: add static initialization of steal_clock op to xen_time_ops
pv_time_ops might be overwritten with xen_time_ops after the
steal_clock operation has been initialized already. To prevent calling
a now uninitialized function pointer add the steal_clock static
initialization to xen_time_ops.
Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
(cherry picked from commit d34c30cc1fa80f509500ff192ea6bc7d30671061) Signed-off-by: Bob Liu <bob.liu@oracle.com>
Orabug: 24820937