Rafael J. Wysocki [Mon, 3 Jul 2017 12:22:05 +0000 (14:22 +0200)]
Merge branches 'pm-core', 'pm-opp' and 'pm-qos'
* pm-core:
PM / sysfs: Constify attribute groups
PM: Constify info string used in messages
PM: Constify returned PM event name
PM / wakeirq: Convert to SRCU
* pm-opp:
PM / OPP: Add dev_pm_opp_{set|put}_clkname()
PM / OPP: Use - instead of @ for DT entries
PM / OPP: Don't create debugfs "supply-0" directory unnecessarily
PM / OPP: opp-microvolt is not optional if regulators are set
PM / OPP: Don't create copy of regulators unnecessarily
PM / OPP: Reorganize _generic_set_opp_regulator()
Rafael J. Wysocki [Mon, 3 Jul 2017 12:21:33 +0000 (14:21 +0200)]
Merge branch 'pm-sleep'
* pm-sleep:
PM: hibernate: constify attribute_group structures.
PM / hibernate: Drop redundant parameter of swsusp_alloc()
PM / hibernate: Use CONFIG_HAVE_SET_MEMORY for include condition
x86/power/64: Use char arrays for asm function names
Rafael J. Wysocki [Mon, 3 Jul 2017 12:21:18 +0000 (14:21 +0200)]
Merge branches 'pm-cpufreq', 'intel_pstate' and 'pm-cpuidle'
* pm-cpufreq:
cpufreq / CPPC: Initialize policy->min to lowest nonlinear performance
cpufreq: sfi: make freq_table static
cpufreq: exynos5440: Fix inconsistent indenting
cpufreq: imx6q: imx6ull should use the same flow as imx6ul
cpufreq: dt: Add support for hi3660
* intel_pstate:
cpufreq: Update scaling_cur_freq documentation
cpufreq: intel_pstate: Clean up after performance governor changes
intel_pstate: skip scheduler hook when in "performance" mode
intel_pstate: delete scheduler hook in HWP mode
x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF
cpufreq: intel_pstate: Remove max/min fractions to limit performance
x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"
* pm-cpuidle:
cpuidle: menu: allow state 0 to be disabled
intel_idle: Use more common logging style
x86/ACPI/cstate: Allow ACPI C1 FFH MWAIT use on AMD systems
ARM: cpuidle: Support asymmetric idle definition
Rafael J. Wysocki [Mon, 3 Jul 2017 12:17:16 +0000 (14:17 +0200)]
Merge branch 'pm-tools'
* pm-tools:
cpupower: Add support for new AMD family 0x17
cpupower: Fix bug where return value was not used
tools/power turbostat: update version number
tools/power turbostat: decode MSR_IA32_MISC_ENABLE only on Intel
tools/power turbostat: stop migrating, unless '-m'
tools/power turbostat: if --debug, print sampling overhead
tools/power turbostat: hide SKL counters, when not requested
intel_pstate: use updated msr-index.h HWP.EPP values
tools/power x86_energy_perf_policy: support HWP.EPP
x86: msr-index.h: fix shifts to ULL results in HWP macros.
x86: msr-index.h: define HWP.EPP values
x86: msr-index.h: define EPB mid-points
Rafael J. Wysocki [Wed, 28 Jun 2017 23:49:44 +0000 (01:49 +0200)]
cpufreq: Update scaling_cur_freq documentation
Commit f8475cef9008 "x86: use common aperfmperf_khz_on_cpu() to
calculate KHz using APERF/MPERF" modified the way the scaling_cur_freq
cpufreq policy attribute in sysfs is handled on contemporary
Intel-based x86 systems, so update the documentation to reflect
that change.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Wed, 28 Jun 2017 23:47:56 +0000 (01:47 +0200)]
cpufreq: intel_pstate: Clean up after performance governor changes
After commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
"performance" mode) get_target_pstate_use_performance() and
get_target_pstate_use_cpu_load() are never called if scaling_governor
is "performance", so drop the CPUFREQ_POLICY_PERFORMANCE checks from
them as they will never trigger anyway.
Moreover, the documentation needs to be updated to reflect the change
made by the above commit, so do that too.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by <linux/sysfs.h> work with const
attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
6332 488 308 7128 1bd8 kernel/power/hibernate.o
File size After adding 'const':
text data bss dec hex filename
6396 424 308 7128 1bd8 kernel/power/hibernate.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Nicholas Piggin [Mon, 26 Jun 2017 05:38:15 +0000 (15:38 +1000)]
cpuidle: menu: allow state 0 to be disabled
The menu driver does not allow state0 to be disabled completely.
If it is disabled but other enabled states don't meet latency
requirements, it is still used.
Fix this by starting with the first enabled idle state. Fall back
to state 0 if no idle states are enabled (arguably this should be
-EINVAL if it is attempted, but this is the minimal fix).
Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Joe Perches [Fri, 9 Jun 2017 19:29:20 +0000 (12:29 -0700)]
intel_idle: Use more common logging style
Remove #define PREFIX and add #define pr_fmt to use more common logging.
Miscellanea:
o Add missing newline to format
o Convert a single printk without KERN_<LEVEL> to pr_info
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit fc5cbf0c94b6 (PM / Domains: Support for multiple states) split
out some code out of default_power_down_ok() function so the
documentation has to be moved to appropriate place.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Krzysztof Kozlowski [Wed, 28 Jun 2017 14:56:17 +0000 (16:56 +0200)]
PM / Domains: Handle safely genpd_syscore_switch() call on non-genpd device
genpd_syscore_switch() had two problems:
1. It silently assumed that device, it is being called for, belongs
to generic power domain and used container_of() on its power
domain pointer. Such assumption might not be true always.
2. It iterated over list of generic power domains without holding
gpd_list_lock mutex thus list could have been modified at the same
time.
Usage of genpd_lookup_dev() solves both problems as it is safe a call
for non-generic power domains and uses mutex when iterating.
Reported-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Mikko Perttunen [Thu, 22 Jun 2017 07:18:33 +0000 (10:18 +0300)]
PM / Domains: Call driver's noirq callbacks
Currently genpd installs its own noirq callbacks, but never calls down
to the driver's corresponding callbacks. Add these calls.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Arvind Yadav [Thu, 22 Jun 2017 10:53:32 +0000 (16:23 +0530)]
PM / QoS: constify *_attribute_group.
File size before:
text data bss dec hex filename
3890 1152 8 5050 13ba drivers/base/power/sysfs.o
File size After adding 'const':
text data bss dec hex filename
4250 800 8 5058 13c2 drivers/base/power/sysfs.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
David Wu [Fri, 9 Jun 2017 09:36:14 +0000 (17:36 +0800)]
PM / AVS: rockchip-io: add io selectors and supplies for rk3228
This adds the necessary data for handling io voltage domains on the rk3228.
Signed-off-by: David Wu <david.wu@rock-chips.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Krzysztof Kozlowski [Mon, 12 Jun 2017 15:17:41 +0000 (17:17 +0200)]
PM / Domains: Constify genpd pointer
Mark pointer to struct generic_pm_domain const (either passed in
argument or used localy in a function), whenever it is not modifed by
the function itself.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Prakash, Prashanth [Thu, 11 May 2017 22:39:44 +0000 (16:39 -0600)]
cpufreq / CPPC: Initialize policy->min to lowest nonlinear performance
Description of Lowest Perfomance in ACPI 6.1 specification states:
"Lowest Performance is the absolute lowest performance level of
the platform. Selecting a performance level lower than the lowest
nonlinear performance level may actually cause an efficiency penalty,
but should reduce the instantaneous power consumption of the processor.
In traditional terms, this represents the T-state range of performance
levels."
Set the default value of policy->min to Lowest Nonlinear Performance
to avoid any potential efficiency penalty.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Alexey Klimov <alexey.klimov@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Balbir Singh [Sat, 3 Jun 2017 10:52:32 +0000 (20:52 +1000)]
PM / hibernate: Use CONFIG_HAVE_SET_MEMORY for include condition
Kbuild reported a build failure when CONFIG_STRICT_KERNEL_RWX was
enabled on powerpc. We don't yet have ARCH_HAS_SET_MEMORY and ppc32
saw a build failure.
I've only done a basic compile test with a config that has
hibernation enabled.
Fixes: 50327ddfbc92 (kernel/power/snapshot.c: use set_memory.h header) Reported-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Yazen Ghannam [Wed, 7 Jun 2017 15:19:46 +0000 (10:19 -0500)]
x86/ACPI/cstate: Allow ACPI C1 FFH MWAIT use on AMD systems
AMD systems support the Monitor/Mwait instructions and these can be used
for ACPI C1 in the same way as on Intel systems.
Three things are needed:
1) This patch.
2) BIOS that declares a C1 state in _CST to use FFH, with correct values.
3) CPUID_Fn00000005_EDX is non-zero on the system.
The BIOS on AMD systems have historically not defined a C1 state in _CST,
so the acpi_idle driver uses HALT for ACPI C1.
Currently released systems have CPUID_Fn00000005_EDX as reserved/RAZ. If a
BIOS is released for these systems that requests a C1 state with FFH, the
FFH implementation in Linux will fail since CPUID_Fn00000005_EDX is 0. The
acpi_idle driver will then fallback to using HALT for ACPI C1.
Future systems are expected to have non-zero CPUID_Fn00000005_EDX and BIOS
support for using FFH for ACPI C1.
Allow ffh_cstate_init() to succeed on AMD systems.
Tested on Fam15h and Fam17h systems.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Thomas Gleixner [Sun, 25 Jun 2017 17:31:13 +0000 (19:31 +0200)]
PM / wakeirq: Convert to SRCU
The wakeirq infrastructure uses RCU to protect the list of wakeirqs. That
breaks the irq bus locking infrastructure, which is allows sleeping
functions to be called so interrupt controllers behind slow busses,
e.g. i2c, can be handled.
The wakeirq functions hold rcu_read_lock and call into irq functions, which
in case of interrupts using the irq bus locking will trigger a
might_sleep() splat.
Convert the wakeirq infrastructure to Sleepable RCU and unbreak it.
Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling) Reported-by: Brian Norris <briannorris@chromium.org> Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Brian Norris <briannorris@chromium.org> Cc: 4.2+ <stable@vger.kernel.org> # 4.2+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Len Brown [Sat, 24 Jun 2017 05:11:54 +0000 (22:11 -0700)]
intel_pstate: skip scheduler hook when in "performance" mode
When the governor is set to "performance", intel_pstate does not
need the scheduler hook for doing any calculations. Under these
conditions, its only purpose is to continue to maintain
cpufreq/scaling_cur_freq.
The cpufreq/scaling_cur_freq sysfs attribute is now provided by
shared x86 cpufreq code on modern x86 systems, including
all systems supported by the intel_pstate driver.
So in "performance" governor mode, the scheduler hook can be skipped.
This applies to both in Software and Hardware P-state control modes.
Suggested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Len Brown [Sat, 24 Jun 2017 05:11:53 +0000 (22:11 -0700)]
intel_pstate: delete scheduler hook in HWP mode
The cpufreq/scaling_cur_freq sysfs attribute is now provided by
shared x86 cpufreq code on modern x86 systems, including
all systems supported by the intel_pstate driver.
In HWP mode, maintaining that value was the sole purpose of
the scheduler hook, intel_pstate_update_util_hwp(),
so it can now be removed.
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Len Brown [Sat, 24 Jun 2017 05:11:52 +0000 (22:11 -0700)]
x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF
The goal of this change is to give users a uniform and meaningful
result when they read /sys/...cpufreq/scaling_cur_freq
on modern x86 hardware, as compared to what they get today.
Modern x86 processors include the hardware needed
to accurately calculate frequency over an interval --
APERF, MPERF, and the TSC.
Here we provide an x86 routine to make this calculation
on supported hardware, and use it in preference to any
driver driver-specific cpufreq_driver.get() routine.
MHz is computed like so:
MHz = base_MHz * delta_APERF / delta_MPERF
MHz is the average frequency of the busy processor
over a measurement interval. The interval is
defined to be the time between successive invocations
of aperfmperf_khz_on_cpu(), which are expected to to
happen on-demand when users read sysfs attribute
cpufreq/scaling_cur_freq.
As with previous methods of calculating MHz,
idle time is excluded.
base_MHz above is from TSC calibration global "cpu_khz".
This x86 native method to calculate MHz returns a meaningful result
no matter if P-states are controlled by hardware or firmware
and/or if the Linux cpufreq sub-system is or is-not installed.
When this routine is invoked more frequently, the measurement
interval becomes shorter. However, the code limits re-computation
to 10ms intervals so that average frequency remains meaningful.
Discerning users are encouraged to take advantage of
the turbostat(8) utility, which can gracefully handle
concurrent measurement intervals of arbitrary length.
Signed-off-by: Len Brown <len.brown@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sherry Hurwitz [Tue, 20 Jun 2017 07:08:42 +0000 (02:08 -0500)]
cpupower: Add support for new AMD family 0x17
Add support for new AMD family 0x17
- Add bit field changes to the msr_pstate structure
- Add the new formula for the calculation of cof
- Changed method to access to CpbDis
Signed-off-by: Sherry Hurwitz <sherry.hurwitz@amd.com> Acked-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sherry Hurwitz [Tue, 20 Jun 2017 07:07:37 +0000 (02:07 -0500)]
cpupower: Fix bug where return value was not used
Save return value from amd_pci_get_num_boost_states
and remove redundant setting of *support
Signed-off-by: Sherry Hurwitz <sherry.hurwitz@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Mon, 26 Jun 2017 23:42:28 +0000 (01:42 +0200)]
Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat fixes from Len Brown.
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: update version number
tools/power turbostat: decode MSR_IA32_MISC_ENABLE only on Intel
tools/power turbostat: stop migrating, unless '-m'
tools/power turbostat: if --debug, print sampling overhead
tools/power turbostat: hide SKL counters, when not requested
Linus Torvalds [Sun, 25 Jun 2017 18:59:19 +0000 (11:59 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A few fixes for timekeeping and timers:
- Plug a subtle race due to a missing READ_ONCE() in the timekeeping
code where reloading of a pointer results in an inconsistent
callback argument being supplied to the clocksource->read function.
- Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the
time keeping core code, to prevent a possible discontuity.
- Apply a similar fix to the arm64 vdso clock_gettime()
implementation
- Add missing includes to clocksource drivers, which relied on
indirect includes which fails in certain configs.
- Use the proper iomem pointer for read/iounmap in a probe function"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW
time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting
time: Fix clock->read(clock) race around clocksource changes
clocksource: Explicitly include linux/clocksource.h when needed
clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
Linus Torvalds [Sun, 25 Jun 2017 18:55:21 +0000 (11:55 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"Three fixlets for perf:
- Return the proper error code if aux buffers for a event are not
supported.
- Calculate the probe offset for inlined functions correctly
- Update the Skylake DTLB load/store miss event so it can count 1G
TLB entries as well"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Fix probe definition for inlined functions
perf/x86/intel: Add 1G DTLB load/store miss support for SKL
perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)
Linus Torvalds [Sun, 25 Jun 2017 17:39:43 +0000 (10:39 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- a quirk to i8042 to ignore timeout bit on Lifebook AH544
- a fixup to Synaptics RMI function 54 that was breaking some Dells
- a fix for memory leak in soc_button_array driver
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - only read the F54 query registers which are used
Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list
Input: soc_button_array - fix leaking the ACPI button descriptor buffer
Pull SCSI target fixes from Nicholas Bellinger:
"Here are the target-pending fixes for v4.12-rc7 that have been queued
up for the last 2 weeks. This includes:
- Fix a TMR related kref underflow detected by the recent refcount_t
conversion in upstream.
- Fix a iscsi-target corner case during explicit connection logout
timeout failure.
- Address last fallout in iscsi-target immediate data handling from
v4.4 target-core now allowing control CDB payload underflow"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Reject immediate data underflow larger than SCSI transfer length
iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP
target: Fix kref->refcount underflow in transport_cmd_finish_abort
Linus Torvalds [Sat, 24 Jun 2017 23:18:00 +0000 (16:18 -0700)]
Merge tag 'kbuild-fixes-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
"Nothing scary, just some random fixes:
- fix warnings of host programs
- fix "make tags" when COMPILED_SOURCE=1 is specified along with O=
- clarify help message of C=1 option
- fix dependency for ncurses compatibility check
- fix "make headers_install" for fakechroot environment"
* tag 'kbuild-fixes-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: fix sparse warnings in nconfig
kbuild: fix header installation under fakechroot environment
kconfig: Check for libncurses before menuconfig
Kbuild: tiny correction on `make help`
tags: honor COMPILED_SOURCE with apart output directory
genksyms: add printf format attribute to error_with_pos()
Linus Torvalds [Sat, 24 Jun 2017 09:24:53 +0000 (02:24 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull timer fix from Eric Biederman:
"This fixes an issue of confusing injected signals with the signals
from posix timers that has existed since posix timers have been in the
kernel.
This patch is slightly simpler than my earlier version of this patch
as I discovered in testing that I had misspelled "#ifdef
CONFIG_POSIX_TIMERS". So I deleted that unnecessary test and made
setting of resched_timer uncondtional.
I have tested this and verified that without this patch there is a
nasty hang that is easy to trigger, and with this patch everything
works properly"
Thomas Gleixner dixit:
"It fixes the problem at hand and covers the ptrace case as well, which
I missed.
Reviewed-and-tested-by: Thomas Gleixner <tglx@linutronix.de>"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Only reschedule timers on signals timers have sent
Thomas Gleixner [Fri, 23 Jun 2017 08:50:38 +0000 (10:50 +0200)]
x86/mshyperv: Remove excess #includes from mshyperv.h
A recent commit included linux/slab.h in linux/irq.h. This breaks the build
of vdso32 on a 64-bit kernel.
The reason is that linux/irq.h gets included into the vdso code via
linux/interrupt.h which is included from asm/mshyperv.h. That makes the
32-bit vdso compile fail, because slab.h includes the pgtable headers for
64-bit on a 64-bit build.
Neither linux/clocksource.h nor linux/interrupt.h are needed in the
mshyperv.h header file itself - it has a dependency on <linux/atomic.h>.
Remove the includes and unbreak the build.
Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: devel@linuxdriverproject.org Fixes: dee863b571b0 ("hv: export current Hyper-V clocksource") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sat, 24 Jun 2017 00:53:16 +0000 (17:53 -0700)]
Merge tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.12. Most of these actually came in last
week but got held up for some more testing.
- three fixes for kprobes/ftrace/livepatch interactions.
- properly handle data breakpoints when using the Radix MMU.
- fix for perf sampling of registers during call_usermodehelper().
- properly initialise the thread_info on our emergency stacks
- add an explicit flush when doing TLB invalidations for a process
using NPU2.
Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi
Bangoria, Masami Hiramatsu"
* tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Initialise thread_info for emergency stacks
powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD
powerpc/perf: Fix oops when kthread execs user process
powerpc/64s: Handle data breakpoints in Radix mode
powerpc/kprobes: Skip livepatch_handler() for jprobes
powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS
powerpc/kprobes: Pause function_graph tracing during jprobes handling
Linus Torvalds [Sat, 24 Jun 2017 00:49:12 +0000 (17:49 -0700)]
Merge tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"This fixes the ACPI-based enumeration of some I2C and SPI devices
broken in 4.11.
Specifics:
- I2C and SPI devices are expected to be enumerated by the I2C and
SPI subsystems, respectively, but due to a change made during the
4.11 cycle, in some cases the ACPI core marks them as already
enumerated which causes the I2C and SPI subsystems to overlook
them, so fix that (Jarkko Nikula)"
* tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / scan: Fix enumeration for special SPI and I2C devices
Linus Torvalds [Sat, 24 Jun 2017 00:40:41 +0000 (17:40 -0700)]
Merge tag 'gpio-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
"A single GPIO patch fixing the compatible string for the MVEBU PWM
controller embedded in the GPIO controller before we release v4.12.
Hopefully"
* tag 'gpio-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: mvebu: change compatible string for PWM support
Linus Torvalds [Sat, 24 Jun 2017 00:37:56 +0000 (17:37 -0700)]
Merge tag 'sound-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Nothing exciting here, just a few stable fixes:
- suppress spurious kernel WARNING in PCM core
- fix potential spin deadlock at error handling in firewire
- HD-audio PCI ID addition / fixup"
* tag 'sound-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Apply quirks to Broxton-T, too
ALSA: firewire-lib: Fix stall of process context at packet error
ALSA: pcm: Don't treat NULL chmap as a fatal error
ALSA: hda - Add Coffelake PCI ID
Linus Torvalds [Sat, 24 Jun 2017 00:35:57 +0000 (17:35 -0700)]
Merge tag 'drm-fixes-for-v4.12-rc7' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A varied bunch of fixes, one for an API regression with connectors.
Otherwise amdgpu and i915 have a bunch of varied fixes, the shrinker
ones being the most important"
* tag 'drm-fixes-for-v4.12-rc7' of git://people.freedesktop.org/~airlied/linux:
drm: Fix GETCONNECTOR regression
drm/radeon: add a quirk for Toshiba Satellite L20-183
drm/radeon: add a PX quirk for another K53TK variant
drm/amdgpu: adjust default display clock
drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating
drm/amdgpu: add Polaris12 DID
drm/i915: Don't enable backlight at setup time.
drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic()
drm/i915: Fix deadlock witha the pipe A quirk during resume
drm/i915: Remove __GFP_NORETRY from our buffer allocator
drm/i915: Encourage our shrinker more when our shmemfs allocations fails
drm/i915: Differentiate between sw write location into ring and last hw read
Linus Torvalds [Sat, 24 Jun 2017 00:32:05 +0000 (17:32 -0700)]
Merge tag 'for-4.12/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- a revert of a DM mirror commit that has proven to make the code prone
to crash
- a DM io reference count fix that resolves a NULL pointer seen when
issuing discards to a DM mirror target's device whose mirror legs do
not all support discards
- a couple DM integrity fixes
* tag 'for-4.12/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm io: fix duplicate bio completion due to missing ref count
dm integrity: fix to not disable/enable interrupts from interrupt context
Revert "dm mirror: use all available legs on multiple failures"
dm integrity: reject mappings too large for device
Daniel Lezcano [Mon, 12 Jun 2017 15:55:10 +0000 (17:55 +0200)]
ARM: cpuidle: Support asymmetric idle definition
Some hardware have clusters with different idle states. The current code does
not support this and fails as it expects all the idle states to be identical.
Because of this, the Mediatek mtk8173 had to create the same idle state for a
big.Little system and now the Hisilicon 960 is facing the same situation.
Solve this by simply assuming the multiple driver will be needed for all the
platforms using the ARM generic cpuidle driver which makes sense because of the
different topologies we can support with a single kernel for ARM32 or ARM64.
Every CPU has its own driver, so every single CPU can specify in the DT the
idle states.
This simple approach allows to support the future dynamIQ system, current SMP
and HMP.
Tested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Srinivas Pandruvada [Mon, 12 Jun 2017 23:30:27 +0000 (16:30 -0700)]
cpufreq: intel_pstate: Remove max/min fractions to limit performance
In the current model the max/min perf limits are a fraction of current
user space limits to the allowed max_freq or 100% for global limits.
This results in wrong ratio limits calculation because of rounding
issues for some user space limits.
Initially we tried to solve this issue by issue by having more shift
bits to increase precision. Still there are isolated cases where we still
have error.
This can be avoided by using ratios all together. Since the way we get
cpuinfo.max_freq is by multiplying scaling factor to max ratio, we can
easily keep the max/min ratios in terms of ratios and not fractions.
For example:
if the max ratio = 36
cpuinfo.max_freq = 36 * 100000 = 3600000
Suppose user space sets a limit of 1200000, then we can calculate
max ratio limit as
= 36 * 1200000 / 3600000
= 12
This will be correct for any user limits.
The other advantage is that, we don't need to do any calculation in the
fast path as ratio limit is already calculated via set_policy() callback.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Len Brown [Sat, 17 Jun 2017 03:03:11 +0000 (20:03 -0700)]
x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"
cpufreq_quick_get() allows cpufreq drivers to over-ride cpu_khz
that is otherwise reported in x86 /proc/cpuinfo "cpu MHz".
There are four problems with this scheme,
any of them is sufficient justification to delete it.
1. Depending on which cpufreq driver is loaded, the behavior
of this field is different.
2. Distros complain that they have to explain to users
why and how this field changes. Distros have requested a constant.
3. The two major providers of this information, acpi_cpufreq
and intel_pstate, both "get it wrong" in different ways.
acpi_cpufreq lies to the user by telling them that
they are running at whatever frequency was last
requested by software.
intel_pstate lies to the user by telling them that
they are running at the average frequency computed
over an undefined measurement. But an average computed
over an undefined interval, is itself, undefined...
4. On modern processors, user space utilities, such as
turbostat(1), are more accurate and more precise, while
supporing concurrent measurement over arbitrary intervals.
Users who have been consulting /proc/cpuinfo to
track changing CPU frequency will be dissapointed that
it no longer wiggles -- perhaps being unaware of the
limitations of the information they have been consuming.
Yes, they can change their scripts to look in sysfs
cpufreq/scaling_cur_frequency. Here they will find the same
data of dubious quality here removed from /proc/cpuinfo.
The value in sysfs will be addressed in a subsequent patch
to address issues 1-3, above.
Issue 4 will remain -- users that really care about
accurate frequency information should not be using either
proc or sysfs kernel interfaces.
They should be using using turbostat(8), or a similar
purpose-built analysis tool.
Signed-off-by: Len Brown <len.brown@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Colin Ian King [Thu, 15 Jun 2017 09:55:59 +0000 (10:55 +0100)]
cpufreq: sfi: make freq_table static
pointer freq_table can be made static as it does not need to be in
global scope.
Cleans up sparse warning:
"symbol 'freq_table' was not declared. Should it be static?"
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Wed, 21 Jun 2017 04:59:13 +0000 (10:29 +0530)]
PM / OPP: Add dev_pm_opp_{set|put}_clkname()
In order to support OPP switching, OPP layer needs to get pointer to the
clock for the device. Simple cases work fine without using the routines
added by this patch (i.e. by passing connection-id as NULL), but for a
device with multiple clocks available, the OPP core needs to know the
exact name of the clk to use.
Add a new set of APIs to get that done.
Tested-by: Rajendra Nayak <rnayak@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Krzysztof Kozlowski [Wed, 7 Jun 2017 18:13:24 +0000 (20:13 +0200)]
cpufreq: exynos5440: Fix inconsistent indenting
Fix inconsistent indenting and unneeded white space in assignment.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tao Wang [Tue, 23 May 2017 08:13:18 +0000 (16:13 +0800)]
cpufreq: dt: Add support for hi3660
Add the compatible string for supporting the generic device tree cpufreq-dt
driver on Hisilicon's 3660 SoC.
Signed-off-by: Tao Wang <kevin.wangtao@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Fri, 23 Jun 2017 23:30:52 +0000 (16:30 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
fs/exec.c: account for argv/envp pointers
ocfs2: fix deadlock caused by recursive locking in xattr
slub: make sysfs file removal asynchronous
lib/cmdline.c: fix get_options() overflow while parsing ranges
fs/dax.c: fix inefficiency in dax_writeback_mapping_range()
autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL
mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings
mm, thp: remove cond_resched from __collapse_huge_page_copy
Viresh Kumar [Thu, 22 Jun 2017 03:45:11 +0000 (09:15 +0530)]
PM / OPP: Use - instead of @ for DT entries
Compiling the DT file with W=1, DTC warns like follows:
Warning (unit_address_vs_reg): Node /opp_table0/opp@1000000000 has a
unit name, but no reg property
Fix this by replacing '@' with '-' as the OPP nodes will never have a
"reg" property.
Reported-by: Krzysztof Kozlowski <krzk@kernel.org> Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Kees Cook [Fri, 23 Jun 2017 22:08:57 +0000 (15:08 -0700)]
fs/exec.c: account for argv/envp pointers
When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included. This means
that an exec with huge numbers of tiny strings could eat 1/4 of the stack
limit in strings and then additional space would be later used by the
pointers to the strings.
For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721
single-byte strings would consume less than 2MB of stack, the max (8MB /
4) amount allowed, but the pointers to the strings would consume the
remaining additional stack space (1677721 * 4 == 6710884).
The result (1677721 + 6710884 == 8388605) would exhaust stack space
entirely. Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-1000365).
[akpm@linux-foundation.org: additional commenting from Kees] Fixes: b6a2fea39318 ("mm: variable length argument support") Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Ren [Fri, 23 Jun 2017 22:08:55 +0000 (15:08 -0700)]
ocfs2: fix deadlock caused by recursive locking in xattr
Another deadlock path caused by recursive locking is reported. This
kind of issue was introduced since commit 743b5f1434f5 ("ocfs2: take
inode lock in ocfs2_iop_set/get_acl()"). Two deadlock paths have been
fixed by commit b891fa5024a9 ("ocfs2: fix deadlock issue when taking
inode lock at vfs entry points"). Yes, we intend to fix this kind of
case in incremental way, because it's hard to find out all possible
paths at once.
This one can be reproduced like this. On node1, cp a large file from
home directory to ocfs2 mountpoint. While on node2, run
setfacl/getfacl. Both nodes will hang up there. The backtraces:
Fix this one by using ocfs2_inode_{lock|unlock}_tracker, which is
exported by commit 439a36b8ef38 ("ocfs2/dlmglue: prepare tracking logic
to avoid recursive cluster lock").
Link: http://lkml.kernel.org/r/20170622014746.5815-1-zren@suse.com Fixes: 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") Signed-off-by: Eric Ren <zren@suse.com> Reported-by: Thomas Voegtle <tv@lio96.de> Tested-by: Thomas Voegtle <tv@lio96.de> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tejun Heo [Fri, 23 Jun 2017 22:08:52 +0000 (15:08 -0700)]
slub: make sysfs file removal asynchronous
Commit bf5eb3de3847 ("slub: separate out sysfs_slab_release() from
sysfs_slab_remove()") made slub sysfs file removals synchronous to
kmem_cache shutdown.
Unfortunately, this created a possible ABBA deadlock between slab_mutex
and sysfs draining mechanism triggering the following lockdep warning.
======================================================
[ INFO: possible circular locking dependency detected ]
4.10.0-test+ #48 Not tainted
-------------------------------------------------------
rmmod/1211 is trying to acquire lock:
(s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40
but task is already holding lock:
(slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
It'd be the cleanest to deal with the issue by removing sysfs files
without holding slab_mutex before the rest of shutdown; however, given
the current code structure, it is pretty difficult to do so.
This patch punts sysfs file removal to a work item. Before commit bf5eb3de3847, the removal was punted to a RCU delayed work item which is
executed after release. Now, we're punting to a different work item on
shutdown which still maintains the goal removing the sysfs files earlier
when destroying kmem_caches.
Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org Fixes: bf5eb3de3847 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ilya Matveychikov [Fri, 23 Jun 2017 22:08:49 +0000 (15:08 -0700)]
lib/cmdline.c: fix get_options() overflow while parsing ranges
When using get_options() it's possible to specify a range of numbers,
like 1-100500. The problem is that it doesn't track array size while
calling internally to get_range() which iterates over the range and
fills the memory with numbers.
Link: http://lkml.kernel.org/r/2613C75C-B04D-4BFF-82A6-12F97BA0F620@gmail.com Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Fri, 23 Jun 2017 22:08:46 +0000 (15:08 -0700)]
fs/dax.c: fix inefficiency in dax_writeback_mapping_range()
dax_writeback_mapping_range() fails to update iteration index when
searching radix tree for entries needing cache flushing. Thus each
pagevec worth of entries is searched starting from the start which is
inefficient and prone to livelocks. Update index properly.
Link: http://lkml.kernel.org/r/20170619124531.21491-1-jack@suse.cz Fixes: 9973c98ecfda3 ("dax: add support for fsync/sync") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ard Biesheuvel [Fri, 23 Jun 2017 22:08:41 +0000 (15:08 -0700)]
mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings
Existing code that uses vmalloc_to_page() may assume that any address
for which is_vmalloc_addr() returns true may be passed into
vmalloc_to_page() to retrieve the associated struct page.
This is not un unreasonable assumption to make, but on architectures
that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need
to ensure that vmalloc_to_page() does not go off into the weeds trying
to dereference huge PUDs or PMDs as table entries.
Given that vmalloc() and vmap() themselves never create huge mappings or
deal with compound pages at all, there is no correct answer in this
case, so return NULL instead, and issue a warning.
When reading /proc/kcore on arm64, you will hit an oops as soon as you
hit the huge mappings used for the various segments that make up the
mapping of vmlinux. With this patch applied, you will no longer hit the
oops, but the kcore contents willl be incorrect (these regions will be
zeroed out)
We are fixing this for kcore specifically, so it avoids vread() for
those regions. At least one other problematic user exists, i.e.,
/dev/kmem, but that is currently broken on arm64 for other reasons.
Link: http://lkml.kernel.org/r/20170609082226.26152-1-ard.biesheuvel@linaro.org Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Laura Abbott <labbott@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: zhong jiang <zhongjiang@huawei.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Fri, 23 Jun 2017 22:08:38 +0000 (15:08 -0700)]
mm, thp: remove cond_resched from __collapse_huge_page_copy
This is a partial revert of commit 338a16ba1549 ("mm, thp: copying user
pages must schedule on collapse") which added a cond_resched() to
__collapse_huge_page_copy().
On x86 with CONFIG_HIGHPTE, __collapse_huge_page_copy is called in
atomic context and thus scheduling is not possible. This is only a
possible config on arm and i386.
Although need_resched has been shown to be set for over 100 jiffies
while doing the iteration in __collapse_huge_page_copy, this is better
than doing
if (in_atomic())
cond_resched()
to cover only non-CONFIG_HIGHPTE configs.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706191341550.97821@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 23 Jun 2017 19:25:37 +0000 (12:25 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two fixes to remove spurious WARN_ONs from the new(ish) qedi driver.
The driver already prints a warning message, there's no need to panic
users by printing something that looks like an oops as well"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qedi: Remove WARN_ON from clear task context.
scsi: qedi: Remove WARN_ON for untracked cleanup.
Linus Torvalds [Fri, 23 Jun 2017 19:23:06 +0000 (12:23 -0700)]
Merge tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"I have one more bugfix for you for 4.12-rc7 to fix a disk corruption
problem:
- don't allow swapon on files on the realtime device, because the
swap code will swap pages out to blocks on the data device, thereby
corrupting the filesystem"
* tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: don't allow bmap on rt files
Andrew Duggan [Fri, 23 Jun 2017 07:04:51 +0000 (00:04 -0700)]
Input: synaptics-rmi4 - only read the F54 query registers which are used
The F54 driver is currently only using the first 6 bytes of F54 so there is
no need to read all 27 bytes. Some Dell systems (Dell XP13 9333 and
similar) have an issue with the touchpad or I2C bus when reading reports
larger then 16 bytes. Reads larger then 16 bytes are reported in two HID
reports. Something about the back to back reports seems to cause the next
read to report incorrect data. This results in F30 failing to load and the
click button failing to work.
Previous issues with the I2C controller or touchpad were addressed in:
commit 5b65c2a02966 ("HID: rmi: check sanity of the incoming report")
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=195949 Signed-off-by: Andrew Duggan <aduggan@synaptics.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: Nick Dyer <nick@shmanahar.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Nicholas Piggin [Wed, 21 Jun 2017 05:58:29 +0000 (15:58 +1000)]
powerpc/64: Initialise thread_info for emergency stacks
Emergency stacks have their thread_info mostly uninitialised, which in
particular means garbage preempt_count values.
Emergency stack code runs with interrupts disabled entirely, and is
used very rarely, so this has been unnoticed so far. It was found by a
proposed new powerpc watchdog that takes a soft-NMI directly from the
masked_interrupt handler and using the emergency stack. That crashed
at BUG_ON(in_nmi()) in nmi_enter(). preempt_count()s were found to be
garbage.
To fix this, zero the entire THREAD_SIZE allocation, and initialize
the thread_info.
Cc: stable@vger.kernel.org Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move it all into setup_64.c, use a function not a macro. Fix
crashes on Cell by setting preempt_count to 0 not HARDIRQ_OFFSET] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Dave Airlie [Fri, 23 Jun 2017 01:44:51 +0000 (11:44 +1000)]
Merge tag 'drm-misc-fixes-2017-06-22' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
UAPI Changes:
- drm: Fix regression in GETCONNECTOR ioctl returning stale properties (Daniel)
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
* tag 'drm-misc-fixes-2017-06-22' of git://anongit.freedesktop.org/git/drm-misc:
drm: Fix GETCONNECTOR regression
Randy Dunlap [Tue, 23 May 2017 01:44:57 +0000 (18:44 -0700)]
kconfig: fix sparse warnings in nconfig
Fix sparse warnings in scripts/kconfig/nconf* ('make nconfig'):
../scripts/kconfig/nconf.c:1071:32: warning: Using plain integer as NULL pointer
../scripts/kconfig/nconf.c:1238:30: warning: Using plain integer as NULL pointer
../scripts/kconfig/nconf.c:511:51: warning: Using plain integer as NULL pointer
../scripts/kconfig/nconf.c:1460:6: warning: symbol 'setup_windows' was not declared. Should it be static?
../scripts/kconfig/nconf.c:274:12: warning: symbol 'current_instructions' was not declared. Should it be static?
../scripts/kconfig/nconf.c:308:22: warning: symbol 'function_keys' was not declared. Should it be static?
../scripts/kconfig/nconf.gui.c:132:17: warning: non-ANSI function declaration of function 'set_colors'
../scripts/kconfig/nconf.gui.c:195:24: warning: Using plain integer as NULL pointer
nconf.gui.o before/after files are the same.
nconf.o before/after files are the same until the 'static' function
declarations are added.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Björn Töpel [Wed, 21 Jun 2017 16:41:34 +0000 (18:41 +0200)]
perf probe: Fix probe definition for inlined functions
In commit 613f050d68a8 ("perf probe: Fix to probe on gcc generated
functions in modules"), the offset from symbol is, incorrectly, added
to the trace point address. This leads to incorrect probe trace points
for inlined functions and when using relative line number on symbols.
Using 'pfunct', a tool found in the 'dwarves' package [1], one can ask what are
the functions that while not being explicitely marked as inline, were inlined
by the compiler:
Now lets concentrate on the 'e1000_consume_page' one, that was inlined twice in
e1000_clean_jumbo_rx_irq(), lets see what readelf says about the DWARF tags for
that function:
So, the first time in e1000_clean_jumbo_rx_irq() where e1000_consume_page() is
inlined is at PC 0x17be6, which subtracted from e1000_clean_jumbo_rx_irq()'s
address, gives us the offset we should use in the probe definition:
0x17be6 - 0x17a30 = 438
but above we have 876, which is twice as much.
Lets see the second inline expansion of e1000_consume_page() in
e1000_clean_jumbo_rx_irq():
Linus Torvalds [Thu, 22 Jun 2017 18:16:55 +0000 (11:16 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Various small fixes for stable"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Fix some return values in case of error in 'crypt_message'
cifs: remove redundant return in cifs_creation_time_get
CIFS: Improve readdir verbosity
CIFS: check if pages is null rather than bv for a failed allocation
CIFS: Set ->should_dirty in cifs_user_readv()
Paolo Bonzini [Wed, 7 Jun 2017 13:13:14 +0000 (15:13 +0200)]
KVM: x86: fix singlestepping over syscall
TF is handled a bit differently for syscall and sysret, compared
to the other instructions: TF is checked after the instruction completes,
so that the OS can disable #DB at a syscall by adding TF to FMASK.
When the sysret is executed the #DB is taken "as if" the syscall insn
just completed.
KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
Fix the behavior, otherwise you could get #DB on a user stack which is not
nice. This does not affect Linux guests, as they use an IST or task gate
for #DB.
This fixes CVE-2017-7518.
Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Alistair Popple [Tue, 20 Jun 2017 08:37:28 +0000 (18:37 +1000)]
powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD
NPU2 requires an extra explicit flush to an active GPU PID when
sending address translation shoot downs (ATSDs) to reliably flush the
GPU TLB. This patch adds just such a flush at the end of each sequence
of ATSDs.
We can safely use PID 0 which is always reserved and active on the
GPU. PID 0 is only used for init_mm which will never be a user mm on
the GPU. To enforce this we add a check in pnv_npu2_init_context()
just in case someone tries to use PID 0 on the GPU.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Use true/false for bool literals] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For real-space designation asces the asce origin part is only a token.
The asce token origin must not be used to generate an effective
address for storage references. This however is erroneously done
within kvm_s390_shadow_tables().
Furthermore within the same function the wrong parts of virtual
addresses are used to generate a corresponding real address
(e.g. the region second index is used as region first index).
Both of the above can result in incorrect address translations. Only
for real space designations with a token origin of zero and addresses
below one megabyte the translation was correct.
Furthermore replace a "!asce.r" statement with a "!*fake" statement to
make it more obvious that a specific condition has nothing to do with
the architecture, but with the fake handling of real space designations.
Fixes: 3218f7094b6b ("s390/mm: support real-space for gmap shadows") Cc: David Hildenbrand <david@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Kan Liang [Mon, 19 Jun 2017 14:26:09 +0000 (07:26 -0700)]
perf/x86/intel: Add 1G DTLB load/store miss support for SKL
Current DTLB load/store miss events (0x608/0x649) only counts 4K,2M and
4M page size.
Need to extend the events to support any page size (4K/2M/4M/1G).
i2c: imx: Use correct function to write to register
The i2c-imx driver incorrectly uses readb()/writeb() to read and
write to the appropriate registers when performing a repeated start.
The appropriate imx_i2c_read_reg()/imx_i2c_write_reg() functions
should be used instead. Performing a repeated start results in
a kernel panic. The platform is imx.
Signed-off-by: Michail G Etairidis <m.etairidis@beck-ipc.com> Fixes: ce1a78840ff7 ("i2c: imx: add DMA support for freescale i2c driver") Fixes: 054b62d9f25c ("i2c: imx: fix the i2c bus hang issue when do repeat restart") Acked-by: Fugang Duan <fugang.duan@nxp.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Linus Torvalds [Thu, 22 Jun 2017 05:15:00 +0000 (22:15 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"This contains a set of fixes for xen-blkback by way of Konrad, and a
performance regression fix for blk-mq for shared tags.
The latter could account for as much as a 50x reduction in
performance, with the test case from the user with 500 name spaces. A
more realistic setup on my end with 32 drives showed a 3.5x drop. The
fix has been thoroughly tested before being committed"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: fix performance regression with shared tags
xen-blkback: don't leak stack data via response ring
xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
xen/blkback: don't free be structure too early
xen/blkback: fix disconnect while I/Os in flight
Darrick J. Wong [Thu, 22 Jun 2017 03:27:35 +0000 (20:27 -0700)]
xfs: don't allow bmap on rt files
bmap returns a dumb LBA address but not the block device that goes with
that LBA. Swapfiles don't care about this and will blindly assume that
the data volume is the correct blockdev, which is totally bogus for
files on the rt subvolume. This results in the swap code doing IOs to
arbitrary locations on the data device(!) if the passed in mapping is a
realtime file, so just turn off bmap for rt files.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
We create "supply-0" debugfs directory even if the device doesn't do
voltage scaling. That looks confusing, as if the regulator is found but
we never managed to get voltage levels for it.
Avoid creating such a directory unnecessarily.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 23 May 2017 04:02:12 +0000 (09:32 +0530)]
PM / OPP: opp-microvolt is not optional if regulators are set
If dev_pm_opp_set_regulators() is called for a device and its regulators
are set in the OPP core, the OPP nodes for the device must contain the
"opp-microvolt" property, otherwise there is something wrong and we
better error out.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 23 May 2017 04:02:11 +0000 (09:32 +0530)]
PM / OPP: Don't create copy of regulators unnecessarily
This code was required while the OPP core was managed with help of RCUs,
but not anymore. Get rid of unnecessary alloc/memcpy operations.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar [Tue, 23 May 2017 04:02:10 +0000 (09:32 +0530)]
PM / OPP: Reorganize _generic_set_opp_regulator()
The code was overly complicated here because of the limitations that we
had with RCUs (Couldn't use opp-table and OPPs outside RCU protected
section and can't call sleep-able routines from within that). But that
is long gone now.
Reorganize _generic_set_opp_regulator() in order to avoid using "struct
dev_pm_set_opp_data" and copying data into it for the case where
opp_table->set_opp is not set.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>