Thomas VanSelus [Fri, 22 Dec 2017 16:51:23 +0000 (10:51 -0600)]
misc: ds1682: Ignore update-in-progress ETC reads
The Elapsed Time Counter (ETC) registers are not buffered for reading.
If a 250ms tick occurs while data is being read out, the result can be
a combination of old and new values. This can occur at the byte level
(giving a time in the future) or the individual bit level (giving a
time in the past). We catch both these cases by reading until we get
two equal or consecutive values. After five unsuccessful attempts we
give up.
Signed-off-by: Thomas VanSelus <tvanselus@xes-inc.com> Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Aaron Sierra [Fri, 22 Dec 2017 16:51:09 +0000 (10:51 -0600)]
misc: ds1682: Show device registers as unsigned
This patch leverages the fact that all DS1682 registers are unsigned to
merge two return paths into one. It also introduces val_le as used in
ds1682_store() to merge two endianness conversions into one.
Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wei Yongjun [Tue, 2 Jan 2018 17:54:24 +0000 (17:54 +0000)]
slimbus: qcom: Fix return value check in qcom_slim_probe()
In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Colin Ian King [Tue, 2 Jan 2018 17:54:21 +0000 (17:54 +0000)]
slimbus: make functions slim_ack_txn and slim_alloc_txbuf static
The functions slim_ack_txn and slim_alloc_txbuf are local to the
source and do not need to be in global scope, so make them static.
Cleans up sparse warnings:
symbol 'slim_ack_txn' was not declared. Should it be static?
symbol 'slim_alloc_txbuf' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Colin Ian King [Tue, 2 Jan 2018 17:54:20 +0000 (17:54 +0000)]
slimbus: fix retries comparison to correctly identify failed allocation
Currently the check for too many retries fails because retries is actually
-1 when the retry loop terminates if no pbuf can be allocated because of
the post decrement on retries. Fix this by not comparing retries with zero
but instead check if it is negative.
Detected by CoverityScan, CID#1463143 ("Logically dead code") and
CID#1463144 ("Dereference after null check")
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Colin Ian King [Tue, 2 Jan 2018 17:54:19 +0000 (17:54 +0000)]
slimbus: avoid null pointer dereference on msg
The pointer msg is checked to see if it is null at the start of
the function and jumps to the error exit label reterr that then
dereferences msg when it prints a dev_err error message. Avoid
this potential null pointer dereference by only printing the
error message if msg is not null.
Detected by CoverityScan, CID#1463141 ("Dereference after null check")
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Colin Ian King [Mon, 8 Jan 2018 16:52:43 +0000 (22:22 +0530)]
soundwire: intel: fix missing assignment to ret
Currently the return status ret is being checked but it has not been
updated since the previous check on ret. It appears that assignment of
ret from return status of the call to sdw_cdns_enable_interrupt was
accidentally ommited. Fix this.
Detected by CoverityScan, CID#1463148 ("Logically dead code")
Fixes: 71bb8a1b059e ("soundwire: intel: Add Intel Master driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Colin Ian King [Mon, 8 Jan 2018 16:52:42 +0000 (22:22 +0530)]
soundwire: fix sign extension when shifting buf[2] 24 places
The buf[2] left shift by 24 bits is promoted to int (32 bit signed)
and then signed-extended to unsigned long long. Hence if the upper
bit to buf[2] is set then all the upper bits of addr end up as 1.
Fix this by casting it to u64 before shifting it. Also replace the
unsigned long long casts to u64 casts to match the same type of
addr.
Detected by CoverityScan, CID#1463147 ("Unintended sign extension")
Fixes: d52d7a1be02c ("soundwire: Add Slave status handling helpers") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Mon, 8 Jan 2018 13:33:08 +0000 (14:33 +0100)]
Merge tag 'extcon-next-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-next
Chanwoo writes:
Update extcon for 4.16
Detailed description for this pull request:
1. Add support to notify USB connector for "ChromeOS Embedded Controller".
- extcon-usbc-cros-ec driver detects the EXTCON_USB and EXTCON_USB_HOST
connector type and then notify the state/properties to the consumer device.
2. Update the detection on probe time and clean-up code for "X-Power AXP288".
- Detect the state of connector after a couple of seconds after probe()
becasue extcon-axp288.c driver depends on other device driver like mux.
In order to guarantee the correct state, the extcon-axp288.c uses the
delayed_work.
- Set EXTCON_CHG_USB_SDP type as the safe default type if unknown connector
is attached because the data sheet of axp288 doesn't handle
the all exception cases.
- Remove unused code
3. Fix the minor issue of extcon driver
- Fix platform get_irq's error checking for extcon-adc-jack.
- Delete unneeded initialization for extcon-max8997/max77693.
According to the data sheets all the values not handled in the
switch-case are "reserved". Update the dev_warn message to reflect
this and set the cable-type to EXTCON_CHG_USB_SDP (so max 500mA
current draw) as safe default.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Hans de Goede [Fri, 22 Dec 2017 12:36:15 +0000 (13:36 +0100)]
extcon: axp288: Redo charger type detection a couple of seconds after probe()
The axp288 extcon code depends on other drivers to do things like mux the
data lines, enable/disable vbus based on the id-pin, etc.
Sometimes the BIOS has not set these things up correctly resulting in the
initial charger cable type detection giving a wrong result and we end up
not charging or charging at only 0.5A.
This commit starts a second charger-detection cycle a couple of seconds
after the first one finishes, giving the other drivers time to load and
do their thing.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Hans de Goede [Sat, 30 Dec 2017 16:04:13 +0000 (01:04 +0900)]
extcon: axp288: Remove unused platform data
This is not used / set anywhere in the tree.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Colin Ian King [Tue, 19 Dec 2017 17:35:30 +0000 (17:35 +0000)]
mei: fix incorrect logical operator in if statement
The current expression using the || operator is always true because
dev->dev_state cannot be equal to two different values at the same time.
Fix this by replacing the || with &&.
Detected by CoverityScan, CID#1463042 ("Constant expression result")
Fixes: 8d52af6795c0 ("mei: speed up the power down flow") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tomas Winkler [Tue, 2 Jan 2018 10:01:41 +0000 (12:01 +0200)]
mei: me: allow runtime pm for platform with D0i3
>From the pci power documentation:
"The driver itself should not call pm_runtime_allow(), though. Instead,
it should let user space or some platform-specific code do that (user space
can do it via sysfs as stated above)..."
However, the S0ix residency cannot be reached without MEI device getting
into low power state. Hence, for mei devices that support D0i3, it's better
to make runtime power management mandatory and not rely on the system
integration such as udev rules.
This policy cannot be applied globally as some older platforms
were found to have broken power management.
Cc: <stable@vger.kernel.org> v4.13+ Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Reviewed-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Tue, 2 Jan 2018 10:48:54 +0000 (11:48 +0100)]
slimbus: qcom-ctrl: use normal allocation
The previous patch addressed a warning but not the cause:
drivers/slimbus/qcom-ctrl.c: In function 'qcom_slim_probe':
drivers/slimbus/qcom-ctrl.c:584:9: error: passing argument 3 of 'dmam_alloc_coherent' from incompatible pointer type [-Werror=incompatible-pointer-types]
There are two things wrong here:
- The naming is very confusing, we now have a member named 'phys'
that doesn't refer to a phys_addr_t but a dma_addr_t. If we needed
a dma address, it should be named 'dma' to avoid confusion, and
to make it less likely that someone passes it into a function that
expects a physical address.
- The dma address is not used at all at this point. It may have been
designed to support DMA in the future, but today it doesn't, so
the only effect right now is to make transfers artificially slower
by using uncached memory instead of cached memory for a temporary
buffer.
This removes the unused structure member and instead changes the code
to call devm_kcalloc(), which matches the usage of the 'base' pointer
as an array of temporary buffers.
Linus Torvalds [Sun, 31 Dec 2017 21:13:56 +0000 (13:13 -0800)]
Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A couple of fixlets for x86:
- Fix the ESPFIX double fault handling for 5-level pagetables
- Fix the commandline parsing for 'apic=' on 32bit systems and update
documentation
- Make zombie stack traces reliable
- Fix kexec with stack canary
- Fix the delivery mode for APICs which was missed when the x86
vector management was converted to single target delivery. Caused a
regression due to the broken hardware which ignores affinity
settings in lowest prio delivery mode.
- Unbreak modules when AMD memory encryption is enabled
- Remove an unused parameter of prepare_switch_to"
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Switch all APICs to Fixed delivery mode
x86/apic: Update the 'apic=' description of setting APIC driver
x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case
x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)
x86: Remove unused parameter of prepare_switch_to
x86/stacktrace: Make zombie stack traces reliable
x86/mm: Unbreak modules that use the DMA API
x86/build: Make isoimage work on Debian
x86/espfix/64: Fix espfix double-fault handling on 5-level systems
Linus Torvalds [Sun, 31 Dec 2017 20:30:34 +0000 (12:30 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A pile of fixes for long standing issues with the timer wheel and the
NOHZ code:
- Prevent timer base confusion accross the nohz switch, which can
cause unlocked access and data corruption
- Reinitialize the stale base clock on cpu hotplug to prevent subtle
side effects including rollovers on 32bit
- Prevent an interrupt storm when the timer softirq is already
pending caused by tick_nohz_stop_sched_tick()
- Move the timer start tracepoint to a place where it actually makes
sense
- Add documentation to timerqueue functions as they caused confusion
several times now"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timerqueue: Document return values of timerqueue_add/del()
timers: Invoke timer_start_debug() where it makes sense
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
timers: Reinitialize per cpu bases on hotplug
timers: Use deferrable base independent of base::nohz_active
Linus Torvalds [Sun, 31 Dec 2017 20:27:19 +0000 (12:27 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"Three patches addressing the fallout of the CPU_ISOLATION changes
especially with NO_HZ_FULL plus documentation of boot parameter
dependency"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/isolation: Document boot parameters dependency on CONFIG_CPU_ISOLATION=y
sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default
sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION
Linus Torvalds [Sun, 31 Dec 2017 19:47:24 +0000 (11:47 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
- plug a memory leak in the intel pmu init code
- clang fixes
- tooling fix to avoid including kernel headers
- a fix for jvmti to generate correct debug information for inlined
code
- replace backtick with a regular shell function
- fix the build in hardened environments
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Plug memory leak in intel_pmu_init()
x86/asm: Allow again using asm.h when building for the 'bpf' clang target
tools arch s390: Do not include header files from the kernel sources
perf jvmti: Generate correct debug information for inlined code
perf tools: Fix up build in hardened environments
perf tools: Use shell function for perl cflags retrieval
Linus Torvalds [Sun, 31 Dec 2017 19:23:11 +0000 (11:23 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A rather large update after the kaisered maintainer finally found time
to handle regression reports.
- The larger part addresses a regression caused by the x86 vector
management rework.
The reservation based model does not work reliably for MSI
interrupts, if they cannot be masked (yes, yet another hw
engineering trainwreck). The reason is that the reservation mode
assigns a dummy vector when the interrupt is allocated and switches
to a real vector when the interrupt is requested.
If the MSI entry cannot be masked then the initialization might
raise an interrupt before the interrupt is requested, which ends up
as spurious interrupt and causes device malfunction and worse. The
fix is to exclude MSI interrupts which do not support masking from
reservation mode and assign a real vector right away.
- Extend the extra lockdep class setup for nested interrupts with a
class for the recently added irq_desc::request_mutex so lockdep can
differeniate and does not emit false positive warnings.
- A ratelimit guard for the bad irq printout so in case a bad irq
comes back immediately the system does not drown in dmesg spam"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI
genirq/irqdomain: Rename early argument of irq_domain_activate_irq()
x86/vector: Use IRQD_CAN_RESERVE flag
genirq: Introduce IRQD_CAN_RESERVE flag
genirq/msi: Handle reactivation only on success
gpio: brcmstb: Make really use of the new lockdep class
genirq: Guard handle_bad_irq log messages
kernel/irq: Extend lockdep class for request mutex
Linus Torvalds [Sun, 31 Dec 2017 18:57:10 +0000 (10:57 -0800)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Thomas Gleixner:
"Three fixlets for objtool:
- Address two segfaults related to missing parameter and clang
objects
- Make it compile clean with clang"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix seg fault with clang-compiled objects
objtool: Fix seg fault caused by missing parameter
objtool: Fix Clang enum conversion warning
Linus Torvalds [Sun, 31 Dec 2017 18:52:51 +0000 (10:52 -0800)]
Merge tag 'char-misc-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are six small fixes of some of the char/misc drivers that have
been sent in to resolve reported issues.
Nothing major, a binder use-after-free fix, some thunderbolt bugfixes,
a hyper-v bugfix, and an nvmem driver fix. All of these have been in
linux-next with no reported issues for a while"
* tag 'char-misc-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
nvmem: meson-mx-efuse: fix reading from an offset other than 0
binder: fix proc->files use-after-free
vmbus: unregister device_obj->channels_kset
thunderbolt: Mask ring interrupt properly when polling starts
MAINTAINERS: Add thunderbolt.rst to the Thunderbolt driver entry
thunderbolt: Make pathname to force_power shorter
Linus Torvalds [Sun, 31 Dec 2017 18:50:05 +0000 (10:50 -0800)]
Merge tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two driver core fixes for 4.15-rc6, resolving some reported
issues.
The first is a cacheinfo fix for DT based systems to resolve a
reported issue that has been around for a while, and the other is to
resolve a regression in the kobject uevent code that showed up in
4.15-rc1.
Both have been in linux-next for a while with no reported issues"
* tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kobject: fix suppressing modalias in uevents delivered over netlink
drivers: base: cacheinfo: fix cache type for non-architected system cache
Linus Torvalds [Sun, 31 Dec 2017 18:44:00 +0000 (10:44 -0800)]
Merge tag 'usb-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/PHY fixes from Greg KH:
"Here are a number of small USB and PHY driver fixes for 4.15-rc6.
Nothing major, but there are a number of regression fixes in here that
resolve issues that have been reported a bunch. There are also the
usual xhci fixes as well as a number of new usb serial device ids.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
xhci: Fix use-after-free in xhci debugfs
xhci: Fix xhci debugfs NULL pointer dereference in resume from hibernate
USB: serial: ftdi_sio: add id for Airbus DS P8GR
usb: Add device quirk for Logitech HD Pro Webcam C925e
usb: add RESET_RESUME for ELSA MicroLink 56K
usbip: fix usbip bind writing random string after command in match_busid
usbip: stub_rx: fix static checker warning on unnecessary checks
usbip: prevent leaking socket pointer address in messages
usbip: stub: stop printing kernel pointer addresses in messages
usbip: vhci: stop printing kernel pointer addresses in messages
USB: Fix off by one in type-specific length check of BOS SSP capability
USB: serial: option: adding support for YUGA CLM920-NC5
phy: rcar-gen3-usb2: select USB_COMMON
phy: rockchip-typec: add pm_runtime_disable in err case
phy: cpcap-usb: Fix platform_get_irq_byname's error checking.
phy: tegra: fix device-tree node lookups
USB: serial: qcserial: add Sierra Wireless EM7565
USB: serial: option: add support for Telit ME910 PID 0x1101
USB: chipidea: msm: fix ulpi-node lookup
Adam Borowski [Mon, 25 Dec 2017 15:38:58 +0000 (16:38 +0100)]
MAINTAINERS: mark arch/blackfin/ and its gubbins as orphaned
The blackfin architecture has seen no maintainer action of any kind since
April 2015. No new code, no pull requests, no acks to patches, no response
to mails, nothing.
The web site has an expired certificate (expiration Sep 2017, issued in
2013), the mailing list sees no answers either, with one exception:
https://sourceforge.net/p/adi-buildroot/mailman/adi-buildroot-devel/
>
> Steven is no longer working on this for ADI. Acked by me if this works. Thanks.
>
> Best regards,
> Aaron Wu
> Analog Devices Inc.
But, Aaron doesn't seem to respond to queries either.
Srinivas Kandagatla [Sun, 31 Dec 2017 14:23:11 +0000 (14:23 +0000)]
slimbus: qcom: fix incompatible pointer warning
One of the pointer passed to dmam_alloc_coherent seems to be
phys_addr_t * instead of dma_addr_t *. This address will be
used by dma apis, so change this to proper type.
Thomas Gleixner [Sun, 31 Dec 2017 15:52:15 +0000 (16:52 +0100)]
x86/ldt: Make LDT pgtable free conditional
Andy prefers to be paranoid about the pagetable free in the error path of
write_ldt(). Make it conditional and warn whenever the installment of a
secondary LDT fails.
Requested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Sun, 31 Dec 2017 10:24:34 +0000 (11:24 +0100)]
x86/ldt: Plug memory leak in error path
The error path in write_ldt() tries to free 'old_ldt' instead of the newly
allocated 'new_ldt', resulting in a memory leak. It also misses to clean up a
half populated LDT pagetable, which is not a leak as it gets cleaned up
when the process exits.
Free both the potentially half populated LDT pagetable and the newly
allocated LDT struct. This can be done unconditionally because once an LDT
is mapped subsequent maps will succeed, because the PTE page is already
populated and the two LDTs fit into that single page.
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1712311121340.1899@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Sat, 30 Dec 2017 21:13:54 +0000 (22:13 +0100)]
x86/mm: Remove preempt_disable/enable() from __native_flush_tlb()
The preempt_disable/enable() pair in __native_flush_tlb() was added in
commit:
5cf0791da5c1 ("x86/mm: Disable preemption during CR3 read+write")
... to protect the UP variant of flush_tlb_mm_range().
That preempt_disable/enable() pair should have been added to the UP variant
of flush_tlb_mm_range() instead.
The UP variant was removed with commit:
ce4a4e565f52 ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code")
... but the preempt_disable/enable() pair stayed around.
The latest change to __native_flush_tlb() in commit:
6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
... added an access to a per CPU variable outside the preempt disabled
regions, which makes no sense at all. __native_flush_tlb() must always
be called with at least preemption disabled.
Remove the preempt_disable/enable() pair and add a WARN_ON_ONCE() to catch
bad callers independent of the smp_processor_id() debugging.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171230211829.679325424@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Sat, 30 Dec 2017 21:13:53 +0000 (22:13 +0100)]
x86/smpboot: Remove stale TLB flush invocations
smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
invoke local_flush_tlb() for no obvious reason.
Digging in history revealed that the original code in the 2.1 era added
those because the code manipulated a swapper_pg_dir pagetable entry. The
pagetable manipulation was removed long ago in the 2.3 timeframe, but the
TLB flush invocations stayed around forever.
Remove them along with the pointless pr_debug()s which come from the same 2.1
change.
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20171230211829.586548655@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sat, 30 Dec 2017 22:31:30 +0000 (14:31 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two simple fixes, both of which cause I/O hangs.
The storvsc one is from the hyper-v which can hang under certain hot
add/remove conditions and the other is generally, where removing a
target and a device in close proximity can result in the release
method being executed twice (and subsequent list and other corruption
and an eventual panic)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error
scsi: core: check for device state in __scsi_remove_target()
Simon Ser [Sat, 30 Dec 2017 20:43:32 +0000 (14:43 -0600)]
objtool: Fix seg fault with clang-compiled objects
Fix a seg fault which happens when an input file provided to 'objtool
orc generate' doesn't have a '.shstrtab' section (for instance, object
files produced by clang don't have this section).
Linus Torvalds [Sat, 30 Dec 2017 18:16:51 +0000 (10:16 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- two cosmetic fixes from Daniel Axtens and Hans de Goede
- fix for I2C command mismatch fix for cp2112 driver from Eudean Sun
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: core: lower log level for unknown main item tags to warnings
HID: holtekff: move MODULE_* parameters out of #ifdef block
HID: cp2112: Fix I2C_BLOCK_DATA transactions
Linus Torvalds [Sat, 30 Dec 2017 01:34:43 +0000 (17:34 -0800)]
kbuild: add '-fno-stack-check' to kernel build options
It appears that hardened gentoo enables "-fstack-check" by default for
gcc.
That doesn't work _at_all_ for the kernel, because the kernel stack
doesn't act like a user stack at all: it's much smaller, and it doesn't
auto-expand on use. So the extra "probe one page below the stack" code
generated by -fstack-check just breaks the kernel in horrible ways,
causing infinite double faults etc.
[ I have to say, that the particular code gcc generates looks very
stupid even for user space where it works, but that's a separate
issue. ]
Reported-and-tested-by: Alexander Tsoy <alexander@tsoy.me> Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de> Cc: stable@kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 30 Dec 2017 01:02:49 +0000 (17:02 -0800)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 page table isolation updates from Thomas Gleixner:
"This is the final set of enabling page table isolation on x86:
- Infrastructure patches for handling the extra page tables.
- Patches which map the various bits and pieces which are required to
get in and out of user space into the user space visible page
tables.
- The required changes to have CR3 switching in the entry/exit code.
- Optimizations for the CR3 switching along with documentation how
the ASID/PCID mechanism works.
- Updates to dump pagetables to cover the user space page tables for
W+X scans and extra debugfs files to analyze both the kernel and
the user space visible page tables
The whole functionality is compile time controlled via a config switch
and can be turned on/off on the command line as well"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
x86/ldt: Make the LDT mapping RO
x86/mm/dump_pagetables: Allow dumping current pagetables
x86/mm/dump_pagetables: Check user space page table for WX pages
x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
x86/mm/pti: Add Kconfig
x86/dumpstack: Indicate in Oops whether PTI is configured and enabled
x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
x86/mm: Use INVPCID for __native_flush_tlb_single()
x86/mm: Optimize RESTORE_CR3
x86/mm: Use/Fix PCID to optimize user/kernel switches
x86/mm: Abstract switching CR3
x86/mm: Allow flushing for future ASID switches
x86/pti: Map the vsyscall page if needed
x86/pti: Put the LDT in its own PGD if PTI is on
x86/mm/64: Make a full PGD-entry size hole in the memory map
x86/events/intel/ds: Map debug buffers in cpu_entry_area
x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
x86/mm/pti: Map ESPFIX into user space
x86/mm/pti: Share entry text PMD
x86/entry: Align entry text section to PMD boundary
...
Thomas Gleixner [Fri, 22 Dec 2017 14:51:13 +0000 (15:51 +0100)]
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
subsequently invokes tick_nohz_stop_sched_tick() are:
if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu))
If need_resched() is not set, but a timer softirq is pending then this is
an indication that the softirq code punted and delegated the execution to
softirqd. need_resched() is not true because the current interrupted task
takes precedence over softirqd.
Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
timer interrupts because the timer wheel contains an expired timer, but
softirqs are not yet executed. So it returns an immediate expiry request,
which causes the timer to fire immediately again. Lather, rinse and
repeat....
Prevent that by adding a check for a pending timer soft interrupt to the
conditions in tick_nohz_stop_sched_tick() which avoid calling
get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
prevents a repetitive programming of an already expired timer.
Reported-by: Sebastian Siewior <bigeasy@linutronix.d> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
Thomas Gleixner [Wed, 27 Dec 2017 20:37:25 +0000 (21:37 +0100)]
timers: Reinitialize per cpu bases on hotplug
The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.
Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.
Set base->must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.
Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel") Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
Anna-Maria Gleixner [Fri, 22 Dec 2017 14:51:12 +0000 (15:51 +0100)]
timers: Use deferrable base independent of base::nohz_active
During boot and before base::nohz_active is set in the timer bases, deferrable
timers are enqueued into the standard timer base. This works correctly as
long as base::nohz_active is false.
Once it base::nohz_active is set and a timer which was enqueued before that
is accessed the lock selector code choses the lock of the deferred
base. This causes unlocked access to the standard base and in case the
timer is removed it does not clear the pending flag in the standard base
bitmap which causes get_next_timer_interrupt() to return bogus values.
To prevent that, the deferrable timers must be enqueued in the deferrable
base, even when base::nohz_active is not set. Those deferrable timers also
need to be expired unconditional.
Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel") Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: stable@vger.kernel.org Cc: rt@linutronix.de Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
Thomas Gleixner [Fri, 29 Dec 2017 09:47:22 +0000 (10:47 +0100)]
genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI
The new reservation mode for interrupts assigns a dummy vector when the
interrupt is allocated and assigns a real vector when the interrupt is
requested. The reservation mode prevents vector pressure when devices with
a large amount of queues/interrupts are initialized, but only a minimal
subset of those queues/interrupts is actually used.
This mode has an issue with MSI interrupts which cannot be masked. If the
driver is not careful or the hardware emits an interrupt before the device
irq is requestd by the driver then the interrupt ends up on the dummy
vector as a spurious interrupt which can cause malfunction of the device or
in the worst case a lockup of the machine.
Change the logic for the reservation mode so that the early activation of
MSI interrupts checks whether:
- the device is a PCI/MSI device
- the reservation mode of the underlying irqdomain is activated
- PCI/MSI masking is globally enabled
- the PCI/MSI device uses either MSI-X, which supports masking, or
MSI with the maskbit supported.
If one of those conditions is false, then clear the reservation mode flag
in the irq data of the interrupt and invoke irq_domain_activate_irq() with
the reserve argument cleared. In the x86 vector code, clear the can_reserve
flag in the vector allocation data so a subsequent free_irq() won't create
the same situation again. The interrupt stays assigned to a real vector
until pci_disable_msi() is invoked and all allocations are undone.
Fixes: 4900be83602b ("x86/vector/msi: Switch to global reservation mode") Reported-by: Alexandru Chirvasitu <achirvasub@gmail.com> Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
Thomas Gleixner [Fri, 29 Dec 2017 15:57:00 +0000 (16:57 +0100)]
x86/vector: Use IRQD_CAN_RESERVE flag
Set the new CAN_RESERVE flag when the initial reservation for an interrupt
happens. The flag is used in a subsequent patch to disable reservation mode
for a certain class of MSI devices.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
Thomas Gleixner [Fri, 29 Dec 2017 15:44:34 +0000 (16:44 +0100)]
genirq: Introduce IRQD_CAN_RESERVE flag
Add a new flag to mark interrupts which can use reservation mode. This is
going to be used in subsequent patches to disable reservation mode for a
certain class of MSI devices.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
Thomas Gleixner [Fri, 29 Dec 2017 09:42:10 +0000 (10:42 +0100)]
genirq/msi: Handle reactivation only on success
When analyzing the fallout of the x86 vector allocation rework it turned
out that the error handling in msi_domain_alloc_irqs() is broken.
If MSI_FLAG_MUST_REACTIVATE is set for a MSI domain then it clears the
activation flag for a successfully initialized msi descriptor. If a
subsequent initialization fails then the error handling code path does not
deactivate the interrupt because the activation flag got cleared.
Move the clearing of the activation flag outside of the initialization loop
so that an eventual failure can be cleaned up correctly.
Fixes: 22d0b12f3560 ("genirq/irqdomain: Add force reactivation flag to irq domains") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Mikael Pettersson <mikpelinux@gmail.com> Cc: Josh Poulson <jopoulso@microsoft.com> Cc: Mihai Costache <v-micos@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-pci@vger.kernel.org Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Simon Xiao <sixiao@microsoft.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jork Loeser <Jork.Loeser@microsoft.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: devel@linuxdriverproject.org Cc: KY Srinivasan <kys@microsoft.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@intel.com>, Cc: linux-media@vger.kernel.org
Linus Torvalds [Fri, 29 Dec 2017 19:54:15 +0000 (11:54 -0800)]
Merge tag 'pm-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"This fixes a schedutil cpufreq governor regression from the 4.14 cycle
that may cause a CPU idleness check to return incorrect results in
some cases which leads to suboptimal decisions (Joel Fernandes)"
* tag 'pm-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: schedutil: Use idle_calls counter of the remote CPU
Thomas Gleixner [Fri, 29 Dec 2017 15:29:15 +0000 (16:29 +0100)]
gpio: brcmstb: Make really use of the new lockdep class
The recent extension of irq_set_lockdep_class() with a second argument
added the new lockdep class to the mrcmstb driver, but used the already
existing lockdep class as second argument, which leaves the new lockdep
class defined but unused.
Use the new lockdep class as that's what the change intended to do.
Fixes: 39c3fd58952d ("kernel/irq: Extend lockdep class for request mutex") Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: linus.walleij@linaro.org
Thomas Gleixner [Thu, 28 Dec 2017 10:33:33 +0000 (11:33 +0100)]
x86/apic: Switch all APICs to Fixed delivery mode
Some of the APIC incarnations are operating in lowest priority delivery
mode. This worked as long as the vector management code allocated the same
vector on all possible CPUs for each interrupt.
Lowest priority delivery mode does not necessarily respect the affinity
setting and may redirect to some other online CPU. This was documented
somewhere in the old code and the conversion to single target delivery
missed to update the delivery mode of the affected APIC drivers which
results in spurious interrupts on some of the affected CPU/Chipset
combinations.
Switch the APIC drivers over to Fixed delivery mode and remove all
leftovers of lowest priority delivery mode.
Switching to Fixed delivery mode is not a problem on these CPUs because the
kernel already uses Fixed delivery mode for IPIs. The reason for this is
that th SDM explicitely forbids lowest prio mode for IPIs. The reason is
obvious: If the irq routing does not honor destination targets in lowest
prio mode then an IPI targeted at CPU1 might end up on CPU0, which would be
a fatal problem in many cases.
As a consequence of this change, the apic::irq_delivery_mode field is now
pointless, but this needs to be cleaned up in a separate patch.
Fixes: fdba46ffb4c2 ("x86/apic: Get rid of multi CPU affinity") Reported-by: vcaputo@pengaru.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: vcaputo@pengaru.com Cc: Pavel Machek <pavel@ucw.cz> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712281140440.1688@nanos
1) IPv6 gre tunnels end up with different default features enabled
depending upon whether netlink or ioctls are used to bring them up.
Fix from Alexey Kodanev.
2) Fix read past end of user control message in RDS< from Avinash
Repaka.
3) Missing RCU barrier in mini qdisc code, from Cong Wang.
4) Missing policy put when reusing per-cpu route entries, from Florian
Westphal.
5) Handle nested PCI errors properly in bnx2x driver, from Guilherme G.
Piccoli.
6) Run nested transport mode IPSEC packets via tasklet, from Herbert
Xu.
7) Fix handling poll() for stream sockets in tipc, from Parthasarathy
Bhuvaragan.
8) Fix two stack-out-of-bounds issues in IPSEC, from Steffen Klassert.
9) Another zerocopy ubuf handling fix, from Willem de Bruijn.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
strparser: Call sock_owned_by_user_nocheck
sock: Add sock_owned_by_user_nocheck
skbuff: in skb_copy_ubufs unclone before releasing zerocopy
tipc: fix hanging poll() for stream sockets
sctp: Replace use of sockets_allocated with specified macro.
bnx2x: Improve reliability in case of nested PCI errors
tg3: Enable PHY reset in MTU change path for 5720
tg3: Add workaround to restrict 5762 MRRS to 2048
tg3: Update copyright
net: fec: unmap the xmit buffer that are not transferred by DMA
tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path
tipc: error path leak fixes in tipc_enable_bearer()
RDS: Check cmsg_len before dereferencing CMSG_DATA
tcp: Avoid preprocessor directives in tracepoint macro args
tipc: fix memory leak of group member when peer node is lost
net: sched: fix possible null pointer deref in tcf_block_put
tipc: base group replicast ack counter on number of actual receivers
net_sched: fix a missing rcu barrier in mini_qdisc_pair_swap()
net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
ip6_gre: fix device features for ioctl setup
...
Linus Torvalds [Fri, 29 Dec 2017 07:16:24 +0000 (23:16 -0800)]
Merge tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"nouveau and i915 regression fixes"
* tag 'drm-fixes-for-v4.15-rc6' of git://people.freedesktop.org/~airlied/linux:
drm/nouveau: fix race when adding delayed work items
i915: Reject CCS modifiers for pipe C on Geminilake
drm/i915/gvt: Fix pipe A enable as default for vgpu
Linus Torvalds [Fri, 29 Dec 2017 07:14:47 +0000 (23:14 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One more fix for the runtime PM clk patches. We're calling a runtime
PM API that may schedule from somewhere that we can't do that. We
change to the async version of pm_runtime_put() to fix it"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: use atomic runtime pm api in clk_core_is_enabled
Linus Torvalds [Fri, 29 Dec 2017 07:09:45 +0000 (23:09 -0800)]
Merge tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED fix from Jacek Anaszewski:
"A single LED fix for brightness setting when delay_off is 0"
* tag 'led_fixes_for_4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
led: core: Fix brightness setting when setting delay_off=0
Linus Torvalds [Fri, 29 Dec 2017 07:06:01 +0000 (23:06 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"This is the next batch of for-rc patches from RDMA. It includes the
fix for the ipoib regression I mentioned last time, and the result of
a fairly major debugging effort to get iser working reliably on cxgb4
hardware - it turns out the cxgb4 driver was not handling QP error
flushing properly causing iser to fail.
- cxgb4 fix for an iser testing failure as debugged by Steve and
Sagi. The problem was a driver bug in the handling of shutting down
a QP.
- Various vmw_pvrdma fixes for bogus WARN_ON, missed resource free on
error unwind and a use after free bug
- Improper congestion counter values on mlx5 when link aggregation is
enabled
- ipoib lockdep regression introduced in this merge window
- hfi1 regression supporting the device in a VM introduced in a
recent patch
- Typo that breaks future uAPI compatibility in the verbs core
- More SELinux related oops fixing
- Fix an oops during error unwind in mlx5"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/mlx5: Fix mlx5_ib_alloc_mr error flow
IB/core: Verify that QP is security enabled in create and destroy
IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
IB/mlx5: Serialize access to the VMA list
IB/hfi: Only read capability registers if the capability exists
IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
IB/mlx5: Fix congestion counters in LAG mode
RDMA/vmw_pvrdma: Avoid use after free due to QP/CQ/SRQ destroy
RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning
RDMA/vmw_pvrdma: Call ib_umem_release on destroy QP path
iw_cxgb4: when flushing, complete all wrs in a chain
iw_cxgb4: reflect the original WR opcode in drain cqes
iw_cxgb4: Only validate the MSN for successful completions
David S. Miller [Thu, 28 Dec 2017 19:28:23 +0000 (14:28 -0500)]
Merge branch 'strparser-Fix-lockdep-issue'
Tom Herbert says:
====================
strparser: Fix lockdep issue
When sock_owned_by_user returns true in strparser. Fix is to add and
call sock_owned_by_user_nocheck since the check for owned by user is
not an error condition in this case.
====================
Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Thu, 28 Dec 2017 19:00:44 +0000 (11:00 -0800)]
strparser: Call sock_owned_by_user_nocheck
strparser wants to check socket ownership without producing any
warnings. As indicated by the comment in the code, it is permissible
for owned_by_user to return true.
Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-and-tested-by: <syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com> Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Thu, 28 Dec 2017 17:38:13 +0000 (12:38 -0500)]
skbuff: in skb_copy_ubufs unclone before releasing zerocopy
skb_copy_ubufs must unclone before it is safe to modify its
skb_shared_info with skb_zcopy_clear.
Commit b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even
without user frags") ensures that all skbs release their zerocopy
state, even those without frags.
But I forgot an edge case where such an skb arrives that is cloned.
The stack does not build such packets. Vhost/tun skbs have their
frags orphaned before cloning. TCP skbs only attach zerocopy state
when a frag is added.
But if TCP packets can be trimmed or linearized, this might occur.
Tracing the code I found no instance so far (e.g., skb_linearize
ends up calling skb_zcopy_clear if !skb->data_len).
Still, it is non-obvious that no path exists. And it is fragile to
rely on this.
Fixes: b90ddd568792 ("skbuff: skb_copy_ubufs must release uarg even without user frags") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Parthasarathy Bhuvaragan [Thu, 28 Dec 2017 11:03:06 +0000 (12:03 +0100)]
tipc: fix hanging poll() for stream sockets
In commit 42b531de17d2f6 ("tipc: Fix missing connection request
handling"), we replaced unconditional wakeup() with condtional
wakeup for clients with flags POLLIN | POLLRDNORM | POLLRDBAND.
This breaks the applications which do a connect followed by poll
with POLLOUT flag. These applications are not woken when the
connection is ESTABLISHED and hence sleep forever.
In this commit, we fix it by including the POLLOUT event for
sockets in TIPC_CONNECTING state.
Fixes: 42b531de17d2f6 ("tipc: Fix missing connection request handling") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lukas Bulwahn [Tue, 26 Dec 2017 21:27:20 +0000 (15:27 -0600)]
objtool: Fix Clang enum conversion warning
Fix the following Clang enum conversion warning:
arch/x86/decode.c:141:20: error: implicit conversion from enumeration
type 'enum op_src_type' to different enumeration
type 'enum op_dest_type' [-Werror,-Wenum-conversion]
op->dest.type = OP_SRC_REG;
~ ^~~~~~~~~~
It just happened to work before because OP_SRC_REG and OP_DEST_REG have
the same value.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Nicholas Mc Guire <der.herr@hofr.at> Reviewed-by: Nick Desaulniers <nick.desaulniers@gmail.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: baa41469a7b9 ("objtool: Implement stack validation 2.0") Link: http://lkml.kernel.org/r/b4156c5738bae781c392e7a3691aed4514ebbdf2.1514323568.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dou Liyang [Mon, 4 Dec 2017 04:03:12 +0000 (12:03 +0800)]
x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 case
There are two consumers of apic=:
apic_set_verbosity() for setting the APIC debug level;
parse_apic() for registering APIC driver by hand.
X86-32 supports both of them, but sometimes, kernel issues a weird warning.
eg: when kernel was booted up with 'apic=bigsmp' in command line,
early_param would warn like that:
...
[ 0.000000] APIC Verbosity level bigsmp not recognised use apic=verbose or apic=debug
[ 0.000000] Malformed early option 'apic'
...
Wrap the warning code in CONFIG_X86_64 case to avoid this.
This was seen when running an upstream kernel on Acer Chromebook R11. The
system was unstable as result.
Guard the log message with __printk_ratelimit to reduce the impact. This
won't prevent the interrupt storm from happening, but at least the system
remains stable.
Joel Fernandes [Thu, 21 Dec 2017 01:22:45 +0000 (02:22 +0100)]
cpufreq: schedutil: Use idle_calls counter of the remote CPU
Since the recent remote cpufreq callback work, its possible that a cpufreq
update is triggered from a remote CPU. For single policies however, the current
code uses the local CPU when trying to determine if the remote sg_cpu entered
idle or is busy. This is incorrect. To remedy this, compare with the nohz tick
idle_calls counter of the remote CPU.
Fixes: 674e75411fc2 (sched: cpufreq: Allow remote cpufreq callbacks) Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Joel Fernandes <joelaf@google.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Andrew Lunn [Sat, 2 Dec 2017 17:11:04 +0000 (18:11 +0100)]
kernel/irq: Extend lockdep class for request mutex
The IRQ code already has support for lockdep class for the lock mutex
in an interrupt descriptor. Extend this to add a second class for the
request mutex in the descriptor. Not having a class is resulting in
false positive splats in some code paths.
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Two small fixes for bpftool. Fix otherwise broken output if any of
the system calls failed when listing maps in json format and instead
of bailing out, skip maps or progs that disappeared between fetching
next id and getting an fd for that id, both from Jakub.
2) Small fix in BPF selftests to respect LLC passed from command line
when testing for -mcpu=probe presence, from Quentin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Mon, 25 Dec 2017 02:43:53 +0000 (03:43 +0100)]
sparc64: repair calling incorrect hweight function from stubs
Commit v4.12-rc4-1-g9289ea7f952b introduced a mistake that made the
64-bit hweight stub call the 16-bit hweight function.
Fixes: 9289ea7f952b ("sparc64: Use indirect calls in hamming weight stubs") Signed-off-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Nitzan Carmi [Tue, 26 Dec 2017 09:20:20 +0000 (11:20 +0200)]
IB/mlx5: Fix mlx5_ib_alloc_mr error flow
ibmr.device is being set only after ib_alloc_mr() is
(successfully) complete. Therefore, in case mlx5_core_create_mkey()
return with error, the error flow calls mlx5_free_priv_descs()
which uses ibmr.device (which doesn't exist yet), causing
a NULL dereference oops.
To fix this, the IB device should be set in the mr struct earlier
stage (e.g. prior to calling mlx5_core_create_mkey()).
Fixes: 8a187ee52b04 ("IB/mlx5: Support the new memory registration API") Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Nitzan Carmi <nitzanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Moni Shoua [Sun, 24 Dec 2017 11:54:58 +0000 (13:54 +0200)]
IB/core: Verify that QP is security enabled in create and destroy
The XRC target QP create flow sets up qp_sec only if there is an IB link with
LSM security enabled. However, several other related uAPI entry points blindly
follow the qp_sec NULL pointer, resulting in a possible oops.
Check for NULL before using qp_sec.
Cc: <stable@vger.kernel.org> # v4.12 Fixes: d291f1a65232 ("IB/core: Enforce PKey security on QPs") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Moni Shoua [Sun, 24 Dec 2017 11:54:57 +0000 (13:54 +0200)]
IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp()
If the input command length is larger than the kernel supports an error should
be returned in case the unsupported bytes are not cleared, instead of the
other way aroudn. This matches what all other callers of ib_is_udata_cleared
do and will avoid user ABI problems in the future.
Cc: <stable@vger.kernel.org> # v4.10 Fixes: 189aba99e700 ("IB/uverbs: Extend modify_qp and support packet pacing") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Majd Dibbiny [Sun, 24 Dec 2017 11:54:56 +0000 (13:54 +0200)]
IB/mlx5: Serialize access to the VMA list
User-space applications can do mmap and munmap directly at
any time.
Since the VMA list is not protected with a mutex, concurrent
accesses to the VMA list from the mmap and munmap can cause
data corruption. Add a mutex around the list.
Cc: <stable@vger.kernel.org> # v4.7 Fixes: 7c2344c3bbf9 ("IB/mlx5: Implements disassociate_ucontext API") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Linus Torvalds [Wed, 27 Dec 2017 21:06:57 +0000 (13:06 -0800)]
Merge tag 'trace-v4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"While doing tests on tracing over the network, I found that the
packets were getting corrupted.
In the process I found three bugs.
One was the culprit, but the other two scared me. After deeper
investigation, they were not as major as I thought they were, due to a
signed compared to an unsigned that prevented a negative number from
doing actual harm.
The two bigger bugs:
- Mask the ring buffer data page length. There are data flags at the
high bits of the length field. These were not cleared via the
length function, and the length could return a negative number.
(Although the number returned was unsigned, but was assigned to a
signed number) Luckily, this value was compared to PAGE_SIZE which
is unsigned and kept it from entering the path that could have
caused damage.
- Check the page usage before reusing the ring buffer reader page.
TCP increments the page ref when passing the page off to the
network. The page is passed back to the ring buffer for use on
free. But the page could still be in use by the TCP stack.
Minor bugs:
- Related to the first bug. No need to clear out the unused ring
buffer data before sending to user space. It is now done by the
ring buffer code itself.
- Reset pointers after free on error path. There were some cases in
the error path that pointers were freed but not set to NULL, and
could have them freed again, having a pointer freed twice"
* tag 'trace-v4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix possible double free on failure of allocating trace buffer
tracing: Fix crash when it fails to alloc ring buffer
ring-buffer: Do no reuse reader page if still in use
tracing: Remove extra zeroing out of the ring buffer page
ring-buffer: Mask out the info bits when returning buffer page length
Linus Torvalds [Wed, 27 Dec 2017 20:59:27 +0000 (12:59 -0800)]
Merge tag 'sound-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"It seems that Santa overslept with a bunch of gifts; the majority of
changes here are various device-specific ASoC fixes, most notably the
revert of rcar IOMMU support and fsl_ssi AC97 fixes, but also lots of
small fixes for codecs. Besides that, the usual HD-audio quirks and
fixes are included, too"
* tag 'sound-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (31 commits)
ALSA: hda - Fix missing COEF init for ALC225/295/299
ALSA: hda: Drop useless WARN_ON()
ALSA: hda - change the location for one mic on a Lenovo machine
ALSA: hda - fix headset mic detection issue on a Dell machine
ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines
ASoC: rsnd: fixup ADG register mask
ASoC: rt5514-spi: only enable wakeup when fully initialized
ASoC: nau8825: fix issue that pop noise when start capture
ASoC: rt5663: Fix the wrong result of the first jack detection
ASoC: rsnd: ssi: fix race condition in rsnd_ssi_pointer_update
ASoC: Intel: Change kern log level to avoid unwanted messages
ASoC: atmel-classd: select correct Kconfig symbol
ASoC: wm_adsp: Fix validation of firmware and coeff lengths
ASoC: Intel: Skylake: Do not check dev_type for dmic link type
ASoC: rockchip: disable clock on error
ASoC: tlv320aic31xx: Fix GPIO1 register definition
ASoC: codecs: msm8916-wcd: Fix supported formats
ASoC: fsl_asrc: Fix typo in a field define
ASoC: rsnd: ssiu: clear SSI_MODE for non TDM Extended modes
ASoC: da7218: Correct IRQ level in DT binding example
...
Linus Torvalds [Wed, 27 Dec 2017 19:48:50 +0000 (11:48 -0800)]
x86-32: Fix kexec with stack canary (CONFIG_CC_STACKPROTECTOR)
Commit e802a51ede91 ("x86/idt: Consolidate IDT invalidation") cleaned up
and unified the IDT invalidation that existed in a couple of places. It
changed no actual real code.
Despite not changing any actual real code, it _did_ change code generation:
by implementing the common idt_invalidate() function in
archx86/kernel/idt.c, it made the use of the function in
arch/x86/kernel/machine_kexec_32.c be a real function call rather than an
(accidental) inlining of the function.
That, in turn, exposed two issues:
- in load_segments(), we had incorrectly reset all the segment
registers, which then made the stack canary load (which gcc does
using offset of %gs) cause a trap. Instead of %gs pointing to the
stack canary, it will be the normal zero-based kernel segment, and
the stack canary load will take a page fault at address 0x14.
- to make this even harder to debug, we had invalidated the GDT just
before calling idt_invalidate(), which meant that the fault happened
with an invalid GDT, which in turn causes a triple fault and
immediate reboot.
Fix this by
(a) not reloading the special segments in load_segments(). We currently
don't do any percpu accesses (which would require %fs on x86-32) in
this area, but there's no reason to think that we might not want to
do them, and like %gs, it's pointless to break it.
(b) doing idt_invalidate() before invalidating the GDT, to keep things
at least _slightly_ more debuggable for a bit longer. Without a
IDT, traps will not work. Without a GDT, traps also will not work,
but neither will any segment loads etc. So in a very real sense,
the GDT is even more core than the IDT.
Fixes: e802a51ede91 ("x86/idt: Consolidate IDT invalidation") Reported-and-tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.LFD.2.21.1712271143180.8572@i7.lan
Matthieu CASTET [Tue, 12 Dec 2017 10:10:44 +0000 (11:10 +0100)]
led: core: Fix brightness setting when setting delay_off=0
With the current code, the following sequence won't work :
echo timer > trigger
echo 0 > delay_off
* at this point we call
** led_delay_off_store
** led_blink_set
*** stop timer
** led_blink_setup
** led_set_software_blink
*** if !delay_on, led off
*** if !delay_off, set led_set_brightness_nosleep <--- LED_BLINK_SW is set but timer is stop
*** otherwise start timer/set LED_BLINK_SW flag
echo xxx > brightness
* led_set_brightness
** if LED_BLINK_SW
*** if brightness=0, led off
*** else apply brightness if next timer <--- timer is stop, and will never apply new setting
** otherwise set led_set_brightness_nosleep
To fix that, when we delete the timer, we should clear LED_BLINK_SW.
Cc: linux-leds@vger.kernel.org Signed-off-by: Matthieu CASTET <matthieu.castet@parrot.com> Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
rodrigosiqueira [Fri, 15 Dec 2017 13:15:33 +0000 (11:15 -0200)]
x86: Remove unused parameter of prepare_switch_to
Commit e37e43a497d5 ("x86/mm/64: Enable vmapped stacks
(CONFIG_HAVE_ARCH_VMAP_STACK=y)") added prepare_switch_to with one extra
parameter which is not used by the function, remove it.
Thomas Gleixner [Wed, 27 Dec 2017 18:45:31 +0000 (19:45 +0100)]
perf/x86/intel: Plug memory leak in intel_pmu_init()
A recent commit introduced an extra merge_attr() call in the skylake
branch, which causes a memory leak.
Store the pointer to the extra allocated memory and free it at the end of
the function.
Fixes: a5df70c354c2 ("perf/x86: Only show format attributes when supported") Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com>
Steven Rostedt (VMware) [Wed, 27 Dec 2017 01:07:34 +0000 (20:07 -0500)]
tracing: Fix possible double free on failure of allocating trace buffer
Jing Xia and Chunyan Zhang reported that on failing to allocate part of the
tracing buffer, memory is freed, but the pointers that point to them are not
initialized back to NULL, and later paths may try to free the freed memory
again. Jing and Chunyan fixed one of the locations that does this, but
missed a spot.
Jing Xia [Tue, 26 Dec 2017 07:12:53 +0000 (15:12 +0800)]
tracing: Fix crash when it fails to alloc ring buffer
Double free of the ring buffer happens when it fails to alloc new
ring buffer instance for max_buffer if TRACER_MAX_TRACE is configured.
The root cause is that the pointer is not set to NULL after the buffer
is freed in allocate_trace_buffers(), and the freeing of the ring
buffer is invoked again later if the pointer is not equal to Null,
as:
//if trace_buffer is not null, free again
|-ring_buffer_free(buf->buffer)
|-rb_free_cpu_buffer(buffer->buffers[cpu])
// ring_buffer_per_cpu is null, and
// crash in ring_buffer_per_cpu->pages
Steven Rostedt (VMware) [Sat, 23 Dec 2017 02:19:29 +0000 (21:19 -0500)]
ring-buffer: Do no reuse reader page if still in use
To free the reader page that is allocated with ring_buffer_alloc_read_page(),
ring_buffer_free_read_page() must be called. For faster performance, this
page can be reused by the ring buffer to avoid having to free and allocate
new pages.
The issue arises when the page is used with a splice pipe into the
networking code. The networking code may up the page counter for the page,
and keep it active while sending it is queued to go to the network. The
incrementing of the page ref does not prevent it from being reused in the
ring buffer, and this can cause the page that is being sent out to the
network to be modified before it is sent by reading new data.
Add a check to the page ref counter, and only reuse the page if it is not
being used anywhere else.
Cc: stable@vger.kernel.org Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>