]> www.infradead.org Git - nvme.git/log
nvme.git
7 years agocrypto: vmx - Remove overly verbose printk from AES XTS init
Michael Ellerman [Thu, 3 May 2018 12:29:30 +0000 (22:29 +1000)]
crypto: vmx - Remove overly verbose printk from AES XTS init

In p8_aes_xts_init() we do a printk(KERN_INFO ...) to report the
fallback implementation we're using. However with a slow console this
can significantly affect the speed of crypto operations. So remove it.

Fixes: c07f5d3da643 ("crypto: vmx - Adding support for XTS")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: vmx - Remove overly verbose printk from AES init routines
Michael Ellerman [Thu, 3 May 2018 12:29:29 +0000 (22:29 +1000)]
crypto: vmx - Remove overly verbose printk from AES init routines

In the vmx AES init routines we do a printk(KERN_INFO ...) to report
the fallback implementation we're using.

However with a slow console this can significantly affect the speed of
crypto operations. Using 'cryptsetup benchmark' the removal of the
printk() leads to a ~5x speedup for aes-cbc decryption.

So remove them.

Fixes: 8676590a1593 ("crypto: vmx - Adding AES routines for VMX module")
Fixes: 8c755ace357c ("crypto: vmx - Adding CBC routines for VMX module")
Fixes: 4f7f60d312b3 ("crypto: vmx - Adding CTR routines for VMX module")
Fixes: cc333cd68dfa ("crypto: vmx - Adding GHASH routines for VMX module")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/sha512-ce - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:30 +0000 (18:18 +0200)]
crypto: arm64/sha512-ce - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
conditionally yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/sha3-ce - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:29 +0000 (18:18 +0200)]
crypto: arm64/sha3-ce - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
conditionally yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/crct10dif-ce - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:28 +0000 (18:18 +0200)]
crypto: arm64/crct10dif-ce - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/crc32-ce - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:27 +0000 (18:18 +0200)]
crypto: arm64/crc32-ce - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/aes-ghash - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:26 +0000 (18:18 +0200)]
crypto: arm64/aes-ghash - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/aes-bs - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:25 +0000 (18:18 +0200)]
crypto: arm64/aes-bs - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/aes-blk - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:24 +0000 (18:18 +0200)]
crypto: arm64/aes-blk - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/aes-ccm - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:23 +0000 (18:18 +0200)]
crypto: arm64/aes-ccm - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/sha2-ce - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:22 +0000 (18:18 +0200)]
crypto: arm64/sha2-ce - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64/sha1-ce - yield NEON after every block of input
Ard Biesheuvel [Mon, 30 Apr 2018 16:18:21 +0000 (18:18 +0200)]
crypto: arm64/sha1-ce - yield NEON after every block of input

Avoid excessive scheduling delays under a preemptible kernel by
yielding the NEON after every block of input.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: ghash-clmulni - fix spelling mistake: "acclerated" -> "accelerated"
Colin Ian King [Fri, 27 Apr 2018 18:08:05 +0000 (19:08 +0100)]
crypto: ghash-clmulni - fix spelling mistake: "acclerated" -> "accelerated"

Trivial fix to spelling mistake in module description text

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: caam - fix size of RSA prime factor q
Horia Geantă [Fri, 27 Apr 2018 08:40:11 +0000 (11:40 +0300)]
crypto: caam - fix size of RSA prime factor q

Fix a typo where size of RSA prime factor q is using the size of
prime factor p.

Cc: <stable@vger.kernel.org> # 4.13+
Fixes: 52e26d77b8b3 ("crypto: caam - add support for RSA key form 2")
Fixes: 4a651b122adb ("crypto: caam - add support for RSA key form 3")
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: tcrypt - Remove VLA usage
Kees Cook [Fri, 27 Apr 2018 02:57:28 +0000 (19:57 -0700)]
crypto: tcrypt - Remove VLA usage

In the quest to remove all stack VLA usage from the kernel[1], this
allocates the return code buffers before starting jiffie timers, rather
than using stack space for the array. Additionally cleans up some exit
paths and make sure that the num_mb module_param() is used only once
per execution to avoid possible races in the value changing.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: arm64 - add support for SM4 encryption using special instructions
Ard Biesheuvel [Wed, 25 Apr 2018 12:20:46 +0000 (14:20 +0200)]
crypto: arm64 - add support for SM4 encryption using special instructions

Add support for the SM4 symmetric cipher implemented using the special
SM4 instructions introduced in ARM architecture revision 8.2.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: sm4 - export encrypt/decrypt routines to other drivers
Ard Biesheuvel [Wed, 25 Apr 2018 12:20:45 +0000 (14:20 +0200)]
crypto: sm4 - export encrypt/decrypt routines to other drivers

In preparation of adding support for the SIMD based arm64 implementation
of arm64, which requires a fallback to non-SIMD code when invoked in
certain contexts, expose the generic SM4 encrypt and decrypt routines
to other drivers.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agohwrng: stm32 - fix pm_suspend issue
lionel.debieve@st.com [Mon, 23 Apr 2018 15:04:26 +0000 (17:04 +0200)]
hwrng: stm32 - fix pm_suspend issue

When suspend is called after pm_runtime_suspend,
same callback is used and access to rng register is
freezing system. By calling the pm_runtime_force_suspend,
it first checks that runtime has been already done.

Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agohwrng: stm32 - define default state for rng driver
lionel.debieve@st.com [Mon, 23 Apr 2018 15:04:25 +0000 (17:04 +0200)]
hwrng: stm32 - define default state for rng driver

Define default state for stm32_rng driver. It will
be default selected with multi_v7_defconfig

Signed-off-by: Lionel Debieve <lionel.debieve@st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: ccree - use proper printk format
Gilad Ben-Yossef [Mon, 23 Apr 2018 07:25:15 +0000 (08:25 +0100)]
crypto: ccree - use proper printk format

Fix incorrect use of %pad as a printk format string for none dma_addr_t
variable.

Discovered via smatch.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: ccree - enable support for hardware keys
Gilad Ben-Yossef [Mon, 23 Apr 2018 07:25:14 +0000 (08:25 +0100)]
crypto: ccree - enable support for hardware keys

Enable CryptoCell support for hardware keys.

Hardware keys are regular AES keys loaded into CryptoCell internal memory
via firmware, often from secure boot ROM or hardware fuses at boot time.

As such, they can be used for enc/dec purposes like any other key but
cannot (read: extremely hard to) be extracted since since they are not
available anywhere in RAM during runtime.

The mechanism has some similarities to s390 secure keys although the keys
are not wrapped or sealed, but simply loaded offline. The interface was
therefore modeled based on the s390 secure keys support.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: crypto4xx - put temporary dst sg into request ctx
Christian Lamparter [Thu, 19 Apr 2018 16:41:57 +0000 (18:41 +0200)]
crypto: crypto4xx - put temporary dst sg into request ctx

This patch fixes a crash that happens when testing rfc4543(gcm(aes))

Unable to handle kernel paging request for data at address 0xf59b3420
Faulting instruction address: 0xc0012994
Oops: Kernel access of bad area, sig: 11 [#1]
BE PowerPC 44x Platform
Modules linked in: tcrypt(+) crypto4xx [...]
CPU: 0 PID: 0 Comm: swapper Tainted: G           O      4.17.0-rc1+ #23
NIP:  c0012994 LR: d3077934 CTR: 06026d49
REGS: cfff7e30 TRAP: 0300   Tainted: G           O       (4.17.0-rc1+)
MSR:  00029000 <CE,EE,ME>  CR: 44744822  XER: 00000000
DEAR: f59b3420 ESR: 00000000
NIP [c0012994] __dma_sync+0x58/0x10c
LR [d3077934] crypto4xx_bh_tasklet_cb+0x188/0x3c8 [crypto4xx]

__dma_sync was fed the temporary _dst that crypto4xx_build_pd()
had in it's function stack. This clearly never worked.
This patch therefore overhauls the code from the original driver
and puts the temporary dst sg list into aead's request context.

Fixes: a0aae821ba3d3 ("crypto: crypto4xx - prepare for AEAD support")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: crypto4xx - extend aead fallback checks
Christian Lamparter [Thu, 19 Apr 2018 16:41:56 +0000 (18:41 +0200)]
crypto: crypto4xx - extend aead fallback checks

1020 bytes is the limit for associated data. Any more
and it will no longer fit into hash_crypto_offset anymore.

The hardware will not process aead requests with plaintext
that have less than AES_BLOCK_SIZE bytes. When decrypting
aead requests the authsize has to be taken in account as
well, as it is part of the cryptlen. Otherwise the hardware
will think it has been misconfigured and will return:

aead return err status = 0x98

For rtc4543(gcm(aes)), the hardware has a dedicated GMAC
mode as part of the hash function set.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: crypto4xx - properly set IV after de- and encrypt
Christian Lamparter [Thu, 19 Apr 2018 16:41:55 +0000 (18:41 +0200)]
crypto: crypto4xx - properly set IV after de- and encrypt

This patch fixes cts(cbc(aes)) test when cbc-aes-ppc4xx is used.
alg: skcipher: Test 1 failed (invalid result) on encryption for cts(cbc-aes-ppc4xx)
00000000: 4b 10 75 fc 2f 14 1b 6a 27 35 37 33 d1 b7 70 05
00000010: 97
alg: skcipher: Failed to load transform for cts(cbc(aes)): -2

The CTS cipher mode expect the IV (req->iv) of skcipher_request
to contain the last ciphertext block after the {en,de}crypt
operation is complete.

Fix this issue for the AMCC Crypto4xx hardware engine.
The tcrypt test case for cts(cbc(aes)) is now correctly passed.

name         : cts(cbc(aes))
driver       : cts(cbc-aes-ppc4xx)
module       : cts
priority     : 300
refcnt       : 1
selftest     : passed
internal     : no
type         : skcipher
async        : yes
blocksize    : 16
min keysize  : 16
max keysize  : 32
ivsize       : 16
chunksize    : 16
walksize     : 16

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: crypto4xx - add aes-ctr support
Christian Lamparter [Thu, 19 Apr 2018 16:41:54 +0000 (18:41 +0200)]
crypto: crypto4xx - add aes-ctr support

This patch adds support for the aes-ctr skcipher.

name         : ctr(aes)
driver       : ctr-aes-ppc4xx
module       : crypto4xx
priority     : 300
refcnt       : 1
selftest     : passed
internal     : no
type         : skcipher
async        : yes
blocksize    : 16
min keysize  : 16
max keysize  : 32
ivsize       : 16
chunksize    : 16
walksize     : 16

The hardware uses only the last 32-bits as the counter while the
kernel tests (aes_ctr_enc_tv_template[4] for example) expect that
the whole IV is a counter. To make this work, the driver will
fallback if the counter is going to overlow.

The aead's crypto4xx_setup_fallback() function is renamed to
crypto4xx_aead_setup_fallback.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: crypto4xx - avoid VLA use
Christian Lamparter [Thu, 19 Apr 2018 16:41:53 +0000 (18:41 +0200)]
crypto: crypto4xx - avoid VLA use

This patch fixes some of the -Wvla warnings.

crypto4xx_alg.c:83:19: warning: Variable length array is used.
crypto4xx_alg.c:273:56: warning: Variable length array is used.
crypto4xx_alg.c:380:32: warning: Variable length array is used.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: crypto4xx - convert to skcipher
Christian Lamparter [Thu, 19 Apr 2018 16:41:52 +0000 (18:41 +0200)]
crypto: crypto4xx - convert to skcipher

The ablkcipher APIs have been effectively deprecated since [1].
This patch converts the crypto4xx driver to the new skcipher APIs.

[1] <https://www.spinics.net/lists/linux-crypto/msg18133.html>

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: crypto4xx - performance optimizations
Christian Lamparter [Thu, 19 Apr 2018 16:41:51 +0000 (18:41 +0200)]
crypto: crypto4xx - performance optimizations

This patch provides a cheap 2MiB/s+ (~ 6%) performance
improvement over the current code. This is because the
compiler can now optimize several endian swap memcpy.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: cavium - Remove unnecessary parentheses
Varsha Rao [Thu, 19 Apr 2018 15:49:43 +0000 (21:19 +0530)]
crypto: cavium - Remove unnecessary parentheses

This patch fixes the clang warning of extraneous parentheses, with the
following coccinelle script.

@@
identifier i;
constant c;
expression e;
@@
(
!((e))
|
-((
\(i == c\|i != c\|i <= c\|i < c\|i >= c\|i > c\)
-))
)

Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: drivers - simplify getting .drvdata
Wolfram Sang [Thu, 19 Apr 2018 14:05:36 +0000 (16:05 +0200)]
crypto: drivers - simplify getting .drvdata

We should get drvdata from struct device directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: omap-sham - fix memleak
Bin Liu [Tue, 17 Apr 2018 19:53:13 +0000 (14:53 -0500)]
crypto: omap-sham - fix memleak

Fixes: 8043bb1ae03cb ("crypto: omap-sham - convert driver logic to use sgs for data xmit")
The memory pages freed in omap_sham_finish_req() were less than those
allocated in omap_sham_copy_sgs().

Cc: stable@vger.kernel.org
Signed-off-by: Bin Liu <b-liu@ti.com>
Acked-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: drivers - Remove depends on HAS_DMA in case of platform dependency
Geert Uytterhoeven [Tue, 17 Apr 2018 17:49:03 +0000 (19:49 +0200)]
crypto: drivers - Remove depends on HAS_DMA in case of platform dependency

Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.

Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.

This simplifies the dependencies, and allows to improve compile-testing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: caam: - Use kmemdup() function
Fabio Estevam [Mon, 16 Apr 2018 16:05:01 +0000 (13:05 -0300)]
crypto: caam: - Use kmemdup() function

Use kmemdup() rather than duplicating its implementation.

By usign kmemdup() we can also get rid of the 'val' variable.

Detected with Coccinelle script.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: caam - strip input zeros from RSA input buffer
Horia Geantă [Mon, 16 Apr 2018 13:07:05 +0000 (08:07 -0500)]
crypto: caam - strip input zeros from RSA input buffer

Sometimes the provided RSA input buffer provided is not stripped
of leading zeros. This could cause its size to be bigger than that
of the modulus, making the HW complain:

caam_jr 2142000.jr1: 40000789: DECO: desc idx 7:
Protocol Size Error - A protocol has seen an error in size. When
running RSA, pdb size N < (size of F) when no formatting is used; or
pdb size N < (F + 11) when formatting is used.

Fix the problem by stripping off the leading zero from input data
before feeding it to the CAAM accelerator.

Fixes: 8c419778ab57e ("crypto: caam - add support for RSA algorithm")
Cc: <stable@vger.kernel.org> # 4.8+
Reported-by: Martin Townsend <mtownsend1973@gmail.com>
Link: https://lkml.kernel.org/r/CABatt_ytYORYKtApcB4izhNanEKkGFi9XAQMjHi_n-8YWoCRiw@mail.gmail.com
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agohwrng: via - support new Centaur CPU
davidwang [Fri, 13 Apr 2018 07:03:03 +0000 (15:03 +0800)]
hwrng: via - support new Centaur CPU

New Centaur CPU(Family > 6) supprt Random Number Generator, but can't
support MSR_VIA_RNG. Just like VIA Nano.

Signed-off-by: David Wang <davidwang@zhaoxin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: rsa - Remove unneeded error assignment
Fabio Estevam [Wed, 11 Apr 2018 21:37:17 +0000 (18:37 -0300)]
crypto: rsa - Remove unneeded error assignment

There is no need to assign an error value to 'ret' prior
to calling mpi_read_raw_from_sgl() because in the case
of error the 'ret' variable will be assigned to the error
code inside the if block.

In the case of non failure, 'ret' will be overwritten
immediately after, so remove the unneeded assignment.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: testmgr - Allow different compression results
Mahipal Challa [Wed, 11 Apr 2018 18:28:32 +0000 (20:28 +0200)]
crypto: testmgr - Allow different compression results

The following error is triggered by the ThunderX ZIP driver
if the testmanager is enabled:

[  199.069437] ThunderX-ZIP 0000:03:00.0: Found ZIP device 0 177d:a01a on Node 0
[  199.073573] alg: comp: Compression test 1 failed for deflate-generic: output len = 37

The reason for this error is the verification of the compression
results. Verifying the compression result only works if all
algorithm parameters are identical, in this case to the software
implementation.

Different compression engines like the ThunderX ZIP coprocessor
might yield different compression results by tuning the
algorithm parameters. In our case the compressed result is
shorter than the test vector.

We should not forbid different compression results but only
check that compression -> decompression yields the same
result. This is done already in the acomp test. Do something
similar for test_comp().

Signed-off-by: Mahipal Challa <mchalla@cavium.com>
Signed-off-by: Balakrishna Bhamidipati <bbhamidipati@cavium.com>
[jglauber@cavium.com: removed unrelated printk changes, rewrote commit msg,
 fixed whitespace and unneeded initialization]
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: caam - allow retrieving 'era' from register
Fabio Estevam [Wed, 11 Apr 2018 12:45:20 +0000 (09:45 -0300)]
crypto: caam - allow retrieving 'era' from register

The 'era' information can be retrieved from CAAM registers, so
introduce a caam_get_era_from_hw() function that gets it via register
reads in case the 'fsl,sec-era' property is not passed in the device
tree.

This function is based on the U-Boot implementation from
drivers/crypto/fsl/sec.c

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Tested-by: Breno Lima <breno.lima@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: caam - staticize caam_get_era()
Fabio Estevam [Wed, 11 Apr 2018 12:45:19 +0000 (09:45 -0300)]
crypto: caam - staticize caam_get_era()

caam_get_era() is only used locally, so do not export this function
and make it static instead.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: cavium - Fix smp_processor_id() warnings
Jan Glauber [Mon, 9 Apr 2018 15:45:54 +0000 (17:45 +0200)]
crypto: cavium - Fix smp_processor_id() warnings

Switch to raw_smp_processor_id() to prevent a number of
warnings from kernel debugging. We do not care about
preemption here, as the CPU number is only used as a
poor mans load balancing or device selection. If preemption
happens during a compress/decompress operation a small performance
hit will occur but everything will continue to work, so just
ignore it.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: cavium - Fix statistics pending request value
Jan Glauber [Mon, 9 Apr 2018 15:45:53 +0000 (17:45 +0200)]
crypto: cavium - Fix statistics pending request value

The pending request counter was read from the wrong register. While
at it, there is no need to use an atomic for it as it is only read
localy in a loop.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: cavium - Prevent division by zero
Jan Glauber [Mon, 9 Apr 2018 15:45:52 +0000 (17:45 +0200)]
crypto: cavium - Prevent division by zero

Avoid two potential divisions by zero when calculating average
values for the zip statistics.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: cavium - Limit result reading attempts
Jan Glauber [Mon, 9 Apr 2018 15:45:51 +0000 (17:45 +0200)]
crypto: cavium - Limit result reading attempts

After issuing a request an endless loop was used to read the
completion state from memory which is asynchronously updated
by the ZIP coprocessor.

Add an upper bound to the retry attempts to prevent a CPU getting stuck
forever in case of an error. Additionally, add a read memory barrier
and a small delay between the reading attempts.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: cavium - Fix fallout from CONFIG_VMAP_STACK
Jan Glauber [Mon, 9 Apr 2018 15:45:50 +0000 (17:45 +0200)]
crypto: cavium - Fix fallout from CONFIG_VMAP_STACK

Enabling virtual mapped kernel stacks breaks the thunderx_zip
driver. On compression or decompression the executing CPU hangs
in an endless loop. The reason for this is the usage of __pa
by the driver which does no longer work for an address that is
not part of the 1:1 mapping.

The zip driver allocates a result struct on the stack and needs
to tell the hardware the physical address within this struct
that is used to signal the completion of the request.

As the hardware gets the wrong address after the broken __pa
conversion it writes to an arbitrary address. The zip driver then
waits forever for the completion byte to contain a non-zero value.

Allocating the result struct from 1:1 mapped memory resolves this
bug.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: remove several VLAs
Salvatore Mesoraca [Mon, 9 Apr 2018 13:54:47 +0000 (15:54 +0200)]
crypto: remove several VLAs

We avoid various VLAs[1] by using constant expressions for block size
and alignment mask.

[1] http://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: api - laying defines and checks for statically allocated buffers
Salvatore Mesoraca [Mon, 9 Apr 2018 13:54:46 +0000 (15:54 +0200)]
crypto: api - laying defines and checks for statically allocated buffers

In preparation for the removal of VLAs[1] from crypto code.
We create 2 new compile-time constants: all ciphers implemented
in Linux have a block size less than or equal to 16 bytes and
the most demanding hw require 16 bytes alignment for the block
buffer.
We also enforce these limits in crypto_check_alg when a new
cipher is registered.

[1] http://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: chelsio - remove redundant assignment to cdev->ports
Colin Ian King [Fri, 6 Apr 2018 16:58:47 +0000 (17:58 +0100)]
crypto: chelsio - remove redundant assignment to cdev->ports

There is a double assignment to cdev->ports, the first is redundant
as it is over-written so remove it.

Detected by CoverityScan, CID#1467432 ("Unused value")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: chelsio - don't leak information from the stack to userspace
Colin Ian King [Thu, 5 Apr 2018 16:44:03 +0000 (17:44 +0100)]
crypto: chelsio - don't leak information from the stack to userspace

The structure crypto_info contains fields that are not initialized and
only .version is set.  The copy_to_user call is hence leaking information
from the stack to userspace which must be avoided. Fix this by zero'ing
all the unused fields.

Detected by CoverityScan, CID#1467421 ("Uninitialized scalar variable")

Fixes: a08943947873 ("crypto: chtls - Register chtls with net tls")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: chelsio - Fix potential NULL pointer dereferences
Gustavo A. R. Silva [Tue, 3 Apr 2018 20:09:12 +0000 (15:09 -0500)]
crypto: chelsio - Fix potential NULL pointer dereferences

Add null checks on lookup_tid() return value in order to prevent
null pointer dereferences.

Addresses-Coverity-ID: 1467422 ("Dereference null return value")
Addresses-Coverity-ID: 1467443 ("Dereference null return value")
Addresses-Coverity-ID: 1467445 ("Dereference null return value")
Addresses-Coverity-ID: 1467449 ("Dereference null return value")
Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: authencesn - don't leak pointers to authenc keys
Tudor-Dan Ambarus [Tue, 3 Apr 2018 06:39:01 +0000 (09:39 +0300)]
crypto: authencesn - don't leak pointers to authenc keys

In crypto_authenc_esn_setkey we save pointers to the authenc keys
in a local variable of type struct crypto_authenc_keys and we don't
zeroize it after use. Fix this and don't leak pointers to the
authenc keys.

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: authenc - don't leak pointers to authenc keys
Tudor-Dan Ambarus [Tue, 3 Apr 2018 06:39:00 +0000 (09:39 +0300)]
crypto: authenc - don't leak pointers to authenc keys

In crypto_authenc_setkey we save pointers to the authenc keys in
a local variable of type struct crypto_authenc_keys and we don't
zeroize it after use. Fix this and don't leak pointers to the
authenc keys.

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: zstd - Add zstd support
Nick Terrell [Fri, 30 Mar 2018 19:14:53 +0000 (12:14 -0700)]
crypto: zstd - Add zstd support

Adds zstd support to crypto and scompress. Only supports the default
level.

Previously we held off on this patch, since there weren't any users.
Now zram is ready for zstd support, but depends on CONFIG_CRYPTO_ZSTD,
which isn't defined until this patch is in. I also see a patch adding
zstd to pstore [0], which depends on crypto zstd.

[0] lkml.kernel.org/r/9c9416b2dff19f05fb4c35879aaa83d11ff72c92.1521626182.git.geliangtang@gmail.com

Signed-off-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: ecc - Actually remove stack VLA usage
Kees Cook [Fri, 30 Mar 2018 16:55:44 +0000 (09:55 -0700)]
crypto: ecc - Actually remove stack VLA usage

On the quest to remove all VLAs from the kernel[1], this avoids VLAs
by just using the maximum allocation size (4 bytes) for stack arrays.
All the VLAs in ecc were either 3 or 4 bytes (or a multiple), so just
make it 4 bytes all the time. Initialization routines are adjusted to
check that ndigits does not end up larger than the arrays.

This includes a removal of the earlier attempt at this fix from
commit a963834b4742 ("crypto/ecc: Remove stack VLA usage")

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: caam/qi - fix IV DMA mapping and updating
Horia Geantă [Wed, 28 Mar 2018 12:39:19 +0000 (15:39 +0300)]
crypto: caam/qi - fix IV DMA mapping and updating

There are two IV-related issues:
(1) crypto API does not guarantee to provide an IV buffer that is DMAable,
thus it's incorrect to DMA map it
(2) for in-place decryption, since ciphertext is overwritten with
plaintext, updated IV (req->info) will contain the last block of plaintext
(instead of the last block of ciphertext)

While these two issues could be fixed separately, it's straightforward
to fix both in the same time - by using the {ablkcipher,aead}_edesc
extended descriptor to store the IV that will be fed to the crypto engine;
this allows for fixing (2) by saving req->src[last_block] in req->info
directly, i.e. without allocating yet another temporary buffer.

A side effect of the fix is that it's no longer possible to have the IV
contiguous with req->src or req->dst.
Code checking for this case is removed.

Cc: <stable@vger.kernel.org> # 4.14+
Fixes: a68a19380522 ("crypto: caam/qi - properly set IV after {en,de}crypt")
Link: http://lkml.kernel.org/r/20170113084620.GF22022@gondor.apana.org.au
Reported-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: caam - fix IV DMA mapping and updating
Horia Geantă [Wed, 28 Mar 2018 12:39:18 +0000 (15:39 +0300)]
crypto: caam - fix IV DMA mapping and updating

There are two IV-related issues:
(1) crypto API does not guarantee to provide an IV buffer that is DMAable,
thus it's incorrect to DMA map it
(2) for in-place decryption, since ciphertext is overwritten with
plaintext, updated req->info will contain the last block of plaintext
(instead of the last block of ciphertext)

While these two issues could be fixed separately, it's straightforward
to fix both in the same time - by allocating extra space in the
ablkcipher_edesc for the IV that will be fed to the crypto engine;
this allows for fixing (2) by saving req->src[last_block] in req->info
directly, i.e. without allocating another temporary buffer.

A side effect of the fix is that it's no longer possible to have the IV
and req->src contiguous. Code checking for this case is removed.

Cc: <stable@vger.kernel.org> # 4.13+
Fixes: 854b06f76879 ("crypto: caam - properly set IV after {en,de}crypt")
Link: http://lkml.kernel.org/r/20170113084620.GF22022@gondor.apana.org.au
Reported-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: caam - fix DMA mapping dir for generated IV
Horia Geantă [Wed, 28 Mar 2018 12:39:17 +0000 (15:39 +0300)]
crypto: caam - fix DMA mapping dir for generated IV

In case of GIVCIPHER, IV is generated by the device.
Fix the DMA mapping direction.

Cc: <stable@vger.kernel.org> # 3.19+
Fixes: 7222d1a34103 ("crypto: caam - add support for givencrypt cbc(aes) and rfc3686(ctr(aes))")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: drbg - set freed buffers to NULL
Stephan Mueller [Thu, 12 Apr 2018 06:40:55 +0000 (08:40 +0200)]
crypto: drbg - set freed buffers to NULL

During freeing of the internal buffers used by the DRBG, set the pointer
to NULL. It is possible that the context with the freed buffers is
reused. In case of an error during initialization where the pointers
do not yet point to allocated memory, the NULL value prevents a double
free.

Cc: stable@vger.kernel.org
Fixes: 3cfc3b9721123 ("crypto: drbg - use aligned buffers")
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Reported-by: syzbot+75397ee3df5c70164154@syzkaller.appspotmail.com
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: api - fix finding algorithm currently being tested
Eric Biggers [Mon, 16 Apr 2018 23:59:13 +0000 (16:59 -0700)]
crypto: api - fix finding algorithm currently being tested

Commit eb02c38f0197 ("crypto: api - Keep failed instances alive") is
making allocating crypto transforms sometimes fail with ELIBBAD, when
multiple processes try to access encrypted files with fscrypt for the
first time since boot.  The problem is that the "request larval" for the
algorithm is being mistaken for an algorithm which failed its tests.

Fix it by only returning ELIBBAD for "non-larval" algorithms.  Also
don't leak a reference to the algorithm.

Fixes: eb02c38f0197 ("crypto: api - Keep failed instances alive")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agoLinux 4.17-rc1 v4.17-rc1
Linus Torvalds [Mon, 16 Apr 2018 01:24:20 +0000 (18:24 -0700)]
Linux 4.17-rc1

7 years agoMerge tag 'for-4.17-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Mon, 16 Apr 2018 01:08:35 +0000 (18:08 -0700)]
Merge tag 'for-4.17-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull more btrfs updates from David Sterba:
 "We have queued a few more fixes (error handling, log replay,
  softlockup) and the rest is SPDX updates that touche almost all files
  so the diffstat is long"

* tag 'for-4.17-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: Only check first key for committed tree blocks
  btrfs: add SPDX header to Kconfig
  btrfs: replace GPL boilerplate by SPDX -- sources
  btrfs: replace GPL boilerplate by SPDX -- headers
  Btrfs: fix loss of prealloc extents past i_size after fsync log replay
  Btrfs: clean up resources during umount after trans is aborted
  btrfs: Fix possible softlock on single core machines
  Btrfs: bail out on error during replay_dir_deletes
  Btrfs: fix NULL pointer dereference in log_dir_items

7 years agoMerge tag '4.17-rc1SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Mon, 16 Apr 2018 01:06:22 +0000 (18:06 -0700)]
Merge tag '4.17-rc1SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "SMB3 fixes, a few for stable, and some important cleanup work from
  Ronnie of the smb3 transport code"

* tag '4.17-rc1SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: change validate_buf to validate_iov
  cifs: remove rfc1002 hardcoded constants from cifs_discard_remaining_data()
  cifs: Change SMB2_open to return an iov for the error parameter
  cifs: add resp_buf_size to the mid_q_entry structure
  smb3.11: replace a 4 with server->vals->header_preamble_size
  cifs: replace a 4 with server->vals->header_preamble_size
  cifs: add pdu_size to the TCP_Server_Info structure
  SMB311: Improve checking of negotiate security contexts
  SMB3: Fix length checking of SMB3.11 negotiate request
  CIFS: add ONCE flag for cifs_dbg type
  cifs: Use ULL suffix for 64-bit constant
  SMB3: Log at least once if tree connect fails during reconnect
  cifs: smb2pdu: Fix potential NULL pointer dereference

7 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Mon, 16 Apr 2018 00:24:12 +0000 (17:24 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of minor (and safe changes) that didn't make the initial
  pull request plus some bug fixes.

  The status handling code is actually a running regression from the
  previous merge window which had an incomplete fix (now reverted) and
  most of the remaining bug fixes are for problems older than the
  current merge window"

[ Side note: this merge also takes the base kernel git repository to 6+
  million objects for the first time. Technically we hit it a couple of
  merges ago already if you count all the tag objects, but now it
  reaches 6M+ objects reachable from HEAD.

  I was joking around that that's when I should switch to 5.0, because
  3.0 happened at the 2M mark, and 4.0 happened at 4M objects. But
  probably not, even if numerology is about as good a reason as any.

                                                              - Linus ]

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: devinfo: Add Microsoft iSCSI target to 1024 sector blacklist
  scsi: cxgb4i: silence overflow warning in t4_uld_rx_handler()
  scsi: dpt_i2o: Use after free in I2ORESETCMD ioctl
  scsi: core: Make scsi_result_to_blk_status() recognize CONDITION MET
  scsi: core: Rename __scsi_error_from_host_byte() into scsi_result_to_blk_status()
  Revert "scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"
  scsi: aacraid: Insure command thread is not recursively stopped
  scsi: qla2xxx: Correct setting of SAM_STAT_CHECK_CONDITION
  scsi: qla2xxx: correctly shift host byte
  scsi: qla2xxx: Fix race condition between iocb timeout and initialisation
  scsi: qla2xxx: Avoid double completion of abort command
  scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure
  scsi: scsi_dh: Don't look for NULL devices handlers by name
  scsi: core: remove redundant assignment to shost->use_blk_mq

7 years agoMerge tag 'kbuild-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy...
Linus Torvalds [Mon, 16 Apr 2018 00:21:30 +0000 (17:21 -0700)]
Merge tag 'kbuild-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:

 - pass HOSTLDFLAGS when compiling single .c host programs

 - build genksyms lexer and parser files instead of using shipped
   versions

 - rename *-asn1.[ch] to *.asn1.[ch] for suffix consistency

 - let the top .gitignore globally ignore artifacts generated by flex,
   bison, and asn1_compiler

 - let the top Makefile globally clean artifacts generated by flex,
   bison, and asn1_compiler

 - use safer .SECONDARY marker instead of .PRECIOUS to prevent
   intermediate files from being removed

 - support -fmacro-prefix-map option to make __FILE__ a relative path

 - fix # escaping to prepare for the future GNU Make release

 - clean up deb-pkg by using debian tools instead of handrolled
   source/changes generation

 - improve rpm-pkg portability by supporting kernel-install as a
   fallback of new-kernel-pkg

 - extend Kconfig listnewconfig target to provide more information

* tag 'kbuild-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: extend output of 'listnewconfig'
  kbuild: rpm-pkg: use kernel-install as a fallback for new-kernel-pkg
  Kbuild: fix # escaping in .cmd files for future Make
  kbuild: deb-pkg: split generating packaging and build
  kbuild: use -fmacro-prefix-map to make __FILE__ a relative path
  kbuild: mark $(targets) as .SECONDARY and remove .PRECIOUS markers
  kbuild: rename *-asn1.[ch] to *.asn1.[ch]
  kbuild: clean up *-asn1.[ch] patterns from top-level Makefile
  .gitignore: move *-asn1.[ch] patterns to the top-level .gitignore
  kbuild: add %.dtb.S and %.dtb to 'targets' automatically
  kbuild: add %.lex.c and %.tab.[ch] to 'targets' automatically
  genksyms: generate lexer and parser during build instead of shipping
  kbuild: clean up *.lex.c and *.tab.[ch] patterns from top-level Makefile
  .gitignore: move *.lex.c *.tab.[ch] patterns to the top-level .gitignore
  kbuild: use HOSTLDFLAGS for single .c executables

7 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 15 Apr 2018 23:12:35 +0000 (16:12 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes and updates for x86:

   - Address a swiotlb regression which was caused by the recent DMA
     rework and made driver fail because dma_direct_supported() returned
     false

   - Fix a signedness bug in the APIC ID validation which caused invalid
     APIC IDs to be detected as valid thereby bloating the CPU possible
     space.

   - Fix inconsisten config dependcy/select magic for the MFD_CS5535
     driver.

   - Fix a corruption of the physical address space bits when encryption
     has reduced the address space and late cpuinfo updates overwrite
     the reduced bit information with the original value.

   - Dominiks syscall rework which consolidates the architecture
     specific syscall functions so all syscalls can be wrapped with the
     same macros. This allows to switch x86/64 to struct pt_regs based
     syscalls. Extend the clearing of user space controlled registers in
     the entry patch to the lower registers"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Fix signedness bug in APIC ID validity checks
  x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption
  x86/olpc: Fix inconsistent MFD_CS5535 configuration
  swiotlb: Use dma_direct_supported() for swiotlb_ops
  syscalls/x86: Adapt syscall_wrapper.h to the new syscall stub naming convention
  syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()
  syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention
  syscalls/core, syscalls/x86: Clean up syscall stub naming convention
  syscalls/x86: Extend register clearing on syscall entry to lower registers
  syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64
  syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32
  syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls
  syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls
  syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
  x86/syscalls: Don't pointlessly reload the system call number
  x86/mm: Fix documentation of module mapping range with 4-level paging
  x86/cpuid: Switch to 'static const' specifier

7 years agoMerge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 15 Apr 2018 20:35:29 +0000 (13:35 -0700)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 pti updates from Thomas Gleixner:
 "Another series of PTI related changes:

   - Remove the manual stack switch for user entries from the idtentry
     code. This debloats entry by 5k+ bytes of text.

   - Use the proper types for the asm/bootparam.h defines to prevent
     user space compile errors.

   - Use PAGE_GLOBAL for !PCID systems to gain back performance

   - Prevent setting of huge PUD/PMD entries when the entries are not
     leaf entries otherwise the entries to which the PUD/PMD points to
     and are populated get lost"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
  x86/pti: Leave kernel text global for !PCID
  x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image
  x86/pti: Enable global pages for shared areas
  x86/mm: Do not forbid _PAGE_RW before init for __ro_after_init
  x86/mm: Comment _PAGE_GLOBAL mystery
  x86/mm: Remove extra filtering in pageattr code
  x86/mm: Do not auto-massage page protections
  x86/espfix: Document use of _PAGE_GLOBAL
  x86/mm: Introduce "default" kernel PTE mask
  x86/mm: Undo double _PAGE_PSE clearing
  x86/mm: Factor out pageattr _PAGE_GLOBAL setting
  x86/entry/64: Drop idtentry's manual stack switch for user entries
  x86/uapi: Fix asm/bootparam.h userspace compilation errors

7 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 15 Apr 2018 19:43:30 +0000 (12:43 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:
 "A few scheduler fixes:

   - Prevent a bogus warning vs. runqueue clock update flags in
     do_sched_rt_period_timer()

   - Simplify the helper functions which handle requests for skipping
     the runqueue clock updat.

   - Do not unlock the tunables mutex in the error path of the cpu
     frequency scheduler utils. Its not held.

   - Enforce proper alignement for 'struct util_est' in sched_avg to
     prevent a misalignment fault on IA64"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Force proper alignment of 'struct util_est'
  sched/core: Simplify helpers for rq clock update skip requests
  sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
  sched/cpufreq/schedutil: Fix error path mutex unlock

7 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 15 Apr 2018 19:36:31 +0000 (12:36 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull more perf updates from Thomas Gleixner:
 "A rather large set of perf updates:

  Kernel:

   - Fix various initialization issues

   - Prevent creating [ku]probes for not CAP_SYS_ADMIN users

  Tooling:

   - Show only failing syscalls with 'perf trace --failure' (Arnaldo
     Carvalho de Melo)

            e.g: See what 'openat' syscalls are failing:

        # perf trace --failure -e openat
         762.323 ( 0.007 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video2) = -1 ENOENT No such file or directory
         <SNIP N /dev/videoN open attempts... sigh, where is that improvised camera lid?!? >
         790.228 ( 0.008 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video63) = -1 ENOENT No such file or directory
        ^C#

   - Show information about the event (freq, nr_samples, total
     period/nr_events) in the annotate --tui and --stdio2 'perf
     annotate' output, similar to the first line in the 'perf report
     --tui', but just for the samples for a the annotated symbol
     (Arnaldo Carvalho de Melo)

   - Introduce 'perf version --build-options' to show what features were
     linked, aliased as well as a shorter 'perf -vv' (Jin Yao)

   - Add a "dso_size" sort order (Kim Phillips)

   - Remove redundant ')' in the tracepoint output in 'perf trace'
     (Changbin Du)

   - Synchronize x86's cpufeatures.h, no effect on toolss (Arnaldo
     Carvalho de Melo)

   - Show group details on the title line in the annotate browser and
     'perf annotate --stdio2' output, so that the per-event columns can
     have headers (Arnaldo Carvalho de Melo)

   - Fixup vertical line separating metrics from instructions and
     cleaning unused lines at the bottom, both in the annotate TUI
     browser (Arnaldo Carvalho de Melo)

   - Remove duplicated 'samples' in lost samples warning in
     'perf report' (Arnaldo Carvalho de Melo)

   - Synchronize i915_drm.h, silencing the perf build process,
     automagically adding support for the new DRM_I915_QUERY ioctl
     (Arnaldo Carvalho de Melo)

   - Make auxtrace_queues__add_buffer() allocate struct buffer, from a
     patchkit already applied (Adrian Hunter)

   - Fix the --stdio2/TUI annotate output to include group details, be
     it for a recorded '{a,b,f}' explicit event group or when forcing
     group display using 'perf report --group' for a set of events not
     recorded as a group (Arnaldo Carvalho de Melo)

   - Fix display artifacts in the ui browser (base class for the
     annotate and main report/top TUI browser) related to the extra
     title lines work (Arnaldo Carvalho de Melo)

   - perf auxtrace refactorings, leftovers from a previously partially
     processed patchset (Adrian Hunter)

   - Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de
     Melo)

   - Synchronize i915_drm.h, silencing a perf build warning and in the
     process automagically adding support for a new ioctl command
     (Arnaldo Carvalho de Melo)

   - Fix a strncpy issue in uprobe tracing"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()
  tracing/uprobe_event: Fix strncpy corner case
  perf/core: Fix perf_uprobe_init()
  perf/core: Fix perf_kprobe_init()
  perf/core: Fix use-after-free in uprobe_perf_close()
  perf tests clang: Fix function name for clang IR test
  perf clang: Add support for recent clang versions
  perf tools: Fix perf builds with clang support
  perf tools: No need to include namespaces.h in util.h
  perf hists browser: Remove leftover from row returned from refresh
  perf hists browser: Show extra_title_lines in the 'D' debug hotkey
  perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
  tools headers uapi: Synchronize i915_drm.h
  perf report: Remove duplicated 'samples' in lost samples warning
  perf ui browser: Fixup cleaning unused lines at the bottom
  perf annotate browser: Fixup vertical line separating metrics from instructions
  perf annotate: Show group details on the title line
  perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer
  perf/x86/intel: Move regs->flags EXACT bit init
  perf trace: Remove redundant ')'
  ...

7 years agoMerge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 15 Apr 2018 19:32:06 +0000 (12:32 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 EFI bootup fixlet from Thomas Gleixner:
 "A single fix for an early boot warning caused by invoking
  this_cpu_has() before SMP initialization"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix bogus warning during EFI bootup, use boot_cpu_has() instead of this_cpu_has() in build_cr3_noflush()

7 years agoMerge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 15 Apr 2018 19:29:46 +0000 (12:29 -0700)]
Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq affinity fixes from Thomas Gleixner:

  - Fix error path handling in the affinity spreading code

  - Make affinity spreading smarter to avoid issues on systems which
    claim to have hotpluggable CPUs while in fact they can't hotplug
    anything.

    So instead of trying to spread the vectors (and thereby the
    associated device queues) to all possibe CPUs, spread them on all
    present CPUs first. If there are left over vectors after that first
    step they are spread among the possible, but not present CPUs which
    keeps the code backwards compatible for virtual decives and NVME
    which allocate a queue per possible CPU, but makes the spreading
    smarter for devices which have less queues than possible or present
    CPUs.

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/affinity: Spread irq vectors among present CPUs as far as possible
  genirq/affinity: Allow irq spreading from a given starting point
  genirq/affinity: Move actual irq vector spreading into a helper function
  genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask
  genirq/affinity: Don't return with empty affinity masks on error

7 years agoMerge tag 'for-linus' of git://github.com/openrisc/linux
Linus Torvalds [Sun, 15 Apr 2018 19:27:58 +0000 (12:27 -0700)]
Merge tag 'for-linus' of git://github.com/openrisc/linux

Pull OpenRISC fixlet from Stafford Horne:
 "Just one small thing here, it came in a while back but I didnt have
  anything in my 4.16 queue, still its the only thing for 4.17 so
  sending it alone.

  Small cleanup: remove unused __ARCH_HAVE_MMU define"

* tag 'for-linus' of git://github.com/openrisc/linux:
  openrisc: remove unused __ARCH_HAVE_MMU define

7 years agoMerge tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 15 Apr 2018 18:57:12 +0000 (11:57 -0700)]
Merge tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix crashes when loading modules built with a different
   CONFIG_RELOCATABLE value by adding CONFIG_RELOCATABLE to vermagic.

 - Fix busy loops in the OPAL NVRAM driver if we get certain error
   conditions from firmware.

 - Remove tlbie trace points from KVM code that's called in real mode,
   because it causes crashes.

 - Fix checkstops caused by invalid tlbiel on Power9 Radix.

 - Ensure the set of CPU features we "know" are always enabled is
   actually the minimal set when we build with support for firmware
   supplied CPU features.

Thanks to: Aneesh Kumar K.V, Anshuman Khandual, Nicholas Piggin.

* tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Fix CPU_FTRS_ALWAYS vs DT CPU features
  powerpc/mm/radix: Fix checkstops caused by invalid tlbiel
  KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
  powerpc/8xx: Fix build with hugetlbfs enabled
  powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops
  powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops
  powerpc/fscr: Enable interrupts earlier before calling get_user()
  powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
  powerpc/modules: Fix crashes by adding CONFIG_RELOCATABLE to vermagic

7 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 14 Apr 2018 15:50:50 +0000 (08:50 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge yet more updates from Andrew Morton:

 - various hotfixes

 - kexec_file updates and feature work

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (27 commits)
  kernel/kexec_file.c: move purgatories sha256 to common code
  kernel/kexec_file.c: allow archs to set purgatory load address
  kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory load
  kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrs
  kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrs
  kernel/kexec_file.c: split up __kexec_load_puragory
  kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations*
  kernel/kexec_file.c: search symbols in read-only kexec_purgatory
  kernel/kexec_file.c: make purgatory_info->ehdr const
  kernel/kexec_file.c: remove checks in kexec_purgatory_load
  include/linux/kexec.h: silence compile warnings
  kexec_file, x86: move re-factored code to generic side
  x86: kexec_file: clean up prepare_elf64_headers()
  x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer
  x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()
  x86: kexec_file: purge system-ram walking from prepare_elf64_headers()
  kexec_file,x86,powerpc: factor out kexec_file_ops functions
  kexec_file: make use of purgatory optional
  proc: revalidate misc dentries
  mm, slab: reschedule cache_reap() on the same CPU
  ...

7 years agokernel/kexec_file.c: move purgatories sha256 to common code
Philipp Rudo [Fri, 13 Apr 2018 22:36:46 +0000 (15:36 -0700)]
kernel/kexec_file.c: move purgatories sha256 to common code

The code to verify the new kernels sha digest is applicable for all
architectures.  Move it to common code.

One problem is the string.c implementation on x86.  Currently sha256
includes x86/boot/string.h which defines memcpy and memset to be gcc
builtins.  By moving the sha256 implementation to common code and
changing the include to linux/string.h both functions are no longer
defined.  Thus definitions have to be provided in x86/purgatory/string.c

Link: http://lkml.kernel.org/r/20180321112751.22196-12-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: allow archs to set purgatory load address
Philipp Rudo [Fri, 13 Apr 2018 22:36:43 +0000 (15:36 -0700)]
kernel/kexec_file.c: allow archs to set purgatory load address

For s390 new kernels are loaded to fixed addresses in memory before they
are booted.  With the current code this is a problem as it assumes the
kernel will be loaded to an 'arbitrary' address.  In particular,
kexec_locate_mem_hole searches for a large enough memory region and sets
the load address (kexec_bufer->mem) to it.

Luckily there is a simple workaround for this problem.  By returning 1
in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off.  This
allows the architecture to set kbuf->mem by hand.  While the trick works
fine for the kernel it does not for the purgatory as here the
architectures don't have access to its kexec_buffer.

Give architectures access to the purgatories kexec_buffer by changing
kexec_load_purgatory to take a pointer to it.  With this change
architectures have access to the buffer and can edit it as they need.

A nice side effect of this change is that we can get rid of the
purgatory_info->purgatory_load_address field.  As now the information
stored there can directly be accessed from kbuf->mem.

Link: http://lkml.kernel.org/r/20180321112751.22196-11-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: remove mis-use of sh_offset field during purgatory load
Philipp Rudo [Fri, 13 Apr 2018 22:36:39 +0000 (15:36 -0700)]
kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory load

The current code uses the sh_offset field in purgatory_info->sechdrs to
store a pointer to the current load address of the section.  Depending
whether the section will be loaded or not this is either a pointer into
purgatory_info->purgatory_buf or kexec_purgatory.  This is not only a
violation of the ELF standard but also makes the code very hard to
understand as you cannot tell if the memory you are using is read-only
or not.

Remove this misuse and store the offset of the section in
pugaroty_info->purgatory_buf in sh_offset.

Link: http://lkml.kernel.org/r/20180321112751.22196-10-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrs
Philipp Rudo [Fri, 13 Apr 2018 22:36:35 +0000 (15:36 -0700)]
kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrs

The main loop currently uses quite a lot of variables to update the
section headers.  Some of them are unnecessary.  So clean them up a
little.

Link: http://lkml.kernel.org/r/20180321112751.22196-9-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrs
Philipp Rudo [Fri, 13 Apr 2018 22:36:32 +0000 (15:36 -0700)]
kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrs

To update the entry point there is an extra loop over all section
headers although this can be done in the main loop.  So move it there
and eliminate the extra loop and variable to store the 'entry section
index'.

Also, in the main loop, move the usual case, i.e.  non-bss section, out
of the extra if-block.

Link: http://lkml.kernel.org/r/20180321112751.22196-8-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: split up __kexec_load_puragory
Philipp Rudo [Fri, 13 Apr 2018 22:36:28 +0000 (15:36 -0700)]
kernel/kexec_file.c: split up __kexec_load_puragory

When inspecting __kexec_load_purgatory you find that it has two tasks

1) setting up the kexec_buffer for the new kernel and,
2) setting up pi->sechdrs for the final load address.

The two tasks are independent of each other.  To improve readability
split up __kexec_load_purgatory into two functions, one for each task,
and call them directly from kexec_load_purgatory.

Link: http://lkml.kernel.org/r/20180321112751.22196-7-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations*
Philipp Rudo [Fri, 13 Apr 2018 22:36:24 +0000 (15:36 -0700)]
kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations*

When the relocations are applied to the purgatory only the section the
relocations are applied to is writable.  The other sections, i.e.  the
symtab and .rel/.rela, are in read-only kexec_purgatory.  Highlight this
by marking the corresponding variables as 'const'.

While at it also change the signatures of arch_kexec_apply_relocations* to
take section pointers instead of just the index of the relocation section.
This removes the second lookup and sanity check of the sections in arch
code.

Link: http://lkml.kernel.org/r/20180321112751.22196-6-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: search symbols in read-only kexec_purgatory
Philipp Rudo [Fri, 13 Apr 2018 22:36:21 +0000 (15:36 -0700)]
kernel/kexec_file.c: search symbols in read-only kexec_purgatory

The stripped purgatory does not contain a symtab.  So when looking for
symbols this is done in read-only kexec_purgatory.  Highlight this by
marking the corresponding variables as 'const'.

Link: http://lkml.kernel.org/r/20180321112751.22196-5-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: make purgatory_info->ehdr const
Philipp Rudo [Fri, 13 Apr 2018 22:36:17 +0000 (15:36 -0700)]
kernel/kexec_file.c: make purgatory_info->ehdr const

The kexec_purgatory buffer is read-only.  Thus all pointers into
kexec_purgatory are read-only, too.  Point this out by explicitly
marking purgatory_info->ehdr as 'const' and update the comments in
purgatory_info.

Link: http://lkml.kernel.org/r/20180321112751.22196-4-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kexec_file.c: remove checks in kexec_purgatory_load
Philipp Rudo [Fri, 13 Apr 2018 22:36:13 +0000 (15:36 -0700)]
kernel/kexec_file.c: remove checks in kexec_purgatory_load

Before the purgatory is loaded several checks are done whether the ELF
file in kexec_purgatory is valid or not.  These checks are incomplete.
For example they don't check for the total size of the sections defined
in the section header table or if the entry point actually points into
the purgatory.

On the other hand the purgatory, although an ELF file on its own, is
part of the kernel.  Thus not trusting the purgatory means not trusting
the kernel build itself.

So remove all validity checks on the purgatory and just trust the kernel
build.

Link: http://lkml.kernel.org/r/20180321112751.22196-3-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoinclude/linux/kexec.h: silence compile warnings
Philipp Rudo [Fri, 13 Apr 2018 22:36:10 +0000 (15:36 -0700)]
include/linux/kexec.h: silence compile warnings

Patch series "kexec_file: Clean up purgatory load", v2.

Following the discussion with Dave and AKASHI, here are the common code
patches extracted from my recent patch set (Add kexec_file_load support
to s390) [1].  The patches were extracted to allow upstream integration
together with AKASHI's common code patches before the arch code gets
adjusted to the new base.

The reason for this series is to prepare common code for adding
kexec_file_load to s390 as well as cleaning up the mis-use of the
sh_offset field during purgatory load.  In detail this series contains:

Patch #1&2: Minor cleanups/fixes.

Patch #3-9: Clean up the purgatory load/relocation code.  Especially
remove the mis-use of the purgatory_info->sechdrs->sh_offset field,
currently holding a pointer into either kexec_purgatory (ro) or
purgatory_buf (rw) depending on the section.  With these patches the
section address will be calculated verbosely and sh_offset will contain
the offset of the section in the stripped purgatory binary
(purgatory_buf).

Patch #10: Allows architectures to set the purgatory load address.  This
patch is important for s390 as the kernel and purgatory have to be
loaded to fixed addresses.  In current code this is impossible as the
purgatory load is opaque to the architecture.

Patch #11: Moves x86 purgatories sha implementation to common lib/
directory to allow reuse in other architectures.

This patch (of 11)

When building the kernel with CONFIG_KEXEC_FILE enabled gcc prints a
compile warning multiple times.

  In file included from <path>/linux/init/initramfs.c:526:0:
  <path>/include/linux/kexec.h:120:9: warning: `struct kimage' declared inside parameter list [enabled by default]
           unsigned long cmdline_len);
           ^

This is because the typedefs for kexec_file_load uses struct kimage
before it is declared.  Fix this by simply forward declaring struct
kimage.

Link: http://lkml.kernel.org/r/20180321112751.22196-2-prudo@linux.vnet.ibm.com
Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokexec_file, x86: move re-factored code to generic side
AKASHI Takahiro [Fri, 13 Apr 2018 22:36:06 +0000 (15:36 -0700)]
kexec_file, x86: move re-factored code to generic side

In the previous patches, commonly-used routines, exclude_mem_range() and
prepare_elf64_headers(), were carved out.  Now place them in kexec
common code.  A prefix "crash_" is given to each of their names to avoid
possible name collisions.

Link: http://lkml.kernel.org/r/20180306102303.9063-8-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agox86: kexec_file: clean up prepare_elf64_headers()
AKASHI Takahiro [Fri, 13 Apr 2018 22:36:03 +0000 (15:36 -0700)]
x86: kexec_file: clean up prepare_elf64_headers()

Removing bufp variable in prepare_elf64_headers() makes the code simpler
and more understandable.

Link: http://lkml.kernel.org/r/20180306102303.9063-7-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agox86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer
AKASHI Takahiro [Fri, 13 Apr 2018 22:35:59 +0000 (15:35 -0700)]
x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer

While CRASH_MAX_RANGES (== 16) seems to be good enough, fixed-number
array is not a good idea in general.

In this patch, size of crash_mem buffer is calculated as before and the
buffer is now dynamically allocated.  This change also allows removing
crash_elf_data structure.

Link: http://lkml.kernel.org/r/20180306102303.9063-6-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agox86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()
AKASHI Takahiro [Fri, 13 Apr 2018 22:35:56 +0000 (15:35 -0700)]
x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()

The code guarded by CONFIG_X86_64 is necessary on some architectures
which have a dedicated kernel mapping outside of linear memory mapping.
(arm64 is among those.)

In this patch, an additional argument, kernel_map, is added to enable/
disable the code removing #ifdef.

Link: http://lkml.kernel.org/r/20180306102303.9063-5-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agox86: kexec_file: purge system-ram walking from prepare_elf64_headers()
AKASHI Takahiro [Fri, 13 Apr 2018 22:35:53 +0000 (15:35 -0700)]
x86: kexec_file: purge system-ram walking from prepare_elf64_headers()

While prepare_elf64_headers() in x86 looks pretty generic for other
architectures' use, it contains some code which tries to list crash
memory regions by walking through system resources, which is not always
architecture agnostic.  To make this function more generic, the related
code should be purged.

In this patch, prepare_elf64_headers() simply scans crash_mem buffer
passed and add all the listed regions to elf header as a PT_LOAD
segment.  So walk_system_ram_res(prepare_elf64_headers_callback) have
been moved forward before prepare_elf64_headers() where the callback,
prepare_elf64_headers_callback(), is now responsible for filling up
crash_mem buffer.

Meanwhile exclude_elf_header_ranges() used to be called every time in
this callback it is rather redundant and now called only once in
prepare_elf_headers() as well.

Link: http://lkml.kernel.org/r/20180306102303.9063-4-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokexec_file,x86,powerpc: factor out kexec_file_ops functions
AKASHI Takahiro [Fri, 13 Apr 2018 22:35:49 +0000 (15:35 -0700)]
kexec_file,x86,powerpc: factor out kexec_file_ops functions

As arch_kexec_kernel_image_{probe,load}(),
arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig()
are almost duplicated among architectures, they can be commonalized with
an architecture-defined kexec_file_ops array.  So let's factor them out.

Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokexec_file: make use of purgatory optional
AKASHI Takahiro [Fri, 13 Apr 2018 22:35:45 +0000 (15:35 -0700)]
kexec_file: make use of purgatory optional

Patch series "kexec_file, x86, powerpc: refactoring for other
architecutres", v2.

This is a preparatory patchset for adding kexec_file support on arm64.

It was originally included in a arm64 patch set[1], but Philipp is also
working on their kexec_file support on s390[2] and some changes are now
conflicting.

So these common parts were extracted and put into a separate patch set
for better integration.  What's more, my original patch#4 was split into
a few small chunks for easier review after Dave's comment.

As such, the resulting code is basically identical with my original, and
the only *visible* differences are:

 - renaming of _kexec_kernel_image_probe() and  _kimage_file_post_load_cleanup()

 - change one of types of arguments at prepare_elf64_headers()

Those, unfortunately, require a couple of trivial changes on the rest
(#1, #6 to #13) of my arm64 kexec_file patch set[1].

Patch #1 allows making a use of purgatory optional, particularly useful
for arm64.

Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load,
verify_sig}() and arch_kimage_file_post_load_cleanup() across
architectures.

Patches #3-#7 are also intended to generalize parse_elf64_headers(),
along with exclude_mem_range(), to be made best re-use of.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html
[2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html

This patch (of 7):

On arm64, crash dump kernel's usable memory is protected by *unmapping*
it from kernel virtual space unlike other architectures where the region
is just made read-only.  It is highly unlikely that the region is
accidentally corrupted and this observation rationalizes that digest
check code can also be dropped from purgatory.  The resulting code is so
simple as it doesn't require a bit ugly re-linking/relocation stuff,
i.e.  arch_kexec_apply_relocations_add().

Please see:

   http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html

All that the purgatory does is to shuffle arguments and jump into a new
kernel, while we still need to have some space for a hash value
(purgatory_sha256_digest) which is never checked against.

As such, it doesn't make sense to have trampline code between old kernel
and new kernel on arm64.

This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and
allows related code to be compiled in only if necessary.

[takahiro.akashi@linaro.org: fix trivial screwup]
Link: http://lkml.kernel.org/r/20180309093346.GF25863@linaro.org
Link: http://lkml.kernel.org/r/20180306102303.9063-2-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoproc: revalidate misc dentries
Alexey Dobriyan [Fri, 13 Apr 2018 22:35:42 +0000 (15:35 -0700)]
proc: revalidate misc dentries

If module removes proc directory while another process pins it by
chdir'ing to it, then subsequent recreation of proc entry and all
entries down the tree will not be visible to any process until pinning
process unchdir from directory and unpins everything.

Steps to reproduce:

proc_mkdir("aaa", NULL);
proc_create("aaa/bbb", ...);

chdir("/proc/aaa");

remove_proc_entry("aaa/bbb", NULL);
remove_proc_entry("aaa", NULL);

proc_mkdir("aaa", NULL);
# inaccessible because "aaa" dentry still points
# to the original "aaa".
proc_create("aaa/bbb", ...);

Fix is to implement ->d_revalidate and ->d_delete.

Link: http://lkml.kernel.org/r/20180312201938.GA4871@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm, slab: reschedule cache_reap() on the same CPU
Vlastimil Babka [Fri, 13 Apr 2018 22:35:38 +0000 (15:35 -0700)]
mm, slab: reschedule cache_reap() on the same CPU

cache_reap() is initially scheduled in start_cpu_timer() via
schedule_delayed_work_on(). But then the next iterations are scheduled
via schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.

Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND
work on wq_unbound_cpumask CPUs") there is no guarantee the future
iterations will run on the originally intended cpu, although it's still
preferred.  I was able to demonstrate this with
/sys/module/workqueue/parameters/debug_force_rr_cpu.  IIUC, it may also
happen due to migrating timers in nohz context.  As a result, some cpu's
would be calling cache_reap() more frequently and others never.

This patch uses schedule_delayed_work_on() with the current cpu when
scheduling the next iteration.

Link: http://lkml.kernel.org/r/20180411070007.32225-1-vbabka@suse.cz
Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokexec: export PG_swapbacked to VMCOREINFO
Petr Tesarik [Fri, 13 Apr 2018 22:35:34 +0000 (15:35 -0700)]
kexec: export PG_swapbacked to VMCOREINFO

Since commit 6326fec1122c ("mm: Use owner_priv bit for PageSwapCache,
valid when PageSwapBacked"), PG_swapcache is an alias for
PG_owner_priv_1, which may be also used for other purposes.

To know whether the bit indeed has the PG_swapcache meaning, it is
necessary to check PG_swapbacked, hence this bit must be exported.

Link: http://lkml.kernel.org/r/20180410161345.142e142d@ezekiel.suse.cz
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Xunlei Pang <xlpang@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Marc-Andr Lureau" <marcandre.lureau@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoipc/shm: fix use-after-free of shm file via remap_file_pages()
Eric Biggers [Fri, 13 Apr 2018 22:35:30 +0000 (15:35 -0700)]
ipc/shm: fix use-after-free of shm file via remap_file_pages()

syzbot reported a use-after-free of shm_file_data(file)->file->f_op in
shm_get_unmapped_area(), called via sys_remap_file_pages().

Unfortunately it couldn't generate a reproducer, but I found a bug which
I think caused it.  When remap_file_pages() is passed a full System V
shared memory segment, the memory is first unmapped, then a new map is
created using the ->vm_file.  Between these steps, the shm ID can be
removed and reused for a new shm segment.  But, shm_mmap() only checks
whether the ID is currently valid before calling the underlying file's
->mmap(); it doesn't check whether it was reused.  Thus it can use the
wrong underlying file, one that was already freed.

Fix this by making the "outer" shm file (the one that gets put in
->vm_file) hold a reference to the real shm file, and by making
__shm_open() require that the file associated with the shm ID matches
the one associated with the "outer" file.

Taking the reference to the real shm file is needed to fully solve the
problem, since otherwise sfd->file could point to a freed file, which
then could be reallocated for the reused shm ID, causing the wrong shm
segment to be mapped (and without the required permission checks).

Commit 1ac0b6dec656 ("ipc/shm: handle removed segments gracefully in
shm_mmap()") almost fixed this bug, but it didn't go far enough because
it didn't consider the case where the shm ID is reused.

The following program usually reproduces this bug:

#include <stdlib.h>
#include <sys/shm.h>
#include <sys/syscall.h>
#include <unistd.h>

int main()
{
int is_parent = (fork() != 0);
srand(getpid());
for (;;) {
int id = shmget(0xF00F, 4096, IPC_CREAT|0700);
if (is_parent) {
void *addr = shmat(id, NULL, 0);
usleep(rand() % 50);
while (!syscall(__NR_remap_file_pages, addr, 4096, 0, 0, 0));
} else {
usleep(rand() % 50);
shmctl(id, IPC_RMID, NULL);
}
}
}

It causes the following NULL pointer dereference due to a 'struct file'
being used while it's being freed.  (I couldn't actually get a KASAN
use-after-free splat like in the syzbot report.  But I think it's
possible with this bug; it would just take a more extraordinary race...)

BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 9 PID: 258 Comm: syz_ipc Not tainted 4.16.0-05140-gf8cf2f16a7c95 #189
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
RIP: 0010:d_inode include/linux/dcache.h:519 [inline]
RIP: 0010:touch_atime+0x25/0xd0 fs/inode.c:1724
[...]
Call Trace:
 file_accessed include/linux/fs.h:2063 [inline]
 shmem_mmap+0x25/0x40 mm/shmem.c:2149
 call_mmap include/linux/fs.h:1789 [inline]
 shm_mmap+0x34/0x80 ipc/shm.c:465
 call_mmap include/linux/fs.h:1789 [inline]
 mmap_region+0x309/0x5b0 mm/mmap.c:1712
 do_mmap+0x294/0x4a0 mm/mmap.c:1483
 do_mmap_pgoff include/linux/mm.h:2235 [inline]
 SYSC_remap_file_pages mm/mmap.c:2853 [inline]
 SyS_remap_file_pages+0x232/0x310 mm/mmap.c:2769
 do_syscall_64+0x64/0x1a0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ebiggers@google.com: add comment]
Link: http://lkml.kernel.org/r/20180410192850.235835-1-ebiggers3@gmail.com
Link: http://lkml.kernel.org/r/20180409043039.28915-1-ebiggers3@gmail.com
Reported-by: syzbot+d11f321e7f1923157eac80aa990b446596f46439@syzkaller.appspotmail.com
Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm/filemap.c: provide dummy filemap_page_mkwrite() for NOMMU
Arnd Bergmann [Fri, 13 Apr 2018 22:35:27 +0000 (15:35 -0700)]
mm/filemap.c: provide dummy filemap_page_mkwrite() for NOMMU

Building orangefs on MMU-less machines now results in a link error
because of the newly introduced use of the filemap_page_mkwrite()
function:

  ERROR: "filemap_page_mkwrite" [fs/orangefs/orangefs.ko] undefined!

This adds a dummy version for it, similar to the existing
generic_file_mmap and generic_file_readonly_mmap stubs in the same file,
to avoid the link error without adding #ifdefs in each file system that
uses these.

Link: http://lkml.kernel.org/r/20180409105555.2439976-1-arnd@arndb.de
Fixes: a5135eeab2e5 ("orangefs: implement vm_ops->fault")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Martin Brandenburg <martin@omnibond.com>
Cc: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm/gup.c: document return value
Michael S. Tsirkin [Fri, 13 Apr 2018 22:35:23 +0000 (15:35 -0700)]
mm/gup.c: document return value

__get_user_pages_fast handles errors differently from
get_user_pages_fast: the former always returns the number of pages
pinned, the later might return a negative error code.

Link: http://lkml.kernel.org/r/1522962072-182137-6-git-send-email-mst@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoget_user_pages_fast(): return -EFAULT on access_ok failure
Michael S. Tsirkin [Fri, 13 Apr 2018 22:35:20 +0000 (15:35 -0700)]
get_user_pages_fast(): return -EFAULT on access_ok failure

get_user_pages_fast is supposed to be a faster drop-in equivalent of
get_user_pages.  As such, callers expect it to return a negative return
code when passed an invalid address, and never expect it to return 0
when passed a positive number of pages, since its documentation says:

 * Returns number of pages pinned. This may be fewer than the number
 * requested. If nr_pages is 0 or negative, returns 0. If no pages
 * were pinned, returns -errno.

When get_user_pages_fast fall back on get_user_pages this is exactly
what happens.  Unfortunately the implementation is inconsistent: it
returns 0 if passed a kernel address, confusing callers: for example,
the following is pretty common but does not appear to do the right thing
with a kernel address:

        ret = get_user_pages_fast(addr, 1, writeable, &page);
        if (ret < 0)
                return ret;

Change get_user_pages_fast to return -EFAULT when supplied a kernel
address to make it match expectations.

All callers have been audited for consistency with the documented
semantics.

Link: http://lkml.kernel.org/r/1522962072-182137-4-git-send-email-mst@redhat.com
Fixes: 5b65c4677a57 ("mm, x86/mm: Fix performance regression in get_user_pages_fast()")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm/gup_benchmark: handle gup failures
Michael S. Tsirkin [Fri, 13 Apr 2018 22:35:16 +0000 (15:35 -0700)]
mm/gup_benchmark: handle gup failures

Patch series "mm/get_user_pages_fast fixes, cleanups", v2.

Turns out get_user_pages_fast and __get_user_pages_fast return different
values on error when given a single page: __get_user_pages_fast returns
0.  get_user_pages_fast returns either 0 or an error.

Callers of get_user_pages_fast expect an error so fix it up to return an
error consistently.

Stress the difference between get_user_pages_fast and
__get_user_pages_fast to make sure callers aren't confused.

This patch (of 3):

__gup_benchmark_ioctl does not handle the case where get_user_pages_fast
fails:

 - a negative return code will cause a buffer overrun

 - returning with partial success will cause use of uninitialized
   memory.

[akpm@linux-foundation.org: simplification]
Link: http://lkml.kernel.org/r/1522962072-182137-3-git-send-email-mst@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoresource: fix integer overflow at reallocation
Takashi Iwai [Fri, 13 Apr 2018 22:35:13 +0000 (15:35 -0700)]
resource: fix integer overflow at reallocation

We've got a bug report indicating a kernel panic at booting on an x86-32
system, and it turned out to be the invalid PCI resource assigned after
reallocation.  __find_resource() first aligns the resource start address
and resets the end address with start+size-1 accordingly, then checks
whether it's contained.  Here the end address may overflow the integer,
although resource_contains() still returns true because the function
validates only start and end address.  So this ends up with returning an
invalid resource (start > end).

There was already an attempt to cover such a problem in the commit
47ea91b4052d ("Resource: fix wrong resource window calculation"), but
this case is an overseen one.

This patch adds the validity check of the newly calculated resource for
avoiding the integer overflow problem.

Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739
Link: http://lkml.kernel.org/r/s5hpo37d5l8.wl-tiwai@suse.de
Fixes: 23c570a67448 ("resource: ability to resize an allocated resource")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reported-by: Michael Henders <hendersm@shaw.ca>
Tested-by: Michael Henders <hendersm@shaw.ca>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszer...
Linus Torvalds [Fri, 13 Apr 2018 23:55:41 +0000 (16:55 -0700)]
Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:
 "In addition to bug fixes and cleanups there are two new features from
  Amir:

   - Consistent inode number support for the case when layers are not
     all on the same filesystem (feature is dubbed "xino").

   - Optimize overlayfs file handle decoding. This one touches the
     exportfs interface to allow detecting the disconnected directory
     case"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: update documentation w.r.t "xino" feature
  ovl: add support for "xino" mount and config options
  ovl: consistent d_ino for non-samefs with xino
  ovl: consistent i_ino for non-samefs with xino
  ovl: constant st_ino for non-samefs with xino
  ovl: allocate anon bdev per unique lower fs
  ovl: factor out ovl_map_dev_ino() helper
  ovl: cleanup ovl_update_time()
  ovl: add WARN_ON() for non-dir redirect cases
  ovl: cleanup setting OVL_INDEX
  ovl: set d->is_dir and d->opaque for last path element
  ovl: Do not check for redirect if this is last layer
  ovl: lookup in inode cache first when decoding lower file handle
  ovl: do not try to reconnect a disconnected origin dentry
  ovl: disambiguate ovl_encode_fh()
  ovl: set lower layer st_dev only if setting lower st_ino
  ovl: fix lookup with middle layer opaque dir and absolute path redirects
  ovl: Set d->last properly during lookup
  ovl: set i_ino to the value of st_ino for NFS export