]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
7 years agocrypto: ccp - Use only the relevant interrupt bits
Gary R Hook [Thu, 20 Apr 2017 20:24:09 +0000 (15:24 -0500)]
crypto: ccp - Use only the relevant interrupt bits

Orabug: 26644685

Each CCP queue can product interrupts for 4 conditions:
operation complete, queue empty, error, and queue stopped.
This driver only works with completion and error events.

Cc: <stable@vger.kernel.org> # 4.9.x+
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 56467cb11cf8ae4db9003f54b3d3425b5f07a10a)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Rearrange structure members to minimize size
Gary R Hook [Tue, 28 Mar 2017 15:57:26 +0000 (10:57 -0500)]
crypto: ccp - Rearrange structure members to minimize size

Orabug: 26644685

The AES GCM function (in ccp-ops) requires a fair amount of
stack space, which elicits a complaint when KASAN is enabled.
Rearranging and packing a few structures eliminates the
warning.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 2d158391061ec8c73898ceac148f4eddfa83efd5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Remove redundant cpu-to-le32 macros
Gary R Hook [Tue, 28 Mar 2017 13:58:28 +0000 (08:58 -0500)]
crypto: ccp - Remove redundant cpu-to-le32 macros

Orabug: 26644685

Endianness is dealt with when the command descriptor is
copied into the command queue. Remove any occurrences of
cpu_to_le32() found elsewhere.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 51de7dd02d422da11b4dff6f11936c8333a870fe)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Enable 3DES function on v5 CCPs
Gary R Hook [Wed, 15 Mar 2017 18:20:52 +0000 (13:20 -0500)]
crypto: ccp - Enable 3DES function on v5 CCPs

Orabug: 26644685

Wire up support for Triple DES in ECB mode.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 990672d48515ce09c76fcf1ceccee48b0dd1942b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Add SHA-2 384- and 512-bit support
Gary R Hook [Wed, 15 Mar 2017 18:20:43 +0000 (13:20 -0500)]
crypto: ccp - Add SHA-2 384- and 512-bit support

Orabug: 26644685

Incorporate 384-bit and 512-bit hashing for a version 5 CCP
device

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ccebcf3f224a44ec8e9c5bfca9d8e5d29298a5a8)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Make some CCP DMA channels private
Gary R Hook [Thu, 23 Mar 2017 17:53:30 +0000 (12:53 -0500)]
crypto: ccp - Make some CCP DMA channels private

Orabug: 26644685

The CCP registers its queues as channels capable of handling
general DMA operations. The NTB driver will use DMA if
directed, but as public channels can be reserved for use in
asynchronous operations some channels should be held back
as private. Since the public/private determination is
handled at a device level, reserve the "other" (secondary)
CCP channels as private.

Add a module parameter that allows for override, to be
applied to all channels on all devices.

CC: <stable@vger.kernel.org> # 4.10.x-
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit efc989fce8703914bac091dcc4b8ff7a72ccf987)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Assign DMA commands to the channel's CCP
Gary R Hook [Fri, 10 Mar 2017 18:28:18 +0000 (12:28 -0600)]
crypto: ccp - Assign DMA commands to the channel's CCP

Orabug: 26644685

The CCP driver generally uses a round-robin approach when
assigning operations to available CCPs. For the DMA engine,
however, the DMA mappings of the SGs are associated with a
specific CCP. When an IOMMU is enabled, the IOMMU is
programmed based on this specific device.

If the DMA operations are not performed by that specific
CCP then addressing errors and I/O page faults will occur.

Update the CCP driver to allow a specific CCP device to be
requested for an operation and use this in the DMA engine
support.

Cc: <stable@vger.kernel.org> # 4.9.x-
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7c468447f40645fbf2a033dfdaa92b1957130d50)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Simplify some buffer management routines
Gary R Hook [Thu, 9 Feb 2017 21:50:08 +0000 (15:50 -0600)]
crypto: ccp - Simplify some buffer management routines

Orabug: 26644685

The reverse-get/set functions can be simplified by
eliminating unused code.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 83d650ab78c7185da815e16d03fb579d3fde0140)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Update the command queue on errors
Gary R Hook [Thu, 9 Feb 2017 21:49:57 +0000 (15:49 -0600)]
crypto: ccp - Update the command queue on errors

Orabug: 26644685

Move the command queue tail pointer when an error is
detected. Always return the error.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 4cdf101ef444e47bc8869ef3e90396e828fd9b61)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Change mode for detailed CCP init messages
Gary R Hook [Thu, 9 Feb 2017 21:49:48 +0000 (15:49 -0600)]
crypto: ccp - Change mode for detailed CCP init messages

Orabug: 26644685

The CCP initialization messages only need to be sent to
syslog in debug mode.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit a60496a0ca0d34a3ae92e426138eab35f0f45612)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Set the AES size field for all modes
Gary R Hook [Wed, 8 Feb 2017 19:07:06 +0000 (13:07 -0600)]
crypto: ccp - Set the AES size field for all modes

Orabug: 26644685

Ensure that the size field is correctly populated for
all AES modes.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit f7cc02b3c3a33a10dd5bb9e5dfd22e47e09503a2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Fix double add when creating new DMA command
Gary R Hook [Fri, 27 Jan 2017 23:09:04 +0000 (17:09 -0600)]
crypto: ccp - Fix double add when creating new DMA command

Orabug: 26644685

Eliminate a double-add by creating a new list to manage
command descriptors when created; move the descriptor to
the pending list when the command is submitted.

Cc: <stable@vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit e5da5c5667381d2772374ee6a2967b3576c9483d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Fix DMA operations when IOMMU is enabled
Gary R Hook [Fri, 27 Jan 2017 21:28:45 +0000 (15:28 -0600)]
crypto: ccp - Fix DMA operations when IOMMU is enabled

Orabug: 26644685

An I/O page fault occurs when the IOMMU is enabled on a
system that supports the v5 CCP.  DMA operations use a
Request ID value that does not match what is expected by
the IOMMU, resulting in the I/O page fault.  Setting the
Request ID value to 0 corrects this issue.

Cc: <stable@vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 500c0106e638e08c2c661c305ed57d6b67e10908)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Fix handling of RSA exponent on a v5 device
Gary R Hook [Tue, 1 Nov 2016 19:05:05 +0000 (14:05 -0500)]
crypto: ccp - Fix handling of RSA exponent on a v5 device

Orabug: 26644685

The exponent size in the ccp_op structure is in bits. A v5
CCP requires the exponent size to be in bytes, so convert
the size from bits to bytes when populating the descriptor.

The current code references the exponent in memory, but
these fields have not been set since the exponent is
actually store in the LSB. Populate the descriptor with
the LSB location (address).

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit e6414b13ea39e3011901a84eb1bdefa65610b0f8)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - fix typo "CPP"
Paul Bolle [Thu, 20 Oct 2016 19:20:59 +0000 (21:20 +0200)]
crypto: ccp - fix typo "CPP"

Orabug: 26644685

The abbreviation for Cryptographic Coprocessor is "CCP".

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit c8d283ff8b0b6b2061dfc137afd6c56608a34bcb)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Clean up the LSB slot allocation code
Gary R Hook [Tue, 18 Oct 2016 22:33:37 +0000 (17:33 -0500)]
crypto: ccp - Clean up the LSB slot allocation code

Orabug: 26644685

Fix a few problems revealed by testing: verify consistent
units, especially in public slot allocation. Percolate
some common initialization code up to a common routine.
Add some comments.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 103600ab966a2f02d8986bbfdf87b762b1c6a06d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - remove unneeded code
Gary R Hook [Tue, 18 Oct 2016 22:28:49 +0000 (17:28 -0500)]
crypto: ccp - remove unneeded code

Orabug: 26644685

Clean up patch for an unneeded structure member.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ec9b70df75b3600ca20338198a43173f23e6bb9b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - change bitfield type to unsigned ints
Gary R Hook [Tue, 18 Oct 2016 22:28:35 +0000 (17:28 -0500)]
crypto: ccp - change bitfield type to unsigned ints

Orabug: 26644685

Bit fields are not sensitive to endianness, so use
a transparent standard data type

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit fdd2cf9db1e25a46a74c5802d18435171c92e7df)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Fix non static symbol warning
Wei Yongjun [Mon, 17 Oct 2016 15:08:50 +0000 (15:08 +0000)]
crypto: ccp - Fix non static symbol warning

Orabug: 26644685

Fixes the following sparse warning:

drivers/crypto/ccp/ccp-dev.c:44:6: warning:
 symbol 'ccp_error_codes' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ff4f44de44dbd98feecf8fa76e14353a3993b335)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - change type of struct member lsb to signed
Gary R Hook [Wed, 12 Oct 2016 13:47:03 +0000 (08:47 -0500)]
crypto: ccp - change type of struct member lsb to signed

Orabug: 26644685

The lsb field uses a value of -1 to indicate that it
is unassigned. Therefore type must be a signed int.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 3cf799680d2612a21d50ed554848dd37241672c8)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Make syslog errors human-readable
Gary R Hook [Wed, 28 Sep 2016 16:53:56 +0000 (11:53 -0500)]
crypto: ccp - Make syslog errors human-readable

Orabug: 26644685

Add human-readable strings to log messages about CCP errors

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 81422badb39078fde1ffcecda3caac555226fc7b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - clean up data structure
Gary R Hook [Wed, 28 Sep 2016 16:53:47 +0000 (11:53 -0500)]
crypto: ccp - clean up data structure

Orabug: 26644685

Change names of data structure instances.  Add const
keyword where appropriate.  Add error handling path.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 9ddb9dc6be095ebe393f7eb582df09cc4847c5e9)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Fix return value check in ccp_dmaengine_register()
Wei Yongjun [Sat, 17 Sep 2016 16:01:22 +0000 (16:01 +0000)]
crypto: ccp - Fix return value check in ccp_dmaengine_register()

Orabug: 26644685

Fix the retrn value check which testing the wrong variable
in ccp_dmaengine_register().

Fixes: 58ea8abf4904 ("crypto: ccp - Register the CCP as a DMA resource")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7514e3688811e610640ec2201ca14dfebfe13442)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - use kmem_cache_zalloc instead of kmem_cache_alloc/memset
Wei Yongjun [Thu, 15 Sep 2016 03:28:04 +0000 (03:28 +0000)]
crypto: ccp - use kmem_cache_zalloc instead of kmem_cache_alloc/memset

Orabug: 26644685

Using kmem_cache_zalloc() instead of kmem_cache_alloc() and memset().

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 664f570a9cee51a8c7caef042118abd2b48705b1)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - add missing release in ccp_dmaengine_register
Quentin Lambert [Fri, 2 Sep 2016 09:48:53 +0000 (11:48 +0200)]
crypto: ccp - add missing release in ccp_dmaengine_register

Orabug: 26644685

ccp_dmaengine_register used to return with an error code before
releasing all resource. This patch adds a jump to the appropriate label
ensuring that the resources are properly released before returning.

This issue was found with Hector.

Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ba22a1e2aa8ef7f8467f755cfe44b79784febefe)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Fix non static symbol warning
Wei Yongjun [Fri, 12 Aug 2016 00:00:09 +0000 (00:00 +0000)]
crypto: ccp - Fix non static symbol warning

Orabug: 26644685

Fixes the following sparse warning:

drivers/crypto/ccp/ccp-dev.c:62:14: warning:
 symbol 'ccp_increment_unit_ordinal' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit dabc7904a74c47ead9d40cc00d5e8b1946a0736c)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Enable use of the additional CCP
Gary R Hook [Wed, 27 Jul 2016 00:10:49 +0000 (19:10 -0500)]
crypto: ccp - Enable use of the additional CCP

Orabug: 26644685

A second CCP is available, identical to the first, with
its ownn PCI ID. Make it available for use by the crypto
subsystem, as well as for DMA activity and random
number generation.

This device is not pre-configured at at boot time. The
driver must configure it (during the probe) for use.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit e14e7d126765ce0156ab5e3b250b1270998c207d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Enable DMA service on a v5 CCP
Gary R Hook [Wed, 27 Jul 2016 00:10:40 +0000 (19:10 -0500)]
crypto: ccp - Enable DMA service on a v5 CCP

Orabug: 26644685

Every CCP is capable of providing general DMA services.
Register the device as a provider.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 99d90b2ebd8b327c0c496798db99009b30c70945)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Add support for the RNG in a version 5 CCP
Gary R Hook [Wed, 27 Jul 2016 00:10:31 +0000 (19:10 -0500)]
crypto: ccp - Add support for the RNG in a version 5 CCP

Orabug: 26644685

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 084935b208f6507ef5214fd67052a67a700bc6cf)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Let a v5 CCP provide the same function as v3
Gary R Hook [Wed, 27 Jul 2016 00:10:21 +0000 (19:10 -0500)]
crypto: ccp - Let a v5 CCP provide the same function as v3

Orabug: 26644685

Enable equivalent function on a v5 CCP. Add support for a
version 5 CCP which enables AES/XTS/SHA services. Also,
more work on the data structures to virtualize
functionality.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 4b394a232df78414442778b02ca4a388d947d059)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Refactor code to enable checks for queue space.
Gary R Hook [Wed, 27 Jul 2016 00:10:13 +0000 (19:10 -0500)]
crypto: ccp - Refactor code to enable checks for queue space.

Orabug: 26644685

Available queue space is used to decide (by counting free slots)
if we have to put a command on hold or if it can be sent
to the engine immediately.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit bb4e89b34d1bf46156b7e880a0f34205fb7ce2a5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Refactor code supporting the CCP's RNG
Gary R Hook [Wed, 27 Jul 2016 00:10:02 +0000 (19:10 -0500)]
crypto: ccp - Refactor code supporting the CCP's RNG

Orabug: 26644685

Make the RNG support code common (where possible) in
preparation for adding a v5 device.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 8256e683113e659d9bf6bffdd227eeb1881ae9a7)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Refactor the storage block allocation code
Gary R Hook [Wed, 27 Jul 2016 00:09:50 +0000 (19:09 -0500)]
crypto: ccp - Refactor the storage block allocation code

Orabug: 26644685

Move the KSB access/management functions to the v3
device file, and add function pointers to the actions
structure. At the operations layer all of the references
to the storage block will be generic (virtual). This is
in preparation for a version 5 device, in which the
private storage block is managed differently.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 58a690b701efc32ffd49722dd7b887154eb5a205)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Refactoring: symbol cleanup
Gary R Hook [Wed, 27 Jul 2016 00:09:40 +0000 (19:09 -0500)]
crypto: ccp - Refactoring: symbol cleanup

Orabug: 26644685

Form and use of the local storage block in the CCP is
particular to the device version. Much of the code that
accesses the storage block can treat it as a virtual
resource, and will under go some renaming. Device-specific
access to the memory will be moved into device file.
Service functions will be added to the actions
structure.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 956ee21a6df08afd9c1c64e0f394a9a1b65e897d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Shorten the fields of the action structure
Gary R Hook [Wed, 27 Jul 2016 00:09:31 +0000 (19:09 -0500)]
crypto: ccp - Shorten the fields of the action structure

Orabug: 26644685

Use more concise field names; "perform_" is too verbose.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit a43eb98507574acfc435c38a6b7fb1fab6605519)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Abstract PCI info for the CCP
Gary R Hook [Wed, 27 Jul 2016 00:09:20 +0000 (19:09 -0500)]
crypto: ccp - Abstract PCI info for the CCP

Orabug: 26644685

Device-specific values for the BAR and offset should be found
in the version data structure.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit fba8855cb2403707b0639bdff0d34149699f14a2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Fix non-conforming comment style
Gary R Hook [Tue, 26 Jul 2016 23:09:46 +0000 (18:09 -0500)]
crypto: ccp - Fix non-conforming comment style

Orabug: 26644685

Adhere to the cryptodev comment convention.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit fa242e80c7fb581eddbe636186020786f2e117da)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Use skcipher for fallback
Herbert Xu [Wed, 29 Jun 2016 10:04:01 +0000 (18:04 +0800)]
crypto: ccp - Use skcipher for fallback

Orabug: 26644685

This patch replaces use of the obsolete ablkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 241118de58dace0fe24a754b6e2e3cb6f804ad47)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: skcipher - Add helper to zero stack request
Herbert Xu [Fri, 22 Jan 2016 15:21:10 +0000 (23:21 +0800)]
crypto: skcipher - Add helper to zero stack request

Orabug: 26644685

As the size of an skcipher_request is variable, it's awkward to
zero it explicitly.  This patch adds a helper to do that which
should be used when it is created on the stack.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 1aaa753d918c48c603195a468766e6a2b32b87f9)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: skcipher - Add top-level skcipher interface
Herbert Xu [Thu, 20 Aug 2015 07:21:45 +0000 (15:21 +0800)]
crypto: skcipher - Add top-level skcipher interface

Orabug: 26644685

This patch introduces the crypto skcipher interface which aims
to replace both blkcipher and ablkcipher.

It's very similar to the existing ablkcipher interface.  The
main difference is the removal of the givcrypt interface.  In
order to make the transition easier for blkcipher users, there
is a helper SKCIPHER_REQUEST_ON_STACK which can be used to place
a request on the stack for synchronous transforms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7a7ffe65c8c5fbf272b132d8980b2511d5e5fc98)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - constify ccp_actions structure
Julia Lawall [Sun, 1 May 2016 11:52:55 +0000 (13:52 +0200)]
crypto: ccp - constify ccp_actions structure

Orabug: 26644685

The ccp_actions structure is never modified, so declare it as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Gary Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit bc197b2a9c7e0129fa0ec1961881e2a0b3bef967)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Ensure all dependencies are specified
Gary R Hook [Wed, 20 Apr 2016 14:55:12 +0000 (09:55 -0500)]
crypto: ccp - Ensure all dependencies are specified

Orabug: 26644685

A DMA_ENGINE requires DMADEVICES in Kconfig

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit b3c2fee5d66b0d1e977de1a56243002e532da6a5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Register the CCP as a DMA resource
Gary R Hook [Mon, 18 Apr 2016 14:21:44 +0000 (09:21 -0500)]
crypto: ccp - Register the CCP as a DMA resource

Orabug: 26644685

The CCP has the ability to provide DMA services to the
kernel using pass-through mode of the device. Register
these services as general purpose DMA channels.

Changes since v2:
- Add a Signed-off-by

Changes since v1:
- Allocate memory for a string in ccp_dmaengine_register
- Ensure register/unregister calls are properly ordered
- Verified all changed files are listed in the diffstat
- Undo some superfluous changes
- Added a cc:

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 58ea8abf490415c390e0cc671e875510c9b66318)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Fix RT breaking #include <linux/rwlock_types.h>
Mike Galbraith [Tue, 5 Apr 2016 13:03:21 +0000 (15:03 +0200)]
crypto: ccp - Fix RT breaking #include <linux/rwlock_types.h>

Orabug: 26644685

Direct include of rwlock_types.h breaks RT, use spinlock_types.h instead.

Fixes: 553d2374db0b crypto: ccp - Support for multiple CCPs
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7587c407540006e4e8fd5ed33f66ffe6158e830a)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - fix lock acquisition code
Gary R Hook [Wed, 16 Mar 2016 14:02:26 +0000 (09:02 -0500)]
crypto: ccp - fix lock acquisition code

Orabug: 26644685

This patch simplifies an unneeded read-write lock.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 03a6f29000fdc13adc2bb2e22efd07a51d334154)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Add abstraction for device-specific calls
Gary R Hook [Tue, 1 Mar 2016 19:49:25 +0000 (13:49 -0600)]
crypto: ccp - Add abstraction for device-specific calls

Orabug: 26644685

Support for different generations of the coprocessor
requires that an abstraction layer be implemented for
interacting with the hardware. This patch splits out
version-specific functions to a separate file and populates
the version structure (acting as a driver) with function
pointers.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ea0375afa17281e9e0190034215d0404dbad7449)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - CCP versioning support
Gary R Hook [Tue, 1 Mar 2016 19:49:15 +0000 (13:49 -0600)]
crypto: ccp - CCP versioning support

Orabug: 26644685

Future hardware may introduce new algorithms wherein the
driver will need to manage resources for different versions
of the cryptographic coprocessor. This precursor patch
determines the version of the available device, and marks
and registers algorithms accordingly. A structure is added
which manages the version-specific data.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit c7019c4d739e79d7baaa13c86dcaaedec8113d70)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Support for multiple CCPs
Gary R Hook [Tue, 1 Mar 2016 19:49:04 +0000 (13:49 -0600)]
crypto: ccp - Support for multiple CCPs

Orabug: 26644685

Enable management of >1 CCPs in a system. Each device will
get a unique identifier, as well as uniquely named
resources. Treat each CCP as an orthogonal unit and register
 resources individually.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 553d2374db0bb3f48bbd29bef7ba2a4d1a3f325d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Remove check for x86 family and model
Gary R Hook [Tue, 1 Mar 2016 19:48:54 +0000 (13:48 -0600)]
crypto: ccp - Remove check for x86 family and model

Orabug: 26644685

Each x86 SoC will make use of a unique PCI ID for the CCP
device so it is not necessary to check for the CPU family
and model.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 3f19ce2054541a6c663c8a5fcf52e7baa1c6c5f5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - use to_pci_dev and to_platform_device
Geliang Tang [Wed, 23 Dec 2015 12:49:01 +0000 (20:49 +0800)]
crypto: ccp - use to_pci_dev and to_platform_device

Orabug: 26644685

Use to_pci_dev() and to_platform_device() instead of open-coding.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit c6c59bf2c0d60e67449190a8a95628ecd04b3969)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Use precalculated hash from headers
LABBE Corentin [Thu, 17 Dec 2015 12:45:41 +0000 (13:45 +0100)]
crypto: ccp - Use precalculated hash from headers

Orabug: 26644685

Precalculated hash for empty message are now present in hash headers.
This patch just use them.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit bdd75064d2b2068007f4fc5e26ac726e8617a090)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: hash - add zero length message hash for shax and md5
LABBE Corentin [Thu, 17 Dec 2015 12:45:39 +0000 (13:45 +0100)]
crypto: hash - add zero length message hash for shax and md5

Orabug: 26644685

Some crypto drivers cannot process empty data message and return a
precalculated hash for md5/sha1/sha224/sha256.

This patch add thoses precalculated hash in include/crypto.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 0c4c78de0417ced1da92351a3013e631860ea576)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
include/crypto/md5.h

7 years agocrypto: ccp - Use module name in driver structures
Tom Lendacky [Thu, 1 Oct 2015 21:32:50 +0000 (16:32 -0500)]
crypto: ccp - Use module name in driver structures

Orabug: 26644685

The convention is to use the name of the module in the driver structures
that are used for registering the device. The CCP module is currently
using a descriptive name. Replace the descriptive name with module name.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 166db195536f380c4545a8d2fca9789402464bc8)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Change references to accelerator to offload
Tom Lendacky [Thu, 1 Oct 2015 21:32:44 +0000 (16:32 -0500)]
crypto: ccp - Change references to accelerator to offload

Orabug: 26644685

The CCP is meant to be more of an offload engine than an accelerator
engine. To avoid any confusion, change references to accelerator to
offload.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 21dc9e8f941f8693992230d189a556b220b50f5b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Replace BUG_ON with WARN_ON and a return code
Tom Lendacky [Thu, 1 Oct 2015 21:32:31 +0000 (16:32 -0500)]
crypto: ccp - Replace BUG_ON with WARN_ON and a return code

Orabug: 26644685

Replace the usage of BUG_ON with WARN_ON and return an error.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 355eba5dda6984cbe10fa914e5cc8ef45a34cce2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Provide support to autoload CCP driver
Tom Lendacky [Tue, 30 Jun 2015 17:57:14 +0000 (12:57 -0500)]
crypto: ccp - Provide support to autoload CCP driver

Orabug: 26644685

Add the necessary module device tables to the platform support to allow
for autoloading of the CCP driver. This will allow for the CCP's hwrng
support to be available without having to manually load the driver. The
module device table entry for the pci support is already present.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 6170511a917679f8a1324f031a0a40f851ae91e9)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Protect against poorly marked end of sg list
Tom Lendacky [Mon, 1 Jun 2015 16:15:53 +0000 (11:15 -0500)]
crypto: ccp - Protect against poorly marked end of sg list

Orabug: 26644685

Scatter gather lists can be created with more available entries than are
actually used (e.g. using sg_init_table() to reserve a specific number
of sg entries, but in actuality using something less than that based on
the data length).  The caller sometimes fails to mark the last entry
with sg_mark_end().  In these cases, sg_nents() will return the original
size of the sg list as opposed to the actual number of sg entries that
contain valid data.

On arm64, if the sg_nents() value is used in a call to dma_map_sg() in
this situation, then it causes a BUG_ON in lib/swiotlb.c because an
"empty" sg list entry results in dma_capable() returning false and
swiotlb trying to create a bounce buffer of size 0. This occurred in
the userspace crypto interface before being fixed by

0f477b655a52 ("crypto: algif - Mark sgl end at the end of data")

Protect against this by using the new sg_nents_for_len() function which
returns only the number of sg entries required to meet the desired
length and supplying that value to dma_map_sg().

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit fb43f69401fef8ed2f72d7ea4a25910a0f2138bc)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agoscatterlist: introduce sg_nents_for_len
Tom Lendacky [Mon, 1 Jun 2015 16:15:25 +0000 (11:15 -0500)]
scatterlist: introduce sg_nents_for_len

Orabug: 26644685

When performing a dma_map_sg() call, the number of sg entries to map is
required. Using sg_nents to retrieve the number of sg entries will
return the total number of entries in the sg list up to the entry marked
as the end. If there happen to be unused entries in the list, these will
still be counted. Some dma_map_sg() implementations will not handle the
unused entries correctly (lib/swiotlb.c) and execute a BUG_ON.

The sg_nents_for_len() function will traverse the sg list and return the
number of entries required to satisfy the supplied length argument. This
can then be supplied to the dma_map_sg() call to successfully map the
sg.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit cfaed10d1f27d036b72bbdc6b1e59ea28c38ec7f)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Remove unused structure field
Tom Lendacky [Tue, 26 May 2015 18:06:30 +0000 (13:06 -0500)]
crypto: ccp - Remove unused structure field

Orabug: 26644685

Remove the length field from the ccp_sg_workarea since it is unused.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit d725332208ef13241fc435eece790c9d0ea16a4e)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: ccp - Remove manual check and set of dma_mask pointer
Tom Lendacky [Tue, 26 May 2015 18:06:24 +0000 (13:06 -0500)]
crypto: ccp - Remove manual check and set of dma_mask pointer

Orabug: 26644685

The underlying device support will set the device dma_mask pointer
if DMA is set up properly for the device.  Remove the check for and
assignment of dma_mask when it is null. Instead, just error out if
the dma_set_mask_and_coherent function fails because dma_mask is null.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit d921620e03c58bd83c4b345e36a104988d697f0d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agouek-rpm: Enable CCP device driver and interface support
Somasundaram Krishnasamy [Thu, 17 Aug 2017 15:59:44 +0000 (08:59 -0700)]
uek-rpm: Enable CCP device driver and interface support

Orabug: 26644685

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agomqueue: fix a use-after-free in sys_mq_notify()
Cong Wang [Sun, 9 Jul 2017 20:19:55 +0000 (13:19 -0700)]
mqueue: fix a use-after-free in sys_mq_notify()

The retry logic for netlink_attachskb() inside sys_mq_notify()
is nasty and vulnerable:

1) The sock refcnt is already released when retry is needed
2) The fd is controllable by user-space because we already
   release the file refcnt

so we when retry but the fd has been just closed by user-space
during this small window, we end up calling netlink_detachskb()
on the error path which releases the sock again, later when
the user-space closes this socket a use-after-free could be
triggered.

Setting 'sock' to NULL here should be sufficient to fix it.

Reported-by: GeneBlue <geneblue.mail@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit f991af3daabaecff34684fd51fac80319d1baad1)

Orabug: 26584960
CVE: CVE-2017-11176

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agox86/acpi: Prevent out of bound access caused by broken ACPI tables
Seunghun Han [Tue, 18 Jul 2017 11:03:51 +0000 (20:03 +0900)]
x86/acpi: Prevent out of bound access caused by broken ACPI tables

The bus_irq argument of mp_override_legacy_irq() is used as the index into
the isa_irq_to_gsi[] array. The bus_irq argument originates from
ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI
tables, but is nowhere sanity checked.

That allows broken or malicious ACPI tables to overwrite memory, which
might cause malfunction, panic or arbitrary code execution.

Add a sanity check and emit a warning when that triggers.

[ tglx: Added warning and rewrote changelog ]

Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: security@kernel.org
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit dad5ab0db8deac535d03e3fe3d8f2892173fa6a4)

Orabug: 26540612
CVE: CVE-2017-11473

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agoMerge remote-tracking branch 'krish/uek-4.1-next' into uek/uek-4.1-next
Konrad Rzeszutek Wilk [Wed, 23 Aug 2017 21:08:53 +0000 (17:08 -0400)]
Merge remote-tracking branch 'krish/uek-4.1-next' into uek/uek-4.1-next

* krish/uek-4.1-next:
  KVM: nVMX: fix nested EPT detection
  KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry
  KVM: nVMX: propagate errors from prepare_vmcs02
  KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT

Fixes
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agoBtrfs: add free space tree mount option
Omar Sandoval [Wed, 30 Sep 2015 03:50:38 +0000 (20:50 -0700)]
Btrfs: add free space tree mount option

Now we can finally hook up everything so we can actually use free space
tree. The free space tree is enabled by passing the space_cache=v2 mount
option. On the first mount with the this option set, the free space tree
will be created and the FREE_SPACE_TREE read-only compat bit will be
set. Any time the filesystem is mounted from then on, we must use the
free space tree. The clear_cache option will also clear the free space
tree.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 70f6d82ec73c3ae2d3adc6853c5bebcd73610097)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agoBtrfs: wire up the free space tree to the extent tree
Omar Sandoval [Wed, 30 Sep 2015 03:50:37 +0000 (20:50 -0700)]
Btrfs: wire up the free space tree to the extent tree

The free space tree is updated in tandem with the extent tree. There are
only a handful of places where we need to hook in:

1. Block group creation
2. Block group deletion
3. Delayed refs (extent creation and deletion)
4. Block group caching

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 1e144fb8f4a4d6d6d88c58f87e4366e3cd02ab72)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agoBtrfs: add free space tree sanity tests
Omar Sandoval [Wed, 30 Sep 2015 03:50:36 +0000 (20:50 -0700)]
Btrfs: add free space tree sanity tests

This tests the operations on the free space tree trying to excercise all
of the main cases for both formats. Between this and xfstests, the free
space tree should have pretty good coverage.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 7c55ee0c4afba4434d973117234577ae6ff77a1c)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agoBtrfs: implement the free space B-tree
Omar Sandoval [Wed, 30 Sep 2015 03:50:35 +0000 (20:50 -0700)]
Btrfs: implement the free space B-tree

The free space cache has turned out to be a scalability bottleneck on
large, busy filesystems. When the cache for a lot of block groups needs
to be written out, we can get extremely long commit times; if this
happens in the critical section, things are especially bad because we
block new transactions from happening.

The main problem with the free space cache is that it has to be written
out in its entirety and is managed in an ad hoc fashion. Using a B-tree
to store free space fixes this: updates can be done as needed and we get
all of the benefits of using a B-tree: checksumming, RAID handling,
well-understood behavior.

With the free space tree, we get commit times that are about the same as
the no cache case with load times slower than the free space cache case
but still much faster than the no cache case. Free space is represented
with extents until it becomes more space-efficient to use bitmaps,
giving us similar space overhead to the free space cache.

The operations on the free space tree are: adding and removing free
space, handling the creation and deletion of block groups, and loading
the free space for a block group. We can also create the free space tree
by walking the extent tree and clear the free space tree.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit a5ed91828518ab076209266c2bc510adabd078df)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agoBtrfs: introduce the free space B-tree on-disk format
Omar Sandoval [Wed, 30 Sep 2015 03:50:34 +0000 (20:50 -0700)]
Btrfs: introduce the free space B-tree on-disk format

The on-disk format for the free space tree is straightforward. Each
block group is represented in the free space tree by a free space info
item that stores accounting information: whether the free space for this
block group is stored as bitmaps or extents and how many extents of free
space exist for this block group (regardless of which format is being
used in the tree). Extents are (start, FREE_SPACE_EXTENT, length) keys
with no corresponding item, and bitmaps instead have the
FREE_SPACE_BITMAP type and have a bitmap item attached, which is just an
array of bytes.

Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 208acb8c72d7ace6b672b105502dca0bcb050162)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agoBtrfs: refactor caching_thread()
Omar Sandoval [Wed, 30 Sep 2015 03:50:33 +0000 (20:50 -0700)]
Btrfs: refactor caching_thread()

We're also going to load the free space tree from caching_thread(), so
we should refactor some of the common code.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 73fa48b674e819098c3bafc47618d0e2868191e5)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agoBtrfs: add helpers for read-only compat bits
Omar Sandoval [Wed, 30 Sep 2015 03:50:32 +0000 (20:50 -0700)]
Btrfs: add helpers for read-only compat bits

We're finally going to add one of these for the free space tree, so
let's add the same nice helpers that we have for the incompat bits.
While we're add it, also add helpers to clear the bits.

Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 1abfbcdf56d9485f050149bc4968c1609f9a0773)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agoBtrfs: add extent buffer bitmap sanity tests
Omar Sandoval [Wed, 30 Sep 2015 03:50:31 +0000 (20:50 -0700)]
Btrfs: add extent buffer bitmap sanity tests

Sanity test the extent buffer bitmap operations (test, set, and clear)
against the equivalent standard kernel operations.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 0f3312295d3ce1d82392244236a52b3b663480ef)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agoBtrfs: add extent buffer bitmap operations
Omar Sandoval [Wed, 30 Sep 2015 03:50:30 +0000 (20:50 -0700)]
Btrfs: add extent buffer bitmap operations

These are going to be used for the free space tree bitmap items.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 3e1e8bb770dba29645b302c5499ffcb8e3906712)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
7 years agox86/irq: Retrieve irq data after locking irq_desc
Thomas Gleixner [Sun, 5 Jul 2015 17:12:35 +0000 (17:12 +0000)]
x86/irq: Retrieve irq data after locking irq_desc

irq_data is protected by irq_desc->lock, so retrieving the irq chip
from irq_data outside the lock is racy vs. an concurrent update. Move
it into the lock held region.

While at it add a comment why the vector walk does not require
vector_lock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.331320612@linutronix.de
(cherry picked from commit 09cf92b784fae6109450c5d64f9908066d605249)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agox86/irq: Use proper locking in check_irq_vectors_for_cpu_disable()
Thomas Gleixner [Sun, 5 Jul 2015 17:12:33 +0000 (17:12 +0000)]
x86/irq: Use proper locking in check_irq_vectors_for_cpu_disable()

It's unsafe to examine fields in the irq descriptor w/o holding the
descriptor lock. Add proper locking.

While at it add a comment why the vector check can run lock less

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.236544164@linutronix.de
(cherry picked from commit cbb24dc761d95fe39a7a122bb1b298e9604cae15)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agox86/irq: Plug irq vector hotplug race
Thomas Gleixner [Sun, 5 Jul 2015 17:12:32 +0000 (17:12 +0000)]
x86/irq: Plug irq vector hotplug race

Jin debugged a nasty cpu hotplug race which results in leaking a irq
vector on the newly hotplugged cpu.

cpu N cpu M
native_cpu_up                   device_shutdown
  do_boot_cpu   free_msi_irqs
  start_secondary                   arch_teardown_msi_irqs
    smp_callin                        default_teardown_msi_irqs
       setup_vector_irq                  arch_teardown_msi_irq
        __setup_vector_irq    native_teardown_msi_irq
          lock(vector_lock)      destroy_irq
          install vectors
          unlock(vector_lock)
       lock(vector_lock)
--->                                          __clear_irq_vector
                                            unlock(vector_lock)
    lock(vector_lock)
    set_cpu_online
    unlock(vector_lock)

This leaves the irq vector(s) which are torn down on CPU M stale in
the vector array of CPU N, because CPU M does not see CPU N online
yet. There is a similar issue with concurrent newly setup interrupts.

The alloc/free protection of irq descriptors does not prevent the
above race, because it merily prevents interrupt descriptors from
going away or changing concurrently.

Prevent this by moving the call to setup_vector_irq() into the
vector_lock held region which protects set_cpu_online():

cpu N cpu M
native_cpu_up                   device_shutdown
  do_boot_cpu   free_msi_irqs
  start_secondary                   arch_teardown_msi_irqs
    smp_callin                        default_teardown_msi_irqs
       lock(vector_lock)                arch_teardown_msi_irq
       setup_vector_irq()
        __setup_vector_irq    native_teardown_msi_irq
          install vectors      destroy_irq
       set_cpu_online
       unlock(vector_lock)
       lock(vector_lock)
                                          __clear_irq_vector
                                            unlock(vector_lock)

So cpu M either sees the cpu N online before clearing the vector or
cpu N installs the vectors after cpu M has cleared it.

Reported-by: xiao jin <jin.xiao@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.141898931@linutronix.de
(cherry picked from commit 5a3f75e3f02836518ce49536e9c460ca8e1fa290)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
arch/x86/kernel/smpboot.c

7 years agohotplug: Prevent alloc/free of irq descriptors during cpu up/down
Thomas Gleixner [Sun, 5 Jul 2015 17:12:30 +0000 (17:12 +0000)]
hotplug: Prevent alloc/free of irq descriptors during cpu up/down

When a cpu goes up some architectures (e.g. x86) have to walk the irq
space to set up the vector space for the cpu. While this needs extra
protection at the architecture level we can avoid a few race
conditions by preventing the concurrent allocation/free of irq
descriptors and the associated data.

When a cpu goes down it moves the interrupts which are targeted to
this cpu away by reassigning the affinities. While this happens
interrupts can be allocated and freed, which opens a can of race
conditions in the code which reassignes the affinities because
interrupt descriptors might be freed underneath.

Example:

CPU1 CPU2
cpu_up/down
 irq_desc = irq_to_desc(irq);
remove_from_radix_tree(desc);
 raw_spin_lock(&desc->lock);
free(desc);

We could protect the irq descriptors with RCU, but that would require
a full tree change of all accesses to interrupt descriptors. But
fortunately these kind of race conditions are rather limited to a few
things like cpu hotplug. The normal setup/teardown is very well
serialized. So the simpler and obvious solution is:

Prevent allocation and freeing of interrupt descriptors accross cpu
hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.063519515@linutronix.de
(cherry picked from commit a899418167264c7bac574b1a0f1b2c26c5b0995a)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agosysfs: replace WARN() with pr_debug when sysfs_remove_group() failed
Ethan Zhao [Wed, 23 Aug 2017 02:03:19 +0000 (11:03 +0900)]
sysfs: replace WARN() with pr_debug when sysfs_remove_group() failed

Orabug: 26374902

There is no enough error handling in block device adding/registration
path, for example,

device_add_disk()
  blk_register_queue()

When kernel returns from device_add_disk(), no return value to tell
us it was successful or not --- that suggests it would always succeed,
and according to this assumption, then during block device removal/
unregistration steps,

sd_remove()
  del_gendisk()
    blk_unregister_queue()

dpm_sysfs_remove(), blk_trace_remove_sysfs() will be called blindly,
though there is likely no 'trace' 'power' sysfs groups there because
actually blk_register_queue()/device_add() failed somewhere. thus
causes WARN flood emitted from sysfs_remove_group() as following triggered
by unloading fnic driver:

 modprobe -rv fnic

[  122.081398] WARNING: CPU: 14 PID: 11709 at fs/sysfs/group.c:224
        sysfs_remove_group+0x9c/0xa0()
[  122.081399] sysfs group 'trace' not found for kobject 'sdb'
[  122.081424] CPU: 14 PID: 11709 Comm: modprobe Tainted: G        W
        4.1.12.x86_64 #2
[  122.081425] Hardware name: Cisco Systems Inc UCSBXXxx
[  122.081425]  0000000000000286 00000000d03792ff ffff881037823ad8
        ffffffff8173605d
[  122.081427]  ffff881037823b30 ffffffff81a2b9bc ffff881037823b18
        ffffffff810862aa
[  122.081428]  ffff88103974a000 0000000000000000 ffffffff81ba4080
        ffff882037d45080
[  122.081430] Call Trace:
[  122.081432]  [<ffffffff8173605d>] dump_stack+0x63/0x81
[  122.081434]  [<ffffffff810862aa>] warn_slowpath_common+0x8a/0xc0
[  122.081435]  [<ffffffff81086335>] warn_slowpath_fmt+0x55/0x70
[  122.081437]  [<ffffffff8129321c>] ? kernfs_find_and_get_ns+0x4c/0x60
[  122.081439]  [<ffffffff81296b5c>] sysfs_remove_group+0x9c/0xa0
[  122.081441]  [<ffffffff811675a4>] blk_trace_remove_sysfs+0x14/0x20
[  122.081444]  [<ffffffff81312605>] blk_unregister_queue+0x65/0x90
[  122.081446]  [<ffffffff81320f26>] del_gendisk+0x126/0x290
[  122.081449]  [<ffffffffa0091281>] sd_remove+0x61/0xc0 [sd_mod]
[  122.081452]  [<ffffffff81492fb7>] __device_release_driver+0x87/0x120
[  122.081454]  [<ffffffff81493073>] device_release_driver+0x23/0x30
[  122.081456]  [<ffffffff814928f8>] bus_remove_device+0x108/0x180
[  122.081457]  [<ffffffff8148eca0>] device_del+0x160/0x2a0
[  122.081459]  [<ffffffff814d8feb>] __scsi_remove_device+0xcb/0xd0
[  122.081461]  [<ffffffff814d7524>] scsi_forget_host+0x64/0x70
[  122.081462]  [<ffffffff814cac0b>] scsi_remove_host+0x7b/0x130
[  122.081466]  [<ffffffffa016fc47>] fnic_remove+0x1b7/0x4a0 [fnic]
[  122.081469]  [<ffffffff8138434f>] pci_device_remove+0x3f/0xc0
[  122.081472]  [<ffffffff81492fb7>] __device_release_driver+0x87/0x120
[  122.081474]  [<ffffffff81493a38>] driver_detach+0xc8/0xd0
[  122.081478]  [<ffffffff81492c19>] bus_remove_driver+0x59/0xe0
[  122.081479]  [<ffffffff814942e0>] driver_unregister+0x30/0x70
[  122.081482]  [<ffffffff81382dba>] pci_unregister_driver+0x2a/0x80
[  122.081486]  [<ffffffffa01808cc>] fnic_cleanup_module+0x10/0x7a [fnic]
[  122.081488]  [<ffffffff8110e8ec>] SyS_delete_module+0x1ac/0x230
[  122.081490]  [<ffffffff81028666>] ? syscall_trace_leave+0xc6/0x150
[  122.081491]  [<ffffffff8173dcee>] system_call_fastpath+0x12/0x71
[  122.081502] ---[ end trace 29ba5813719045a4 ]---

WARNING: CPU: 14 PID: 11709 at fs/sysfs/group.c:224
        sysfs_remove_group+0x9c/0xa0()
[  122.095724] sysfs group 'power' not found for kobject 'target2:0:4'
[  122.095790] CPU: 14 PID: 11709 Comm: modprobe Tainted: G        W
        4.1.12.x86_64 #2
[  122.095793] Hardware name: Cisco Systems Inc UCSBXXxx
[  122.095795]  0000000000000286 00000000d03792ff ffff881037823af8
        ffffffff8173605d
[  122.095800]  ffff881037823b50 ffffffff81a2b9bc ffff881037823b38
        ffffffff810862aa
[  122.095803]  ffff88103782
[  122.095807] Call Trace:
[  122.095814]  [<ffffffff8173605d>] dump_stack+0x63/0x81
[  122.095818]  [<ffffffff810862aa>] warn_slowpath_common+0x8a/0xc0
[  122.095822]  [<ffffffff81086335>] warn_slowpath_fmt+0x55/0x70
[  122.095827]  [<ffffffff8129321c>] ? kernfs_find_and_get_ns+0x4c/0x60
[  122.095831]  [<ffffffff81296b5c>] sysfs_remove_group+0x9c/0xa0
[  122.095839]  [<ffffffff8149b7e7>] dpm_sysfs_remove+0x57/0x60
[  122.095843]  [<ffffffff8148ebc6>] device_del+0x86/0x2a0
[  122.095847]  [<ffffffff8148e1f9>] ? device_remove_file+0x19/0x20
[  122.095854]  [<ffffffff814983ae>] attribute_container_class_device_del
        +0x1e/0x30
[  122.095858]  [<ffffffff814985c2>] transport_remove_classdev+0x52/0x60
[  122.095862]  [<ffffffff81498570>] ? transport_add_class_device+0x40/0x40
[  122.095866]  [<ffffffff81497f1c>] attribute_container_device_trigger
        +0xdc/0xf0
[  122.095870]  [<ffffffff81498525>] transport_remove_device+0x15/0x20
[  122.095875]  [<ffffffff814d4df5>] scsi_target_reap_ref_release+0x25/0x40
[  122.095879]  [<ffffffff814d68fc>] scsi_target_reap+0x2c/0x30
[  122.095883]  [<ffffffff814d8fa6>] __scsi_remove_device+0x86/0xd0
[  122.095887]  [<ffffffff814d7524>] scsi_forget_host+0x64/0x70
[  122.095891]  [<ffffffff814cac0b>] scsi_remove_host+0x7b/0x130
[  122.095900]  [<ffffffffa016fc47>] fnic_remove+0x1b7/0x4a0 [fnic]
[  122.095909]  [<ffffffff8138434f>] pci_device_remove+0x3f/0xc0
[  122.095915]  [<ffffffff81492fb7>] __device_release_driver+0x87/0x120
[  122.095922]  [<ffffffff81493a38>] driver_detach+0xc8/0xd0
[  122.095930]  [<ffffffff81492c19>] bus_remove_driver+0x59/0xe0
[  122.095934]  [<ffffffff814942e0>] driver_unregister+0x30/0x70
[  122.095941]  [<ffffffff81382dba>] pci_unregister_driver+0x2a/0x80
[  122.095952]  [<ffffffffa01808cc>] fnic_cleanup_module+0x10/0x7a [fnic]
[  122.095957]  [<ffffffff8110e8ec>] SyS_delete_module+0x1ac/0x230
[  122.095961]  [<ffffffff81028666>] ? syscall_trace_leave+0xc6/0x150
[  122.095966]  [<ffffffff8173dcee>] system_call_fastpath+0x12/0x71
[  122.095968] ---[ end trace 29ba5813719045a6 ]---

While, refactoring block device code seems not valuable if just
because of above noisy but not so dangerous WARN flood.

So this patch suppress the warning flood by replacing WARN() with
pr_debug() as shortcut before refactoring all related block device
code.

This issue also could be reproduced with stable v4.12 kernel.

(Upstream maintainer Greg K-H refused to apply this "workaround / shortcut",
He insisted the issue should be fixed in block device subsystem, that means refactoring
all block device/SCSI drivers and all relevant block layer code, that is not practical task,
it is too expensive, and we couldn't wait for the upstream refactoring,
So this patch is specific to UEK4 code,
*NOTE*, there will be no WARNNING in sysfs_remove_group(), this doens't affect
other WARN_ONCE() in kenrel )

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
7 years agoKVM: nVMX: fix nested EPT detection
Ladi Prosek [Thu, 23 Mar 2017 06:18:08 +0000 (07:18 +0100)]
KVM: nVMX: fix nested EPT detection

The nested_ept_enabled flag introduced in commit 7ca29de2136 was not
computed correctly. We are interested only in L1's EPT state, not the
the combined L0+L1 value.

In particular, if L0 uses EPT but L1 does not, nested_ept_enabled must
be false to make sure that PDPSTRs are loaded based on CR3 as usual,
because the special case described in 26.3.2.4 Loading Page-Directory-
Pointer-Table Entries does not apply.

Fixes: 7ca29de21362 ("KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT")
Cc: qemu-stable@nongnu.org
Reported-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7ad658b693536741c37b16aeb07840a2ce75f5b9)
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Tested-by: Chris Kenna <chris.kenna@oracle.com>
Acked-by: Konrad Wilk <konrad.wilk@oracle.com>
7 years agoKVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry
Ladi Prosek [Wed, 30 Nov 2016 15:03:10 +0000 (16:03 +0100)]
KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry

Loading CR3 as part of emulating vmentry is different from regular CR3 loads,
as implemented in kvm_set_cr3, in several ways.

* different rules are followed to check CR3 and it is desirable for the caller
to distinguish between the possible failures
* PDPTRs are not loaded if PAE paging and nested EPT are both enabled
* many MMU operations are not necessary

This patch introduces nested_vmx_load_cr3 suitable for CR3 loads as part of
nested vmentry and vmexit, and makes use of it on the nested vmentry path.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit 9ed38ffad47316dbdc16de0de275868c7771754d)
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Tested-by: Chris Kenna <chris.kenna@oracle.com>
Acked-by: Konrad Wilk <konrad.wilk@oracle.com>
7 years agoKVM: nVMX: propagate errors from prepare_vmcs02
Ladi Prosek [Wed, 30 Nov 2016 15:03:09 +0000 (16:03 +0100)]
KVM: nVMX: propagate errors from prepare_vmcs02

It is possible that prepare_vmcs02 fails to load the guest state. This
patch adds the proper error handling for such a case. L1 will receive
an INVALID_STATE vmexit with the appropriate exit qualification if it
happens.

A failure to set guest CR3 is the only error propagated from prepare_vmcs02
at the moment.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit ee146c1c100dbe9ca92252be2e901b957476b253)
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Tested-by: Chris Kenna <chris.kenna@oracle.com>
Acked-by: Konrad Wilk <konrad.wilk@oracle.com>
7 years agoKVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT
Ladi Prosek [Wed, 30 Nov 2016 15:03:08 +0000 (16:03 +0100)]
KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT

KVM does not correctly handle L1 hypervisors that emulate L2 real mode with
PAE and EPT, such as Hyper-V. In this mode, the L1 hypervisor populates guest
PDPTE VMCS fields and leaves guest CR3 uninitialized because it is not used
(see 26.3.2.4 Loading Page-Directory-Pointer-Table Entries). KVM always
dereferences CR3 and tries to load PDPTEs if PAE is on. This leads to two
related issues:

1) On the first nested vmentry, the guest PDPTEs, as populated by L1, are
overwritten in ept_load_pdptrs because the registers are believed to have
been loaded in load_pdptrs as part of kvm_set_cr3. This is incorrect. L2 is
running with PAE enabled but PDPTRs have been set up by L1.

2) When L2 is about to enable paging and loads its CR3, we, again, attempt
to load PDPTEs in load_pdptrs called from kvm_set_cr3. There are no guarantees
that this will succeed (it's just a CR3 load, paging is not enabled yet) and
if it doesn't, kvm_set_cr3 returns early without persisting the CR3 which is
then lost and L2 crashes right after it enables paging.

This patch replaces the kvm_set_cr3 call with a simple register write if PAE
and EPT are both on. CR3 is not to be interpreted in this case.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit 7ca29de21362de242025fbc1c22436e19e39dddc)
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Tested-by: Chris Kenna <chris.kenna@oracle.com>
Acked-by: Konrad Wilk <konrad.wilk@oracle.com>
7 years agokvm: x86: reduce collisions in mmu_page_hash v4.1.12-110.0.20170822_0730
David Matlack [Mon, 19 Dec 2016 21:58:25 +0000 (13:58 -0800)]
kvm: x86: reduce collisions in mmu_page_hash

When using two-dimensional paging, the mmu_page_hash (which provides
lookups for existing kvm_mmu_page structs), becomes imbalanced; with
too many collisions in buckets 0 and 512. This has been seen to cause
mmu_lock to be held for multiple milliseconds in kvm_mmu_get_page on
VMs with a large amount of RAM mapped with 4K pages.

The current hash function uses the lower 10 bits of gfn to index into
mmu_page_hash. When doing shadow paging, gfn is the address of the
guest page table being shadow. These tables are 4K-aligned, which
makes the low bits of gfn a good hash. However, with two-dimensional
paging, no guest page tables are being shadowed, so gfn is the base
address that is mapped by the table. Thus page tables (level=1) have
a 2MB aligned gfn, page directories (level=2) have a 1GB aligned gfn,
etc. This means hashes will only differ in their 10th bit.

hash_64() provides a better hash. For example, on a VM with ~200G
(99458 direct=1 kvm_mmu_page structs):

hash            max_mmu_page_hash_collisions
--------------------------------------------
low 10 bits     49847
hash_64         105
perfect         97

While we're changing the hash, increase the table size by 4x to better
support large VMs (further reduces number of collisions in 200G VM to
29).

Note that hash_64() does not provide a good distribution prior to commit
ef703f49a6c5 ("Eliminate bad hash multipliers from hash_32() and
hash_64()").

Signed-off-by: David Matlack <dmatlack@google.com>
Change-Id: I5aa6b13c834722813c6cca46b8b1ed6f53368ade
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Orabug: 26628797
(cherry picked from commit 114df303a7eeae8b50ebf68229b7e647714a9bea)
Signed-off-by: Govinda Tatti <Govinda.Tatti@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com> [Ravello/BMCS Team]
7 years agoIB/ipoib: For sendonly join free the multicast group on leave
Christoph Lameter [Sun, 11 Oct 2015 23:49:42 +0000 (18:49 -0500)]
IB/ipoib: For sendonly join free the multicast group on leave

Orabug: 26324050

When we leave the multicast group on expiration of a neighbor we
do not free the mcast structure. This results in a memory leak
that causes ib_dealloc_pd to fail and print a WARN_ON message
and backtrace.

Fixes: bd99b2e05c4d (IB/ipoib: Expire sendonly multicast joins)
Signed-off-by: Christoph Lameter <cl@linux.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit 0b5c9279e568d90903acedc2b9b832d8d78e8288)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
7 years agoIB/ipoib: increase the max mcast backlog queue
Doug Ledford [Sat, 26 Sep 2015 02:30:24 +0000 (22:30 -0400)]
IB/ipoib: increase the max mcast backlog queue

Orabug: 26324050

When performing sendonly joins, we queue the packets that trigger
the join until the join completes.  This may take on the order of
hundreds of milliseconds.  It is easy to have many more than three
packets come in during that time.  Expand the maximum queue depth
in order to try and prevent dropped packets during the time it
takes to join the multicast group.

Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit 2866196f294954ce9fa226825c8c1eaa64c7da8a)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
7 years agoIB/ipoib: Make sendonly multicast joins create the mcast group
Doug Ledford [Fri, 25 Sep 2015 18:35:01 +0000 (14:35 -0400)]
IB/ipoib: Make sendonly multicast joins create the mcast group

Orabug: 26324050

Since IPoIB should, as much as possible, emulate how multicast
sends work on Ethernet for regular TCP/IP apps, there should be
no requirement to subscribe to a multicast group before your
sends are properly sent.  However, due to the difference in how
multicast is handled on InfiniBand, we must join the appropriate
multicast group before we can send to it.  Previously we tried
not to trigger the auto-create feature of the subnet manager when
doing this because we didn't have tracking of these sendonly
groups and the auto-creation might never get undone.  The previous
patch added timing to these sendonly joins and allows us to
leave them after a reasonable idle expiration time.  So supply
all of the information needed to auto-create group.

Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit c3852ab0e606212de523c1fb1e15adbf9f431619)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
7 years agoIB/ipoib: Expire sendonly multicast joins
Christoph Lameter [Thu, 24 Sep 2015 17:00:05 +0000 (12:00 -0500)]
IB/ipoib: Expire sendonly multicast joins

Orabug: 26324050

On neighbor expiration, check to see if the neighbor was actually a
sendonly multicast join, and if so, leave the multicast group as we
expire the neighbor.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit bd99b2e05c4df2a428e5c9dd338289089d0e26df)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
7 years agoIB/ipoib: Suppress warning for send only join failures
Jason Gunthorpe [Fri, 21 Aug 2015 23:34:13 +0000 (17:34 -0600)]
IB/ipoib: Suppress warning for send only join failures

Orabug: 26324050

We expect send only joins to fail, it just means there are no listeners
for the group. The correct thing to do is silently drop the packet
at source.

Eg avahi will full join 224.0.0.251 which causes a send only IGMP packet
to 224.0.0.22, and then a warning level kmessage like this:

 ib0: sendonly multicast join failed for ff12:401b:ffff:0000:0000:0000:0000:0016, status -22

If there is no IP router listening to IGMP.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit d1178cbcdcf91900ccf10a177350d7945703c151)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
7 years agoIB/ipoib: Clean up send-only multicast joins
Doug Ledford [Thu, 3 Sep 2015 21:05:58 +0000 (17:05 -0400)]
IB/ipoib: Clean up send-only multicast joins

Orabug: 26324050

Even though we don't expect the group to be created by the SM we
sill need to provide all the parameters to force the SM to validate
they are correct.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit c3acdc06a95ff20d920220ecb931186b0bb22c42)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
7 years agofs/exec.c: account for argv/envp pointers
Kees Cook [Fri, 23 Jun 2017 22:08:57 +0000 (15:08 -0700)]
fs/exec.c: account for argv/envp pointers

Orabug: 26365008
CVE: CVE-2017-1000365

When limiting the argv/envp strings during exec to 1/4 of the stack limit,
the storage of the pointers to the strings was not included.  This means
that an exec with huge numbers of tiny strings could eat 1/4 of the stack
limit in strings and then additional space would be later used by the
pointers to the strings.

For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721
single-byte strings would consume less than 2MB of stack, the max (8MB /
4) amount allowed, but the pointers to the strings would consume the
remaining additional stack space (1677721 * 4 == 6710884).

The result (1677721 + 6710884 == 8388605) would exhaust stack space
entirely.  Controlling this stack exhaustion could result in
pathological behavior in setuid binaries (CVE-2017-1000365).

[akpm@linux-foundation.org: additional commenting from Kees]
Fixes: b6a2fea39318 ("mm: variable length argument support")
Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 98da7d08850fb8bdeb395d6368ed15753304aa0c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
7 years agosched/core: Use load_avg for selecting idlest group
Vincent Guittot [Thu, 8 Dec 2016 16:56:54 +0000 (17:56 +0100)]
sched/core: Use load_avg for selecting idlest group

find_idlest_group() only compares the runnable_load_avg when looking
for the least loaded group. But on fork intensive use case like
hackbench where tasks blocked quickly after the fork, this can lead to
selecting the same CPU instead of other CPUs, which have similar
runnable load but a lower load_avg.

When the runnable_load_avg of 2 CPUs are close, we now take into
account the amount of blocked load as a 2nd selection factor. There is
now 3 zones for the runnable_load of the rq:

 - [0 .. (runnable_load - imbalance)]:
Select the new rq which has significantly less runnable_load

 - [(runnable_load - imbalance) .. (runnable_load + imbalance)]:
The runnable loads are close so we use load_avg to chose
between the 2 rq

 - [(runnable_load + imbalance) .. ULONG_MAX]:
Keep the current rq which has significantly less runnable_load

The scale factor that is currently used for comparing runnable_load,
doesn't work well with small value. As an example, the use of a
scaling factor fails as soon as this_runnable_load == 0 because we
always select local rq even if min_runnable_load is only 1, which
doesn't really make sense because they are just the same. So instead
of scaling factor, we use an absolute margin for runnable_load to
detect CPUs with similar runnable_load and we keep using scaling
factor for blocked load.

For use case like hackbench, this enable the scheduler to select
different CPUs during the fork sequence and to spread tasks across the
system.

Tests have been done on a Hikey board (ARM based octo cores) for
several kernel. The result below gives min, max, avg and stdev values
of 18 runs with each configuration.

The patches depend on the "no missing update_rq_clock()" work.

hackbench -P -g 1

         ea86cb4b7621  7dc603c9028e  v4.8        v4.8+patches
  min    0.049         0.050         0.051       0,048
  avg    0.057         0.057(0%)     0.057(0%)   0,055(+5%)
  max    0.066         0.068         0.070       0,063
  stdev  +/-9%         +/-9%         +/-8%       +/-9%

More performance numbers here:

  https://lkml.kernel.org/r/20161203214707.GI20785@codeblueprint.co.uk

Orabug: 25862897

Tested-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten.Rasmussen@arm.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: kernellwp@gmail.com
Cc: umgwanakikbuti@gmail.com
Cc: yuyang.du@intel.comc
Link: http://lkml.kernel.org/r/1481216215-24651-3-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 6b94780e45c17b83e3e75f8aaca5a328db583c74)
Conflicts:
kernel/sched/fair.c
Signed-off-by: subhra mazumdar <subhra.mazumdar@oracle.com>
Reviewed-by: Atish Patra <atish.patra@oracle.com>
7 years agodentry name snapshots
Al Viro [Fri, 7 Jul 2017 18:51:19 +0000 (14:51 -0400)]
dentry name snapshots

Orabug: 26630800
CVE: CVE-2017-7533

take_dentry_name_snapshot() takes a safe snapshot of dentry name;
if the name is a short one, it gets copied into caller-supplied
structure, otherwise an extra reference to external name is grabbed
(those are never modified).  In either case the pointer to stable
string is stored into the same structure.

dentry must be held by the caller of take_dentry_name_snapshot(),
but may be freely dropped afterwards - the snapshot will stay
until destroyed by release_dentry_name_snapshot().

Intended use:
struct name_snapshot s;

take_dentry_name_snapshot(&s, dentry);
...
access s.name
...
release_dentry_name_snapshot(&s);

Replaces fsnotify_oldname_...(), gets used in fsnotify to obtain the name
to pass down with event.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 49d31c2f389acfe83417083e1208422b4091cd9e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
7 years agoNFSv4.1: Use seqid returned by EXCHANGE_ID after state migration
Chuck Lever [Thu, 1 Jun 2017 16:03:38 +0000 (12:03 -0400)]
NFSv4.1: Use seqid returned by EXCHANGE_ID after state migration

Transparent State Migration copies a client's lease state from the
server where a filesystem used to reside to the server where it now
resides. When an NFSv4.1 client first contacts that destination
server, it uses EXCHANGE_ID to detect trunking relationships.

The lease that was copied there is returned to that client, but the
destination server sets EXCHGID4_FLAG_CONFIRMED_R when replying to
the client. This is because the lease was confirmed on the source
server (before it was copied).

Normally, when CONFIRMED_R is set, a client purges the lease and
creates a new one. However, that throws away the entire benefit of
Transparent State Migration.

Therefore, the client must use the contrived slot sequence value
returned by the destination server for its first CREATE_SESSION
operation after a Transparent State Migration.

Orabug: 25802443
Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
(hand picked mainline 838edb9 NFSv4.1: Use seqid returned ...)
Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
7 years agoipv6: fix out of bound writes in __ip6_append_data()
Eric Dumazet [Fri, 19 May 2017 21:17:48 +0000 (14:17 -0700)]
ipv6: fix out of bound writes in __ip6_append_data()

Orabug: 26575181
CVE: CVE-2017-9242

Andrey Konovalov and idaifish@gmail.com reported crashes caused by
one skb shared_info being overwritten from __ip6_append_data()

Andrey program lead to following state :

copy -4200 datalen 2000 fraglen 2040
maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200

The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen,
fraggap, 0); is overwriting skb->head and skb_shared_info

Since we apparently detect this rare condition too late, move the
code earlier to even avoid allocating skb and risking crashes.

Once again, many thanks to Andrey and syzkaller team.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Reported-by: <idaifish@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 232cd35d0804cc241eb887bb8d4d9b3b9881c64a)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
net/ipv6/ip6_output.c

7 years agomnt: Add a per mount namespace limit on the number of mounts
Eric W. Biederman [Wed, 28 Sep 2016 05:27:17 +0000 (00:27 -0500)]
mnt: Add a per mount namespace limit on the number of mounts

Orabug: 26575596
CVE: CVE-2016-6213

CAI Qian <caiqian@redhat.com> pointed out that the semantics
of shared subtrees make it possible to create an exponentially
increasing number of mounts in a mount namespace.

    mkdir /tmp/1 /tmp/2
    mount --make-rshared /
    for i in $(seq 1 20) ; do mount --bind /tmp/1 /tmp/2 ; done

Will create create 2^20 or 1048576 mounts, which is a practical problem
as some people have managed to hit this by accident.

As such CVE-2016-6213 was assigned.

Ian Kent <raven@themaw.net> described the situation for autofs users
as follows:

> The number of mounts for direct mount maps is usually not very large because of
> the way they are implemented, large direct mount maps can have performance
> problems. There can be anywhere from a few (likely case a few hundred) to less
> than 10000, plus mounts that have been triggered and not yet expired.
>
> Indirect mounts have one autofs mount at the root plus the number of mounts that
> have been triggered and not yet expired.
>
> The number of autofs indirect map entries can range from a few to the common
> case of several thousand and in rare cases up to between 30000 and 50000. I've
> not heard of people with maps larger than 50000 entries.
>
> The larger the number of map entries the greater the possibility for a large
> number of active mounts so it's not hard to expect cases of a 1000 or somewhat
> more active mounts.

So I am setting the default number of mounts allowed per mount
namespace at 100,000.  This is more than enough for any use case I
know of, but small enough to quickly stop an exponential increase
in mounts.  Which should be perfect to catch misconfigurations and
malfunctioning programs.

For anyone who needs a higher limit this can be changed by writing
to the new /proc/sys/fs/mount-max sysctl.

Tested-by: CAI Qian <caiqian@redhat.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
(cherry picked from commit d29216842a85c7970c536108e093963f02714498)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
fs/namespace.c
kernel/sysctl.c

7 years agol2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
Guillaume Nault [Fri, 18 Nov 2016 21:13:00 +0000 (22:13 +0100)]
l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()

Lock socket before checking the SOCK_ZAPPED flag in l2tp_ip6_bind().
Without lock, a concurrent call could modify the socket flags between
the sock_flag(sk, SOCK_ZAPPED) test and the lock_sock() call. This way,
a socket could be inserted twice in l2tp_ip6_bind_table. Releasing it
would then leave a stale pointer there, generating use-after-free
errors when walking through the list or modifying adjacent entries.

BUG: KASAN: use-after-free in l2tp_ip6_close+0x22e/0x290 at addr ffff8800081b0ed8
Write of size 8 by task syz-executor/10987
CPU: 0 PID: 10987 Comm: syz-executor Not tainted 4.8.0+ #39
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
 ffff880031d97838 ffffffff829f835b ffff88001b5a1640 ffff8800081b0ec0
 ffff8800081b15a0 ffff8800081b6d20 ffff880031d97860 ffffffff8174d3cc
 ffff880031d978f0 ffff8800081b0e80 ffff88001b5a1640 ffff880031d978e0
Call Trace:
 [<ffffffff829f835b>] dump_stack+0xb3/0x118 lib/dump_stack.c:15
 [<ffffffff8174d3cc>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156
 [<     inline     >] print_address_description mm/kasan/report.c:194
 [<ffffffff8174d666>] kasan_report_error+0x1f6/0x4d0 mm/kasan/report.c:283
 [<     inline     >] kasan_report mm/kasan/report.c:303
 [<ffffffff8174db7e>] __asan_report_store8_noabort+0x3e/0x40 mm/kasan/report.c:329
 [<     inline     >] __write_once_size ./include/linux/compiler.h:249
 [<     inline     >] __hlist_del ./include/linux/list.h:622
 [<     inline     >] hlist_del_init ./include/linux/list.h:637
 [<ffffffff8579047e>] l2tp_ip6_close+0x22e/0x290 net/l2tp/l2tp_ip6.c:239
 [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415
 [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422
 [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570
 [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017
 [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208
 [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244
 [<ffffffff813774f9>] task_work_run+0xf9/0x170
 [<ffffffff81324aae>] do_exit+0x85e/0x2a00
 [<ffffffff81326dc8>] do_group_exit+0x108/0x330
 [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307
 [<ffffffff811b49af>] do_signal+0x7f/0x18f0
 [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156
 [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:190
 [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259
 [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6
Object at ffff8800081b0ec0, in cache L2TP/IPv6 size: 1448
Allocated:
PID = 10987
 [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20
 [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0
 [ 1116.897025] [<ffffffff8174c9ad>] kasan_kmalloc+0xad/0xe0
 [ 1116.897025] [<ffffffff8174cee2>] kasan_slab_alloc+0x12/0x20
 [ 1116.897025] [<     inline     >] slab_post_alloc_hook mm/slab.h:417
 [ 1116.897025] [<     inline     >] slab_alloc_node mm/slub.c:2708
 [ 1116.897025] [<     inline     >] slab_alloc mm/slub.c:2716
 [ 1116.897025] [<ffffffff817476a8>] kmem_cache_alloc+0xc8/0x2b0 mm/slub.c:2721
 [ 1116.897025] [<ffffffff84c4f6a9>] sk_prot_alloc+0x69/0x2b0 net/core/sock.c:1326
 [ 1116.897025] [<ffffffff84c58ac8>] sk_alloc+0x38/0xae0 net/core/sock.c:1388
 [ 1116.897025] [<ffffffff851ddf67>] inet6_create+0x2d7/0x1000 net/ipv6/af_inet6.c:182
 [ 1116.897025] [<ffffffff84c4af7b>] __sock_create+0x37b/0x640 net/socket.c:1153
 [ 1116.897025] [<     inline     >] sock_create net/socket.c:1193
 [ 1116.897025] [<     inline     >] SYSC_socket net/socket.c:1223
 [ 1116.897025] [<ffffffff84c4b46f>] SyS_socket+0xef/0x1b0 net/socket.c:1203
 [ 1116.897025] [<ffffffff85e4d685>] entry_SYSCALL_64_fastpath+0x23/0xc6
Freed:
PID = 10987
 [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20
 [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0
 [ 1116.897025] [<ffffffff8174cf61>] kasan_slab_free+0x71/0xb0
 [ 1116.897025] [<     inline     >] slab_free_hook mm/slub.c:1352
 [ 1116.897025] [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
 [ 1116.897025] [<     inline     >] slab_free mm/slub.c:2951
 [ 1116.897025] [<ffffffff81748b28>] kmem_cache_free+0xc8/0x330 mm/slub.c:2973
 [ 1116.897025] [<     inline     >] sk_prot_free net/core/sock.c:1369
 [ 1116.897025] [<ffffffff84c541eb>] __sk_destruct+0x32b/0x4f0 net/core/sock.c:1444
 [ 1116.897025] [<ffffffff84c5aca4>] sk_destruct+0x44/0x80 net/core/sock.c:1452
 [ 1116.897025] [<ffffffff84c5ad33>] __sk_free+0x53/0x220 net/core/sock.c:1460
 [ 1116.897025] [<ffffffff84c5af23>] sk_free+0x23/0x30 net/core/sock.c:1471
 [ 1116.897025] [<ffffffff84c5cb6c>] sk_common_release+0x28c/0x3e0 ./include/net/sock.h:1589
 [ 1116.897025] [<ffffffff8579044e>] l2tp_ip6_close+0x1fe/0x290 net/l2tp/l2tp_ip6.c:243
 [ 1116.897025] [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415
 [ 1116.897025] [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422
 [ 1116.897025] [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570
 [ 1116.897025] [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017
 [ 1116.897025] [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208
 [ 1116.897025] [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244
 [ 1116.897025] [<ffffffff813774f9>] task_work_run+0xf9/0x170
 [ 1116.897025] [<ffffffff81324aae>] do_exit+0x85e/0x2a00
 [ 1116.897025] [<ffffffff81326dc8>] do_group_exit+0x108/0x330
 [ 1116.897025] [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307
 [ 1116.897025] [<ffffffff811b49af>] do_signal+0x7f/0x18f0
 [ 1116.897025] [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156
 [ 1116.897025] [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:190
 [ 1116.897025] [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259
 [ 1116.897025] [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6
Memory state around the buggy address:
 ffff8800081b0d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8800081b0e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8800081b0e80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                    ^
 ffff8800081b0f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8800081b0f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

==================================================================

The same issue exists with l2tp_ip_bind() and l2tp_ip_bind_table.

Fixes: c51ce49735c1 ("l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case")
Reported-by: Baozeng Ding <sploving1@gmail.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Baozeng Ding <sploving1@gmail.com>
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 32c231164b762dddefa13af5a0101032c70b50ef)

Orabug: 26575341
CVE: CVE-2016-10200

Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
7 years agoKEYS: Disallow keyrings beginning with '.' to be joined as session keyrings
David Howells [Tue, 18 Apr 2017 14:31:07 +0000 (15:31 +0100)]
KEYS: Disallow keyrings beginning with '.' to be joined as session keyrings

This fixes CVE-2016-9604.

Keyrings whose name begin with a '.' are special internal keyrings and so
userspace isn't allowed to create keyrings by this name to prevent
shadowing.  However, the patch that added the guard didn't fix
KEYCTL_JOIN_SESSION_KEYRING.  Not only can that create dot-named keyrings,
it can also subscribe to them as a session keyring if they grant SEARCH
permission to the user.

This, for example, allows a root process to set .builtin_trusted_keys as
its session keyring, at which point it has full access because now the
possessor permissions are added.  This permits root to add extra public
keys, thereby bypassing module verification.

This also affects kexec and IMA.

This can be tested by (as root):

keyctl session .builtin_trusted_keys
keyctl add user a a @s
keyctl list @s

which on my test box gives me:

2 keys in keyring:
180010936: ---lswrv     0     0 asymmetric: Build time autogenerated kernel key: ae3d4a31b82daa8e1a75b49dc2bba949fd992a05
801382539: --alswrv     0     0 user: a

Fix this by rejecting names beginning with a '.' in the keyctl.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
cc: linux-ima-devel@lists.sourceforge.net
cc: stable@vger.kernel.org
(cherry picked from commit ee8f844e3c5a73b999edf733df1c529d6503ec2f)

Orabug: 26575534
CVE: CVE-2016-9604

Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>
7 years agosctp: do not inherit ipv6_{mc|ac|fl}_list from parent
Eric Dumazet [Wed, 17 May 2017 14:16:40 +0000 (07:16 -0700)]
sctp: do not inherit ipv6_{mc|ac|fl}_list from parent

SCTP needs fixes similar to 83eaddab4378 ("ipv6/dccp: do not inherit
ipv6_mc_list from parent"), otherwise bad things can happen.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fdcee2cbb8438702ea1b328fb6e0ac5e9a40c7f8)

Orabug: 26107745
CVE: CVE-2017-9075

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: algif_hash - Fix result clobbering in recvmsg
Herbert Xu [Mon, 21 Nov 2016 07:34:00 +0000 (15:34 +0800)]
crypto: algif_hash - Fix result clobbering in recvmsg

Recently an init call was added to hash_recvmsg so as to reset
the hash state in case a sendmsg call was never made.

Unfortunately this ended up clobbering the result if the previous
sendmsg was done with a MSG_MORE flag.  This patch fixes it by
excluding that case when we make the init call.

Fixes: a8348bca2944 ("algif_hash - Fix NULL hash crash with shash")
Reported-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 8acf7a106326eb94e143552de81f34308149121c)

Orabug: 25698521

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
7 years agocrypto: algif_hash - Fix NULL hash crash with shash
Herbert Xu [Thu, 17 Nov 2016 14:07:58 +0000 (22:07 +0800)]
crypto: algif_hash - Fix NULL hash crash with shash

Recently algif_hash has been changed to allow null hashes.  This
triggers a bug when used with an shash algorithm whereby it will
cause a crash during the digest operation.

This patch fixes it by avoiding the digest operation and instead
doing an init followed by a final which avoids the buggy code in
shash.

This patch also ensures that the result buffer is freed after an
error so that it is not returned as a genuine hash result on the
next recv call.

The shash/ahash wrapper code will be fixed later to handle this
case correctly.

Fixes: 493b2ed3f760 ("crypto: algif_hash - Handle NULL hashes correctly")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Laura Abbott <labbott@redhat.com>
(cherry picked from commit a8348bca2944d397a528772f5c0ccb47a8b58af4)

Orabug: 25698521

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>