www.infradead.org Git - users/jedix/linux-maple.git/log

uek-rpm: Enable USB/IP support (Add usbip-core driver).

Orabug: 26668125

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

x86/nmi: Save regs in crash dump on external NMI

Now, multiple CPUs can receive an external NMI simultaneously by
specifying the "apic_extnmi=all" command line parameter. When we take
a crash dump by using external NMI with this option, we fail to save
registers into the crash dump. This happens as follows:

  CPU 0                              CPU 1
  ================================   =============================
  receive an external NMI
  default_do_nmi()                   receive an external NMI
    spin_lock(&nmi_reason_lock)      default_do_nmi()
    io_check_error()                   spin_lock(&nmi_reason_lock)
      panic()                            busy loop
      ...
        kdump_nmi_shootdown_cpus()
          issue NMI IPI -----------> blocked until IRET
                                         busy loop...

Here, since CPU 1 is in NMI context, an additional NMI from CPU 0
remains unhandled until CPU 1 IRETs. However, CPU 1 will never execute
IRET so the NMI is not handled and the callback function to save
registers is never called.

To solve this issue, we check if the IPI for crash dumping was issued
while waiting for nmi_reason_lock to be released, and if so, call its
callback function directly. If the IPI is not issued (e.g. kdump is
disabled), the actual behavior doesn't change.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20151210065245.4587.39316.stgit@softrs
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit b279d67df88a49c6ca32b3eebd195660254be394)

Orabug: 25679037

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

aio: mark AIO pseudo-fs noexec

This ensures that do_mmap() won't implicitly make AIO memory mappings
executable if the READ_IMPLIES_EXEC personality flag is set.  Such
behavior is problematic because the security_mmap_file LSM hook doesn't
catch this case, potentially permitting an attacker to bypass a W^X
policy enforced by SELinux.

I have tested the patch on my machine.

To test the behavior, compile and run this:

    #define _GNU_SOURCE
    #include <unistd.h>
    #include <sys/personality.h>
    #include <linux/aio_abi.h>
    #include <err.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/syscall.h>

    int main(void) {
        personality(READ_IMPLIES_EXEC);
        aio_context_t ctx = 0;
        if (syscall(__NR_io_setup, 1, &ctx))
            err(1, "io_setup");

        char cmd[1000];
        sprintf(cmd, "cat /proc/%d/maps | grep -F '/[aio]'",
            (int)getpid());
        system(cmd);
        return 0;
    }

In the output, "rw-s" is good, "rwxs" is bad.

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 22f6b4d34fcf039c63a94e7670e0da24f8575a5a)

Orabug: 26540416
CVE: CVE-2016-10044

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

vfs: Commit to never having exectuables on proc and sysfs.

Today proc and sysfs do not contain any executable files. Several
applications today mount proc or sysfs without noexec and nosuid and
then depend on there being no exectuables files on proc or sysfs.
Having any executable files show on proc or sysfs would cause
a user space visible regression, and most likely security problems.

Therefore commit to never allowing executables on proc and sysfs by
adding a new flag to mark them as filesystems without executables and
enforce that flag.

Test the flag where MNT_NOEXEC is tested today, so that the only user
visible effect will be that exectuables will be treated as if the
execute bit is cleared.

The filesystems proc and sysfs do not currently incoporate any
executable files so this does not result in any user visible effects.

This makes it unnecessary to vet changes to proc and sysfs tightly for
adding exectuable files or changes to chattr that would modify
existing files, as no matter what the individual file say they will
not be treated as exectuable files by the vfs.

Not having to vet changes to closely is important as without this we
are only one proc_create call (or another goof up in the
implementation of notify_change) from having problematic executables
on proc. Those mistakes are all too easy to make and would create
a situation where there are security issues or the assumptions of
some program having to be broken (and cause userspace regressions).

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
(cherry picked from commit 90f8572b0f021fdd1baa68e00a8c30482ee9e5f4)

Orabug: 26540416
CVE: CVE-2016-10044

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
include/linux/fs.h

vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB

FS_CGROUP_WRITEBACK indicates whether a file_system_type supports
cgroup writeback; however, different super_blocks of the same
file_system_type may or may not support cgroup writeback depending on
filesystem options.  This patch replaces FS_CGROUP_WRITEBACK with a
per-super_block flag.

super_block->s_flags carries some internal flags in the high bits but
it's exposd to userland through uapi header and running out of space
anyway.  This patch adds a new field super_block->s_iflags to carry
kernel-internal flags.  It is currently only used by the new
SB_I_CGROUPWB flag whose concatenated and abbreviated name is for
consistency with other super_block flags.

ext2_fill_super() is updated to set SB_I_CGROUPWB.

v2: Added super_block->s_iflags instead of stealing another high bit
    from sb->s_flags as suggested by Christoph and Jan.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 46b15caa7cb19b0f6e3bc8ebaee5bc1bb2e35110)

Orabug: 26540416
CVE: CVE-2016-10044

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
include/linux/backing-dev.h
include/linux/fs.h
fs/ext2/super.c

crypto: ccp - Release locks before returning

Orabug: 26644685

krobot warning: make sure that all error return paths release locks.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 30b4c54ccdebfc5a0210d9a4d77cc3671c9d1576)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - return NULL instead of 0

Orabug: 26644685

This change is to handle sparse warning. Return type of function is a pointer to the structure and
it returns 0. Instead it should return NULL.

Signed-off-by: Pushkar Jambhlekar <pushkar.iit@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 9d1fb19668207b66bc54f8e10fa91a2fcff061e4)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Add debugfs entries for CCP information

Orabug: 26644685

Expose some data about the configuration and operation of the CCP
through debugfs entries: device name, capabilities, configuration,
statistics.

Allow the user to reset the counters to zero by writing (any value)
to the 'stats' file. This can be done per queue or per device.

Changes from V1:
- Correct polarity of test when destroying devices at module unload

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 3cdbe346ed3f380eae1cb3e9febfe703e7d8a7b0)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Use IPAD/OPAD constant

Orabug: 26644685

This patch simply replace all occurrence of HMAC IPAD/OPAD value by their
define.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 6507c57bb013a22ae0f61bdbb2a63380598d34a2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: hmac - add hmac IPAD/OPAD constant

Orabug: 26644685

Many HMAC users directly use directly 0x36/0x5c values.
It's better with crypto to use a name instead of directly some crypto
constant.

This patch simply add HMAC_IPAD_VALUE/HMAC_OPAD_VALUE defines in a new
include file "crypto/hmac.h" and use them in crypto/hmac.c

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 03d7db5654aefb78bf2ded62e2187833fda2a6cc)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Add a module author

Orabug: 26644685

CC: <stable@vger.kernel.org> # 4.9.x+
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ac50b78b22cddb8a85b264b1f172c4239e826fd2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Change ISR handler method for a v5 CCP

Orabug: 26644685

The CCP has the ability to perform several operations simultaneously,
but only one interrupt. When implemented as a PCI device and using
MSI-X/MSI interrupts, use a tasklet model to service interrupts. By
disabling and enabling interrupts from the CCP, coupled with the
queuing that tasklets provide, we can ensure that all events
(occurring on the device) are recognized and serviced.

This change fixes a problem wherein 2 or more busy queues can cause
notification bits to change state while a (CCP) interrupt is being
serviced, but after the queue state has been evaluated. This results
in the event being 'lost' and the queue hanging, waiting to be
serviced. Since the status bits are never fully de-asserted, the
CCP never generates another interrupt (all bits zero -> one or more
bits one), and no further CCP operations will be executed.

Cc: <stable@vger.kernel.org> # 4.9.x+
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 6263b51eb3190d30351360fd168959af7e3a49a9)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Change ISR handler method for a v3 CCP

Orabug: 26644685

The CCP has the ability to perform several operations simultaneously,
but only one interrupt. When implemented as a PCI device and using
MSI-X/MSI interrupts, use a tasklet model to service interrupts. By
disabling and enabling interrupts from the CCP, coupled with the
queuing that tasklets provide, we can ensure that all events
(occurring on the device) are recognized and serviced.

This change fixes a problem wherein 2 or more busy queues can cause
notification bits to change state while a (CCP) interrupt is being
serviced, but after the queue state has been evaluated. This results
in the event being 'lost' and the queue hanging, waiting to be
serviced. Since the status bits are never fully de-asserted, the
CCP never generates another interrupt (all bits zero -> one or more
bits one), and no further CCP operations will be executed.

Cc: <stable@vger.kernel.org> # 4.9.x+
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7b537b24e76a1e8e6d7ea91483a45d5b1426809b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Disable interrupts early on unload

Orabug: 26644685

Ensure that we disable interrupts first when shutting down
the driver.

Cc: <stable@vger.kernel.org> # 4.9.x+
Signed-off-by: Gary R Hook <ghook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 116591fe3eef11c6f06b662c9176385f13891183)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Use only the relevant interrupt bits

Orabug: 26644685

Each CCP queue can product interrupts for 4 conditions:
operation complete, queue empty, error, and queue stopped.
This driver only works with completion and error events.

Cc: <stable@vger.kernel.org> # 4.9.x+
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 56467cb11cf8ae4db9003f54b3d3425b5f07a10a)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Rearrange structure members to minimize size

Orabug: 26644685

The AES GCM function (in ccp-ops) requires a fair amount of
stack space, which elicits a complaint when KASAN is enabled.
Rearranging and packing a few structures eliminates the
warning.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 2d158391061ec8c73898ceac148f4eddfa83efd5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Remove redundant cpu-to-le32 macros

Orabug: 26644685

Endianness is dealt with when the command descriptor is
copied into the command queue. Remove any occurrences of
cpu_to_le32() found elsewhere.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 51de7dd02d422da11b4dff6f11936c8333a870fe)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Enable 3DES function on v5 CCPs

Orabug: 26644685

Wire up support for Triple DES in ECB mode.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 990672d48515ce09c76fcf1ceccee48b0dd1942b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Add SHA-2 384- and 512-bit support

Orabug: 26644685

Incorporate 384-bit and 512-bit hashing for a version 5 CCP
device

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ccebcf3f224a44ec8e9c5bfca9d8e5d29298a5a8)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Make some CCP DMA channels private

Orabug: 26644685

The CCP registers its queues as channels capable of handling
general DMA operations. The NTB driver will use DMA if
directed, but as public channels can be reserved for use in
asynchronous operations some channels should be held back
as private. Since the public/private determination is
handled at a device level, reserve the "other" (secondary)
CCP channels as private.

Add a module parameter that allows for override, to be
applied to all channels on all devices.

CC: <stable@vger.kernel.org> # 4.10.x-
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit efc989fce8703914bac091dcc4b8ff7a72ccf987)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Assign DMA commands to the channel's CCP

Orabug: 26644685

The CCP driver generally uses a round-robin approach when
assigning operations to available CCPs. For the DMA engine,
however, the DMA mappings of the SGs are associated with a
specific CCP. When an IOMMU is enabled, the IOMMU is
programmed based on this specific device.

If the DMA operations are not performed by that specific
CCP then addressing errors and I/O page faults will occur.

Update the CCP driver to allow a specific CCP device to be
requested for an operation and use this in the DMA engine
support.

Cc: <stable@vger.kernel.org> # 4.9.x-
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7c468447f40645fbf2a033dfdaa92b1957130d50)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Simplify some buffer management routines

Orabug: 26644685

The reverse-get/set functions can be simplified by
eliminating unused code.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 83d650ab78c7185da815e16d03fb579d3fde0140)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Update the command queue on errors

Orabug: 26644685

Move the command queue tail pointer when an error is
detected. Always return the error.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 4cdf101ef444e47bc8869ef3e90396e828fd9b61)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Change mode for detailed CCP init messages

Orabug: 26644685

The CCP initialization messages only need to be sent to
syslog in debug mode.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit a60496a0ca0d34a3ae92e426138eab35f0f45612)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Set the AES size field for all modes

Orabug: 26644685

Ensure that the size field is correctly populated for
all AES modes.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit f7cc02b3c3a33a10dd5bb9e5dfd22e47e09503a2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Fix double add when creating new DMA command

Orabug: 26644685

Eliminate a double-add by creating a new list to manage
command descriptors when created; move the descriptor to
the pending list when the command is submitted.

Cc: <stable@vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit e5da5c5667381d2772374ee6a2967b3576c9483d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Fix DMA operations when IOMMU is enabled

Orabug: 26644685

An I/O page fault occurs when the IOMMU is enabled on a
system that supports the v5 CCP. DMA operations use a
Request ID value that does not match what is expected by
the IOMMU, resulting in the I/O page fault. Setting the
Request ID value to 0 corrects this issue.

Cc: <stable@vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 500c0106e638e08c2c661c305ed57d6b67e10908)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Fix handling of RSA exponent on a v5 device

Orabug: 26644685

The exponent size in the ccp_op structure is in bits. A v5
CCP requires the exponent size to be in bytes, so convert
the size from bits to bytes when populating the descriptor.

The current code references the exponent in memory, but
these fields have not been set since the exponent is
actually store in the LSB. Populate the descriptor with
the LSB location (address).

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit e6414b13ea39e3011901a84eb1bdefa65610b0f8)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - fix typo "CPP"

Orabug: 26644685

The abbreviation for Cryptographic Coprocessor is "CCP".

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit c8d283ff8b0b6b2061dfc137afd6c56608a34bcb)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Clean up the LSB slot allocation code

Orabug: 26644685

Fix a few problems revealed by testing: verify consistent
units, especially in public slot allocation. Percolate
some common initialization code up to a common routine.
Add some comments.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 103600ab966a2f02d8986bbfdf87b762b1c6a06d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - remove unneeded code

Orabug: 26644685

Clean up patch for an unneeded structure member.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ec9b70df75b3600ca20338198a43173f23e6bb9b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - change bitfield type to unsigned ints

Orabug: 26644685

Bit fields are not sensitive to endianness, so use
a transparent standard data type

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit fdd2cf9db1e25a46a74c5802d18435171c92e7df)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Fix non static symbol warning

Orabug: 26644685

Fixes the following sparse warning:

drivers/crypto/ccp/ccp-dev.c:44:6: warning:
symbol 'ccp_error_codes' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ff4f44de44dbd98feecf8fa76e14353a3993b335)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - change type of struct member lsb to signed

Orabug: 26644685

The lsb field uses a value of -1 to indicate that it
is unassigned. Therefore type must be a signed int.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 3cf799680d2612a21d50ed554848dd37241672c8)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Make syslog errors human-readable

Orabug: 26644685

Add human-readable strings to log messages about CCP errors

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 81422badb39078fde1ffcecda3caac555226fc7b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - clean up data structure

Orabug: 26644685

Change names of data structure instances. Add const
keyword where appropriate. Add error handling path.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 9ddb9dc6be095ebe393f7eb582df09cc4847c5e9)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Fix return value check in ccp_dmaengine_register()

Orabug: 26644685

Fix the retrn value check which testing the wrong variable
in ccp_dmaengine_register().

Fixes: 58ea8abf4904 ("crypto: ccp - Register the CCP as a DMA resource")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7514e3688811e610640ec2201ca14dfebfe13442)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - use kmem_cache_zalloc instead of kmem_cache_alloc/memset

Orabug: 26644685

Using kmem_cache_zalloc() instead of kmem_cache_alloc() and memset().

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 664f570a9cee51a8c7caef042118abd2b48705b1)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - add missing release in ccp_dmaengine_register

Orabug: 26644685

ccp_dmaengine_register used to return with an error code before
releasing all resource. This patch adds a jump to the appropriate label
ensuring that the resources are properly released before returning.

This issue was found with Hector.

Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ba22a1e2aa8ef7f8467f755cfe44b79784febefe)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Fix non static symbol warning

Orabug: 26644685

Fixes the following sparse warning:

drivers/crypto/ccp/ccp-dev.c:62:14: warning:
symbol 'ccp_increment_unit_ordinal' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit dabc7904a74c47ead9d40cc00d5e8b1946a0736c)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Enable use of the additional CCP

Orabug: 26644685

A second CCP is available, identical to the first, with
its ownn PCI ID. Make it available for use by the crypto
subsystem, as well as for DMA activity and random
number generation.

This device is not pre-configured at at boot time. The
driver must configure it (during the probe) for use.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit e14e7d126765ce0156ab5e3b250b1270998c207d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Enable DMA service on a v5 CCP

Orabug: 26644685

Every CCP is capable of providing general DMA services.
Register the device as a provider.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 99d90b2ebd8b327c0c496798db99009b30c70945)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Add support for the RNG in a version 5 CCP

Orabug: 26644685

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 084935b208f6507ef5214fd67052a67a700bc6cf)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Let a v5 CCP provide the same function as v3

Orabug: 26644685

Enable equivalent function on a v5 CCP. Add support for a
version 5 CCP which enables AES/XTS/SHA services. Also,
more work on the data structures to virtualize
functionality.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 4b394a232df78414442778b02ca4a388d947d059)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Refactor code to enable checks for queue space.

Orabug: 26644685

Available queue space is used to decide (by counting free slots)
if we have to put a command on hold or if it can be sent
to the engine immediately.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit bb4e89b34d1bf46156b7e880a0f34205fb7ce2a5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Refactor code supporting the CCP's RNG

Orabug: 26644685

Make the RNG support code common (where possible) in
preparation for adding a v5 device.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 8256e683113e659d9bf6bffdd227eeb1881ae9a7)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Refactor the storage block allocation code

Orabug: 26644685

Move the KSB access/management functions to the v3
device file, and add function pointers to the actions
structure. At the operations layer all of the references
to the storage block will be generic (virtual). This is
in preparation for a version 5 device, in which the
private storage block is managed differently.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 58a690b701efc32ffd49722dd7b887154eb5a205)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Refactoring: symbol cleanup

Orabug: 26644685

Form and use of the local storage block in the CCP is
particular to the device version. Much of the code that
accesses the storage block can treat it as a virtual
resource, and will under go some renaming. Device-specific
access to the memory will be moved into device file.
Service functions will be added to the actions
structure.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 956ee21a6df08afd9c1c64e0f394a9a1b65e897d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Shorten the fields of the action structure

Orabug: 26644685

Use more concise field names; "perform_" is too verbose.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit a43eb98507574acfc435c38a6b7fb1fab6605519)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Abstract PCI info for the CCP

Orabug: 26644685

Device-specific values for the BAR and offset should be found
in the version data structure.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit fba8855cb2403707b0639bdff0d34149699f14a2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Fix non-conforming comment style

Orabug: 26644685

Adhere to the cryptodev comment convention.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit fa242e80c7fb581eddbe636186020786f2e117da)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Use skcipher for fallback

Orabug: 26644685

This patch replaces use of the obsolete ablkcipher with skcipher.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 241118de58dace0fe24a754b6e2e3cb6f804ad47)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: skcipher - Add helper to zero stack request

Orabug: 26644685

As the size of an skcipher_request is variable, it's awkward to
zero it explicitly. This patch adds a helper to do that which
should be used when it is created on the stack.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 1aaa753d918c48c603195a468766e6a2b32b87f9)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: skcipher - Add top-level skcipher interface

Orabug: 26644685

This patch introduces the crypto skcipher interface which aims
to replace both blkcipher and ablkcipher.

It's very similar to the existing ablkcipher interface. The
main difference is the removal of the givcrypt interface. In
order to make the transition easier for blkcipher users, there
is a helper SKCIPHER_REQUEST_ON_STACK which can be used to place
a request on the stack for synchronous transforms.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7a7ffe65c8c5fbf272b132d8980b2511d5e5fc98)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - constify ccp_actions structure

Orabug: 26644685

The ccp_actions structure is never modified, so declare it as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Gary Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit bc197b2a9c7e0129fa0ec1961881e2a0b3bef967)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Ensure all dependencies are specified

Orabug: 26644685

A DMA_ENGINE requires DMADEVICES in Kconfig

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit b3c2fee5d66b0d1e977de1a56243002e532da6a5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Register the CCP as a DMA resource

Orabug: 26644685

The CCP has the ability to provide DMA services to the
kernel using pass-through mode of the device. Register
these services as general purpose DMA channels.

Changes since v2:
- Add a Signed-off-by

Changes since v1:
- Allocate memory for a string in ccp_dmaengine_register
- Ensure register/unregister calls are properly ordered
- Verified all changed files are listed in the diffstat
- Undo some superfluous changes
- Added a cc:

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 58ea8abf490415c390e0cc671e875510c9b66318)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Fix RT breaking #include <linux/rwlock_types.h>

Orabug: 26644685

Direct include of rwlock_types.h breaks RT, use spinlock_types.h instead.

Fixes: 553d2374db0b crypto: ccp - Support for multiple CCPs
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 7587c407540006e4e8fd5ed33f66ffe6158e830a)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - fix lock acquisition code

Orabug: 26644685

This patch simplifies an unneeded read-write lock.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 03a6f29000fdc13adc2bb2e22efd07a51d334154)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Add abstraction for device-specific calls

Orabug: 26644685

Support for different generations of the coprocessor
requires that an abstraction layer be implemented for
interacting with the hardware. This patch splits out
version-specific functions to a separate file and populates
the version structure (acting as a driver) with function
pointers.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit ea0375afa17281e9e0190034215d0404dbad7449)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - CCP versioning support

Orabug: 26644685

Future hardware may introduce new algorithms wherein the
driver will need to manage resources for different versions
of the cryptographic coprocessor. This precursor patch
determines the version of the available device, and marks
and registers algorithms accordingly. A structure is added
which manages the version-specific data.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit c7019c4d739e79d7baaa13c86dcaaedec8113d70)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Support for multiple CCPs

Orabug: 26644685

Enable management of >1 CCPs in a system. Each device will
get a unique identifier, as well as uniquely named
resources. Treat each CCP as an orthogonal unit and register
resources individually.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 553d2374db0bb3f48bbd29bef7ba2a4d1a3f325d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Remove check for x86 family and model

Orabug: 26644685

Each x86 SoC will make use of a unique PCI ID for the CCP
device so it is not necessary to check for the CPU family
and model.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 3f19ce2054541a6c663c8a5fcf52e7baa1c6c5f5)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - use to_pci_dev and to_platform_device

Orabug: 26644685

Use to_pci_dev() and to_platform_device() instead of open-coding.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit c6c59bf2c0d60e67449190a8a95628ecd04b3969)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Use precalculated hash from headers

Orabug: 26644685

Precalculated hash for empty message are now present in hash headers.
This patch just use them.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit bdd75064d2b2068007f4fc5e26ac726e8617a090)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: hash - add zero length message hash for shax and md5

Orabug: 26644685

Some crypto drivers cannot process empty data message and return a
precalculated hash for md5/sha1/sha224/sha256.

This patch add thoses precalculated hash in include/crypto.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 0c4c78de0417ced1da92351a3013e631860ea576)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
include/crypto/md5.h

crypto: ccp - Use module name in driver structures

Orabug: 26644685

The convention is to use the name of the module in the driver structures
that are used for registering the device. The CCP module is currently
using a descriptive name. Replace the descriptive name with module name.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 166db195536f380c4545a8d2fca9789402464bc8)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Change references to accelerator to offload

Orabug: 26644685

The CCP is meant to be more of an offload engine than an accelerator
engine. To avoid any confusion, change references to accelerator to
offload.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 21dc9e8f941f8693992230d189a556b220b50f5b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Replace BUG_ON with WARN_ON and a return code

Orabug: 26644685

Replace the usage of BUG_ON with WARN_ON and return an error.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 355eba5dda6984cbe10fa914e5cc8ef45a34cce2)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Provide support to autoload CCP driver

Orabug: 26644685

Add the necessary module device tables to the platform support to allow
for autoloading of the CCP driver. This will allow for the CCP's hwrng
support to be available without having to manually load the driver. The
module device table entry for the pci support is already present.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 6170511a917679f8a1324f031a0a40f851ae91e9)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Protect against poorly marked end of sg list

Orabug: 26644685

Scatter gather lists can be created with more available entries than are
actually used (e.g. using sg_init_table() to reserve a specific number
of sg entries, but in actuality using something less than that based on
the data length). The caller sometimes fails to mark the last entry
with sg_mark_end(). In these cases, sg_nents() will return the original
size of the sg list as opposed to the actual number of sg entries that
contain valid data.

On arm64, if the sg_nents() value is used in a call to dma_map_sg() in
this situation, then it causes a BUG_ON in lib/swiotlb.c because an
"empty" sg list entry results in dma_capable() returning false and
swiotlb trying to create a bounce buffer of size 0. This occurred in
the userspace crypto interface before being fixed by

0f477b655a52 ("crypto: algif - Mark sgl end at the end of data")

Protect against this by using the new sg_nents_for_len() function which
returns only the number of sg entries required to meet the desired
length and supplying that value to dma_map_sg().

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit fb43f69401fef8ed2f72d7ea4a25910a0f2138bc)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

scatterlist: introduce sg_nents_for_len

Orabug: 26644685

When performing a dma_map_sg() call, the number of sg entries to map is
required. Using sg_nents to retrieve the number of sg entries will
return the total number of entries in the sg list up to the entry marked
as the end. If there happen to be unused entries in the list, these will
still be counted. Some dma_map_sg() implementations will not handle the
unused entries correctly (lib/swiotlb.c) and execute a BUG_ON.

The sg_nents_for_len() function will traverse the sg list and return the
number of entries required to satisfy the supplied length argument. This
can then be supplied to the dma_map_sg() call to successfully map the
sg.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit cfaed10d1f27d036b72bbdc6b1e59ea28c38ec7f)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Remove unused structure field

Orabug: 26644685

Remove the length field from the ccp_sg_workarea since it is unused.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit d725332208ef13241fc435eece790c9d0ea16a4e)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

crypto: ccp - Remove manual check and set of dma_mask pointer

Orabug: 26644685

The underlying device support will set the device dma_mask pointer
if DMA is set up properly for the device. Remove the check for and
assignment of dma_mask when it is null. Instead, just error out if
the dma_set_mask_and_coherent function fails because dma_mask is null.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit d921620e03c58bd83c4b345e36a104988d697f0d)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

uek-rpm: Enable CCP device driver and interface support

Orabug: 26644685

Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

mqueue: fix a use-after-free in sys_mq_notify()

The retry logic for netlink_attachskb() inside sys_mq_notify()
is nasty and vulnerable:

1) The sock refcnt is already released when retry is needed
2) The fd is controllable by user-space because we already
release the file refcnt

so we when retry but the fd has been just closed by user-space
during this small window, we end up calling netlink_detachskb()
on the error path which releases the sock again, later when
the user-space closes this socket a use-after-free could be
triggered.

Setting 'sock' to NULL here should be sufficient to fix it.

Reported-by: GeneBlue <geneblue.mail@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit f991af3daabaecff34684fd51fac80319d1baad1)

Orabug: 26584960
CVE: CVE-2017-11176

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

x86/acpi: Prevent out of bound access caused by broken ACPI tables

The bus_irq argument of mp_override_legacy_irq() is used as the index into
the isa_irq_to_gsi[] array. The bus_irq argument originates from
ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI
tables, but is nowhere sanity checked.

That allows broken or malicious ACPI tables to overwrite memory, which
might cause malfunction, panic or arbitrary code execution.

Add a sanity check and emit a warning when that triggers.

[ tglx: Added warning and rewrote changelog ]

Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: security@kernel.org
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit dad5ab0db8deac535d03e3fe3d8f2892173fa6a4)

Orabug: 26540612
CVE: CVE-2017-11473

Signed-off-by: Tim Tianyang Chen <tianyang.chen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

Merge remote-tracking branch 'krish/uek-4.1-next' into uek/uek-4.1-next

* krish/uek-4.1-next:
  KVM: nVMX: fix nested EPT detection
  KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry
  KVM: nVMX: propagate errors from prepare_vmcs02
  KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT

Fixes
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Btrfs: add free space tree mount option

Now we can finally hook up everything so we can actually use free space
tree. The free space tree is enabled by passing the space_cache=v2 mount
option. On the first mount with the this option set, the free space tree
will be created and the FREE_SPACE_TREE read-only compat bit will be
set. Any time the filesystem is mounted from then on, we must use the
free space tree. The clear_cache option will also clear the free space
tree.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 70f6d82ec73c3ae2d3adc6853c5bebcd73610097)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

Btrfs: wire up the free space tree to the extent tree

The free space tree is updated in tandem with the extent tree. There are
only a handful of places where we need to hook in:

1. Block group creation
2. Block group deletion
3. Delayed refs (extent creation and deletion)
4. Block group caching

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 1e144fb8f4a4d6d6d88c58f87e4366e3cd02ab72)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

Btrfs: add free space tree sanity tests

This tests the operations on the free space tree trying to excercise all
of the main cases for both formats. Between this and xfstests, the free
space tree should have pretty good coverage.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 7c55ee0c4afba4434d973117234577ae6ff77a1c)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

Btrfs: implement the free space B-tree

The free space cache has turned out to be a scalability bottleneck on
large, busy filesystems. When the cache for a lot of block groups needs
to be written out, we can get extremely long commit times; if this
happens in the critical section, things are especially bad because we
block new transactions from happening.

The main problem with the free space cache is that it has to be written
out in its entirety and is managed in an ad hoc fashion. Using a B-tree
to store free space fixes this: updates can be done as needed and we get
all of the benefits of using a B-tree: checksumming, RAID handling,
well-understood behavior.

With the free space tree, we get commit times that are about the same as
the no cache case with load times slower than the free space cache case
but still much faster than the no cache case. Free space is represented
with extents until it becomes more space-efficient to use bitmaps,
giving us similar space overhead to the free space cache.

The operations on the free space tree are: adding and removing free
space, handling the creation and deletion of block groups, and loading
the free space for a block group. We can also create the free space tree
by walking the extent tree and clear the free space tree.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit a5ed91828518ab076209266c2bc510adabd078df)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

Btrfs: introduce the free space B-tree on-disk format

The on-disk format for the free space tree is straightforward. Each
block group is represented in the free space tree by a free space info
item that stores accounting information: whether the free space for this
block group is stored as bitmaps or extents and how many extents of free
space exist for this block group (regardless of which format is being
used in the tree). Extents are (start, FREE_SPACE_EXTENT, length) keys
with no corresponding item, and bitmaps instead have the
FREE_SPACE_BITMAP type and have a bitmap item attached, which is just an
array of bytes.

Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 208acb8c72d7ace6b672b105502dca0bcb050162)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

Btrfs: refactor caching_thread()

We're also going to load the free space tree from caching_thread(), so
we should refactor some of the common code.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 73fa48b674e819098c3bafc47618d0e2868191e5)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

Btrfs: add helpers for read-only compat bits

We're finally going to add one of these for the free space tree, so
let's add the same nice helpers that we have for the incompat bits.
While we're add it, also add helpers to clear the bits.

Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 1abfbcdf56d9485f050149bc4968c1609f9a0773)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

Btrfs: add extent buffer bitmap sanity tests

Sanity test the extent buffer bitmap operations (test, set, and clear)
against the equivalent standard kernel operations.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 0f3312295d3ce1d82392244236a52b3b663480ef)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

Btrfs: add extent buffer bitmap operations

These are going to be used for the free space tree bitmap items.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 26274676

(cherry picked from commit 3e1e8bb770dba29645b302c5499ffcb8e3906712)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

x86/irq: Retrieve irq data after locking irq_desc

irq_data is protected by irq_desc->lock, so retrieving the irq chip
from irq_data outside the lock is racy vs. an concurrent update. Move
it into the lock held region.

While at it add a comment why the vector walk does not require
vector_lock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.331320612@linutronix.de
(cherry picked from commit 09cf92b784fae6109450c5d64f9908066d605249)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

x86/irq: Use proper locking in check_irq_vectors_for_cpu_disable()

It's unsafe to examine fields in the irq descriptor w/o holding the
descriptor lock. Add proper locking.

While at it add a comment why the vector check can run lock less

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.236544164@linutronix.de
(cherry picked from commit cbb24dc761d95fe39a7a122bb1b298e9604cae15)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

x86/irq: Plug irq vector hotplug race

Jin debugged a nasty cpu hotplug race which results in leaking a irq
vector on the newly hotplugged cpu.

cpu N cpu M
native_cpu_up                   device_shutdown
  do_boot_cpu   free_msi_irqs
  start_secondary                   arch_teardown_msi_irqs
    smp_callin                        default_teardown_msi_irqs
       setup_vector_irq                  arch_teardown_msi_irq
        __setup_vector_irq    native_teardown_msi_irq
          lock(vector_lock)      destroy_irq
          install vectors
          unlock(vector_lock)
       lock(vector_lock)
--->                                          __clear_irq_vector
                                            unlock(vector_lock)
    lock(vector_lock)
    set_cpu_online
    unlock(vector_lock)

This leaves the irq vector(s) which are torn down on CPU M stale in
the vector array of CPU N, because CPU M does not see CPU N online
yet. There is a similar issue with concurrent newly setup interrupts.

The alloc/free protection of irq descriptors does not prevent the
above race, because it merily prevents interrupt descriptors from
going away or changing concurrently.

Prevent this by moving the call to setup_vector_irq() into the
vector_lock held region which protects set_cpu_online():

cpu N cpu M
native_cpu_up                   device_shutdown
  do_boot_cpu   free_msi_irqs
  start_secondary                   arch_teardown_msi_irqs
    smp_callin                        default_teardown_msi_irqs
       lock(vector_lock)                arch_teardown_msi_irq
       setup_vector_irq()
        __setup_vector_irq    native_teardown_msi_irq
          install vectors      destroy_irq
       set_cpu_online
       unlock(vector_lock)
       lock(vector_lock)
                                          __clear_irq_vector
                                            unlock(vector_lock)

So cpu M either sees the cpu N online before clearing the vector or
cpu N installs the vectors after cpu M has cleared it.

Reported-by: xiao jin <jin.xiao@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.141898931@linutronix.de
(cherry picked from commit 5a3f75e3f02836518ce49536e9c460ca8e1fa290)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
arch/x86/kernel/smpboot.c

hotplug: Prevent alloc/free of irq descriptors during cpu up/down

When a cpu goes up some architectures (e.g. x86) have to walk the irq
space to set up the vector space for the cpu. While this needs extra
protection at the architecture level we can avoid a few race
conditions by preventing the concurrent allocation/free of irq
descriptors and the associated data.

When a cpu goes down it moves the interrupts which are targeted to
this cpu away by reassigning the affinities. While this happens
interrupts can be allocated and freed, which opens a can of race
conditions in the code which reassignes the affinities because
interrupt descriptors might be freed underneath.

Example:

CPU1 CPU2
cpu_up/down
irq_desc = irq_to_desc(irq);
remove_from_radix_tree(desc);
raw_spin_lock(&desc->lock);
free(desc);

We could protect the irq descriptors with RCU, but that would require
a full tree change of all accesses to interrupt descriptors. But
fortunately these kind of race conditions are rather limited to a few
things like cpu hotplug. The normal setup/teardown is very well
serialized. So the simpler and obvious solution is:

Prevent allocation and freeing of interrupt descriptors accross cpu
hotplug.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: xiao jin <jin.xiao@intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20150705171102.063519515@linutronix.de
(cherry picked from commit a899418167264c7bac574b1a0f1b2c26c5b0995a)

Orabug: 25671838

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

sysfs: replace WARN() with pr_debug when sysfs_remove_group() failed

Orabug: 26374902

There is no enough error handling in block device adding/registration
path, for example,

device_add_disk()
  blk_register_queue()

When kernel returns from device_add_disk(), no return value to tell
us it was successful or not --- that suggests it would always succeed,
and according to this assumption, then during block device removal/
unregistration steps,

sd_remove()
  del_gendisk()
    blk_unregister_queue()

dpm_sysfs_remove(), blk_trace_remove_sysfs() will be called blindly,
though there is likely no 'trace' 'power' sysfs groups there because
actually blk_register_queue()/device_add() failed somewhere. thus
causes WARN flood emitted from sysfs_remove_group() as following triggered
by unloading fnic driver:

modprobe -rv fnic

[  122.081398] WARNING: CPU: 14 PID: 11709 at fs/sysfs/group.c:224
        sysfs_remove_group+0x9c/0xa0()
[  122.081399] sysfs group 'trace' not found for kobject 'sdb'
[  122.081424] CPU: 14 PID: 11709 Comm: modprobe Tainted: G        W
        4.1.12.x86_64 #2
[  122.081425] Hardware name: Cisco Systems Inc UCSBXXxx
[  122.081425]  0000000000000286 00000000d03792ff ffff881037823ad8
        ffffffff8173605d
[  122.081427]  ffff881037823b30 ffffffff81a2b9bc ffff881037823b18
        ffffffff810862aa
[  122.081428]  ffff88103974a000 0000000000000000 ffffffff81ba4080
        ffff882037d45080
[  122.081430] Call Trace:
[  122.081432]  [<ffffffff8173605d>] dump_stack+0x63/0x81
[  122.081434]  [<ffffffff810862aa>] warn_slowpath_common+0x8a/0xc0
[  122.081435]  [<ffffffff81086335>] warn_slowpath_fmt+0x55/0x70
[  122.081437]  [<ffffffff8129321c>] ? kernfs_find_and_get_ns+0x4c/0x60
[  122.081439]  [<ffffffff81296b5c>] sysfs_remove_group+0x9c/0xa0
[  122.081441]  [<ffffffff811675a4>] blk_trace_remove_sysfs+0x14/0x20
[  122.081444]  [<ffffffff81312605>] blk_unregister_queue+0x65/0x90
[  122.081446]  [<ffffffff81320f26>] del_gendisk+0x126/0x290
[  122.081449]  [<ffffffffa0091281>] sd_remove+0x61/0xc0 [sd_mod]
[  122.081452]  [<ffffffff81492fb7>] __device_release_driver+0x87/0x120
[  122.081454]  [<ffffffff81493073>] device_release_driver+0x23/0x30
[  122.081456]  [<ffffffff814928f8>] bus_remove_device+0x108/0x180
[  122.081457]  [<ffffffff8148eca0>] device_del+0x160/0x2a0
[  122.081459]  [<ffffffff814d8feb>] __scsi_remove_device+0xcb/0xd0
[  122.081461]  [<ffffffff814d7524>] scsi_forget_host+0x64/0x70
[  122.081462]  [<ffffffff814cac0b>] scsi_remove_host+0x7b/0x130
[  122.081466]  [<ffffffffa016fc47>] fnic_remove+0x1b7/0x4a0 [fnic]
[  122.081469]  [<ffffffff8138434f>] pci_device_remove+0x3f/0xc0
[  122.081472]  [<ffffffff81492fb7>] __device_release_driver+0x87/0x120
[  122.081474]  [<ffffffff81493a38>] driver_detach+0xc8/0xd0
[  122.081478]  [<ffffffff81492c19>] bus_remove_driver+0x59/0xe0
[  122.081479]  [<ffffffff814942e0>] driver_unregister+0x30/0x70
[  122.081482]  [<ffffffff81382dba>] pci_unregister_driver+0x2a/0x80
[  122.081486]  [<ffffffffa01808cc>] fnic_cleanup_module+0x10/0x7a [fnic]
[  122.081488]  [<ffffffff8110e8ec>] SyS_delete_module+0x1ac/0x230
[  122.081490]  [<ffffffff81028666>] ? syscall_trace_leave+0xc6/0x150
[  122.081491]  [<ffffffff8173dcee>] system_call_fastpath+0x12/0x71
[  122.081502] ---[ end trace 29ba5813719045a4 ]---

WARNING: CPU: 14 PID: 11709 at fs/sysfs/group.c:224
        sysfs_remove_group+0x9c/0xa0()
[  122.095724] sysfs group 'power' not found for kobject 'target2:0:4'
[  122.095790] CPU: 14 PID: 11709 Comm: modprobe Tainted: G        W
        4.1.12.x86_64 #2
[  122.095793] Hardware name: Cisco Systems Inc UCSBXXxx
[  122.095795]  0000000000000286 00000000d03792ff ffff881037823af8
        ffffffff8173605d
[  122.095800]  ffff881037823b50 ffffffff81a2b9bc ffff881037823b38
        ffffffff810862aa
[  122.095803]  ffff88103782
[  122.095807] Call Trace:
[  122.095814]  [<ffffffff8173605d>] dump_stack+0x63/0x81
[  122.095818]  [<ffffffff810862aa>] warn_slowpath_common+0x8a/0xc0
[  122.095822]  [<ffffffff81086335>] warn_slowpath_fmt+0x55/0x70
[  122.095827]  [<ffffffff8129321c>] ? kernfs_find_and_get_ns+0x4c/0x60
[  122.095831]  [<ffffffff81296b5c>] sysfs_remove_group+0x9c/0xa0
[  122.095839]  [<ffffffff8149b7e7>] dpm_sysfs_remove+0x57/0x60
[  122.095843]  [<ffffffff8148ebc6>] device_del+0x86/0x2a0
[  122.095847]  [<ffffffff8148e1f9>] ? device_remove_file+0x19/0x20
[  122.095854]  [<ffffffff814983ae>] attribute_container_class_device_del
        +0x1e/0x30
[  122.095858]  [<ffffffff814985c2>] transport_remove_classdev+0x52/0x60
[  122.095862]  [<ffffffff81498570>] ? transport_add_class_device+0x40/0x40
[  122.095866]  [<ffffffff81497f1c>] attribute_container_device_trigger
        +0xdc/0xf0
[  122.095870]  [<ffffffff81498525>] transport_remove_device+0x15/0x20
[  122.095875]  [<ffffffff814d4df5>] scsi_target_reap_ref_release+0x25/0x40
[  122.095879]  [<ffffffff814d68fc>] scsi_target_reap+0x2c/0x30
[  122.095883]  [<ffffffff814d8fa6>] __scsi_remove_device+0x86/0xd0
[  122.095887]  [<ffffffff814d7524>] scsi_forget_host+0x64/0x70
[  122.095891]  [<ffffffff814cac0b>] scsi_remove_host+0x7b/0x130
[  122.095900]  [<ffffffffa016fc47>] fnic_remove+0x1b7/0x4a0 [fnic]
[  122.095909]  [<ffffffff8138434f>] pci_device_remove+0x3f/0xc0
[  122.095915]  [<ffffffff81492fb7>] __device_release_driver+0x87/0x120
[  122.095922]  [<ffffffff81493a38>] driver_detach+0xc8/0xd0
[  122.095930]  [<ffffffff81492c19>] bus_remove_driver+0x59/0xe0
[  122.095934]  [<ffffffff814942e0>] driver_unregister+0x30/0x70
[  122.095941]  [<ffffffff81382dba>] pci_unregister_driver+0x2a/0x80
[  122.095952]  [<ffffffffa01808cc>] fnic_cleanup_module+0x10/0x7a [fnic]
[  122.095957]  [<ffffffff8110e8ec>] SyS_delete_module+0x1ac/0x230
[  122.095961]  [<ffffffff81028666>] ? syscall_trace_leave+0xc6/0x150
[  122.095966]  [<ffffffff8173dcee>] system_call_fastpath+0x12/0x71
[  122.095968] ---[ end trace 29ba5813719045a6 ]---

While, refactoring block device code seems not valuable if just
because of above noisy but not so dangerous WARN flood.

So this patch suppress the warning flood by replacing WARN() with
pr_debug() as shortcut before refactoring all related block device
code.

This issue also could be reproduced with stable v4.12 kernel.

(Upstream maintainer Greg K-H refused to apply this "workaround / shortcut",
He insisted the issue should be fixed in block device subsystem, that means refactoring
all block device/SCSI drivers and all relevant block layer code, that is not practical task,
it is too expensive, and we couldn't wait for the upstream refactoring,
So this patch is specific to UEK4 code,
*NOTE*, there will be no WARNNING in sysfs_remove_group(), this doens't affect
other WARN_ONCE() in kenrel )

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>

KVM: nVMX: fix nested EPT detection

The nested_ept_enabled flag introduced in commit 7ca29de2136 was not
computed correctly. We are interested only in L1's EPT state, not the
the combined L0+L1 value.

In particular, if L0 uses EPT but L1 does not, nested_ept_enabled must
be false to make sure that PDPSTRs are loaded based on CR3 as usual,
because the special case described in 26.3.2.4 Loading Page-Directory-
Pointer-Table Entries does not apply.

Fixes: 7ca29de21362 ("KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT")
Cc: qemu-stable@nongnu.org
Reported-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7ad658b693536741c37b16aeb07840a2ce75f5b9)
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Tested-by: Chris Kenna <chris.kenna@oracle.com>
Acked-by: Konrad Wilk <konrad.wilk@oracle.com>

KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry

Loading CR3 as part of emulating vmentry is different from regular CR3 loads,
as implemented in kvm_set_cr3, in several ways.

* different rules are followed to check CR3 and it is desirable for the caller
to distinguish between the possible failures
* PDPTRs are not loaded if PAE paging and nested EPT are both enabled
* many MMU operations are not necessary

This patch introduces nested_vmx_load_cr3 suitable for CR3 loads as part of
nested vmentry and vmexit, and makes use of it on the nested vmentry path.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit 9ed38ffad47316dbdc16de0de275868c7771754d)
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Tested-by: Chris Kenna <chris.kenna@oracle.com>
Acked-by: Konrad Wilk <konrad.wilk@oracle.com>

KVM: nVMX: propagate errors from prepare_vmcs02

It is possible that prepare_vmcs02 fails to load the guest state. This
patch adds the proper error handling for such a case. L1 will receive
an INVALID_STATE vmexit with the appropriate exit qualification if it
happens.

A failure to set guest CR3 is the only error propagated from prepare_vmcs02
at the moment.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit ee146c1c100dbe9ca92252be2e901b957476b253)
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Tested-by: Chris Kenna <chris.kenna@oracle.com>
Acked-by: Konrad Wilk <konrad.wilk@oracle.com>

KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT

KVM does not correctly handle L1 hypervisors that emulate L2 real mode with
PAE and EPT, such as Hyper-V. In this mode, the L1 hypervisor populates guest
PDPTE VMCS fields and leaves guest CR3 uninitialized because it is not used
(see 26.3.2.4 Loading Page-Directory-Pointer-Table Entries). KVM always
dereferences CR3 and tries to load PDPTEs if PAE is on. This leads to two
related issues:

1) On the first nested vmentry, the guest PDPTEs, as populated by L1, are
overwritten in ept_load_pdptrs because the registers are believed to have
been loaded in load_pdptrs as part of kvm_set_cr3. This is incorrect. L2 is
running with PAE enabled but PDPTRs have been set up by L1.

2) When L2 is about to enable paging and loads its CR3, we, again, attempt
to load PDPTEs in load_pdptrs called from kvm_set_cr3. There are no guarantees
that this will succeed (it's just a CR3 load, paging is not enabled yet) and
if it doesn't, kvm_set_cr3 returns early without persisting the CR3 which is
then lost and L2 crashes right after it enables paging.

This patch replaces the kvm_set_cr3 call with a simple register write if PAE
and EPT are both on. CR3 is not to be interpreted in this case.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
(cherry picked from commit 7ca29de21362de242025fbc1c22436e19e39dddc)
OraBug: 26628813 KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT - backport+regression fix
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Tested-by: Chris Kenna <chris.kenna@oracle.com>
Acked-by: Konrad Wilk <konrad.wilk@oracle.com>

kvm: x86: reduce collisions in mmu_page_hash

When using two-dimensional paging, the mmu_page_hash (which provides
lookups for existing kvm_mmu_page structs), becomes imbalanced; with
too many collisions in buckets 0 and 512. This has been seen to cause
mmu_lock to be held for multiple milliseconds in kvm_mmu_get_page on
VMs with a large amount of RAM mapped with 4K pages.

The current hash function uses the lower 10 bits of gfn to index into
mmu_page_hash. When doing shadow paging, gfn is the address of the
guest page table being shadow. These tables are 4K-aligned, which
makes the low bits of gfn a good hash. However, with two-dimensional
paging, no guest page tables are being shadowed, so gfn is the base
address that is mapped by the table. Thus page tables (level=1) have
a 2MB aligned gfn, page directories (level=2) have a 1GB aligned gfn,
etc. This means hashes will only differ in their 10th bit.

hash_64() provides a better hash. For example, on a VM with ~200G
(99458 direct=1 kvm_mmu_page structs):

hash            max_mmu_page_hash_collisions
--------------------------------------------
low 10 bits     49847
hash_64         105
perfect         97

While we're changing the hash, increase the table size by 4x to better
support large VMs (further reduces number of collisions in 200G VM to
29).

Note that hash_64() does not provide a good distribution prior to commit
ef703f49a6c5 ("Eliminate bad hash multipliers from hash_32() and
hash_64()").

Signed-off-by: David Matlack <dmatlack@google.com>
Change-Id: I5aa6b13c834722813c6cca46b8b1ed6f53368ade
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Orabug: 26628797
(cherry picked from commit 114df303a7eeae8b50ebf68229b7e647714a9bea)
Signed-off-by: Govinda Tatti <Govinda.Tatti@Oracle.COM>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Tested-by: Eyal Moscovici <eyal.moscovici@oracle.com> [Ravello/BMCS Team]

IB/ipoib: For sendonly join free the multicast group on leave

Orabug: 26324050

When we leave the multicast group on expiration of a neighbor we
do not free the mcast structure. This results in a memory leak
that causes ib_dealloc_pd to fail and print a WARN_ON message
and backtrace.

Fixes: bd99b2e05c4d (IB/ipoib: Expire sendonly multicast joins)
Signed-off-by: Christoph Lameter <cl@linux.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit 0b5c9279e568d90903acedc2b9b832d8d78e8288)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>

IB/ipoib: increase the max mcast backlog queue

Orabug: 26324050

When performing sendonly joins, we queue the packets that trigger
the join until the join completes.  This may take on the order of
hundreds of milliseconds.  It is easy to have many more than three
packets come in during that time.  Expand the maximum queue depth
in order to try and prevent dropped packets during the time it
takes to join the multicast group.

Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit 2866196f294954ce9fa226825c8c1eaa64c7da8a)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>

IB/ipoib: Make sendonly multicast joins create the mcast group

Orabug: 26324050

Since IPoIB should, as much as possible, emulate how multicast
sends work on Ethernet for regular TCP/IP apps, there should be
no requirement to subscribe to a multicast group before your
sends are properly sent.  However, due to the difference in how
multicast is handled on InfiniBand, we must join the appropriate
multicast group before we can send to it.  Previously we tried
not to trigger the auto-create feature of the subnet manager when
doing this because we didn't have tracking of these sendonly
groups and the auto-creation might never get undone.  The previous
patch added timing to these sendonly joins and allows us to
leave them after a reasonable idle expiration time.  So supply
all of the information needed to auto-create group.

Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit c3852ab0e606212de523c1fb1e15adbf9f431619)

Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>