www.infradead.org Git - users/jedix/linux-maple.git/log

qede: Break datapath logic into its own file

Orabug: 25933053, 26439680

This adds a new file qede_fp.c and relocates the datapath-related
logic into it [from qede_main.c].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

MSI: Don't assign MSI IRQ vector twice

Orabug: 26275961

NVMe enables MSIx interrupts during the nvme device probe
and also during nvme reset. While enabling MSIx interrupts
using do_setup_msix_irqs(), it does it twice. It assigns
the IRQ under the function irq_alloc_hwirqs() and again
shortly using setup_msi_irq(). During the first invocation
from irq_alloc_hwirqs(), it sets the cfg->vector and
cfg->domain of the specific IRQ. During the subsequent
invocation from setup_msi_irq(), if the cfg->domain
intersects with the target cpumask (tmp_mask) then
the move_in_progress flag is set. This flag is never cleared
unless it is set via proc file system. As this flag is not
cleared, the subsequent smp affinity set via procfs fails.

Upstream introduced IRQ domain hierarchy where they assign
the IRQ only once. However, pulling in IRQ domain hierarchy
from the upstream brings with it lot of changes (85 commits).
Hence, this patch assigns the IRQ only once.

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

blk-mq: Export blk_mq_freeze_queue_wait

Drivers can start a freeze, so this provides a way to wait for frozen.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 6bae363ee3057a14eec93440826813603559273a)

Orabug: 26486098

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

blk-mq: Provide freeze queue timeout

A driver may wish to take corrective action if queued requests do not
complete within a set time.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit f91328c40a559362b6e7b7bfee01ca17fda87592)

Orabug: 26486098

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

nvme: Complete all stuck requests

If the nvme driver is shutting down its controller, the drievr will not
start the queues up again, preventing blk-mq's hot CPU notifier from
making forward progress.

To fix that, this patch starts a request_queue freeze when the driver
resets a controller so no new requests may enter. The driver will wait
for frozen after IO queues are restarted to ensure the queue reference
can be reinitialized when nvme requests to unfreeze the queues.

If the driver is doing a safe shutdown, the driver will wait for the
controller to successfully complete all inflight requests so that we
don't unnecessarily fail them. Once the controller has been disabled,
the queues will be restarted to force remaining entered requests to end
in failure so that blk-mq's hot cpu notifier may progress.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 302ad8cc09339ea261eef58a8d5f4a116a8ffda5)

Orabug: 26486098

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

nvme: Don't suspend admin queue that wasn't created

This fixes a regression in my previous commit c21377f8366c ("nvme:
Suspend all queues before deletion"), which provoked an Oops in the
removal path when removing a device that became IO incapable very early
at probe (i.e. after a failed EEH recovery).

Turns out, if the error occurred very early at the probe path, before
even configuring the admin queue, we might try to suspend the
uninitialized admin queue, accessing bad memory.

Fixes: c21377f8366c ("nvme: Suspend all queues before deletion")
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 82469c59d222f839ded5cd282172258e026f9112)

Orabug: 26486098

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

nvme: Delete created IO queues on reset

The driver was decrementing the online_queues prior to attempting to
delete those IO queues, so the driver ended up not requesting the
controller delete any. This patch saves the online_queues prior to
suspending them, and adds that parameter for deleting io queues.

Fixes: c21377f8 ("nvme: Suspend all queues before deletion")
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 7065906096273b39b90a512a7170a6697ed94b23)

Orabug: 26486098

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

nvme: Suspend all queues before deletion

When nvme_delete_queue fails in the first pass of the
nvme_disable_io_queues() loop, we return early, failing to suspend all
of the IO queues.  Later, on the nvme_pci_disable path, this causes us
to disable MSI without actually having freed all the IRQs, which
triggers the BUG_ON in free_msi_irqs(), as show below.

This patch refactors nvme_disable_io_queues to suspend all queues before
start submitting delete queue commands.  This way, we ensure that we
have at least returned every IRQ before continuing with the removal
path.

[  487.529200] kernel BUG at ../drivers/pci/msi.c:368!
cpu 0x46: Vector: 700 (Program Check) at [c0000078c5b83650]
    pc: c000000000627a50: free_msi_irqs+0x90/0x200
    lr: c000000000627a40: free_msi_irqs+0x80/0x200
    sp: c0000078c5b838d0
   msr: 9000000100029033
  current = 0xc0000078c5b40000
  paca    = 0xc000000002bd7600   softe: 0        irq_happened: 0x01
    pid   = 1376, comm = kworker/70:1H
kernel BUG at ../drivers/pci/msi.c:368!
Linux version 4.7.0.mainline+ (root@iod76) (gcc version 5.3.1 20160413
(Ubuntu/IBM 5.3.1-14ubuntu2.1) ) #104 SMP Fri Jul 29 09:20:17 CDT 2016
enter ? for help
[c0000078c5b83920] d0000000363b0cd8 nvme_dev_disable+0x208/0x4f0 [nvme]
[c0000078c5b83a10] d0000000363b12a4 nvme_timeout+0xe4/0x250 [nvme]
[c0000078c5b83ad0] c0000000005690e4 blk_mq_rq_timed_out+0x64/0x110
[c0000078c5b83b40] c00000000056c930 bt_for_each+0x160/0x170
[c0000078c5b83bb0] c00000000056d928 blk_mq_queue_tag_busy_iter+0x78/0x110
[c0000078c5b83c00] c0000000005675d8 blk_mq_timeout_work+0xd8/0x1b0
[c0000078c5b83c50] c0000000000e8cf0 process_one_work+0x1e0/0x590
[c0000078c5b83ce0] c0000000000e9148 worker_thread+0xa8/0x660
[c0000078c5b83d80] c0000000000f2090 kthread+0x110/0x130
[c0000078c5b83e30] c0000000000095f0 ret_from_kernel_thread+0x5c/0x6c

Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: linux-nvme@lists.infradead.org
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit c21377f8366c95440d533edbe47d070f662c62ef)

Orabug: 26486098

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

nvme/pci: No special case for queue busy on IO

This driver previously required we have a special check for IO submitted
to nvme IO queues that are temporarily suspended. That is no longer
necessary since blk-mq provides a quiesce, so any IO that actually gets
submitted to such a queue must be ended since the queue isn't going to
start back up.

This is fixing a condition where we have fewer IO queues after a
controller reset. This may happen if the number of CPU's has changed,
or controller firmware update changed the queue count, for example.

While it may be possible to complete the IO on a different queue, the
block layer does not provide a way to resubmit a request on a different
hardware context once the request has entered the queue. We don't want
these requests to be stuck indefinitely either, so ending them in error
is our only option at the moment.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 9ef3932e250f8e2e11ffbc0c1f28b3ba5dc40cd6)

Orabug: 26486098

UEK4 blk-mq module doesn't have the quescing capability. So the requests
should fail if a namespace is dead.

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>

ipv6: Fix leak in ipv6_gso_segment().

If ip6_find_1stfragopt() fails and we return an error we have to free
up 'segs' because nobody else is going to.

Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e3e86b5119f81e5e2499bea7ea1ebe8ac6aab789)

Orabug: 26175248
CVE-2017-9074

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Qing Huang <qing.huang@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt()

xfrm6_find_1stfragopt() may now return an error code and we must
not treat it as a length.

Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6e80ac5cc992ab6256c3dae87f7e57db15e1a58c)

Orabug: 26175248
CVE-2017-9074

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Qing Huang <qing.huang@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

ipv6: Check ip6_find_1stfragopt() return value properly.

Do not use unsigned variables to see if it returns a negative
error or not.

Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options")
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7dd7eb9513bd02184d45f000ab69d78cb1fa1531)

Orabug: 26175248
CVE-2017-9074

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Qing Huang <qing.huang@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
net/ipv6/ip6_offload.c

ipv6: Prevent overrun when parsing v6 header options

The KASAN warning repoted below was discovered with a syzkaller
program.  The reproducer is basically:
  int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP);
  send(s, &one_byte_of_data, 1, MSG_MORE);
  send(s, &more_than_mtu_bytes_data, 2000, 0);

The socket() call sets the nexthdr field of the v6 header to
NEXTHDR_HOP, the first send call primes the payload with a non zero
byte of data, and the second send call triggers the fragmentation path.

The fragmentation code tries to parse the header options in order
to figure out where to insert the fragment option.  Since nexthdr points
to an invalid option, the calculation of the size of the network header
can made to be much larger than the linear section of the skb and data
is read outside of it.

This fix makes ip6_find_1stfrag return an error if it detects
running out-of-bounds.

[   42.361487] ==================================================================
[   42.364412] BUG: KASAN: slab-out-of-bounds in ip6_fragment+0x11c8/0x3730
[   42.365471] Read of size 840 at addr ffff88000969e798 by task ip6_fragment-oo/3789
[   42.366469]
[   42.366696] CPU: 1 PID: 3789 Comm: ip6_fragment-oo Not tainted 4.11.0+ #41
[   42.367628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014
[   42.368824] Call Trace:
[   42.369183]  dump_stack+0xb3/0x10b
[   42.369664]  print_address_description+0x73/0x290
[   42.370325]  kasan_report+0x252/0x370
[   42.370839]  ? ip6_fragment+0x11c8/0x3730
[   42.371396]  check_memory_region+0x13c/0x1a0
[   42.371978]  memcpy+0x23/0x50
[   42.372395]  ip6_fragment+0x11c8/0x3730
[   42.372920]  ? nf_ct_expect_unregister_notifier+0x110/0x110
[   42.373681]  ? ip6_copy_metadata+0x7f0/0x7f0
[   42.374263]  ? ip6_forward+0x2e30/0x2e30
[   42.374803]  ip6_finish_output+0x584/0x990
[   42.375350]  ip6_output+0x1b7/0x690
[   42.375836]  ? ip6_finish_output+0x990/0x990
[   42.376411]  ? ip6_fragment+0x3730/0x3730
[   42.376968]  ip6_local_out+0x95/0x160
[   42.377471]  ip6_send_skb+0xa1/0x330
[   42.377969]  ip6_push_pending_frames+0xb3/0xe0
[   42.378589]  rawv6_sendmsg+0x2051/0x2db0
[   42.379129]  ? rawv6_bind+0x8b0/0x8b0
[   42.379633]  ? _copy_from_user+0x84/0xe0
[   42.380193]  ? debug_check_no_locks_freed+0x290/0x290
[   42.380878]  ? ___sys_sendmsg+0x162/0x930
[   42.381427]  ? rcu_read_lock_sched_held+0xa3/0x120
[   42.382074]  ? sock_has_perm+0x1f6/0x290
[   42.382614]  ? ___sys_sendmsg+0x167/0x930
[   42.383173]  ? lock_downgrade+0x660/0x660
[   42.383727]  inet_sendmsg+0x123/0x500
[   42.384226]  ? inet_sendmsg+0x123/0x500
[   42.384748]  ? inet_recvmsg+0x540/0x540
[   42.385263]  sock_sendmsg+0xca/0x110
[   42.385758]  SYSC_sendto+0x217/0x380
[   42.386249]  ? SYSC_connect+0x310/0x310
[   42.386783]  ? __might_fault+0x110/0x1d0
[   42.387324]  ? lock_downgrade+0x660/0x660
[   42.387880]  ? __fget_light+0xa1/0x1f0
[   42.388403]  ? __fdget+0x18/0x20
[   42.388851]  ? sock_common_setsockopt+0x95/0xd0
[   42.389472]  ? SyS_setsockopt+0x17f/0x260
[   42.390021]  ? entry_SYSCALL_64_fastpath+0x5/0xbe
[   42.390650]  SyS_sendto+0x40/0x50
[   42.391103]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[   42.391731] RIP: 0033:0x7fbbb711e383
[   42.392217] RSP: 002b:00007ffff4d34f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   42.393235] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb711e383
[   42.394195] RDX: 0000000000001000 RSI: 00007ffff4d34f60 RDI: 0000000000000003
[   42.395145] RBP: 0000000000000046 R08: 00007ffff4d34f40 R09: 0000000000000018
[   42.396056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400aad
[   42.396598] R13: 0000000000000066 R14: 00007ffff4d34ee0 R15: 00007fbbb717af00
[   42.397257]
[   42.397411] Allocated by task 3789:
[   42.397702]  save_stack_trace+0x16/0x20
[   42.398005]  save_stack+0x46/0xd0
[   42.398267]  kasan_kmalloc+0xad/0xe0
[   42.398548]  kasan_slab_alloc+0x12/0x20
[   42.398848]  __kmalloc_node_track_caller+0xcb/0x380
[   42.399224]  __kmalloc_reserve.isra.32+0x41/0xe0
[   42.399654]  __alloc_skb+0xf8/0x580
[   42.400003]  sock_wmalloc+0xab/0xf0
[   42.400346]  __ip6_append_data.isra.41+0x2472/0x33d0
[   42.400813]  ip6_append_data+0x1a8/0x2f0
[   42.401122]  rawv6_sendmsg+0x11ee/0x2db0
[   42.401505]  inet_sendmsg+0x123/0x500
[   42.401860]  sock_sendmsg+0xca/0x110
[   42.402209]  ___sys_sendmsg+0x7cb/0x930
[   42.402582]  __sys_sendmsg+0xd9/0x190
[   42.402941]  SyS_sendmsg+0x2d/0x50
[   42.403273]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[   42.403718]
[   42.403871] Freed by task 1794:
[   42.404146]  save_stack_trace+0x16/0x20
[   42.404515]  save_stack+0x46/0xd0
[   42.404827]  kasan_slab_free+0x72/0xc0
[   42.405167]  kfree+0xe8/0x2b0
[   42.405462]  skb_free_head+0x74/0xb0
[   42.405806]  skb_release_data+0x30e/0x3a0
[   42.406198]  skb_release_all+0x4a/0x60
[   42.406563]  consume_skb+0x113/0x2e0
[   42.406910]  skb_free_datagram+0x1a/0xe0
[   42.407288]  netlink_recvmsg+0x60d/0xe40
[   42.407667]  sock_recvmsg+0xd7/0x110
[   42.408022]  ___sys_recvmsg+0x25c/0x580
[   42.408395]  __sys_recvmsg+0xd6/0x190
[   42.408753]  SyS_recvmsg+0x2d/0x50
[   42.409086]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[   42.409513]
[   42.409665] The buggy address belongs to the object at ffff88000969e780
[   42.409665]  which belongs to the cache kmalloc-512 of size 512
[   42.410846] The buggy address is located 24 bytes inside of
[   42.410846]  512-byte region [ffff88000969e780, ffff88000969e980)
[   42.411941] The buggy address belongs to the page:
[   42.412405] page:ffffea000025a780 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
[   42.413298] flags: 0x100000000008100(slab|head)
[   42.413729] raw: 0100000000008100 0000000000000000 0000000000000000 00000001800c000c
[   42.414387] raw: ffffea00002a9500 0000000900000007 ffff88000c401280 0000000000000000
[   42.415074] page dumped because: kasan: bad access detected
[   42.415604]
[   42.415757] Memory state around the buggy address:
[   42.416222]  ffff88000969e880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   42.416904]  ffff88000969e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   42.417591] >ffff88000969e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   42.418273]                    ^
[   42.418588]  ffff88000969ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   42.419273]  ffff88000969ea80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   42.419882] ==================================================================

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2423496af35d94a87156b063ea5cedffc10a70a1)

Orabug: 26175248
CVE-2017-9074

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Qing Huang <qing.huang@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

sparc64: Convert non-fatal error print to a debug print (DAX driver)

Error code HV_EWOULDBLOCK returned by the hypervisor is a
transient error. It is not considered fatal.

Orabug: 26305292

Acked-by: Jonathan Helman <jonathan.helman@oracle.com>
Acked-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Jane Chu <jane.chu@oracle.com>

bnxt_en: Fix SRIOV on big-endian architecture.

The PF driver sets up a list of firmware commands from the VF driver that
needs to be forwarded to the PF for approval.  This list is a 256-bit
bitmap.  The code that sets up the bitmap falls apart on big-endian
architecture.  __set_bit() does not work because it operates on long types
whereas the firmware interface is defined in u32 types, causing bits in
the wrong 32-bit word to be set.

Fix it by setting the proper bits on an array of u32.

Fixes: de68f5de5651 ("bnxt_en: Fix bitmap declaration to work on 32-bit arches.")
Reported-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Orabug: 26000471

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

cpuset: consider dying css as offline

In most cases, a cgroup controller don't care about the liftimes of
cgroups.  For the controller, a css becomes online when ->css_online()
is called on it and offline when ->css_offline() is called.

However, cpuset is special in that the user interface it exposes cares
whether certain cgroups exist or not.  Combined with the RCU delay
between cgroup removal and css offlining, this can lead to user
visible behavior oddities where operations which should succeed after
cgroup removals fail for some time period.  The effects of cgroup
removals are delayed when seen from userland.

This patch adds css_is_dying() which tests whether offline is pending
and updates is_cpuset_online() so that the function returns false also
while offline is pending.  This gets rid of the userland visible
delays.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Link:
http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Orabug: 26415290

(backport upstream commit 41c25707d21716826e3c1f60967f5550610ec1c9)

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Signed-off-by: Tom Hromatka <tom.hromatka@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Tom Hromatka <tom.hromatka@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

proc: sparc64 Export ADI-enabled memory

Add a maps entry to /proc/pid/adi to show memory map information about
SPARC ADI enabled memory.

Orabug: 26052545

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: chris hyser <chris.hyser@oracle.com>

proc: sparc64 ADI version tag debugging interface

To facilitate user space ADI debugging there needs to be a way for a
debugger to get/set ADI version tags in a target process. This is
accomplished with a new /proc/<pid>/adi/tags interface.  This new interface
maps linearly to the address space of the target process at a ratio
of 1:adi_blksz.  A read (or write) of offset K in the file returns
(or modifies) the ADI version tag stored in the cacheline containing
address K * adi_blksz, encoded as 1 version per byte.

Pseudocode example:

        unsigned char vers[2];
        long long addr = 0x20000;

        fd = open(â\80\9c/proc/pid/adi/tagsâ\80\9d, O_RDONLY);
        addr /= adi_blksz();
        rv = pread64(fd, &vers, 2, addr);
        /*
         * vers[0] gets version from address 0x20000,
         * vers[1] gets version from address 0x20000 + adi_blksz()
         */

Orabug: 26051178

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>

proc: Move directory functions into internal.h

Move directory macros and define proc_pident_lookup,
proc_pident_readdir and struct pid_entry within
fs/proc/internal.h. These were originally statically defined within
fs/proc/base.c and couldn't be used elsewhere.

Orabug: 26051178

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>

sched: Move the loadavg code to a more obvious location

A previous commit f33dfff75d968 ("sched/fair: Rewrite runnable load
and utilization average tracking") created a regression in global
load average in uptime. Active Load average computation function
should be invoked periodically to update the delta for each runqueue.

Use the following upstream commit 3289bdb42 to fix this in stead of
quick-fix.

Before the fix

procs_
    when running load average
======== ======= =================
13:32:46       1  0.65, 0.22, 0.08
13:33:47     129  0.78, 0.33, 0.12
13:34:47     129  0.74, 0.41, 0.16
13:35:47     129  0.60, 0.42, 0.18
13:36:47     129  0.77, 0.49, 0.22
13:37:47     129  0.78, 0.55, 0.26

After the fix:

  procs_
    when running load average
======== ======= =================
19:46:35       1  0.58, 0.38, 0.16
19:47:35     129  74.02, 21.09, 7.27
19:48:35     129  103.16, 39.08, 14.31
19:49:35     129  114.25, 53.95, 20.98
19:52:36     257  172.40, 97.26, 42.96
19:53:37     257  221.54, 124.95, 55.87
19:54:37     257  237.13, 147.05, 67.80

Original upstream commit message:
I could not find the loadavg code.. turns out it was hidden in a file
called proc.c. It further got mingled up with the cruft per rq load
indexes (which we really want to get rid of).

Move the per rq load indexes into the fair.c load-balance code (that's
the only thing that uses them) and rename proc.c to loadavg.c so we
can find it again.

Orabug: 26266279

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
[ Did minor cleanups to the code. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 3289bdb429884c0279bf9ab72dff7b934f19dfc6)

Conflicts:

kernel/sched/fair.c
kernel/sched/loadavg.c
kernel/sched/sched.h

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: Atish Patra <atish.patra@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Dhaval Giani <dhaval.giani@oracle.com>

sparc64: Treat ERESTARTSYS as an acceptable error (DAX driver)

get_user_pages fails if the current process calling it has a SIGKILL
posted on it. Since this is an acceptable failure catch this error and
print a debug message instead of an error message.

Orabug: 26393400

Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Jonathan Helman <jonathan.helman@oracle.com>

SPARC64: vcc: delay device removal until close()

    If a vcc device file is open while it's removed (due to a
    domain being unbound), delay the removal of the associated vcc
    device structure until the final close() call is made on
    the device. This preventsthe  device file cdev minor number from
    being reused which can result in ugly filesystem warnings to
    the console.

Orabug: 24594547

Signed-off-by: Aaron Young <aaron.young@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-By: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>

sparc64: fix vio handshake issue

When rebooting multiple LDoms together with bind/unbind a separate
LDom, kernel panic with vio handshake error. The panic is caused by
vio trying to allocate a buffer which is not freed properly. The
ldc_unbind should unconfigure and stop the ldc queue before freeing
the irq. If the irq is freed before stopping the queue, interrupts
can continue to happen after the irq is freed which may cause
issue.

Orabug: 26259622

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>

sparc64: Use cpu_poke to resume idle cpu

Use cpu_poke hypervisor call to resume idle cpu if supported.

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 25575672
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>

sparc64: Add a new hypercall CPU_POKE

This adds a new hypercall CPU_POKE for quickly waking up an idle CPU.
CPU POKE should only be sent to valid non-local CPUs.

Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 25575672
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>

sparc64: fix out of order spin_lock_irqsave and spin_unlock_restore

After enabling spinlocks debug option, kernel prints out call trace
suggesting that the function ldom_req_sp_token executes a might_sleep
function while IRQs is disabled. IRQs is disabled because the
spin_lock_irqsave and spin_unlock_irqrestore are out of order.

Normal correct sequence is:

spin_lock_irqsave(ds_dev_list_lock, data_flags)

        LOCK_DS_DEV()/* spin_lock_irqsave(ds_lock, ds_flags) */
        .
        .
        UNLOCK_DS_DEV()/* spin_unlock_irqrestore(ds_lock, ds_flags) */

spin_unlock_irqrestore(ds_dev_list_lock, data_flags)

Out or order sequence:

spin_lock_irqsave(ds_dev_list_lock, data_flags)

        LOCK_DS_DEV()/* spin_lock_irqsave(ds_lock, ds_flags) */

spin_unlock_irqrestore(ds_dev_list_lock, data_flags)

        UNLOCK_DS_DEV()/* spin_unlock_irqrestore(ds_lock, ds_flags) */

The last UNLOCK_DS_DEV() ends up restoring IRQs to disabled state,
because the previous LOCK_DS_DEV() is in irqs_disabled().
To fix the issue, follows the order of irqsave()/irqrestore().

Orabug: 26265190
Orabug: 25421812

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Reviewed-by: Aaron Young <aaron.young@oracle.com>

ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT

snd_timer_user_tselect() reallocates the queue buffer dynamically, but
it forgot to reset its indices.  Since the read may happen
concurrently with ioctl and snd_timer_user_tselect() allocates the
buffer via kmalloc(), this may lead to the leak of uninitialized
kernel-space data, as spotted via KMSAN:

  BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10
  CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:16
   dump_stack+0x143/0x1b0 lib/dump_stack.c:52
   kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007
   kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086
   copy_to_user ./arch/x86/include/asm/uaccess.h:725
   snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004
   do_loop_readv_writev fs/read_write.c:716
   __do_readv_writev+0x94c/0x1380 fs/read_write.c:864
   do_readv_writev fs/read_write.c:894
   vfs_readv fs/read_write.c:908
   do_readv+0x52a/0x5d0 fs/read_write.c:934
   SYSC_readv+0xb6/0xd0 fs/read_write.c:1021
   SyS_readv+0x87/0xb0 fs/read_write.c:1018

This patch adds the missing reset of queue indices.  Together with the
previous fix for the ioctl/read race, we cover the whole problem.

Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit ba3021b2c79b2fa9114f92790a99deb27a65b728)

Orabug: 26267070
CVE-2017-1000380

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

ALSA: timer: Fix race between read and ioctl

The read from ALSA timer device, the function snd_timer_user_tread(),
may access to an uninitialized struct snd_timer_user fields when the
read is concurrently performed while the ioctl like
snd_timer_user_tselect() is invoked. We have already fixed the races
among ioctls via a mutex, but we seem to have forgotten the race
between read vs ioctl.

This patch simply applies (more exactly extends the already applied
range of) tu->ioctl_lock in snd_timer_user_tread() for closing the
race window.

Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit d11662f4f798b50d8c8743f433842c3e40fe3378)

Orabug: 26267070
CVE-2017-1000380

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

net/rds: Replace printk in TX path with stat variable

Printing in any form is not recommended in data path.
Fix it by replacing printk with a new statistics counter.

Orabug: 26402653

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Reviewed-by: Avinash Repaka <avinash.repaka@oracle.com>

Revert "mm: meminit: only set page reserved in the memblock region"

This reverts commit 35d7de6449327c64ed82c3e2b8c071a7664e0b19.

Orabug: 26446232
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

Revert "mm: meminit: move page initialization into a separate function"

This reverts commit aeb76e45d8809efc0c273a5aa6edf481256893c6.

Orabug: 26446232
Signed-off-by: Dhaval Giani <dhaval.giani@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

NVMe: Retain QUEUE_FLAG_SG_GAPS flag for bio vector alignment.

The nvme queue flag QUEUE_FLAG_SG_GAPS checks for the bio vector
alignment against the page size. In upstream, the QUEUE_FLAG_SG_GAPS
flag is replaced by blk_queue_virt_boundary() and pulling in the
respective patches caused instability in the driver and hence
QUEUE_FLAG_SG_GAPS flag is retained for vector alignment.

Orabug: 26402433

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: Don't post statistics to malicious VFs

Once firmware indicates that a given VF is malicious and until
that VF passes an FLR all bets are off - PF can't know anything
is happening to the VF [since VF can't communicate anything to its PF].
But PF is currently still periodically asking device to collect
statistics for the VF which might in turn fill logs by IOMMU blocking
memory access done by the VF's PCI function [in the case VF has unmapped
its buffers].

Orabug: 26440216

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3523882229b903e967de05665b871dab87c5df0f)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: Allow vfs to disable txvlan offload

VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.

Orabug: 26440216

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 92f85f05caa51d844af6ea14ffbc7a786446a644)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: fix pf2vf bulletin DMA mapping leak

When freeing VF's DMA mappings, an already NULLed pointer was checked
again due to an apparent copy&paste error. Consequently, the pf2vf
bulletin DMA mapping was not freed.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 996652c7050c70008e4434af108be6f15f20fbd0)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: Fix Multi-Cos

Apparently multi-cos isn't working for bnx2x quite some time -
driver implements ndo_select_queue() to allow queue-selection
for FCoE, but the regular L2 flow would cause it to modulo the
fallback's result by the number of queues.
The fallback would return a queue matching the needed tc
[via __skb_tx_hash()], but since the modulo is by the number of TSS
queues where number of TCs is not accounted, transmission would always
be done by a queue configured into using TC0.

Orabug: 26440216

Fixes: ada7c19e6d27 ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3968d38917eb9bd0cd391265f6c9c538d9b33ffa)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: add missing configuration of VF VLAN filters

Configuring VLANs from the VF side had no effect, because the PF ignored
filters of type VFPF_VLAN_FILTER in the VF-PF message.

Add the missing filter type to configure.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e39513259450a8312cb98a9a3b16bb924310dbcc)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: fix incorrect filter count in an error message

filters->count is the number of filters we were supposed to configure.
There is no reason to increase it by +1 when printing the count in an error
message.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 74bcbeb7d77ec92e4262fc340cb436ef7d98ba01)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: do not rollback VF MAC/VLAN filters we did not configure

On failure to configure a VF MAC/VLAN filter we should not attempt to
rollback filters that we failed to configure with -EEXIST.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 78d5505432436516456c12abbe705ec8dee7ee2b)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: fix detection of VLAN filtering feature for VF

VFs are currently missing the VLAN filtering feature, because we were
checking the PF's acquire response before actually performing the acquire.

Fix it by setting the feature flag later when we have the PF response.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 83bd9eb8fc69cdd5135ed6e1f066adc8841800fd)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: fix possible overrun of VFPF multicast addresses array

It is too late to check for the limit of the number of VF multicast
addresses after they have already been copied to the req->multicast[]
array, possibly overflowing it.

Do the check before copying.

Also fix the error path to not skip unlocking vf2pf_mutex.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 22118d861cec5da6ed525aaf12a3de9bfeffc58f)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: lower verbosity of VF stats debug messages

When BNX2X_MSG_IOV is enabled, the driver produces too many VF statistics
messages. Lower the verbosity of the VF stats messages similarly as in
commit 76ca70fabbdaa3 ("bnx2x: [Debug] change verbosity of some prints").

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 850268d320f0c7c5eb7ad0a62ef21859fa331ded)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

bnx2x: prevent crash when accessing PTP with interface down

It is possible to crash the kernel by accessing a PTP device while its
associated bnx2x interface is down. Before the interface is brought up,
the timecounter is not initialized, so accessing it results in NULL
dereference.

Fix it by checking if the interface is up.

Use -ENETDOWN as the error code when the interface is down.
-EFAULT in bnx2x_ptp_adjfreq() did not seem right.

Tested using phc_ctl get/set/adj/freq commands.

Orabug: 26440216

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 466e8bf10ac104d96e1ea813e8126e11cb72ea20)
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

Merge remote-tracking branch 'linux-uek/topic/uek-4.1/dtrace' into uek-4.1-next

Conflicts:
arch/sparc/kernel/ttable_64.S

dtrace: FBT module support and SPARCs return probes

This fix adds two features to FBT provider:
1) support for modules
2) support for return probes on SPARC

The module support of x86 was almost ready as it does not rely on trampolines and
uses hashtables of tracepoints. This works well if we know amount of probes in
advance so can reserve correct amount of memory during module load time. Unfortunately
that is not possible on SPARC and we need to allocate a trampoline dynamically.

Major part of this code is about removing all static assumptions about FBT from kernel
code and moving the responsibility to dtrace modules. Trampolines for SPARC are now
allocated dynamically (including kernel's pseudo module). This applies to SDT trampolines
too.

Second change adds scan for return probes on SPARC with small heuristics to quickly
skip over cases that are not interesting for DTrace. At the same time this patch
allocates new SPARC Trap for FBT.

Support for .init section is not available on any platform. The .init section is freed
after a module is fully loaded and it is not possible to remove its probes without
further chagnes in DTrace framework (modules). This is deffered for later work.

Orabug: 25960276
Orabug: 26384199

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>

mm: fix use-after-free if memory allocation failed in vma_adjust()

There's one case when vma_adjust() expands the vma, overlapping with
*two* next vma.  See case 6 of mprotect, described in the comment to
vma_merge().

To handle this (and only this) situation we iterate twice over main part
of the function.  See "goto again".

Vegard reported[1] that he sees out-of-bounds access complain from
KASAN, if anon_vma_clone() on the *second* iteration fails.

This happens because we free 'next' vma by the end of first iteration
and don't have a way to undo this if anon_vma_clone() fails on the
second iteration.

The solution is to do all required allocations upfront, before we touch
vmas.

The allocation on the second iteration is only required if first two
vmas don't have anon_vma, but third does.  So we need, in total, one
anon_vma_clone() call.

It's easy to adjust 'exporter' to the third vma for such case.

[1] http://lkml.kernel.org/r/1469514843-23778-1-git-send-email-vegard.nossum@oracle.com

Link: http://lkml.kernel.org/r/1469625255-126641-1-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Orabug: 26441514

(cherry picked from commit 734537c9cb725fc8005ee7a25c48f1ad10fce5df)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

uek-rpm nano: Signature verification support in kexec_file_load

The following configuration options to support
signature verification in the kexec_file_load
syscall are enabled:
CONFIG_KEXEC_VERIFY_SIG=y
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_PKCS7_MESSAGE_PARSER=y
CONFIG_SIGNED_PE_FILE_VERIFICATION=y

Orabug: 26386345
Signed-off-by: alexey.petrenko@oracle.com

lpfc update for uek4 11.4.0.2

Orabug: 26439257

Signed-off-by: rkennedy <dick.kennedy@avagotech.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Driver responds LS_RJT to Beacon Off

[backport of 6c29ac6512240659455456b36bfe7d6f3b839312]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Beacon OFF from switch is rejected by driver.

Driver fails Beacon OFF if frequency is set to 0. As per fc-ls spec,
status, capability, frequency and duration fields are only applicable
for Beacon ON.

Remove frequency and type checks. Reject Beacon ON if duration is non
zero.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix crash after firmware flash when

[backport of e9e6003bdcfec67c719c8e335fa5bff65c707297]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

OS crashes after the completion of firmware download.

Failure in posting SCSI SGL buffers because number of SGL buffers is
less than total count. Some of the pending IOs are not completed by
driver. SGL buffers for these IOs are not added back to the list.
Pending IOs are not completed because lpfc_wq_list list is initialized
before completion of pending IOs.

Postpone lpfc_wq_list reinitialization by moving
lpfc_sli4_queue_destroy() after lpfc_hba_down_post().

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Vport creation is failing with "Link

[backport of f0b2223a666e93c84fd42a0c5dad8a17c08aa869]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Vport creation fails for SLI-3 adapters.

Mailbox submission fails because mailbox interrupt is disabled. Mailbox
interrupt is disabled during port reset.

Do reset only for physical port.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Null pointer dereference when

[backport of 8ca7edf129441181280bc5c932e70d338d30f47f]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Kernel panic when log_verbose is set to 0xffffffff

phba->pport is dereferenced before it is initialized

Fix: Do not dereference phba->pport if it is NULL

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix return value of board_mode store

[backport of b18f2a8765fc072701452f710ac164d243ee4579]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

On hbacmd reset failure, observing wrong string "nline" in kernel log.

On failure, non negative value (1) is returned from sysfs store
routine. It is interpreted as count by kernel and store routine is
called again with the remaining characters as input.

Fix: Return negative error code (-EIO) in case of failure.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix Port going offline after

[backport of f4af07d4df21c5d02f66a0174b9669187a40bf1e]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Observing lpfc port down after issuing hbacmd reset command

Failure in posting SGL buffers. If there is only one SGL buffer and rrq
is valid for its XRI, we are rightly returning NULL but not adding the
buffer back to the SGL list. So, number of buffers become less than
total count and repost fails during reset.

Add SGL buffer back to list before returning NULL.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: fix spelling mistake "entrys"

[backport of 627efc5892d73cdfc0c9841a50714677855775a9]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Trivial fix to spelling mistake in debugfs message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Add MDS Diagnostic support.

[backport of e01b7cdb676e06310c4dba367ba153f33effc464]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Added code to support Cisco MDS loopback diagnostic. The diagnostics run
various loopbacks including one which loops-back frame through the
driver.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix used-RPI accounting problem.

[backport of a1d3e3605028f2b4b0ff811b91df73454d9b8792]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

With 255 vports created a link trasition can casue a crash.

When going through discovery after a link bounce the driver is using
rpis before the cmd FCOE_POST_HDR_TEMPLATES completes. By doing that the
next rpi bumps the rpi range out of the boundary.

The fix it to increment the next_rpi only when the
FCOE_POST_HDR_TEMPLATE succeeds.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix panic on BFS configuration

[backport of 596b05edb75f251f64373255f2a338ea7a9b5197]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

To select the appropriate shost template, the driver is issuing a
mailbox command to retrieve the wwn. Turns out the sending of the
command precedes the reset of the function. On SLI-4 adapters, this is
inconsequential as the mailbox command location is specified by dma via
the BMBX register. However, on SLI-3 adapters, the location of the
mailbox command submission area changes. When the function is first
powered on or reset, the cmd is submitted via PCI bar memory. Later the
driver changes the function config to use host memory and DMA. The
request to start a mailbox command is the same, a simple doorbell write,
regardless of submission area. So.. if there has not been a boot driver
run against the adapter, the mailbox command works as defaults are
ok. But, if the boot driver has configured the card and, and if no
platform pci function/slot reset occurs as the os starts, the mailbox
command will fail. The SLI-3 device will use the stale boot driver dma
location. This can cause PCI eeh errors.

Fix is to reset the sli-3 function before sending the mailbox command,
thus synchronizing the function/driver on mailbox location.

Note: The fix uses routines that are typically invoked later in the call
flow to reset the sli-3 device. The issue in using those routines is
that the normal (non-fix) flow does additional initialization, namely
the allocation of the pport structure. So, rather than significantly
reworking the initialization flow so that the pport is alloc'd first,
pointer checks are added to work around it. Checks are limited to the
routines invoked by a sli-3 adapter (s3 routines) as this fix/early call
is only invoked on a sli3 adapter. Nothing changes post the
fix. Subsequent initialization, and another adapter reset, still occur -
both on sli-3 and sli-4 adapters.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
Cc: stable@vger.kernel.org # v4.11+
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix Express lane queue creation.

[backport of 24754c5e3651ef7521cd2f8d2f7ac2c38b7bb072]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

The older sli4 adapters only supported the 64 byte WQE entry size.
The new adapter (fw) support both 64 and 128 byte WQE entry sizies.
The Express lane WQ was not being created with the 128 byte WQE sizes
when it was supported.

Not having the right WQE size created for the express lane work queue
caused the the firmware to overwrite the lun indentifier in the FCP header.

This patch correctly creates the express lane work queue with the
supported size.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix driver usage of 128B WQEs when WQ_CREATE is

[backport of 8c434f8b07d99c4f0728ed144108839a367c72c0]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

There are two versions of a structure for queue creation and setup that the
driver shares with FW. The driver was only treating as version 0.

Verify WQ_CREATE with 128B WQEs in V0 and V1.

Code review of another bug showed the driver passing
128B WQEs and 8 pages in WQ CREATE and V0.
Code inspection/instrumentation showed that the driver
uses V0 in WQ_CREATE and if the caller passes queue->entry_size
128B, the driver sets the hdr_version to V1 so all is good.
When I tested the V1 WQ_CREATE, the mailbox failed causing
the driver to unload.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Add Fabric assigned WWN support.

[backport of a2ed2d89b451f7ca26b049ea1664f4397c5df87b]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Adding support for Fabric assigned WWPN and WWNN.

Firmware sends first FLOGI to fabric with vendor version changes.
On link up driver gets updated service parameter with FAWWN assigned port
name. Driver sends 2nd FLOGI with updated fawwpn and modifies the
vport->fc_portname in driver.

Note:
Soft wwpn will not be allowed when fawwpn is enabled.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix crash after issuing lip reset

[backport of 43f40add69672723389a32adab1d24715aaedff0]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

When RPI is not available, driver sends WQE with invalid RPI value and
rejected by HBA.
lpfc 0000:82:00.3: 1:3154 BLS ABORT RSP failed, data: x3/xa0320008
and
lpfc :2753 PLOGI failure DID:FFFFFA Status:x3/xa0240008

In this case, driver accesses rpi_ids array out of bounds.

Fix:
Check return value of lpfc_sli4_alloc_rpi(). Do not allocate
lpfc_nodelist entry if RPI is not available.

When RPI is not available, we will get discovery timeouts and
command drops for some of the vports as seen below.

lpfc :0273 Unexpected discovery timeout, vport State x0
lpfc :0230 Unexpected timeout, hba link state x5
lpfc :0111 Dropping received ELS cmd Data: x0 xc90c55 x0

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Remove NULL ptr check before kfree.

[backport of 600b57091effbdb680c067332c92de2a21c5d79d]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

The check for NULL ptr is not necessary, kfree will check it.

Removing NULL ptr check.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
This patch was applied manually.

Signed-off-by: Brian Maly <brian.maly@oracle.com>

lpfc: Fix spelling in comments.

[backport of 92c9d8da38785584acae3660112acadc3baf3051]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

Comment should have said Repost.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix PT2PT PRLI reject

[backport of 114e80db15039e248eb4e458559cef57737930a8]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

lpfc cannot establish connection with targets that send PRLI in P2P
configurations.

If lpfc rejects a PRLI that is sent from a target the target will not
resend and will reject the PRLI send from the initiator.

[mkp: applied by hand]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: correct rdp diag portnames

[backport of 91b7cd038317fa47e32f42096c1ff28de4337af0]
From: rkennedy <dick.kennedy@avagotech.com>

Orabug: 26439257

NVME merge reverted diag port names to the physical port.
They should be the vport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix eh_deadline setting for sli3 adapters.

[backport of 96418b5e2c8867da3279d877f5d1ffabfe460c3d]
From: James Smart <jsmart2021@gmail.com>

Orabug: 26439257

A previous change unilaterally removed the hba reset entry point
from the sli3 host template. This was done to allow tape devices
being used for back up from being removed. Why was this done ?
When there was non-responding device on the fabric, the error
escalation policy would escalate to the reset handler. When the
reset handler was called, it would reset the adapter, dropping
link, thus logging out and terminating all i/o's - on any target.
If there was a tape device on the same adapter that wasn't in
error, it would kill the tape i/o's, effectively killing the
tape device state. With the reset point removed, the adapter
reset avoided the fabric logout, allowing the other devices to
continue to operate unaffected. A hack - yes. Hint: we really
need a transport I_T nexus reset callback added to the eh process
(in between the SCSI target reset and hba reset points), so a
fc logout could occur to the one bad target only and stop the error
escalation process.

This patch commonizes the approach so it can be used for sli3 and sli4
adapters, but mandates the admin, via module parameter, specifically
identify which adapters the resets are to be removed for. Additionally,
bus_reset, which sends Target Reset TMFs to all targets, is also removed
from the template as it too has the same effect as the adapter reset.

This patch had to be modified from the original because of the NVME changes that are in the upstream driver.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters

Orabug: 26439257

if REG_VPI fails, the driver was incorrectly issuing INIT_VFI
(a SLI4 command) on a SLI3 adapter.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5d181531bc6169e19a02a27d202cf0e982db9d0e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>

scsi: lpfc: fix missing spin_unlock on sql_list_lock

Orabug: 26439257

In the case where sglq is null, the current code just returns without
unlocking the spinlock sql_list_lock. Fix this by breaking out of the
while loop and the exit path will then unlock and return NULL as was
the original intention.

Detected by CoverityScan, CID#1411635 ("Missing unlock")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Brian Maly <brian.maly@oracle.com>

drm/mgag200: Fix to always set HiPri for G200e4 V2

- Changed the HiPri value for G200e4 to always be 0.
- Added Bandwith limitation to block resolution above 1920x1200x60Hz

Signed-off-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Acked-by: Dave Airlie <airlied@redhat.com>
[seanpaul removed some trailing whitespace from the patch]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/ec0f8568d7ec41904dfe593c5deccf3f062d7bd8.1497450944.git.mathieu.larouche@matrox.com
(cherry picked from commit 0cbb738108927916a659b5b0b96e386fcd7cc6e1)

Orabug: 26427049

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

char: lp: fix possible integer overflow in lp_setup()

The lp_setup() code doesn't apply any bounds checking when passing
"lp=none", and only in this case, resulting in an overflow of the
parport_nr[] array. All versions in Git history are affected.

Reported-By: Roee Hay <roee.hay@hcl.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3e21f4af170bebf47c187c1ff8bf155583c9f3b1)

Orabug: 26174869
CVE-2017-1000363

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>

NFSv4: Fix callback server shutdown

We want to use kthread_stop() in order to ensure the threads are
shut down before we tear down the nfs_callback_info in nfs_callback_down.

Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes: bb6aeba736ba9 ("NFSv4.x: Switch to using svc_set_num_threads()...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit ed6473ddc704a2005b9900ca08e236ebb2d8540a)

Orabug: 26143102
CVE-2017-9059

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
fs/nfs/callback.c

SUNRPC: Refactor svc_set_num_threads()

Refactor to separate out the functions of starting and stopping threads
so that they can be used in other helpers.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Tested-and-reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit 9e0d87680d689f1758185851c3da6eafb16e71e1)

Orabug: 26143102
CVE-2017-9059

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
net/sunrpc/svc.c

ipv6/dccp: do not inherit ipv6_mc_list from parent

Like commit 657831ffc38e ("dccp/tcp: do not inherit mc_list from parent")
we should clear ipv6_mc_list etc. for IPv6 sockets too.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 83eaddab4378db256d00d295bda6ca997cd13a52)

Orabug: 26142901
CVE-2017-9077

Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Jack Vogel <jack.vogel@oracle.com>
Conflicts:
net/dccp/ipv6.c

Merge remote-tracking branch 'linux-uek/topic/uek-4.1/dtrace' into qu6-push

btrfs: introduce device delete by devid

This introduces new ioctl BTRFS_IOC_RM_DEV_V2, which uses enhanced struct
btrfs_ioctl_vol_args_v2 to carry devid as an user argument.

The patch won't delete the old ioctl interface and so kernel remains
backward compatible with user land progs.

Test case/script:
echo "0 $(blockdev --getsz /dev/sdf) linear /dev/sdf 0" | dmsetup create bad_disk
mkfs.btrfs -f -d raid1 -m raid1 /dev/sdd /dev/sde /dev/mapper/bad_disk
mount /dev/sdd /btrfs
dmsetup suspend bad_disk
echo "0 $(blockdev --getsz /dev/sdf) error /dev/sdf 0" | dmsetup load bad_disk
dmsetup resume bad_disk
echo "bad disk failed. now deleting/replacing"
btrfs dev del 3 /btrfs
echo $?
btrfs fi show /btrfs
umount /btrfs
btrfs-show-super /dev/sdd | egrep num_device
dmsetup remove bad_disk
wipefs -a /dev/sdf

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reported-by: Martin <m_btrfs@ml1.co.uk>
[ adjust messages, s/disk/device/ ]
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit 6b526ed70cf189660d009ea6f17af77a9cca0f38)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: enhance btrfs_find_device_by_user_input() to check device path

The operation of device replace and device delete follows same steps upto
some depth with in btrfs kernel, however they don't share codes. This
enhancement will help replace and delete to share codes.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit b3d1b1532ff9620ff5dba891a96f3e912005eb10)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: make use of btrfs_find_device_by_user_input()

btrfs_rm_device() has a section of the code which can be replaced
btrfs_find_device_by_user_input()

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit 24fc572fe456c02ff4136c07861a3edd4b8de683)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: create helper btrfs_find_device_by_user_input()

The patch renames btrfs_dev_replace_find_srcdev() to
btrfs_find_device_by_user_input() and moves it to volumes.c, so that
delete device can use it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit 24e0474b59538cdb9d2b7318ec7c7ae9f6faf85d)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: clean up and optimize __check_raid_min_device()

__check_raid_min_device() which was pealed from btrfs_rm_device()
maintianed its original code to show the block move. This patch cleans up
__check_raid_min_device().

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit bd45ffbcb1f082e3c5a0bd56b1d7310d8b707ffb,
drop the second argument of the btrfs_dev_replace_lock/unlock to
match with the current kernel api)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

btrfs: create helper function __check_raid_min_devices()

move a section of btrfs_rm_device() code to check for min number of the
devices into the function __check_raid_min_devices()

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Orabug: 26287586

(Cherry picked from commit f1fa7f264250f2cb60119aca3fd114c3f47070c2,
drop the second argument of the btrfs_dev_replace_lock/unlock to
match with the current kernel api)

Signed-off-by: Shan Hai <shan.hai@oracle.com>
Reviewed-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>

dtrace: Add support for manual triggered cyclics

In some scenarios it is better if a client of cyclic susbstem can reprogram cyclic on
his own. This is not possible with current implementation.

This chage adds cyclic_reprogram() that can be used to schedule cyclic from inside and
outside of its handler. A manually triggered cyclic is distinguished from other types
by having its interval set to -1.

Orabug: 26384803

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>

dtrace: LOW level cyclics should use workqueues

The HIGH level cyclics are meant to be run from interrupt handler. This works on
Linux because the hrtimer is scheduled as tasklet. The LOW level cyclics must be
interruptible and should not be scheduled as tasklests.

DTrace is currently relying on being able to call dtrace_sync() from within a cyclic
handler. On Linux it is not safe to try send IPIs from within interrupt/bottom half
handlers.

This fix changes LOW level cyclics to use workqueues. At the moment we are using
shared system workqueue but it may be required to allocate our owns if this causes
big latency in our timer routines.

Orabug: 26384779

Signed-off-by: Tomas Jedlicka <tomas.jedlicka@oracle.com>
Reviewed-by: Kris Van Hees <kris.van.hees@oracle.com>
Acked-by: Chuck Anderson <chuck.anderson@oracle.com>

SPARC64: Correct ATU IOTSB binding flow

Any PCIe device attempting to use ATU for 64-bit DMA must be successfully
bound to IOTSB otherwise DMA access from device will be rejected by
PCIe root complex.

In the current code, ATU initialization and PCIe device binding to IOTSB
is done during system boot when PCI Bus Module (PBM) is initialized.
(PBM driver traverse through the PCI tree and binds each PCI device to
IOTSB).

In case if user/system add SRIOV VFs later on (either after system is up
or when PF driver is loaded), the new VF devices never get bounded to
IOTSB. So when attempt is made from VF devices to access DMA region, ATU
HW will flag it as invalid access; generates CTE_Invalid errors and VF
driver/device fails to perform DMA.

This patch fixes the aforementioned issue by delaying PCIe device
binding to IOTSB only when driver requests DMA setting from kernel.
Doing it this way makes sure every PCIe device requesting 64bit DMA mask
(and therefore attempting to use ATU) will get bind to IOTSB and can do
successful DMA operations.

Orabug: 25177689
Orabug: 25450353

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: Govinda Tatti <Govinda.Tatti@Oracle.COM>
Reviewed-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

SPARC64: Introduce IOMMU BYPASS method

This change adds IOMMU BYPASS HV API that will allow PCIe devices to
bypass IOMMU during DMA map/unmap.

Orabug: 25573557

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

i40e: Revert i40e temporary workaround

ATU and Hypervisor IOMMU v2 APIs enable 64bit DMA addressing so
i40e temporary workaround no longer needed.

This patch revert uek4 commit f59e8082fb57c0009
("i40e: Temporary workaround for DMA map issue").

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Enable 64-bit DMA

ATU 64bit addressing allows PCIe devices with 64bit DMA capabilities
to use ATU for 64bit DMA.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Enable sun4v dma ops to use IOMMU v2 APIs

Add Hypervisor IOMMU v2 APIs pci_iotsb_map(), pci_iotsb_demap() and
enable sun4v dma ops to use IOMMU v2 API for all PCIe devices with
64bit DMA mask.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Bind PCIe devices to use IOMMU v2 service

In order to use Hypervisor (HV) IOMMU v2 API for map/demap, each PCIe
device has to be bound to IOTSB using HV API pci_iotsb_bind().

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Initialize iommu_map_table and iommu_pool

Like legacy IOMMU, use common iommu_map_table and iommu_pool for ATU.
This change initializes iommu_map_table and iommu_pool for ATU.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Add ATU (new IOMMU) support

ATU (Address Translation Unit) is a new IOMMU in SPARC supported with
Hypervisor IOMMU v2 APIs.

Current SPARC IOMMU supports only 32bit address ranges and one TSB
per PCIe root complex that has a 2GB per root complex DVMA space
limit. The limit has become a scalability bottleneck nowadays that
a typical 10G/40G NIC can consume 300MB-500MB DVMA space per
instance. When DVMA resource is exhausted, devices will not be usable
since the driver can't allocate DVMA.

ATU removes bottleneck by allowing guest os to create IOTSB of size
32G (or more) with 64bit address ranges available in ATU HW. 32G is
more than enough DVMA space to be shared by all PCIe devices under
root complex contrast to 2G space provided by legacy IOMMU.

ATU allows PCIe devices to use 64bit DMA addressing. Devices
which choose to use 32bit DMA mask will continue to work with the
existing legacy IOMMU.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Make FORCE_MAX_ZONEORDER to 13 for ATU

This change allows ATU (new IOMMU) in SPARC systems to request
large (32M) contiguous memory during boot for creating IOTSB backing
store.

Orabug: 23239179

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

Revert "sparc64: bypass iommu to use 64bit address space"

This reverts commit 34edb40e5e13cba76ad13646d3a1823d877851c6.

Conflicts:
arch/sparc/kernel/pci_sun4v.c

Orabug: 21149316

Reviewed-by: chris hyser <chris.hyser@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

i40e: fix annoying message

The driver was printing a message about not being able
to assign VMDq because of a lack of MSI-X vectors.

This was because a line was missing that initialized a variable,
simply a merge error.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Orabug: 26409501
(cherry picked from commit e9e53662d8130dd950885e37dc1d97008e1283f9)
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

watchdog: Move hardlockup detector to separate file

Separate hardlockup code from watchdog.c and move it to watchdog_hld.c.
It is mostly straight forward. Remove everything inside
CONFIG_HARDLOCKUP_DETECTORS. This code will go to file watchdog_hld.c.
Also update the makefile accordigly.

Orabug: 24796651

Reviewed-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

watchdog: Move shared definitions to nmi.h

Move shared macros and definitions to nmi.h so that watchdog.c,
new file watchdog_hld.c or any other architecture specific handler
can use those definitions.

Orabug: 24796651

Reviewed-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: Suppress kmalloc (DAX driver) warning due to allocation failure

dax_alloc is used by libdax to allocate 4MB chunks of physically
contiguous memory using kmalloc. It is normal for dax_alloc to fail
when kmalloc runs out of memory and this should not be treated as a warning.
This failure should be caught in libdax and attempted to allocate memory
from another source (Eg: mmap hugepage).

Orabug: 26224254

Signed-off-by: Sanath Kumar <sanath.s.kumar@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

i40evf: Use le32_to_cpu before evaluating HW desc fields.

i40e hardware descriptor fields are in little-endian format. Driver
must use le32_to_cpu while evaluating these fields otherwise on
big-endian arch we end up evaluating incorrect values, cause errors
like:

i40evf 0001:04:02.0: Expected response 0 from PF, received 285212672
i40evf 0001:04:02.1: Expected response 0 from PF, received 285212672

Orabug: 25577233

Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

sparc64: revert pause instruction patch for atomic backoff and cpu_relax()

This patch reverts the commit e9b9eb59ffcdee09ec96b040f85c919618f4043e
("sparc64: Use pause instruction when available").

This all started with our TPCC results on UEK4. During T7 testing, the TPCC results
were much lower compared to UEK2. The atomic calls like atomic_add and atomic_sub
were showing top on perf results. Karl found out that this was caused by
the upstream commit e9b9eb59ffcdee09ec96b040f85c919618f4043e
(sparc64: Use pause instruction when available). After reverting this commit on UEK4,
the TPCC numbers were back to UEK2 level. However, things changed after Atish's
scheduler fixes on UEK4. The TPCC numbers improved and the upstream commit
(sparc64: Use pause instruction when available) did not seem make any difference.
So, Karl's "revert pause instruction" patch was removed from UEK4.

Now again with T8 testing, we are seeing the same old behaviour. The atomic calls
like atomic_add and atomic_sub are showing top on perf results. After trying with
Karl's patch(revert pause instruction patch for atomic backoff) the TPCC numbers
improved(about %25 better than T7) and atomic calls are not showing on top in perf.

So, we are adding this patch back again. This is a temporary fix. Long term solution
is still in the discussion. The original patch is from Karl.
http://ca-git.us.oracle.com/?p=linux-uek-sparc.git;a=commit;h=f214eebf2223d23a2b1499be5b54719bdd7651e3

All the credit should go to Karl. Rebased it on latest sparc tree.

Orabug: 26306832

Signed-off-by: Karl Volz <karl.volz@Oracle.com>
Reviewed-by: Atish Patra <atish.patra@oracle.com>
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Reviewed-by: Karl Volz <karl.volz@Oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>

be2net: Update the driver version to 11.4.0.0

Orabug: 26403655

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kirtikar Kashyap <kirtikar.kashyap@oracle.com>
Reviewed-by: Reviewed-by: Jack Vogel <jack.vogel@oracle.com>