]> www.infradead.org Git - users/jedix/linux-maple.git/log
users/jedix/linux-maple.git
8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sun, 27 Nov 2016 00:48:32 +0000 (16:48 -0800)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  Bluetooth: Fix potential NULL dereference in RFCOMM bind callback
  aacraid: Check size values after double-fetch from user
  mm: migrate dirty page without clear_page_dirty_for_io etc
  xen-netfront: cast grant table reference first to type int
  xen-netfront: do not cast grant table reference to signed short

8 years agoMerge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Sun, 27 Nov 2016 00:47:09 +0000 (16:47 -0800)]
Merge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/stable-cherry-picks: (21 commits)
  ocfs2: fix not enough credit panic
  ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
  ocfs2/dlm: fix race between convert and migration
  ocfs2: solve a problem of crossing the boundary in updating backups
  ocfs2: use spinlock_irqsave() to downconvert lock in ocfs2_osb_dump()
  ocfs2: access orphan dinode before delete entry in ocfs2_orphan_del
  ocfs2/dlm: do not insert a new mle when another process is already migrating
  ocfs2: fix slot overwritten if storage link down during mount
  ocfs2/dlm: return appropriate value when dlm_grab() returns NULL
  ocfs2/dlm: wait until DLM_LOCK_RES_SETREF_INPROG is cleared in dlm_deref_lockres_worker
  ocfs2/dlm: fix a race between purge and migration
  ocfs2/dlm: clear migration_pending when migration target goes down
  ocfs2: fix BUG when calculate new backup super
  ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error
  ocfs2: fix race between mount and delete node/cluster
  ocfs2/dlm: unlock lockres spinlock before dlm_lockres_put
  ocfs2: avoid access invalid address when read o2dlm debug messages
  ocfs2: fix a tiny case that inode can not removed
  ocfs2: trusted xattr missing CAP_SYS_ADMIN check
  ocfs2: set filesytem read-only when ocfs2_delete_entry failed.
  ...

8 years agoBluetooth: Fix potential NULL dereference in RFCOMM bind callback
Jaganath Kanakkassery [Thu, 14 May 2015 07:28:08 +0000 (12:58 +0530)]
Bluetooth: Fix potential NULL dereference in RFCOMM bind callback

Orabug: 25058887
CVE: CVE-2015-8956

addr can be NULL and it should not be dereferenced before NULL checking.

Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
(cherry picked from commit 951b6a0717db97ce420547222647bcc40bf1eacd)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agoocfs2: fix not enough credit panic
Junxiao Bi [Tue, 1 Nov 2016 06:42:20 +0000 (14:42 +0800)]
ocfs2: fix not enough credit panic

The following panic was caught when run ocfs2 disconfig single test
(block size 512 and cluster size 8192). ocfs2_journal_dirty() return
-ENOSPC, that means credits were used up. The total credit should
include 3 times of "num_dx_leaves" from ocfs2_dx_dir_rebalance(),
because 2 times will be consumed in ocfs2_dx_dir_transfer_leaf() and
1 time will be consumed in ocfs2_dx_dir_new_cluster()->
__ocfs2_dx_dir_new_cluster()->ocfs2_dx_dir_format_cluster(). But only
two times is included in ocfs2_dx_dir_rebalance_credits(), fix it.

[34377.331151] ------------[ cut here ]------------
[34377.332007] kernel BUG at fs/ocfs2/journal.c:775!
[34377.344107] invalid opcode: 0000 [#1] SMP
[34377.346090] Modules linked in: ocfs2 nfsd lockd grace nfs_acl auth_rpcgss sunrpc autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sd_mod sg ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ppdev xen_kbdfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea parport_pc parport acpi_cpufreq i2c_piix4 i2c_core pcspkr ext4 jbd2 mbcache xen_blkfront floppy pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod
[34377.346090] CPU: 2 PID: 10601 Comm: dd Not tainted 4.1.12-71.el6uek.bug24939243.x86_64 #2
[34377.346090] Hardware name: Xen HVM domU, BIOS 4.4.4OVM 02/11/2016
[34377.346090] task: ffff8800b6de6200 ti: ffff8800a7d48000 task.ti: ffff8800a7d48000
[34377.346090] RIP: 0010:[<ffffffffa06e7397>]  [<ffffffffa06e7397>] ocfs2_journal_dirty+0xa7/0xb0 [ocfs2]
[34377.346090] RSP: 0018:ffff8800a7d4b6d8  EFLAGS: 00010286
[34377.346090] RAX: 00000000ffffffe4 RBX: 00000000814d0a9c RCX: 00000000000004f9
[34377.346090] RDX: ffffffffa008e990 RSI: ffffffffa008f1ee RDI: ffff8800622b6460
[34377.346090] RBP: ffff8800a7d4b6f8 R08: ffffffffa008f288 R09: ffff8800622b6460
[34377.346090] R10: 0000000000000000 R11: 0000000000000282 R12: 0000000002c8421e
[34377.346090] R13: ffff88006d0cad00 R14: ffff880092beef60 R15: 0000000000000070
[34377.346090] FS:  00007f9b83e92700(0000) GS:ffff8800be880000(0000) knlGS:0000000000000000
[34377.346090] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[34377.346090] CR2: 00007fb2c0d1a000 CR3: 0000000008f80000 CR4: 00000000000406e0
[34377.346090] Stack:
[34377.346090]  00000000814d0a9c ffff88005fe61e00 ffff88006e995c00 ffff880009847c00
[34377.346090]  ffff8800a7d4b768 ffffffffa06c0999 ffff88009d3c2a10 ffff88005fe61e30
[34377.346090]  ffff8800997ce500 ffff8800997ce980 ffff880092beef60 000000100000000e
[34377.346090] Call Trace:
[34377.346090]  [<ffffffffa06c0999>] ocfs2_dx_dir_transfer_leaf+0x159/0x1a0 [ocfs2]
[34377.346090]  [<ffffffffa06c3eeb>] ocfs2_dx_dir_rebalance+0xd9b/0xea0 [ocfs2]
[34377.346090]  [<ffffffffa06dedb2>] ? ocfs2_inode_cache_io_unlock+0x12/0x20 [ocfs2]
[34377.346090]  [<ffffffffa0761180>] ? ocfs2_refcount_tree_et_ops+0x60/0xfffffffffffe4b31 [ocfs2]
[34377.346090]  [<ffffffffa06e7730>] ? ocfs2_journal_access_dl+0x20/0x20 [ocfs2]
[34377.346090]  [<ffffffffa06c6b63>] ocfs2_find_dir_space_dx+0xd3/0x300 [ocfs2]
[34377.346090]  [<ffffffffa06c9709>] ocfs2_prepare_dx_dir_for_insert+0x219/0x450 [ocfs2]
[34377.346090]  [<ffffffffa06c9b16>] ocfs2_prepare_dir_for_insert+0x1d6/0x580 [ocfs2]
[34377.346090]  [<ffffffffa06dee90>] ? ocfs2_read_inode_block+0x10/0x20 [ocfs2]
[34377.346090]  [<ffffffffa06f38e2>] ocfs2_mknod+0x5a2/0x1400 [ocfs2]
[34377.346090]  [<ffffffffa06f4933>] ocfs2_create+0x73/0x180 [ocfs2]
[34377.346090]  [<ffffffff81211de8>] vfs_create+0xd8/0x100
[34377.346090]  [<ffffffff8120f5fd>] ? lookup_real+0x1d/0x60
[34377.346090]  [<ffffffff81212535>] lookup_open+0x185/0x1c0
[34377.346090]  [<ffffffff8121571d>] do_last+0x36d/0x780
[34377.346090]  [<ffffffff812a85d6>] ? security_file_alloc+0x16/0x20
[34377.346090]  [<ffffffff81215bc2>] path_openat+0x92/0x470
[34377.346090]  [<ffffffff81215fea>] do_filp_open+0x4a/0xa0
[34377.346090]  [<ffffffff8132c570>] ? find_next_zero_bit+0x10/0x20
[34377.346090]  [<ffffffff812232ec>] ? __alloc_fd+0xac/0x150
[34377.346090]  [<ffffffff8120459a>] do_sys_open+0x11a/0x230
[34377.346090]  [<ffffffff810259d3>] ? syscall_trace_enter_phase1+0x153/0x180
[34377.346090]  [<ffffffff812046ee>] SyS_open+0x1e/0x20
[34377.346090]  [<ffffffff816cb6ee>] system_call_fastpath+0x12/0x71
[34377.346090] Code: 1d 3f 29 09 00 48 85 db 74 1f 48 8b 03 0f 1f 80 00 00 00 00 48 8b 7b 08 48 83 c3 10 4c 89 e6 ff d0 48 8b 03 48 85 c0 75 eb eb 90 <0f> 0b eb fe 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54
[34377.346090] RIP  [<ffffffffa06e7397>] ocfs2_journal_dirty+0xa7/0xb0 [ocfs2]
[34377.346090]  RSP <ffff8800a7d4b6d8>
[34377.615401] ---[ end trace 91ac5312a6ee1288 ]---
[34377.618919] Kernel panic - not syncing: Fatal exception
[34377.619910] Kernel Offset: disabled

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
Eric Ren [Fri, 30 Sep 2016 22:11:32 +0000 (15:11 -0700)]
ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()

The testcase "mmaptruncate" of ocfs2-test deadlocks occasionally.

In this testcase, we create a 2*CLUSTER_SIZE file and mmap() on it;
there are 2 process repeatedly performing the following operations
respectively: one is doing memset(mmaped_addr + 2*CLUSTER_SIZE - 1, 'a',
1), while the another is playing ftruncate(fd, 2*CLUSTER_SIZE) and then
ftruncate(fd, CLUSTER_SIZE) again and again.

This is the backtrace when the deadlock happens:

   __wait_on_bit_lock+0x50/0xa0
   __lock_page+0xb7/0xc0
   ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2]
   ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2]
   do_page_mkwrite+0x66/0xc0
   handle_mm_fault+0x685/0x1350
   __do_page_fault+0x1d8/0x4d0
   trace_do_page_fault+0x37/0xf0
   do_async_page_fault+0x19/0x70
   async_page_fault+0x28/0x30

In ocfs2_write_begin_nolock(), we first grab the pages and then allocate
disk space for this write; ocfs2_try_to_free_truncate_log() will be
called if -ENOSPC is returned; if we're lucky to get enough clusters,
which is usually the case, we start over again.

But in ocfs2_free_write_ctxt() the target page isn't unlocked, so we
will deadlock when trying to grab the target page again.

Also, -ENOMEM might be returned in ocfs2_grab_pages_for_write().
Another deadlock will happen in __do_page_mkwrite() if
ocfs2_page_mkwrite() returns non-VM_FAULT_LOCKED, and along with a
locked target page.

These two errors fail on the same path, so fix them by unlocking the
target page manually before ocfs2_free_write_ctxt().

Jan Kara helps me clear out the JBD2 part, and suggest the hint for root
cause.

Changes since v1:
1. Also put ENOMEM error case into consideration.

Link: http://lkml.kernel.org/r/1474173902-32075-1-git-send-email-zren@suse.com
Signed-off-by: Eric Ren <zren@suse.com>
Reviewed-by: He Gang <ghe@suse.com>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit c33f0785bf292cf1d15f4fbe42869c63e205b21c)

Conflicts:

fs/ocfs2/aops.c

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2/dlm: fix race between convert and migration
Joseph Qi [Mon, 19 Sep 2016 21:43:55 +0000 (14:43 -0700)]
ocfs2/dlm: fix race between convert and migration

Commit ac7cf246dfdb ("ocfs2/dlm: fix race between convert and recovery")
checks if lockres master has changed to identify whether new master has
finished recovery or not.  This will introduce a race that right after
old master does umount ( means master will change), a new convert
request comes.

In this case, it will reset lockres state to DLM_RECOVERING and then
retry convert, and then fail with lockres->l_action being set to
OCFS2_AST_INVALID, which will cause inconsistent lock level between
ocfs2 and dlm, and then finally BUG.

Since dlm recovery will clear lock->convert_pending in
dlm_move_lockres_to_recovery_list, we can use it to correctly identify
the race case between convert and recovery.  So fix it.

Fixes: ac7cf246dfdb ("ocfs2/dlm: fix race between convert and recovery")
Link: http://lkml.kernel.org/r/57CE1569.8010704@huawei.com
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit e6f0c6e6170fec175fe676495f29029aecdf486c)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: solve a problem of crossing the boundary in updating backups
jiangyiwen [Fri, 25 Mar 2016 21:21:35 +0000 (14:21 -0700)]
ocfs2: solve a problem of crossing the boundary in updating backups

In update_backups() there exists a problem of crossing the boundary as
follows:

we assume that lun will be resized to 1TB(cluster_size is 32kb), it will
include 0~33554431 cluster, in update_backups func, it will backup super
block in location of 1TB which is the 33554432th cluster, so the
phenomenon of crossing the boundary happens.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Xue jiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 584dca3440732afa84fbca07567bb66e1453936a)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: use spinlock_irqsave() to downconvert lock in ocfs2_osb_dump()
jiangyiwen [Tue, 15 Mar 2016 21:53:01 +0000 (14:53 -0700)]
ocfs2: use spinlock_irqsave() to downconvert lock in ocfs2_osb_dump()

Commit a75e9ccabd92 ("ocfs2: use spinlock irqsave for downconvert lock")
missed an unmodified place in ocfs2_osb_dump(), so it still exists a
deadlock scenario.

    ocfs2_wake_downconvert_thread
    ocfs2_rw_unlock
    ocfs2_dio_end_io
    dio_complete
    .....
    bio_endio
    req_bio_endio
    ....
    scsi_io_completion
    blk_done_softirq
    __do_softirq
    do_softirq
    irq_exit
    do_IRQ
    ocfs2_osb_dump
    cat /sys/kernel/debug/ocfs2/${uuid}/fs_state

This patch still uses spin_lock_irqsave() - replace spin_lock() to solve
this situation.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit bfd97a0320d338b2fce422adeddd512466ef2390)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: access orphan dinode before delete entry in ocfs2_orphan_del
Joseph Qi [Thu, 14 Jan 2016 23:17:44 +0000 (15:17 -0800)]
ocfs2: access orphan dinode before delete entry in ocfs2_orphan_del

In ocfs2_orphan_del, currently it finds and deletes entry first, and
then access orphan dir dinode.  This will have a problem once
ocfs2_journal_access_di fails.  In this case, entry will be removed from
orphan dir, but in deed the inode hasn't been deleted successfully.  In
other words, the file is missing but not actually deleted.  So we should
access orphan dinode first like unlink and rename.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 074a6c655f6da12cb1123c8a84bfd8d781138800)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2/dlm: do not insert a new mle when another process is already migrating
xuejiufei [Thu, 14 Jan 2016 23:17:41 +0000 (15:17 -0800)]
ocfs2/dlm: do not insert a new mle when another process is already migrating

When two processes are migrating the same lockres,
dlm_add_migration_mle() return -EEXIST, but insert a new mle in hash
list.  dlm_migrate_lockres() will detach the old mle and free the new
one which is already in hash list, that will destroy the list.

Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 32e493265b2be96404aaa478fb2913be29b06887)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: fix slot overwritten if storage link down during mount
jiangyiwen [Thu, 14 Jan 2016 23:17:33 +0000 (15:17 -0800)]
ocfs2: fix slot overwritten if storage link down during mount

The following case will lead to slot overwritten.

N1                               N2
mount ocfs2 volume, find and
allocate slot 0, then set
osb->slot_num to 0, begin to
write slot info to disk
                                 mount ocfs2 volume, wait for super lock
write block fail because of
storage link down, unlock
super lock
                                 got super lock and also allocate slot 0
                                 then unlock super lock

mount fail and then dismount,
since osb->slot_num is 0, try to
put invalid slot to disk. And it
will succeed if storage link
restores.
                                 N2 slot info is now overwritten

Once another node say N3 mount, it will find and allocate slot 0 again,
which will lead to mount hung because journal has already been locked by
N2.  so when write slot info failed, invalidate slot in advance to avoid
overwrite slot.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 1247017f43a93eae3d64b7c25f3637dc545f5a47)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2/dlm: return appropriate value when dlm_grab() returns NULL
Xue jiufei [Thu, 14 Jan 2016 23:17:29 +0000 (15:17 -0800)]
ocfs2/dlm: return appropriate value when dlm_grab() returns NULL

dlm_grab() may return NULL when the node is doing unmount.  When doing
code review, we found that some dlm handlers may return error to caller
when dlm_grab() returns NULL and make caller BUG or other problems.
Here is an example:

Node 1                                 Node 2
receives migration message
from node 3, and send
migrate request to others
                                     start unmounting

                                     receives migrate request
                                     from node 1 and call
                                     dlm_migrate_request_handler()

                                     unmount thread unregisters
                                     domain handlers and removes
                                     dlm_context from dlm_domains

                                     dlm_migrate_request_handlers()
                                     returns -EINVAL to node 1
Exit migration neither clearing the
migration state nor sending
assert master message to node 3 which
cause node 3 hung.

Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit c372f2193a2e73d5936bf37259ae63ca388b4cbc)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2/dlm: wait until DLM_LOCK_RES_SETREF_INPROG is cleared in dlm_deref_lockres_worker
jiangyiwen [Thu, 14 Jan 2016 23:17:23 +0000 (15:17 -0800)]
ocfs2/dlm: wait until DLM_LOCK_RES_SETREF_INPROG is cleared in dlm_deref_lockres_worker

Commit f3f854648de6 ("ocfs2_dlm: Ensure correct ordering of set/clear
refmap bit on lockres") still exists a race which can't ensure the
ordering is exactly correct.

Node1               Node2                    Node3
umount, migrate
lockres to Node2
                    migrate finished,
                    send migrate request
                    to Node3
                                              received migrate request,
                                              create a migration_mle,
                                              respond to Node2.
                    set DLM_LOCK_RES_SETREF_INPROG
                    and send assert master to
                    Node3
                                              delete migration_mle in
                                              assert_master_handler,
                                              Node3 umount without response
                                              dlm_thread purge
                                              this lockres, send drop
                                              deref message to Node2
                    found the flag of
                    DLM_LOCK_RES_SETREF_INPROG
                    is set, dispatch
                    dlm_deref_lockres_worker to
                    clear refmap, but in function of
                    dlm_deref_lockres_worker,
                    only if node in refmap it wait
                    DLM_LOCK_RES_SETREF_INPROG
                    to be cleared. So worker is
                    done successfully

                                              purge lockres, send
                                              assert master response
                                              to Node1, and finish umount
                    set Node3 in refmap, and it
                    won't be cleared forever, thus
                    lead to umount hung

so wait until DLM_LOCK_RES_SETREF_INPROG is cleared in
dlm_deref_lockres_worker.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b5560143385e18b4109ad6951c7719705e3dd995)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2/dlm: fix a race between purge and migration
Xue jiufei [Thu, 14 Jan 2016 23:17:18 +0000 (15:17 -0800)]
ocfs2/dlm: fix a race between purge and migration

We found a race between purge and migration when doing code review.
Node A put lockres to purgelist before receiving the migrate message
from node B which is the master.  Node A call dlm_mig_lockres_handler to
handle this message.

dlm_mig_lockres_handler
  dlm_lookup_lockres
  >>>>>> race window, dlm_run_purge_list may run and send
         deref message to master, waiting the response
  spin_lock(&res->spinlock);
  res->state |= DLM_LOCK_RES_MIGRATING;
  spin_unlock(&res->spinlock);
  dlm_mig_lockres_handler returns

  >>>>>> dlm_thread receives the response from master for the deref
  message and triggers the BUG because the lockres has the state
  DLM_LOCK_RES_MIGRATING with the following message:

dlm_purge_lockres:209 ERROR: 6633EB681FA7474A9C280A4E1A836F0F: res
M0000000000000000030c0300000000 in use after deref

Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 30bee898f86506893883ffb8db20d8101a29b5f5)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2/dlm: clear migration_pending when migration target goes down
xuejiufei [Tue, 29 Dec 2015 22:54:29 +0000 (14:54 -0800)]
ocfs2/dlm: clear migration_pending when migration target goes down

We have found a BUG on res->migration_pending when migrating lock
resources.  The situation is as follows.

dlm_mark_lockres_migration
  res->migration_pending = 1;
  __dlm_lockres_reserve_ast
  dlm_lockres_release_ast returns with res->migration_pending remains
      because other threads reserve asts
  wait dlm_migration_can_proceed returns 1
  >>>>>>> o2hb found that target goes down and remove target
          from domain_map
  dlm_migration_can_proceed returns 1
  dlm_mark_lockres_migrating returns -ESHOTDOWN with
      res->migration_pending still remains.

When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON
with res->migration_pending.  So clear migration_pending when target is
down.

Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit cc28d6d80f6ab494b10f0e2ec949eacd610f66e3)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: fix BUG when calculate new backup super
Joseph Qi [Tue, 29 Dec 2015 22:54:06 +0000 (14:54 -0800)]
ocfs2: fix BUG when calculate new backup super

When resizing, it firstly extends the last gd.  Once it should backup
super in the gd, it calculates new backup super and update the
corresponding value.

But it currently doesn't consider the situation that the backup super is
already done.  And in this case, it still sets the bit in gd bitmap and
then decrease from bg_free_bits_count, which leads to a corrupted gd and
trigger the BUG in ocfs2_block_group_set_bits:

    BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);

So check whether the backup super is done and then do the updates.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 5c9ee4cbf2a945271f25b89b137f2c03bbc3be33)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error
alex chen [Fri, 6 Nov 2015 02:44:10 +0000 (18:44 -0800)]
ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error

In ocfs2_mknod_locked if '__ocfs2_mknod_locke d' returns an error, we
should reclaim the inode successfully claimed above, otherwise, the
inode never be reused. The case is described below:

ocfs2_mknod
    ocfs2_mknod_locked
        ocfs2_claim_new_inode
                Successfully claim the inode
        __ocfs2_mknod_locked
            ocfs2_journal_access_di
            Failed because of -ENOMEM or other reasons, the inode
                        lockres has not been initialized yet.

    iput(inode)
        ocfs2_evict_inode
            ocfs2_delete_inode
                ocfs2_inode_lock
                    ocfs2_inode_lock_full_nested
                        __ocfs2_cluster_lock
                                Return -EINVAL because of the inode
                                lockres has not been initialized.

                So the following operations are not performed
                ocfs2_wipe_inode
                        ocfs2_remove_inode
                                ocfs2_free_dinode
                                        ocfs2_free_suballoc_bits

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b1529a41f777a48f95d4af29668b70ffe3360e1b)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: fix race between mount and delete node/cluster
Joseph Qi [Fri, 6 Nov 2015 02:44:07 +0000 (18:44 -0800)]
ocfs2: fix race between mount and delete node/cluster

There is a race case between mount and delete node/cluster, which will
lead o2hb_thread to malfunctioning dead loop.

    o2hb_thread
    {
        o2nm_depend_this_node();
        <<<<<< race window, node may have already been deleted, and then
               enter the loop, o2hb thread will be malfunctioning
               because of no configured nodes found.
        while (!kthread_should_stop() &&
               !reg->hr_unclean_stop && !reg->hr_aborted_start) {
    }

So check the return value of o2nm_depend_this_node() is needed.  If node
has been deleted, do not enter the loop and let mount fail.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 0986fe9b50f425ec81f25a1a85aaf3574b31d801)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2/dlm: unlock lockres spinlock before dlm_lockres_put
Joseph Qi [Thu, 22 Oct 2015 20:32:29 +0000 (13:32 -0700)]
ocfs2/dlm: unlock lockres spinlock before dlm_lockres_put

dlm_lockres_put will call dlm_lockres_release if it is the last
reference, and then it may call dlm_print_one_lock_resource and
take lockres spinlock.

So unlock lockres spinlock before dlm_lockres_put to avoid deadlock.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b67de018b37a97548645a879c627d4188518e907)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: avoid access invalid address when read o2dlm debug messages
Yiwen Jiang [Fri, 4 Sep 2015 22:44:37 +0000 (15:44 -0700)]
ocfs2: avoid access invalid address when read o2dlm debug messages

The following case will lead to a lockres is freed but is still in use.

cat /sys/kernel/debug/o2dlm/locking_state dlm_thread
lockres_seq_start
    -> lock dlm->track_lock
    -> get resA
                                                resA->refs decrease to 0,
                                                call dlm_lockres_release,
                                                and wait for "cat" unlock.
Although resA->refs is already set to 0,
increase resA->refs, and then unlock
                                                lock dlm->track_lock
                                                    -> list_del_init()
                                                    -> unlock
                                                    -> free resA

In such a race case, invalid address access may occurs.  So we should
delete list res->tracking before resA->refs decrease to 0.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit f57a22ddecd6f26040a67e2c12880f98f88b6e00)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: fix a tiny case that inode can not removed
Yiwen Jiang [Fri, 4 Sep 2015 22:44:25 +0000 (15:44 -0700)]
ocfs2: fix a tiny case that inode can not removed

When running dirop_fileop_racer we found a case that inode
can not removed.

Two nodes, say Node A and Node B, mount the same ocfs2 volume.  Create
two dirs /race/1/ and /race/2/ in the filesystem.

  Node A                            Node B
  rm -r /race/2/
                                    mv /race/1/ /race/2/
  call ocfs2_unlink(), get
  the EX mode of /race/2/
                                    wait for B unlock /race/2/
  decrease i_nlink of /race/2/ to 0,
  and add inode of /race/2/ into
  orphan dir, unlock /race/2/
                                    got EX mode of /race/2/. because
                                    /race/1/ is dir, so inc i_nlink
                                    of /race/2/ and update into disk,
                                    unlock /race/2/
  because i_nlink of /race/2/
  is not zero, this inode will
  always remain in orphan dir

This patch fixes this case by test whether i_nlink of new dir is zero.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Xue jiufei <xuejiufei@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 928dda1f9433f024ac48c3d97ae683bf83dd0e42)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: trusted xattr missing CAP_SYS_ADMIN check
Sanidhya Kashyap [Fri, 4 Sep 2015 22:44:08 +0000 (15:44 -0700)]
ocfs2: trusted xattr missing CAP_SYS_ADMIN check

The trusted extended attributes are only visible to the process which
hvae CAP_SYS_ADMIN capability but the check is missing in ocfs2
xattr_handler trusted list.  The check is important because this will be
used for implementing mechanisms in the userspace for which other
ordinary processes should not have access to.

Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Taesoo kim <taesoo@gatech.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 0f5e7b41f91814447defc34e915fc5d6e52266d9)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: set filesytem read-only when ocfs2_delete_entry failed.
jiangyiwen [Fri, 4 Sep 2015 22:44:06 +0000 (15:44 -0700)]
ocfs2: set filesytem read-only when ocfs2_delete_entry failed.

In ocfs2_rename, it will lead to an inode with two entried(old and new) if
ocfs2_delete_entry(old) failed.  Thus, filesystem will be inconsistent.

The case is described below:

ocfs2_rename
    -> ocfs2_start_trans
    -> ocfs2_add_entry(new)
    -> ocfs2_delete_entry(old)
        -> __ocfs2_journal_access *failed* because of -ENOMEM
    -> ocfs2_commit_trans

So filesystem should be set to read-only at the moment.

Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 807a7907114c7c703017ed7a96477a2eeb0d08e0)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoocfs2: fix NULL pointer dereference in function ocfs2_abort_trigger()
Xue jiufei [Wed, 24 Jun 2015 23:55:20 +0000 (16:55 -0700)]
ocfs2: fix NULL pointer dereference in function ocfs2_abort_trigger()

ocfs2_abort_trigger() use bh->b_assoc_map to get sb.  But there's no
function to set bh->b_assoc_map in ocfs2, it will trigger NULL pointer
dereference while calling this function.  We can get sb from
bh->b_bdev->bd_super instead of b_assoc_map.

[akpm@linux-foundation.org: update comment, per Joseph]
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 74e364ad1b13fd518a0bd4e5aec56d5e8706152f)

Orabug: 24939243

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
8 years agoaacraid: Check size values after double-fetch from user
Dave Carroll [Fri, 5 Aug 2016 19:44:10 +0000 (13:44 -0600)]
aacraid: Check size values after double-fetch from user

Orabug: 25060030

In aacraid's ioctl_send_fib() we do two fetches from userspace, one the
get the fib header's size and one for the fib itself. Later we use the
size field from the second fetch to further process the fib. If for some
reason the size from the second fetch is different than from the first
fix, we may encounter an out-of- bounds access in aac_fib_send(). We
also check the sender size to insure it is not out of bounds. This was
reported in https://bugzilla.kernel.org/show_bug.cgi?id=116751 and was
assigned CVE-2016-6480.

Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Fixes: 7c00ffa31 '[SCSI] 2.6 aacraid: Variable FIB size (updated patch)'
Cc: stable@vger.kernel.org
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit fa00c437eef8dc2e7b25f8cd868cfa405fcc2bb3)
Signed-off-by: Dan Duval <dan.duval@oracle.com>
8 years agomm: migrate dirty page without clear_page_dirty_for_io etc
Hugh Dickins [Fri, 6 Nov 2015 02:50:05 +0000 (18:50 -0800)]
mm: migrate dirty page without clear_page_dirty_for_io etc

Orabug: 25059177
CVE: CVE-2016-3070

clear_page_dirty_for_io() has accumulated writeback and memcg subtleties
since v2.6.16 first introduced page migration; and the set_page_dirty()
which completed its migration of PageDirty, later had to be moderated to
__set_page_dirty_nobuffers(); then PageSwapBacked had to skip that too.

No actual problems seen with this procedure recently, but if you look into
what the clear_page_dirty_for_io(page)+set_page_dirty(newpage) is actually
achieving, it turns out to be nothing more than moving the PageDirty flag,
and its NR_FILE_DIRTY stat from one zone to another.

It would be good to avoid a pile of irrelevant decrementations and
incrementations, and improper event counting, and unnecessary descent of
the radix_tree under tree_lock (to set the PAGECACHE_TAG_DIRTY which
radix_tree_replace_slot() left in place anyway).

Do the NR_FILE_DIRTY movement, like the other stats movements, while
interrupts still disabled in migrate_page_move_mapping(); and don't even
bother if the zone is the same.  Do the PageDirty movement there under
tree_lock too, where old page is frozen and newpage not yet visible:
bearing in mind that as soon as newpage becomes visible in radix_tree, an
un-page-locked set_page_dirty() might interfere (or perhaps that's just
not possible: anything doing so should already hold an additional
reference to the old page, preventing its migration; but play safe).

But we do still need to transfer PageDirty in migrate_page_copy(), for
those who don't go the mapping route through migrate_page_move_mapping().

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 42cb14b110a5698ccf26ce59c4441722605a3743)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
mm/migrate.c

Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agoxen-netfront: cast grant table reference first to type int
Dongli Zhang [Thu, 17 Nov 2016 05:55:27 +0000 (13:55 +0800)]
xen-netfront: cast grant table reference first to type int

IS_ERR_VALUE() in commit 87557efc27f6a50140fb20df06a917f368ce3c66
("xen-netfront: do not cast grant table reference to signed short") would
not return true for error code unless we cast ref first to type int.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oracle-Bug: 25138361
upstream commit: 269ebce4531b8edc4224259a02143181a1c1d77c
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed by: Jack F. Vogel <jack.vogel@oracle.com>
Acked-by: Joe Jin <joe.jin@oracle.com>
8 years agoxen-netfront: do not cast grant table reference to signed short
Dongli Zhang [Thu, 17 Nov 2016 05:54:19 +0000 (13:54 +0800)]
xen-netfront: do not cast grant table reference to signed short

While grant reference is of type uint32_t, xen-netfront erroneously casts
it to signed short in BUG_ON().

This would lead to the xen domU panic during boot-up or migration when it
is attached with lots of paravirtual devices.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oracle-Bug: 25138362
upstream commit: 87557efc27f6a50140fb20df06a917f368ce3c66
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed by: Jack F. Vogel <jack.vogel@oracle.com>
Acked-by: Joe Jin <joe.jin@oracle.com>
8 years agoMerge branch topic/uek-4.1/secureboot of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Wed, 16 Nov 2016 20:38:22 +0000 (12:38 -0800)]
Merge branch topic/uek-4.1/secureboot of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/secureboot:
  acpi: Disable ACPI table override if securelevel is set

8 years agoacpi: Disable ACPI table override if securelevel is set
Linn Crosetto [Wed, 16 Nov 2016 20:33:52 +0000 (12:33 -0800)]
acpi: Disable ACPI table override if securelevel is set

From the kernel documentation (initrd_table_override.txt):

  If the ACPI_INITRD_TABLE_OVERRIDE compile option is true, it is possible
  to override nearly any ACPI table provided by the BIOS with an
  instrumented, modified one.

When securelevel is set, the kernel should disallow any unauthenticated
changes to kernel space. ACPI tables contain code invoked by the kernel, so
do not allow ACPI tables to be overridden if securelevel is set.

Signed-off-by: Linn Crosetto <linn@hpe.com>
Orabug: 25058372
CVE: CVE-2016-3699
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Reviewed-by: Guru Anbalagane <guru.anbalagane@oracle.com>
8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Thu, 10 Nov 2016 14:27:04 +0000 (06:27 -0800)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  ecryptfs: don't allow mmap when the lower fs doesn't support it
  Revert "ecryptfs: forbid opening files without mmap handler"

8 years agoecryptfs: don't allow mmap when the lower fs doesn't support it
Jeff Mahoney [Tue, 5 Jul 2016 21:32:30 +0000 (17:32 -0400)]
ecryptfs: don't allow mmap when the lower fs doesn't support it

There are legitimate reasons to disallow mmap on certain files, notably
in sysfs or procfs.  We shouldn't emulate mmap support on file systems
that don't offer support natively.

CVE-2016-1583

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
[tyhicks: clean up f_op check by using ecryptfs_file_to_lower()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Mainline v4.7 commit f0fe970df3838c202ef6c07a4c2b36838ef0a88b
Replaces UEK4 commit e06914f2e9ac6b3f19d4461cb24b401f77ce4f17
which was reverted by UEK4 commit b1660e855b21.

Orabug: 24971905
CVE: CVE-2016-1583
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com
8 years agoRevert "ecryptfs: forbid opening files without mmap handler"
Chuck Anderson [Thu, 10 Nov 2016 14:10:42 +0000 (06:10 -0800)]
Revert "ecryptfs: forbid opening files without mmap handler"

This reverts commit UEK4 e06914f2e9ac6b3f19d4461cb24b401f77ce4f17
which was based on mainline v4.7 commit:
2f36db71009304b3f0b95afacd8eba1f9f046b87
ecryptfs: forbid opening files without mmap handler

2f36db71 was replaced by mainline v4.7 commit:
f0fe970df3838c202ef6c07a4c2b36838ef0a88b
ecryptfs: don't allow mmap when the lower fs doesn't support it
which follows in another commit.

Also see mainline 4.7 commit 78c4e172412de5d0456dc00d2b34050aa0b683b5
Revert "ecryptfs: forbid opening files without mmap handler"

Orabug: 24971905
CVE: CVE-2016-1583
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Wed, 9 Nov 2016 22:19:53 +0000 (14:19 -0800)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  percpu: fix synchronization between synchronous map extension and chunk destruction
  percpu: fix synchronization between chunk->map_extend_work and chunk destruction
  ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt
  ALSA: timer: Fix leak in events via snd_timer_user_ccallback
  ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS

8 years agopercpu: fix synchronization between synchronous map extension and chunk destruction
Tejun Heo [Wed, 25 May 2016 15:48:25 +0000 (11:48 -0400)]
percpu: fix synchronization between synchronous map extension and chunk destruction

For non-atomic allocations, pcpu_alloc() can try to extend the area
map synchronously after dropping pcpu_lock; however, the extension
wasn't synchronized against chunk destruction and the chunk might get
freed while extension is in progress.

This patch fixes the bug by putting most of non-atomic allocations
under pcpu_alloc_mutex to synchronize against pcpu_balance_work which
is responsible for async chunk management including destruction.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: stable@vger.kernel.org # v3.18+
Fixes: 1a4d76076cda ("percpu: implement asynchronous chunk population")
Orabug: 25060076
CVE: CVE-2016-4794
Mainline v4.7 commit 6710e594f71ccaad8101bc64321152af7cd9ea28
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agopercpu: fix synchronization between chunk->map_extend_work and chunk destruction
Tejun Heo [Wed, 25 May 2016 15:48:25 +0000 (11:48 -0400)]
percpu: fix synchronization between chunk->map_extend_work and chunk destruction

Atomic allocations can trigger async map extensions which is serviced
by chunk->map_extend_work.  pcpu_balance_work which is responsible for
destroying idle chunks wasn't synchronizing properly against
chunk->map_extend_work and may end up freeing the chunk while the work
item is still in flight.

This patch fixes the bug by rolling async map extension operations
into pcpu_balance_work.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: stable@vger.kernel.org # v3.18+
Fixes: 9c824b6a172c ("percpu: make sure chunk->map array has available space")
Orabug: 25060076
CVE: CVE-2016-4794
Mainline v4.7 commit 4f996e234dad488e5d9ba0858bc1bae12eff82c3
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoALSA: timer: Fix leak in events via snd_timer_user_tinterrupt
Kangjie Lu [Tue, 3 May 2016 20:44:32 +0000 (16:44 -0400)]
ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt

The stack object “r1” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Orabug: 25059885
CVE: CVE-2016-4578
Mainline v4.7 commit e4ec8cc8039a7063e24204299b462bd1383184a5
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoALSA: timer: Fix leak in events via snd_timer_user_ccallback
Kangjie Lu [Tue, 3 May 2016 20:44:20 +0000 (16:44 -0400)]
ALSA: timer: Fix leak in events via snd_timer_user_ccallback

The stack object “r1” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Orabug: 25059885
CVE: CVE-2016-4578
Mainline v4.7 commit 9a47e9cff994f37f7f0dbd9ae23740d0f64f9fe6
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS
Kangjie Lu [Tue, 3 May 2016 20:44:07 +0000 (16:44 -0400)]
ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS

The stack object “tread” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Orabug: 25059408
CVE: CVE-2016-4569
Mainline v4.7 commit cec8f96e49d9be372fdb0c3836dcf31ec71e457e
Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
8 years agoMerge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Wed, 9 Nov 2016 19:50:53 +0000 (11:50 -0800)]
Merge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/rpm-build:
  uek-rpm ol7: change uek-rpm/ol7/update-el release value from 7.1 to 7.3

8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Wed, 9 Nov 2016 14:21:32 +0000 (06:21 -0800)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  perf tools: handle spaces in file names obtained from /proc/pid/maps

8 years agoperf tools: handle spaces in file names obtained from /proc/pid/maps
Marcin Ślusarz [Tue, 19 Jan 2016 19:03:03 +0000 (20:03 +0100)]
perf tools: handle spaces in file names obtained from /proc/pid/maps

Steam frequently puts game binaries in folders with spaces.

Note: "(deleted)" markers are now treated as part of the file name.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 6064803313ba ("perf tools: Use sscanf for parsing /proc/pid/maps")
Link: http://lkml.kernel.org/r/20160119190303.GA17579@marcin-Inspiron-7720
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
(cherry picked from commit 89fee59b504f86925894fcc9ba79d5c933842f93)

Orabug: 25072114
Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
8 years agouek-rpm ol7: change uek-rpm/ol7/update-el release value from 7.1 to 7.3
Chuck Anderson [Fri, 4 Nov 2016 12:33:13 +0000 (05:33 -0700)]
uek-rpm ol7: change uek-rpm/ol7/update-el release value from 7.1 to 7.3

Change release value in uek-rpm/ol7/update-el to 7.3 so that manual builds
will pick up the new OL7.3 secure boot key.
uek-rpm/ol6/update-el is not affected.

Orabug: 25050588

Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com>
Reviewed-by: Guru Anbalagane <guru.anbalagane@oracle.com>
8 years agoMerge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek...
Chuck Anderson [Thu, 3 Nov 2016 17:43:20 +0000 (10:43 -0700)]
Merge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/ofed:
  xsigo: send nack codes
  xsigo: xve driver has excessive messages
  xsigo: hard LOCKUP in freeing paths
  xsigo: Crash in xscore_port_num
  xsigo: Resize uVNIC/PVI CQ size
  xsigo: Optimizing Transmit completions
  xsigo: Implementing Jumbo MTU support
  RDS: rds debug messages are enabled by default
  net/rds: Fix new sparse warning
  net/rds: fix unaligned memory access

8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Thu, 3 Nov 2016 17:42:10 +0000 (10:42 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks: (23 commits)
  NFS: Fix an LOCK/OPEN race when unlinking an open file
  intel_idle: correct BXT support
  intel_idle: re-work bxt_idle_state_table_update() and its helper
  x86/intel_idle: Use Intel family macros for intel_idle
  x86/cpu/intel: Introduce macros for Intel family numbers
  intel_idle: add BXT support
  intel_idle: Add KBL support
  intel_idle: Add SKX support
  intel_idle: Clean up all registered devices on exit.
  intel_idle: Propagate hot plug errors.
  intel_idle: Don't overreact to a cpuidle registration failure.
  intel_idle: Setup the timer broadcast only on successful driver load.
  intel_idle: Avoid a double free of the per-CPU data.
  intel_idle: Fix dangling registration on error path.
  intel_idle: Fix deallocation order on the driver exit path.
  intel_idle: Remove redundant initialization calls.
  intel_idle: Fix a helper function's return value.
  intel_idle: remove useless return from void function.
  intel_idle: Support for Intel Xeon Phi Processor x200 Product Family
  intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled
  ...

8 years agoxsigo: send nack codes
Pradeep Gopanapalli [Tue, 1 Nov 2016 22:16:23 +0000 (22:16 +0000)]
xsigo: send nack codes

Orabug: 24442792

Sometime uVNIC removal on OFOS won't trigger a actual removal
of Vstar interface, in that case uVNIC driver has to send NACK
code so that XCM will start cleaning its database.

Added additional codes as per XCM specification

Reported-by: jie zhu <jie.x.zhu@oracle.com>
Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: Qingjun Wang <qingjun.wang@oracle.com>
Reviewed-by: Manish Kumar Singh <mk.singh@oracle.com>
Reviewed-by: UmaShankar Tumari Mahabalagiri <umashankar.mahabalagiri@oracle.com>
8 years agoxsigo: xve driver has excessive messages
Pradeep Gopanapalli [Tue, 1 Nov 2016 19:41:48 +0000 (19:41 +0000)]
xsigo: xve driver has excessive messages

Orabug: 24758335

Moved some message types from Warning to debug.

Consolidated multiple messages into single to avoid
flooding of messages on console

Added more counters to identify state of vnic.

Added a debug type xve_info

Reported-by: chien yen <chien.yen@oracle.com>
Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: Aravind Kini <aravind.kini@oracle.com>
Reviewed-by: UmaShankar Tumari Mahabalagiri <umashankar.mahabalagiri@oracle.com>
8 years agoxsigo: hard LOCKUP in freeing paths
Pradeep Gopanapalli [Tue, 1 Nov 2016 19:41:13 +0000 (19:41 +0000)]
xsigo: hard LOCKUP in freeing paths

Orabug: 24669507

When path->users becomes zero uVNIC driver starts
cleaning up the Forwarding table entries.

In some corner cases the call is invoked from transmit
function which is in interrupt context and that results
in a hard LOCKUP.

With new changes path->users is decremented in transmit
function to allow cleanup to happen from other thread.
Proper care is taken to avoid race between these
two contexts.

Reported-by: chien yen <chien.yen@oracle.com>
Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: Aravind Kini <aravind.kini@oracle.com>
Reviewed-by: viswa krishnamurthy <viswa.krishnamurthy@oracle.com>
Reviewed-by: Manish Kumar Singh <mk.singh@oracle.com>
Reviewed-by: UmaShankar Tumari Mahabalagiri <umashankar.mahabalagiri@oracle.com>
8 years agoxsigo: Crash in xscore_port_num
Pradeep Gopanapalli [Tue, 1 Nov 2016 19:38:47 +0000 (19:38 +0000)]
xsigo: Crash in xscore_port_num

Orabug: 24760465

When Server Profile context is not present
xcpm_get_xsmp_session_info returns error and uVNIC
driver has to handle that conditions

Reported-by: scarlett chen <scarlett.chen@oracle.com>
Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: viswa krishnamurthy <viswa.krishnamurthy@oracle.com>
Reviewed-by: UmaShankar Tumari Mahabalagiri <umashankar.mahabalagiri@oracle.com>
8 years agoxsigo: Resize uVNIC/PVI CQ size
Pradeep Gopanapalli [Tue, 1 Nov 2016 19:36:41 +0000 (19:36 +0000)]
xsigo: Resize uVNIC/PVI CQ size

Orabug: 24765034

uVNIC/PVI should avoid CQ overflow condition

Resize CQ's to 16k to handle multiple connections
flushed simultaneously per path.

Increase Send Queue and receive Queue to 2k for
better performance.

Added counters to print CQ sizes.
Added stats to count RC completions.

Reported-by: scarlett chen <scarlett.chen@oracle.com>
Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: Aravind Kini <aravind.kini@oracle.com>
Reviewed-by: viswa krishnamurthy <viswa.krishnamurthy@oracle.com>
Reviewed-by: Manish Kumar Singh <mk.singh@oracle.com>
Reviewed-by: UmaShankar Tumari Mahabalagiri <umashankar.mahabalagiri@oracle.com>
8 years agoxsigo: Optimizing Transmit completions
Pradeep Gopanapalli [Tue, 1 Nov 2016 19:34:29 +0000 (19:34 +0000)]
xsigo: Optimizing Transmit completions

Orabug: 24928865

Added a timer for polling Transmit completion and
removed polling completion from a thread context.

Seeing Good Performance improvments with the changes.
In some cases uVNIC is seeing 10% increase in throughput

Reported-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: sajid zia <szia@oracle.com>
8 years agoxsigo: Implementing Jumbo MTU support
Pradeep Gopanapalli [Tue, 1 Nov 2016 19:27:06 +0000 (19:27 +0000)]
xsigo: Implementing Jumbo MTU support

Orabug: 24928804

With Titan and Saturn supporting Jumbo Infiniband frames
uVNIC can have MTU greater than 4k and upto 10k.

Allocate multiple pages for Receive descriptors code changes
for handling multiple page mapping and unmapping.

Took proper care for enabling Jumbo MTU only for Titan and only
in EoiB mode.

If Jumbo MTU is used for non-Titan cards uVNIC driver will NACK
the Install and OFOS will display a failure message for the
install.

Added stats to display Jumbo & removed legacy EoiB HeartBeat code.

Reported-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Signed-off-by: Pradeep Gopanapalli <pradeep.gopanapalli@oracle.com>
Reviewed-by: sajid zia <szia@oracle.com>
8 years agoNFS: Fix an LOCK/OPEN race when unlinking an open file
Chuck Lever [Mon, 11 Apr 2016 20:20:22 +0000 (16:20 -0400)]
NFS: Fix an LOCK/OPEN race when unlinking an open file

Orabug: 24476280

At Connectathon 2016, we found that recent upstream Linux clients
would occasionally send a LOCK operation with a zero stateid. This
appeared to happen in close proximity to another thread returning
a delegation before unlinking the same file while it remained open.

Earlier, the client received a write delegation on this file and
returned the open stateid. Now, as it is getting ready to unlink the
file, it returns the write delegation. But there is still an open
file descriptor on that file, so the client must OPEN the file
again before it returns the delegation.

Since commit 24311f884189 ('NFSv4: Recovery of recalled read
delegations is broken'), nfs_open_delegation_recall() clears the
NFS_DELEGATED_STATE flag _before_ it sends the OPEN. This allows a
racing LOCK on the same inode to be put on the wire before the OPEN
operation has returned a valid open stateid.

To eliminate this race, serialize delegation return with the
acquisition of a file lock on the same file. Adopt the same approach
as is used in the unlock path.

This patch also eliminates a similar race seen when sending a LOCK
operation at the same time as returning a delegation on the same file.

Fixes: 24311f884189 ('NFSv4: Recovery of recalled read ... ')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[Anna: Add sentence about LOCK / delegation race]
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
(cherry picked from commit 11476e9dec39d90fe1e9bf12abc6f3efe35a073d)
Signed-off-by: Todd Vierling <todd.vierling@oracle.com>
8 years agointel_idle: correct BXT support
Jan Beulich [Mon, 27 Jun 2016 06:35:48 +0000 (00:35 -0600)]
intel_idle: correct BXT support

Orabug: 24810432

Commit 5dcef69486 ("intel_idle: add BXT support") added an 8-element
lookup array with just a 2-bit value used for lookups. As per the SDM
that bit field is really 3 bits wide. While this is supposedly benign
here, future re-use of the code for other CPUs might expose the issue.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit bef450962597ff39a7f9d53a30523aae9eb55843)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: re-work bxt_idle_state_table_update() and its helper
Jan Beulich [Mon, 27 Jun 2016 06:35:12 +0000 (00:35 -0600)]
intel_idle: re-work bxt_idle_state_table_update() and its helper

Orabug: 24810432

Since irtl_ns_units[] has itself zero entries, make sure the caller
recognized those cases along with the MSR read returning zero, as zero
is not a valid value for exit_latency and target_residency.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 3451ab3ebf92b12801878d8b5c94845afd4219f0)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agox86/intel_idle: Use Intel family macros for intel_idle
Dave Hansen [Fri, 3 Jun 2016 00:19:32 +0000 (17:19 -0700)]
x86/intel_idle: Use Intel family macros for intel_idle

Orabug: 24810432

Use the new INTEL_FAM6_* macros for intel_idle.c.  Also fix up
some of the macros to be consistent with how some of the
intel_idle code refers to the model.

There's on oddity here: model 0x1F is uniquely referred to here
and nowhere else that I could find.  0x1E/0x1F are just spelled
out as "Intel Core i7 and i5 Processors" in the SDM or as "Intel
processors based on the Nehalem, Westmere microarchitectures" in
the RDPMC section.  Comments between tables 19-19 and 19-20 in
the SDM seem to point to 0x1F being some kind of Westmere, so
let's call it "WESTMERE2".

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jacob.jun.pan@intel.com
Cc: linux-pm@vger.kernel.org
Link: http://lkml.kernel.org/r/20160603001932.EE978EB9@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit db73c5a8c80decbb6ddf208e58f3865b4df5384d)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agox86/cpu/intel: Introduce macros for Intel family numbers
Dave Hansen [Fri, 3 Jun 2016 00:19:27 +0000 (17:19 -0700)]
x86/cpu/intel: Introduce macros for Intel family numbers

Orabug: 24810432

Problem:

We have a boatload of open-coded family-6 model numbers.  Half of
them have these model numbers in hex and the other half in
decimal.  This makes grepping for them tons of fun, if you were
to try.

Solution:

Consolidate all the magic numbers.  Put all the definitions in
one header.

The names here are closely derived from the comments describing
the models from arch/x86/events/intel/core.c.  We could easily
make them shorter by doing things like s/SANDYBRIDGE/SNB/, but
they seemed fine even with the longer versions to me.

Do not take any of these names too literally, like "DESKTOP"
or "MOBILE".  These are all colloquial names and not precise
descriptions of everywhere a given model will show up.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vishwanath Somayaji <vishwanath.somayaji@intel.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: jacob.jun.pan@intel.com
Cc: linux-acpi@vger.kernel.org
Cc: linux-edac@vger.kernel.org
Cc: linux-mmc@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: platform-driver-x86@vger.kernel.org
Link: http://lkml.kernel.org/r/20160603001927.F2A7D828@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 970442c599b22ccd644ebfe94d1d303bf6f87c05)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: add BXT support
Len Brown [Wed, 6 Apr 2016 21:00:47 +0000 (17:00 -0400)]
intel_idle: add BXT support

Orabug: 24810432

Broxton has all the HSW C-states, except C3.
BXT C-state timing is slightly different.

Here we trust the IRTL MSRs as authority
on maximum C-state latency, and override the driver's tables
with the values found in the associated IRTL MSRs.
Further we set the target_residency to 1x maximum latency,
trusting the hardware demotion logic.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 5dcef694860100fd16885f052591b1268b764d21)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
arch/x86/include/asm/msr-index.h

8 years agointel_idle: Add KBL support
Len Brown [Wed, 6 Apr 2016 21:00:59 +0000 (17:00 -0400)]
intel_idle: Add KBL support

Orabug: 24810432

KBL is similar to SKL

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 3ce093d4de753d6c92cc09366e29d0618a62f542)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Add SKX support
Len Brown [Wed, 6 Apr 2016 21:00:58 +0000 (17:00 -0400)]
intel_idle: Add SKX support

Orabug: 24810432

SKX is similar to BDX

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit f9e71657c2c0a8f1c50884ab45794be2854e158e)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Clean up all registered devices on exit.
Richard Cochran [Wed, 6 Apr 2016 21:00:57 +0000 (17:00 -0400)]
intel_idle: Clean up all registered devices on exit.

Orabug: 24810432

This driver registers cpuidle devices when a CPU comes online, but it
leaves the registrations in place when a CPU goes offline.  The module
exit code only unregisters the currently online CPUs, leaving the
devices for offline CPUs dangling.

This patch changes the driver to clean up all registrations on exit,
even those from CPUs that are offline.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 3e66a9ab53641a0f7a440e56f7b35bf5d77494b3)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Propagate hot plug errors.
Richard Cochran [Wed, 6 Apr 2016 21:00:56 +0000 (17:00 -0400)]
intel_idle: Propagate hot plug errors.

Orabug: 24810432

If a cpuidle registration error occurs during the hot plug notifier
callback, we should really inform the hot plug machinery instead of
just ignoring the error.  This patch changes the callback to properly
return on error.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 08820546e4c30c84d0a1f1a49df055e1719c07ea)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Don't overreact to a cpuidle registration failure.
Richard Cochran [Wed, 6 Apr 2016 21:00:55 +0000 (17:00 -0400)]
intel_idle: Don't overreact to a cpuidle registration failure.

Orabug: 24810432

The helper function, intel_idle_cpu_init, registers one new device
with the cpuidle layer.  If the registration should fail, that
function immediately calls intel_idle_cpuidle_devices_uninit() to
unregister every last CPU's device.  However, it makes no sense to do
so, when called from the hot plug notifier callback.

This patch moves the call to intel_idle_cpuidle_devices_uninit()
outside of the helper function to the one call site that actually
needs to perform the de-registrations.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit b69ef2c099c3e5f11bd5c33a9530d6522f72c9aa)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Setup the timer broadcast only on successful driver load.
Richard Cochran [Wed, 6 Apr 2016 21:00:54 +0000 (17:00 -0400)]
intel_idle: Setup the timer broadcast only on successful driver load.

Orabug: 24810432

This driver sets the broadcast tick quite early on during probe and does
not clean up again in cast of failure.  This patch moves the setup call
after the registration, placing the on_each_cpu() calls within the global
CPU lock region.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 2259a819a8d37e472f08c88bc0dd22194754adb4)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Avoid a double free of the per-CPU data.
Richard Cochran [Wed, 6 Apr 2016 21:00:53 +0000 (17:00 -0400)]
intel_idle: Avoid a double free of the per-CPU data.

Orabug: 24810432

The helper function, intel_idle_cpuidle_devices_uninit, frees the
globally allocated per-CPU data.  However, this function is invoked
from the hot plug notifier callback at a time when freeing that data
is not safe.

If the call to cpuidle_register_driver() should fail (say, due to lack
of memory), then the driver will free its per-CPU region.  On the
*next* CPU_ONLINE event, the driver will happily use the region again
and even free it again if the failure repeats.

This patch fixes the issue by moving the call to free_percpu() outside
of the helper function at the two call sites that actually need to
free the per-CPU data.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit ca42489d9ee3262482717c83428e087322fdc39c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Fix dangling registration on error path.
Richard Cochran [Wed, 6 Apr 2016 21:00:52 +0000 (17:00 -0400)]
intel_idle: Fix dangling registration on error path.

Orabug: 24810432

In the module_init() method, if the per-CPU allocation fails, then the
active cpuidle registration is not cleaned up.  This patch fixes the
issue by attempting the allocation before registration, and then
cleaning it up again on registration failure.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit e9df69ccd1322e87eee10f28036fad9e6c71f8dd)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Fix deallocation order on the driver exit path.
Richard Cochran [Wed, 6 Apr 2016 21:00:51 +0000 (17:00 -0400)]
intel_idle: Fix deallocation order on the driver exit path.

Orabug: 24810432

In the module_exit() method, this driver first frees its per-CPU
pointer, then unregisters a callback making use of the pointer.
Furthermore, the function, intel_idle_cpuidle_devices_uninit, is racy
against CPU hot plugging as it calls for_each_online_cpu().

This patch corrects the issues by unregistering first on the exit path
while holding the hot plug lock.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 51319918bcc31f901646fc66348d41cf74ee0566)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Remove redundant initialization calls.
Richard Cochran [Wed, 6 Apr 2016 21:00:50 +0000 (17:00 -0400)]
intel_idle: Remove redundant initialization calls.

Orabug: 24810432

The function, intel_idle_cpuidle_driver_init, makes calls on each CPU
to auto_demotion_disable() and c1e_promotion_disable().  These calls
are redundant, as intel_idle_cpu_init() does the same calls just a bit
later on.  They are also premature, as the driver registration may yet
fail.

This patch removes the redundant code.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 4a3dfb3fc0fb0fc9acd36c94b7145f9c9dd4d93a)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Fix a helper function's return value.
Richard Cochran [Wed, 6 Apr 2016 21:00:49 +0000 (17:00 -0400)]
intel_idle: Fix a helper function's return value.

Orabug: 24810432

The function, intel_idle_cpuidle_driver_init, delivers no error codes
at all.  This patch changes the function to return 'void' instead of
returning zero.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 5469c827d20ab013f43d4f5f94e101d0cf7afd2c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: remove useless return from void function.
Richard Cochran [Wed, 6 Apr 2016 21:00:48 +0000 (17:00 -0400)]
intel_idle: remove useless return from void function.

Orabug: 24810432

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit f70415496d5ddf06fe7e0a22250d60bab2b2d7cc)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Support for Intel Xeon Phi Processor x200 Product Family
Dasaratharaman Chandramouli [Fri, 5 Sep 2014 00:22:54 +0000 (17:22 -0700)]
intel_idle: Support for Intel Xeon Phi Processor x200 Product Family

Orabug: 24810432

Enables "Intel(R) Xeon Phi(TM) Processor x200 Product Family" support,
formerly code-named KNL. It is based on modified Intel Atom Silvermont
microarchitecture.

Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
[micah.barany@intel.com: adjusted values of residency and latency]
Signed-off-by: Micah Barany <micah.barany@intel.com>
[hubert.chrzaniuk@intel.com: removed deprecated CPUIDLE_FLAG_TIME_VALID flag]
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Pawel Karczewski <pawel.karczewski@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 281baf7a702693deaa45c98ef0c5161006b48257)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled
Len Brown [Sun, 13 Mar 2016 05:33:48 +0000 (00:33 -0500)]
intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled

Orabug: 24810432

Some SKL-H configurations require "intel_idle.max_cstate=7" to boot.
While that is an effective workaround, it disables C10.

This patch detects the problematic configuration,
and disables C8 and C9, keeping C10 enabled.

Note that enabling SGX in BIOS SETUP can also prevent this issue,
if the system BIOS provides that option.

https://bugzilla.kernel.org/show_bug.cgi?id=109081
"Freezes with Intel i7 6700HQ (Skylake), unless intel_idle.max_cstate=7"

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: stable@vger.kernel.org
(cherry picked from commit d70e28f57e14a481977436695b0c9ba165472431)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Skylake Client Support - updated
Len Brown [Wed, 9 Sep 2015 17:35:05 +0000 (13:35 -0400)]
intel_idle: Skylake Client Support - updated

Orabug: 24810432

Addition of PC9 state, and minor tweaks to existing PC6 and PC8 states.

Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 135919a3a80565070b9645009e65f73e72c661c0)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: Skylake Client Support
Len Brown [Thu, 26 Mar 2015 03:20:37 +0000 (23:20 -0400)]
intel_idle: Skylake Client Support

Orabug: 24810432

Skylake Client CPU idle Power states (C-states)
are similar to the previous generation, Broadwell.
However, Skylake does get its own table with updated
worst-case latency and average energy-break-even residency values.

Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 493f133f47750aa5566fafa9403617e3f0506f8c)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agointel_idle: allow idle states to be freeze-mode specific
Len Brown [Wed, 27 May 2015 21:11:37 +0000 (17:11 -0400)]
intel_idle: allow idle states to be freeze-mode specific

Orabug: 24810432

intel_idle uses a NULL "enter" field in a cpuidle state
to recognize the invalid entry terminating a variable-length array.

Linux-4.0 added support for the system-wide "freeze" state
in cpuidle drivers via the new "enter_freeze" field.

The natural way to expose a deep idle state for freeze,
but not for run-time idle is to supply "enter_freeze" without "enter";
so we update the driver to accept such states.

Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 7dd0e0af64afe4aa08ccdd167f64bd007f09b515)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agoRDS: rds debug messages are enabled by default
shamir rabinovitch [Wed, 26 Oct 2016 13:16:50 +0000 (06:16 -0700)]
RDS: rds debug messages are enabled by default

rds use Kconfig option called "RDS_DEBUG" to enable rds debug messages.
This option cause the rds Makefile to add -DDEBUG to the rds gcc command
line.

When CONFIG_DYNAMIC_DEBUG is enabled, the "DEBUG" macro is used by
include/linux/dynamic_debug.h to decide if dynamic debug prints should
be sent by default to the kernel log.

rds should not enable this macro for production builds.

Orabug: 24956522

Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agonet/rds: Fix new sparse warning
David Ahern [Mon, 4 May 2015 15:51:38 +0000 (11:51 -0400)]
net/rds: Fix new sparse warning

c0adf54a109 introduced new sparse warnings:
  CHECK   /home/dahern/kernels/linux.git/net/rds/ib_cm.c
net/rds/ib_cm.c:191:34: warning: incorrect type in initializer (different base types)
net/rds/ib_cm.c:191:34:    expected unsigned long long [unsigned] [usertype] dp_ack_seq
net/rds/ib_cm.c:191:34:    got restricted __be64 <noident>
net/rds/ib_cm.c:194:51: warning: cast to restricted __be64

The temporary variable for sequence number should have been declared as __be64
rather than u64. Make it so.

Orabug: 24817685

Signed-off-by: David Ahern <david.ahern@oracle.com>
Cc: shamir rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e2783717a71e9babfdd7c36c7e35b790d2c01022)
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agonet/rds: fix unaligned memory access
shamir rabinovitch [Fri, 1 May 2015 00:58:07 +0000 (20:58 -0400)]
net/rds: fix unaligned memory access

rdma_conn_param private data is copied using memcpy after headers such
as cma_hdr (see cma_resolve_ib_udp as example). so the start of the
private data is aligned to the end of the structure that come before. if
this structure end with u32 the meaning is that the start of the private
data will be 4 bytes aligned. structures that use u8/u16/u32/u64 are
naturally aligned but in case the structure start is not 8 bytes aligned,
all u64 members of this structure will not be aligned. to solve this issue
we must use special macros that allow unaligned access to those
unaligned members.

Addresses the following kernel log seen when attempting to use RDMA:

Kernel unaligned access at TPC[10507a88] rds_ib_cm_connect_complete+0x1bc/0x1e0 [rds_rdma]

Orabug: 24817685

Acked-by: Chien Yen <chien.yen@oracle.com>
Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
[Minor tweaks for top of tree by:]
Signed-off-by: David Ahern <david.ahern@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c0adf54a10903b59037a4c5fcb933dfeeb7b2624)
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Mon, 31 Oct 2016 22:52:23 +0000 (15:52 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  sched: panic on corrupted stack end
  ecryptfs: forbid opening files without mmap handler
  proc: prevent stacking filesystems on top

8 years agosched: panic on corrupted stack end
Jann Horn [Wed, 1 Jun 2016 09:55:07 +0000 (11:55 +0200)]
sched: panic on corrupted stack end

Orabug: 24971905
CVE: CVE-2016-1583

Until now, hitting this BUG_ON caused a recursive oops (because oops
handling involves do_exit(), which calls into the scheduler, which in
turn raises an oops), which caused stuff below the stack to be
overwritten until a panic happened (e.g.  via an oops in interrupt
context, caused by the overwritten CPU index in the thread_info).

Just panic directly.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 29d6455178a09e1dc340380c582b13356227e8df)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
Conflicts:
kernel/sched/core.c

8 years agoecryptfs: forbid opening files without mmap handler
Jann Horn [Wed, 1 Jun 2016 09:55:06 +0000 (11:55 +0200)]
ecryptfs: forbid opening files without mmap handler

Orabug: 24971905
CVE: CVE-2016-1583

This prevents users from triggering a stack overflow through a recursive
invocation of pagefault handling that involves mapping procfs files into
virtual memory.

Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Tyler Hicks <tyhicks@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 2f36db71009304b3f0b95afacd8eba1f9f046b87)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agoproc: prevent stacking filesystems on top
Jann Horn [Wed, 1 Jun 2016 09:55:05 +0000 (11:55 +0200)]
proc: prevent stacking filesystems on top

Orabug: 24971905
CVE: CVE-2016-1583

This prevents stacking filesystems (ecryptfs and overlayfs) from using
procfs as lower filesystem.  There is too much magic going on inside
procfs, and there is no good reason to stack stuff on top of procfs.

(For example, procfs does access checks in VFS open handlers, and
ecryptfs by design calls open handlers from a kernel thread that doesn't
drop privileges or so.)

Signed-off-by: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit e54ad7f1ee263ffa5a2de9c609d58dfa27b21cd9)
Signed-off-by: Brian Maly <brian.maly@oracle.com>
8 years agoMerge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Mon, 31 Oct 2016 15:30:36 +0000 (08:30 -0700)]
Merge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/rpm-build:
  uek-rpm nano: remove the OL6 nano kernel dependency on kernel-firmware

8 years agouek-rpm nano: remove the OL6 nano kernel dependency on kernel-firmware
Ashok Vairavan [Mon, 31 Oct 2016 19:51:50 +0000 (12:51 -0700)]
uek-rpm nano: remove the OL6 nano kernel dependency on kernel-firmware

linux-nano-firmware obsoletes kernel-firmware.  Remove the requirement
for it from the OL6 nano kernel-uek.spec.

Orabug: 25023723

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
8 years agoMerge branch topic/uek-4.1/fuse of git://ca-git.us.oracle.com/linux-uek into uek...
Chuck Anderson [Mon, 31 Oct 2016 10:50:00 +0000 (03:50 -0700)]
Merge branch topic/uek-4.1/fuse of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/fuse:
  fuse: direct-io: don't dirty ITER_BVEC pages

8 years agoMerge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Mon, 31 Oct 2016 10:48:30 +0000 (03:48 -0700)]
Merge branch topic/uek-4.1/upstream-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/upstream-cherry-picks:
  btrfs: Handle unaligned length in extent_same
  panic, x86: Fix re-entrance problem due to panic on NMI
  kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup
  Fix compilation error introduced by "cancel the setfilesize transation when io error happen"
  cancel the setfilesize transation when io error happen
  mm/hugetlb: optimize minimum size (min_size) accounting
  Btrfs: fix device replace of a missing RAID 5/6 device
  Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
  bpf: fix double-fdput in replace_map_fd_with_map_ptr()

8 years agoMerge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux...
Chuck Anderson [Mon, 31 Oct 2016 10:47:39 +0000 (03:47 -0700)]
Merge branch topic/uek-4.1/stable-cherry-picks of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

* topic/uek-4.1/stable-cherry-picks:
  kvm:vmx: more complete state update on APICv on/off

8 years agobtrfs: Handle unaligned length in extent_same
Mark Fasheh [Mon, 8 Jun 2015 22:05:25 +0000 (15:05 -0700)]
btrfs: Handle unaligned length in extent_same

The extent-same code rejects requests with an unaligned length. This
poses a problem when we want to dedupe the tail extent of files as we
skip cloning the portion between i_size and the extent boundary.

If we don't clone the entire extent, it won't be deleted. So the
combination of these behaviors winds up giving us worst-case dedupe on
many files.

We can fix this by allowing a length that extents to i_size and
internally aligining those to the end of the block. This is what
btrfs_ioctl_clone() so we can just copy that check over.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit e1d227a42ea2b4664f94212bd1106b9a3413ffb8)
Signed-off-by: Divya Indi <divya.indi@oracle.com>
Orabug: 24696342

8 years agopanic, x86: Fix re-entrance problem due to panic on NMI
Hidehiro Kawai [Mon, 14 Dec 2015 10:19:09 +0000 (11:19 +0100)]
panic, x86: Fix re-entrance problem due to panic on NMI

If panic on NMI happens just after panic() on the same CPU, panic() is
recursively called. Kernel stalls, as a result, after failing to acquire
panic_lock.

To avoid this problem, don't call panic() in NMI context if we've
already entered panic().

For that, introduce nmi_panic() macro to reduce code duplication. In
the case of panic on NMI, don't return from NMI handlers if another CPU
already panicked.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: lkml <linux-kernel@vger.kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs
[ Cleanup comments, fixup formatting. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit 1717f2096b543cede7a380c858c765c41936bc35)

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 24327572

8 years agokernel/watchdog.c: perform all-CPU backtrace in case of hard lockup
Jiri Kosina [Fri, 6 Nov 2015 02:44:41 +0000 (18:44 -0800)]
kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup

In many cases of hardlockup reports, it's actually not possible to know
why it triggered, because the CPU that got stuck is usually waiting on a
resource (with IRQs disabled) in posession of some other CPU is holding.

IOW, we are often looking at the stacktrace of the victim and not the
actual offender.

Introduce sysctl / cmdline parameter that makes it possible to have
hardlockup detector perform all-CPU backtrace.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 55537871ef666b4153fd1ef8782e4a13fee142cc)

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Orabug: 24327572

8 years agoFix compilation error introduced by "cancel the setfilesize transation
Ashok Vairavan [Tue, 9 Aug 2016 18:06:22 +0000 (11:06 -0700)]
Fix compilation error introduced by "cancel the setfilesize transation
when io error happen"

xfs_trans_cancel() has two args in UEK4.

Orabug: 24385189
Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
8 years agocancel the setfilesize transation when io error happen
Zhaohongjiang [Mon, 12 Oct 2015 04:28:39 +0000 (15:28 +1100)]
cancel the setfilesize transation when io error happen

When I ran xfstest/073 case, the remount process was blocked to wait
transactions to be zero. I found there was a io error happened, and
the setfilesize transaction was not released properly. We should add
the changes to cancel the io error in this case.

Reproduction steps:
1. dd if=/dev/zero of=xfs1.img bs=1M count=2048
2. mkfs.xfs xfs1.img
3. losetup -f ./xfs1.img /dev/loop0
4. mount -t xfs /dev/loop0 /home/test_dir/
5. mkdir /home/test_dir/test
6. mkfs.xfs -dfile,name=image,size=2g
7. mount -t xfs -o loop image /home/test_dir/test
8. cp a file bigger than 2g to /home/test_dir/test
9. mount -t xfs -o remount,ro /home/test_dir/test

[ dchinner: moved io error detection to xfs_setfilesize_ioend() after
  transaction context restoration. ]

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Orabug: 24385189
mainline commit: 5cb13dcd0fac071b45c4bebe1801a08ff0d89cad

Signed-off-by: Ashok Vairavan <ashok.vairavan@oracle.com>
8 years agomm/hugetlb: optimize minimum size (min_size) accounting
Mike Kravetz [Fri, 20 May 2016 00:11:01 +0000 (17:11 -0700)]
mm/hugetlb: optimize minimum size (min_size) accounting

Orabug: 24450029

It was observed that minimum size accounting associated with the
hugetlbfs min_size mount option may not perform optimally and as
expected.  As huge pages/reservations are released from the filesystem
and given back to the global pools, they are reserved for subsequent
filesystem use as long as the subpool reserved count is less than
subpool minimum size.  It does not take into account used pages within
the filesystem.  The filesystem size limits are not exceeded and this is
technically not a bug.  However, better behavior would be to wait for
the number of used pages/reservations associated with the filesystem to
drop below the minimum size before taking reservations to satisfy
minimum size.

An optimization is also made to the hugepage_subpool_get_pages() routine
which is called when pages/reservations are allocated.  This does not
change behavior, but simply avoids the accounting if all reservations
have already been taken (subpool reserved count == 0).

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Orabug: 24450029
(cherry picked from commit 09a95e29cb30a3930db22d340ddd072a82b6b0db)
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
8 years agoBtrfs: fix device replace of a missing RAID 5/6 device
Omar Sandoval [Fri, 19 Jun 2015 18:52:51 +0000 (11:52 -0700)]
Btrfs: fix device replace of a missing RAID 5/6 device

The original implementation of device replace on RAID 5/6 seems to have
missed support for replacing a missing device. When this is attempted,
we end up calling bio_add_page() on a bio with a NULL ->bi_bdev, which
crashes when we try to dereference it. This happens because
btrfs_map_block() has no choice but to return us the missing device
because RAID 5/6 don't have any alternate mirrors to read from, and a
missing device has a NULL bdev.

The idea implemented here is to handle the missing device case
separately, which better only happen when we're replacing a missing RAID
5/6 device. We use the new BTRFS_RBIO_REBUILD_MISSING operation to
reconstruct the data from parity, check it with
scrub_recheck_block_checksum(), and write it out with
scrub_write_block_to_dev_replace().

Reported-by: Philip <bugzilla@philip-seeger.de>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96141
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 24447930
signed-off-by: Divya Indi <divya.indi@oracle.com>
(cherry picked from commit 73ff61dbe5edeb1799d7e91c8b0641f87feb75fa)

8 years agoBtrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation
Omar Sandoval [Fri, 19 Jun 2015 18:52:50 +0000 (11:52 -0700)]
Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation

The current RAID 5/6 recovery code isn't quite prepared to handle
missing devices. In particular, it expects a bio that we previously
attempted to use in the read path, meaning that it has valid pages
allocated. However, missing devices have a NULL blkdev, and we can't
call bio_add_page() on a bio with a NULL blkdev. We could do manual
manipulation of bio->bi_io_vec, but that's pretty gross. So instead, add
a separate path that allows us to manually add pages to the rbio.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Orabug: 24447930
Signed-off-by: Divya Indi <divya.indi@oracle.com>
(cherry picked from commit b4ee1782686d5b7a97826d67fdeaefaedbca23ce)

8 years agokvm:vmx: more complete state update on APICv on/off
Roman Kagan [Wed, 18 May 2016 14:48:20 +0000 (17:48 +0300)]
kvm:vmx: more complete state update on APICv on/off

The function to update APICv on/off state (in particular, to deactivate
it when enabling Hyper-V SynIC) is incomplete: it doesn't adjust
APICv-related fields among secondary processor-based VM-execution
controls.  As a result, Windows 2012 guests get stuck when SynIC-based
auto-EOI interrupt intersected with e.g. an IPI in the guest.

In addition, the MSR intercept bitmap isn't updated every time "virtualize
x2APIC mode" is toggled.  This path can only be triggered by a malicious
guest, because Windows didn't use x2APIC but rather their own synthetic
APIC access MSRs; however a guest running in a SynIC-enabled VM could
switch to x2APIC and thus obtain direct access to host APIC MSRs
(CVE-2016-4440).

The patch fixes those omissions.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Reported-by: Steve Rutherford <srutherford@google.com>
Reported-by: Yang Zhang <yang.zhang.wz@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Orabug: 23347009
CVE: CVE-2016-4440
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
8 years agofuse: direct-io: don't dirty ITER_BVEC pages
Ashish Samant [Thu, 18 Aug 2016 20:54:26 +0000 (13:54 -0700)]
fuse: direct-io: don't dirty ITER_BVEC pages

When reading from a loop device backed by a fuse file it deadlocks on
lock_page().

This is because the page is already locked by the read() operation done on
the loop device.  In this case we don't want to either lock the page or
dirty it.

So do what fs/direct-io.c does: only dirty the page for ITER_IOVEC vectors.

Orabug : 22652336

Reported-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com>
8 years agobpf: fix double-fdput in replace_map_fd_with_map_ptr()
Jann Horn [Tue, 26 Apr 2016 20:26:26 +0000 (22:26 +0200)]
bpf: fix double-fdput in replace_map_fd_with_map_ptr()

When bpf(BPF_PROG_LOAD, ...) was invoked with a BPF program whose bytecode
references a non-map file descriptor as a map file descriptor, the error
handling code called fdput() twice instead of once (in __bpf_map_get() and
in replace_map_fd_with_map_ptr()). If the file descriptor table of the
current task is shared, this causes f_count to be decremented too much,
allowing the struct file to be freed while it is still in use
(use-after-free). This can be exploited to gain root privileges by an
unprivileged user.

This bug was introduced in
commit 0246e64d9a5f ("bpf: handle pseudo BPF_LD_IMM64 insn"), but is only
exploitable since
commit 1be7f75d1668 ("bpf: enable non-root eBPF programs") because
previously, CAP_SYS_ADMIN was required to reach the vulnerable code.

(posted publicly according to request by maintainer)

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7)

Orabug: 23268285
Signed-off-by: Manjunath Govindashetty <manjunath.govindashetty@oracle.com>
8 years agoMerge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into...
Chuck Anderson [Fri, 28 Oct 2016 21:37:03 +0000 (14:37 -0700)]
Merge branch topic/uek-4.1/rpm-build of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1

8 years agoMerge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek...
Chuck Anderson [Fri, 28 Oct 2016 21:36:16 +0000 (14:36 -0700)]
Merge branch 'topic/uek-4.1/ofed' of git://ca-git.us.oracle.com/linux-uek into uek/uek-4.1